0 Voti positivi0 Voti negativi

0 visualizzazioni154 pagineAug 29, 2019

© © All Rights Reserved

PDF, TXT o leggi online da Scribd

© All Rights Reserved

0 visualizzazioni

© All Rights Reserved

- Matlab Learn Lesson
- MAT2002 Applications of Differential and Difference Equations ETH 1 AC37
- ODE Lecture 1
- Wikibooks Calculus
- Parameterization and Controllability of Linear Time-Invariant Systems
- Initial Value Problems
- NON HOMOGENEOUS SECOND ODER LINEAR DIFFERENTIAL EQUATION PDF
- IIT Bombay Lectures
- 001_Week-1 ODE Intro
- 1285062812 MicrosoftWord-XII Math Set 2 CBSE Paper 2010 Questions 0
- DiffEqnWeek1
- ECE+Curriculum+latest
- Lecture 2
- Odes
- Md Framed
- CEN Undergraduate Curriculum
- Thermoacoustic_Instability_in_a_Rijke_Tube_with_a_.pdf
- Simscape Product Description
- BurdenFaires
- 25 4 Soln Fourier Series

Sei sulla pagina 1di 154

Peter Philip∗

Lecture Notes

Originally Created for the Class of Spring Semester 2012 at LMU Munich,

Revised and Extended for Several Subsequent Classes

Contents

1 Basic Notions 4

1.1 Types and First Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Equivalent Integral Equation . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.3 Patching and Time Reversion . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1 Geometric Interpretation, Graphing . . . . . . . . . . . . . . . . . . . . . 12

2.2 Linear ODE, Variation of Constants . . . . . . . . . . . . . . . . . . . . . 12

2.3 Separation of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.4 Change of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3 General Theory 24

3.1 Equivalence Between Higher-Order ODE and Systems of First-Order ODE 24

3.2 Existence of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.3 Uniqueness of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.4 Extension of Solutions, Maximal Solutions . . . . . . . . . . . . . . . . . 40

3.5 Continuity in Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . 50

∗

E-Mail: philip@math.lmu.de

1

CONTENTS 2

4 Linear ODE 57

4.1 Definition, Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.2 Gronwall’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.3 Existence, Uniqueness, Vector Space of Solutions . . . . . . . . . . . . . . 61

4.4 Fundamental Matrix Solutions and Variation of Constants . . . . . . . . 63

4.5 Higher-Order, Wronskian . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.6 Constant Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.6.1 Linear ODE of Higher Order . . . . . . . . . . . . . . . . . . . . . 68

4.6.2 Systems of First-Order Linear ODE . . . . . . . . . . . . . . . . . 75

5 Stability 84

5.1 Qualitative Theory, Phase Portraits . . . . . . . . . . . . . . . . . . . . . 84

5.2 Stability at Fixed Points . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

5.3 Constant Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

5.4 Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

5.5 Limit Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

A Differentiability 118

B.1 Kn -Valued Riemann Integral . . . . . . . . . . . . . . . . . . . . . . . . . 119

B.2 Kn -Valued Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . 122

C.1 Distance in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 124

C.2 Compactness in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . 127

F Paths in Rn 135

CONTENTS 3

I.1 Product Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

I.2 Integration and Matrix Multiplication Commute . . . . . . . . . . . . . . 144

J.1 Equivalence of Autonomous and Nonautonomous ODE . . . . . . . . . . 145

J.2 Integral for ODE with Discontinuous Right-Hand Side . . . . . . . . . . 146

References 154

1 BASIC NOTIONS 4

1 Basic Notions

Examples

A differential equation is an equation for some unknown function, involving one or more

derivatives of the unknown function. Here are some first examples:

y ′ = y, (1.1a)

y (5) = (y ′ )2 + π x, (1.1b)

(y ′ )2 = c, (1.1c)

∂t x = e2πit x2 , (1.1d)

′′ −1

x = −3x + . (1.1e)

1

One distinguishes between ordinary differential equations (ODE) and partial differential

equations (PDE). While ODE contain only derivatives with respect to one variable, PDE

can contain (partial) derivatives with respect to several different variables. In general,

PDE are much harder to solve than ODE. The equations in (1.1) all are ODE, and only

ODE are the subject of this class. We will see precise definitions shortly, but we can

already use the examples in (1.1) to get some first exposure to important ODE-related

terms and to discuss related issues.

As in (1.1), the notation for the unknown function varies in the literature, where the

two variants presented in (1.1) are probably the most common ones: In the first three

equations of (1.1), the unknown function is denoted y, usually assumed to depend on

a variable denoted x, i.e. x 7→ y(x). In the last two equations of (1.1), the unknown

function is denoted x, usually assumed to depend on a variable denoted t, i.e. t 7→ x(t).

So one has to use some care due to the different roles of the symbol x. The notation

t 7→ x(t) is typically favored in situations arising from physics applications, where t

represents time. In this class, we will mostly use the notation x 7→ y(x).

There is another, in a way a slightly more serious, notational issue that one commonly

encounters when dealing with ODE: Strictly speaking, the notation in (1.1b) and (1.1d)

is not entirely correct, as functions and function arguments are not properly distin-

guished. Correctly written, (1.1b) and (1.1d) read

2

∀ y (5) (x) = y ′ (x) + π x, (1.2a)

x∈D(y)

2

∀ (∂t x)(t) = e2πit x(t) , (1.2b)

t∈D(x)

where D(y) and D(x) denote the respective domains of the functions y and x. However,

one might also notice that the notation in (1.2) is more cumbersome and, perhaps,

harder to read. In any case, the type of slight abuse of notation present in (1.1b) and

(1.1d) is so common in the literature that one will have to live with it.

1 BASIC NOTIONS 5

One speaks of first-order ODE if the equations involve only first derivatives such as in

(1.1a), (1.1c), and (1.1d). Otherwise, one speaks of higher-order ODE, where the precise

order is given by the highest derivative occurring in the equation, such that (1.1b) is

an ODE of fifth order and (1.1e) is an ODE of second order. We will see later in Th.

3.1 that ODE of higher order can be equivalently formulated and solved as systems of

ODE of first order, where systems of ODE obviously consist of several ODE to be solved

simultaneously. Such a system of ODE can, equivalently, be interpreted as a single ODE

in higher dimensions: For instance, (1.1e) can be seen as a single two-dimensional ODE

of second order or as the system

x′′2 = −3x2 + 1 (1.3b)

One calls an ODE explicit if it has been solved explicitly for the highest-order deriva-

tive, otherwise implicit. Thus, in (1.1), all ODE except (1.1c) are explicit. In general,

explicit ODE are much easier to solve than implicit ODE (which include, e.g., so-called

differential-algebraic equations, cf. Ex. 1.4(g) below), and we will mostly consider ex-

plicit ODE in this class.

As the reader might already have noticed, without further information, none of the

equations in (1.1) makes much sense. Every function, in particular, every function

solving an ODE, needs a set as the domain where it is defined, and a set as the range it

maps into. Thus, for each ODE, one needs to specify the admissible domains as well as

the range of the unknown function. For an ODE, one usually requires a solution to be

defined on a nontrivial (bounded or unbounded) interval I ⊆ R. Prescribing the possible

range of the solution is an integral part of setting up an ODE, and it often completely

determines the ODE’s meaning and/or its solvability. For example for (1.1d), (a subset

of) C is a reasonable range. Similarly, for (1.1a)–(1.1c), one can require the range to

be either R or C, where requiring range R for (1.1c) immediately implies there is no

solution for c < 0. However, one can also specify (a subset of) Rn or Cn , n > 1, as

range for (1.1a), turning the ODE into an n-dimensional ODE (or a system of ODE),

where y now has n compoments (y1 , . . . , yn ) (note that, except in cases where we are

dealing with matrix multiplications, we sometimes denote elements of Rn as columns

and sometimes as rows, switching back and forth without too much care). A reasonable

range for (1.1e) is (a subset of) R2 or C2 .

One of the important goals regarding ODE is to find conditions, where one can guan-

rantee the existence of solutions. Moreover, if possible, one would like to find conditions

that guarantee the existence of a unique solution. Clearly, for each a ∈ R, the function

y : R −→ R, y(x) = a ex , is a solution to (1.1a), showing one cannot expect uniqueness

without specifying further requirements. The most common additional conditions that

often (but not always) guarantee a unique solution are initial conditions, (e.g. requiring

y(x0 ) = y0 (x0 , y0 given); or boundary conditions (e.g. requiring y(a) = ya , y(b) = yb for

y : [a, b] −→ Cn (ya , yb ∈ Cn given)).

Let us now proceed to mathematically precise definitions of the abovementioned notions.

1 BASIC NOTIONS 6

function

φ : I −→ Kn , (1.5)

defined on a nontrivial (bounded or unbounded, open or closed or half-open) interval

I ⊆ R satisfying the two conditions

Note that condition (i) is necessary so that one can even formulate condition (ii).

function φ as in (1.5), defined on a nontrivial (bounded or unbounded, open or

closed or half-open) interval I ⊆ R satisfying the two conditions

(i)

(ii) φ(k) (x) = f (x, φ(x), φ′ (x), . . . , φ(k−1) (x)) for each x ∈ I.

Again, note that condition (i) is necessary so that one can even formulate condition

(ii). Also note that φ is a solution to (1.6) if, and only if, φ is a solution to the

equivalent implicit ODE y (k) − f (x, y, y ′ , . . . , y (k−1) ) = 0.

(a) An initial value problem for (1.4) (resp. for (1.6)) consists of the ODE (1.4) (resp.

of the ODE (1.6)) plus the initial condition

j=0,...,k−1

problem is a k times differentiable function φ as in (1.5) that is a solution to the

ODE and that also satisfies (1.7) (with y replaced by φ) – in particular, this requires

x0 ∈ I.

1 BASIC NOTIONS 7

(b) A boundary value problem for (1.4) (resp. for (1.6)) consists of the ODE (1.4) (resp.

of the ODE (1.6)) plus the boundary condition

j∈Ja j∈Jb

yb,j ∈ Kn for each j ∈ Jb . A solution φ to the boundary value problem is a k times

differentiable function φ as in (1.5) that is a solution to the ODE and that also

satisfies (1.8) (with y replaced by φ) – in particular, this requires [a, b] ⊆ I.

Under suitable hypotheses, initial and boundary value problems for ODE have unique

solutions (for initial value problems, we will see some rather general results in Cor. 3.10

and Cor. 3.16 below). However, in general, they can have infinitely many solutions or

no solutions, as shown by Examples 1.4(b),(c),(e) below.

solution to the kth order explicit initial value problem

y (k) = y, (1.9a)

∀ y (j) (0) = a. (1.9b)

j=0,...,k−1

We will see later (e.g., as a consequence of Th. 4.8 combined with Th. 3.1) that φ

is the unique solution to (1.9) on R.

p

y ′ = |y|, (1.10a)

y(0) = 0. (1.10b)

(

0 for x ≤ c,

φc : R −→ R, φc (x) := (x−c)2 (1.11)

4

for x ≥ c,

(

0 for x ≤ c,

φ′c : R −→ R, φ′c (x) := x−c (1.12)

2

for x ≥ c,

solving the ODE. Thus, (1.10) is an example of an initial value problem with un-

countably many different solutions, all defined on the same domain.

1 BASIC NOTIONS 8

(c) As mentioned before, the one-dimensional implicit first-order ODE (1.1c) has no

real-valued

√ solution for c < 0. For c ≥ 0, every function φ : R −→ R, φ(x) :=

a ± x c, a ∈ R, is a solution √ to (1.1c). Moreover, for c < 0, every function

φ : R −→ C, φ(x) := a ± xi −c, a ∈ C, is a C-valued solution to (1.1c). The

one-dimensional implicit first-order ODE

′

ey = 0 (1.13)

exercise to find f : R −→ R such that the explicit ODE y ′ = f (x) has no solution.

is the unique solution to the n-dimensional explicit first-order initial value problem

y ′ = a, (1.15a)

y(0) = c. (1.15b)

(e) Let a, b ∈ R, a < b. We will see later in Example 4.12 that on [a, b] the 1-dimensional

explicit second-order ODE

y ′′ = −y (1.16)

has precisely the set of solutions

n o

L= (c1 sin +c2 cos) : [a, b] −→ K : c1 , c2 ∈ K . (1.17)

for (1.16) has the unique solution φ : [0, π/2] −→ K, φ(x) := sin x (using (1.18a)

and (1.17) implies c2 = 0 and c1 = 1); the boundary value problem

for (1.16) has the infinitely many different solutions φc : [0, π] −→ K, φc (x) :=

c sin x, c ∈ K; and the boundary value problem

for (1.16) has no solution (using (1.18c) and (1.17) implies the contradictory re-

quirements c2 = 0 and c2 = −1).

1 BASIC NOTIONS 9

(f ) Consider

2 2 2 y1 z1 z2

F : R × K × K −→ K , F x, , := . (1.19a)

y2 z2 z2 − 1

y2′

′ 0

F (x, y, y ) = = (1.19b)

y2′ − 1 0

(g) Consider

y1 z1 z2 + iy3 − 2i

F : R × C3 × C3 −→ C3 , F x, y2 , z2 := z1 + y2 − x2 . (1.20a)

y3 z3 y1 − ieix

′

y2 + iy3 − 2i 0

′ ′ 2

F (x, y, y ) = y1 + y2 − x

= 0

(1.20b)

y1 − ieix 0

has a unique solution on R (note that, here, we do not need to provide initial

or boundary conditions to obtain uniqueness). The implicit ODE (1.20b) is an

example of a differential algebraic equation, since, read in components, only its first

two equations contain derivatives, whereas its third equation is purely algebraic.

It is often useful to rewrite a first-order explicit intitial value problem as an equivalent

integral equation. We provide the details of this equivalence in the following theorem:

Theorem 1.5. If G ⊆ R × Kn , n ∈ N, and f : G −→ Kn is continuous, then, for each

(x0 , y0 ) ∈ G, the explicit n-dimensional first-order initial value problem

y(x0 ) = y0 , (1.21b)

Z x

y(x) = y0 + f t, y(t) dt , (1.22)

x0

interval, and φ satisfying

x, φ(x) ∈ I × Kn : x ∈ I ⊆ G,

(1.23)

1 BASIC NOTIONS 10

is a solution to (1.21) in the sense of Def. 1.3(a) if, and only if,

Z x

∀ φ(x) = y0 + f t, φ(t) dt , (1.24)

x∈I x0

i.e. if, and only if, φ is a solution to the integral equation (1.22).

to be a continuous function, satisfying (1.23). If φ is a solution to (1.21), then φ is

differentiable and the assumed continuity of f implies the continuity of φ′ . In other

words, each component φj of φ, j = {1, . . . , n}, is in C 1 (I, K). Thus, the fundamental

theorem of calculus [Phi16, Th. 10.20(b)] applies, and [Phi16, (10.51b)] yields

Z x Z x

(1.21b)

∀ ∀ φj (x) = φj (x0 ) + fj t, φ(t) dt = y0,j + fj t, φ(t) dt , (1.25)

x∈I j∈{1,...,n} x0 x0

proving φ satisfies (1.24). Conversely, if φ satisfies (1.24), then the validity of the initial

condition (1.21b) is immediate.

Moreover, as f and φ are continuous, so is the integrand

function t 7→ f t, φ(t) of (1.24). Thus, [Phi16, Th. 10.20(a)] applies to (the components

of) φ, proving φ′ (x) = f x, φ(x) for each x ∈ I, proving φ is a solution to (1.21).

Example 1.6. Consider the situation of Th. 1.5. In the particularly simple special

case, where f does not actually depend on y, but merely on x, the equivalence between

(1.21) and (1.22) can be directly exploited to actually solve the initial value problem:

If f : I −→ Kn , where I ⊆ R is some nontrivial interval with x0 ∈ I, then we obtain

φ : I −→ Kn to be a solution of (1.21) if, and only if,

Z x

∀ φ(x) = y0 + f (t) dt , (1.26)

x∈I x0

i.e. if, and only if, φ is the antiderivative of f that satisfies the initial condition. In

particular, in the present situation, φ as given by (1.26) is the unique solution to the

initial value problem. Of course, depending on f , it can still be difficult to carry out

the integral in (1.26).

If solutions defined on different intervals fit together, then they can be patched to obtain

a solution on the union of the two intervals:

Lemma 1.7 (Patching of Solutions). Let k, n ∈ N. Given G ⊆ R × Kkn and f : G −→

Kn , if φ : I −→ Kn and ψ : J −→ Kn are both solutions to (1.6), i.e. to

such that I =]a, b], J = [b, c[, a < b < c, and such that

j=0,...,k−1

1 BASIC NOTIONS 11

then (

φ(x) for x ∈ I,

σ : I ∪ J −→ Kn , σ(x) := (1.28)

ψ(x) for x ∈ J,

is also a solution to (1.6).

(1.29)

must hold, where (1.27) guarantees that σ (j) (b) exists for each j = 0, . . . , k−1. Moreover,

σ is k times differentiable at each x ∈ I ∪ J, x 6= b, and

∀ (1.30)

x∈I∪J,

x6=b

However, at b, we also have (using the left-hand derivatives for φ and the right-hand

derivatives for ψ)

(1.31)

which shows σ is k times differentiable and the equality of (1.30) also holds at x = b,

completing the proof that σ is a solution.

Definition 1.8. Let k, n ∈ N, Gf ⊆ R × Kkn , f : Gf −→ Kn , and consider the ODE

y (k) = g(x, y, y ′ , . . . , y (k−1) ), (1.33)

where

(1.34a)

Gg := (x, y) ∈ R × Kkn : − x, y1 , −y2 , . . . , (−1)k−1 yk ∈ Gf ,

(1.34b)

Lemma 1.9 (Time Reversion). Let k, n ∈ N, Gf ⊆ R × Kkn , and f : Gf −→ Kn .

(a) The time-reversed version of (1.33) is the original ODE, i.e. (1.32).

(b) If −∞ ≤ a < b ≤ ∞, then φ : ]a, b[−→ Kn is a solution to (1.32) if, and only if,

2 ELEMENTARY SOLUTION METHODS 12

(b): Due to (a), it suffices to show if φ is a solution to (1.32), then ψ is a solution to

(1.33). Clearly, if x ∈] − b, −a[, then −x ∈]a, b[. Moreover, noting

∀ ∀ ψ (j) (x) = (−1)j φ(j) (−x), (1.36a)

j=0,...,k x∈]−b,−a[

one has

− x, φ(−x), φ′ (−x), . . . , φ(k−1) (−x) ∈ Gf

∀ ⇒ (1.36b)

x∈]−b,−a[

= x, φ(−x), −φ′ (−x), . . . , (−1)k−1 φ(k−1) (−x) ∈ Gg

and

ψ (k) (x) = (−1)k f − x, φ(−x), φ′ (−x), . . . , φ(k−1) (−x)

∀ (1.36c)

x∈]−b,−a[

= g x, ψ(x), ψ ′ (x), . . . , ψ (k−1) (x) ,

First-Order ODE

Geometrically, in the 1-dimensional real-valued case, the ODE (1.21a) provides a slope

y ′ = f (x, y) for every point (x, y). In other words, it provides a field of directions. The

task is to find a differentiable function φ such that its graph has the prescribed slope in

each point it contains. In certain simple cases, drawing the field of directions can help

to guess the solutions of the ODE.

Example 2.1. Let G := R+ × R and f : G −→ R, f (x, y) := y/x, i.e. we consider the

ODE y ′ = y/x. Drawing the field of directions leads to the idea that the solutions are

functions whose graphs constitute rays, i.e. φc : R+ −→ R, y = φc (x) = c x with c ∈ R.

Indeed, one immediately verifies that each φc constitutes a solution to the ODE.

Definition 2.2. Let I ⊆ R be an open interval and let a, b : I −→ K be continuous

functions. An ODE of the form

y ′ = a(x)y + b(x) (2.1)

is called a linear ODE of first order. It is called homogeneous if, and only if, b ≡ 0; it is

called inhomogeneous if, and only if, it is not homogeneous.

2 ELEMENTARY SOLUTION METHODS 13

I −→ K be continuous. Moreover, let x0 ∈ I and c ∈ K. Then the linear ODE (2.1)

has a unique solution φ : I −→ K that satisfies the initial condition y(x0 ) = c. This

unique solution is given by

Z x

−1

φ : I −→ K, φ(x) = φ0 (x) c + φ0 (t) b(t) dt , (2.2a)

x0

where Z x Rx

a(t) dt

φ0 : I −→ K, φ0 (x) = exp a(t) dt =e x0

. (2.2b)

x0

0 denotes 1/φ0 and not the inverse function of φ0 (which

does not even necessarily exist).

to be continuous, i.e., in particular, Riemann integrable on [x0 , x]. Moreover, the fun-

damental theorem of calculus [Phi16, Th. 10.20(a)] applies, showing φ0 is differentiable

with Z x

φ′0 : I −→ K, φ′0 (x) = a(x) exp a(t) dt = a(x)φ0 (x), (2.3)

x0

where Lem. A.1 of the Appendix was used as well. In particular, φ0 is continuous. Since

φ0 6= 0 as well, φ−1

0 is also continuous. Moreover, as b is continuous by hypothesis,

φ−1

0 b is continuous and, thus, Riemann integrable on [x0 , x]. Once again, [Phi16, Th.

10.20(a)] applies, yielding φ to be differentiable with

φ′ : I −→ K,

Z x

′ ′

φ (x) = φ0 (x) c + φ0 (t) b(t) dt + φ0 (x)φ0 (x)−1 b(x)

−1

x0

Z x

−1

= a(x)φ0 (x) c + φ0 (t) b(t) dt + b(x) = a(x)φ(x) + b(x), (2.4)

x0

where the product rule of [Phi16, Th. 9.7(c)] was used as well. Comparing (2.4) with

(2.1) shows φ is a solution to (2.1). The computation

verifies that φ satisfies the desired initial condition. It remains to prove uniqueness. To

this end, let ψ : I −→ K be an arbitrary differentiable function that satisfies (2.1) as

well as the initial condition ψ(x0 ) = c. We have to show ψ = φ. Since φ0 6= 0, we can

define u := ψ/φ0 and still have to verify

Z x

∀ u(x) = c + φ0 (t)−1 b(t) dt . (2.6)

x∈I x0

We obtain

2 ELEMENTARY SOLUTION METHODS 14

0 b. Thus, the fundamental theorem of calculus in the

form [Phi16, Th. 10.20(b)] implies

Z x Z x

′

∀ u(x) = u(x0 ) + u (t) dt = c + φ0 (t)−1 b(t) dt , (2.8)

x∈I x0 x0

Moreover, let x0 ∈ I and c ∈ K. Then the homogeneous linear ODE (2.1) (i.e. with

b ≡ 0) has a unique solution φ : I −→ K that satisfies the initial condition y(x0 ) = c.

This unique solution is given by

Rx

Z x

a(t) dt

φ(x) = c exp a(t) dt = c e x0 . (2.9)

x0

Remark 2.5. The name variation of constants for Th. 2.3 can be understood from

comparing the solution (2.9) of the homogeneous linear ODE with the solution (2.2)

of the general inhomogeneous linear ODE: One obtains (2.2) R x from (2.9) by varying the

constant c, i.e. by replacing it with the function x 7→ c + x0 φ0 (t)−1 b(t) dt .

y ′ = 2xy + x3 (2.10)

with initial condition y(0) = c, c ∈ C. Comparing (2.10) with Def. 2.2, we observe we

are facing an inhomogeneous linear ODE with

b : R −→ R, b(x) := x3 . (2.11b)

From Cor. 2.4, we obtain the solution φ0,c to the homogeneous version of (2.10):

Z x

2

φ0,c : R −→ C, φ0,c (x) = c exp a(t) dt = cex . (2.12)

0

φ : R −→ C,

Z x x

x2 −t2 3 x2 1 2 −t2

φ(x) = e c+ e t dt = e c + − (t + 1)e

0 2 0

2 1 1 2 1 x2 1 2

= ex c + − (x2 + 1)e−x = c + e − (x + 1). (2.13)

2 2 2 2

2 ELEMENTARY SOLUTION METHODS 15

If the ODE (1.21a) has the particular form

y ′ = f (x)g(y), (2.14)

can be solved by a method known as separation of variables:

Theorem 2.7. Let I, J ⊆ R be (bounded or unbounded) open intervals and suppose that

f : I −→ R and g : J −→ R are continuous with g(y) 6= 0 for each y ∈ J. For each

(x0 , y0 ) ∈ I × J, consider the initial value problem consisting of the ODE (2.14) together

with the initial condition

y(x0 ) = y0 . (2.15)

Define the functions

x y

dt

Z Z

F : I −→ R, F (x) := f (t) dt , G : J −→ R, G(y) := . (2.16)

x0 y0 g(t)

initial value problem consisting of (2.14) and (2.15) has a unique solution. This

unique solution is given by

(2.17)

i.e. an I ′ such that (a) applies.

Proof. (a): We begin by proving G has a differentiable inverse function G−1 : G(J) −→

J. According to the fundamental theorem of calculus [Phi16, Th. 10.20(a)], G is dif-

ferentiable with G′ = 1/g. Since g is continuous and nonzero, G is even C 1 . If

G′ (y0 ) = 1/g(y0 ) > 0, then G is strictly increasing on J (due to the intermediate

value theorem [Phi16, Th. 7.57]; g(y0 ) > 0, the continuity of g, and g 6= 0 imply that

g > 0 on J). Analogously, if G′ (y0 ) = 1/g(y0 ) < 0, then G is strictly decreasing on J.

In each case, G has a differentiable inverse function on G(J) by [Phi16, Th. 9.9].

In the next step, we verify that (2.17) does, indeed, define a solution to (2.14) and

(2.15). The assumption F (I ′ ) ⊆ G(J) and the existence of G−1 as shown above provide

that φ is well-defined by (2.17). Verifying (2.15) is quite simple: φ(x0 ) = G−1 (F (x0 )) =

G−1 (0) = y0 . To see φ to be a solution of (2.14), notice that (2.17) implies F = G ◦ φ

on I ′ . Thus, we can apply the chain rule to obtain the derivative of F = G ◦ φ on I ′ :

φ′ (x)

f (x) = F ′ (x) = G′ φ(x) φ′ (x) =

∀′ , (2.18)

x∈I g φ(x)

2 ELEMENTARY SOLUTION METHODS 16

We now proceed to show that each solution φ : I ′ −→ R to (2.14) that satisfies (2.15)

must also satisfy (2.17). Since φ is a solution to (2.14),

φ′ (x)

= f (x) for each x ∈ I ′ . (2.19)

g φ(x)

Z x ′ Z x

φ (t)

dt = f (t) dt = F (x) for each x ∈ I ′ . (2.20)

x0 g φ(t) x0

Using the change of variables formula of [Phi16, Th. 10.25] in the left-hand side of (2.20),

allows one to replace φ(t) by the new integration variable u (note that each solution

φ : I ′ −→ R to (2.14) is in C 1 (I ′ ) since f and g are presumed continuous). Thus, we

obtain from (2.20):

φ(x) φ(x)

du du

Z Z

= G φ(x) for each x ∈ I ′ .

F (x) = = (2.21)

φ(x0 ) g(u) y0 g(u)

(b): During the proof of (a), we have already seen G to be either strictly increasing

or strictly decreasing. As G(y0 ) = 0, this implies the existence of ǫ > 0 such that

] − ǫ, ǫ[⊆ G(J). The function F is differentiable and, in particular, continuous. Since

F (x0 ) = 0, there is δ > 0 such that, for I ′ :=]x0 −δ, x0 +δ[, one has F (I ′ ) ⊆]−ǫ, ǫ[⊆ G(J)

as desired.

y

y′ = − on I × J := R+ × R+ (2.22)

x

with the initial condition y(1) = c for some given c ∈ R+ . Introducing functions

1

f : R+ −→ R, f (x) := − , g : R+ −→ R, g(y) := y, (2.23)

x

one sees that Th. 2.7 applies. To compute the solution φ = G−1 ◦ F , we first have to

determine F and G:

Z x Z x

+ dt

F : R −→ R, F (x) = f (t) dt = − = − ln x, (2.24a)

1 1 t

Z y Z y

+ dt dt y

G : R −→ R, G(y) = = = ln . (2.24b)

c g(t) c t c

defined on the entire interval I. The inverse function of G is given by

2 ELEMENTARY SOLUTION METHODS 17

Finally, we get

c

φ : R+ −→ R, φ(x) = G−1 F (x) = c e− ln x = .

(2.26)

x

The uniqueness part of Th. 2.7 further tells us the above initial value problem can have

no solution different from φ.

—

The advantage of using Th. 2.7 as in the previous example, by computing the relevant

functions F , G, and G−1 , is that it is mathematically rigorous. In particular, one can be

sure one has found the unique solution to the ODE with initial condition. However, in

practice, it is often easier to use the following heuristic (not entirely rigorous) procedure.

In the end, in most cases, one can easily check by differentiation that the function found

is, indeed, a solution to the ODE with initial condition. However, one does not know

uniqueness without further investigations (general results such as Th. 3.15 below can

often help). One also has to determine on which interval the found solution is defined.

On the other hand, as one is usually interested in choosing the interval as large as

possible, the optimal choice is not always obvious when using Th. 2.7, either.

The heuristic procedure is as follows: Start with the ODE (2.14) written in the form

dy

= f (x)g(y). (2.27a)

dx

Multiply by dx and divide by g(y) (i.e. separate the variables):

dy

= f (x) dx . (2.27b)

g(y)

Integrate:

dy

Z Z

= f (x) dx . (2.27c)

g(y)

Change the integration variables and supply the appropriate upper and lower limits for

the integrals (according to the initial condition):

Z y Z x

dt

= f (t) dt . (2.27d)

y0 g(t) x0

Solve this equation for y, set φ(x) := y, check by differentiation that φ is, indeed, a

solution to the ODE, and determine the largest interval I ′ such that x0 ∈ I ′ and such

that φ is defined on I ′ . The use of this heuristic procedure is demonstrated by the

following example:

y ′ = −y 2 on I × J := R × R (2.28)

2 ELEMENTARY SOLUTION METHODS 18

with the initial condition y(x0 ) = y0 for given values x0 , y0 ∈ R. We manipulate (2.28)

according to the heuristic procedure described in (2.27) above:

dy

Z Z

2 −2 −2

= −y −y dy = dx − y dy = dx

dx

Z y Z x y

−2 1 1 1

− t dt = dt = [t]xx0 − = x − x0

y0 x0 t y0 y y0

y0

φ(x) = y = . (2.29)

1 + (x − x0 ) y0

y02 2

φ′ (x) = − 2 = − φ(x) , (2.30)

1 + (x − x0 ) y0

on the entire interval I = R. If y0 6= 0, then the denominator of φ(x) has a zero at

x = (x0 y0 − 1)/y0 , and φ is not defined on all of R. In that case, if y0 > 0, then

x0 > (x0 y0 − 1)/y0 = x0 − 1/y0 and the maximal open interval for φ to be defined on

is I ′ =]x0 − 1/y0 , ∞[; if y0 < 0, then x0 < (x0 y0 − 1)/y0 = x0 − 1/y0 and the maximal

open interval for φ to be defined on is I ′ =] − ∞, x0 − 1/y0 [. Note that the formula for φ

obtained by (2.29) works for y0 = 0 as well, even though not every previous expression

in (2.29) is meaningful for y0 = 0 and, also, Th. 2.7 does not apply to (2.28) for y0 = 0.

In the present example, the subsequent Th. 3.15 does, indeed, imply φ to be the unique

solution to the initial value problem on I ′ .

To solve an ODE, it can be useful to transform it into an equivalent ODE, using a

so-called change of variables. If one already knows how to solve the transformed ODE,

then the equivalence allows one to also solve the original ODE. We first present the

following Th. 2.10, which constitutes the base for the change of variables technique,

followed by examples, where the technique is applied.

Define

∀ Gx := {y ∈ Kn : (x, y) ∈ G} (2.31)

x∈R

that

∀ Tx := T (x, ·) : Gx −→ Tx (Gx ), Tx (y) := T (x, y), is a diffeomorphism ,

Gx 6=∅

(2.32)

2 ELEMENTARY SOLUTION METHODS 19

i.e. Tx is invertible and both Tx and Tx−1 are differentiable. Then the first-order initial

value problems

y ′ = f (x, y), (2.33a)

y(x0 ) = y0 , (2.33b)

and

−1

′

DTx−1 (y) f x, Tx−1 (y) + ∂x T x, Tx−1 (y) ,

y = (2.34a)

are equivalent in the following sense:

solution to (2.33a) if, and only if, the function

µ : I −→ Kn , µ(x) := (Tx ◦ φ)(x) = T x, φ(x) ,

(2.35)

is a solution to (2.34a).

(b) A differentiable function φ : I −→ Kn , where I ⊆ R is a nontrivial interval, is a

solution to (2.33) if, and only if, the function of (2.35) is a solution to (2.34).

Proof. We start by noting that the assumption of G being open clearly implies each Gx ,

x ∈ R, to be open as well, which, in turn, implies Tx (Gx ) to be open, even though this

is not as obvious1 . Next, for each x ∈ R such that Gx 6= ∅, we can apply the chain rule

[Phi15, Th. 2.28] to Tx ◦ Tx−1 = Id to obtain

DTx Tx−1 (y) ◦ DTx−1 (y) = Id

∀ (2.36)

y∈Tx (Gx )

−1

DTx−1 (y) = DTx Tx−1 (y) .

∀ (2.37)

y∈Tx (Gx )

∀ φ(x) = Tx−1 (µ(x)). (2.38)

x∈I

which also yields

′

1

µ (x) = DT x, φ(x)

∀ φ′ (x) (2.39)

x∈I

′

= DTx (φ(x)) φ (x) + ∂x T x, φ(x) .

1

If Tx is a continuously differentiable map, then this is related to the inverse function theorem (see,

e.g. [Phi15, Cor. C.9]); it is still true if Tx is merely continuous and injective, but then it is the invariance

of domain theorem of algebraic topology [Oss09, 5.6.15], which is equivalent to the Brouwer fixed-point

theorem [Oss09, 5.6.10], and is much harder to prove.

2 ELEMENTARY SOLUTION METHODS 20

x ∈ I,

(2.39),(2.33a)

µ′ (x)

= DTx (φ(x)) f x, φ(x) + ∂x T x, φ(x)

(2.38)

DTx Tx−1 (µ(x)) f x, Tx−1 (µ(x)) + ∂x T x, Tx−1 (µ(x))

=

(2.37)

−1

−1

f x, Tx−1 (µ(x)) + ∂x T x, Tx−1 (µ(x)) ,

= DTx (µ(x)) (2.40)

each x ∈ I,

−1

DTx−1 (µ(x)) f x, Tx−1 (µ(x)) + ∂x T x, Tx−1 (µ(x))

(2.34a) ′ (2.39)

= µ (x) = DTx (φ(x)) φ′ (x) + ∂x T x, φ(x) .

(2.41)

Using (2.38), one can subtract the second summand from (2.41). Multiplying the result

by DTx−1 (µ(x)) from the left and taking into account (2.37) then provides

(2.38)

φ′ (x) = f x, Tx−1 (µ(x))

∀ = f x, φ(x) , (2.42)

x∈I

It remains to prove (b). If φ satisfies (2.33), then µ satisfies (2.34a) by (a). More-

over, µ(x0 ) = T x0 , φ(x0 ) = T (x0 , y0 ), i.e. µ satisfies (2.34b) as well. Conversely,

assume µ satisfies (2.34). Then φ satisfies (2.33a) by (a). Moreover, by (2.38), φ(x0 ) =

Tx−1

0

(µ(x0 )) = Tx−1

0

(T (x0 , y0 )) = y0 , showing φ satisfies (2.33b) as well.

As a first application of Th. 2.10, we prove the following theorem about so-called

Bernoulli differential equations:

where α ∈ R \ {0, 1}, the functions a, b : I −→ R are continuous and defined on an open

interval I ⊆ R, and f : I × R+ −→ R. For (2.43a), we add the initial condition

and, furthermore, we also consider the corresponding linear initial value problem

y ′ = (1 − α) a(x) y + b(x) ,

(2.44a)

1−α

y(x0 ) = y0 , (2.44b)

2 ELEMENTARY SOLUTION METHODS 21

Bernoulli initial value problem (2.43) has a unique solution. This unique solution

is given by

1

φ : I ′ −→ R+ , φ(x) := ψ(x) 1−α . (2.45)

i.e. an I ′ such that (a) applies.

Proof. (b) is immediate from Th. 2.3, since ψ(x0 ) = y0 > 0 and ψ is continuous.

To prove (a), we apply Th. 2.10 with the change of variables

x∈I

1 α

1

S −1 (y) = y 1−α , DS −1 (y) = (S −1 )′ (y) = 1−α y 1−α . Thus, (2.34a) takes the form

−1

y ′ = DTx−1 (y) f x, Tx−1 (y) + ∂x T x, Tx−1 (y)

α

1 1 α

= (1 − α) y − 1−α a(x) y 1−α + b(x) y 1−α +0

= (1 − α) a(x) y + b(x) . (2.48)

Thus, if I ′ ⊆ I is such that x0 ∈ I ′ and ψ > 0 on I ′ , then Th. 2.10 says φ defined

by (2.45) must be a solution to (2.43) (note that the differentiability of ψ implies the

differentiability of φ). On the other hand, if λ : I ′ −→ R+ is an arbitrary solution to

(2.43), then Th. 2.10 states µ := S ◦ λ = λ1−α to be a solution to (2.44). The uniqueness

part of Th. 2.3 then yields λ1−α = ψ↾I ′ = φ1−α , i.e. λ = φ.

1

y ′ = f (x, y) := i − , (2.49a)

ix − y + 2

y(1) = i, (2.49b)

preimage of the open set C \ {0}). We apply the change of variables

T : G −→ C, T (x, y) := ix − y. (2.50)

2 ELEMENTARY SOLUTION METHODS 22

Tx−1 : C \ {−2} −→ C \ {ix + 2}, Tx−1 (y) = ix − y. (2.52b)

−1

DTx−1 (y) f x, Tx−1 (y) + ∂x T x, Tx−1 (y)

1 1

= (−1) · i − +i= . (2.53)

y+2 y+2

Thus, the transformed initial value problem is

1

y′ = , (2.54a)

y+2

y(1) = T (1, i) = i − i = 0. (2.54b)

√

µ : ] − 1, ∞[−→] − 2, ∞[, µ(x) := 2x + 2 − 2, (2.55)

√

φ(x) := Tx−1 µ(x) = ix − 2x + 2 + 2,

φ : ] − 1, ∞[−→ C, (2.56)

is a solution to (2.49) (that φ is a solution to (2.49) can now also easily be checked

directly). It will become clear from Th. 3.15 below that φ and ψ are also the unique

solutions to their respective initial value problems.

—

Finding a suitable change of variables to transform a given ODE such that one is in a

position to solve the transformed ODE is an art, i.e. it can be very difficult to spot a

useful transformation, and it takes a lot of practise and experience.

Remark 2.13. Somewhat analogous to the situation described in the paragraph before

(2.27) regarding the separation of variables technique, in practise, one frequently uses a

heuristic procedure to apply a change of variables, rather than appealing to the rigorous

Th. 2.10. For the initial value problem y ′ = f (x, y), y(x0 ) = y0 , this heuristic procedure

proceeds as follows:

(1) One introduces the new variable z := T (x, y) and then computes z ′ , i.e. the deriva-

tive of the function x 7→ z(x) = T (x, y(x)).

(2) In the result of (1), one eliminates all occurrences of the variable y by first replacing

y ′ by f (x, y) and then replacing y by Tx−1 (z), where Tx (y) := T (x, y) = z (i.e. one has

to solve the equation z = T (x, y) for y). One thereby obtains the transformed initial

value problem problem z ′ = g(x, z), z(x0 ) = T (x0 , y0 ), with a suitable function g.

2 ELEMENTARY SOLUTION METHODS 23

(3) One solves the transformed initial value problem to obtain a solution µ, and then

−1

x 7→ φ(x) := Tx µ(x) yields a candidate for a solution to the original initial value

problem.

(4) One checks that φ is, indeed, a solution to y ′ = f (x, y), y(x0 ) = y0 .

+ y y2

f : R × R −→ R, f (x, y) := 1 + + 2 , (2.57)

x x

and the initial value problem

We introduce the change of variables z := T (x, y) := y/x and proceed according to the

steps of Rem. 2.13. According to (1), we compute, using the quotient rule,

y ′ (x) x − y(x)

z ′ (x) = . (2.59)

x2

According to (2), we replace y ′ (x) by f (x, y) and then replace y by Tx−1 (z) = xz to

obtain the transformed initial value problem

y y2 1 + z2

′ 1 y 1 2 z

z = 1 + + 2 − 2 = (1 + z + z ) − = , z(1) = 0/1 = 0. (2.60)

x x x x x x x

According to (3), we next solve (2.60), e.g. by seperation of variables, to obtain the

solution

π π

µ : e− 2 , e 2 −→ R, µ(x) := tan ln x, (2.61)

of (2.60), and

π π

φ : e− 2 , e 2 −→ R, φ(x) := x µ(x) = x tan ln x, (2.62)

as a candidate for a solution to (2.58). Finally, according to (4), we check that φ is,

indeed, a solution to (2.58): Due to φ(1) = 1 · tan 0 = 0, φ satisfies the initial condition,

and due to

1

φ′ (x) = tan ln x + x

(1 + tan2 ln x) = 1 + tan ln x + tan2 ln x

x

φ(x) φ2 (x)

=1+ + , (2.63)

x x2

φ satisfies the ODE.

3 GENERAL THEORY 24

3 General Theory

First-Order ODE

It turns out that each one-dimensional kth-order ODE is equivalent to a system of k

first-order ODE; more generally, that each n-dimensional kth-order ODE is equivalent

to a kn-dimensional first-order ODE (i.e. to a system of kn one-dimensional first-order

ODE). Even though, in this class, we will mainly consider explicit ODE, we provide the

equivalence also for the implicit case, as the proof is essentially the same (the explicit

case is included as a special case).

Theorem 3.1. In the situation of Def. 1.2(a), i.e. U ⊆ R × K(k+1)n and F : U −→ Kn ,

plus (x0 , y0,0 , . . . , y0,k−1 ) ∈ R × Kkn , consider the kth-order initial value problem

∀ y (j) (x0 ) = y0,j , (3.1b)

j∈{0,...,k−1}

y1′ − y2 = 0,

y2′ − y3 = 0,

.. (3.2a)

.

′

yk−1 − yk = 0,

F (x, y1 , . . . , yk , yk′ ) = 0,

y0,0

y(x0 ) = ... (3.2b)

y0,k−1

(note that the unknown function y in (3.1) is Kn -valued, whereas the unknown function

y in (3.2) is Kkn -valued). Then both initial value problems are equivalent in the following

sense:

φ

φ′

Φ : I −→ Kkn , Φ := .. , (3.3)

.

φ(k−1)

is a solution to (3.2).

solution to (3.1).

3 GENERAL THEORY 25

G(x, y, y ′ ) = 0, (3.4)

where

G : V −→ Kkn ,

V := (x, y, z) ∈ R × Kkn × Kkn : (x, y, zk ) ∈ U ⊆ R × Kkn × Kkn ,

G1 (x, y, z) := z1 − y2 ,

G2 (x, y, z) := z2 − y3 ,

..

.

Gk−1 (x, y, z) := zk−1 − yk ,

Gk (x, y, z) := F (x, y, zk ). (3.5)

implies (3.2b), since

φ(x0 )

φ′ (x0 ) y0,0

(3.1b) ..

Φ(x0 ) = .. = . .

.

(k−1) y0,k−1

φ (x0 )

(3.3)

x, φ(x), φ′ (x), . . . , φ(k−1) (x), φ′ (x), . . . , φ(k) (x) ∈ I × Kkn × Kkn : x ∈ I

=

(x, φ(x), . . . , φ(k) (x)) ∈ U

⊆ V. (3.6)

j∈{1,...,k−1}

and, thus,

(3.1a)

Gk x, Φ(x), Φ′ (x) = F x, φ(x), φ′ (x), . . . , φ(k) (x) = 0,

∀ (3.7)

x∈I

(b): As Φ is a solution to (3.2), the first k − 1 equations of (3.2a) imply

φ=Φ1

∀ Φj+1 = Φ′j = φ(j) ,

j∈{1,...,k−1}

3 GENERAL THEORY 26

i.e. φ is k times differentiable and Φ has, once again, the form (3.3) (note Φ1 = φ by

the definition of φ). Then, clearly, (3.2b) implies (3.1b), and Def. 1.2(a)(i) for Φ implies

Def. 1.2(a)(i) for φ:

(3.3),(3.5)

F x, φ(x), . . . , φ(k) = Gk x, Φ(x), Φ′ (x) = 0,

∀

x∈I

y(0) = 0,

y ′′ = −y, (3.8)

y ′ (0) = r, r ∈ R given,

y1′ = y2 ,

0

′ y(0) = . (3.9)

y2 = −y1 , r

2 Φ1 (x) r sin x

Φ : R −→ R , Φ(x) = = , (3.10)

Φ2 (x) r cos x

As a consequence of Th. 3.1, one can carry out much of the general theory of ODE

(such as results regarding existence and uniqueness of solutions) for systems of first-

order ODE, obtaining the corresponding results for higher-order ODE as a corollary.

This is the strategy usually pursued in the literature and we will follow suit in this

class.

It is a rather remarkable fact that, under the very mild assumption that f : G −→ Kn is

a continuous function defined on an open subset G of R×Kkn with (x0 , y0,0 , . . . , y0,k−1 ) ∈

3 GENERAL THEORY 27

G, every initial value problem (1.7) for the n-dimensional explicit kth-order ODE (1.6)

has at least one solution φ : I −→ Kn , defined on a, possibly very small, open interval.

This is the contents of the Peano Th. 3.8 below and its Cor. 3.10. From Example 1.4(b),

we already know that uniqueness of the solution cannot be expected without stronger

hypotheses.

The proof of the Peano theorem requires some work. One of the key ingredients is

the Arzelà-Ascoli Th. 3.7 that, under suitable hypotheses, guarantees a given sequence

of continuous functions to have a uniformly convergent subsequence (the formulation

in Th. 3.7 is suitable for our purposes – many different variants of the Arzelà-Ascoli

theorem exist in the literature).

We begin with some prelimanaries from the theory of metric spaces. At this point, the

reader might want to review the definition of a metric, a metric space, and basic notions

on metric spaces, such as the notion of compactness and the notion of continuity of

functions between metric spaces. Also recall that every normed space is a metric space

via the metric induced by the norm (in particular, if we use metric notions on normed

spaces, they are always meant with respect to the respective induced metric). If you

are not sufficiently familiar with metrics and norms, you might want to consult the

relevant subsections of [Phi15, Sec. 1]; for compactness and some related results see,

e.g., Appendix C.2.

denote the open ball with center x and radius r, also known as the r-ball with center x.

Definition 3.4. Let (X, dX ) and (Y, dY ) be metric spaces. We say a sequence of func-

tions (fm )m∈N , fm : X −→ Y , converges uniformly to a function f : X −→ Y if, and

only if,

∀ ∃ ∀ dY fm (x), f (x) < ǫ.

ǫ>0 N ∈N m≥N,

x∈X

Theorem 3.5. Let (X, dX ) and (Y, dY ) be metric spaces. If the sequence (fm )m∈N of

continuous functions fm : X −→ Y converges uniformly to the function f : X −→ Y ,

then f is continuous as well.

Proof. We have to show that f is continuous at every ξ ∈ X. Thus, let ξ ∈ X and ǫ > 0.

Due to the uniform convergence, we can choose m ∈ N such that dY fm (x), f (x) < ǫ/3

for every x ∈ X. Moreover, as fm is continuous at ξ, there exists δ > 0 such that

x ∈ Bδ (ξ) implies dY fm (ξ), fm (x) < ǫ/3. Thus, if x ∈ Bδ (ξ), then

dY f (ξ), f (x) ≤ dY f (ξ), fm (ξ) + dY fm (ξ), fm (x) + dY fm (x), f (x)

ǫ ǫ ǫ

< + + = ǫ,

3 3 3

proving f is continuous at ξ.

3 GENERAL THEORY 28

Definition 3.6. Let (X, dX ) and (Y, dY ) be metric spaces and let F be a set of functions

from X into Y . Then the set F (or the functions in F) are said to be uniformly

equicontinuous if, and only if, for each ǫ > 0, there is δ > 0 such that

∀ dX (x, ξ) < δ ⇒ ∀ dY f (x), f (ξ) < ǫ . (3.12)

x,ξ∈X f ∈F

Theorem 3.7 (Arzelà-Ascoli). Let n ∈ N, let k · k denote some norm on Kn , and let

I ⊆ R be some bounded interval. If (fm )m∈N is a sequence of functions fm : I −→ Kn

such that {fm : m ∈ N} is uniformly equicontinuous and such that, for each x ∈ I, the

sequence fm (x) m∈N is bounded, then (fm )m∈N has a uniformly convergent subsequence

(fmj )j∈N , i.e. there exists f : I −→ Kn such that

ǫ>0 N ∈N j≥N,

x∈I

Q ∩ I. Inductively, we construct a sequence (Fm )m∈N of subsequences of (fm )m∈N ,

Fm = (fm,k )k∈N , such that

(ii) for each m ∈ N, Fm converges pointwise at each of the first m rational numbers

rj ; more precisely, there exists a sequence (z1 , z2 , . . . ) in Kn such that, for each

m ∈ N and each j ∈ {1, . . . , m}:

k→∞

Actually, we construct the (zm )m∈N inductively together with the (Fm )m∈N : Since the

sequence (fm (r1 ))m∈N is, by hypothesis, a bounded sequence in Kn , one can apply the

Bolzano-Weierstrass theorem (cf. [Phi15, Th. 1.16(b)]) to obtain z1 ∈ Kn and a sub-

sequence F1 = (f1,k )k∈N of (fm )m∈N such that limk→∞ f1,k (r1 ) = z1 . To proceed by

induction, we now assume to have already constructed F1 , . . . , FM and z1 , . . . , zM for

M ∈ N such that (i) and (ii) hold for each m ∈ {1, . . . , M }. Since the sequence

(fM,k (rM +1 ))k∈N is a bounded sequence in Kn , one can, once more, apply the Bolzano-

Weierstrass theorem to obtain zM +1 ∈ Kn and a subsequence FM +1 = (fM +1,k )k∈N of

FM such that limk→∞ fM +1,k (rM +1 ) = zM +1 . Since FM +1 is a subsequence of FM , it is

also a subsequence of all previous subsequences, i.e. (i) now also holds for m = M + 1.

In consequence, limk→∞ fM +1,k (rj ) = zj for each j = 1, . . . , M + 1, such that (ii) now

also holds for m = M + 1 as required.

Next, one considers the diagonal sequence (gm )m∈N , gm := fm,m , and observes that this

sequence converges pointwise at each rational number rj (limm→∞ gm (rj ) = zj ), since,

at least for m ≥ j, (gm )m∈N is a subsequence of every Fj (exercise) – in particular,

(gm )m∈N is also a subsequence of the original sequence (fm )m∈N .

3 GENERAL THEORY 29

In the last step of the proof, we show that (gm )m∈N converges uniformly on the entire

interval I to some f : I −→ Kn . To this end, fix ǫ > 0. Since {gm : m ∈ N} ⊆ {fm :

m ∈ N}, the assumed uniform equicontinuity of {fm : m ∈ N} yields δ > 0 such that

ǫ

∀ |x − ξ| < δ ⇒ ∀
gm (x) − gm (ξ) <

.

x,ξ∈I m∈N 3

Since I is bounded, it has S finite length and, thus, it can be covered with finitely many

intervals I1 , . . . , IN , I = N j=1 Ij , N ∈ N, such that each Ij has length less than δ.

Moreover, since Q is dense in R, for each j ∈ {1, . . . , N }, there exists k(j) ∈ N such

that rk(j) ∈ Ij . Define M := max{k(j) : j = 1, . . . , N }. We note that each of the

finitely many sequences (gm (r1 ) m∈N , . . . , (gm (rM ) m∈N is a Cauchy sequence. Thus,

∃ ∀ ∀ (3.13)

K∈N k,l≥K α=1,...,M 3

We now consider an arbitrary x ∈ I and k, l ≥ K. Let j ∈ {1, . . . , N } such that x ∈ Ij .

Then rk(j) ∈ Ij , |rk(j) − x| < δ, and the estimate in (3.13) holds for α = k(j). In

consequence, we obtain the crucial estimate

gk (x) − gl (x)

∀ ≤ kg k (x) − g k (r k(j) ) + gk (rk(j) ) − gl (rk(j) ) + gl (rk(j) ) − gl (x)

(3.14)

k,l≥K ǫ ǫ ǫ

< + + = ǫ.

3 3 3

The estimate (3.14) shows (gm (x) m∈N is a Cauchy sequence for each x ∈ I, and we

can define

f : I −→ Kn , f (x) := lim gm (x). (3.15)

m→∞

Since K in (3.14) does not depend on x ∈ I, passing to the limit k → ∞ in the estimate

of (3.14) implies

∀ kgl (x) − f (x)k ≤ ǫ,

l≥K,

x∈I

proving uniform convergence of the subsequence (gm )m∈N of (fm )m∈N as desired. The

continuity of f is now a consequence of Th. 3.5.

At this point, we have all preparations in place to state and prove the existence theorem.

Theorem 3.8 (Peano). If G ⊆ R×Kn is open, n ∈ N, and f : G −→ Kn is continuous,

then, for each (x0 , y0 ) ∈ G, the explicit n-dimensional first-order initial value problem

y(x0 ) = y0 , (3.16b)

has at least one solution. More precisely, given an arbitrary norm k · k on Kn , (3.16)

has a solution φ : I −→ Kn , defined on the open interval

3 GENERAL THEORY 30

B := (x, y) ∈ R × Kn : |x − x0 | ≤ b and ky − y0 k ≤ b ⊆ G,

(3.18)

and (

min{b, b/M } for M > 0,

α := α(b) := (3.20)

b for M = 0.

In general, the choice of the norm k · k on Kn will influence the possible sizes of α and,

thus, of I.

Proof. The proof will be conducted in several steps. In the first step, we check α =

α(b) > 0 is well-defined: Since G is open, there always exists b > 0 such that (3.18)

holds. Since B is a closed and bounded subset of the finite-dimensional space R × Kn ,

B is compact (cf. [Phi15, Cor. 3.5]). Since f and, thus, kf k is continuous (every norm

is even Lipschitz continuous due to the inverse triangle inequality), it must assume its

maximum on the compact set B (cf. [Phi15, Th. 3.8]), showing M ∈ R+ 0 is well-defined

by (3.19) and α is well-defined by (3.20).

In the second step of the proof, we note that it suffices to prove (3.16) has a solution φ+ ,

defined on [x0 , x0 + α[: One can then apply the time reversion Lem. 1.9(b): The proof

providing the solution φ+ also provides a solution ψ+ : [−x0 , −x0 + α[−→ Kn to the

time-reversed initial value problem, consisting of y ′ = −f (−x, y) and y(−x0 ) = y0 (note

that the same M and α work for the time-reversed problem). Then, according to Lem.

1.9(b), φ− : ]x0 − α, x0 ] −→ Kn , φ− (x) := ψ+ (−x), is a solution to (3.16). According to

Lem. 1.7, we can patch φ− and φ+ together to obtain the desired solution

(

φ− (x) for x ≤ x0 ,

φ : I −→ Kn , φ(x) := (3.21)

φ+ (x) for x ≥ x0 ,

defined on all of I. It is noted that one can also conduct the proof with the second step

omitted, but then one has to perform the following steps on all of I, which means one

has to consider additional cases in some places.

In the third step of the proof, we will define a sequence (φm )m∈N of functions

m ∈ N. Since B is compact and f is continuous, we know f is even uniformly continuous

on B (cf. [Phi15, Th. 3.9]). In particular,

∃ ∀ |x − x̃| < δm , ky − ỹk < δm ⇒

δm >0 (x,y),(x̃,ỹ)∈B m

(3.23)

3 GENERAL THEORY 31

We now form what is called a discretization of the interval I+ , i.e. a partition of I+ into

sufficiently many small intervals: Let N ∈ N and

x0 < x1 < · · · < xN −1 < xN := x0 + α (3.24)

such that

(

min{δm , δm /M, 1/m} for M > 0,

∀ xj − xj−1 < β := (3.25)

j∈{1,...,N } min{δm , 1/m} for M = 0

(for example one could make the equidistant choice xj := x0 + jh with h = α/N and

N > α/β, but it does not matter how the xj are defined as long as (3.24) and (3.25)

both hold). Note that we get a different discretization of I+ for each m ∈ N; however,

the dependence on m is suppressed in the notation for the sake of readability. We now

define recursively

φm : I+ −→ Kn ,

φm (x0 ) := y0 , (3.26)

φm (x) := φm (xj ) + (x − xj ) f xj , φm (xj ) for each x ∈ [xj , xj+1 ].

Note that there is no conflict between the two definitions given for x = xj with j ∈

{1, . . . , N − 1}. Each function φm defines a polygon in Kn . This construction is known

as Euler’s method and it can be used to obtain numerical approximations to the solution

of the initial value problem (while simple, this method is not very efficient, though). We

still need to verify that the definition (3.26) does actually make sense: We need to check

that f can, indeed, be applied to (xj , φm (xj )), i.e. we have to check (xj , φm (xj )) ∈ G.

We can actually show the stronger statement

∀ (x, φm (x)) ∈ B, (3.27)

x∈I+

where B is as defined in (3.18). First, it is pointed out that (3.20) implies α ≤ b, such

that x ∈ I+ implies |x − x0 | ≤ α ≤ b as required in (3.18). One can now prove (3.27)

by showing by induction on j ∈ {0, . . . , N − 1}:

∀ (x, φm (x)) ∈ B. (3.28)

x∈[xj ,xj+1 ]

To start the induction, note φm (x0 ) = y0 and (x0 , y0 ) ∈ B by (3.18). Now let j ∈

{0, . . . , N − 1} and x ∈ [xj , xj+1 ]. We estimate

j

X

kφm (x) − y0 k ≤ kφm (x) − φm (xj )k + kφm (xk ) − φm (xk−1 )k

k=1

j

(3.26)
X

= (x − xj )
f xj , φm (xj )
+ (xk − xk−1 )
f xk−1 , φm (xk−1 )

k=1

j

(∗) X

≤ (x − xj ) M + (xk − xk−1 ) M = (x − x0 ) M

k=1

(3.20)

≤ α M ≤ b, (3.29)

3 GENERAL THEORY 32

where, at (∗), it was used
that (xk ,
φm (xk )) ∈ B by induction hypothesis for each

k = 0, . . . , j, and, thus, f xk , φm (xk )
≤ M by (3.19). Estimate (3.29) completes the

induction and the third step of the proof.

In the fourth step of the proof, we establish several properties of the functions φm . The

first two properties are immediate from (3.26), namely that φm is continuous on I + and

differentiable at each x ∈]xj , xj+1 [, j ∈ {0, . . . , N − 1}, where

∀ ∀ (3.30)

j∈{0,...,N −1} x∈]xj ,xj+1 [

s,t∈I+

To prove (3.31), we may assume s < t without loss of generality. If s, t ∈ [xj , xj+1 ],

j ∈ {0, . . . , N − 1}, then

(3.26)

= φm (xj ) + (t − xj ) f xj , φm (xj ) − φm (xj ) − (s − xj ) f xj , φm (xj )

(3.19)

= |t − s| f xj , φm (xj ) ≤ |t − s| M (3.32a)

as desired. If s, t are not contained in the same interval [xj , xj+1 ], then fix j < k such

that s ∈ [xj , xj+1 ] and t ∈ [xk , xk+1 ]. Then (3.31) follows from an estimate analogous to

the one in (3.29):

k−1

X

≤ kφm (s) − φm (xj+1 )k + kφm (xl ) − φm (xl+1 )k + kφm (xk ) − φm (t)k

l=j+1

k−1

(3.32a) X

≤ |s − xj+1 | M + |xl − xl+1 | M + |t − xk | M

l=j+1

= |t − s| M, (3.32b)

completing the proof of (3.31). The following property of the φm is the justification for

calling them approximate solutions to our initial value problem (3.16):

′

∀ ∀ (3.33)

j∈{0,...,N −1} x∈]xj ,xj+1 [ m

if M = 0, then f ≡ φ′m ≡ 0 and there is nothing to prove.So let M > 0. If x ∈]xj , xj+1 [,

then, according to (3.30), we have φ′m (x) = f xj , φm (xj ) . Thus, by (3.25),

(3.31)

|x−xj | < β ≤ min{δm , δm /M } ⇒ kφm (x)−φm (xj )k ≤ |x−xj | M < δm , (3.34a)

3 GENERAL THEORY 33

and

′
(3.34a),(3.23) 1

φm (x) − f x, φm (x)
=
f xj , φm (xj ) − f x, φm (x)
< , (3.34b)

m

proving (3.33).

The last property of the φm we need is

(3.31)

∀ kφm (x)k ≤ kφm (x) − φm (x0 )k + kφm (x0 )k ≤ |x − x0 | M + kφm (x0 )k (3.35)

x∈I+

≤ αM + ky0 k,

which says that the φm are pointwise and even uniformly bounded.

In the fifth and last step of the proof, we use the Arzelà-Ascoli Th. 3.7 to obtain a

function φ+ : I+ −→ Kn , and we show that φ constitutes a solution to (3.16). According

to (3.31), the φm are uniformly equicontinuous (given ǫ > 0, condition (3.12) is satisfied

with δ := ǫ/M for M > 0 and arbitrary δ > 0 for M = 0), and according to (3.35)

the φm are bounded such that the Arzelà-Ascoli Th. 3.7 applies to yield a subsequence

(φmj )j∈N of (φm )m∈N converging uniformly to some continuous function φ+ : I+ −→ Kn .

So it merely remains to verify that φ+ is a solution to (3.16).

As the uniform convergence of the (φmj )j∈N implies pointwise convergence, we have

φ+ (x0 ) = limj→∞ φmj (x0 ) = y0 , showing φ+ satisfies the initial condition (3.16b).

Next,

∀ (x, φ+ (x)) = lim (x, φmj (x)) ∈ B,

x∈I j→∞

since each (x, φmj (x)) is in B and B is closed. In particular, f (x, φ+ (x)) is well-defined

for each x ∈ I+ . To prove that φ+ also satisfies the ODE (3.16a), by Th. 1.5, it suffices

to show Z x

∀ φ+ (x) − φ+ (x0 ) − f t, φ+ (t) dt = 0 (3.36)

x∈I+ x0

triangle inequality for the umpteenth time, one obtains

Z x

φ+ (x) − φ+ (x0 ) − f t, φ+ (t) dt

x0

Z x

≤ kφ+ (x) − φmj (x)k + φmj (x) − φ+ (x0 ) − f t, φmj (t) dt

x0

Z x

+ f t, φmj (t) − f t, φ+ (t) dt , (3.37)

x0

holding for every j ∈ N. We will conclude the proof by showing that all three summands

on the right-hand side of (3.37) tend to 0 for j → ∞. As already mentioned above,

the uniform convergence of the (φmj )j∈N implies pointwise convergence, implying the

convergence of the first summand. We tackle the third summand next, using

Z x
Z x

f t, φ (t) − f t, φ (t) dt ≤ f t, φ (t) − f t, φ (t)
dt , (3.38)

mj + mj +

x0 x0

3 GENERAL THEORY 34

which holds for every norm (cf. Appendix Pn B), but can easily be checked directly for

the 1-norm, where k(z1 , . . . , zn )k1 := j=1 |zj |
(exercise). Given ǫ > 0, the uniform

continuity of f on B provides δ > 0 such that f t, φmj (t) − f t, φ+ (t)
< ǫ/α for

kφmj (t) − φ+ (t)k < δ. The uniform convergence of (φmj )j∈N then yields K ∈ N such

that kφmj (t) − φ+ (t)k < δ for every j ≥ K and each t ∈ I. Thus,

Z x

|x − x0 | ǫ

∀ f t, φ (t) − f t, φ (t)
dt ≤ ≤ ǫ,

mj +

α

j≥K x0

thereby establishing the convergence of the third summand from the right-hand side

of (3.37). For the remaining second summand, we note that the fact that each φm is

continuous and piecewise differentiable (with piecewise constant derivative) allows to

apply the fundamental theorem of calculus in the form [Phi16, Th. 10.20(b)] to obtain

Z x

∀ φm (x) = φm (x0 ) + φ′m (t) dt . (3.39)

x∈I+ x0

Using (3.39) in the second summand of the right-hand side of (3.37) provides

Z x
Z x

′

φm (x) − φ+ (x0 ) − f t, φ mj (t) dt
≤ φ

mj (t) − f t, φ mj (t)
dt

j

x0 x0

Z x

(3.33) 1 α

≤ ≤ ,

x0 m j mj

showing the convergence of the second summand, which finally concludes the proof.

is compact, then there exists α > 0, such that, for each (x0 , y0 ) ∈ C, the explicit n-

dimensional first-order initial value problem (3.16) has a solution φ : I −→ Kn , defined

on the open interval I :=]x0 − α, x0 + α[, i.e. always on an interval of the same length

2α.

Proof. Exercise.

then, for each (x0 , y0,0 , . . . , y0,k−1 ) ∈ G, the explicit n-dimensional kth-order initial value

problem consisting of (1.6) and (1.7), which, for convenience, we rewrite

y (k) = f x, y, y ′ , . . . , y (k−1) ,

(3.40a)

∀ y (j) (x0 ) = y0,j , (3.40b)

j∈{0,...,k−1}

has at least one solution. More precisely, there exists an open interval I ⊆ R with

x0 ∈ I and φ : I −→ Kn such that φ is a solution to (3.40). If C ⊆ G is compact,

then there exists α > 0 such that, for each (x0 , y0,0 , . . . , y0,k−1 ) ∈ C, (3.40) has a solution

φ : I −→ Kn , defined on the open interval I :=]x0 − α, x0 + α[, i.e. always on an interval

of the same length 2α.

3 GENERAL THEORY 35

Proof. If f is continuous, then the right-hand side of the equivalent first-order system

(3.2a) (written in explicit form) is given by the continuous function

y2

y3

..

f˜ : G −→ Kkn , f˜(x, y1 , . . . , yk ) := . (3.41)

.

yk−1

f (x, y1 , . . . , yk )

Thus, Th. 3.8 provides a solution Φ : I −→ Kkn to (3.2) and, then, Th. 3.1(b) yields

φ := Φ1 to be a solution to (3.40). Moreover, if C ⊆ G is compact, then Cor. 3.9 provides

α > 0 such that, for each (x0 , y0,0 , . . . , y0,k−1 ) ∈ C, (3.2) has a solution Φ : I −→ Kkn ,

defined on the same open interval I :=]x0 − α, x0 + α[. In particular, φ := Φ1 , the

corresponding solution to (3.40) is also defined on the same I.

While the Peano theorem is striking in its generality, it does have several drawbacks:

(a) the interval, where the existence of a solution is proved can be unnecessarily short;

(b) the selection of the subsequence using the Arzelà-Ascoli theorem makes the proof

nonconstructive; (c) uniqueness of solutions is not provided, even in cases, where unique

solutions exist; (d) it does not provide information regarding how the solution changes

with a change of the initial condition. We will subsequently address all these points,

namely (b) and (c) in Sec. 3.3 (we will see that the proof of the Peano theorem becomes

constructive in situations, where the solution is unique – in general, a constructive proof

is not available), (a) in Sec. 3.4, and (d) in Sec. 3.5.

Example 1.4(b) shows that the hypotheses of the Peano Th. 3.8 are not strong enough

to guarantee the initial value problem (3.16) has a unique solution, not even in some

neighborhood of x0 . The additional condition that will yield uniqueness is local Lipschitz

continuity of f with respect to y.

Definition 3.11. Let m, n ∈ N, G ⊆ R × Km , and f : G −→ Kn .

(a) The function f is called (globally) Lipschitz continuous or just (globally) Lipschitz

with respect to y if, and only if,

∃ ∀
f (x, y) − f (x, ȳ)
≤ Lky − ȳk. (3.42)

L≥0 (x,y),(x,ȳ)∈G

(b) The function f is called locally Lipschitz continuous or just locally Lipschitz with

respect to y if, and only if, for each (x0 , y0 ) ∈ G, there exists a (relative) open set

U ⊆ G such that (x0 , y0 ) ∈ U (i.e. U is a (relative) open neighborhood of (x0 , y0 ))

and f is Lipschitz continuous with respect to y on U , i.e. if, and only if,

∀ ∃ ∃ ∀
f (x, y) − f (x, ȳ)
≤ Lky − ȳk.

(x0 ,y0 )∈G (x0 , y0 ) ∈ U ⊆ G open L≥0 (x,y),(x,ȳ)∈U

(3.43)

3 GENERAL THEORY 36

The number L occurring in (a),(b) is called Lipschitz constant. The norms on Km and

Kn in (a),(b) are arbitrary. If one changes the norms, then one will, in general, change

L, but not the property of f being (locally) Lipschitz.

with respect to y does not imply f to be continuous: Indeed, if I ⊆ R, ∅ 6= A ⊆ Km ,

and g : I −→ Kn is an arbitrary discontinuous function, then f : I × A −→ Kn ,

f (x, y) := g(x) is not continuous, but satisfies (3.42) with L = 0.

—

While the local neighborhoods U , where a function locally Lipschitz (with respect to y)

is actually Lipschitz continuous (with respect to y) can be very small, we will now show

that a continuous function is locally Lipschitz (with respect to y) on G if, and only if,

it is Lipschitz continuous (with respect to y) on every compact set K ⊆ G.

Then f is locally Lipschitz with respect to y if, and only if, f is (globally) Lipschitz with

respect to y on every compact subset K of G.

Proof. First, assume f is not locally Lipschitz with respect to y. Then there exists

(x0 , y0 ) ∈ G such that

∀ ∃
f (xN , yN,1 ) − f (xN , yN,2 )
> N kyN,1 − yN,2 k. (3.44)

N ∈N (xN ,yN,1 ),(xN ,yN,2 )

∈G∩B1/N (x0 ,y0 )

The set

K := {(x0 , y0 )} ∪ (xN , yN,j ) : N ∈ N, j ∈ {1, 2}

is clearly a compact subset of G (e.g. by the Heine-Borel property of compact sets (see

Th. C.19), since every open set containing (x0 , y0 ) must contain all, but finitely many,

of the elements of K). Due to (3.44), f is not (globally) Lipschitz with respect to y on

the compact set K (so, actually, continuity of f was not used for this direction).

Conversely, assume f to be locally Lipschitz with respect to y, and consider a compact

subset K of G. Then, for each (x, y) ∈ K, there is some (relatively) open U(x,y) ⊆ G

with (x, y) ∈ U(x,y) and such that f is Lipschitz with respect to y in U(x,y) . By the

Heine-Borel property of compact sets (see Th. C.19), there are finitely many U1 :=

U(x1 ,y1 ) , . . . , UN := U(xN ,yN ) , N ∈ N, such that

N

[

K⊆ Uj . (3.45)

j=1

For each j = 1, . . . , N , let Lj denote the Lipschitz constant for f on Uj and set L′ :=

max{L1 , . . . , LN }. As f is assumed continuous and K is compact, we have

3 GENERAL THEORY 37

Using the compactness of K once again, there exists a Lebesgue number δ > 0 for the

open cover (Uj )j∈{1,...,N } of K (cf. Th. C.21), i.e. δ > 0 such that

∀ ky − ȳk < δ ⇒ ∃ {(x, y), (x, ȳ)} ⊆ Uj . (3.47)

(x,y),(x,ȳ)∈K j∈{1,...,N }

Define L := max{L′ , 2M/δ}. Then, for every (x, y), (x, ȳ) ∈ K:

ky − ȳk < δ ⇒ kf (x, y) − f (x, ȳ)k ≤ Lj ky − ȳk ≤ Lky − ȳk, (3.48a)

2M δ

ky − ȳk ≥ δ ⇒ ≤ Lky − ȳk,

kf (x, y) − f (x, ȳ)k ≤ 2M = (3.48b)

δ

completing the proof that f is Lipschitz with respect to y on K.

While, in general, the assertion of Prop. 3.13 becomes false if the continuity of f is omit-

ted, for convex G, it does hold without the continuity assumption on f (see Appendix

D). The following Prop. 3.14 provides a useful sufficient condition for f : G −→ Kn ,

G ⊆ R × Km open, to be locally Lipschitz with respect to y:

Proposition 3.14. Let m, n ∈ N, let G ⊆ R × Km be open, and f : G −→ Kn . A

sufficient condition for f to be locally Lipschitz with respect to y is f being continuously

(real) differentiable with respect to y, i.e., f is locally Lipschitz with respect to y provided

that all partials ∂yk fl ; k, l = 1, . . . , n (∂yk,1 fl , ∂yk,2 fl for K = C) exist and are continuous.

Proof. We consider the case K = R; the case K = C is included by using the identifi-

cations Cm ∼= R2m and Cn ∼ = R2n . Given (x0 , y0 ) ∈ G, we have to show f is Lipschitz

with respect to y on some open set U ⊆ G with (x0 , y0 ) ∈ U . As in the Peano Th. 3.8,

since G is open,

∃ B := (x, y) ∈ R × Rm : |x − x0 | ≤ b and ky − y0 k1 ≤ b ⊆ G,

b>0

where k · k1 denotes the 1-norm on Rm . Since the ∂yk fl , (k, l) ∈ {1, . . . , m} × {1, . . . , n},

are all continuous on the compact set B,

M := max |∂yk fl (x, y)| : (x, y) ∈ B, (k, l) ∈ {1, . . . , m} × {1, . . . , n} < ∞. (3.49)

Applying the mean value theorem (cf. [Phi15, Th. 2.32]) to the n components of the

function

fx : y ∈ Rm : (x, y) ∈ B −→ Rn , fx (y) := f (x, y),

m

X

fl (x, y) − fl (x, ȳ) = ∂yk fl (x, ηl )(yk − ȳk ), (3.50)

k=1

and, thus,

n

X

f (x, y) − f (x, ȳ)
= |fl (x, y) − fl (x, ȳ)|

1

l=1

∀ m

n X n (3.51)

(x,y),(x,ȳ)∈B (3.49),(3.50) X X

≤ M |yk − ȳk | = M ky − ȳk1 = nM ky − ȳk1 ,

l=1 k=1 l=1

3 GENERAL THEORY 38

Theorem 3.15. If G ⊆ R × Kn is open, n ∈ N, and f : G −→ Kn is continuous and

locally Lipschitz with respect to y, then, for each (x0 , y0 ) ∈ G, the explicit n-dimensional

first-order initial value problem

y(x0 ) = y0 , (3.52b)

are both solutions to (3.52a), then φ(x0 ) = ψ(x0 ) for one x0 ∈ I implies φ(x) = ψ(x)

for all x ∈ I:

∃ φ(x0 ) = ψ(x0 ) ⇒ ∀ φ(x) = ψ(x) . (3.53)

x0 ∈I x∈I

ǫ>0 x∈]x0 −ǫ,x0 +ǫ[

Since f is continuous and both φ and ψ are solutions to the initial value problem (3.52),

we can use Th. 1.5 to obtain

Z x

∀ φ(x) − ψ(x) = f t, φ(t) − f t, ψ(t) dt . (3.55)

x∈I x0

As f is locally Lipschitz with respect to y, there exists δ > 0 such that f is Lipschitz

with Lipschitz constant L ≥ 0 with respect to y on

where we have chosen some arbitrary norm k·k on Kn . The continuity of φ, ψ implies the

existence of ǫ̃ > 0 such that B ǫ̃ (x0 ) ⊆ I, φ(Bǫ̃ (x0 )) ⊆ Bδ (y0 ) and ψ(Bǫ̃ (x0 )) ⊆ Bδ (y0 ),

implying

∀
f x, φ(x) − f x, ψ(x)
≤ L
φ(x) − ψ(x)
. (3.56)

x∈Bǫ̃ (x0 )

Next, define

ǫ := min{ǫ̃, 1/(2L)}

and, using the compactness of B ǫ (x0 ) = [x0 − ǫ, x0 + ǫ] plus the continuity of φ, ψ,

M := max kφ(x) − ψ(x)k : x ∈ B ǫ (x0 ) < ∞.

x

Z

M

∀ kφ(x) − ψ(x)k ≤ L kφ(t) − ψ(t)k dt ≤ L |x − x0 | M ≤ (3.57)

x∈Bǫ (x0 ) x0 2

3 GENERAL THEORY 39

(note that the integral in (3.57) can be negative for x < x0 ). The definition of M

together with (3.57) yields M ≤ M/2, i.e. M = 0, finishing the proof of (3.54).

To prove φ(x) = ψ(x) for each x ≥ x0 , let

s := sup{ξ ∈ I : φ(x) = ψ(x) for each x ∈ [x0 , ξ]}.

One needs to show s = sup I. If s = sup I does not hold, then there exists α > 0 such

that [s, s + α] ⊆ I. Then the continuity of φ, ψ implies φ(s) = ψ(s), i.e. φ and ψ satisfy

the same initial value problem at s such that (3.54) must hold with s instead of x0 , in

contradiction to the definition of s. Finally, φ(x) = ψ(x) for each x ≤ x0 follows in an

completely analogous fashion, which concludes the proof of the theorem.

Corollary 3.16. If G ⊆ R×Kkn is open, k, n ∈ N, and f : G −→ Kn is continuous and

locally Lipschitz with respect to y, then, for each (x0 , y0,0 , . . . , y0,k−1 ) ∈ G, the explicit

n-dimensional kth-order initial value problem consisting of (1.6) and (1.7), i.e.

y (k) = f x, y, y ′ , . . . , y (k−1) ,

j∈{0,...,k−1}

are both solutions to (1.6), then

∀ φ(j) (x0 ) = ψ (j) (x0 ) (3.58)

j∈{0,...,k−1}

Proof. Exercise.

Remark 3.17. According to Th. 3.15, the condition of f being continuous and locally

Lipschitz with respect to y is sufficient for each initial value problem (3.52) to have a

unique solution. However, this condition is not necessary: It is an exercise to show that

the continuous function

(

1 for y ≤ 0,

f : R2 −→ R, f (x, y) := √ (3.59)

1 + y for y ≥ 0,

is not locally Lipschitz with respect to y, but that, for each (x0 , y0 ) ∈ R2 , the initial

value problem (3.52) still has a unique solution in the sense that (3.53) holds for each

solution φ to (3.52a). And one can (can you?) even find simple examples of f being

defined on an open domain such that f is discontinuous at every point in its domain

and every initial value problem (3.52) still has a unique solution.

—

At the end of Sec. 3.2, it was pointed out that the proof of the Peano Th. 3.8 is non-

constructive due to the selection of a subsequence. The following Th. 3.18 shows that,

whenever the initial value problem has a unique solution, it becomes unnecessary to

select a subsequence, and the construction procedure (namely Euler’s method) used in

the proof of Th. 3.8 becomes an effective (if not necessarily efficient) numerical approx-

imation procedure for the unique solution.

3 GENERAL THEORY 40

Theorem 3.18. Consider the situation of the Peano Th. 3.8. Under the additional

assumption that the solution to the explicit n-dimensional first-order initial value prob-

lem (3.16) is unique on some interval J ⊆ [x0 , x0 + α[, x0 ∈ J, and where α > 0 is

constructed as in Th. 3.8 (i.e. given by (3.18) – (3.20)), every sequence (φm )m∈N of func-

tions defined on J according to Euler’s method as in the proof of Th. 3.8 (i.e. defined

as in (3.26)) converges uniformly to the unique solution φ : J −→ Kn . An analogous

statement also holds for J ⊆]x0 − α, x0 ], x0 ∈ J.

Proof. Seeking a contradiction, assume (φm )m∈N does not converge uniformly to the

unique solution φ. Then there exists ǫ > 0 and a subsequence (φmj )j∈N such that

∀ kφmj − φksup = sup kφmj (x) − φ(x)k : x ∈ J ≥ ǫ. (3.60)

j∈N

However, as a subsequence, (φmj )j∈N still has all the properties of the (φm )m∈N (namely

pointwise boundedness, uniform equicontinuity, piecewise differentiability, being approx-

imate solutions according to (3.33)) that guanranteed the existence of a subsequence,

converging to a solution. Thus, since the solution is unique on J, (φmj )j∈N must, in turn,

have a subsequence, converging uniformly to φ, which is in contradiction to (3.60). This

shows the assumption that (φm )m∈N does not converge uniformly to φ must have been

false. The proof of the analogous statement for J ⊆]x0 − α, x0 ], x0 ∈ J, one obtains,

e.g., via time reversion (cf. the second step of the proof of Th. 3.8).

Remark 3.19. The argument used to prove Th. 3.18 is of a rather general nature: It

can be applied whenever a sequence is known to have a subsequence converging to some

solution of some equation (or some other problem), provided the same still holds for

every subsequence of the original sequence – in that case, the additional knowledge that

the solution is unique implies the convergence of the original sequence without the need

to select a subsequence.

The Peano Th. 3.8 and Cor. 3.10 show the existence of local solutions to explicit initial

value problems, i.e. the solution’s existence is proved on some, possibly small, interval

containing the initial point x0 . In the current section, we will address the question in

which circumstances such local solutions can be extended, we will prove the existence

of maximal solutions (solutions that can not be extended), and we will learn how such

maximal solutions can be identified.

Definition 3.20. Let φ : I −→ Kn , n ∈ N, be a solution to some ODE (such as (1.6)

or (1.4) in the most general case), defined on some open interval I ⊆ R.

(a) We say φ has an extension or continuation to the right (resp. to the left) if, and

only if, there exists a solution ψ : J −→ Kn to the same ODE, defined on some

open interval J ⊇ I such that ψ↾I = φ and

3 GENERAL THEORY 41

an extension to the left (or both).

(b) The solution φ is called a maximal solution if, and only if, it does not admit any

extensions in the sense of (a) (note that we require maximal solutions to be defined

on open intervals, cf. Appendix E).

Remark 3.21. As an immediate consequence of the time reversion Lem. 1.9(b), if a

solution φ : I −→ Kn , n ∈ N, to (1.6), defined on some open interval I ⊆ R, has an

extension to the right (resp. to the left) if, and only if, ψ : (−I) −→ Kn , ψ(x) := φ(−x),

(solution to y (k) = (−1)k f (−x, y, −y ′ , . . . , (−1)k−1 y (k−1) )) has an extension to the left

(resp. to the right).

—

The existence of maximal solutions is not trivial – a priori it could be that every solution

had an extension (analogous to the fact that to every x ∈ [0, 1[ (or every x ∈ R) there

is some bigger element in [0, 1[ (respectively in R)).

Theorem 3.22. Every solution φ0 : I0 −→ Kn to (1.4) (resp. to (1.6)), defined on an

open interval I0 ⊆ R, can be extended to a maximal solution of (1.4) (resp. of (1.6)).

Proof. The proof is carried out for solutions to (1.4) (the implicit ODE) – the proof for

solutions to the explicit ODE (1.6) is analogous and can also be seen as a special case.

The idea is to apply Zorn’s lemma. To this end, define a partial order on the set

S := {(I0 , φ0 )} ∪ {(I, φ) : φ : I −→ Kn is solution to (1.4), extending φ0 } (3.62)

by letting

(I, φ) ≤ (J, ψ) :⇔ I ⊆ J, ψ↾I = φ. (3.63)

Every chain C, i.e. Severy totally ordered subset of S, has an upper bound, namely

(IC , φC ) with IC := (I,φ)∈C I and φC (x) := φ(x), where (I, φ) ∈ C is chosen such that

x ∈ I (since C is a chain, the value of φC (x) does not actually depend on the choice of

(I, φ) ∈ C and is, thus, well-defined).

Clearly, IC is an open interval, I0 ⊆ IC , and φC extends φ0 as a function; we still need to

see that φC is a solution to (1.4). For this, we, once again, use that x ∈ IC means there

exists (I, φ) ∈ C such that x ∈ I and φ is a solution to (1.4). Thus, using the notation

from Def. 1.2(a),

(k)

(x, φC (x), φ′C (x), . . . , φC (x)) = (x, φ(x), φ′ (x), . . . , φ(k) (x)) ∈ U

and

(k)

F (x, φC (x), φ′C (x), . . . , φC (x)) = F (x, φ(x), φ′ (x), . . . , φ(k) (x)) = 0,

showing φC is a solution to (1.4) as defined in Def. 1.2(a). In particular, (IC , φC ) ∈ S. To

verify (IC , φC ) is an upper bound for C, note that the definition of (IC , φC ) immediately

implies I ⊆ IC for each (I, φ) ∈ C and φC ↾I = φ for each (I, φ) ∈ C.

To conclude the proof, we note that all hypotheses of Zorn’s lemma have been verified

such that it yields the existence of a maximal element of (Imax , φmax ) ∈ S, i.e. φmax :

Imax −→ Kn must be a maximal solution extending φ0 .

3 GENERAL THEORY 42

continuous, if φ : I −→ Kn is a solution to (1.6) such that I =]a, b[, a < b, b < ∞

(resp. −∞ < a), then φ has an extension to the right (resp. to the left) if, and only if,

lim φ(x), φ′ (x), . . . , φ(k−1) (x) = (η0 , . . . , ηk−1 ),

∃ (3.64a)

(b,η0 ,...,ηk−1 )∈G x↑b

′ (k−1)

resp. ∃ lim φ(x), φ (x), . . . , φ (x) = (η0 , . . . , ηk−1 ) . (3.64b)

(a,η0 ,...,ηk−1 )∈G x↓a

Proof. That the respective part of (3.64) is necessary for the existence of the respective

extension is immediate from the fact that, for each solution to (1.6), the solution and

all its derivatives up to order k − 1 must exist and must be continuous.

We now prove that (3.64a) is also sufficient for the existence of an extension to the right

(the sufficiency of (3.64b) for the existence of an extension to the left is then immediate

from Rem. 3.21). So assume (3.64a) to hold and consider the initial value problem

consisting of (1.6) and the initial conditions

∀ y (j) (b) = ηj .

j=0,...,k−1

By Cor. 3.10, there must exist ǫ > 0 such that this initial value problem has a solution

ψ : ]b−ǫ, b+ǫ[−→ Kn . We now show that φ extended to b via (3.64a) is still a solution to

(1.6). First note the mean value theorem (cf. [Phi16, Th. 9.18]) yields that φ(j) (b) = ηj

exists for j = 1, . . . , k − 1 as a left-hand derivative. Moreover,

lim φ(k) (x) = lim f x, φ(x), φ′ (x), . . . , φ(k−1) (x) = f (b, η0 , . . . , ηk−1 ),

x↑b x↑b

showing φ(k) (b) = f b, φ(b), φ′ (b), . . . , φ(k−1) (b) (again employing the mean value theo-

rem), which proves φ extended to b is a solution to (1.6). Finally, Lem. 1.7 ensures

(

φ(x) for x ≤ b,

σ : ]a, b + ǫ[−→ Kn , σ(x) :=

ψ(x) for x ≥ b,

is a solution to (1.6) that extends φ to the right.

Proposition 3.24. Let k, n ∈ N, let G ⊆ R × Kkn be open, let f : G −→ Kn be

continuous, and let φ : I −→ Kn be a solution to (1.6) defined on the open interval I.

Consider x0 ∈ I and let gr+ (φ) (resp. gr− (φ)) denote the graph of (φ, . . . , φ(k−1) ) for

x ≥ x0 (resp. for x ≤ x0 ):

gr+ (φ) := gr+ (φ, x0 ) := x, φ(x), . . . , φ(k−1) (x) ∈ G : x ∈ I, x ≥ x0 ,

(3.65a)

gr− (φ) := gr− (φ, x0 ) := x, φ(x), . . . , φ(k−1) (x) ∈ G : x ∈ I, x ≤ x0 .

(3.65b)

If there exists a compact set K ⊆ G such that gr+ (φ) ⊆ K (resp. gr− (φ) ⊆ K), then φ

has an extension ψ : J −→ Kn to the right (resp. to the left) such that

x̃, ψ(x̃), . . . , ψ (k−1) (x̃) ∈

∃ / K. (3.66)

x̃∈J

The statement can be rephrased by saying that gr+ (φ) (resp. gr− (φ)) of each maximal

solution φ to (1.6) escapes from every compact subset of G when x appoaches the right

(resp. the left) boundary of I (where the boundary of I can contain −∞ and/or +∞).

3 GENERAL THEORY 43

Proof. We conduct the proof for extensions to the right; extensions to the left can be

handled completely analogously (alternatively, one can apply the time reversion Lem.

1.9(b) as demonstrated in the last paragraph of the proof below). The proof for exten-

sions to the right is divided into three steps. Let K ⊆ G be compact.

Step 1: We show that gr+ (φ) ⊆ K implies φ has an extension to the right: Since K is

bounded, so is gr+ (φ), implying

as well as

M2 := max kf (x, y)k : (x, y) ∈ K < ∞.

Set

M := max{M1 , M2 }.

According to Prop. 3.23, we need to show (3.64a) holds. To this end, notice

j=0,...,k−1 x,x̄∈[x0 ,b[

Indeed,

x̄

Z

kφ(k−1) (x) − φ(k−1) (x̄)k =
f t, φ(t), . . . , φ(k−1) (t) dt

∀
≤ M |x − x̄|,

x,x̄∈[x0 ,b[

x

x̄

Z

(j) (j) (j+1)

∀ kφ (x) − φ (x̄)k = φ ≤ M |x − x̄|,

(t) dt

x,x̄∈[x0 ,b[

x

proving (3.68). Since K is compact, there exists a sequence (xm )m∈N in [x0 , b[ such that

∃

(b,η0 ,...,ηk−1 )∈K m→∞

j=0,...,k−1 x∈[x0 ,b[

implying

∀ lim φ(j) (x) = ηj ,

j=0,...,k−1 x↑b

3 GENERAL THEORY 44

Step 2: We show that gr+ (φ) ⊆ K implies φ can be extended to the right to I∪]x0 , b+α[,

where α > 0 does not depend on b := sup I: Since K is compact, Cor. 3.9 guarantees

every initial value problem

y (k) = f (x, y, y ′ , . . . , y (k−1) ), (3.70a)

∀ y (j) (ξ0 ) = y0,j , (ξ0 , y0 ) ∈ K, (3.70b)

j=0,...,k−1

has a solution defined on ]ξ0 − α, ξ0 + α[ with the same α > 0. As shown in Step

1, the solution φ can be extended into b = sup I such that it satisfies (3.70b) with

(ξ0 , y0 ) = (b, η) ∈ K. Thus, using Lem. 1.7, it can be pieced together with the solution

to (3.70) given on [b, b + α[ by Cor. 3.9, completing the proof of Step 2.

Step 3: We finally show that gr+ (φ) ⊆ K implies φ has an extension ψ : J −→ Kn

to the right such that (3.66) holds: We set a := inf I and φ0 := φ. Then, by Step 2,

φ0 has an extension φ1 defined on ]a, b + α[. Inductively, for each m ≥ 1, either there

exists m0 ≤ m such that φm0 : ]a, b + m0 α[−→ Kn is an extension of φ that can be

used as ψ to conclude the proof (i.e. ψ := φm0 satisfies (3.66)) or φm can, once more,

be extended to ]a, b + (m + 1)α[. As K is bounded, {x ≥ x0 : (x, y) ∈ K} ⊆ R must

also be bounded, say by µ ∈ R. Thus, (3.66) must be satisfied for some ψ := φm with

1 ≤ m ≤ (µ − x0 )/α.

As mentioned above, one can argue completely analogous to the above proof to obtain

that gr− (φ) ⊆ K implies φ to have an extension to the left, satisfying (3.66). Here we

show how one, alternatively, can use the time reversion Lem. 1.9 to this end: Consider

the map

h : R × Kkn −→ R × Kkn , h(x, y1 , . . . , yk ) := (−x, y1 , . . . , (−1)k−1 yk ),

which clearly constitutes an R-linear isomophism. Noting (1.6) and (1.32) are the same,

we consider the time-reversed version (1.33) and observe Gg = h(G) to be open, h(K) ⊆

Gg to be compact, g : Gg −→ Kn , g = (−1)k (f ◦h), to be continuous. If gr− (φ, x0 ) ⊆ K,

then gr+ (ψ, −x0 ) ⊆ h(K), where ψ is the solution to the time-reversed version (1.33),

given by Lem. 1.9(b). Then ψ has an extension ψ̃ to the right, satisfying (3.66) with ψ

replaced by ψ̃ and K replaced by h(K). Then, by Rem. 3.21, φ must have an extension

φ̃ to the left, satisfying (3.66) with ψ replaced by φ̃.

In Th. 3.28 below, we will show that, for continuous f : G −→ Kn , each maximal

solution to (1.6) must go to the boundary of G in the sense of the following definition.

Definition 3.25. Let k, n ∈ N, let G ⊆ R × Kkn be open, let f : G −→ Kn , and let

φ : ]a, b[−→ Kn , −∞ ≤ a < b ≤ ∞, be a solution to (1.6). We say that the solution φ

goes to the boundary of G for x → b (resp. for x → a) if, and only if,

∀ ∃ gr+ (φ, x0 ) ∩ K = ∅ (resp. gr− (φ, x0 ) ∩ K = ∅), (3.71)

K ⊆ G compact x0 ∈]a,b[

where gr+ (φ, x0 ) and gr− (φ, x0 ) are defined as in (3.65) (with I =]a, b[). In other words,

φ goes to the boundary of G for x → b (resp. for x → a) if, and only if, the graph of

(φ, . . . , φ(k−1) ) escapes every compact subset K of G forever for x → b (resp. for x → a).

3 GENERAL THEORY 45

Proposition 3.26. In the situation of Def. 3.25, if the solution φ goes to the boundary

of G for x → b, then one of the following conditions must hold:

(i) b = ∞,

(ii) b < ∞ and L := lim supx↑b
φ(x), . . . , φ(k−1) (x)
= ∞,

lim dist x, φ(x), . . . , φ(k−1) (x) , ∂G = 0.

(3.72)

x↑b

An analogous statement is valid for the solution φ going to the boundary of G for x → a.

Proof. The proof is carried out for x → b; the proof for x → a is analogous.

Assume (i) – (iii) are all false. Choose c ∈]a, b[. Since (i) and (ii) are false,

φ(x), . . . , φ(k−1) (x)
≤ M.

∃ ∀

0≤M <∞ x∈[c,b[

If (iii) is false because G = R × Kkn , then K := {(x, y) ∈ R × Kkn : x ∈ [c, b], kyk ≤ M }

is a compact subset of G that shows (3.71) does not hold. In the only remaining case,

(iii) must be false, since (3.72) does not hold. Thus,

dist x1 , φ(x1 ), . . . , φ(k−1) (x1 ) , ∂G ≥ δ.

∃ ∀ ∃

δ>0 x0 ∈]a,b[ x1 ∈]x0 ,b[

A := (x, y) ∈ G : dist (x, y), ∂G ≥ δ

is closed (e.g. as the distance function d : R × Kkn −→ R+ 0 , d(·) := dist(·, ∂G) is

continuous (see Th. C.4) and A = (d−1 [δ, ∞[) ∩ (G ∪ ∂G)). In consequence, K ∩ A with

K as defined above is a compact subset of G that shows (3.71) does not hold.

Remark 3.27. (a) Examples such as the second ODE of Ex. 3.30(b) below show that

the lim sup in Prop. 3.26(ii) can not be replaced with a lim.

(b) If f : G −→ Kn is continuous, then the three conditions of Prop. 3.26 are also

sufficient for φ to go to the boundary of G (cf. Cor. 3.29 below).

(c) For discontinuous f : G −→ Kn , in general, (ii) of Prop. 3.26 is no longer sufficient

for φ to go to the boundary of G as is shown by simple examples, whereas (i) and (iii)

remain sufficient, even for discontinuous f (exercise). Similarly, simple examples

show Prop. 3.24 becomes false without the assumption of f being continuous; and

it can also happen that a maximal solution escapes every compact set, but still does

not go to the boundary of G (exercise).

Theorem 3.28. In the situation of Def. 3.25, if f : G −→ Kn is continuous and

φ : ]a, b[−→ Kn is a maximal solution to (1.6), then φ must go to the boundary of G for

both x → a and x → b, i.e., for both x → a and x → b, it must escape every compact

subset K of G forever and it must satisfy one of the conditions specified in Prop. 3.26

(and one of the analogous conditions for x → a).

3 GENERAL THEORY 46

Proof. We carry out the proof for x → b – the proof for x → a can be done analogously

or by applying the time reversion Lem. 1.9, as indicated at the end of the proof below.

Let φ : ]a, b[−→ Kn be a maximal solution to (1.6). Seeking a contradiction, we assume

φ does not go to the boundary of G for x → b, i.e. (3.71) does not hold and there exists

a compact subset K of G and a strictly increasing sequence (xm )m∈N in ]a, b[ such that

limm→∞ xm = b < ∞ and

∀ (3.73)

m∈N

i.e. K ( C ( G: More precisely, we choose r > 0 such that

where

dist (x, y), K = inf{k(x, y) − (x̃, ỹ)k2 : (x̃, ỹ) ∈ K},

k · k2 denoting the Euclidean norm on Rkn+1 for K = R and the Euclidean norm on

R2kn+1 for K = C (this choice of norm is different from previous choices and will be

convenient later during the current proof). As φ is a maximal solution, Prop. 3.24

guarantees the existence of another strictly increasing sequence (ξm )m∈N in ]a, b[ such

that limm→∞ ξm = b < ∞, x1 < ξ1 < x2 < ξ2 < . . . (i.e. xm < ξm < xm+1 for each

m ∈ N) and such that

∀ / C.

m∈N

m∈N

By the definition of sm as a sup, sm < xm+1 < b < ∞, and by the continuity of the

distance function d : R × Kkn −→ R+ 0 , d(·) := dist(·, K) (see Th. C.4), one obtains

dist sm , φ(sm ), . . . , φ(k−1) (sm ) , K = r,

∀

m∈N

in particular,

x, φ(x), . . . , φ(k−1) (x) ∈ C

∀ (3.74)

x∈[xm ,sm ]

and

xm , φ(xm ), . . . , φ(k−1) (xm ) − sm , φ(sm ), . . . , φ(k−1) (sm )
≥ r.

∀ (3.75)

m∈N 2

M2 := max kf (x, y)k2 : (x, y) ∈ C < ∞

3 GENERAL THEORY 47

M := max{M1 , M2 }.

is a continuously differentiable curve or path (using the continuity of f ), cf. Def. F.1

(for K = C, we consider Jm as a path in R2kn+1 ). To finish the proof, we will have to

make use of the notion of arc length (cf. Def. F.5) of such a continuously differentiable

curve: Recall that each such continuously differentiable path is rectifyable, i.e. it has a

well-defined finite arc length l(Jm ) (cf. Th. F.7). Moreover, l(Jm ) satisfies

(F.4)

Z sm

(F.17) ′

kJm (xm ) − Jm (sm )k2 ≤ l(Jm ) = kJm (x)k2 dx

xm

v

Z sm u Xk

u

= t 1+ kφ(j) (x)k22 dx

xm j=1

Z sm q
2
2

≤ 1 +
φ(x), . . . , φ(k−1) (x)
2 +
f (Jm (x))
2 dx

Z xm

sm √

≤ 1 + 2M 2 dx , (3.76)

xm

where it was used that k · k2 was chosen to be the Euclidean norm. For each m ∈ N, we

estimate

(3.75)

(k−1) (k−1)

0 < r ≤
xm , φ(xm ), . . . , φ (xm ) − sm , φ(sm ), . . . , φ (sm )

2

(3.76)

Z sm √

= kJm (xm ) − Jm (sm )k2 ≤ 1 + 2M 2 dx

xm

√

= (sm − xm ) 1 + 2M 2 . (3.77)

√

However, limm→∞ (sm − xm ) 1 + 2M 2 = 0 due to limm→∞ sm = limm→∞ xm = b, in

contradiction to r > 0. This contradiction shows our initial assumption that φ does not

go to the boundary of G for x → b must have been wrong.

To obtain the remaining assertion that φ must go to the boundary of G for x → a,

one can proceed as in the last paragraph of the proof of Prop. 3.23, making use of the

function h defined there and of the time reversion Lem. 1.9: If K ⊆ G is a compact

set and ψ is the solution to the time-reversed version given by Lem. 1.9(b), then ψ

must be maximal as φ is maximal. Thus, for x → −a, ψ must escape the compact set

h(K) forever by the first part of the proof above, implying φ must escape K forever for

x → a.

Corollary 3.29. Let k, n ∈ N, let G ⊆ R × Kkn be open, and let f : G −→ Kn

be continuous. If φ : ]a, b[−→ Kn , a < b, is a solution to (1.6), then the following

statements are equivalent:

3 GENERAL THEORY 48

(ii) φ must go to the boundary of G for both x → a and x → b in the sense defined in

Def. 3.25.

(iii) φ satisfies one of the conditions specified in Prop. 3.26 and one of the analogous

conditions for x → a.

Proof. (i) implies (ii) by Th. 3.28, (ii) implies (iii) by Prop. 3.26, and it is an exercise

to show (iii) implies (i) (here, Prop. 3.23 is the clue).

Example 3.30. The following examples illustrate the different kinds of possible bahav-

ior of maximal solutions listed in Prop. 3.26 (the different kinds of bahavior can already

be seen for 1-dimensional ODE of first order):

y ′ = 0, y(0) = −1,

has the maximal solution φ : R −→ R, φ(x) = −1 – here we have

G = R2 , f : G −→ R, f (x, y) = 0,

y ′ = x−2 , y(−1) = 1,

has the maximal solution φ : ] − ∞, 0[−→ R, φ(x) = −x−1 – here we have

(ii) of Prop. 3.26.

To obtain an example, where we are also in Case (ii) of Prop. 3.26, but where

limx↑b |φ(x)|, b := sup I, does not exist, consider the initial value problem

′ 1 1 1 1 1

y = − 2 sin + 3 cos , y − = 0,

x x x x π

which has the maximal solution φ : ] − ∞, 0[−→ R, φ(x) = x−1 sin(x−1 ) (here

lim supx↑0 |φ(x)| = ∞, but, as φ(−1/(kπ)) = 0 for each k ∈ N, limx↑0 |φ(x)| does

not exist) – here we have

1 1 1 1

G = (R \ {0}) × R, f : G −→ R, f (x, y) = − 2

sin + 3 cos .

x x x x

To obtain an example, where we are again in Case (ii) of Prop. 3.26, but where

G = R2 , consider the initial value problem

y′ = y2, y(−1) = 1,

3 GENERAL THEORY 49

which, as in the first example of (b), has the maximal solution φ : ] − ∞, 0[−→ R,

φ(x) = −x−1 – here we have

G = R2 , f : G −→ R, f (x, y) = y 2 .

y ′ = −y −1 , y(−1) = 1,

√

has the maximal solution φ : ] − ∞, − 21 [−→ R, φ(x) = −2x − 1 – here we have

G = R × (R \ {0}), f : G −→ R, f (x, y) = −y −1 ,

1

lim (x, φ(x)) = − , 0 ∈ ∂G,

x↑b 2

An example, where we are in Case (iii) of Prop. 3.26, but where limx↑b (x, φ(x))

does not exist, is given by the initial value problem

′ 1 1 1

y = − 2 cos , y − = 0,

x x π

1 1

G = (R \ {0}) × R, f : G −→ R, f (x, y) = − 2

cos ,

x x

solution interval I =] − ∞, 0[, b := sup I = 0, ∂G = {0} × R,

lim dist (x, φ(x)), ∂G = lim |x| = 0.

x↑0 x↑0

As a final example, where we are again in Case (iii) of Prop. 3.26, reconsider the

initial value problem from (a), but this time with

] − 1, 1[, b := sup I = 1, and limx↑1 (x, φ(x)) = (1, −1) ∈ ∂G. This last example also

illustrates that, even though it is quite common to omit an explicit specification of

the domain G when writing an ODE (as we did in (a)) – where it is usually assumed

that the intended domain can be guessed from the context – the maximal solution

will typically depend on the specification of G.

3 GENERAL THEORY 50

Example 3.31. We have already seen examples of initial value problems that admit

more than one maximal solution – for instance, the initial value problem of Ex. 1.4(b)

had infinitely many different maximal solutions, all of them defined on all of R. The fol-

lowing example shows that an initial value problem can have maximal solutions defined

on different intervals: Let

p

|y|

G := R×] − 1, 1[, f : G −→ R, f (x, y) := p ,

1 − |y|

p

′ |y|

y = f (x, y) = p , y(0) = 0. (3.78)

1 − |y|

φ : R −→ R, φ(0) = 0.

However, another maximal solution (that can be found using separation of variables) is

( √ 2

− 1− 1+x for −1 < x ≤ 0,

ψ : ] − 1, 1[−→ R, ψ(x) := √ 2

1− 1−x for 0 ≤ x < 1.

To confirm the maximality of the solution ψ, note limx↓−1 (x, ψ(x)) = (−1, −1) ∈ ∂G

and limx↑1 (x, ψ(x)) = (1, 1) ∈ ∂G.

The goal of the present section is to show that, under suitable conditions, small changes

in the initial condition for an ODE result in small changes in the solution. As, in

situations of nonuniqueness, we can change the solution without having changed the

initial condition at all, ensuring unique solutions to initial value problems is a minimal

prerequisite for our considerations in this section.

explicit n-dimensional kth-order ODE (1.6), i.e.

y (k) = f x, y, y ′ , . . . , y (k−1) ,

(3.79a)

admits unique maximal solutions if, and only if, f is such that every initial value problem

consisting of (3.79a) and

j∈{0,...,k−1}

with (ξ, η) ∈ G, has a unique maximal solution φ(ξ,η) : I(ξ,η) −→ Kn (combining Cor.

3.16 with Th. 3.22 yields that G being open and f being continuous and locally Lipschitz

3 GENERAL THEORY 51

with respect to y is sufficient for (3.79a) to admit unique maximal solutions, but we

know from Rem. 3.17 that this condition is not necessary). If f is such that (3.79a)

admits unique maximal solutions, then

defined on

Df := {(x, ξ, η) ∈ R × G : x ∈ I(ξ,η) }, (3.81)

is called the global or general solution to (3.79a). Note that the domain Df of Y is

determined entirely by f , which is notationally emphasized by its lower index f .

(b) If k = 1, then η = η0 and Y x, x̃, Y (x̃, ξ, η) = Y (x, ξ, η) for each (x, ξ, η), (x̃, ξ, η) ∈

Df .

(c) If k = 1, then Y ξ, x, Y (x, ξ, η) = η for each (x, ξ, η) ∈ Df .

Proof. (a) holds as Y (·, ξ, η) is a solution to (3.79b). For (b) note (x̃, ξ, η) ∈ Df im-

plying x̃, Y (x̃, ξ, η) ∈ G, i.e. x̃, Y (x̃, ξ, η) are admissible initial data. Moreover,

Y ·, x̃, Y (x̃, ξ, η) and Y (·, ξ, η) are both maximal solutions for some intial value prob-

lem for (3.79a). Since both solutions agree at x = x̃, both functions must be identical

by the assumed uniqueness of solutions. In particular, they are defined for the same x

and yield the same value at each x. Setting x := ξ in (b) yields (c).

The core of the proof of continuity in initial conditions as stated in Cor. 3.36 below is

the following Th. 3.34(a), which provides continuity in initial conditions locally. As a

byproduct, we will also obtain a version of the Picard-Lindelöf theorem in Th. 3.34(b),

which states the local uniform convergence of the so-called Picard iteration, a method for

obtaining approximate solutions that is quite different from the Euler method considered

above.

Theorem 3.34. Consider the situation of Def. 3.32 for first-order problems, i.e. with

k = 1, and with f being continuous and locally Lipschitz with respect to y on G open.

Fix an arbitrary norm k · k on Kn .

(a) For each (σ, ζ) ∈ G ⊆ R × Kn and each −∞ < a < b < ∞ such that [a, b] ⊆ I(σ,ζ)

(i.e., using the notation introduced in Def. 3.32, the maximal solution φ(σ,ζ) =

Y (·, σ, ζ) is defined on [a, b]), there exists δ > 0 satisfying:

Uδ (σ, ζ) := (ξ, η) ∈ G : ξ ∈]a, b[, η − Y (ξ, σ, ζ) < δ , (3.82)

the maximal solution φ(ξ,η) = Y (·, ξ, η) is defined on ]a, b[ (i.e. ]a, b[⊆ I(ξ,η) ).

3 GENERAL THEORY 52

(ii) The restriction of the global solution (x, ξ, η) 7→ Y (x, ξ, η) to the open set

is continuous.

(b) (Picard-Lindelöf) For each (σ, ζ) ∈ G, there exists α > 0 such that the Picard

iteration, i.e. the sequence of functions (φm )m∈N0 , φm : ]σ − α, σ + α[−→ Kn , defined

recursively by

φ0 (x) := ζ, (3.84a)

Z x

∀ φm+1 (x) := ζ + f t, φm (t) dt , (3.84b)

m∈N0 σ

converges uniformly to the solution of the initial value problem (3.79) (with k = 1

and (ξ, η) := (σ, ζ)) on ]σ − α, σ + α[.

Proof. We will obtain (b) as an aside while proving (a). To simplify notation, we

introduce the function

is a compact subset of G (cf. C.14). Thus, γ has a positive distance from the closed set

(R × Kn ) \ G, implying

(3.85)

δ1 >0

Clearly, C is bounded and C is also closed (using the continuity of the distance function

d : R × Kn −→ R+ 0 , d(·) := dist(·, γ), the continuity of the projection to the first

component π1 : R × Kn −→ R, and noting C = d−1 [0, δ1 ] ∩ π1−1 [a, b]). Thus, C is

compact, and the hypothesis of f being locally Lipschitz with respect to y implies f to

be globally Lipschitz with some Lipschitz constant L ≥ 0 on the compact set C by Prop.

3.13. We can now choose the number δ > 0 claimed to exist in (a) to be any number

δ < δ1 . (3.87)

Moreover, with d and π1 as above, Uδ (σ, ζ) as defined in (3.82) can be written in the

form

Uδ (σ, ζ) = d−1 [0, δ[ ∩ π1−1 ]a, b[,

showing it is an open set ([0, δ[ is, indeed, open in R+

0 ).

3 GENERAL THEORY 53

Even though we are mostly interested in what happens on the open set W , it will be

convenient to define functions on the slightly larger compact set

W := [a, b] × U ,

U := (x, y) ∈ R × Kn : x ∈ [a, b],
y − ψ(x)
≤ δ = d−1 [0, δ] ∩ π1−1 [a, b].

To proceed with the proof, we now carry out a form of the Picard iteration, recursively

defining a sequence of functions (ψm )m∈N0 , ψm : W −→ Kn , defined recursively by

ψ0 (x, ξ, η) := ψ(x) + η − ψ(ξ), (3.88a)

Z x

∀ ψm+1 (x, ξ, η) := η + f t, ψm (t, ξ, η) dt . (3.88b)

m∈N0 ξ

The proof will be concluded if we can show the (ψm )m∈N0 constitute a sequence of

continuous functions converging uniformly on W to Y ↾W . As an intermediate step, we

establish the following properties of the ψm (simultaneously) by induction on m ∈ N0 :

(2) One has

∀ ψm (x, ξ, η) − ψ(x) < δ1 ⇒ (x, ψm (x, ξ, η)) ∈ C .

m∈N0 ,

(x,ξ,η)∈W

(3) One has

m+1

ψm+1 (x, ξ, η) − ψm (x, ξ, η) ≤ L

|x − ξ|m+1 δ

∀ .

m∈N0 , (m + 1)!

(x,ξ,η)∈W

To start the induction proof, notice that the continuity of ψ implies the continuity of

ψ0 . Moreover, if (x, ξ, η) ∈ W , then

= η − ψ(ξ) = η − Y (ξ, σ, ζ) ≤ δ < δ1 . (3.89)

Also, from ψ = Y (·, σ, ζ) = φ(σ,ζ) , we know, for each x, ξ ∈ [a, b],

Z x Z ξ Z x

ψ(x) − ψ(ξ) = ζ + f (t, ψ(t)) dt − ζ − f (t, ψ(t)) dt = f (t, ψ(t)) dt

σ σ ξ

Z x

ψ1 (x, ξ, η) − ψ0 (x, ξ, η) = f t, ψ 0 (t, ξ, η) − f (t, ψ(t)) dt

ξ

Z x

f L-Lip.

≤ L ψ0 (t, ξ, η) − ψ(t) dt

Zξ x

= L η − ψ(ξ) dt ≤ L |x − ξ| δ,

ξ

3 GENERAL THEORY 54

completing the proof of (1) – (3) for m = 0. For the induction step, let m ∈ N0 .

It is left as an exercise to prove the continuity of ψm+1 .

Using the triangle inequality, we estimate, for each (x, ξ, η) ∈ W ,

ψm+1 (x, ξ, η) − ψ(x)

Xm

≤ kψj+1 (x, ξ, η) − ψj (x, ξ, η)
+ kψ0 (x, ξ, η) − ψ(x)

j=0

m

(3.89), ind.hyp. for (3) X Lj+1 |x − ξ|j+1 δ (3.86)

≤ + δ ≤ eL|x−ξ| δ < eL(b−a) e−L(b−a) δ1 = δ1 ,

j=0

(j + 1)!

establishing the estimate of (2) for m + 1. To prove the estimate in (3) for m replaced

by m + 1, one estimates, for each (x, ξ, η) ∈ W ,

Z x

ψm+2 (x, ξ, η) − ψm+1 (x, ξ, η)
≤
f t, ψm+1 (t, ξ, η) − f t, ψm (t, ξ, η)
dt

ξ

Z x

≤ L
ψm+1 (t, ξ, η) − ψm (t, ξ, η) dt

Zξ x m+1 m+1

ind.hyp. L |t − ξ| δ

≤ L dt

ξ (m + 1)!

m+2

L |x − ξ|m+2 δ

= ,

(m + 2)!

completing the induction proof of (1) – (3).

As a consequence of (3), for each l, m ∈ N0 such that m > l:

m

X Lj (b − a)j

∀
ψm (x, ξ, η) − ψl (x, ξ, η)
≤ δ . (3.90)

(x,ξ,η)∈W

j=l+1

j!

The convergence of the exponential series, thus, implies that (ψm (x, ξ, η))m∈N0 is a

Cauchy sequence for each (x, ξ, η) ∈ W , yielding pointwise convergence of the ψm to

some function ψ̃ : W −→ Kn . Letting m tend to infinity in (3.90) then shows

∞

X Lj (b − a)j

∀
ψ̃(x, ξ, η) − ψl (x, ξ, η)
≤ δ ,

(x,ξ,η)∈W

j=l+1

j!

where the independence of the right-hand side with respect to (x, ξ, η) ∈ W proves

ψm → ψ̃ uniformly on W . The uniform convergence together with (1) then implies ψ̃

to be continuous.

In the final step of the proof, we show ψ̃ = Y on W , i.e. ψ̃(·, ξ, η) solves (3.79) (with

k = 1). By Th. 1.5, we need to show

Z x

∀ ψ̃(x, ξ, η) = η + f t, ψ̃(t, ξ, η) dt (3.91)

(x,ξ,η)∈W ξ

3 GENERAL THEORY 55

(then uniqueness of solutions implies ψ̃(·, ξ, η) = Y (·, ξ, η)). To verify (3.91), given

ǫ > 0, by the uniform convergence ψm → ψ̃, choose m ∈ N sufficiently large such that

∀ ∀
ψ̃(x, ξ, η) − ψk (x, ξ, η)
< ǫ

k∈{m−1,m} (x,ξ,η)∈W

Z x

ψ̃(x, ξ, η) − η − f t, ψ̃(t, ξ, η) dt

ξ

x

Z

≤ ψ̃(x, ξ, η) − ψm (x, ξ, η) + f t, ψm−1 (t, ξ, η) − f t, ψ̃(t, ξ, η) dt

ξ

Z x

< ǫ + L ψm−1 (t, ξ, η) − ψ̃(t, ξ, η) dt ≤ ǫ + L ǫ (b − a). (3.92)

ξ

It is noted that we have, indeed, proved (b) as a byproduct, since we know (for example

from the Peano Th. 3.8) that ψ must be defined on [σ − α, σ + α] for some α > 0 and

then φm = ψm (·, σ, ζ) on [σ − α, σ + α] for each m ∈ N0 .

Theorem 3.35. As in Th. 3.34, consider the situation of Def. 3.32 for first-order prob-

lems, i.e. with k = 1, and with f being continuous and locally Lipschitz with respect to

y on G open. Then the global solution (x, ξ, η) 7→ Y (x, ξ, η) as defined in Def. 3.32 is

continuous. Moreover, its domain Df is open.

Proof. Let (x, σ, ζ) ∈ Df . Then, using the notation from Def. 3.32, x is in the domain of

the maximal solution φ(σ,ζ) , i.e. x ∈ I(σ,ζ) . Since I(σ,ζ) is open, there must be −∞ < a <

x < b < ∞ such that [a, b] ⊆ I(σ,ζ) and then Th. 3.34(a) implies the global solution Y to

be continuous on W , where W as defined in (3.83) is an open neighborhood of (x, σ, ζ).

In particular, (x, σ, ζ) is an interior point of Df and Y is continuous at (x, σ, ζ). As

(x, σ, ζ) was arbitrary, Df must be open and Y must be continuous.

Corollary 3.36. Consider the situation of Def. 3.32 with f being continuous and locally

Lipschitz with respect to y on G open. Then the global solution (x, ξ, η) 7→ Y (x, ξ, η) as

defined in Def. 3.32 is continuous. Moreover, its domain Df is open.

Proof. It was part of the exercise that proved Cor. 3.16 to show that the right-hand side

F of the first-order problem equivalent to (3.79) in the sense of Th. 3.1 is continuous

and locally Lipschitz with respect to y, provided f is continuous and locally Lipschitz

with respect to y. Thus, according to Th. 3.35, the equivalent first-order problem has

a continuous global solution Υ : DF −→ Kkn , defined on some open set DF . As a

consequence of Th. 3.1(b), Y = Υ1 : DF −→ Kn is the global solution to (3.79a). So

we have Df = DF and, as Υ is continuous, so is Y .

on some (vector of) parameters µ in addition to depending on x and y:

3 GENERAL THEORY 56

for each (ξ, η, µ) ∈ G, the explicit n-dimensional kth-order initial value problem

y (k) = f x, y, y ′ , . . . , y (k−1) , µ ,

(3.93a)

∀ y (j) (ξ) = ηj ∈ Kn , (3.93b)

j∈{0,...,k−1}

Y : Df −→ Kn , Y (x, ξ, η, µ) := φ(ξ,η,µ) (x), (3.94)

defined on

Df := {(x, ξ, η, µ) ∈ R × G : x ∈ I(ξ,η,µ) }, (3.95)

is called the global or general solution to (3.93a).

Corollary 3.38. Consider the situation of Def. 3.37 with f being continuous and locally

Lipschitz with respect to (y, µ) on G open. Then the global solution Y as defined in Def.

3.94 is continuous. Moreover, its domain Df is open.

Proof. We consider k = 1 (i.e. (3.93a) is of first order) – the case k > 1 can then, in the

usual way, be obtained by applying Th. 3.1. To apply Th. 3.35 to the present situation,

define the auxiliary function

(

fj (x, y) for j = 1, . . . , n,

F : G −→ Kn+l , Fj (x, y) := (3.96)

0 for j = n + 1, . . . , n + l.

Then, since f is continuous and locally Lipschitz with respect to (y, µ), F is continuous

and locally Lipschitz with respect to y, and we can apply Th. 3.35 to

y ′ = F (x, y), (3.97a)

y(ξ) = (η, µ), (3.97b)

(3.97a) is continuous on the open set DF . Moreover, by the definition of F in (3.96),

we have

Y (x, ξ, η, µ)

∀ Ỹ (x, ξ, η, µ) = ,

(x,ξ,η,µ)∈DF µ

where Y is as defined in (3.94). In particular, Df = DF and the continuity of Ỹ implies

the continuity of Y .

Example 3.39. As a simple example of a parametrized ODE, consider f : R × K2 −→

K, f (x, y, µ) := µy,

y ′ = f (x, y, µ) = µ y,

y(ξ) = η,

with the global solution

Y : R × R × K2 −→ K, Y (x, ξ, η, µ) = η eµ (x−ξ) .

4 LINEAR ODE 57

4 Linear ODE

In Sec. 2.2, we saw that the solution of one-dimensional first-order linear ODE was

particularly simple. One can now combine the general theory of ODE with some linear

algebra to obtain results for n-dimensional linear ODE and, equivalently, for linear ODE

of higher order.

Notation 4.1. For n ∈ N, let M(n, K) denote the set of all n × n matrices over K.

Definition 4.2. Let I ⊆ R be a nontrivial interval, n ∈ N, and let A : I −→ M(n, K)

and b : I −→ Kn be continuous. An ODE of the form

is called an n-dimensional linear ODE of first order. It is called homogeneous if, and

only if, b ≡ 0; it is called inhomogeneous if, and only if, it is not homogeneous.

—

Using the notion of matrix norm (cf. Sec. G), it is not hard to show the right-hand side

of (4.1) is continuous and locally Lipschitz with respect to y and, thus, every initial

value problem for (4.1) has a unique maximal solution (exercise). However, to show

the maximal solution is always defined on all of I, we need some additional machinery,

which is developed in the next section.

In the current section, we will provide Gronwall’s inequality, which is also of interest

outside the field of ODE. Here, Gronwall’s inequality will allow us to prove the global ex-

istence of maximal solutions for ODE with linearly bounded right-hand side – a corollary

being that maximal solutions of (4.1) are always defined on all of I.

As an auxiliary tool on our way to Gronwall’s inequality, we will now briefly study

(one-dimensional) differential inequalities:

Definition 4.3. Given G ⊆ R × R = R2 , and f : G −→ R, call

y ′ ≤ f (x, y) (4.2)

entiable function w : I −→ R defined on a nontrivial interval I ⊆ R satisfying the two

conditions

(i) x, w(x) ∈ I × R : x ∈ I ⊆ G,

4 LINEAR ODE 58

Lipschitz with respect to y, and let −∞ < a < b ≤ ∞. If w : [a, b[−→ R is a solution

to the differential inequality (4.2) and φ : [a, b[−→ R is a solution to the corresponding

ODE, then

w(a) ≤ φ(a) ⇒ ∀ w(x) ≤ φ(x). (4.3)

x∈[a,b[

Since f is continuous and locally Lipschitz with respect to y, g is continuous and locally

Lipschitz with respect to (y, µ). Thus, continuity in initial conditions as given by Cor.

3.38 applies, yielding the global solution Y : Dg −→ R, (x, ξ, η, µ) 7→ Y (x, ξ, η, µ), to

be continuous on the open set Dg .

We now consider an arbitrary compact subinterval [a, c] ⊆ [a, b[ with a < c < b, noting

that it suffices to prove w ≤ φ on every such interval [a, c]. The set

γ := (Id, a, φ(a), 0)[a, c] = (x, a, φ(a), 0) : x ∈ [a, c] (4.6)

n o

4

∃ γǫ := (x, ξ, η, µ) ∈ R : dist (x, ξ, η, µ), γ < ǫ ⊆ Dg . (4.7)

ǫ>0

If we choose the distance in (4.7) to be meant with respect to the max-norm on R4 and

if 0 < µ < ǫ, then (x, a, φ(a), µ) ∈ γǫ for each x ∈ [a, c], such that φµ := Y (·, a, φ(a), µ)

is defined on (a superset of) [a, c]. We proceed to prove w ≤ φµ on [a, c]: Seeking a

contradiction, assume there exists x0 ∈ [a, c] such that w(x0 ) > φµ (x0 ). Due to the

continuity of w and φµ , w > φµ must then hold in an entire neighborhood of x0 . On the

other hand, w(a) ≤ φ(a) = φµ (a), such that, for

x1 := inf x < x0 : w(t) > φµ (t) for each t ∈]x, x0 ] ,

a ≤ x1 < x0 and w(x1 ) = φµ (x1 ). But then, for each sufficiently small h > 0,

implying

w(x1 + h) − w(x1 ) φµ (x1 + h) − φµ (x1 )

w′ (x1 ) = lim ≥ lim = φ′µ (x1 )

h→0 h h→0 h

= g x1 , φµ (x1 ), µ = f x1 , φµ (x1 ) + µ > f x1 , φµ (x1 ) = f x1 , w(x1 ) , (4.8)

4 LINEAR ODE 59

Thus, w ≤ φµ on [a, c] holds for every 0 < µ < ǫ, and continuity of Y on Dg yields,

∀ w(x) ≤ lim φµ (x) = lim Y x, a, φ(a), µ = Y x, a, φ(a), 0 = φ(x), (4.9)

x∈[a,c] µ→0 µ→0

Theorem 4.5 (Gronwall’s Inequality). Let I := [a, b[, where −∞ < a < b ≤ ∞. If

α, β, γ : I −→ R are continuous and β(x) ≥ 0 for each x ∈ I, then

Z x

∀ γ(x) ≤ α(x) + β(t) γ(t) dt (4.10)

x∈I a

implies Z x Z x

∀ γ(x) ≤ α(x) + α(t) β(t) exp β(s) ds dt . (4.11)

x∈I a t

ψ(x) := γ(x) − α(x), (4.12a)

Z x

w(x) := β(t)γ(t) dt , (4.12b)

a

∀ ψ(x) ≤ w(x).

x∈I

w′ (x) = β(x)γ(x) = β(x) α(x) + ψ(x) ≤ β(x)w(x) + α(x)β(x),

∀

x∈I

y ′ ≤ β(x) y + α(x) β(x). (4.13)

Continuously extending α and β to x < a (e.g. using the constant extensions α(x) = α(a)

and β(x) := β(a) for x < a), we can consider the linear ODE corresponding to (4.13) on

all of ] − ∞, b[. Using the initial condition y(a) = w(a) = 0, yields the unique solution

(employing the variation of constants Th. 2.3)

φ : ] − ∞, b[−→ R,

Z x Z x Z t

φ(x) := exp β(s) ds exp − β(s) ds α(t) β(t) dt

a a a

(4.14)

Z x Z x

= α(t) β(t) exp β(s) ds dt .

a t

(4.3)

Z x Z x

(4.14)

∀ ψ(x) ≤ w(x) ≤ φ(x) = α(t) β(t) exp β(s) ds dt , (4.15)

x∈I a t

4 LINEAR ODE 60

continuous, β(x) ≥ 0 for each x ∈ I, and C ∈ R, then

Z x

∀ γ(x) ≤ C + β(t) γ(t) dt (4.16)

x∈I a

implies Z x

∀ γ(x) ≤ C exp β(t) dt : (4.17)

x∈I a

We apply Gronwall’s inequality of Th. 4.5 with α ≡ C together with the fundamental

theorem of calculus to obtain the estimate

Z x Z x

γ(x) ≤ C + C β(t) exp β(s) ds dt

a t

Z x Z t Z t x

=C −C −β(t) exp −β(s) ds dt = C − C exp −β(s) ds

a x x a

Z x

= C exp β(t) dt (4.18)

a

—

The following Th. 4.7 will be applied to show maximal solutions to linear ODE are

always defined on all of I (with I as in Def. 4.2). However, Th. 4.7 is often also useful

to obtain the domains of maximal solutions for nonlinear ODE.

Theorem 4.7. Let n ∈ N, let I ⊆ R be an open interval, and let f : I × Kn −→ Kn be

continuous. If there exist nonnegative continuous functions γ, β : I −→ R+

0 such that

(x,y)∈I×Kn

y ′ = f (x, y)

is defined on all of I.

Proof. Let c < d and φ : ]c, d[−→ Kn be a solution to y ′ = f (x, y). We prove that

d < b := sup I implies φ can be extended to the right and a := inf I < c implies φ can

be extended to the left. First, assume d < b and let x0 ∈]c, d[. The idea is to apply

Example 4.6 on the interval [x0 , d[. To this end, we estimate, for each x ∈ [x0 , d[:

Z x
Z x

kφ(x)k =
φ(x0 ) + f t, φ(t) dt
≤ kφ(x0 )k +
f t, φ(t)
dt

x x0

(4.19)

Z0 x Z x

≤ kφ(x0 )k + γ(t) dt + β(t) kφ(t)k dt . (4.20)

x0 x0

4 LINEAR ODE 61

Since the continuous function γ is uniformly bounded on the compact interval [x0 , d],

Z x

∃ ∀ kφ(x)k ≤ C + β(t) kφ(t)k dt .

C≥0 x∈[x0 ,d[ x0

Z x

∀ kφ(x)k ≤ C exp β(t) dt ≤ C eM (d−x0 ) , (4.21)

x∈[x0 ,d[ x0

where M ≥ 0 is a uniform bound for the continuous function β on the compact interval

[x0 , d]. As (4.21) states that the graph

gr+ (φ) = x, φ(x) ∈ G : x ∈ [x0 , d[

is contained in the compact set

K := [x0 , d] × y ∈ Kn : kyk ≤ C eM (d−x0 ) ,

Now assume a < c. The idea is to apply the time reversion Lem. 1.9(b): According to

Lem. 1.9(b), ψ : ] − d, −c[−→ Kn , ψ(x) = φ(−x), is a solution to y ′ = −f (−x, y) and

the first part of the prove above shows ψ to have an extension to the right. However,

then Rem. 3.21 tells us φ has an extension to the left.

Theorem 4.8. Consider the setting of Def. 4.2 with an open interval I. Then every

initial value problem consisting of the linear ODE (4.1) and y(x0 ) = y0 , x0 ∈ I, y0 ∈ Kn ,

has a unique maximal solution φ : I −→ Kn (note that φ is defined on all of I).

Proof. It is an exercise to show the right-hand side of (4.1) is continuous and locally

Lipschitz with respect to y. Thus, every initial value problem has a unique maximal

solution by using Cor. 3.16 and Th. 3.22. That each maximal solution is defined on I

follows from Th. 4.7, as

∀
A(x)y + b(x)
≤
b(x)
+
A(x)
kyk,

x∈I

where
A(x)
denotes the matrix norm of A(x) induced by the norm k · k on Kn (cf.

Appendix G).

We will now proceed to study the solution spaces of linear ODE – as it turns out, these

solution spaces inherit the linear structure of the ODE.

Notation 4.9. Again, we consider the setting of Def. 4.2. Define Li and Lh to be the

respective sets of solutions to (4.1) and its homogeneous version, i.e.

n o

Li := (φ : I −→ Kn ) : φ′ = Aφ + b , (4.22a)

n o

Lh := (φ : I −→ Kn ) : φ′ = Aφ . (4.22b)

4 LINEAR ODE 62

∀ Li = φ + Lh = {φ + ψ : ψ ∈ Lh }, (4.23)

φ∈Li

i.e. one obtains all solutions to the inhomogeneous equation (4.1) by adding solutions

of the homogeneous equation to a particular solution to the inhomogeneous equation

(note that this is completely analogous to what occurs for solutions to linear systems of

equations in linear algebra).

Proof. Exercise.

be continuous. Then the following holds:

(ii) There exists x0 ∈ I such that the k vectors φ1 (x0 ), . . . , φk (x0 ) ∈ Kn are linearly

independent over K.

(iii) The k vectors φ1 (x), . . . , φk (x) ∈ Kn are linearly independent over K for every

x ∈ I.

(b): (iii) trivially implies (ii). That (ii) implies (i) can easily be shownPby contraposition:

If (i) does not hold, then there is (λ1 , . . . , λk ) ∈ K \ {0} such that kj=1 λj φj = 0, i.e.

k

Pk

j=1 λj φj (x) = 0 holds for each x ∈ I, i.e. (ii) does not hold. It remains to show (i)

implies (iii). Once again, we accomplish this via contraposition: PkIf (iii) does not hold,

k

then there are (λ1 , . . . , λk ) ∈ K \ {0} and x ∈ I such that j=1 λj φj (x) = 0. But

then, since kj=1 λj φj ∈ Lh by (a), kj=1 λj φj = 0 (using uniqueness of solutions). In

P P

consequence, φ1 , . . . , φk are linearly dependent and (i) does not hold.

(c): Let (b1 , . . . , bn ) be a basis of Kn and x0 ∈ I. Let φ1 , . . . , φn ∈ Lh be the solutions

to the initial conditions y(x0 ) = b1 , . . . , y(x0 ) = bn , respectively. Then the φ1 , . . . , φn

must be linearly independent by (b) (as they are linearly independent at x0 ), proving

dim Lh ≥ n. On the other hand, if φ1 , . . . , φk ∈ Lh are linearly independent, k ∈ N,

then, once more by (b), φ1 (x), . . . , φk (x) ∈ Kn are linearly independent for each x ∈ Kn ,

showing k ≤ n and dim Lh ≤ n.

Example 4.12. In Example 1.4(e), we had claimed that the second-order ODE (1.16)

on [a, b], a < b, namely

y ′′ = −y

4 LINEAR ODE 63

n o

L= (c1 sin +c2 cos) : [a, b] −→ K : c1 , c2 ∈ K .

We are now in a position to verify this claim: The second-order ODE (1.16) is equivalent

to the homogeneous linear first-order ODE

′

y1 y2 0 1 y1

′ = = (4.24)

y2 −y1 −1 0 y2

φ1 , φ2 : [a, b] −→ K2 with

sin x cos x

φ1 (x) := , φ2 (x) := . (4.25)

cos x − sin x

Moreover, φ1 and φ2 are linearly independent (e.g. since φ1 (0) = 01 and φ2 (0) = 10 are

4.11(b), the linear independence of φ1 (a), φ2 (a), finally implying the linear independence

of φ1 , φ2 : [a, b] −→ K2 ). Thus,

n o

2

Lh = (c1 φ1 + c2 φ2 ) : [a, b] −→ K : c1 , c2 ∈ K (4.26)

and, since, according to Th. 3.1 the solutions to (1.16) are precisely the first components

of solutions to (4.24), the representation (1.17) is verified.

Definition and Remark 4.13. A basis (φ1 , . . . , φn ), n ∈ N, of the n-dimensional

vector space Lh over K is called a fundamental system for the linear ODE (4.1). One

then also calls the matrix

φ11 . . . φ1n

Φ := ... .. , (4.27)

.

φn1 . . . φnn

where the kth column of the matrix consists of the component functions φ1k , . . . , φnk of

φk , k ∈ {1, . . . , n}, a fundamental system or a fundamental matrix solution for (4.1). The

latter term is justified by the observation that Φ : I −→ M(n, K) can be interpreted as

a solution to the matrix-valued ODE

Y ′ = A(x) Y : (4.28)

Indeed,

Φ′ = φ′1 , . . . , φ′n = A(x) φ1 , . . . , A(x) φn = A(x) Φ.

the following statements are equivalent:

4 LINEAR ODE 64

Proof. The equivalences are a direct consequence of the equivalences in Th. 4.11(b).

M(n, K) is a fundamental system for (4.1), then the unique solution ψ : I −→ Kn of

the initial value problem consisting of (4.1) and y(x0 ) = y0 , (x0 , y0 ) ∈ I × Kn , is given

by Z x

ψ : I −→ Kn , ψ(x) = Φ(x)Φ−1 (x0 ) y0 + Φ(x) Φ−1 (t) b(t) dt . (4.29)

x0

Z x

′ (I.3) ′ −1 ′

ψ (x) = Φ (x)Φ (x0 ) y0 + Φ (x) Φ−1 (t) b(t) dt + Φ(x)Φ−1 (x) b(x)

x0

Z x

(4.28) −1

= A(x) Φ(x)Φ (x0 ) y0 + A(x) Φ(x) Φ−1 (t) b(t) dt + b(x)

x0

= A(x) ψ(x) + b(x), (4.30)

special case of (4.29): We note that, for n = 1 and A(x) = a(x), the solution Φ := φ0

to the 1-dimensional homogeneous equation as defined in (2.2b), i.e.

Rx

Z x

a(t) dt

φ0 : I −→ K, φ0 (x) = exp a(t) dt = e x0

x0

constitutes a fundamental matrix solution in the sense of Def. and Rem. 4.13 (since 1/φ0

exists). Taking into account Φ(x0 ) = φ0 (x0 ) = 1, we obtain, for each x ∈ I,

Z x

(4.29) −1

φ(x) = Φ(x)Φ (x0 ) y0 + Φ(x) Φ−1 (t) b(t) dt

x0

Z x

−1

= φ0 (x) y0 + φ0 (t) b(t) dt , (4.31)

x0

which is (2.2a).

—

4 LINEAR ODE 65

In Sec. 4.6, we will study methods for actually finding fundamental matrix solutions in

cases where A is constant. However, in general, fundamental matrix solutions are often

not explicitly available. In such situations, the following Th. 4.17 can sometimes help

to extract information about solutions.

Theorem 4.17 (Liouville’s Formula). Consider the setting of Def. 4.2 and recall the

trace of an n × n matrix A = (akl ) is defined by

n

X

tr A := akk .

k=1

Z x

∀ det Φ(x) = det Φ(x0 ) exp tr A(t) dt . (4.32)

x0 ,x∈I x0

Proof. Exercise.

In Th. 3.1, we saw that higher-order ODE are equivalent to systems of first-order ODE.

We can now combine Th. 3.1 with our findings regarding first-order linear ODE to help

with the solution of higher-order linear ODE.

Definition 4.18. Let I ⊆ R be a nontrivial interval, n ∈ N. Let b : I −→ K and

a0 , . . . , an−1 : I −→ K be continuous functions. Then a (1-dimensional) linear ODE of

nth order is an equation of the form

It is called homogeneous if, and only if, b ≡ 0; it is called inhomogeneous if, and only if,

it is not homogeneous. Analogous to (4.22), define the respective sets of solutions

n n−1

X o

Hi := (φ : I −→ K) : φ(n) = b + ak φ(k) , (4.34a)

k=0

n n−1

X o

(n) (k)

Hh := (φ : I −→ K) : φ = ak φ . (4.34b)

k=0

times differentiable functions φ1 , . . . , φn : I −→ K, define the Wronskian

φ1 (x) ... φn (x)

φ′1 (x) ... φ′n (x)

W (φ1 , . . . , φn ) : I −→ K, W (φ1 , . . . , φn )(x) := det .. .. .

. .

(n−1) (n−1)

φ1 (x) . . . φn (x)

(4.35)

4 LINEAR ODE 66

(a) If Hi and Hh are the sets defined in (4.34), then Hh is an n-dimensional vector

space over K and, if φ ∈ Hi is arbitrary, then

Hi = φ + Hh . (4.36)

Hh ).

(ii) There exists x0 ∈ I such that the Wronskian does not vanish:

W (φ1 , . . . , φn )(x0 ) 6= 0.

(iii) The Wronskian never vanishes, i.e. W (φ1 , . . . , φn )(x) 6= 0 for every x ∈ I.

Proof. According to Th. 3.1, (4.33) is equivalent to the first-order linear ODE

0 1 0 ... 0 0 y1 0

0 0 1 ... 0 0 y2 0

.. .. . .

. ... ... .. ..

′

y = . +

0 0 0 . . . 1 0 yn−2 0

0 0 0 ... 0 1 yn−1 0

a0 (x) a1 (x) a2 (x) . . . an−2 (x) an−1 (x) yn b(x)

Define

n o

n ′

Li := (Φ : I −→ K ) : Φ = ÃΦ + b̃ ,

n o

Lh := (Φ : I −→ Kn ) : Φ′ = Ãφ .

φ

φ′

Φ := .

..

.

(n−1)

φ

Then

Th. 3.1(a),(b)

Hh = {Ψ1 : Ψ ∈ Lh }

and

Th. 3.1(a),(b) (4.23)

Hi = {Φ̃1 : Φ̃ ∈ Li } = {Φ̃1 : Φ̃ ∈ Φ + Lh }

= {(Φ + Ψ)1 : Ψ ∈ Lh } = φ + Hh .

4 LINEAR ODE 67

phism, implying that Hh , like Lh , is an n-dimensional vector space over K.

(l−1)

(b): For φ1 , . . . , φn ∈ Hh , define Φkl := φk for each k, l ∈ {1, . . . , n} and

φ1 (x) ... φn (x)

φ′1 (x) ... φ′n (x)

∀ Φ(x) := (Φ1 (x), . . . , Φn (x)) = (Φkl (x)) = .. ..

. .

x∈I

(n−1) (n−1)

φ1 (x) . . . φn (x)

such that det Φ(x) = W (φ1 , . . . , φn )(x) for each x ∈ I. Since Th. 3.1 yields Φ1 , . . . , Φn ∈

Lh if, and only if, φ1 , . . . , φn ∈ Hh , the equivalences of (b) follow from the equivalences

of Cor. 4.14.

y′ y

y ′′ = a1 (x) y ′ + a0 (x) y = − 2.

2x 2x

One might be able to guess the solutions

√

φ1 , φ2 : R+ −→ K, φ1 (x) := x, φ2 (x) := x.

The Wronskian is

W (φ1 , φ2 ) : R+ −→ K,

√ √ √

x √

x x x

W (φ1 , φ2 )(x) = det √ = − x=− < 0,

1 1/(2 x) 2 2

Hh = {c1 φ1 + c2 φ2 : c1 , c2 ∈ K}.

For 1-dimensional first-order linear ODE, we obtained a solution formula in Th. 2.3 in

terms of integrals (of course, in general, evaluating integrals can still be very difficult,

and one might need effective and efficient numerical methods). In the previous sections,

we have studied systems of first-order linear ODE as well as linear ODE of higher order.

Unfortunately, there are no general solution formulas for these situations (one can use

(4.29) if one knows a fundamental system, but the problem is the absence of a general

procedure to obtain such a fundamental system). However, there is a more satisfying

solution theory for the situation of so-called constant coefficients, i.e. if A in (4.1) and

the a0 , . . . , an−1 in (4.33) do not depend on x.

4 LINEAR ODE 68

continuous and a0 , . . . , an−1 ∈ K. Then a (1-dimensional) linear ODE of nth order with

constant coefficients is an equation of the form

Notation 4.23. Let n ∈ N0 .

(a) Let P denote the set of all polynomials over K, Pn := {P ∈ P : deg P ≤ n}. We

will also write P[R], P[C], Pn [R], Pn [C] if we need to be specific about the field of

coefficients.

(b) Let I ⊆ R be a nontrivial interval. Let Dn (I) := Dn (I, K) denote the set of all n

times differentiable functions f : I −→ K, and let

and, for each P ∈ Pn with P (x) = nj=0 aj xj (a0 , . . . , an ∈ K) define the differential

P

operator

n

X n

X

n

P (∂x ) : D (I) −→ F(I, K), P (∂x )f := aj ∂xj f = aj f (j) . (4.39)

j=0 j=0

Remark 4.24. Using Not. 4.23(b), the ODE (4.38) can be written concisely as

n−1

X

P (∂x )y = b(x), where P (x) := xn − aj x j . (4.40)

j=0

The following Prop. 4.25 implies that the differential operator P (∂x ) does not, actually,

depend on the representation of the polynomial P .

Proposition 4.25. Let P, P1 , P2 ∈ P.

f ∈D n (I)

∀n P (∂x )f = P1 (∂x ) P2 (∂x )f .

f ∈D (I)

4 LINEAR ODE 69

Proof. Exercise.

Pn

Proof. There exists n ∈ N0 and a0 , . . . , an ∈ K such that P (x) = j=0 aj xj . One

computes

n

X Xn

P (∂x )f (x) = aj ∂xj eλx = aj λj eλx = eλx P (λ),

j=0 j=0

proving (4.42).

Pn−1

Theorem 4.27. If a0 , . . . , an−1 ∈ K, n ∈ N, and P (x) = xn − j=0 aj xj has the distinct

zeros λ1 , . . . , λn ∈ K (i.e. P (λ1 ) = · · · = P (λn ) = 0), then (φ1 , . . . , φn ), where

j∈{1,...,n}

n o

Hh = (φ : I −→ K) : P (∂x )φ = 0 (4.44)

Proof. It is immediate from (4.42) and P (λj ) = 0 that each φj satisfies P (∂x )φj = 0.

From Th. 4.20(a), we already know Hh is an n-dimensional vector space over K. Thus,

it merely remains to compute the Wronskian. One obtains (cf. (4.35)):

1 . . . 1

λ1 . . . λn n−1

(H.2) Y

W (φ1 , . . . , φn )(0) = .. .. = (λk − λl ) 6= 0,

. .

n−1 k,l=0

. . . λnn−1

λ1 k>l

since the λj are all distinct. We have used that the Wronskian, in the present case, turns

out to be a Vandermonde determinant. The formula (H.2) for this type of determinant is

provided and proved in Appendix H. We also used that the determinant of a matrix is the

same as the determinant of its transpose: det A = det At . From W (φ1 , . . . , φn )(0) 6= 0

and Th. 4.20(b), we conclude that (φ1 , . . . , φn ) is a basis of Hh .

4 LINEAR ODE 70

i.e. P has the distinct zeros λ1 = i, λ2 = −i, λ3 = 2. Thus, according to Th. 4.27, the

three functions

form a basis of the C-vector space Hh . If we consider (4.45) as an ODE over R, then

we are interested in a basis of the R-vector space Hh . We can use linear combinations

of φ1 and φ2 to obtain such a basis (cf. Rem. 4.33(b) below):

eix + e−ix eix − e−ix

ψ1 , ψ2 : R −→ R, ψ1 (x) = = cos x, ψ2 (x) = = sin x. (4.48)

2 2i

As explained in Rem. 4.33(b) below, as (φ1 , φ2 , φ3 ) are a basis of Hh over C, (ψ1 , ψ2 , φ3 )

are a basis of Hh over R.

—

By working a bit harder, one can generalize Th. 4.27 to the case where P has zeros of

higher multiplicity. We provide this generalization in Th. 4.32 below after recalling the

notion of zeros of higher multiplicity in Rem. and Def. 4.29, and after providing two

preparatory lemmas.

Remark and Definition 4.29. According to the fundamental theorem of algebra (cf.

[Phi16, Th. 8.32, Cor. 8.33(b)]), for every polynomial P ∈ Pn with deg P = n, n ∈ N,

there exists r ∈ N with r ≤ n, k1 , . . . , kr ∈ N with k1 + · · · + kr = n, and distinct

numbers λ1 , . . . , λr ∈ C such that

multiplicity of the zero λj , j = 1, . . . , r.

Lemma 4.30. Let I ⊆ R be a nontrivial interval, λ ∈ K, k ∈ N0 , and f ∈ Dk (I). Then

we have

∀ (∂x − λ)k f (x) eλx = f (k) (x) eλx .

(4.50)

x∈I

Proof. The proof is carried out by induction. The case k = 0 is merely the identity

(∂x − λ)0 f (x) eλx = f (x) eλx . For the induction step, let k ≥ 1 and compute, using

the product rule,

ind. hyp.

(∂x − λ)k f (x) eλx (∂x − λ) f (k−1) (x) eλx

=

= f (k) (x) eλx + f (k−1) (x) λ eλx − λ f (k−1) (x) eλx

= f (k) (x) eλx , (4.51)

4 LINEAR ODE 71

Lemma 4.31. Let P ∈ P and λ ∈ K such that P (λ) 6= 0. Then, for each Q ∈ P with

deg Q = k, k ∈ N0 , it holds that

(4.52)

x∈R

n

X

P (x) = bj (x − λ)j , n ∈ N0 , (4.53)

j=0

where b0 = P (λ) 6= 0 and the remaining bj ∈ K can also be calculated from the

coefficients of P according to [Phi16, (6.6)]. We compute

n n

(4.53) X (4.50) X

P (∂x ) Q(x) eλx = bj (∂x − λ)j Q(x) eλx = bj Q(j) (x) eλx ,

j=0 j=0

Pk

i.e. (4.52) holds with R := Q(j) and b0 6= 0 implies deg R = deg Q = k.

j=0 bj

Pn−1

Theorem 4.32. If a0 , . . . , an−1 ∈ K, n ∈ N, and P (x) = xn − j=0 aj xj has the distinct

zeros λ1 , . . . , λr ∈ K with respective multiplicities k1 , . . . , kr ∈ N, then the set

n o

B := (φjm : I −→ K) : j ∈ {1, . . . , r}, m ∈ {0, . . . , kj − 1} , (4.54a)

where

∀ ∀ φjm : I −→ K, φjm (x) := xm eλj x , (4.54b)

j∈{1,...,r} m∈{0,...,kj −1}

n o

Hh = (φ : I −→ K) : P (∂x )φ = 0 .

show that B ⊆ Hh and the elements of B are linearly independent. Let φjm be as in

(4.54b). As λj is a zero of multiplicity kj of P , we can write P (x) = Qj (x)(x − λj )kj

with some Qj ∈ P. From the computation

(4.50) kj >m

P (∂x )φjm (x) = Qj (∂x )(∂x − λj )kj xm eλj x = Qj (∂x ) ∂xkj xm eλj x = 0,

r

!

X

Qj (x) eλj x = 0 ∧ ∀ Qj ∈ Pkj −1 ⇒ ∀ Qj ≡ 0. (4.55)

j=1,...,r j=1,...,r

j=1

6 0 for each x ∈ R, the case r = 1

is immediate. For the induction step, let r ≥ 2. If at least one Qj ≡ 0, then the

4 LINEAR ODE 72

case that none of the Qj vanishes identically. In that case, we apply (∂x − λr )kr to

P r λj x

j=1 Qj (x) e = 0, obtaining

r−1

X

Rj (x) eλj x = 0 (4.56)

j=1

(k )

with suitable Rj ∈ P, since Lem. 4.30 yields (∂x − λr )kr Qr (x) eλr x = Qr r (x) eλr x = 0

and, for j < r, Lem. 4.31 applies due to (λj − λr )kr 6= 0, also providing deg Rj = deg Qj .

Thus, none of the Rj in (4.56) can vanish identically, violating the induction hypothesis.

This finishes the proof of Qj ≡ 0 for each j = 1, . . . , r and the proof of the theorem.

As it can occur in Th. 4.32 that P ∈ P[R], but λj ∈ C \ R for some or all of the zeros λj ,

the question arises of how to obtain a basis of the R-vector space Hh from the basis of

the C-vector space Hh provided by Th. 4.32. The following Rem. 4.33(b) answers this

question.

Remark 4.33. (a) If λ1 , λ2 ∈ C, then complex conjugation has the properties (cf.

[Phi16, Def. and Rem. 5.5])

λ1 ± λ2 = λ̄1 ± λ̄2 , λ1 λ2 = λ̄1 λ̄2 .

In consequence, if P ∈ P[R], then P (λ) = P (λ̄) for each λ ∈ C. In particular, if

P ∈ P[R] and λ ∈ C \ R is a nonreal zero of P , then λ̄ 6= λ is also a zero of P .

(b) Consider the situation of Th. 4.32 with P ∈ P[R]. Using (a), if φjm : I −→ C,

φjm (x) = xm eλj x , λj ∈ C \ R, occurs in a basis for the C-vector space Hh (with

m = 0 in the special case of Th. 4.27), then φj̃m : I −→ C, φj̃m (x) = xm eλj̃ x , with

λj̃ = λ̄j will occur as well. Noting that, for each x ∈ R and each λ ∈ C,

eλx = ex(Re λ+i Im λ) = ex Re λ cos(x Im λ) + i sin(x Im λ) ,

(4.57a)

λ̄x x(Re λ−i Im λ) x Re λ

e =e =e cos(x Im λ) − i sin(x Im λ) , (4.57b)

1 λx

(e + eλ̄x ) = ex Re λ cos(x Im λ), (4.57c)

2

1 λx

(e − eλ̄x ) = ex Re λ sin(x Im λ), (4.57d)

2i

one can define

1

ψjm : I −→ R, ψjm (x) := (φjm (x) + φj̃m (x)) = xm ex Re λj cos(x Im λj ), (4.58a)

2

1

ψj̃m : I −→ R, ψj̃m (x) := (φjm (x) − φj̃m (x)) = xm ex Re λj sin(x Im λj ).

2i

(4.58b)

If one replaces each pair φjm , φj̃m in the basis for the C-vector space Hh with the

corresponding pair ψjm , ψj̃m , then one obtains a basis for the R-vector space Hh :

This follows from

1 1

ψjm φjm 2 2 1

=A with A := 1 1 , det A = − 6= 0. (4.59)

ψj̃m φj̃m 2i

− 2i 2i

4 LINEAR ODE 73

i.e. P has the zeros λ1 = 2i, λ2 = −2i, both with multiplicity 2. Thus, according to Th.

4.32, the four functions

φ10 , φ11 , φ20 , φ21 : R −→ C,

φ10 (x) = e 2ix

, φ11 (x) = x e2ix , φ20 (x) = e−2ix , φ21 (x) = x e−2ix ,

form a basis of the C-vector space Hh . If we consider (4.60) as an ODE over R, we can

use (4.58) to obtain the basis (ψ10 , ψ11 , ψ20 , ψ21 ) of the R-vector space Hh , where

ψ10 (x) = cos(2x), ψ11 (x) = x cos(2x), ψ20 (x) = sin(2x), ψ21 (x) = x sin(2x).

—

If (4.38) is inhomogeneous, then one can use Th. 4.32 and, if necessary, Rem. 4.33(b),

to obtain a basis of the homogeneous solution space Hh , then using the equivalence with

systems of first-order linear ODE and variation of constants according to Th. 4.15 to

solve (4.38). However, if the function b in (4.38) is such that the following Th. 4.35

applies, then one can avoid using the above strategy to obtain a particular solution φ

to (4.38) (and, thus, the entire solution space via Hi = φ + Hh ).

Pn−1

Theorem 4.35. Let a0 , . . . , an−1 ∈ K, n ∈ N, and P (x) = xn − j=0 aj xj . Consider

(a) (no resonance): If P (µ) 6= 0 and m := deg(Q) ∈ N0 , then there exists a polynomial

R ∈ P such that deg(R) = m and

(b) (resonance): If µ is a zero of P with multiplicity k ∈ N and m := deg(Q) ∈ N0 ,

then there exists a solution to (4.62) of the following form:

m+k

X

µx

φ : R −→ K, φ(x) := R(x) e , R ∈ P, R(x) = c j xj , ck , . . . , cm+k ∈ K.

j=k

(4.64)

The reason behind the terms no resonance and resonance will be explained in the follow-

ing Example 4.36.

4 LINEAR ODE 74

Proof. Exercise.

Example 4.36. Consider the second-order linear ODE

d2 x

+ ω02 x = a cos(ωt), ω0 , ω ∈ R+ , a ∈ R \ {0}, (4.65)

dt2

which can be written as P (∂t )x = a cos(ωt) with

P (t) := t2 + ω02 = (t − iω0 )(t + iω0 ). (4.66)

Note that the unknown function is written as x depending on the variable t (instead of y

depending on x). This is due to the physical interpretation of (4.65), where x represents

the position of a so-called harmonic oscillator at time t, having angular frequency ω0

and being subjected to a periodic external force of angular frequency ω and amplitude

a. We can find a particular solution φ to (4.65) by applying Th. 4.35 to

P (∂t )x = a eiωt . (4.67)

We have to distinguish two cases:

(a) Case ω 6= ω0 : In this case, one says that the oscillator and the external force are not

in resonance, which explains the term no resonance in Th. 4.35(a). In this case, we

can apply Th. 4.35(a) with µ := iω and Q ≡ a, yielding R ≡ a/P (iω) = a/(ω02 −ω 2 ),

i.e.

a

φ0 : R −→ C, φ0 (t) := R(t) eµt = 2 eiωt , (4.68a)

ω0 − ω 2

is a solution to (4.67) and

a

φ : R −→ R, φ(t) := Re φ0 (t) = cos(ωt), (4.68b)

ω02 − ω2

is a solution to (4.65).

(b) Case ω = ω0 : In this case, one says that the oscillator and the external force are in

resonance, which explains the term resonance in Th. 4.35(b). In this case, we can

apply Th. 4.35(b) with µ := iω and Q ≡ a, i.e. m = 0, k = 1, yielding R(t) = ct

for some c ∈ C. To determine c, we plug x(t) = R(t) eµt into (4.67):

P (∂t ) ct eiωt = ∂t (c eiωt + ciωt eiωt ) + ω02 ct eiωt

= 2ciω eiωt = a eiωt ⇒ c = a/(2iω). (4.69)

Thus,

a

φ0 : R −→ C, φ0 (t) := t eiωt , (4.70a)

2iω

is a solution to (4.67) and

a

φ : R −→ R, φ(t) := Re φ0 (t) = t sin(ωt), (4.70b)

2ω

is a solution to (4.65).

4 LINEAR ODE 75

Definition 4.37. Let I ⊆ R be a nontrivial interval, n ∈ N, A ∈ M(n, K) and b :

I −→ Kn be continuous. Then a linear ODE with constant coefficients is an equation

of the form

y ′ = A y + b(x), (4.71)

i.e. a linear ODE, where the matrix A does not depend on x.

—

is precisely the solution to the initial value problem y ′ = a y, y(0) = 1, the following

definition constitutes a natural generalization:

Definition 4.38. Given n ∈ N, A ∈ M(n, C), define the matrix exponential function

called the principal maxtrix solution of y ′ = A y).

—

The previous definition of the matrix exponential function is further justified by the

following result:

Theorem 4.39. For each A ∈ M(n, C), n ∈ N, it holds that

∞

X (Ax)k

∀ eAx = (4.73)

x∈R

k=0

k!

in the sense that the partial sums on the right-hand side converge pointwise to eAx on

R, where the convergence is even uniform on every compact interval.

2

= M(n, C), we may choose a convenient

norm on M(n, C). So we let k · k denote an arbitrary operator norm on M(n, C),

induced by some norm k · k on Cn . We first show that the partial sums (Am (x))m∈N ,

(Ax)k

Am (x) := m

P

k=0 k! , in (4.73) form a Cauchy sequence in M(n, C): For M, N ∈ N,

N > M , one estimates, for each x ∈ R,

N N

X (Ax)k (G.10) X kAkk |x|k

kAN (x) − AM (x)k = ≤ . (4.74)

k! k!

k=M +1 k=M +1

4 LINEAR ODE 76

kAkk |x|k

Since the convergence limm→∞ m = ekAk|x| is pointwise for x ∈ R and uniform

P

k=0 k!

on every compact interval, (4.74) shows each (Am (x))m∈N is a Cauchy sequence that

converges to some Φ(x) ∈ M(n, C) (by the completeness of M(n, C)) pointwise for

x ∈ R and uniform on every compact interval. It remains to show Φ is the solution to

(4.72b), i.e. Z x

∀ Φ(x) = Id + AΦ(t) dt . (4.75)

x∈R 0

Using the identity

m m−1 x m−1

(Ax)k Ak xk+1 Ak tk

X X Z X

Am (x) = Id + = Id +A = Id + A dt ,

k=1

k! k=0

(k + 1)! 0 k=0

k!

Z x Z x

Φ(x) − Id − AΦ(t) dt ≤ Φ(x) − Am (x) + Am (x) − Id − AΦ(t) dt

0 0

Z x m−1

!

X Ak tk

= Φ(x) − A (x) + A − AΦ(t) dt

m

k!

0

k=0

Z x

≤ Φ(x) − Am (x) +

kAk Am−1 (t) − Φ(t) dt → 0 for m → ∞,

0

The matrix exponential function has some properties that are familiar from the case

n = 1 (see Prop. 4.40(a),(b)), but also some properties that are, perhaps, unexpected

(see Prop. 4.42(a),(b)).

Proposition 4.40. Let A ∈ M(n, C), n ∈ N.

(b) (eAx )−1 = eA(−x) = e−Ax holds for each x ∈ R.

t

(c) For the transpose At , one has eA x = (eAx )t for each x ∈ R.

Proof. (a): Fix s ∈ R. The function Φs : R −→ M(n, C), Φs (t) := eA(t+s) is a solution

to Y ′ = AY (namely the one for the initial condition Y (−s) = Id). Moreover, the

function Ψs : R −→ M(n, C), Ψs (t) := eAt eAs , is also a solution to Y ′ = AY , since

Finally, since Ψs (0) = eA0 eAs = Id eAs = eAs = Φs (0), the claimed Φs = Ψs follows by

uniqueness of solutions.

(b) is an easy consequence of (a), since

(a)

Id = eA0 = eA(x−x) = eAx e−Ax .

4 LINEAR ODE 77

limk→∞ ak,αβ = aαβ for all components, which implies limk→∞ Atk = At ), providing, for

each x ∈ R,

m m

!t m

!t

t k k k

t

X (A x) X (Ax) X (Ax)

eA x = lim = lim = lim = (eAx )t ,

m→∞

k=0

k! m→∞

k=0

k! m→∞

k=0

k!

det eA = etr A .

Z x

Ax A0

det e = det e exp tr A dt = 1 · ex tr A , (4.76)

0

(a) BeAx = eAx B holds for each x ∈ R if, and only if, AB = BA.

(b) e(A+B)x = eAx eBx holds for each x ∈ R if, and only if, AB = BA.

Proof. (a): If BeAx = eAx B holds for each x ∈ R, then differentiation yields BAeAx =

AeAx B for each x ∈ R, and the case x = 0 provides BA Id = A Id B, i.e. BA = AB.

For the converse, assume BA = AB and define the auxiliary maps

gB : M(n, C) −→ M(n, C), gB (C) := CB.

If k·k denotes an operator norm, then kBC1 −BC2 k ≤ kBkkC1 −C2 k and kC1 B−C2 Bk ≤

kBkkC1 − C2 k, showing fB and gB to be (even Lipschitz) continuous. Thus,

m

! m

!

k k

X (Ax) X (Ax)

BeAx = fB (eAx ) = fB lim = lim fB

m→∞

k=0

k! m→∞

k=0

k!

m m

! m

!

X (Ax)k AB=BA X (Ax)k X (Ax)k

= lim B = lim B = lim gB

m→∞

k=0

k! m→∞

k=0

k! m→∞

k=0

k!

m

!

X (Ax)k

= gB lim = gB (eAx ) = eAx B,

m→∞

k=0

k!

(b): Exercise (hint: use (a)).

4 LINEAR ODE 78

We will see that the solution theory of linear ODE with constant coefficients is related

to the eigenvalues of A. We recall the definition of this notion:

A if, and only if, there exists 0 6= v ∈ Cn such that

Av = λv. (4.77)

respect to the eigenvalue λj ∈ C of A for each j ∈ {1, . . . , n}, then φ1 , . . . , φn with

j∈{1,...,n}

(b): Without loss of generality, we may consider I = R. We already know from (a) that

each φj is a solution to the homogeneous version of (4.71). Thus, it merely remains

to check that φ1 , . . . , φn are linearly independent. As φ1 (0) = v1 , . . . , φn (0) = vn are

linearly independent by hypothesis, the linear independence of φ1 , . . . , φn is provided by

Th. 4.11(b).

To proceed, we need a few more notions and results from linear algebra:

Theorem 4.45. Let n ∈ N and A ∈ M(n, C). Then the following statements (i) and

(ii) are equivalent:

(i) There exists a basis B of eigenvectors for Cn , i.e. there exist v1 , . . . , vn ∈ Cn and

λ1 , . . . , λn ∈ C such that B = {v1 , . . . , vn } is a basis of Cn and Avj = λj vj for

each j = 1, . . . , n (note that the vj must all be distinct, whereas some (or all) of

the λj may be identical).

4 LINEAR ODE 79

λ1 0

W −1 AW =

... ,

(4.80)

0 λn

actually be the respective eigenvectors to the eigenvalues λ1 , . . . , λn ).

M(n, C) can at least be transformed into Jordan normal form:

Theorem 4.46 (Jordan Normal Form). Let n ∈ N and A ∈ M(n, C). There exists an

invertible matrix W ∈ M(n, C) such that

B := W −1 AW (4.81)

B1 0

B=

... ,

(4.82)

0 Br

λj 1 0 . . . 0

λj 1

Bj = (λj ) or Bj =

. . . . . . 0 ,

(4.83)

0 λj 1

λj

where λj is an eigenvalue of A.

The reason Th. 4.46 regarding the Jordan normal form is useful for solving linear ODE

with constant coefficients is the following theorem:

(ii) ψ := W −1 φ : I −→ Cn is a solution to y ′ = W −1 AW y.

4 LINEAR ODE 80

−1 AW x

(b) eW = W −1 eAx W for each x ∈ R.

φ′ = Aφ ⇔ W −1 φ′ = W −1 Aφ ⇔ ψ ′ = W −1 AW ψ

−1

(b): By definition, x 7→ eW AW x is the solution to the initial value problem Y ′ =

W −1 AW Y , Y (0) = Id. Thus, noting W −1 eA0 W = Id and

(b).

Remark 4.48. To obtain a fundamental system for (4.71) with A ∈ M(n, C), it suffices

to obtain a fundamental system for y ′ = By, where B := W −1 AW is in Jordan normal

form and W ∈ M(n, C) is invertible: If φ1 , . . . , φn are linearly independent solutions to

y ′ = By, then A = W BW −1 , Th. 4.47(a), and W being a linear isomorphism yield that

ψ1 := W φ1 , . . . , ψn := W φn are linearly independent solutions to y ′ = Ay.

Moreover, since B is in block diagonal form with each block being a Jordan matrix

according to (4.82) and (4.83), it actually suffices to solve y ′ = By assuming that

B = λ Id +N, (4.84)

0 1 0 ... 0

0 1

... ...

N = 0 (zero matrix) or N = 0 , (4.85)

0 0 1

0

where the case N = 0 is already covered by Th. 4.44. The remaining case is covered by

the following Th. 4.49.

Theorem 4.49. Let λ ∈ C, k ∈ N, k ≥ 2, and assume 0 6= N ∈ M(k, C) is a canonical

nilpotent matrix according to (4.85). Then

2 xk−2 xk−1

1 x x2

... (k−2)! (k−1)!

xk−3 xk−2

0 1 x ...

(k−3)! (k−2)!

... .. ..

0 0 1 . .

Φ : R −→ M(k, C), Φ(x) := eλx . , (4.86)

.. .. .. ... ... ..

. .

.

0 0 0 ... 1 x

0 0 0 ... 0 1

4 LINEAR ODE 81

i.e.

∀ Φ(x) = e(λ Id +N )x ; (4.88)

x∈R

independent.

Proof. Φ(0) = Id is immediate from (4.86). Since Φ(x) has upper triangular form with

all 1’s on the diagonal, we obtain det Φ(x) = ekλx 6= 0 for each x ∈ R, showing the

columns of Φ are linearly independent. Let φαβ : R −→ C denote the αth component

function of the βth column of Φ, i.e.

(

xβ−α

eλx (β−α)! for α ≤ β,

∀ φαβ : R −→ C, φαβ (x) :=

α,β∈{1,...,k} 0 for α > β.

λφαβ + φα+1,β

for α < β,

∀ φ′αβ = λφαβ for α = β, (4.89)

α,β∈{1,...,k}

0 for α > β.

One computes,

xβ−α xβ−(α+1)

λx

λ e

(β−α)!

+ eλx (β−(α+1))!

for α < β,

xβ−α

∀ φ′αβ (x) = λ eλx (β−α)!

+0 for α = β,

α,β∈{1,...,k}

0 for α > β,

possible), i.e. there is a basis {v1 , v2 } of R2 such that vj is an eigenvector for λj ,

j ∈ {1, 2}. In this case, according to Th. 4.44(b), the two functions

4 LINEAR ODE 82

C \ R, λ2 = λ̄1 . Analogous to (i), one has a basis {v1 , v2 } of C2 such that vj

is an eigenvector for λj , j ∈ {1, 2}, and the two functions in (4.91) still form a

fundamental system for (4.90), but with K replaced by C. However, one can still

obtain a real-valued fundamental system as follows: We have

v2 := v̄1 = α − iβ, and taking complex conjugates

approach described in Rem. 4.33(b) above, we can let

1 1

ψ1 2 2 φ1

= 1 1 ,

ψ2 2i

− 2i φ2

R2 ,

= Re eµx cos(ωx) + i sin(ωx) (α + iβ)

(4.93a)

ψ2 (x) = Im(φ1 (x)) = eµx α sin(ωx) + β cos(ωx) .

(4.93b)

(iii) The matrix A has precisely one eigenvalue λ ∈ R and the corresponding eigenspace

is 1-dimensional. Then there is an invertible matrix W ∈ M(2, R) such that

B := W −1 AW is in (nondiagonal) Jordan normal form, i.e.

−1 λ 1

B = W AW = .

0 λ

According to Th. 4.49, the two functions

2 λx 1 λx x

φ1 , φ2 : R −→ K , φ1 (x) := e , φ2 (x) := e , (4.94)

0 1

form a fundamental system for y ′ = By (over K). Thus, according to Th. 4.47,

the two functions

Remark 4.51. One way of finding a fundamental matrix solution for y ′ = A y, A ∈

M(n, C), is to obtain eAx , using the following strategy based on Jordan normal forms:

4 LINEAR ODE 83

the zeros of the characteristic polynomial χA (x) := det(A − x Id) (the multiplicity

of the zero λj is called its algebraic multiplicity, the dimension of the eigenspace

ker(A − λj Id) its geometric multiplicity).

general, this means computing the (finitely many distinct) powers (A − λj Id)k and

(suitable bases of) ker(A−λj Id)k (in general, this is a somewhat involved procedure

and it is referred to [Mar04, Sections 4.2,4.3] and [Str08, Sec. 27] for details – the

lecture notes do not provide further details here, as they rather recommend using

the Putzer algorithm as described below instead).

(iii) For each Jordan block Bj (as in (4.83)) of B compute eBj x as in (4.86).

As step (ii) above tends to be complicated in practise, it is usually easier to obtain eAx

using the Putzer algorithm described next.

Putzer Algorithm

The Putzer algorithm due to [Put66] is a procedure for computing eAx that avoids the

difficulty of determining the Jordan normal form of A, and, thus, is often more efficient

to employ in practise than the procedure described in Rem. 4.51 above. The Putzer

algorithm is provided by the following theorem:

ues of A (not necessarily distinct, each eigenvalue occurring possibly repeatedly according

to its multiplicity). Then

Xn−1

Ax

∀ e = pk+1 (x) Mk , (4.96)

x∈R

k=0

defined recursively by

p′k = λk pk + pk−1 , pk (0) = 0 for k = 2, . . . , n (4.97b)

linear ODE that can be solved using (2.2)) and

M0 := Id, (4.98a)

Mk = Mk−1 (A − λk Id) for k = 1, . . . , n − 1. (4.98b)

n

Y

Mn = (A − λk Id) = χA (A) = 0,

k=1

5 STABILITY 84

since each matrix annihilates its characteristic polynomial according to the Cayley-

Hamilton theorem (cf. [Koe03, Th. 8.4.6] or [Str08, Th. 26.6]). Also note

k=0,...,n−1

Pn−1

We have to show that x 7→ Φ(x) := k=0 pk+1 (x) Mk solves the initial value problem

Y ′ = AY , Y (0) = Id. The initial condition is satisfied, as Φ(0) = p1 (0) M0 = Id, and

the ODE is satisfied, as, for each x ∈ R,

n−1

X n−1

X

′

Φ (x) − AΦ(x) = p′k+1 (x) Mk −A pk+1 (x) Mk

k=0 k=0

n−1

(4.97), (4.99) X

= λ1 p1 (x) M0 + λk+1 pk+1 (x) + pk (x) Mk

k=1

n−1

X

− pk+1 (x) Mk+1 + λk+1 Mk

k=0

= −pn (x) Mn = 0,

5 Stability

In the qualitative theory of ODE, which can be seen as part of the field of dynamical

systems, the idea is to understand the set of solutions to an ODE (or to a class of ODE),

if possible, without making use of explicit solution formulas, which, in most situations,

are not available anyway. Examples of qualitative questions are if, and under which

conditions, solutions to an ODE are constant, periodic, are unbounded, approach some

limit (more generally, the solutions’ asymptotic behavior), etc. One often thinks of the

solutions as depending on a time-like variable, and then qualitative theory typically

means disregarding the speed of change, but rather focusing on the shape/geometry of

the solution’s image.

The topic of stability takes continuity in intial conditions further and investigates the

behavior of solutions that are, at least initially, close to some given solution. Under

which conditions do nearby solutions approach each other or diverge away from each

other, show the same or different asymptotic behavior etc.

Even though the abovedescribed considerations are not limited to this situation, a nat-

ural starting point is to consider first-order ODE where the right-hand side does not

depend on x. In the following, we will mostly be concerned with this type of ODE,

which has a special name:

5 STABILITY 85

ODE

y ′ = f (y) (5.1)

is called autonomous and Ω is called the phase space.

Remark 5.2. In fact, nonautonomous ODE are not really more general than au-

tonomous ODE, due to the, perhaps, surprising Th. J.1 of the Appendix, which states

that every nonautonomous ODE is equivalent to an autonomous ODE. However, this fact

is of little practical relevance, since the autonomous ODE arising via Th. J.1 from nonau-

tonomous ODE can never have bounded solutions on unbounded intervals, whereas the

theory of autonomous ODE is most powerful and useful for ODE that admit bounded

solutions on unbounded intervals (such as constant or periodic solutions, or solutions

approaching constant or periodic functions).

Lemma 5.3. If, in the context of Def. 5.1, φ : I −→ Kn is a solution to (5.1), defined

on the interval I ⊆ R, then

∀ φξ : I − ξ −→ Kn , φξ (x) := φ(x + ξ), where I − ξ := {x − ξ ∈ R : x ∈ I},

ξ∈R

(5.2)

is another solution to (5.1). In consequence, if φ is a maximal solution, then so is φξ .

solution to (5.1), it is φ(I) ⊆ Ω, implying φξ (I − ξ) ⊆ Ω. Finally,

φ′ξ (x) = φ′ (x + ξ) = f φ(x + ξ) = f φξ (x) ,

∀

x∈I−ξ

completing the proof that φξ is a solution. Since each extension of φ yields an extension

of φξ and vice versa, φ is a maximal solution if, and only if, φξ is a maximal solution.

Lemma 5.4. If Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn is such that (5.1) admits unique

maximal solutions (f being locally Lipschitz on Ω open is sufficient, but not necessary,

cf. Def. 3.32), then the global solution Y : Df −→ Kn of (5.1) satisfies

0, η) for each (x, x̃, η) ∈ R×R×Kn such that (x̃, 0, η) ∈

(b) Y x, 0, Y (x̃, 0, η) = Y (x+x̃,

Df and x, 0, Y (x̃, 0, η) ∈ Df .

the initial data y(ξ) = η and y(0) = η, respectively, then (a) claims, using the notation

from Lem. 5.3, ψ = φ−ξ . As a consequence of Lem. 5.3, φ−ξ : I0,η + ξ −→ Kn , is some

maximal solution to (5.1) and, since φ−ξ (ξ) = φ(0) = η = ψ(ξ), the assumed uniqueness

yields the claimed ψ = φ−ξ , in particular, Iξ,η = I0,η + ξ.

(b): Let η̃ := Y (x̃, 0, η). If ψ : I0,η̃ −→ Kn and φ : I0,η −→ Kn denote the maximal

solutions to the initial data y(0) = η̃ and y(0) = η, respectively, then (b) claims ψ = φx̃ .

As a consequence of Lem. 5.3, φx̃ : I0,η − x̃ −→ Kn , is some maximal solution to (5.1)

and, since φx̃ (0) = φ(x̃) = η̃ = ψ(0), the assumed uniqueness yields the claimed ψ = φx̃ ,

in particular, I0,η̃ = I0,η − x̃.

5 STABILITY 86

arbitrary).

is often referred to as the orbit of φ in the present context of qualitative ODE theory.

(b) φ : R −→ S (note I = R) is called periodic if, and only if, there exists a smallest

ω > 0 (called the period of φ) such that

x∈R

The requirement ω > 0 means constant functions are not periodic in the sense of

this definition.

Lemma 5.6. Let φ : R −→ Kn , n ∈ N.

(a) If φ is continuous and (5.4) holds for some ω > 0, then φ is either constant or

periodic in the sense of Def. 5.5(b).

Proof. Exercise.

Definition 5.7. Let Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn . In the context of the

autonomous ODE (5.1), the zeros of f are called the fixed points of the ODE (5.1) (cf.

Lem. 5.8 below). One then sometimes uses the notation

F := Ff := {η ∈ Ω : F (η) = 0} (5.5)

Lemma 5.8. Let Ω ⊆ Kn , n ∈ N, f : Ω −→ Kn , η ∈ Ω. Then the following statements

are equivalent:

Proof. If f (η) = 0 and φ ≡ η, then φ′ (x) = 0 = f (φ(x)) for each x ∈ R, i.e. (i) implies

(ii). Conversely, if φ ≡ η is a solution to (5.1), then f (η) = f (φ(x)) = φ′ (x) = 0, i.e. (ii)

implies (i).

Proposition 5.9. If Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn is such that (5.1) admits

unique maximal solutions (f being locally Lipschitz on Ω open is sufficient), then, for

maximal solutions φ1 : I1 −→ Kn , φ2 : I2 −→ Kn to (5.1), defined on open intervals

I1 , I2 , respectively, precisely one of the following two statements (i) and (ii) is true:

5 STABILITY 87

(ii) There exists ξ ∈ R such that

I2 = I1 − ξ and ∀ φ2 (x) = φ1 (x + ξ). (5.6)

x∈I2

In particular, it follows in this case that O(φ1 ) = O(φ2 ), i.e. the solutions have

the same orbit.

Proof. Suppose (i) does not hold. Then there are x1 ∈ I1 and x2 ∈ I2 such that

φ1 (x1 ) = φ2 (x2 ). Define ξ := x1 − x2 and consider

φ : I1 − ξ −→ Kn , φ(x) := φ1 (x + ξ). (5.7)

Then φ is a maximal solution of (5.1) by Lem. 5.3 and φ(x2 ) = φ1 (x1 ) = φ2 (x2 ). By

uniqueness of maximal solutions, we obtain φ = φ2 , in particular, I2 = I1 − ξ, proving

(5.6). Clearly, (5.6) implies O(φ1 ) = O(φ2 ).

Proposition 5.10. If Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn is such that (5.1) admits

unique maximal solutions (f being locally Lipschitz on Ω open is sufficient), then, for

each maximal solution φ : I −→ Kn to (5.1), defined on the open interval I, precisely

one of the following three statements is true:

(i) φ is injective.

(ii) I = R and φ is periodic.

(iii) I = R and φ is constant (in this case η := φ(0) is a fixed point of (5.1)).

Proof. Clearly, (i) – (iii) are mutually exclusive. Suppose (i) does not hold. Then there

exist x1 , x2 ∈ I, x1 < x2 , such that φ(x1 ) = φ(x2 ). Set ω := x2 − x1 . According to Lem.

5.3, ψ : I − ω −→ Kn , ψ(x) := φ(x + ω), must also be a maximal solution to (5.1).

Since ψ(x1 ) = φ(x1 + ω) = φ(x2 ) = φ(x1 ), uniqueness implies ψ = φ and I = I − ω.

As ω > 0, this means I = R and the validity of (5.4). As φ is also continuous, by Lem.

5.6(a), either (ii) or (iii) must hold.

Corollary 5.11. If Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn is such that (5.1) admits unique

maximal solutions (f being locally Lipschitz on Ω open is sufficient), then the orbits

of maximal solutions to (5.1) partition the phase space Ω into disjoint sets. Moreover,

every point η ∈ Ω is either a fixed point, or it belongs to some periodic orbit, or it belongs

to the orbit of some injective solution.

Proof. The corollary merely summarizes Prop. 5.9 and Prop. 5.10.

Definition 5.12. In the situation of Cor. 5.11, a phase portrait for (5.1) is a sketch

showing representative orbits. Thus, the sketch shows subsets of the phase space Ω,

including fixed points (if any) and representative periodic solutions (if any). Usually,

one also uses arrows to indicate the direction in which each drawn orbit is traced as the

variable x increases.

5 STABILITY 88

Example 5.13. Even though it is a main goal of qualitative theory to obtain phase

portraits without the need of explicit solution formulas, and we will study techniques

for accomplishing this below, we will make use of explicit solution formulas for our first

two examples of phase portraits.

′

y1 −y2

= . (5.8)

y2′ y1

is (0, 0). Clearly, for each r > 0, φ : R −→ R2 , φ(x) := (r cos x, r sin x) is a solution

to (5.8) and its orbit is the circle with radius r around the origin. Since every point

of Ω belongs to such a circle, every orbit is either the origin or a circle around the

origin. Thus, the phase portrait consists of such circles plus the origin and arrows

that indicate the circles are traversed counterclockwise.

(b) As compared to the previous one, the phase portrait of the autonomous linear ODE

′

y1 y

= 2 (5.9)

y2′ y1

is more complicated: While (0, 0) is still the only fixed point, for each r > 0, all the

following functions φ1 , φ2 , φ3 , φ4 : R −→ R2 are solutions:

φ2 (x) := (−r cosh x, −r sinh x), (5.10b)

φ3 (x) := (r sinh x, r cosh x), (5.10c)

φ4 (x) := (−r sinh x, −r cosh x), (5.10d)

each type describing a hyperbolic orbit in some section of the plane R2 . These

sections are separated by rays, forming the orbits of the solutions φ5 , φ6 , φ7 , φ8 :

R −→ R2 :

φ6 (x) := (−ex , −ex ), (5.10f)

φ7 (x) := (e−x , −e−x ), (5.10g)

φ8 (x) := (−e−x , e−x ). (5.10h)

The two rays on {(y1 , y1 ) : y1 6= 0} move away from the origin, whereas the two rays

on {(y1 , −y1 ) : y1 6= 0} move toward the origin. The hyperbolic orbits asymptoti-

cally approach the ray orbits and are traversed such that the flow direction agrees

between approaching orbits.

—

5 STABILITY 89

The next results will be useful to obtain new phase portraits from previously known

phase portraits in certain situations.

Proposition 5.14. Let Ω ⊆ Kn , n ∈ N, let I ⊆ R be some nontrivial interval, let

f : Ω −→ Kn , and let φ : I −→ Kn be a solution to

time, then one can think of φ′ as the velocity of φ), then there exists a continuously

differentiable bijective map λ : J −→ I, defined on some nontrivial interval J, such that

(φ ◦ λ) : J −→ Kn is a solution to y ′ = f (y).

Proof. Since φ′ (x) 6= 0 for each x ∈ I, one has γ(x) 6= 0 for each x ∈ I. As γ is also

continuous, it must be either always

Rx negative or always positive. In consequence, fixing

x0 ∈ I, Γ : I −→ R, Γ(x) := x0 γ(t) dt , is continuous and either strictly increasing or

strictly decreasing. In particular, Γ is injective, J := Γ(I) is an interval, and Γ : I −→ J

is bijective. The desired function λ is λ := Γ−1 : J −→ I. Indeed, according to [Phi16,

Th. 9.9], λ is differentiable and its derivative is the continuous function

1

λ′ : J −→ R, λ′ (x) = ,

γ λ(x)

1

(φ ◦ λ)′ (x) = φ′ λ(x) λ′ (x) = γ λ(x) f φ(λ(x))

= f φ(λ(x)) ,

γ λ(x)

Proposition 5.15. Let Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn . Moreover, consider a

continuous function h : Ω −→ R with the property that either h > 0 everywhere on Ω

or h < 0 everywhere on Ω.

y ′ = f (y), (5.12a)

y ′ = h(y) f (y) (5.12b)

have precisely the same orbits, i.e. every orbit of a solution to (5.12a) is an orbit

of a solution to (5.12b) and vice versa.

(b) If f and h are such that the ODE (5.12) admit unique maximal solutions, then the

ODE (5.12) have precisely the same orbits (even if F 6= ∅).

and continuous. Since F = ∅ implies φ′ 6= 0, we can apply Prop. 5.14 to obtain

the existence of a bijective λ1 : J1 −→ I such that φ ◦ λ1 is a solution to (5.12a).

5 STABILITY 90

y ′ = h(y)

h(y)

f (y), then γ := 1/(h ◦ ψ) is well-defined and continuous. Since F = ∅ implies

′

ψ 6= 0, we can apply Prop. 5.14 to obtain the existence of a bijective λ2 : J2 −→ I such

that ψ ◦ λ2 is a solution to (5.12b). Thus, O(ψ) = O(ψ ◦ λ2 ).

(b): We are now in the situation of Prop. 5.10 and Cor. 5.11, and from (a) we know every

nonconstant orbit of (5.12a) is a nonconstant orbit of (5.12b) and vice versa. However,

since h > 0 or h < 0, both ODE in (5.12) have precisely the same constant solutions,

concluding the proof.

Remark 5.16. We apply Prop. 5.15 to phase portraits (in particular, assume unique

maximal solutions). Prop. 5.15 says that overall multiplication with a continuous pos-

itive function h does not change the phase portrait at all. Moreover, Prop. 5.15 also

states that overall multiplication with a continuous negative function h does not change

the partition of Ω into solution orbits. However, after multiplication with a negative h,

the orbits are clearly traversed in the opposite direction, i.e., for negative h, the arrows

in the phase portrait have to be reversed. For a general continuous h, this implies the

phase portrait remains the same in each region of Ω, where h > 0; it remains the same,

except for the arrows reversed, in each region of Ω, where h < 0; and the zeros of h

add additional fixed points, cutting some of the previous orbits. We summarize how to

obtain the phase portrait of (5.12b) from that of (5.12a):

(2) Add the zeros of h as additional fixed points (if any). Previous orbits are cut, where

fixed points are added.

(5.13)

y2′ = y1 (y1 − 1)2 + y22 ,

which comes from multiplying the right-hand side of (5.8) by h(y) = (y1 − 1)2 + y22 .

The phase portrait is the same as the one for (5.8), except for the added fixed point

at {(1, 0)}.

y1′ = −y1 y2 + y22 ,

(5.14)

y2′ = −y1 y2 + y12 ,

which comes from multiplying the right-hand side of (5.8) by h(y) = y1 − y2 . The

phase portrait is obtained from that of (5.8), where additional fixed points are on

the line with y1 = y2 . This line cuts each previously circular orbit into two segments.

The arrows have to be reversed for y2 > y1 , that means above the y1 = y2 line.

5 STABILITY 91

called an integral for the autonomous ODE (5.1), i.e. for y ′ = f (y), if, and only if, E ◦ φ

is constant for every solution φ of (5.1).

Lemma 5.19. Let Ω ⊆ Rn be open, n ∈ N, and f : Ω −→ Rn such that each initial

value problem for (5.1) has at least one solution (f continuous is sufficient by Th. 3.8).

Then a differentiable function E : Ω −→ R is an integral for (5.1) if, and only if,

n

X

∀ (∇ E)(y) • f (y) = ∂j E(y) fj (y) = 0. (5.15)

y∈Ω

j=1

x∈I

The differentiable function E ◦ φ : I −→ R is constant on the interval I if, and only if,

(E ◦ φ)′ ≡ 0. Thus, by (5.16), E ◦ φ being constant for every solution φ is equivalent to

(∇ E) • f (y) = 0 for each y ∈ Ω such that at least one solution passes through y.

Example J.2 of the Appendix, pointed out by Anton Sporrer, shows the hypothesis of

Lem. 5.19, that each initial value problem for (5.1) has at least one solution, can not be

omitted. The following Prop. 5.20 makes use of integrals and applies to phase portraits

of 2-dimensional real ODE:

Proposition 5.20. Let Ω ⊆ R2 be open, and let f : Ω −→ R2 be continuous and

such that (5.1) admits unique maximal solutions (f being locally Lipschitz is sufficient).

Assume E : Ω −→ R to be a continuously differentiable integral for (5.1), i.e. for

y ′ = f (y), satisfying ∇ E(y) 6= 0 for each y ∈ Ω. Then the following statements hold

for each maximal solution φ : I −→ R2 of (5.1) (I ⊆ R some open interval):

(a) If (xm )m∈N is a sequence in I such that limm→∞ φ(xm ) = η ∈ Ω, then η ∈ F (i.e. η

is a fixed point) or η ∈ O(φ) (i.e. there exists ξ ∈ I with φ(ξ) = η).

E −1 {C} = {y ∈ Ω : E(y) = C} is compact and E −1 {C} ∩ F = ∅, then φ is

periodic.

(a): The continuity of E yields E(η) = limm→∞ E(φ(xm )) = C. Moreover, by hypoth-

esis, (ǫ1 , ǫ2 ) := ∇ E(η) 6= (0, 0). We proceed with the proof for ǫ2 6= 0 – if ǫ2 = 0

and ǫ1 6= 0, then the roles of the indices 1, 2 have to be switched in the following. We

apply the implicit function theorem [Phi15, Th. C.9] to the function f˜ : Ω −→ R,

f˜(y) := E(y) − C at its zero η = (η1 , η2 ). By [Phi15, Th. C.9], there exist ǫ, δ > 0 and a

continuously differentiable map g : Ig −→ R, Ig :=]η1 − δ, η1 + δ[, such that g(η1 ) = η2 ,

∀ E s, g(s) = C, (5.17a)

s∈Ig

5 STABILITY 92

∀ ky − ηk < ǫ ∧ E(y) = C ⇒ ∃ y = s, g(s) . (5.17b)

y∈Ω s∈Ig

We now assume η ∈ / F and show η ∈ O(φ). If η ∈ / F, then f (η) 6= 0 and the continuity

of f and g imply there is δ̃ > 0, δ̃ ≤ δ, such that, for each s ∈ I˜ :=]η1 − δ̃, η1 + δ̃[,

f (s, g(s)) 6= 0. Define the auxiliary function ϕ : I˜ −→ Ω, ϕ(s) = (s, g(s)). Since

E ◦ ϕ ≡ C, we can employ the chain rule to conclude

s∈I˜

i.e. the two-dimensional vectors (∇ E)(ϕ(s)) and ϕ′ (s) are orthogonal with respect to

the Euclidean scalar product. As E is an integral, using (5.15), f (ϕ(s)) is another vector

orthogonal to (∇ E)(ϕ(s)) and, since all vectors in R2 orthogonal to (∇ E)(ϕ(s)) form

a 1-dimensional subspace of R2 (recalling (∇ E)(ϕ(s)) 6= 0), there exists γ(s) ∈ R such

that

ϕ′ (s) = γ(s)f (ϕ(s)) (5.19)

(note f (ϕ(s)) 6= 0 as s ∈ I). ˜ We can now apply Prop. 5.14, since (5.19) says ϕ is a

solution to (5.11), the function γ : I˜ −→ R, s 7→ γ(s) = ϕ′ (s)/f (ϕ(s)) is continuous,

and ϕ′ (s) = (1, g ′ (s)) 6= (0, 0) for each s ∈ I. ˜ Thus, Prop. 5.14 provides a bijective

˜ ′

λ : J −→ I, such that ϕ ◦ λ is a solution to y = f (y).

As we assume limm→∞ φ(xm ) = η, there exists M ∈ N such that kφ(xm )−ηk < ǫ for each

m ≥ M . Since E(φ(xm )) = C also holds, (5.17b) implies the existence of a sequence

(sm )m∈N in I˜ such that φ(xm ) = (sm , g(sm )) for each m ≥ M . Then, for each m ≥ M

and τm := λ−1 (sm ), (ϕ ◦ λ)(τm ) = ϕ(sm ) = φ(xm ). On the other hand, for τ0 := λ−1 (η1 ),

(ϕ ◦ λ)(τ0 ) = ϕ(η1 ) = η, showing φ(xm ), η ∈ O(ϕ ◦ λ). Since φ(xm ) ∈ O(φ) as well,

Prop. 5.9 implies O(ϕ ◦ λ) ⊆ O(φ), i.e. η ∈ O(φ), which proves (a). In preparation for

(b), we also observe that kφ(xm ) − ηk < ǫ for each m ≥ M implies the sm for m ≥ M

all are in some compact interval I1 with η1 ∈ I1 , implying the τm to be in the compact

interval J1 := λ−1 [I1 ] with τ0 ∈ J1 . We will use for (b) that J1 is bounded.

(b): As we have O(φ) ⊆ E −1 {C} according to the choice of C, the assumed compactness

of E −1 {C} and Prop. 3.24 show φ can only be maximal if it is defined on all of R (since

(x, φ(x)) must escape every compact [−m, m] × E −1 {C}, m ∈ N, on the left and on the

right). Using the compactness of E −1 {C} a second time, we obtain the existence of a

sequence (xm )m∈N in R such that limm→∞ xm = ∞ and limm→∞ φ(xm ) = η ∈ E −1 {C}.

So we see that we are in the situation of (a). Let ψ be the maximal extension of the

solution ϕ ◦ λ constructed in the proof of (a). Then we know O(ψ) ∩ O(φ) 6= ∅ from the

proof of (a) and, since ψ and φ both are maximal, Prop. 5.9 implies O(ψ) = O(φ) and,

more importantly for us here, there exists ξ ∈ R such that ψ(x) = φ(x+ξ) for each x ∈ R.

Let m ≥ M with M from the proof of (a). If ξ 6= 0, then φ(xm ) = ψ(τm ) = φ(xm + ξ)

shows φ is not injective. If ξ = 0, then φ = ψ and φ(xm ) = φ(τm ). Since the τm are

bounded, whereas the xm are unbounded, xm = τm cannot be true for all m, again

showing φ is not injective. Since E −1 {C} ∩ F = ∅, φ cannot be constant, therefore it

must be periodic by Prop. 5.10.

5 STABILITY 93

Example 5.21. Using condition (5.15), i.e. ∇ E • f ≡ 0, one readily verifies that the

functions

E : R2 −→ R, E(y1 , y2 ) := y12 − y22 , (5.20b)

′

y1 −y2

=

y2′ y1

and ′

y1 y

= 2 ,

y2′ y1

respectively, and we recover the respective phase portraits via the respective level curves

E(y1 , y2 ) = C, C ∈ R.

Example 5.22. Consider the autonomous ODE

′

y1 2y1 y2

= . (5.21)

y2′ 1 − 2y12

We claim that

2 2

E : R2 −→ R, E(y1 , y2 ) := y1 e−(y1 +y2 ) , (5.22)

is an integral for (5.21) and intend to use Prop. 5.20 to establish (5.21) has orbits that

are fixed points, orbits that are periodic, and orbits that are neither. To verify E is an

integral, one computes, for each (y1 , y2 ) ∈ R2 ,

2 2 2 2 2 2

= e−(y1 +y2 ) − 2y12 e−(y1 +y2 ) , −2y1 y2 e−(y1 +y2 ) • (2y1 y2 , 1 − 2y12 )

2 2

= e−(y1 +y2 ) (1 − 2y12 , −2y1 y2 ) • (2y1 y2 , 1 − 2y12 ) = 0.

1 1

F= −√ , 0 , √ , 0 .

2 2

The level set of 0 is E −1 {0} = {(0, y2 ) : y2 ∈ R}, i.e. it is the y2 -axis. This is a

nonperiodic orbit (actually, the orbit of solutions of the form φ : R −→ R2 , φ(x) :=

(0, x + c), c ∈ R). Now consider the level set

g has precisely two zeros, namely λ1 = 1 and 0 < λ2 < 1, pand g ≥ 0 precisely

on the

−1 −1

compact interval J := [λ2 , 1], implying E {e } = y1 , ± g(y1 ) : y1 ∈ J , showing

E −1 {e−1 } is compact. According to Prop. 5.20(b), E −1 {e−1 } must consist of one or

more periodic orbits.

5 STABILITY 94

Given an autonomous ODE with a fixed point p, we will investigate the question under

what conditions a solution φ(x) starting out near p will remain near p as x increases or

decreases.

To simplify notation, we will restrict ourselves to initial data y(0) = y0 , which, in light

of Lem. 5.4(b), is not an essential restriction.

Notation 5.23. Let Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn such that

y ′ = f (y) (5.23)

admits unique maximal solutions (f being locally Lipschitz on Ω open is sufficient). Let

Y : Df −→ Kn denote the general solution to (5.23) and define

Y : Df,0 −→ Kn , Y (x, η) := Y (x, 0, η),

(5.24)

Df,0 := {(x, η) ∈ R × Kn : (x, 0, η) ∈ Df }.

Definition 5.24. Let Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn such that (5.23) admits unique

maximal solutions (f being locally Lipschitz on Ω open is sufficient). Moreover, assume

the set of fixed points to be nonempty, F 6= ∅, and let p ∈ F. The fixed point p is said

to be positively (resp. negatively) stable if, and only if, the following conditions (i) and

(ii) hold:

(i) There exists r > 0 such that, for each η ∈ Ω with kη − pk < r, the maximal

−

solution x 7→ Y (x, η) (cf. (5.24)) is defined on (a superset of) R+

0 (resp. R0 ).

(ii) For each ǫ > 0, there exists δ > 0 such that, for each η ∈ Ω,

kη − pk < δ ⇒ ∀ kY (x, η) − pk < ǫ. (5.25)

x≥0

(resp. x ≤ 0)

The fixed point p is said to be positively (resp. negatively) asymptotically stable if, and

only if, (i) and (ii) hold plus the additional condition

kη − pk < γ ⇒ lim Y (x, η) = p (resp. lim Y (x, η) = p). (5.26)

x→∞ x→−∞

The norm k · k on Kn used in (i) – (iii) above is arbitrary. Due to the equivalence of

norms on Kn , changing the norm does not change the defined stability properties, even

though, in general, it does change the sizes of r, δ, γ.

Remark 5.25. In the situation of Def. 5.24, consider the time-reversed version of (5.23),

i.e.

y ′ = −f (y). (5.27)

According to Lem. 1.9(b), (5.27) has the general solution

Ỹ : D−f,0 −→ Kn , Ỹ (x, η) := Y (−x, η),

(5.28)

D−f,0 = {(x, η) ∈ R × Kn : (−x, η) ∈ Df,0 }.

5 STABILITY 95

p is negatively stable for (5.23) ⇔ p is positively stable for (5.27).

p is pos. asympt. stable for (5.23) ⇔ p is neg. asympt. stable for (5.27),

p is neg. asympt. stable for (5.23) ⇔ p is pos. asympt. stable for (5.27).

Ω ⊆ Kn open. Then the fixed point p is positively (resp. negatively) stable if, and only

if, for each ǫ > 0, there exists δ > 0 such that, for each η ∈ Ω,

x∈I(0,η) ∩R+ 0

(resp. x ∈ I(0,η) ∩ R−

0 )

where I(0,η) denotes the domain of the maximal solution Y (·, η).

Proof. Clearly, stability in the sense of Def. 5.24 implies (5.29), and it merely remains

to show that (5.29) implies Def. 5.24(i). As (5.29) holds, we can consider ǫ := 1 and

obtain a corresponding δ =: r. Then (5.29) states that, for each η ∈ Ω with kη − pk < r,

for x ≥ 0 (resp. for x ≤ 0), the maximal solution Y (x, η) remains in the compact set

B 1 (p). Since f : Ω −→ Kn is continuous on Ω ⊆ Kn open, Th. 3.28 implies R+ 0 ⊆ I(0,η)

(resp. R− 0 ⊆ I (0,η) ), proving Def. 5.24(i).

It is an exercise to show Lem. 5.26 becomes false if the hypothesis that f be continuous

is omitted.

The set of fixed points is F = {0, 1}. Moreover, Y ′ (·, η) < 0 for 0 < η < 1 and

Y ′ (·, η) > 0 for η ∈] − ∞, 0[∪]1, ∞[. It follows that, for p = 0, the positive stability

part of (5.29) holds (where, given ǫ > 0, one can choose δ := min{1, ǫ}). Moreover,

for η < 0 and 0 < η < 1, one has limx→∞ Y (x, η) = 0. Thus, all three conditions of

Def. 5.24 are satisfied and 0 is positively asymptotically stable. Analogously, one

sees that 1 is negatively asymptotically stable.

(b) For the R2 -valued ODE of (5.8), (0, 0) is a fixed point that is positively and neg-

atively stable, but neither positively nor negatively asymptotically stable. For the

R2 -valued ODE of (5.9), (0, 0) is a fixed point that is neither positively nor nega-

tively stable.

5 STABILITY 96

y′ = y2. (5.31)

The only fixed point is 0, which is neither positively nor negatively stable. Indeed,

not even Def. 5.24(i) is satisfied: One obtains

η

Y : Df,0 −→ R, Y (x, η) := ,

1 − ηx

where

∪ (x, η) ∈ R2 : η > 0, x ∈] − ∞, 1/η[

showing every neighborhood of 0 contains η such that Y (·, η) is not defined on all

−

of R+

0 and η such that Y (·, η) is not defined on all of R0 .

Remark 5.28. There exist examples of autonomous ODE that show fixed points can

satisfy Def. 5.24(iii) without satisfying Def. 5.24(ii). For example, [Aul04, Ex. 7.4.16]

provides the following ODE in polar coordinates (r, ϕ):

r′ = r (1 − r), (5.32a)

1 − cos ϕ ϕ

ϕ′ = = sin2 . (5.32b)

2 2

Even though it is somewhat tedious, one can show that its fixed point (1, 0) satisfies Def.

5.24(iii) without satisfying Def. 5.24(ii) (see Claim 4 of Example K.2 in the Appendix).

—

We will now study a method that allows, in certain cases, to determine the stability

properties of a fixed point without having to know the solutions to an ODE. The method

is known as Lyapunov’s method. The key ingredient to this method is a test function V ,

known as a Lyapunov function. Once a Lyapunov function is known, stability is often

easily tested. The catch, however, is that Lyapunov functions can be hard to find. From

the literature, it appears there is no definition for an all-purpose Lyapunov function, as

a suitable choice depends on the circumstances.

be positive (resp. negative) definite at p ∈ Ω0 if, and only if, the following conditions (i)

and (ii) hold:

5 STABILITY 97

open, and f : Ω −→ Rn continuous. Let Ω0 be open with p ∈ Ω0 ⊆ Ω ⊆ Rn . Assume

V : Ω0 −→ R to be continuously differentiable and define

n

X

V̇ : Ω0 −→ R, V̇ (y) := (∇ V )(y) • f (y) = ∂j V (y) fj (y). (5.33)

j=1

negatively) stable. If, in addition, V̇ is negative (resp. positive) definite at p, then p is

positively (resp. negatively) asymptotically stable.

Proof. The proof is carried out for the case of postive (asymptotic) stability; the proof

for the case of negative (asymptotic) stability is then easily obtained by reversing time,

i.e. by using Rem. 5.25 together with noting V̇ changing its sign when replacing f with

−f . Fix your favorite norm k·k on Rn . Let r > 0 such that B r (p) = {y ∈ Rn : ky −pk ≤

r} ⊆ Ω0 (such an r > 0 exists, as Ω0 is open). Define

(5.34)

where k is well-defined, since the continuous function V assumes its min on compact

sets, and k(ǫ) > 0 by the positive definiteness of V . Given ǫ ∈]0, r], since V (p) = 0,

k(ǫ) > 0, and V continuous,

0<δ(ǫ)<ǫ y∈Bδ(ǫ) (p)

where we used Not. 3.3 to denote an open ball with center p with respect to k · k.

We now claim that, for each η ∈ Bδ(ǫ) (p), the maximal solution x 7→ φ(x) := Y (x, η)

must remain inside Bǫ (p) for each x ≥ 0 in its domain I(0,η) (implying p to be positively

stable by Lem. 5.26). Seeking a contradiction, assume there exists ξ ≥ 0 such that

kφ(ξ) − pk ≥ ǫ and let

n o

s := sup x ≥ 0 : φ(t) ∈ Bǫ (p) for each t ∈ [0, x] ≤ ξ < ∞. (5.36)

by the definition of k(ǫ). On the other hand, by the chain rule (V ◦ φ)′ (x) = V̇ (φ(x))

(cf. (5.16)), such that V̇ ≤ 0 implies

Z s

(5.35)

V (φ(s)) = V (η) + V̇ (φ(x)) dx ≤ V (η) < k(ǫ), (5.38)

0

0 and the positive

stability of p.

5 STABILITY 98

For the remaining part of the proof, we additionally assume V̇ to be negative definite

at p, while continuing to use the notation from above. Set γ := δ(r). We have to show

limx→∞ Y (x, η) = p for each η ∈ Bγ (p), i.e.

ǫ∈]0,r] ξǫ ≥0 x≥ξǫ

So fix η ∈ Bγ (p) and, as above, let φ(x) := Y (x, η). Given ǫ ∈]0, r], we first claim that

there exists ξǫ ≥ 0 such that φ(ξǫ ) ∈ Bδ(ǫ) (p), where δ(ǫ) is as in the first part of the

proof above. Indeed, seeking a contradiction, assume kφ(x) − pk ≥ δ(ǫ) for all x ≥ 0,

and set

α := max V̇ (y) : δ(ǫ) ≤ ky − pk ≤ r . (5.40)

Then α < 0 due to the negative definiteness of V̇ at p. Moreover, due to the choice of

γ, we have δ(ǫ) ≤ kφ(x) − pk ≤ r for each x ≥ 0, implying

Z x

∀ 0 ≤ V (φ(x)) = V (η) + V̇ (φ(t)) dt ≤ V (η) + αx, (5.41)

x≥0 0

which is the desired contradiction, as α < 0 implies the right-hand side to go to −∞ for

x → ∞. Thus, we know the existence of ξǫ such that ηǫ := φ(ξǫ ) ∈ Bδ(ǫ) (p).

To finish the proof, we recall from the first part of the proof that kY (x, ηǫ ) − pk < ǫ for

each x ≥ 0. Using Lem. 5.4(a), we obtain

x≥0

asymptotically stable fixed point for each R2 -valued ODE of the form

!

−y12k−1 + αy1 y22

′

y1

= . (5.43)

y2′ −y22m−1 − βy12 y2

Indeed, (0, 0) is clearly a fixed point, and we consider the Lyapunov function

y12 y22

V : R2 −→ R, V (y1 , y2 ) := + , (5.44a)

α β

is clearly negative definite at (0, 0), Th. 5.30 proves (0, 0) to be a positively asymptoti-

cally stable fixed point.

5 STABILITY 99

Theorem 5.32. Consider the situation of Def. 5.24 with K = R. Let Ω0 be open with

p ∈ Ω0 ⊆ Ω ⊆ Rn . Assume V : Ω0 −→ R to be continuously differentiable and assume

there is an open set U ⊆ Ω0 such that the following conditions (i) – (iii) are satisfied:

Proof. We assume V and V̇ are positive, proving p not to be positively stable; the cor-

responding statement regarding p not to be negatively stable is then, once again, easily

obtained by reversing time, i.e. by using Lem. 5.25 together with noting V̇ changing its

sign when replacing f with −f .

Seeking a contradiction, assume p to be positively stable. Then there exists r > 0 such

that B r (p) = {y ∈ Rn : ky − pk ≤ r} ⊆ Ω0 and η ∈ Br (p) implies Y (x, η) is defined

for each x ≥ 0. Moreover, positive stability and p ∈ ∂U also imply the existence of

η ∈ U ∩ Br (p) such that φ(x) := Y (x, η) ∈ Br (p) for all x ≥ 0 (note p 6= η as p ∈ ∂U ).

Set

s := sup x ≥ 0 : φ(t) ∈ U for each t ∈ [0, x]}. (5.45)

If s < ∞, then the maximality of φ implies φ(s) to be defined. Moreover, φ(s) ∈ ∂U by

the definition of s, and φ(s) ∈ Br (p) ⊆ Ω0 by the choice of η. Thus, φ(s) ∈ Ω0 ∩ ∂U

and V (φ(s)) = 0. On the other hand, as V and V̇ are positive on U , we have

Z s

V (φ(s)) = V (η) + V̇ (φ(t)) dt > V (η) > 0, (5.46)

0

V (φ(x)) > V (η) > 0 hold for each x > 0.

To conclude the proof, consider the compact set

Then the choice of η guarantees φ(x) ∈ C for all x ≥ 0. If y ∈ C, then V (y) ≥ V (η) > 0.

If y ∈ Ω0 ∩ ∂U , then V (y) = 0, showing C ∩ ∂U = ∅, i.e. C ⊆ U and

Thus, Z x

∀ V (φ(x)) = V (η) + V̇ (φ(t)) dt ≥ V (η) + αx. (5.49)

x≥0 0

But this means that the continuous function V is unbounded on the compact set C and

this contradiction proves p is not positively stable.

5 STABILITY 100

on some open set Ω ⊆ R2 with (0, 0) ∈ Ω and h1 (0, 0) > 0, h2 (0, 0) > 0. We claim that

(0, 0) is not a positively stable fixed point for each R2 -valued ODE of the form

′

y1 y2 h1 (y1 , y2 )

= . (5.50)

y2′ y1 h2 (y1 , y2 )

Indeed, (0, 0) is clearly a fixed point, and we let Ω0 be some open neighborhood of (0, 0),

where both h1 and h2 are positive (such an Ω0 exists by continuity of h1 , h2 and h1 , h2

being positive at (0, 0)), and consider the Lyapunov function

V : Ω0 −→ R, V (y1 , y2 ) := y1 y2 , (5.51a)

with V̇ : Ω0 −→ R,

V̇ (y1 , y2 ) = ∇ V (y1 , y2 ) • y2 h1 (y1 , y2 ), y1 h2 (y1 , y2 )

= (y2 , y1 ) • y2 h1 (y1 , y2 ), y1 h2 (y1 , y2 )

= y22 h1 (y1 , y2 ) + y12 h2 (y1 , y2 ) > 0 on Ω0 \ {(0, 0)}. (5.51b)

and V = 0 on Ω0 ∩ ∂U ⊆ ({0} × R) ∪ (R × {0}). Thus, Th. 5.32 applies, yielding that

(0, 0) is not positively stable.

y ′ = − ∇ F (y). (5.52)

If p ∈ Ω is an isolated critical point of F (i.e. ∇ F (p) = 0 and there exists an open set

O with p ∈ O ⊆ Ω and ∇ F 6= 0 on O \ {p}), then p is a fixed point of (5.52) that is

positively asymptotically stable, negatively asymptotically stable, neither positively nor

negatively stable as p is a local minimum for F , local maximum for F , neither.

such that (5.52) admits unique maximal solutions. Suppose F has a local min at p.

As p is an isolated critical point, the local min at p must be strict, i.e. there exists an

open neighborhood Ω0 of p such that F (p) < F (y) for each y ∈ Ω0 \ {p}. Then the

Lyapunov function V : Ω0 −→ R, V (y) := F (y) − F (p), is clearly positive definite at p

and V̇ : Ω0 −→ R,

by Th. 5.30. If F has a local max at p, then the proof is conducted analogously, using

V : Ω0 −→ R, V (y) := F (p) − F (y), or, alternatively, by using time reversion (if F

has a local max at p, then −F has a local min at p, i.e. p is a positively asymptotically

stable for y ′ = ∇ F (y), i.e. p is a negatively asymptotically stable for y ′ = − ∇ F (y) by

Rem. 5.25(b)).

5 STABILITY 101

If p is neither a local min nor max for F , then let Ω0 := O, and V : Ω0 −→ R, V (y) :=

F (p) − F (y), where O was chosen such that ∇ F 6= 0 on O \ {p}, i.e. V̇ : Ω0 −→ R,

V̇ (y) = k ∇ F (y)k22 , is positive definite at p. Let U := {y ∈ Ω0 : F (y) < F (p)}. Then

U is open by the continuity of F , and p ∈ ∂U , as p is neither a local min nor max

for F . By the continuity of F , F (y) = F (p) for each y ∈ Ω0 ∩ ∂U , i.e. V = 0 on

Ω0 ∩ ∂U . Thus, Th. 5.32 applies, showing p is not positively stable. Analogously, using

U := {y ∈ Ω0 : F (y) > F (p)} and V (y) := F (y) − F (p) shows p is not negatively

stable.

Example 5.35. (a) The function F : Rn −→ R, F (y) = kyk22 , has an isolated critical

point at 0, which is also a min for F . Thus, by Th. 5.34,

(b) The function F : R2 −→ R, F (y) = ey1 y2 , has an isolated critical point at 0, which

is neither a local min nor local max for F . Thus, by Th. 5.34,

The stability properties of systems of first-order linear ODE (cf. Sec. 4.6.2) are closely

related to the eigenvalues of the matrix A. As it turns out, the stability of the origin is

essentially determined by the sign of the real part of the eigenvalues of A (cf. Th. 5.38

below). We start with a preparatory lemma:

Lemma 5.36. Let n ∈ N and W ∈ M(n, K) be invertible. Moreover, let k · k be some

norm on M(n, K). Then

k · kW : M(n, K) −→ R+

0, kAkW := kW −1 AW k, (5.56)

i.e. A = W 0W −1 = 0, showing k · kW is positive definite. Next,

λ∈K

A,B∈M(n,K)

5 STABILITY 102

eigenvalue of A.

Then

∀ ker(A − λ Id)r(λ) = ker(A − λ Id)r(λ)+k :

k∈N

Then there exists v ∈ Cn such that (A−λ Id)r(λ)+k0 v = 0, but (A−λ Id)r(λ)+k0 −1 v 6=

0. However, that means w := (A − λ Id)k0 −1 v ∈ ker(A − λ Id)r(λ)+1 , but w ∈ /

ker(A − λ Id)r(λ) , in contradiction to the definition of r(λ). The space

(b) Due to A(A − λ Id) = (A − λ Id)A, one has

∀

k∈N0

i.e. all the kernels (in particular, the generalized eigenspace M (λ)) are invariant

subspaces for A.

(c) As already mentioned in Rem. 4.51 the algebraic multiplicity of λ, denoted ma (λ),

is its multiplicity as a zero of the characteristic polynomial χA (x) = det(A − x Id),

and the geometric multiplicity of λ is mg (λ) := dim ker(A − λ Id). We call the

eigenvalue λ semisimple if, and only if, its algebraic and geometric multiplicities

are equal. We then have the equivalence of the following statements (i) – (iv):

(i) λ is semisimple.

(ii) M (λ) = ker(A − λ Id).

(iii) A↾M (λ) is diagonalizable.

(iv) All the Jordan blocks corresponding to λ are trivial, i.e. they all have size 1

(i.e. there are dim ker(A − λ Id) such blocks).

Indeed, note that ma (λ) = dim ker(A − λ Id)ma (λ) (e.g., since, if A is in Jordan

normal form, then ma (λ) provides the size of the λ-block and, for A − λ Id, this

block is canonically nilpotent). This shows the equivalence between (i) and (ii).

Moreover, mg (λ) = ma (λ) means ker(A − λ Id) has a basis of ma (λ) eigenvectors

v1 , . . . , vma (λ) for the eigenvalue λ. The equivalence of (i),(ii) with (iii) and with

(iv) is then given by Th. 4.45 and Th. 4.46, respectively.

5 STABILITY 103

Theorem 5.38. Let n ∈ N and A ∈ M(n, C). Moreover, let k · k be some norm on

M(n, C) and let λ1 , . . . , λs ∈ C, 1 ≤ s ≤ n, be the distinct eigenvalues of A.

(i) There exists K > 0 such that keAx k ≤ K holds for each x ≥ 0 (resp. x ≤ 0).

(ii) Re λj ≤ 0 (resp. Re λj ≥ 0) for every j = 1, . . . , s and if Re λj = 0 occurs, then

λj is a semisimple eigenvalue (i.e. its algebraic and geometric multiplicities

are equal).

(iii) The fixed point 0 of y ′ = Ay is positively (resp. negatively) stable.

(i) There exist K, α > 0 such that keAx k ≤ Ke−α|x| holds for each x ≥ 0 (resp.

x ≤ 0).

(ii) Re λj < 0 (resp. Re λj > 0) for every j = 1, . . . , s.

(iii) The fixed point 0 of y ′ = Ay is positively (resp. negatively) asymptotically

stable.

2

= M(n, C), i.e.

k(mkl )kmax := max |mkl | : k, l ∈ {1, . . . , n}

(caveat: for n > 1, this is not the operator norm induced by the max-norm on Cn ).

Moreover, using Th. 4.46, let W ∈ M(n, C) be invertible and such that B := W −1 AW

−1

is in Jordan normal form. Then, according to Lem. 5.36, kM kW max := kW M W kmax

also defines a norm on M(n, C). According to Th. 4.47(b),

−1 Ax −1 AW x

∀ keAx kW

max = kW e W kmax = keW kmax = keBx kmax . (5.57)

x∈R

According to Th. 4.44 and Th. 4.49, the entries βkl (x) of (βkl (x)) := eBx enjoy the

following property:

k,l∈{1,...,n} j∈{1,...,s} C>0 m∈N0 x∈R

Moreover,

⇒ lim |βkl (x)| = 0, (5.59a)

x→∞

|βkl (x)| = C eRe λj x |x|m ∧ Re λj > 0

⇒ lim |βkl (x)| = ∞, (5.59b)

x→∞

|βkl (x)| = C eRe λj x |x|m ∧ Re λj = 0 ∧ m = 0

⇒ |βkl | ≡ C, (5.59c)

Re λj x m

|βkl (x)| = C e |x| ∧ Re λj = 0 ∧ m > 0 ⇒ lim |βkl (x)| = ∞. (5.59d)

x→∞

(a): We start with the equivalence between (i) and (ii): Suppose, Re λj ≤ 0 for every

j = 1, . . . , s and if Re λj = 0 occurs, then λj is a semisimple eigenvalue. Then, using

5 STABILITY 104

Rem. and Def. 5.37(c) and (5.58), we are either in situation (5.59a) or in situation

(5.59c). Thus, there exists K0 > 0 such that |βkl (x)| ≤ K0 for each x ≥ 0 and each

k, l = 1, . . . , n. Then there exists K1 > 0 such that

∀ keAx k ≤ K1 keAx kW

max = K1 ke

Bx

kmax ≤ K1 K0 , (5.60)

x≥0

Re λj > 0, then there is βkl such that (5.59b) occurs; if there is j ∈ {1, . . . , s} such that

Re λj = 0 and λj is not semisimple, then, using Rem. and Def. 5.37(c), there is βkl such

that (5.59d) occurs. In both cases,

max = lim ke

Bx

kmax = ∞, (5.61)

x→∞ x→∞ x→∞

i.e., the corresponding statement of (i) can not be true. The remaining case is handled

via time reversion: keAx k ≤ K holds for each x ≤ 0 if, and only if, ke−Ax k ≤ K holds

for each x ≥ 0, which holds if, and only if, Re(−λj ) ≤ 0 for every j = 1, . . . , s with

λj semisimple for Re(−λj ) = 0, which is equivalent to Re λj ≥ 0 for every j = 1, . . . , s

with λj semisimple for Re λj = 0.

We proceed to the equivalence between (i) and (iii): Fix some arbitary norm k · k on

Cn , and let k · kop denote the induced operator norm on M(n, C). Let C1 , C2 > 0 be

such that kM kop ≤ C1 kM k and kM k ≤ C2 kM kop for each M ∈ M(n, C). Suppose

there exists K > 0 such that keAx k ≤ K holds for each x ≥ 0. Given ǫ > 0, choose

δ := ǫ/(C1 K). Then

ǫ

∀ ∀ kY (x, η) − 0k = keAx ηk ≤ keAx kop kηk < C1 K = ǫ, (5.62)

η∈Bδ (0) x≥0 C1 K

proving 0 is positively stable. Conversely, assume 0 to be positively stable. Then there

exists δ > 0 such that kY (x, η)k = keAx ηk < 1 for each η ∈ Bδ (0) and each x ≥ 0.

Thus,

(G.1)

keAx kop = sup keAx ηk : η ∈ Cn , kηk = 1

∀ 2 2 (5.63)

x≥0

= sup keAx ηk : η ∈ Cn , kηk = δ/2 ≤ ,

δ δ

showing (i) holds with K := 2 C2 /δ. The remaining case is handled via time reversion:

keAx k ≤ K holds for each x ≤ 0 if, and only if, ke−Ax k ≤ K holds for each x ≥ 0, which

holds if, and only if, 0 is positively stable for y ′ = −Ay, which, by Rem. 5.25(a), holds

if, and only if, 0 is negatively stable for y ′ = Ay.

(b): As in (a), we start with the equivalence between (i) and (ii): Suppose, Re λj < 0

for every j = 1, . . . , s. We first show, using (5.58),

k,l=1,...,n Kkl ,αkl >0 x≥0

According to (5.58),

Ckl >0 x≥0

5 STABILITY 105

Since Re λj < 0, one has limx→∞ eRe λj x/2 xm = 0, i.e. eRe λj x/2 xm is uniformly bounded

on [0, ∞[ by some Mkl > 0. Thus, (5.64) holds with Kkl := Ckl Mkl and αkl := − Re λj /2.

In consequence, if K1 is chosen as in (5.60), then keAx k ≤ Ke−α|x| ≤ K for each x ≥ 0

holds with K := K1 max{Kkl : k, l = 1, . . . , n} and α := min{αkl : k, l = 1, . . . , n}.

Conversely, if there is j ∈ {1, . . . , s} such that Re λj ≥ 0, then there is βkl such that

(5.59b) or (5.59c) or (5.59d) occurs. In each case, keAx k 6→ 0 for x → ∞, since

lim keAx kW

max = lim ke

Bx

kmax ∈ ]0, ∞], (5.65)

x→∞ x→∞

i.e., the corresponding statement of (i) can not be true. The remaining case is handled

via time reversion: keAx k ≤ Ke−α|x| holds for each x ≤ 0 if, and only if, ke−Ax k ≤

Ke−α|x| holds for each x ≥ 0, which holds if, and only if, Re(−λj ) < 0 for every

j = 1, . . . , s, which is equivalent to Re λj > 0 for every j = 1, . . . , s.

It remains to consider the equivalence between (i) and (iii): Let k · kop and C1 , C2 > 0

be as in the proof of the equivalence between (i) and (iii) in (a). Suppose, there exist

K, α > 0 such that keAx k ≤ Ke−α|x| holds for each x ≥ 0. Since keAx k ≤ Ke−α|x| ≤ K

for each x ≥ 0, 0 is positively stable by (a). Moreover,

η∈Cn x≥0

(5.66)

showing 0 to be positively asymptotically stable. For the converse, we will actually

show (iii) implies (ii). If 0 is positively asymptotically stable, then, in particular, it

is positively stable, such that (ii) of (a) must hold. It merely remains to exclude the

possibility of a semisimple eigenvalue λ with Re λ = 0. If there were a semisimple

eigenvalue λ with Re λ = 0, then eBx had a Jordan block of size 1 with entry eλx , i.e.

βkk (x) = eλx for some k ∈ {1, . . . , n}. Let ek be the corresponding standard unit vector

of Cn (all entries 0, except the kth entry, which is 1). Then, for η := W ek ,

∀ ∀ (5.67)

ǫ∈R+ x∈R = ǫ |eλx | kek k = ǫ · 1 · kek k > 0,

norm on Cn ). The remaining case is, once again, handled via time reversion: keAx k ≤

Ke−α|x| holds for each x ≤ 0 if, and only if, ke−Ax k ≤ Ke−α|x| holds for each x ≥ 0,

which holds if, and only if, 0 is positively asymptotically stable for y ′ = −Ay, which, by

Rem. 5.25(b), holds if, and only if, 0 is negatively asymptotically stable for y ′ = Ay.

Example 5.39. (a) The matrix

2 1

A=

1 2

has eigenvalues 1 and 3 and, thus, the fixed point 0 of y ′ = Ay is negatively

asymptotically stable, but not positively stable.

(b) The matrix

0 1

A=

0 0

5 STABILITY 106

has eigenvalue 0, which is not semisimple, i.e. the fixed point 0 of y ′ = Ay is neither

negatively nor positively stable.

i 1 2 2 − 3i

0 −i 5 −17

A=

0 0 −1 + 3i

0

0 0 0 −5

has simple eigenvalues i, −i, −1 + 3i, −5, i.e. the fixed point 0 of y ′ = Ay is

positively stable (since all real parts are ≤ 0), but neither negatively stable nor

positively asymptotically stable (since there are eigenvalues with 0 real part).

5.4 Linearization

If the right-hand side f of an autonomous ODE is differentiable and p is a fixed point (i.e.

f (p) = 0), then one can sometimes use its linearization, i.e. its derivative A := Df (p)

(which is an n × n matrix), to infer stability properties of y ′ = f (y) at p from those of

y ′ = Ay at 0 (see Th. 5.44 below). We start with some preparatory results:

n

X

n n t

β : R × R −→ R, β(y, z) := y • (Bz) = y Bz = yk bkl zl , (5.68)

k,l=1

where B = (bkl ) ∈ M(n, R), “•” denotes the Euclidean scalar product, and elements of

Rn are interpreted as column vectors when involved in matrix multiplications.

(a) The function β is differentiable (it is even a polynomial, deg(β) ≤ 2, and, thus,

C ∞ ) and

∂yk β : Rn × Rn −→ R,

n

∀ X (5.69a)

k∈{1,...,n} ∀n n ∂yk β(y, z) = bkl zl = (Bz)k ,

(y,z)∈R ×R

l=1

∂zl β : R × Rn −→ R,

n

n

∀ X (5.69b)

l∈{1,...,n} ∀n ∂zl β(y, z) = yk bkl = (y t B)l ,

(y,z)∈R ×Rn

k=1

Dβ(y, z) = ∇ β(y, z) : R × Rn −→ R,

n

∀ (5.69c)

(y,z)∈Rn ×Rn ∇ β(y, z)(u, v) = β(y, v) + β(u, z) = y t Bv + ut Bz.

n

X

n t

V : R −→ R, V (y) := β(y, y) = y • (By) = y By = yk bkl yl , (5.70)

k,l=1

5 STABILITY 107

∂yk V : Rn −→ R,

n

∀ X (5.71a)

k∈{1,...,n} ∀n ∂yk V (y) = yl (bkl + blk ) = y t (B + B t )k ,

y∈R

l=1

DV (y) = ∇ V (y) : Rn −→ R,

∀n (5.71b)

y∈R ∇ V (y)(u) = β(y, u) + β(u, y) = y t Bu + ut By = y t (B + B t )u.

Proof. (a): (5.69a) and (5.69b) are immediate from (5.68) and, then, imply (5.69c).

(b): (5.71a) is immediate from (5.70) and, then, implies (5.71b).

y∈Rn

Proof. We note

t

∀n (y t B t ) • (Ay) = y t B t Ay = y t B t Ay = y t At By (5.73)

y∈R

(5.71b) (5.73)

∀n ∇ V (y) • (Ay) = y t (B + B t ) • (Ay) = y t (BA + At B)y, (5.74)

y∈R

proving (5.72).

Definition 5.42. A matrix B ∈ M(n, R), n ∈ N, is called positive definite if, and only

if, the function V of (5.70) is positive definite at p = 0 in the sense of Def. 5.29.

Proposition 5.43. Let A ∈ M(n, R), n ∈ N. Then the following statements (i) – (iii)

are equivalent:

BA + At B = −C. (5.75)

(iii) For each given positive definite (symmetric) C ∈ M(n, R), there exists a positive

definite (symmetric) B ∈ M(n, R), satisfying (5.75).

Proof. (iii) immediately implies (i) (e.g. by applying (iii) with C := Id).

For the proof that (i) implies (ii), let B, C ∈ M(n, R) be positive definite matrices,

satisfying (5.75). By Th. 5.38(b), it suffices to show 0 is a positively asymptotically

stable fixed point for y ′ = Ay. To this end, we apply Th. 5.30, using V : Rn −→ R of

5 STABILITY 108

(5.70) as the Lyapunov function. Then, by Def. 5.42, B being positive definite means

V being positive definite at 0. Since

(5.72) (5.75)

V̇ : Rn −→ R, V̇ (y) = ∇ V (y) • (Ay) = y t (BA + At B)y = −y t Cy (5.76)

positively asymptotically stable fixed point for y ′ = Ay as desired.

It remains to show that (ii) implies (iii). If all eigenvalues of A have negative real part,

then, as A and At have the same eigenvalues, all eigenvalues of At have negative real

part as well. Thus, according to Th. 5.38(b),

t

∃ ∀ keAx kmax ≤ Ke−αx ∧ keA x kmax ≤ Ke−αx , (5.77)

K,α>0 x≥0

2

where we have chosen the norm in (5.77) to mean the max-norm on Rn (note that eAx

is real if A is real, e.g. due to the series representation (4.73)). Given C ∈ M(n, R),

define Z ∞

t

B := eA x C eAx dx . (5.78)

0

To verify that B ∈ M(n, R) is well-defined, note that each entry of the integrand matrix

of (5.78) constitutes an integrable function on [0, ∞[: Indeed,

At x At x

e C eAx
kmax kCkmax keAx kmax

max

≤ M ke

∃ ∀ (5.79)

M >0 x≥0 ≤ M kCkmax K 2 e−2αx ,

Z x

(5.79) At x Ax t

∂s eA s C eAs ds

−C = lim e C e − C = lim

x→∞ x→∞ 0

Z ∞

t

∂s eA s C eAs ds

=

Z0 ∞

(I.3) t t

At eA s C eAs + eA s C eAs A ds

=

0

Z ∞ Z ∞

(I.5),(I.6) t At s As At s As

= A e C e ds + e C e ds A

0 0

(5.78)

= = At B + BA, (5.80)

implying

Z ∞ Z ∞

t t At x Ax (I.5),(I.6) t

y By = y e C e dx y = y t eA x C eAx y dx

Z ∞ 0 0

Prop. 4.40(c)

= (eAx y)t C eAx y dx > 0, (5.81)

0

5 STABILITY 109

Z ∞ Z ∞

t At x Ax t Prop. 4.40(c) t

eA x C eAx dx = B,

B = e Ce dx = (5.82)

0 0

Theorem 5.44. Let Ω ⊆ Rn be open, n ∈ N, and f : Ω −→ Rn continuously differ-

entiable. Let p ∈ Ω be a fixed point (i.e. f (p) = 0) and A := Df (p) ∈ M(n, R) the

derivative of f at p. If all eigenvalues of A have negative (resp. positive) real parts, then

p is a positively (resp. negatively) asymptotically stable fixed point for y ′ = f (y).

Proof. Let all eigenvalues of A have negative real parts. We first consider the special

case p = 0, i.e. A = Df (0). By the equivalence between (ii) and (iii) of Prop. 5.43,

we can choose C := Id in (iii) to obtain the existence of a positive definite symmetric

matrix B ∈ M(n, R), satisfying

BA + At B = − Id . (5.83)

The idea is now to apply the Lyapunov Th. 5.30 with V of (5.70), i.e.

n

X

t

V : Ω −→ R, V (y) := y • (By) = y By = yk bkl yl . (5.84)

k,l=1

conclude the proof of 0 being positively asymptotically stable by showing there exists

δ > 0, such that

V̇ : Bδ (0) −→ R, V̇ (y) = (∇ V )(y) • f (y), (5.85)

is negative definite at 0, where we take Bδ (0) with respect to the 2-norm k · k2 on Rn .

The differentiability of f at 0 implies that (cf. [Phi15, Lem. 2.21])

satisfies

kr(y)k2

lim = 0. (5.87)

y→0 kyk2

(5.71b),(5.86)

(∇ V )(y) • (Ay) + y t (B + B t ) • r(y)

V̇ (y) = (∇ V )(y) • f (y) =

(5.72), B=B t (5.83)

= y t (BA + At B)y + 2y t B r(y) = −kyk22 + 2 y • B r(y). (5.88)

We can estimate the second summand via the Cauchy-Schwarz inequality to obtain

y • B r(y) ≤ kyk2 kB r(y)k2 ≤ kyk2 kBk kr(y)k2 , (5.89)

y • B r(y)

lim = 0. (5.90)

y→0 kyk22

5 STABILITY 110

2y • B r(y) 1

∀ 2

< . (5.91)

y∈Bδ (0) kyk2 2

(5.91) kyk22 kyk22

V̇ (y) = −kyk22 + 2 y • B r(y) < −kyk22 + =− < 0, (5.92)

2 2

showing V̇ to be negative definite at 0, and 0 to be positively asymptotically stable. If

p 6= 0, then consider the ODE y ′ = g(y) := f (y + p), g : (Ω − p) −→ Rn . Then 0 is a

fixed point for y ′ = g(y), Dg(0) = Df (p) = A, i.e. 0 is positively asymptotically stable

for y ′ = g(y). But, since ψ is a solution to y ′ = g(y) if, and only if, φ = ψ +p is a solution

to y ′ = f (y), p must be positively asymptotically stable for y ′ = f (y). The remaining

case that all eigenvalues of A have positive real parts is now treated via time reversion:

If all eigenvalues of A have positive real parts, then all eigenvalues of −A = D(−f )(p)

have negative real parts, i.e. p is positively asymptotically stable for y ′ = −f (y), i.e., by

Rem. 5.25(b), p is negatively asymptotically stable for y ′ = f (y).

Caveat 5.45. The following example shows that the converse of Th. 5.44 does not

hold: A fixed point p can be positively (resp. negatively) asymptotically stable without

A := Df (p) having only eigenvalues with negative (resp. positive) real parts. The same

example shows that, in general, one can not infer anything regarding the stability of the

fixed point p if A := Df (p) is merely stable, but not asymptotically stable: Consider

0 1

Df (0, 0) = (5.94)

−1 0

with complex eigenvalues i and −i. Thus, the linearized system is positively and nega-

tively stable, but not asymptotically stable, still independently of µ. However, we claim

that (0, 0) is a positively asymptotically stable fixed point for y ′ = f (y) if µ < 0 and

a positively asymptotically stable fixed point for y ′ = f (y) if µ > 0. Indeed, this can

be seen by using the Lyapunov function V : R2 −→ R, V (y1 , y2 ) = y12 + y22 , which has

∇ V (y1 , y2 ) = (2y1 , 2y2 ) and

Thus, V is positive definite at (0, 0) and V̇ is negative definite at (0, 0) for µ < 0 and

positive definite at (0, 0) for µ > 0.

5 STABILITY 111

The derivative is

− cos y x sin y 0

Df : R3 −→ M(3, R), Df (x, y, z) = 0 −ez −yez . (5.97)

2x 0 −2

Clearly, (0, 0, 0) is a fixed point and Df (0, 0, 0) has eigenvalues −1 and −2. Thus,

(0, 0, 0) is a positively asymptotically stable fixed point for (x, y, z)′ = f (x, y, z) by Th.

5.44.

Limit sets are important when studying the asymptotic behavior of solutions, i.e. φ(x)

for x → ∞ and for x → −∞. If a solution has a limit, then its corresponding limit set

consists of precisely one point. In general, the limit set of a solution is defined to consist

of all points that occur as limits of sequences taken along the solution’s orbit (of course,

the limit sets can also be empty):

Definition 5.47. Let Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn be such that y ′ = f (y) admits

unique maximal solutions. For each η ∈ Ω, we define the omega limit set and the alpha

limit set of η as follows:

ω(η) := ωf (η) := y ∈ Ω : ∃ lim xk = ∞ ∧ lim Y (xk , η) = y , (5.98a)

(xk )k∈N ⊆R k→∞ k→∞

α(η) := αf (η) := y ∈ Ω : ∃ lim xk = −∞ ∧ lim Y (xk , η) = y . (5.98b)

(xk )k∈N ⊆R k→∞ k→∞

Remark 5.48. In the situation of Def. 5.47, consider the time-reversed version of y ′ =

f (y), i.e. y ′ = −f (y), with its general solution Ỹ (x, η) = Y (−x, η), cf. (5.28). Clearly,

for each η ∈ Ω,

ωf (η) = α−f (η), αf (η) = ω−f (η). (5.99)

Proposition 5.49. In the situation of Def. 5.47, the following hold:

0 , then

∞

\

ω(η) = {Y (x, η) : x ≥ m}; (5.100a)

m=0

0 , then

∞

\

α(η) = {Y (x, η) : x ≤ −m}. (5.100b)

m=0

(b) All points in the same orbit have the same omega and alpha limit sets, i.e.

∀ ω(η) = ω Y (x, η) ∧ α(η) = α Y (x, η) .

x∈I0,η

5 STABILITY 112

Proof. Due to Rem. 5.48, it suffices to prove the statements involving the omega limit

sets.

(a): Let y ∈ ω(η) and m ∈ N0 . Then there is a sequence (xk )k∈N in R such that

limk→∞ xk = ∞ and limk→∞ Y (xk , η) = y. Since, for sufficiently large k0 ∈ N, the

sequence (Y (xk , η))k≥k0 is in {Y (x, η) : x ≥ m}, the inclusion “⊆” of (5.100a) is proved.

Conversely, assume y ∈ {Y (x, η) : x ≥ m} for each m ∈ N0 . Then,

Y (xk , η) − y
< 1 ,

∀ ∃

k∈N xk ∈[k,∞[ k

providing a sequence (xk )k∈N in R such that limk→∞ xk = ∞ and limk→∞ Y (xk , η) = y,

proving y ∈ ω(η) and the inclusion “⊇” of (5.100a).

(b): Let y ∈ ω(η) and x ∈ I0,η . Choose a sequence (xk )k∈N in R such that limk→∞ xk = ∞

and limk→∞ Y (xk , η) = y. Then limk→∞ (xk − x) = ∞ and

Lem. 5.4(b)

lim Y xk − x, Y (x, η) = lim Y (xk , η) = y, (5.101)

k→∞ k→∞

proving ω(η) ⊆ ω Y (x, η) . The reversed inclusion then also follows, since

Lem. 5.4(b)

Y − x, Y (x, η) = Y (0, η) = η, (5.102)

unique maximal solutions.

is such that limx→∞ Y (x, η) = y ∈ Kn , then ω(η) = {y}; if η ∈ Ω is such that

limx→−∞ Y (x, η) = y ∈ Kn , then α(η) = {y}.

(b) If A ∈ M(n, C) is such that the conditions of Th. 5.38(b) hold (all eigenvalues have

negative real parts, 0 is positively asymptotically stable), then ω(η) = {0} for each

η ∈ Cn and α(η) = ∅ for each η ∈ Cn \ {0}.

(c) If η ∈ Ω is such that the orbit O(φ) of φ := Y (·, η) is periodic, then ω(η) = α(η) =

O(φ). For example, for (5.8),

(5.103)

η∈R

Example 5.51. As an example with nonperiodic orbits that have limit sets consisting

of more than one point, consider

y2′ = −y1 + y2 (1 − y12 − y22 ). (5.104b)

5 STABILITY 113

We will show that, for each point except the origin (which is clearly a fixed point), the

omega limit set is the unit circle, i.e.

η∈R2 \{0}

(η1 cos x + η2 sin x, η2 cos x − η1 sin x)

Y : Df,0 −→ R2 , Y (x, η1 , η2 ) = p , (5.106)

η12 + η22 + (1 − η12 − η22 )e−2x

where, letting

kηk22 − 1

1

∀ xη := ln , (5.107)

η∈{y∈R2 : kyk2 >1} 2 kηk22

Df,0 = R × {η ∈ R2 : kηk2 ≤ 1} ∪ ]xη , ∞[×{η ∈ R2 : kηk2 > 1} : (5.108)

(η1 , η2 )

Y (0, η1 , η2 ) = p = (η1 , η2 ). (5.109)

η12 + η22 + (1 − η12 − η22 )

The following computations prepare the check that each Y (·, η1 , η2 ) satisfies (5.104):

The 2-norm squared of the numerator in (5.106) is

2

= η12 cos2 x + 2η1 η2 cos x sin x + η22 sin2 x + η22 cos2 x − 2η1 η2 cos x sin x + η12 sin2 x

= η12 + η22 = kηk22 . (5.110)

Thus,

Y (x, η1 , η2 )
= p kηk2

2

(5.111)

kηk22 + (1 − kηk22 )e−2x

and

2 kηk22 + (1 − kηk22 )e−2x − kηk22

1 − Y12 (x, η1 , η2 ) − Y22 (x, η1 , η2 ) = 1 −
Y (x, η1 , η2 )
2 =

kηk22 + (1 − kηk22 )e−2x

(1 − kηk22 )e−2x

= . (5.112)

kηk22 + (1 − kηk22 )e−2x

In consequence,

Y1′ (x, η1 , η2 )

(−η1 sin x + η2 cos x) kηk22 + (1 − kηk22 )e−2x + (η1 cos x + η2 sin x)(1 − kηk22 )e−2x

= 3

kηk22 + (1 − kηk22 )e−2x 2

= Y2 (x, η1 , η2 ) + Y1 (x, η1 , η2 ) 1 − Y12 (x, η1 , η2 ) − Y22 (x, η1 , η2 ) , (5.113)

5 STABILITY 114

Y2′ (x, η1 , η2 )

(−η2 sin x − η1 cos x) kηk22 + (1 − kηk22 )e−2x + (η2 cos x − η1 sin x)(1 − kηk22 )e−2x

= 3

kηk22 + (1 − kηk22 )e−2x 2

= −Y1 (x, η1 , η2 ) + Y2 (x, η1 , η2 ) 1 − Y12 (x, η1 , η2 ) − Y22 (x, η1 , η2 ) , (5.114)

verifying (5.104b).

For kηk2 ≤ 1, Y (·, η1 , η2 ) is maximal, as it is defined on R (the denominator in (5.106)

has no zero in this case). For kηk2 > 1, the denominator clearly has a zero at xη < 0,

where xη is defined as in (5.107). For x > xη , the expression under the square root

in (5.106) is positive. Since limx↓xη kY (x, η1 , η2 )k2 = ∞ for kηk2 > 1, Y (·, η1 , η2 ) is

maximal in this case as well, completing the verification of Y , defined as in (5.106) –

(5.108), being the general solution of (5.104).

It remains to prove (5.105). From (5.111), we obtain

∀2 lim
Y (x, η1 , η2 )
2 = 1, (5.115)

η∈R \{0} x→∞

which implies

∀ ω(η) ⊆ S1 (0). (5.116)

η∈R2 \{0}

y ∈ ω(η): Since kyk2 = 1,

ϕy ∈[0,2π[

Analogously,

∃ η = kηk2 (sin ϕη , cos ϕη ) (5.118)

ϕη ∈[0,2π[

(the reader might note that, in (5.117) and (5.118), we have written y and η using their

polar coordinates, cf. [Phi15, Ex. 4.19]). Then, according to (5.106), we obtain, for each

x ≥ 0,

Y (x, η1 , η2 ) = p

kηk22 + (1 − kηk22 )e−2x

kηk2 sin(x + ϕη ), cos(x + ϕη )

= p . (5.119)

kηk22 + (1 − kηk22 )e−2x

Define

∀ xk := ϕy − ϕη + 2πk ∈ R+ . (5.120)

k∈N

5 STABILITY 115

lim Y (xk , η1 , η2 )

k→∞

(5.119) kηk2 sin(ϕy − ϕη + 2πk + ϕη ), cos(ϕy − ϕη + 2πk + ϕη )

= lim p

k→∞ kηk22 + (1 − kηk22 )e−2x

kηk2 (sin ϕy , cos ϕy )

= lim p = y, (5.121)

k→∞ kηk22 + (1 − kηk22 )e−2x

Proposition 5.52. In the situation of Def. 5.47, if f is locally Lipschitz, then orbits

that intersect an omega or alpha limit set, must entirely remain inside that same omega

or alpha limit set, i.e.

∀ ∀ Y (x, y) ∈ ω(η) ∧ ∀ ∀ Y (x, y) ∈ α(η) . (5.122)

y∈Ω∩ω(η) x∈I0,y y∈Ω∩α(η) x∈I0,y

Proof. Due to Rem. 5.48, it suffices to prove the statement involving the omega limit set.

Let y ∈ Ω∩ω(η) and x ∈ I0,y . Choose a sequence (xk )k∈N in R such that limk→∞ xk = ∞

and limk→∞ Y (xk , η) = y. Then limk→∞ (xk + x) = ∞ and,

Lem. 5.4(b) (∗)

lim Y (xk + x, η) = lim Y x, Y (xk , η) = Y (x, y), (5.123)

k→∞ k→∞

proving Y (x, y) ∈ ω(η). At “(∗)”, we have used that, due to f being locally Lipschitz

by hypothesis, Y is continuous by Th. 3.35.

Proposition 5.53. In the situation of Def. 5.47, let η ∈ Ω be such that there exists a

compact set K ⊆ Ω, satisfying

{Y (x, η) : x ≥ 0} ⊆ K resp. {Y (x, η) : x ≤ 0} ⊆ K . (5.124)

(c) ω(η) (resp. α(η)) is a connected set, i.e. if O1 , O2 are disjoint open subsets of Kn

such that ω(η) ⊆ O1 ∪O2 (resp. α(η) ⊆ O1 ∪O2 ), then ω(η)∩O1 = ∅ or ω(η)∩O2 = ∅

(resp. α(η) ∩ O1 = ∅ or α(η) ∩ O2 = ∅).

Proof. Due to Rem. 5.48, it suffices to prove the statements involving the omega limit

sets.

(a): Since, by hypothesis, (Y (k, η))k∈N is a sequence in the compact set K, it must have

a subsequence, converging to some limit y ∈ K. But then y ∈ ω(η), i.e. ω(η) 6= ∅.

5 STABILITY 116

(b): According to (5.100a) and (5.124), ω(η) is a closed subset of the compact set K,

implying ω(η) to be compact as well.

(c): Seeking a contradiction, we suppose the assertion is false, i.e. there are disjoint

open subsets O1 , O2 of Kn such that ω(η) ⊆ O1 ∪ O2 , ω1 := ω(η) ∩ O1 6= ∅ and ω2 :=

ω(η) ∩ O2 6= ∅. Then ω1 and ω2 are disjoint since O1 , O2 are disjoint. Moreover, ω1

and ω2 are both subsets of the compact set ω(η). Due to ω1 = ω(η) ∩ (Kn \ O2 ) and

ω2 = ω(η) ∩ (Kn \ O1 ), ω1 and ω2 are also closed, hence, compact. Then, according

to Prop. C.10, δ := dist(ω1 , ω2 ) > 0. If y1 ∈ ω1 and y2 ∈ ω2 , then there are numbers

0 < s1 < t1 < s2 < t2 < . . . such that limk→∞ sk = limk→∞ tk = ∞ and

∀ Y (sk , η) ∈ O1 ∧ Y (tk , η) ∈ O2 . (5.125)

k∈N

Define

∀ σk := sup x ≥ sk : Y (t, η) ∈ O1 for each t ∈ [sk , x] . (5.126)

k∈N

Then sk < σk < tk and the continuity of the (even differentiable) map Y (·, η) yields

ηk := Y (σk , η) ∈ ∂O1 . Thus, (ηk )k∈N is a sequence in the compact set K ∩ ∂O1 and,

therefore, must have a convergent subsequence, converging to some z ∈ K ∩ ∂O1 . But

then z ∈ ω(η), but not in O1 ∪ O2 , in contradiction to ω(η) ⊆ O1 ∪ O2 .

admits unique maximal solutions. Moreover, let Ω0 be an open subset of Ω, assume

V : Ω0 −→ R is continuously differentiable, K := {y ∈ Ω0 : V (y) ≤ r} is compact for

some r ∈ R, and V̇ (y) ≤ 0 (resp. V̇ (y) ≥ 0) for each y ∈ K, where V̇ is defined as in

(5.33). If η ∈ Ω0 is such that V (η) < r, then the following hold:

−

(a) Y (·, η) is defined on all of R+

0 (resp. on all of R0 ).

(b) One has ω(η) ⊆ K (resp. α(η) ⊆ K) and V is constant on ω(η) (resp. on α(η)).

n o

M := y ∈ K : V̇ Y (x, y) = 0 for each x ≥ 0 (resp. for each x ≤ 0) ,

one has ω(η) ⊆ M (resp. α(η) ⊆ M ). In particular, V̇ (y) = 0 for each y ∈ ω(η)

(resp. for each y ∈ α(η)).

Proof. As usual, it suffices to prove the assertions for V̇ (y) ≤ 0, as the assertions for

V̇ (y) ≥ 0 then follow via time reversion.

(a): We claim

∀ V Y (x, η) < r : (5.127)

x∈I0,η ∩R+

0

0 < s := sup x ≥ 0 : V Y (t, η) < r for each t ∈ [0, x] ∈ I0,η , (5.128)

5 STABILITY 117

and Z s

r = V Y (s, η) = V (η) + V̇ Y (t, η) dt ≤ V (η) < r, (5.129)

0

0 ⊆ I0,η , since

Y (·, η) is a maximal solution and K = {y ∈ Ω0 : V (y) ≤ r} is compact.

(b): Let φ := Y (·, η). During the proof of (a) above, we have shown φ(x) ∈ K for each

x ≥ 0. Since, then, (V ◦ φ)′ (x) = V̇ (φ(x)) ≤ 0 for each x ≥ 0, V ◦ φ is nonincreasing for

x ≥ 0. Since V ◦ φ is also bounded on K,

∃ c = lim V φ(x) . (5.130)

c∈R x→∞

If y ∈ ω(η), then there exists a sequence (xk )k∈N in R such that limk→∞ xk = ∞ and

limk→∞ φ(xk ) = y. Thus, y ∈ K (since K is closed), and

(5.130)

V (y) = lim V φ(xk ) = c, (5.131)

k→∞

proving (b).

(c): Let y ∈ ω(η) and φ := Y (·, y). Since f is assumed to be locally Lipschitz, Prop.

5.52 applies and we obtain φ(x) = Y (x, y) ∈ ω(η) for each x ∈ R+0 . Using (b), we know

V to be constant on ω(η), i.e. V ◦ φ must be constant on R+ 0 as well, implying

x∈R0

as claimed.

Example 5.55. Let a < 0 < b and let h : ]a, b[−→ R be continuously differentiable and

such that

< 0 for x < 0,

h(x) = 0 for x = 0, (5.133)

> 0 for x > 0.

y1′ = y2 , (5.134a)

y2′ = −y12 y2 − h(y1 ). (5.134b)

The right-hand side is defined on Ω :=]a, b[×R and is clearly C 1 , i.e. the ODE admits

unique maximal solutions. Due to (5.133), F = {(0, 0)}, i.e. the origin is the only fixed

point of (5.134). We will use Th. 5.54(c) to show (0, 0) is positively asymptotically

stable: We introduce

Z x

H : ]a, b[−→ R, H(x) := h(t) dt , (5.135)

0

y22

V : Ω −→ R, V (y1 , y2 ) := H(y1 ) + . (5.136)

2

A DIFFERENTIABILITY 118

increasing on [0, b[), V is positive definite at (0, 0). We also obtain

Thus, from the Lyapunov Th. 5.30, we already know (0, 0) to be positively stable.

However, V̇ is not negative definite at (0, 0), i.e. we can not immediately conclude that

(0, 0) is positively asymptotically stable. Instead, as promised, we apply Th. 5.54(c):

To this end, using that H is continuous and positive definite at 0, we choose r > 0 and

c, d ∈ R, satisfying

and define

K := {(y1 , y2 ) ∈ Ω : V (y1 , y2 ) ≤ r}. (5.140)

(η1 ,η2 )∈O x→∞

√ √

Moreover, the continuity of V implies K to be closed. Since K ⊆ [c, d] × [− 2r, 2r], it

is also bounded, i.e. compact. Thus, Th. 5.54 applies to each η ∈ O. So let η ∈ O. We

will show that M = {(0, 0)}, where M is the set of Th. 5.54(c) (then ω(η) = {(0, 0)} by

Th. 5.54(c), which implies (5.141) as desired). To verify M = {(0, 0)}, note V̇ (y1 , y2 ) < 0

for y1 , y2 6= 0, showing (y1 , y2 ) ∈

/ M . For y1 = 0, y2 6= 0, let φ := Y (·, y1 , y2 ). Then

φ2 (0) = y2 6= 0 and φ′1 (0) = y2 6= 0, i.e. both φ1 and φ2 are nonzero on some interval

]0, ǫ[ with ǫ > 0, showing (y1 , y2 ) ∈/ M . Likewise, if y1 6= 0, y2 = 0, then let φ be as

before. This time φ1 (0) = y1 6= 0 and φ′2 (0) = −h(y1 ) 6= 0, again showing both φ1 and

φ2 are nonzero on some interval ]0, ǫ[ with ǫ > 0, implying (y1 , y2 ) ∈

/ M.

A Differentiability

We provide a lemma used in the variation of constants Th. 2.3.

is differentiable with

B KN -VALUED INTEGRATION 119

Proof. For K = R, the lemma is immediate from the chain rule of [Phi16, Th. 9.11].

It remains to consider the case K = C. Note that we can not apply the chain rule for

holomorphic (i.e. C-differentiable functions), since a is only R-differentiable and it does

not need to have a holomorphic extension. However, we can argue as follows, merely

using the chain rule and the product rule for real-valued functions: Write a = b + ic

with differentiable functions b, c : O −→ R. Then

f (x) = ea(x) = eb(x)+ic(x) = eb(x) eic(x) = eb(x) sin c(x) + i cos c(x) .

(A.2)

f ′ (x) = b′ (x) eb(x) eic(x) + eb(x) − c′ (x) cos c(x) + ic′ (x) sin c(x)

= b′ (x) ea(x) + ic′ (x) eb(x) i cos c(x) + sin c(x) = b′ (x) ea(x) + ic′ (x) eb(x) eic(x)

(A.3)

proving (A.1b).

B Kn-Valued Integration

During the course of this class, we frequently need Kn -valued integrals.R In particular,

for f : I −→ Kn , I an interval in R, we make use of the estimate k I f k ≤ I kf k,

R

for example in the proof of the Peano Th. 3.8. As mentioned in the proof of Th. 3.8,

the estimate can easily be checked directly for the 1-norm on Kn , but it does hold for

every norm on Kn . To verify this result is the main purpose of the present section.

Throughout the class, it suffices to use Riemann integrals. However, some readers

might be more familiar with Lebesgue integrals, which is a more general notion (every

Riemann integrable function is also Lebesgue integrable). For convenience, the material

is presented twice, first using Riemann integrals and arguments that make specific use

of techniques available for Riemann integrals, then, second, using Lebesgue integrals

and corresponding techniques. For Riemann integrals, the norm estimate is proved in

Th. B.4, for Lebesgue integrals in Th. B.9.

Definition B.1. Let a, b ∈ R, I := [a, b]. We call a function f : I −→ Kn , n ∈ N,

Riemann integrable if, and only if, each coordinate function fj = πj ◦ f : I −→ K,

j = 1, . . . , n, is Riemann integrable. Denote the set of all Riemann integrable functions

from I into Kn by R(I, Kn ). If f : I −→ Kn is Riemann integrable, then

Z Z Z

f := f1 , . . . , fn ∈ Kn (B.1)

I I I

B KN -VALUED INTEGRATION 120

Remark B.2. The linearity of the K-valued integral implies the linearity of the Kn -

valued integral.

Theorem B.3. Let a, b ∈ R, a ≤ b, I := [a, b]. If f ∈ R(I, Kn ), n ∈ N, and φ :

f (I) −→ R is Lipschitz continuous, then φ ◦ f ∈ R(I, R).

and ψ : Cn −→ R, ψ(z1 , . . . , zn ) := φ(Re z1 , . . . , Re zn ). Clearly, ι ◦ f ∈ R(I, Cn ), and,

if φ is L-Lipschitz, L ≥ 0, then, due to

n

(∗) X

≤ CL k Re z − Re wk1 = CL | Re zj − Re wj |

∀ j=1 (B.2)

z,w∈Cn

n

[Phi16, Th. 5.9(d)] X (∗∗)

≤ CL |zj − wj | = CL kz − wk1 ≤ C̃CL kz − wk,

j=1

where the estimate at (∗) holds with C ∈ R+ , due to the equivalence of k · k and k · k1 on

Rn , and the estimate at (∗∗) holds with C̃ ∈ R+ , due to the equivalence of k · k1 and k · k

on Cn . Thus, by (B.2), ψ is Lipschitz as well, namely C̃CL-Lipschitz, and it suffices to

consider the case K = C, which we proceed to do next. Once again using the equivalence

of k · k1 and k · k on Cn , there exists c ∈ R+ such that kzk ≤ ckzk1 for each z ∈ Cn .

Assume φ to be L-Lipschitz, L ≥ 0. If f ∈ R(I, Cn ), then Re f1 , . . . , Re fn ∈ R(I, R)

and Im f1 , . . . , Im fn ∈ R(I, R), i.e., given ǫ > 0, Riemann’s integrability criterion of

[Phi16, Th. 10.13] provides partitions ∆1 , . . . , ∆n of I and Π1 , . . . , Πn of I such that

ǫ

R(∆j , Re fj ) − r(∆j , Re fj ) < ,

∀ 2ncL (B.3)

j=1,...,n ǫ

R(Πj , Im fj ) − r(Πj , Im fj ) < ,

2ncL

where R and r denote upper and lower Riemann sums, respectively (cf. [Phi16, (10.7)]).

Letting ∆ be a joint refinement of the 2n partitions ∆1 , . . . , ∆n , Π1 , . . . , Πn , we have (cf.

[Phi16, Def. 10.8(a),(b)] and [Phi16, Th. 10.10(a)])

ǫ

R(∆, Re fj ) − r(∆, Re fj ) < ,

∀ 2ncL (B.4)

j=1,...,n ǫ

R(∆, Im fj ) − r(∆, Im fj ) < .

2ncL

Recalling that, for each g : I −→ R and ∆ = (x0 , . . . , xN ) ∈ RN +1 , N ∈ N, a = x0 <

x1 < · · · < xN = b, Ik := [xk−1 , xk ], it is

N

X N

X

r(∆, g) = mk |Ik | = mk (g)(xk − xk−1 ), (B.5a)

k=1 k=1

XN XN

R(∆, g) = Mk |Ik | = Mk (g)(xk − xk−1 ), (B.5b)

k=1 k=1

B KN -VALUED INTEGRATION 121

where

mk (g) := inf{g(x) : x ∈ Ik }, Mk (g) := sup{g(x) : x ∈ Ik }, (B.5c)

we obtain, for each ξk , ηk ∈ Ik ,

(φ ◦ f )(ξk ) − (φ ◦ f )(ηk ) ≤ L
f (ξk ) − f (ηk )
≤ cL
f (ξk ) − f (ηk )

1

n

X

= cL fj (ξk ) − fj (ηk )

j=1

n n

[Phi16, Th. 5.9(d)] X X

≤ cL Re fj (ξk ) − Re fj (ηk ) + cL

Im fj (ξk ) − Im fj (ηk )

j=1 j=1

n

X n

X

≤ cL Mk (Re fj ) − mk (Re fj ) + cL Mk (Im fj ) − mk (Im fj ) . (B.6)

j=1 j=1

Thus,

N

X

R(∆, φ ◦ f ) − r(∆, φ ◦ f ) = Mk (φ ◦ f ) − mk (φ ◦ f ) |Ik |

k=1

n

N X

(B.6) X

≤ cL Mk (Re fj ) − mk (Re fj ) |Ik |

k=1 j=1

N X

X n

+ cL Mk (Im fj ) − mk (Im fj ) |Ik |

k=1 j=1

n

X n

X

= cL R(∆, Re fj ) − r(∆, Re fj ) + cL R(∆, Im fj ) − r(∆, Im fj )

j=1 j=1

(B.4) ǫ

< 2ncL = ǫ. (B.7)

2ncL

Thus, φ ◦ f ∈ R(I, R) by [Phi16, Th. 10.13].

Theorem B.4. Let a, b ∈ R, a ≤ b, I := [a, b]. For each norm k · k on Kn , n ∈ N, and

each Riemann integrable f : I −→ Kn , it is kf k ∈ R(I, R), and the following holds:

Z
Z

f
≤ kf k. (B.8)

I I

Proof. From Th. B.3, we obtain kf k ∈ R(I, R), as the norm k · k is 1-Lipschitz by the

inverse triangle inequality. Let ∆ be an arbitrary partition of I. Recalling that, for

each g : I −→ R and ∆ = (x0 , . . . , xN ) ∈ RN +1 , N ∈ N, a = x0 < x1 < · · · < xN = b,

Ik := [xk−1 , xk ], ξk ∈ Ik , the intermediate Riemann sums

N

X N

X

ρ(∆, f ) = f (tk ) |Ik | = f (tk )(xk − xk−1 ), (B.9)

k=1 k=1

B KN -VALUED INTEGRATION 122

we obtain, for ξk ∈ Ik ,

ρ(∆, Re f1 ), ρ(∆, Im f1 ) , . . . , ρ(∆, Re fn ), ρ(∆, Im fn )

N N

! N N

!!

X X X X

=
Re f1 (ξk ) |Ik |, Im f1 (ξk ) |Ik | , . . . , Re fn (ξk ) |Ik |, Im fn (ξk ) |Ik |

k=1 k=1 k=1 k=1

N

X

=
Re f1 (ξk ) |Ik |, Im f1 (ξk ) |Ik | , . . . , Re fn (ξk ) |Ik |, Im fn (ξk ) |Ik |

k=1

N

X

≤
Re f1 (ξk ), Im f1 (ξk ) , . . . , Re fn (ξk ), Im fn (ξk )
|Ik |

k=1

N

X

= kf (ξk )k |Ik | = ρ(∆, kf k). (B.10)

k=1

Since the intermediate Riemann sums in (B.10) converge to the respective integrals by

[Phi16, (10.25b)], one obtains

Z

f
= lim

ρ(∆, Re f 1 ), ρ(∆, Im f 1 ) , . . . , ρ(∆, Re f n ), ρ(∆, Im f n )

I

|∆|→0

(B.10)

Z

≤ lim ρ(∆, kf k) = kf k, (B.11)

|∆|→0 I

proving (B.8).

Definition B.5. Let I ⊆ R be (Lebesgue) measurable, n ∈ N.

integrable) if, and only if, each coordinate function fj = πj ◦ f : I −→ K, j =

1, . . . , n, is (Lebesgue) measurable (respectively, (Lebesgue) integrable), which, for

K = C, means if, and only if, each Re fj and each Im fj , j = 1, . . . , n, is (Lebesgue)

measurable (respectively, (Lebesgue) integrable).

(b) If f : I −→ Kn is integrable, then

Z Z Z

f := f1 , . . . , fn ∈ Kn (B.12)

I I I

Remark B.6. The linearity of the K-valued integral implies the linearity of the Kn -

valued integral.

Theorem B.7. Let I ⊆ R be measurable, n ∈ N. Then f : I −→ Kn is measurable in

the sense of Def. B.5(a) if, and only if, f −1 (O) is measurable for each open subset O of

Kn .

B KN -VALUED INTEGRATION 123

Proof. Assume f −1 (O) is measurable for each open subset O of Kn . Let j ∈ {1, . . . , n}.

If Oj ⊆ K is open in K, then O := πj−1 (Oj ) = {z ∈ Kn : zj ∈ Oj } is open in Kn .

Thus, fj−1 (Oj ) = f −1 (O) is measurable, showing that each fj is measurable, i.e. f is

measurable. Now assume f is measurable, i.e. each fj is measurable. Since every open

O ⊆ Kn is a countable union of open sets of the form O = O1 × · · · × On with each Oj

being an open subset of K, it suffices to show that the

Tn preimages of such open sets are

−1

measurable. So let O be as above. Then f (O) = j=1 fj (Oj ), showing that f −1 (O)

−1

is measurable.

kf k : I −→ R is measurable.

the norm. In consequence, kf k−1 (O) = f −1 k · k−1 (O) is measurable.

integrable f : I −→ Kn , the following holds:

Z Z

f ≤ kf k. (B.13)

I I

characteristic function of B (i.e. the fj are yj on B and 0 on I \ B). Then

Z Z

y1 λ(B), . . . , yn λ(B) = λ(B)kyk = kf k,

f =

(B.14)

I I

where λ denotes Lebesgue measure on R. Next, consider the case that f is a so-called

simple function, that means f takes only finitely many values y1 , . . . , yN ∈ Kn , N ∈ N,

and each preimage Bj := f −1 {yj } ⊆ I is measurable. Then

N

X

f= yj χ B j , (B.15)

j=1

where, without loss of generality, we may assume that the Bj are pairwise disjoint. We

obtain

Z

XN
Z
N Z Z X N

X

f
≤
yj χ B
=
yj χ B
=
yj χ B

j
j j

I j=1 I j=1 I I j=1

Z
X N

Z

(∗)

= yj χBj
= kf k, (B.16)

I
j=1

I

where, at (∗), it was used that, as the Bj are disjoint, the integrands of the two integrals

are equal at each x ∈ I.

C METRIC SPACES 124

L1 (I)) and there exist sequences of simple functions φj,k : I −→ R and ψj,k : I −→ R

such that limk→∞ kφj,k − Re fj kL1 (I) = limk→∞ kψj,k − Im fj kL1 (I) = 0. In particular,

Z Z

0 ≤ lim φj,k − Re fj ≤ lim kφj,k − Re fj kL1 (I) = 0, (B.17a)

k→∞ I I k→∞

Z Z

0 ≤ lim ψj,k − Im fj ≤ lim kψj,k − Im fj kL1 (I) = 0, (B.17b)

k→∞ I I k→∞

and also

0 ≤ lim kφj,k + iψj,k − fj kL1 (I)

k→∞

≤ lim kφj,k − Re fj kL1 (I) + lim kψj,k − Im fj kL1 (I) = 0. (B.18)

k→∞ k→∞

Thus, we obtain

Z
Z Z

f
=
f1 , . . . , fn

I

I Z I

Z Z Z

=
lim

k→∞ φ 1,k + i lim ψ 1,k , . . . , lim φ n,k + i lim ψ n,k

I k→∞ I k→∞ I k→∞ I

Z
Z Z

(∗)

= lim
(φ k + i ψ k )
≤ lim kφk + iψk k = kf k, (B.19)

k→∞

I

k→∞ I I

where the equality at (∗) holds due to limk→∞
k(φ1,k , . . . , φn,k )k−kf k
L1 (I)

= 0, which,

in turn, is verified by

Z Z Z

0≤ kφk + iψk k − kf k ≤ kφk + iψk − f k ≤ C kφk + iψk − f k1

I I I

n

Z X

=C φj,k + iψj,k − fj → 0 for k → ∞, (B.20)

I j=1

C Metric Spaces

Lemma C.1. The following law holds in every metric space (X, d):

|d(x, y) − d(x′ , y ′ )| ≤ d(x, x′ ) + d(y, y ′ ) for each x, x′ , y, y ′ ∈ X. (C.1)

In particular, (C.1) states the Lipschitz continuity of d : X 2 −→ R+ 0 (with Lipschitz

2

constant 1) with respect to the metric d1 on X defined by

d1 : X 2 × X 2 −→ R+ d1 (x, y), (x′ , y ′ ) = d(x, x′ ) + d(y, y ′ ).

0, (C.2)

Further consequences are the continuity and even uniform continuity of d, and also the

continuity of d in both components.

C METRIC SPACES 125

Definition C.2. Let (X, d) be a nonempty metric space. For each A, B ⊆ X define the

distance between A and B by

and

∀ dist(x, B) := dist({x}, B) and dist(A, x) := dist(A, {x}). (C.5)

x∈X

Theorem C.4. Let (X, d) be a nonempty metric space. If A ⊆ X and A 6= ∅, then the

functions

δ, δ̃ : X −→ R+

0, δ(x) := dist(x, A), δ̃(x) := dist(A, x), (C.7)

are both Lipschitz continuous with Lipschitz constant 1 (in particular, they are both

continuous and even uniformly continuous).

Proof. Since dist(x, A) = dist(A, x), it suffices to verify the Lipschitz continuity of δ.

We need to show

∀ | dist(x, A) − dist(y, A)| ≤ d(x, y). (C.8)

x,y∈X

and

dist(x, A) − d(x, y) ≤ d(y, a), (C.10)

implying

dist(x, A) − d(x, y) ≤ dist(y, A) (C.11)

and

dist(x, A) − dist(y, A) ≤ d(x, y). (C.12)

Since x, y ∈ X were arbitrary, (C.12) also yields

C METRIC SPACES 126

Aǫ := {x ∈ X : d(x, A) ≤ ǫ}. (C.14b)

Lemma C.6. Let (X, d) be a metric space, A ⊆ X, and ǫ ∈ R+ . Then Aǫ , the open

ǫ-fattening of A, is, indeed, open, and Aǫ , the closed ǫ-fattening of A, is, indeed, closed.

−1

Th. C.4, Aǫ = δ [0, ǫ[ is open as the continuous preimage of an open set (note that [0, ǫ[

is, indeed, (relatively) open in R+ −1

0 ); Aǫ = δ [0, ǫ] is closed as the continuous preimage

of a closed set.

Lemma C.7. Let (X, d) be a metric space, A ⊆ X, and ǫ ∈ R+ . If A is bounded, then

so are the fattenings Aǫ and Aǫ .

Proof. If A is bounded, then there exist x ∈ X and r > 0 such that A ⊆ Br (x). Let

s := r + ǫ + 1. If y ∈ Aǫ , then there exists a ∈ A such that d(a, y) < ǫ + 1. Thus,

Proposition C.8. Let (X, d) be a metric space, A ⊆ X, and 0 < ǫ1 < ǫ2 .

(b) If (X, k·k) is a normed space with d being the induced metric, ∅ 6= A ⊆ X, and there

exists x ∈

/ A, satisfying δ := d(x, A) ≥ ǫ2 , then all the inclusions in (a) are strict:

A ( Aǫ1 ( Aǫ1 ( Aǫ2 ( Aǫ2 . Caveat: For general metric spaces X and A satisfying

all the hypotheses, the inclusions do not need to be strict (consider discrete metric

spaces for simple examples).

To prove (b), let a ∈ A and consider the maps

f : [0, 1] −→ R, f (t) := d φ(t), A . (C.16b)

If (sn )n∈N is a sequence in [0, 1] such that limn→∞ sn = s ∈ [0, 1], then limn→∞ φ(sn ) =

sx+(1−s)a = φ(s), i.e. φ is continuous. Then, using Th. C.4, f is also continuous. Thus,

since f (0) = d(a, A) = 0 and f (1) = d(x, A) = δ ≥ ǫ2 , one can use the intermediate

value theorem [Phi16, Th. 7.57] to obtain, for each ǫ ∈ [0, ǫ2 ], some τ ∈ [0, 1], satisfying

f (τ ) = ǫ. If ǫ > 0, then d(φ(τ ), A) = f (τ ) = ǫ > 0, i.e φ(τ ) ∈ Aǫ \ A and φ(τ ) ∈ Aǫ \ Aǫ ,

showing A ( Aǫ1 , Aǫ1 ( Aǫ1 , and Aǫ2 ( Aǫ2 . If ǫ := (ǫ1 + ǫ2 )/2, then ǫ1 < ǫ = f (τ ) =

d(φ(τ ), A) < ǫ2 , i.e. φ(τ ) ∈ Aǫ2 \ Aǫ1 , showing Aǫ1 ( Aǫ2 .

C METRIC SPACES 127

Definition C.9. A subset C of a metric space X is called compact if, and only if, every

sequence in C has a subsequence that converges to some limit c ∈ C.

Proposition C.10. Let (X, d) be a metric space, C, A ⊆ X. If C is compact, A is

closed, and A ∩ C = ∅, then dist(C, A) > 0.

If dist(C, A) = 0, then there exists a sequence ((ck , ak ))k∈N in C × A such that

k→∞

lim ck = c ∈ C, (C.18)

k→∞

also implying

k→∞ k∈N

Proposition C.11. Let (X, d) be a metric space and C ⊆ X.

(b) If C is compact and A ⊆ C is closed, then A is compact.

Proof. (a): Suppose C is compact. Let (xk )k∈N be a sequence in C that converges in

X, i.e. limk→∞ xk = x ∈ X. Since C is compact, (xk )k∈N must have a subsequence

that converges to some c ∈ C, implying x = c ∈ C and showing C is closed. If C

is not bounded, then, for each x ∈ X, there is a sequence (xk )k∈N in C such that

limk→∞ d(x, xk ) = ∞. If y ∈ X, then d(x, xk ) ≤ d(x, y) + d(y, xk ), i.e. d(y, xk ) ≥

d(x, xk ) − d(x, y), showing that limk→∞ d(y, xk ) = ∞ as well. Thus, y can not be a limit

of any subsequence of (xk )k∈N . As y was arbitrary, C can not be compact.

(b): If (xk )k∈N is a sequence in A, then (xk )k∈N is a sequence in C. Since C is compact,

it must have a subsequence that converges to some c ∈ C. However, as A is closed, c

must be in A, showing that (xk )k∈N has a subsequence that converges to some c ∈ A,

i.e. A is compact.

Corollary C.12. A subset C of Kn , n ∈ N, is compact if, and only if, C is closed and

bounded.

Proof. Every compact set is closed and bounded by Prop. C.11(a). If C is closed and

bounded, and (xk )k∈N is a sequence in C, then the boundedness and the Bolzano-

Weierstrass theorem yield a subsequence that converges to some x ∈ Kn . However,

since C is closed, x ∈ C, showing that C is compact.

C METRIC SPACES 128

The following examples show that, in general, sets can be closed and bounded without

being compact.

Example C.13. (a) If (X, d) is a noncomplete metric space, than it contains a Cauchy

sequence that does not converge. It is not hard to see that such a sequence can

not have a convergent subsequence, either. This shows that no noncomplete metric

space can be compact. Moreover, the closure of every bounded subset of X that

contains such a nonconvergent Cauchy sequence is an example of a closed and

bounded set that is noncompact. Concrete examples are given by Q ∩ [a, b] for each

a, b ∈ R with a < b (these sets are Q-closed, but not R-closed!) and ]a, b[ for each

a, b ∈ R with a < b, in each case endowed with the usual metric d(x, y) := |x − y|.

(b) There can also be closed and bounded sets in complete spaces that are not compact.

Consider the space X of all bounded sequences (xn )n∈N in K, endowed with the sup-

norm k(xn )n∈N ksup := sup{|xn | : n ∈ N}. It is not too difficult to see that X with

the sup-norm is a Banach space: Let (xk )k∈N with xk = (xkn )n∈N be a Cauchy

sequence in X. Then, for each n ∈ N, (xkn )k∈N is a Cauchy sequence in K, and,

thus, it has a limit yn ∈ K. Let y := (yn )n∈N . Then

Let ǫ > 0. As (xk )k∈N is a Cauchy sequence with respect to the sup-norm, there is

N ∈ N such that kxk − xl ksup < ǫ for all k, l > N . Fix some l > N and some n ∈ N.

Then ǫ ≥ limk→∞ |xkn − xln | = limk→∞ |yn − xln |. Since this is valid for each n ∈ N,

we get kxl − yksup ≤ ǫ for each l > N , showing liml→∞ xl = y, i.e. X is complete

and a Banach space.

Now consider the sequence (ek )k∈N with

(

1 for k = n,

ekn :=

0 otherwise.

Then (ek )k∈N constitutes a sequence in X with kek ksup = 1 for each k ∈ N. In par-

ticular, (ek )k∈N is a sequence inside the closed unit ball B 1 (0), and, hence, bounded.

However, if k, l ∈ N with k 6= l, then kek − el ksup = 1. Thus, neither (ek )k∈N nor any

subsequence can be a Cauchy sequence. In particular, no subsequence can converge,

showing that the closed and bounded unit ball B 1 (0) is not compact.

Note: There is an important result that shows that a normed vector space is finite-

dimensional if, and only if, the closed unit ball B 1 (0) is compact (see, e.g., [Str08,

Th. 28.14]).

Theorem C.14. If (X, dX ) and (Y, dY ) are metric spaces, C ⊆ X is compact, and

f : C −→ Y is continuous, then f (C) is compact.

such that f (xk ) = y k . As C is compact, there is a subsequence (ak )k∈N of (xk )k∈N

with limk→∞ ak = a for some a ∈ C. Then (f (ak ))k∈N is a subsequence of (y k )k∈N and

C METRIC SPACES 129

the continuity of f yields limk→∞ f (ak ) = f (a) ∈ f (C), showing that (y k )k∈N has a

convergent subsequence with limit in f (C). We have therefore established that f (C) is

compact.

continuous, then f assumes its max and its min, i.e. there are xm ∈ C and xM ∈ C

such that f has a global min at xm and a global max at xM .

C.14. Then, by [Phi16, Lem. 7.53], f (C) contains a smallest element m and a largest

element M . This, in turn, implies that there are xm , xM ∈ C such that f (xm ) = m and

f (xM ) = M .

Theorem C.16. If (X, dX ) and (Y, dY ) are metric spaces, C ⊆ X is compact, and

f : C −→ Y is continuous, then f is uniformly continuous.

Proof. If f is not uniformly continuous, then there must be some ǫ > 0 such that, for

each k ∈ N, there exist xk , y k ∈ C satisfying dX (xk , y k ) < 1/k and dY (f (xk ), f (y k )) ≥ ǫ.

Since C is compact, there is a ∈ C and a subsequence (ak )k∈N of (xk )k∈N such that

a = limk→∞ ak . Then there is a corresponding subsequence (bk )k∈N of (y k )k∈N such that

dX (ak , bk ) < 1/k and dY (f (ak ), f (bk )) ≥ ǫ for all k ∈ N. Using the compactness of C

again, there is b ∈ C and a subsequence (v k )k∈N of (bk )k∈N such that b = limk→∞ v k .

Now there is a corresponding subsequence (uk )k∈N of (ak )k∈N such that dX (uk , v k ) <

1/k and dY (f (uk ), f (v k )) ≥ ǫ for all k ∈ N. Note that we still have a = limk→∞ v k .

Given α > 0, there is N ∈ N such that, for each k > N , one has dX (a, uk ) < α/3,

dX (b, v k ) < α/3, and dX (uk , v k ) < 1/k < α/3. Thus, dX (a, b) < dX (a, uk ) + dX (uk , v k ) +

dX (b, v k ) < α, implying d(a, b) = 0 and a = b. Finally, the continuity of f implies

f (a) = limk→∞ f (uk ) = limk→∞ f (v k ) in contradiction to dY (f (uk ), f (v k )) ≥ ǫ.

Theorem C.17. If (X, dX ) and (Y, dY ) are metric spaces, C ⊆ X is compact, and

f : C −→ Y is continuous and one-to-one, then f −1 : f (C) −→ C is continuous.

Proof. Let (y k )k∈N be a sequence f (C) such that limk→∞ y k = y ∈ f (C). Then there

is a sequence (xk )k∈N in C such that f (xk ) = y k for each k ∈ N. Let x := f −1 (y).

It remains to prove that limk→∞ xk = x. As C is compact, there is a ∈ C and a

subsequence (ak )k∈N of (xk )k∈N such that a = limk→∞ ak . The continuity of f yields

f (a) = limk→∞ f (ak ) = limk→∞ y k = y = f (x) since (f (ak ))k∈N is a subsequence of

(y k )k∈N . It now follows that a = x since f is one-to-one. The same argument shows

that every convergent subsequence of (xk )k∈N has to converge to x. If (xk )k∈N did not

converge to x, then there had to be some ǫ > 0 such that infinitely man xk are not in

Bǫ (x). However, the compactness of C would provide a convergent subsequence whose

limit could not be x, in contradiction to x having to be the limit of all convergent

subsequences of (xk )k∈N .

bounded if, and only if, for each ǫ > 0, A can be covered by finitely many ǫ-balls, i.e. if,

C METRIC SPACES 130

and only if, there exist finitely many points a1 , . . . , aN ∈ A, N ∈ N, such that

N

[

A⊆ Bǫ (aj ). (C.20)

j=1

Theorem C.19. For a subset C of a metric space (X, d), the following statements are

equivalent:

(ii) C has the Heine-Borel property, i.e. every open cover of C has a finite subcover,

i.e. if (Oj )j∈I is a family of open sets Oj ⊆ X, satisfying

[

C⊆ Oj , (C.21)

j∈I

SN

then there exist j1 , . . . , jN ∈ I, N ∈ N, such that C ⊆ k=1 Oj k .

(iii) C is precompact (i.e. totally bounded) as defined in Def. C.18 and complete, i.e.

every Cauchy sequence in C converges to a limit in C.

“(i) ⇒ (iii)”: Let (cn )n∈N be a Cauchy sequence in C. As C is compact, (cn )n∈N has a

subsequence (cnj )j∈N such that limj→∞ cnj = c ∈ C. Given ǫ > 0 choose K ∈ N such

that, for each m, n ≥ K, d(cm , cn ) < 2ǫ , and such that, for each nj ≥ K, d(cnj , c) < 2ǫ .

Then, fixing some nj ≥ K,

ǫ ǫ

∀ d(cn , c) ≤ d(cn , cnj ) + d(cnj , c) < + = ǫ, (C.22)

n≥K 2 2

showing limn→∞ cn = c and the completeness of C. We now show C to be also totally

bounded. We proceed by contraposition and assume C not to be totally bounded, i.e.

there exists ǫ > 0 such that C is not contained in any finite union of ǫ-balls. Inductively,

we construct a sequence (cn )n∈N in C such that

∀ d(cm , cn ) ≥ ǫ : (C.23)

m,n∈N,

m6=n

C, k ∈ N, have already been constructed such that d(cm , cn ) ≥ ǫ holds for each m, n ∈

{1, . . . , k}, there must be

[k

c∈C\ Bǫ (cj ). (C.24)

j=1

Choosing ck+1 := c, (C.24) guarantees (C.23) now holds for each m, n ∈ {1, . . . , k + 1}.

Due to (C.23), no subsequence of (cn )n∈N can be a Cauchy sequence, i.e. (cn )n∈N does

not have a convergent subsequence, proving C is not compact.

C METRIC SPACES 131

“(iii) ⇒ (ii)”: Assume C to be precompact and complete. For each k ∈ N, the precom-

pactness yields points ck1 , . . . , ckNk ∈ C, Nk ∈ N, such that

Nk

[

C⊆ B 1 (ckj ). (C.25)

k

j=1

Seeking a contradiction, assume C does not have the Heine-Borel property, i.e. there

exists an open cover (Oj )j∈I of C which does not have a finite subcover. Inductively, we

construct a decreasing sequence of subsets Ck of C, C ⊇ C1 ⊇ C2 ⊇ . . . , such that no

Ck can be covered by a finite subcover of (Oj )j∈I and such that

∀ ∃ Ck ⊆ B 1 (ckj ) : (C.26)

k∈N j∈{1,...,Nk } k

To start out, we note that (C.25) implies at least one of the finitely many sets C ∩

B1 (c11 ), . . . , C∩B1 (c1N1 ) can not be covered by a finite subcover of (Oj )j∈I , say, C∩B1 (c1j1 ).

Define C1 := C ∩B1 (c1j1 ). Then, given C1 , . . . , Ck have already been constructed for some

k ∈ N, since Ck can not be covered by a finite subcover of (Oj )j∈I and

Nk+1

[

Ck ⊆ C ⊆ B 1 (ck+1

j ), (C.27)

k+1

j=1

jk+1 ) can not be covered by a

k+1

finite subcover of (Oj )j∈I , either. Define Ck+1 := Ck ∩ B 1 (ck+1 jk+1 ). For each k ∈ N,

k+1

choose some sk ∈ Ck (note Ck 6= ∅, as it can not be covered by finitely many Oj ). Given

ǫ > 0, there is K ∈ N such that K2 < ǫ. If k, l ≥ K, then sk , sl ∈ CK ⊆ B 1 (cK j ) for some

K

2

suitable j ∈ {1, . . . , NK }. In particular, d(sk , sl ) < K < ǫ, showing (sk )k∈N is a Cauchy

sequence. As (sk )k∈N is a Cauchy sequence in C and C is complete, there exists c ∈ C

such that limk→∞ sk = c. However, then there must exist some j ∈ I such that c ∈ Oj

and, since Oj is open, there is ǫ > 0 with Bǫ (c) ⊆ Oj , and Bǫ (c) must contain almost

all of the sk . Choose k sufficiently large such that k1 < 4ǫ and d(sk , c) < 2ǫ . Then, since

sk ∈ Ck ⊆ B 1 (ckj ), (C.28)

k

one has

2 ǫ 2ǫ ǫ

∀ d(x, c) ≤ d(x, sk ) + d(sk , c) < + < + = ǫ, (C.29)

x∈B 1 (ckj ) k 2 4 2

k

k

finitely many Oj .

“(ii) ⇒ (i)”: Assume C has the Heine-Borel property. Seeking a contradiction, assume C

is not compact, that means there exists a sequence (cn )n∈N in C such that no subsequence

of (cn )n∈N converges to a limit in C. According to [Phi15, Prop. 1.38(d)], no c ∈ C can

be a cluster point of (cn )n∈N , i.e., for each c ∈ C, there

S exists ǫc > 0 such that Bǫc (c)

contains only finitely many of the cn . Since C ⊆ c∈C Bǫc (c), the family Bǫc (c) c∈C

constitutes an open cover of C. As C has the Heine-Borel SN property, there exist finitely

many points a1 , . . . , aN ∈ C, N ∈ N, such that C ⊆ j=1 Bǫaj (aj ), i.e. C contains only

finitely many of the cn , in contradiction to (cn )n∈N being a sequence in C.

D LOCAL LIPSCHITZ CONTINUITY 132

Caveat C.20. In general topological spaces, one defines compactness via the Heine-

Borel property (a topological space C is defined to be compact if, and only if, C has

the Heine-Borel property). Moreover, a topological space C is defined to be sequentially

compact if, and only if, every sequence in C has a convergent subsequence. Using this

terminology, one can rephrase the equivalence between (i) and (ii) in Th. C.19 by stating

that a metric space is sequentially compact if, and only if, it is compact. However, in

general topological spaces, neither implication remains true ((iii) of Th. C.19 does not

even make sense in general topological spaces, as the concepts of boundedness, total

boundedness, and Cauchy sequences are, in general, not available): For an example

of a topological space that is compact, but not sequentially compact, see, e.g. [Pre75,

7.2.10(a)]; for an example of a topological space that is sequentially compact, but not

compact, see, e.g. [Pre75, 7.2.10(c)].

Theorem C.21 (Lebesgue Number). Let (X, d) be a metric space and C ⊆ X. If C is

compact and (Oj )j∈I is an open cover of C, then there exists a Lebesgue number δ for

the open cover, i.e. some δ > 0 such that, for each A ⊆ C with diam A < δ, there exists

j0 ∈ I, where A ⊆ Oj0 . Recall that

(

0 for A = ∅,

diam A = (C.30)

sup d(x, y) : x, y ∈ A for ∅ 6= A.

Proof. Seeking a contradiction, assume there is no Lebesgue number for the open cover

(Oj )j∈I . Then there are sequences (xk )k∈N in C and (Ak )k∈N in P(C) such that

1

∀ xk ∈ Ak , diam Ak < , and ∀ Ak 6⊆ Oj . (C.31)

k∈N k j∈I

such that c ∈ Oj and ǫ > 0 such that Bǫ (c) ⊆ Oj . If k ∈ N is such that k1 < 2ǫ and

d(xk , c) < 2ǫ , then, for each a ∈ Ak , we have d(a, c) ≤ d(a, xk ) + d(xk , c) < 2ǫ + 2ǫ = ǫ,

implying the contradiction Ak ⊆ Bǫ (c) ⊆ Oj .

In Prop. 3.13, it was shown that a continuous function is locally Lipschitz with respect

to y if, and only if, it is globally Lipschitz with respect to y on every compact set.

The following Prop. D.1 shows that this equivalence holds even if f is not continuous,

provided that each projection Gx as in (D.1) below is convex. On the other hand, Ex.

D.2 shows that, in general, there exist discontinuous functions that are locally Lipschitz

with respect to y without being globally Lipschitz with respect to y on every compact

set.

Proposition D.1. Let m, n ∈ N, G ⊆ R × Km , and f : G −→ Kn . If G is such that

each projection

Gx := {y ∈ Km : (x, y) ∈ G}, x ∈ R, (D.1)

D LOCAL LIPSCHITZ CONTINUITY 133

is convex (in particular, if G itself is convex), then f is locally Lipschitz with respect to

y if, and only if, f is (globally) Lipschitz with respect to y on every compact subset K

of G.

Proof. The proof of Prop. 3.13 shows, whithout making use of the continuity of f , that

(global) Lipschitz continuity with respect to y on every compact subset K of G implies

local Lipschitz continuity on G. Thus, assume f to be locally Lipschitz with respect to

y and assume each Gx to be convex. The proof of Prop. 3.13 shows, whithout making

use of the continuity of f , that, for each K ⊆ G compact

∃ ∀ ky − ȳk < δ ⇒ kf (x, y) − f (x, ȳ)k ≤ Lky − ȳk . (D.2)

δ>0, (x,y),(x,ȳ)∈K

L≥0

If (x, y), (x, ȳ) ∈ K are arbitrary with y 6= ȳ, then the convexity of Gx implies

Choose N ∈ N such that N > 2ky − ȳk/δ and set h := ky − ȳk/N . Then

Define

kh kh

∀ yk := y+ 1− ȳ. (D.5)

k=0,...,N ky − ȳk ky − ȳk

Then

h h

∀ kyk+1 − yk k =

y+ ȳ
=h<δ (D.6)

k=0,...,N −1 ky − ȳk ky − ȳk

and

N −1 N −1

X (D.2) X

kf (x, y) − f (x, ȳ)k ≤ kf (x, yk ) − f (x, yk+1 )k ≤ L kyk − yk+1 k

k=0 k=0

= L N h = L ky − ȳk, (D.7)

Example D.2. We provide two examples that show that, in general, a discontinuous

function can be locally Lipschitz with respect to y without being globally Lipschitz with

respect to y on every compact set.

(a) Consider

G :=] − 2, 2[× ] − 4, −1[∪]1, 4[ (D.8)

and f : G −→ R,

1/x for x 6= 0, y ∈] − 4, −1[,

f (x, y) := 0 for x = 0, y ∈] − 4, −1[, (D.9)

0 for y ∈]1, 4[.

D LOCAL LIPSCHITZ CONTINUITY 134

For the following open balls with respect to the max norm k(x, y)k := max{|x|, |y|},

one has B1 (x, y) ∩ G ⊆] − 2, 2[×] − 4, −1[ for y ∈] − 4, −1[, andB1 (x, y) ∩ G ⊆

] − 2, 2[×]1, 4[ for y ∈]1, 4[. Thus, f (x, ·) is constant on each set B1 (x, y) ∩ G (either

constantly equal to 1/x or constantly equal to 0), i.e. 0-Lipschitz with respect to y.

In particular, f is locally Lipschitz with respect to y. However, f is not Lipschitz

continuous with respect to y on the compact set

K := [−1, 1] × [−3, −2] ∪ [2, 3] : (D.10)

For the sequence ((xk , yk , y k ))k∈N , where

∀ xk := 1/k, yk := −2, y k := 2, (D.11)

k∈N

one has

|f (xk , yk ) − f (xk , y k )| k−0

lim = lim = ∞, (D.12)

k→∞ |yk − y k | k→∞ 2 − (−2)

(b) If one increases the dimension by 1, then one can modify the example in (a) such

the set G is even connected (this variant was pointed out by Anton Sporrer): Let

A := ] − 4, −1[×] − 2, 2[ ∪ ] − 4, 4[×] − 2, 0[ ∪ ]1, 4[×] − 2, 2[ ⊆ R2 . (D.13)

Then A is open and connected (but not convex) and the same holds for

G :=] − 2, 2[×A ⊆ R3 . (D.14)

Define

(

1/x for x 6= 0, y1 ∈] − 4, −1[, y2 > 0,

f : G −→ R, f (x, y1 , y2 ) := (D.15)

0 otherwise.

Then everything works essentially as in (a) (it might be helpful to graphically

visualize the set A and the behavior of the function f ): For the following open balls

with respect to the max norm k(x, y)k := max{|x|, |y1 |, |y2 |}, one has

∀ (ξ, η1 , η2 ) ∈ B1 (x, y1 , y2 ) ∩ G ⇒ η1 < −1 + 1 = 0 < 1 . (D.16)

(x,y1 ,y2 )∈G,

y1 ∈]−4,−1[

equal to 1/x or constantly equal to 0), i.e. 0-Lipschitz with respect to y. In particu-

lar, f is locally Lipschitz with respect to y. However, f is not Lipschitz continuous

with respect to y on the compact set

K := [−1, 1] × [−3, −2] ∪ [2, 3] × [−1, 1] : (D.17)

For the sequence ((xk , y1,k , y 1,k ), y2,k )k∈N with

∀ xk := 1/k, y1,k := −2, y 1,k := 2, y2,k := 0, (D.18)

k∈N

one has

|f (xk , y1,k , y2,k ) − f (xk , y 1,k , y2,k )| k−0

lim = lim = ∞, (D.19)

k→∞ k(y1,k , y2,k ) − (y 1,k , y2,k )kmax k→∞ max{4, 0}

E MAXIMAL SOLUTIONS ON NONOPEN INTERVALS 135

In Def. 3.20, we required a maximal solution to an ODE to be defined on an open

interval. The following Ex. E.1 shows it can occur that such a maximal solution has an

extension to a larger nonopen interval. In such cases, one might want to call the solution

on the nonopen interval maximal rather than the solution on the smaller open interval.

However, this would make the treatment of maximal solutions more cumbersome in some

places, without adding any real substance, which is why we stick to our requirement for

maximal solutions to always be defined on an open interval.

φ : [0, 1] −→ R, φ ≡ y0 , (E.2)

(

2 0 for x ∈ [0, 1],

G := R , f : G −→ R, f (x, y) := (E.4)

1 for x ∈/ [0, 1].

Then, for each (x0 , y0 ) ∈ [0, 1] × R, the function φ of (E.2) is a solution to the initial

value problem (E.3), but, again, the maximal solution of (E.3) according to Def.

3.20 is φ↾]0,1[ .

F Paths in Rn

Definition F.1. A path or curve in Rn , n ∈ N, is a continuous map ψ : I −→ Rn , where

I ⊆ R is an interval. One calls the path differentiable, continuously differentiable, etc.

if, and only if, the function ψ has the respective property.

the length of I.

F PATHS IN RN 136

tuple ∆ := (x0 , . . . , xN ) ∈ RN +1 , N ∈ N, is called a partition of I if, and only if,

a = x0 < x1 < · · · < xN = b. The set of all partitions of I is denoted by Π(I) or by

Π[a, b]. Given a partition ∆ of I as above and letting Ij := [xj−1 , xj ], the number

|∆| := max |Ij | : j ∈ {1, . . . , N } , (F.2)

∆ = (x0 , . . . , xN ), N ∈ N, of [a, b], we consider the approximation of ψ by the polygon,

connecting the points ψ(x0 ), . . . , ψ(xN ), where we denote the polygon’s length by

N

X −1

pψ (∆) := pψ (x0 , . . . , xN ) := kψ(xk+1 ) − ψ(xk )k2 , (F.3)

k=0

(

0 n for a = b,

l(ψ) := o (F.4)

sup pψ (∆) : ∆ ∈ Π[a, b] ∈ [0, ∞] for a < b.

The path ψ is called rectifyable with arc length l(ψ) if, and only if, l(ψ) < ∞.

∀ ψ(x) = y0 + x y1 , (F.5)

x∈[a,b]

l(φ) − l(ψ) ≤ l(φ − ψ). (F.8)

F PATHS IN RN 137

N

X −1 N

X −1

pψ (x0 , . . . , xN ) = kψ(xk+1 ) − ψ(xk )k2 = kxk+1 y1 − xk y1 k2

k=0 k=0

N

X −1

= ky1 k2 (xk+1 − xk ) = ky1 k2 (b − a), (F.10)

k=0

proving (F.6).

(b): For each partition (x0 , . . . , xN ), N ∈ N, of [a, b], we have

N

X −1 N

X −1

pψ (x0 , . . . , xN ) = kψ(xk+1 ) − ψ(xk )k2 ≤ L kxk+1 − xk k2

k=0 k=0

N

X −1

=L (xk+1 − xk ) = L (b − a), (F.11)

k=0

proving (F.7).

(c): For each partition ∆ = (x0 , . . . , xN ), N ∈ N, of [a, b], we have

−1 −1

N X N

X

pφ (∆) − pψ (∆) = kφ(xk+1 ) − φ(xk )k2 − kψ(xk+1 ) − ψ(xk )k2

k=0 k=0

N

X −1

≤ kφ(xk+1 ) − φ(xk )k2 − kψ(xk+1 ) − ψ(xk )k2

k=0

N

X −1

≤
φ(xk+1 ) − ψ(xk+1 ) − φ(xk ) − ψ(xk )

2

k=0

proving (F.8) (the last estimate in (F.12) holds true due to the inverse triangle inequal-

ity).

(d): If ξ = a or ξ = b, then there is nothing to prove. Thus, assume a < ξ < b. If

∆1 := (x0 , . . . , xN ) is a partition of [a, ξ] and ∆2 := (xN , . . . , xM ) is a partition of [ξ, b],

N, M ∈ N, M > N , then ∆ := (x0 , . . . , xM ) is a partition of [a, b]. Moreover,

On the other hand, if ∆ = (x0 , . . . , xM ) M ∈ N, is a partition of [a, b], then, either there

is 0 < N < M such that ξ = N , in which case (F.13) holds once again, where ∆1 and ∆2

are defined as before. Otherwise, there is N ∈ {0, . . . , M − 1} such that xN < ξ < xN +1

F PATHS IN RN 138

is a partition of [ξ, b]. Moreover,

M

X −1

pψ (∆) = kψ(xk+1 ) − ψ(xk )k2

k=0

N

X −1 M

X −1

= kψ(xk+1 ) − ψ(xk )k2 + kxN +1 − xN k + kψ(xk+1 ) − ψ(xk )k2

k=0 k=N +1

showing

l(ψ) ≤ l(ψ↾[a,ξ] ) + l(ψ↾[ξ,b] ) (F.16)

and concluding the proof.

Theorem F.7. Given a, b ∈ R, a < b, each continuously differentiable path ψ : [a, b] −→

Rn , n ∈ N, is rectifyable with arc length

Z b

′

l(ψ) =
ψ (x)
dx .

2

(F.17)

a

Proof. Since ψ is continuously differentiable, it follows from [Phi15, Th. C.3] that ψ is

Lipschitz continuous on [a, b], i.e. ψ is rectifyable by Prop. F.6(b) above. To prove (F.17),

according to the fundamental theorem of calculus [Phi16, Th. 10.20(b)], it suffices to

show the function

λ : [a, b] −→ R+ 0, λ(x) := l(ψ↾[a,x] ), (F.18)

is differentiable with derivative λ′ (x) = kψ ′ (x)k2 . To this end, first note the continuous

function ψ ′ is even uniformly continuous by Th. C.16. Thus,

∀ ∃ ∀ |x0 − x| < δ ⇒ kψ(x0 ) − ψ(x)k2 < ǫ. . (F.19)

ǫ>0 δ>0 x0 ,x∈[a,b]

Fix x0 ∈ [a, b[ and consider x1 ∈]a, b[ such that x0 < x1 < x0 + δ. Define the affine path

(F.19)

∀ kψ ′ (x) − α′ (x)k2 = kψ ′ (x) − ψ ′ (x0 )k2 < ǫ. (F.22)

x∈[x0 ,x1 ]

Thus, it follows from [Phi15, Th. C.3] that ψ − α is ǫ-Lipschitz on [a, b] and, then, Prop.

F.6(b) yields

l(ψ↾[x0 ,x1 ] −α) ≤ ǫ(x1 − x0 ), (F.23)

G OPERATOR NORMS AND MATRIX NORMS 139

l(ψ↾[x0 ,x1 ] ) − l(α) ≤ l(ψ↾[x0 ,x1 ] −α) ≤ ǫ(x1 − x0 ). (F.24)

Putting everything together, we obtain

l(ψ↾[a,x1 ] ) − l(ψ↾[a,x0 ] ) ′

Prop. F.6(d), (F.21) l(ψ↾[x0 ,x1 ] ) l(α)

− kψ (x0 )k2

= x1 − x0 − x1 − x0

x1 − x0

(F.24) ǫ(x1 − x0 )

≤ = ǫ, (F.25)

x1 − x0

showing the function λ from (F.18) has a right-hand derivative at x0 and the value of

that right-hand derivative at x0 is the desired kψ ′ (x0 )k2 . Repeating the above argument

with x0 , x1 ∈]a, b] such that x0 − δ < x1 < x0 shows λ to have a left-hand derivative at

each x0 ∈]a, b] with value kψ ′ (x0 )k2 , which completes the proof.

Remark F.8. An example of a differentiable nonrectifyable path is given by (cf. [Wal02,

Ex. 5.14.6]) (

2 x2 cos xπ2 for x 6= 0,

ψ : [0, 1] −→ R , ψ(x) := (F.26)

0 for x = 0.

For the present ODE class, we are mostly interested in linear maps from Kn into itself.

However, introducing the relevant notions for linear maps between general normed vector

spaces does not provide much additional difficulty, and, hopefully, even some extra

clarity.

Definition G.1. Let A : X −→ Y be a linear map between two normed vector spaces

(X, k · kX ) and (Y, k · kY ) over K. Then A is called bounded if, and only if, A maps

bounded sets to bounded sets, i.e. if, and only if, A(B) is a bounded subset of Y for

each bounded B ⊆ X. The vector space of all bounded linear maps between X and Y

is denoted by L(X, Y ).

Definition G.2. Let A : X −→ Y be a linear map between two normed vector spaces

(X, k · kX ) and (Y, k · kY ) over K. The number

kAxkY

kAk := sup : x ∈ X, x 6= 0

kxkX

= sup kAxkY : x ∈ X, kxkX = 1 ∈ [0, ∞] (G.1)

is called the operator norm of A induced by k · kX and k · kY (strictly speaking, the term

operator norm is only justified if the value is finite, but it is often convenient to use the

term in the generalized way defined here).

In the special case, where X = Kn , Y = Km , and A is given via a real m × n matrix,

the operator norm is also called matrix norm.

—

G OPERATOR NORMS AND MATRIX NORMS 140

From now on, the space index of a norm will usually be suppressed, i.e. we write just

k · k instead of both k · kX and k · kY .

Theorem G.3. For a linear map A : X −→ Y between two normed vector spaces

(X, k · k) and (Y, k · k) over K, the following statements are equivalent:

(a) A is bounded.

(b) kAk < ∞.

(c) A is Lipschitz continuous.

(d) A is continuous.

(e) There is x0 ∈ X such that A is continuous at x0 .

Proof. Since every Lipschitz continuous map is continuous and since every continuous

map is continuous at every point, “(c) ⇒ (d) ⇒ (e)” is clear.

“(e) ⇒ (c)”: Let x0 ∈ X be such that A is continuous at x0 . Thus, for each ǫ > 0, there

is δ > 0 such that kx − x0 k < δ implies kAx − Ax0 k < ǫ. As A is linear, for each x ∈ X

with kxk < δ, one has kAxk = kA(x + x0 ) − Ax0 k < ǫ, due to kx + x0 − x0 k = kxk < δ.

Moreover, one has k(δx)/2k ≤ δ/2 < δ for each x ∈ X with kxk ≤ 1. Letting L := 2ǫ/δ,

this means that kAxk = kA((δx)/2)k/(δ/2) < 2ǫ/δ = L for each x ∈ X with kxk ≤ 1.

Thus, for each x, y ∈ X with x 6= y, one has

x−y

kAx − Ayk = kA(x − y)k = kx − yk
A
< L kx − yk. (G.2)

kx − yk

Together with the fact that kAx − Ayk ≤ kx − yk is trivially true for x = y, this shows

that A is Lipschitz continuous.

“(c) ⇒ (b)”: As A is Lipschitz continuous, there exists L ∈ R+

0 such that kAx − Ayk ≤

L kx − yk for each x, y ∈ X. Considering the special case y = 0 and kxk = 1 yields

kAxk ≤ L kxk = L, implying kAk ≤ L < ∞.

“(b) ⇒ (c)”: Let kAk < ∞. We will show

kAx − Ayk ≤ kAk kx − yk for each x, y ∈ X. (G.3)

For x = y, there is nothing to prove. Thus, let x 6= y. One computes

kAx − Ayk
x − y

≤ kAk

=
A (G.4)

kx − yk
kx − yk

x−y

as
kx−yk
= 1, thereby establishing (G.3).

“(b) ⇒ (a)”: Let kAk < ∞ and let M ⊆ X be bounded. Then there is r > 0 such that

M ⊆ Br (0). Moreover, for each 0 6= x ∈ M :

kAxk
x
≤ kAk

=
A

(G.5)

kxk kxk

G OPERATOR NORMS AND MATRIX NORMS 141

x

as
kxk
= 1. Thus kAxk ≤ kAkkxk ≤ rkAk, showing that A(M ) ⊆ BrkAk (0). Thus,

A(M ) is bounded, thereby establishing the case.

“(a) ⇒ (b)”: Since A is bounded, it maps the bounded set B1 (0) ⊆ X into some

bounded subset of Y . Thus, there is r > 0 such that A(B1 (0)) ⊆ Br (0) ⊆ Y . In

particular, kAxk < r for each x ∈ X satisfying kxk = 1, showing kAk ≤ r < ∞.

Remark G.4. For linear maps between finite-dimensional spaces, the equivalent prop-

erties of Th. G.3 always hold: Each linear map A : Kn −→ Km , (n, m) ∈ N2 , is

continuous (this follows, for example, from the fact that each such map is (trivially)

differentiable, and every differentiable map is continuous). In particular, each linear

map A : Kn −→ Km , has all the equivalent properties of Th. G.3.

Theorem G.5. Let X and Y be normed vector spaces over K.

(a) The operator norm does, indeed, constitute a norm on the set of bounded linear

maps L(X, Y ).

(b) If A ∈ L(X, Y ), then kAk is the smallest Lipschitz constant for A, i.e. kAk is a

Lipschitz constant for A and kAx − Ayk ≤ L kx − yk for each x, y ∈ X implies

kAk ≤ L.

kAk = 0. Conversely, kAk = 0 implies Ax = 0 for each x ∈ X with kxk = 1. But then

Ax = kxk A(x/kxk) = 0 for every 0 6= x ∈ X, i.e. A = 0. Thus, the operator norm is

positive definite. If A ∈ L(X, Y ), λ ∈ K, and x ∈ X, then

(λA)x = A(λx) = λ(Ax) = |λ| Ax , (G.6)

yielding

kλAk = sup k(λA)xk : x ∈ X, kxk = 1 = sup |λ| kAxk : x ∈ X, kxk = 1

= |λ| sup kAxk : x ∈ X, kxk = 1 = |λ| kAk, (G.7)

showing that the operator norm is homogeneous of degree 1. Finally, if A, B ∈ L(X, Y )

and x ∈ X, then

k(A + B)xk = kAx + Bxk ≤ kAxk + kBxk, (G.8)

yielding

kA + Bk = sup k(A + B)xk : x ∈ X, kxk = 1

≤ sup kAxk + kBxk : x ∈ X, kxk = 1

≤ sup kAxk : x ∈ X, kxk = 1 + sup kBxk : x ∈ X, kxk = 1

= kAk + kBk, (G.9)

showing that the operator norm also satisfies the triangle inequality, thereby completing

the verification that it is, indeed, a norm.

(b): That kAk is a Lipschitz constant for A was already shown in the proof of “(b) ⇒

(c)” of Th. G.3. Now let L ∈ R+

0 be such that kAx − Ayk ≤ L kx − yk for each x, y ∈ X.

Specializing to y = 0 and kxk = 1 implies kAxk ≤ L kxk = L, showing kAk ≤ L.

H THE VANDERMONDE DETERMINANT 142

Remark G.6. Even though it is beyond the scope of the present class, let us mention

as an outlook that one can show that L(X, Y ) with the operator norm is a Banach space

(i.e. a complete normed vector space) provided that Y is a Banach space (even if X is

not a Banach space).

Lemma G.7. If Id : X −→ X, Id(x) := x, is the identity map on a normed vector space

X over K, then k Id k = 1 (in particular, the operator norm of a unit matrix is always

1). Caveat: In principle, one can consider two different norms on X simultaneously,

and then the operator norm of the identity can differ from 1.

Lemma G.8. Let X, Y, Z be normed vector spaces and consider linear maps A ∈

L(X, Y ), B ∈ L(Y, Z). Then

kBAk ≤ kBk kAk. (G.10)

then one estimates

B(Ax) = kAxk B

Ax ≤ kAk kBk, (G.11)

kAxk

thereby establishing the case.

Example G.9. Let m, n ∈ N and let A : Rn −→ Rm be the linear map given by the

m × n matrix (akl )(k,l)∈{1,...,m}×{1,...,n} . Then

( n )

X

kAk∞ := max |akl | : k ∈ {1, . . . , m} (G.12a)

l=1

( m

)

X

kAk1 := max |akl | : l ∈ {1, . . . , n} (G.12b)

k=1

is called the column sum norm of A. It is an exercise to show that kAk∞ is the operator

norm induced if Rn and Rm are endowed with the ∞-norm, and kAk1 is the operator

norm induced if Rn and Rm are endowed with the 1-norm.

Theorem H.1. Let n ∈ N and λ0 , λ1 , . . . , λn ∈ C. Moreover, let

1 λ0 . . . λn0

1 λ1 . . . λn

1

V := .. .. (H.1)

. .

1 λn . . . λnn

H THE VANDERMONDE DETERMINANT 143

be the corresponding Vandermonde matrix. Then its determinant, the so-called Vander-

monde determinant is given by

n

Y

det(V ) = (λk − λl ). (H.2)

k,l=0

k>l

Proof. The proof can be conducted by induction with respect to n: For n = 1, we have

1

1 λ0 Y

det(V ) = = λ1 − λ0 = (λk − λl ), (H.3)

1 λ1

k,l=0

k>l

showing (H.2) holds for n = 1. Now let n > 1. We know from Linear Algebra that the

value of a determinant does not change if we add a multiple of a column to a different

column. Adding the (−λ0 )-fold of the nth column to the (n + 1)st column, we obtain

in the (n + 1)st column

0

λn − λn−1 λ0

1 1

.. . (H.4)

.

λnn − λnn−1 λ0

Next, one adds the (−λ0 )-fold of the (n − 1)st column to the nth column, and, succes-

sively, the (−λ0 )-fold of the mth column to the (m + 1)st column. One finishes, in the

nth step, by adding the (−λ0 )-fold of the first column to the second column, obtaining

1 λ0 . . . λn 1 0 0 . . . 0

0

1 λ1 . . . λn 1 λ1 − λ0 λ2 − λ1 λ0 . . . λn − λn−1 λ0

1 1 1 1

det(V ) = .. .. = .. .. .. .. .. . (H.5)

. . . . . . .

1 λn . . . λnn 1 λn − λ0 λ2n − λn λ0 . . . λnn − λnn−1 λ0

λ1 − λ0 λ2 − λ1 λ0 . . . λn − λn−1 λ0

1 1 1

det(V ) = 1 · ... .. .. ..

. (H.6)

. . .

2 n n−1

λn − λ0 λn − λn λ0 . . . λn − λn λ0

As we also know from Linear Algebra that determinants are linear in each row, for each

k, we can factor out (λk − λ0 ) from the kth row of (H.6), arriving at

1 λ1 . . . λn−1

n 1

.. .. . .

Y

det(V ) = (λk − λ0 ) . . .

. .

. .

(H.7)

n−1

k=1 1 λn . . . λn

However, the determinant in (H.7) is precisely the Vandermonde determinant of the n−1

numbers λ1 , . . . , λn , which is given according to the induction hypothesis, implying

n

Y n

Y n

Y

det(V ) = (λk − λ0 ) (λk − λl ) = (λk − λl ), (H.8)

k=1 k,l=1 k,l=0

k>l k>l

I MATRIX-VALUED FUNCTIONS 144

I Matrix-Valued Functions

Notation I.1. Given m, n ∈ N, let M(m, n, K) denote the set of m × n matrices over

K.

Proposition I.2. Let I ⊆ R be a nontrivial interval, let m, n, l ∈ N, and suppose

A : I −→ M(m, n, K), A(x) = aαβ (x) , (I.1a)

B : I −→ M(n, l, K), B(x) = bαβ (x) , (I.1b)

x∈I

Proof. Writing C(x) = cαβ (x) and using the one-dimensional product rule together

with the definition of matrix multiplication, one computes, for each (α, β) ∈ {1, . . . , m}

× {1, . . . , l},

n

!′

X

c′αβ (x) = aαγ (x) bγβ (x)

γ=1

n

X n

X

= a′αγ (x) bγβ (x) + aαγ (x) b′γβ (x)

γ=1 γ=1

αβ αβ

, (I.4)

Proposition I.3. Let m, n, p ∈ N, let I ⊆ R be measurable (e.g. an interval), let A :

I −→ M(m, n, K), x 7→ A(x) = (akl (x)), be integrable (i.e. all Re akl , Im akl : I −→ R

are integrable).

J AUTONOMOUS ODE 145

Z Z

B A(x) dx = B A(x) dx . (I.5)

I I

Z Z

A(x) dx B = A(x) B dx . (I.6)

I I

Proof. (a): One computes, for each (j, l) ∈ {1, . . . , p} × {1, . . . , n},

Z m Z Z m

!

X X

B A(x) dx = bjk akl (x) dx = bjk akl (x) dx

I jl k=1 I I k=1

Z Z

= (BA(x))jl dx = B A(x) dx , (I.7)

I I jl

proving (I.5).

(b): One computes, for each (k, j) ∈ {1, . . . , m} × {1, . . . , p},

Z n Z Z n

!

X X

A(x) dx B = akl (x) dx blj = akl (x)blj dx

I kj l=1 I I l=1

Z Z

= (A(x)B)kj dx = A(x) B dx , (I.8)

I I kj

proving (I.6).

J Autonomous ODE

Theorem J.1. Let G ⊆ R × Kn , n ∈ N, and f : G −→ Kn . Then the nonautonomous

ODE

y ′ = f (x, y) (J.1)

is equivalent to the autonomous ODE

y ′ = g(y), (J.2)

where

g : R × G −→ Kn+1 ,

g(y1 , . . . , yn+1 ) := 1, f (y1 , y2 , . . . , yn+1 ) , (J.3)

in the following sense:

J AUTONOMOUS ODE 146

solution to (J.2).

∃ ψ1 (x0 ) = x0 , (J.4)

x0 ∈I

then

∀ ψ ′ (x) = (1, φ′ (x)) = 1, f (x, φ(x)) = g(x, φ(x)) = g(ψ(x)),

(J.5)

x∈I

(b): If ψ : I −→ Kn+1 is a solution to (J.2) with the property (J.4) and φ : I −→ Kn ,

φ(x) := (ψ2 (x), . . . , ψn+1 (x)), then (J.4) implies ψ1 (x) = x for each x ∈ I and, thus,

′

(x)) = f (x, ψ2 (x), . . . , ψn+1 (x)) = f (x, φ(x)), (J.6)

x∈I

While Th. J.1 is somewhat striking and of theoretical interest, it has few useful applica-

tions in practise, due to the unbounded first component of solutions to (J.2) (cf. Rem.

5.2).

The following Example J.2, provided by Anton Sporrer, shows Lem. 5.19 becomes false

if the hypothesis that every initial value problem for the considered ODE y ′ = f (y) has

at least one solution is omitted:

Example J.2. Consider

(

0 for y ∈ Q,

f : R −→ R, f (y) := (J.7)

1 for y ∈ R \ Q,

and the autonomous ODE y ′ = f (y). If (x0 , y0 ) ∈ R × Q, then the initial value problem

y(x0 ) = y0 has the unique solution φ : R −→ R, φ ≡ y0 ∈ Q. However, if (x0 , y0 ) ∈

R × (R \ Q), then the initial value problem y(x0 ) = y0 has no solution. Since y ′ = f (y)

has only constant solutions, every function E : R −→ R is an integral for this ODE

according to Def. 5.18. However, not every differentiable function E : R −→ R satisfies

(5.15): For example, if E(y) := y, then E ′ ≡ 1, i.e.

y∈R\Q

showing that Lem. 5.19 does not hold for y ′ = f (y) with f according to (J.7).

K POLAR COORDINATES 147

K Polar Coordinates

Recall the following functions, used in polar coordinates of the plane:

q

r : R2 \ {(0, 0)} −→ R+ , r(y1 , y2 ) := y12 + y22 , (K.1a)

0 for y2 = 0, y1 > 0,

arccot(y /y )

1 2 for y2 > 0,

ϕ : R2 \ {(0, 0)} −→ [0, 2π[, ϕ(y1 , y2 ) :=

π for y2 = 0, y1 < 0,

π + arccot(y1 /y2 ) for y2 < 0.

(K.1b)

autonomous ODE

y2′ = f2 (y1 , y2 ), (K.2b)

ϕ′ = g2 (r, ϕ), (K.3b)

where g : R+ × R −→ R2 ,

g1 : R+ × R −→ R,

g1 (r, ϕ) := f1 (r cos ϕ, r sin ϕ) cos ϕ + f2 (r cos ϕ, r sin ϕ) sin ϕ, (K.4a)

g2 : R+ × R −→ R,

1

g2 (r, ϕ) := f2 (r cos ϕ, r sin ϕ) cos ϕ − f1 (r cos ϕ, r sin ϕ) sin ϕ . (K.4b)

r

Let µ : I −→ R2 be a solution to (K.3).

(a) Then

φ : I −→ R2 ,

φ(x) := µ1 (x) cos µ2 (x), µ1 (x) sin µ2 (x) , (K.5)

is a solution to (K.2).

(b) If µ satisfies the initial condition

µ1 (0) = ρ, ρ ∈ R+ , (K.6a)

µ2 (0) = τ, τ ∈ R, (K.6b)

and if

η1 = ρ cos τ, (K.7a)

η2 = ρ sin τ, (K.7b)

K POLAR COORDINATES 148

φ1 (0) = η1 , (K.8a)

φ2 (0) = η2 . (K.8b)

Note that ρ > 0 implies (η1 , η2 ) 6= (0, 0), and that, for (ρ, τ ) ∈ R+ × [0, 2π[, (K.7)

is equivalent to

r(η1 , η2 ) = ρ, (K.9a)

ϕ(η1 , η2 ) = τ (K.9b)

Proof. Exercise.

Example K.2. Consider the autonomous ODE (K.2) with

f1 : R2 \ {(0, 0)} −→ R,

y2 (r(y1 , y2 ) − y1 )

f1 (y1 , y2 ) := y1 1 − r(y1 , y2 ) − , (K.10a)

2 r(y1 , y2 )

f2 : R2 \ {(0, 0)} −→ R,

y1 (r(y1 , y2 ) − y1 )

f2 (y1 , y2 ) := y2 1 − r(y1 , y2 ) + , (K.10b)

2 r(y1 , y2 )

where r is the radial polar coordinate function as defined in (K.1a). Using g : R+ ×R −→

R2 as defined in (K.4), one obtains, for each (ρ, ϕ) ∈ R+ × R,

ρ sin ϕ (ρ − ρ cos ϕ) cos ϕ

g1 (ρ, ϕ) = ρ cos ϕ (1 − ρ) cos ϕ −

2ρ

ρ cos ϕ (ρ − ρ cos ϕ) sin ϕ

+ ρ sin ϕ (1 − ρ) sin ϕ +

2ρ

= ρ (1 − ρ), (K.11a)

cos ϕ (ρ − ρ cos ϕ) cos ϕ

g2 (ρ, ϕ) = sin ϕ (1 − ρ) cos ϕ +

2ρ

sin ϕ (ρ − ρ cos ϕ) sin ϕ

− cos ϕ (1 − ρ) sin ϕ +

2ρ

1 − cos ϕ

= , (K.11b)

2

such that the autonomous ODE

r′ = r (1 − r), (K.12a)

1 − cos ϕ [Phi16, (I.1c)] ϕ

ϕ′ = = sin2 , (K.12b)

2 2

is the polar coordinate version of (K.2) as defined in Th. K.1.

K POLAR COORDINATES 149

is

ρ

Yp : Dp,0 −→ R+ , Yp (x, ρ) := , (K.14)

ρ + (1 − ρ) e−x

where h

i ρ

Dp,0 = R×]0, 1] ∪ (x, ρ) : ρ > 1, x ∈ − ln ,∞ . (K.15)

ρ−1

ρ

∀+ Yp (0, ρ) = = ρ. (K.16)

ρ∈R ρ + (1 − ρ)

ρ(1 − ρ)e−x

Yp′ (x, ρ) =

∀ 2 = Yp (x, ρ) 1 − Yp (x, ρ) . (K.17)

(x,ρ)∈Dp,0 ρ + (1 − ρ) e−x

To verify the form of Dp,0 , we note that the denominator in (K.14) is positive for each

x ∈ R if 0 < ρ ≤ 1. If ρ > 1, then the function a : R −→ R, a(x) := ρ + (1 − ρ) e−x , is

ρ

strictly increasing (note a′ (x) = (ρ − 1) e−x > 0) and has a unique zero at x = − ln ρ−1 .

ρ Y (x, ρ) = ∞, proving the maximality of Y (·, ρ).

Thus limx↓− ln ρ−1 N

2 cos τ + 2

x0 (τ ) := for τ ∈ R \ {lπ : l ∈ Z}, (K.18a)

sin τ

Rk :=]0, π[+2kπ, (K.18b)

Lk :=] − π, 0[+2kπ, (K.18c)

A0 := R × {2kπ : k ∈ Z}, (K.18d)

A0,k := R− × {π + 2kπ}, (K.18e)

B0,k := R+ × {π + 2kπ}, (K.18f)

A1,k := (x, τ ) ∈ R2 : x ∈] − ∞, x0 (τ )[, τ ∈ Rk ,

(K.18g)

A2,k := (x, τ ) ∈ R2 : x ∈]x0 (τ ), ∞[, τ ∈ Lk ,

(K.18h)

B1,k := (x, τ ) ∈ R2 : x ∈]x0 (τ ), ∞[, τ ∈ Rk ,

(K.18i)

B2,k := (x, τ ) ∈ R2 : x ∈] − ∞, x0 (τ )[, τ ∈ Lk ,

(K.18j)

C1,k := (x, τ ) ∈ R2 : x = x0 (τ ), τ ∈ Rk ,

(K.18k)

C2,k := (x, τ ) ∈ R2 : x = x0 (τ ), τ ∈ Lk ,

(K.18l)

1 − cos ϕ ϕ

ϕ′ = q(ϕ), q : R −→ R, q(ϕ) := = sin2 , (K.19)

2 2

K POLAR COORDINATES 150

is

Yq : R2 −→ R+ ,

τ for (x, τ ) ∈ A0 ,

2

2 kπ + arctan − x

for (x, τ ) ∈ A0,k , k ∈ Z,

2 (k + 1)π + arctan − x2

for (x, τ ) ∈ B0,k , k ∈ Z,

π + 2kπ for (x, τ ) = (0, π + 2kπ), k ∈ Z,

Yq (x, τ ) := 2 kπ + arctan 2 cos τ2−x

sin τ

sin τ +2

for (x, τ ) ∈ A1,k ∪ A2,k , k ∈ Z,

π + 2kπ for (x, τ ) ∈ C1,k , k ∈ Z,

2 (k + 1)π + arctan 2 cos τ2−x sin τ

sin τ +2

for (x, τ ) ∈ B1,k , k ∈ Z,

−π + 2kπ for (x, τ ) ∈ C2,k , k ∈ Z,

2 sin τ

2 (k − 1)π + arctan

for (x, τ ) ∈ B2,k , k ∈ Z.

2 cos τ −x sin τ +2

(K.20)

[

2 ˙ ˙ ˙ ˙ ˙ ˙ ˙ ˙ ˙

R = A0 ∪ {(0, π + 2kπ)} ∪ A0,k ∪ B0,k ∪ A1,k ∪ A2,k ∪ B1,k ∪ B2,k ∪ C1,k ∪ C2,k

k∈Z

(K.21)

and, introducing the auxiliary function

∆ : R2 −→ R, ∆(x, τ ) := 2 cos τ − x sin τ + 2, (K.22)

one has

∆(x, τ ) 6= 0 for each (x, τ ) ∈ A1,k ∪ A2,k ∪ B1,k ∪ B2,k , k ∈ Z. (K.23)

It remains to show that, for each τ ∈ R, the function x 7→ Yq (x, τ ) is differentiable on

R, satisfying

1 − cos Yq (x, τ )

∀ Yq′ (x, τ ) = , (K.24)

x∈R 2

and the initial condition

Yq (0, τ ) = τ. (K.25)

The initial condition (K.25) is satisfied, since

∀ Yq (0, τ ) = τ, (K.26a)

τ ∈{kπ: k∈Z}

2 sin τ [Phi16, (I.1d)] τ

∀ Yq (0, τ ) = 2kπ + 2 arctan = 2kπ + 2 arctan tan

τ ∈Rk ∪Lk , k∈Z 2 cos τ + 2 2

τ

= 2kπ + 2 − kπ = τ. (K.26b)

2

Next, we show that, for each τ ∈ R, the function x 7→ Yq (x, τ ) is differentiable on R and

satisfies (K.24): For τ ∈ {2kπ : k ∈ Z}, x 7→ Yq (x, τ ) is constant, i.e. differentiability is

clear, and

1 − cos Yq (x, τ ) 1 − cos(2kπ)

∀ = = 0 = Yq′ (x, τ ) (K.27)

x∈R 2 2

K POLAR COORDINATES 151

proves (K.24).

For each τ ∈ {2(k +1)π : k ∈ Z}, differentiability is clear in each x ∈ R\{0}. Moreover,

′

2 4 1 4

∀ 2 arctan − = 2 4 = , (K.28)

x∈R\{0} x x 1 + x2 4 + x2

and, thus, for each x ∈ R \ {0},

4

1 1−

1 − cos Yq (x, τ ) 1 1 2 [Phi16, (I.1e)] 1 x2

= − cos 2 arctan − = − 4

2 2 2 x 2 2 1+ x2

2

1 1 x −4 8 4 (K.28) ′

= − 2

= 2

= = Yq (x, τ ), (K.29)

2 2 x +4 2(4 + x ) 4 + x2

proving (K.24) for each x ∈ R \ {0}. It remains to consider x = 0. One has, by

L’Hôpital’s rule [Phi16, Th. 9.26(a)],

π + 2kπ − 2 kπ + arctan − x2

Yq (0, τ ) − Yq (x, τ )

lim = lim

x↑0 0−x x↑0 −x

4

[Phi16, (9.29)],(K.28) − 2

= lim 4+x = 1 (K.30)

x↑0 −1

and

π + 2kπ − 2 (k + 1)π + arctan − x2

Yq (0, τ ) − Yq (x, τ )

lim = lim

x↓0 0−x x↓0 −x

4

[Phi16, (9.29)],(K.28) − 2

= lim 4+x = 1, (K.31)

x↓0 −1

showing x 7→ Yq (x, τ ) to be differentiable in x = 0 with Yq′ (0, τ ) = 1. Due to

1 − cos(π + 2kπ)

= 1 = Yq′ (0, π + 2kπ), (K.32)

2

(K.24) also holds.

For each τ ∈ Rk ∪ Lk , the differentiability is clear in each x ∈ R \ {x0 (τ )}. Moreover,

recalling ∆(x, τ ) from (K.22), one has, for each x ∈ R \ {x0 (τ )},

′

4(sin τ )2 4(sin τ )2

2 sin τ 1

2 arctan = 2 = , (K.33)

∆(x, τ ) (∆(x, τ ))2 1 + 4(sin τ ) 2 4(sin τ )2 + (∆(x, τ ))2

(∆(x,τ ))

4(sin τ )2

1 1−

1 − cos Yq (x, τ ) 1 1 2 sin τ [Phi16, (I.1e)] 1 (∆(x,τ ))2

= − cos 2 arctan = − 4(sin τ )2

2 2 2 ∆(x, τ ) 2 2 1+

(∆(x,τ ))2

1 1 (∆(x, τ ))2 − 4(sin τ )2 8(sin τ )2

= − =

2 2 (∆(x, τ ))2 + 4(sin τ )2 2(4(sin τ )2 + (∆(x, τ ))2 )

4(sin τ )2 (K.33)

= = Yq′ (x, τ ), (K.34)

4(sin τ )2 + (∆(x, τ ))2

K POLAR COORDINATES 152

we have sin τ > 0 and x0 (τ ) > 0, and, thus, by L’Hôpital’s rule [Phi16, Th. 9.26(a)],

sin τ

Yq x0 (τ ), τ − Yq (x, τ ) sin τ +2

lim = lim

x↑x0 (τ ) x0 (τ ) − x x↑x0 (τ ) x0 (τ ) − x

2

[Phi16, (9.29)],(K.33) − 4(sin τ4(sin τ)

)2 +(∆(x,τ ))2

= lim =1 (K.35)

x↑x0 (τ ) −1

and

2 sin τ

Yq x0 (τ ), τ − Yq (x, τ ) π + 2kπ − 2 (k + 1)π + arctan 2 cos τ −x sin τ +2

lim = lim

x↓x0 (τ ) x0 (τ ) − x x↓x0 (τ ) x0 (τ ) − x

2

[Phi16, (9.29)],(K.33) − 4(sin τ4(sin τ)

)2 +(∆(x,τ ))2

= lim =1 (K.36)

x↓x0 (τ ) −1

1 − cos(π + 2kπ)

= 1 = Yq′ (x0 (τ ), τ ), (K.37)

2

(K.24) also holds. For τ ∈ Lk , we have sin τ < 0 and x0 (τ ) < 0, and, thus, by L’Hôpital’s

rule [Phi16, Th. 9.26(a)],

Yq x0 (τ ), τ − Yq (x, τ )

lim

x↑x0 (τ ) x0 (τ ) − x

−π + 2kπ − 2 (k − 1)π + arctan 2 cos τ2−x

sin τ

sin τ +2

= lim

x↑x0 (τ ) x0 (τ ) − x

2

[Phi16, (9.29)],(K.33) − 4(sin τ4(sin τ)

)2 +(∆(x,τ ))2

= lim =1 (K.38)

x↑x0 (τ ) −1

and

2 sin τ

Yq x0 (τ ), τ − Yq (x, τ ) −π + 2kπ − 2 kπ + arctan 2 cos τ −x sin τ +2

lim = lim

x↓x0 (τ ) x0 (τ ) − x x↓x0 (τ ) x0 (τ ) − x

2

[Phi16, (9.29)],(K.33) − 4(sin τ4(sin τ)

)2 +(∆(x,τ ))2

= lim =1 (K.39)

x↓x0 (τ ) −1

1 − cos(−π + 2kπ)

= 1 = Yq′ (x0 (τ ), τ ), (K.40)

2

(K.24) also holds. N

K POLAR COORDINATES 153

Y : Df,0 −→ R2 ,

Y (x, η1 , η2 ) := Yp x, r(η1 , η2 ) cos Yq x, ϕ(η1 , η2 ) ,

Yp x, r(η1 , η2 ) sin Yq x, ϕ(η1 , η2 ) , (K.41)

Df,0 = R × {η ∈ R2 : 0 < kηk2 ≤ 1}

h

2

i kηk2

∪ (x, η) ∈ R × R : kηk2 > 1, x ∈ − ln ,∞ . (K.42)

kηk2 − 1

Proof. Since (K.12) is the polar coordinate version of (K.2) with f1 , f2 according to

(K.10), everything follows from combining Th. K.1 with Claims 1 and 2. N

Claim 4. The autonomous ODE (K.2) with f1 , f2 according to (K.10) has (1, 0) as its

only fixed point, and (1, 0) satisfies Def. 5.24(iii) for x → ∞ (even for each η ∈ R2 \ {0})

without satisfying Def. 5.24(ii) (i.e. without being positively stable).

r(η)

∀2 lim Yp x, r(η) = lim = 1. (K.43)

η∈R \{0} x→∞ x→∞ r(η) + 1 − r(η) e−x

lim Yq x, 0 = lim 0 = 0. (K.44a)

x→∞ x→∞

If ϕ(η) = π, then

2

lim Yq x, π = lim 2 π + arctan − = 2(π + 0) = 2π. (K.44b)

x→∞ x→∞ x

If 0 < ϕ(η) < π or π < ϕ(η) < 2π, then sin ϕ(η) 6= 0 and, thus,

2 sin ϕ(η)

lim Yq x, ϕ(η) = lim 2 π + arctan

x→∞ x→∞ 2 cos ϕ(η) − x sin ϕ(η) + 2

= 2(π + 0) = 2π. (K.44c)

η∈R2 \{0} x→∞

While (1, 0) is clearly a fixed point for (K.2) with f1 , f2 according to (K.10), (K.45)

shows that no other η ∈ R2 \ {0} can be a fixed point.

REFERENCES 154

For each τ ∈]0, π[ and η := (cos τ, sin τ ), it is ϕ(η) = τ and Yq (0, ϕ(η)) = τ . Thus, due

to (K.44c) and the intermediate value theorem, the continuous function Yq (·, ϕ(η)) must

attain every value between τ and 2π, in particular, there is xπ > 0 that Yq (xτ , ϕ(η)) = π

and Y (xτ , η) = (cos π, sin π) = (−1, 0). Since every neighborhood of (1, 0) contains

points η = (cos τ, sin τ ) with τ ∈]0, π[, this shows that (1, 0) does not satisfy Def.

5.24(ii) for x ≥ 0. N

References

[Aul04] Bernd Aulbach. Gewöhnliche Differenzialgleichungen, 2nd ed. Spektrum

Akademischer Verlag, Heidelberg, Germany, 2004 (German).

[Koe03] Max Koecher. Lineare Algebra und analytische Geometrie, 4th ed. Springer-

Verlag, Berlin, 2003 (German), 1st corrected reprint.

Mathematics, Wiley-Interscience, Hoboken, NJ, USA, 2004.

Maximilians-Universität, Germany, 2015, available in PDF format at

http://www.math.lmu.de/~philip/publications/lectureNot

es/calc2_forStatStudents.pdf.

[Phi16] P. Philip. Analysis I: Calculus of One Real Variable. Lecture Notes, Lud-

wig-Maximilians-Universität, Germany, 2015/2016, available in PDF format at

http://www.math.lmu.de/~philip/publications/lectureNotes/analysis1.pdf.

[Pre75] G. Preuß. Allgemeine Topologie, 2nd ed. Springer-Verlag, Berlin, 1975 (Ger-

man).

[Put66] E.J. Putzer. Avoiding the Jordan Canonical Form in the Discussion of Linear

Systems with Constant Coefficients. The American Mathematical Monthly 73

(1966), No. 1, 2–7.

[Str08] Gernot Stroth. Lineare Algebra, 2nd ed. Berliner Studienreihe zur Mathe-

matik, Vol. 7, Heldermann Verlag, Lemgo, Germany, 2008 (German).

[Wal02] Wolfgang Walter. Analysis 2, 5th ed. Springer-Verlag, Berlin, 2002 (Ger-

man).

- Matlab Learn LessonCaricato dageorgixpie
- MAT2002 Applications of Differential and Difference Equations ETH 1 AC37Caricato daNikhilesh Prabhakar
- ODE Lecture 1Caricato dapet
- Wikibooks CalculusCaricato daVeronica Bayani
- Parameterization and Controllability of Linear Time-Invariant SystemsCaricato daIJCSI Editor
- Initial Value ProblemsCaricato daLara
- NON HOMOGENEOUS SECOND ODER LINEAR DIFFERENTIAL EQUATION PDFCaricato daLuqman Rajput
- IIT Bombay LecturesCaricato daWritabrata Bhattacharya
- 001_Week-1 ODE IntroCaricato daThara Arfiansyah
- 1285062812 MicrosoftWord-XII Math Set 2 CBSE Paper 2010 Questions 0Caricato daRko Mhit
- DiffEqnWeek1Caricato daAfs Asg
- ECE+Curriculum+latestCaricato daPankaj Sharma
- Lecture 2Caricato daNaveen Raj
- OdesCaricato datonynugan
- Md FramedCaricato daVu Van Dong
- CEN Undergraduate CurriculumCaricato daSestreUIslamu
- Thermoacoustic_Instability_in_a_Rijke_Tube_with_a_.pdfCaricato daDinesh Kumar
- Simscape Product DescriptionCaricato daasdf
- BurdenFairesCaricato darichevans123
- 25 4 Soln Fourier SeriesCaricato daTebsu Tebsuko
- Sylabus Ias Pre Mathematics PhysicsCaricato daRakesh Mahawar
- Review 2Caricato daburny98
- Practice Set 1Caricato daAtif Irshad
- A Computational Framework for the Development of a Stochastic Micro-Cracks Informed Damage ModelCaricato daRohit Madke
- bput_barch_2008.pdfCaricato daArjun Bishoyi
- Population_types of StabilityCaricato daKatty González
- Oran ~ Numerical Simulation of Reactive Flow,2001,2ed.pdfCaricato daNikola Maricic
- Unit-4-MSCaricato daitsmohanecom
- image inpaintingCaricato daSaad Gillani
- Heat conduction in polar coordinatesCaricato daGonzalo Salfate Maldonado

- MTH102Caricato daSaaleh Amin
- Course OutlineCaricato daKarthik Beyond
- Engineering Applied MathCaricato daAnAgwgos
- Ordinary Differential EquationsCaricato daMadelynne
- 2013FA-MATH-2420-81001Caricato daSpoodie
- Module DQT203Caricato daDaten Yusfarina
- bsc hons physicsCaricato daGarima Malhan
- Assignment 4 SolutionsCaricato dahlove1
- Nonhomogeneous Equations: Variation of ParametersCaricato daNicole S. Flores
- Introduction to Differential EquationsCaricato daSuhas Phalak
- Notes-2nd Order ODE Pt2Caricato daHimanshu Sharma
- Assignment I Maths I 1Caricato dayasharmaster
- Chapter 05Caricato daGupta GurunadhGupta
- Differential Equations Course NotesCaricato daIan Chesser
- 1407.3162Caricato daRupesh Kumar
- Powers NotesCaricato daKanthavel Thillai
- Ordinary Differential Equations course notesCaricato dadavid1562008
- BSc_mathsCaricato dashashankmeena
- 1803Caricato dadinhanhminhqt
- Diferencijalne Drugog RedaCaricato daMiro Vucic
- De Chapter 1Caricato daessamqad
- Kxex2244 NotesCaricato daAztec Mayan
- ME681_syllabusCaricato daMayank Gaur
- 3-Higher Order de v-VICaricato datharunenjp
- Course Outline [FDM 1023]Caricato daJasmine_lai00
- 18MAT211.pdfCaricato daPrasad
- All CLO and PLO for BSRS 2nd YearCaricato dadure
- 10-11Caricato daYan Zhao
- SSCE1793-CourseOutline-2016171Caricato daSamad Said
- Particular SolutionsCaricato daconfederatenur

## Molto più che documenti.

Scopri tutto ciò che Scribd ha da offrire, inclusi libri e audiolibri dei maggiori editori.

Annulla in qualsiasi momento.