Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Classical Mechanics
and
Dynamical Systems
1 Classical mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1 Newton’s laws of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Index notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.1 Einstein’s convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.2 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3 Kinetic energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4 Potential energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.5 Conservation of energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.6 Conservation of momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.7 Conservation of angular momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.8 Curvilinear coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.8.1 Polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.8.2 Spherical coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2 Lagrange equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.2 Lagrange equations of the second kind . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.2.1 Generalized coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.2.2 Kinetic energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.2.3 Generalized forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2.4 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.3 Lagrange equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.4 Particle in homogeneous gravitational field . . . . . . . . . . . . . . . . . . . . . . . . 47
2.5 Harmonic oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.6 Mathematical pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4 Contents
3 Hamilton’s equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.2 Legendre transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.3 Hamilton’s equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.4 Particle in homogeneous gravitational field . . . . . . . . . . . . . . . . . . . . . . . . 68
3.5 Conservation of energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.5.1 Homogeneous functions and Hamiltonian . . . . . . . . . . . . . . . . . . . . . 70
3.5.2 Conservation of energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.6 Phase space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.7 Harmonic oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4 Variational principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.1 Fermat’s principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2 Formulation of variational problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.3 Variation of the functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.4 Euler-Lagrange equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.5 Non-uniqueness of the Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.6 Variational derivation of Hamilton’s equations . . . . . . . . . . . . . . . . . . . . . 91
4.7 Noether’s theorem: motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.8 Noether’s theorem: proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.9 Basic conservation laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
9 Bifurcations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
9.1 Saddle-node bifurcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
9.2 Transcritical bifurcations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
9.3 Pitchfork bifurcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
9.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
D To do . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
1
Classical mechanics
Classical mechanics is the most basic part of the physics. In fact, the physics as
an exact science started with the development of mechanics by sir Isaac Newton.
Conventionally we distinguish two parts of mechanics: the kinematics and the dy-
namics. In this chapter we introduce basic notions of dynamics, including the notion
of generalized coordinates and several conventions being used thorough the entire
textbook.
The word “kinematics” is derived from the Greek word κινει̃ν (kinein) meaning
“to move”. Thus, the kinematics studies the motion of bodies and point masses. It
does not, however, ask why the bodies move in a given way, but rather it provides
us with the description of the motion. In kinematics we ask where the bodies are,
at what velocities and with what accelerations they move. We also classify the kinds
of motion according to the shapes of the trajectories or according to the velocities.
Typical kinematic quantities are position, velocity and acceleration.
In dynamics, on the other hand, we study reasons of the motion. The word “dy-
namics” has an ancient origin as well: δυναµικóς means “powerful”. In this branch of
mechanics we ask what are the forces acting on the bodies and what is the influence
of the forces on the motion. This influence will not depend on the force themselves
only, but also on the mass of the bodies. Mass, force and momentum belong to basic
quantities in dynamics.
• Law of inertia
Every body persists in its state of being at rest or of moving uniformly straight
forward, except insofar as it is compelled to change its state by force impressed.
• Law of force
The alteration of motion is ever proportional to the motive force impressed; and
is made in the direction of the right line in which that force is impressed.
• Law of action and reaction
To every action there is always an equal and opposite reaction: or the forces of
two bodies on each other are always equal and are directed in opposite directions.
These laws involve important notions of force, momentum and mass and we as-
sume that reader is familiar with them. Consider a point particle of mass m. Choosing
some fixed point O (origin) in the space, we can describe motion of the particle by
position vector (radius vector) r. Position vector is time-dependent if the body is
moving with respect to the origin O, which is mathematically denoted by r = r(t).
Trajectory of the point particle is the set of all end-points of the position vector
in some time interval (see fig. 1.1). Velocity is defined as a derivative of r(t) with
respect to time:
dr
v(t) = .
dt
The total derivative with respect to time will be often denoted by “dot”, so that the
last equation is briefly written as v = ṙ(t) = ṙ. Sometimes it is useful to parametrize
position vector by other parameter than time, for example by the length of the
trajectory.
Similarly, second derivative with respect to time will be denoted by “double-dot”.
The most important example is the definition of acceleration, which is the second
derivative of position vector with respect to time:
d2 r dv
a= 2 = = r̈ = v̇.
dt dt
Quantities r, v and a are so called kinematic quantities. They describe the motion
independently of the causes and reasons of motion. According to Aristotle, the motion
is caused by forces, but this is wrong. Aristotle’s opinion was so influential that it
stopped the progress in physics for the next two thousands years. Experimental
research of Galileo Galilei, his discovery of the law of inertia, and finally the grand
work of Isaac Newton founded the basis of modern physics.
Why is Aristotle’s point of view wrong? Well, we have to clarify what we mean by
the statement that “the motion is caused by the forces”. The law of inertia says that
1.1 Newton’s laws of motion 9
trajectory
P (position at time t)
v(t)(velocity)
r(t)(radius vector)
O(origin)
Fig. 1.1. The point mass is moving along the trajectory. Its position at time t is given by position
vector r(t). The velocity v = ṙ
if there is no force, the body will move uniformly along the straight line. We need
the force to change the motion, not to preserve it. Therefore there is no connection
between the force and velocity, but there must be a relation between the force and
acceleration. This crucial point was missed by Aristotle.
The precise form of the relation between force and acceleration is given by New-
ton’s second law, the law of force. We expect that acceleration has the same direction
as the force, and that bigger force will cause bigger acceleration. Experience teaches
us that we need bigger force to change the motion of heavier bodies, so the acceler-
ation must be inversely proportional to the mass. This simple consideration directly
leads us to the suggestion
F
a= ,
m
where m is the mass of the body and F is the force acting on the body. The last
formula is a mathematical expression of Newton’s second law and the experiments
show that it is in a very good accordance with the reality, although it fails for high
velocities, strong gravitational fields and for microscopic objects.
We can formulate this law in slightly different form by defining the (linear) mo-
mentum p of the body:
p = m v.
Momentum incorporates both the measure of the inertia (mass) and the “state of
motion”, velocity. The force can be then defined as the change of the momentum in
time, i.e.
10 1 Classical mechanics
dp
F = .
dt
If the mass of the body is constant in time, we have ṗ = mv̇ = ma, which is again
Newton’s law
F = m a.
r = (x, y, z)
and say that x, y and z are the coordinates (or the components) of the position vector
r. In the index notation we define
x1 = x, x2 = y, x3 = z.
xi , i = 1, 2, 3.
If the position vector depends on time, i.e. r = r(t), also its coordinates xi do:
xi = xi (t).
1.2 Index notation 11
This notation means that we subsequently substitute values 1, 2, 3 for the variable
i and then add all terms. Notice that under the summation symbol we have two
vector quantities and the index i appears there exactly twice. In fact, expressions of
this type arise very often in mathematics and physics. Albert Einstein introduced a
convention named after him, in which we do not write the symbol Σ. More precisely,
if some index appears in some term exactly twice, the sum through this index is
automatically assumed. That is, the scalar product can be written simply as
x · y = xi y i .
12 1 Classical mechanics
1.2.2 Differentiation
Let us see another example. Suppose that f is a physical quantity depending on the
position, i.e.
f (t) = f (x(t)).
We can see that the expression under the sum has again the same structure: index i
appears there exactly twice. According to Einstein’s convention we therefore write
∂f dxi
f˙ = .
∂xi dt
For convenience we introduce also the notation
∂
= ∂i ,
∂xi
and together with notation ẋi = dxi /dt we can write simply
f˙ = ẋi ∂i f.
1.2.3 Examples
P (x, y, z)
O y
x
Fig. 1.2. Position vector in Cartesian coordinates.
Solution. Both vectors have three components, so that the index i takes values
i = 1, 2, 3.
x · y = xi yi .
Since the index i repeats twice in the previous expression, according to Einstein’s
summation convention we have
xi yi = x1 y1 + x2 y2 + x3 y3 .
xi yi = 1 · (−5) + 2 · 1 + 3 · 1 = 0.
x · y = 0.
Example 2: divergence
Let
v = (v1 , v2 , v3 )
∂i vi
v = (x − y, x2 + y 2 , xy). (1.3)
∂i vi = ∂1 v1 + ∂2 v2 + ∂3 v3
or, equivalently,
∂v1 ∂v2 ∂v3
∂i vi = + + .
∂x ∂y ∂z
Substituting (1.3) we find
∂1 v1 = 1,
∂2 v2 = 2 y, (1.4)
∂3 v3 = 0
∂i vi = 1 + 2 y.
∆f = ∂i ∂i f,
where f is an arbitrary object (scalar function or the component of vector). Find the
expression for the Laplacian in the Cartesian coordinates.
1.2 Index notation 15
∂ 2f ∂ 2f ∂ 2f
∆f = ∂i ∂i f = ∂1 ∂1 f + ∂2 ∂2 f + ∂3 ∂3 f = + + .
∂x2 ∂y 2 ∂z 2
Thus, the Laplacian reads
∂2 ∂2 ∂2
∆= + + .
∂x2 ∂y 2 ∂z 2
Let
r = (x, y, z)
Find the derivatives of radius vector with respect to Cartesian coordinates and write
down the result in the index notation.
Solution. We need to evaluate quantities ∂i r. We start with ∂1 r, i.e. with the deriva-
tive with respect to coordinate x. We have
∂r 1 x
= ∂x (x2 + y 2 + z 2 )1/2 = (x2 + y 2 + z 2 )−1/2 2 x = p .
∂x 2 x2 + y 2 + z 2
Thus, we arrived at
∂r x
= .
∂x r
Similarly, one can show that for the other coordinates the following holds:
∂r y ∂r z
= , = .
∂y r ∂z r
All these result can be summarized in the index notation as
xi
∂i r = .
r
16 1 Classical mechanics
Now let the radius vector from the previous example be time-dependent in such way
that
x(t) = a cos ωt,
y(t) = a sin ωt, (1.5)
z(t) = 0.
Its magnitude is
√ p
v = v · v = (b ω sin ωt)2 + (b ω cos ωt)2 = ω b.
v = (v1 , v2 , v3 ),
The magnitude of v is
Solution in index notation. The proof is essentially identical to the previous one,
but more compact:
d 2 d
v = vi vi = 2 vi v̇i = 2 v · v̇.
dt dt
Suppose, in addition, that the body was at rest at point A, while the final velocity
of the body was v. Let us evaluate the work done by the force in terms of the final
velocity. Using Newton’s law in the form
18 1 Classical mechanics
F = m v̇
we have
ZB
dv
W = m · dr.
dt
A
dr = v dt,
so that
ZB
W =m v · dv.
A
ZB
1 2
W =m dv .
2
A
v A = 0, v B = v.
variable acceleration, but the work depends only on the final velocity. And, finally,
the work does not depend on the force. The force could be small and act for a long
time, or it could be big and act only for a moment, but the work depends only on
the final velocity.
Summa summarum, if the body was at rest at the beginning, but it had velocity
v at the end, the work needed to accelerate the body is always the same, regardless
on the way how it was accelerated. This work is called kinetic energy and is defined
by
1 1
T = m v 2 = m ẋi ẋi . (1.8)
2 2
The body has kinetic energy if it is moving and the kinetic energy is equal to work
which must be done by the force to accelerate the body from the rest to velocity v.
F EM = q (E + v × B) .
When the spring is displaced from its equilibrium position by y, it exerts force
F = −ky
where k is the constant characterizing the spring. Hence, one way how to characterize
the force is to give an explicit expression. Since the force is vector quantity, we have
to specify three coordinates Fx , Fy and Fz .
20 1 Classical mechanics
where symbol γ resembles the trajectory along which the body was moving. However,
if we choose any other curve γ 0 , figure 1.3, the work associated with this curve will
be, in general, different:
Z
0
W = F · dr 6= W. (1.10)
γ0
In such a case, the notion of potential energy is useless because it depends on partic-
ular trajectory. Hence, in general, potential energy is meaningless quantity. Surpris-
ingly enough, there are many examples of forces where the work W in fact does not
depend on the choice of trajectory γ. Such forces are called conservative or potential
forces and in such cases we can define a useful and meaningful potential energy.
Let us find which forces have this property. We demand that integral (1.9) does not
depend on γ and it depends only on points A and B and investigate the consequences
of this assumption. Then, however, work performed along any closed loop must be
equal to zero. Indeed, let γ be arbitrary closed loop as depicted in figure 1.4. Let us
choose arbitrary two points A and B lying on the curve. In this way we obtain two
1.4 Potential energy 21
γ0
A
Fig. 1.3. Under the influence of force F , body moves from point A to point B. Potential energy is
the work done by the force during this displacement. However, there are infinitely many trajectories
connecting these two points and the work is, in general, different for each of them.
We have made an assumption that for any two points A and B the integral be-
tween these two points does note depend on the trajectory but only on the points
themselves. Thus, we can write
Z ZB
F · dr = F · dr
γ1 A
where we do not specify the trajectory as the integral does not depend on it. Similar
consideration applies to integral over γ2 but notice that this curve starts at point B
and ends at point A. Hence,
Z ZA ZB
F · dr = F · dr = − F · dr.
γ2 B A
Therefore, both integrals on the right hand side of (1.11) have the same values but
differ by sign and so we arrive at
22 1 Classical mechanics
I
F · dr = 0 (1.12)
γ
as claimed. Conversely, we leave to the reader to show that if (1.12) hold for arbitrary
closed loop γ, then necessarily integral between any two points does not depend on
the trajectory. Conservative forces are those for which (1.12) holds.
There is yet another formulation of the fact that the force is conservative. This
last formulation is convenient for practical purposes because it is a differential rather
than integral criterion for a force to be conservative. If γ is arbitrary closed loop and
the force is conservative, i.e. (1.12) holds, we can use the Stokes theorem to convert
the line integral into a surface integral:
I Z
F · dr = (∇ × F ) · dS (1.13)
γ S(γ)
where S(γ) is the surface surrounded by loop γ. Then, by the conservative character
of F , the Stokes theorem implies
Z
(∇ × F ) · dS = 0
S(γ)
for arbitrary loop γ. But since the choice of γ is arbitrary, the last equality can hold
for all possible loops only if the integrand vanishes everywhere, i.e.
∇ × F = 0. (1.14)
In other words, the curl of conservative field F is necessarily zero. Poincare’s lemma
then asserts that any vector field with vanishing divergence is the gradient of some
scalar field φ,
F = − ∇φ, (1.15)
so that the components of the force are given by partial derivatives of function φ:
∂φ
Fi = − ≡ −∂i φ. (1.16)
∂xi
The sign minus is conventional.
Let us recapitulate. We have discussed the notion of the work performed on the
body by force of the external force field in which the body is moving. We have argued
1.4 Potential energy 23
γ = γ1 ∪ γ2
B
γ2
γ1
A
Fig. 1.4. Let γ be any closed trajectory (a loop) and A and B any of its two points which split the
curve γ into a union of curves γ1 and γ2 .
that this work in general depends not only on the initial and final positions but also
on the trajectory. Then we defined a special class of the forces for which this is not
true and the work is actually path-independent and called such forces conservative
or potential. We have found four equivalent criteria for the force to be conservative:
• For arbitrary points A and B, the integral
ZB
W = F · dr (1.17)
A
∇ × F = 0. (1.19)
F = − ∇φ. (1.20)
Function φ, if exists, is called the potential of vector field F or simply the potential
energy.
24 1 Classical mechanics
ZB ZB ZB
W = F · dr = − (∇φ) · dr = − dφ = φA − φB . (1.21)
A A A
Thus, the work performed by the force is equal to difference of the values of the
potential at the initial and at the final point of the trajectory. In the proof we have
used the identity
E = T + φ. (1.22)
This quantity is constant in time. Before we proof this statement, notice that by
second Newton’s law and the definition of the potential, the acceleration of the body
is
dv 1 1
a= = F = − ∇φ.
dt m m
Let us differentiate the total energy with respect to time:
1
The equivalence means that if the force satisfies one of these conditions, it automatically satisfies re-
maining three conditions.
1.6 Conservation of momentum 25
dE d 1 2
= m v + φ = m v · v̇ + dφ = − v · ∇φ + (∇φ) · v = 0. (1.23)
dt dt 2
Thus, quantity E is indeed constant in time, i.e. it is conserved.
Theorem 1. Mechanical energy of the system in which the forces are potential is
constant.
Later we will see that the conservation of energy is in fact a consequence of a
deeper principle that the laws of motion cannot depend on time, i.e. the laws are
the same at all times. We say that the time is homogeneous. Precise meaning of this
statement will be clarified in sections 4.7 and 4.8.
F ij = −F ji . (1.24)
According to the law of force, total force exerted on i−th particle is equal to derivative
of its momentum, i.e.
dpi X
= Fi + F ij (1.25)
dt j6=i
where the total force on the right hand side is a sum of the external force and internal
forces exerted by all other particles.
Total momentum of the system is a sum of momenta of all particles,
X
P = pi ,
i
Now we use the law of action and reaction (1.24). In the expression
XX
F ij
i j6=i
we sum through all (ordered) pairs of particles. For each pair (i, j) contributing by
F ij to the sum, there is a pair (j, i) contributing to the sum by F ji = −F ij . Hence,
the total sum of all internal forces is necessarily equal to zero and the time derivative
of the momentum reads
X
Ṗ = F i. (1.26)
i
In other words, the total momentum changes only because of the external forces and
internal interaction does not contribute to the overall change of momentum. If there
are no external forces, the total momentum is constant,
Ṗ = 0. (1.27)
Law (1.26) states that the total change of the momentum is equal to the external
force impressed on the system and total momentum is constant if there are no ex-
ternal forces. System with no external forces is called isolated because of lack of its
interaction with surrounding bodies. Thus, law (1.26) can be reformulated as follows.
M = r × F. (1.28)
M = r F sin α
where α is an angle between both vectors. Because of the presence of the cross
product, the torque vanishes if the force is parallel to the position vector. In such a
case we expect that the force will not cause a rotation. In contrary, rotational effect
of the force will be maximal if vectors r and F are orthogonal, see figure 1.5.
Fn
r
α
F
Ft
Fig. 1.5. Rotational effect of the force F on the disk attached to a fixed point in its centre. Any
force F can be decomposed into the normal part Fn and the tangential part Ft . Clearly, the normal
part F n does not affect the rotation of the disk and only tangential part Ft is responsible for rotation.
Magnitude of tangential part is given by Ft = F sin α and hence we define the torque by (1.28).
l i = r i × pi . (1.29)
Since pi = mi ṙi , vectors r i and pi are parallel and hence their cross product vanishes.
Thus, we have
X
L̇ = r i × ṗi .
i
Because ṗi is a total force acting on i−the particle, we can see that the rate of change
of the angular momentum of i−th particle is given by the torque of total force acting
on this particle. However, we can proceed further and decompose ṗi into an external
force and the sum of internal forces (as in the previous section) to find
X XX
L̇ = ri × F i + r i × F ij .
i i i6=j
Repeating the argument based on the action-reaction law we conclude that the total
change of the angular momentum is
X
L̇ = ri × F i = M (1.31)
i
where
X
M= ri × F i
i
i, j, k, . . .
We do not specify the values of indices i, j, k, . . . if the dimension is clear from the
context. Notice that symbol x without index stands for the n−tuple of coordinates
xi where i = 1, 2, . . . n, in general. Occasionally, we use standard notation
x1 = x, x2 = y, x3 = z,
a, b, c, . . .
q = (q1 , q2 , . . . qn ).
30 1 Classical mechanics
xi = xi (q)
from curvilinear coordinates to the Cartesian coordinates. Notice that the last equa-
tion is in fact an abbreviation for n transformation relations. If we invert these
relations we arrive at the inverse coordinate transformation
qa = qa (x)
Let us take a matrix product of the Jacobi matrix J and matrix J of the inverse
transformation. We find
∂xi ∂qa
Jia J aj = = δij
∂qa ∂xj
where we have used the chain rule for partial derivatives in the last step. Since δij
are the components of the unit matrix, we have
1.8 Curvilinear coordinates 31
J ·J =I
J −1 = J ,
i.e. Jacobi matrices of direct and inverse coordinate transformations are mutually
inverse.
Polar coordinates are defined in the plane rather than in three-dimensional space,
see figure 1.6. Let (x, y) be Cartesian coordinates of a given point with the position
vector r. Distance of this point from the origin will is denoted by r and is related to
Cartesian coordinates by
p
r = x2 + y 2 .
Now we denote the angle between the position vector and the axis x by θ, see figure
1.6. Then the pair (r, θ) constitutes the polar coordinates of a point under consider-
ation. Clearly, polar coordinates and Cartesian coordinates are related by equations
x = r cos θ,
(1.33)
y = r sin θ.
x1 = x and x2 = y
q1 = r, q2 = θ.
(x, y)
r
θ
x
Fig. 1.6. Polar coordinates in the plane. Cartesian coordinates of the point are (x, y), polar coordinates
are (r, θ) where r is a distance of the point from the origin and θ is the angle between the radius-vector
and x−axis.
∂x ∂x
∂r ∂θ
cos θ −r sin θ
J = = . (1.35)
∂y ∂y sin θ r cos θ
∂r ∂θ
Let us see how this result can be obtained using Mathematica. First we define function
Jacobi which accepts the list of the Cartesian coordinates xs, the list of generalized
coordinates qs and the list of transformation rules rules. These rules are assumed to
be of the form
{ x1 -> ..., x2 -> ..., etc.}
where the dots express the Cartesian coordinates in terms of generalized ones. Func-
tion Jacobi can be defined, for example, as follows:
H* x i = x i H q L *L
In[1]:=
For our particular example of polar coordinates, this function should be called in the
following way:
1.8 Curvilinear coordinates 33
Out[2]//MatrixForm=
We can see that the result is identical with the previous one. In the rest of this
chapter we will use function Jacobi freely without explicitly mentioning it. Moreover,
we can call function Inverse to find
cos θ sin θ
!
J −1 = sin θ cos θ .
−
r r
By (1.32), we can deduce the partial derivatives of generalized coordinates with
respect to the Cartesian ones without actually calculating them:
∂r ∂r
= cos θ, = sin θ,
∂x ∂y
(1.36)
∂θ sin θ ∂θ cos θ
=− , = .
∂x r ∂y r
Now suppose that we want to describe the motion of a particle in the polar
coordinates. Since the particle is moving, its Cartesian coordinates will depend on
time, xi = xi (t), or explicitly
x = x(t), y = y(t).
r = r(t), θ = θ(t).
d
ẋ = (r(t) cos θ(t)) = ṙ cos θ − r θ̇ sin θ,
dt (1.37)
d
ẏ = (r(t) sin θ(t)) = ṙ sin θ + r θ̇ cos θ.
dt
The magnitude of the velocity is then
θ (x, y, z)
r0
φ
where
1
csc x = .
sin x
Components of the velocity can be calculated in the same way as in the previous
subsection. However, we can use Mathematica as in the following example:
x ' = Cos@Φ@tDD Sin @Θ@tDD r ¢ @tD + Cos@Θ@tDD Cos@Φ@tDD r @tD Θ ¢ @tD - r @tD Sin @Θ@tDD Sin @Φ@tDD Φ ¢ @tD
y ' = Sin @Θ@tDD Sin @Φ@tDD r ¢ @tD + Cos@Θ@tDD r @tD Sin @Φ@tDD Θ ¢ @tD + Cos@Φ@tDD r @tD Sin @Θ@tDD Φ ¢ @tD
The last line of the output shows that the magnitude of the velocity in spherical
coordinates is
v 2 = ṙ2 + r2 θ̇2 + sin2 θ φ̇2 .
2
Lagrange equations
2.1 Motivation
Basic equation of classical mechanics is Newton’s law of force. If the force F acts on
a point mass m, this point mass undergoes an acceleration a according to formula
F
a= .
m
In the previous chapter we have introduced a Cartesian coordinate system, in which
the law of force can be written in the form
Fi = m ẍi . (2.1)
We can see that Newton’s law is a differential equation of second order. Solving this
equation we find three coordinates xi as functions of time
xi = xi (t).
However, equation (2.1) holds only in Cartesian coordinates. Since we are inter-
ested in the motion of bodies in three dimensional space E 3 or two dimensional space
E 2 , we can always introduce Cartesian coordinate system, write down equations of
motion and in principle we can also solve them. However, Cartesian system is not
always the most convenient choice and there can be other coordinate systems which
are more appropriate. So, natural question arises: what are the equations of motion
in arbitrary coordinate system?
To illustrate why we need non-Cartesian coordinates, let us consider the following
example. Mathematical pendulum is a point mass m attached to a fixed point called
pivot via rigid rod of length r, see figure 2.1. Cartesian coordinates of the point mass
38 2 Lagrange equations
are (x, y). Pendulum is subject to gravitational force F = mg, where g = (0, −g) is
gravitational acceleration. Thus, in order to find the equation of motion we have to
find Cartesian components of the force F and insert them into Newton’s law (2.1).
There is a problem, however: coordinates x and y are not independent. For the rod
is assumed to be rigid, it has fixed length and, by Pythagorean theorem, coordinates
x and y have to satisfy equation
x2 + y 2 = r 2 , (2.2)
where r is the length of the rod. This is not a dynamical equation, because it is not a
differential equation which can be solved for given initial conditions. Rather it is an
algebraic equation which must be satisfied for any solution of equations of motion.
Equations of this kind are called constraints and we say that coordinates x and y
are constrained.
θ
r
(x, y)
θ
Ft Fn
mg
y
Fig. 2.1. Mathematical pendulum.
In other words, we have two equations of motion, one for each coordinate, but in
addition we have to satisfy the constraint (2.2). Instead of two equations we have
2.1 Motivation 39
to solve three. The reason is that the Cartesian coordinates are not well adapted to
the problem at all. If the system is described by two independent coordinates, we
say that it has two degrees of freedom. But the constraint reduces the number of
degrees of freedom to one! It is natural, because the pendulum can move only along
the circle of radius r. And circle is one-dimensional object. Although we describe the
position of pendulum by two coordinates, it has only one degree of freedom.
Can we describe the motion of the pendulum in such a way that it will have
manifestly only one degree of freedom? Definitely we can. The position of pendulum
is uniquely determined by the angle of deflection θ, see again figure 2.1. According
to that figure, Cartesian coordinates (x, y) are related to the angle θ by
x = r sin θ,
(2.3)
y = r cos θ.
This is similar to polar coordinates introduced before, the exchange of sin and cos
comes from different definition of angle θ. More important is that quantity r is
constant, it is not a variable. We can easily verify that the constraint (2.2) is satisfied
for any value of θ:
We can see that if we describe the pendulum by angle θ, we do not have to care
about the constraint anymore, for it is automatically satisfied. We thus have the
single variable θ which corresponds to the fact that the pendulum has only one
degree of freedom.
This is certainly a progress! In Cartesian coordinates we had two equations of
motion and one constraint. Now we have only one variable and no constraint. What
remains is to find the equation of motion. From the figure 2.1 it is obvious that
the force F acting on the pendulum can be decomposed to two components F t and
F n . Force F n is the normal component parallel to the rod. It causes the tension of
the rod, but since the rod is rigid, this has no effect on the motion of pendulum.
On the other hand, component F t tangent to the trajectory causes the acceleration.
Magnitude of tangent force is
Ft = F sin θ = m g sin θ.
Tangential acceleration is
at = r θ̈
where θ̈ is angular acceleration. The equation of motion of mathematical pendulum
is therefore
r θ̈ + g sin θ = 0
or in slightly modified form
g
θ̈ + sin θ = 0 (2.4)
r
The point is that the Cartesian coordinates are not always the most convenient.
We have seen that if we describe pendulum by Cartesian coordinates we have to solve
two equations of motion and one constraint, i.e. three equations. But the pendulum
has only one degree of freedom and its description by two coordinates is redundant.
This redundancy is reason why we have to impose the constraint. The problem can
be circumvented by appropriate choice of coordinates. Choosing angle θ as a single
coordinate we have eliminated the constraint and we have found the single equation
of motion. So we have one variable θ and one equation of motion. In this coordinate
system the system has one degree of freedom manifestly and we do not have to
impose the constraint.
Mathematical pendulum is a very simple system and we will analyze its properties
later on. We will see that despite its simplicity it possesses several non-trivial proper-
ties and the equation of motion cannot be even solved. In physics and in modelling of
realistic situations we often meet systems which are much more complicated. Double
pendulum, for example, consists of two point masses, one is attached to pivot, but
the second point mass is attached to the first one. Analysis shows that the motion of
double pendulum is chaotic. But in the case of double pendulum it is not clear how to
find the equations of motion and the procedure sketched above is more complicated.
Lagrange formalism to be introduced in this chapter provides a systematic way how
to derive equations of motion in arbitrary curvilinear coordinate system.
Fi = m ẍi . (2.5)
xi = xi (q),
where symbol q stands for the whole n−tuple (q1 , . . . qn ). If this is too abstract for
the reader, equations (2.3) from the previous section can serve as an example of a
coordinate transformation. In the case of pendulum, Cartesian coordinates xi are x
and y, and the only generalized coordinate is q1 = θ.
Moreover, we assume that previous relations can be inverted, i.e. we can express
generalized coordinates as functions of the Cartesian ones:
qa = qa (x).
Thus, the generalized coordinates are functions of the Cartesian coordinates and vice
versa. On the other hand, Cartesian coordinates depend on time (they are solutions
of (2.5)), so the generalized coordinates must depend on time, too:
qa (t) = qa (x(t)).
The total derivative of qa with respect to time can be obtained by the chain rule for
derivatives:
∂qa
q̇a = ẋi .
∂xi
This relation immediately implies
∂ q̇a ∂qa
= . (2.6)
∂ ẋi ∂xi
The total derivative of xi expressed in terms of generalized coordinates reads
42 2 Lagrange equations
∂xi
ẋi = q̇a . (2.7)
∂qa
Notice that since qa depend on xi , also the quantity
∂qa
∂xi
depends on xi . Similarly, xi depends on qa and therefore
∂xi
∂qa
depends on qa as well.
We know that if xi is a Cartesian coordinate, then ẋi is i−th component of the
velocity, i.e. vi = ẋi . Analogously, derivatives of qa with respect to time are called
generalized velocities. In Lagrangian formalism, coordinates and corresponding ve-
locities are treated as independent variables. In other words,
∂ ẋi ∂ q̇a
= = 0.
∂xj ∂qb
The last ingredient neccesary for the derivation of Lagrange equations is the notion
of generalized forces. Generalized forces are the components of the force F in the
curvilinear coordinate system. If Fi are Cartesian components of the force, then the
generalized forces are defined by
∂xi
Qa = Fi . (2.10)
∂qa
2.2.4 Derivation
Now we are prepared to derive the Lagrange equations of the second kind. Newton’s
law reads
Fi = m ẍi .
d ∂T
Fi = .
dt ∂ ẋi
Multiply this equation by ∂xi /∂qa to obtain
∂xi ∂xi d ∂T
Fi = .
∂qa ∂qa dt ∂ ẋi
On the left hand side we can see the generalized forces Qa according to relation
(2.10):
∂xi d ∂T
Qa = . (2.11)
∂qa dt ∂ ẋi
Now we are going to rearrange the right hand side in order to eliminate the Cartesian
coordinates xi .
Using the Leibniz rule1 , the right hand side can be rewritten as
d ∂xi ∂T ∂T d ∂xi
Qa = − . (2.12)
dt ∂qa ∂ ẋi ∂ ẋi dt ∂qa
1
Leibniz rule is a product rule for differentiation. Derivative of the product f g is (f g)0 = f 0 g + f g 0 . We
use this rule in the form f g 0 = (f g)0 − f 0 g.
44 2 Lagrange equations
The first term on the right hand side is, using (2.6), equal to
d ∂xi ∂T d ∂ ẋi ∂T
= .
dt ∂qa ∂ ẋi dt ∂ q̇a ∂ ẋi
Recall that the kinetic energy depends on Cartesian velocities ẋi but it does not
depend on the coordinates. Then, by the chain rule, we have
∂T ∂T ∂ ẋi ∂T ∂xi ∂T ∂ ẋi
= + = .
∂ q̇a ∂ ẋi ∂ q̇a ∂xi ∂ q̇a ∂ ẋi ∂ q̇a
|{z}
0
d ∂T ∂T d ∂xi
Qa = − . (2.13)
dt ∂ q̇a ∂ ẋi dt ∂qa
Now we want to eliminate the Cartesian coordinates from the second term of
equation (2.13). Consider following identity:
d ∂xi ∂ ∂xi ∂ ∂xi
= q̇b + q̈b .
dt ∂qa ∂qb ∂qa ∂ q̇b ∂qa
∂T d ∂xi ∂T ∂ ẋi ∂T
= = .
∂ ẋi dt ∂qa ∂ ẋi ∂qa ∂qa
Substituting this equality into (2.13) we arrive at the final form of the equations of
motion.
d ∂T ∂T
− = Qa (2.14)
dt ∂ q̇a ∂qa
2.3 Lagrange equations 45
∂V
= 0.
∂ q̇a
Now, rewrite equation (2.16) as
d ∂T ∂(T − V )
− = 0.
dt ∂ q̇a ∂qa
Since the potential does not depend on q̇a , we can write also
d ∂(T − V ) ∂(T − V )
− = 0,
dt ∂ q̇a ∂qa
because the term ∂V /∂ q̇a which we added vanishes anyway. Obviously, it is useful to
introduce a new scalar function called Lagrangian by
L = T − V. (2.17)
d ∂L ∂L
− = 0. (2.18)
dt ∂ q̇a ∂qa
Notice the terminology used: there exist Lagrange equations of the first kind
but we do not consider them in this text. In the previous section we derived the
Lagrange equations of the second kind which are equivalent to Newton’s law of
motion, but they are written in generalized coordinate system. Equations (2.18)
are called simply Lagrange equations. They are not completely equivalent to the
Newton law, because we assumed that the forces are conservative. Gravitational
and electrostatic forces are typical conservative forces. By contrast, the friction and
general electromagnetic forces are non-conservative, i.e. there is no potential V from
which they can be derived. If the system under consideration contains the friction,
we cannot find the Lagrangian of this system and we cannot use Lagrange equations,
but we still can use Lagrange equations of the second kind. It is interesting that
although the electromagnetic force is not conservative, the Lagrangian exists, as
we will see later. The friction is not a fundamental force, however: it is a result of
complicated interaction between the molecules forming surfaces of bodies in contact.
The electromagnetic force, on the other hand, is fundamental, it is one of four basic
forces in Nature. In fact, it is the most important force for us. Fortunately, it can be
described in Lagrange formalism so that the Lagrange equations (2.18) are sufficient
for the description of almost all physically relevant situations.
2.4 Particle in homogeneous gravitational field 47
g = (0, −g),
ẍ = 0, ÿ = −g. (2.19)
Let us see how the same result can be derived in Lagrange formalism. Kinetic
energy of the particle is
1 1
m ẋi ẋi = m ẋ2 + ẏ 2 .
T =
2 2
Gravitational force is
F = m g,
so that
F1 = 0, F2 = −m g.
V = V (y).
Integration constant does not affect the equations of motion (why?), so we can set
the constant to zero without the loss of generality.
We have found the kinetic energy and the potential, so we can write down the
Lagrangian which is by definition
1
m ẋ2 + ẏ 2 − m g y.
L=T +V = (2.20)
2
Notice that Lagrange equations (2.18) are written in an arbitrary coordinate system.
Our motivation was to introduce curvilinear coordinates but these equations hold in
the Cartesian system as well. Now the generalized coordinates are simply
q1 = x, q2 = y,
∂L d ∂L
= m ẏ, = m ÿ,
∂ ẏ dt ∂ ẏ
∂L ∂L
= 0, = m g.
∂x ∂y
2.5 Harmonic oscillator 49
Substituting these expressions into Lagrange equations (2.18) we arrive at the equa-
tions of motion:
d ∂L ∂L
− =0 → m ẍ = 0,
dt ∂ ẋ ∂x
d ∂L ∂L
− =0 → m ÿ = −m g.
dt ∂ ẏ ∂y
We can see that the Lagrange equations are familiar equations of motion (2.19). Of
course, for the motion in homogeneous gravitational field we can find the equations
of motion easier than through the Lagrangian. But before we can apply it to more
complicated problems, it is useful to see how it works in simple cases where we know
the result even without Lagrange equations.
F = −k q,
where q is the displacement. The minus sign is due to fact that the force always acts
in the direction opposite to the displacement. The constant k is called the rigidity
of the spring. According to the Newton’s law of motion, the acceleration is given by
F
a= .
m
Since the motion is one-dimensional, the only component of the previous equation is
k
q̈ = − q.
m
Constant k/m is usually denoted as
k
ω2 =
m
so that the equation of motion is
50 2 Lagrange equations
q̈ + ω 2 q = 0. (2.21)
This equation appears in physics very frequently, even if it is not connected with
the motion of the spring, e.g. oscilations of the electric circuits, vibrating atoms in
the crystal lattice.
Let us find the Lagrangian for harmonic oscilator. Kinetic energy is straightfor-
ward:
1
T = m q̇ 2 .
2
Potential is defined by relation
∂V
F =−
∂q
and the integration yields
1
Z Z
V = − F dq = k q dq = k q 2 .
2
This expression is usually written in terms of parameter ω:
1
V = m ω2 q2.
2
Thus, the Lagrangian is
1 1
L= m q̇ 2 − m ω 2 q 2 . (2.22)
2 2
Lagrange equations are obtained in a usual way and we find
∂L d ∂L ∂L
= m q̇, = m q̈, = m ω 2 q. (2.23)
∂ q̇ dt ∂ q̇ ∂q
d ∂L ∂L
− =0 → q̈ + ω 2 q = 0. (2.24)
dt ∂ q̇ ∂q
m
F
q=0 q
Fig. 2.2. Equilibrium position of the spring corresponds to q = 0. Restoring force F is proportional
to the displacement, F = −kq.
x = r sin θ, y = r cos θ.
θ = θ(t).
Derivatives of Cartesian coordinates x and y with respect to time are therefore given
by
ẋ = r (cos θ) θ̇ = r θ̇ cos θ,
(2.25)
ẏ = −r (sin θ) θ̇ = −r θ̇ sin θ.
1 2 2
T = m r θ̇ cos2 θ + r2 θ̇2 sin2 θ
2
1
= m r2 θ̇2 cos2 θ + sin2 θ
(2.26)
2
1
= m r2 θ̇2 .
2
What about the potential V ? In our elementary analysis from the beginning of the
chapter, we decomposed the force F into tangent and normal component and realized
that the normal component does not affect the motion and the tangent component
causes the angular acceleration. Decomposition of the force was easy but sometimes it
can be very difficult and one has to find an appropriate way. In Lagrange formalism,
however, the procedure is straightforward (although it can be complicated).
Cartesian components of the force are
F1 = 0, F2 = m g.
Note that we do not include the minus sign, because y−axis is oriented downwards
(see figure 2.1). Now we can compute generalized forces according to relation (2.10).
Since we have only one generalized coordinate θ, there is only one generalized force:
∂xi ∂x ∂y
Q= Fi = F1 + F2 = −m g r cos θ.
∂θ ∂θ ∂θ
Potential is then defined as
∂V
Q=−
∂θ
which integrates to
Z
V = − Q dθ = −m g r cos θ.
d ∂L ∂L g
− =0 → θ̈ + sin θ = 0.
dt ∂ θ̇ ∂θ r
2.7 Lagrange equations in Mathematica 53
If we define
g
ω02 = ,
r
the equation of motion acquires the form
We have seen how the Lagrange formalism can be applied to familiar problems to
obtain the equations of motion and we are ready for studying a new problem where
the equations of motion are unknown. We will illustrate the power of the formalism
on the example of the double pendulum. Before we analyse double pendulum, let us
see how the Lagrange formalism can be implemented in Mathematica.
q1 = x, q2 = y,
the Lagrangian is
1
m ẋ2 + ẏ 2 − m g y.
L=
2
Let us explicitly denote which variables depend on time:
1
m ẋ(t)2 + ẏ(t)2 − m g y(t).
L=
2
In order to find the Lagrange equations
d ∂L ∂L
− =0
dt ∂ q̇a ∂qa
we have to evaluate partial derivatives
54 2 Lagrange equations
∂L ∂L ∂L ∂L
, , , ,
∂ ẋ(t) ∂ ẏ(t) ∂x(t) ∂y(t)
and then to calculate total derivatives
d ∂L d ∂L
, .
dt ∂ ẋ(t) dt ∂ ẏ(t)
Derivative of the Lagrangian with respect to coordinate x can be found by the com-
mand
D[ L, x[t] ]
where L is the Lagrangian written in Mathematica. Similarly, derivative with respect
to velocity is simply
D[ L, x’[t] ].
An equivalent way how to perform the last command is
D[ L, D[x[t], t] ].
Since we need to differentiate this expression with respect to time again, we can write
D[ L, D[x[t], t], t ].
Hence, the Lagrange equation for variable x can be written in the form
D[ L, D[x[t], t], t ] - D[L, x[t]] == 0.
Analogous command can be constructed for the second variable y.
In order to make our code universal, we realize that we have to perform operation
D[ L, D[#, t], t] - D[L, #] == 0
for each generalized coordinate #, where # must be taken from the list of generalized
coordinates. Hence, suppose that we have a list of generalized coordinates called qs,
in our case
qs = { x[t], y[t] }
and the Lagrangian L. Then we can define the function
1
LagrangeEqsB 8x @tD, y @tD<, m J x '@tD 2 + y '@tD 2 N - m g y @tDF
In[6]:=
This equation cannot be solved in a closed form which means that we cannot write
down its explicit solution. Fortunately, there exist numerical methods which allow us
to find the approximate solution. In fact, Mathematica has many built-in methods for
constructing the numerical solutions of many types of differential equations. They are
all encapsulated in NDSolve function. But the numerical solution cannot be obtained
if the values of the constants are not specified. Moreover, to find a particular solution
we have to provide also the initial conditions. Our task is now to solve equation (2.29)
with appropriate initial conditions numerically.
\[Omega]0 = 1; T0 = 2 \[Pi] / \[Omega]0;
eqs = { \[Theta]’’[t] + \[Omega]0^2 Sin[\[Theta][t]] == 0,
\[Theta][0] == \[Pi]/4, \[Theta]’[0] == 0};
sol = NDSolve[ eqs, \[Theta][t], {t, 0, 2 T0}]
56 2 Lagrange equations
Here we first set the value of ω0 to 1 for simplicity. Moreover we define the “period”
T0 = 2π/ω0 because we know that for harmonic oscilator such relation holds. Next
we define the list of three equations,
π
θ̈ + ω02 sin θ = 0, θ(0) = , θ̇(0) = 0.
4
First of them is the equation of motion, the other two represent the initial conditions.
Equation θ(0) = π/4 means that the angle of deflection at time t = 0 is equal to
π/4 (in what position the pendulum is?). The velocity θ̇ has been set to zero. The
solution is found by the function NDSolve, as claimed, where we specify
• system of equations to solve – eqs;
• unknown variable – θ[t];
• interval of – {t, 0, 2 T0}.
The result of NDSolve is something of the form
{{ \[Theta][t] -> InterpolatingFunction[....][t] }}
We can see that it is a replacement rule. According to this rule, any occurence of θ[t]
will be replaced by interpolating function. When the function NDSolve constructs the
solution, it finds only a finite number of values of the unknown function θ on desired
interval. Then, however, we want to evaluate the solution at arbitrary time t and it
can happen that this time will be different than any time used in the construction of
the solution. For this reason, Mathematica has to ”guess” the correct value of θ at
that time. By the ”guessing” we mean the interpolation between two closest times
at which the value of θ is known.
However, it is not important how the procedure works for us. What we need is
that in order to evaluate the solution at arbitrary time t we have to type
\[Theta][t] /.sol /.t->1
Symbol θ[t] has no meaning to Mathematica, but the rule sol will replace the symbol
by the function which is the solution of the equation of motion. Then we can replace
the argument t by its concrete numerical value using the next rule.
Finally, we can visualise the solution by the command Plot. Complete code for
solving the equations of motion and plotting it follows, resulting picture is in figure
2.3.
2.9 Deriving the Lagrangian in Mathematica 57
Fig. 2.3. Numerical solution of equation of mathematical pendulum for initial conditions θ(0) =
π/4, θ̇ = 0 and value ω0 = 1.
x = r sin θ,
(2.30)
y = r cos θ.
G M = 4 π2.
x = r cos θ, y = r sin θ,
planet, mass m
r (position vector)
Sun, mass M
Fig. 2.4. Position of the planet with respect to the Sun.
3.1 Motivation
Let us recapitulate the advantages of Lagrange’s equations compared to the Newton
law of motion:
• Lagrange’s equations hold in arbitrary curvilinear coordinate system;
• the number of Lagrange’s equations is equal to the number of degrees of freedom
while in the Cartesian system we always have three equations for each particle
and possibly some additional constraints;
• the system is described by single scalar function called Lagrangian which simplifies
the transformation to generalized coordinate system.
In the context of classical mechanics, Lagrange’s equations are equivalent to New-
ton’s laws, but Newton’s laws appeared to be incorrect and have to be replaced by
the theory of relativity and the quantum mechanics. Nevertheless, the formalism of
Lagrange’s equations can be applied even in those theories.
Typical Lagrangian for one particle in Cartesian coordinates has form
1
L = T − V = m ẋi ẋi − V (x). (3.1)
2
Let us examine the structure of Lagrange’s equations
d ∂L ∂L
− =0
dt ∂ ẋi ∂xi
compared to Newton’s law of force in the form
dpi
= Fi .
dt
62 3 Hamilton’s equations
pi = m ẋi .
v = ωr
3.1 Motivation 63
p = m v = m r θ̇.
We can see that quantities p and ∂L/∂ θ̇ differ by a factor r. But this is a consequence
of the choice of the coordinates only! Although p and ∂L/∂ θ̇ are different, they are
obviously related.
Thus, in the two cases we presented, derivatives of the Lagrangian with respect
to generalized velocities are related to the momentum of the particle. As we have
seen, in Cartesian system they coincide, but in curvilinear coordinates they do not.
Nevertheless, it seems reasonable to define notion of momenta derived from the La-
grangian.
Let L be an arbitraty Lagrangian depending on generalized coordinates qa and
velocities q̇a , i.e. L = L(q, q̇). We define generalized momentum pa conjugated to
coordinate qa :
∂L
pa = . (3.2)
∂ q̇a
Lagrange’s equations then acquire form
dpi ∂L
= .
dt ∂qa
If we know actual position and momentum of the particle, we can calculate how the
momentum varies in time. But how the position changes? We can find the answer only
by solving Lagrange’s equations to obtain functions qa = qa (t) and then calculate q̇a .
It would be better, however, if we could write equations of the form
q̇a = something,
(3.3)
ṗa = something else.
∂L
pa = (3.5)
∂ q̇a
and invert it to obtain relation of the form
f = f (x1 , . . . xn ) ≡ f (x).
df = yi dxi ,
3.2 Legendre transformation 65
df = yi dxi .
df = d(xi yi ) − xi dyi .
This is equivalent to
d(xi yi ) − df = xi dyi
d (xi yi − f ) = xi dyi .
Note that on the left hand side we have the total differential of some function which
will be denoted by g:
dg = xi dyi ,
g = g(y).
Moreover, relation
66 3 Hamilton’s equations
∂g
= xi
∂yi
holds.
Let us recapitulate the procedure. We started with function f = f (x) depending
on variables xi . Then we defined new variables yi by
∂f
yi =
dxi
which means that the differential df became
df = yi dxi .
g = xi y i − f
with differential
dg = xi dyi
which means that the new function depends on new variables yi . Function g is called
Legendre transformation of function f . Thus, Legendre transformation is procedure
how to transform function f = f (x) to new function g = g(y) where yi = ∂i f .
L = L(q, q̇, t)
where qa are generalized coordinates, q̇a are generalized velocities. We also allow the
Lagrangian to depend on time explicitly, i.e. ∂t L 6= 0. We introduce new variables
called generalized momenta by
∂L
pa = .
∂ q̇a
Thus, generalized momenta are partial derivatives of function L with respect to
one set of variables – velocities. We want to find the Legendre tranformation of
3.3 Hamilton’s equations 67
H = pa q̇a − L. (3.8)
Notice that the first two equations have exactly the form (3.3)! These are new equa-
tions of motion called Hamilton’s equations:
∂H ∂H
q̇a = ṗa = − (3.10)
∂pa ∂qa
Hamilton’s equation possess the advantages of Lagrange’s equations but there are
some differences. Let us compare them briefly.
• Both Lagrange and Hamilton equations hold in arbitrary curvilinear coordinate
system;
• equations of motion are derived from single scalar function, L or H;
• Hamilton’s equations are of the first order while the Lagrange equations are of
the second order;
• there is one Lagrange equation for each degree of freedom, so for the system with
n degrees of freedom we have n Lagrange’s equations; on the other hand, there
are two Hamilton’s equations for each degree of freedom, one for the coordinate
and one for the momentum – thus, there are 2n Hamilton’s equations.
Let us illustrate how Hamilton’s equations “work” on familiar examples.
i.e. we use ordinary Cartesian coordinates. The generalized momenta are, by defini-
tion (3.2),
∂L
p1 = = m ẋ,
∂ ẋ
(3.11)
∂L
p2 = = m ẏ.
∂ ẏ
3.5 Conservation of energy 69
We can see that the generalized momenta are ordinary momenta, the components of
p = m v in Cartesian coordinates. The last relations can be inverted to find
p1
ẋ = ,
m (3.12)
p2
ẏ = .
m
In order to find the Hamiltonian we have to perform the Legendre transformation
of the Lagrangian using relation (3.8):
1
H = pa q̇a − L = p1 ẋ + p2 ẏ − m (ẋ2 + ẏ 2 ) + m g y.
2
This is not a correct expression for we have to eliminate velocities q̇a and express
them as functions of momenta pa :
p1 p2 1 p2 1 p2
H = p1 + p2 − m 12 + m 22 + m g y.
m m 2 m 2 m
Collecting similar terms we arrive at the Hamiltonian in the form
p21 + p22
H= + m g y. (3.13)
2m
Notice that this is, not accidentally, an expression for the total energy of the particle.
Hamilton’s equations then follow straighforwardly from (3.10).
ẋ = p1 , ẏ = p2 ,
(3.14)
ṗ1 = 0, ṗ2 = − m g.
where we defined
∂xi ∂xi
gab = m .
∂qa ∂qb
Direct differentiation gives
∂T
= gab q̇b .
∂ q̇a
Since the potential V does not depend on generalized velocities, we have
∂L ∂T
H = q̇a pa − L = q̇a − T + V = q̇a − T + V = gab q̇a q̇b − T + V
∂ q̇a ∂ q̇a
and therefore
H = 2 T − T + V = T + V = E.
There exists more general proof of the last statement and we present it briefly.
It relies on Euler’s theorem about homogeneous functions. Function f of variables
x = (x1 , . . . xn ) is said to be homogeneous of degree N if the following holds:
f (λ x) = λN f (x).
We can see that kinetic energy is homogeneous function of degree 2 in velocities, for
we have
1
T (q, λq̇) = gab (q) (λq̇a ) (λq̇b ) = λ2 T.
2
Application of (3.15) immediately yields
∂T
q̇a = 2 T,
∂ q̇a
so that the Hamiltonian is
∂L ∂T
H = q̇a pa − L = q̇a − T + V = q̇a − T + V = T + V = E.
∂ q̇a ∂ q̇a
This is an alternative proof of the above statement that the Hamiltonian is equal to
the total energy.
q = (q1 , . . . qn )
we know where all the particles are. We say that qa describe the configuration of
the system. Generalized momenta describe the state of motion (recall that momenta
are related to generalized velocities) of individual particles. Together we have 2n
quantities describing the actual state of the system which can be encapsulated in the
order 2n−tuple
Variables (q, p) define the state of the system. Hamilton’s equations (3.10) then say
how, for given state, these quantities change in time. The set of all possible states
of the system is called phase space. In other words, each state can be identified with
one point of the phase space. Let us illustrate the idea on the example of harmonic
oscillator.
p2 p2 1
H = pq̇ − L = −L= + m ω2 q2. (3.16)
m 2m 2
3.7 Harmonic oscillator 73
q̇ = p, ṗ = − q. (3.17)
Let us interpret these equations in the spirit of section 3.6. We have two vari-
ables describing the state of harmonic oscillator, coordinate q and the momentum
p. Hence, the phase space is a two-dimensional plane with coordinates q and p, see
figure 3.1. In this figure, the actual state of the oscillator is depicted as a point with
coordinates (q, p). Oscillator will then evolve in accordance with Hamilton’s equa-
tions (3.17) which determine the derivatives of coordinates. Thus, the oscillator will
move in the phase plane in the direction of velocity (q̇, ṗ) which is a vector tangent
to the trajectory of the oscillator in the phase plane. This trajectory is called phase
trajectory. By Hamilton’s equations (3.17) we have
H = E,
74 3 Hamilton’s equations
p
(q, p)
(p, −q)
Fig. 3.1. Geometrical interpretation of Hamilton’s equations (3.17) in the phase space (q, p). The state
of the oscillator is represented by the position q and momentum p which can be regarded as coordinates
in the phase space. The ”velocity” is then vector with coordinates (q̇, ṗ) where the derivatives are
determined by Hamilton’s equations.
where E is the total energy of the oscillator, while H is the Hamiltonian (3.16) with
the simplification m = ω = 1 employed in this section for brevity:
p2 q 2
H= + .
2 2
Now, equation H = E can be rearranged slightly so it acquires the form of the
equation of circle,
√ 2
q 2 + p2 = 2E ,
√
where the radius of the circle is manifestly r = 2E. We can see that the phase
trajectory is determined by the single parameter E, the total energy.
4
0
p
-2
-4
-4 -2 0 2 4
q
Fig. 3.2. Velocity field of harmonic oscillator. At each point of the phase plane (q, p) we calculate the
velocity (q̇, ṗ) using the Hamilton equations (3.17) and draw the vector representing the velocity.
4
Variational principle
We have seen that both Lagrange’s equations and Hamilton’s equations are essen-
tially equivalent (at least when the forces involved have potential) to Newton’s law
of motion. In this chapter we derive Lagrange’s equations in a completely different
way, using variational principle. We will see that with this principle it is possible to
derive equations of motion from scratch, with the minimum of initial assumptions.
This approach is much more powerful, because it works even outside the realms of
classical mechanics. In fact, all laws of modern physics can be formulated in terms
of variational principle.
with respect to the line perpendicular to the plane of the interface. For example, the
angle of impact in figure 4.1 is denoted by α. Similarly, the angle of refraction is β.
Can we calculate the angle of refraction provided that the angle of impact is given?
Yes, we can. According to the Snell law, these angles must satisfy equation
sin α n2
= . (4.1)
sin β n1
The Snell law (4.1) is a phenomenological law which was discovered before the
theory of electromagnetism and propagation electromagnetic waves have been found.
We can say that the rôle of the Snell law is similar to Newton’s law of motion. This
analogy goes even further. Three basic laws of geometrical optics are
• In homogeneous medium, the light propagates along straight lines;
• When the light propagates from one medium to another, the angle of impact and
the angle of refraction are related by the Snell law (4.1); if the light is reflected
on the interface, then the angle of impact is equal to the angle of reflection;
• If the light ray can propagate along some trajectory, then it can propagate also
in the opposite direction along the same trajectory.
With a small portion of fantasy one observes striking analogy between Newton’s laws
of motion and these three laws of optics. The first law tells us that if no changes
of refractive index occur, the light is propagating along the straight line which can
be regarded as an analogy to Newton’s law of motion: if no forces act, the body is
moving uniformly along a straight line. Second law, on the other hand, tells us how
the direction of propagation is influenced by changes of refractive index. Newton’s
law tells us how the velocity is changed under external force. Finally, the third law
of optics ensures that if light can propagate along some trajectory, it can propagate
in a reverse direction. In other words, if Alice can see Bob, then also Bob can see
Alice. Newton’s third law says that if body A exerts a force on body B, then also
body B exerts the force of the same magnitude and opposite direction on body A.
However, what we want to emphasize is that both the Snell law and the Newton
law of force are empirical laws which are justified by experiment. Is there any deeper
law which could explain all three laws of optics? Can we replace three laws of optics
by a single law? Yes, we can and it is called Fermat’s law.
Let us see how we can arrive at the formulation of the Fermat principle by heuristic
arguments. Suppose that the light ray is propagating in the homogeneous medium in
which, by definition, the refractive index is constant. Then, by the first law of optics,
the ray propagates along the straight line which is the shortest curve connecting
given two points. Hence, in the homogeneous medium, the light ray which travels
4.1 Fermat’s principle 79
α
medium n1
medium n2
β
B
Fig. 4.1. The light ray changes the direction on the interface between two media with refractive indices
n1 and n2 . In this figure we assume n1 < n2 which means that the light is faster in the first medium.
from point A to point B follows the shortest path from A to B. Does this statement
hold in general? Certainly not, as it is obvious from figure 4.1: the trajectory of light
ray passing from one medium to another is not a straight line and so its trajectory is
longer than the shortest possible one. But recall that the light propagates at different
speeds in both media. Maybe that the time rather than length is the minimal!
There is a beautiful argument by Richard Feynman. Suppose that you are standing
at the coast and there is a nice girl drowning in the sea. Of course, you want to save
her (this statement does not depend on whether the reader is a girl or a boy). You
must to reach the girl in a shortest possible time, not by the shortest distance! It is not
the same because you run faster than you swim. There are two extreme trajectories
along which you can travel, figure 4.2.
If you run straightly to that girl, trajectory i), you have to swim a long distance
which takes a longer time. If you choose trajectory ii), you spend the shortest possible
time in the water, but you have to run longer while your enter the water. It is obvious
that we have to find a point where to enter the water, so that the overall time which
you need to reach the girl is the shortest. This qualitative analysis shows that the
trajectory must be something like it is shown in figure 4.1 which indeed suggests that
the light ray is following the trajectory with minimal time.
Thus, we have arrived at the conjecture that the light ray propagates from point
A to point B along trajectory which takes the minimal time. This is the Fermat
principle. Let us formulate it in mathematical terms. Suppose that the light ray
starts in the point A and ends in the point B as in figure 4.1. Time which the ray
needs to travel along distance dr is
80 4 Variational principle
coast ii)
water i)
B (drowning girl)
Fig. 4.2. Two extreme trajectories which can be used to save drowning girl. Trajectory i) is the most
natural one, but the time you spend in water is too long. Trajectory ii) is better, but now you spend
too much time on the coast.
dr 1
dt = = n dr.
v c
Speed of light c is irrelevant, because whenever ndr is minimal, so is dt. Hence, we
define optical path length by
ds = n dr.
Total optical path length between points A and B is
ZB ZB
S= ds = n dr. (4.2)
A A
This notation is slightly awkward because the value of the integral depends not only
on points A and B but on the whole trajectory. In figure 4.3 we depict two different
trajectories γ and γ 0 connecting points A and B. Obviously, optical path length S is
different for both trajectories and thus instead of writing integration bounds A and
B we write the trajectory explicitly, e.g.
Z
S[γ] = n dr.
γ
Here we explicitly emphasize that the integral is taken along trajectory γ. Notice
that S depends on the entire trajectory γ. In other words, S can be regarded as a
mapping which assigns a number, optical path length, to each trajectory γ,
4.1 Fermat’s principle 81
S : γ 7→ R.
In general, any mapping from arbitrary set into the real numbers is called a func-
tional.
A
γ
B
γ0
Fig. 4.3. There are many trajectories connecting points A and B and the optical path length S is
different for each of them.
What is the law of propagation for the light ray? The Fermat principle states
that the light propagates along such trajectory for which the optical length S[γ] is
minimal. All three optical laws formulated at the beginning of this section can be
recovered from this simple statement. We do not show how it can be done in general,
but we show how the Snell law (4.1) can be derived from the Fermat principle.
Situation is sketched in figure 4.4. Suppose again that the light ray starts at point
A in the medium with refractive index n1 , crosses the interface between both media
and finally ends at point B in the medium with refractive index n2 . Let a be the
distance of point A from the interface, let b be the distance of point B from the
interface and let x be a coordinate of the place where the ray crosses the interface. If
points A and B are held fixed, it is the coordinate x which is unknown: we want to
find the place where the ray must cross the interface in order that the optical path
length be minimal. Complement of distance x will be denoted by y. Notice that, for
fixed points A and B, the sum of x and y is constant, say, l (it is the horizontal
distance of between points A and B):
x + y = l.
A
r1
a α
n1 x y
n2 r2
β b
B
Fig. 4.4. Derivation of the Snell law using the Fermat principle.
Similarly, distance from the crossing point to point B and corresponding optical path
length are
p p
r2 = b2 + y 2 , s2 = n2 r2 = n2 b2 + y 2 .
We want to find such x that the length S will be minimal. This is an easy task of
elementary calculus: we differentiate S with respect to x and set the derivative equal
to zero. Assuming that n1 and n2 are constants and using (4.3) we find
∂S x y
= n1 √ − n2 p = 0. (4.5)
∂x a2 + x 2 b2 + y 2
From figure 4.4 we can see that x and y are related to angles α and β by
x x y x
sin α = =√ , sin β = =p ,
r1 a + x2
2 r2 b2 + y 2
4.2 Formulation of variational problem 83
dx = ẋ dt, dy = ẏ dt,
We can see that the optical path length S[γ] is a functional of the form
Z
S[γ] = L(x(t), ẋ(t)) dt
γ
where L is a function of coordinates and their derivatives. Our task is to find such
trajectory γ for which the value S[γ] is minimal. It is a task similar to finding the
minimum of function familiar from elementary calculus. Such problem is solved by
taking the derivative with respect to the variable and setting it to zero. The difference
in our case is that now γ is not a single variable but it is the entire trajectory and it
is not obvious how to differentiate S with respect to γ. This concept is known as a
functional derivative or a variation and it can be defined in a very general context.
Here we define it in a more pedestrian way sufficient for our purposes.
Suppose that qa is the trajectory which is the solution to our problem, i.e. suppose
that for qa the functional S[q] acquires minimal value. Let this trajectory passes point
A for t = t1 and point B for t = t2 , see figure 4.5. Since qa is the minimal trajectory,
any trajectory qa0 different from qa must yield bigger value of S. Notice that we can
choose arbitrary trajectory qa0 but it must satisfy boundary conditions
qa0 (t1 ) = qa (t1 ), qa0 (t2 ) = qa (t2 ), (4.8)
because points A and B are held fixed. Let us write trajectory qa0 in the form
qa0 (t) = qa (t) + ε ηa (t) (4.9)
where ε is arbitrary constant parameter and η(t) is arbitrary function of time, subject
to boundary conditions
ηa (t1 ) = ηa (t2 ) = 0 (4.10)
in order to satisfy conditions (4.8).
Since function ηa is a difference between trajectories qa and qa0 , it is called a
variation and in physical textbooks it is often denoted by δqa = εηa . Symbol δ has
formally the same properties as the total differential d. Notice that (4.9) implies
q̇ 0a = q̇a + ε η̇a
so we can write δ q̇a = εη̇a . In other words, variation δ commutes with differentiation
with respect to parameter t.
As we emphasized repeatedly, functional S depends on the trajectory and for qa
it acquires minimal value, while for qa0 6= qa we have
S[q 0 ] = S[q + εη].
Notice that now we parametrized the family of trajectories qa0 by single parameter ε.
Since we want to find the minimum of S[q 0 ] (which is S[q]), we need to differentiate
S[q 0 ] somehow. While we do not know how to differentiate S with respect to entire
trajectory, differentiation with respect to ε is a well-defined operation. Hence, we
define the variation or functional derivative of S by
d
δS = S[q + εη]. (4.11)
dε ε=0
Notation d/dε|ε=0 means that first we differentiate the function with respect to ε
and then set ε = 0. The reason why we substitute zero for ε will be clear soon. Now,
the correct qa is a solution to equation
δS = 0.
86 4 Variational principle
qa
qa
A
qa0
εη B
t
t1 t2
Fig. 4.5. Two trajectories qa and qa0 starting at point A and ending at point B. Only for trajectory
qa the functional S is minimized.
qa 7→ qa + ε ηa
and solve equation δS = 0 where the variation δ is defined by (4.11). Since we suppose
that initial and final points A and B are fixed, we are interested only in trajectories
for which
ηa (t1 ) = ηa (t2 ) = 0.
Zt2
d d
δS = S[q + εη] = L(q(t) + εη(t), q̇ + εη̇) dt
dε ε=0 dε ε=0
t1
(4.13)
Zt2
∂L ∂L
= ηa + η̇a dt.
∂qa ∂ q̇a
t1
Note that function L in the first line is evaluated on varied trajectory qa0 = qa + εηa .
Then we differentiate L using the chain rule with respect to its first and then with
respect to its second argument. After differentiation we put ε = 0 so that function
L is evaluated on the original trajectory qa after the differentiation. Hence, after the
differentiation we do not have varied trajectory, only the original one.
Next step is to remove derivative of variation ηa with respect to parameter t.
Using integration by parts we find
Zt2 t2 Zt2
∂L dηa ∂L d ∂L
dt = ηa − dt. (4.14)
∂ q̇a dt ∂ q̇a t1 dt ∂ q̇a
t1 t1
Now we impose boundary conditions (4.10) that ηa must vanish at boundary points
A and B which implies that the ”boundary” term in square brackets is equal to zero!
Thus, after integration by parts, the variation of the action becomes
Zt2
∂L d ∂L
δS = − ηa dt. (4.15)
∂qa dt ∂ q̇a
t1
Our variational principle tells us that this variation must be equal to zero. Recall
that during the variation we kept boundary points A and B fixed. However, equation
(4.15) must hold for arbitrary points A and B, because we did not say anything
specific about these points. We can choose these points arbitrarily and then find δS
and this variation δS must vanish. Moreover, variation ηa was chosen to be arbitrary
as well. Then, δS can vanish for all ηa and for all points A and B only if the expression
in the square brackets is zero everywhere. In other words, variational principle implies
that following equations must hold:
∂L d ∂L
− = 0. (4.16)
∂ q̇a dt ∂ q̇a
These equations are known as the Euler-Lagrange equations of variational calculus.
88 4 Variational principle
We can see that the Euler-Lagrange equations are nothing else than Lagrange’s
equations (2.18), if we identify the Lagrangian L with function L above. This is
a surprising result: actual physical system evolves in time in such a way so as to
minimize the action (4.12)!
From the other point of view, recall that Lagrange’s equations (2.18) have been
derived as an equivalent formulation of Newton’s laws of motion in arbitrary coordi-
nate system. Thus, at the beginning, we had the Newton law which is a physical law.
In this chapter, on the other hand, we have not assumed anything about the physics:
we merely formulated the rule, variational principle, that action must be minimal.
Then we performed some calculations and showed that this principle is equivalent to
the Euler-Lagrange equations (4.16). Hence, we have derived the same form of the
law of motion without using any physics.
Of course, this strong statement is somewhat weakened if we realize that varia-
tional principle does not tell us what is the form of function L. In order to guess the
form of L we have to impose some physical restrictions. First, consider free particle,
i.e. particle moving in free space where no forces are present. If we describe the parti-
cle in the Cartesian coordinates, the Lagrangian L can depend on coordinates xi and
velocities ẋi . However, all points of the space are equivalent and no one is preferred.
If there are no forces, the particle must behave in the same way independently of its
position. Hence, Lagrangian cannot depend directly on coordinates, it can depend
only on the velocities. This is a consequence of homogeneity of space.
Next restriction comes from a isotropy of space. While homogeneity implies that
all points are equivalent, isotropy implies that for a given point, all directions in the
space are equivalent. We can rotate the system containing our particle under analysis
and the particle will behave in the same way. Thus, the Lagrangian cannot depend
on the direction of velocity vi = ẋi and can depend only on its magnitude, v 2 = ẋi ẋi .
Thus, we have determined the Lagrangian of the free particle up to multiplicative
constant and we can write it in the form
L = α v2 (4.17)
1
T = α v2 = m v2
2
and investigate what happens in the presence of forces.
We can see the heuristic power of variational principle: equations of motion are
provided by the Euler-Lagrange equations which have always the same form regard-
less on the system we describe and independently on the coordinates used. In order
to find equations of motion we have only to specify the Lagrangian. Usually we do
not have too many possibilities how the Lagrangian can look like. We have seen in
the case of the free particle that essentially the only form of admissible Lagrangian
is (4.17). The reason is that the Lagrangian is a scalar, so we must construct a scalar
quantity from quantities describing our system, like velocity and coordinates. Usually
there are only few possibilities.
Situation is similar even in the presence of forces. If the force is potential and thus
described by single scalar V such that Fi = −∂i V , it is natural to set
L = α v2 − V
where the minus sign is customary again and is related to the fact that the force is
minus the gradient. This choice is convenient but absolutely not necessary.
Electromagnetic forces, on the other hand, are not potential. Thus, the construc-
tion of Lagrangian as in chapter 2 is impossible: we can define the generalized forces
Qa but they are not a gradient of any scalar. In fact, electromagnetic field is described
by one scalar potential φ and one vector potential Ai . These potentials in general
depend on time and position. Now it is not important what is the vector potential,
we want just illustrate that even in this case the Lagrangian can be constructed.
Indeed, the particle moving in the electromagnetic field is described again by the
position xi and by the velocity vi while the field itself is described by potentials φ
and Ai . Can we combine these quantities to form a scalar Lagrangian? Yes, and the
construction is fairly unique. Since φ is itself a scalar, we can simply add it to the
Lagrangian of free particle (or, more precisely, subtract it from the Lagrangian), so
that the first part of the Lagrangian will be
L = T − βφ.
Here, β is again a constant to be specified later. Now we can form two scalar functions
from quantities xi , vi and Ai :
xi v i , xi Ai , Ai vi .
The first combination does not contain field quantities and we can exclude it im-
mediately, for it cannot describe interaction of the particle with a field. The second
90 4 Variational principle
combination looks better but recall that the space itself is homogeneous. This ho-
mogeneity is broken down by the presence of the electromagnetic field, but still the
Lagrangian should not depend on coordinates directly, only through potentials φ and
Ai . Hence, the only plausible combination is Ai vi and we can write
L = T − βφ + γA · v.
Now, constants β and γ obviously determine the strength of interaction between the
field and the particle. We know from the experience that electromagnetic force is
proportional to the charge of the particle e and thus we can write the Lagrangian in
the final form
L = T − e φ + e A · v. (4.18)
We can see that our construction is not ”bullet-proof” but it is very natural and,
moreover, it yields correct equations of motion. This heuristic approach is even more
powerful in relativistic theories where the action must be a scalar1 with respect to so-
called Lorentz group which is a strong restriction. Notice that in classical mechanics
we know what the correct equations of motion are: Lagrange’s equations must reduce
to Newton’s law. However, when we are developing a new theory, we do not know
what the correct equations are. In such a position we usually assume that variational
principle is correct and guess the form of the action or the Lagrangian. In this
way, people constructed modern quantum field theories of electromagnetic, weak and
strong interactions. Hence, variational principle is much more fundamental principle
than it seems from our discussion.
dF
f (t) = .
dt
Let us modify the action by adding a new term to it:
1
In classical mechanics it does not matter whether we construct the action or directly the Lagrangian,
because they differ only by integration over time. In relativistic theories, time is not invariant and trans-
forms as a component of (four-)vector. Hence, it is the action which must be scalar, not the Lagrangian.
4.6 Variational derivation of Hamilton’s equations 91
Zt2
0
S =S+ f (t) dt.
t1
Thus, the new action S 0 differs from S only by boundary terms – values of F at
boundaries of the trajectory. These are fixed under variation and so we have
δS = δS 0 .
That means that variational principle δS = 0 gives the same equations of motion as
principle δS 0 = 0. By the definition of the action, we have
Zt2
0
S = (L + f (t)) dt,
t1
where
dF
L0 = L + f (t) = L + . (4.19)
dt
In other words, if we change the Lagrangian by adding function f which is a total
derivative of some other function F with respect to time, we do not change the
equations of motion. Hence, the Lagrangian is not unique. This is an important
observation which will be exploited in the connection with canonical transformation,
chapter 5.
Zt2
S= (pa q̇a − H) dt. (4.20)
t1
Recall that the Hamiltonian is function of coordinates and momenta, H = H(q, p).
Let us variate the action, remembering that δ−symbol behaves like the differential,
Zt2
∂H ∂H
δS = pa δ q̇a + q̇a δpa − δqa − δpa dt.
∂qa ∂pa
t1
Now we have three variations δ q̇a , δpa and δqa . However, they are not independent
because q̇a should be expressed in terms of momenta. We can get rid of this term
integrating by parts,
Zt2 Zt2
pa δ q̇a dt = [pa δqa ]tt21 − ṗa δqa dt
t1 t1
where the first term on the right hand side vanishes by boundary conditions δpa = 0
for t = t1 and t = t2 . Then the variation of the action becomes
Zt2
∂H ∂H
δS = −ṗa δqa + q̇a δpa − δqa − δpa dt.
∂qa ∂pa
t1
Variation will be zero for arbitrary choice of t1 and t2 if the integrand vanishes.
Comparing coefficients standing beside independent variations δqa and δpa we recover
Hamilton’s equations
∂H ∂H
q̇a = , ṗa = − . (4.21)
∂pa ∂qa
is energy. We have seen that Hamiltonian represents total energy of the system and if
it does not depend on time explicitly, it does not depend at time at all. For example,
the Hamiltonian of harmonic oscillator is
p2 1
H= + m ω2 qa
2m 2
where both p and q are functions of time. Nevertheless, for any solution of Hamil-
ton’s equations, particular combination of coordinates and momenta given by H is
a constant. In this case we say that the energy is conserved.
Other examples of conserved quantities are momentum and angular momentum.
Total momentum and total angular momentum of an isolated system are constant
in time.
From mathematical point of view, the existence of conserved quantities is not
surprising but it is a direct consequence of properties of differential equations. For
the system with n degrees of freedom we have n Lagrange’s equations of the second
order or 2n Hamilton’s equations of the first order. Solution of second-order equation
contains two arbitrary constants, so the solution of complete set of Lagrange’s equa-
tions contains 2n constants. Similarly, solution to first-order equation contains one
integration constant, so the solution of complete set of Hamilton’s equations contains
again 2n constants.
We have arrived at conclusion that, regardless on the formalism, the solution
of equations of motion depends on the choice of 2n arbitrary constants C1 , . . . C2n .
Hence, the solution (q, p) of equations of motion can be written in the form
This is the system of 2n equations for constants Cm which can be inverted to obtain
C1 = C1 (q1 , . . . qn , p1 , . . . pn , t),
..
.
C2n = C2n (q1 , . . . qn , p1 , . . . pn , t),
In other words, for any solution of Hamilton’s equations there must exist at least
2n functions Cm of coordinates and momenta which are in fact constant and hence
conserved. In this sense the existence of conserved quantities is a pure mathemati-
cal consequence of the fact that solutions of differential equations contain integration
94 4 Variational principle
B B0
A A0
t1 t01 t2 t02
Fig. 4.6. Translation of the system in time.
More generally, imagine that qa = qa (t) is the real trajectory (i.e. it is a solution
of equations of motion) which passes point A at time t1 and point B at time t2 ,
see figure 4.6. If we perform the same experiment at later time, we can imagine it
as ”shifting” the trajectory to the right (in time direction), so that new trajectory
starts at point A0 at shifted time t01 and ends at point B 0 at time t02 . We say that,
mathematically, we translated the system in time. If all other conditions remain the
same, then the shape of the trajectory cannot change, the particle must move along
the same trajectory but at later time.
We say that time is homogeneous, i.e. alt instants of time are physically equivalent.
Hence, the result of any experiment cannot depend explicitly on time at which it was
performed: isolated system must be invariant under the translation in time.
Notice that this conclusion does not apply to non-isolated systems. For example,
suppose that we measure the intensity of the sunlight at 8.00 am and at 11.00 pm.
Then the results will be, of course, different! We cannot say that the intensity of the
sunlight is always the same. However, this is related to the fact that Earth is not an
isolated system if one studies the sunlight, because for our measurement it is crucial
that there is an energy coming from Sun to Earth. The conditions which can affect
the experiment are not the same in the morning and before midnight. Hence, the
assumption that the system is isolated is important. In fact, the existence of Sun and
the rotation of Earth breaks down the homogeneity of time.
We will not always emphasize it, but in connection with conservation laws, we
will always assume that the system is isolated.
96 4 Variational principle
q
B0
q 0(t1) A0
B
q(t1) A
t1 t2
Fig. 4.7. Translation of the system in space.
The last of the most important symmetries is the isotropy of the space. Isotropy
means that at given point of the space, all directions are equivalent and the result
of any experiment cannot change if we rotate the system by arbitrary angle.
If the system is invariant with respect to some transformation(translation in time,
space or rotation), the action of this system does not change under this transforma-
tions. Noether’s theorem then implies that each of these symmetries is responsible
for the conservation of some quantity. Homogeneity of time implies the conserva-
tion of energy, homogeneity of the space implies the conservation of momentum and
isotropy of the space implies conservation of angular momentum.
Notice that in previous examples we varied either the trajectory or the time. In
the case of spatial translation, figure 4.7, we did not transform the time, only the
trajectory. However, boundary points were not fixed because the endpoints of the
trajectory are transformed as well. Thus, in general, boundary conditions
must be relaxed. In the case of time translation we did not change values of coordi-
nates qa , but we shifted the trajectory in time and thus we must consider not only
variations of coordinates qa , but also variation of time δt.
All transformations considered above are special cases of general transformation
Here we explicitly emphasized that variations δt and δqa can depend on time. Varia-
tion δq is called isochronous variation because it is a difference of varied coordinate
q 0 (t) and original coordinate q(t) at the same time. Beside δqa we introduce also
non-isochronous variation or total variation ∆qa and defined by
∆qa = qa0 (t + δt) − qa (t) = qa0 (t) + q˙a δt − qa (t) = δqa + q̇a δt. (4.22)
Let
Q = pa ∆qa − E δt (4.25)
is constant during the evolution of the system whenever qa is the solution of equations
of motion, where pa are the generalized momenta and E is generalized energy of the
system E defined by
∂L
pa = , E = pa q̇a − L. (4.26)
∂ q̇a
98 4 Variational principle
where we use notation δt1 = δt(t1 ) and δt2 = δt(t2 ) for brevity. Notice that the
time translation δ affects only the integration bounds, not the integrand. The total
variation of the action is then ∆S = S 0 − S which is zero by assumption of invariance
of the action:
∆S = S 0 − S = 0. (4.28)
S0 = = + = − + = − − +
t1 +δt1 t1 +δt1 t2 t2 t2 t2 t1 t2
Zt2 t1Z+δt1 t2Z+δt2
0 0 0 0
= L(q , q̇ , t) dt − L(q , q̇ , t) dt + L(q 0 , q̇ 0 , t) dt
t1 t1 t2
where we have omitted the integrand in intermediate steps. Hence, the total variation
of the action reads
Zt2 t2Z+δt2 t1Z+δt1
Now we are in position to expand these integrals in variations δqa and δt assuming
they are infinitesimal and hence neglecting higher order terms. This is ”legal” because
in the definition of the variation it was assumed that after variation, all quantities
will be evaluated at δqa = δt = 0, so only the first order terms enter the result.
First we express the variation denoted by ∆S1 in the equation above. The La-
grangian is evaluated on different trajectories but at the same time and so the ex-
pression under the integral is isochronous variation of the Lagrangian:
4.8 Noether’s theorem: proof 99
Zt2 Zt2
∂L ∂L
∆S1 = δL dt = δqa + δ q̇a dt.
∂qa ∂ q̇a
t1 t1
We arrived at the same expression when we derived Lagrange’s equations from the
variational principle but now the interpretation is different. There we assumed that
boundary points of the trajectory are fixed and so we assumed δqa (t1 ) = δqa (t2 ) = 0.
By this assumption, the first term in square brackets vanished and hence we deduced
that in order to satisfy δS = 0, the Lagrange equations must hold. But now the
boundary conditions are not fixed because we consider the transformation of the
system. However, we assume that the equations of motion are satisfied and therefore
the second term vanishes! Consequently, the only contribution from ∆S1 to total
variation is merely
t2
∂L
∆S1 = δqa .
∂ q̇a t1
Next we evaluate variation ∆S2 in the expression (4.29). Recall that we are ex-
panding all quantities up to the first order in variations δqa and δt. Thus, for example,
the first integral in ∆S2 is
t2Z+δt2 t2Z+δt2
t2 t2
Since the integral is taken over interval (t2 , t2 + δt2 ), the inequality
t − t2 < δt2
t2
Using the definition of generalized momentum (3.2) and relation between isochronous
and total variation (4.22) we find
The coefficient standing by variation δt is in fact equal to the Hamiltonian (3.8). The
reason why we do not denote it by H is that the Hamiltonian should be expressed
as the function of qa and pa which is not our case. But we know that Hamiltonian is
equal to the total energy and hence we define generalized energy by
E = pa q̇a − L
preserving equations (5.1), i.e. such that new equations of motion will be
∂H0 ∂H0
Q̇a = , Ṗa = − . (5.3)
∂Pa ∂Qa
In chapter 4 we have seen that the Lagrangian is not determined uniquely, so that we
can add arbitrary function which is a total time-derivative to a Lagrangian without
affecting the equations of motion, recall equation (4.19).
Suppose that we have the Lagrangian L = L(q, q̇) and corresponding Hamiltonian
H = q̇a pa − L.
H0 = Q̇a Pa − L0 .
We require that both Lagrangians yield the same equations of motion. Then, by
(4.19), two Lagrangians can differ only by a total derivative of some function F with
respect to time,
dF
L0 = L + .
dt
In terms of Hamiltonian this means
dF
q̇a pa − H = Q̇a Pa − H0 + . (5.4)
dt
In general, function F depends on both old coordinates, new coordinates and possibly
on time,
5.1 Canonical transformations 105
i.e. it is a function of 4n + 1 variables. But these coordinates are not all indepen-
dent as they are constrained by 2n equations (5.2). Hence, F is a function of 2n + 1
independent variables and we can decide which variables will be independent. Trans-
formations (5.2) are called canonical and function F is called generating function for
canonical transformations (5.2).
Let us choose a generating function F1 which is a function of old and transformed
coordinates (and possibly on time),
Substituting this expression into (5.4) and comparing coefficients standing by inde-
pendent derivatives q̇a and Q̇a , respectively, we find
∂F1 ∂F1 ∂F1
pa = , Pa = − , H0 = H + . (5.7)
∂qa ∂Qa ∂t
Hence, we can define arbitrary function F1 of qa and Qa and, using relations (5.7),
we can find transformations which function F1 generates. Equation
∂
pa = F1 (q, Q, t)
∂qa
can be used to find defining relation for Qa , i.e. we can solve this equation to find
Qa = Qa (q, p, t).
Pa = Pa (q, p, t).
106 5 Hamilton-Jacobi equation
F2 = F1 + Qa Pa . (5.8)
S = S(q, P, t).
In other words, transformed coordinates and momenta are constant. Equations (5.14)
can be solved trivially,
Qa = αa , P a = βa , (5.15)
where αa and βa are integration constants, but they are equal to constant values of
coordinates and momenta. Then the generating function can be written as
S = S(q, β, t) (5.16)
This result should be compared to equation (5.16) where βa are constant momenta.
Our aim was to arrive at Hamilton’s equations in the form (5.14), so in order to
identify constants ca with momenta βa we have to show that coordinates derived
from generating function (5.19) via (5.12) are indeed constant. We have
∂S
Qa =
∂ca
and using Hamilton’s equations and the Hamilton-Jacobi equation we find
d ∂S ∂ ∂S ∂ ∂S ∂ ∂S ∂ ∂S
Q̇a = = q̇b + = q̇b +
dt ∂ca ∂qb ∂ca ∂t ∂ca ∂ca ∂qb ∂ca ∂t
∂pb ∂H ∂H
= − .
∂ca ∂pb ∂ca
Now we use that fact that Hamiltonian H depends on constants ca only through
generating function S,
∂ ∂S(q, c, t) ∂H ∂pb
H q, , t = ,
∂ca ∂q ∂pb ∂ca
| {z }
p
Q̇a = 0.
Q̇a = 0, Ṗa = 0,
as desired.
5.3 Example: harmonic oscillator 109
S = A(t) + W (q)
S(q, E, t) = − E t + W (q).
This is an elementary integral and can be evaluated easily but with some work (or
using Mathematica). The result is
q 1 p
W (q, E) = E arcsin √ + q 2E − q 2 .
2E 2
where the additive integration constant has been set to zero1 .
Since we have identified integration constant E with constant momentum P = β,
we can use relation (5.17),
∂S
α= ,
∂β
to obtain constant transformed coordinate Q = α. By differentiating S = −Et + W
we find
∂W q
α= −t+ = − t + arcsin √ .
∂E 2E
We have proved in the previous section that α must be constant, we can use the last
equation to express q as a function of time t:
√
q = 2E sin(α + t)
use trigonometric formula cos2 x = (1 + cos 2x)/2 and perform trivial integration. Finally, return√ to
variablepq by inverting the relation for x and use formula sin 2x = 2 sin x cos x where sin x = q/ 2E and
cos x = 1 − sin2 x.
5.4 Action-angle variables 111
S = − E t + W (q, E).
where the integral is taken along the orbit of the oscillator, i.e. along the circle. We
said that J will be treated as a momentum which means that we identify transformed
momentum β with action-variable J. Recall that the Hamiltonian is equal to the total
energy,
H(q, p) = E
p = p(q, E).
Hence, the integrand of (5.20) depends on q and E. But since we integrate over
variable q, the integral does not depend on q anymore and we have
J = J(E) or E = E(J).
W = W (q, J).
112 5 Hamilton-Jacobi equation
Lagrange’s equations and Hamilton’s equations have been derived from Newton’s
law under assumption that the force which acts on the particle is conservative, i.e.
it can be written as a gradient of the potential,
F = −∇V.
m m0
F =G 2
r
where r is the distance between the between the points and G is gravitational con-
stant. Numerical value of constant G in standard SI units is
Similarly, the sources of electromagnetic interaction are charges, i.e. charged par-
ticles. Charge is usually denoted by symbol q or e and it can be either positive or
negative. Particles with vanishing charge are called neutral. It is a remarkable fact
that for two point charges at rest, the electric force of their interaction is given by
the Coulomb law which is formally identical to the Newton’s law of gravitation. Two
point charges q and q 0 at mutual distance r act on each other by electric force of
magnitude
q q0
F =k (6.1)
r2
where k is the constant characterizing the strength of electromagnetic interaction and
plays the rôle similar to that of gravitational constant G in Newton’s law. Numerical
value of constant k depends on the system of units we use. In standard SI units we
write k in the form
1
k=
4πε0
where ε0 is called permittivity of the vacuum and its value is
k = 8, 99 × 109 F−1 m.
Comparing this value to the value of gravitational constant G we can see that electric
force is much, much stronger than gravitational force.
6 Electromagnetic field 115
However, simple Coulomb’s law (6.1) holds only for charges at rest. When the
charges start to move in an arbitrary way, new effects emerge. First, electromagnetic
field propagates at finite speed c equal to the speed of light,
Notice that in SI units, this value is not approximate but exact. It is related to
constant ε0 by
1
c= √
ε0 µ 0
where µ0 is called permeability of vacuum and its value is, by definition,
When we say that the speed of propagation of electromagnetic field is finite and
equal to c, we mean that if one charge changes its position, the other charges do not
feel this change immediately but only after time
r
∆t =
c
where r is the distance from the charge which changed the position. From this fact
it is immediately obvious that r in the Coulomb law (6.1) is a problematic quantity
because we must take into account that the charge at actual distance r cannot have
immediate effect on some other charge.
Next problem is that moving charge produces not only electric but also magnetic
field. Time-dependent electric field is a source of magnetic field and vice versa. This
is what we mean by dynamics of electromagnetic field: the field can propagate over
empty spacetime (without charges) at the speed of light. Hence, the notion of the
force is not appropriate for description of dynamics of electromagnetic interaction
and the notion of the field must be introduced.
But, as we claimed, we will not discuss the dynamics of electromagnetic field which
is given by celebrated Maxwell’s equations. We simply assume that the electromag-
netic field is given and investigate the motion of charged particles in this field. Once
again, electromagnetic field is described by electric field E and magnetic field B.
Consider particle with charge q which is moving in the region where only electric
field is present, i.e. B = 0. Then the electric field acts on the particle by force given
by
F = q E. (6.2)
116 6 Electromagnetic field
In other words, electric force is proportional to electric field E and the charge of
particle q, which is an experimental fact. Once we discover this fact, relation (6.2) is
a definition of electric vector E. Vector E is such vector that electric force exerted
on a point charge q is given by (6.2).
Similarly, consider particle moving in the region where only magnetic field is
present. Once again we find (experimentally) that the force acting on charge q is
proportional to the charge. But, in addition we find that the direction of magnetic
force is always orthogonal to the velocity v of the charge. It was discovered that
magnetic force is given by
F = qv×B (6.3)
where operation × is standard vector product1 (or cross product). Again, relation
(6.3) is a definition of magnetic vector B.
When both electric and magnetic fields are present, the force exerted on the
particle is given by the so-called Lorentz force
F = q (E + v × B) . (6.4)
We emphasize that relation (6.4) is an experimental fact, similarly as the Newton law
of force is, and we do not derive it from some more basic principle. It is fascinating
that relation (6.4) can be derived from more basic principles but this is completely
beyond the scope of this textbook2 . In the theory of electromagnetism it is shown
that instead of electric field E and magnetic field B we can introduce one scalar
function φ and one vector function A; it is a consequence of Maxwell’s equations. In
this textbook we proceed differently and assume that this can be done. From this
assumption we will be able to derive correct equations of motion of charged particle
in arbitrary electromagnetic field.
potential and by one vector field A called vector potential so that the Lagrangian of
particle in electromagnetic field is
1 1
L= m v 2 − e φ + e v · A = m ẋi ẋi − e φ + e ẋi Ai . (6.5)
2 2
where e is a constant measuring the strength of the interaction between the particle
and the electromagnetic field; this constant is called charge of the particle. We assume
that the Lagrangian (6.5) represents correct description of particle moving in given
electromagnetic field. This assumption is justified a posteriori by accordance of the
theory with the experiment.
Equations of motion can be derived from usual Lagrange’s equations (2.18)
d ∂L ∂L
− = 0.
dt ∂ ẋi ∂xi
Partial derivatives read
∂L ∂L
= m ẋi + e Ai , = −e ∂i φ + e ẋj ∂i Aj .
∂ ẋi ∂xi
Note that total derivative of Ai with respect to time is
dAi ∂Ai dxj ∂Ai
= + ≡ ẋj ∂j Ai + ∂t Ai
dt ∂xj dt ∂t
and hence
d ∂L
= m ẍi + e ẋj ∂j Ai + e ∂t Ai .
dt ∂ ẋi
Collecting these auxiliary expression we find that the Lagrange equations of motion
are
Now, since ∂i ẋj = 0, the last term on the right hand side can be rewritten as
v × (∇ × A) = ∇(A · v) − v · ∇A
118 6 Electromagnetic field
p = mv + eA
is different from linear momentum mv. The Hamiltonian is then given by the Leg-
endre transformation of the Lagrangian,
H = ẋi pi − L, (6.13)
where we must, however, express the velocities ẋi in terms of generalized momenta
(6.11). After simple rearrangements we find
1
H= (p − eA)2 + e φ. (6.14)
2m
Let us now differentiate the Hamiltonian with respect to coordinates and mo-
menta,
∂H e
= − (pj − eAj ) ∂i Aj + e ∂i φ,
∂xi m
∂H 1
= (pi − e Ai ) ,
∂pi m
from which the Hamilton equations follow:
1
ẋi = (pi − e Ai ) ,
m (6.15)
e
ṗi = (pj − e Aj ) ∂i Aj − e ∂i φ.
m
HamiltonEM[φ, A]
E = 0, B = constant.
E = −∇φ, B = ∇ × A.
Next, electric field vanishes and so, by last equations, potential φ is constant which
can be set to zero without the loss of generality. Remaining equation B = ∇ × A in
the component form reads
Bx = ∂y Az − ∂z Ay ,
By = ∂z Ax − ∂x Az ,
Bz = ∂x Ay − ∂y Ax .
It is possible to find the solution for arbitrary direction of magnetic field, but for
convenience we choose a coordinate system in which B has direction of z−axis,
B = (0, 0, B).
6.4 Homogeneous fields 121
vals = 8m ® 1, B ® 1, e ® 1<;
initConds = 8 x @0D 1, y @0D 0, z@0D == 0, p1@0D 0, p2@0D 2, p3@0D 0<;
tmax = 20;
sol = NDSolve @ Join @eqs, initCondsD . vals,
8x @tD, y @tD, z@tD, p1@tD, p2@tD, p3@tD<, 8t, 0, tmax <D;
122 6 Electromagnetic field
Now we plot the solution. All plotting options can be ignored, they serve just to
improve the quality of the plot.
AxesOrigin ® 80, 0, 0<, Boxed ® False , PlotRange ® 88- 1, 3.5<, 8- 1, 2<, 8- 1, 1<<,
Ticks ® 8 Range @- 1, 3, 1D, Range @- 1, 2, 1D, 8- 1, 1<<,
BaseStyle ® 8FontFamily ® "Times New Roman ", FontSize ® 15<,
ViewPoint ® 81, 1, 1<D;
g2 = Graphics3D@ 8Text@Style @"x ", 15D, 83.5, 0.2, 0<D,
Text@Style @"y ", 15D, 8-0.2, 2, 0<D,
Text@Style @"z ", 15D, 8-0.05, 0.1, 1<D
<
D;
Show @g1, g2D
1 z
-1 -1
0
1 1
y
2 -1 2
3
x
We can see that the trajectory of the particle is a circle of radius 1 centered at
position (2, 0, 0). This is a familiar property of magnetic field: the field does not
perform the work on a particle, only changes direction of its motion. Since magnetic
force is always orthogonal to velocity, resulting trajectory is a circle.
Now suppose that we add an initial velocity in the z−direction, e.g. we set
p30 = 0, 1.
6.4 Homogeneous fields 123
That means that initial velocity is not orthogonal to magnetic field B anymore, but
the vz -component of the velocity does not affect magnetic force. Hence, in addition to
circular motion, the charge will move uniformly in z−direction. Resulting trajectory
of the particle is called helix (in order to obtain this figure in Mathematica, do not
forget to adjust the range on z−axis).
1 z -1
-1 0
1
2 1
-1 y
3 2
Let us consider another example. Suppose that in addition to magnetic field, there
is homogeneous electric field
E = (0, 0, E)
in the direction of axis z. This field is time-independent again and thus the equation
for scalar potential reads
E = −∇φ
or, in components,
∂φ ∂φ ∂φ
= 0, = 0, = −E,
∂x ∂y ∂z
from which we find
124 6 Electromagnetic field
φ = −x z.
Corresponding code:
vals = 8m ® 1, B ® 1, e ® 1, E0 ® 0.01<;
initConds = 8 x @0D 1, y @0D 0, z@0D == 0, p1@0D 0, p2@0D 2, p3@0D 0<;
tmax = 100;
sol = NDSolve @ Join @eqs, initCondsD . vals,
8x @tD, y @tD, z@tD, p1@tD, p2@tD, p3@tD<, 8t, 0, tmax <D;
In this case, the motion of the particle consists of uniform circular motion in the
plane z = constant and uniformly accelerated motion in the direction of z−axis.
1 z -1
-1 0
1
2 1
-1 y
3 2
i.e. it has only z−component. E0 is the amplitude of the electric field. Electric field
is related to potentials via
∂A
E = −∇φ − .
∂t
Let us set φ = 0:
∂A
E=− .
∂t
This equation can be integrated to find the vector potential in the form
Z
A = − E dt = (0, 0, −E0 sin(t − x)) .
We can see that magnetic field has direction of y−axis and hence it is orthogonal
to electric field, which is a general property of electromagnetic waves. Derivation
performed above can be done with Mathematica using following commands:
In[11]:=
Needs@"VectorAnalysis`"D
El@t_ , x_ D = 80, 0, E0 Cos@t - x D<;
A = - à El@t, x D â t;
B = Curl@A . x ® Xx D . Xx ® x
eqs = HamiltonEM @ 0, A D;
In[211]:=
vals = 8m ® 1, E0 ® 1, e ® 1<;
initConds = 8 x @0D 1, y @0D 0, z@0D 0, p1@0D 0, p2@0D 0, p3@0D 0<;
tmax = 100;
sol = NDSolve @ Join @eqs, initCondsD . vals,
8x @tD, y @tD, z@tD, p1@tD, p2@tD, p3@tD<, 8t, 0, tmax <D;
126 6 Electromagnetic field
and plotted by
-4 x
1.5
-2
1.0
0.5 0
0.0
0.5
1.0
y
1.5
2.0
z
6.6 Electrostatic wave 127
These fields cannot be described by the same vector potential and hence the equations
of motion cannot be derived from any potential. Nevertheless, with this prescription,
we can write down usual Newtonian equation of motion
dv
m = e (E + v × B)
dt
and solve it numerically. Appropriate Mathematica code reads
B = 81, 0, 0<;
r @t_ D = 8 x @tD, y @tD, z@tD <;
eqs = 8Equal Transpose @8r ''@tD , El + r '@tD B<D,
x @0D 0, y @0D 0, z@0D 0,
x '@0D 0, y '@0D 0, z '@0D 0<
sol = NDSolve @ eqs, r @tD, 8t, 0, 100<D
ParametricPlot@8y @tD, z@tD< . sol, 8t, 0, 100<D
128 6 Electromagnetic field
Here we have set initial velocity to zero. The trajectory is found to be the spiral.
40
20
-40 -20 20 40
-20
-40
7
Discrete dynamical systems and fractals
This chapter is a digression from the main line but, first, discrete dynamical systems
provide a simple model of more complicated continuous dynamical systems which we
will study later and, second, we will plot nice pictures called fractals and get some
insight into complicated nature of chaotic systems.
where f (z) = z 2 . Thus, starting from a given z0 , members of this sequence read
z1 = f (z0 ) + z0 = z02 + z0 ,
z2 = f (z1 ) + z0 = z04 + 2 z03 + z02 + z0 ,
···
We can use Mathematicato generate members of this sequence using the following
command
z0 = z0 ,
z1 = z02 + z0 ,
z2 = z04 + 2z03 + z02 + z0 ,
z3 = z08 + 4z07 + 6z06 + 6z05 + 5z04 + 2z03 + z02 + z0 ,
z4 = z016 + 8z015 + 28z014 + 60z013 + 94z012 + 116z011 + 114z010 + 94z09 + 69z08
+ 44z07 + 26z06 + 14z05 + 5z04 + 2z03 + z02 + z0 ,
z5 = z032 + 16z031 + 120z030 + 568z029 + 1932z028 + 5096z027 + 10948z026
+ 19788z025 + 30782z024 + 41944z023 + 50788z022 + 55308z021 + 54746z020
+ 49700z019 + 41658z018 + 32398z017 + 23461z016 + 15864z015 + 10068z014
+ 6036z013 + 3434z012 + 1860z011 + 958z010 + 470z09 + 221z08 + 100z07
+ 42z06 + 14z05 + 5z04 + 2z03 + z02 + z0 .
Obviously, the complexity of each term zn grows very quickly with increasing n.
It is instructive to see the behaviour of the sequence graphically. Hence, we choose
some particular z0 and plot few terms zn of the sequence starting from z0 . Let us
define following functions:
seq[z0_, n_] := NestList[ #^2 + z0 &, z0, n] // Expand
list[z0_, n_] := {Re[#], Im[#]} & /@ seq[z0, n]
First definition defines function which generates the list of n members of the sequence
zn . For example, command seq[I, 10] generates the list of ten members of the sequence
starting at point z0 = i:
However, we cannot plot complex numbers directly and so we must convert each
complex number z = x + iy into a pair of coordinates (x, y). This is accomplished by
function list. We define a pure function
{Re[#], Im[#]}&
which splits the argument into its real and imaginary parts. Then we apply this pure
function to all elements of the list seq[z0,n]. Using the previous example, command
list[I, 10] produces
{{0, 1}, {−1, 1}, {0, −1}, {−1, 1}, {0, −1},
{−1, 1}, {0, −1}, {−1, 1}, {0, −1}, {−1, 1}, {0, −1}}.
7.1 Complex sequences 131
Notice that this sequence is periodic: except from the starting point i, the sequence
is jumping from −1 + i to −i and back, infinitely.
The list produced by list can be already plotted by ListLinePlot. Let us plot the
list list[I,10] by
ListLinePlot[ {list[I, 10]},
PlotRange -> Full, AxesOrigin -> {0, 0}, AspectRatio -> 1,
PlotMarkers -> Automatic,
PlotStyle -> { {Blue} },
BaseStyle -> {FontFamily -> "Times New Roman", FontSize -> 13}
]
Expected result is plotted in figure 7.1.
Now, let us choose a different starting point close to original point i, e.g z0 =
0.8i, and construct first ten members of the sequence again. We can compare both
trajectories using the following code:
ListLinePlot[ {list[I, 10], list[0.8 I, 10]},
PlotRange -> Full, AxesOrigin -> {0, 0}, AspectRatio -> 1,
PlotMarkers -> Automatic,
132 7 Discrete dynamical systems and fractals
Obviously, this sequence is not bounded and it escapes to infinity very quickly.
What conclusion can be drawn from examples above? What we did actually see
is the most characteristic property of chaotic systems: sensitivity to initial condi-
tions. Particular choice of the starting point z0 corresponds to imposing the initial
condition. We have seen three sequences starting from points close to each other,
i, 0.9i and 0.8i. In non-chaotic systems, if we change initial positions slightly, also
the solution will change only slightly. In chaotic systems, the behaviour can differ
drastically even for very similar initial conditions. In our examples, first sequence
was periodic, second was unpredictable and the third one was diverging and tending
to infinity.
Fig. 7.2. Comparison of two sequences with close starting points i and 0.8i.
Fig. 7.3. Ten sequences starting from initial points of the form z0 = x + 0.8I, x ∈ (−0.5, 0.5).
134 7 Discrete dynamical systems and fractals
Parameters of the algorithm are radius R > 0 and maximum number of steps
nmax . We choose a point z0 = x0 + iy0 ∈ C and construct the sequence zn starting
from this point. If |zn | > R then the algorithm stops and returns value n. If |zn | < R,
we compute zn+1 and repeat the procedure. If n > nmax , the algorithm stops and
returns value nmax . In this way we assign an integer to each point z0 of the complex
plane or, equivalently, to each point (x0 , y0 ) of usual Euclidean plane.
Let us see how this algorithm can be implemented in Mathematica. In usual pro-
cedural languages we would use some kind of cycle like for or while. In Mathematica,
these cycles can be still implemented but functional methods are more satisfactory;
in this case we use function NestWhileList. Function Mandelbrot implementing the
algorithm described above follows:
Mandelbrot[x_, y_,
OptionsPattern[{MaxRadius -> 100, MaxSteps -> 50}]] :=
Module[ {c, R, n},
c = x + I y;
R = OptionValue[MaxRadius];
n = OptionValue[MaxSteps];
Length[NestWhileList[ N[#^2 + c] &, c, (Abs[#] < R) &, 1, n]]
]
The head of the function tells Mathematicathat the function has two obligatory pa-
rameters x and y – these are the coordinates of initial point (x0 , y0 ) in the plane.
Moreover, function accepts optional arguments specifying the behaviour of the func-
tion. In our case, optional parameters are maximum radius R with default value 100
and the maximum number of steps with default value 50. If we call function without
specifying optional parameters, e.g.
Mandelbrot[ 1, 3 ],
default values are used. If we want to change these values, we call the function in
the form, e.g.
Mandelbrot[1, 3, MaxRadius -> 20, MaxSteps -> 1000]
Reader should be familiar with this notation as it is used in many predefined functions
in Mathematica.
Then we define three local variables c, R and n. Variable c represents the initial
point because we set
c = x + i y.
7.2 Mandelbrot set 135
Variables R and n are set to the values of parameters MaxRadius and MaxSteps and
we introduce them only to increase the readability of the code. The core of function
Mandelbrot is in the last command. Function
NestWhileList[ N[#^2 + c] &, c, (Abs[#] < R) &, 1, n]
applies the pure function #2 +c&, which is our function f (z) = z 2 + z0 , to the initial
value c repeatedly. Calling of function N is included in order to obtain just numerical
value of the result instead of exact value which would take a long time and occupy
a lot of memory (after all, reader is invited to remove the calling of this function to
see the differnce). Function NestWhileList stops when the condition specified again
as a pure function is violated. In our case, the condition |zn | < R is typed as a pure
function (Abs[#] < R)&. Next parameter of NestWhileList specifies how many recent
results of nested call should be inserted to the test. Here we want to test only the last
result and hence set this parameter to 1. The last parameter n specifies maximum
number of the calls.
The result of NestWhileList is the sequence of numbers zn which stops if |zn | > R
or if n > nmax . The point is that this command returns the list of all members of
given sequence so taking its length we find how long this sequence is. This number
is then a result of function Mandelbrot.
Finally we can visualize function Mandelbrot using
DensityPlot[
Mandelbrot[x, y], {x, -1.5, 0.5}, {y, -1.3, 1.3},
PlotPoints -> 100]
Function DensityPlot serves to visualize functions of two variables not by plotting a
three-dimensional graph but by assigning a color to each point (x, y) depending on
the value of the function to be plotted. The result is shown in figure 7.4 and is known
as the Mandelbrot set.
The meaning of regions with different colors can be understood easily. For exam-
ple, if we choose zero to be the initial point, z0 = 0 then all members of the sequence
2
must be zero, for we have zn = zn−1 + 0 = 0. In other words, the sequence stays at
point zero for all n and therefore function Mandelbrot will stop only after maximum
number of steps have been reached. Indeed, typing
Mandelbrot[0, 0]
yields the result 51. That means that after 50 steps the sequence was still in the
circle of radius R. We can see that the neighbourhood of zero is plotted in white
color in figure 7.4. Hence, white regions correspond to high values of the function
Mandelbrot. Blue color, on the other hand, represents regions where the values of the
136 7 Discrete dynamical systems and fractals
function are small and so the sequence escapes the circle of radius R very soon. For
example, at point (−1.5, 1) the value of
Mandelbrot[-1.5, 1]
is equal to 5 which means that the sequence escapes the circle after 5 steps.
It is natural to expect that small numbers close to zero yield bounded sequence
while numbers distant from zero yield rapidly diverging sequences. An unexpected
feature of this construction is the existence of boundary between blue and white
region which exhibits highly non-trivial structure. This boundary is obviously irreg-
ular but when we zoom into the boundary, we find kind of self-similarity: at each
scale we observe similar shape of the boundary. In figure 7.5 we plot the boundary
of Mandelbrot sets for different zooms.
This complicated structure of Mandelbrot’s set corresponds to the behaviour ob-
served in the previous section. Two different but close points give rise to sequences
with very different behaviour: one sequence remains bounded while the other one
escapes to infinity. Thus, in this sense, Mandelbrot set visualize extreme sensitivity
of the sequence zn to the choice of the initial point.
Fig. 7.5. Mandelbrot set on different scales.
8
Dynamical systems
8.1 Definition
Dynamical system is a set of n first-order ordinary differential equations of the form
140 8 Dynamical systems
x2 f (x0) = ẋ(0)
x20 x0
x
f (x) = ẋ
x10 x1
Fig. 8.1. Two-dimensional dynamical system of the form ẋa = fa . Initial position is at x0 = x(0).
The “velocity” vector at x0 is f (x0 ) and determines the trajectory of the system in the infinitesimal
neighbourhood of the initial point.
8.2 Example
Let us see an illustrating example. We are already familiar with the equation of
harmonic oscillator
θ̈ + θ = 0.
This is a second order equation but we can bring into into the firs-order form by
setting
x1 = θ, x2 = θ̇.
Then we have
ẋ1 = θ̇ = x2
and
ẋ2 = θ̈ = −θ = −x1 .
142 8 Dynamical systems
Hence, instead of single equation θ̈+θ = 0 of second order we now have two equations
of first order
ẋ1 = x2 ,
(8.3)
ẋ2 = −x1 .
Clearly, this is a dynamical system (8.1) if we set f1 = x2 and f2 = −x1 . Thus, the
velocity field is
f@x_ , y_ D = 8y , - x <;
StreamPlot@ f@x , y D, 8x , - 2, 2<, 8y , - 2, 2<D
Resulting figure is
-1
-2
-2 -1 0 1 2
This picture agrees with our previous analysis when we used the conservation of
energy to show that the phase trajectories of harmonic oscillator are circles (or ellipses
8.3 Implementation in Mathematica 143
when using SI units). Another possibility is to use function StreamPlot with the same
arguments which yields
-1
-2
-2 -1 0 1 2
This code deserves a brief explanation. Arguments of the function DynDys are
• pure function f – this is a vector function representing the right hand side of
dynamical system (8.1);
144 8 Dynamical systems
Clearly, this corresponds to harmonic oscillator (8.4). Initial conditions IC are set to
x1 (0) = 1, x2 (0) = 1
Now we form the right hand side of equations. Recall that vars is the list of
variables. We want to evaluate functions fa at point xa , i.e. we need the expression
fa (x1 , . . . xn ).
1.0
0.5
Out[8]=
- 0.5
- 1.0
θ̈ + sin θ = 0.
This equations is non-linear because of the presence of the sine. We have seen that
this equation cannot be solved in terms of elementary functions but we were able
to find the numerical solution. Moreover, using the Hamiltonian formalism we were
able to plot the phase trajectories without actually solving the equation of motion.
We can generalize the model of mathematical pendulum in several ways. First, any
realistic system is dissipative, i.e. there are resisting forces acting against the motion
8.4 Chaotic pendulum 147
θ̈ + b θ̇ + sin θ = 0,
where b is the constant characterizing the strength of resisting force. For example,
it can be related to the viscosity of the medium in which the pendulum moves.
Pendulum with resisting force is called damped pendulum.
Next we can assume that in addition to restoring gravitational force there is an
external force acting on the pendulum. Such force is called driving force. In the
presence of driving force, even if the initial velocity of the pendulum is zero (and the
pendulum is at equilibrium position), driving force will make the pendulum to move.
Resulting motion of the pendulum will be a ”mixture” of two motions: periodic
motion due to self-oscillations of the pendulum, and motion due to driving force.
Pendulum with the driving force, driven pendulum with the friction is described by
equation
where we assume that driving force is harmonic with angular frequency Ω and ampli-
tude F0 . In the subsequent analysis we will show that this kind of pendulum exhibits
chaotic behaviour and hence we also call it chaotic pendulum.
We start with rewriting equation (8.5) in the form of dynamical system. This is
straightforward since we can define
x1 = θ, x2 ≡ p = θ̇, x3 = φ = Ωt.
We will freely pass from notation (θ, p, φ) to equivalent notation (x1 , x2 , x3 ) according
to the context. By definition, variable φ satisfies equation
φ̇ = Ω,
while variable p (which is clearly related to the momentum of the pendulum) was
defined by
θ̇ = p
which can be consequently regarded as an equation for θ. The only true dynamical
equation is an equation for p which follows from (8.5):
148 8 Dynamical systems
ṗ = F0 sin φ − b p − sin θ.
θ̇ = p,
ṗ = F0 sin φ − b p − sin θ, (8.6)
φ̇ = Ω,
b = 0, , F0 = 0, , Ω = 0.
vals = 8b ® 0, f0 ® 0, W ® 0<;
In[37]:=
tmax = 10;
sol = DynSys@8 ð2, f0 Sin @ð3D - b ð2 - Sin @ð1D, W< & . vals, 8Π 4, 0, 0<, tmax D
Here we have chosen tmax = 10 but the reader should adjust this parameter in order
to reproduce all figures below. Now we can plot the phase trajectory in a usual way.
8.4 Chaotic pendulum 149
0.6
0.4
0.2
Out[28]=
- 0.5 0.5
- 0.2
- 0.4
- 0.6
b = 0.1,
see figure 8.2 for the result. We can see that the phase trajectory is a spiral which,
in the limit tmax → ∞, ends at the origin of the phase plane. This means that the
oscillations are damped until the pendulum stops. Slightly more ”fancy” picture can
be obtained by
0.6
0.4
0.2
-0.2
-0.4
-0.6
Fig. 8.2. Parameters b = 0.1, F0 = 0. Non-zero friction leads to damped oscillations of the pendulum.
p Θ
Θ t
Fig. 8.3. Phase trajectory of damped pendulum together with the time dependence of deflection
θ = θ(t).
8.4 Chaotic pendulum 151
Let us see how the driving force affects the motion. For this purpose we set
b = 0, F0 = 1, Ω = 2,
In other words, the pendulum is initially at its equilibrium position and, hence,
without driving force it would stay at rest. However, the presence of driving force
leads to solution plotted in figure 8.4.
p
Θ
Θ t
Fig. 8.4. Motion of the pendulum without friction under the external driving force with amplitude
F0 = 1 and angular frequency Ω = 2. Initial position of the pendulum is θ(0) = p(0) = 0.
b = 1, F0 = 1, Ω = 1,
see figure 8.5. An interesting feature of this solution is the presence of short transient
stage during which the phase trajectory follows outgoing spiral but then settles at
circular periodic orbit. Such behaviour is called limit cycle.
The reader is invited to experiment with values of parameters b, F0 and Ω and with
initial values θ(0), p(0) and φ(0). We can see that resulting motion of the pendulum
is a consequence of complicated and delicate interplay between three motions:
152 8 Dynamical systems
p
Θ
Θ t
Fig. 8.5. Motion of the pendulum with friction (b = 1) under the external driving force with amplitude
F0 = 1 and angular frequency Ω = 1. Initial position of the pendulum is θ(0) = p(0) = 0.
In other words, the derivatives of all variables xa , where x = (θ, p), vanish and
therefore the pendulum does not move.
However, there is another possibility. If we set
θ(0) = π, p(0) = 0,
ẋa = fa (x)
ẋa = fa (xC ) = 0.
If we choose the initial point to x(0) = xC , the system will remain at this initial
position forever, it will not move. For mathematical pendulum we have two critical
points,
The idea is that small perturbations of stable critical points will produce small
deviations from equilibrium position, but small perturbations of unstable critical
points will result in a motion far from the critical point. We will use a basic fact
from mathematical analysis that, under certain assumptions, function f (x) can be
expanded into the Taylor series around arbitrary point xC in the following way
1 2 00
f (xC + δ) = f (xC ) + δ f 0 (xC ) + δ f (xC ) + O δ 3
(8.8)
2
where f 0 (x) denotes the value of derivative of f at point x, f 00 (x) is the second
derivative at the x, etc.
First we analyse critical point xC1 = (0, 0). Let us denote critical values of θ = x1
and p = x2 by
θC = 0, pC = 0.
θ = θC + δ where |δ| 1.
θ̇ = δ̇.
Next we simplify equations of motion (8.7) under this assumption. Let us expand
sin θ around critical point θC = 0:
d
sin θ = sin(θC + δ) = sin θC + δ sin θ = sin θC + δ cos θC = δ
dθ θ=θC
where we have neglected higher powers of δ which is assumed to be small. With this
assumption, equations of pendulum (8.7) simplify to
δ̇ = p, ṗ = −δ.
These are, in fact, well-known equations for harmonic oscillator and we can easily
plot the solution which we already know is a circle. We can plot it by
sol = DynSys[ {#2, - #1} &, {0.1, 0}, 10];
ParametricPlot[ {x[1][t], x[2][t]} /. sol, {t, 0, 10}]
where we have chosen small initial deflection θ(0) = 0.1, in accordance with the
assumption. Since the solution is a circle, we observe that phase trajectories near
the first critical point remain in the vicinity of this critical point; an indicator of
stability.
Let us now investigate the second critical point located at
θC = π, pC = 0.
As in the previous case, we assume that deviations from critical value θC are small
and write
θ = θC + δ = π + δ, |δ| 1.
δ̇ = p, ṗ = δ.
In order to compare solutions near both critical points we use the following code:
156 8 Dynamical systems
In[154]:=
tmax = 2 Π ;
Needs@"PlotLegends`"D
sol1 = DynSys@ 8ð2, - ð1< &, 80.1, 0<, tmax D;
sol2 = DynSys@ 8ð2, ð1< &, 80.1, 0<, tmax D;
ParametricPlot@ 88x @1D@tD, x @2D@tD< . sol1, 8x @1D@tD, x @2D@tD< . sol2<,
8t, 0, tmax <, PlotRange ® 8 8-0.5, 2<, 8-0.5, 2<<,
AxesLabel ® 8∆, p<, PlotStyle ® 8 Red , Blue <, BaseStyle ® 8FontSize ® 15<,
PlotLegend ® 8"Critical point Θ C =0", "Critical point Θ c = Π "<, LegendPosition ® 8-0.5, 1<
D
Both trajectories near critical points are plotted in figure 8.7. We can see that tra-
jectory corresponding to first critical point θC = 0 is a circle and thus remains in the
vicinity of the critical point. The second trajectory corresponding to critical point
θC = π, on the other hand, is a line which escapes to infinity. Hence, we can see
that the second critical point is unstable in the following sense. If we move the pen-
dulum to θ = π and set the initial velocity to zero, the pendulum remains at this
equilibrium position. However, arbitrarily small perturbation (in our case δ = 0.1)
will cause the pendulum to escape from equilibrium position quickly. In our case,
the trajectory escapes to infinity, but this is an artefact of the linearization: we have
assumed that the perturbation δ is small but as soon as the pendulum is far enough
from the critical point, this assumption is not valid anymore.
ẋ(xC , yC ) = 0, ẏ(xC , yC ) = 0,
8.6 Stability of critical points 157
Critical point Θ C = 0
Critical point Θ c = Π
p
2.0
1.5
1.0
0.5
∆
- 0.5 0.5 1.0 1.5 2.0
- 0.5
Fig. 8.7. Phase trajectories near two critical points θC = 0 and θC = π. In both cases the actual
deflection is θ = θC + δ but with different θC . We can see that the red trajectory is a circle about the
origin while the blue trajectory diverges to infinity rapidly.
which means that critical points represent the equilibrium configurations of the sys-
tem.
Now we want to investigate the stability or instability of critical points. That
means we want to find out how the phase trajectories behave in the vicinity of
critical points. In the case of the pendulum we have seen that an appropriate way
how to proceed is to linearize the system of equations near the critical point.
Let us assume that (xC , yC ) is a critical point of system (8.9). In the neighbour-
hood of critical point we can write
158 8 Dynamical systems
x = xC + δ, |δ| 1,
(8.11)
y = yC + ε, |ε| 1.
Since xC and yC are constants, for the time derivatives of x and y we have
ẋ = δ̇, ẏ = ε̇.
Function fx (x, y) can be then expanded into the Taylor series:
∂fx ∂fx
fx (x, y) = fx (xC + δ, yC + ε) = fx (xC , yC ) + δ +ε
∂x (xC ,yC ) ∂y (xC ,yC )
= a δ + b ε, (8.12)
where we have used definition (8.10) in the last step and denoted partial derivatives
of fx by
∂fx ∂fx
a= , b= . (8.13)
∂x (xC ,yC ) ∂y (xC ,yC )
Vertical line with the subscript indicates that partial derivatives must be evaluated
at the critical point. Similarly, for fy we find
fy (x, y) = fy (xC + δ, yC + ε) = c δ + d ε
where
∂fy ∂fy
c= , d= . (8.14)
∂x (xC ,yC ) ∂y (xC ,yC )
Thus, near the critical point, planar dynamical system (8.9) can be replaced by
simpler equations
δ̇ = a δ + b ε,
(8.15)
ε̇ = c δ + d ε.
Coefficients a, b, c and d are not functions but constants given by (8.13) and (8.14).
It is useful to write equations (8.15) in the matrix form. Let us define
ab
x= δε , J= .
cd
Then two equations (8.15) are equivalent to single matrix equation
ẋ = J · x (8.16)
where the dot denotes standard matrix multiplication.
8.6 Stability of critical points 159
8.6.1 Example
with
xC (1 + yC ) = 0, yC (1 − xC ) = 0.
We analyse these points separately. The emphasis is on finding the critical points
and deriving linearized equations of motion. The solution is merely stated because
we will analyse all cases in detail later.
a) Critical point (0, 0). In this case we write
x = xC + δ = δ, y = yC + ε = ε.
Now we have
δ̇ = δ, ε̇ = ε.
δ = C1 et , ε = C2 et ,
where C1 and C2 are integration constants. We will discus this later, but for now it
is obvious that the phase trajectory escapes to infinity because
lim et = ∞.
t→∞
160 8 Dynamical systems
ẋ = a x + b y, ẏ = c x + d y.
ẋ = J · x
where
ab
J= .
cd
Now we discuss several forms of matrix J and classify the critical points. Finally we
will show how the analysis can be done for general matrix J .
ẋ = λ1 x, ẏ = λ2 y (8.17)
System (8.17) can be easily solved. Equations for x and y are independent; we say
that these equations are decoupled which means that equation for ẋ does not contain
y and vice versa.
Let us solve equation
ẋ = λ1 x
dx
Z Z
= λ1 dt,
x
to obtain
log x = λ1 t + C
log x = λ1 t + log K.
x = K eλ1 t .
y = L eλ2 t
Clearly, the only critical point of system (8.17) is (0, 0). Having derived solution
of this system, we can analyze its behaviour near the critical point. Useful function
to visualise properties of the system near critical point is StreamPlot which takes the
vector field and plots trajectories. In the following example we choose λ1 = λ2 = 1.
3
Notice that arbitrary real number C is a logarithm of some other real number, i.e. we can write C = log K
for some K.
8.7 Classification of critical points 163
vals = 8 Λ1 ® 1, Λ2 ® 1<;
StreamPlot@ 8Λ1 x , Λ2 y < . vals, 8x , - 10, 10<, 8y , - 10, 10<D
10
Out[173]=
0
-5
- 10
- 10 -5 0 5 10
In this figure we can see trajectories (8.19) for initial points (x0 , y0 ) chosen by Math-
ematica. Notice that we have inserted the right hand side of (8.17) as an argument
of function StreamPlot. We can see that the trajectories are straight lines emanating
from the origin (critical point) and tending to infinity exponentially.
What about other choices of λ1,2 ? It is clear that function eλt is increasing for
λ > 0 and decreasing for λ < 0. We can conclude that qualitative behaviour of the
system depends on signs of λ1,2 and four possibilities are shown in figure 8.8 which
was created by following commands in Mathematica. We distinguish three cases.
• λ1 > 0 and λ2 > 0
In this case the critical point is called unstable node. Trajectories are emanating
from the origin and they are repelled to infinity.
• λ1 > 0, λ2 < 0 or λ1 < 0, λ2 > 0
Critical point is called saddle point. Trajectories are repelled from y−axis and
attracted to x−axis (for λ1 < 0) or repelled from x−axis and attracted to y−axis
(for λ2 < 0).
• λ1 < 0 and λ2 < 0
Critical point is called stable node. Trajectories are attracted to the origin.
164 8 Dynamical systems
In addition to this classification, critical points with distinct values λ1 6= λ2 are called
singular while critical points with the same values λ1 = λ2 are called degenerate.
Clearly, the saddle points cannot be singular.
5 5
0 0
-5 -5
- 10 - 10
- 10 -5 0 5 10 - 10 -5 0 5 10
5 5
0 0
-5 -5
- 10 - 10
- 10 -5 0 5 10 - 10 -5 0 5 10
Fig. 8.8. Different behaviour of planar system (8.17) for different choices of λ1,2 . Critical points are
a) unstable node, b,c) saddle point, d) stable node.
8.7 Classification of critical points 165
Recall that planar dynamical system (8.17) can be represented by the matrix
(8.18),
λ1 0
J= .
0 λ2
From elementary linear algebra we know that with matrix J we can associate a set
of eigenvalues λ defined by equation
J · e = λe
J · e1 = λ1 e1 , J · e2 = λ2 e2 .
ẋ = a x + b y, ẏ = c x + d y.
If matrix J has two real eigenvalues λ1 and λ2 , then critical point is stable/unstable
node or a saddle point, depending on the signs of these eigenvalues.
We illustrate this classification on the example. Consider dynamical system
ẋ = 2 x + y, ẏ = x, (8.20)
166 8 Dynamical systems
This matrix is not of the form (8.18) but we can apply the second criterion. Eigen-
values and eigenvectors can be found in Mathematica using
Eigensystem @J D
Since λ1 > 0 and λ2 < 0, vector e1 defines the stable manifold and e2 defines unstable
manifold. Since both eigenvalues have different signs, the critical point is a saddle
point and it is regular. Phase trajectories together with stable and unstable manifolds
can be plotted by
10
-5
-10
-10 -5 0 5 10
Fig. 8.9. Phase portrait for dynamical system (8.20). Blue line represents unstable manifold, red line
represents stable manifold.
ẋ = α x + β y, ẏ = −β x + α y. (8.21)
System (8.21) is little trickier to solve. Let us switch to polar coordinate system
by usual transformation
x = r cos θ, y = r sin θ,
∂r x ∂r y
= , = ,
∂x r ∂y r
∂θ y ∂θ x
= − 2, = .
∂x r ∂y r2
∂r ∂r
ṙ = ẋ + ẏ = α r,
∂x ∂y
∂θ ∂θ
θ̇ = ẋ + ẏ = −β.
∂x ∂y
We can see that dynamical system (8.21) in polar coordinates decouples to two
independent equations for coordinates r and θ,
ṙ = α r, θ̇ = − β. (8.23)
log r = α t + log C
where the integration constant has been written as a logarithm (see footnote on page
162). Exponentiating the last equation we arrive at
r = C eαt .
Obviously, at time t = 0 we have r(0) = C and so we write the solution in the form
r = r0 eαt .
dθ = β dt
which integrates to
θ = β t + θ0
8.7 Classification of critical points 169
where the integration constant has been denoted by θ0 and represents the value of θ
at t = 0. Summa summarum, solution of system (8.23) acquires the form
r = r0 eαt , θ = θ0 + β t. (8.24)
Clearly, this represents motion at constant angular velocity β and constant radius r0
and therefore the phase trajectories are circles of radius r0 . If α 6= 0, the radius of
the ”circle” will be
r0 eαt
and hence the trajectory will be a spiral. If α > 0, the radius will increase exponen-
tially and the spiral will tend to infinity. If, on the other hand, α < 0, the radius will
decrease exponentially and the phase trajectories will spiral towards the origin. All
cases are plotted in figure 8.10 by Mathematica commands
J = 8 8Α , Β<, 8- Β, Α <<;
In[25]:=
• α<0
Critical point is called stable focus, trajectories are spirals tending to the origin.
Parameter β has the meaning of angular velocity. If it is zero, spirals become straight
lines and dynamical system reduces to previous case (8.17). If it is non-zero, its sign
determines the sense of rotation: trajectories orbit the origin in a clockwise sense for
β > 0 and in a counter-clockwise sense for β < 0.
Let us now analyse critical points of system (8.21) in terms of eigenvalues of
matrix (8.22)
αβ
J= .
−β α
We can use Mathematica to find the eigenvalues and eigenvectors of matrix (8.22)
by
λ1 = α − i β and λ2 = α + i β
with eigenvectors
i −i
e1 = , e2 = .
1 1
J · e1 = λ1 e1 , J · e2 = λ2 e2 .
The first observation is that the eigenvectors are complex and hence there are no
neither stable nor unstable manifolds, i.e. there is no real direction which is mapped
to the same direction. The only exception is when β = 0 since in this case dynamical
system (8.21) reduces to (8.17) and the eigenvectors become real.
Second, eigenvalues λ1,2 are mutually complex conjugated (as well as the eigen-
vectors),
8.7 Classification of critical points 171
aL Α = 0, Β > 0 bL Α = 0 , Β < 0
4 4
2 2
0 0
-2 -2
-4 -4
-4 -2 0 2 4 -4 -2 0 2 4
4 4
2 2
0 0
-2 -2
-4 -4
-4 -2 0 2 4 -4 -2 0 2 4
Fig. 8.10. Classification of critical points for the system (8.21): a, b) centre, c) unstable focus, d)
stable focus.
172 8 Dynamical systems
λ1 = λ2
where the bar denotes the complex conjugation. Hence, even if the dynamical system
is not of the form (8.21), we can conclude, that if the matrix J has two complex
conjugated eigenvalues
α ± i β,
ẋ = 2 x + 4 y, ẏ = −3 x + 2y.
This system is not of the form (8.21) but we can apply the criterion based on the
analysis of eigenvalues. In Mathematica we type
Eigensystem @J D Expand
where we have used Expand in order to simplify the expression for eigenvectors (try
this code without Expand). We have found two eigenvalues
√
λ1,2 = 2 ± 2 i 3 = α ± i β,
which are mutually complex conjugated. In this case, parameters α and β are
√
α = 2, β = 2 i 3.
-2
-4
-4 -2 0 2 4
ẋ = x + 2 y, ẏ = −2 x − y.
Eigensystem @J D Expand
Since α = 0, critical point is a centre rather than focus. Trajectories of this dynamical
system are the following:
-2
-4
-4 -2 0 2 4
However, we have seen that the analysis can be performed using the eigenvalues of
these matrices. Now we consider general linear planar dynamical system
ẋ = α x + β y, ẏ = γ x + δ y. (8.26)
Let us find the eigenvalues and eigenvectors of this general matrix. Recall that the
determinant of matrix J is
8.8 General case 175
D = det J = α δ − β γ.
The trace of the matrix is defined as a sum of its diagonal elements, i.e.
T = Tr J = α + δ.
J · e = λe
(J − λ I) · e = 0
det (J − λ I) = 0.
(α − λ)(δ − λ) − β γ = 0.
λ2 − (α + δ)λ + α δ − β γ = 0,
or, equivalently
λ2 − T λ + D = 0.
8.9 Examples
Example 1
ẋ = 2 x + y, ẏ = x + 2 y.
xC = 0, yC = 0.
Since the system is linear, we do not have to linearize it and can write the matrix of
linearized system immediately:
21
J= .
12
λ1 = 1, λ2 = 3.
Eigenvectors can be found easily by hand. Recall that eigenvectors are solutions to
equation
J · ei = λi ei , i = 1, 2.
8.9 Examples 177
where e1 = (a, b) is unknown eigenvector. Since the rows (or columns) of the matrix
above are linearly dependent4 , this system has infinitely many non-trivial solutions
satisfying condition a = −b. Hence, all eigenvectors corresponding to eigenvalue
λ1 = 1 have the form
a
.
−a
Now we can classify the critical point (0, 0). Since the eigenvalues are real and non-
zero, critical point is hyperbolic. They are both positive and hence the critical point
is unstable node. Finally, eigenvectors are real and so the system has two unstable
manifolds given by e1 and e2 . Implementation in Mathematica is shown in figure
8.11.
4
This is a consequence of (8.27), because this equation has been derived under the assumption det(J −
λ I) = 0.
178 8 Dynamical systems
Dynamical system
x’ = 2 x + y, y’ = x + 2y
with the matrix J = K O
2 1
1 2
H* critical points *L
In[4]:=
88x ® 0, y ® 0<<
Out[4]=
Origin (0, 0) is the only critical point. Eigenvalues and eigenvectors are found by
Eigensystem @J D
H* unstable manifold e 1 = H 1, 1L *L
g2 = Graphics@8Blue , Thick , Line @ 8 - 10 81, 1<, 10 81, 1<<D<D;
H* unstable manifold e 1 = H- 1, 1L *L
g3 = Graphics@8Red , Thick , Line @ 8 - 10 8- 1, 1<, 10 8- 1, 1<<D<D;
Show @g1, g2, g3D
Out[27]=
0
-2
-4
-4 -2 0 2 4
Example 2
Linear dynamical system has the form
ẋ = − 2 x, ẏ = − 4 x − 2 y.
Eigenvalues are
λ1 = λ2 = −2
ẋ = α x.
x = x0 e α t
Dynamical system
x’ = -2 x,
y’ = -4 x - 2 y
with the matrix J = K O
-2 0
-4 -2
H* critical points *L
In[28]:=
88x ® 0, y ® 0<<
Out[28]=
Origin (0, 0) is the only critical point. Eigenvalues and eigenvectors are found by
Eigensystem @J D
H* stable manifold *L
g2 = Graphics@8Red , Thick , Line @ 8 - 10 80, 1<, 10 80, 1<<D<D;
Show @g1, g2D
Out[39]=
0
-2
-4
-4 -2 0 2 4
ẏ = −γ y
y = y0 e−γ t ,
ẋ = α x − β x y, ẏ = −γ y + δ x y. (8.29)
These are Volterra-Lotka equations. Obviously, they are non-linear and the non-
linearity represents the interaction between two populations. All constants are as-
sumed to be positive.
Critical points can be found by
:: x ® >, 8x ® 0, y ® 0<>
Out[4]= Γ Α
, y ®
∆ Β
J = ∂x ∂y
∂ ẏ ∂ ẏ .
∂x ∂y
The Jacobi matrix can be found in Mathematica by
182 8 Dynamical systems
f@x_ , y_ D = 8 Α x - Β x y , - Γ y + ∆ x y <;
In[16]:=
which shows
α − yβ −xβ
J= .
yδ xδ − γ
Next we evaluate the Jacobian at both critical points:
J1 = J . cp@@1DD
In[24]:=
J2 = J . cp@@2DD
::0, - >, :
Out[24]= Β Γ Α ∆
, 0>>
∆ Β
i.e. we have
βγ
0 − δ
J1 = α δ at critical point xC1 ,
0
β
α 0
J2 = at critical point xC2 .
0 −γ
Finally we find eigenvalues and eigenvectors by
Eigensystem @J1D
In[27]:=
Eigensystem @J2D
ä Β Γ ä Β Γ
::-ä Γ >, ::- , 1>, :
Out[27]=
Α Γ ,ä Α , 1>>>
Α ∆ Α ∆
We know that the solution exists and is unique if prescribe initial conditions
where xa0 are constants with the meaning of initial value of coordinates xa . The
solution of dynamical system is then a set of functions xa as functions of time,
where we have explicitly emphasized that particular solution depends on initial values
such that
d
x(0, x0 ) = x0 and x(t, x0 ) = fa (x(t, x0 )). (8.33)
dt
In other words, x(t, x0 ) is a solution of dynamical system (8.30) with initial conditions
(8.31).
It is useful to introduce slightly more formal notation for x(t, x0 ). We defined
the phase space M as an abstract space with coordinates xa . For n−dimensional
dynamical system, the phase space is
M = Rn = R
| × R{z
× · · · R} .
n
Φ : R × M 7→ M
defined by
Φs (x0 ) = x(s, x0 ).
Geometrically, the flow Φs is a mapping which maps arbitrary point x0 to point
x(s, x0 ), i.e. shifts point x0 along the phase trajectory by parametric distance s.
Hence, the flow satisfies relations
Φ0 (x0 ) = x0 , Φs+t = Φs ◦ Φt , (Φs )−1 = Φ−s .
Obviously,
dΦs (x0 ) d
= x(s, x0 ) = fa (x0 ).
ds s=0 ds s=0
Thus, we can also say that the flow Φs shifts point x0 along the vector field fa .
Let us illustrate it on the example of familiar planar dynamical system
ẋ = y, ẏ = −x
so that we have
f1 (x, y) = y, f2 (x, y) = −x.
Vector field fa can be plotted by
10
Out[63]=
0
-5
- 10
- 10 -5 0 5 10
8.10 Flow of the vector field 185
x(0) = x0 , y(0) = y0 ,
sol = DSolve @ 8x '@tD y @tD, y '@tD - x @tD, x @0D x0, y @0D y0<, 8x @tD, y @tD<, tD
In[66]:=
Thus, the flow Φs maps point (x0 , y0 ) to point which lies on the solution with initial
conditions (x0 , y0 ) at time s:
Hence, Φs (x0 , y0 ) is a position of the system at time s for initial conditions (x0 , y0 ).
In figure 8.13 we plot the flow for initial conditions
x0 = 1, y0 = 8.
We have seen that the curve Φs (x0 ) for a given x0 is a solution of dynamical
system with initial condition x(0) = x0 . This curve is called orbit of point x0 and is
denoted by
IC = 8 x0 ® 1, y0 ® 8<;
In[187]:=
x 0 =F 0 Hx 0 L
10
F 5 Hx 0 L
5
Out[192]=
0
-5
- 10
- 10 -5 0 5 10
fa (xC ) = 0
Λ(xC ) = {xC }.
We have classified critical points according to behaviour of the orbits (phase tra-
jectories) in the vicinity of the critical point. If the orbit remained in the vicinity of
critical point, we have said that the critical point is stable. If the orbit was attracted
to critical point, it was called stable node or stable focus, depending on the character
of the system. If the orbit was circular, critical point was called centre. Finally, if
the orbit escaped from the critical point to infinity, we called the critical point the
unstable node or unstable focus. However, this analysis was performed for linearized
dynamical system. Now we can formulate the stability for general non-linear system
in terms of the flow.
Let k · k be standard norm defined on the phase space M , i.e. for any x ∈ M its
norm is
q
kxk = x21 + x22 + · · · x2n .
In general, the norm is a measure of distance of point x from the origin. In some
situations, it is useful to introduce different notion of the norm, for example the
so-called p−norm (p is positive integer) defined by
• Linearity
kα xk = |α| kxk
kx + yk ≤ kxk + kyk.
In some contexts the first condition is relaxed, i.e. we admit there are vectors
x 6= 0 for which kxk = 0. In this case, operation k · k is called semi-norm. In this
textbook we consider only positive definite norms satisfying the first property. Notice
that positive definiteness implies that whenever
188 8 Dynamical systems
kx − yk = 0,
In the previous chapter we defined the concept of dynamical system and introduced
several notions related to dynamical systems. Among others, we have investigated the
stability of critical points. This discussion was connected with the behaviour of the
phase trajectories (or orbits) n the neighbourhood of the critical point. In this section
we analyse dynamical systems from another point of view. Instead of investigating
the orbits (but using classification introduced in previous chapter) we investigate the
influence of the parameters of the system. We will observe that there are values of
parameters for which the system can exhibit different behaviour. Which behaviour
occurs depends on the circumstances, e.g. on the history of the system. Points at
which the system must ”decide” which behaviour to choose are called bifurcation
points. These issues will be clarified and illustrated below. Bifurcation theory is a
large subject and in this chapter we merely sketch the main ideas without going into
depth.
ẋ = µ + x2 (9.1)
where µ is a real parameter. If µ > 0, there are no real critical points. For µ = 0, the
√
only critical point is xC = 0, and for µ < 0 there are two critical points at xC = µ
√
and xC = − µ. Let us examine the character of critical points briefly.
For µ = 0 and critical point xC = 0, the linearized version of system (9.1) reads
ẋ = 0
190 9 Bifurcations
which shows that xC is non-hyperbolic critical point (eigenvalue of Jacobi matrix has
vanishing real part).
√
For µ < 0, the critical point is xC = ± µ. We expand function
f (x) = µ + x2
and so this critical point is a stable node. We can plot critical points corresponding
to different values of µ by code presented in figure 9.1.
Saddle-node bifurcations occur when critical points do not exist for some values of
the parameter, then a critical point suddenly appears at some value of the parameter
and single critical point splits into two critical points for other values of the param-
eter. In our case, there are no critical points for µ > 0 but a critical point appears
at µ = 0. This is a bifurcation point. Finally, for µ < 0 there are two critical points,
one of them being stable, the other one being unstable.
ẋ = µ x − x2 = x (µ − x). (9.2)
Regardless on the value of µ, there is always one critical point at xC = 0 and one
critical point at xC = µ. Hence, unlike the case of saddle-node bifurcations, the
number of critical points does not change. However, we will show that the character
of these critical points change at the bifurcation point.
First critical point is xC = 0. After linearization of system (9.2) we find
ẋ = µ x.
9.2 Transcritical bifurcations 191
xC
unstable node
1.0
0.5
Out[61]=
bifurcation point
Μ
- 2.0 -1.5 -1.0 - 0.5 0.5
- 0.5
-1.0
stable node
Obviously, for µ > 0, critical point is unstable while for µ < 0 it is stable. On the
other hand, after linearization of system (9.2) we have
ẋ = µ2 − µ x. (9.3)
This equation is inhomogeneous linear equation with constant coefficients and can
be solved by elementary methods. First we write down corresponding homogeneous
equation
ẋ = −µ x
which integrates to
192 9 Bifurcations
xH = C e−µ t
where subscript H stands for ”homogeneous”. Next we need to find any particular
solution of original inhomogeneous equation. This is trivial, however, for obviously
the choice x = µ is a solution to equation 9.3. By a mathematical theorem, general
solution to equation (9.3) is
x = µ + C e−µ t .
Constant µ does not affect the character of critical point (prove!) and only the ex-
ponential term matters. We can see that for µ > 0 the critical point is stable while
for µ < 0 it is unstable.
To summarize, we have found two critical points
xC = 0 and xC = µ
ẋ = µ x − x3 . (9.4)
Notice that this system is invariant under reflection x 7→ −x, for under this trans-
formation we have
and hence
ẋ = µ x − x3 7→ −ẋ = −µx + x3 → ẋ = µ x − x3 .
Thus, equation (9.4) does not change its form under the reflection, i.e. the reflection
is a symmetry of equation (9.4). Pitchfork bifurcations occur often in the systems
possessing some kinds of symmetries.
9.3 Pitchfork bifurcation 193
PlotStyle ® 88Blue , Thick <, 8Dashed , Red , Thick <<, AspectRatio ® 1, AxesLabel ® 8"Μ", "x C "<,
Axes ® 8False , True <, BaseStyle ® 8FontSize ® 15<,
Epilog ® 8Disk @80, 0<, 0.03D,
Text@"unstable ", 8- 1, 0.1<D,
Text@"stable ", 81, 0.1<D,
Text@Style @"Μ", FontSize ® 15D, 81.9, -0.1<D
<
D
xC
2
Out[83]=
unstable stable
0 Μ
-1
-2
which shows that (ignoring the constant factor as in the previous section) both critical
√
points ± µ are stable. Indeed, µ > 0 and hence the factor standing by x is always
−2µ < 0. All possibilities are plotted in figure 9.3 again.
xC1@Μ_ ; Μ £ 0D = 0;
In[99]:=
xC2@Μ_ ; Μ > 0D = Μ ;
xC3@Μ_ ; Μ > 0D = - Μ ;
xC4@Μ_ ; Μ > 0D = 0;
PlotStyle ® 88Blue <, 8Blue <, 8Blue <, 8Red , Dashed <<,
AspectRatio ® 1, AxesLabel ® 8"Μ", "x C "<,
Axes ® 8False , True <, BaseStyle ® 8FontSize ® 15<,
Epilog ® 8Disk @80, 0<, 0.03D,
Text@"stable ", 8- 1, 0.1<D,
Text@"unstable ", 81, 0.1<D,
Text@Style @"Μ", FontSize ® 15D, 81.9, -0.1<D
<
D
xC
1.0
0.5
Out[104]=
stable unstable
0.0
Μ
- 0.5
-1.0
ẋ = µ x + x3
9.4 Example
Now let us see a non-trivial example on pitchfork bifurcation. Let the system be
ẋ = µ x + y + sin x, ẏ = x − y. (9.5)
Our task is to determine the bifurcation point and type of bifurcation. We will use
Mathematica to solve particular steps.
First we find critical points by setting ẋ = 0 and ẏ = 0. Second equation imme-
diately gives x = 0 and hence equation for x reads
µ x + x + sin x = 0. (9.6)
Clearly, a general solution cannot be found analytically but we can see that for
arbitrary µ there is always a solution
xC = yC = 0.
Let us determine the character of this critical point. Jacobi matrix of system (9.5) is
µ+1 1
J= (9.7)
1 −1
sys = Eigenvalues@J D
: >
Out[31]= 1 1
Μ- 8 + 4 Μ + Μ2 , Μ+ 8 + 4 Μ + Μ2
2 2
196 9 Bifurcations
xC1@Μ_ ; Μ £ 0D = 0;
In[11]:=
xC2@Μ_ ; Μ < 0D = -Μ ;
xC3@Μ_ ; Μ < 0D = - -Μ ;
xC4@Μ_ ; Μ > 0D = 0;
PlotStyle ® 88Blue <, 8Blue <, 8Blue <, 8Red , Dashed <<,
AspectRatio ® 1, AxesLabel ® 8"Μ", "x C "<,
Axes ® 8False , True <, BaseStyle ® 8FontSize ® 15<,
Epilog ® 88PointSize @Large D, Point@80, 0<D<,
Text@"stable ", 8- 1, 0.1<D,
Text@"unstable ", 81, 0.1<D,
Text@Style @"Μ", FontSize ® 15D, 81.9, -0.1<D
<
D
xC
1.0
0.5
Out[17]=
stable unstable
0.0
Μ
- 0.5
-1.0
Λ1@Μ_ D = sysP 1T
In[14]:=
Λ2@Μ_ D = sysP 2T
Plot@ 8Λ1@ΜD, Λ2@ΜD<, 8Μ, - 10, 10<, PlotStyle ® 8Blue , Red <D
Out[14]= 1
Μ- 8 + 4 Μ + Μ2
2
Out[15]= 1
Μ+ 8 + 4 Μ + Μ2
2
10
Out[16]=
- 10 -5 5 10
-5
Hence, for all values of µ we have λ1 < 0 while λ2 changes the sign for µ = −2. That
means that for µ < −2, when both eigenvalues are negative, the critical point is a
stable node. For µ > −2, the critical point is a saddle point because eigenvalues have
different signs.
Clearly, point µ = −2 is a candidate for being a bifurcation point. Since we cannot
solve equation (9.6) exactly, we restrict our attention to neighbourhood of potential
bifurcation point µ = −2. Critical points are roots of function
Rµ (x) = µ(x + 1) + sin x.
In figure 9.5 we plot this function for three values of µ. We can see that critical
points different from the origin appear only for µ > −2. Approximate location of
these critical points can be found by expanding function sin x in (9.6) up to the third
order,
1 3
sin x = x − x,
3!
198 9 Bifurcations
R Μ H x L = ΜH x +1L + sin x
0.4
Μ = - 2.1
0.2
Μ = -2
- 1.0 - 0.5 0.5 1.0
x
- 0.2 Μ = -1.9
- 0.4
Fig. 9.5. Plot of function Rµ (x) = µ(x + 1) + sin x. Its roots are critical points of system (9.5). For
µ ≤ −2, the origin x = 0 is the only critical point, for µ > −2 there are two critical points symmetric
about the origin.
Now we can determine the character of bifurcation point even without analysis
of new critical points. Recall that the origin is a critical point, stable for µ < −2
and unstable for µ > −2. New critical point emerge at bifurcation point and exist
for µ > −2. Hence, the bifurcation diagram is similar to that in figure 9.3. We can
deduce that the bifurcation is supercritical and two new critical points are stable.
In Mathematica we can easily find precise locations of critical points numerically
using function FindRoot. This function needs a starting point and we choose this
starting point to be approximate solution (9.8). Full Mathematica code for plotting
correct bifurcation diagram in the neighbourhood of the bifurcation point µ = −2 is
shown in figure 9.6.
9.4 Example 199
;
In[229]:=
xC1@Μ_ Μ £ - 2D = 0;
xC2@Μ_ ; Μ > - 2D = 0;
xC3@Μ_ ; Μ > - 2D = cp@ΜD;
xC4@Μ_ ; Μ > - 2D = - cp@ΜD;
8Μ, - 3, - 1<, PlotStyle ® 88Blue <, 8Red , Dashed <, 8Blue <, 8Blue <<,
AspectRatio ® 1, AxesLabel ® 8"Μ", "x C "<,
Axes ® 8False , True <, BaseStyle ® 8FontSize ® 15<,
Epilog ® 8Disk @80, 0<, 0.03D,
Text@"stable ", 8- 2.5, 0.2<D,
Text@"stable ", 8- 1.5, 2.5<D,
Text@"unstable ", 8- 1.2, 0.2<D,
Text@Style @"Μ", FontSize ® 15D, 81.9, -0.1<D,
8PointSize @Large D, Point@8- 2, 0<D<,
Text@Style @"Μ=- 2", FontSize ® 15D, 8- 1.8, 0.2<D
<
D
xC
3
stable
2
Out[262]=
stable Μ =-2 unstable
0
-1
-2
-3
A.1 D-derivative
Derivatives in Mathematica can be computed in several ways. Command of the form
D[f, x]
differentiates function f with respect to variable x. If we need n−th order derivative
of f , we use
D[f, {x, n}]
Similarly, second partial derivatives with respect to several variables can be calculated
by
D[ f, x, y ]
which is equivalent of
∂ 2f
∂x ∂y
For example, commands
D[ Sin[x^2], x]
D[ x^3, {x, 2} ]
D[ y x^2 + x y^2, x, y]
are equivalents of mathemematical expressions
d d2 3 ∂2
sin x2 , y x2 + x y 2
x,
dx dx2 ∂x ∂y
and produce following output
202 A Important commands in Mathematica
2 x Cos[x^2]
6 x
2 x + 2 y
A.2 Table
Command Table[...] creates one-dimensional or more dimensional lists of elements.
One-dimensional list can be created by
Table[ expr, {i, imin, imax} ]
where expr is some expression depending on variable i. Command Table subsequently
substitutes values of i into expression expr and produces a list of expressions. For
example, command
squares = Table[ i^2, {i, 1, 5} ]
produces a list
{1, 4, 9, 16, 25}
which is now stored in variable squares. In order to access individual elements of the
list, use the double-square-brackets [[ and ]]. For example, third element of the list
squares can be accessed via
squares[[ 3 ]]
which returns
9.
B
Some features of Mathematica
B.2 Functions
In Mathematica you can define functions of any type and there are many features to
be covered. Here we discuss only what is necessary for the purposes of our textbook.
Function of one or more variables is defined according to scheme
func_name [ var1_, var2_, ... ] = expr
where func name is the name of new function. In square brackets you have to enu-
merate all variables which the function depends on. Notice the underline symbol
after the name of each variable. Assignment is performed via traditional symbol =.
Finally, on the right hand side there is an expression for the function.
2
For example, you can define function f = 3 x e−x as
f[ x_ ] = 3 x Exp[-x^2];
Now you can evaluate it at some point, say 10, by
f[10]
which yields
30
.
e100
If you need numerical value, type
f[10] //N
to find result 1.11602 × 10−42 .
Let us see an example of function of more variables.
f[ x_, y_, z_ ] = x^2 + y^2 + z^2
To evaluate this function at some point, say (1, 2, 3), type
f[1, 2, 3]
to get number 14.
B.3 Pure functions 205
f : R 7→ R.
On the other hand, symbol f (x) is a value of function f at point x. More precisely,
f is a set of ordered pairs (x, y) such that there is only one y for each x. If a pair
(x, y) is an element of f , i.e. (x, y) ∈ f , we write usually
y = f (x).
Thus, f is a set of ordered pairs of real numbers, while f (x) is the single real number
meaning the value of f at point x.
Let us turn back to Mathematica. When you write, for example,
f[x_] = 1 + x^2
you tell Mathematica that the value of function f at point x is f (x) = 1 + x2 . But
the name of the argument is irrelevant, for if you write
f[q_] = 1 + q^2
you define exactly the same function! The name of argument is only formal. The
alternative is to use the pure function.
Consider following definition:
f = Function[ 1 + #^2 ]
Here we do not use the name of arguments. The sharp symbol # means the argument
of function regardless on its name. You can verify that function f defined in this way
behaves as function f[x] or f[q] defined above. Similarly, you can define function of
more variables by
f = Function[ #1^2 + #2^2 ]
where symbols #1 and #2 stand for the first and the second argument, respectively.
Calling
f[x,y]
206 B Some features of Mathematica
now yields
x2 + y 2 ,
calling
f[1, 3]
yields number 10.
Pure function can be defined without using command Function by symbol &.
Following three lines are equivalent:
f[x_] = 1 + x^2
f = Function[ 1+#^2 ]
f = (1 + #^2)&
Notation with symbol & is particularly useful if we need to use the function only
at one place but we do not need it later. Then it is unnecessary to define function
separately. For example, suppose that you are given a list
list = { 1, 2, 3, 4, 5 };
and you want to apply function f (x) = 1 + x2 to each element of list. We can use
operator /@:
(1+#^2)& /@ list
Here we defined a pure function (1 + #2 )& which, as we have seen, is an abstract
way of defining function 1 + x2 . Operator /@ now substitutes each element of list into
this pure function and produces a list
{2, 5, 10, 17, 26}.
B.4 Expressions
Anything you type in Mathematica is called expression and expressions can be di-
vided into two groups, atomic and composed. Atomic expressions are the most simple
elements, e.g. numbers, functions. Each expression has the so-called head which can
be found using function Head. For example, try the following code:
Head[2]
Head[4.5]
Head[2 + 3 I ]
Mathematica returns “values”
B.4 Expressions 207
Times
{-1, y}
Therefore, −y is a product of −1 and y. Reader is invited to experiment with several
expressions in order to get feeling for the structure of Mathematica.
• Rotation matrices
• Full analysis of chaotic pendulum
• Matrix eigenvalues
• Volterra-Lotka equations
• Pictures on Lyapunov stability
• More coordinate systems