Sei sulla pagina 1di 30

Lecture Notes: Mathematical Methods I

S Chaturvedi
October 3, 2016

Contents
1 Finite dimensional Vector Spaces
1.1 Vector space . . . . . . . . . . . . . . . . . . . . . .
1.2 Examples . . . . . . . . . . . . . . . . . . . . . . .
1.3 Linear combinations, Linear Span . . . . . . . . . .
1.4 Linear independence . . . . . . . . . . . . . . . . .
1.5 Dimension . . . . . . . . . . . . . . . . . . . . . . .
1.6 Basis . . . . . . . . . . . . . . . . . . . . . . . . . .
1.7 Representation of a vector in a given basis . . . . .
1.8 Relation between bases . . . . . . . . . . . . . . . .
1.9 Subspace . . . . . . . . . . . . . . . . . . . . . . . .
1.10 Basis for a vector space adapted to its subspace . .
1.11 Direct Sum and Sum . . . . . . . . . . . . . . . . .
1.12 Linear Operators . . . . . . . . . . . . . . . . . . .
1.13 Null space, Range and Rank of a linear operator . .
1.14 Invertibility . . . . . . . . . . . . . . . . . . . . . .
1.15 Invariant subspace of a linear operator . . . . . . .
1.16 Eigenvalues and Eigenvectors of a linear operator .
1.17 Representation of a linear operator in a given basis
1.18 Change of basis . . . . . . . . . . . . . . . . . . . .
1.19 Diagonalizability . . . . . . . . . . . . . . . . . . .
1.20 From linear operators to Matrices . . . . . . . . . .
1.21 Rank of a matrix . . . . . . . . . . . . . . . . . . .
1.22 Eigenvalues and Eigenvectors of a matrix . . . . . .
1.23 Diagonalizability . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

3
. 3
. 4
. 4
. 4
. 4
. 5
. 5
. 5
. 6
. 6
. 6
. 7
. 7
. 7
. 7
. 7
. 8
. 8
. 8
. 9
. 9
. 9
. 10

1.24
1.25
1.26
1.27
1.28
1.29
1.30
1.31
1.32
1.33

Jordan Canonical Form . . . . . . . . . . . . . . . . . . . . .


Cayley Hamilton Theorem . . . . . . . . . . . . . . . . . . .
Scalar or inner product, Hilbert Space . . . . . . . . . . . .
Orthogonal complement . . . . . . . . . . . . . . . . . . . .
Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . .
Relation between orthonormal bases . . . . . . . . . . . . . .
Gram Schmidt Orthogonalization procedure . . . . . . . . .
Gram Matrix . . . . . . . . . . . . . . . . . . . . . . . . . .
Adjoint of a linear operator . . . . . . . . . . . . . . . . . .
Special kinds of linear operators, their matrices and their properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.34 Simultaneous Diagonalizability of Self adjoint operators . . .
1.35 Simultaneous reduction of quadratic forms . . . . . . . . . .
1.36 Standard constructions of new vector spaces from old ones .
2 Second order differential equations
2.1 Power series, interval of convergence . . . . . .
2.2 Ordinary, singular and regular singular points
2.3 Solution around an ordinary point . . . . . .
2.4 Solution around a regular singular point . . .
2.5 Example: Bessel Equation . . . . . . . . . . .
2.6 Second order diff. eqns : Sturm Liouville form
2.7 Sturm Liouville form: Polynomial solutions . .

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

11
12
13
14
14
14
14
15
15

.
.
.
.

16
19
19
19

.
.
.
.
.
.
.

22
22
22
23
23
25
27
28

Finite dimensional Vector Spaces

1.1

Vector space

A vector space V is a set of mathematical objects, called vectors written as


x, y, z, u, equipped with two operations - addition and multiplication by
scalars, such that the following hold
For any pair x, y V , x+y = x+y is also in V [Closure under additon
and commutativity of addition]
For any x, y, z V , x + (y + z = (x + y) + z [Associativity of addition]
There is a unique zero vector 0 V such that, for any x V , x+0 = x
[Additive identity]
For each x V there is a vector denoted by x such that x+(x) = 0
[Additive inverse]
For any scalar and any x V , x is also in V [ Closure under scalar
multiplication]
For any x V 0.x = 0, 1.x = x
For any scalar and any pair x, y V , (x + y) = x + y. Further,
for any pair of scalars , and any x V , (x) = x and (+)x =
x + x
Depending on whether the scalars are real numbers or complex numbers one
speaks of a real or a complex vector space. In general, the scalars may
be drawn from any field, usually denoted by F and in that case we speak
of a vector space over the field F. ( A field F is a set equipped with two
composition lawsaddition and multiplication such that F is an abelian group
under addition (with 0 denoting the additive identity element) and F =
F {0} is an abelian group with respect to multiplication (with 1 denoting
the multiplicative identity element). Two familiar examples are fields are the
set of real and complex numbers. Both of these are infinite fields. Another
not so familiar example of an infinite field is the field of rational numbers.
Finite fields also exist but they come only in sizes pn where p is a prime
number). In what follows we will consider only the real or the complex field.
It is, however, important to appreciate that the choice of the field is an
integral part of the definition of the vector space.
3

1.2

Examples

Mnm (C): the set of n m complex matrices.


Mn (C): the set of n n complex matrices.
Cn : the set of n-dimensional columns.
Pn (t): the set of polymomials {a0 + a1 t + a2 t2 + + an1 tn1 } in t of
degree less than n with real or complex coefficients.
( The vector space Mn (C) can also be viewed as the set of all linear operators
on Cn . We note that the vector space Cn is of special interest as all finite
dimensional vector spaces of dimension d are isomorphic to Cd as we shall
see later )

1.3

Linear combinations, Linear Span

For vectors x1 , x2 , , xn V and scalars 1 , 2 , , n , we say that the


vector x = 1 x1 + 2 x2 + + n xn is a linear combination of x1 , x2 , , xn
with coefficients 1 , 2 , , n .
The set of all linear combinations of a given set of vectors x1 , x2 , , xn
V is called the linear span of the vectors x1 , x2 , , xn and is itself a vector
space.

1.4

Linear independence

A set of vectors x1 , x2 , , xn V is said to be linearly independent if


1 x1 + 2 x2 + + n xn = 0 1 = 2 = = n = 0 Otherwise the set
is said to be linearly dependent.

1.5

Dimension

A vector space is said to be of dimension n if there exists a set of n linearly


independent vectors but every set of n + 1 vectors is linearly dependent.
On the other hand, if for every integer n it is possible to find n linearly
independent vectors then the vector space is said to be infinite dimensional.
In what follows we will exclusively deal with finite dimensional vector
spaces.
4

1.6

Basis

In a finite dimensional vector space V of dimension n any n linearly independent vectors x1 , x2 , , xn V of linearly independent vectors are said to
provide a basis for V . In general there are infinitely many bases for a given
vector space.

1.7

Representation of a vector in a given basis

Given a basis e1 , e2 , , en V any x V can be uniquely written as x =


x1 e1 +x2 e2 + +xn en . The coefficients x1 , , xn , called the components of
x in the basis e1 , e2 , , en , can be arranged in the form of a column vector

x1

x 7 x =

xn
In particular

1
0
0
0

e1 7 e1 =
, , en 7 en =
0
1

The column vector x is called the representation of x V in the basis


e1 , e2 , , en .

1.8

Relation between bases

Let e1 , e2 , , en and e01 , e02 , , e0n be two bases for a vector space V . Since
e1 , e2 , , en is a basis each e0i can be written as a linear combination of
e1 , e2 , , en :
n
X
0
ei =
Sji ej
j=1

The matrix S must necessarily be invertible as since e01 , e02 , , e0n is a basis
and each ei can be written as a linear combination of e01 , e02 , , e0n :
ei =

n
X
j=1

1 0
Sji
ej

Two bases are thus related to each other through an invertible matrix S
there are as many bases in a vector space of dimension n as there are n n
invertible matrices. The set of all n n invertible real (complex) matrices
form a group denoted by GL(n, R)(GL(n, C)) Under a change of basis the
components x of a vector x in e1 , e2 , , en are related to the components
x0 of x in the basis e01 , e02 , , e0n as follows
x0i

n
X

1
Sji
xj

j=1

1.9

Subspace

A subset V1 of V which is a vector space in its own right is called a subspace


of V .

1.10

Basis for a vector space adapted to its subspace

If V1 is a subspace of dimension m of a vector space V of dimension n then a


basis e1 , e2 , , em , em+1 , , en such that the first m vectors e1 , e2 , , em
provide a basis for V1 is called a basis for V adaped to V1 .

1.11

Direct Sum and Sum

A vector space V is said to be a direct sum of its subspacces V1 and V2 ,


V = V1 V2 if every vector x V can be uniquely written as x = u + v
where u V1 and u V2 . The requirement of uniqueness implies that the
two subspaces V1 and V2 of V can have no vectors in common except the zero
vector. This has the consequence that dimV1 + dimV2 = dimV .
Given a subspace V1 of V , there is no unique choice for the subspace V2
such that V = V1 V2 there are infinitely many ways in which this can be
done.
If the requirement of the uniqueness of the decomposition x = u + v is
dropped then one says that V is a sum of V1 and V2 . In this case V1 and
V2 do have vectors other than the zero vector. The set of common vectors
themselves form a subspace of V of dimension equal to dimV1 +dimV2 dimV .

1.12

Linear Operators

A linear operator A on a vector space V is a rule which assigns, to any vector


x, another vector Ax such that A(x + y) = Ax + Ay. for any x, y V
and any scalars , .
Linear operators on a vector space V of dimension n themselves form a
vector space of dimension n2 .

1.13

Null space, Range and Rank of a linear operator

Given a linear operator, the set of vectors obtained by applying A to all of


V written symbolically as AV form a subspace of V called the range of A.
The dimension of the range of A is called the rank of A The set of all vectors
x such that Ax = 0 i.e. the set of all vectors which get mapped to the zero
vector also form a subspace of V called the null space of A. Clearly the rank
of A is equal to the dimension of V minus the dimension of the null space.

1.14

Invertibility

An operator is said to be invertible if its range is the whole of V or in other


words its null space is trivial there is no nonzero vector x V such that
Ax = 0.

1.15

Invariant subspace of a linear operator

A subspace V1 of V is said to be an invariant subspace of a linear operator


A if Ax V1 whenever x V1 .

1.16

Eigenvalues and Eigenvectors of a linear operator

A non zero vector x V is said to be an eigenvector of A if Ax = x and


is called the corresponding eigenvalue. Note that if x is an eigenvector of A
corresponding to the eigenvalue then so is x for any scalar .

1.17

Representation of a linear operator in a given basis

It is evident that a linear operator on V , owing to linearity, is completely


specified by its action on a chosen basis e1 , e2 , , en for V
Aei =

n
X

Aji ej

j=1

The matrix A of the coefficients Aij is called the representation of the linear
operator in the basis e1 , e2 , , en
It can further be seen that the if the linear operators A and B are respectively represented by A and B respectvely in a given basis in V then the
operator AB is represented in the same basis by AB.

1.18

Change of basis

Clearly the representation of a linear operator depends on the chosen basis. If we change the basis the matrix representing the operator will also
change. Let A and A0 be the representation of the linear operator in the
bases e1 , e2 , , en and e01 , e02 , , e0n related to each other as
e0i

n
X

Sji ej

j=1

then the representation A and A0 are related to each other as A0 = S 1 AS.


Thus under a change of basis the representation of a linear operator undergoes
a similarity transformation : A A0 = S 1 AS.

1.19

Diagonalizability

A linear operator is said to be diagonalizable if one can find a basis in V


such that it is represented in that basis by a diagonal matrix. If this can
be done then clearly each of the basis vector must be an eigenvector of A.
This also means that a for an operator to be diagonalizable its eigenvectors
must furnish a basis for V i.e. the n eigenvectors of A must be linearly
independent.

1.20

From linear operators to Matrices

From the discussion above it is evident that for any vector space V of dimension n, whatever be its nature, after fixing a basis, we can make the following
identifications:
x V x Cn
Linear operator A on V A Mn (C)
Rank of A Rank of A
Invertibility of A Invertibility of A
Diagonalizability of A Diagonalizability of A
Eigenvalues and eigenvectors of A Eigenvalues and eigenvectors A
In mathematical terms, every finite dimensional vector space of dimension n
is isomorphic to Cn .

1.21

Rank of a matrix

The rank of an nn matrix A = (x1 , x2 , , xn ) equals the size of the largest


set of vectors x1 , x2 , , xn that are linearly independent. It also equals the
size of the largest non vanishing minor of A. Alternatively one may compute
the number of linearly independent solutions to Ax = 0. This gives the
dimension of the null space of A. This number subtracted from n gives the
rank of A.

1.22

Eigenvalues and Eigenvectors of a matrix

The eigenvalue problem Ax = x may be rewritten as (A I)x = 0. This


set of homogeneous linear equations has a non trivial solution if and only if
Det(A I) equals zero. This yields an nth degree polynomial equation in
:
C() = n + cn1 n1 + cn2 n2 + c0 = 0
whose roots give the eigenvalues. The polynomial C() is called the characteristic polynomial and the equation C() = 0 the characteristic equation of
A. It is here that the role of the field F comes to fore. In general, there is
no gurantee that an nth degree polynomial with coefficients in F has n roots
also in F. This is, however, true for the field of complex numbers and one
9

says that the complex field is algebraically complete and is the main reason
behind considering vector spaces over the complex field.
The roots 1 , , n of the characteristic equution, the eigenvalues or the
spectrum of A may all be distinct or some of them may occur several times.
An eigenvalue that occurs more than once is said to be degenerate and the
number of times it occurs is called its (algebraic) multiplicity or degeneracy.
Having found the eigenvalues one proceeds to construct the corresponding
eigenvectors. Two situations may arise
An eigenvalue k is non degenerate. In this case, there is essentially (
or upto multiplication by a scalar) there is only one eigenvector corresponding to that eigenvalue.
An eigenvalue k occurs fold degenerate. In this case one may or may
not find linearly independent eigenvectors. Further, there is much
greater freedom in choosing the eigenvectors any linear combination
of the eigenvectors corresponding to a degenerate eigenvalue is also an
eigevector corresponding to that eigenvalue.
Given the fact that the eigenvectors corresponding to distinct eigenvalues are
always linearly independent, we can make the following statements:
If the eigenvalues of an n n matrix A are all distinct then the corresponding eigenvectors, n in number are linearly independent and hence
form a basis in Cn
If this is not so, the n eigenvectors may or may not be linearly independent. (Special kinds of matrices for which the existence of n linearly
independent eigenvectors is guranteed regardless of the degeneracies or
otherwise in its spectrum, will be considered later.

1.23

Diagonalizability

An n n matrix A is diagonalizable i.e. there exists a matrix S such that


S 1 AS = Diag(1 , , n )
If and only if the eigenvectors x1 , , xn corresponding to the eigenvalues
(1 , , n ) are linearly independent and the matrix S is simply obtained
by putting the eigenvectors side by side:
S = (x1 x2 xn )
10

In view of what has been said above, a matrix whose eigenvalues are all
distinct can certainly be diagonalized. When this is not so i.e. when one or
more eigenvalues are degenerate we may or may not be able to diagonalize
depending on whether or not it has n linearly independent eigen vectors. If
the matrix can not be diagonalized what is the best we can do? This leads us
to the Jordan canonical form ( of which the diagonal form is a special case).

1.24

Jordan Canonical Form

Consider a matrix A whose eigenvalues are (1 , 2 , , n . Some of the


entries in this list may be the same. Notationally it proves convenient to replace this list by a shorter list ,1 , ,,r with all distinct entries and specify
the (algebraic) multiplicity i of the entry ,i , i = 1, , r. (Thus, for instance, the list (0, 5, 0.5, 1.5, 1.5, 1.5, 0.3) would get abridged to 0.5, 1.5, 0.3
with
Pr 1 = 2, 2 = 3, 3 = 1). Clearly all the s must add up to n :
i=1 i = n. Now let i , i = 1, , r denote the number of linearly inependent eigenvectors corresponding to the eigenvalue ,i . This number
is also referred to as the geometric multiplicity
,i . It is evident that
Pof
r
1 i i , i = 1, , r. Further, the sum i=1 i = ` gives the total
number of linearly independent eigenvectors of A. It can be shown that for
every matrix A there is an S such that S 1 AS = J where J, the Jordan
form, can brought to a block diagonal form J = Diag(J1 , J2 , , J` ) where
each block Ji , i = 1, , ` has the structure


Ji =

where is one of the eigenvalues of A. Some general statements that can be


made at this stage
The sizes of the blocks add up to n
The number of times each eigenvalue occurs along the diagonal equals
its algebraic multiplicity
The number of blocks in which each eigenvalue occurs equals its geometric multiplicity.
11

Needless to say that the diagonal form is a special case of the Jordan form
in which each box is of 1 dimension.
Further details concerning the sizes of the blocks, explicit construction of
S which effects the Jordan form have to be worked out case by case and will
be omitted.

1.25

Cayley Hamilton Theorem

Cayley Hamilton theorem states that every matrix satisfies its characteristic
equation. Thus if the characteristic equation of a 3 3 matrix is 3 + c2 2 +
c1 + c0 then A satisfies A3 + c2 A2 + c1 A + c0 I = 0. This thus expresses A3 ,
and hence any higher power of A, as a linear combination of the A2 , A and
I. This result very useful in explicit computation of functions f (A) of any
n n an matrix. We illustrate below the procedure for a 3 3 matrix.
Recall that if A has eigenvalues 1 , , n with corresponding eigenvectors x1 , , xn then f (A) has eigenvalues f (1 ), , f (n ) with x1 , , xn
as eigenvectors.
Now consider a function f (A) of, say, a 3 3 matrix. Cayley Hamilton
theorem tells us that computing any function of A ( which can meaningfully
be expanded in a power series in A) reduces, in this instance, to computing
powers of A upto two.
f (A) = a2 A2 + a1 A + a0 I
Only thing that remains is to determine the three coefficients a2 , a1 , a0 and
to do that we need three equations. If the three eigenvalues are distinct, by
virtue of what was said above one obtains
f (1 ) = a2 21 + a1 1 + a0
f (2 ) = a2 22 + a1 2 + a0
f (3 ) = a2 23 + a1 3 + a0
which when solved for a2 , a1 , a0 yield the desired result.
What if one of the eigenvalues 1 is two fold degenerate i.e what if the
eigenvalues turn out to be 1 , 1 , 2 ? We then get only two equations for the
three unknowns. It can be shown that in such a situation the third equation
needed to supplement the two equations
f (1 ) = a2 21 + a1 1 + a0
12

f (2 ) = a2 22 + a1 2 + a0
is

f ()
|=1 = 2a2 1 + a1

What if all the three eigenvalues are the same i.e. if the eigenvallues turn
out to be 1 , 1 , 1 . The three desired equations then would be
f (1 ) = a2 21 + a1 1 + a0
f ()
|=1 = 2a2 1 + a1

2 f ()
|=1 = 2a2
2
One can easily recognise how the pattern outlined above extends to more
general situations.

1.26

Scalar or inner product, Hilbert Space

A scalar product is a rule which assigns a scalar, denoted by (x, y), to any
pair of vectors in x, y V such that
(x, y) = (y, x) (hermitian symmetry)
(x, y + z) = (x, y) + (x, z) (linearity)
(x, x) 0. Equality holds if and only x is 0 (Positivity)
Examples:
Cn : (x, y) = x y
Mn (C) : (x, y) = Tr[x y]
Rb
Pn (t) : (x, y) = a dt w(t)x (t)y(t), for any fixed w(t) such that w(t)
0 for t (a, b)
p
A vector x is said to be normalized if its norm ||x|| (x, x) = 1. If
a vector is not normalized, it can be normalized by dividing it by its norm.
Two vectors x, y H are said to be orthogonal if (x, y) = 0. A vector space
V equipped with a scalar product is called a Hilbert space H. On a given
vector space one can define a scalar product in infinitely many ways. Hilbert
spaces corresponding to the same vector space with distinct scalar products
are regarded as distinct Hilbert spaces.
13

1.27

Orthogonal complement

Given a subspace H1 of a Hilbert space H, the set of all vectors orthogonal


to all vectors in H1 forms a subspace, denoted by H1 , called the orthogonal
complement of H1 in H. Further, as the nomenclature suggests, H = H1
H1 .

1.28

Orthonormal Bases

A basis e1 , e2 , , en H such that (ei , ej ) = ij is said to be an orthonormal


basis in H. If a vector x H is expressed in terms of the orthonormal basis
ei , , en as x = x1 ei + +xn en then its components xi are simply equal to
(ei , x). Similarly if a linear operator is represented in an orthonormal basis
e1 , e2 , , en H by a matrix A
Aei =

n
X

Aji ej

j=1

then the matrix elements Aij are simply given by


Aij = (ei , Aej )
Remember that this holds only when the chosen basis is an orthonormal basis
and not otherwise.

1.29

Relation between orthonormal bases

Two orthonormal bases e1 , e2 , , en and e01 , e02 , , e0n are related to each
other by a unitary matrix :
e0i =

n
X

Uji ej , U U = I

j=1

Thus in an n dimensional Hilbert space there are as many orthonormal bases


as n n unitary matrices.

1.30

Gram Schmidt Orthogonalization procedure

Given a set of linearly independent vectors x1 , x2 , , xn the Gram Schmidt


procedure enables one to construct out of it an orthonormal set z1 , z2 , , zn
14

in a recursive way. The first step consits constructing an orthogonal basis


y1 , y2 , , yn as follows
i1
X
(yj , xi )
yj , i = 2, , n
y1 = x1 , yi = xi
(yj , yj )
j=1

The desired orthonormal basis z1 , z2 , , zn is then obtained by normalizing


yi
this orthogonal set zi =
.
||y||
There are infinitely many orthogonalization procedures. However only the
Gram-Schimidt procedure has the advantage of being sequential if one more
vector is added to the set the construction upto the previous step remains
unaffected.

1.31

Gram Matrix

Given a set of linearly independent vectors x1 , x2 , , xn , one can associate


with it a matrix G with Gij = (xi , xj ) called the Gram matrix. A necessary and sufficient condition for x1 , x2 , , xn to be linearly independent is
that the determinant of the Gram matrix must be non zero. ( In fact the
determinant of a Gram matrix is always 0)

1.32

Adjoint of a linear operator

An operator, denoted by A , such that (x, Ay) = (A x, y) for all pairs x, y


H is called the adjoint of A. Stated in terms of a basis e1 , e2 , , en H
these may equivalently expressed as
(ei , Aej ) = (A ei , ej )
(ei , Aej ) = (ej , A ei )
If the basis
(ei , Aej ) = (A ei , ej )
is chosen to be an orthonormal basis, after recognising that (ei , Aej ) and
(ei , A ej ) are simply the matrix elements Aij Aij of the matrices representing
A and A respectively in the chosen basis, the last equation translates into
A ij = Aji
15

i.e. the matrix for A is simply the complex conjugate transpose of the matrix
A for A. Remember that this is so only if the basis chosen is an orthonormal
basis and is not so otherwise.

1.33

Special kinds of linear operators, their matrices


and their properties

Self adjoint or Hermitian operator : An opearator A for which A = A


or in otherwords (x, Ay) = (Ax, y) for all pairs x, y is called a self
adjoint operator. Such an operator can be shown to have the following
properties:
Its eigenvalues are real
The eigenvectors corresponding to distinct eigenvalues are orthogonal
Its eigenvectors are linearly independent and therefore can always
be diagonalized regardless of whether its eigenvalues are distinct or
not. Its eigenvectors can always be chosen to form an orthonormal
basis
In an orthonormal basis, a self adjoint operator A is represented
by a Hermitian matrix A, A = A
A hermitian matrix A can always be diagonalixed by a unitary
matrix U AU = Diag
Unitary operator: An opearator U such that (Ux, Uy) = (x, y) for all
pairs x, y is called a unitary operator. Such an operator can be shown
to have the following properties:
Its eigenvalues are of unit modulus
The eigenvectors corresponding to distinct eigenvalues are orthogonal
Its eigenvectors are linearly independent and therefore can always
be diagonalized regardless of whether its eigenvalues are distinct or
not. Its eigenvectors can always be chosen to form an orthonormal
basis
In an orthonormal basis, a unitary operator U is represented by a
unitary matrix U, U U = I
16

Positive operator : An opearator A such that (x, Ax) 0 for all pairs
x is called a positive ( or non negative) operator. For such an operator
it can be shown to have the following properties:
its eigenvalues are 0
It is necessarily self adjoint and hence inherits all the properties
of a self adjoint operator.
Projection operator: A self adoint opearator P such that P 2 = P is
called a projection operator Such an operator can be shown to have the
following properties:
its eigenvalues are either 1 or 0
Being self adjoint, it has all the properties of a self adjoint operator.
the number of 1s in its spectrum give its rank.
A projection operator P of rank m fixes an m dimensional subspace of H. The operator Id P is also a projection operator of
rank n m and fixes the orthogonal complement of the subspace
corresponding to P.
If P1 , P2 , , Pn denote the projection operators corresponding
to the one dimensional subspaces determined by an orthonormal
basis e1 , e2 , , en H then
Tr[Pi Pj ] = ij Pi ,

P1 + P2 + + Pn = Id

If e1 , e2 , , en H is an eigenbasis of of a self adjoint operator


A corresponding to the eigenvalues 1 , , n then A may be
resolved as :
A = 1 P1 + 2 P2 + + n Pn Spectral Decomposition
Real symmetric matrices: These arise in the study of quadratic forms
over the real field. An expression q(x1 , , xn ) of the form
q(x1 , , xn ) =

n
X

Aij xi xj , Aij R, x Rn

i,j

17

, a real homogeneous polynomial of degree 2, is called a real quadratic


form in n variables. A real quadratic form can be compactly expressed
as q(,x) = xT Ax where A is a real symmetric matrix. Under a linear
change of variables x y = S 1 x, A suffers a congruence transformation : A A0 = S T AS. Given a real symmetric matrix A can we
always find a matrix S such that S T AS is diagonal so that the quadratic
expression in the new variables has only squares and no cross terms ?
The answer is yes :
Every real symmetric A has real eigenvalues
Its eigenvectors are real and and can always be chosen to form an
orthonormal basis.
An orthogonal matrix S, S T S = I can always be found such that
S T AS = Diag. The entries along the diagonal are the eigenvalues
of A.
The matrix S is contructed by putting the eigenvectors of A side
by side.
2n dimensional real symmetric positive matrices A can always be diagonalized by a congruence transformation through a symplectic matrix
:


0 I
T
T
S AS = Diag, S S = , =
I 0
The entries along the diagonal are not the eigenvalues of A but rather
what are known as symplectic eigenvalues of A.
Symplectic matrices arise naturally in the context of linear canonical
transformations in the Hamiltonian formulation of classical mechanics
and quantum mechanics ( Linear canonical transformation in classical mechanics (quantum mechanics ) are those linear transformations
which preserve the fundamental Poisson brackets (commutation relations))
All these operators (matrices) are referred to as normal operators (matrices)
A and its adjoint A commute with each other: i.e. [A, A ] = 0 where
[A, B] AB BA.

18

1.34

Simultaneous Diagonalizability of Self adjoint operators

Two self adjoint operators A, B, A = A, B = B can be diagonalized simultaneously by a unitary transformation if and only if the commute [A, B] = 0.
The task of diagonalizing two commuting self adjoint operators essential
consists in consisting a common eigenbasis which, as we know , can always
be chosen as an orthonormal basis. The unitary operator which effects the
simultaneous diagonalization is then obtained by putting the elements of
the common basis side by side as usual. If one of the two has degenerate
eigenvalues then its eigenbasis is also the eigenbasis of the other. More work
is needed If neither of the two has a non degenerate spectrum- suitable linear
cobinations of the eigenvectors corresponding to a degenerate eigenvalues
have to be constructed so that they are also the eigenvectors of the other.
The significance of this result in the context of quantum mechanics arises
in the process of labelling the elements of a basis in the Hilbert space by the
eigenvalues of a commuting set of operators.

1.35

Simultaneous reduction of quadratic forms

If A is a real symmetrix strictly positive matrix and B a real symmetric


matrix then there is an S such that
S T AS = Id, S T BS = Diag
This result is of consderable relevance in the context finding normal modes
of oscillations of coupled harmonic oscillators:
M

d2
x = Kx,
dt2

where M is a real positve matrix and K is a real symmetric matrix.

1.36

Standard constructions of new vector spaces from


old ones

Quotient spaces : Given a vector space V and a subspace V1 thereof one


can decompose V into disjoint subsets using the equivalence relation
that two elements of V1 are equivalent if they differ from each other by
19

an element of V1 . The subsets, the equivalence classes themselves form


a vector space V /V1 , called the quotient of V by V1 , of dimension equal
to the difference in the dimensions of V and V1 .
Dual of a vector space : Given a vector space V , the set of all linear
functionals on V themselves form a vector space V of the same dimension as V . Here by a linear functionnal on V one means a rule which
assigns a scalar to each element in V respecting linearity.
Tensor product of vector spaces: Consider two vector spaces V1 and
V2 of dimensions n and m respectively. Let e1 , , en and f1 , , fm
denote the bases in V1 and V2 respectively. By introducing a formal
symbol , we construct a set of nm objects ei fj ; i = 1, n, j =
1, , m. and decree them as the basis for a new vector space V1 V2
of dimension nm : elements x of V1 V2 are taken to be all linear
combinations of ei fj ; i = 1, n, j = 1, , m
x=

n X
m
X

ij ei fj ; i = 1, n, j = 1, , m

i=1 j=1

(It is assumed that the formal symbol satisfies certain common sense
properties such as (u + v) z = u z + v z; (u) z = (u z) =
u (z) etc. )
Here a few comments are in order:
Elements x of V1 V2 can be divided into two categories, product or
separable vectors i.e those which can be written as uv; u V1 , v V2
and non separable or entangled vectors i.e. those which can not be
written in this form.
Operators A and B on V1 and V2 may respectively be extended to
operators on V1 V2 as A I and I B.
Operators on V1 V2 can be divided into two categories : local operators
i.e. those which can be written as A B and non local operators i.e.
those which can not be written in this way.
If the operators A and B on V1 and V2 are represented by the matrices
A and B in the bases e1 , , en and f1 , , fm then the operator A
B is represented in the lexicographically ordered basis ei fj ; i =
20

1, n, j = 1, , m by the matrix A B. where

A11 B A1n B


AB =


An1 B Ann B
This construction can easily be extended to tensor products of three or
more vector spaces.
Tensor products of vector spaces arise naturally in the description of
composite systems in quantum mchanics. The notion of entanglement
pays a crucial role in quantum information theory.

21

2
2.1

Second order differential equations


Power series, interval of convergence

P
n
An infinite series of the form
n=0 an (x x0 ) is called a power series
around x0 . It converges for values of x lying in the interval |x x0 | < R
where R, the radius of convergence of the power series is given by
lim

R =n |

an
|
an+1

Stated in words, the infinite series converges for values of x lying in the
interval x0 R to x0 + R where R is to be computed as above.

2.2

Ordinary, singular and regular singular points

A point x = x0 of a second order differential equation


d2 y
dy
+ p(x) + q(x)y = 0
2
dx
dx
is said to be an ordinary point of the differential equation if both p(x) and
q(x) are analytic at x = x0 i.e. both can be expanded in a power series
around x0 :
p(x) =

pn (x x0 ) , q(x) =

qn (x x0 )n

n=0

n=0

If this is not so i.e if either p(x) or q(x) or both are not analytic at x = x0
then x0 is said to be a singular point.
If x = x0 is a singular point such that (x x0 )p(x) and (x x0 )2 q(x) are
analytic at x = x0 then x0 is called a regular singular point of the differential
equation.

22

2.3

Solution around an ordinary point

Hereafter, without any loss of generality, we will choose x0 = 0.


It can be shown that if both p(x) and q(x) are analytic at x = 0 then so
is the solution y(x) of the differential equation. Further the power series for
y(x) around x = 0 converges at least in the common of interval of convergence
of that for p(x) and q(x). To explicitly solve the differential equation one
therefore makes the ansatz:
y(x) =

an x n

n=0

and puts it in the differential equation. On equating like powers of x on both


sides one obtains, in general, a two step recursion formula of the form :
()an+2 + ()an+1 + ()an = 0, n = 0, 1, 2,
for the coefficients an . The recursion formula therefore determines all the
an s in terms of a0 and a1 . Putting the expressions for an in the power series
y(x) then yields
y(x) = a0 y1 (x) + a1 y2 (x)
where the two functions y1 (x), y2 (x) provide us with the two independent
solutions of the differential equation. Any solution can be expressed as a
linear combination thereof.

2.4

Solution around a regular singular point

To explicitly solve the differential equation around a regular singular point


one makes the ansatz:

X
y(x, ) =
an xn+
n=0

Substituting it in the differential equation and on equating like powers of x


on both sides one obtains,
1. the indicial equation, a quadratic equation for :
( 1 )( 2 ) = 0

23

2. a one step recursion relation for the an s of the form :


()an+1 + ()an = 0, n = 0, 1,
The coefficient of an+1 always turns out to be proportional to (n + 1 +
1 )(n + 1 + 2 )
Note that this method can also be used for solving the differential equation
around an ordinary point as well.
Three situations may arise
Case I : (1 2 ) 6= 0, or an integer In this case solve the recursion relation
to obtain an , n = 1, 2, in terms of a0 for arbitrary to obtain
y(x, ) = a0

()xn+

n=0

The two independent solutions y1 (x) and y2 (x) are then given by
y1 (x) = y(x, 1 );

y2 (x) = y(x, 2 )

Case II : (1 2 ) = 0, In this case the two independent solutions y1 (x) and


y2 (x) are then given by
y1 (x) = y(x, 1 );

y2 (x) =

y(x, )
|=1

Case III : (1 2 ) = N, a positive integer In this case finding the solution


y1 (x) corresponding to the larger root 1 presents no difficulties and ,as
before, is given by
y1 (x) = y(x, 1 )
Difficulties arise when one tries to find the solution corresponding to the
smaller root. Here for = 2 one finds that the factor in front of aN in the
expression relating aN to aN 1 becomes zero. Two situations may arise
1. A

0.aN = 0

2. B

0.aN 6= 0

24

In the case A one can simply put aN = 0 ( If one doesnt, one simply
generates an expression proportional to the first solution)
In the case B the second solution is obtained by putting a0 = ( 2 )b0 ,
solving for an s in terms of b0 to obtain
y(x, ) = b0

()xn+

n=0

The second solution is then given by


y2 (x) =

y(x, )
|=2

Note that in the cases II and III B, the second solution, in general, has
the structure

X
()xn+2 +
y2 (x) = log(x)
n=0

which is singular at x = 0 and hence not of much physical interest.


Further, if this method is used for solving a differential equation around
and ordinary point than one would find that both 1 and 2 are integers and
that the case III A always obtains ensuring that the two solutions have the
structure of a power series as they should.

2.5

Example: Bessel Equation

A particularly good example of a second order differential equation where all


the cases discussed above obtain is that of the Bessel equation
x2

dy
d2 y
+ x + (x2 2 )y = 0
2
dx
dx

Here the roots of the indicial equation turn out to be and and one has
Case I : 2 not an integer
Case II : = 0
Case III A 2 an odd integer
Case III B 2 an even integer i.e an integer
25

In Case I, proceeding as above the two solutions turn out to be :

 x 2n+
 x 2n
X
X
(1)n
(1)n
J (x) =
; J (x) =
;
n!(n + + 1) 2
n!(n + 1) 2
n=0
n=0
and the general solution of the Bessel equation can be written as
y(x) = c1 J (x) + c2 J (x)
In cases II, IIIA, IIIB, while the first solution (corresponding to the larger
root of the indicial equation) is still J (x), to obtain the second solution
one has to follow the procedure outlined above. However, in the context of
Bessels equation, one finds that the two soultions continue to remain valid
in the Case III A as well. So the only problematic cases that remain are Case
II and Case III B i.e. when is zero or an integer. Here ( and only in this
context) to find the second solution the following trick works. One defines a
suitable linear combination of J (x) and J (x) as follows
J (x) cos J (x)
sin
and for cases I and IIIA the general solution of the Bessel equation can
equally well be written as
Y (x) =

y(x) = c1 J (x) + c2 Y (x)


When = 0 or an integer one finds the function Y (x) the way it is defined becomes an indeterminate form and has to be computed by applying LHospital
s rule. [ This happens because for N integer or zero , JN (x) = (1)N JN (x)
as can be verified from the definition of J (x)]. With this caveat the general
solution of the Bessel equation can always be written as
y(x) = c1 J (x) + c2 Y (x)
The functions Y (x) are called Bessel functions of the second kind
The equation
d2 y
dy
x2 2 + x (x2 + 2 )y = 0
dx
dx
obtained by replacing x by ix in the Bessel equation is called the modified
Bessel equation. The considerations given above apply here as well and its
general solution is given by
y(x) = c1 I (x) + c2 K (x)
26

where
I (x) =

X
n=0

 x 2n+
1
n!(n + + 1) 2

and

I (x) I (x)
2
sin
As before when is zero or an integer, K (x) becomes an indeterminate
form ( by virtue of the fact that IN (x) = IN (x) ) and has to be computed
as a limit. The functions K (x) are called modified Bessel functions of the
second kind
There are several equations of mathematical physics which are related to
the Bessel equation by suitable changes of variables. It can be shown that
the family of equations
K (x) =

x2

dy
d2 y
+ (1 2s)x + [(s2 r2 p2 ) + a2 r2 x2r ]y = 0
2
dx
dx

after putting t = axr ; y = xs u transforms into the Bessel equation


t2

d2 u
du
+ t + (t2 p2 )u = 0
2
dt
dt

and hence their general soluion can be written as


y(x) = xs [c1 Jp (axr ) + c2 Yp (axr )]

2.6

Second order diff. eqns : Sturm Liouville form

Second order differential equations with the structure




dy
1 d
s(x)w(x)
= y
w(x) dx
dx

(1)

are said to have the Sturm Liouville form. They have the stucture of the
eigenvalue problem for the second order differenial operator


1 d
d
L
s(x)w(x)
(2)
w(x) dx
dx
If w(x) and s(x) are such that
27

w(x) 0 in the interval (a, b)


s(a)w(a) = s(b)w(b) = 0
then it can easily be shown that the solutions y (x) and y0 (x) with 6= 0
are orthogonal to each other with respect to the weight function w(x) in the
interval (a, b)
Z b
dxw(x)y (x)y0 (x) = 0 if 6= 0
a

Aternatively this may be seen as a consequence of the fact that L is self


adjoint with respect to the scalar product
Z b
dxw(x)f (x)g(x)
(f, g) =
a

and that the eigenvectors of a self adjoint operator corresponding to distinct


eigenvalues are orthogonal.

2.7

Sturm Liouville form: Polynomial solutions

If in addition to the two conditions on s(x) and w(x) above one further
stipulates that
s(x) is a polynomial in x of degree 2 with real roots
C1 (x)

d
1
[s(x)w(x)]; K1 a constant , is a polynomial of
K1 w(x) dx

degree 1.
then it can be shown that
Cn (x)

1
dn n
[s (x)w(x)];
Kn w(x) dxn

n = 0, 1, 2, [Rodriguez Formula]

1. are polynomials in x of degree n


2. satisfy the second order differential equation


d
1
d2 s(x)
LCn (x) = n Cn (x); with n = n K1 C1 (x) + (n 1)
dx
2
dx2

28

3. form an orthogonal system


Z b
dxw(x)Cn (x)Cm (x) = 0 if n 6= m
a

4. satisfy recursion relations of the form Cn+1 (x) = (An x + Bn )Cn (x) +
Dn Cn1 (x)
A systematic analysis of the four conditions on s(x) and w(x) leads one
to eight distinct systems of orthogonal polynomials - Hermite, Legendre,
Laguerre, Jacobi, Gegenbauer and Tchebychef. [For details see, for instance
Mathematics for Physicists Dennery and Krzywicki]

s(x)
1
x
(1 x2 )
x
(1 x2 )
(1 x2 )
(1 x2 )
(1 x2 )

w(x)
2
ex
ex
1
x
x e
(1 x) (1 + x)
(1 x2 )1/2
(1 x2 )1/2
(1 x2 )1/2

Interval
(, )
[0, )
[1, 1]
[0, )
[1, 1]
[1, 1]
[1, 1]
[1, 1]

Name of the polynomial


Hermite: Hn (x)
Laguerre: Ln (x)
Legendre : Pn (x)
Associated Laguerre: Ln (x), > 1
(,)
Jacobi: Pn (x), , > 1
Gegenbauer: Cn (x), > 1/2
Tchebychef of the first kind: Tn (x)
Tchebychef of the second kind: Un (x)

The table below gives the values of Kn appearing in the Rodriguez formula
and those of hn appearing in the orthogonality relation
Z b
dxw(x)Cn (x)Cm (x) = hn nm
a

for the first three orthogonal polynomial systems.


Cn (x)
Kn
Hn (x) (1)n
Ln (x)
n!
Pn (x)

29

hn

2 n!
1
2
2n + 1
n

It is often not possible to remember detailed expressions for these polynomial. However their suitably defined generating functions
F (x, t) =

an Cn (x)tn

have simple analytical expressions which can easily be remembered and these
can be used directly to deduce various properties of the corresponding polynomials. The table below gives the generating functions for the first three
polynomial systems
Cn (x)
Hn (x)

an
1
n!

Ln (x)

Pn (x)

F (x, t)
2

e2xtt
xt



1
e 1t
1t
1

1 2xt + t2

30

Potrebbero piacerti anche