Sei sulla pagina 1di 3

Notes on covariant versus contravariant vectors, raising and lowering

indices, etc.
Matt Reece
Here are some notes on how vectors and tensors work in special relativity to try to clear up
some possible confusion from the lecture.
The first vector that we introduced was x , the coordinates of a point in spacetime, with x0
the time coordinate and xi (i = 1, 2, 3) the three spatial coordinates. This object, a vector with a
superscript index, was our prototypical example of a contravariant vector. In fact, its not a great
example, because its worth keeping in mind that coordinates of a point in spacetime are not,
in general, vectors at all; its a special property of Euclidean space that we can define such an
operation at all. If we had labeled points on a sphere with some coordinates, for instance, there
would be no meaning to the sum of two such points. And even in Euclidean space, it doesnt really
make sense to add x + y , or to multiply x by a constant; this would determine a new coordinate,
but that new coordinate doesnt have a very sensible meaning. The reason we can kind of get
away with thinking about spacetime coordinates as vectors is that Lorentz transformations do act
on them by matrix multiplication:

x 7 x0 x .

(1)

The summation convention on the index is understood. Anything that transforms according to
this relationship is called a contravariant vector. This definition extends beyond the case of Lorentz
transformations, and you can take the general definition (useful in GR, for example) to be: if the
coordinates change by any arbitrary mapping x 7 x0 , a contravariant vector is one that transforms
according to:
v 7 v0

x0
v .
x

(2)

You can see that Eq. 1 is a special case of this, which is why we can get away with thinking of
coordinates in flat spacetime as a sort of vector.
A covariant vector, on the other hand, is denoted with a lower index and transforms in the
opposite way:
v 7 v0

x
v .
x0

(3)

In other words, its really transforming by the inverse matrix. In the case of a Lorentz transformation, we would have an inverse matrix defined by

(1 ) = ,

(4)

so

x = (1 ) x = (1 ) x0 ,
and

x
x0

(5)

= (1 ) . Thus, if were talking specifically about Lorentz transformations,

v 7 v0 (1 ) v ,
and covariant vectors have the opposite transformation law from contravariant vectors.

(6)

On the other hand, the defining property of Lorentz transformations is that they preserve the
metric. The metric g has two lower indices and transforms like a covariant vector but with one
factor for each index:

g 7 g0 = (1 ) (1 ) g = g .

(7)

The last equality here is the statement that the metric didnt actually change when we did a Lorentz
transformation. This is really the same relationship we denoted in class by (T g) = g, if we were
to view g as a matrix. I usually find it easier to work in the form that just writes lots of indices,
rather than writing down an explicit matrixespecially if you ever work with tensors of rank
higher than 2, it just becomes unwieldy to use anything other than the index notation.
So at this point weve discussed two kinds of objects, covariant vectors and contravariant
vectors, which transform in different ways. Now we come to the key claim: given a contravariant
vector v , there is an object w defined by w g v . We claim that this is a covariant vector.
To see that, we should work out its transformation law. g doesnt change under a Lorentz
transformation, so the change comes entirely from the change in v . That is:
w 7 g0 v0 = g v .

(8)

But if we multiply the last equality in Eq. 7 by , we learn that:

(1 ) (1 ) g = g ,

(9)

and then use the definition of the inverse to replace (1 ) = , it becomes:


(1 ) g = g ,

(10)

w 7 g v = (1 ) g v = (1 ) w .

(11)

i.e.

But this is exactly the transformation law Eq. 6 of a covariant vector. More generally, the boxed
equation 10 is what I was implicitly (if maybe a little too cavalierly) using in class to avoid writing
down any transformation laws that used the inverse matrix.
If the jumble of indices above confused you, lets repeat the key point and elaborate a bit:

Summary of the key idea: Given a contravariant vector v that transforms to v under a Lorentz
transformation, the object w defined to be g v transforms to (1 ) w under a Lorentz transformation, and as such is a covariant vector. In fact, we will usually choose to denote this object
by v , i.e. using the same letter we used to denote its counterpart contravariant vector. We call
the operation of moving from v to v lowering the index , and because it is a one-to-one map,
we will often carelessly talk about v and v as if they are the same object, and simply call it a
vector. Put differently, the metric g allows us to define an isomorphism between the spaces of
contravariant and convariant vectors.

We will also make use of the inverse metric g , defined to be the object such that g g = .
Given this definition, v = g v , so we can raise the index in much the same way as we lower
it. This is just the inverse operation of the isomorphism defined above.
Given the transformation laws of covariant and contravariant vectors, you can see that for any
two vectors v and w , the contraction v w = g v w = g v w is a Lorentz scalar, i.e. does not
change under taking derivatives. Because v w is a scalar, you can view contravariant vectors like

v as machines for turning covariant vectors like w into real numbers, i.e. as functions V R
from the space of covariant vectors to the reals. The space of such functions is what mathematicians
call a dual space (often denoted V ), and so contravariant vectors are dual to covariant vectors.
The metric gives us a canonical isomorphism between the space of covariant vectors V and its dual
space V , allowing us to be sloppy and conflate the two. More general objects with n upper indices
and m lower indices can be thought of as living in a tensor product of copies of V and V , i.e.
V m V n .
The gradient = x is a covariant vector, while the differential form dx is a contravariant
vector.

Potrebbero piacerti anche