Sei sulla pagina 1di 384
Elec revel a (ore space-times second edition Flat and Curved Space—Times SECOND EDITION George F. R. Ellis Distinguished Professor of Complex Systems Mathematics Department, University of Cape Town and Ruth M. Williams Fellow and Lecturer in Mathematics Girton College and Assistant Director of Research, Department of Applied Mathematics and Theoretical Physics, University of Cambridge Diagrams by Mauro Carfora Department of Nuclear and Theoretical Physics, University of Pavia OXFORD ‘UNIVERSITY PRESS OXFORD UNIVERSITY PRESS Great Clarendon Street, Oxford OX2 6DP ‘Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scliolarship, and education by publishing worldwide in Oxford New York Athens Auckland Bangkok Bogoté Buenos Aires Calcutta Cape Town Chennai Dar es Salaam Delhi Flotence Hong Kong Istanbul Karachi Kuala Lumpur Madrid Melbourne Mexico City Mumbai Nairobi Paris So Paulo Singapore Taipei Tokyo Toronto Warsaw with associated companies in Berlin Ibadan, Oxford a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York © George F. R, Ellis and Ruth M. Williams 1988, 2000 ‘The moral rights of the author have been asserted Database right Oxford University Press (maker) First edition 1988 Second edition 2000 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate repragraphies rights organization Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose this same condition on any acquirer A catalogue record for this book is available from the British Library Library of Congress Cataloging in Publication Data (Data available) ISBN 0-19-850657-0 Hardback ISBN 0-19-¥30656-2 Paperback ‘Typeset by Newgen Imaging Systems (P) Ltd., Chennai, India Printed in Great Britain on acid-free paper by T.J. International Ltd, Padstow, Cornwall Preface This book grew out of a series of lectures and a summer school course given by one of us (G.F.R.E.) at the University of Cape Town. A series of notes taken by a student (Gavin Hough) was useful in preparing the text, the major part of which was completed while G.F.R.E. was at the University of Texas and R.M.W. at the Institute for Advanced Study, Princeton. We thank Marilyn Brink, Colin Myburgh, Sasha Loncarevic, and Clive Khouny for useful criticisms of a draft of the text. We decided to turn the notes into an introductory book because we believed that, despite the proliferation of books on relativity theory, there was no equivalent text available. We hope that the book will make a solid understanding of flat and curved space-times accessible to a wider audience than hitherto. We are extremely grateful to Dr Mauro Carfora for combining his artistic skills with his knowledge of relativity to produce the diagrams in the book. Relativity may at first seem to the reader to be an abstract theory, far removed from the reality of everyday life. By the end of the book, we will have demon- strated that this theory is of fundamental importance not only for elementary particle physics and astronomy. but also in the way it affects conditions of life in the world around us. We shall also see that the cover photograph, showing an eclipse of the Sun by the Earth, as seen from an Apollo spacecraft, illustrates several features of relativity. [Editor’s note: the front cover of the second edition has a different photograph; it shows the galaxy NGC 3377, which is believed to have a black hole at its centre. As we shall see, this also illustrates several features of relativity.] October 1987 GF.RE. R.M.W. Preface to the Second Edition We have been very pleased to prepare a second edition of this book, at the request of Oxford University Press, in order to bring this presentation of relativity theory up to date (and allowing us to correct some errors and areas of lack of clarity that haye been pointed out by readers). While the foundations of the subject remain the same as ever, there has been marked progress in some areas of application of relativity theory, particularly because of the vast explosion of new astronomical data from powerful new ground-based telescopes such as Keck, and a series of satellite observatories: IRAS (infra-red astronomical satellite), COBE (cosmic background radiation explorer), the Hubble space telescope, ROSAT (X-ray), and so on. Also, for example, gravitation radiation detectors have made enor- mous strides, and major new-generation gravitational wave observatories will come on line in the next five years or so, opening up a new astronomical channel of observation. The observational situation is heing transformed. Thus this revision presents a substantial amount of new material that takes these developments into account. However, we have not altered the basic structure of the book, despite critical comments by some reviewers. The prime cause of dissatisfaction to some is that we take so long to reach the Lorentz transformation—traditionally, an early part of many presentations. This policy of ours is deliberate. We believe it is essential to get the grounding right first, and that takes a long time and considerable thought; it should not be rushed. It is possible to move quickly to the Lorentz transformations, and learn to manipulate them mechanically, but that does not mean that what they represent is under- stood in a serious way. Our aim is to solidly lay the foundations, first deriving all the main relativity results in a simple and well-grounded way, and only then use the Lorentz transformations as a device for summarizing concisely what has been discovered. The other way of presentation (effectively starting with the Lorentz transfor- mation) is right for some readers; ours is right for many others, as readers’ comments testify. So the basic presentation is the same as before. We hope that you will find it enlightening. July 2000 GFRE. R.M.W. We dedicate this book to our daughters Margaret Ellis and Miriam Saxl Contents Introduction 1, Space-time diagrams and the foundations of special relativity 1.1 The concept of a space-time 1.2. Causality and the speed of light 1.3. Relative motion in special relativity 2. Fundamentals of measurement 2.1 Time 2.2 Distance 2.3. Simultaneity 2.4 World maps, world pictures, and radar maps . Measurements in flat space—times Relative velocity Simultaneity Time dilation Length contraction The whole package of kinematic effects Relativistic dynamics The consistency of physics 2 G2 Le Lo Uo La Lo Lo bo DIAAR wD 4, The Lorentz transformation and the invariant interval 4.1 The Lorentz transformation 4.2 Space-time separation invariants 4.3 Some flat-space universes 5. Curved spacetimes 5.1 The general concept 5.2 Acceleration and gravitation: the principle of equivalence 5.3. Freely falling motion and the meaning of geodesics 122 162 186 186 189 195 x Contents 5.4 The metric form and the metric tensor 5.5 The field equations 5.6 Light rays 5.7 Causality 5.8 Parallel propagation along a curve 5.9 Further tests of Einstein’s theory 5.10 Gravitational waves 5.11 Detection of gravitational waves 5.12 Alternative theories and approaches 6. Spherical and stellar collapse 6.1 The Schwarzschild solution 6.2 Spherical collapse to black holes 6.3. More general black holes 6.4 Black hole evaporation and thermodynamics 6.5 Black hole candidates and ways of detecting them . Simple cosmological models 1 Space-time geometry 2. The evolution of the universe 3 Observable quantities 4 New observational data 5 The light cone, observational limits, and horizons 6 Steady-state and inflationary universes 7 ~ Small universes 8 Alternative universes 9 Observational tests 8. Finale Afterword Appendices A. Line integrals B. — Four-vectors and relativistic dynamics Cc. Four-vectors, clectromagnetism, and cnergy momentum conservation Symbuls used Index 201 206 209 217 219 222 226 229 232 240 240 249 255 257 260 264 264 271 277 284 293 301 305 308 311 313 315 318 325 341 368 369 Introduction The aim of this book is to demonstrate the unifying power of the concept of a space-time in understanding the nature of the physical world. It will do so firstly by giving a good understanding of the nature and meaning of the flat space-time of the special theory of relativity, and features of that theory such as length contraction, time dilation, and the twin paradox. Secondly it will provide an introduction to the nature and meaning of the curved space-times of the general theory of relativity, including the concept of the expanding universe and the nature of black holes. Both of these theories of relativity are due to Albert Einstein (Fig. 0.1), the special theory being completed in 1905 and the general theory in 1916. Einstein’s theories of relativity and their dramatic revelations of the unex- pected nature of space-time are among the major scientific discoveries of this century, replacing the ideas about space and time that had been believed since Galileo and Newton. It is fundamental in approaching these topics that the reader be prepared to drop his/her preconceived ideas about the nature of dis- tance measurements, time measurements, simultaneity, and causality. This is perhaps Einstein’s greatest single contribution to the understanding of space— time: teaching us to question the commonplace ideas about these concepts. The resulting revolution in understanding, leading to the discovery of length con- traction, time dilation (a moving clock goes slow’), the relativity of simultaneity, and the fact that space-time geometry and causality are determined by the matter in it, will be explored in depth in this book. One should note that the kinematic effects discussed here are only dramatic when speeds near the speed of light are involved; they are negligible in ordinary everyday life. That is why we do not understand these effects intuitively as ‘the way things are’. However, many of the consequences of spevial relativity are significant in situations that do not involve high-speed motion; in particular, the nature of magnetic forces and the possibility of nuclear power are two such consequences of considerable importance. The concept of space-time presented here is a model of reality used with great success by theoretical physicists. It summarizes the nature of spatial and time relationships in a concise way, and is a very good illustration of the use of geo- metry in understanding physics. The point of a geometrical picture is that it represents in a concise way many analytical relationships that are tedious to describe in full, and are difficult to understand when they are written out in detail. These pictures enable to understand in a direct way the results of distance and 2 Introduction Fig. 0.1 Albert Hinstein, who proposed the special theory of relativity in 1905 and the general theory of relativity in 1916 thereby bringing the study of flat and curved space~ times into the main-stream of physics. (the photograph shows Einstein in 1933.) (Photograph from the Amercian Institute of Physics.) time measurements and so are a very useful tool in making predictions about the results of physical experiments. One should remember that the space-time view embodied in relativity theory is a model of reality which has heen tested by many physical experiments, and depicts more correctly than other models the results of these experiments. It is thus a way of summarizing much of what we know about the physical universe. The understanding obtained through the concept of a Introduction 3 space-time shows how various features that we at first may regard as independent of each other are in fact manifestations of the same underlying physical phe- nomena. Thus this concept is nol merely a lool to use in making predictions efficiently, but also provides a way of understanding a deeper unity in nature than is obvious on the surface. Being able to understand fully the concept of a space-time implies being able to calculate the results of measurements in particular space-times. We shall show how this can be done without employing more than school-level mathematics plus the simple concept of a line integral (explained in Appendix A). Thus we believe that anyone with a good grasp of school algebra, some trigonometry, and the concept of a function should be able to follow our detailed argument including the calculations (in this respect our book is similar to Lilley’s book Discovering Relativity for Yourself, Cambridge University Press, 1981, which gives a more extended introduction to the actual details of calculation than we do here). In a few restricted places in the main text, the idea of a derivative is also needed; omitting these sections will not impair understanding of the major thrust of our argument. We recommend that the serious reader should indeed try to follow all the calculations presented in the main text and attempt at least some of the examples, both for the satisfaction this will afford and because this is the way to fullest understanding of the concepts presented. Restrictions on the length of the book meant that we were not able to include solutions to the exercises, However, a set of notes containing a mixture of complete solutions, hints, and answers to the problems may be obtained separately from the authors (please write to Dr R. M. Williams). For fun, we have included some examples involving writing programs for a microcomputer; these examples enable a good visual presentation of some of the ideas, and are amusing to carry out, but again they are not essential to understanding the text. We suggest that, if at any time you feel that you are becoming stuck in detailed argument or calculations, you should just, note the general ideas presented and go on to the next section. An acquaintance with school level physics will make the argument easier to follow at some places, but a lack of this background will not prevent the reader from grasping the main ideas. We show how the concepts of energy and momentum are united through the concept of a space-time four-vector, leading to the famous result £ = mc? (Appendix B); and how electricity and magnetism are united in a space-time tensor, leading to the fundamental understanding ofa magnetic field as being essentially an electric field viewed from a relatively moving frame (Appendix C). These topics have been separated from the main text because their full development requires somewhat more mathematics than the main text (full appreciation of Appendix C requires sufficient knowledge of partial derivatives to understand Maxwell’s equations in vector notation). Thus while this material will be interesting and useful to anyone who wishes to understand these dynamical applications of relativity theory, it is not essential to the understanding of the kinematics described in the main part of the book. With these appendices, the book describes sufficient material on special relativity to give adequate understanding for most first-year university physics courses on the subject; however, the main text should be accessible to a wider circle of readers, 4 Introduction namely, any interested person with a reasonable knowledge of school mathe- matics, and the will to follow the argument through (and indeed could serve as a text for courses such as described by T. A. Roman in *General relativity, black holes and cosmology: a course for non-scientists', American Journal of Physics 54, 144, 1986). Should you not have a background in physics but wish to follow through some of the physics arguments a bit further, the book Time, Space and Things by B. K. Ridley (Cambridge, 1984) might be a good starting point. This book focuses particularly on understanding relativity from a geometrical viewpoint (perhaps the most similar other approaches being those in Geroch’s book General Relativity from A to B, University of Chicago Press, 1978, and in Lilley’s book mentioned above). We make particular use of Bondi’s K-calculus to determine the results of calculations in flat space-time (Hermann Bondi used this approach in a successful BBC television series on relativity, and published accounts of it in his books Relativity and Common Sense, Anchor Books, 1964, and Assumption and Myth in Physical Theory, Cambridge University Press, 1967). Instead of starting off with the Lorentz transformation as the basis of the argument, we arrive at this concept fairly late in our presentation, when it appears as a convenient unified way of summarizing relationships we have previously derived by use of the K-calculus. Our presentation of the nature of simple curved space-times centres on showing the reader how he or she may deduce many properties of these space-times directly from their interval. Further reading is suggested in the concluding section of the main text (‘Afterword’), and the reader will find that the Index has been carefully prepared as a guide to the terms used and ideas presented throughout the book. While we have endeavoured to present the material covered thoroughly, we have also tried to do so concisely so that the overall size of the volume will not be excessive or daunting. The first part of the book may seem to some to be rather leisurely, because all the detail is spelt out. This is a conscious decision on our part: we feel that the average textbook goes too fast through the fundamentals. The serious student will probably be able to read the first few chapters fairly quickly, but will benefit from this thorough grounding; he/she will find the main increment of difficulty is in the Appendices, whose inclusion results in covering what is needed for a first university course in relativity. On the other hand readers for whom they are too technical may well omit these appendices. We believe that in their case the book will provide a good opportunity for the interested non-specialist reader or early student lo understand the nature of flat and curved space-times, and how they determine physical measurements of time, distances, and instantaneity, without becoming bogged down in mathematical formalism. Thus the reader will become familiar with one of the foundations of our moder understanding of the nature of the physical world. 1 Space-time diagrams and the foundations of special relativity 1.1 The concept of a space-time Space and time are notions familiar to everyone. We shall explore the way in which they form a single entity called space-time, firstly according to the ordinary everyday view of how events occur (i.e. according to Newtonian theory). In later sections we shall examine the space-time description of relativity theory. Space-time according to a single observer Consider a cine camera set up above a billiard table, pointing directly down to take a series of photographs of the billiard balls on the table (Fig. |. 1a). We may use x and y coordinates to express the position of each of the balls, and could even make these coordinates explicit by marking a coordinate grid on the billiard table. Suppose that one of the balls moves as time progresses, while the rest are stationary. Then the x and y coordinates of this ball will change with time according to this motion, and this will be reflected in the photographs. Now imagine cutting the cine film to separate the images (Fig. 1.1b) and then stacking these photographs one above the other in their correct time sequence, with the earliest photograph at the bottom and the latest at the top (Fig. |. 1c). The position of each ball at any time ¢ = 2’ is represented by the position of its image in the corresponding photograph, with its successive positions at later times recorded in the subsequent photographs higher up in the stack. Thus a glance at this stack of pictures will show the way the arrangement of the balls changes with time; in particular it will show how one ball moves and the others are all stationary. This stack of photographs already contains the essential idea of'a space-time, namely the presentation of a time sequence of images one above the other showing the successive positions of objects in the space considered (here, the surface of the table), However, there is one problem: the stack of photographs is very frail: one sneeze will destroy ils order. To remedy this, imagine Laking the stack of photographs and fusing them together in an oven, to obtain a solid, durable space-time (Fig. 1.1d). This is a three-dimensional space-time, with the vertical axis depicting time, represented by a coordinate ¢ (measured by a clock), and the horizontal axes depicting spatial position on the surface of the table, represented by coordinates x and y (measured by. rulers). The space-time represents the histories of all objects in the two-dimensional space. Thus the 6 Space-time diagrams and the foundations of special relativity Successive | images Ser aETy ) x k ©) @ Fig. 1.1 Constructing a space-time. (a) A cine camera takes photographs of billiard balls ona table. One ball moves relative to the others. (b) A series of photographs from the film. () The photographs stacked together, later ones above the earlier ones. (4) The photographs fused together to form a ‘space-time’, with time coordinate 1 and spatial coordinates x, y. histories of the stationary billiard balls are represented by vertical tubes in the space-time, while the history of a ball moving to the left is represented by a tube sloping over to the left. To recover the detailed history of motions of objectsin the space, simply consider a series of horizontal sections of the space-time (surfaces of instantaneity) at later and later times. These sections intersect the tubes representing the histories of the stationary balls at x and y coordinate positions that stay constant (showing that they are indeed stationary), and intersect the tube representing the ball moving to the left in positions that are successively more to the left (showing it does indeed move to the left). In effect, by considering a succession of time slices in this way one can reconstruct a series of images corresponding to the photographs from which the space-time was initially constructed, and then by considering these in turn one can visualize the motion of the particles as in a cine film. The space-time therefore completely represents these motions. The space-time we have constructed is three-dimensional, representing the histories of objects in a two-dimensional space (the surface of the table). LL The concept of a space 7 Of course, real space-time is four-dimensional, with three space dimensions (described by coordinates x, y, z) and one time dimension (described by the coordinate 1), representing the histories of all objects in three-dimensional space. ‘We cannot easily represent this in a single picture. However a study of three- dimensional (or even two-dimensional) space-times will enable us to understand many of the properties of the full four-dimensional space-time. We will demonstrate this in the rest of this book. Space-time according to different observers Different observers will in general have different views of the space-time. Returning to consider the billiard table discussed above, we suppose now that in addition to a camera A held fixed above the billiard table, (Fig. 1.2a), there is a second camera B, which moves with the moving ball (Fig. 1.2b).* To simplify matters suppose that the ball moves parallel to the x-axis; then the camera will also move parallel to the x axis at the same speed as the ball, directly above it, so that the ball stays at a fixed position in the viewfinder. Then in the space-time model constructed from the pictures obtained by A (exactly as described above) the history of the moving ball is a tube slanted to the left (Fig. 1.2c), while in the space-time model constructed by B (again, exactly as above) the history of this ball isa vertical tube (Pig. 1.24), This is because the ball moves to the left relative to the coordinate x corresponding to A’s view, but stays fixed in the coordinate x’ corresponding to B’s view. Thus we have two different views of the same set of happenings. These are the same space-time described from different viewpoints. This illustrates one of the major issues that arises in understanding space times: one can use different coordinate systems, corresponding to making dif- representations arising will apparently be different, but can in fact be trans- formed into each other by making the appropriate changes of coordinates. Later we will determine the mathematical transformations that relate the viewpoints of the two observers. For the present, we simply note that when we consider the series of photographs from which the space-time representations are con- structed, the relation is a simple one. Suppose that before we fuse A’s set of photographs together, we slide them carefully sideways until the images of the moving ball are directly above each other (Fig. 1.3); then A’s and B’s repre- sentation of the same set of physical events will be the same. By this means, the view obtained by the first camera has been transformed into the same as that obtained by the second. “If you fee that the labels A and B for the different cameras and the coresponding obses very axe antiseptically impersonal, you might like to substitute names such as Alfred or Angela for A, Barbara or Bernard for B. While such labelling may well initially help the beginner to grasp what is, happening; ultimately it becomes an annoying distraction. We have chosen to use the more con- ‘venient abstract labels from the beginning. 8 Space-time diagrams and the foundations of special relativity CoB —Fs a _— (a) (b) La : “i : ‘ - 4 _ ——; SS © @ Fig. 1.2. Effect of the observer's motion on the space-time picture. (a) Camera A is fixed above the billard table. (b) Camera B moves with the moving Billiard ball. (c) The space-time view of the ball’s history, constructed from A’s photographs. (4) The space-time constructed from B’s photographs A's view B's view equivalent = we same view Fig. 1.3. Although they look different, A’s and B's space-time views are equivalent: sliding A’s pictures sideways before fusing them together will give the same space-time view as B’s 1.1 The concept of a space 9 Syn planet Fig. 1.4 A planet in circular motion around the Sun, describing a helix in space-time. Examples of space~times The ideas explained so far should become quite clear on carefully considering two examples. (A) A planetincircular motion around a sun. In the sun’s frame of reference, the sunis at rest in the spatial coordinates used, while the planet circles around it, describing a helix in space-time (Fig. 1.4), To see that this is the correct space time picture, consider later and later time sections of the space-time; the positions of the planet in the successive surfaces of instantaneity trace out a circle around the sun, as required. (B) Acircular wave ina pond. Consider dropping a stone into a large pond at some time 11, producing a spreading spherical ripple in the pond (Fig. 1.5a). Photographs of the crest of the spherically spreading wave taken from a camera stationary above the point of impact (Fig. 1.5b) produce a space-time picture in apex att 1g wave is depicted as.a con ict — 1) (Pig. 1.50). Again considering later and later surfaces of instantaneity in the space-time, we recover the series of images depicting the spherically spreading wave, starting from the centre at time t; Points in space-time are called events. An event represents a particular position in the physical world at a particular time, the set of all events repre- senting the spatial and temporal locations of all possible physical occurrences. A world-line is the path traced out in space-time by the events representing the history ofa particular particle or light ray. For example the helix in example (A) is the world-line of the planet as it orbits around the sun. Not all lines in space-time. are possible world-lines; for example, if a line reaches a maximum time and then slopes down again (Fig. 1.6), it does not represent a possible world-line of a massive body, because time would start to go backwards along such a world-line, where it slopes down. We shall discover further restrictions on allowable world lines after considering the limiting role played by the speed of light in relativity. Summary Space-time represents the histories of objects in space. When the space represented is two-dimensional, the space time is three-dimensional (three 10 Space-time diagrams and the foundations of special relativity stone ty O ( ‘spreading ripple 1 Fig. 1.5 (a) Circular ripples produced by a stone thrown into a pond. (b) A succession of photographs of the spreading wave. (c) A space-time view of the spreading wave. 5 Fig. 1.6 Curves in space-time: A is a possible particle history, or worldeline; B is not. coordinates are needed to characterize all events: the two spatial coordinates x and y depicting the spatial position of the event, and the coordinate f representing the time of the event). The full space-time needed to represent all events in the real physical world is four-dimensional (with one time coordinate and three spatial coordinates). Each surface (¢—constant) tells us where each object was al the LL The concept of a space 11 time ¢, according to an observer using a particular coordinate system, say (x, y,z); these surfaces are slices of instantaneity or simultaneity in the space-time (Fig. 1.7). {t-constant} Fig..7 Afr constant} slice of a space-time; this represents a surface of simultaneity. Exercises 1.1 An observer O watches the engine of a train shunting on a straight track; he chooses the x coordinate to measure distance along the track. Plot the world-line of the engine in the (¢, x) plane if, starting at a distance of 50m from the observer, (i) it moves at 10 m/sec away from the observer for 5 seconds; (ii) then it is stationary for 7 seconds; (ii) then it moves at 5m/sec towards the observer for 8 seconds. 1.2. The motion of a rocket relative to observer A is shown in Fig. 1.8. What is the distance of the rocket from A at ¢ = 0 seconds? at ¢ = 10 seconds? What is the speed of motion of the rocket relative to A? Rocket t(sec) aos . 4 X=x/e (light-secs) Fig. 1.8 1.3. Drawa space-time diagram representing the motion of the Moon about the Earth (stating carefully what reference frame you are using). Indicate approximate time and spatial scales on your diagram. 1.4 Suppose a particle in an accelerator moves in a circular orbit of radius 29m, speeding up all the time as it moves. Sketch a space-time diagram of its motion, 1.5 Two cars A and B, watched by a person C waiting to cross the street, collide and then bounce apart. Sketch the world-lines of A, B, and C as seen by (i) the driver of one of 12 Space-time diagrams and the foundations of special relativity the cars; (ji) the driver of the other car; (iii) the person waiting to cross the street. [The drivers are each securely seat-belted into their respective cars.) So far, our discussion of spacetimes has been based on the everyday ideas of Newtonian theory. The concept of a space-time applies equally in the case of relativity theory, provided we take into account important relativity principles which we examine in the next two sections. 1.2 Causality and the speed of light The speed at which light travels is very large but nevertheless is finite. It is measured to be approximately 3 x 10'°.cm/sec = 3 x 10° m/sec = 300000 km/sec. ‘Thus, for example, light travels 30 km in 10~* sec = (1/1000) see, and 300 km in 10™ sec = (1/1000) sec. According to the Newtonian view of space-time, there is nothing special about the speed of light, and physical influences (c.g. changesina gravitational field) can propagate faster: indeed, in principle they can influence distant regions instantaneously. According to relativity theory, the situation is quite different. The limiting nature of the speed of light One of the basic principles of Einstein’s special theory of relativity is that the speed of light is a limiting speed for all communication and for all motion of massive bodies; indeed it is a limiting speed for propagation of all causal influ- ences. One should note here that this speed is the speed of travel of all electro- magnetic radiation, not merely light; it is the speed of travel of infrared and ultraviolet radiation, of radio waves and X-rays, as well as visible light (because these are all forms of electromagnetic radiation, at different wavelengths). Further, it will be the speed of travel of any particles of zero rest mass there may be, e.g. gravitons (packets of gravitational energy) and massless neutrinos as well as photons (packets of electromagnetic energy). Thus one can send signals at the speed of light in many ways, but there is no way one can send a signal faster. Any massive object, e.g. a rocket, a meteorite, a human being, cannot travel as fast as light ‘There is experimental evidence for this principle from many sources. On the one hand, no particle or signal has ever been measured lo move faster than this speed. On the other, attempts to accelerate objects to higher speeds fail. For example, suppose one accelerates particles in a linear accelerator, and then plots the square of the resulting speed against the cnergy given to the particles. New- tonian theory predicts that no matter how high the speed, the resulting graph will bea straight line because the kinetic energy of the particle is proportional to the square of its speed of motion; in particular, there should be no barrier to accel erating particles to move faster than light. In practice it turns out that the Newtonian prediction is correct at low speeds, but at higher speeds the experi- mental results deviate from this prediction: the speed attained is less than that 1.2. Causality and the speed of light 13 Newtonian Theary 2 3 ‘Experiment a 4 Eneray Fig.1.9 A graph of the square of the speed of a particle against the energy of motion given to it, showing the experimental result and the prediction of Newtonian theory. No matter how much energy is given to the particle, the speed of light cis a limit to the speed it attains, predicted by Newtonian theory. This happens in such a way that no matter how much energy one imparts it is not possible to accelerate particles to move faster than the speed of light (Fig. 1.9). The amount of energy needed to accelerate fast- moving particles to higher speeds becomes larger and larger as the speed increases; smaller and smaller speed increments result from each doubling of the energy, and the speed of light is never reached. This is an experimental result that has been proved many times over at a cost of many billions of dollars (since that is the cost of the high energy particle accelerators now in use). One has to invest large sums of money in accelerators to produce an observable effect, because the spood restricting the speed of cars, aircraft, or other vehicles on the earth! ight is so large: the specd-of-High y does not act as a factor ‘The need to allow for the speed of light The time delay between lightning and thunder reminds us to allow for the speed of sound, but that is not the only allowance we should make! The limiting nature of the speed of light in special relativity means that one should always allow for light travel time in analysing any physical phenomenon. As an example, any photograph will, in general, include images of objects at Various distances and so various light travel times. This means the images in a photograph will represent the states of the objects pictured at different times in the past. Thus a photograph of the Moon framed by trees represents the state of the moon 1.27 seconds earlier than that of the Lees, a photograph of distant galaxies with foreground stars. (Fig. 1.10) represents delays of millions of years in the state of the galaxies relative to the stars (the stars will typically be at distances for which the light travel time is, thousands of years but the galaxics at distances for which the light travel time is millions of years). In cach case we see the object at the instant when the light ‘was emitted; the camera therefore necessarily records the resulting time delays. The front cover of this book shows galaxy NGC 3377, which is approximately 14 Space-time diagrams and the foundations of special relativity Fig. 1.10 Distant galaxies and foreground stars, The foreground stars all belong to our own galaxy, which isa spiral system of stars and dust like the galaxy M81 shown here. The four ‘nearby’ galaxies visible in the photograph are at a distance of some millions of light years from us (three fainter galaxies are even more distant) but the individual starsseen are within a few thousand light years. The photograph dramatically illustrates the time delays necessarily involved in all our observations of distant objects: we are seeing conditions at the galaxies millions of years ago, and those in the stars up to a few thousand years ago. ‘Thus the images represent these objects as they were at times differing by millions of years. (Photograph from the Hale Observatory.) 32 million light years from us, and so the image show the galaxy as it was 32 million years ago. The back cover shows the COBE image (see p. 59) of the surface of last scattering of light in the very early universe, approximately 10’? years ago. The light that made this image has been travelling towards us for that cnormous time. 1.2 Causality and the speed of light 15 mS | \ee longer shurtest pond, — (b) © Photos @ Fig. 1.11 (a) A camera above the centre of a pond: the distance d; to the centre is clearly shorter than the distance dy to a point further out, Consequently, light arriving at the camera from the centre set out later than light arriving at the same instant from the edge. (b) Circles of constant imaging time on a photograph Pj of the pond, the larger circles corresponding to earlier times. (c) Surfaces of simultaneity ina stack of photographs of the pond (viewed edge-on, showing the finite thickness of each photograph). The photograph P is shown shaded. (d) Distortion of the stack of photographs before fusing, to represent correctly surfaces of simultaneity as exactly horizontal sections of space-time, To cxplore this effect further, consider a camera 3 metres above the centre ofa circular pond of diameter 8 metres (Fig. 1.11a). The light has to travel a distance of 3 metres from the centre of the pond to the camera, taking (3m)/(3 x 10° m/ sec) = 10~* seconds to do so, but light from the edge of the pond has to travel a distance of 5 metres, taking (5m)/(3 x 108 m/sec) =§ x 10-8 seconds to do so. ‘Thus light from the edge takes 3 x 10~® seconds more to reach the camera than light from the centre. A photograph records one instant when light reaches the camera from different places within its field of view; if these places are at various distances from the camera, the image obtained will represent the different times when the light set out towards the camera. Hence, when the camera takes a 16 Space-time diagrams and the foundations of special relativity photograph of the pond, one will obtain images of the situation in different areas of the pond at different times: light from the edge has to travel further and so has to set outearlier in order to reach the lens at the same time as light from the centre. If we sketch lines of exact simultaneity on a photo P, of the pond taken by the camera, they will form circles with the outer circle depicting the situation at the pond earliest, say at a time 1, and the central point the situation at a time f which is 0.667 x 10~* seconds later than 4 (Fig. 1.11b). A photograph taken by the camera is not an instantaneous photograph of the pond! Hence, on stacking a succession of photographs together and fusing them to obtain a representation of space-time, horizontal sections will not represent exact simultaneity:* as one moves out from the centre on a horizontal slice of space-time (which will be one of the photographic images), the situation represented will be earlier and earlier the further one is from the centre. There will be an earlier photo Po in which the situation at the central point is depicted at the time 1; this photograph will lie below P; in the stack (hecause later photographs lie above earlier ones). It follows that exact surfaces of simultaneity in the space-time (e.g. f is constant) will be lowest at the centre and will curve up as one moves from the centre to the edge (Fig. 1.110). To correct this, i.e. to obtain a space-time representation in which horizontal sections are indeed exactly simultaneous sections of the space-time, one will have to distort the photographs of the pond by bending their outer regions downwards before stacking them and fusing them together (Fig. 1.11d). One could in this way allow for the light travel time, and obtain a space-time picture correctly representing simultaneity as exactly horizontal surfaces. In this particular case, the effect is negligible in practice. However, this will not always be true. Consider, for example, the delays implied from the centre to the edge of the photographic image where an observer in a spacecraft photographs the disc of a galaxy from a distance of 30 000 light years above the centre of the galaxy. If the galaxy has a radius of 40 000 light years, the delay represented in the photograph will be 20000 year, i.e. the situation at the centre will be depicted 20 000 years after that at the edge of the disc. Light rays in space-time In flat space, light travels in straight lines; as it travels at the constant speed c, the path traced out in space-time by light (strictly, by a photon, thatis, a light particle) will also be a straight line. Each light ray in space-time represents travelling a distance din a time ¢ given by t = d/c, where the symbol cis used to represent the speed of light (so c = 3 x 10! cm/sec). For example, ifa light ray is emitted in the x direction at the event O with coordinate values x = y = z= 0 with f= 0, then in 1 second it will be at the position x = lecm = 3 x 10° cmwithy = z = 0; at the time t= 2 seconds, it will be at the position x = 2ccm = 6 x 10! cm with y = z= and so on (Fig. 1.12). It is convenient to measure spatial distances in *In Section 1.1, we ignored light travel time and so regarded horizontal slices as exactly simultaneous. This will be a good approximation for slowly moving objects considered at everyday’ time and length scales. 1.2 Causality and the speed of light 17 t t ‘sec! (sec) te a a 8 ight. ray rey 2 2 1 4 - +x x 7 2a45e7 58 (ocr) Ty 2 8 a (lightseq (a) (b) Fig. 1.12 (a) A light ray travelling in x-direction after emission at the event O (x = 0, t= 0). Its space-time position is shown at ¢ = 1 and ¢ = 2. (b) The same light ray depicted using a spatial coordinate X = x/c (with units of light-seconds). terms of coordinates X = x/c, Y = y/c, Z = 2z/c which are just the previous spatial coordinates divided by the speed of light; they are the same distances but measured in terms of ‘light-times’ (light-seconds, light-years, etc.). Then in 1 second the light would be at the position x=1cem, y=z=0, so X= (lecm)/(ccm/sec) = | light-second, Y = Z = 0; at thetime f = 2 seconds, it will beat the position ¥ = (2ccm)/(com/sec) = 2 light-seconds, Y = Z = 0; and so on, At an arbitrary time ¢, it will be at the position X = (ct)/c = ¢ light-seo, Y = Z =0 (Fig. 1.12b). The relation between this and the previous representa- tion is easily obtained on remembering that | light-second = (1 sec) x (com/ sec) =3 x 10!°cm = 300000km. Another way of thinking of the coordinates X, Y, Zis that when they are used, we have effectively chosen units of measurement for spatial distances so that the speed of light is 1 (because then light travels a 4 In flat space, initially parallel light rays never meet each other because the spatial distance between them stays constant (Fig. 1.13a); consequently in space— time diagrams, they are represented hy parallel straight lines that remain a constant distance apart (Fig. 1.13b). We shall see later that this is not true in a curved space-time. The light cone and causal regions The future light cone of an event O is the set of all light rays through that event (Fig. 1.14). This represents the space-time paths of light rays emitted in all directions from that place and time. It may conveniently be thought of as the history in space-time of a flash of light emitted in all directions at the position and instant corresponding to the event O; thus one can imagine a flash bulb going off at this place and time, resulting in a sphere of light spreading out in all directions at the speed of light. At a time ¢ after the flash was emitted, the light forms a sphere at distance d — ct from the source position (Fig. 1.15a). For definiteness, let us assume the event O is (x = =0,1=0). It is difficult to represent the full light cone in a diagram, so we restrict our attention toa fixed value of z, say z = 0, obtaining the projection of this spreading 18 Space-time diagrams and the foundations of special relativity “Slight rays - light y Ss x (b) Fig. 1.13 (a) Parallel light rays in a three-space with coordinates (x, y,2). (b) These rays are represented by parallel straight lines in space-time. 4 light rays Fig. 1.14 The future light cone of the event O is the set of all future-directed light rays through O. light in a two-dimensional plane. The light will spread out circularly in this plane, which is described by coordinates x and y. This is exactly analogous to the spherical wave in the pond (Example (B) above). By exactly the same reasoning as used in that example (leading to Fig. 1.5c), a three-dimensional space-time diagram representing the spread of the light will show the wave front as a cone originating at (x = y = 0, = 0) and with radius cf at time ¢ (Fig. 1.15b). As the future light cone of the event O obtained in this way represents light (ravelling out in all directions from the emission event O, it is generated by alll the future light rays that pass through O. To 1eptesent this situation in a clear, standard way, it is convenient to use the coordinates ¥ = x/c, Y = y/c, Z=z/c introduced above. Their use has the advantage that in these units the spatial distance travelled is equal to the time clapscd (the effective speed of light is 1); for example, after a time of 1 second, the 1.2 Causality and the speed of light 19 future light cone of 0 (b) Fig. 1.15 (a) A sphere of light spreading out from a flashbulb, (b) Representation of the spherical light wave in a three-dimensional space-time diagram, giving the future light cone of O. light has spread to a sphere of radius | light-second. Consequently the light cone makes an angle of 45° with the vertical axis, representing the fact that a unit horizontal distance in these diagrams is traversed in a unit time; this makes it particularly easy to draw the light cones when these units are used (Fig. 1.15b was drawn using this convention). It is often convenient to restrict our attention even further to a fixed value of ¥ (say Y = 0) as well as a fixed value of Z. The light then spreads out in a one- dimensional space with X as the spatial coordinate (this situation might be real- ized, for example, if a pair of optical fibres convey the light from the flashbulb in the positive and negative XY directions, Fig. 1.16a). The corresponding two- dimensional space-time diagram shows the light emitted from the event O as travelling on lines at +45° to the ¢ axis (Fig. 1.16b); these are the two light rays through O, because such lines are precisely those in which a unit (vertical) change in time corresponds to a unit (horizontal) change in distance. This diagram is a two-dimensional section (with one time and one space dimension represented) of the three-dimensional Fig. 1.15b (representing one time and two space dimensions). In this diagram we have extended the light rays to the past of O; the light rays converging on O from the past generate its past light cone, repre- senting converging light pulses that arrive at the position(¥ = Y = Z = 0)at the time ¢ = 0. “Lhe importance of the light cone of any event derives from the fact that it limits the region of space-time which can be causally affected from that event. For example, suppose President Lugarnev of Transylvania receives information at noon that at 3:00 p.m. a nuclear missile is lo be launched towards his castle on the earth from a secret base on Mars. He instantly presses the button firing his Super- Z lasers at the base on Mars, but he is too late: the energy bolts he has released, uavelling at the speed of light, will take 4 hours to reach Mars and so will destroy the rocket launching pad 1 hour after the missile has left. Let the event where he receives the information be O; this event (specified by a time and spatial position) is then noon at his castle. The light cone of O is depicted in Fig. 1.17, where, for 20 Space-time diagrams and the foundations of special relativity bulb light light future A cone light ray ——_———-x pest fight cone &) Fig. 1.16 (a) Light spreading from a flashbulb one-dimensionally along optical fibres. (b) Representation of these light rays in a two-dimensional space-time diagram, gen- erating the future light cone of O. The past light cone of O (i.e. light rays converging to O) is also shown. convenience, time is measured in hours from O and spatial distances in light- hours from O (so O has the coordinates ¢ = 0, ¥ = 0). Then the event where the missiles are to be launched is P, given by t = 3, X = 4. The light cone-clearly shows that the laser beam emitted at O will arrive at Mars too late to influence P. One cannot influence P from O, because it is outside O’s light cone. The reason for this limitation, of course, is the limiting nature of the speed of light. The angle of a particle’s world-line in space-time from the vertical depends on rate of change along the world-line of spatial distance with respect to time, and so represents the speed of motion of the particle relative to the chosen coordinate system (Fig. 1.18). Therefore, the limiting nature of the speed of light means that no world-line can make a greater angle with the vertical than the light cone; using the coordinates (X, Y, Z), no world-line can make an angle larger than 45° with the vertical axis. Further, one can only send light or radio signals from any event to events on its future light cone. Considering this, it becomes clear that an observer at an event O cannot influence any event that lies outside the future light cone of O (to do so would involve causally influencing events along paths representing motion al speeds gieater han the speed of light). This is a funda- mental limitation on all communication, implied by special relativity theory. It follows that given any event P, we may divide space-time into five distinct causal regions (Tig. 1.19). The interior of the future light cone C*(P) is that region 1.2 Causality and the speed of light 21 * hours) +x + 2 8 4 (light hours) (a) B su a royedtt castle Nw! ot ties castle |fires bolt base [EARTH] {mars} » base fires mivcile Fig. 1.17 (a) A space-time diagram showing the event P ( = 3, ¥ = 4) wheremissiles are launched from Mars towards the Earth. At the time ¢ = 0 on the earth (at X = 0), it is already too late to prevent the launching of these missiles; this is because a laser pulse emitted at this event O will reach Mars at the event R (t= 4, ¥ =4), an hour after the missiles were launched. (b) Depiction of this series of events by asequence of instantancous spatial views. At ¢ — 0, the castle fires a bolt towards the missile basc; at ¢ — 3, the base fires missile while the bolt is still a light-hour away from it; at ¢ = 4, the base is destroyed but the missile is on its way to the castle. Note the direct correspondence between these spatial views and the space-time diagram, The reason event P cannot be influenced from event O is because P is outside O’s future light cone (the light ray OR lies on this light cone). which can be influenced by objects travelling from the event P at less than the speed of light; the future light cone itself can be influenced from P by signals travelling at the speed of light. The past light cone represents the set of events in space-time from which signals sent at the speed of light arrive al the spatial position and time represented by event P. Thus in a photograph of an object taken at P, the light arriving at P records the situation at the instant where the object’s world-line intersects our past light cone (Mig. 1.20); the camera necessarily records the resulting time delays (as in the cover photograph). The interior of the past light cone C~(P) is the region in space-time from which the event P.can be influenced by objects travelling at less than the speed of light. The exterior of the 22 Space-time diagrams and the foundations of special relativity Fig, 1.18 A straight world-line passing through © and P represents motion relative to the reference frame (f,¥) at a speed v in the X-direction; at time f, it is at position X =x/c = ut/c. The angle @ of this world-line to the vertical is given by tana = X/ 1 = v/e. Fora light ray, v = cand tana = 1. Future of P Fig. 1.19 The future and past light cones C* (P), C~ (P) of an event Pdetermine the future of P (the interior of the future light cone), and the past of P (the interior of the past light cone). Events outside these light cones cannot be influenced from P or influence what happens there. light cones is the region which cannot be influenced by P and which cannot influence P. One can illustrate the latter feature by considering a particular event.onthe surface of the Darth, when an astronaut on the Moon is observed through an 1.2 Causality and the speed of light 23 Fig. 1.20 A photograph taken by observer A at the event P depicts the event R in B's history, where B’s world-line intersects the past light cone of P. Fig. 1.21 The past and future light cones of an event O in the history of an observer A on the Earth. who (at the event O) sees event e (a threatening boulder starting to roll down) in the history of an astronaut Bon the Moon. Observer A immediately sendsa warning signal to B; but this arrives at event r, after the boulder has just hit the astronaut at the event b in his history. Because b is outside the future light cone of O, the observer at O cannot influence what happens there. ultrapowerful telescope. Suppose that at this timc onc were to observe a boulder rolling down a slope towards the astronaut. Since light takes 1.27 seconds to reach the Earth from the Moon, we are observing an event 1.27 light-seconds away and 1.27 light-seconds to the past, on the past light cone (Fig. 1.21). It is already too late to radio a warning to the astronaut if the boulder will take 2 seconds to_reach him, because the event where the bolder will reach. him. is outside the causal future of the reception event. Given the restrictions on 24 Space-time diagrams and the foundations of special relativity communication resulting from the limiting nature of the speed of light, there is no method of sending a warning signal in time. The causal limitations discussed here are fundamental, but will not sig- nificantly affect ordinary everyday life in an obvious way because the speed of light is so large: in the context of cars, aircraft, etc. on or near the surface of the Earth, the resulting delays in communication are negligible. They become sig- nificant either when large distances or times are involved, or if the time-scales involved in some process are such that the speed of light is a significant limiting factor. One example is supercomputers: an ultimate limit is imposed on their possible speed of calculation because information cannot be conveyed from one part of the computer to another at speeds greater than the speed of light; this limits the number of calculations that can be performed per second. For this reason, distances between their components must be kept small; thus supercomputers of the future will be small machines. Exercises 1.6 A satellite takes survey pictures of a square region of the Earth, 800km in width, from 300 km above the Earth’s surface. What is the delay from the centre of the image to the edge? (Regard the Earth’s surface as flat in order to simplify the calculation). 1.7 Suppose that a ‘mind reader’ in London claims to know what his twin brother in New Zealand says at any moment, within less than one-hundredth of a second after a word is uttered. Is there anything extraordinary about this claim? [The radius of the Earth is about 6000 km.]. 1.8 Arrocket R moves in the direction relative to an observer A on Mars, at a speed v where u/c = 4; their positions coincide at 1 = 0. Plot the world-lines of A and R ina (t,Z) diagram. The rocket emits light signals in both the forward and backward directions at 1 = 2seo; draw the corresponding light rays ime diagram. The observer A signals to the rocket at the time t= 1 ser: what is the earliest time he can expert to get a reply? {All distances and times are measured in the reference frame of the observer A.] 1.9 Draw a diagram to illustrate the fact that the ‘past’ (i. the past light cone and its interior) of any point P on any world-line, always includes the ‘past’ of any earlier point Q on that world-line, Interpret this result in physical terms. Computer Exercise 1 Writea program that will either (a) take as input a spatial distance D (in miles or km) and give as output the time 7 (in seconds, minutes, or hours) for light to travel that distance; or (b) take as input a light travel time T, and give as output the corresponding distance D. Try the program for suitable distances on the Earth, and in the solar system. Now alter the program to print out additionally the rescaled distance D1 = D/c, where cis the speed of light. Notice the simplification achieved. [This corresponds to usc of coordinates X, ¥, Z discussed above, for which the speed of light is unity. Your output should always state the units of time and distance being used.] 1.3. Relative motion in special relativity We have seen that even in Newtonian theory, two observers in relative motion will, in general, have different views of space-time. We have also seen how such 1.3 Relative motion in special relativity 25 differing Newtonian views may be reconciled. According to Einstein’s special theory of relativity there are some basic features which are common to observers using different reference frames. These are described in the special principle of relativity: The laws of physics are the same for all non-accelerating observers. In the Newtonian theory, this result is well established as far as the laws of dynamics are concerned: there is no way for an experimenter to determine absolute uniform motion by any dynamical experiment. For example, if one carried out a series of experiments involving measuring the motion of colliding billiard balls, timing pendulums, etc. in a compartment in a uniformly moving train, the results are independent of the speed of motion of the train. Therefore, one cannot determine the speed of motion of the train by any such experiments, as they are not affected hy this speed; indeed, the results of the experiments will be exactly the same as if the train is at rest. Similarly, the results would be the same if the experiments were done in the Concorde airliner flying smoothly at twice the speed of sound. This set of results establishes the Newtonian principle of rela- tivity, that the laws of dynamics of particles and rigid bodies are the same in all non-accelerating frames The genius of Einstein lay in extending this principle to all the laws of physics (it applies e.g. to optics, thermodynamics, electromagnetic effects, and elemen- tary particle physics). Thus, the special principle of relativity implies that no physical experiment whatever can establish the absolute motion of any uniformly moving body (one can easily establish motion relative to other bodies, but that is not the issue: the point is that we cannot determine the motion of the Earth at some instant as being say 350 km/sec in any particular direction, in an absolute sense). This is because no experiment can detect such absolute motion; and that is because the laws of physics are unaffected by any absolute uniform motion One can rephrase the principle of relativity as stating the equivalence of all inertial reference frames. ‘The set of coordinates used by an observer to describe space-time, with himself at the origin (x = y = z = 0), constitutes his reference frame. A reference frame is said to be inertial if it is non-rotating and non- accelerating. Newton’s iaws of motion imply that a body experiences an accel- eration relative to an inertial reference frame if and only if forces caused by other bodies act on it; indeed this feature may be used to characterize inertial frames. If one frame is inertial, any other frame moving uniformly relative to it is also inertial. The claim then is that one may use any inertial reference frame and the laws of physics will be unchanged. At first this principle scems obscure, but after we have encountered it in various contexts and seen its implications, its nature will become obvious. It is a powerful unifying principle underlying all known laws of physics. It is already clear that it is useful in the following sense: it implies that if a body is in uniform motion, we do not have to specify that state of motion before being able to apply the laws of physics to it. For example, the operation of the electric generators and motors in an aircraft are unaffected by the motion of the aircraft, if it is moving 26 Space-time diagrams and the foundations of special relativity uniformly. Therefore we do not have to design the motors to take the speed of operation into account; an electric motor that works on the surface of the earth will work equally well in a rocket moving uniformly at 25000 miles an hour relative to the surface of the Earth. Engineering would be very difficult indeed if this were not so. Invariance of the speed of light A major implication of the relativity principle is Einstein’s principle of the invariance of the speed of light: ‘The speed of light in empty space is the same for all observers, independent of the motion of the source and of the observer. Ifthe speed of light were not independent of the motion of the observer, we could detect absolute motion by measuring the speed of light in different directions, so contradicting the principle of relativity. Given this invariance it is then clear that the speed of light must be independent of the motion of the source also, or else its absolute motion could be detected by imeasuring the speed of light it has emitted (which would be measured to be the same by all observers). This principle is supported by all available experimental evidence, in particular, by the famous Michelson-Morley experiment which showed that the speed of light emitted by distant stars is the same when measured from the Earth, whether the Earth in its orbit around the Sun is moving towards or away from the stars (Fig. 1.22). In addition, this principle is also a consequence of the relativity principle applied to particle dynamics, because the speed of light is a limiting speed for particle motion (cf. the previous section). This implies that if the speed of light were different in (aimed at determining the limiting velocity of motion) to determine the absolute motion of each reference frame. Given the validity of this result, it hecomes starkly clear that we will have to revise our ideas about many features we have previously taken for granted. To see this, we consider three important effects of special relativity. The problem of velocity addition Consider an observer A stationary on the ground watching a rocket, which is passing by at 150 000 kin/sec, as it emits a light signal in its direction of motion (Fig. 1.23a). The observer A will measure the speed of this light to be 300000 km/ sec. If she works out on the basis of ordinary Newtonian theory the speed of motion of the light that would be measured by an observer B on the rocket, she will argue as follows: ‘I measure the light to be moving past me at 300000 km/sec and the rocket to be moving past me in the same direction at 150000 km/sec. Therefore observer B on the rocket will measure the speed of the light to be (300 000 - 150.000) km/sec = 150 000 km/sec’. But viewing the situation from the frame of observer B (Fig. 1.23b), the speed of this light is measured to_be 300.000 km/sec, as it must be by the principle of invariance of the speed of light, 1.3 Relative motion in special relativity 27 star motion away from stam Fig. 1.22 In the Michelson-Morley experiment, the speed of light emitted by a star is measured both when the Earth in its orbit around the Sun is moving towards the star and away from it, The same result is obtained for the speed of light in both cases: this speed is independent of the relative motion of the source and the observer. rocket signal soane0 km/sec —_ soce00 km/sec rocket ssonoc km/sec (atrest) *s0000 ) Fig. 1.23 (a) An observer at rest on the Earth measuring the speed of alight signal emitted from a fast-moving rocket. (b) The same situation but viewed from the rest frame of the rocket. in dramatic contrast to the result calculated by A. The Newtonian law of velocity addition is drastically wrong when the velocities involved are comparable with the speed of light; it must be replaced by a new law that is compatible with the principles of special relativity. The Newtonian law of velocity addition is wrong because it is based on incorrect ideas about length and time measurements that do not adequately take into account the principle of special relativity..We will determine the correct relativity velocity addition law in Section 3.2. The dependence of relative clock rates on relative motion Itis clear that something strange happens to the measurement of time in relativity theory; because-of-the-following simple-‘thought-experiment’ due to Einstein. Suppose one watches a large clock (such as that on the tower in the central square 28 Space-time diagrams and the foundations of special relativity in Berne) through a powerful telescope, as one moves away from it in a very fast tram (') that passes the clock at exactly midday. If the tram could move at the speed of light, an observer on it would see the clock appear to stand still—because the light emitted by the clock at midday would be travelling away from it at exactly the same speed as the tram, and light emitted at later times could not catch up with the observer. Thus on using his telescope, he would always see the clock hands stand at 12 o'clock. Indeed all other happenings next to the clock tower would also be seen by him exactly as they were at midday, because the light he receives from the tower at all later times is the light that left it then. If one could move as fast as light, time would appear to stand still! To analyse further how time behaves according to special relativity theory, we must consider how it is measured by a clock. In general, a clock is a complex mechanism that is difficult to analyse. Conceptually the simplest is a ‘light clock’, constructed by means of. light source that emits signals which travel a distance dp and are then reflected back to the source (Fig. 1.24). The time interval hetween emission and return of the signals to the mirror define the ‘ticks’ of such a clock; they occur a time 2¢ apart where 2t = Wdp/c t= do/e (a) (because the signals travel at the speed of light). Suppose such a light clock is attached to a rocket (Fig. 1.25a); seen from the rocket’s frame, the time measured will be given by eqn (1.1) independently of its state of motion (because of the principle of relativity). Now suppose the rocket moves past an identical clock on the ground, at a speed v (Fig. 1.25b). Considered from the ground, the light always travels at the same speed; therefore the interval between emission and reception of light by the clock on the rocket is measured from the ground to be 2¢’, where the distance travelled by the light is given by Pythagoras’ theorem, so er avr td Rigid rod j|_tighe i mirror Jight source Fig, 1.24 A ‘light clock’ consisting of two mirrors held at a fixed distance by a rigid rod, and a pulsed light source. The ‘ticks’ of the clock are each time the pulse of light is reflected by the bottom mirror 1.3 Relative motion in special relativity 29 Ct speed v rocket (at rest) (receive) tet, (reflect) moving clock t " peaes” ‘mirror Velen we length of arm sedsgorn @ (b) Fig. 1.25 (a) A light clock fixed to a rocket, viewed from the rest-frame of the rocket. The light is reflected from a mirror at a distance dy, and is reccived back after a time 2r. (b) A light clock aboard a rocket moving at speed v relative to an identical clock on the ground. ‘An observer on the ground sees the light received back by the rocket after a time 21' This implies 12(2 — v*) = d? @ t?(1- w/c?) = a/c? Taking the square root and dividing by (1 — v?/c?)! gives = (ho e/e} But the rate of the clock on the ground is given by (1.1). Thus (1 ve) (1.2a) We see that even with identical light clocks, the ticks of clocks in relative motion micasurc time at different rates. Simce c' is larger thau é, the muving cluck is seen from the ground to ‘run slow’ in the ratio t= 7/2), (1.20) ‘This effect is significant when motion is at speeds near the speed of light. We shall rederive this result in Section 3.4, and discuss its experimental verification in that section and in Section 3.6 The ‘twin paradox’ An interesting exampie of this effect is the so-called ‘twin paradox’, Suppose that one of a pair of twins (ie. siblings born on the same day) goes ona long journey at very high speeds in a rocket ship, while the other stays at home (Fig. 1.26a). The bwins are inequivalent bevause the one experiences varying accelerations asso- ciated with the changing speed of the rocket in which she is travelling, whereas the other does not. The biological systems of the moving twin will be measured by the stationary twin to run slowly. Agcis measured by means of ideal clocks cach twin carries with him or her, and they will not be the same age when they meet again; the-one who stayed at home will be older. This different-aging, evidenced in biological processes, will be confirmed by mechanical or electrical clocks the 30 Space-time diagrams and the foundations of special relativity () © Fig. 1.26 The‘twin paradox’. (a) Twin A stays at home while twin B goes on along return. journey at high speed. (b) A clock actually measures time alongits world-line in space-time (cach ‘tick’ can be thought of as a marker on the world-line). (c) A space-time diagram of the twins’ histories: twin A’s clock measures time f along his world-line, while twin B’s clock measures time ¢’ along her world-line, twins carry with them. For the effect to be significant, the relative motion must take place at close to the speed of light. This effect is not really surprising when one asks what a measurement of ‘time’ means in space-time. Remembering that clocks are mechanisms whose history is represented by a world-line in space-time, we see thatit is plausible that what they really measure is ‘distance in space-time’ along the world-lines representing their history (Fig. 1.26b). Because the twins have followed different space-time paths between the events when they are together initially and finally (Fig. 1.26c), it is not too surprising that they have lived for different times. A similar effect occurs on the surface of a table. The distance d from P to Qalong the curve Cis different than that along the route C’ (Fig. 1.27a); the ‘twin paradox’ is the analogous effect in space-time. ‘There is, however, a significant difference in the two effects: from this analogy onemight at first expect that the time measured by the twin who moves out and back would be longer, as her world-line looks longer in Fig. 1.26c; bul the actual sign of the effect is the opposite. We must be careful; while such a space-time diagram accurately represents instantaneous relative spatial pos- itions and time measurements made by a single observer, we must not jump to conclusions about what spatial or time measurements will be made by other observers. In this case the diagram represents accurately measurements made by the stay-at-home twin A, but does not in an obvious way represent measurements made by the traveller B. What is clear from the diagram is that we may expect 1.3 Relative motion in special relativity 31 outer suburbs, inner subyrbs a c centrg = — ma L__Pe a @ ) Fig. 1.27 (a) Two paths between points P and Q on the surface of a table. The straight-line path Cis of length d, while the curved path C’ is of length ’. (b) Four routes from town Q to own P lying on opposite sides of the city C, Travel time is longest on route a through the city centre; and is shortest on the apparently longer route d, a freeway that avoids even the outer suburbs of the city. This provides a good analogy to the situation in Fig. 1.26c. B’s time measurements to differ from those of A, but we must not jump to conclusions as to how they differ An analogy to the situation represented by the space-time diagram can be given as follows: imagine towns P and Q lying respectively to the north and south of an ancient city C. The roads of this city are very congested by heavy traffic passing throug! travel by car is. One can choose routes from P to Q through the centre of the city, through inner suburbs, through outer suburbs, or on a ring road that avoids the city altogether; these further-out routes from P to Q of course involve travelling a Jonger distance, as is at once apparent from a map (Fig. 1.27b). However, the travel time from P to Q is shortest on the ring road and longest on the road through the city centre. The map represents accurately the different possible paths from P to Q, but not the different times it will take to travel on these routes; the shortest travel time from the initial to the final points is associated with the path that looks longest on the map. This gives ns.a good analogy to the space— time situation represented in Fig. 1.26c. One can crudely understand the sign of the effect in that case by remembering the example of the observer watching a clock from a tram, which suggests that the nearer to the speed of light a clock moves relative to an observer A, the slower it will appear to him that it is running. We shall discuss the time dilation effect and ‘twin paradox’ fully in Section 3.4. 1aiTOW strects, so the closer onc travels to thecity centie the slower The dependence of simultaneity on the reference frame One of the most important features of relativity theory is that observers in relative motion will disagree about simultaneity, As an example consider, as before, two 32 Space-time diagrams and the foundations of special relativity observers A and B viewing a billiard table from above, the one being stationary above the centre of the table and the other moving to the left (cf. Fig. 1.2). Now suppose red and green billiard balls R and G fall into pockets at opposite edges of the table, at exactly the same instant (as measured by an observer stationary relative to the table). Light waves recording these events are emitted from the two edges of the table at the same instant (as scen from the table). Let this instant be when A and B coincide (Fig. 1.28a); thus, the light is emitted equidistant from both A and B. Both waves reach A at the same instant 7 (Fig. 1.28b). Since A is equidistant from the two edges, he deduces that the billiard balls were pocketed simultaneously (cf. the discussion above of simultaneity in the photograph of the pond). However, when the waves reach A, then B’s motion will ensure that the wave from the left has already passed him while the wave from the right has yet to catch up with him. Thus, B observes that the red ball was pocketed before the green one. Therefore, observers in relative motion disagree about simultaneity. This will be manifested in photographs taken by the observers: they will represent the same spatial regions viewed at different time slices. For example, A’s picture taken at T shows R and G being pocketed simultaneously, whereas B’s picture shows R already in the pocket and G still approaching the edge. If two photo- graphs taken by A and B show the left-hand edge at the same time, B’s photo- graph will show the right-hand edge at an earlier time than A’s. Thus a surface of simultaneity in space-time for B will be tilted relative to a surface of simultaneity for A (Fig. 1.29). The conclusion is that observers in relative motion determine =e cat (b) Fig. 1.28 (a) Two cameras A and B over a billiard table: A is stationary above the centre, and Bis moving to the left. Light rays are emitted trom the sides of the table as two balls R and G are simultaneously pocketed; at the same instant, A and B coincide. (b) Both light rays reach A at the same instant, but B receives the light from the left before the light from. the right. Thus he sees R fall into a pocket before G. 1.3 Relative motion in special relativity 33 simultaneous for A simultaneous: fora fore simultaneous far B fa) (b) Fig. 1.29 (a) Surfaces of simultaneity for A and B, showing how relatively moving observers determine different space-sections of space-time as being instantaneous. (b) Cross-section (p= constant) of Figure (a). The surface of simultaneity for A is parallel to the x-axis, but that for B is tilted rclative to it. different splittings of space-time into space and time. Space-time is a unit which unifies space and time, but does so in different ways for different observers. The argument above is indicative of the fundamental feature that simultaneity is determined relative to the motion of the observer, but does not enable one to understand the issues fully. A full technical examination of simultaneity and how to measure it follows, see Section 3.3. Exercises 1.10 Which of the following properties would you expect for a correct relativistic veivcity addition iaw, combining paraliel veluviiies vp and v iv produce a sesuliautt velocity v3? @ {uife ts Gil) {ufe 1, What value do you find for 7’/T when V=0? Interpret your results physically. These various effects of relative velocity warn us to be cautious in interpreting space-time diagrams. Suppose that a space-time diagram is drawn from A’s viewpoint; then the coordinates (¢, X, ¥, Z) represent the results of A’s meas- urements, and we can read the results of his measurements directly from the diagram. Without further investigation, we cannot assume that we know the results of measurements made by other observers. In particular, we cannot read off directly the results of time or space measurements made by an observer B moving relative to A, because the diagram does not represent the relation between A’s and B’s measurements of space or time in a simple way. We can indeed use the diagram to understand these relationships, as we shall see later, but must be careful in the way we do so and we must avoid preconceptions. Conclusion Space time diagrams give a very convenient description of spatial and temporal relations, which enable us to clarify important features such as the nature of causal relationships. The examples given so far show that in order to understand relativity theory properly, and the way space-time represents space and time measurements for different observers, we need to rethink carefully the nature of space and time measurements. We shall do so in Chapter 2, and then work out systematically the consequences for the geometry of the space-time of special relativity in Chapter 3 (studying there in depth the concepts introduced in this Chapter). The unifying theme of a space-time interval will be introduced in Chapter 4, and used in later chapters to study some basic ideas of curved space-times. While all the preceding material is necessary for a full understanding of the later chapters, so that ideally one should read them in sequence, nevertheless a reader who wishes to proceed directly to the main ideas of curved space-times can do so now by reading Chapter 5. However, understanding of the interesting applications in Chapters 6 and 7 will be greatly benefited by a perusal at least of the flat-space universes discussed in Section 4.3. Although we shall mention it again in the Afterword, let us recommend as an additional source of discussion and examples the book Space-Time Physics by E. F. Taylor and J. A. Wheeler (Second edition: Freeman, 1992); this describes special relativity (and a little beyond) in a highly readable way, with lots of examples and pictures, and provides a useful parallel text which could be read in conjunction with Chapters I-4 of this book. 2 Fundamentals of measurement To build a proper foundation for understanding relativity theory we need to consider in turn the bases of measurement of time, distance, and instantaneity, because these are the fundamentals on which other kinematic measurements, such as velocity measurements, depend. 2.1. Time We assume the existence of ideal clocks which measure time accurately along their world-lines. These clocks may for example be mechanical (e.g. based on an escapement mechanism controlling the rate at which a spring unwinds), atomic (e.g. depending on the half-life of a radioactive substance), electromechanical (eg. based on a crystal), or electronic (based on an electronic oscillator). The notion of perfect measurement of time along a world-line is important because it implies the universality of time measurement in the following sense, The equa- tions determining the mechanical response of a body involve time, as do the equations of electromagnetism and of atomic and nuclear structure. Until we have investigated further, we are not entitled to assume that these and the times in other physical laws are the same, or even simply related to each other. However, to the accuracy so far measured, it turns out that the relevant time is the same for all physical systems: we do not have to allow for different time variables in mechanical systems, thermal systems, atomic systems, etc. Therefore, we do not have to specify the kind of clock to be used by an observer: the universality of time allows him to base his clock on any physical principle he chooses. Ideal clocks constructed on the basis of any physical laws will all agree with each other. ‘The further point of importance to be emphasized is that a clock by itself cannot determine a time measurement at some point away from itself (I cannot obtain a reading from a clock remote from me unless transmitting and receiving mechanisms are used to transfer data from it to where I am). Thus, clocks by themselves cannot establish surfaces of instantaneity in space-time, but rather measure Lime alung a world-line (namely, the world-line of the clock in space~ time, Fig. 2.1). There is no implication here that the same time will be measured from an initial toa final point along different world-lines, and indeed, in relativity thcory this is not expected to be truc (cf. Fig. 1.26 and the discussion in Section 1.3). Experimental evidence shows that special relativity is correct: ideal clocks have been flown around the world in airliners and compared with identical clocks stationary on the ground. Their readings differ, in agreement with the prediction 36 Fundamentals of measurement of special relativity. Thus the Newtonian idea of a uniform flow of time that is the same for all observers, is wrong. Given any world-line, then, there is a unique time measured along that world- line by any ideal clock moving along it. This is called proper time along that world-line. All direct time measurements are measurements of proper time along some world-line or other. To relate proper times measured along different world- lines implies use of signalling devices that can transfer information between distant observers; we shall deal with this in Section 2.3 below. Given this understanding, there is one particular ‘time’ that needs clarification: namely, what is the significance of the time coordinate # specified in the standard coor- dinates (1, X, Y, Z) used to describe space-time by an observer A (cf. Section 1.1)? ‘The answer is that it is the proper time measured by that observer along his own world-line in space-time, which is the line (Y= Y= Z=0) in those coordinates (Fig. 2.2). It does not directly indicate time measured along other arbitrary world-lines. However, as we shall see later, it will correctly give the time measured by any observer who is at rest in this coordinate system, i.e. who is stationary relative to A. - — ae 1 Fig. 2.1 Measurement of time is based on the fact that a clock measures time r' along its ‘own world-line in space-time, —4 Fig. 2.2 The time rin the standard coordinate system of an observer A is time measured by a clock stationary relative to him. It measures time along-his-world-line (the line X= ¥=0, which is the origin of the spatial coordinates in his reference frame). 2.2 Distance 37 Exercise 2.1 ‘The period of rotation of the carth as measured by an electromagnetic crystal clock is found to be increasing. Does this imply that (a) dynamical time (as measured by the fundamental laws controlling the Earth’s rotation) is different from electromagnetic time, or (b) that the Earth’s rotation is an imperfect clock for some reason? 2.2 Distance Tn texts on elementary physics it is often stated that rulers or ‘rigid rods’ are the basis of measurement of distances. However, they are very imperfect measures of distance; the length of a ruler varies with temperature, for example, and will be different ifitisheld horizontally or vertically in a gravitational field (because of the elastic response to stresses induced by gravity). Therefore, ‘corrections? must be made to allow for the fact that a ruler does not in fact measure a constant distance underall conditions. Further, itisimpracticable to usea ruler (or series of rulers) to measure accurately the distance from Rome to Venice or Dover to Calais, let alone from the Earth to the Moon or Mars. Some more practical method must exist. Measuring the distance of one object from another which is far from it implies sending signals or information between these objects. The invariance of the speed of light means that electromagnetic radiation is the best basis for standard measuring devices in space-time. This is true in particular for the measurement of distance. Thus, the proper basis for measuring distance in special relativity is radar. This works as follows: to measure the distance between points P and Q, an electromagnetic signal is emitted by a transmitter at P and reflected back to P from Q (Fig. 2.3). < t, of the signals are measur by ai ssior t, ideal clock at P. Let the difference between these times be t = f — 1); this is then the light travel time for the outward and return journeys. If the distance between P and Q is , the distance travelled by the light is 2d. But light travels at the invariant speed c; so t = 2d/c, and the distance measured is half the light travel time: id reoopti d=her e dje=4(—4). (2.1) light Pty: send Pyises a iz’ receive = = corel transmitter reflect and receiver Fig.2.3 A device to measure the distance between P and Q:aradarsignal (usually a radio wave) is sent fromP at the time f), reflected at Q, and the echo received by P at time fy. The distance d then follows from the light travel time ft) — fy 38 Fundamentals of measurement As an example, if the light is emitted at 12:01 and received at 12:03 then = 12:01, = 12:03, r=2 minutes, and the distance is 1 light- minute = 60 light-seconds = 60 sec x 300 000 km/sec = 18 000000km. By con- trast, if r= 2 sec = 2 x 107° sec, then d= | sec = 300 metres. This use of radar to measure distance, apart from being the fastest method, is in most cases the only practical method. It is for example the basis of accurate measurement of distance for mapping purposes by surveyors (e.g, through a device called a Tellumat, see Fig, 2.4). It has been used to measure the distance to the Moon and to Mars with unprecedented accuracy. It is routinely used by ships and aircraft to determine distances to other ships and aircraft. Also, because of the problems with defining Fig. 2.4 The Tellumat, an advanced distance measuring device based on the radar principle. This instrument uses microwave radiation to measure distances between 20m. and 25km to within an accuracy of 5mm. The distance measured appears directly as a digital read-out on the hand-held control unit, (Photograph from Plessey ple.) 2.2 Distance 39 a length standard by means of a ‘rigid rod’, the metre is now defined as the distance light travels in a given time; thus the constancy of the speed of light—the basis of radar—is also the basis now used to define the length in a laboratory. From now on, in this book we shall assume that radar is the practical means of measuring distance. A space-time diagram of the usc of radar to measure distance is given in Fig. 2.5. Unless otherwise stated, we shall from now on use the coordinates (t.X, Y,Z) introduced in the last chapter, scaled so that the speed of light is 1 (because lengths are measured in light travel times) and the light cone is at 45° to the vertical in space-time diagrams. Then all world-lines of massive particles must make an angle of less than 45° to the vertical in these diagrams, because they cannot move faster than light. This convention has been used in Fig. 2.5. When radar is used to measure distance, it is very natural to describe distances in terms of light travel times (e.g. psec, sec, years). To convert to ordinary units, one just has to multiply by the speed of light. For example, | psec is (10° sec) x (3 x 10° cm/sec) = 3 x 10*cm = 300 metres; 1 msec is 300 km; I sec is 300000km. In these units, the mean distance from the Earth to the Moon (381550km) is 1.27sec; the mean distance from the Earth to the Sun (149 600000km) is 8.31 minutes; the distance to the nearest star is 4.27 light-years. We can now give a direct meaning to the standard spatial coordinates (X, Y, Z) in an observer’s space-time picture. Along the coordinate axes, they are just the distances measured by him by radar from his world-line (¥ = ¥ = Z = 0) to the event in question (Fig, 2.6), in units of light-travel time; for a general point, the distance measured is d = (X? + ¥? + Z?)#, As in the case of time measure- ments, one cannot assume that one can read distances measured by other tt}, reception light: t=t,J"emission Fig. 2.5 A space-time diagram of the measnring procedure in Fig. 2.3. 40 Fundamentals of measurement xxo kat kane Fig. 2.6 The coordinate X in the standard coordinate system of an observer P is radar distance measured by him from his position. Thus a scries of radar signals establishes the lines ¥ = 0, ¥ = 1, ¥ = 2, etc. in space time (Y = 0 being his own world-line). observers directly from the space-time diagram, since they are in general not directly represented by the coordinates X, Y, Z. An important feature of distance measurement by radar is that an observer at P can measure the distance to Q purely by observations at his own position; he does not have to go himself to Q, or attain any active collaboration from Q, to make the measurement. Instead he sends light or radio waves to Q; all that is required is that they are reflected back to P by some object at Q. This feature is what makes radar so important in navigation and in military applications. Finally, having defined distance in terms of radar, we can now understand the common use of rulers to measure distances on scales of between 10° metres and 10? metres as being due to their being reasonably good approximations to ‘rigid rods’ (rods of constant length) in many circumstances, If any conflict were ever to arise between ruler and radar measurements of distance, we Would reject the ruler result in favor of that determined by radar. Exercises 2.2 Find the light travel time between the following locations: (i) your fect and your eyes; (ii) Cambridge and London (90 km apart); (ii) the Earth and the planet Pluto (mean distance 5900 million km). Calculate the distance in kilometres to astronomical objects which are (1) one light-hour away, (2) one light-day away, (3) one light-year away. 2.3 A fighter aircraft sends out a signal that is reflected from a bomber aircraft; the ‘echo signal is received by the fighter after an elapse of 20 isec. One second after sending the first signal the fighter sends another signal; the echo signals received after 15 psec. Deduce the distance measured by the fighter to the bomber on each occasion, and hence find the relative speed of approach of the two aircraft. 2.3. Simultaneity 41 2.3 Simultaneity In order to synchronize a clock at a distant point Q with a clock at P, one has to send information to Q about the state of the clock at P (or vice versa). An initial suggestion might be that one should send an ideal clock C from P to Q, after synchronizing C with P’s clock; this will then enable synchronization of Q’s clock with C, and so with P (Fig. 2.7a). However, this will not work. This is because, as we have already seen, the result obtained will depend on the path through space-time taken by C from P to Q (Fig. 2.7b), that is, on the speed with which C is moved from P to Q. Thus one cannot set up a consistent synchroni- zation system this way that will give the same answer no matter how the clock Cis moved from P to Q (in mathematical terms, proper time is not an integrable variable). As in the case of distance measurement, one must turn to the use of electro- magnetic signals (‘light’) to convey adequately the information needed for syn- chronization from P to Q. In fact, determining which events are simultaneous with particular events in the history ofan inertial observer P is again best achieved by radar. kc ) Fig. 2.7. (a) A conceivable process for synchronizing distant clocks at P and Q by transporting a third clock C between them, and a space-time diagram of this process. (b) This procedure will not work, because the result is ambiguous: another clock C’, synchronized with-C.at-P, will in general disagree with C on arrival at Q after traveling from P to Q. Thus the result of such a synchronization process is arbitrary. 42 Fundamentals of measurement tf [P a (reception Fig. 2.8 The synchronization of clocks at P and Q using a radar signal. Because light takes the same time to travel out and back, the reflection event r at Q must be simultaneous with the event q at P half-way between emission and reception of the signal Suppose P sends outan electromagnetic signal ata time r; to Q and records the time f2 at which the echo pulse is reflected back from Q. Because P knows that the speed of light is constant, he will deduce that half the light travel time was taken up by the outward journey and half by the return journey, so he will judge that the reflection event r at Q is simultaneous with the time 7 in his history precisely hnaif-way between when the signai was seni and when the echo was received (Fig. 2.8). Thisis given by adding half the light travel time to the time the light was emitted, i.e. T=h+}(b~4)=}(4 +0). (2.2) This is a practical way of determining simultaneity, and so of synchronizing clocks even if they are very far apart. For example if P is on the surface of the Earth and Q on the Moon, they can synchronize their clocks by the following procedure: observer P sends a radar signal lo Q. He measures the limes 4 and 4, determines T from eqn (2.2), and transmits this value to Q. Observer Q records the time ¢’ of the reflection event r according to his initial watch setting. After receiving the signal from P, he rescts his watch by the amount 7 — ¢’, which is the difference between the time T'assigned to event r by A and the time ¢' assigned tor by his own watch. Each observer can use this method to define simultaneity in space time. If they are in relative motion, they will disagree on simultaneity (as has already been pointed out in Section 1.3). This does not matter: each obtains a perfectly unambiguous definition of the meaning of simultaneity for him, that corresponds 2.3. Simultaneity 43 precisely with our ordinary, everyday notion of simultaneity. An example may help to clarify this. Imagine two police cars patrolling a straight road between two police stations Aand B. The drivers are instructed to go immediately to whichever station calls first, unless both call at the same time, in which case station A has priority. Ata particular time, both cars are midway between the stations, with car | stationary and car 2 traveling towards station B. At that precise moment, according tocar 1, both stations send outa call. Car | proceeds to station A, while car 2 proceeds to station B, having received a call from there before that from A, Who is correct? The answer, of course, is that both are correct (see Fig. 2.9). Simultaneity is not absolute but is affected by relative motion (cf. Fig. 1.29). We shall study this further in the next chapter. The key concept that enables this analysis to be made is due to Einstein: it is that one should give an operational definition of simultaneity, i.e. a definition in terms of the results of possible experiments. The rest of the analysis then follows on noting the invariance of the speed of light for all observers. One should note that when the standard coordinates (t, X, Y, Z) are used by an observer P in flat space-time, according to the definition given here the surfaces {t = constant} are precisely surfaces of simultaneity for P, whose world-line is (X = ¥ = Z = 0). For example, if P sends out a signal at : = —1 (Fig. 2.10), which is reflected at the event r with coordinates (t = 0, ¥ = 1), then it is received again by P at r= 1. The mid-time 7 (calculated from formula (2.2)) is measured by P to be T =}(-1 +1) = 0; so P determines the event r to be simultaneous with the event ra 0,.¥ = 0) in his own history. Similarly each event for which ¢ = 0 is measured by him to be simultaneous with q. Thus, in flat space-time, the stan- dard time coordinate t does indeed (as would be expected) indicate the way clocks would be synchronized (using radar) by the observer who set up the coordinate system. Any other observer who is at rest in this coordinate system, i.e. who is station A te Fig. 2.9 Police car 1 js stationary relative to police stations A and B, but car 2 is approaching B. Signals sent out simultaneously (as measured by car 1) from A and B at events a and b will bereceived at the same time by car 1 at event p, but car 2 will receive the signal from station B first (at event q) and the signal from station A second (at event 1). Thus car 2 will detect the emission event b before the emission event a. 44 Fundamentals of measurement surface simultaneous teo Fig.2.10 The surface of events in space-time simultaneous for the observer P (stationary in the chosen coordinate system) with theevent q at the origin of coordinates. P has to use a whole series of radar signals (c.g. those shown establishing simultaneity of r and r’ with q) to determine this surface. stationary relative to P, will determine the same surfaces of simultaneity. How- ever, an observer who is in relative motion, again using (2.2) to determine simultaneity, will disagree. We will explore this further in the next chapter. 2.4 World maps. world pictures. and radar maps Now that the concept of simultaneity as determined by radar has been carefully defined, it is useful to distinguish between three different possible observational views of a space-time. A world map is the idea we inherit from Newtonian theory: itis a view of objects in a space-time at an instant, i.e. a map representing where the objects are in an instantaneous space section {r=constant} of the space-time (Fig. 2.1a). Unfortunately, itis difficult for an observer to obtain such a view of space-time at some time fo observationally (cf. Fig. 1.11). The reason is that the further out a point in the surface {f= fo} is, the earlier must be the emission of the radar pulse and the later the reception of the echo pulse (cf. Fig. 2.10); hence this map can only be observationally determined by a whole series of radar measurements involving sending out a series of radar pulses. By contrast, a world picture is a view of objects in space-time on the past light cone of the point of observation (Fig. 2.11b). Any photograph or other obser- vation of distant objects by simply detecting incoming radiation from them* is a e.g. by a radio or X-ray telescope (see-The New Astronomy by N:-Henbest and M. Marten Cambridge University Press, 1983), or by the human cyc. 2.4 World maps, world pictures, and radar maps 45 ly x future light 1 cone \— x —~? af cm 7 past 7—~ . light ‘cone ly x @ Fig. 2.11 (a) A world map depicts the position of each object in the surface of simultaneity of some event ¢ = fo on the observer’s world-line. (b) A world picture depicts the position of each object in the past light cone of some event ¢ = fp on the observer's world-line (e.g. when a photograph was taken). (c) A radar map depicts the position of each object in the future light cone of an event / = fo on the observer's world-line (when a radar pulse was emitted). (d) When ordinary units are used to describe everyday occurrences, the light cones are extremely flat and so the three views are very similar, because the spatial position of an object cannot change much between the events rand s where its world-line intersects these light cones (except if the object viewed is moving at close to the speed of light). representation of these objects on our past light cone, inevitably therefore representing the associated time delays (cf. the discussion in Section 1.2.). The problem is that what we directly obtain is a two-dimensional representation of these objects (the photograph itself), with images of objects all projected onto the same image plane no matter how different their distances (cf. Fig. 1.10). How far away they are is then not at all obvious; indeed, for many decades astronomers debated whether ‘spiral nebulae’ were clouds of dust in our own galaxy, or distant, galaxies equal in size to our own galaxy; the latter eventually turned out to be the correct answer. To determine how far away objects are we need further analysis, e.g. determination of distances by measuring apparent sizes, apparent luminos- ities, or redshifts, Use of such methods of estimating distances (discussed in the following chapters) allows an observer to construct his world picture at any time 46 Fundamentals of measurement to in his history. The particular advantage of this method of observation is that it can be used out to extremely large distances. Finally, a radar map is the natural picture obtained directly by a radar set as commonly used in aircraft, on ships, in airport control towers, etc. (see Fig, 2.12.) We can conceive of a radar pulse being sent out at some time to, echo pulses received from objects at various distances, and the radar display being con- structed from these echoes, representing the distance of each object according to the delay time for the corresponding echo. The implication is that this is a picture of the position of each object on the future light cone of the event é (Fig. 2.1 1c). This picture has the great advantage that it is directly obtained and immediately displayed, but the disadvantage that it cannot be used out to very large distances, because the light travel time out to the object and back becomes too large. However, this is a real limitation only in the context of astronomical observa- tions; it will not be a serious restriction on the earth. Ttis clear that the representation of positions of objects in space-time obtained in each case is conceptually quite different (cf. Figs 2.11a, b, c). However, the resulting maps will differ substantially only if the objects depicted move appre- ciably on the relevant time-scale. In the context of measurements in ordinary everyday life, the speed of light is very high, so if we use ordinary units of mea- surement, the light cones are extremely flat and the three maps obtained will differ very little (Fig, 2.11d). Thus, for the purposes of distance measurements in everyday life, radar provides a very adequate and convenient picture of the relative positions of objects giving a good approximation to the instantaneous view of a world map. Exercises 24 bxplain what practical problems will occurin using radar over very long distances, and estimate the maximum distance over which radar is a practical distance-measuring device. 2.5 Taking into account special relativity principles and the limiting nature of the speed of light, see if you can propose some other method of determining simultaneity at a distance. If you do 50, convince yourself whether it is essentially equivalent to the radar definition, or not. 2.6 Two volcanoes 100 km apart on Io (a satellite of Jupiter) are seen by an observer A at rest on Io to crupt simultancously. Observer Bis the pilot ofa rocket which according to Ais 10 km directly above the first volcano when it explodes, flying towards the second at a speed of $c, What will B see as happening at the second volcano at the moment when he sees the first explode? 2.7. According to a nuclear treaty between two superpowers, if either strikes first the second is entitled to destroy the first completely, The superpowers deploy two ships A and B which move at a very high speed towards each other. Ship A sends off radar signals at one-second intervals which are reflected back by B. At ¢ = 0 in its coordinates, A fires a weaponat B. Att = 4, A receives back the signal sent at ¢ = —6, which detects B firing at A. What can A conclude about who fired first? [In Chapter 3 we will consider if B would reach the same conclusion ] 2.8 Ask various friends what time interval appropriately corresponds to various dis- tance measures: c.g, 1 cm; | metro; I kilometre. [In principle it is not possible to make such Fig. 2.12 Radar used to control the movements of aircraft. (top) The radar antenna. Pulses are transmitted and received by the unit at the focus of the curved antenna, which rotates to cover all directions around the airfield. (bottom) The display (a ‘radar map’), directly showing the spatial positions of aircraft relative to the airfield. (Photograph from Plessey plc.) 48 Fundamentals of measurement a. comparison, but in practice most people are able to make a reasonable correspondence on the basis of their experience in daily life, e.g. using the speed of walking or driving to set the relative scales.] Try to draw past and future light cones in space-time using ‘natural units’ (e.g. minutes and metres). Observe from this how the light cones closely define a ‘surface of simultaneity’ in everyday life Computer Exercise 3 Write a program that accepts as input from a radar set trained on a UFO, (a) the time T1 at which a radar pulse is transmitted towards the UFO, (b) the time 72 at which an echo is reocived from it; and gives as output, (i) the distance D measured to the UFO, (b) the time TR at which the radar pulse was reflected by it. Suppose the radar set sends ont a regular train of pulses a time T apart. What condition should T satisfy to avoid confusion between different echo pulses? Modify your program to print out also the relative speed of approach of the UFO as determined by the echo pulses received from it. Ensure your program prints out a special warning message if the speed determination for the UFO appears to violate a special relativity condition. What might be an appropriate phrasing of this warning message? Conclusion We have now determined methods for measuring the fundamental quantities (time, distance, simultaneity) needed as a basis for all other kinematic mea- surements, and have done this taking the limiting nature of the speed of light into account. It is important to realize that (in view of the principle of relativity) every observer is equivalent and so all will use the same method to determine time, to measure distance, and to determine simultaneity, as outlined above. In the next chapter, we will determine the consequences of these methods of measurement. 3 Measurements in flat space—times We shall now make quantitative the properties of the space-time of special relativity introduced in the previous chapters. To do so we shall use a simple formalism introduced by Herman Bondi, called the K-calculus. We can represent faithfully all physical effects in these flat space-times, except gravity. To repre- sent gravity properly, we need to use curved space-times; we discuss these in Chapter 5. The major features of special relativity which we shall look at in turn are its kinematic features, namely (1) the Doppler effect, (2) relativistic velocity addi- tion, (3) the relativity of simultaneity, (4) time dilation and the ‘twin paradox’, and (5) length contraction; and its dynamic features, such as (6) the effective dependence of mass on relative velocity, and the equivalence of mass and energy. While each of these effects may be regarded as important in its own right, we shall emphasize that they only make sense as a total package in which they all occur together. In the next chapter we will look at compact ways of representing this total package. 3.1. The Doppler effect The first feature we examine is the effect of relative motion on the observed naut B is in a rocket moving uniformly at }c away from space station A towards the star Alpha Centauri. Once a year on the 13th of March the space station sends birthday greetings to B. Suppose the radio message cartying this greeting in the year 2010 is measured by the space station to travel a distance of } light-year to reach the rocket, taking a time 7 = } year to do this. The next message is sent exactly a year later, When this radio message has travelled for 4a year to where the astronaut received the previous signal, the rocket has moved ¢ light-year further on, so this signal has to travel longer to catch up the rocket; in fact the time measured by A when the signal reaches the rocket is 3 years after it was emitted (Fig. 3.1). Thus according to A, the birthday greetings ‘sent yearly will be received by B at intervals of one and a quarter years! This does not directly tell us what interval B will measure between receiving the signals (note the warnings in the last chapter!), but it does indicate that this time will not be one year. A similar effect will occur for all light or radio signals from B to A. Accordingly we expect the rate of happenings at the space station as seen by the astronaut to differ from the rate 50 Measurements in flat space-times 125 years Fig. 3.1 Two radio signals sent out 1 year apart by space station A, as seen in A’s coordinates (the time f = 0 is chosen to be midday on 13 March 2010). The first signal is received by astronaut B at theevent a, whose coordinates are t = 0.5, ¥ = 0.5. The second is received by Bat the event b, whose coordinates are f = 1.75, Z = 0.75. Thusaccording to A, the time interval between B's reception of these signals is 1.25 years, We are unable to determine directly from this diagram the time interval B measures between these events, Aly 4 B Fig. 3.2 Light signals sent at an interval T by observer A, as measured by his clock, to observer B moving relative to A. The signals are received by Bat an interval 7” as measured by his clock; Kis defined by the relation 7” = KT. of those happenings as measured at the space station. This is the effect we now investigate, Consider two inertial observers A and B in relative motion. A emits a light signal, waits a time interval T'as measured by his clock, and then sends a second signal. B measures the time interval between seception of these signals to be 7” (Fig, 3.2). A quantity K is then defined as the ratio of these proper times: K=T'/T & T'= KT. GA) 3.1 The Doppler effect 51 We shall see below that, when the speed of relative motion is non-zero, the time intervals are different, ie. K is unequal to 1. (The formulae relating K to the relative velocity of the observers are (3.9) and (3.10) below.) In principle, one can easily measure K directly from definition (3.1). For example, if A’s ‘vehicle’ (be it a spacecraft, aircraft, the earth, or whatever) has attached to it a radio beacon that emits signals at known regular intervals (say every minute), B merely has to receive these signals and measure the time interval between them to determine K. Thus, if B measures the time interval between reception of the signals to be one and a half minutes, then 7 = 1 minute and T' = 1.5 minutes, so K = 1.5/1 = 1.5. More hypothetically, suppose A and B each possess identical accurate clocks, and B has a very powerful telescope through which he can observe A’s clock. He then merely has to watch A’s clock through the telescope, and compare the time it registers with that registered by his own clock (e.g. noting the time interval 7” elapsing according to his own clock every time A’s clock registers that an hour has passed; then K follows from (3.1) with T = 1 hour). This is nothing other than the ‘thought experiment’ mentioned in Section 1.3, where an observer in the tram watched the clock tower in Berne. That thought experiment already tells us that we expect K to get unboundedly large if the relative velocities of the observers approaches the speed of light. Redshift Often the easiest practical way to measure the quantity K is by measuring the observed wavelength of light, radio waves, or other electromagnetic radiation emitted by the source, provided the intrinsic wavelength of this radiation is known. This isthe basis of the redshifl measurements that are our major tool in Suppose that A emits electromagnetic radiation at wavelength Xp. Then* the period Ave of this radiation (the time for one full oscillation, cf. Fig. 3.3a) is given by \g — cA7p. By eqn (3.1), the period of the radiation received by B is measured by him to be Am) = KArg (Fig. 3.36). The wavelength \o that B observes for the light is related to its period by the relation \g = cArp. Therefore the wavelength of the received radiation is related to the wavelength of the emitted radiation by = Kip (3.2) This change in wavelength is easy to measure direct from the spectrum of received light. One identifies in the observed spectrum a line of known wavelength at the source (e.g. the ‘alpha line’ of wavelength 1215 angstroms in the spectrum of hydrogen), measures its received wavelength, and so determines K from eqn (3.2). It is common to express the result of such measurements in terms of the redshift *You can omit the details of the following derivation if you are prepared to accept eqn. (3.3b) as correct, 52 Measurements in flat space-times Fig. 3.3 (a) The amplitude of an electric field plotted against time, showing the period Avg (the time for one full oscillation). (b) An observer B measuresa period Ary fora signal emitted by observer A with period Ary. parameter z, the fractional change in wavelength. Formally, z is defined by the relation z = (change in wavelength) /(emitted wavelength) = (do — Ag)/AB = Mo/Ae- 1. (3.3a) It then follows that 142=d0o/\e = K. (3.3b) Redshifts for distant galaxies are routinely measured by astronomers from their spectra, and used to determine their speed of recession (Fig. 3.4; we will cover the relation of redshift to velocity in Sections 3.2 and 4.3). The name ‘redshift’ is used because light in distant receding galaxies is observed to be displaced towards the red end of the spectrum. This is because if z > 0, then K > | and the received wavelength is longer than the emitted wavelength. ‘Ihe colour of light is directly determined by its wavelength as follows: in units of 10~*cm, the wavelength of red light is between 7.5 and 6.3, orange 6.3 to 5.9, yellow 5.9 to 5.3, green 5.3 to 4.9, blue 4.9 to 4.5, indigo 4.5 to 4.3, and violet 4.2 to 3.9, while infra-red is above 7.5 and ultraviolet is below 3.9. Thus, light emitted as blue may be seen as green, that emitted as green may be seen as yellow, and so on, cf. Fig. 3.5a; so the light is, displaced towards the red end of the spectrum, as claimed, On the other hand if —1 of reception of the signals by B (20 seconds) by the relation T; = 27), that is, T; = KT; (see Fig. 3.6). 3.1. The Doppler effect 55 (b) Fig. 3.6. (a) An observer A sends regular signals for 10 seconds, which are received by observer B during a period of 20 seconds because the K-factor is 2. (b) In general in this situation, T; — KT; Se & {1 i tt © Fig.3.7 Relative motion at speed for observers A and B, seen (a) in A's rest frame (A is at rest and B moves to the right at speed v), and (b) in B’s rest frame (B is at rest and A moves to the left at speed w). Reciprocity of K The second basic assumption about K is a consequence of the principle of rela tivity. Suppose that as well as A sending signals to B, the observer B sends signals to A. Then there is no intrinsic difference between the two situations: in each case the source merely sends signals to the observer, who is in motion relative to the source (see Fig. 3.7). In special relativity the factor Kis simply a result of relative motion in flat space-time. Since this space-time is isotropic (i.e. the same in all 56 Measurements in flat space-times directions), light propagation is the same in all directions. Because of the equivalence ofall inertial observers, the two K-factors measured must be the same: Kan = Koa (3.5) where Kap is the K-factor for light emitted from A and received at B, and Kya is the K-factor for light emitted from B and received at A. If this were not so, there would be some intrinsic difference between light propagation from A to B and from B to A, contrary to the relativity assumption; this intrinsic difference would enable us to measure absolute motion. Thus the Doppler shift effect is completely reciprocal: whatever relative time change is detected by B in observations of A, is also detected by A in observations of B. If A measures a factor-2 increase in the wavelengths of all light received from B, then B will also measure a factor-2 increase in the wavelengths of all light received from A. So A will have to retune his receiver by a factor 2 to receive signals from B, and B will also have to retune his receiver by a factor 2 to receive signals from A. The observer B will see A’s clock running slow by a factor of 2, and A will observe B’s clock to be running slow by a factor of 2. This symmetry allows us to omit the subscript ‘AB’ from Kap when the context makes it clear which observers are concerned (see Fig. 3.8). Measuring K by radar A useful feature results from the symmetry relation (3.5): suppose A sends out two pulses separated by a time interval T, which are reflected by B and received again by A witha time separation 7” (Fig. 3.9). By the definition of K, the time between these pulses measured by B will be 7’ = KT, and then 7” = KT’ = K?T. Thus, A merely has to observe the ratio T”/T to determine K from the relation vir"/7) (3.6) The significance of this derives from the fact that to use relations (3.1-4) to determine K, the observer A has to receive radiation emitted by B where this radiation has to be of a known wavelength (or frequency). Thus, either the signal Fig.3.8 Signalssent by Bat an interval 7” (as measured by his clock) and received by A at an interval 7” (as measured by his clock). By the relativity principle, 7” = K7’. 3.1 The Doppler effect 57 Fig. 3.9 Signals sent by A at an interval 7, reflected by B at an interval 7”, and received by A atan interval 7”. has to be deliberately transmitted at a specific frequency, or the frequency must be deduced from the received radiation (which indicates physical conditions at the source), e.g. by recognizing specific spectral lines. However, using reflected pulses and relation (3.6), A can determine K even if B is not emitting any radiation. This enables him to measure the speed of motion of B relative to himself, as well as B’s distance, purely on the basis of measurements at his own. position, without the collaboration of B or any detailed knowledge about B. Summary ‘The discussion we have given shows that when K > 1 (which will be thecase when Aand Bare moving apart, as we shall see in the next section) the factor K gives the relative time increase observed by B in all phenomena at A, and observed by Ain all phenomena at B. The fact that we commonly refer to this effect in terms of the redshifting of light is just because this happens to be easy to observe. The time- shift observed for all other effects will be the same. For example, suppose we observe the radiation received from a quasi-stellar object at great distance to have a redshift 2 = 3, and to vary in brightness on a time-scale of 8 hours. Then (since K =z + 1 = 4) in fact these variations must have taken place on a time-scale of 2 hours at the source Exercises 3.1 A space-traveller moving away from the Earth at speed such that K = 2tunesin to a television show transmitted from the Earth. In what way will the K-factor affect the display he obtains and the way he receives it? 3.2. In order to perform a complicated docking manoeuvre, it is essential that two spacecraft can be held at rest relative to each other. Devise a simple experiment to check that this is so. Computer Exercise 4 Write a programme that will accept as input (a) cither a value for K or a value of 2 due to relative motion of two observers, and (b) a time period 7, a wavelength L, or a frequency F 58 Measurements in flat space-times measured by one of them; and giveas output the corresponding time period 7”, wavelength L! or frequency F' (as appropriate) measured by the other (given by eqns (3.1-4)). Now modify your programme to accept as input a letter representing the colour of emitted light (c.g. ‘B’ for ‘blue’) and to print out the colour of this light as seen by the relatively moving observer. [Note that for high values of z, some light will be shifted out of the visual range and some radiation into this range.] If your computer has colour graphics, apply this change to any colour image you have available to sce visually the effect of redshifting (K > 1) or blueshifting (K < 1) an image. Redshift and background radiation From quantum theory, we know that the energy ofa photon is proportional to its frequency: E = hv, where the frequency is the number of oscillations per second, and so is just the inverse of the period: v = 1/Ar. Thus (see p. 51) frequency is inversely proportional to wavelength, with proportionality constant the speed of light: v = c/A. Putting this together with (3.3), the observed frequency of radiation and hence the measured energy per photon varies as the inverse of the redshift factor (1 +2): fom _M__l Ep ve Itz Now the rate at which photons are emitted by a source will be seen by an observer moving away to be slowed by a factor (+2), so the rate at which energy is emitted by.a source will be related to the rate energy is received from the source by a factor (1 + 2)*: dE dE I ()- Gar ae This determines the effect of motion on flux of radiation received from distant objects (see eqns (4.35) and (7.11) below, and Fig. 7.13, for the cosmological application). They look fainter if they are receding from us and brighter if they are approaching. Now we live in a universe bathed in cosmic background radiation (‘CBR’), the relic radiation from the ‘Hot Big Bang’; that is, black body radiation at a temperature of 2.75 K (see pp. 272-4 below). This radiation is isotropic (i.e. is measured to be the same in all directions) for any observer at rest relative to the matter that emitted that radiation—that is, who is moving at the average velocity of all the matter in the universe. The implication of the above relation is that we can detect any motion of our own Galaxy or Sun relative to the universe by measuring a dipole anisotropy in this radiation—a higher temperature in one part of the sky (the direction towards which we are moving) and lower in the opposite part of the sky (the direction away from which we are moving). Actually the effect is even stronger: the instruments we use measure intensity of radiation (that is, flux of radiation received in a unit solid angle from many sources of radiation), rather than flux froma single source; this brings in two more factors of redshift (see (4.36) and (7.11,12) below), enhancing the dipole anisotropy effect predicted, (3.7) 3.2 Relative velocity 59 Fig. 3.10 The cosmic background radiation temperature anisotropy as measured in all directions in the sky (the oval shape represents the entire sky). The section surrounded by the lightest region is hotter by one part in a thousand than the dark section (which is the opposite direction in the sky). This is caused by our motion relative to the rest frame of the universe. The radiation was emitted at a redshift of about 1100. This kind of anisotropy is precisely what we measure: there is a difference in the measured CBR temperature of one part in 10° in opposite directions in the sky, as measured for example by the COBE satellite (Fig. 3.10). This suggests we are moving at a speed of 300 km/sec relative to the rest frame defined by this back- ground radiation. If we transform to that frame (the rest frame of the universe), this dipole anisotropy will be removed. The resulting CBR anisotropy is then very smallindeed; itisisotropicto theastonishing accuracy of one part in 10° (Fig. 3.11), the remnant fluctuations arising from the very small inhomogeneities in the very early universe at the time of decoupling of matter and radiation, that later grow into clusters of galaxies. (Figures 3.10 and 3.11 are reproduced in colour on the back cover.) (Note: this section refers forward to the cosmology parts of the book. This is deliberate: the idea is to make the reader aware that those parts will be reached in due course and will be interesting. The CBR anisotropy is mentioned again on p. 274]. 3.2. Relative velocity In special relativity theory, the Doppler shift factor K depends simply on the relative motion of the source and the observer. We first determine that relation in the simple case of radial relative motion, and then derive the special relativity law of addition of parallel velocities. The relation between K and relative radial velocity Consider two observers A and B moving directly away from each other at a uniform speed v. For simplicity, let their positions coincide at the time ¢ = 0.as 60 Measurements in flat space-times ibe Fig.3.11 (a) The residual anisotropy once the dipole has been removed. Apart from the major lane across the sky due to sources in our own galaxy, the anisotropy is only one part, in a hundred thousand. (b) The remnant anisotropy once the galaxy signal has been subtracted. The primordial fluctuations detected represent inhomogeneities at the surface of last scattering of the Cosmic Background Radiation. They provide the seeds for growth, of large-scale structures at much later times, such as the clusters of galaxies we see at the present time, and the matter in them is the most distant matter we can detect by any form, of electromagnetic radiation (they form our visual horizon). (Images 3.10 and 3.11 reproduced by permission of the NASA Goddard Space Flight Center and the COBE Science Working Group.) measured by both their clocks; we can regard them as signalling to each other by radio at that time (the distance is zero, so communication is instantaneous). Suppose that a radio pulse is then emitted by A at a time T'as measured by his clock, which is reflected by B at a time 7’ as measured by B’s clock, and received again by A ata time 7” measured by A’s clock (Fig. 3.12). Remembering the 3.2. Relative velocity 61 Fig.3.12 Observer A emits a radio signal at time 7, and observer B receives it at time 7" at event p. It is reflected back to A who receives it at time 7”. A measures the event q at time A(T +1") to be simultaneous with p. definition (3.1) of K and the reciprocity relation (3.5), we find (cf. Fig. 3.6 and the derivation of eqn (3.6) that T'=KT, T'=KT=K°T. According to A, the travel ime for the radio pulse is therefore T" ~T=KT-T=(K?-I)T. By eqn (2.1) the radar distance measured by A between B and A is (hus D=4e(K?—1)T. (3.9) To determine the velocity of B as measured by A, we must find out when ‘A measures B to be this far away. By the definition of simultaneity (Section 2.3), A determines the reflection event p to be simultaneous with the event q in his history which is half-way between the times of transmission and of reception. By eqn (2.2), the time measured by A’s clock at q is tg =4(0" +7) =4(K? +7. (3.10) Now, A and B coincided at the time = 0 measured by A’s clock. A therefore concludes that B has moved a distance D (given by (3.9)) in the time f, (given by (3.10)), so the speed of B relative to A, as measured by A, is v= D/te = {he(K? — 1)T}/{5(K? + 1)T}. 62 Measurements in flat space-times Multiplying numerator and denominator by 2/T shows that v= (K? = Ne/(K? | 1). (3.11a) ‘Therefore K, which is directly measurable in various ways (see Section 3.1), directly determines the relative speed of separation of A and B. (Note that the results would be more complex if the motion were non-radial, ic. if A and B were not moving directly towards or away from each other; we will only consider radial motion here.) Just as we introduced the rescaled coordinates (¥, ¥, Z) to simplify distance measurements relative to the speed of light, so now it is convenient to rescale our velocity measurements. We do so by defining V = v/c. The quantity Vis dimensionless; itis simply the velocity v rescaled relative to the speed of light. In these units, the speed of light is +1 (if'w = c then V = c/e = 1; if v= —c, then V =-1). The final result is then V = (u/c) = (K? — 1)/(K? + 1). (3.11b) We can solve eqn (3.1 1b) for K? in terms of V by multiplying through by K? +1 and collecting terms. We find V(K 41) =(K-1) & (VY -1)=-(V +1), so R= —-(V+)/V-1)=0+V)/0-P). On taking the square root of this relation, the sign ambiguity is resolved because K must always be positive (if B observes A’s clock through a telescope, he will not see it run backwards!) Thus the Doppler shift factor K resulting from a relative radial velocity v is found to be 147)! K= ( 7) (3.12) For example, if v = j¢, then V = j, so 1+ V = 3,1 — V = 3. Thus K? =3 and K = (9*= 1.291. Similarly, then K = 3! = 1.732; then K = 7 = 2.646; then K = 19! = 4.359; ifu- ie then K~ 199 ~ 14.107 Thus, as expected, high relative speeds cause large K-factors, and so large ratios hetween times measured by two observers. 3.2 Relative velocity 63 Approach and recession Thecalculation above was done for a relative speed of recession v of A and B, and assumed v > 0. If we consider the case when A and B approach each other at relative speed v (Fig, 3.13), the resulting formulae will be identical except that » is replaced by —v, and V by —V. Therefore, we can use the same formulae (3.11-12) for both approach and recession if we introduce the sign convention: » will be positive whenever A and B recede from each other, and negative whenever they approach each other. We adopt this sign convention from now on; then (3.11) and (3.12) apply to relative radial motion whether it represents relative approach or recession of the observers, With this sign convention, the reciprocity of the relation is apparent: Kap = Koa @ Uap = Upa, (3.12b) that is, receding observers each measure the other to be receding at the same speed, and approaching observers each measure the other to be approaching at the same speed. This result is in fact just a consequence of Einstein’s relativity principle, that physics should be the same for both inertial observers, since this leads to the expressions (3.9-12) which treat both observers on exactly the same footing. If this were untrue (¢.g. if you measure me to be receding at 500 km/sec, but I measure you to be receding at 250 km/sec) relative velocities would be very difficult indeed to deal with. As in the case of K, we will omit the subscript ‘AB’ from vag whenever no confusion results. B\JA Fig. 3.13 A situation similar to that depicted in Fig. 3.12 but with the observers approaching each other rather than receding. A sends a signal at time 7” before the observers mect and reecives it back a time T before they mect, after B has reflected it at 7’. 64 Measurements in flat spacetimes Suppose V = 0; then (3.12a) shows K = 1. Similarly if K = 1, then (3.11) shows V = 0. Thus the relations we have derived show K=1ev0=08V=0, i.e. there is no Doppler shift effect if and only if the relative velocity is zero. Considering now the relation of vto K and z implied by (3.3, 11, 12), we find that K > | (arelative slowing down of time is observed) when observers recede from each other, and K < | (a relative speeding up of time is observed) when they approach each other: Relative approach -1<¥<0 00 (light redshifted) Basically, this follows because when receding, each observes the other to be positioned at steadily increasing distances and so light travelling either way has to travel larger and larger distances; so we then expect the time intervals observed at the receiver to be longer than the time intervals at the emitter, ie. > K > 1 (cf. Fig. 3.1). Similarly, when approaching, light travelling either way will travel shorter and shorter distances so the observed time intervals at the receiver will be shorter than those at the emitter, ie. K < 1 Figure 3.14 shows the relation between v/c and K; one can read off the relation either way from this graph (e.g. one can find the K-value corresponding to any v/c, or the u/c value corresponding to any X). It is clear from this graph (and follows from eqns (3,11,12)) that as the relative speed of motion approaches the speed of light, the relative time-change observed increases without limit. In the case of relative approach, jes lee K-40, Fig. 3.14 The relation between K and ¥ — u/c. 3.2. Relative velocity 65 i.e.a time interval Ar at A is observed by B in an indefinitely short time period. In the case of relative recession, ule 1 & K-00, ie.atimeinterval Arat A is observed by B to last an indefinitely long time period. This is the result Einstein realized by thinking about observing the clock in the square at Berne througha telescope (see Section 1.3 above): asv — ¢, time appears to stand still. Exercise 3.3 (i) What relative radial velocity V corresponds to a K-factor of 3? Determine the corresponding velocity v = cV in km/sec. (i) What relative radial velocity V corresponds to a K-factor of 4? Determine the corresponding velocity v = eV in km/sec. (iii) If A recedes radially from B at a speed. What is the X-factor observed by B? (iv) IfA approaches B radially at a speed v What is the K-factor observed by B? L -, what is the K-factor observed by A? 4c, what is the K-factor observed by A? ‘The change in K during a fly-by Consider an observer B approaching A at a speed v, and another observer C receding from A at the same speed. Then Vap=—Vac, Vac > 0, (3.13) where Vp is the speed of B relative to A and Vac is the speed of C relative to A ineasuied as a fiaciion of ihe speed of fighi (we are using ihe sign conventions just introduced). Therefore (3.12a) shows that Kac={(1 | Vac)/( Faced}? = {(1 = Van)/(+ Vaw)}? = 1/Raw. Thus the K-factors for B and C relative to A are related by Kap =1/Kac, Kac> 1. (3.14) We have just proved that (3.13) implies (3.14). Similarly, one can show from eqn (3.11) that (3.14) implies (3.13); that is, wo K-faclors are reciprocal to each other if and only if the corresponding relative velocities are the same in magnitude but ‘opposite in sign (one corresponding to approach and the other to recession). This is preciscly the situation that will occur during a ‘fly-by’ (sec Fig. 3.15a). For example, suppose that B flies past A at a constant speed of 3c. While B is approaching A, we have vap/e = —3and K =4. After B has reached A and is receding, vay/¢ — +3 and K — 2. As B passes A, the K-factor suddenly changes to its reciprocal (in this case, from } to 2. There are good physical reasons for this change: initially A points his receiving antenna to the left (B is approaching from that side). As B passes, A has to swing the antenna round to receive signals from B, 66 Measurements in flat space-times B r receding w>0 a= = “al “is approaching receding B (a) (b) approaching wea Fig. 3.15 A ‘fly-by’. (a) A watches B approaching from the left and then receding to the right. (b) A space-time situation, showing the light rays by which A observes B when he is approaching and receding. which now come from the right. A then receives signals from B on a different family of light rays than the family of light rays on which the signals were initially travelling (Fig. 3.15b). Asa consequence, A will also have to retune his receiver as B passes; e.g. if B transmits radio signals at a wavelength of I metre, A will receive the signals at a wavelength of 0.5 metres while B is approaching but at 2 metres while B is receding. This is closely analogous to the corresponding effect in the case of sound waves: asa train or car passes a stationary observer while emitting a warning note, the tone heard drops from a high pitch to a low pitch. The Doppler shift factor again changes discontinuously as approach changes to recession. Exercise 3.4 Show that (3.14) implies (3.13), that is, reciprocal K-factors imply that the measured radial speeds of approach and recession are the same The relativity law of addition of parallel velocities Consider now three inertial (non-accelerating) observers A, B, and Cin motion relative to each other in the same direction (Fig. 3.16). Then their relative vel- ies are parallel, and we can chouse coordinates so that all Uie motion lakes A B c ae Fig. 3.16 Observers A, B, Cin relative motion, all moving in the same direction 3.2 Relative velocity 67 place in the x direction and their world-lines in a space-time diagram lie in the (1, X) plane. Figure 3.17 is sucha diagram drawn from the viewpoint of A. Wecan immediately read off the relative velocities vgn and vac from this diagram, because the axes are marked off according to the measurements made by A; but wecannot read off upc, because it is not apparent from this diagram how the time and space measurements made by B or C relate to those made by A. tha x Fig. 3.17 The world-lines of the observers A, B, C, seen from A’s reference frame. To determine Vgc, suppose that A emits light signals separated by a time interval T, and B and C measure the time intervals between reception of these signals as 7” and 7” respectively (Fig. 3.18). Then by the definition of the K-factor, T!=KayT, T" = KacT. (3.15a,b) However, we can also consider B emitting light signals a time 7’ apart. Then qv 7! Combining (3.15c) and (3.15a) shows that 7” = KycKanT. Comparing with (3.15b) and noting that these relations hold for all values of T, one finds Kac = KasKec, (3.16) yc Fig. 3.18 A emits signals separated by a time interval 7; they are received by B separated by 7’, and hy C separated by 7” 68 Measurements in flat space-times the composition law for Doppler shift factors K. Squaring relation (3.16) to obtain Ki, = Kip Kc and using formula (3.12a), we obtain (; + yA) = (# rt) (j + 2) 1—Vac 1— Vas) \1— Vac which may be solved for Vac as follows: multiply through by the product of the denominators to obtain (1 + Vac)(1 Van)(1 — Vac) = (1 + Van)(1 + Yac)(1 — Vac)- Now multiply out, cancel terms, and collect terms in Vac to give Vac(1 + Van Vac) = (Vas + Vac). Dividing by 1 + Van Vac, Vac = (Van + Vac)/(1 + Van Vac), (3.17a) that is, Uap t+ vec 1+ vapusc/c?” the relativistic velocity addition law for parallel velocities. When the speeds involved are very small compared with the speed of light ¢ (juan/el <1, jusc/e] < 1) the denominator is very nearly equal to 1 and this reduces to the Newtonian result vac = (3.176) vac = Ua + UBC. (3.18) However, for larger speeds the results given by eqns (3.17,18) differ consider- ably. For example, suppose oe =vupc =4c. Then the relativity result is vac = (de +ho)/(1+4x4) =4e, from 3 19), while the Newtonian result is s=tette=e, from (3.18). Simularly if van = voc = 3. the retatwvity result is vac = e/(1 + 2) = 3c/% = He = (0.96)e while the Newtonian result is vac = 1.Se. ‘The speed of light as an invariant limiting speed Inthe example above. the relativity velocity addition law shows the relative speed of A and Cis less than the speed of light, although a simple velocity addition suggests it would be greater (cf. Section 1.3). This is no accident; relation (3.17) is of a form which guarantees that as long as vag and vgc are both less than the speed of light, so is vac. This is an important feature, since it is necessary in order to have consistency with the principle that no observer should measure a massive object to move as fast as the speed of light (Section 1.2). Further, the limit of this law as vc is just what we would wish. Indeed, suppose we put Vac = in (3.17a). Then Vac = (Van + 1)/( + Van) = 1, no matter what the value of Vag. Thus taking the limit as ugc — ¢ in (3.17), we confirm that this velocity addition law implies Einstein’s principle of invariance of the speed of light, because, if B measures a particle C to move at the speed of 3.2 Relative velocity 69 light, so will A, no matter what the relative velocity of A and Bis. This resolves the yelocity-addition problem we encountered in Section 1.2. Finally, we note that if we consider situations where the relative motion of B and Cis not parallel to that of A and B, the relativistic result is more complex than that derived here, but still guarantees consistency with the principle of invariance of the speed of light (and so with the limiting nature of the speed of light for motion of massive particles). The theory is self-consistent! Exercises 3.5 Letrocket A move away from Bo the left at}, and rocket C move away from B to. the right at 3c. Draw a space-time diagram of this situation from B’s viewpoint, and show that from this diagram B can determine the relative separation of A and C to be increasing ataratedc. How is this consistent with the fact that the relative speed of motion measured by two observers for each other cannot exceed the speed of light? What relative velocity will A measure for C? 3.6 (i) Consider eqn (3.16) in the case when Kgc = |. Explain the situation occurring. Is theresultobtained reasonable? What particular conclusioncan you drawif Kap = 1also? (ii) Consider eqn (3.16) in the case when Kac = 1. Explain the situation occurring, and hence rederive the result that K is replaced by 1/K when a speed of approach v is replaced by a speed of recession of the same magnitude. 3.7. (@) What value of K corresponds to a relative speed of approach of 1000km/hr? (atypical speed of approach of airliners). Is this measurable? i) Whatis the value of Kif'vis 500 km/sec? (typical of the relative motions of galaxies in our cluster). (ii) If Kis measured to be 4, what is the corresponding speed of relative motion? (iv) A traffic officer measures a car 150m from him to be travelling towards him at 100 km/hr in a 60 km/hr speed zone. How long does the radar echo take to reach him? If the pulses emitted by his radar set are separated by 3 j1sec, what is the separation measured by fhe ocho pulses? 3.8 Prove from (3.17) that if |Vap| <1 and |Vpc| < 1, then |Vac| < 1. [Hint: prove that(1 — Vap)/(1 — Vac)/(1 + VasVac) = 1 — Vacandasimilarexpressionfor | + Vac.] Computer Exercises 5. Write a program that will either (a) accept as input a value for a radial relative velocity V and compute the corresponding K-factor (from eqn (3.12)), or (b) accept as input a K-lactor and compute the corresponding radial relative velocity V (from eqn (3.11). [Ensure that your program accepts only relative speeds less than the speed of light, and values of K greater than zero.] Use your program to confirm (i) the form of Fig. 3.14, and (ii) the reciprocal K-relation (3.14) for equal speeds of approach and recession 6. Write a program that will accept as input speeds Vap and Vgc of relative motion, and print out Vac. the speed of relative motion measured by A for C (calculated from ean 3.17); restrict the inputs to physically acceptable values). Use your program to verify that Vac does not exceed the speed of light. Adjust the program to print out the error if Vac is estimated by the corresponding Newtonian value (3.18), and hence check that the Newtonian value 1s acceptably good in ordinary everyday circumstances. 70 Measurements in flat space-times 3.3 Simultaneity We have already seen in Section 1.3 that the surfaces of simultaneity or instant- aneity for observers A and B in space-time depend on their motion, This is a key feature: most of the ‘paradoxes’ of relativity theory require an understanding of the relativity of instantaneity for their resolution. We now examine this issue. Simultaneity in the observer’s rest frame To have in mind a specific example, one can consider setting up a standard time system throughout the solar system in order to facilitate communication between. space ships and assist space navigation. Initially the plan is to extend Greenwich Mean Time out as far as Mars. The way to do this is for an observer A at Greenwich to set up a standard clock, and then to use the concept of simultaneity determined by radar (as explained in Section 2.3) to extend time measured by this clock to other points in the solar system Just as one would intuitively expect, when the space-time is represented using the standard coordinates (t, X) of A’s reference frame, the surfaces of instant- aneity he determines by use of radar are the surfaces {r = constant} (Fig. 3.19). For example, if A emits a light signal at 1) = —1 and receives its echo at f) = +1, then since light moves at unit speed in these coordinates the reflection event P has coordinates t = 0, ¥ = 1. By eqn (2.2), A measures P to occur at the time T =}(—1+ 1) = 0. Thus Pis measured by A to be simultaneous with the event O (att = 0, ¥ = 0)in his history (Fig. 3.19b). Similarly emitting light at r= -2 and receiving it at ¢ = +2, A determines the event Q at {t = 0, X = 2} also to be simultaneous with O; and in fact A determines all points for which t = 0 to be simultaneous with each other. This is not an accident; use of simultaneity (as defined by radar) is the natural way observer A extends clock readings from his world-line to other points in space-time, so he will naturally define the surfaces yA —>X surfaces of simultaneity| for A @ ) Fig. 3.19 (a) The surfaces of simultaneity for an observer A (who is, by definition, stationary in his own coordinate system (i, X)). (b) Observer A determines the event P at (0, 1) to be simultaneous with O because light emitted by A at ¢ = —1 is reflected at P and reecived back by Aat¢ = 1, Similarly A can determine Q at (0, 2)to be simultancous with O. 3.3 Simultaneity 71 {t= constant} to denote simultaneity with clock readings along his own world- line. Essentially, we have simply verified that this natural interpretation is correct. The effect of relative motion Continuing our specific example, suppose now a rival commercial enterprise decides to set up an alternative time standard for space navigation. Being for- ward-looking, they decide to base this on the standard of rest defined by the galactic centre. Because of the rotation of our galaxy, the Earth is moving at a speed of about 350 km/sec relative to this standard of rest. The question is how one would relate times determined in the two reference frames. To do so, consider the reference frame of an observer B moving past the observer A at a relative speed v. ‘Io simplify the calculation, we assume their positions coincide at an event O to which each assigns the time 0; then in terms of proper time r! measured by B along his world-line, 1, = 0. Using the standard radar procedure for determining simultaneity (see Section 2.3), observer B will measure a reflection event P’ to be simultaneous with O if O is the half-time between emission and reception of light reflected at P’; since P’ is simultaneous with O, then also 4, = 0. We do not yet know how /’ relates to the coordinate time t. However, one can see (because of the constancy of the K-factor when relative motion is uniform) that equal time intervals measured by B will be represented in the space-time diagram by cqual distances along his world-line (cf. Fig, 3.6). Thus, the light by which B determines simultaneity with O must be emitted and received at events E and R represented at equal distances from O along his world-line in a space-time diagram (Fig. 3.20). Because light travels at 45°, it is clear from Fig. 3.20 that an event P’ measured by B to be simultaneous with O will lie above the surface {1 = 0} in space-time, if B is moving towards the Oand P’ to occur simultaneously, A will determine P' to occur after O (tp: > to). IfB moves away from the spatial position of P’, then A will determine P’ to occur before O. Fig. 3.20 Observer B, moving relative to A, determines the event P’ to be simultaneous with O because O coincides with the point half-way between E, where B emitted a signal, and R, when the signal was received hack after reflection at P’. 72 Measurements in flat spacetimes The equal-angle rule Consider the situation above, as represented in Fig. 3.21. Examination of the geometry implied by the equality of the distances OE and OR, plus the fact that the segments EP’ and RP’ are at 45° to the vertical, shows that the shaded tri- angles ORS and OP’V are congruent to each other. One can convince oneself of this result experimentally (for various values of the angle SOR, draw equal line segments OE and OR accurately and then determine P’ as the intersection of lines at 45° from R and E), or by formal geometric proof based on ordinary Euclidean geometry (such a proof is given at the end of this section). Consequently, the angles SOR and VOP’ are the same. This implies a simple rule characterizing surfaces of simultaneity in space-time (Fig. 3.22): if'a world-line A makes an angle awith the vertical in aspace-time diagram, surfaces of instantaneity for an observer with world-line A tilt up by an angle a toward A. Fig. 3.21 Figure 3.20 redrawn to illustrate the fact that triangles ORS and OP’ are congruent B a it Y = . y ultaneous forB Late? - x — | simultaneous for A | Fig.3.22 The angle a between the surfaces of simultaneity for A and B is the same as the angle between their world-lines in a space-time diagram drawn from A’s viewpoint 3.3 Simultaneity 73 Fig. 3.23 A point (0, Xo) on B’s world-line, where Xp = Vg, and a point (t1, Xi) on B’s surface of simultaneity. Because of the equal-angle rule(Fig. 3.22), 11/X; = Xo/to =tan a. The simultaneity equation The preceding result enables us to derive a simple formula for these surfaces of simultaneity. Because B is moving at a speed v relative to A, we see that B’s world- line is given by x = vt, so X = x/c = vt/e = (v/c)t = Vi (on remembering that X = x/c, V = u/c). Thus at the time f (measured by A) B will lie at a coordinate position Xy = Vty (measured by A; sce Fig. 3.23). Therefore the angle a of the world-line A with the vertical is given by tana = Xo/to = (Vto)/to = V. On the other hand if (11, X1) is a point on the surface of instantaneity for B where X1—x;/c, then the angle a of this surface from the horizontal is given by tana = t,/X). Equating these values for tan a shows Q Q , ? which is the equation for B’s surface of instantaneity in terms of the variables measured by A. Two examples Asa first example, consider the observer A to be on the surface of the Earth; Bis in a rocket moving past at a speed 3 c in the direction of the planet Mars, at a time when the distance to Mars is 4 light-hours. Then Fig. 3.23 applies with v/c =, X) = xi/c = 4hours, and 1) = 4 x 4 = 2 hours (from eqn (3.17)). Thus, the event Pin Mars’ history that A measures to be simultaneous with the event O when A and B pass each other, is 2 hours prior to the event P’ in Mars’ history that B nicasures Lo be simultaneous with O. As a second example, the Andromeda Nebula is about 2 190000 light years from the Earth. Consider simultaneity between events on the Earth and at Andromeda as measured by an observer A on the surface of the Barth, and an observer B in an airliner flying at 300 km/hr above the Earth in the direction of Andromeda. The-relative.speed-of motion of these observers is V = v/e= (300 km/hr) x (1/3600 hr/sec) /(300000 km/sec) = 1/3600000, so by (3.19) 74 Measurements in flat space-times the difference in time between events at Andromeda they measure to be simul- taneous with a single event on the Earth is r, = (2190000/3 600000) = 0.61 years. Similarly, if observer C travels on a bus at 30 km/hr towards Andromeda, he will disagree with A about simultaneity on Andromeda by 22 days. Conclusion This analysis confirms what we discovered previously, namely that spacetime is a unit which is split into space (surfaces of simultaneity) and time in different ways by different observers (Fig. 3.24). The splitting depends on their relative velocities; it is given by eqn (3.19), which is the analytic form of the simple ‘equal tilt’ result illustrated in Fig. 3.22. The analysis is inevitable once we have decided to base the concept of simultaneity on measurable effects, and recognize that it is best to do so on the basis of the speed of light because of the fundamental importance of this speed in nature. As is the case for all relativity effects, the relativity of simultaneity is completely reciprocal: viewed from B's reference frame, his surfaces of simultaneity are horizontal and it is A’s surfaces of simultaneity that are tilted, inclining up towards A’s world-line (see Fig. 3.25, which is just Fig. 3.22 redrawn from B’s viewpoint). Finally we note that for small values of |vx/c?| the effect is very small; in particular, it is negligible in everyday life (the differences for simultaneity of different observers are in the region of 10~ysec). On the other hand, as v increases towards c, f — x1/c: that is, events simultaneous with O approach closer and closer to the future light cone. Figure 3.22 shows that v increases, @ » 45° and B’s surface of simultaneity in space-time approaches closer and closer to his world-line. If the limit when v/c = 1 could be attained, B’s world-line would be contained in his surface of simultaneity: time would cease to flow for simultaneity for A Fig. 3.24 Space-time split differently into space (surfaces of simultaneity) and time (measured along world-lines) for observers A and B in relative motion 3.3 Simultaneity 75 simultaneous for A Fig. 3.25 Figure 3.22 redrawn from B’s viewpoint. to determine simultancity at distant regions in his direction of motion, he could not do so: having emitted a radar signal for this purpose, he would arrive at these regions at the same time as the signal he was attempting to use to determine simultancity therc! Further, on obscrving backwards, wave fronts emitted along ago from regions he had already passed would perpetually be moving with him, informing him that conditions there were unchanging. Luckily, these strange situations cannot happen for real observers, since they cannot move at the speed of light. Exercises 3.9 An airliner flies at 500 km/hr towards a destination 1000 km away. What is the resulting difference in simultaneity for theaircraft and the control tower at the destination? Does the pilot have to allow for it? 3.10 Twin A on the Earth maintains radio contact with Twin B who is in a rocketship moving away from him at a speed of $c. They decide to blow out candles on cakes simultaneously at midday on January 10th (their birthday). At that moment the distance between them, as measured by A, will be 2 light-years. What difference will there be between the times they each consider the appropriate moment for each to blow out the candles? B_ turns around when her distance from A is measured by A to be 3 light-years, and starts returning at a spced $c. Let P be the event in A’s history that B measures to occur immediately before the turn-around, and Q be the event in A’s history that B measures to occur immediately after the turn-around (for simplicity take this to happen instanta- neously). What difference in time does A measure between the events P and Q? 3.11 Return to consideration of Exercise 2.7. Determine what instant in B’s history she measures to be simultaneous with the event when A fired at her. Does B reach the same conclusion as A about who fires first? Computer Exercise 7 Write a program that will accept as input (a) the relative speed of motion F of two observers, and (b) a distance D; and will then print out the difference in simultaneity DT measured by these observers at the distance D (given by eqn (3.19)). Verify the negligible nature of the effect for everyday speeds of motion on the Earth 76 Measurements in flat space-times Modify your program so that it can also calculate D from DT, given V; or V from D and DT Hence find, for example, what relative speed will cause a difference of simultaneity of one hour at a distance of four light-hours, What limit can you deduce on the possible magnitude of D7 at this distance? What is the general form of the limit on the magnitude on DT, given D? Appendix: Geometric proof of the congruent triangle result Consider Fig. 3.26, which is an extension of Fig. 3.21. By sending outa light signal in the opposite direction to P’, the observer B would determine the event Q’ also to be simultaneous with O. Now, parallelogram EP’RQ is formed from light rays all at 45° to the vertical. Thus it is a rectangle and its diagonals, which bisect each other at O, must be equal. Hence the length of OP’ is equal to those of OE and OR. Looking now at angles, we see that 2 — 3 (triangle OPE is isosceles); a + fy = 45° (the angle OTU exterior to triangle OET is equal to the sum of the interior angles); and similarly a3 + By = 45° These three equations show that a2 = a3. We also have a = a2 (opposite angles are equal), therefore a = a3. We see now that triangles ORS and OP’V are congruent with equal sides (OR and OP’) and two pairs of equal angles (a1 and a3, and the two right angles OSR and OVP’), 3.4 Time dilation ‘We have seen that a relatively moving observer will measure time differently from a stationary observer (Section 1.3); and have emphasized that consequently one Fig. 3.26 . Figure to prove congruence of the two shaded triangles in Fig. 3.21 (see text). Light emitted at event E in B’s history is reflected at events Q’ and P’ and returns to event R in B’s history. 3.4 Time dilation 77 Altt “yf? | “ P a | + 7 ma IO:t=t'=0 Fig. 3.27 Observers A and B in relative motion. By sending a light signal at 7’ and receiving in back at 7”, the observer A determines the event Q to be simultaneous with the reflection event P, which is at time 7” according to B’s clock. cannot directly measure proper time ¢’ for an observer B from a space-time diagram drawn from the viewpoint of an observer A, because this diagram will be calibrated in terms of A’s variables (t, X, Y, Z), and we are not entitled simply to assume what the relation between ¢and ¢’ is. However, we can easily calculate this relation (cf. the derivation of eqn (1.2)). In this section we shall work out the magnitude of the time dilation effect in terms of the Doppler shift factor K and in terms of the relative velocity v, consider direct evidence for time dilation, discuss the symmetry of the time dilation effect, and investigate the ‘twin paradox’. The comparison of clock readings by radar Consider again the situation described at the beginning of Section 3.2, where observer B moves past A at a speed v (see Fig. 3.27). Both observers set their clocks to zero at the time they pass each other, thus the event O where their world- lines coincideis given by to = 0, 1 = 0. After emitting a radar pulse at time 7, the observer A determines the reflection event P on B’s world-line to be simultaneous with the event Q on his world-line: ic. t 4 =0. Then from eqn (3.10), fp = tg =4(K* + 1)T. On the other hand, for B, we have tp = T'KT. Because both A and B set the time at O to zero, the ratio of the time from O to P as measured by A to the time from O to P as measured by B is given by tp /tp =4(K? + 1)T/KT. Cancelling the factor 7, we find that tp/tp = (K? + 1)/2K = o(K) (3.20) (¢=? means ‘identically cqual to’). This is the relativity effect of time dilation, showing how times ¢' measured by B relate to times ¢ measured by A when they are compared by synchronization of clocks (ie. using radar to determine simultaneity). 78 Measurements in flat space-times The K-factor and the 7-factor Equation (3.20)defines the time dilation factory (‘gamma‘)in terms ofthe Doppler shift factor K, measured by A directly observing an apparent slowing down or speeding up of timein theimage of eventsat B (asdiscussed in Section 3.1). Itiseasy to get confused over these two effects (the time dilation effect and the Doppler shift effect), so we now note their major distinctions. Essentially, the K-factor relates time at the observer to clock rates at the object as directly observed (e.g. througha telescope). Thus it compares clock rates then (at the time of emission) to now (at the time of observation). On the other hand, ¥ relates clock times as related by instantaneity. Thus it is based on the concept of ‘simultaneity’ or ‘now’, and compares theratesat which clocks at the observer and object are both runningnow. Considering these two situations, it becomes clear that the crucial difference is that direct measurements of the K-factor involve light travelling only one way, from an object to the observer (Fig. 3.28a); thus one only needsa receiver to carry out the observations. By contrast, radar measurements (such as those used to determine + by clock synchronization) depend on light travelling both ways between the object and observer, as pulses travel from the observer to the object and back again (Fig. 3.28b), so one needs both a transmitter and receiver to carry out the observations. The K-factor observations are essentially simpler, requiring only analysis of a received signal. Through them, we only obtain information about conditions at the object at the time the light was emitted, which could be a very long time ago; indeed we have measured redshifts in light from distant objects using radiation > simultaneous for A, \ —_ | | ieee llr VY kD (b) © Fig. 3.28 The distinction between K and +; (a) The K-factor relates observers’ clocks by observed Doppler shifts and depends on light signals travelling only in one direction. (b) The 7-factor relates the observers’ clocks by simultaneity determined by radar, and depends on signals travelling both ways between the observers. (c) A situation where information is conveyed by a one-way signal from A to B, but that information was determined by previous radar measurements using reflected (two-way) signals and so is based on the 7-factor. 3.4 Time dilation 79 that has been travelling towards us for over a thousand million years. We do not obtain information about the object ‘now’. We can make these measurements to such great distances because the object itself (perhaps a galaxy or quasi-stellar object) provides the power supply for the signal. The information is relatively easy to obtain, but is also relatively limited; in particular, neither distance nor simultancity are directly deducible from measurements of the K-factor. By contrast, observations to determine directly the y-factor depend on obtaining a echo pulse; the experiments are essentially more complex, requiring coordinated measurement of emitted and received signals. Correspondingly, they give more information (distance and simultaneity can be deduced directly, and indeed the Doppler shift factor is also directly measurable from a series of radar pulses, see eqn (3.6)). The distance to which radar can be used is more limited, both because of limits on practicalities of observing time delays, and because of limits on power requirements, since we provide the power for the signal detected. Unless either the radiation is emitted parallel (i.e. non-spreading) to very high accuracy, or the target actively aids the process by amplifying and rebroadcasting the signal, the power needed goes up as the fourth power of the distance because of the need to obtain an echo pulse. It is hardly practical to use radar to measure distances of more than a few light-years; at present, the maximum distance measured by radar is about 8 light-hours. Clearly, the same limits will apply to the use of radar for clock synchronization. Finally, we note that in any complex situation one may have to consider carefully before deciding which is the real effect in operation. As an example, suppose observer A tracks a uniformly moving spacecraft B by radar for some time, and then after suitable computations sends a message: ‘When you receive this message, the time will be 12:00 noon’ (Fig. 3.28c). Now, the final message is one-way from A to B, so one could conceivably think that the information sent was essentially a deduction from the K-factor effect. However, this would be incorrect; the data sent is based on the two-way radar observations by A that took place initially, the final signal merely transferring from A to B the results of these previous measurements. ‘I'he intormation B receives in this case is not about conditions at the time of transmission, but rather about what conditions will be at the time of reception: at that time, A’s clock will simultaneously read 12:00 (where simultaneity is measured by A). Thus the information sent is about radar- based determination of simultaneity, and the relative time dilation measured by such observations will be determined by the 7-factor. The inverse relation and the symmetry of One can solve relation (3.20) for Kin terms of +, obtaining K=74 (P= ‘The plus sign will correspond to relative recession (when K > 1), and the minus sign to relative approach of the two observers (when 0 < K < 1). As examples of the relation between the Doppler-shift and time-dilation factors, eqn (3.20) shows 80 Measurements in flat space-times These examples are all for observers receding from each other. The same formula (3.20) holds if they approach each other. As examples, Kahsy=$, Kab 753 These results suggest that the same value of 7 are obtained for 1/K as for K, and indeed eqn (3.20) confirms this: (/KYP+1_ K241 1) = ETRY > aR = (Kk). We already know (Section 3.2) that K + 1/K corresponds to changing from approach to recession (or vice versa) at the same relative speed of motion Thus, we have shown that the time dilation effect (determined by radar com- parison of clock setting) is the same for relative approach or recession at the same speed. The above examples suggest, and further investigation confirms, that y= 1. Thus B’s clock (moving past A) is measured by A to run slow relative to A’sclock (at rest in the chosen coordinate system), whether they are approaching or receding from each other. Essentially, this symmetry is because the light used to make the measurements travels both ways between A and B. It contrasts with the Doppler shift effect, where A observes B’s clock to be running slow if B recedes, but to be running fast if B approaches; the difference between observed consequences of approach and recession in this case is possible because the light used to make the measurement travels only one way (either from A to B or from B to A). The relation to relative velocity By substituting from eqn (3.12a) in (3.20) we can re-express the time dilation factor 7 in terms of V = v/c instead of K. We find v= (254) flitg) =r =1/0- = 1/01 - 2/2), (3.21) Therefore which confirms the result already obtained by other means (eqn (1.2)). As examples,ifv/c = },then | — 2 = 48,(1 — V2)! = 0.97, andy = 1/0.97 = 1.033. Similarly, if vfe=4 then y= 1/0.866 = 1.155; then y= 1/0.661 = 1.512; then = 1/0.436 = 2.294; if v/e= then y= 1/0.141 = 7.089. 3.4 Time dilation 81 i I f i I ' ' i ' I i ' ' I | I T ! i ' I vos Fig. 3.29 A graph of the >-factor against 7 = v/c, plotted from eqn (3.21). Note that -y becomes arbitrarily large as the relative speed approaches the speed of light. Thus, as expected, high relative speeds cause large y-factors and so large observed time dilations. It follows immediately from eqn (3.21) that (a) the effect is always one of an observed slowing down of the moving clock (7 > 1); (b) it vanishes if and only if there is no relative motion of the object and observer: “= | <> ae 0: atid (c) the time dilation becomes indefinitely large as the relative speed approaches the speed of light: 7 00 & (ve 1 It also confirms (d) that time dilation depends only on the magnitude of the relative velocity, not whether it is a speed of approach or recession: (—v) = (>(v). All these features are clear in the graph of as a function of u/c shown in Fig. 3.29 (plotted from eqn (3.21)). Direct evidence for time dilation The time dilation effect is at first quite unexpected, so it is important to determine experimentally if it actually occurs or not. It has indeed been confirmed by a classic experiment. The basi¢ idea is Lo (a) synchronize wo atomic ducks; (b) leave one on the ground while the other travels on a jet aircraft; then (c) compare the times measured by the clocks when the aircraft returns. If the Earth’s surface could be regarded as an inertial frame for this purpose, then, after allowance has been made for accelerations during the jet’s journey, the clock on the jet should have run slow bya factory. The experiment of Hafele and Keating in 1972 used two aircraft going round the Earth in opposite directions. The time dilation 82 Measurements in flat space-times recorded by the clocks showed remarkable agreement with the theoretical pre- dictions, after allowance had been made for such features as the rotation of the Earth. The interested reader will find a fuller description of the experiment in Science 177, p. 166 (1972). We will discuss a different way of verifying the time dilation effect in Section 3.6. The symmetry of time dilation Perhaps the most difficult thing to understand about the time dilation effect is that, like all the other relativity effects discussed here, it is completely symmet- rical. Thus not only does observer A measure B’s clock to run slow by a factor +, butalso B measures A’sclock to run slow bya factor 7. Asanillustrative example, consider a Transylvanian spacecraft travelling at very high speed outwards from its home base on Earth towards the star Aldebaran, having left home on the first of January. In order to make sure the crew of the spacecraft celebrate the Pre- sident’s birthday (on the Ist of June) at the appropriate time, the base uses radar to track the progress of the spacecraft, and their computer transmits a radio signal (‘Today is the day!’) timed so that it will arrive at the spacecraft at precisely midday on June 1st as measured on the ground. According to their accurate clock and carefully kept calendar, the crew of the spacecraft receive this signal at midday on April Ist. The discrepancy arises because when viewed from the frame of the earth, the moving clock (in the spacecraft) goes slow. Now, by a curious coincidence, the birthday of the captain of the spacecraft is also the Ist of June. The crew of the spacecraft use the same tactics as the home base: they track the position of the Earth by radar, and transmit a celebratory radio signal timed to arrive at the home base precisely at midday on June Ist (according to their calendar). This signal also arrives at the home base at midday on Ist April. This is be th comp! because the situation is completely reciprocal: by the spacecraft, the clock on Earth (moving relative to the spacecraft) goes slow. The two reference frames (assumed here both to be inertial) are equivalent, and each finds the other's clock to go slow by the same amount. There are various ways to understand this feature. One is to note that in the derivation above of the time dilation result, there was nothing special whatever ahout A as opposed to B; they were simply two inertial observers in relative motion. To determine what B measures, we simply need to relabel A and B; the whole calculation (with an implied relabelling of the coordinates) remains valid That the reciprocity of the result must be true, is thus simply a consequence of the basic relativity principle. While this proves the result desired, it does not explain the relation between the two sets of observations. To consider this, start with Fig. 3.30a (drawn from A’s viewpoint). As we have seen above, A measures Q on his world-line to be simultaneous with P on B’s world-line, and so determines that tg= hh (3.224) where -yis given by (3.21). Hence fq > #}: although OP looks longer than OQ, the segment OP in the space time diagram represents a smaller time measured by B, 3.4 Time dilation 83 simultaneous for B — simultaneous for A tal (a) (b) @ ) Tt] simultaneous t—/ for B R S \ \simuttaneous for A o © Fig. 3.30 (a) Observer A measures tie point Q on his worid-iine to be simuitaneous with P on B’s world-line. (b) Observer B measures the point R on A’s world-line to be simultaneous with P on his world-line. Event R precedes event Q. (c) The same situation redrawn in B’s rest frame. than the time A measures from O to Q (ef. Fig. 1.27b). A measures B’s clock to be running slow. How can B also measure A’s clock to be running slow? The key feature is that B does not measure Q on A’s world-line to be simul- taneous with P on his world-line. Rather, from what we have learned in Section 3.3, B measures a point R on A’s world-line Lo be ultaneous with P, where R precedes Q: i.e. th = t} with fr < fq (Fig. 3.30b). Exactly analogously to (3.224), B’s analysis shows that =e, (3.22b) showing that B measures A’s clock to be running slow. There is no contradiction between these results; rather (3.22a, b) show that tg= tr (3.22¢) confirming the result fg > fg, as required for consistency. 84 Measurements in flat space-times Hence, the key to understanding the way the time dilation effect can be reci- procal is to note that A measures Q and P to be simultaneous, but B measures R and P to be simultaneous. Finally, we note that Fig. 3.30b is drawn from A’s viewpoint. To understand fully the reciprocity, consider Fig. 3.30c which is the identical space-time situation drawn from B’s viewpoint. Relations (3.22) therefore hold for Pig. 3.30c, just as they do for Pig. 3.30b. Note that one can directly read off proper times ¢’ measured by B from Fig. 3.30c, because it is calibrated in terms of his variables (t’, X', Y’, Z’); however, one cannot directly read off times measured by A from this diagram. Later (in Section 4.2) we shall find out how to represent a time along B’s world-line equal to the time OQ measured by A. The ‘twin paradox’ We have already referred to the ‘twin paradox’ (Section 1.3). The question that now arises is, how can it be compatible with the symmetry between inertial observers which we have just established? To examine the issue, consider a specific example. Let A be an observer who stays at rest in an inertial frame while B travels away from A ata speed v = $c for 6 years, as measured by B’s clock, and then returns at the same speed for 6 years. Thus B measures a total duration for the trip of 12 years. What does A measure? Figure 3.31a is a space-time diagram of the situation. On the outward journey, A and B recede from each other at v/e =4, and by eqn (3.12) Kt = (1 +4)/ (1-4) =9, so K =3. On the return journey, A and B approach each other; u/c = —4,and K = } (asexpected, the inverse of K for the outward trip). Let O be the event in A’s history when B leaves, $ the event where A sends signal to B that arrives at the event U when B turns around, and P the event when B arrives back at A. The relation 41; = Ktos follows from the definition of K; as i = 6 and K = 3, the time fos measured by A from O to Sis 2 years. Similarly ¢/;p = Ktsp, where now f/,p = 6 years and K = 4. Therefore, the time ¢sp measured by A from S to P is 18 years. Thus, the total time measured for the trip by A is fop = fos + tsp = 20 years. This illustrates the twin paradox: after the journey, A will have aged by 8 years more than B. Analternative way of obtaining this result is to note that on both the outward and return journey, from (3.21), y= {1 — (¢)°}# =& Thuy, if we consider the event W in A’s history simultaneous with U (Fig. 3.31b) then tow = x 6 = 10 years; similarly ‘we = tip = 10 years. Thus, A measures a total travel time for B of top = tow + two = 20 years, as before. How do we reconcile this difference in the times measured by A and B from O- lo P, if the lime dilation effect is reciprocal between inertial observers? In a nutshell, the point is that B does not move inertially, but A does. Thus, Bis not an inertial observer. Rather, her history consists of inertial segments joined by a period of acceleration, In order that this acceleration take place (when B's dircc- tion of motion reverses), she has to fire a rocket, experience elastic forces, or in some other way break her inertial motion; if she does not do so, the two observers. are equivalent and their distance apart increases indefinitely. Acceleration is 3.4 Time dilation 85 6 years ‘2 years} soyears) 6 years 6 years @ (b) Fig.3.31 The ‘twin paradox’. (a) Twin B travels at speed v for 6 years, and then returns at the same speed to rejoin A, who has remained at rest (in an inertial reference frame) during B’s journey. A light signal emitted by A at S is received by B at U, when she turns around. (b) Twin A measures the event W on his world-line to be simultaneous with the event U on, B's world-line. signal _| recorder Fig.3.32 Anacceleration detector, consisting ofa weight held between springs fitted with detectors that record movement of the weight relative to the sides of the framework. demanded in order that they meet again. This acceleration is physically detect- able. Suppose each observer has with him or her an acceleration detector con- sisting of a weight constrained to move within a framework by springs which are fitted.with strain detectors (Fig. 3.32). Since A moves inertially, his detector will register no forces, but B’s will; this shows that the distinction between their 86 Measurements in flat space-times t Fig. 3.33 Various world-lines between events O and P. The straight-line path A is the one with the longest proper time. It is uniquely characterized by the fact that an acceleration detector will measure no acceleration along this path. motions has clear-cut measurable physical consequences, The symmetry of the time dilation effect holds only between inertially moving observers. In the example, B does not move inertially between the events O and P, but A does, and that is the source of the asymmetry whereby A measures a longer proper time between O and P than B does, Itis the time dilation effect that makes it clear that we should properly regard time as a quantity measured along world-lines from their initial to their final points, that is, a line integral along the world-lines (see Appendix A for a brief discussion of ihe vonvepi of a line inivgial). Then the asymmetry in the ‘twin paradox’ has a particularly clear interpretation (Fig. 3.33). Consider any two events O and P in space-time that are time-like separated, that is, that are such that a particle can move from O to P without exceeding the speed of light. Then one can show that the unique path from O to P along which a clock will measure the longest time is that representing inertial (free-fall) motion. This is a straight line in space time from O to P. Thus, it is precisely the inertial observer who will have aged the most when two observers mect again, no matter what path the other has taken through space-time from O to P (i.e. no matter what accelera- tions he has undergone). In the example above, this singles out the observer A as unique compared to all others who pass through both events O and P. Conclusion In summary, ‘a moving clock goes slow’ in a way that is completely reciprocal for any pair of inertial observers (each measures the other’s clock to be going slowly), This is consistent because they disagice about simultancity. This time dilation effect refers to a comparison of times measured by both clocks ‘now’, i.e. it is based on the idea of simultaneity. It must not be confused with the Doppler shift cffect relating obscryed times, which is also completely reciprocal, but relates 3.4 Time dilation 87 time measured by the observer now to time at the source of the radiation when. this radiation was emitted (which could be a very long time ago). Time dilation gives rise to the ‘twin paradox’: any observer who moves away from an inertial observer and then back again will find he has experienced a smaller increase in time than the inertial observer. This feature has been observed experimentally by comparison of a clock in an aircraft with a clock stationary on the surface of the Earth (the Hafele-Keating experiment described above). Exereises 3.12 Consider the ‘twin paradox’ example above (Fig. 3.31). (1) Let light emitted by B at the event U be received by A at an event V. Using the K- factor, determine the times A measures from O to V and from V to P; hence deduce the total time measured by A from O to P. (2) What event in A’s history does B determine to be simultaneous with U, (a) just before she turns around, (b) just after she turns around? Use the y-factor based on B's view of space-time during her inertial segments of motion to determine the time intervals in A’s, history between O, these events, and P. Hence confirm that B can also use 7 to determine the time A measures from O to P. (3) Suppose A and B each observe the other by radar. Find the relative motions each determines for the other. [This reveals a quite unexpected motion that B measures for A, and so sharply shows the distinction between them.] 3.13. Let the world-line of an inertial observer A go from a space-time event Oto P. Let observer D move inertially from O to some event Q, and then inertially to P, Show that D measures a shorter time interval from O to P than A does. [Hint: find the time in A’s history he measures to be simultaneous with Q; then use the relevant 7-factors separately for the outward and return journey of DJ Generalize to show that if D moves on any finite number of inertial segments from O to P, he measures a shorter time interval from O to P than A does (unless he moves in an same time interval). 3.14 Suppose that a spaceship cruises at v = 3c. Assuming that you may neglect the times for acceleration and deceleration, find how much the earth will have aged during an outward and return journey which takes 50 years as measured by the astronauts on board. How far from the earth will the space-ship have travelled? What limits does this suggest to what may be achieved in space travel? 3.15 The relation between velocity and K (and so redshift) considered so far has been for the case of radial motion (the source moving directly towards or away from the observer). Now consider the case of transverse motion; the source is moving at right angles to the line of sight from the observer (Fig. 3.34). Then the distance between the source and observer is unchanging instantaneously. Calculate the K-factor for light emitted by the source and received by the observer, and hence the redshift measured in this case. (Hint: the K-factor is simply due to the time dilation effect (3.21) in this case]. How large will be Light —— 9" source Observer Fig. 3.34 88 Measurements in flat space-times the resultant effect on the measured CBR temperature in those direction? (See the discussion of redshift and background radiation at the end of Section 3.1.) Computer Exercise 8 Write a program which will accept as input any one of the three parameters: velocity V(= v/c), time dilation factor G(= 7), Doppler shift factor K; and prints out the other two. Use your program to plot carefully a graph of yand K against V for all allowed values of V. Modify your program (a) to print out additionally the ‘slow-motion’ approximations G1=14+$V? and K1 =14+V. Find out for what ranges of V Gl and Kl are good approximations to G and K respectively. (b) Repeat this for the fast motion approxima- tions G2 = 1/,/(2e) and K2 = \/(2/e) where ¢ is defined by V 3.5 Length contraction The final major kinematic effect of special relativity is length contraction. Just as time measurements depend on relative motion, so we might expect the same general kind of effect to hold for length measurements. As an example, suppose a Special Interstellar Shuttle requires 15 miles of surface to land safely, so a runway of this length has been constructed for it in the Mojave desert. As the pilot approaches at very high speed, he checks the length of the runway by radar, and measures it to be only 7 miles long —apparently far too short for a safe landing! This is due to the relativity length contraction effect; when he makes his final approach at low speed, he will measure its length to be about 15 miles, so he can make a safe landing. In general, the length measurements made by two relatively moving observers are related by length contraction, which is the companion to time dilation, and is similar in many ways. In particular, it is also a reciprocal relation hecanse of the relativity of simultaneity In this section, we calculate how relative motion affects lengths measured by radar, consider how length contraction can be a reciprocal effect, mention the lack of a width contraction effect, and discuss the relation of these results to photographic. images of objects. The determination of lengths by radar The crucial feature about measuring the length of a rigid ruler, rod, or other object is that it is a measurement of length at an instant. To understand the implications of this statement, it is important to realize that the space-time representation of the history of a rod is a ribbon in space-time, bounded by two time-like lines. To see this, consider a straight rod with ends u and w (Fig. 3.35a) Suppose thatit isatrestin A's reference frame; for simplicity, wetake ute beat the origin of A’scoordinates. The world-lines of theleft-hand end of therod (u)and the right-hand end(w)arethenlinesat fixed X values,asshownin Fig. 3.35b. Obviously, the central point vin the rod lies between wand w, so its history will be represented by a world-line between those of u and w; similarly, the history of each point in the rod will be represented by a world-line between u and w. Thus, the material of the rod occupies the entire region between these world-lines (Fig. 3.35b). 3.5 Length contraction 89 Ta itistisitiibeerediis u v w @) Fig.3.38 (a) A ruler with ends wand w and mid-point v. (b) A space-time diagram of the ruler at rest in the reference frame of an observer A, showing the world-lines of the ends u and wand of the mid-point v. Clearly, the entire strip between the world-lines of u and w will represent histories of particles comprising the ruler. A surface of simultaneity for A is horizontal in this diagram. A measurement of the length of the rod is a measurement by radar of the distance between u and w (Fig. 3.354) ‘at an instant’. Thus, when A measures the length of the rod from u to w, this is the distance between their respective world- lines in a surface of simultaneity for A (Fig. 3.35b). The effect of relative motion on measured lengths As usual, we consider an inertial observer B moving past the inertial observer A at a relative speed v. Given a space-time diagram drawn from A’s viewpoint (and calibrated by A’s coordinates), we can directly read off from it distances measured by A, but cannot directly read off distance measurements made by B. As simultaneity differs for A and B, when each uses radar to measure the length of the rod they are measuring somewhat different aspects of its space-time history (Fig. 3.36a), so it is hardly surprising that they obtain different results. This diagram shows the situation from A’s viewpoint; Figure 3.36b shows the same situation as seen by B. The detailed examination that follows leads to the length contraction formula (3.24). To simplify the comparison, consider the measurements made so that A and B both use the same light rays to determine the length of the rod. (Fig. 3.37). Bemits alight signal at P; thisis reflected from the end w of the rod at W, and it is received back by Bat R. Let the event on B’s world-line half-way between the emission and reception of the signal as measured by his own clock be O. Suppose that B chooses the event P so that he coincides with the end u of the rod at the event O; then Fig. 3.36 (a) Observer A measures the instantaneous length of the ruler to be L, while observer B (moving relative to A) measures the length 1’ in his surfaces of simultaneity, (b) The same situation drawn in B's rest frame. NC u, w ° Pp Fig. 3.37 Observers A and B both determine the length of the rod by radar, A emitting a signal at Qand receiving the echo at S, while Bemitsa signal at P and receives theechoat R ‘A measures U and W to be simultaneous, while B.measures O and W as simultaneous (OP and OR represent equal times) 3.5 Length contraction 91 top = top and B determines O and W to be simultaneous. The light travel time is 7' = 2tp = 2tp, and the length of the rod is measured by B to be Later’. (3.23a) Let the light emitted at the event P by B reach A’s world-line at the event Q Suppose that A emits a light signal at the event Q. This light is reflected at the event W and received back by A at the event S. Let K; be the K-factor fora relative speed of approach 1; this relates the time ff. to fg, so the time measured by A from Q to O is tao = Ki x 37’. Let Kz be the K-factor for a relative speed of recession v; this relates the time for to fos, So the time measured by A from O to S is tos = Ky x 47’. Let the total light travel time measured by A be r. Then T = Ies = tao + tos = (Ki + Kp) x $7. Now, K=1/K; because they relate approach and recession at the same speed. Then = (Ky +.1/Ki)(r!/2) = {(K} +1)/2Ki}7! = 97", (3.23b) by (3.20). The length of the rod is measured by A to be (3.23¢) Hence the ratio of the length of the rod measured by A to the length of the rod measured by B is : L/Li'=t/r'=7 (3.24) by eqns (3.23) where is expressed in terms of v by equation (3.21). (Note that 7 is the same for K; and K2, as they represent approach and recession at the same speed, so the same result is obtained if we replace Ky by Ky in (3.23b).) Thus the length of the rod measured by A (for whom the rod is at rest) is greater than the length of the rod measured by B (for whom the rod moves at a speed v) bya factor +. In brief, moving objects are measured to be shorter than stationary objects by a factor y= (1-8 fe}, Asan example, consider an interstellar rocket that is measured by its crew to be 500m = 0.5 km long. Supposeit is viewed froma planet which it passes at a speed v=0.9e. Then 1/7 = {1 (0.9)}} ~ 0.44, Hence the length measured from the ground is L! = L/y~ (0.44)(0.5) = 0.22km. Now suppose v=0.99¢. Then L/y= {1 — (0.99)}! = 0.14 and the measured length is L' = 0.07km = 70m. A graph of the length-contraction quantity 1/7 is given in Fig. 3.38. Tt has properties corresponding to those of + (cf. the previous section), i.e. (1) it is always less than or equal to 1; (2) itis only 1 if the speed of relative motion is zero; Q) it goes to 0 as |n/el —> 1; and (4) it is the same for approach (w positive) and 92 Measurements in flat space-times Fig. 3.38 A graph of the length-contraction factor 1/7 against V = w/e recession (v negative). Thus, the effect is negligible at speeds of relative motion low compared to that of light, but at speeds close to the speed of light the length of a moving object as measured by radar goes to zero. As indicated above, we may regard the basic cause of the length-contraction effect as being that B measures the length of the rod at the ‘instant’ represented by the surface of simultaneity OW (Fig. 3.37), whereas A measures its length at the ‘instant’ represented by the surface of simultaneity WU, where U is midway between Q and § on A’s world- line; this surface appears horizontal in A’s space-time diagram. Although OW is appaenily a louger line than UW, it represents a shorter tengtl radar by B. In Section 4.2 we will derive a precise representation of how lengths measured along different surfaces of instantaneity relate to each other. measured by The symmetry of length contraction Just as in the case of time dilation, the effect is completely reciprocal: each observer measures objects at rest in the other’s frame to be short by a factor 1/-. As in the previous case, one can see this by noting that there is no intrinsic dif- ference between A and B, so in the analysis above we could equally well have changed the labels of A and B to determine the length contraction observed by B for objects moving with A. Thus, the reciprocity is a result of the relativity principle. However, this does not completely account for how the reciprocity can be consistently possible. To sce this, we need to consider measurements made by both A and B of rigid rods moving with each of them. Figure 3.39 is a space-time diagram drawn from A’s viewpoint, showing a rod Ra with end-points u and w at rest in A’s frame and a rod Ry with cndpoints u! and w’ at rest in B’s frame. The rods are chosen to be such that the length mea- sured by A for both of them is the same: the world-lines u-and_u’ coincide at the event U and the world-lines w and w’ coincide at the event W, where U and 3.5 Length contraction 93 Fig.3.39 Measurement of two rods Ra, with end-points wand w,and Ry with endpoints’ and w’. Observer A measures both to have length L (the distance between events U and W). Observer B measures them to have the lengths L/ (between U and N) and Z” (between U and M) in his surface of simultaneity UM. W are instantaneous for A, so A measures the same length L (the radar distance measured by him between U and W) for both of them. Now, using radar, B measures lengths in his surface of instantaneity, which is indicated as the line UNM in the diagram, where N lies on the world-line of wand M on the world-line of w'. He measures the length of Ra as the radar distance L” between U and N, and the length of Rp as the radar distance L/ between U and M. By the results above, A measures the relatively moving rod Rp short by a factor 7: IfSqr, SL (3.25a) Also, B measures the relatively moving rod Ra short by a factor 7; L=L, L>L. (3.25b) These results are consistent with each other. Indeed, they show that LPL, (3.25) consistent with the feature that L” > L' (apparent because the segment UM is longer than TIN) In view of this reciprocity, it is apparent that given any rigid object, it will appear longest to an observer for whom it is at rest (i.e. who moves at the same speed as the object). We may use the name proper length to denote the length of the object as measured by such an observer. Then every observer moving relative to the rod will measure the length to be less than its proper length. Transverse measurements The length contraction effect is a longitudinal effect: that is, it is observed in the direction of relative motion of the object (in the above calculation, the relative 94 Measurements in flat space-times motion was in the X-direction and the length contraction occurred in the length of the object measured in that direction). No change of size is measured in directions perpendicular to the motion, because there is no change of relative distances in those directions. Thus, radar sets aligned along the Y and Z axes by A and B will give the same measurements of distances along these axes, and one will find the size of objects measured in the Y and Z directions unaffected by relative motion in the X-direction. A body moving past will therefore be measured to be distorted in shape, having the same Y and Z dimensions as when stationary but being contracted in the X-direction. Photographic images The length contraction effect refers to measurements made by radar. This does not mean that a photographic image will show the length contraction in an obvious way, because such an image does not represent the state of the object ‘at an instant’. To work out what the image will show, one must allow for the light travel time from different parts of the object to the camera, and this works in the opposite way to the length contraction. In general, the result is complex to work out, but a simple example will make the principle clear. This detailed study is peripheral to our main line of argument, and so may be omitted at a first reading. Consider a rigid rod Rg with edges u’ and w' moving towards the observer A (Fig. 3.40). As in the previous example, denote the proper length of the rod by L" and the length A measures for it by L; these quantities are then related by (3.25a). In the following, unless otherwise stated, all distances will be scaled according to A’s coordinate X, which is used to calibrate the x-axis in Fig. 3.40 and is nor- malized so that the speed of light is 1. At event R, observer A takes a photograph of Rg. The light arriving at the event R has travelled up its past light cone; we 4 byU Fig.3.40 A photograph being taken by an observer A of a ruler with end-points u’ and w’ moving towards the camera. The events U at u and W at w are recorded by the camera at event R. By the time the light ray leaves W, the ruler is distance dnearer the camera so its apparent length is Lo = L +d. 3.5 Length contraction 95 left the edge w. Suppose Rp moves a distance d towards A while the light travels from U to W; since Rp is moving at a speed v/c towards A, we have |d| = T|v/c| where Tis the lime the light takes to travel from U to W, Remembering our sign convention for v, a speed of approach is represented by a negative value of v, so d = —Tv/c. When the light arrives at W it has travelled a distance L + d towards A, so T = L +d; consequently d = —(L + d)u/c. Solving for d shows that d= {-(vo/e)/(1 + v/c)}L. (3.26a) Now the effective length of Rg in A’s photograph is Lo = L + d, because this is the distance between the ends uand w apparent in the photograph (forexample, if the rod slides over a scale with distances from A marked off, the event U where A’s photograph depicts the end u will be shown by this scale to bea distance L + d from the event W where A’s photograph depicts the end w). From (3.26a), then. Ly = L/(1 + ufc). (3.266) Now, using (3.25a) and the expression (3.21) for +y and simplifying, we can show that Lo = {(l—ufe)/(1 + v/e)}#L". But by (3.12) this is just Lo = (1/K)L"; (3.27) the effective length of Rg observed by A ina photograph is related to its proper length not by the length contraction factor +y but by the inverse of the Doppler shift factor K! This is actually not surprising: the situation is analogous to the way the time- scale difference measured in Doppler shift observations is given by K rather than 7y. In both cases the occurrence of K is essentially because light travels only one way in that observation, rather than both ways as when radar is used to determine lengths or simultaneity. Equivalently, K occurs because the measurement related magnitudes of quantities ‘then’ and ‘now’, rather than measured ‘at an instant’, As in the case of Doppler shifts, the sign of the effect depends on whether the relative motion is one of approach or recession. If the object approaches, it will appear Lo be longer by a factor 1/K than ils rest length (When-v < Oand K < 1); if it recedes, it will appear shorter by a factor 1/K (in this case, v > 0 and K > 1). The above example calculates the effective length of Rp seen in a photograph when it travels directly towards or away from A (admittedly it would be rather difficult actually to demonstrate this relation because of the motion of the object being directly toward the camera). More complex effects occur when the object moves transversely to the line of sight. It will in fact appear undistorted when viewed from a long way off; in this case the length-contraction and light travel effects just cancel (but a careful examination of the way light travels from the object to camera shows that it will then appear to be rotated!). 96 Measurements in flat space-times Conclusion In conclusion, the ‘length contraction’ effect discussed in this section, with magnitude determined by the factor |/-, represents the behaviour of distance measurements made at an instant, for example by radar. It will not directly represent measurements determined from photographs, where light travel times have to be taken into account and the image obtained does not represent the situation ‘at an instant’. Given this understanding, the basic feature is simple: an observer will measure the length of an object moving radially towards or away from him to be shorter than that of an identical object that is stationary, by precisely a factor 1/-y. Exercises 3.16 A section of the surface of a road has pressure studs laid into it, connected toa measurement centre by cables that are all exactly the same length. A series of lights at the centre indicate which studs are loaded by a vehicle in the road. An articulated lorry passes over them at high speed. When the lorry is at rest, its length is measured to be 30 metres, What length would this apparatus measure for the lorry, if its speed of travel were v=00le? 3.17 To measure the length of a high-speed train, an observer measures the time Tit takes to passa fixed point on the track, and then determines its length Z! from its speed of motion v (which he also measures) by the relation Z’ = v7. Show that the length-con- traction formula (3.24) relates L/ to the (proper) length Z measured for the train by an observer moving with it, 3.18 A science-fiction story features a moon-buggy which has continuous contact with the ground (via caterpillar tracks) and has its weight distributed uniformly along its 10-metre length. What is the upper limit to the speed at which it can travel directly across a 4-metre-wide chasm without falling into it? For speeds greater than this critical speed, explain how itis possible, from the point of view of someone travelling in the buggy, for it to fall into the chasm. Computer Exercise 9 Write a program that will accept as input a relative velocity V(= u/c) and a proper length 1, and prints out the measured length L’ given by eqn (3.24). Also print out the approx- imate value L1' = L(1 — $V), and find for what range of V, the estimate L1' is a good approximation to L’. Apply your program (a) to a Concorde airliner at maximum speed, (b) a space shuttle, 3.6 The whole package of kinematic effects We have now considered the basic principles of special relativity—the equiva- lence of all inertial observers, and the invariance and limiting nature of the speed of light—and four major phenomena resulting from these principles: time dila- tion, length contraction, the relativity of simultaneity, and the relativity velocity addition law, It is important to realize that these phenomena are intimately related to each other. Any one of them only makes sense if the others also operate; only the whole pack age is consistent. We shall illustrate this through two 3.6 The whole package of kinematic effects 97 ‘cosmic ray we ~089 atmosphere, tatrion ont eat toe + rest EARTH + EARTH atmosphere @ (b) ve ~099 Fig. 3.41 (a) Cosmic rays colliding with particles in the Earth’s atmosphere to produce muons which then decay into other particles. The muons travel at a speed of about 0.99¢ relative to the Earth. (b) The same situation from the viewpoint of the muons, with the Earth approaching at high speed. illuminating examples, and will then consider how one can express the essential features either through a single unified relation (the Lorentz transformation) or through the concept of an invariant (the space-time interval), both of which will be discussed in detail in Chapter 4. Example (a); muon decay Cosmic rays are particles that arrive at the Earth from space at extremely high relative speeds v (often |x/c| ~ 0.99). Their origin, and where they get such great energy from, is still something of a mystery. Ata height of about 20 km above sea leyel they collide with atoms in the Earth’s atmosphere, and among the particles ing from th: P 1s (Fig. 2.4 move very rapidly towards the ground (their mean speed being nearly the same as that of the incoming cosmic rays), but they are unstable, decaying rapidly to less massive particles (electrons and neutrinos). One can measure this decay rate in the laboratory; the mean lifetime of a muon at rest is ¢; where t) 232.2 x 10-® sec. (3.28a) Their mean flight time through the Earth’s atmosphere, from where they are created, to sea level is t2, given by tp 20km/(0.99 x 3 x 10°km/sec) ~ 6.7 x 1075 sec. (3.28b) Defining f by f we find that f = t2/t{ ~ (6.7 x 1075)/(2.2 x 10-6) ~ 30. Now a statistical ana lysis shows that during one mean lifetime r}, the proportion of muons surviving will be about I/e, where e is the transcendental number occurring in natural logarithms (e + 2.71828... .);and during the time f2, the fraction surviving should (mean time of flight)/(mean lifetime), (3.28c) 98 Measurements in flat spacetimes be about ef ~ e*? ~ 10-'5. However, when measurements are made of the number of muons created high in the atmosphere and those arriving at sea level it turns out that a much higher fraction arrive al sea level; about 1% = 10~? of the total number created. Thus, the prediction is entirely wrong: enormously more particles survive than expected on the basis of this simple calculation. What has gone wrong? The essential point is that we have failed to take time dilation into account. In considering any physical situation, one should make a definite decision as to which frame will be used for the analysis, and then stick to this decision; mixing results of measurements by two different observers will usually lead to incorrect results. We first choose to look at the situation from the viewpoint of an observer on the ground. Then eqn (3.28a) is an incorrect estimate of the measured muon lifetime, because it is the lifetime measured by an observer moving with themuon. The lifetime t measured by an observer stationary on the ground will differ by a factor 7, where y= (1-2 /e) t= {1 -(0.997}4 71 (3.29a) so th =9t} = 1.5 x 10S sec. (3.296) Equation (3.28b) is a correct estimate of the time of flight measured by an observer stationary on the ground. In evaluating (3.28c), we must use values measured by the same observer (in this case, an observer stationary on the ground) for the numerator and denominator. Thus we find f= h/t ~42, (3.29c) (a factor 1/7 times our previous estimate). Hence ef ~ e~4? ~ 0.015, an esti- mate of the fraction surviving which is in good agreement with the experiment. ‘The time dilation effect therefore reconciles the theoretical and experimental results in the Earth’s frame; the observations in fact provide an experimental verification of the time dilation effect. However, a problem is apparent if we consider the situation from the viewpoint of an observer travelling with the muon. This is because in that frame there is no time dilation effect for the decay: the muon is stationary in the observer's reference frame (Fig. 3.41b), and has the lifetime (3.28a). The previous analysis, which gave an incorrect answer, appears to apply. The resolution in this case is provided by remembering that we must apply all the special relativity results in analyzing our observations. Seen from the muon’s reference frame (Fig. 3.41b), the Earth is approaching at the same speed v (|u/e| = 0.99) as the observer on the Earth measured for the muon, because both observers agree about the relative rate of approach (see eqn (3.17h)) However, from this viewpoint the atmosphere is also moving by at high speed, so the path through the atmosphere is measured to be much shorter because of the length-contraction effect. In fact, the moving observer would measure the path 3.6 The whole package of kinematic effects 99 through the atmosphere, from creation of the muon until it is hit by the surface of the Earth, to have a length of 20/7 km ~ 20 x 0.141 ~ 2.8 km, instead of the 20 km measured by an observer on the Earth (at rest relative to the atmosphere). Thus, for the moving observer the muon traverses this path in a time f} given by 1, = 2.8km/(0.99 x 3 x 10° km/sec) ~ 9.4 x 10-6 see = t2/4. (3.30a) Hence, evaluating both terms in (3.28c) in the muon’s reference frame, f= 6/4 = (o/Na/y) = @/4 (3.30b) and we obtain exactly the same result (3.29c) as before. In the muon’s reference frame, we reconcile the theoretical and experimental results by use of the length- contraction effect, and the experiment serves as a verification of this effect. This analysis shows very clearly why one must consider length contraction and time dilation together: they are the same phenomenon seen from different points of view. From the stationary frame, theory and experiment agree because of time dilation; from themoving frame, because of length contraction; theanalysis would be inconsistent if only one of the effects occurred. The experimental data for muon decay serves to verify that both effects occur in the real physical world. The interested reader will find more details about how the experiment is performed in the book Special Relativity by A. P. French (published by the MIT Press). Example (b): tied rockets Imagine an observer B watching two identical stationary rockets C and A a distance d = 400 m apart, joined by an inextensible rope of length d (so the rope is stretched tautly between them). As measured by B, at an instant fo they ignite théir engines simultaneously and start moving parallel to the rope with their engines at full thrust, with A leading C (Fig. 3.42a). Since they are identical, their speeds relative to the observer B are identical, and he will therefore measure them to remain precisely a distance d apart. At the time 1, as measured by B. they both turn off their engines and continue moving inertially at a speed v = 3c relative to him. His measurement of the distance between them js still 400 m. To simplify the problem, we will assume the rocket engines are very powerful and fast burning, giving very brief buf strong impulses to the rockets that accelerate them up to their final speed. Thus, we will assume that r isa negligibly short time after fo. A space time diagram of this situation (drawn from B’s viewpoint) is given in Fig. 3.42b. Now, an observer moving with the rocket A will measure rocket C to be sta- tionary relative to himself both at early and at late times. What will he measure the final distance between A and C to be? Denote this distance by D. According to the length-contraction formula, D/d = ¥ (see eqn (3.24)); D corresponds to L, since this is the distance measured in A’s rest frame, while d corresponds to L’ measured by the observer B moving relative to that frame, who will measure the distance to be shorter. Therefore D=d={1~ @ yaa fa= 500m. 100 Measurements in flat space-times c A 5 a “tt imultangous for AC Sventually initially (b) © Fig. 3.42 (a) Observer B sees two rockets, A and C, accelerate simultaneously in the same direction. The distance d between them stays constant because they accelerate identically. (b) An idealized space-time diagram of the situation as seen by observer B. Very powerful engines are switched on just before events s and s! and switched off just after these events, (c) Initially the surfaces of simultaneity of A and C coincide with those of B, but after they have finished accelerating they are tilted relative to those of B. Thus, just after he has completed accelerating (at event s), A determines the event f’ in C’s history (before C started accelerating) to be simultaneous with s. Thus his measurements show that, at that instant, C is still to start accelerating. If the rope still joins the two rockets, it is at rest relative to them; this must then be its length (measured in its own rest frame). But it is inextensible; it cannot stretch to this length. It will therefore have broken, The problem arises when one considers how this can have happened. As established above, B observed both rockets to accelerate in precisely the same way. This seems lo imply that the distanee between them vould not change, and therefore that the rope did not break. They accelerated identically; how can the distance between them have changed from 400 m (as measured by A initially) to 500 m (as measured by A finally)? Does the rope actually break or not? As before, the problem is that we have not taken all the relativity effects into account, The apparent paradox is resolved by considering the relativity of simultaneity. Specifically, instantaneous surfaces in space-time for A and C when they are moving at their final speed are tilted relative to their initial surfaces of instantaneity, which coincide with those of B (see Fig. 3.42c). Thus, consider events as determined by A. Just before he starts to fire his rocket engine (at the 3.6 The whole package of kinematic effects 101 eventsin his history), C is also just about to start his (at the event s’). At this stage, A and C both measure their distance apart to be 400m, and they agree about simullaneity. But, when A has finished firing his engine (just after the event s), C has not yet started firing his (since A measures s to be simultaneous with the event {” in C’s history, which precedes s’). At this stage, A determines that he is moving away from C, because he has finished accelerating but C has not yet begun to accelerate. The distance between them increases and the rope snaps. C then begins accelerating (just before the event s' in hishistory, measured by A to besimultaneous with the event fin A’s history). Finally, C ceases accelerating just after the event s’. Both A and C now measure their distance apart to be 500 m, and they agree about simultaneity. This explains why their final distance apart is greater than their initial distance apart, which of course means that the rope must break. As before, we see that consistency of the special relativity effects depends on taking all of them into account; the puzzling ‘paradoxes’ of relativity usually result from ignoring one or other of them. The most difficult to appreciate initially is the relativity of simultaneity; indeed, a rough rule of thumb is that when a problem appears particularly paradoxical. it is usually because this effect has been forgotten. Exercises 3.19 Particles called pions decay into other particles at a rate such that (when measured in their rest frame) on average half the number of pions present decay in 18 x 10~? sec. Suppose now thatin a high-energy collision experiment, pions are produced with a speed 0.99c. How long will it take on average, as measured by a stationary observer, for half their number to decay? How far will they have travelled in this time? [Compare with the distance travelled by the muons described in the text.] 3.20 A car 5 metres long drives into a garage 4 metres long at a speed v = 3c. According to a stationary observer, the length of the car will appear to be reduced by a factor of 1/7 to 4 metres and so it will fit into the garage exactly. On the other hand the driver of the car will perceive the length of the garage to be reduced by 1/-7to 3.2 metres so the car will not fit in. How would you resolve this apparent paradox? What wording in its statement is not sufficiently precise? 3.21 Construct a space-time diagram (0 illustrate Lhe possibility of causal paradox if tachyons (particles travelling faster than light) were to exist, Observer A is at rest while observer B moves past at relative speed $c, Draw in the surfaces showing which events are simultaneons, according to R, with the events on A’s world-line at ¢=0,1,,3,4,5 Suppose that at r= 1, A were to send a signal toward B with speed $c, Show that B will determine that A sent the signal at a time when B had already received it. Hence according, to his (radar) measure of instantaneity, B could transmit an answer to the signal before it had been sent! [Moral: consistency of relativity theory forbids sending signals faster than light] 3.22 Our analysis above of Example (b) (tied rockets) referred to instantaneous dis- tances as deduced by A, B, and C from their surfaces of simultaneity. In practice, they would measure their separation by radar signals which are not instantaneous measure- ments;e.g. the surface of simultaneity (sf") would be determined by A from signals sent out before event S and received afters. 102 Measurements in flat spacetimes Work out in detail the separation A would measure for C by radar measurements of distance (cf, Example 3.12; note that this is a lengthy but interesting exercise). The whole package: unifying viewpoints We have seen now, through these examples, that the whole set of relativity kinematic effects must be taken into account if one is to get a consistent des- cription of what is happening. Their inextricable intertwining is made clear by the fact that what appears to be a length contraction in one frame of reference may turn out to be time dilation in another. Thus, we are naturally led to consider if there is some way of writing the theory so as to bring out this unity, and present a unified view of space-time measurements and geometry, This can be done: indeed, there are two separate ways of going about the problem. The first approach relies on working out in detail how all space and time measurements alter when one changes from one reference frame to another. Thus we are led to the idea of a Lorentz transformation. The second approach takes what is in effect an opposite viewpoint. We have established that various features (length, time differences, simultaneity) that we previously believed were invariant when one changes reference frame, are not immutable after all. We can now ask: given our new insights, is there any feature of the space-time that is unchanged by anarbitrary change of reference frame? Thatis, are there any significant invariant features of space-time? We shall find that there are various such quantities. One in particular the (metric form) summarizes in a compact way the results of measurements of spatial distances and time differences. The next chapter will look at each of these approaches in turn. Before we turn to this, however, we consider briefly the nature of relativistic dynamics, and the relation of the rela- tivity principle to the rest of physics: the ‘whole package’ that must be consistent inciudes aii physicai iaws, and so in particular the iaws of dynamics. 3.7 Relativistic dynamics If the Newtonian laws of particle motion were correct, one could accelerate a particle to move faster than the speed of light, thus violating one of the basic assumptions of relativity theory, and contradicting the experimental evidence (see Section 1.2). Thus, the laws of particle motion in relativity theory must be different from those of Newtonian theory. Similarly the laws of energy and momentum conservation must also be different. When one takes into account the four-dimensional nature of space-time, the real nature of the concepts ‘mass’, ‘momentum’, ‘energy’, and ‘force’ turns out to be somewhat different than in Newtonian theory. A four-dimensional for- mulation of these topics is presented in Appendix B, Here we will simply sum- marize in a three-dimensional form the revised laws of dynamics that result. ‘These form the basis of a dynamical theory that is consistent with the relativistic kinematical results we have established, and so establish a consistent relativity theory of motion of particles and massive bodies which has many important practical consequences (such as providing thc thcorctical basis for the extraction 3.7 Relativistic dynamics 103 of nuclear energy, and for understanding the processes taking place in the Sun). The topics dealt with in this section are an important part of special relativity theory, but are nol essential for understanding the nature of space-time geometry or measurements. Thus, the reader who wishes to concentrate on the geometry of space-times can omit this section. A: Mass Just as we had to be prepared to question all our preconceived ideas about space- time measurements, so we must also be prepared to revise our ideas about the basic quantities involved in dynamics. In Newtonian theory, the mass ofan object is a quantity of considerable importance, since the energy and momentum of any body are proportional to its mass. Thus, the mass of a rocket determines the amount of energy needed to place it in an orbit around the Earth at a particular distance; the mass of a meteorite determines the amount of kinetic energy it dissipates when it crashes into the Moon at a particular speed and forms a new crater; the masses of elementary particles determine the final speed each attains after a collision: the mass of a car of given power determines the time it takes to accelerate from rest to a speed of 100 kin/hr. In Newtonian theory, the mass m of an object is independent of the motion of the observer who measures it. In relativity theory we must be prepared to ques- tion whether this is still true or not. Accordingly, we will denote by mo the mass measured for an object by an observer when it is at rest relative to him. It will then be an experimental question whether or not he should still regard its mass as 9, when the body is in relative motion. It will turn out that the effective relativistic mass m does indeed depend on relative motion (eqn (3.34) below). The second important feature is that in Newtonian theory, total mass is conserved in inter- water, it is predicted that the mass of water produced will be 90 kg. We shall see below that mass conservation remains true in relativity theory, but inan extended sense: mass can be converted to energy and energy ta mass; it is the fotal of mass. and energy that is conserved. a rn to B: Momentum In Newtonian theory, the momentum of an object is its mass multiplied by its velocity. The importance of momentumiis that it underlies the basic conservation jaws of dynamical motion: (M1) when no forces act on a body, its momentum is conserved; (M2) when a collision takes place between particles or massive bodies, the total momentum of all the objects involved in the collision is conserved. Consider, for example, a space station of mass 100 tonsand a meteorite of mass 50 tons approaching each other. In the reference frame of an inertial observer B the space station is initially moving in the | X-dircction at a specd jc and the meteorite in the — X-direction at a speed } (Fig. 3.43a). The initial momentum of therocket is 100 x ,¢ = 10c (to the right, as implied by the positivesign), and the initial momentum of themeteorite is 50 x (-})e ~ —25¢ (to the left, as implied by 104 Measurements in flat space-times — _ shoe me 100 tons 50 tons x M tons x (b) . 3.43 (a) A space station moves right at v= je while a meteorite moves left at v=le. (b) After they collide and fuse together, the wreckage moves at speed v in the +X-direction. the negative sign). As no forces are acting on them, by (M1) these momenta stay constant; they therefore continue approaching each other at constant speeds. They then collide, generate considerable heat, and fuse together. Let the wreckage have mass M and speed v’ in the +X-direction (Fig. 3.43b). The total final momentum of the material involved is Mv’. By (M2), this is equal to the total initial momentum of the space station plus the meteorite, which is 10c + (=25c) = —15e. Thus conservation of momentum tells us Mv’ = —15c, so the final speed is v' = —15e/M. Now in Newtonian theory, total mass is con- served so the final mass of the wreckage is equal to the mass of the space station pius the meteorite, i.e. M@ = 100 + 50 = i5u. Thus 2’ 15¢/190 = —qye; that is, the wreckage moves to the left at #5 the speed of light (v!/c = —0.1). In this example, the situation was particularly simple because all motion took place parallel to the ¥ axis. If the motion is in a general direction, we can write (he velocity vector vin terms of its components (0, v», v-) parallel to the X, Y,and Z axes respectively; then the components (p,,py,pz) of the momentum vector p parallel to these axes are given by Px = Mx, Py = My, pp = mv, (3.31a) We can conveniently combine these three relations in the single vector equation p=mv (3.31b) giving the momentum p measured by an observer B for a particle of mass m moving with velocity ». According to Newtonian theory, B will measure each component (3.31a) of total momentum to be conserved when collisions take plac In relativity theory, on examining momentum conservation from a space-time viewpoint (see Appendix B), it turns out that the quantity conserved is not p but rather a vector 1r, the relativistic three-momentum, defined by = moy(n)v (3.32a) 3.7 Relativistic dynamics 105 with components Tx = Moy(U)Ux, Ty = Moy(v)vy, mz = mMo7(v) v2, (3.32b) where mo is a mass associated with the particle (which we later identify as its ‘rest mass’) and y(u) = {1 — (w/e)?}"? (see eqn (3.21)). Given this definition, the relativity-theory prediction is that momentum 7 is conserved in collisions: (total initial momentum x) = (total final momentum x), (3.33) and from this one can work out the effects of collisions in relativity theory almost identically to the way one does in Newtonian theory. To see this, consider again the space station and meteorite in the example above, We naturally assume that the masses stated previously are rest masses. Relative 0 ,the observer B, the y-factor for the space station is (j5c) = (1- Gy'F t= Gy = 1.005 30 the x component of its initial momentum istry 0 1(v}vx =,100 x 1.005 x 45¢ = 10.05c. The -factor for the meteorite is (iQ)? =) = 1.135, so the x component of its initial momen- tum 50 x 1.155 < (—$}¢ = —28.868¢. The total initial momentum is therefore 10.05¢ ~ 28.868¢ = —18.818c, which will be equal to the total final momentum, so Moru')u! = -18.818¢ () where Mp is the rest mass of the wreckage. Completion of the calculation to find v’ demands that we work out the final total mass Mo. According to Newtonian theory, total mass is conserved. Can we generalize this result in a simple way? This depends on identifying a conserved quantity that we shouid caii ‘mass’ in reiativity theory. Now, on comparing (3.31) and (3.32) it becomes clear that if we define the mass m of a moving particle by m= ylu)ng = mo{1 = (v/e)2}# (3.34) so that (3.32) can be rewritten in the form {x= mv} & {my = mrx, %) = my, T = mv;}, (3.35) then the Newtonian and relativistic equations both take the same form: the conserved momentum is given by ‘momentum = mass x velocity’. Further, given this definition of a mass m that depends on the velocity relative to the observer (mo being independent of this velocity), the four-dimensional momentum con- servation equation shows that m is conserved in collisions (Appendix B). From now on, we refer to m (determined from the rest mass and relative speed by equation (3.34) as the ‘mass’ of an object, both because the momentum equa- tions then preserve their form (cf. 3.31), @.35)) and because this quantity is conserved in collisions: (fotal initial mass m) = (total final mass m) (3.36) 106 Measurements in flat space-times When the body is at rest relative to the observer, (3.34) shows m = mio, hence the name ‘rest mass’ for mp. Clearly m > mo, with m = mo if and only if the body is at rest relative to the observer. Returning to our example, the initial mass of the space station relative to the observer was mo = 100 x 10.05 = 100.Stons, and the initial mass of the meteorite was 50x 1.155~57.75tons. Thus, the total initial mass was 100.5 + 57.75 = 158.25 tons, Provided that no mass has been lost any other way, it follows that, by conservation of relativistic mass, this will be the final mass M also; 80 M = Moy(v') = 158.25. (ea) Dividing this into the relation (*) above, v’/c = ~18.818/158.25 = ~0.119; then substituting this value back into (**) shows Mo = 158.25/+(0.119c) = 158.25/1.0071 = 157.13 tons, 7 tons more than the total rest mass of the bodies that collided! The source of the extra rest mass would be conversion of some of the kinetic energy of the two bodies into mass, as we shall discuss later in this section. We shall then also see that the collision as discussed so far is over- simplified; in practice radiation would be given off which we need to take into account to get the full picture, Exercise 3.23 Consider the example above when the mass of the meteorite is taken to be 20 tons, all other conditions remaining unchanged. Show that then, according to Newtonian theory, after the collision the wreckage remains at rest in the rest frame of Observer B, but according to relativity theory this is not so. What is the final total rest mass in this case? As a second example, suppose an observer sees a particle of rest mass mp approach from the left at a speed v =! and collide with a particle which approaches from the right at speed v = }c; after the collision they both remain stationary relative to the observer. What was the rest mass of the second particle? Suppose this mass is Mo. The total final momentum is zero, so the total initial momentum is zero. Hence, the initial momentum to the right of the one body is equal to that to the left of the other: mor(3e) x fe = Mor( fe) x 6 = mo/2{1 (4) = mo/4{ - (4). Thus 19/(3)! = Mo/2(H3)!, ie. Mfo = 2mo($ x 15)! = 2mo(4)? = 2.236m0. By contrast, conservation of energy according to Newtonian theory would give mo X $e = Mo 4c, i.e. Mo = 2mo, which gives an error of about II per cent compared with the relativistic result. We have now seen how to calculate the consequences of relativistic mass and momentum conservation. Are these laws actually correct, i.e. do they describe the real world? This has to be determined by experiment. Major particle accelerators 3.7 Relativistic dynamics 107 screen deflector Efield Charged! plates Fig.3.44 Acceleration of a charged particle by an electric field (e.g. in a television display tube). are used daily to produce particle collisions at very high energies, and many thousands of such collisions have been analysed on the basis of conservation of relativistic momentum (eqn (3.33)) and mass (eqn 3.36)). The theory enables us to understand the collisions in each case, so these are among the best-tested laws in physics. C: Force In Newtonian theory, if a force F acts on a body with momentum p the rate of change of momentum is equal to the force acting: that is,” {F = dp/dt} @ {F, = dp,/dt, F, = dpy/dt,F = dp./dt}. —(3.37a) This determines the motion of the body when acted on by any force. For example, the electrons which generate the display in a television set are initially accelerated from rest by an clectric field. To analyse this, note that if a particle with electric charge e moves non-relativistically in a uniform electric field E, parallel to the field, the force exerted by the field on the particle will be F = eF (Fig. 3.44). If the xaxis is chosen parallel to the field, then since p ~ mv the motion of the particle is, determined by the equation ef = mdv/dr for the velocity component v in the x direction with solution v = (eF'/m)rifit starts from rest. In principle, the particle can eventually reach arhitrarily high speeds if it moves in a uniform electric field long enough. In relativity theory, the same equation of motion is valid; again force = rate of change of momentum, “Here dp/dt (the ‘derivative of p with respect to t°) means the rate of change of p as time ¢ evolves; for example the velocity v is the rate of change of position, v=dx/dr, and acceleration a is rate of change of velocity, a=du/de. If you have not learnt about derivatives in calculus courses, you will simply have to avcept as correct some of the results we quote below. 108 Measurements in flat space-times however, here ‘momentum’ is now the relativistic momentum (3.32). Thus the equations of motion are {F =dn/dt} @ {Fy =day/dt, Fy =dny/dt, F=dn,/dt}. (3.376) This again determines the motion of a body acted on by any force, but now correctly takes relativistic effects into account. Reconsidering the example above, the equation for v now becomes eB = mod{v/(1 2/2) fat. This leads to the relation v/{1 — (v/e)?}? = eBt/mo, which can be solved for u/c, giving the result v/e = (eBt/moec)/{1 + (eBt/moc)?}3. In this case, even an arbitrarily long acceleration period will not enable the particle to exceed the speed of light (Fig. 3.45). ‘Again the question is: does the relativity force-law (3.37b) describe accurately the effects of a force acting on a particle? The answer is similar to the previous one: this force law has been tested many thousands of times up to very high energies in many particle accelerators, and is a very well-established law of motion in accord with all the experimental data. The essential difference An observer B will measure the inertial mass of a body by experiments based mass of an elementary particle by measuring the change in its speed when momentum is given to it in a collision). Thus, since the mass m defined by (3.34) is the quantity that directly enters the momentum definition (3.35) and so the momentum conservation equation (3.33) and the force-law equation (3.37b), itis indeed the quantity that he will measure as its effective inertial mass. For example, the response of an electron in a particle accelerator to the force acting vie Fig. 3.45. The speed of motion of the charged particle as a function of time. No matter how long the particle accelerates, it does not exceed the speed of light. 3.7 Relativistic dynamics 109 will be that expected of a particle of mass m (rather than mo). Hence the effective mass of a particle moving relative to B will be measured by him to vary with its relative speed of motion. Clearly, the form of the momentum and force equations is very similar in Newtonian theory and relativity theory; indeed, we can regard the only difference as being that the effective mass in the relativistic theory depends on the speed of motion of the body relative to the observer according to formula (3.34), while in Newtonian theory it is independent of this motion. Despite this close similarity in form, variation of m with speed v results in a fundamental difference between the Newtonian and relativistic cases. In Newtonian theory, mis a constant and there is nothing special about the speed of light. In relativity theory, m is related v by (3.34); the relation is shown in Fig. 3.46. The crucial feature is that the effective mass m diverges as V +1 (ie. as vc) and so the momentum x (given by (3.35) diverges then also; a graph of the magnitude of the momentum against the magnitude of the relative velocity vis given in Fig. 3.47. The consequence is that as one imparts more and more momentum to an object, either through collisions or through exerting forces on it, it moves closer and closer to the speed of light as. its momentum increases, but never reaches that speed because the inertial mass increases without limit and so the force needed to increase its speed by some given amount also increases without limit. Thus, one cannot accelerate a particle to faster than light in a particle accelerator, no matter how large the accelerator is (see Fig. 3.45), nor can one accelerate a rocket to faster than light no matter how much fuel one burns or how powerful the rocket motor is. To see this, in a specific case, suppose a projectile is moving at v/c = $ then (1 — 38)? = 3, so its effective mass is 3 mp and its momentum has magnitude 8 3 mo($)c =4nc. If its momentum is now doubled, then 7 = 8moc. i Wow a eae Fig. 3.46. A graph of m/me, the ratio of effective mass m to rest mass mo, against relative speed of mation V= Jo. 110 Measurements in flat space-times Fig. 3.47 A graph of ja/mo|, the ratio of the magnitude of relativistic momentum to rest mass against V=vjc. The new speed of motion hid is related to the momentum by + Sige = vung = v{1— (v'/c)}4mo; solving for v' shows (v//c) =@/ +09 = %, so u/c = (S)? = 0.936. Successively doubling the momentum, we find Yim > Vv =vjc= Bing + V =v/e= (Be n= Sing > V = v/e= (i26)= 0.999, showing that less and less return is gained for each doubling of momentum and the speed of light is not attained. Because the effective mass diverges at the speed of light, one cannot by any physical process accelerate a real object so that its final speed exceeds the speed of light. Thus, the dynamical theory of special relativily is in agreement with the basic assumption that the speed of light is a limiting speed for motion of all massive bodies, and indeed ensures that this condition is fulfilled. There is no inconsistency between thekinematicsand dynamics of special relativity; they form aconsistent whole together, as long as wedo not omit any of the relativisticeffects, Computer Exercises 10, Writea program that acceptsas input the rest mass My and speed of motion Vy of particle moving relative to an observer R in the X-direction of his reference frame, and 3.7 Relativistic dynamics 111 prints out its relativistic mass M1 and momentum P1. Use this program to verify the forms of Figs 3.46 and 3.47, and so to check that no matter how much momentum may be imparted to a particle its speed will not exceed the speed of light. Print out also the slow-motion approximation Mi = Mo(1 +4(Vi/c)*), and find out for what ranges of V; this is a good approximation to M1, 11. Write a program that will accept as inputs the rest masses Mo(/) and speeds of motion V;(/) of two particles labelled J (T= 1, 2) which collide in a particle accelerator and are converted to two new particles (labeled J, J = 3, 4) in this collision, all particles moving in the X-direction of the chosen coordinate axes. The program should additionally accept as inputs the measured speeds V2(/) of the product particles, and then calculate and print, out their rest masses Mo(/). [Find the total momentum and rest mass of the initial par- ticles, use the mass and momentum conservation equations, and then solve for the final rest masses,] What happens if you enter a value of 2(/) greater than the speed of light? What happens for a value equal to the speed of light? D: Energy and mass One of the fundamental features of Newtonian theory is the principle of energy conservation. For example, when the engine of a rocket accelerates it, energy is supplied to the rocket by the fuel it burns, the rate at which work is done being equal to the rate at which energy is supplied by the fuel. Similarly in the case of relativity theory one can calculate the rate at which work is done by a force acting ona body; this turns out to be proportional to the rate of change of its mass m (see Appendix B). Thus if we assume the rate of working is again equal to the rate of change of energy of that body, we deduce that the rate of change of energy is proportional to the rate of change of mass, suggesting a relation between mass and energy. Further, if a body of rest mass moves siowly relative to the observer so that (v/c)? < 1, one can approximate expression (3.34) by m= mo/{l — (wfe)?}t = of + }(e/e)? + terms of order (v/e)*}, that is, m= my | }mgv?/c? | (small terms that can be neglected). (3.38a) This shows that the Newtonian kinetic energy Ex =}mov? contributes an amount Ex/c? to the effective mass of the body. Dropping the small terms and multiplying by the constant c’, we find that approximately me? = moc? + Ex (3.38b) Thus one is again led to the idea that mass and energy are closely related to each other. On the basis of these kinds of arguments, Einstein proposed that mass and energy are different aspects of the same fundamental physical quantity. Mathe- matically, this is expressed in the famous relation E=me (3.39) 112 Measurements in flat space-times where the constant factor c” is required so that the dimensions of the equation are correct; the fact that this conversion factor is required to relate mass and energy units is evident from (3.38). This relation immediately implies that the mass conservation law (3.36) also expresses conservation of energy during particle interactions: (total final energy) = (total initial energy). (3.40) Conservation of energy isa necessity because on taking.a relativity viewpoint (see Appendix B), it is clear that the (three-dimensional) law of momentum con- servation (3.33) and (one-dimensional) law of energy conservation (3.40) can be written as a single (four-dimensional) law of energy-momentum conservation; what appears to be momentum conservation in one frame will be energy con- servation in another, and vice versa. These are then absolute laws that hold in all interactions as viewed by all observers; because of (3.39), the law of mass con- servation (3.36) is implied as well. Before exploring the meaning of these relations, we define the concepts of rest- mass energy and kinetic energy in relativity theory. Putting v=0 in eqn (3.39), we find the energy of a body at rest relative to the observer, that is, the energy E= Ey associated with its rest mass. Since y = | in this case, then m = mo and we find Eq = nyc? = (the particle’s‘rest-mass energy’). (3.41) Using eqns (3.34) and (3.41) in (3.39) shows that the total energy is E= move = Ev). (3.42) We now define the relativistic kinetic energy Ex from the rest-mass energy Ey and the total energy F by the relation E=Fot+ Ex (3.43) that is, the kinetic energy Ex is defined to be precisely that part of the total energy E due to motion of the body relative to the observer. Using the definitions (3.39), (3.41) we recover eqn (3.386) as a relation that is exactly true in relativity theory. Also using (3.42) and (3.21), eqn (3.43) shows that 1 ome (aa') au is the exact relativity expression for kinetic energy. In the case of slow motion this reduces to Ex = }ngv’+ (small terms that may be neglected), thus recovering the ‘Newtonian expression 4 m9v* asa good approximation to the kinetic energy when jv/el < 1. When low speeds are involved, we may use either expression for kinetic energy, for the difference between them will be negligible; when high speeds are involved, we must use the relativity expressions for energy, or we shall obtain wrong answers. Summing up, in relativity theory, eqn (3.43) splits the total 3.7 Relativistic dynamics 113 energy Einto its rest-mass energy Ep, that part of the total energy independent of the motion of the body, and its kinetic energy Ex, the part solely dependent on that motion. Zo is given by (3.41) and Ex by (3.44). These ideas are of fundamental importance in physics. We do not have space to consider all their implications in detail, but will outline some of the most important consequences. ‘The conservation of mass and energy Einstein’s vision was to see that these relations apply to ail forms of energy, this total energy being conserved. In Newtonian theory, provided oneaccounts for all forms of energy, total mass and total energy are separately conserved when complex interactions take place between bodies or systems of particles. In rela- tivity theory these two laws are replaced by a single conservation law, the law of conservation of relativistic energy, which accounts for all forms of energy and simultaneously represents conservation of mass (because mass and energy are just different aspects of the same quantity). Theimplications are profound. Consider two billiard balls colliding. In the real world, while most of their kinetic energy will be regained in the rebound of the balls after the collision, some of this energy will be expended in heating the balls up. Thus the final kinetic energy will be less than the initial kinetic energy, the difference being accounted for by the heat energy gained by the bails (resulting in their final temperatures being higher than their initial temperatures). Total energy is conserved, so we can calculate the change in heat energy, which will result in the balls finally having a larger rest mass than before the collision. ‘Asa specific example, suppose two identical balls of rest mass ! kg are seen by an observer to approach each other symmetrically, each moving at a speed awl, (Fig, 3.48), ae vo je, le, momentum is zero initially and therefore is zero finally, as momentum is con- served, It follows that the final rest masses are equal (which is also clear by the symmetry of the situation). The initial total cnerey is 2x 4x rge) = (1- (7) "= Gt Lass vecr > _ e Qe vekg . wekg Final state Fig. 3.48 Two balls of mass 0.5kg approach each other, each moving at a speed $c relative to an observer B, and then move apart after colliding, each moving at a speed |. 114 Measurements in flat space-times If the final rest mass of each ball is Mo, the final total energy is 2 2x Mo x y(he) = 2Mo{ 1 — (4)°} *= 2Mo(H8) #= 2.066 Ma Since these energies are equal, then My = 1.155/2.066 = 0.559 kg, an amount of 0.059 kg more than before the collision. By definition, this is the mass an observer moving with each ball will measure for it; how can the rest mass of the ball have increased? The answer is that the ball has more energy in it finally than initially because it has heated up, and this results in an increase in the ball’s rest mass because of the equivalence of mass and energy. The inertia of energy This example illustrates the fact that a hot ball will have a larger inertial mass than an identical ball which is cold. By (3.32) its momentum ata given velocity will also increase; heat energy has inertia! Thus one will, for example, have to impart more momentum to the hot ball to increase its speed from rest to 30 km/hr than in the case of the cold ball. The same applies to all forms of energy, for (3.41) shows that if there is any increase in energy of a body (¢.g. if it is heated up, if a battery is charged, if a spring is wound up), its rest mass will increase. Thus, all forms of energy (e.g. the chemical energy in a charged battery, the mechanical energy ina wound spring) contribute to the inertia of a moving body, increasing the force required to accelerate it to a given speed and the momentum it carries at that speed. As is often the case, this effect will be negligible in everyday life; in our illus- trative example we took the billiard balls to approach at half the speed of light in order to demonstrate a substantial effect, but it is of course totally impossible to close to the speed of light) are attainable in many physical situations, and the effect then becomes very important as the fraction of momentum due to internal energy, rather than rest-mass nergy, increases. The most dramatic example is in the case of zero-rest-mass particles: these are particles that, although they have no rest mass, have non-zero internal energy and consequently non-zero momentum, To see the properties of these particles, we note that from equations (3.21) (3.32) and (3.42) one can prove the relations BPa-Peame, (3.45) is Oil aut Ordinary billiard table! However, relativistic speeds (Le. specds n=(E/e)v,, ee = Bvt {2 (3.46) where 7? and v* are respectively the squared magnitudes of the momentum and velocity vectors (see Exercise 3.27). This in turn implies the equation BX] —v?/c?) = mac. (3.47) ‘These equations hold for any particle or body. ‘he idea now is to take the limit mo > 0 while E remains non-zero. From (3.46) and (3.47) we find {m=0, FANS fr =2, r= F/ch, (3.48) 3.7 Relativistic dynamics 115 showing that zero-rest-mass particles must move at the speed of light, and their energy and momentum are the same (up to a factor of c required to convert dimensions between these quantities). Thus, we can consistently conceive of particles which have finite energy and carry finite momentum for which (3.32), (3.42), (3.45), and (3.46) hold in a limiting form where (3.48) s true. Such particles do indeed cxist, for example the photon, which is the particle associated wi (and so must of necessity move at the speed of light, as required by (3.48)). As is familiar, photons are able to carry energy between distant points, e.g. one can destroy a satellite in space by use of a suitable laser on the earth which focuses light on the satellite; the photons which carry the energy from the laser to the satellite will also carry momentum, so the laser will recoil as it is fired and the satellite wreckage will be pushed into a new orbit by this momentum. This kind of effect will occur for every collision involving zero-rest-mass particles. Suppose for example that a photon with energy 1 MeV collides with a stationary electron with rest mass 0.511 MeV. After the collision it is observed that the photon has been deflected through an angle of 45°. Suppose that its energy and momentum are then £’ and ’, and those of the electron are £” and x”. Now |z’| = £’/c, and conservation of energy gives 140.511 = 2’ +5". Conservation of momentum shows nawn', where mis the initial momentum of the photon. Rearranging the last equation and taking the squared magnitude, we obtain giving |)? + |x’? — 20 = |x"? Noting that the magnitude of is I/c (from (3.48)) and that w- 2’ = |z||x'| cos 45°, we find that (Ife)? + (E'/c)? — 2(B'/c?) cos 45° = |x"? = (1/c?){(E")? — (0.511)°} from (3.45). Substitution for £” from the equation of conservation of energy leads to 14 (BY -2x4e'y2= (1.511 - £? - (0.511) Solving this equation gives E’ = 0.636 MeV and we see that the photon has lost energy as a result of the collision. The existence of zero-rest-mass particles leads to significant changes to the equation of state of matter-at-very-high temperatures, which in turn affects such features as the equilibrium states of massive stars and the rate of expansion of the 116 Measurements in flat space-times early universe. As a specific example: because the temperatures there are so high, the interior ofa star like the Sun contains vast numbers of high-energy photons. The gravitational forces trying to cause the sun to collapse are prevented from doing so primarily by radiation pressure exerted by these photons because of the momentum they necessarily carry (eqn (3.48)). Thus, this is one of the features making a long life possible for stars like the Sun; it makes possible the stability of the Sun, which in turn makes life possible on the Earth. The conversion of mass to energy and vice versa While total energy is conserved in relativity theory, (3.41) suggests the possibility of converting rest mass to energy or vice versa. This can indeed be done in dif- ferent ways. Because of the very large constant c? in this relation, a very large amount of energy is obtained by conversion of a very small amount of matter. The major processes whereby mass is converted to energy are fusion, fission, and pair annihilation; we briefly discuss these in turn. Fusion In the example above of a collision between a space station and meteorite, we found that the total final mass was higher than the initial mass; and are now able to interpret this increase as being the mass equivalent of heat energy of the wreckage. Suppose now that this heat radiates off to space until the wreckage is cold: all the heat energy is gone. What would be the rest mass then? Initially one might think this would be exactly equal to the mass of the com- ponents, i.e. 150 tons. However, this cannot be correct, for the following reason The material of the satellite and meteorite are fused together into a solid whole, which is stable (it does not spontaneously break apart). This means one would have to supply some energy in order to break it up into its constituent parts; because of this extra energy, which will increase the mass of the components, the total masses of these parts would be more than the mass of the wreckage. Thus the cold wreckage must weigh less than 150 tons. An exactly analogous effect occurs in atomic nuclei. The nucleus of an atom consists of protons and neutrons (‘nucleons’) tightly bound together. Knergy is needed to break a stable nucleus up into its constituent parts; the amount of energy needed to do this is its binding energy. If the nucleus were to be assembled from its elementary constituents, this amount of energy would be given offas they bind together to form the nucleus (Fig. 3.49). But, by (3.41) this energy loss implies a mass loss: so all nuclei have masses /ess than the masses of their con- stituent nuclei. Por example, the mass of a proton is 1.007825 a.m.u.* and the mass of a neutron is 1.008665 a.m.u, so the total mass ofa proton anda neutron is 2.016490 a.m.u. However, the mass of a deuterium nucleus, formed by fusing together a proton and a neutron, is 2.014102 a.m.u., less than the masses of its constituents by 0.002388 a.m.u. This mass difference, knownas the mass defect, is directly measurable. The binding energy of deuterium is also directly measurable, and is found to be 2.224 MeV which is just the energy equivalent of 0.002388 a.m.u. Thus equation (3.41) can be directly verified by these delicate experiments. 1 asm.u,—‘atomic mass unit’ ~ 1.6605 « 10"kg, Its energy equivalent is 931.5 MeV. 3.7 Relativistic dynamics 117 ™ mh a) © approach =, = binding energy radiated Nucleus Fig. 3.49 A proton and a neutron fuse together, giving off energy, to form a deuteron nucleus with mass less than the total mass of the constituent particles. For most nuclei the binding energy per nucleon is about 8 MeV; the nucleus with the largest binding energy per nucleon is iron, which is therefore the most stable nucleus. Some of the lighter elements can give off energy by fusion, when they combine to make heavier elements. A dramatic demonstration is the fusion (in a series of steps) of hydrogen to helium, releasing the binding energy of the helium; this is the process occurring in the hydrogen bomb, and is also the main source of energy in the Sun. This is an immensely important consequence of special relativity theory, because the Sun is the source of all the energy that enables life to exist on the Earth. The stars in galacy NGC 3377 on the cover of this book would be invisible were it not for the release of fusion energy, not to mention the fact that neither the photographer nor the reader would exist in the absence of this process! Fission Elements heavier than iron have a lower binding energy per nucleon than iron, and so can give off energy by fission, as they split to make lighter elements. he most famous example of this is when uranium 235 splits into two nuclei, giving off the difference in binding energy between the initial uranium nucleus and the two final nuclei. This is the process occurring in the original atomic bomb, and the source of energy in many nuclear reactors used to supply electricity. Thus, the tiny mass differences corresponding to the binding energy of nuclei have very important consequences in the modern world. Pair annihilation and creation 1n pair annihilation, an electron and its anti- particle,* a positron, annihilate entirely and all their rest mass plus their kinetic cnergy is converted to energy carried offin the form of clectromagnetic radiation (in the form of particles of light, i.e. photons). The rest mass of an electron is moc? = 0.511 MeV, so an energy greater than 1.022 MeV is released per electron— positron pair annihilated. This process may be a powerful source of energy in various astrophysical processes. *A particle with cqual mass but opposite valucs of other quantitics such as charge. 118 Measurements in flat space-times The converse process of pair creation is also possible. If two photons collide with a total energy greater than the threshold energy of 1.022 MeV, sufficient energy is present to provide the rest mass of an electron~positron pair, so such a pair can be created where none existed before. This does not violate the law of conservation of mass, because energy has been turned into mass, and it is the total of mass and energy that is conserved in the reaction. Similarly, a single photon of sufficient energy can create an electron-positron pair, provided there is a nucleus nearby to allow conservation of energy and momentum. The creation of matter out of pure radiation is perhaps the most dramatic demonstration of the mass- energy relation. It has been demonstrated many hundreds of thousands of times in particle-accelerator experiments; Figure. 3.50 shows the creation of an elec tron-positron pair in a bubble chamber, where a high-energy photon enters the chamber from the left and produces the pair near a nucleus which allows bal- ancing of total energy and momentum. Neither the photon nor the nucleus leaves a visible track in the chamber, so the tracks of the electron—positron pair seem to appear out of nothing. The relativistic basis These consequences of special relativity all follow naturally from the four- dimensional view of energy and momentum that results when we consider dynamics from a space-time viewpoint, leading naturally to equations (3.32) and (3.42). Itis perhaps useful to conclude this section by indicating how this happens (details being given in Appendix R). Our basic ideas of kinetic energy and momentum lead us to believe that in both Newtonian and relativistic theory, when the relative velocity is zero, an observer will measure both quantities to be zero: thus in the relativity case. {v= 0} > {r=0, Bq =0,E = Eo} (3.49) (the equality of Zand Ep following from (3.43). Thus, ifan observer moves witha particle, all he will measure will be its rest-mass energy. Now change to another frame so that the particle is in relative motion at speed v; equations (3.32) for the momentum and (3.42) for the mass then follow. The latter provides the basis for (3.38) and so for (3.39). Conclusion We have seen how the application of relativity theory to dynamics leads to the understanding of many important phenomena: the velocity-dependence of effective mass, the equivalence of mass and energy, the inertia of all forms of energy, the concept of ‘rest mass’, and the possibility of converting mass to energy and vice versa. While many of the conscquences of relativity thcory arc important only when high speed motion or large distances are involved, some of the dynamical phenomena are important in everyday life; for example, nuclear fis- sion is a source of power for many cities at the present time Fig. 3.50 Conversion of energy to matter according to Einstein’s famous formula E = mce* illustrated by pair production, A very energetic photon provides the energy needed to create the rest masses of an electron-positron pair. The photon does not make a visible track, but the tracks of the electron and positron after their creation are visible as they move to the right in spiral tracks in the Brookhaven bubble chamber. (Photograph provided by Brookhaven National Laboratory.) 120 Measurements in flat space-times Exercises 3.24 Suppose that a particle moves with speed (i) 10 c (ii) 10-¢ (iii) $ each case the ratio of the kinetic energy to the rest energy. 3.25A._ particle with rest mass My and speed }ccollides with a stationary particle of rest mass j Mo. They coalesce to form a new particle. Find its rest mass and spced of motion. 3.26A _ particle of rest mass Mg is moving with speed §¢ at time ¢ = 0. It is subject toa constant force of magnitude ; Moc parallel to its direction of motion, Find its speed when 1 = 1, How inaccurate would the Newtonian result be in this ease? 3.27 Obtain the energy-momentum relation for zero-rest-mass particles as follows. (i) Square equation (3.32a). (i) Square equation (3.42). (iii) Obtain equation (3.45), and solve for E. (iv) Take the limit of this expression as my — 0. (v) Obtain from (3.46) from (3.45), and so show the speed v = c allows non-zero values for mand E even through my is zero. [Note that relations (3.32) and (3.42) are indeterminate in this case.) 3.28 The energy received from the Sun at the Earth is 8 x 107 erg/em?-min on a sur- face perpendicular to the rays of the Sun. At what rate (in kg per minute) is hydrogen consumed in the Sun, in a chain of reactions fusing 4 protons (i.e. hydrogen nuclei) together to form oné helium nucleus, to provide this radiated energy? What does this suggest about the lifetime of the Sun? [Hint: (i) Find in a.m.u. the energy released when 4 protons form a helium nucleus (the mass of a helium nucleus is 4.002603 a.m.u.). Convert this to ergs, using the relation Jamu. = 1.4916 x 10-3 ergs. (ii) Find the surface area (4x72) of a sphere with radius equal to the Earth-Sun distance of 1.496 x 10! cm. (iii) Find the total energy falling on this sphere per minute if 8 x 107 ergs fall on each cm? of the sphere per minute. (iv) Determine how many fusion events will be needed per minute to produce this energy. (v) Convert the total mass of hydrogen needed per minute into kilograms. (vi) Estimate the maximum lifetime of the Sun if all its mass (1.989 x 10° kg) is used up in the fusion process.] 3.29 How much cues gy is veteased in the fissivn of a sadiuis auvicus (cousisiing uf 224 nucleons with an average binding energy of 7.5 MeV per nucleon) into 4 iron nuclei (each with 56 nucleons with an average binding energy of 8.6 MeV per nucleon)? 3.30 What energy is released in annihilation of a proton—antiproton pair? What is the threshold energy a pair of photons require in order to create a proton-antiproton pair? [A proton has a rest mass 1836.1 larger than the rest mass of an electron.] . Find in 3.8 The consistency of physics We have discussed how the whole package of the kinematic and dynamical ideas of relativity theory must be consistent, and have important physical con- sequences. But Einstein’s requirement goes further than this: all of physics must be consistent with the relativity principle. This central idea has many important consequences. It implies that relativity kinematics must obey the relativity principle and be consistent; we discussed this in Section 3.6, It implies that the laws of dynamics must obey the relativity principle, and be consistent with relativistic kinematics; we discussed this in Section 3.7. When applied to quantum mechanics, it led Dirac to predict the 3.8 The consistency of physics 121 existence of antiparticles long before they were observed experimentally. We will close this section by considering briefly one of the most successful of the classical physical theories, namely the theory of electromagnetism. What happens when we consider this from the viewpoint of relativity theory? The answer is perhaps rather surprising: this classical theory is completely compatible with relativity. Unlike the case of classical dynamics, where a reformulation was required and dramatic new results were found when relativity was taken into account, in this case it was discovered that the principle of relativity was already deeply imbedded in the classical theory; indeed, it forms the foundation of the well-known relation between electricity and magnetism. This will be discussed briefly in Appendix C. Because of this unity of physics and the compatibility of all physics with relativity theory, we do not have to rely on delicate measurements of clock times (as in the Hafele-Keating experiment) or of the speed of light (as in the Michelson-Morley experiment) to verify that special relativity is correct. Rather, we can prove its validity by the accuracy of the predictions of relativistic momentum conservation, verified many hundreds of thousands of times in high- energy accelerators, and by the existence of nuclear reactors and the demon- stration of pair production and annihilation, as discussed in Section 3.7. Therefore, special relativity is in fact one of the best tested of all physical theories. As mentioned above, even the cover photograph of the book is evidence for special relativity theory, because the stars shine as a result of the conversion of mass to energy as outlined in Exercise 3.28. Computer-Graphies Exercise 1 Write a program that will draw on thescreen the (t, X) axes of an observer A. Itshould then contain subroutines to do the following: (a) Acceptas input the speed of motion v of an observer B relative to A, and the spatial position Xo of Bat the time ¢ = as measured by A; and then draw on the screen B's world- line. [Note: |v/el < 1.] (6) Accept as input a time fp as measured by A, and then indicate the point P in B’s history that corresponds to that time. (©) Draw a surface of simultaneity for B through any designated point Q on his world-line, (d) Draw the future and past light cones of any designated space-time point Q (with coordinates (11, X1)). (©) Draw a series of light rays from A to B emitted at regular time intervals 70, and print out the interval of reception of signals by B as measured by A [found directly from r- coordinate values], and as measured by B [determined by use of the K-factor]. (Draw a radar signal sent from A to B at some time f; and reflected back to A. Use your program to depict (1) an observer A receiving radiation at some instant fo froma, distant galaxy, indicating clearly the time 1; in the galaxy’s history that is observable by A at the time fg; (2) the front and back ends of a rocket B moving past A, measured by A to be of length L, showing surfaces of simultaneity for A and for B; (3) observer B moving away from observer A and then back at speed v, while A keeps track of his motion by radar. [You should be able to think of many other elaborations and uses for this graphies program.) 4 The Lorentz transformation and the invariant interval To examine the unifying ideas set out in Section 3.6, we look in turn at Lorentz transformations, at simple quantities invariant under these transformations, and at the invariant interval of flat space-time. We complete our examination of flat space-times by looking at three universe models based on these space-times An understanding of the invariant interval and its meaning (presented in Section 4.2) is important for a full appreciation of the properties of the curved space-times discussed in Chapters 6 and 7. From the viewpoint of this book, the primary importance of the Lorentz-transformation discussion (in Section 4.1) is that it enables us to prove (in Section 4.2) that the space-time interval is an invariant, i.e. is the same for all observers. 4.1 The Lorentz transformation Some people would claim that man’s destiny, if he does not destroy himself before he has time to attain it, is to explore the Galaxy. As always, in venturing into the unknown, accurate surveying and mapping will be of vital importance for the safety of those engaged in this enterprise. Thus, it will be necessary to collate observations from many observatories, space stations, and spacecraft to give an overall picture of the regions that have been explored.* However, many of these observations will be made by observers moving relative to each-other, so we must know how to reduce to a common reference frame observations made by relatively moving observers. The essential problem is to derive a relationship between the space-time views obtained by any two observers. Once obtained, this will unify the fundamental special-relativity results of time dilation, length contraction, and the relativity of simultancity, in a single relation. We determine this relation, the Lorentz transformation, in this section. The two coordinate systems Consider inertial observers A and B movingat speed vrelative to cach other, with their space-time positions coinciding at an event O; for concreteness, one might consider say A as being in a control tower at an airport, and B in an aircraft flying past. *See ‘Navigation between the planets’, W. G. Melbourne. Scientific American 234, June 1976, 58. 4.1 The Lorentz transformation 123 A’s coordinates Observer A defines a time coordinate ¢ and spatial coordinates (x,),2) in the standard way. Essentially, he (i) measures proper time ¢ along his world-line by using an ideal clock; (ii) determines surfaces of simultaneity in space-time by use of radar; these are the surfaces {t= constant}, extending his time measurements from his world-line to the rest of space-time; and (iii) determines a set of non-rotating orthogonal directions along his world-line by local dynamical experiments (use of ideal gyroscopes, pendulums, etc), and measures spatial distances (x, y, z) along these axes by radar. This implies that he chooses (x, y,z) so that his own position at all times is at the origin (0,0, 0). (iv) Surfaces {x=constant, y=constant, z=constant} in space-time can be determined by radar observations of distances at arbitrary angles and ¢ relative to these coordinate axes and use of standard trigonometry, thus extending the distance measurements along the axes to the rest of space-time (the relationships involved here are somewhat complex to write out in full, and do not particularly illuminate the nature of special relativity theory, so we shall omit the details of this procedure). It is then convenient (v) to define new coordinates (X, Y, Z) by the relations (¥ = x/c, Y =y/c,Z =2/c) to obtain a geometrical picture of space-time where the light cone is at 45° to the vertical: physically, he has then chosen distance units that set the speed of light to 1 (see Sections 1.2 and 2.2). When he has done this, he has effectively covered space-time by a coordinate grid that labels each event P by coordinates (¢, X,Y, Z) in the standard way (see Fig. 4.1a). One should note particularly that, by their nature, these coordinates represent specific time and distance measurements as outlined above. The time coordinate of any point is measured along the time lines (parallel to A’s world- line) from the surface of simultaneity {¢ = 0}; in particular ¢ specifies time measured by any observer whose world-line is the coordinate axis {X = Y = Z=0}. The surfaces {¢=constant} are surfaces of instantaneity for such an observer, and (X, ¥,Z) are spatial distance coordinates for him in these surfaces from the origin {X¥ = Y = Z = 0}. Thus, the position ofa general point P can be represented as a combination ofa spatial displacement ina surface {¢== constant} and a time displacement alonga line {X, Y, Z constant}, where one may make these displacements in either order. We need four coordinates to describe the set of all events because space-time is four-dimensional, so precisely four numbers are needed to locate each happening in space and time. For example, a collision between two aircraft is located precisely if we are told it took place at 4:00 p.m. (Greenwich mean time) on 19 November 1986, at the height of 10000 metres above the earth's surface and at latitude 40°30'20", longitude 10°23/34”. More coordinates would be inconsistent or redundant; less would be insufficient to completely tell us ‘where and when’—i.e, at what space-time point—it took place. It is often convenient to use a superscript notation for coordinates. Thus instead of writing (t, X, ¥, Z) we can write (x°, x!, x2, x3) where we have defined x = t,x! = ¥,x° = Y, 3 = Z. For brevity, we write the whole sequence of four coordinates as (x“), where the coordinate index (or label) a is understood to run through the values 0, 1, 2, 3; so (x®) = (x°,x!, x, x3). We may avoid possible conflict between ‘the coordinate x” and ‘the square of the coordinate x’ hy “sada Jo 12s auras ay aquosap pun: aun—aoeds aures ayp ut atf Yz0q ¢ pure Y Staxrasqo Jo spIu3 ayeurpI009 ay] (2} “AyoueTHEYSUT Jo scouJINS aq 07 {aUEISHOD = ,7} seDJINS ay SIP IeISUOD Te ay 01 {JuRISHOD W J9A198q0 “¢ 1UA9 91] 0} (x ‘7) SOTRUTPIOOD SUISSE OAK “Y Xa.49Sqo uw Loy auIN—Deds UT neas st ay WorTULyap Aq “st =X} seovgins samnsvour ay ‘warsks ayeurpI009 UXO st utpx009 9YL (2) Lp BLL que:su0o= x, nat Shel = e KIS ueqsuos= 1 a = = SARA Se 4.1 The Lorentz transformation 125 writing the latter as (x)’; and similarly for other powers of x that may occur. However, in the context of using the coordinate names /, X, Y, Z we continue to denote the squares simply by ?, X*, ¥?, Z?, when no confusion will arise. We refer to A’s coordinate system (x“) determined in this way as his reference frame and denote it by the single letter F. B's coordinates Observer B defines his coordinates («', X’, ¥’, Z’) = (x*") in the identical way, by use of ideal clocks and radar. Then he has covered space-time bya second coordinate grid that labels each event P by coordinates (¢’, X’, ¥', Z’) (Fig. 4.1b). In this case, the position of a point P is represented by a combination of a spatial displacement in a surface of instantaneity {1 =constant} (tilted at some angle @ relative to the surface of constant time ¢) and a time displacement along a line {X’, ¥’, Z’ constant} parallel to B’s world-line (tilted at the same angle a relative to A’s world-line, of. Fig. 3.20). Again one may make these dis- placements in either order. We refer to B’s coordinate system (x ) as his reference frame F’. The relationship between the two coordinate systems Now the crucial point is that che same set of events are described by the two coordinate systems (x*) and (x*") (Fig, 4.1c). The relationship between the two observers’ space-time measurements is contained in the relation between these two coordinate systems. This relation can be simplified greatly by making two choices: (i) A and B each rotate their spatial axes so that the direction of relative motion lies in the X direction and X’ direction respectively, with their Y and Y’ and their Zand Z’ axes respectively coincident, and (ii) A and B each choose the origin of their time coordinates so that the event O where they meet is at the time t=O and /' = 0 respectively; that is, the origins of their coordinates coincide then. From now on we shall assume these simplifications have been made. Our task then is to determine the relationships between the two coordinate systems, and hence between measurements made by the two observers. The derivation that follows is somewhat involved; if you find yourself getting bogged down in the next few paragraphs on a first reading, we recommend turning to the formulae themselves (4.3, 5) and their consequences (4.6-14). Consider the event E labelled (¢, X, ¥, Z) by A and (1’, ”, ¥’,Z") by B. As the relative motion is in the X (or X’) direction, measurements in the Y and Z directions are unaffected (see Section 3.5), so Y= Y! and Z=Z' for all ¢ (or t’). Thus we need to concentrate only on the relation between (4, X) and (’, X"). These two pairs of coordinates are drawn from A’s viewpoint in Fig, 4.2, Because B’s surfaces of simultaneity are tilted up by the same angle as his world-line is tilted over from vertical (Section 3.3), the triangles OCF and EK in Fig. 4.2 are identical to each other (formally: they are congruent); and similarly, the triangles OHI and EGF are identical to each other. Thisis clear from the diagram; a formal proof is given in the appendix to this section. Since E has coordinates (t', X") according to B, the displacements OF and JE both represent the distance X’ measured by B and OJ and FE both represent the time ¢’ measured by B. Let J have the coordinates (fo, Xo) in A’s frame; then the displacements OH and GE, 126 The Lorentz transformation and the invariant interval Fig. 4.2 The event E has coordinates (¢, X) in A’s frame and coordinates (¢', X’) in B's frame. Lines IE and OD are simultaneous for A, while JE and OF are simultaneous for B. Observer A measures DE to be at constant distance from his world-line OL, while observer B measures EF to be at constant distance from his world-line OJ. The lines FG and JH are parallel to the X-axis, and the lines CF and JK are parallel to the t-axis. both represent times f measured by A, and HJ and FG both represent distances Xq measured by A. Similarly let F have the coordinates (11, 1) in A’s frame; then the displacements OC and KE both represent a distance ¥, measured by A, and JK and CF both represent times ¢; measured by A. We now use some earlier results to relate these different coordinate values. (a) Time dilation (3.18, 20) for the time measured from O to J gives ty =v)", (4.1a) where 1(v) is given by (3.19). (b) Length contraction (3.22, 23) for the distance measured from O to F gives X= y(v)X". (4.1b) (c) Since B moves at the speed v relative to A, the distance Xp represented by HJ is related to the time “ represented by OH by Xp = xo/e = vlo/e = (v/e)to = Vto. (4.10) (d) Because theangle HOJisequal to theangle COF, we have HJ/HO = FC/CO, that is, Xo/t = 4/1; so, using (4.1c), we obtain 4 = XoXi/t = VX) (4.1d) (which is essentially the simultaneity result (3.17). This information now lets us determine the lengths of the sides of the large rectangle OIED. Firstly, OD = OC + CD = OC+ FG = OC + HI (because triangles OHJ, EGF are congruent). But OD — ¥, OC ~ ¥;, and HJ — Xp. 4.1 The Lorentz transformation 127 Hence Xt Xo = Yv)X! + Vo(u)e! from (4.1b,a,c); that is, X= v)(X' + V1'). (4.2a) Similarly, OI = OH + HI = OH + JK = OH + CF (because triangles OCF and EKJ are congruent). Therefore t= +h = yu)! + Vy(v)X’ by (4.1a,d,b); that is, = yo) (0! + VX") (4.2b) Finally, as we have already seen, Y=Y', Z=Z!. (4.2¢) Equations (4.2a-c) are called the Lorentz transformation resulting from a change of velocity in the x direction (sometimes called a ‘boost’). They give A’s coordinates for the event E in terms of B's coordinates for that event. The equations are usually written in terms of the coordinates x, y,z and x’, y’,2': £= yv)(e! + ux'/e?), (4.3a) , x= 7(v)(x! +00’), (4.3b) (4.3) where yo) = (1-8/2) (44) This is the required set of equations relating two observers’ coordinates for the same event. Asan example of their use, suppose observer Bis in a rocket moving uniformly al speed v = $e away frum observer A ata base on the Earth, both observers agreeing to measure time from the instant when B passed A at the space-time event O. After a while B observes a tremendous explosion on a planet he is observing by radar. He measures the coordinates of the event P where the explosion took place to be (x*") = (5,1,3,0), that is,’ =5, ¥’=1, ¥’=3, Z' =0. He radios back to A, ‘Danger! Radioactive debris at position (1,3, 0) because of explosion at time ¢/ ~ 5° (in this example, units are assumed to be years and light-years). Standard coordinates are based on A’s position and motion. What coordinates should A assign to this event when broadcasting a warning to other spacecraft? 128 The Lorentz transformation and the invariant interval {1 — @?}7 = § thus eqns (4.3) In this case, V = Sand so, by (4.4),7 = 7 § become (xt) = (5.1.3.0) = (x*) showing the different coordinates allocated by the two observers to the explosion at this event. The inverse transformation Relations (4.2, 3) are completely reciprocal (as are all relativity formulae); that is, the transformation from A’s coordinates to B’s also has the form (4.3). In fact, one can solve eqns (4.3) for ¢’ and x’, finding e)(t- vx/e’), (4.5a) u)(x = v2), (4.5b) (4.5c) (2, %,3,0). SY =3, Z=0, and the speed of motion of B relative to A isu = 4c (so V = $). Then use of (4.5) will show (just as in the above example) that observer B measures the coordinates of the event P to be (x*) = (5, 1,3,0), that is, 2” = 5, X’ = 1, ¥' =3, Z’ =0. In brief, we can write this as (9) = @, 33,0) > (2) = (51,30), “) showing the different coordinates allocated by the two observers to the same event P. This result is of course just the inverse of (*) above, as it must be because (4.5) is the inverse of (4.3). Equations (4.5) are identical in form to (4.3) except for the minus sign pre- ceding v. The reason for the sign is as follows: according to A, the origin of B’s reference frame (the point X’ — Y’ — Z! — 0) movesin the positive x-direction at a speed v; according to B, the origin of A’s reference frame (the point ¥ = ¥ = Z = 0) moves in the negative x’-direction at the same speed. The Lorentz trans- formation formulae (4.3, 5) are valid when A and Bare approaching each other for t,t’ < and receding from each other for ¢, t’ > 0, i.e. when B moves in the +x-direction as measured by A. However, we would find the opposite sign for v if we calculated the result for B moving in the negative x-direction relative to A. 4.1 The Lorentz transformation 129 Therefore, the convention for the sign of v implied in the Lorentz transformation formula is that v will be positive when the relative motion is in the +x-direction and negative when it is in the ~x-direction. Given this understanding (which is different from that implied in the K-factor formulae above in Sections 3.1 and 3.2), eqns (4.5) are precisely what we would expect to determine B’s coordinates from those of A. This equivalence of the formulae is a direct consequence of the relativity principle (each observer is equivalent to the other, so there must be no essential difference in the transformation formulae between them). Consequences of the Lorentz, transformation The forward and inverse Lorentz transformation formulae (4.3, 5) for the rela- tively moving observers A and B determine each observer's coordinates for any event from the other's coordinates for that event. They can be used to find relations between any space, time, or velocity measurements made by the observers. For example, as we shall see shortly, one can derive the length con- traction, time dilation, relativity of simultaneity, and special-relativity velocity addition laws directly from these relations. We shall now explore briefly the main consequences of the Lorentz, transformations. A; The Newtonian limit It is important to note that when the speeds involved are small compared to the speed of light and distances small relative to the times involved, the Lorentz transformation reduces to the usual Newtonian results. For example, if » = 300 km/sec (a very large speed by daily standards!) then }/c = 300/300 000 = 1/1000 so from (4.4), ¥(v) = {1 — (1/1000)?}? = (1 — 10-°)? ~ 1 + (1/2)10®, a value extremely close to 1; so for most purposes we can accurately approximate 7 in (4.3) and (4.5) by 1. If, further, the distance x involved is less than 10000km, then x/c < 10000 km/300 000 km/sec = 1/30 sec, so (u/e)(x/c) < (1/1000)(1/30) = 1/3 x 10-* see; so ¢— ux/et = Lif t >> 1074 sec. Thus, in these circumstances, all the characteristic relativity terms are so small that they can be ignored, and the transformation (4.3) becomes xxx’ 4300, y=y', z=2', where x is measured in kilometres and ¢ in seconds. More generally, {lv/e| <1 |x/el < } > yo) H1 > x~x'to', yay’, (4.6) , x'ex—vl, yy which arc just the usual relations we intuitively understand through our experi- ences of everyday life, that have been formalized in Newtonian theory. Thus, the standard results of Newtonian theory will be valid when slow motion (..e. speeds slow relative to the speed of light) takes place and the length scales involved are everyday lengths; relativistic effects will only show up when large velocities or light travel times are involved, but they will then be important. We now turn to these effects, 130 The Lorentz transformation and the invariant interval B: Time dilation Firstly, consider the transformation for a point Q on B's world-line (x! = y’ =z! = 0). Thus (see Fig, 4.3) we set x’ = 0 in (4.2), finding c= 70)", (4.7a) x= vot! = vt. (4.76) The second result confirms that the quantity ‘v’ is indeed the velocity measured for B in the +x-direction by A; and the first is the standard time-dilation effect 3.20) for B's clock as measured by A. For example, if'v a time of 5 years measured by B (" to be equivalent to a time 1=$x 5% years measured by A (using radar to determine simultaneity in B's history with events on his own world-line). Similarly, setting x = 0 in (4.5) will give the reciprocal results when B observes A. th v A 9 ye light cone © Fig. 4.3 (a) A point Q on B’s world-line has coordinates (f,) in A’s frame and coordinates (', 0) in B’s frame; rand r’ are related by the time dilation factor. (b) A point R on B’s surtace of events simultaneous with the origin O has coordinates (ft, X )in A’s frame and (0, X")in B’s frame; Xand X’ are related by the length contraction factor. (¢) For both observers A and B, light rays have the same speed; according to.A.their equation is t= x/e =X, and according to Bit is ¢! = x'/e = X" 4.1 The Lorentz transformation 131 C: Length contraction and the relativity of simultaneity Secondly, consider the transformation for a point R in the surface {t' = 0} defining simultaneity, as meusured by B, with the event O. Thus (see Fig. 4.3b) weset r’ = Oin (4.3), finding x =4(0)x!, (4.8a) b= y(vjux'/e = vx/ ec. (4.8b) The first result relates the distance from O to R as measured by A to the same distance measured by B, and shows the standard length-contraction effect (3.23) for B moving past A. The second is the formula (3.17) giving A’s coordinates for an arbitrary point R in B’s surface of simultaneity with O. For example, if v = $c so that 7 = §, length of 1 light-year measured instantaneously by B (x! = 1) for an object at rest relative to him parallel to » will correspond by 4.8a) toa length of 3x | = $light-years measured by A; also by (4.86), A and B will disagree about Simultaneity over that distance by an amount (v/c)(x/e) =4x 1 —4 years. Again the reciprocal results follow on setting s = 0 in (4.5). D: Invariance of the speed of light Thirdly, both observers should agree about motion at the speed of light. We check this for light moving in the +x-direction by setting ¢’ = +x’/c in (4.3), finding x = 7(v)x'(1 + v/c) and t= (v)(x"/e)(1 + v/c) = x/c; thus t=tx'/o>t +x/e, (4.9a) confirming that if B measures the speed of ight in the +-x-direction, then A agrees (Fig, 4,3c). For example, if v = 4cso that y = § then, after 1 year, B will measure light emitted at O (1 = 0, X/=0) to be at event P(t’ =1, X’=1). A will determine the coordinates for event O to be {1 = 0, X = 0} and the coordinates for the event P to be {=$x 1 x $= 3years, ¥ =$x 1 x $= 3 light-years}, confirming that A measures this light to be travelling at the speed c. Similarly, t x'/e > t=—x/e, (4.9b) showing that A agrees with B about the speed of light in the —x-direction. E: Relativistic velocity addition Fourthly, suppose a third observer C moves past B at a speed v! in the +x'-direction (Fig. 4.4). Let C’s coordinates («", x,y", 2") be aligned in the standard way. Then (applying the results above for B observing C) B’s coordinates are related to C’s by t= yv)(e" + 0x"), (4.10a) x ae +ut"), (4.10b) haz", (4.10c) where (4.11) 132 The Lorentz transformation and the invariant interval Y yy A B =X =X! — YI YY 8 | c x =X" —>- v Fig. 4.4 Three observers in relative motion: A has coordinate system (t,X, ¥,Z); observer B has coordinate system (t/, X", ¥', Z') and is moving at speed v relative to A in the ¥-direction; C has coordinates (t", X”, ¥", Z") and is moving with speed w relative to Bin the X’-direction (which is parallel to the X-direction). Now, the relation between A’s and C’s coordinates should again be a Lorentz transformation of the form (4.3), because A and C are just two inertial observers whose coordinates are related in the standard way. Indeed this is so: one can substitute (4,10, 11) into (4.3, 5) and simplify, obtaining eventually (after some tedious algebra) tayo" \(t" b"x" 2), (4.12a) = yo" )(x" + 0"t"), (4.12b) yoy, zlae", (4.12¢) where ae) = 1-7/4, (4.13) the quantity v being defined by vu’ =(v+v)/(1 +00! /c’). (4.14) This indeed shows that measurements made by A and C are related in the stan- dard way, with the relative velocity of A and C given by (4.14). Thus, we have confirmed that if A measures B to move at a speed v in the x-direction, and B measures C to move at a speed ’ in the (parallel) x'-direction, then A measures Ctomoveata speed v" in the x-direction where v' is given by the special relativity velocity addition formula (3.15). For example, if »=4c and o =!c, then l+ui/e=1+tx$=2 sou! = (b+ cx 8 = Be (which is less than c, as required). 4.1 The Lorentz transformation 133 Reprise We have now determined the Lorentz-transformation equations (4.3) and (4.5) relating the measurements made by two observers with different velocities in the x-direction, and verified that we can derive the standard kinematic results of special relativity from them, thus confirming that these formulae do indeed encapsulate in a compact form the kinematics of special relativity. In the next section we turn to looking for quantities invariant under Lorentz transforma- tions; the present section concludes by giving a worked example of the use of the Lorentz transformation, and showing how these transformations may be viewed inan active sense rather than the passive sense used so far. Thisis useful later on in constructing simple universe models. ‘An example Suppose that a rocket of length 100 m travels horizontally above the ground at a speed of 10’ m/sec. At a certain moment, a light signal is emitted from the front end of the rocket. Let us compare the times the signal takes to reach the tail end of the rocket according to (i) an observer travelling on the rocket, and (ii) a sta~ tionary observer on the ground. For the observer travelling in the rocket, the length of the rocket is of course 100m. Thus (i) the time taken is this length divided by the speed of light, ic. 100 m/(3 x 10° m/sec) =0.33 x 10~® sec. (ii) Suppose that in this observer’s reference frame the light is emitted at event A with coordinates ta = xq = 0, and received at event B with time ty as calculated above and distance xg = 100m. Thea, in the frame of the stationary observer, who moves with the relative speed v = 107 m/sec in the x-direction, eqn (4.5a) gives th = Yo) (te ~ uxp/2) = (1? /2°*(ap/e)(1 — v/e) = te{(1- v/e)/(1 + v/e)}? = t{(1—4)/(14+4)}~ Btw ~ 0.32 x 10° see Notice that the result cannot be obtained naively from either the length con- traction factor or the time dilation factor, because neither the length of the rocket nor the rate of a moving clock is directly at issuc. Active transformations So far, we have regarded the Lorentz transformation in a passive sense: it relates, the reference frames of different observers, and so determines how their different coordinates for the same event are related. However, it can also be regarded in an active sense. To see how this works, consider first an ordinary rotation of axes in Euclidean 2-space (Fig. 4.5a). Changing from reference frame F (with coor- dinates (x,y) to reference frame F’ (with coordinates (x',y')), we find the =f Tey X=const, X'=const. @) © @ Fig.4.5 (a) A rotation of axes in the Euclidean two-plane changes the coordinates (X, Y) of the point P to coordinates (X’, ¥"). This is a passive transformation: the points in the space remain fixed, but the reference frame changes. (b) In an active rotation, the point P moves with the axes and coordinates to a new point P’, (¢) The image point P’ has the same coordinates relative to the new coordinate system, as the initial point P had relative to the old coordinates. (d) The movement of points in the Euclidean plane generated by an active rotation. 4.1 The Lorentz transformation 135 coordinates of the same point P in the two frames are related by x! = xcosé+ysin#, —xsind + ycosd (which is completely analogous to (4.5). This is what we refer to as a passive transformation: each point in the space is simply being referred to in two different coordinate systems F and F’, with F’ related to F bya rotation. By contrast, in an active transformation the space as a whole rotates relative to the fixed coordinate axcs of the initial frame F, the rotation of the reference frame F’ dragging the points of the space with it, i.e. moving the points so as to preserve their coordinate values (Fig. 4.5b). For example the point P at {x = 0, } is dragged along with the rotation of axes to the point P/ given by {x’ ~ 0,’ = 1}, see Fig. 4.5e. Similarly, each point P is mapped by the transformation into the point P’ which has the same coordinate values relative to the new frame as the old point had relative to the old frame In precisely the same way, we can regard the Lorentz transformation as either a passive or an active transformation. Previously we have regarded eqns (4.5) as representing a passive transformation (referring to the same fixed space-time points in two different coordinate systems). Now suppose we start with two reference frames A and B at rest with respect to each other, representing space- time in terms of coordinates (t,x, y,2) and (t’, x’, y’, 2") respectively. Initially these coordinates are identical (¢ = ¢', x = x‘, y = y’, z = 2’) because the frames are at rest with respect to each other. Now we set frame B in motion so that it is moving relative to frame A at a speed vin the +-x direction; we may refer to this as giving the frame B a boost through +v. We regard space-time events as dragged along with the frame B when we apply the boost (but the frame A as fixed, unaffected by it). Thus, the effect of the boost is to move each point P from an initial position given relative to both A’s and B’s frames by the coordinates (x',y,z’,¢’), to a final position given relative to B’s frame by the same values (x!,9',2", 0) (see Fig. 4.6a). Relative to A’s frame the final coordinates (t,x, y. 2) will be determined from (f’, x’, y’,z') through eqns (4.3). Kor example, the event Q'aunit time along B’s time axis {¢' = 1, x’ = y! =z! = 0} is then found to have coordinates t=7v), x=v7v), y=0, 7=0 (4.15) according to A (cf. en (4.7)). This event was initially a unit ime along both A’s and B’s axes. Thus, if we take the event Q at {¢= 1,x= = 0} in A’s frame and give it a boost through +», it will end up at Q’ with coordinates (4.15). By this construction, it is clear that length and time measurements are preserved under an active Lorentz transformation (¢.g. a unit time measurement in B’s frame remains a unit time measurement, as the boost is performed). From this viewpoint, this is in fact the defining property of Lorentz transformations, which move points in space-time as shown in Fig. 4.6b If we keep on repeating the boost for a particular relative velocity v, we will get an infinite series of frames each related to the previous one by (4,3), 136 The Lorentz transformation and the invariant interval PK'=1,t'=3) ) Fig. 4.6 An active Lorentz transformation moving the points of two-dimensional flat space-time into each other, (a) Theeffect of a boost on specific points P,Q, R, moving cach point (¢,) into a new point with new coordinates (= 4, X7 (b) The pattern of motion generated by the boost (this is the exact analogue of Fig. 4.5d). representing arelalive velocity +vin the x-direction. If wesketeh these frames ina single space-time diagram, the result is Fig. 4.7. We can regard this as showing how repeated application of a boost through + will move the unit time vector of A’s frame (i.e. the vector OT, where T has coordinates (1, 0,0,0) in A’s frame) into a succession of vectors, each representing a unit time displacement for an observer moving relative to A. These unit time vectors each represent the relevant observer measuring 1 unit of time from the space-time event O. Thus, these arrows all represent unit clock measurements made by observers moving at different velocities relative to A; and the surface they define is a surface at unit proper time from O (where this time is measured along the straight line from O) 4.1 The Lorentz transformation 137 vectors at times y, ‘vectors ‘at distance=+1 —— x time=-1 Fig. 4.7 The effect of a repeated series of boosts on the unit time-like and space-like vectors along the axes of the reference frame of an observer A. The image vectors can be thought of as the unit time and space vectors along the axes of the reference frames of a series of relatively moving observers, They define the surfaces at unit time and unit spatial distance from the origin O. ‘This surface enables one to compare the units of time on different lines through the origin representing the uniform motion of particles at different speeds, all passing O at time ¢ = 0. Similarly, repeatedly boosting the displacement {# = 0, x = I,y = 0,2 =0} representing a unit spatial displacement will give a series of vectors representing instantaneous unit spatial measurements by this family of observers, defining a surface at unit spatial distance from O (measured along the straight line from O). This surface enables one to compare the units of spatial distance along different space-like lines all passing through the origin. These two surfaces are the space-time equivalent of a unit circle in the Euclidean plane (since that is the surface at constant unit distance from the origin O, measured alang the straight line from O; there is in that case no distinction between time- like and space-like curves or measurements). Figure 4.7 also displays how, as the relative velocity v tends towards c, the frames of the other observers (viewed from A) appear to collapse towards the light cone. This is a consequence of the limiting nature of the speed of light in special relativity. Exercises 4.1 Deduce explicitly transformation (**) following (4.5) from the general Lorentz transformation formula (4.5). 4.2. Consider two events A and B defined in some frame of reference by coordinates tA = «A = Vi = Oand & = 1, xp = 2c, yy = zp = 0. What are their coordinates in a frame moving with speed Jc in the x direction relative to the first frame? What has happened to their time ordering on transformation between the two frames? What aspect of the relationship hetween A and B makes this feature possible? =z, 138 The Lorentz transformation and the invariant interval 4,3. Suppose that two events are connected by a time-like line in one reference frame. Show that their time order is the same in all reference frames, 4.4 A passenger on « train moving with speed v watches a girl stationary on the ground throw a ball at speed 2v at an angle of 60° to the horizontal, in the direction parallel to the train’s motion, According to the gir! the path of the ball is given by x=ul, y= V3ut~Zgr, where x and y measure horizontal and vertical distances, Find the path according to the passenger on the train. 4.5 Aspaceship with a top speed of $c pursues one with a top speed of 3c. An observer onanearby planet observes them to be one light-year apart, How much later, according to the observer on the planet, will the slower one be caught? What will this time difference be (i) according to an observer on the slower spaceship, (ii) according to an observer on the faster ship? 4.6 Apply a boost with parameter v to the following events described by their (1, X) coordinates: (a) (-1,2), (b) (0, V3), (© (1.2), @) (-L-). © C.D. (f) 2,-D. (g) (3,0), (h) (2, 1), Plot the old and new points on a space-time diagram for v = 5e/13, and draw in what lines of constant distance from the origin are needed to show the effect of the boost on these points. 4.7, The group property {this example presumes you know the mathematical definition of a group]. Show that a combination of any number of Lorentz transformations of the standard form (with parallel velocities) will lead to a final Lorentz transformation of the same form, for some appropriate velocity. Consider, for example, a family of observers Ay, ‘Ao, As,-++»¢ach moving at the speed v relative to the previous member of the family (Az moves at a speed v relative to Ay; A moves at a speed v relative to Ag; and so on), The resulting series of coordinate axes are shown in Fig. 4.7. This shows the unit time-like vectors (from the origin {¢=0, x =0} to the point {¢ = 1,x = 0} on each observer's world-line) and spacelike vectors (from the origin {¢—=0,x=0}) to the point (:—9s/e—1} fame, Then every pair of reference frames in ths family is related by a Lorentz trans- formation of the form given in eqn (4,3), with v replaced by the relevant value for the relative velocity (derived by repeating the relativity velocity addition law the appropriate number of times). The identity transformation is a Lorentz transformation (put v= 0in (4.3)), and the inverse transformation to any Lorentz transformation is also a Lorentz transformation (in fact (4.5) is the inverse of (4.3)). Prove that, together with the com- position property discussed above, the (¢,x) Lorentz transformation form a group of transformations. Camputer Exercise 12 Write a program that will accept as input (a) a speed V (= v/c), (b) coordinates (t, x, y, 2) of. point P measured by an observer A, and will print as output coordinates (1’, x',y', 2") of P measured by an observer B, given by the Lorentz-transformation equations (4.5). Make sure that your program allows repeated Lorentz transformations, i.e. having made one transformation, unless new data is fed in the output of the previous transformation is automatically the input for the next one. Get your program (¢) to print out additionally the result of the Newtonian transfor- mations (4.6), and so experiment to see when these are a good approximation to the Lorentz transformation; (d) to print out the quantity Q = —7 + X? + ¥? +Z?. Whatis the change in this quantity each time you perform a Lorentz transformation? 4.2 Space-time separation invariants 139 Computer Graphics Exercise 2 Write a program that draws a set of axes (t, X), and then shows the effect on a space-time diagram of moving a chosen point P with coordinates (¢, X) to the point P’ with coordi« nates (t', X’) given by the Lorentz-transformation formula (4.5) for a specified speed V(= v/e) [arrange for an arrow to be drawn on the screen from P to the new point P’; this exercise regards the Lorentz transformation as an active transformation, but you may use the program from Computer exercise 12 to perform your calculations}. Try the effect of repeated transformations on the points (1) ¢~ 1, X = 0;(2)t= 1, ¥ = 1;)t=0,¥ =1; (@)t=-1,X¥=-1. Modify the program (a) to show the effect of the transformation in moving several chosen points simultancously; (b) to show its effect on a line through the origin, as follows: given a specification of a point Q, (i) draw the straight line through the origin O (t = 0, X = 0)and Q; (ii) mark off on this line the series of points Q, where Q, is Q, the point Qo is twice as far from O along the line as Q, the point Qs is three times as far from O along this line, etc., until the edge of your diagram has been reached; (ii) show the effect of the transformation on all of the points Qj, and draw the new straight line through the origin that they move into. Try this program on the set of points (1)-(4) listed above. Appendix: geometric proof of congruence of triangles in Fig. 4.2 By construction, OFEJ is a parallelogram and so OJ = FE. The angles HOJ and COF are equal (see Section 3.3). Angles KEJ and COF are equal (the parallel lines JE and OF are at the same angle to the horizontal). Similarly angles HOJ and GEF are equal (the parallel lines OJ and FE are at the same angles to the vertical). Angles OHJ, JKE, EGF and OCF are all right angles by construction. Therefore, triangles OCF and EKJ are congruent (with equal sides OF and JE and two pairs of equal angles COF and KEJ, and OCF and EK)). Similarly, triangles OHJ and EGF are congruent (with equal sides OJ and FE, and two pairs of equal angles HOJ and GEF, and OHJ and EGE). 4.2 Space-time separation invariants We have seen that many features of space-time, which we previously took for gianted to be unchanging, in fact change according to the telative motion of observers. It would be very useful if we could find quantities that are invariant, Le. independent of the reference frame chosen. Then all observers will agree on their values, so communication will be simpler, and physical laws may be expected to take a simpler form, if expressed in terms of such invariants. Their independence of the state of motion implies that such quantities are of particular physical or geometrical significance: they reflect some deeper underlying structure, which is independent of the reference frame or coordinate systems used to describe it. Asimple example to keep in mind is that of rotations in Euclidean space. When different axes are used, different coordinates (x, y) are assigned to the same point (Fig. 4.5a). However, the distance d from the origin, defined by d = (x? + y)!, is calculated to be the same no matter what coordinates are used, since it is invariant under rotations of the (x. y) axes. Thus. itis very useful to be able to talk about the 140 The Lorentz transformation and the invariant interval distance between two points, because this is an invariant quantity: all observers “agree about its value no matter what choice of coordinates they have made, and so it is appropriate to characterize the geometry of Euclidean space in terms of distances between points. We seek analogous quantities in space-time. We shall consider here three invariants related to distance and time measure- ments in flat space-time: the functions S? and AS?, and the infinitesimal ds?. Although the argument is sometimes an involved one, it is worth following through, because it provides the basis for understanding the invariant interval of curved space-times which we consider in the following chapter. These quantities characterize how clock measurements behave and how light travels in space- time, and so also determine instantaneity and spatial distance measurements. The invariant S? characterizes these properties relative to the origin in flat space— time, while AS? characterizes them for any two points in flat space-time. By contrast, ds? determines these properties for any two neighbouring points in space-time; the properties of curved spaces and space-times are built up by knowing the local distances between any pair of neighbouring points in the space, and this is described by ds? ‘The space-time invariant S? We have already seen that the speed of light is an invariant. Other invariants may be built up from entities which by themselves change from frame to frame, but are combined in such a way that the resulting quantity is unchanging. An important example is the quantity S?, defined in terms of the coordinates (¢, X, Y,Z) of an observer A, where ¥ = x/c, ¥ = y/e, Z = 2/c, by S=-P4 4477 (4.16) (Itis important to note here that although this is written, for historical reasons, as “S squared’, it is not necessarily positive. This will become clear in the following discussion.) When an observer B using coordinates (¢’, X’, Y’, Z’) evaluates this quantity, by its definition (4.16) he will evaluate St=-P 4x74 ¥? 42% (4.17) Suppose B moves ata speed v relative to A in the +x-direction. Using the relation (4.5), he finds S? = -{y(e)(0— VX)P H{y)(¥ — VP 4? +27. where we have set V = u/c, On multiplying out and using the expression (4.4) for +», this becomes S?={-F1- 1) 40-1 - W442 On cancelling the factor 1 — V?, we find that S? = S; (4.18) 4.2. Space-time separation invariants 141 that is, both observers obtain the same value for this expression (whatever their speed of relative motion). Thus S? an invariant under change of velocity in the X-direction, It is also invariant under any spatial rotation of the X, Y, Z axes, because ¢and X? + Y* + Z? are separately invariant under such rotations. It is therefore invariant under any velocity change whatever (a spatial rotation can bring any change of velocity to a change of velocity in the x-direction); so Sis an invariant—it will be found to have the same value by all inertial observers. As an example, suppose in A’s coordinate system an event P is given by (x4) =, 9§,3,0); then S? = -@)? + 5)? | 32+.02 = 84 4 354.9 = —15. Now if B moves relative to A at a speed v = $c, then B’s coordinates for the event P will be (x*) = (5, 1,3, 0) see eqn (**) in the previous section). Thus the value of ‘S? calculated by Bis §? = —(5)? + (1)? + (3)? + (0)? = —25 +149 =~15, the same value as before, confirming that S* is an invariant in this particular case. When we remember that many other quantities we previously believed to be invariant have turned out not to be so, it isclear that this quantity must have some special meaning. What is the meaning? It is just ‘the square of the space-time distance’ from the origin O with coordinates (0,0,0,0) to the point P with coordinates (¢, X, Y,Z). Thus it is a natural generalization to the space-time situation of r? = x? + y? + z?, the square of the spatial distance from the origin O with coordinates (0, 0,0) to the point P with coordinates (x,y,z) in Euclidean three-dimensional space. However, there is an important difference: 7? is non- negative: r? > 0, but because of the minus sign in (4.16), S* may take negative, positive, or zero values, with slightly different interpretations in each case. We shall consider them in turn. In examining these meanings, it is convenient to rewrite (4.16) in the form S=-P +R, (4.19) where R? = X? + Y?+ Z? is the square of the spatial distance from O to P measured by A, in units such that the speed of light is I (and so is non-negative: R? > 0). We define V = R/t, the measured speed of motion of an obiect moving on the straight line from O to P. Negative values of S? Suppose that S? < 0. Then there is some positive real number T’such that S? = —T?. Consider the set of points seen by an observer A tobe given by (> Oand S? = —T? (Fig. 4.8a). Choose any point P on this surface. By eqn (4.19), A finds the time ¢ and distance R of P from O to be related by S<0 & P 0. Then there is some positive real number Dsuch that S? = D?. Consider the set of points given by S? = D? viewed from the frame ofan observer A (Fig. 4.1 1a), and any event Q on this surface. Then there is, a straight line from the origin O to Q; we rotate the spatial axes so that yand zare constant along this line, i.e. so that its spatial direction is the x-direction. By eqn (4.19), A finds the time ¢ and distance R of Q from O to be related by S>0 8 Roh se pa that is, the straight line OQ from O to Q represents motion at greater than. the speed of light relative to A. Therefore OQ cannot be the world-line of any observer B moving inertially between 0 and Q. In the (¢, X) plane, this line will make some angle @ with the horizontal axis (Fig. 4.1 1a); the line at the angle a from the vertical axis towards Q is then the world-line of an observer B for whom the events O and Q are simultaneous, ie. the line OQ lies in his surface of instantaneity. Change to B’s frame of reference by a suitable Lorentz transfor- mation; then the events O and Q will both lic in his surface of instantaneity {¢’ = 0} (Fig. 4.11b). His coordinates (’, X’, ¥’,Z’) for Q will then be (0, X’,0, 0). Evaluating S? for this point shows S? = X”. But, since this is an invariant, Si? = S? = D*;30 X' = D. This means that B measures Qto bea distance D from O (at the instant ¢’ — 0). This is also truc in Vig. 4.1 1a, which just represents the same set of events in a different reference frame; thus every point in the surface ‘S? = D? is ata distance D from O when measured by any observer for whom this displacement is instantaneous. We can therefore characterize these surfaces as lying at ‘1 light-second’, ‘2 light-seconds’, etc., distance from O. As an example, consider A to measure the event Q to be at {t=3sec, ¥= 5 light-sec, Y= Z~0}. Then S? = —3? + §? = —9 4.25 = 16 = 4; so Q lies in 4.2 Space-time separation invariants 145 a eee “two light seconds’ SP=+1 ‘one light second’ Fig. 4.12 Surfaces at one unit and two units of spatial distance from the origin O. the surface ‘4 light-seconds distant from ©’. An observer B moving at V = v/c =} in +x direction will measure O and Q to be simultaneous and separated by a spatial distance of 4 light-seconds. Again, the surface S? = | at unit spatial distance has special significance, for this gives the scaling of distances along different surfaces of instantaneity in a space-time diagram by setting the unit distance scale along each of these spatial séctions (Fig. 4.12). The invariance of unit spatial vectors under boosts is apparent in Fig. 4.7, because when any one such vector undergoes a Lorentz transformation it remains in the surface {S? = constant} in which it lay initially. This invariant also provides the last piece of information we need to understand completely the length-contraction effect, for it shows what length is measured by the stationary observer A to be the same as the ‘contracted’ length measured by a moving observer B (Fig. 4.13). Vanishing values of S? Suppose now S? =0. Let L be any point on this surface. By eqn (4.19), A finds the time ¢ and distance R of L from O to be related by 8=0 6 R=? & P=(R/)P=1; that is, the straight line OL from O to L represents motion at the speed of light relative to A. Thus this surface is just the light cone measured by A for the event O. Since S? invariant, any other observer B will also find S — 0: this set ofevents will also be the light cone he determines for the event O. That is, invariance of 'S? = 0 for different observers is just Einstein’s principle of the invariance of the speed of light for all observers. 146 The Lorentz transformation and the invariant interval SP=x? Fig. 4.13 A rigid rod, stationary relative to the observer A, has end-points u and w. It is measured by the relatively moving observer B to have a length X’. To find the length the observer A will measure for the rod, we draw the surface S* = X”, This intersects B’s surface of simultaneity through © at Q, which is a distance X from the origin in A’s reference frame. Therefore A measures the length of the rod to be X. Fig.4.14 The surfaces {S? =constant} at constant space-time distance from the origi drawn in a space-time diagram, The surfaces S? = 0 are the light cone of the origin. Summary All observers will agree on the value of the invariant $?. The surfaces S? = constant are drawn in Fig. 4.14; they represent proper times from O (when S$? is negative), instantaneous spatial distances from O (when S? is positive), and the light cone C*+(Q) of O (when S? zero). It is convenient to 4.2 Space-time separation invariants 147 refer to the latter as being at zero (space-time) distance from O, for the following reason. Taking the limit as a point Q approaches C*(O), Q is simultaneous with © and the spatial distance OQ goes to zero (if approached from the region where S? > 0) or the measured proper time OQ goes to zero (if approached from the region where S? <0). One can use this invariant to compare easily the spatial distance and proper time measurements made by different inertial observers who pass through the event O. Exercises 4.8 Calculate explicitly the quantity S? for the cases (a) t= 4, ¥ = 2, ¥ =3,Z=0; (b)t=2,¥ =4, ¥ =0,Z =5;()1=5,¥ =3, ¥ =0,Z = 4, In cach case interpret your results in terms of the relation between the origin of coordinates O and the point P with the stated coordinates. Use equations (4.5) to prove explicitly the invariance of S? in these cases if u/e = 3. 4.9 If the light cone is projected into the (t, X) plane by setting ¥ = Z=0, S*=0 becomes — X? = 0. Deduce that the solution is = +. Show explicitly that these rays are invariant under (4.5). 4.10 Suppose that a light signal is emilted al the space-time event O (1 = 0, ¥ = 0) and absorbed at the space-time event B (f= 1, ¥ = 1). Is S* zero for B? Suppose now the light is reflected by a mirror at B and absorbed when at the event C (t= 2, = 0). Is 8? zero for C? 4.11 Consider again the discussion of muon decay in Section 3.6. Calculate from quantities given in the Earth’s frame the proper time taken by the muons to move through, the Earth’s atmosphere. Use this time to predict the fraction of muons surviving at sea level. 4. ant As? We have seen that the invariant S? determines surfaces in space time ‘at constant distance’ from the point O with coordinates (0, 0, 0, 0), thereby determining clock measurements on inertial paths through ©, spatial measurements on surfaces of simultaneity, and the directions of light rays from that event. Can we find a similar invariant telling us about such measurements based on an arbitrary space-time point Q? Anexample will be useful in suggesting the way to go. Suppose that scout ships have established that a star exploded in a massive supernova explosion at the cvent Q given in standard galactic coordinates by (x4) — (2,3, 1,0) and that dinosaurs became extinct in a catastrophic event P on a planet of a nearby star, the coordinates of P being (x$) = (3, 1, 2, 0). The question is: could the supernova explosion possibly have been responsible for the extinction of the dinosaurs? A way to arrive at the answer is to notice that the displacement from Q to P (Fig. 4.15) has coordinates (y“) = (3 — 2,1 — 3,2 — 1,0 — 0) = (1, -2, 1,0), ie. these are the components of the position of P relative to Q. Thus, if'we regard Q as the origin of coordinates, we can work out the corresponding invariant AS? for this displacement by using eqn (4.16) but with the left-hand side being now denoted AS? (which just stands for the interval hased on Q rather than O) and the. 148 The Lorentz transformation and the invariant interval t Fig. 4.15 A supernova explosion occurred at event Q and dinosaurs became extinct on a neighbouring planet at event P, The time coordinates / of these events differ by At, and the spatial coordinates X by AX, right-hand side evaluated for the displacement components (1, —2, 1, 0) from Q to P. Explicitly, As? = -1? + (-27 + P40? =-14441=44. Because this is positive, the displacement from Q to P is space-like (it represents a spatial distance of 2 light-years); therefore no causal effect spreading from Q, travelling at less than or at the speed of light, could influence what happened at P. The extinction of the dinosaurs was not caused by the supernova explosion. This example makes clear that it is useful focusing on the displacement from Q to P (with components (“) in the above example). To consider this more gen- erally, consider two points P and Q in space-time, to whom an inertial observer A assigns coordinates (¢p, Xp, Yp, Zp) and (tq, Xq, Yo: Za) (as in Fig. 4.15). When we makea Lorentz transformation (4.2) to the frame of a second observer B, these points will then be assigned coordinates (1), X}. ¥p.Zh) and (tf. X4. Yq. Zo) respectively. It is straightforward to work out how the displacement from Q to P behaves; the result is (4.21c), leading to the invariant distance between these points (4.22). The details are as follows ‘The old and new coordinates of P are related by tp = y(v)(tp + VXp), Ye = Yp. Xp = y(u)(Xp | Vth), Ze = Zp, and those of Q by ta WONG +VKG Yor Yu XQ=UYy(XG+ Vib), Zq= Zh Subtracting these equations shows that fe = tq = Wo) (0h = 1) + VXE— XG}, Yo Y= Yo You Xp — Xo = oP — XA) + Vik — th) Zp — Zq = Zh — Zh. 4.2 Space-time separation invariants 149 This is somewhat clumsy to deal with, so we use the notation that A represents the change in a quantity between Q and P. Then AX = Xp—Xq, AY=Yp-Yo, AZ=Zp-Zo, (4.21a) p— Za, (4.21b) AZ AX’ = xp — AY'=¥- are the changes in the coordinates (1, X, Y,Z) and (¢’, X’, ¥’, Z’) between Qand P; and we find finally At=7(v)(Ar'+ VAX’), AY=AY’, AX = o(v)(AX'+ VA), AZ=Az'. (4.21c) This again has exactly the Lorentz-transformation form (4.2), but with X replaced by AX, ete. Now given the definition (4.16), the invariance result (4.18) was a direct result of (4.2). In exactly the same way, define AS = (At? + (AX)? + (AY)? + (Az) (4.22) Then it follows from (4.21c) that this is an invariant: for any change of reference frame, AS? = AS?. (4.23) What this result shows is that the space-time distance of the point P from the point Q. is invariant. Thus, just as before, we can draw surfaces of constant distance about the point Q, which isan arbittary point in the space-time, and inierprei the result exactly as before except with O replaced by Q (sce Fig. 4.16). Specifically, if AS? < 0, then the displacement QP represents motion at less than the speed of light, and so is a possible history of a massive particle or observer (Fig. 4.16b); we shall then call it time-like. If AS? = 0, it represents motion at the speed of light, and so is a possible path of a zero-rest-mass particle (e.g. a photon); we shall then. call it null or light-like. If AS? > 0, it cannot represent motion of any particle, since it would be motion at greater than the speed of light; rather, it represents an instantaneous spatial displacement for some observer. We then call it space-like. These are a more general form of the previous results; in fact, the previous cal- culations will follow on choosing Q to be O (with coordinates (0, 0,0,0)) here, cf. Example 4.12. The new formulation has several advantages. One is that it is clear that expression (4.22) is invariant not only under boosts and rotations of the axes, but also under translations: that is if we change the origin of coordinates, setting V=ttp, X=X4+M, Y=¥+¥%y, Z=Z+Z, for some choice of constants fo, Xo, Yo, Zo, the values (4.21a,b) will be unchanged and so will the value (4.22). Thus the quantity A.S?, the space-time separation 150 The Lorentz transformation,and the invariant interval t @ aS? 0. between Qand P, is invariant under translations, boosts, and rotations. Itenables us to work out the spatial or time differences measured by any inertial observer between any two points in the space-time from measurements made in A’s frame, without having to make an explicit change of coordinates to that observer's frame. Asan example, suppose that a particle B passes through the event Q with 4.2 Space-time separation invariants 151 @ (b) Fig. 4.17 (a) In a flat space-time given in standard coordinates, the light cones at each point are parallel to each other. (b) The future of a point Q which lies on the future null cone C'(P) ofa point P, lies in the future of P; the null cones of Q are tangent to the null cone of P. coordinates measured by A (in units of seconds) to be (5, 1, 1, 1) and then through the event P with coordinates (7,2,2,2); what time interval does B measure between these events? We find immediately that At=7—S=2, AX = A AZ =2- 1; thus AS? = —4 +1+4+1+1=~1=—I1. Hence this is cod a possible particle path (since the result is negative) aud the time measured between Q and P by the particle is 1 second. Also, the quantity AS? enables us to characterize the speed of light at any event Q by determining those events P around Q for which AS? = 0. If we do this for many different choices of Q, we ean see how the light cones at these different points relate to each other; in the case of the flat space-time of special relativity which we are examining at present, these light cones are parallel to each other (Fig. 4.17a). Exercises 4.12 Weare free to choose any point in space-time as the origin O of anr coordinates. Choose the origin as the point Qin the calculation above. Then (Xb) = (0), i.e. tg Yq = Zq =0 by definition. Verify that (X8) = (0), ie. #6 = X4 = YQ = that therefore the calculation above leading to (4.23) reduces precisely to the previous calculation leading to (4.18). Deduce that all the results following (4.18) for positive, negative, and zero values of S, understood as a measure of separation from O, also hold for AS? understood as a measure of separation from Q 4.13 Thelightcone C* (P) ofan event P is generated by the light rays through P. Show that the light cones of each point Q on these light rays are tangent to C* (P) (Fig, 4.17b) by deducing (a) that the interior of C* (Q) lies in the interior of C* (P); (b) that the interior of > (Q) lies ontside C*(P); and (c) that the light cones C*(Q), C-(Q) intersect C+(P) 152 The Lorentz transformation and the invariant interval precisely in the light ray from P through Q. [It will be important later that these features remain true in curved space-times.] The metric form So far, we have determined the invariant AS? for the straight line in space-time between any points Q and P. We now wish to generalize our results to any path from Q to P, so that we can for example determine the time measured between events Q, P by an arbitrarily accelerating observer. We generalize our results by first considering a piecewise straight path from Q to P, and then a general curved path between them. Consider a path in space-time made up of connected straight line segments (Fig. 4.18a). We will assume that all these segments are time-like, ie. AS? < 0. Then they each represent possible inertial (i.e. unaccelerated) motion of an observer or particle, so the whole path represents the history of an observer who moves inertially except for a finite number of times when he suddenly accelerates toa different velocity (e.g. by firing a very powerful rocket). On each, inertial segment the proper time Ar measured by the observer is Ar = (—AS?)!, where AS? is given by (4.22). In the idealization which we are considering, no proper time elapses during the accelerations (which we regard as instantaneous). Thus the total proper time r measured to clapse along the path is T= (-AS)P = SAP - Ax? - Ay? - Az?) (4.24) where the sign > represents summation of the expression over all the inertial segments (that is, the total proper time along the path is just obtained by adding up the proper times measured along each of these segments); here and in the we cach ier in the clearly ait invariant (i eat ‘APs AN i 7 sequel, “Ai”? uncaus (Ad), viv. T u t th { Hy | ff (| | { x \ x fF -__»x i / / g é c @ ® © Fig. 4.18 (a) A time-like path made up of time-like straight (inertial) segments. (b) Paths made up of smaller and smaller straight (and therefore inertial) segments. (¢) The limit of these paths is a smooth time-like path 4.2 Space-time separation invariants 153 sum is an invariant). As an example, consider again the motion of the twins discussed in the ‘twin paradox’ (Section 3.4). Seen by A, twin B moves away for 10 years at a speed of $c lo a distance of 8 light-years, and returns in a further 10 years. Thus she moves on a broken geodesic where (1, ) goes from (0,0) to (10, 8) to (20, 0) (we ignore Y and Z, since they remain constant and so do not contribute to AS?). On the first leg Ar — 10-0 = 10and AY = 8—0=8.0n the second leg Ar = 20 ~ 10 = 10 and AY = 0 — 8 = —8. Thus 7 = (10? — 8} + {10? — (-8)"}4 = (100 ~ 64)! + (100 + 64)! = (36)! + (36)! = 6 +6=12 years, confirming our previous results. On the direct path between the initial and final points travelled by A, we have Ar = 20 — 0 = 20 and AY = 0-0 = 0; so 7 = (20? — 02}! = 20, as expected. Expression (4.24) enables us to determine what clock measurements would be along any time-like path in space-time made up of a finite number of inertial segments. However, general paths may have a direction that is continuously varying, and we wish to determine proper time along any feasible path of an observer. To do this, we consider piecewise inertial paths from Q to P with smaller and smaller inertial segments (Fig. 4.18b). In the limit as these segments shrink to zero, we obtain a smooth time-like path C (Fig. 4.18c). As long as the limiting value for AS? remains negative for each segment as we take the limit, this represents a possible motion of an observer from Q to P, and the proper time + measured by an observer moving along the path is the limit of the expression (4.24). It is conventional to write this limit as a line integral: r fan! (45a) where ds? = —d? + dx? + d¥? + dz?, (4.25b) or equivalently, js? = —dP + (dx? + dy? + dz”) /c?, (4.25¢) where ‘d?’ means (dé)”, etc. (It would be more in linc with the notation we have used previously to write dS? instead of ds”; however, it is an almost universal convention to use the notation ds”, so we shall do so here.) This is nothing other than a formalism for the limit of expression (4.24) as all the inertial segments are shrunk to indefinitely small lengths and the piecewise inertial path tends to the smooth world-line C. We may interpret this as representing the path C from Q to P as made up of ‘infinitesimal’ segments, each consisting of a displacement (dt, dX, d¥, dZ) from a point P; with coordinates (t, X, Y,Z) to a point P; with coordinates (1+ dt, ¥+ dX, ¥+d¥,Z + dZ) (Fig. 4.19), each of which (by 4.24) contributes a proper time dr = (—ds?)! (given hy (4.25b)) to the total time 7. 1584 The Lorentz transformation and the invariant interval 1, Fig.4.19 Two points P,and P, on a smooth time-like curve, with coordinates differing by dt and dX. 7 xtdxysdy) oe oy Fig. 4.20 (a) A curve C in the Euclidean two-plane between points P and Q. Neigh- bouring points have coordinates differing by dx and dy, and the distance between them can be found by Pythagoras’ theorem. (b) A curve such that dy=0 (that is, y=constant) has x as a curve parameter. Then (4.25a) simply states that the total time measured along the path is the sum of all these contributions (cf. Appendix A). Invariant expressions such as (4.25b,¢) are known as metric forms or intervals. The Euclidean two-plane This concept is illustrated now by considering how one measures length along an arbitrary curve C in the ordinary Euclidean two- plane. First consider using standard Cartesian coordinates (x, y) (Fig. 4.20a) 4.2 Space-time separation invariants 155 This length can then be written as L= / (as?! (4.26a) where Is? = dx? + dy. (4.26b) (It is not appropriate to ‘take the square root’ in (4.26a), as the entity in the bracket is really the full expression ds? given by (4.26b).) Again we are regarding the total length as made up of contributions from Segments representing dis- placements from (x, y) to (x + dx, y + dy), of length (ds?)? where ds? is given by (4.26b). This expression isa line integral evaluating the length of any curve in the plane (similarly, expression (4.25) is a line integral evaluating the proper time along any time-like path in space-time). Again it is an invariant agreed on by all observers (as each of the infinitesimal contributions ds? is an invariant); in fact this is nothing other than repeatedly using Pythagoras’ theorem (4.26b) applied to small line elements to estimate the length of the whole line. Expression (4.26) tells us the length along any curve segment (dx, dy). In understanding its meaning, it is useful to consider first the specialization of this expression to curve segments on which only x or only y varies. Take the first case: if only xvaries along the curve, then y is constant and so dy =Oall along the curve (Fig. 4.20b). The expression (4.26) then reduces to Lx fae vo [dx =~ x05 that is, distance along this curve is simply measured by the change in the coor- dinate x, so (4.26) tells us that x is indeed a coordinate directly representing distance along the lines {y=constant}. Similarly. y is a coordinate directly representing distance along the lines {x= constant}. This will not be true for more general coordinates. As an example, change to plane polar coordinates (r,8), where r is the distance from the origin and @ is the angle from the x axis (Fig. 4.21). Now (4.26b) will be replaced by the expression = dr? $77de. (4.27) To see that thisis correct, note that along the lines {r only varies} the coordinate @ is constant; so d@=0 along this line. Then (4.27) shows ds? = dr? +0 = dr?; but this is the square of the distance travelled. Thus r directly measures distance along these curves (as required by its definition). On the other hand, along the curves {0 only varies} the coordinate r is constant so dr =@ along these curves. 156 The Lorentz transformation and the invariant interval (wednoedo) ds? Fig. 4.21 The same curve as in Fig. 4.20(a) but now described by polar coordinates r and 9. The distance between neighbouring points is now given by Pythagoras’ theorem from orthogonal displacements dr and 8, through distances dr and 0 respectively. Then (4.27) shows ds? =0 +r? d@ =r? de; that is, distance along the curve element defined by d@ is rd@, and, because r is constant and sois the same forall the curveelements, distance along the curve will be given by r(p — Go) rather than just 6p — 0g. This is precisely in accord with our usual understanding of the definition of an angle (measured in radians) Finally, for a general displacement, (4.27) says that the final result is given by Pythagoras’ theorem from its components along the r and 6 directions. Clearly the total distance determined by this formula between two points P and Q along some curve from P to Q depends on the choice of this curve Euclidean three-space As a further example, the geometry of Euclidean three- space is given in terms of Cartesian coordinates (x, »,z) by the expression ds? = dx? + dy? + dz’, (4.28a) generalizing (4.26b) in an obvious way to three dimensions. However, in many cases a geometrical or physical situation may display spherical symmetry, and so one may wish to use spherical polar coordinates instead (Fig. 4.22a). If we use such coordinates (r, 6, ¢) instead of the coordinates (x, y, 2), the corresponding expression describing the Fuclidean geometry is ds? = dr? + r?(d6? + sin? 6d”). (4.28b) One can read off directly from this form that (1) the coordinate r directly represents distance travelled along the curves {r only varies}, that is, the curves {6, @ constant}; however, (2) a coordinate increment dé represents a distance r d@ along the curves {6 only varies}, that is, {r, constant}, and (3) a coordinate 4.2 Space-time separation invariants 157 f@ Fig. 4.22 (a) Spherical polar coordinates r, #, and ¢ in Euclidean three-space. Here r describes radial distance, 0 is the angle between the radial, direction and the z axis, and @ describes rotation about this axis. (b) The distance between neighbouring points described in spherical polars is given by Pythagoras’ theorem from orthogonal displacements dr. 40, and d¢ through distances dr, rd9, and rsin@dé respectively. increment d¢ represents a distance r sind d¢ along the curves {¢ only varies}, that is, the curves {r, constant}. This is indeed precisely the way distances relate to standard polar coordinates (see Fig. 4.22b). Of course, the spatial geometries represented by (4.28a) and (4.28b) are the same—it is the coordinates use that differ. The important point to notice here is that when general coordinates are used, they will not directly represent distances even along these coordinate cnrves, but the relation between a coordinate increment and the actual distance travelled can be read off from the interval (in this case, from (4.28b)). Distances travelled along any curves will be given by (4.26a). Actually working out these expressions in the case of a general curved line may be complex (but if it is a coordinate line, the expression can often be evaluated without trouble). More details on the concept of a line integral needed to evaluate these distances are given in Appendix A. Exercise 4.14 ‘The circle C given by {r = R = constant} passes through the point P at {r= R, 4 = 0} and the point Qat {r = R,@ = m}. Show that (a) the straight line L from P to Q has length 2R, (b) the segment of C joining P to Q for 0.< 4 < has length mR. [Apply (4.26a), (4.27) first to the straight line joining P and Q, and then to thecurve r= R.] Deduee that this circle has radius R, diameter 22, and circumference 2nR. Space-time These examples have simply considered Euclidean spaces, where ds? > 0, described by different coordinate systems. In space-time, ds? is not constrained to be > 0 because of the minus sign in (4.25h). As intimated above, it will in this 158 The Lorentz transformation and the invariant interval th (t4dt x10) ax Fig. 4.23 Neighbouring points in a space-time diagram, with coordinates differing by dtand dX. case represent in one quantity time measurements, spatial distance measure- ments, and the speed of light, according as ds? is negative, positive, or zero for the displacement (dt, dX,dY,dZ) considered (Fig. 4.23). In particular, when ds? < 0, its magnitude is the square of proper time dr measured along that dis- placement: ds? = —dr?. Equation (4.25a) now enables us to calculate proper time measured along any world-line in spacetime (even if it is accelerated, represents non-inertial motion). As an example, on the curve (x=y= which is the world-line of the observer who set up the coordinates, the equations dx =dy =dz=0 hold, so (4.25b) reduces to ds? = —d?, and (4.25a) shows that the coordinate ¢ does indeed measure proper time along this particular world- line. Further, when ds? = 0 we have a displacement along the light cone, i.e. motion at the speed of light. Since all other kinematic quantities, e. spatial enables us to make all the basic space-time measurements we may wish. When standard coordinates are used, ds? will be given by (4.25c), but if other coordi- nates are used it will be given by some other expression. For example, if we use spherical polar coordinates, the spatial part (4.28a) will be replaced by (4.28b). Then ds? di? + {dr? + r?(d0? + sin? 6d”) }/c? (4.29) This enables us to work out measured time intervals along any world-line, in terms of these coordinates. from expression (4.25a). As in the spatial case. the time interval measured along a time-like curve from P to Q will depend on the choice of that curve, and this is the source of the ‘twin paradox’. Asa final example of use of the form ds’, suppose observer A sees a particle move past at a speed v. Let » = (di,dx,dY,dZ) be a displacement along the particle world-line in standard rectangular coordinates (Fig. 4.24a). Then the corresponding proper time experienced by the particle is dr = (-ds?}! = (dP — dx? - a¥? ~ dz)? = (dP — dr? /c2)b = {1 — (dr/d)?/e de 4.2 Space-time separation invariants 159 t 4 A particle A Pp (at,ox) t — =x oO aX | Q @ ©) t 4 Np _>—--————x @ Fig. 4.24 (a) A displacement (ds, dX) along the world-line of a particle moving at speed v relative to the observer A. The corresponding proper time dr, measured by a clock moving with the particle along this displacement, is related to d¢ by the time-dilation relation dt=7(v)dr, which shows d¢> dr with d¢=dr if and only if v=0. (b) Several piecewise inertial paths joining two time-like separated points P and Q. The longest time will be measured along the path d, the straight line path between them. (c) The same situation as seen by an observer B moving inertially between Q and P. Clearly (from (a)) proper time along each inertial segment on and \/ will correspond toa longer time as measured by By thus proper time from Q to P along these paths will be less than along the single inertial path 2. (d) Displacements m; and 72 in space-time, Their scalar product is defined by eqn (4.31). where dr? = dx? + dy? + dz? = c? (dX? + d¥? + dZ”) gives the spatial distance measured by A along 9. Now v= (change of distance)/(change of time) = dr/d?, so. dr=(1-w/e)tdr @ de=7(v) dr, (4.30) 160 The Lorentz transformation and the invariant interval and we have regained the time-dilation result (3.20) directly from ds? (Strictly speaking, one should integrate this result up to determine relative clock mea- surements for finite time intervals along the world-lines, but the meaning of (4.30) in terms of ‘infinitesimal displacements’ is quite clear.) Now consider two points P and Q whose separation is time-like. Let be the time-like straight line joining them and r be the proper time from P to Q measured by an observer whose world-line is \. From an examination of (4.24) and (4.30) it then becomes clear that a shorter time will be measured by any observer whose history is any other piecewise straight line joining P and Q (cf. Fig. 4.24b). Taking the limit, as in (4.25), it becomes clear that a shorter time than ris measured along every other time-like line from P to Q. Thus the longest time between P and Q is measured by an observer who moves uniformly, that is, without acceleration, between P and Q (cf. the discussion of the ‘twin paradox’ in Section 3.4 above). The space-time diagram from his viewpoint is shown in Fig. 4.24c. The scalar product A generalization of the invariant metric form is the scalar product between two displacements. Let 4; = (dé1,dX1,d¥),dZ)) and ny = (dé2,d¥2,d¥2,dZ2) be any two displacements (Fig, 4.24d). Their scalar product is then the quantity mM, = —dt dt, + dXjdXy + AYA 2 + dZdZp. (4.31) As in the case of ds’, this is easily seen to be an invariant by use of (4.21c); it generalizes ds” because ds? = - m. However, it gives us further interesting information; for example, if an observer moves along a world-line segment characterized by the displacement 7, the displacement 7 is instantaneous for him ifand only ifm, - m= 0. Thismay easily beseen by choosing the rest frame of the 0,0): th the a1 — (a1, 9, 0,0) didi = 0 which impli dy = 0, so 7 is indeed an instantaneous displacement (for that observer). By the same method it can be shown that if mis time-like and m time-like or null, then 4, - 4) < 0. We have given the scalar product here only when standard (Minkowski) coordinates are used; the generalization to any coordinates is given in Appendix B. Having defined the scalar product, we are now able to prove analytically the result of Exercise 4.13 as follows. Take a point T inside C+ (Q). The dis- placement PT equals PQ + QT. Then (Pry = (PQ+ ory = (PQ) +2PQ-0T+ (QT) Now (PQ)’ is zero and PQ- QT and (QTY are both negative. Hence (PT) is negative and T'lies inside C + (P). Conclusion In this section, we have looked at the invariants related directly to measurements of time and distance in space-time. There are other important invariants we have 4.2 Space-time separation invariants 161 not considered here, related to energy, momentum, and the electromagnetic field; they are most easily constructed by using the tensor formalism discussed in Appendix B. Some of those invariants are introduced there and in Appendix C. Exercises 4.15 (@ Inthe Euclidean two-plane, consider a path as shown in Pig. 4.25a, joining (x=0, ,) =a} via the point (x= da, y = 0}. Find the length L (given by (4.26a)) ofthe path, and show that the shortest path (ie. minimum value of L) corresponds to = 0. (ii) Now in a two-dimensional space-time consider a path as shown in Fig. 4.25b, joining {= —a, x = 0} and {t= +a, x = 0} via the point (t = 0,x = a). Find the proper time + (given by (4.25)) along the path, and show that the longest proper time (ie. the maximum value of r) corresponds to 4 = 0. 4.16 Illustrate how you would use the metric form to determine the K-factor for two observers in relative motion by working through the following exercise. Suppose that the metric form in a two-dimensional space-time is ds? = -a? de? + BP dx* where a and b are positive constants. Observer A is at rest at x = 0 and emits light signals , and = fy. Observer B moves at speed v relative to A passing him at = 0. Cal- culate, (i) the equations of the light rays sent by A; (ii) the coordinates of the points where B receives the signals; (iii) the interval As, between the emission events, and the interval As) between the reception events; (iv) the ratio K = Asy/As,. 4.17 (a) Prove that the scalar product (4.31) is an invariant. (b) Suppose that an observer O determines both the displacements m1 and no to be instantaneous. Show that the scalar product (4.31) then reduces to the expression nym, = AX, dXo + d¥; d¥y + 4Z, dZy which determines both distances and angles in Euclidean space (e.g. if m, - m> = 0 then the displacements are orthogonal to each other). 162 The Lorentz transformation and the invariant interval 4.18 Two-dimensional flat space-time has the metric form ds? = —d? + dX? (obtained from (4.25b) by setting dY = dZ = 0). On choosing a new coordinate v defined by v= t | X instead of ¢, then dv = d¢ + d¥ and in terms of the coordinates (v,X) the interval becomes ds? = ~de? +2dvdX. 0) Deduce from this that a curve {v= constant} is a light ray, but a curve {Y= constant) is time-like. Sketch these curves in a space-time diagram. On further choosing thecoordinate w= 1—X instead of ¥, then dw = dr—d¥ and in terms of the coordinates (x, ) the metric form becomes ds? = —dudw. (") Show from this that the curves {v =constant} and the curves {w=constant} are light rays (for this reason, these coordinates are called ull coordinates). Sketch these curves in a space-time diagram. Check that if we define a new null coordinates w= —v, the metric form becomes ds? = dudw. cn") Computer Exercise 13 rite a program that will accept as input (a) coordinates (TP, XP) and (TQ, YQ) for the initial point P and final point Q ofa time-like curve, (b) an integer N indicating the number of intermediate points to be specified, (c) coordinates 7(/) and X(/) for each of these intermediate points R(/) (/=1 to NV). It should give as output the total proper time T measured by an observer moving from P to Q along the piecewise inertial path P= R(1) > RQ) > «> R(V) + Q [The program must check that the total path and each of these segments is indeed time-like.] with coordinates (¢— and Q with coordinates (t= 3, ¥ = 5) satisfies the equation # ~ ¥? series of N’ points R(/) on this curve (J = | to N) between P and Qand determine the proper time T from P to Q along the piecewise inertial path defined by these points. Show that as NV gets larger and larger, T tends to a limit TL, the proper time from P to Q along the uniformly accelerated path. [One way to choose the points is to choose a set of values for T (-5 < T <5) and then solve the equation ¥* = T? + 16 for X.] 4.3. Some flat-space universes We shall now illustrate some of the ideas of the previous sections by looking briefly at three cosmologies in flat space-time. These examples are included to show some intriguing possibilities that arisc in the case of special relativity (when gravitational effects are negligible). Similar effects occur in the curved space- times of general relativity, considered in the following chapters (when gravity is taken properly into account). For the sake of simplicity, we will concentrate mainly on two-dimensional examples which show the major features of full four- dimensional versions of these space-times. If you find the details heavy-going, then omit them at a first reading and turn to the discussion of curved space-times 4.3 Some flat-space universes 163 Matter in the universe In the real universe, we observe matter (stars and dust) clustered into galaxies and clusters of galaxies (Fig. 1.10) which are measured to have systematically increasing redshifts as their distance from us increases (Fig. 3.4). This suggests thereisa well-defined average motion of matter in each region in the universe (e.g. in our local region of the universe, the motion of our supercluster of galaxies), Therefore, a model of the universe must specify both the space-time itself and this average motion of matter. We will call a space-time a model universe when a family of preferred world-lines is specified in it,* representing the average motion of matter at each point in space-time (Fig. 4.26). These world-lines, which we call fundamental world-lines, then represent the history of galaxies or observers moving precisely with the average motion of matter at each point (not all matter will move in this way; for example, cosmic rays will be moving at high speed relative to most matter). We refer to observers moving with precisely this average velocity as fundamental observers, and analyse the behaviour of the universe model in terms of the observations of presumed fundamental galaxies (moving with the preferred velocity) made by such (idealized) fundamental observers. Given a universe model, we can test how good a representation of the real uni- verse it is by comparing observations of galaxies in the real universe with the observations predicted by that model for fundamental observers. Fundamental world lines \ N Qy Fig. 4.26 A model universe is a space-time together with a family of world-lines representing the average motion of matter at each space-time point. An observer moving, with this average motion is called a fundamental observer. “In a complete cosmological model we will also have to specify many other physical features of the matter in the universe, but in this book we examine only the space-time geometry of these universe models, 164 The Lorentz transformation and the invariant interval As has been mentioned above, the universe models we look at here do not attempt to represent the nature of gravity (which will be discussed in the next section). Instead they are based on the symmetries of flat space-time, which define a structure for space-time that picks out particular classes of world-lines as ‘naturally preferred’, so we choose these for the world-lines of the fundamental observers. We look at three such models: the Minkowski universe, the Rindler universe, which has many properties similar to those of a black hole, and the Milne universe, which is a simple expanding universe model. We will discuss curved-space universe models of the black-hole type and the expanding type, in Chapters 6 and 7 respectively. Minkowski universes We first consider a two-dimensional version of this universe model, and then a four-dimensional version. A two-dimensional Minkowski universe This is just the two-dimensional flat space-time of special relativity with the metric form given in terms of coordinates (t,X) by ds? = -d? + dXx?, (4.32a) the world-lines of the fundamental observers being lincs {X = constant}, and the number density of galaxies being uniform in the surfaces {= constant}, which are surfaces of instantaneity for all the fundamental observers (Fig. 4.27a). This universe model is based on the translation invariance of the space-time: the world-lines are moved into themselves by the time-translation symmetry a =ttin, X'=X, (4.33a) where fp is any constant. This, in particular, implies that the world-lines stay a constant distance from each other. They are moved into each other by the spatial translation symmetry X’=X4X, Ht (4.33b) where Xo is any constant, This implies spatial homogeneity; in particular, the symmetry leaves invariant the density of matter in the spatial surfaces {= constant}. Note that (4.33) are space-time symmetries because the form (4.32) is clearly invariant under them (ef. (4.22) and the following comments). The static, uniform distribution of matter We can think of this universe either in the continuum approximation where a world-line is defined through every space-time point, or we can conveniently think of it in discrete terms, where there are still an infinite set of uniformly distributed world-lines, but not one through cvery point. Then we start with the world line L passing through the event O {¥=0, ¢=0}, and generate all the other world-lines (see Fig. 4.27b) by (i repeatedly applying a spatial translation (4.33b) to it for some suitable value of Xp to determine the events O,, where the world-lines intersect the initial 4.3 Some flat-space universes 165 onst: world-lines fof matter “teconst surfaces of homogeneity tro py a translational —~ Invariance fa) world-4ines: x=const. yoy | | t=2ty fo fo to fe he ft — tsto lo toto to to to fre x [* XO] Xa | Xo] Xo (@) Fig. 4.27. The Minkowski universe. (a) The world-lines of the fundamental observers, representing the average motion of matter in the universe, are (X = constant} and their surfaces of instantaneity are {¢=constant}. (b) Construction of the universe by (i) repeatedly applying a spatial translation through a distance Xo to the world-line L through the origin to determine the initial points of these world-lines in the surface f = 0, and (ii) applying time translations to these events for all values f to determine the world- lines in spacetime. By this construction, the density of matter measured in the surtaces {t= constant} is uniform. surface {r= 0}; (ii) applying the Lime-translation (4.33a) to determine the world-lines L, in space-time from these initial events. The distribution of world-lines so created is necessarily time-invariant (since it is defined by a time translation which is a space-time symmetry). It is also spatially homogencous in the initial surface {t = constant} by construction (the world-lines are all the same distance Xo apart from each other in this surface), and will remain spatially homogeneous when the time translation (4.33) is applied to determine it elsewhere in space-time, because the initial symmetry is preserved by this time-invariance symmetry (the distance Xo between the world-lines is maintained at all later times). Because of the spatial homogeneity, the density 166 The Lorentz transformation and the invariant interval function representing the number of galaxies per unit spatial distance will be spatially constant; because of the time symmetry, this density is also constant in time. A four-dimensional Minkowski universe This is the four-dimensional flat space-time of special relativity with the invariant metric form given in terms of coordinates (t,X, ¥,Z) by ds? = -d? + dX? +4¥? +dzZ?, (4.32b) the world-lines of the fundamental observers being lines {X, Y, Z constant}, and the number density of galaxies being uniform in the surfaces {¢ = constant}, which arc surfaces of instantancity for all the fundamental observers. The properties of this space-time are clear immediately from the discussion above of the two-dimensional version (which is just the section of the full four-dimensional space-time obtained on setting Y = Z = constant in (4.32b)). This is the simplest kind of universe model: a static, uniform distribution of matter in a flat space-time, without beginning or end and without spatial limit. It is rather uninteresting: there are no observed redshifts or blueshifts, and the density of matter in the universe is uniform in time and space. The model does not correspond to the real universe, where systematic galactic redshifts are observed; we include it mainly for contrast with the other two to follow, and to illustrate in a familiar context some of the methods we will use in the rest of this section. There is a universe model with curved space-time, the Einstein static universe, which is similar to the Minkowski universe discussed here; we will discuss it in Chapter 7. We conclude examination of this universe model by considering briefly three conceivable methods of estimating the distance of an object in such a space-time: distance by apparent angle, by apparent luminosity, and by apparent brightness. This detailed material is included because similar methods will be used later in examining the properties of curved space-times; it may be omitted on a first reading. Apparent size To determine how apparent sizes will appear in these universes, we change to spherical polar coordinates (r,6,¢) so that the metric form becomes ds? = —d? + dr? + r?(d6? + sin? 0d¢?) (4.32c) (f. (4.29), we have chosen units for the radial coordinate r that set the speed of light to unity) where now the fundamental world-lines are the lines {r,0, 6 constant}. It follows immediately from this form that r measures directly the distance from the origin along the radial curves {1, 0, $ constant}, Now consider a linear object of length D lying transverse to this radial line at distance r (Fig. 4.28a); without loss of generality we can choose the polar coordinates so that the object lies in a surface {¢ — constant}, with its ends at @ — 8, and @ — 6 respectively. The interval along the rod measured at an instant {¢ = constant} by a fundamental observer is then ds? = r? d0? (from (4.32c) on setting di = 0 = dr = dé) Tis length is then given by D = r(02—0,). Thus on defining the 4.3 Some flat-space universes 167 2 ras Y Le a 0-0, g Z 8 ?How far? oes) f “ © © Fig.4.28 (a) Arodoflength D lying perpendicular to the line of sight from the observer at a distance r. The apparent angular size of the rod is a (b) We estimate the distances of objects such as cars by observing their apparent angle a, and deducing the distance to them because we know approximately what their length is. apparent angular size of the object as « = 0) — 4, this is related to the length of the object by a=D/r (4.34) showing that the apparent size of the object is proportional to its length D and inversely proportional to its distance r. It is effectively through this equation that, we estimate distance of objects in everyday life: for example our eye estimates the apparent angle ofa car as it passes (Fig, 4.28b), we know the approximate size D of the eqn (4.34)). If the object is not at rest relative to the observer, or does not lie transverse to the line of sight, the calculation becomes more complex but still follows directly from (4.32) Apparent luminosity We wish to calculate the rate at which energy is received by an observer at a distance r from a star. For generality, we will not assume the observer is at rest relative to the star. To be precise, we will assume that the star is at rest at the origin r of coordinates for which the metric form is (4.32c), and the observer is moving radially outwards so as to measure a redshift z for the teceived radiation (Fig. 4.29a). Suppose the star is measured in its rest frame to emit radiation uniformly in all directions at a rate Lergs/sec. This radiation is carried by photons, the energy of cach photon being E = hv whore h is a constant and v is the frequency of the radiation, related to its wavelength \ by c= vA. The rate at which photons are emitted by the star will then be L/E = L/hv photons per second. Assuming that photons are conserved, after travelling a distance r from the star (as measured in the star’s frame) they will all arrive at the observer, at which distance they will be spread over a sphere of area 4zr? (Fig. 4.29b). Because of the K-factor effect (see the redshift relations (3.3, 4)) the rate at which these photons arrive will be a 168 The Lorentz transformation and the invariant interval area anr® light bserver MAA LALA Be ster redshift 2 observer @ ) cross-section area S solid angle a star observer (measured by observer) (©) Fig. 4.29 (a) An observer moving relative toa star in flat space-time measures a redshift z in radiation received from the star. (b) When radiation from the star arrives at the distance rat which the observer is situated, it has spread out over an area 477°. (c) The solid angle 2 is the apparent size of the object as seen by the observer; it can be thought of as the amount of sky covered by the star. factor 1 + z slower, in the observer's rest frame, than the rate at which they were emitted in the rest frame of the star; thus the rate at which photons arrive per unit area will be measured by the observer as R= (L/hv)(1/4nr?){1/(1 +2)}. Now the energy per photon measured by the observer is hv’ where vis the fre- quency measured by the observer, related to v by v'/v = 1/(1 +2). Conse- quently the flux of radiation (i.e, the energy received per unit area per unit time) measured by the observer from the star is F = Rhv' = (L/4n)/{r(1 +2)P. (4.35) This is the basis of measurement of distance by appatent luminosity. We can measure the flux Fand redshift z by use of telescopes and appropriate detectors, If we are able to estimate the intrinsic luminosity L of the star (e.g. by our knowledge of the luminosity of other stars whosc distance can be determined by other means) then we can find the distance r of the star from (4.35). This method of distance estimation is widely used in astronomy, e.g. to estimate the distance of distant galaxies. In the Minkowski universe, the flux measured by a fundamental observer will be given by this equation with z = 0, which is nothing other than the inverse-square law for the flux of light received from an object (since Fis then simply proportional to 1/r?) 4.3. Some flat-space universes 169 Apparent brightness ‘The flux Fis the total radiation emitted by an object. When observing an extended object such as a galaxy, what our instruments directly record is actually ils apparent brightness, ic. {lux received per unil solid angle, in the wavelength band lying in its range of sensitivity (for example, this is what is recorded by our eye or by a photographic plate). The solid angle 9 is the amount of the sky covered by the image of the object. It is defined by the equation S =r?Q where S is the cross-section area of the star, and r’ is the distance measured to the object by the observer (Fig. 4.29c). The observed intensity of radiation I (the brightness at all wavelengths) is the flux received per unit solid angle, ie 1=F/Q= Fr?/S. (4.36a) Now the relation between r (the distance measured between the object and observer by someone stationary relative to the star) and r’ (the same distance measured by the observer) isr’ = r/(1 + z).which is effectively eqn (3.25) applied to the present situation (it is clear that these distances must be related by K = 1+rather than y because the light we are concerned with travels one way, from the source to the observer, rather than both ways; the solid angle is the solid angle subtended by the source at the time of observation, not at the present time as deduced by radar). Combining this result with (4.35) and (4.36a) shows that T=b/( +2), (4.36b) where I = L/(4mS) is the surface brightness of the star. This shows that in flat space-time, the observed intensity of radiation from a given source is independent of the distance between the observer and the source; it depends only on their relative motion. In the case of the Minkowski universe, a fundamental observer will measure the same intensity of radiation from a source, no matter how far he is trom st (as z = 0 then). ‘Thus, 1t 18 nof possible to use observed intensity (or surface brightness, i.e. the measured intensity in restricted wavelength bands) alone to estimate distance of an observed object. Exercise 4.19 Ina Minkowski universe every past light ray from an event P would eventually intersect a star, Prove that the redshift observed by a fundamental observer is zero for every star (assuming each star moves at the fundamental velocity). Deduce from eqn (4.36b) that if the stars shone continuously in such a universe, the entire night sky would be as bright as the surface of a star, contrary to our experience that the sky is dark at night (this is Olber's paradox), ‘What conservation law shows stars cannot shine continuously (i.e. puts a limit on the possible lifetime of a star?) The two-dimensional Rindler universe Although this model universe is based on flat space-time, it displays some of the essential. features of a black hole (which we shall examine in Section 6.2). It is based on the ‘boost’-invariance of flat space-time, and may most easily he 170 The Lorentz transformation and the invariant interval Fig. 4.30 The Rindler universe. The world-lines L of the fundamental observers are obtained by boosts (see Fig. 4.6(b)) applied to their initial positions at equal distances along the surface {r= 0}. The boosts move the surface {/ = 0} into the surfaces {0 = 6}, {= 2(},...., in terms of the parameter f (eqn (4.44) constructed as in the previous example. Start with flat space-time given in terms of coordinates (t,) and with ds* from (4.32a). Use the spatial translations (4.33b) to determine the initial positions of a family of world-lines in the surface {t=0} through the origin O, resulting in an initially uniform distribution of matter asin the previous case. We now use the boosts about O (eqn (4.37a) below) to determine the world-lines elsewhere from their initial positions (Fig. 4.30). As discussed above (cf. (4.23)) the interval is invariant, so this determines the world- lines in such a way that the distance Xq between them in their surfaces of instantaneity remains constant at all later times. The result is clearly different from the Minkowski universe. Explicitly, a general point P on each line L is obtained from the initial event (X’, ¢’) by a boost X= oVX' 4 Wt), t= 9(V\(t' + VX) (4.37a) for some value V for the relevant change of velocity, where y(V) = (1 — V2); thus V (|| < 1) serves asa parameter along the world-line L.. For every value of V, the boost preserves the invariant S? giving the distance from O to P (see eqns (4.16-18)) which on each world-line L takes the value at the initial point: -P +x p* = constant (4.37b) This is therefore the equation for the fundamental world-lines. These curves are sketched in Fig. 4.31; they are all asymptotic to the light cone through O at large 4.3 Some flat-space universes 171 surfaces: of homogeneit: Fig. 4.31 The world-lines S? =p? in the Rindler universe, and their surfaces of simultaneity which are also surfaces of homogeneity (i.e. of constant density). values of |X|. As the world-line L passes through the point {¢’ = 0,X’ =p}, a general point on L can be expressed in terms of this initial point via eqn (4.37a) as, X=AV)p, t=oV)¥e (430) In this form, Visa parameter along the curve that is labelled by the value p. Note that the point O is a fixed point of these boosts, so this procedure does not generate. a world-line through 0 itself; for later purposes it will be convenient to define the world-line Lo to be given by {¥ = 0}. These universe models have many interesting properties, which we will investigate in turn. (A) Constant relative distances By construction, the world-lines are invariant under the Lorentz transformations (boosts) about O; therefore, they maintain a constant distance from each other at all times. This does not at first appear to be the case in Fig. 4.31, but is clear because they lic in surfaces al a constant distance from O (see (4.37b)). The point is that the surfaces of instantaneity for this whole family of observers are the straight lines I, through O; at every point on each surface Ly the angle to the horizontal is the same, but at later and later times on each world-line (corresponding larger and larger values of V’) the Ly tilt up more and more relative to the X-axis, asymptotically approaching the light cone. This is because these observers are accelerating: at every time on each world-line, the speed relative to the and X axes is increasing, so the lines tilt over at an angle a from the vertical which steadily increases towards 45°. Correspondingly, the surfaces of instantaneity tilt up from the X-axis by the same angle a; hence larger 172 The Lorentz transformation and the invariant interval and larger length contraction effects make a constant distance (for an observer L) look longer and longer (to the stationary observer Lo, who is not a fundamental observer), The event Ois at a strangely privileged position for this family of observers. [tis regarded by each observer L to be simultaneous with every event in his history (because all their surfaces of instantaneity intersect here) and to be always at the same distance from him. Conversely, every observer at the event O (no matter what his velocity) will measure the same distance to an observer L. By contrast, an observer with world-line Ly has surfaces of instantaneity {¢ = constant}, and by (4.37b) will measure all the observers L to be approaching him (until the event O) and then moving further and further away from him (after the event ©). That observer will measure the density of matter to be uniform at the time t= 0 (because it was constructed to be uniform then) but not at any other time, because, as (4.37b) shows, the instantaneous (¢=constant) spatial distance X, — ¥2 measured by Lo between two fundamental world-lines depends on the time t, Nevertheless the universe model is spatially uniform for the fundamental observers. The space-time symmetries (4.33b) combined with (4.37a) act in the surfaces of instantaneity I,. showing the space-time itself is uniform on these surfaces. Also, the distance between the world-lines is measured to be constant on these surfaces, so the fundamental observers will measure the density of matter to be constant on them. Thus they will be seen to be surfaces of homogeneity in this universe model. (B) Uniform acceleration Since the world-lines L are not straight lines, each observer is moving non-inertially. Because of the construction of these world- lines by the use of Lorentz transformations, which preserve space and time intervals and will uniformly increment the velocity for the same time step on each away that each observer will measure his rate of change of speed relative to his proper time to be a con- stant, i.e. he is ina state of constant acceleration. From the force law (3.35b), this would require a constant force (e.g, a steadily firing rocket engine) to keep each observer on his orbit. However, as seen by Lo, these world-lines move closer and closer to the speed of light but never exceed it (in accord with the limiting nature of the speed of light). While these statements are obvious once one appreciates the role of the Lorentz transformation as a map of the space-time into itself that preserves space and time measurements, it is interesting to verify these results explicitly. Consider the event Q=(t,X) on the world-line L: {p = po}, mapped into another event Q' = (1, X") on L by (4.37a) for some specific value AV of V. Then the proper time between Q and Q’ is Ar given by Ar? = AP — AX? (4.38a) where At=t'—t, AY=X'-X (4.38b) 4.3 Some flat-space universes 173 Substituting from (4.37a) with relative velocity AV, we obtain Ar? = [t= (AV) (t! + AVX) = [X! — (AV)(X! + AVL)P = ('? — XP (1 (AV)? — P(AV)(AV)’). (4.38¢) Since Q' is on L, 2 — ¥? = (AY), one finds ~p2. On simplifying the terms involving AV and Ar = 2p5{(AV) - 1} (4.39a) showing that Ar is a constant on the world-line, for a given AV. This is the time measured moving ona straight line from Q to Q’, which is nearly the same as the time Ayr measured moving from Q to Q' along L if AV is small. Now AV is the change in velocity undergone by the observer in that time (Fig. 4.32a). Thus, the acceleration undergone in that time is AV /A,7. In the limit of small AV it follows that 7 — 1 ~1 AV? and (4.39a) then shows Ar = pAV. (4.39b) Also Aur ~ Az, so the proper acceleration A = dV /dz, which is the limit of AV /A.7 for small AV and so for small Azz, is given by A=py! (4.40) irom: * Se |e, as fa) ) Fig. 4.32 (a) Two neighbouring points Q and Q’ on the world-line L ( = po) have velocities differing by AV. (b) Just as the acceleration required to move on the uniformly avcelerated path L decreases with distance p, so dues the force required to keep an observer on a path of uniform acceleration at constant distance from the centre of the Earth (in everyday life, that force is exerted on us by the floor; without the floor we would fall freely towards the Rarth’s centre). 174 The Lorentz transformation and the invariant interval confirming that the acceleration is constant on each world-line and is smaller the further the world-line is from O. This is exactly similar, for example, to a static observer maintaining a constant radial distance from the centre of the Earth: he is held at this constant distance by a constant force, usually supplied by the floor, and the size of force needed decreases with distance from the centre of the Earth (Fig. 4.32b). This similarity between uniformly accelerated observers and a uniform gravitational field will turn out later to be of fundamental importance. (C) Redshifis measured by fundamental observers Because the observers are not moving inertially, the analysis of Sections 3.1 and 3.2 no longer holds. However, we can easily calculate the observed K-factor for this family of observers. Consider light emitted at an event r, by an observer O; on the world- line Ly: {p = pi} and received at an event rp by an observer O, on the world-line Ln: {p = pr} (Fig. 4.33). Under the boost (4.37a), for some chosen value of AV, light rays are mapped into light rays. Thus, if r is mapped to rj on Ly and rp is mapped to rj on L», then the light ray from r, to r2 is mapped to a light ray from. 1{ to rf, By (4.39a), the proper time A7; from r; to rf is given by Art = 2pi{(AV) — 1}, (4.41a) and the proper time A7p from rp to 14 is given by Artz = 2p3{7(AV) — 1}. (4.41b) an \ Fig. 4.33 Light is emitted at event r; on the world-line O; (p = p1) and received at event 12 on the world-line O, (p = p2). When event r, is boosted to the event r{ at a proper time Az later, the light ray is boosted to another light ray linking these world-lines (since both light rays and the world-lines are invariant under these Lorentz transformations). The second ray is emitted at rf on O, and received at event rf on O,, a time Ary after ry. 4.3 Some flat-space universes 175 Taking the ratio of these equations, we find Ar,/p1 = Ar2/p2; hence the time intervals are related by K = An/An = po/—r- (4.42) This expression is independent of AV, so on considering the limit for small AV, it gives the observed K-factor at each instant and by (3.3) determines the redshift observed by O, for radiation emitted by O,. This redshift is due to the accelerated motion of the observers; since it depends only on the ratio of the two distances p, and p>, it is independent of time. The redshift increases as p> increases and as p1 decreases, and diverges if either 1 — 0 or py — 00. (D) Redshifts relative to a stationary observer A more complex calculation determines the K-factor if the emission events r, and rj are on the exceptional world-line Lo through the origin O (Fig. 4.34), One finds after a certain amount of algebra that An = pi{l ~ (I~ AV a /{t+ (3 + PY) where fis the time of reception of the signal at the event rp, while Ary is given by (4.41b). Taking the ratio determines K. In the limit of small AV and dropping the subscript ‘2’, one finds K= (t+ (P +P Fo. (4.43) This gives both blueshifts (for negative 2, as L approaches Lo) and redshifts (for positive t, as L recedes from Lo) of indefinitely large magnitude for ‘large enough in magnitude. (E) The event horizon A little reflection on the last example or on Fig. 4.31 will show that the observer on Locan only receive signals from the observer on L when t > 0, but can only send signals to him when ¢ < 0. Thus, any fundamental observer L cannot send a signal to Lp and receive an answer! In fact, it is clear (Fig. 4.35) that all events for which — X > 0 cannot send signals to L, while Fig. 4,34. Light signals emitted from the exceptional world-line Lo at events r, and rj, and received by the uniformly accelerating observer L 176 The Lorentz transformation and the invariant interval allevents for which ¢ + X < Ocannot receive signals from L. The surface {¢ = Y} is called an event horizon for these fundamental observers. All the events ‘the other side’ of the horizon, ie. for which ¢ > X, are forever hidden from the fundamental observers: they can never know what happens there. To clarify this, suppose an observer L in a spaceship moving as a fundamental observer at time ¢ = 0 releases an astronaut in a capsule which then falls freely (ie. no forces act on it). Since it moves inertially, its world-line is a straight line C (Fig. 4.36). At any time until the capsule crosses the event horizon at the event Q, the astronaut could return to the spaceship by turning on a sufficiently powerful rocket motor. However, after the event Q, the capsule can never return to the spaceship: it would have to move faster than light to do so. It can be thought of as ‘trapped’ by the event horizon, a surface in space-time which it cannot cross in one direction. Neither can it send any signals to the spaceship to tell what has happened toit. As far as the outer world (¢ <_¥ ) is concerned, the astronaut has then effectively ceased to exist. Suppose C sends out signals at regular intervals that are received by L (Fig. 4.36). For simplicity, suppose the event Qis measured by C to occur at 12:00 noon. Then the regular signals sent out before 12:00 noon will all eventually be received by L, but the 12:00 signal will not, neither will any subsequent signal. Watching C’s clock through a telescope, L will never see it reach 12:00 o’clock. In fact, the regular signals will be received by L at longer and longer time intervals, the last minute to noon in C’s history being seen by L in an infinite length of time; that is, the Doppler-shift factor K diverges and the redshift becomes infinite. This is clear from the diagram because this last minute is seen by L over his entire remaining history. It also follows directly from (4.43), because > oo on events hidden from L ent horizon t=X Fig. 4.35 The event horizons ¢ = --X ina Rindler universe. A fundamental observer with world-line L cannot send signals to events in the region ¢ < —X behind the past event, horizon ¢ = —Y, and cannot receive signals from events in the region ¢ >. X lying behind the future event horizon t — +Y. 4.3 Some flat-space universes 177 Fig. 4.36 At ¢=0, a fundamental observer L (in a spaceship) releases a capsule which then moves inertially on the world-line C. Before the event Q when C crosses ¢ = X, the capsule can send light signals to L, but after Q this is impossible. Thus if Q occurs at 12:00 o’clock as measured by C’s watch, events after 12:00 in his history cannot be observed by L; thus L will consider them to be ‘hidden behind the event horizon’. L’s world-line in the distant future. As the redshift diverges, the image intensity will decrease to zero (by eqn (4.36b)). Thus observing C continuously, L will see all activity on C slowing down indefinitely; the observed redshift will increase without limit, and the image will fade away. The event Q and all subsequent events will be unobservable to L, but as far as C is concerned, nothing special at all will happen there. This behaviour is exactly similar to that of a particle wat- ched by an outside observer as it crosses the event horizon of a black hole (see Chapter 6). (F) The metric form Finally, it is interesting to see how the metric form (4,32a) is transformed if we change to coordinates adapted to the symmetry of the world- lines. We do so by using as coordinates p (given by (4.37b)) and a quantity 6 determined from 7 by the relation: dr = pdf along the world-lines (this relation is just the infinitesimal limit of relation (4.39b)). These are comoving coordinates for the fundamental observers: p labels the world-lines, and Gisa time parameter (but not proper time) along them. Explicitly, Gis the ‘hyperbolic velocity’ related to V in (4.37) by V = tanh; then »(V) = cosh @ and V>(V) = sinh §. This implies (4.37c) can be written* X = pcosh§, t= psinh 6. (4.44a) “Here, cosh J — {exp f+ exp(—A)}, sinh? — }{exp(9) —eap(—A}}, tau J = (sinh )/ cosh 2, where exp isthe exponential function which canbe gven in lems of a power geres by gxp.x = Text x}/314x4/4!+---, From these relations, it follows that cosh? 6 — sinh” 8 = 1, cosh0 = 1, sinh 0 = 0, tanh 0 = 0 (more details of these ‘hyperbolic functions’ may be found in any standard book on calculus), 178 The Lorentz transformation and the invariant interval (to check this, use (4.32a) and (4.44a) to determine dr along the world-lines on which dp = 0). From the definition of and the fact that p measures radial distance, the metric form may be written ds? = —p dP + dp? (4.44b) The static nature of the solution is apparent, because the metric and the world- lines (given by p = constant) are independent of the time variable 8. One should note that the form (4.44b) covers only that part of the space time where there are fundamental world-lines, i.e. the region of the universe outside the future event horizon ¢= X (discussed above) and the past event horizon t= —X (whose properties we have not investigated here). Exercises 4.20 (a) Explain why it is necessary for a force to act to keep a fundamental observer in a Rindler universe on his world-line. In what way might one produce the required force? (b) Noting that this force (measured at each instant in the observer’s rest-frame) must be constant for an infinite proper time along his world-line, what physical considerations suggest that this would be difficult to achieve in practice in some circumstances? 4.21 Find and sketch the paths of light rays in a Rindler universe in terms of the coordinates in the interval (4.44b). What is the coordinate speed of light at a point (p, )? 4.22. (a) Derive (4.39) and (4.40) from the preceding equations; (b) derive the formula (4.43) for the redshift relative to a stationary observer as follows. (i) Write down the equations of the forward light-rays through theeventsr; (f1,0) and 1{ (44,0). (ii) Use these equations to relate 4 and ¢} to the coordinates of rp (t2,X2) and 4 (4,.¥4) where the light rays meet the path of the observer O2: p = pz. Gi th=7b+V%), Xy=1%.+AVy), XF-8 =p} to eliminate 13, Xj, and Xo. (iv) Find a formula for X by taking the ratio of Ar, to Ar). In the limit of small AV you should obtain (4.43), 4.23 Investigate the properties of the past event horizon ¢ = —X [consider an observer on Lo observing the fundamental world-lines, and show that radiation emitted in an infinite proper time by a fundamental observer L is received by L¢ in a finite proper time]. Will infinite redshifts be associated with this horizon? What will be the apparent flux of radiation? The Milne universe In this case, we again start off with two-dimensional flat space-time given in coordinates (1, ¥) and with metric form (4,32a). Let the world-line Lo be the line {X = 0} which passes through the origin 0, Choosea value AVo, and repeatedly use the boost (4.37a) with V chosen as +AVo, to generate a family of world-lines which all pass through O (Fig. 4.37). These are the world-lines of the fundamental 4.3 Some flat-space universes 179 world lines Fig. 4.37 The Milne universe. The world-lines are generated by repeatedly applying a boost through a speed -+-AVo to the world-line Lo. The surfaces of uniformity (or homogeneity) are given hy S? 2 observers in this model universe which represents an expanding universe. We look in turn at its major features. (A) Equivalent world-lines By the construction from a series of boosts, which leave all space and time measurements invariant, the world lines are all equivalent to each other; each fundamental observer will determine the same history for the universe model as every other one. Thus the universe model obeys the cosmo- logical principle: all the fundamental observers are equivalent to each other. This basic assumption, formalizing the idea that we are not in a privileged position in the universe, underlies the standard models of the expanding universe used by cists today astrop! Because the world-line Lo is a straight line representing inertial (i.e. non- accelerated) motion, the same is true for the world-lines of all the other funda- mental observers in this universe. Since (4.37a) is repeated infinitely often, an infinite number of fundamental world-lines are obtained by this construction; thus these universe models will contain an infinite number of galaxies. (B) Homogeneous spatial sections The surfaces S are defined to be at constant space-time distance from O; that is, they are the surfaces P-Xs= (7° = constant), (4.45) Because the world-lines are straight lines, 7 is just proper time measured along these world-lines from O; so these surfaces are surfaces of constant proper timein the history of the fundamental observers. The boost (4.37a) leaves these surfaces invariant and so moves the intersection Q of any world-line L with asurface S toa point Q’ representing the intersection of another world-line L’ with the same surface S. Because the world-lines are generated by repeated use of the trans- formation (4.37a) with the same value of AV, they are equally spaced in the surface S. Ry a calculation similar to that leading to (4.39a), the spatial distance 180 The Lorentz transformation and the invariant interval Ap between Q and Q' is given by Ap? = 27? {(AV) — 1}: (4.46a) in the limit of small AV (AV < 1), this becomes Ap=7AV. (4.46b) Just as we arrived at the invariant metric form (4.446) in the Rindler universe, if we here use (7, 8) as coordinates for this universe model, where dp = 7 d@, we obtain the metric form ds? = —dr? + 7a? (4.47a) for these space-times where the fundamental world-lines are the curves {G = constant}. As before, (is the hyperbolic velocity related to V in (4.37a) by V = tanh Q; because the curve Lo goes through the point {¢ = 7, ¥ = 0} we can express the transformation (4,37a) in this case as t=rcosh@, X= rsinhf, (4.47b) where f labels the fundamental world-lines and 7 is proper time along them. The spatial homogeneity of the space-time is manifest here, because the form (4.47a) is independent of the spatial variable f. The spatial distance between two world-lines of the family of fundamental observers, measured in a surface S (dr = 0), will be p= [tas where 61 and fa are the values of f on the worldlines. By construction, the (fer by tanh”! A V5, 00 by [os =7(fr — fn). (4.48) (4.48) their aetna separations are all the same. Since they are uniformly spaced in these surfaces, the density of matter in the surface S, determined by the number of world lines per unit spatial distance, isconstant. Thus these spacelike surfaces are surfaces of homogeneity in these universe models. By contrast, on the surfaces {¢= constant} the density is non-uniform; in fact it diverges at the boundary, Pf where t + +X, because the surfaces cross an infinite number of world-lines as they approach ¥ (Fig. 4.38). One should note here that the uniform space-like sections are infinite in extent. ‘The point is that the coordinate @ in (4.47) takes all values from — 00 to +-00, so at each time 7 there exist galaxies separated by distances (4.48) that are unboundedly large. (C) Linear expansion and observed redshifts From (4.46) and (4.48), the spatial distance between any two fundamental world-lines measured in a surface of homogeneity $ scales linearly with the proper time 7. Thus the matter in this universe model is expanding uniformly. Since their motion is inertial, the K-factor analysis of Chapter 3 applies to the fundamental observers, and their observations of distant galaxies will show a redshift increasing systematically with distance (cf. Fig. 3.4). 4.3 Some flat-space universes 181 t ‘surface of hornogengity Fig. 4.38 A boost through AV applied to the event Q on the world-line L moves it to the event Q! where a second world-line L’ intersects the same surface of homogeneity; clearly L’ is ata larger distance from Lo than L, and is moving at a higher speed relative to Lo than L. The surface t = fo is not a surface of homogeneity because it crosses an infinite number of world-lines near the boundary (t = +X). | | vy ae Se ye Sa™ Loan JIN a (a f Fig. 4.39 (a) An observer on the galaxy with fundamental world-line Lo sees all the other galaxies to be receding from him in all directions. (b) The same is true for an observer on. any other galaxy with fundamental world-line L! say. While a fundamental observer L will measure ail other galaxies to be receding iinearly from him, this does not imply that he is at the centre of the expansion: on the contrary, every other observer will observe exactly the same thing. Indeed, all galaxies measure all others to be receding linearly from them and there is no centre to the expansion, all the galaxies being equivalent to each other. While our diagrams suggest the world-line Lo is privileged, this is just because we have drawn them in terms of coordinates centred on that galaxy. We could choose any other galaxy L’ and centre the coordinates on it; the picture obtained would be exactly the same, except now centred on L’ (Fig. 4.39). The relative speed of motion of the most distant galaxies approaches the speed of light, no matter which observer makes the observation. (Each member of the family of world- lines is equivalent under the Lorentz transformation ‘boosts’, and so the kin- ematic properties of special relativity, as discussed in the previous chapter, keep appearing here in new guises.) 182 The Lorentz transformation and the invariant interval sereen \ projector 7 Projector screen (@) (b) Fig. 4.40 (a) A projector throws a picture of a cluster of galaxies on a screen. (b) If the screen is moved steadily further and further away, the images of the galaxies on the screen ‘move further and further apart from cach other; the appearance is just like that of an expanding universe, (c) The relation between the Hubble constant Hp in a Milne universe, and its age 7. This feature of every galaxy receding equally from every other one is perhaps difficult to grasp at first, but can be visualized in the following manner: consider a projector throwing images of a cluster of galaxies on a distant screen (Fig. 4.40a). If the screen is moved further away from the projector, the whole scene depicted increases in scale and the image of each galaxy moves away from the image of each other galaxy without there being a centre to this apparent expansion (Fig. 4.40b). Thus, if the screen is steadily moved away, one will see visually depicted on it the expansion of a small section of the universe. The space-time diagram formed from a succession of these images on the screen will be just like Fig. 4. 37. (D) The Hubble constant The Hubble constant Hy measures the rate of expansion of the universe at a specified time 7p. It is defined as the rate of change 4.3 Some flat-space universes 183 of distance to a nearby galaxy per unit proper time divided by the distance to that galaxy, this ratio being evaluated at the time 7p. In the case considered here, we see from (4.46) for a given pair of galaxies at times 7) and 7, that Ap) = 1AV and Ap. = AY, so the change of distance in time Ar = 7 — 7 is Ap: — App = ArAV. Hence Hy = (ATAV/Ar)/(rAV) = 1/r evaluated at the time 7p, ic Hp — 1/7, which clearly decreases with the age of the universe (Fig. 4.40c). (E) Initial singularity Since the expansion is linear, then if it is followed back in time to O (7 = 0), there is a ‘Big Bang’ at O where all the matter world-lines intersect (by (4.48), the distance between every pair of galaxies goes to zero there). Clearly then the matter density is infinite at the origin O. However since the surfaces S are surfaces of constant density, this means that the matter density increases everywhere on these surfaces as 7 —> 0, and so goes infinite at all points on the boundary (Fig. 4.41). Accordingly, this boundary should really be regarded as the edge of the universe model, because the spatially homogeneous region where the matter is expanding and has finite density is bounded by this surface. Thus, having constructed the universe model, it is regular only within the region f = +X, and the exterior region should be discarded because it isseparated from the expanding universe region by infinite-density surfaces. While there is an edge to the galaxy distribution in cach surface {t = constant}, when we exclude the exterior region we cannot really regard the model as representing an expansion of the matter in the universe into a surrounding vacuum. How can we then interpret what is happening? The key is to note that there is no boundary or edge to the galaxies in the surfaces of homogeneity 7 = constant. Thus when analysed in terms of these surfaces, the expansion docs not take place into a surrounding vacuum or anything else, but is simply a con- tinuous increase in distance between every pair of galaxies in these surfaces, intersection ‘of world lines = density —- Fig.4.41 The ‘Big Bang’: at the point O where all the world-lines intersect, the density of ‘matters infinite. As the surfaces shown are surfaces of constant density, the density isalso infinite on the surfaces ¥, which are therefore the boundary of this universe model: the spatially homogeneous expanding universe region comes to an end at these surfaces where the matter density diverges. The event O is the beginning of the universe, 184 The Lorentz transformation and the invariant interval Fig. 4.42 The past lightcone C~ (p) ofan event pat time 7 ona world-line L intersects all the other fundamental world-lines in the universe before reaching the boundary surface ‘Thus the observer on L can see all the galaxies in the universe. However the furthest spatial distance to which L can have measured by radar at that time is } 7p, the distance to the event R where C~ (p) intersects . universe model, because these surfaces completely cover the space-time region representing the expanding universe (Fig. 4.41). The past light-cone C~ (p) of any point p on a world-line L intersects all the other world-lines back to . Thus in principle each fundamental observer can at all times see and communicate with every other galaxy in the universe, even though there are an infinite number of them. By (3.10a) the Doppler shift factor will diverge as one looks to earlier and earlier times (i.e. to galaxies for which + +0 and the relative velocity v — ¢), so by (3.3) the redshift will also diverge there and by (4.36b) the intensity of received light will fade away to zero. By contrast, although at each time 7) each observer can reccive signals from all the other galaxies in the universe, the distance measured by radar to the limiting observable event R in any direction would be just 7, so one might say that the size of the observed universe 1s ust 79. very fundamental observer would agree on this measurement (Fig. 4.42). Four-dimensional Milne universes One can construct four-dimensional flat- space Milne universe models that have all the essential features discussed above; these will be presented in Chapter 7. Since these are flat space-times, eqns (4.35) (with r = p) and (4.36) will determine the observed flux and intensity of radiation in such universes. These universe models display many features of the curved- space-time expanding universe models which we will examine in Chapter 7. Exercises 4.24 Ina diagram of the Milne universe, draw in the world-lines of some inertially moving particles. Why will each such particle eventually be at rest relative to the funda- mental observers and matter around it? Suppose a particle is emitted from the origin at time t = fo and moves freely with speed Vo. Which is the furthest fundamental observer (ie. the one with the largest value of V) which this particle can reach, given an infinite amount of time? 4.25. Derive eqns (4.46a) and (4.466). 4.26 Suppose the Rubble constant is measured to be Hy = 50 km/sec per Mpc, where one ‘megaparsec’ (Mpc) is 3.26 x 10° light years, and the age of the oldest stars in globular 4.3 Some flat-space universes 185 clusters in the universe is established to be 16 x 10° years. Is this data consistent with a Milne universe model? What if we find that the Hubble constant is really 100 km/sec per Mpc? 4.27 Deduce from the interval (4.47a) that the redshift z of light observed by a fun- damental observer A at time 7 for light emitted by a fundamental galaxy at time r¢ is given by 142=n/1.-. Hence prove that the redshift observed by A at a given time 7» will diverge as he examines spectra emitted by galaxies at earlier and earlier times (i.e. as 7 — 0). What does this imply about the measurements A might make of the flux or intensity or radiation emitted by galaxies at very early times in the history of this universe? [For simplicity, assume here that the light emitted by each galaxy is constant throughout its history.] 5 Curved space—times 5.1 The general concept Our discussions so far have all been concerned with flat space-times, where we can choose physical coordinates so that all the light cones are parallel to each other. This is possible because, in a flat space-time, initially parallel light rays remain parallel to cach other. In curved spacetimes, the situation is radically different. According to Einstein’s general theory of relativity, in which gravita- tional fields are represented through space-time curvature, the gravitational fields of massive objects not only curve the paths of other massive objects but also bend light rays (Fig. 5.1); in fact, observation of this effect gave the first experimental verification of the correctness of general relativity (in 1919). This feature affects the causal and observational properties of curved space-times in intriguing ways. The concept of a curved space is familar from everyday life. For example, the surface of a football is a two-dimensional curved space, as is the surface of a doughnut; but we do not include the surface of a cylinder in this category, because a cylinder can be opened out onto a plane without distortion. In fact, for a two- dimensional surface, a lot can be learned about its curvature by attempting to lay massive object: light rays, ( fa light rays Fig. 5.1 The bending of light rays by the gravitational field ofa massive object; the paths in space and in space time are no longer straight. 5.1 The general concept 187 it out flat on a plane after making appropriate cuts where necessary. If distortion, gaps, or overlap arise at any point in this process then the surface is curved there. If the surface has positive curvature (e.g. the summit of a hill) there will be gaps in the projection onto the plane (Fig. 5.2a). If the curvature is negative (c.g. the saddle-shaped surface between two neighbouring hills) there will be overlap in the projection (Fig. 5.2b). Geometrical relationships in curved spaces differ from those in flat spaces. As an example, consider the surface of a sphere; we can regard this as an idealized model of the surface of the Earth. Great circles are the curves in this surface where any plane through the centre of the sphere intersects it, e.g. lines of constant longitude, and the equator (Fig. 5.3). The analogue, on this surface, of a straight lineis a great circle, because (i) when one moves on the surface of the sphere, these are the curves of shortest distance between any two points (as can be seen by stretching a piece of clastic between two points on a sphere), and (ii) these are the $$ oo = @ (by Fig. 5.2 (a) A surface with positive curvature Recause the cirenmference of a circle of radius r is less than 2nr, if we flatten a section of it onto the plane it will tear, and there will then be gaps in this projection onto the plane. (b) A surface with negative curvature, Because the circumference of a circle of radius r is greater than 2nr, if we flatten a section of it onto the plane it will fold and there will then be overlaps in this projection onto the plane (see ‘The mathematics of three-dimensional manifolds’, W. P. Thurston and J. R. Weekes, Scientific American, July 1984, pp. 103 and 106). constant longitude great circles Fig-5.3 The equator and lines of constant longitude are great circles (‘geodesics’) on the surface of the Earth. 188 Curved spacetimes curves obtained if one starts out from any point on the surface of the sphere in a given direction and then moves on this surface without deviation from its direction of motion (think of a ship or aircraft steering straight ahead, deviating neither to the left nor the right). We shall call curves in any space that have these two properties, geodesics of the space; thus great circles are geodesics on the surface of a sphere. Now if you try drawing a triangle on the surface of a spherc, with sides given by great circles, you will find that the angles do not add up to 180°; indeed one can find such a triangle for which every corner is 90° (Fig. 5.4a). Further, if you follow two such curves that start off parallel to cach other (c.g. they are both initially at right angles to the equator, see Fig. 5.4b) the distance between them does not remain the same; on the contrary they eventually intersect each other. If two aircraft start off exactly parallel to each other, and fly straight ahead at the same height above the surface of the Earth, they will eventually collide. Thus the geometry of this curved space is different from that of a flat space; Euclid’s axiom, that parallel straight lines never meet, is untrue. Further, it is intuitively clear that the smaller the radius of the sphere considered, the more highly curved is its surface, and then the shorter is the distance until initially parallel great circles intersect (Fig. 5.4c). Thus this distance provides a measure of the amount of curvature of the surface. A curved (four-dimensional) space-time is rather more difficult to imagine, but geodesics can again be defined in essentially the same manner and similar kinds of effects occur. This will be made clear in this and the following chapters. In this chapter we consider the nature of curved space—times, and how they are described mathematically. As a preliminary to this we first examine Einstein’s principle of equivalence, which underlies the curved space-time understanding of the nature of gravitation. “ay é ey © Fig. 5.4 (a) A ‘spherical triangle’ formed by three great circles (the equator and two lines of latitude meeting at a right angle at the North Pole). Each of the three interior angles of the Wiangle is 90°. (b) Two great circles (lines of Tatitude), initially parallel lo each other at the equator, intersect at the North Pole. (c) The distance d from the equator to the intersection of these initially parallel great circles is shorter if the radius r of the sphere is, shorter; then the surface af the sphere is more highly curved 5.2 Acceleration and gravitation 189 Exercises 5.1 Pick a point P on a plane, and draw various circles of radius r with P as centre, Repeat the procedure on the surface of a sphere of radius a. In both cases, find the ratio R= C/r between the circumference C and radius r of each circle (for the circles drawn on the sphere, measure the radius along a geodesic on the sphere). How does the ratio R for the circles on the sphere depend on their radius? [You can do this exercise experimentally, actually drawing the circles on a piece of paper and on a ball, or use simple trig to calculate the answers you would obtain if you actually carried the experiment out.] How would R vary with the radius a of the sphere? 5.2. The basic problem of mapping the surface of the world in an atlas arises because the Earth’s surface is not flat. Consider this problem in the light of the above discussion. Can you characterize the kinds of distortion that are likely to arisein mapping the Earth’s surface on a flat map (as in an ordinary atlas)? How could you minimize this distortion best? In attempting a least-distorted map of the Earth’s surface by ‘cutting’ into separate areas and projecting these onto a plane, would you expect to find gaps or overlaps in this projection? 5.3 Consider the surface of a cone. By projecting (i) a region including the vertex, and (ii) a region not including the vertex, onto a plane so as to preserve distances and angles, determine the nature of its curvature. 5.2 Acceleration and gravitation: the principle of equivalence The dynamical reaction of an object (e.g. a rocket ship) to the forces exerted on it is determined by its inertial mass, that is, the mass my entering the equation F=ma sclating the ivial force # acting un ii iu the wsuliing acceleration a. Hit is in the gravitational field of a spherical massive body (c.g. a star) with mass M whose centre is situated a distance r away, the resulting gravitational force on the object is determined by its gravitational mass, that is the mass my entering Newton’s gravitational equation F =GmoM/P where Gis the Newtonian gravitational constant. A crucial feature of gravity is that the gravitational and inertial masses of any object are the same; that is, mg =m =m. Combining these three equations shows that at a distance r from the centre of a star or other massive body of mass M, the acceleration experienced by any small object due to the gravitational force exerted on it is a=GM/r, (+) independent of its mass m. Thus, different objects accelerate at the same rate ina gravitational field, irrespective of their mass or composition. Indeed, this is the 190 Curved space-times essential content of Galileo’s famous observation that bodies of all kinds fall at the same rate when air resistance can be ignored. It also underlies the fact that we do not have to know the composition or nature ofa plane! in order (o caleulate its orbit (the outer planets such as Saturn and Jupiter, composed mainly of hydrogen-rich gases such as methane, move on elliptic orbits, just as do the inner planets such as Mars and the Earth, made mainly of rock and iron). This fundamental feature has two major consequences which we consider in turn. We consider the principle of equivalence in this section, and the meaning of geodesics in the next. Accelerated reference frames and the force of gravity In our discussion of special relativity (Chapter 3), we restricted ourselves to considering only inertial motion, that is, observers whose world-lines are geo- desics in the flat space-time of special relativity. Thus we considered Einstein's principle of relativity only for such observers (see Section 1.3). In general rela tivity, we extend the principle of relativity to all observers, whether moving inertially or not. Thus in the general theory of relativity, it is assumed that the laws of physics are the same for all observers, no matter what their state of motion. As we shall now see, this leads to a new understanding of the nature of gravity. It is clear that the gravitational force measured by an observer depends criti- cally on his state of acceleration. It is convenient here to think of an observer carrying out experiments ina lift (in the USA: an elevator). As long as the lift is stationary or in uniform motion, the results are identical to those he finds in a stationary laboratory on the Earth’s surface. For simplicity, consider the lift when stationary; the Earth’s gravity acts on the lift and on the observer in it. Tension exerted by the cable holding the lift (Fig. 5.5a) prevents it accelerating downwards at the rate g observed for every freely falling object (where g has approximately the value 9.8 m/sec” at the surface of the Earth, determined by (+) with M as the Earth’s mass and r its radius). The reaction exerted by the floor of the lift on the observer prevents her from failing down the lift shaft; she experi- ences this as her weight. If she releases a glass held in her hand, it accelerates downward relative to her at the rate g and breaks on hitting the floor. Because of the eyuivalence of gravitational and inertial mass, the same acceleration is experienced by all bodies no matter how heavy they are (within limits) or what they are made of, this being demonstrated by Galileo’s celebrated experiments at the leaning tower of Pisa, and other more modern versions of that experiment. However, if the cable attached to the lift breaks (Fig. 5.5b), and we ignore friction and air resistance, then relative to the Earth’s surface the lift will accelerate downwards at the rate g (since it will be a freely falling object). The observer also accelerates downwards relative to the Earth at this rate, because the floor no longer prevents this happening: it accelerates away from her at just the free-fall rate, and so exerts no force to slow down her fall. Since the reaction 5.2 Acceleration and gravitation 191 | 1 HR. gravity] gravity} 9 @ ) Fig. 5.5. (a) An observer in a stationary lift, held in place by the tension Tin the cable The force of gravity holds her to the floor; an object dropped by her will accelerate to the floor of the lift at the rate g. (b) An observer in a lift in free fall after the cable has broken. She will not experience any force holding her to the floor; an object dropped by her will float next to her as it accelerates downwards at the same rate g as she does. from the floor now vanishes, she will no longer feel her weight holding her down on the floor. Thus, as far as she is concerned, the force of gravity now appears to have no effect. Ifshe releases a glass held in her hand, it will accelerate downwards relative to the Earth at the rate g, precisely as she is doing, and so will float next to her at a constant distance above the floor (which is also accelerating down, relative to the Earth, at the rate g). Thus, because all freely falling bodies experience the same acceleration in a gravitational field, any freely falling object will appear to be stationary in the observer's reference frame. Measured by local experiments in this accelerating reference frame, the Earth’s gravitational field no longer causes objects to accelerate towards the floor of the lift at the rate g. Its usual effects have been transformed away by changing to an accelerating refer- ence frame. One can make the point even more strongly by considering what the observer would experience if one were to attach rockets to the roof of the lift to accelerate it downwards at a rate 2g (Fig. 5.6). She can then stand as if ina normal gravitational ficld with her feet on the ceiling of the lift! Gravity tends to accelerate her down at a rate g relative to the Earth, but the roof of the lift avcelerates down al 2g, the reaction exerted by the 100f un her feet will act to make her accelerate down at the rate 2g instead of the free fall rate g. Consequently, the observer would apparently experience a perfectly normal force of gravity acting from the floor to the roof, holding her against the roof. If she releascs a glass from her hand, relative to her it will accelerate towards the roof at the rate g and break on hitting the ceiling. From experiments within the lift, she will measure a standard value for the acceleration due to gravity but would regard the roof as ‘down’ and the floor as ‘up’. Thus, by changing to an appropriately accelerating reference frame, one can reverse the effective direction of gravity (for a short while!). 192 Curved space-times gravity Fig. 5.6 An observer in a lift being accelerated downwards at a rate 2g by a rocket. The observer is upside down with her feet on the ceiling, and apparently experiences the normal force of gravity holding her against the ceiling (in the same way as gravity holds the observer in Fig, 5.5(a) against the floor). An object dropped by her will accelerate (relative to her) at the rate g towards the ceiling. The equivalence principle These examples depend crucially on the equivalence of gravitational and inertial mass. If this were not true, different bodies of the same inertial mass would experience different gravitational forces and so would accelerate at different rates in a gravitational field, contrary to experiment; transformation to an accelerating frame could remove the effective gravitational force for s s but not others (because the requ of accel different for different objects). As a result of this equivalence, there is a close relation between acceleration and gravity. To understand this relationship more clearly, we follow Finstein in considering varions possible states of motion of an observer in some small region of space-time. Firstly, suppose observer A is in a lift which is at rest relative to the Earth. The results of any experiments done there will be those of everyday life on Earth (Fig. 5.7a): if an object is released, it will fall to the ground. Secondly, consider observer B in a rocket moving with constant acceleration g far from any massive body (Fig. 5.7b). For him, the results of experiments will be the same as for A, An object when released will fall to the floor (or, if you prefer, the floor will accelerate into it!) with relative acceleration g. Suppose that observer C is in a lift which is falling freely under gravity because its cable has broken (Fig. 5.7c). The observer will fall at the same rate as any object released, and so will measure no relative acceleration; thus the results of all experiments will be the same as for observer D ina stationary rocket far away from any gravitational field (Fig. 5.74). The fact that observers A and B have the same experience of an apparently normal gravitational field in seemingly different physical situations, and that observers C and D have the same experience of an apparently zero gravitational field when 5.2. Acceleration and gravitation 193 t acceleration gravity| ' © @ Fig. 5.7 (a) An observer A in a lift at rest relative to the Earth (cf. Fig. 5.5(a)). (b) An hody. An object dropped by B will accelerate to the rocket floor at rate g. (¢) An observer C ina lift falling freely under gravity (cf. Fig. 5.5(b)). (d) An observer D in a rocket in free fall far from any massive body. An object dropped by D will float next to him. observer Rin a rocket moving with constant acreleration g far from any ma their physical situation are again quite different, can be summarized in the principle of equivalence: there is no way of distinguishing between the effects on an observer of a uniform gravitational field and of constant acceleration. The case of observer B moving in a rocket is exactly equivalent to that of a fundamental observer in the Rindler universe (Section 4.3). On the other hand, observer A experiences the gravitational field of a spherically symmetric body described by the Schwarzschild solution, which will be discussed in Chapter 6. The need for curved spacetimes By varying the acceleration of an observer in a flat space-time, one can mimic any gravitational field. So why do we need curved spacetimes? Thiscan he motivated 194 Curved space-times Fig. 5.8 The freely falling observer C will measure a light ray travelling across the lift to move ina straight line (because this situation is equivalent to that of observer D ina freely falling rocket). The same light ray will appear curved to observer A, the stationary observer in the gravitational field, because C is accelerating relative to A. both by considering accelerating motion, and by considering gravitational effects, To see the effect of acceleration, let us return to the stationary observer D. According to his observations, a light ray sent across the cabin of a rocket will travel ina straight line. The principle of equivalence implies that the equivalent freely falling observer C will measure a light ray sent across the lift to travel ina straight line. A stationary observer (not in free fall) will therefore regard C’s light ray as being bent downwards (Fig. 5.8) and conclude that the space-time cannot be flat. Thus in order to be able to describe the experiences of all possible observers, we need to consider curved space-times. One might be tempted to ask at this stage what gravity actually is: is it due to the local distortion of space-time or is it a force mediated by the exchange of par- ticles? The answer here lies in the concept of complementarity (see also p. 260 in the subsection on the thermodynamics of black holes in Section 6.4); both descriptions are valid, with one being more useful in some circumstances and the other more useful in other situations, Gravity produces the curvature of space— time, which we experience as a force when we move on particular paths on that space-time. In a realistic consideration of gravitation, one must take into account the fact that real gravitational fields are non-uniform. Thus for example the gravitational force exerted by the Earth varies in direction and magnitude (Fig. 5.9a). While it is possible by a change to an accelerated reference frame F to Wansform away the effective gravitational field al any point P near the Earth, use of this reference frame will not transform away the effective grav- itational field at other positions, because then the direction or magnitude of the acceleration would be wrong. For cxample, at the point P’ on the other side of the Earth to P, use of the frame F will double the effective gravitational field rather than cancelling it (Fig. 5.9b). Thus if one uses a flat-space description, ‘one can only mimic the effect of gravity everywhere by having available infi- nitely many accelerated frames (Fig. 5.9c). However, an observer using a single reference frame can represent any gravitational field by using a curved space- time description 5.3. Freely falling motion and the meaning of geodesics 195 Lo = same acceleration f@) ) gravity acceleration Na’ Fig. 5.9 (a) The direction of the gravitational field at various points around the earth. ‘The directions at P and P’ are opposite. (b) An acceleration that transforms away the gravitational field at P will double it at P’, so no single reference frame can transform it away everywhere. (c) In a flat space-time, a separate accelerated frame is needed at each point to transform away the gravitational field, 5.3 Freely falling motion and the meaning of geodesics It follows from the equivalence of gravitational and inertial mass that when a body moves freely under gravity and inertia alone, with no other forces acting, its motion is determined compieteiy by giving its initial position and speed at a chosen initial time. Thus for example one might specify that a stone is dropped (starting from rest) from the top of the Tower of Pisa at 12:00 noon on 1 January in the year 1604 (Fig. 5.10a). This completely specifies the initial conditivns for the motion (the place and time of the starting event, and the velocity of the stone at that event). Assuming air resistance is negligible for the short duration of the fall, the stone falls frecly under gravity and inertia only, and the complete motion is determined by this initial data (the stone accelerates from rest at approximately 9.8 m/sec”). What does this look like from the space-time viewpoint? The world-line of the stone (Fig. 5.10b) is uniquely determined by this initial data, which amounts to specifying (a) the initial event P in space-time (the place and time where we choose to start monitoring the motion) and (b) the initial four-velocity at that 196 Curved space-times I unique world- line wotdine (initiat velocity). 2 U a (overt) * (b) © Fig. 5.10 (a) A stone dropped from rest from the top of the Tower of Pisa at 12 noon on I January 1604. (b) The world-line of the stone, starting at the event P in space-time with an initial four-velocity V. (c) In general the world-line in space-time of a freely falling object (ic. an object moving under gravity and inertia only) is uniquely determined by an initial space-time position Q and an initial four-velocity U defined at that event. event, which is just the space-time direction of the world-line at the event P (see Appendix B). The stone being released from rest, the initial space-time direction of its world-line is parallel to the ¢-axis, since this corresponds to no change in the Z-direction; if it were thrown down instead of being released from rest, its initial direction would be sloping in the Z-direction. From this example, it is clear that a similar result will hold in general for any object moving fieely under gravity and inertia alone: the initial conditions needed to specify the motion are its initial space-time position Qand velocity (a time-like direction at that event, Fig. 5.10c). Given these, the motion is completely determined, and is described by a unique time-like path in space-time. For example, if we know the position of an artificial satellite moving around the Earth at a particular time, and its motion at that instant, we can predict its future motion around the Earth as long as no force other than gravity acts onit (e.g. as long as it does not fire a rocket engine). A unique space-time curve describes this motion, being completely determined by an initial point in space-time and direction at that event 5.3 Freely falling motion and the meaning of geodesics 197 The physics of free fall After this somewhat lengthy introduction to the relation between acceleration and gravitation, we are in a position to pull the threads of the discussion together. ‘When we move from the special to the general principle of relativity, so taking into account the use of accelerated reference frames, it is no longer possible to make a clear-cut distinction between gravity and inertia (since that distinction depends on the acceleration of the reference frame chosen). In particular, inertial motion no longer has a clear physical meaning, because motion that is inertial in one reference frame will not be inertial in another that is accelerating relative to the first. However, we can assign a clear physical meaning to the notion of a particle in free fall, that is, a particle which is in motion under {gravity and inertia} alone. As examples, observers C and D discussed above (Fig. 5.7) were in free fall, whereas A and B were not (A was not in free fall because of the cable restraining the lift from falling, while B was not because of the force exerted by the rocket motors). An object will be in free fall unless some force other than grav- itation is exerted on it. Given this physical identification of a uniquely determined set of particle motions, it is natural to identify them with the geometrically unique sel of particle motions discussed above, namely time-like geodesics of space- time. We therefore make this identification: the paths of freely falling objects in space-time, i.e. objects moving under gravity and inertia alone, are time-like geo- desics in space-time. An example of bodies in free fall is the motion of planets around the Sun, and indeed this prescription turns out to provide a satisfactory description of that motion. We shall consider briefly how this can be. Planets Just as gravity curves the paths of light rays in a curved space-time, so it will also curve the paths of massive objects. Note the inherent non-linearity of the theory—massive bodies produce space-time curvature which then affects the motion of these same massive bodies. This is the reason why some calculations in curved spacetimes are very difficult. However, here we shall be concerned mostly with the motion of what are known technically as ‘test particles’, which just means that we are neglecting their effect on the curvature of space-time, and seeing how their motion is affected by curvature produced by other more massive bodies. The curving of the paths of massive objects by space-time is clearly necessary if we are to describe the nearly circular motion of the planets as due to gravity producing a curved space-time. One aspect of this motion may be illustrated by considering two everyday examples of circular motion. Firstly, consider a ball made to describe a circular path by someone swinging the piece of string to which it is attached. The force or tension in the string maintains the circular path. Secondly, consider a ball following a circular path at a fixed height inside a hemispherical shell (Fig. 5.11); in this case the reaction of the shell maintains the circular path. The first of these examples corresponds to the idea of gravity being a force determining motion; the second embodies the idea of motion being 198 Curved space-times Fig. 5.11 A ball moving at a fixed height inside a spherical shell is maintained on its circular path by the curvature of the shell. Bound planetary motion is just like this, the planet being held in its circular orbit by the curvature of space-time caused by the gravitational field of the Sun, Bodies with sufficient kinetic energy will escape to infinity. iin curved Ispace-time 12, a ap least distans proper time) in space-time, which also has the property that its direction is undeviating (in the curved space-time). Its spatial projection can be highly curved. determined by the shape of the surrounding space, and is essential in our discussion of curved space-times. The spatial paths of the planets may be highly curved; this is a result of their moving on geodesics in space-time which are paths giving the longest* possible time between their initial and final points, Because of the space-time curvature, these ‘longest time’ paths in space-time result in curved spatial orbits (Fig, 5.12). Thus, one can understand the planets as moving around the Sun in such a way as to minimize the space-time distance they travel between their initial and final positions (by maximizing the proper time). As well as giving an extremal space-time distance between their end-points, geodesics (as explained earlier) are curves that have an undeviating direction in “In space, a geodesic is the path giving the shortest distance between its end-points but in space- time it is the path giving the longest time between its initial and final points (cf. the discussion in Scetion 4.2, and Section 5.4 below). 5.3. Freely falling motion and the meaning of geodesics 199 space-time. How then can a particle moving on a geodesic arrive back at the same spatial position (as happens, for example, in the case of a planet moving in a circular orbit around the Sun)? This is difficult to illustrate, but the example mentioned above of the ball moving in a hemispherical shell gives some insight into this; for it is clear that if the ball were moving at the equator, it would veer neither to right nor left, and end up back at the same position. A practical example that nearly demonstrates this is a motorcycle rider ona ‘wall of death’ at a fair. In a curved space-time representing the gravitational field of a massive star, the effect of the space-time curvature is as if its planets were moving on a smoothly curved surface of revolution that holds those planets with sufficiently small kinetic energy near it, but lets those with large energy escape to infinity (Fig. 5.11). One must remember here that the undeviating direction is in space- time, rather than space; this is not easy to visualize, and in the end we have to rely on our calculations to see that the paths predicted by the theory do indeed work out as observed in the solar system, for example, the Earth moving around the Sun in its nearly-circular orbit, held at this distance by the space-time curvature. Geodesic deviation: curvature and tidal effects One cannot measure the strength of a gravitational field by an absolute mea- surement of the amount it bends a light ray or particle path, because this depends on the frame of reference used; indeed one can always choose a reference frame in which the particle’s motion is uniform (e.g. choose the particle as the origin of the reference system; then it will always be stationary at the origin, by the choice of coordinates used.) However, the strength of the gravitational field is readily detectable by measuring the relative motion of particles or light rays. Thus, for example, in a static situation, one may be able to measure the bending of light relative to a static observer and thus estimate the strength of the gravitational field. The relative motion of neighbouring particles or light rays can be examined systematically, and leads directly to estimates of the space-time curvature. Consider a pair of particles in free fall in the gravitational field of a massive star G, after being released from rest (Fig. 5.13a). They will both fall towards the centre of the star, and so will gradually move closer together. Thus one can detect. the effect of the gravitational field in causing relative motion of freely falling particles. If one considers a spherical cloud of freely falling particles that are released from rest, the particles nearer the star accelerate faster than those further away so the sphere becomes compressed sideways but elongated Lowards the star (Fig. 5.13b). It turns out that, in this case, the volume of the cloud of particles remains constant. Thus the gravitational field of a distant mass has a pure dis- torting effect, which we arc familiar with asa tidal force (the gravitational ficld of the distant Moon is the essential cause of the tides on the Earth, cf. Fig. 5.13c). If we took into account the gravitational effect of the particles themselves on their motion, we would find that the volume decreased. These examples illustrate that the effect of gradients in the gravitational fields is to cause relative acceleration of test particles which can be measured and used to estimate the strength of these gradients. One cannot transform such gradients away to zero by changing to an 200 Curved space-times ORBITAL, MOTION > Jlesser lacceleration ; 38 jreater $ ! Moan G S _ @ ©) © Fig. $.13 (a) Two particles falling freely from rest towardsa star G. The distance between them decreases as they move towards G. (h) A spherical cloud of particles is distorted as it falls freely towards a star. (c) The tides on the Earth are produced by the gravitational field of the Moon. The sea on the side of the Earth nearer the Moon experiences stronger acceleration than the sea on the far side (cf. the distortion in (b)) jeodesics initial motion — a distance Fig. 5.14 The space-time paths of the freely falling particles in Fig. 5.13(a). They are parallel initially but meet after a finite time (of. Fig. 5.4(b)). accelerating reference frame, so they represent a real physical aspect of the space time. To understand this a bit further, consider a space-time view (Fig. 5.14) of the freely falling particles released from rest (Fig. 5.13a). The geodesics start off initially parallel, but then converge towards each other; in fact they will intersect in a finite time if they continue far enough. This is a very general feature of gravitational fields; it is completely analogous to the effect of curvature on the geodesics on a sphere (Fig. 5.4h). In that case the distance until parallel lines 5.4 The metric form and.the metric tensor 201 intersect is an inverse measure of the amount of curvature. In the space-time case, by analogy we can measure the strength of the space-time curvature by the time clapsing until particles initially at rest run into each other; the shorter this time, the greater the space-time curvature and the stronger the gravitational field. Exercises 5.4 Devise amethod for constructing the geodesic routes to be used by aircraft flying at a constant height above the Earth’s surface between various cities. In particular, look at (i) London-Sydney, (ii) New York—Tokyo, (iii) Cape Town—I.os Angeles, 5.5 Explain why an astronaut in a satellite orbiting the Earth experiences a state of weightlessness, 5.6 Two particles are simultaneously released from rest a distance 9 metres apart at the surface of the Earth, and fall down a tunnel which allows them to fall to the centre of the Earth. What will happen there? Draw a space-time diagram of this situation. 5.4 The metric form and the metric tensor We now have attained a broad idea of the nature of curved spacetimes. ‘This section addresses the issue of how one can describe them mathematically. The metric form for curved spaces The basic idea we shall use is that one describes a curved space by giving the metric form ds”, in some suitable coordinate system. Just as in flat space, this then determines all distance measurements and angles (cf. Section 4.2). Asan cxample, the metric form for the surface of a sphere of radius a is ds? = a"(d6" + sin? ¢d¢*) (5.1) where @ and ¢ are standard polar coordinates (we can think of @ as latitude measured from the north pole, and ¢ as longitude; see Pig. 5.15). Just as in the argument following eqn (4.28b), this shows that the distance measured along a asinade <1 \ Po e0 ede Fig.5.15 The angles @ and ¢ used to describe position on the surface of a sphere. Small increments in 6 and ¢ result in displacements a d0 and a sin @dg on the surface of the sphere. 202 Curved :space-times line of constant longitude (¢ constant) from 0; to 4 is a(4, ~ 41), while the dis- tance measured along a line of constant latitude (@ constant) from 41 to ¢2 is a(d — ¢1) sin @ (see Fig. 4.22*). Moving through a general small displacement (d9 in the @ direction, d¢ in the ¢ direction), then because the lines of constant latitude and longitude are at right angles to each other, we very nearly have a small flat right-angled triangle, and the smaller these displacements are, the more accurate this approximation is. In such a flat triangle, Pythagoras’ result will hold: the square on the hypotenuse is the sum of the squares on the other two sides. The form (5.1) shows that the geometry of the curved surface agrees in the limit of very small displacements with this flat space result. Thus in the limit very near any point, the geometry of the curved surface is the same as that of a flat space. This is of course clear on the surface of the Earth: one does not need to use spherical trigonometry to lay out a football field or design a building! The distinction at this level between fiat and curved spaces is that, for a flat (two-dimensional) space, it is possible to find a coordinate system in which the metric form is everywhere ds* = dx? + dy? (5.2a) i.e, with the coefficients of dx? and dy? being 1, whereas no such coordinate system can be found fora curved space (e.g. on the surface of a sphere). Note that this statement does not imply that the metric form is the same for all coordinate systems in a flat space; indeed we have seen various other forms for the flat-space metric in Section 4.2. In a curved two-dimensional space, one can always find coordinates such that the metric form is (5.2a) at any point P, but it will not be this at other points (for example, up to a common scaling factor a the two-dimen- sional metric form (5.1) reduces to this at each point on the line @ = 47 but not s form applied eve elsewhere), [fone could find coordinates such that where, this would imply that Pythagoras’ theorem holds for arbitrarily large displace- ments, in contrast to the situation in curved spaces where it only holds in the limit near each point. Similar results hold for higher-dimensional spaces, e.g. a three dimensional space is flat if and only if coordinates x, y, z can be found such that the metric form everywhere is ds? = dx? + dy? + dz?. (5.2b) In general coordinates, the metric form will be different (see e.g. (4.28b)). The metric tensor Itwill be convenient later to introduce a gencral notation that will apply to all the spaces and space-times we consider. First we recall the coordinate notation (xt, x?, x3, x4) introduced in Section 4.1. Define the quantities gu=a, gn =a'sin’9, gi =gn=0, (53a) Cf. eqn (4.286); here we have the same metric form but with r dr =0, giving a 2-sphere of radius @ as required. = constant, which implies 5.4 The metric form and the metric tensor 203 which can also be conveniently written in the matrix notation j_ {sun gr] _ lv 0 bel = [s ] ~ [ 0 @sin’9 Then the metric form (5.1) can be written ds? = gu (dx!) + grodx! dx? +endx?dx! +en(dx’ —(5.4a) On the other hand, if we define gu=l, gz=1, gn=81=0, (5.3b) which can also be conveniently written in the matrix notation l= [5 t]: then (5.4a) gives the metric form (5.2a). Thus the formalism [gi] may be used to specify the metric form ofa flat two-dimensional space in Cartesian coordinates, or a (curved) two-sphere in polar coordinates. Examination of other examples suggests that for a general two-dimensional space in general coordinates, the metric form can be written as in (5.4a), where the coefficients g,, called the components of the metric tensor, are symmetric: S12 = 8a (5.4b) and otherwise are arbitrary functions of the coordinates x! and x*. A more concise way of writing (5.4) is ds? = S> gy dx! dx’, (5.5a) B= i"), By = Be (5.56) where $> stands for summation over all values of the indices i and j (in this case, i= 1,2 and j = 1,2) and the last equation is understood to hold for all values of iand j (in this case, i, = 1,2) One great advantage of this notation is that it includes all the cases we have come across so far, no matter what the dimension of the space (provided we take the summation over appropriate values). Thus, for example, we recover (4.28b) from (5.5) on setting gi: = 1, ga =?", g33 = r?sin’ 0, gy =0 otherwise, but oblain (5.2b) if instead we set gi. = gx = g33 = 1, xj = 0 otherwise. Thus the general concept is that a curved space of n dimensions is described by a metric form ds? given by (5.5) where and j range over the values | to n. Exercises 5.7 Flat two-dimensional space is given in terms of plane polar coordinates (r,0).. What form will the metric components take in this case? 204 Curved space-times 5.8 In the case of a general three-dimensional space, verify that when written out in full detail, expression (5.5a) becomes ds = gyi (dx!)? + giz dx! dx? + gi dx! dx? + gx dx? dx! + gn(dx*)? + 23 dx? dx? + gsi dx? dx! + g32 dx? dx? + g33(dx3)?. What simplification results from (5.5b)? The metric form of space-time Similarly, to describe a general curved (four-dimensional) space-time, one must give the metric form ds? in some suitable coordinate system, and this form can be written in terms of metric tensor components (5.5). Again, the distinction between flat and curved spacetimes is that in a flat space-time it is possible to find a coordinate system in which the form is everywhere ds? = -d? + dx? + dy’ +dz’, (5.6a) so that, with i,j = 0, 1,2,3, the metric components gy are -+1 if i= j and zero otherwise;* that is -1000 _|0 100 wl=|o o 1 of (5.6b) 0001 in a curved space-time one can find a coordinate system in which the metric form is (5.6a) at any specified point P, but there is no coordinate system giving this form everywhere. In flat space-time this form will apply only if special coordi- nates are used; but the general form (5.5) will apply in all cases (see e.g. (4.29). The metric also gives a convenient way of writing the scalar product (sce (4.31). In a general space, the scalar product of vectors m, 93 is mm = > ginny. (5.60) 7 It can easily be seen that this reduces to (4.31) when the metric takes the form (5.6b) and the 9s are chosen as in (4.31). Once the metric form is given, thon just as in flat space-time, it determines all time measurements by ideal clocks in the space-time (moving on time-like curves, for which ds? < 0) through eqn (4.25a), and the motion of light at each point (paths on which ds? — 0). Thus it determines the light rays at each point and the “We are here using the same units for spatial distances (measured by light travel times) as for time measurements, ic, we are using units such that the speed of light ¢ is 1. 5.4 The metric form and the metric tensor 205 past and future null cones of each event (which are generated by these light rays), and so the nature of causality. As a simple example, consider the universe model with metric form given in lerms of suitable coordinates: ds? = —d? + 1i(dx? + dy? + dz”) (5.7a) (that is, goo = —1, 11 = $22 = 833 = 4, gy = 0 otherwise). One immediately sees that along each world-line {x = const, y = const,z = const}, the identities dx = 0 = dy = dz, and so ds? = —d, hold; therefore by (4.25a) the coordinate t measures proper time along those world-lines, which are the fundamental world- lines in this universe. However, along a curve {¢ = const, y =-const, z= const}, we have ds’ = ¢!dx?, so proper distance along that curve is measured by tix rather than x, which (as we will see in detail in Chapter 7) implies that this is an expanding universe. The null cone is determined by the condition ds? = 0; from (5.7a) this shows that a displacement (dx*) = (dé, dx, dy, dz) along the null cone must obey d? = B(dx? + dy? + dz") (5.7b) To see the implications, consider the null cones projected into a surface {y = const, z = const}, i.e. set dy = dz = 0 in (5.7b), to obtain d? = Hdx 6 de=t8dx (5.7c) This shows that, for small values of the coordinate ¢, a given displacement dx results ina very small displacement de; at larger values of t, the same displacement dx results in a larger displacement dt (Fig. 5.16). Thus in terms of these coordi- nates, the light cone ‘flattens out’ as one approaches the surface t = 0 (for the Fig.$.16 The light cones for the interval (5.7a), given by (5.70). For small values of the coordinate ¢, the cones are flattened out. 206 Curved space-times axis of symmetry Fig. 5.17 same increment in ¢, the required increment in x in order to fulfil (5.7c) gets larger and larger as f decreases). It does so in a way independent of the value of x (since the coordinate x does not appear explicitly in (5.7a,c)). We shall examine this and related models in detail in Chapter 7. Exercise 5.9 Consider flat space-time (which has spatial symmetry about any chosen axis). Take cylindrical polar coordinates in which z measures distance parallel to the axis, r measures distance from the axis, and ¢ is an angle describing rotation about this axis, (see Fig. 5.17). Write down the metric form ds? and metric tensor [gy] in these coordinates. 5.5 The field equations The geometry of a space-time is determined by the metric form ds”, or equiva- lently by the space-time metric tensor components gy(x*). The critical question, then, is what determines the metric tensor? Einstein proposed in 1916 that the space-time geometry is determined by gravitational field equations. Broadly speaking, these equations express the idea that the matter present time, which determines the space-time metric form. This is another revolu- tionary idea; until Einstein, it was assumed that geometry was static, a feature of the physical world given ab initio which affected everything in the universe but was affected by nothing. The new view was that the geometrical structure of space-time, like other aspects of the physical world, is a quantity affected by physical conditions in the world, and whose evolution is determined by well- defined equations from given initial conditions, The effects of gravity are then enshrined in the space-time curvature. Thus geometry also became a branch of physics through this new understanding: one could set out to determine the space-time geometry by appropriate observations, and to find the laws deter- mining this geometry. Finstein proposed a particular set of equations to determine the space-time geometry, the Einstein gravitational field equations. These are a complex set of partial differential equations for the metric tensor components gy, written in the mathematical language of tensor calculus. Although the use of tensor calculus is beyond the scope of this book, we shall state Einstein’s equations to show how his revolutionary and profound ideas about the nature of space-time geometry and gravity can be expressed in an extremely concise and elegant way. On one side of ie eatises curvature of that spacc— a space-ti 5.5. The field equations 207 the equations is the symmetric Einstein tensor GY = G" which is built from second partial derivatives of the components of the metric tensor with respect to the various coordinates. The tensor describes the geometry of space-time and is, the most general such object which satisfies certain important requirements, such as transforming correctly and being zero when the space-time has no curvature. On the other side of the equation is the symmetric stress-energy tensor TY = T! (see Appendix C, Section C5). The components of this object describe the matter and energy which cause the space-time curvature, combining in one the energy density, momentum density, and isotropic and anisotropic pressures. Then Einstein’s equations take the simple form Gi=xTi (5.8) where « is the gravitational constant, equal to 8G/c?.* This equation states that matter (represented by the stress-energy tensor on the right) causes space-time curvature (represented by the Einstein tensor on the left). The space-time cur- vature in turn determines how the matter moves, and this is how we experience gravitational effects. The equations of motion of the matter are embodied in the conservation law satisfied by the stress-energy tensor (sce the discussion of these laws in the flat space-time case, in Appendix C), We can choose coordinates so that at a particular point this law is (5.9) (When written in a form valid in general coordinates, the partial derivatives have to be replaced by a ‘covariant derivative’ which involves extra terms and gives the correct tensor transformation properties.) This is just the statement of energy— momentum conservation. By (5.8) this law means that a similar property must hold for G4, and this is one of the requirements determining the form of these equations. Using the symmetry of G4 and 7’ in their indices, and recalling that each index can take four different values in four-dimensional space-time, we might be led to conclude that there are ten independent coupled equations for the ten independent components of the metric tensor gj. However, four of the degrees of freedom of the metric tensor correspond to the freedom to choose what coordinate system to use in a particular problem—-we require this freedom because we know that the physical reality studied must be independent of the coordinates used to describe it. Thus, given a suitable coordinate choice, only six metric tensor components have to be determined by the field equations, the remaining four components being fixed by our choice of gauge, the technical * Several years after Einstein first formulated his equations, he inserted an extra term, adding Ag! to G4, where Ais the so-called cosmological constant, This was to allow the possibility ofa static unchanging universe as a particular solution. However, when the expansion of the universe was discovered in 1929, he changed his mind and set A =0. Many considerations, including the validity of the Newtonian limit, constrain A to be extremely small, and it is usually taken to be strictly zero, ‘except in cosmological applications (sce Chapter 7), where it may indeed be important. 208 Curved space-times term for coordinate choice. On the other hand, it turns out that only six of the Einstein equations are independent because there are four relations between them, the Bianchi identities. These are precisely the derivative conditions 8G /Ax/ = 0 on G¥ mentioned above. Hence we may solve four of the Einstein’s equations (the initial value equations) for the unknown metric tensor compo- nents on a space-like initial surface ©, and six of the equations (the propagation equations) in a suitable open set U in space-time containing »; it then turns out that, because of the Bianchi identities, the constraint equations will be true in all of U (and not just on ©), so we do not have to solve these four equations throughout U. Additionally, if we choose coordinates cleverly in particular cases, we may be able to do so in such a way as to guarantee that some of the field equations are identically satisfied. Thus despite the great complexity of these equations, many solutions are known. Einstein’s equations embody the physics of gravitation. It is of course important to show that in the slow-motion, weak-field limit, we regain from them the results of Newtonian gravitational theory to a high degree of accuracy, because that theory gives a very good description of the behaviour of matter in the solar system. It is far from obvious that this is true, because the Einstein and Newtonian gravitational equations are so dissimilar from each other. However, amazingly, this can be demonstrated, provided we employ suitable coordinates; and this requirement fixes the constant of proportionality « between G! and T7 in (4). However, the predictions of Newtonian theory are not completely accu- rate, and where there is a disagreement, Einstein’s theory gives the better pre- diction. In fact it has stood the test of all experiments so far conducted to examine its accuracy (see Sections 5.6. and 5.9). Einstein’s theory disagrees dramatically with Newtonian theory in the case of strong fields. As we shall see, according to Einstein’s theory extremely dense matter can cause space-time to ‘curl up’ on itself, resulting in a ‘black hole’ (Chapter 6); there is now evidence that solar-mass black holes exist in the outer regions of our galaxy, and that much more massive black holes may exist at the centres of galaxies. In general, the curvature of space— time manifests itself in the bending of light rays and similar gravitational effects, resulting for example in a redshift that has a gravitational rather than a Doppler origin being detected in observations of massive stars. Exercises 5.10. What symmetries would you expect in the metric form describing the space-time around static, spherically symmetric star? From general arguments, write down the most general metric form that might represent this space-time, provided coordinates are chosen adapted to these symmetries, 5.11 What other physical situation might the interval of Exercise §.10 represent? Geodesics again We have already discussed the physical meaning of time-like geodesics and their importance in describing the effects of gravity. How does that discussion relate to the mathematical formalism we have now set up? 5.6 Light rays 209 As has been mentioned before, in a curved space one can look for the shortest distance between two, points. This can be found by choosing a path which minimizes L = [(ds?)4 (ef. (4.264), where ds? is the metric form (5.5). Similarly in curved space-time, we may find the time-like path that maximizes the value of = f(—ds*)! (cf. (4.25a) where ds* is the space-time metric form (again given by G. 5)). This will correspond to the path with the longest proper time between its end-points, as pointed out in the discussion in Section 4.2.* Any paths that are either maxima or minima of the space-time distance between their end-points are geodesics of the space-time (cf. Section 5.1). As we have seen, particles moving freely (i.e. not subject to any non-gravitational forces) will follow such paths, in curved space-times. In introducing the idea of a curved space, we indicated that there is an alter- native way of defining a geodesic: namely, as a curve whose direction is unchanging as one moves along it. This idea can be made precise in any curved space or curved space-time (cf. Section 5.7), and it turns out that the two defi- nitions are the same: a curve of extreme length is also one that does not deviate from its initial direction. In a flat space or space-time, the geodesics are simply straight lines between their initial and final points. Time-like geodesics (those for which ds?< (0 at each point) in space-time have a very clear physical meaning which we have already discussed (Section 5.3), Null geodesics (those for which ds? = 0 at each point) also have an important physical meaning, which we will discuss next. 5.6 Light rays We have now determined a unique physical interpretation of the time-like geo- desics in a curved space-time. What about the null geodesics? The obvious answer is that they must represent light rays, for they are the null curves (i.e. they represent motion at the speed of light) that are the nearest one can get, ina curved space-time, to a straight line. Thus, we will make this identification: light rays in a curved space-time are null geodesics. This assumption can be confirmed by examining the geometric optics solutions of Maxwell’s equations in a curved space-time, and by considering the propagation of zero-rest-mass particles in a curved space-time, This identification is of considerable importance, since, on the one hand light rays determine the results of any astronomical observations we may make, and on the other they are the generators of the light cones in space-time and so determine the nature of causality. Before discussing these issues, we look at some implications of the principle of equivalence. * In space-time, whether the path is ‘shortest’ or “longest” depends on the sign convention used for the space-time interval, this convention is arbitrary, and one can quite cousistently use the upposite sign for ds? than that used here. However, what is independent of this choice is the physical effect: these are the paths of longest ‘proper time. Here we regard ds*, which is negative on a time-like path, gs minimized, resulting in a maximum value for the elapsed time, given by integrating (aey 210 Curved space-times Bending of light rays We have seen already that light rays observed by a freely falling observer D far from any gravitational field should be seen to move in straight lines (for this is just the flat-space-time situation). Hence, by the principle of equivalence, this should also be true for an observer C freely falling radially towards the centre of the Earth (Fig. 5.7c). But the path of this light will appear curved relative to an observer A at rest relative to the Earth, just as the path of light will appear curved relative to an observer B in a uniformly accelerating rocket far from any grav- itational field (Figs 5.7a,b; cf. Fig, 5.8). Hence the principle of equivalence leads us to believe that (relative to an observer at rest on that body) light rays will be bent by the gravitational field of a massive body. The classical way of testing this is by observing the apparent positions of stars during a solar eclipse. The stars are seen by light rays which just graze the surface of the Sun, and the bending of these rays produces a distorted image of their positions (sce Fig. 5.18). From the Schwarzschild solution of Einstein’s equations (see Section 6.1), which describes the gravitational field outside a spherically symmetric object like the Sun, the gravitational deflection of such a light ray can be calculated to be 1.75 seconds of arc. This prediction was first tested during the lotal eclipse in 1919 by an expe- dition led by Eddington, and it was confirmed to within an accuracy of about 10 per cent. This led to the widespread acceptance of the general theory of relativity. Since then, many similar observations have been made during total eclipses of the Sun, but the difficulties which seem inherent in such measurements mean that the accuracy has not improved significantly. However, it has proved possible to test the Einstein prediction more rigorously by radio interferometer measurements of the bending of radio waves from quasars (very distant objects that appear very like stars) being eclipsed by the Sun. In 1976, Fomalont and Sramek performed ch measurements to an accuracy of | per cent, giving exellent agreement with Observer on earth Fig. 5.18 Light rays from a distant star are bent by the gravitational field of the Sun, producing a distorted image of the star’s position. Exercise 5.12 The focal length of the sun 5.6 Light rays 211 Consider parallel light rays projected towards the Sun from infinity. After passing the Sun, they will intersect within a distance d because of the bending of light by the Sun. Find d (in light years). [Hint: 1 parsec = 3.26 light years is the distance from which the diameter of the orbit of the Earth (of radius 150 million km) subtends an angle of | second of arc. The radius of the Sun is 696 000km.] How does this distance compare with the distance to the nearest star? Gravitational redshifts We can note similarly that if light is emitted from the floor of a laboratory or rocket in free fall and received by a detector at the roof, then observer D should measure no change in frequency of this light. On the other hand, for observer Bin an accelerating rocket, the roof accelerates away from the position of the floor when the light was emitted; thus, in every time interval as measured by B, the light, has to travel further before reaching the roof, than in the previous time interval (Fig. 5.19a). Consequently, the accelerating observer B will detect a redshift in the received light (indeed this was shown by the calculation of observed redshift in a Rindler universe presented in Section 4.3). The principle of equivalence leads us to believe that the same will be true for the observer A stationary on the surface of the Earth (Fig. 5.19b). Thus we have the prediction of gravitational redshift: light ‘climbing out’ of a stationary gravitational field will be redshifted when received by a stationary observer (Fig. 5.19c). This has been verified in a number of different types of experiments. The celestial ones involve observations of distant (b) © Fig.5.19 (a) In an accelerated rocket containing an observer B, light emitted at sucoes- sive intervals from the floor has further and further to travel to the roof, (b) Observation of light rays by the equivalent observer A in a stationary lift in the Earth’s gravitational field, must give the same results as B’s observations. (c) Gravitational redshift: the time interval Av’ between reception of signals sent out at interval Ar, is larger than Ar although the reception point w is not moving relative to the emission point u; this is because of a gravitational field between w and u, causing space time curvature. 212 Curved space-times massive stars, and a measurement by Brault in 1962 of the redshift of the sodium D; line emitted on the surface of the Sun confirmed the general relativistic prediction to a precision of 5 per cent. The classic terrestrial experiments were by Pound and Rebka in 1959 and Pound and Snider in 1965; they used the Méssbauer effect to measure the redshift of photons emitted at the base of a 22.5m tower at Harvard University and received at the top of that tower (see Fig. 6.7). The measured redshift agreed to within 1 per cent of that calculated from Einstein’s theory. Geodesic deviation: light rays One result of the gravitational bending of light rays is that the rclationship between observed angles and distances is changed. In flat space-time, an observer receiving light rays with an angular separation of @ from an object a distance r away, can conclude that the size of the object is d = ar (Fig. 5.20a). However, in curved space-time the conclusion is invalid, because the light rays will have been bent by gravity (Fig. 5.20b). If the light rays are bent in towards each other (as we expect for an attractive gravitational field) they will be closer together at the object than one would directly deduce from their angular separation, and the object will appear to be larger than its real size because of this ‘gravitational lensing’ effect (Fig. 5.20c). This effect will also increase the observed luminosity of the object, because the light emitted by it is spread over a smaller surface arca distance CURVED: =r apparent an rn eS . Sr, real size angle —_Tightrays anglea)~lightrays @ ®) pparent size déer real size CURVED: attractive, Fig. 5.20 (a) Ina flat space, the size d of an object viewed with angular width a at a distance r must be ar. (b) In a curved space this relation is not true, If the space has negative curvature, Ue apparent size arr will be smaller than the real size d (¢) Ina space of positive curvature, the light rays will be closer together at the object than they would be in flat space, and the apparent size ar will be larger than the real size d, This is the ‘gravitational lensing? effect 5.6 Light rays 213 7 lightrays —_— —™~ e Fig.5.21 Light rays nearer a massive body will be bent more than those further away, because the gravitational field is stronger nearer the body. Consequently, images will be distorted when light moves near a massive object. than would be the case in flat space-time. A further effect is that in general the light conveying images of distant objects will be differentially bent, since the light nearer a massive object will be bent more than the light further from the object, because the gravitational field is stronger near the object (Fig. 5.21). Thus, dis- tortion will occur in the image; for example, a spherical object will appear elliptical, so in general the gravitational lensing is imperfect and distorts the appearance of the object observed. From the space-time viewpoint, it is clear that what we are discussing is nothing other than the ‘geodesic deviation’ effect discussed above (Section 5.3), but now considered in the case of light rays. Because of the tidal effects of the gravitational fields of distant objects, initially parallel light rays will tend to intersect each other, and light rays diverging from a point will tend to be focused. Asin the case of particle world-lines, the relative separation of neighbouring light rays can be used to detect space-time curvature, and to measure its strength. In the space-time context, Euclid’s axiom that parallel straight lines never meet is replaced by an equation (the equation of geodesic deviation) determining how the distance between neighbouring geodesics varies as a result of space-time cur- vature. In the case of light rays, these effects are directly observable by measuring apparent angular diameters of distant objects. Gravitational lensing In extreme cases, the focusing effect resulting from the presence of massive objects or diffuse matter can cause bending sufficient to produce refocusing of the light rays. Then they no longer recede from each other as one goes to greater distances, but rather approach each other. Consequently, beyond a. certain distance where the light rays start refocusing, the size of an object subtending a constant angular size a at the observer now decreases with distance from the observer (Fig. 5.22a), so if one were to move a rigid object further away (Fig, 5.22b) its apparent size would increase with distance from the observer (instead of decreasing, as one would normally expect). This can occur locally, or cover the whole past light cone. Local lensing An example of the occurrence of local refocusing is when in a cosmological model, a massive object refocuses light rays from more distant 214 Curved spacetimes ame object appears sama e at both distences light—rays. —— ) fal object. py maximum | [same angular} istance apert! size as | @ light-reys: move: apparent, angle increeses from a toa ) Fig. 5.22 (a) The refocusing of light rays in a gravitational field. The size of objects, subtending the same angle at an observer increases with distance first and then decreases with distance. (b) An object of size d beyond the point of refocusing subtends a greater angle at the observer as it moves further away (a! > a). \, (lensing mass «eource Getector le Fig. 5.23 A massive object refocuses light from a more disiani source, producing muitiple images I; and I of the source. ‘objects so causing multiple images (Hig. 5.23). ‘his has now been observed in several cases where light from very distant quasi-stellar objects is focused by an intervening galaxy.* Figure 5.24 shows such a case; the two quasi-stellar images 0957 + 561 have been identified by their spectra as coming from the same quasi- stellar object; the galaxy causing the focusing is very faint, and can only be detected by special processing of the image (Fig, 5.25). This is a dramatic demonstation of the effect of intervening space-time curvature on light rays. Tn this example, the effect is local: light passing near the focusing galaxy is refocused, but light that does not go near it will be unaffected. Thus, this effect will only occur in comparatively few directions in the sky, for light rays that pass suffi- ciently near very massive galaxies or other objects. Large-scale refocusing The second kind of refocusing implies that the light cone as a whole is bent back in on itself. In flat space-time, the area of a wave front * Sco ‘The discovery of gravitational lenses’ by I. IL. Chaffce. Scientific American, November 1980. 5.6 Light rays 215 Fig.5.24 and §.25 Gravitational lensing by an intervening galaxy creates two images of a single quasi-stellar object (OSO 0957 + 561). In Fig. 5.24 the two OSO images. identified as coming from a single very distant object because of the similarity of their spectra. In Fig 5.25 one of the QSO images has been digitally removed, revealing the fainter image of the lensing galaxy (which is nearer but does not radiate as energetically as the QSO). These photographs thus reveal directly the bending of light caused by the gravitational field of the galaxy, and so demonstrates space-time curvature. (These images were made by Alan Stockton at the Institute of Astronomy, University of Hawaii.) necessarily increases with distance from the observer (after having gone a dis- tance r = ct in a time ¢, the light from a source is spread out over an area 4nr?, cf. Fig. 4.29b). In a curved space-time, this will not be true; in general, the total area of a wave front will decrease with distance instead of increasing (Fig. 5.26a), because neighbouring light rays are focused towards each other (asin Fig. 5.22). Correspondingly, going back down our past light cone, the light cone as a whole will reach a maximum distance from our past world-line C and then start refo- cusing towards that world-line (Fig. 5.26b). Examination of expanding universe models confirms that this is indeed the kind of behaviour we expect for our own past light cone in the real universe, hecause there is sufficient matter and radiation 216 Curved space-times light cones tilt in light area ee 2 world lineC (b) ig. 5.26 Refocusing of light where the light cone as a whole is refocused by the curvature of space-time caused by the gravitational field of uniformly distributed matter or radiation. (a) Light spreading out spherically from a source s at a distance d has area less than 4rd, and eventually focuses to zero. In this situation, as seen by the observer, the light originates at a distant region, spreads to a maximum, and then focuses to the observer. (b) In a space-time view, this implies that the light cone of the observer reaches a surface S$ of maximum area and then bends back on itself as we follow it back into the past (the local light cones tip over, remaining tangent to the light cone of P; cf. Fig. 4.17(b) in the flat-space case). The surface of refocusing $ where the geodesics are a maximum distance apart is seen by an observer at P as a surface of minimum angular diameters. Going further back into the past, the area of the light front decreases as the light rays approach each other. uniformly spread out through the universe to cause this overall refocusing. That means that we expect the kind of refocusing behaviour shown in Fig. 5.22 to occur down every light ray, as we follow it back sufficiently far into the past. Locally, the light cone at each point still represents the speed of light, thus the local light cones (in a coordinate system in which the coordinates directly represent lengths and times) cannot be parallel (o each other in such a space-time, and are shown tilted over appropriately in Fig. 5.26b. We believe that the density of matterin the universeis sufficient to cause this kind of refocusing, and so cause ‘anomalous’ angular diameters and luminosities in images of distant objects, at a redshift of somewhere between 1 and 5 (Fig, 5.27). However, this has not yet been verified observationally. Exercises 5.13. Supposea black boxis dropped from an aireraft and falls freely towards the Earth from a height of 10 km ahove the Farth’s snrface. Initially two marbles in the box are at rest a distance of 10cm apart horizontally, How far apart will they be when the box hits the Earth? [The Earth’s radius is approximately 6000 km, You may neglect the gravitational attraction between the marbles.] 5.7 Causality 217 constant 20 8 10. Tus 7 Z 7 Zz Fig.5.27 The apparent angular diameter of a rigid object as it is moved to further and further distances in an Einstein-de Sitter universe is given in terms of the observed redshift of the object by the relation a= (constant)(1 + 2){(1 +2) (14271 ‘There is a minimum of the apparent angular diameter at a redshift z=} This device measures gravitational tidal forces by their geodesic deviation effect. Indicate how one might in principle construct another measuring device basically using the same idea but this time applied to light rays. Would this be useful in practice? 5.14 Consider a region of space-time far from any gravitating masses. How would you test whether it is flat or curved? 5.15. If light rays are bent in gravitational fields, can we still use light as a basis for the measurement of time and distance in curved space-times? 5.16 Draw diagrams to show how two images of the same object may be seen when the light rays are bent by (a) layers of air at different temperatures in a desert, (b) the grav- itational field of a very dense body. 5.17 When refocusing of light rays occurs, the apparent flux of radiation measured from a distant object will differ from that measured in flat space-time. Consider how the argument leading to eqn (4.35) should be adapted to this situation. [Denote the area of the outgoing light front by A.) 5.7 Causality The large-scale refocusing of light rays shows that the local behaviour of light cones can be very different in a curved space than in flat space-time. This in turn implies that causal properties can be quite different. One particular feature that can occur is the existence of various types of horizon in a curved space-time, that, is, surfaces that limit predictability in various ways. The simplest such surface is our past light cone, limiting the regions we can have had causal contact with (cf. the discussion in Chapter 1). In the following chapters we will discuss 218 Curved space-times carefully the concepts of an event horizon around a black hole, the basic concept having already been introduced during our discussion of the (flat-space) Rindler universe model, and of a particle horizon in cosmology. A further possibility in curved space-times is the violation of our normal ideas of causality, which we discuss briefly in this section. To see how this can occur, we note that the local light cones can tilt over relative toeach other; indeed we may expect that this will happen ina rotating system (the light rays get dragged along by the rotation). However, as before, the speed of light (locally determined by the light cone) is still a limiting speed, so the light cones and associated paths of light rays still determine what parts of space-time can be influenced by any particular event. If the rotation is large enough, the light cones may tip over until they appear horizontal in a given coordinate system; an example of a space-time where this occurs is Gédel’s stationary, rotating uni- verse, where the light cones tip right over if one goes far cnough out from any observer (Fig. 5.28). Then causal violations will be theoretically possible in this space-time, because closed time-like lines can exist. Thus in principle an old man can stand next to, and converse with, a young man who is himself (i.e. the same person) at an earlier stage in his life history! (Fig. 5.29). It is in principle possible for an observer on any galaxy in this space-time to travel from any event in the galaxy’s history to any previous event in its history, by accelerating far enough away from its world-line and then back. There is no evidence that this can occur in the real universe, but on the other hand this possibility (which raises various causal paradoxes) has not been disproved observationally or experimentally. We do not claim it is likely that the real universe is like this, but merely point out that curved space-time models exist where this is a theoretical possibility. C(light cones vertical) local light: cones tilt over with | distance from world line © iclosed time-like line Fig.$.28 Gédel’s stationary universe. On the axis the light cones are vertical but away from the axis the rotation causes them to Lill over. This tilling incieases with distance from the axis so that eventually they are horizontal and there are then closed time-like lines (the curves drawn are everywhere pointing in the future-directed time-like direction ‘of the local light-cones). 5.8 Parallel propagation along a curve 219 worldtine Fig. 5.29 Tn a universe with closed time-like lines, world-lines can come back to themselves so it would be possible for an old man to stand next to himself as a young man! Exercise 5.18 In acertain region the space-time interval ds? is given by de? = —(1 — a/r)d? + dP /(1 — afr) + 7 (a? + sin? dg?) where a is a constant. Find the equation of the null cone at radius r for @ and ¢ constant. At what value of r would you expect there to be a horizon? 5.8 Parallel propagation along a curve ‘We have now covered most of the major new effects occuring in curved spaces and space~times in a qualitative manner. ‘There is one further feature not men- tioned so far: this is the concept of parallel propagation along a curve. While this plays an important role in setting up the mathematical formalism of curved space-times, the following chapters do not use the ideas introduced in this sec- tion, so it can be omitted during a first study of the idea of a curved-space-time. Consider a point P in a curved space; choose a direction xy at that point, and consider a curve 7 starting at P and ending at another point Q. One can imagine moving along the curve 7, defining a direction x at cach point on by keeping track of changes of direction and correcting so that x remains parallel to the initial direction xy. For example, in an aircraft onc might register an initial direction as being along the axis of the aircraft; if it then turns 30° to the right, on the next leg of the trip the parallel direction will be 30° to the left of the new direction of the aircraft (Fig, 5.30). We call such a vector, parallel transported 220 Curved spacetimes along y. A geodesic is then a curve whose direction is parallel transported along itself, i.e. whose direction is unchanging. Parallel transport along a curve allows us to compare vectors at distant points ina curved space; however, there is no well-defined concept of ‘parallel’ at distant points (¢.g. at London and New York) in an absolute sense because the result depends on the path taken between these two points. For example, consider a sphere (Fig, 5.31) and motion along the curve 7 along the great circle from P (on the equator) to Q (the north pole), e.g. by steering a ship straight ahead all the way. Let xo at P point along the equator to the right. Then at each point of 7 the parallel transported vector x will remain at right angles to the direction of y, and 80 will define the vector x, at Q. Now consider motion from P to Q along the segment of the equator from P to the point R a quarter around the equator, and then up the great circle \" from R to Q, these two segments together defined as the curve ) from P to Q. Parallel transporting x along N’, it always points along the direction of \’; when the new path turns 90° left at Q, the vector x will initially Fig. 5.30 The parallel transport of a direction along the path of an aircraft, Initially the direction is along the axis of the aircraft, but after its path turns through 30° to the right, the direction is 30° to the left of the aircraft's axis, Fig. 5.31 The patallel ansport of # direction x on the surface of a sphere: when transported from P to Q along the path 4, it defines the vector x, at Q: when trans- ported along the path A via R, it defines the vector x at Q. The vectors x, and xy are not parallel to each other! 5.8 Parallel propagation along a curve 221 lie at right angles to the direction of motion and this will remain true until Q is reached, defining the vector x) at Q. This is at right angles to the vector x, there. Thus parallel transporting a vector from P to Q along two different paths y and A in general gives a different result at Q; mathematically, we say that parallel transport is not integrable. It is then clear that parallel transporting x, round 7 from Q to P and then round J back from P to Q, will result in a vector parallel to x); thus parallel transport round a closed loop results in rotation of the vector. The amount of this rotation is a measure of the amount of curvature enclosed by the loop; in a flat space with its normal topology, the rotation will be zero. The idea of parallel transport can be extended to space-time. Parallel transport ofa space-like vector x along a time-like geodesic is understood to represent the physical situation of using a perfect gyroscope (or equivalent mechanical device, such as a Foucault pendulum) that keeps pointing in the same direction and so tells one what direction ata later time in one’s history is parallel with a direction at an carlier time (Fig. 5.32a,b). This is the basis of the non-rotating reference frame that underlies the usual studies of mechanics and is realized, for example, in the inertial guidance systems of ships, aircraft, and spacecraft. As particlesin free fall and light rays move on space-time geodesics, the time-like directions of their world-lines are parallel propagated along them. LY parallal* t=t, axis. transport | | 4, @ initially, tet, Fig. 5.32 (a) Parallel transport of a gyroscope along a world-line in space-time. (b) This defines parallel directions at one place but at different times (e.g. what is the direction in this room that is parallel to where the vertical direction was an hour ago?) 222 Curved space-times Again, if different world-lines join events P and Q, then parallel transport of the same vector at P along these two world-lines will, in general, result in a dif- ferent vector at Q. On the one hand, this is the basis of a delicate test of general relativity using gyroscopes taken on different paths around the Earth (Fig. 5.33). On the other hand, it is the basic explanation of how geodesics can represent free fall. Consider a particle thrown up from the Moon’s surface and falling back freely (Fig. 5.34a). Since there is no air resistance, it is in free fall and so its space time path is a geodesic that leaves and then returns to the world-line of the observer at events A and B (Fig. 5.34b). Its velocity is parallel propagated along its path from A to B, and at B makes the opposite angle with the world-line of the observer compared with the initial direction parallel propagated along the observer's path from A to B. This is possible because parallel propagation along different paths from A to B results in different directions at B. An analogous effect occurs with great circles on the surface of the Earth; this is in each case a direct result of curvature of space or space-time. Exercise 5.19 Consider a circle drawn on the surface of a cone, at a constant distance from the vertex. Take a direction in the surface at right angles to the circle, and perform parallel transport onit round thecircle. By what angle will its direction change in onecircuit? What do you conclude about the curvature of the surface? [You may find it useful to “flatten out? the cone onto a plane, as discussed previously, to see clearly what is happening.) 5.9 Further tests of Einstein’s theory We have already mentioned in Section 5.6 the ways in which measurements of the a al bend: tand of L redshift dc eaidence: gravitational bonding of light and of the gravitational redshift provide evidence in support of Einstein’s theory. In this section, we shall describe briefly some Fig. 5.33 (a) A measurement of the curvature of space-time involves transporting gyroscopes on different paths around the Earth, and comparing their final direction with a ‘stay-at-home’ gyroscope. (b) This compares parallel transport along different space-time paths hetween the same events 5.9 Further tests of Einstein's theory 223 pa parallel along y parallel along a (geodesic!) A initial direction Ny ) Fig.5.34 (a) A particle thrown from the Moon’s surface at event A and landing again at event B, after falling freely (and therefore travelling on a geodesic in space-time). (b) A space-time diagram of this situation. The initial direction v of particle motion is parallel transported along the world-line + of the observer from A to B, defining a vector v, at B, However, after parallel transport along the geodesic path ) of the particle from A to B, it defines the direction vy at B. The vectors v, and vy are not parallel to cach other (v, is in the -+2-direction, but v, is in the —z-direction). This corresponds to the fact that in (a), when the particle leaves the observer its motion is upwards but when. it returns its motion is downwards. other experimental tests, leaving until Section 5.11 the important topic of the detection of gravitational waves. For a fuller discussion of experimental tests of general relativity, the reader is referred to Clifford Will’s book Was Einstein Right? (second edition: Basic Books, New York, 1993) Perihelion shifts According to Newtonian theory, a planet moving in the gravitational field of the Sun and sufficiently far removed from the gravitational effects of other bodies, would describe a closed elliptical orbit. However, it has been known for a long time that motion in the solar system does not fit this idealized picture. The planet subject to the most intense scrutiny has been Mercury; being the nearest planet to the Sun the gravitational effects on its motion are easiest to measure. It turns out that its orbit is not closed, but like an ellipse with axes which rotate by a tiny amount each time the planet goes round. The way to make this idea precise is to consider the perihelion, which is the position of closest approach to the attracting body, the Sun in this case. The line joining the planet to the Sun at this point is observed and is found to precess; it rotates through a very small angle each time. A very large part of the rotation of Mercury’s perihelion isa result of classical Newtonian effects, in particular the perturbation of the orbit due to other planets. This accounts for 5557 seconds of arc per century. Very accurate observations and calculations left a tiny rotation of 43 seconds of arc per century which could not be explained this way. This presented a major challenge for 224 Curved space-times Einstein’s new theory. Using the Schwarzschild solution to describe the spheri- cally symmetric gravitational field of the Sun, Einstein was able to determine the orbit of Mercury according to general relativity, and amazingly the prediction gave a rotation of 43 seconds, in excellent agreement with observation. This was the first experimental test of the theory and provided very compelling evidence in support of it. The test is particularly compelling because the theory was not designed specifically to meet this challenge—it just turned out that it did so, after the theory had been fashioned on the basis of fundamental considerations by Einstein on the nature of space, time, and gravitation. More recently, the general relativistic prediction of perihelion precession has been confirmed by observations of the binary pulsar discovered by Hulse and Taylor in 1974. This system, which will be discussed in more detail in Section 5.11, consists of two very compact stars ina very tight orbit around each other. The perihelion precession is orders of magnitude larger than that of Mercury; the prediction of about 4 degrees per year agrees closely with the measured value. Radar time-delay A way of investigating the curvature of space produced by the Sun, say, is to measure the delay in the travel-time of a radar beam passing near it, as compared with the travel time if the space were flat. Early experiments were performed by sending a radar beam from Earth and measuring the round trip time after it was returned bya reflector on the surface of Venus or Mercury or onboard a Mariner spacecraft. As the path of the radar beam moved nearer to the Sun as the relative positions of the Earth and the reflector changed, the travel time varied (see Fig. 5.35). Ina more recent experiment by Shapiro in 1976, the radar travelled to Mars and was sent back by reflectors both on the surface of the planet and in a spacecraft in orbit around it. The round trip time for signals passing near the Sun was measured and found to agree well with the values calculated from general relativity. Because of the curvature of space, the distance was found to be larger by about 37 km out ofa total distance of 378 million km from Earth to Mars. The radar travel time was about 42 minutes for the round trip. The Global Positioning System An intriguing application of, and testing ground for, relativity lics in the setting up of the Global Positioning System (GPS). The basic idea is that an observer should be able to determine his or her position in space and time with extreme accuracy by using signals from a network of satellites. In the current system, there are twenty-four satellites in various orbits around the Earth, arranged such that four or more of them are almost always visible from any place on Earth. Each satellite carries an extremely accurate and stable atomic clock, and signals from this are emitted from the satellite. Ground-based monitoring stations collect data which is processed and re-transmitted to the satellites. The user has a small computer which uses information from the satellites to solve for its position, time, 5.9 Further tests of Einstein’s theory 225 Target tb) Sun Fig. 5.35 The gravitational field of the Sun produces curvature in space, which is represented here by a ‘rubber sheet’ picture. (a) When the target planet is far from the Sun, the radar path is on the “flat” part of the sheet. (b) As the Larget approaches the Sun, the radar has greater and greater distance to cover because of the ‘dip’ in the sheet, so the travel time is longer. and velocity. The accuracy is extremely impressive, and can beas good as 5-10 cm in position, Although the velocities of the clocks are small and the gravitational fields are weak, relativistic effects like time dilation and gravitational frequency shift would cause errors much larger than possible errors in the accuracy of the cesium clocks used, and so they need to be taken into account. A natural consequence of this is that the GPS provides a way of testing the theory of relativity. Data from the GPS satellites recorded by the TOPEX satellite (in orbit primarily to measure the height of the sea) is providing the first explicit measurements of the periodic part of the combined effect of time dilation and gravitational frequency shifts on an orbiting receiver. Preliminary analysis of 226 Curved space-times the data gives an agreement between theory and experiment to within 2.5 percent. (For more details, see ‘The global positioning system’ by T. A. Herring, Scientific American, February 1996, 32-38.) 5.10 Gravitational waves Einstein realized more than 80 years ago that his theory predicted the existence of gravitational waves, but it is only relatively recently that any progress has been made on detecting them. In this section, we shall first look at the nature of gravitational waves and then at possible sources for their emission. In Section 5.11, we shall discuss methods of detecting them, both by experi- ments with bars and interferometers and also less directly through study of the energy decrease of a system which is best explained by the emission of such waves. The nature of gravitational waves Gravitational waves are fluctuations in the metric tensor which describes the curvature of space-time. One can think of them as ripples which travel through space-time, usually as a result of rotation or other changes in the body producing the gravitational field. We shall look first at a particularly simple way in which Einstein’s equations can have wave solutions. This is in the case when the gravitational field in empty space is weak, so that the metric tensor can be written as its flat-space value 7” plus a small perturbation 7. The ‘linearized field equations’ are derived from Einstein’s equations by ignoring terms in h¥-squared and higher powers. We obtain a set of linear second-order partial differential equations for hY, and witha particular choice of coordinates called the Lorentz gauge (see Section 5.5) they take the form in empty space of the wave equation for the combination of components h” defined by A” = h¥ —4ninsh". This equation has precisely the same form as the equation describing electromagnetic waves, which means that gravitational waves, like their electromagnetic counterparts, travel with the speed of light. As in the electromagnetic case, the wave equation here has plane wave solutions h® = AY sin(wt +k.x) (5.10) (or cosine of the same argument) where AY and k; = (w,k) are constants. There is already a restriction on A! because of the choice of the Lorentz gauge, and it can be restricted further to what is known as the transverse traceless gauge. Traceless means thatif A! is written asa matrix, the sum of its diagonal componentsis zero. To understand what we mean by transverse, let us choose our axes so that the wave is travelling in the z-direction, so k; = (w,0,0,«). Then our gauge condi- 5.10 Gravitational waves 227 tions mean that components of A? with either 7 or j being in the z-direction are zero, so that the wave oscillations are transverse to the direction in which it is travelling, In particular, we can write Axx Ary 0 AY— | Ay Ay 0 0 0 0 so that there are only two independent components, Ay, and Axy. Hence the perturbation of the metric from its flat-space value also has only two independent components. One way of understanding such waves in a more concrete way is to consider their effect on the motion of particles which they pass. It is no good to consider a single particle as we could always choose coordinates moving with it, so we need to consider the relative motion of two or more particles (see Section 5.3 and pp. 212-13). It is possible to construct and solve an equation for the vector separating two particles; this is called the equation of geodesic deviation and relates derivatives of the separation to derivatives of the metric describing the space-time curvature. It is beyond the scope of this book to give details of this equation but we shall describe its predictions pictorially, Consider a circle of particles around a central one, all lying in the (x.y)-plane, perpendicular to the direction of travel of the wave. For a wave with hy non-zero, /izy = 0, the circle would be distorted as shown in Fig. 5.36b, first squeezed in the x-direction and elongated in the y-direction, followed by the opposite effect. For a wave with Axx = 0 but fxy non-zero, there would be a similar effect but the squeezing and elongation would be at 45 degrees to the x- and y-axes (Fig. 5.36c). We say that the plane wave has two polarization states corresponding to Fig. 5.36. 4 ° ¥ y * eC @ b) © Fig. 5.36 Distortion by gravitational waves with the two types of polarization (a) A circle of particles around a central one, all lying in the (x, y)-plane, before a gravitational wave travelling in the z-direction reaches them. (b) Distortion for non-zero hyx. (©) Distortion for non-zero hy 228 Curved space-times We have focused our discussion on plane wave solutions of the linearized form of Binstein’s equations. There are also exact wave solutions in the more general case where no approximation is made about the weakness of the gravitational field, but since any gravitational waves reaching the Earth are likely to be weak, we shall not discuss the more general waves here. Expected sources of gravitational waves One might wonder first whether any experiment in a laboratory could generate gravitational waves. The answer is yes. For example, a large heavy bar rotating rapidly should produce gravitational radiation, but calculations show that the power generated would be orders of magnitude too small to be detected by the most sensitive detectors currently envisaged. This means that we need to consider astrophysical sources if we want any realistic chance of detecting the waves. There are a large number of possibilities here which we shall describe briefly. One of the main requirements is that the source should not be too symmetrical. For instance, we know from the solution of Einstein’s equations for a spherically symmetric body that it cannot emit gravitational waves. One of the most important phenomena generating gravitational radiation is that of stellar collapse. A white dwarf star, which is typically of about one solar mass and has stopped burning its nuclear fuel, can collapse under gravity to form a much more dense neutron star, which is supported against further gravitational collapse by the pressure of degenerate neutrons and by strong interaction forces. If rotating rapidly, this could fragment into a number of pieces which would then lose energy and angular momentum, and eventually recoalesce. The resulting reduced-size neutron star could then collapse through its gravitational radius or *horizon’ (see Chapter 6) to form a black hole, into which nearby matter would fall. Throughout this process, gravita- tional waves should be emitted, with particularly large amounts at moments of collapse, Another major source of gravitational waves is binary star systems, which are quite common. These consist of two compact objects (two neutron stars, a neu~ (ron star and a black hole, or (wo black holes) in orbit around each other. As energy is lost by gravitational radiation, the objects spiral in towards each other and will eventually coalesce. Supernova cvents are thought to occur when a star with mass more than about twelve solar masses exhausts its nuclear fuel and suffers a massive explosion, which produces a short but powerful burst of gravitational radiation, An end product of the explosion is likely to be a rotating neutron star or pulsar, which, if it not axially symmetric, produces gravitational waves. Supermassive black holes, thought to exist at the centres of galaxies, should give rise to gravitational radiation as matter or smaller black holes fall into them Finally, physical processes in the early universe—quantum fluctuations amplified by a period of ‘inflation’ (see discussion below)—might lead to a cos- mological hackground of gravitational waves. 5.11 Detection of gravitational waves 229 5.11 Detection of gravitational waves Although there are many potential sources of gravitational waves in the universe, the detection of such waves has proved to be an extremely challenging experi- mental problem and it is only now that direct detection seems a realistic pros- pect in the foreseeable future. Although the predicted strength of waves from astrophysical sources is much greater than anything which could be generated in a laboratory, they are still extremely weak, requiring detectors at the forefront of current technology. We shall now see how detectors could work. Direct detection The strength of a gravitational waveis usually characterized by a parameter h, the strain produced on an idealized detector consisting of two free masses a distance Lapart. If their separation changes by ALasa result of the passing of the wave, h is given by 2AL L (5.11) Although the largest signals currently anticipated, say from a supernova explo- sion in our galaxy, have hin the range 107!” to 10-18, such events are likely to be rare and so it makes more sense to have detectors with sensitivities of 10-7! to 10-2, We shall see to what extent present-day detectors match up to this aim. The basic idea for a means of detecting gravitational waves is, as already suggested, to measure changes in the metric by studying the separation of two heavy masses suspended in a way which isolates them as much as possible from all other vibrations. As a model of what are known as resonant detectors, consider two masses joined by a spring (Fig. 5.37). In the absence of gravitational waves, the oscillations of the spring would be simple harmonic motion with damping (like the motion of an imperfect pendulum which gradually slows down because of air resistance). However, gravitational waves impinging on the masses could provide a forcing term for this damped motion, and adjustment of the parameters of the detector to match the frequency of the waves could result in a large or resonant response, which would be more likely to be detected. The pioneer of gravitational wave detection is Joseph Weber from the University of Maryland, who first built such resonant detectors in the 1960s. Rather than the simplified model just described, a resonant detector usually consists of a very large cylindrical bar, with the elasticity of the bar, when it is stretched along its axis, playing the role of the spring. In the last 30 years, Weber has reported a number of gravitational wave ‘events’, and although they have not 2 S5bO5N~* Fig. 5.37 A schematic representation of a resonant detector of gravitational waves; two masses m are joined hy a spring. h= 230 Curved space-times been confirmed by other laboratories, they have certainly inspired the search. Currently there are resonant detectors in operation in Australia, Italy and the United States, with sensitivities as good as 10-?°. It is hoped to increase these sensitivities to 10-° to 10- in the next decade or two. Clearly the larger a detector, the more its length will be changed by the passage of a gravitational wave. It is not realistic to construct solid bars with lengths increased by several orders of magnitude, but there is an exciting and conceptually simple way of surmounting this problem, the development of interferometric detectors, The basic idea is to bounce laser beams back and forth between mirrors suspended as pendulums so that they act as free masses. The paths of the laser beams are along two arms at right angles and typically several kilometers long. The beams pass through various partially-transmitting mirrors (see Fig. 5.38) and when they are eventually recombined, the inter- ference pattern changes if the lengths of the paths change because of gravita- tional radiation. Such interferometers should be able to detect waves from both ‘light’ sources, like exploding stars, and ‘dark’ sources, such as black holes. The most well-known project using interferometry is LIGO (Laser Inter- ferometer Gravitational-Wave Observatory) in the United States. Detectors with arm-lengths of 4km are being constructed in Washington State and Louisiana. The aim is to start experiments in 2002 with a sensitivity of about 107?!, which should be improved within two years. Other projects with similar intended sen- sitivities are under way in Germany, Italy, and Japan. Taken a stage further is the somewhat mind-blowing LISA (Laser Interfero- meter Space Antenna) project. The idea of this is to have space-craft in orbit in positions forming the vertices of an equilateral triangle of edge-length 6 million km! The sensitivity of this could be many orders of magnitude better than any- thing achieved so far, but unfortunately it is unlikely to be operational before 2015, if then. There isa great deal of optimism about the prospects for direct detection in the near future of gravitational waves, perhaps from some astrophysical sources not yet anticipated. Whatever happens, it is likely that we shall learn a great deal about the universe and about Einstein’s theory as a result of these experiments. \——— Fully reflective mirrors ‘ Partially Partially transmitting transmitting mirror \ a mirrors Laser - Detector Beam splitter Fig.5.38 A schematicrepresentation ofan interferometric detector of gravitational waves (not to scale: the paths to the fully reflective mirrors are much longer than the other paths) 5.11 Detection of gravitational waves 23) Indirect detection Astonishing as it may seem, we already have indirect evidence for the existence of gravitational waves. Moreover, this was obtained not from gravitational wave detectors but from conventional radio telescopes. Before looking at the details, we need to consider the theoretical basis for this indirect detection. Gravitational waves carry energy (which is why a bar detector oscillates when such energy is transferred to it by a passing wave). Since we believe that energy is conserved overall, this means that the source of the waves must be losing energy. Suppose that the source is two compact objects in orbit around each other. As energy is lost, the size of the orbit decreases and the period of rotation becomes shorter. So the idea is that if one observes a binary system with decreasing period, the most likely explanation is that gravitational radiation is being emitted. In 1974, Hulse and Taylor, astronomers then at the University of Massachusetts at Amherst, discovered what was labelled as PSR1913 + 16, a type of neutron star known as a pulsar because it rotates rapidly and very regularly, beaming out charged particles from each of its magnetic poles. This particular pulsar also moves in close orbit about a very massive companion neutron star, with a period of about & hours. If this system emits gravitational waves, then its energy must decrease, the pulsar and its companion will move closer toeach other and their orbital period will decrease. This effect was calculated as early as 1941 by the Russian physicists, Landau and Lifshitz, and observations of the binary pulsar agreed extremely closely with the theoretical prediction. The observed value in 1982 for the rate of decrease of the period was (2.30 + 0.22) x 107! compared with the relativistic prediction of 2.4 x 1072. These values translate into about 7 x 10-5 seconds per year, so the experimental accuracy needed and Taylor wes inl 993. Perhaps at some stage in the future, gravitational wave detectors will be suf= ficiently sensitive to register directly the radiation from PSR1913 +16, Tn the meantime, it provides us with the best indirect evidence for the existence of such radiation. The reader who wishes to keep up-to-date on this exciting subject will find articles on it from time-to-time in journals like Scientific American. Two very informative articles from earlier this decade are ‘Catching the wave’ by Russell Ruthen, Scientific American, March 1992, 72-81, and ‘Binary neutron stars’ by Tsvi Piran, Scientific American May 1995, 53-61. The theoretical background is covered in a very accessible way by Bernard Schutz in A First Course in General Relativity (Cambridge University Press, 1985). A diferent kind of indirect detection applies to the possible gravitational cosmic radiation background mentioned above. While this might be observed directly with extremely sensitive detectors, it should also be detectable by its effects on the cosmic microwave background radiation anisotropies (discussed in the section on cosmology). There are currently various groups undertaking high-sensitivity measurements of these anisotropies; if they have the appropriate 232 Curved space—times angular pattern, they might give us an indirect detection of the cosmic back= ground of gravitational radiation. 5.12 Alternative theories and approaches Although general relativity has emerged unscathed from all the tests to which it has been subjected so far, a number of alternative classical theories also exist, and there is still considerable uncertainty as to how a theory of quantum gravity can be set up. We shall discuss these possibilities briefly in this section. Varying When Einstein put forward his theory of general relativity, and for quite some time afterwards, it was assumed that the gravitational constant « was just that: a fundamental constant of nature with fixed value like the mass of the electron. However, with the discovery in 1929 that the universe is expanding this assumption was called into question. The origin of inertia has for long been a source of speculation. Itis clear that the gravitational force on a test particle depends on the matter in the rest of the universe. But we have seen that gravity and inertia are intimately connected with each other. Putting these together, Mach’s principle (see the discussion in D.W. Sciama’s book The Physical Foundations of General Relativity, Doubleday, 1969) suggests that inertia is the result of interactions with very distant matter in the universe. Indeed it is likely that the most distant matter we see is most important, the essential point being that the very large amount of such matter makes up for its very large distance. If this is s0, then because the universe is expanding, it might be that the consequent change in the force on a test particle would be described by changes in the value ot i. ‘Another motivation for considering the idea of varying x came from the British physicist Paul Dirac, who was awarded the Nobel Prize in 1933 for his leading role in the development of quantum mechanics. Dirac noticed a rather extra- ordinary coincidence between particular combinations of quantities appearing in physics. The ratio of the electric force between a proton and an electron to the gravitational forve between them, and the ratio of the age of the universe tu the time for light to travel a tiny distance called the classical electron radius, are both enormous numbers, and what is more, they are both approximately 10°. Unless we live at a special time, this coincidence should be valid at other times. Now the age of the universe is certainly not constant, so that suggests that some other ‘ingredient’ in the numerical coincidence is also changing (keeping the ratio constant). The most likely candidate is «! The simplest way to incorporate this possibility into physics is just to replace the constant «in Einstein’s equations by a function of time. Another way, which forms the basis of the Brans_Dicke theory, which is one of the so-called sealar— tensor theories of gravity, is to introduce a completely new term into Einstein’s equations. This term involves a scalar field ¢ (that is, a field without indices), the value of which is determined by the matter throughout the universe. This field 5.12. Alternative theories and approaches 233 also plays the role of the inverse of «,, leading of course to a varying value of that so-called constant again. Experiments to distinguish between theories with varying x and conventional general relativity are very difficult, partly because any variation in « would be expected to be very small anyway. For many years, it was not possible to dif- ferentiate between the theories, but recent experiments all seem to come out in favour of general relativity. If « did vary, planetary orbits would slow down as a result, but observations of Mercury, Venus, and Mars have found no such effect, down to one part in 100 billion per year. It is possible that future observations of gravitational waves would also provide conclusive evidence for or against the- ories with varying K. Current best limits on the time variation & of from combined solar system measurements are |k/«| < 4 x 107!” yr~! (with the same limits on the time var- iation of the Newtonian gravitational constant). Quantum gravity ‘A problem which has challenged theoretical physicists for many years is how to combine two of the most successful physical theories of the twentieth century, namely general relativity and quantum mechanics. As we have seen, general relativity provides a very accurate description of the large scale behaviour of gravitating bodies in the solar system and beyond. On the other hand, quantum mechanics deals predominantly with the behaviour of matter on the very small scale, Why then is there any need to try to relate these theories? In fact there area number of compelling reasons for attempting this. Think first about the very early universe. Just after the Big Bang, the grav- itational fields were extremely strong and the distances minute, so that both relativistic and quantum effects would have been very important. In Einstein’s theory, the Big Bang itself was what is known asa singularity because the density of matter was infinite. General relativity does not deal with singularities (they are specifically excluded from its domain) so a new theory, which takes quantum effects into account, is needed to throw further light on the Big Bang and on other singularities. In Chapter 6, we shall consider black holes, which have singularities in their centres. In the presence of the strong gravitational fields produced by black holes, quantum effects are known to be significant. For example, Stephen Hawking has shown that black holes are not really so black when quantum mechanics is taken into account; particles can be radiated from these objects which, classically, absorb everything and emit nothing. Another argument for trying to find a synthesis of quantum mechanics and general relativity is one of completeness. There are four fundamental forces in nature, the strong and weak nuclear forces, the clectromagnetic force, and gravity. A break-through in particle theory occurred in the mid-70s when the weak and electromagnetic forces were combined in a unified theory, described by mathematical objects called Lie groups. Glashow, Salam, and Weinberg received the Nobel Prize for this work. The obvious next step was to incor- porate the strong force, with the description of strong interactions known as quantum chromodynamies. This was partially combined with the electroweak 234. Curved space-times theory to give the Standard Model, which describes all these interactions in terms of three types of particle, leptons, quarks and gauge bosons. This has prove highly successful and has had many of its predictions confirmed by experiment, Theorists now search for a Grand Unified Theory (“GUT”), in which the different types of interaction are low energy manifestations of a single master theory and many proposals have been made in this regard. Of course it remains to incor- porate the final force, gravity, into the scheme and a great deal of work has gone into trying to formulate general relativity along the lines of these so-called gauge theories, ‘There are two types of approach to the quest for a theory of quantum gravity, The first starts with general relativity and attempts to extend and modify it to make a theory describing the quantum properties of the gravitational field. The second approaches it from the other end, starting with some new quantum theory which will have, it is hoped, general relativity as its limit in appropriate cir- cumstances. While it is outside the scope of this book to give a detailed account of progress in quantum gravity, we will just mention one approach which is regarded by many physicists as the best candidate available so far for the ultimate description of the forces of nature. This is string theory. Traditionally, physicists have regarded particles, idealized as point-like objects, as the fundamental constituents of matter. String theorists argue that one could just as well consider ‘extended objects’, strings, which trace out two- dimensional surfaces, called world sheets, as they move through space-time. The world sheets of these strings are either bounded by two lines (like a ribbon) or are closed up on themselves to form a thin tube (like a drinking straw). Thus the strings can be of finite length, infinitely long (without ends) or form closed loops. But the extension does not stop with strings—they have been generalized to p-branes which are objects with p spatial dimensions, sweeping out (p+ 1)- dimensional surfaces as they move in a higher dimensional space. (For example, p = corresponds to a particle, p = | toastring, and p = 2 to amembrane,) The study of such objects is currently known as M-theory, although at the time of writing (1999) no one seems quite sure about the detinition of the theory or indeed what the M stands for! (See “The theory formerly known as strings’ by M.J. Duff, Scientific American, February 1998, 54-59). String theory (or /-theory) contains some very important ingredients. One of these is supersymmetry, which is a mathematical formalism by which particles with integer values of spin and particles with half-odd integer values can be treated together. (Iu more conventional quantun theory, these different types of particles had to be dealt with separately mathematically.) Secondly, as men- tioned, the strings or more general objects live in higher dimensional spaces (indeed it, appears that when one adopts this viewpoint, everything becomes simpler in eleven dimensions) and to make contact with real four-dimensional space-time, the extra dimensions have to be ‘compactified’, Imagine a two- dimensional surface in the shape of a hollow cylinder. If the radius is extremely small compared with the length, the object appears to all intents and purposes to be a line, a one-dimensional entity. In an analogous way, the extra dimensions in string theory are curled up on themselves, so that they are not seen at the 5.12 Alternative theories and approaches 235 macroscopic level. Thirdly, and more generally, these theories involve some very complicated and sophisticated mathematical ideas. These include the discovery of unexpected symmetries (for-example, dualities between high energy and low energy results), and the use of gauge theories, in which force-fields are repre- sented via a generalization of the idea of parallel transport. The elegance with which these ideas fit together makes the theory very attractive. The obvious question to ask now is what all this has to with gravity. One very significant connection is that one of the string states is a massless spin-2 particle which can be identified with the graviton, which is the entity through which particle theorists think the gravitational force is mediated (in the same way as the photon or light particle mediates the electromagnetic force). Thus string theory has general relativity as an approximation, in particular as its low energy limit. A second rather amazing connection is the calculation of the entropy or informa- tion content of an extreme charged black hole. This is done by counting the number of string states that have the same mass and charge as an extremal black hole, that is a black hole with as much charge as possible. No one quite under- stands why this works, but it indicates a deep relationship between string theory and general relativity There are still many fundamental questions in string theory which have yet to be answered, and it is not clear whether some of the difficulties will ever be overcome. However, it has certainly stimulated lot of work, which has produced some fascinating results. In common with other approaches to quantum gravity, it is extremely difficult to relate it to any observational data. Currently there is no experimental evidence for any theory of quantum gravity. Indeed there is a fundamental problem here: it is unlikely that we will ever be able to test such theories, which make predictions for the future state of black holes, which are hidden behind event horizons, and the quantum gravity era of the carly universe, which is also inaccessible to observation. This is because the early universe is highly opaque, and any remnants of the quantum gravity cra have probably been swept away by a period of inflation at very carly times (discussed below). Certainly the possibility of testing these quantum gravity theories in the laboratory is highly improbable, so verifying them in some observational or experimental way poses a serious problem, However, the verification of super- symmetry in accelerator experiments would provide strong indirect evidence for the correctness of the M-theory approach. Also the experimental observation of light scalar fields or of space-time-dependent coupling constants could provide evidence for the existence of higher dimensions, since the scalar could be inter- preted as the size of the extra dimensions. An excellent description of the aims and achievements of superstring theory is given by Brian Greene is his book The Elegant Universe (Jonathan Cape, London 1999). Broken symmetries It has been mentioned that gauge theories are central to modern theoretical physics. Their successful application to particle physics depends on the idea of a broken symmetry (an idea imported from the theory of magnetism), which is thus 236 Curved space-times now fundamental to much of physics. It underlies for instance the mechanism proposed for the inflationary universe idea (see Section 7.6). The point we want to make here is that this idea is also of importance in other ways in relating physics, and in particular relativity theory, to modern cosmology. Two particular examples of broken symmetries in the universe are the preferred 4-velocity in cosmology (a rest frame for the universe) and the preferred direction of time in physics (the origin of the arrow of time). The basic idea is that particular solutions to the laws of physics in general do not have the same symmetries as the laws themselves. Thus in the case of cosmology, as has been emphasized at the end of Section 3.1, there is a pre- ferred rest frame in cosmology, defined by the CBR. This breaks the Lorentz invariance of the laws of physics, expressed via relativity theory—as expressed in detail in this book. But that invariance of the laws themselves does not mean that solutions of the gravitational equations will also have that symmetry, so there is no contradiction. There is indeed a preferred rest frame in the universe, and we are close to such a rest frame (we are moving at about 300 km/sec relative to it). This will happen in any solution where the existence of matter defines a local rest-frame, so it is not very mysterious, but it is still important to realize that this is indeed the situation. Secondly, and more profoundly, the laws of fundamental physics are time symmetric (except for a weak symmetry-breaking associated with the weak force); but all macroscopic physics, chemistry, and biology are dominated by a unique arrow of time and in particular by the second law of thermodynamics. How is this consistent? Again, the situation is that the solutions to the laws break the symmetry inherent in the laws. However, here the consequences are pro- found: we are unable so send signals to the past, as Maxwell’s equations by themselves imply, or reconstruct a broken glass by simply reversing the motion of its particles (see the discussion by Roger Penrose in The Emperor's New Mind: Oxford University Press, 1989). It is unclear how the only solutions to the time- symmetric fundamental equations all come to have the one-way arrow of time imposed on them. The best suggestion so far is that this is because of the expansion of the universe, which supplies a ‘master’ arrow of time, that then results in all the others (the mechanical, thermodynamic gravitational, electro- dynamic, biological, and psychological arrows). However, this is not yet fully understood; it has something to do with the way boundary conditions are imposed on physical quantities at the origin of the universe, and the way this differs fiom the corresponding conditions at the end of the universe (sec The Emperor's New Mind for farther discussion). This arrow of time is profoundly important to physics in general, and to the nature of life in particular. We still await a fully convincing explanation of this broken symmetry, and how it comes into being physically (mathematically we impose it by hand: we simply reject half of the solutions that are allowed by the equations). It probably does have a cosmological origin, but how it works still needs explanation. For further discussion of this fascinating and important topic, see for example The Arrow of Time by Peter Covency and Roger Highfield (Fawcett Rooks, 1997) 5.12 Alternative theories and approaches 237 Other representations of general relativity When Einstein’s field equations were described in Section 5.6, it was stated that despite their deceptively simple form, they are actually very complicated sets of simultaneous equations for the components of the metric tensor. It is very hard to solve them without assuming a high degree of symmetry for the space-time under consideration. When physicists or mathematicians mect equations which they cannot solve, they usually resort to some sort of approximation. For example, they may replace the equations by similar ones which retain some of the essential properties but which can be solved. A variant on this idea is used in general relativity in a scheme called Regge calculus (after Tullio Regge, the Italian physicist who invented it in 1961). ‘The basis of this approach is to replace the space-times with smoothly varying, curvature usually considered in general relativity, by spaces which are flat almost everywhere but have curvature at discrete locations. One can think of it as taking a set of flat blocks and gluing them together to approximate a curved space, in the same way that a polygon with lots of sides can approximate a circle, and a geodesic dome (sce Fig. 5.39) can approximate part of a sphere. One sometimes sees maps of the world made this way (he problem being how to represent the curved surface of the world on a flat piece of paper). The curvature is at specified places where the blocks meet (faces with two dimensions less than that of the blocks) and is only non-zero if the blocks would not fit together exactly in a flat space. A space built from a simplicial set of blocks (triangles, tetrahedra, and their higher-dimensional analogues) can be described completely by giving all the edge-lengths, which therefore carry the same sort of informa tion as the metric. Regge showed that these edge-lengths satisfy a set of equa- tions which are the discrete equivalent of Einstein’s equations and are has be classical calculations, such as the time-development of model universes and stellar collapse, and also is a crucial ingredient in some approaches to quantum gravity. Regge calculus is not the only approximation scheme used in general relativity. The difficulties of solving Einstein’s equations analytically have led to a large amount of numerical work. where the field equations are usually approximated by difference equations (a rather different approach to that of Regge calculus) and then solved by computer. Current work uses supercomputers and is extre- mely sophisticated. It is now possible to simulate very complex problems like black hole collisions, As attempts to detect gravitational waves using laser interferometry (see Section 5.11) become operative, predictions of what to expect will become very important and these predictions will come mainly from numerical relativity. Computer Exercise 14 (A) The geometry of a space-time is represented by a diagonal metric tensor: ds? = -A?dT? + BAX? + C?d¥? + D*dZ? (*) 238 Curved space-times Fig. 5.39 A geodesic dome constructed from a network of flat triangles. {.X. ¥,Z)}, defined ina T,D=T. where A, B, C, and D are functions of the coordinates {x/ subroutine METRIC. A simple example is 4 = 1, B= 7, (1) Arrange for the coordinates X(), ¥(7), Z(7’) of a curve in the space-time from an initial point (7'0,%0,¥0,Z0) to a final point (71,X1,¥1,Z1) to be stored in a subroutine CURVE, either in analytic form (i.. giving suitable formulae for the curve in terms of simple functions) orin a numerical table, Asa particular example, you might take x( > (T) =0. (2) Split the time period (70, 71) into M equal parts labelled by J (J = 1,2,-..,M) with the Jth interval starting at the time T(J). Write a subroutine STEP that (a) determines from CURVE the coordinates X(J), ¥(J), Z(J) corresponding to 7'(J); (b) finds the increments DT, DX, DY, and DZ in the Jth interval, and (from METRIC) the functions A, B, C, D evaluated at T(J); (©) evaluates the approximation DS2 to the interval (*), where DS2 = —A°DT? + PDX? + C*DY? + D°D2Z’; (d) sets a flag / to -1,0, or +1 respectively if DS2 is negative, positive, or zero, and then as appropriate prints ‘time-like’, ‘null’, or ‘space-like’; (€) if 1=—1, finds ‘TAU = SQR(—DS2); if 7 = +1, finds DIST = SQR(DS2). (3) Your main program PROPER should sum separately TAU, DIST evaluated by STEP from the beginning to the end of the curve, and print out the sums TAU-TOTAL, DIST-TOTAL. If] = -1 for all steps, print ‘time-like’; if / = 0 for all steps, print ‘null’; if 1 = +1 forall steps, print ‘space-like’. (B) Using your program, (1) check the stability, as the number of steps M is varied, of this approximation to the line integral giving the proper time along time-like curves in the space-time; (2) examine examples of the twin paradox in special relativity; (3) examine the behaviour of clocks in the Schwarzschild and Robertson-Walker metrics described in the next two chapters, when they have been introduced. Particular examples The concepts and results described in this chapter are difficult to understand in general, so in the following chapters we examine the nature of particular curved 5.12 Alternative theories and approaches 239 space-times of interest. The simplest examples of curved space-times are those produced by a single isolated massive body like a star, and those produced by all the matter in the universe. We shall consider the space-time around a single massive body in the next chapter, and describe the gravitational collapse of such bodies to form a black hole. In the final chapter, we shall look at the simplest viable expanding universe models. 6 Spherical stars and stellar collapse In this chapter, we are concerned with two problems of astrophysical importance: firstly, the description of the gravitational field of the Sun, which dominates the dynamics of the solar system; and secondly the issue of the nature of the gravi- tational field of a massive star, and how stellar collapse takes place leading to the formation of a ‘black hole’. The analysis of these topics is based on an exact solution of Einstein’s field equations, the Schwarzschild solution (dis- covered by Karl Schwarzschild in 1917, shortly before his death in the First World War). 6.1 The Schwarzschild solution A single massive obiect, like the Earth, the Sun, or a star, produces curvature in the empty space-time around it. If we assume that this object is spherically symmetric and isolated from all other massive objects, it can be shown from Einstein's field equations that the space-time around it is given by the Schwarzschild exterior solution. In suitable coordinates the metric form is ds* = —(1 — 2m/r) d? + (1 — 2m/r)"" dr? + 7(de + sin? dd?) (6.1) where m is the mass of the body measured in geometric units. Here r is a radial coordinate, @ and ¢ are the usual angular coordinates, and fis a time coordinate. along the coordinate curves. The form (6.1) will be valid for r > R, where R, is the value of the coordinate r at the surface of the body; for 0 2m for a static star. In these expressions, the mass m is naturally given in geometric units. These units will be the same as the units used for spatial distances (since m/r must he dimensionless in (6.1)). The mass m in these units is related to the mass M given in ordinary units of distance by the formula m = GM/c? where G is the gravita- tional constant and ¢ the speed of light. In keeping with the previous sections we will often measure distances in terms of light travel times, so masses will then also be measured in units of time! (the mass m* in these units is given by m* = m/c). An idea of the meaning of these units may be gained from the following: Earth's mass ; 6 x 10° gm 4 0.44em ¢ 1.5 x 107"! sec. Sun’s mass: 2 x 1033 gm 1.5 x 105cm <5 x 1076 sec. 6.1 The Schwarzschild solution 241 Asalready stated, the object referred to could bea planet, the Sun, ora star, but in the analysis that follows, we shall in general refer to it just as a star for simplicity. Symmetries Clearly the space-time is static (i.e. unchanging with time), because the form (6.1) is independent of time. We will refer to observers for whom r, 6 and ¢ are constant as ‘static observers’, since they do not move relative to the star; and they would measure all physical properties of the space-time to be constant in time. The space-time is also spherically symmetric about the central body. This is not so obvious, until one realizes that the 7? term in the metric form is simply the metric form describing a two-dimensional unit sphere (see eqn (5.1)), which is of course spherically symmetric about the centre of the sphere. his is the only part of the metric where @ and ¢ occur; so in fact the space-time described has the same symmetry as the two-sphere, that is, it is spherically symmetric about the centre of the star generating the gravitational field. Distances and times ‘When we work out distances in the radial direction from the surface of the star, and proper times for an observer at constant r, 0, and ¢ the 1 ~ 2m/r factors in the metric form mean that the answers are not the same as they would be in flat space-time. We can easily work out the implications of these two factors for the space-time geometry (Fig. 6.1a). The three-geometries Firstly, consider the factor (1 — 2m/r)~! in the dr? term. This determines the geometries of the surfaces {r = constant}. Thesignificance of D+radial distance \ from 5 tog (d9=0.,9=0) sphere {r=} (rea A,=ant) ome \ foom rn tore, Sphere {r=} (de=0,de=0) (rea Ay=ann?) @ © Fig. 6.1 (a) A space-time diagram for the Schwarzschild solution, with the @ angle suppressed. Surfaces {r = constant} are represented by cylinders, with D denoting the radial distance from the surface r=; to the surface r= 2. (b) A spatial section (¢ = constant} of the Schwarzschild solution, Surfaces (r — constant) are spheres. 242 Spherical stars and stellar collapse the coordinate r used here is that it is an ‘area coordinate’: that is, it is chosen so that the area of the two-sphere defined by {¢ = constant, r = constant} is pre- cisely 427° (this follows immediately from the form (6.1), which reduces to that of the two-sphere with surface area 477 when we set dt = 0, dr = 0). However, this coordinate does not directly measure the distances between these two-spheres (which it does in the case of flat space-time). In fact, the distance one would measure along the normal to these spheres at any time ¢, from the sphere r = r; to r =r, is given by integrating (6.1) with dr = 0, d@ = 0, and d¢ = 0: D= fo ~ 2m/r)~#dr, =r(1 = 2m/r)} + Imlog.[(r — 2m)! + rf] giving D = r= 2m/ra)} — (1 = 2m/r1)8 + 2mf{logel(ra — 2m)! +14] —loge[(r— 2m) +A} (6.2) (Fig. 6.1b). This is greater than the corresponding distance d = r) ~ 7) in flat space-time. Figure (6.2) shows the relation between D and d for various values of 7 . This illustrates how the curvature of space-time results in the curvature of the space-sections {t = const} in these space-times, as expressed by the fact that d # D (whereas in flat space-time, these are necessarily equal to each other) The time coordinate Secondly, consider the factor 1 —2m/r in the d/ term. This shows how the coordinate time 1 relates to proper time T measured by a stativ observer. in fiat space-time and in the usuai coordinates, these are iden- tical. However, here, while the coordinate f serves to mark the passage of time along the histories of static observers, and even to synchronize times measured D G20" Ream _fesaom ‘an . em am am ———, “EE, em am em am im em sam. Fig. 6.2. The distance D between spheres =r; and r = rp, plotted as a function of d=r2 rj for r; equal to 2.01 m, 3m, and 100m. 6.1 The Schwarzschild solution 243 static observers t [-a.do=ode=c\ Lt, / | coordinate time difference Dt=t-t, coord time re rq @ or t os oa a am am “om amr o Fig. 6.3 (a) The relation between clock time and coordinate time varies with the value of the radial coordinate r. (b) The proper time interval DT’ divided by the corresponding coordinate time interval Di, plotted as a function of r by different such observers (because the surfaces {¢ = const} are surfaces of instantaneity for such observers), it does not represent directly the proper time they would measure. One can read off from the metric form that the proper time measured by a static observer (for whom dr = 0, dé = 0, and dé = 0) between coordinate times 1 and 4 is given by DT = f(1 — 2m/r)? du, giving DT = (1 —2m/r)!Dr (6.3) where Dr is the coordinate time difference: Dr = ~ ty (Fig, 6.3a). Thus DT is always less than Df (for r > Ry > 2m), with the difference decreasing as r increases from R, (Fig. 6.3b). Asymptotic behaviour Very far from the body, when r becomes very large, the factors (2m/r) become negligible and then ds? coincides with the flat space metric in spherical polar coordinates. Thus this solution represents an asymptotically flat space-time. This 244 Spherical stars and stellar collapse corresponds to the physical situation that far enough away from the Earth or Sun, their gravitational fields are negligible. To investigate this further, one can use the approximation (1 — 2m/r)~' ~ 1 + 2m/rwhen |2m/r| < 1, to obtain the approximate metric form ds? = —(1 —2m/r)d? + (1 + 2m/r) dP + r°(d6? + sin? add?) (6.1a) valid far from the star. Indeed for ordinary stars or planets this form will be a good approximation everywhere outside its surface, because the condition r > Rs implies m/r < m/Re; and for the Earth and the Sun we find: Earth: mass = 0.44m, R, = 6.4 x 10° cm, m/R, = 6.9 x 1071, 5x 10cm, Ry = 7 x 10! om, m/R, = 2.1 x 10-% Sun: mass = Therefore, even close to the surface of the Earth, m/r < 6.9 x 10-"° and, in the case of the Sun, m/r < 2.1 x 10-§; hence, in both cases (6.1a) will be a good approximation to (6.1). Then (6.2)is closely approximated by D = f(1 + m/r) dr giving D=r—r1 +mlog,(r2/r1) (6.2a) and (6.3) by DI = (1—m/r)Dt. (6.3a) Clearly the larger r is, the more closely (6.1a) approximates the flat-space metric (4.29), while (6.2a) and (6.3a) approximate the flat-space results D = rz — r| and Dr =Dr. ‘The singularity Clearly, problems would arise in the metric if r could approach the value 2m, because then DT would go to zero, the coefficient of d?? in (6.1) would go to zero, and that of dr? would diverge. We do not have to worry about this in the present section, where we assume r > R, > 2m (and indeed, as we have seen, in ordinary astrophysical situations in the solar system, Ry >> 2m). However, we shall have to investigate the ‘singularity’ in the metric form as r approaches 2m in the next section, when we consider gravitational collapse. Redshifts A consequence of (6.3) is that there are observed gravitational redshifts in these space-times (as there were in the Rindler universe), Let us see why this is. Consider two static observers situated radially relative to each other, thatis, at the same values of @ and ¢ but at different values r; and r2 of r (Fig. 6.4). A light ray travelling radially outwards from r; to r2 will obey the conditions dé = 0, do = 0, ds? = 0 (the first two following because the path is radial, the last because it represents motion at the speed of light). Then from (6.1) it follows that along the 6.1 The Schwarzschild solution 245 Op 1% 6.¢ constant, Fig. 6.4 Two static observers O; and Op on the same radial line (0, 6, constant), but at different values r; and rp of r. light ray, the displacements dr and dr will be related by dr/di = 1 — 2m/r. Hence, if the light is emitted by O at time 4; and received by O at time f (see Fig. 6.5a), we find b~ 4 =r —r + 2mlog.{(r2 — 2m)/(r1 — 2m)}. Now observe that the right-hand side is not explicitly dependent on f) and 1, but rather on r; and r9. Thus, if'a second signal is emitted at a later time , by O, and received by O2 at , (see Fig. 6.5a), then t,t, = ry — ry + 2mlog,{(r2 — 2m) /(r1 — 2m)} also. Subtracting, we see that the time difference Dr = 4, — fo between these pulses for the receiver is related to the time difference Dt; = r', — t, for the emitter by the relation Dt; = Duy, ie. they are the same! Inan obvious way, we can define a K-factor for this experiment, just as we did in the case of flat space-time. Does the result just proved imply K = 1? No!— because what we have shown is that it is the coordinate times that are the same, not the proper times. We must use eqn (6.3) to determine the ratio of proper times. We find Ky = DT) + DT; = (1 — 2m/r2)*/(1 — 2m/r1)? (6.4) This is the formula for the gravitational redshift obscrved in these spacc—times (Fig. 6.5b), Crudely, we can think of light travelling radially out as ‘climbing out” ofa potential well and so losing energy and hence being received as redder than it was emitted. K would be measured in precisely the same way as in flat space-time (sce Section 3.1). Asin that case, it is the ratio observed between times of all events as measured at the object and at the observer; referring to it as a redshift effect is lahelling it by one of the most direct ways of measuring it (Fig. 6 5c) As in the flat-space case, Ki is independent of 1, and D7}. This is essentially because in both cases, the space-times are static (i.e. unchanging in time). However, in the present case, quite unlike the'case of Doppler shifts for inertial 246 Spherical stars and stellar collapse static observers uf ot th on | radial "Fight rays (de=ode=0) (a) (b) « 15 10 os. em sm tom sm © Fig.6.5 (a) Radial light signals are emitted by O} at ¢; and ¢, (at coordinate interval Dix) and received by O» at fy and ¢, (at coordinate interval D/2). (b) The gravitational redshiftis, defined to be the ratio of the proper time intervals DT; and D7). (¢) The gravitational redshift 1 + 2 = Kiz plotted against rz for various values of ry observers in flat space-time, the effect is not reciprocal. In fact, clearly now Kj) = 1/Ka1; correspondingly, light travelling inwards from r) tor, is gaining energy fiom the giavilational field and so is blueshifted rather than redshifted. This has the further consequence that, unlike the situation of inertial observers in flat space-time, radar measurements of distance will reveal a K-factor of 1 (the factor Kj, on the outward trip will be compensated by the factor Ky, on the inward trip, resulting in no overall change in observed wavelength; Fig, 6.6). These differences from the Minkowski-universe case occur because the redshifts observed are due to the inhomogeneity of the space-time, rather than due to Doppler shifts in a homogeneous space-time; the factor K now is caused by the gravitational field of the star (represented by the factors 1 — 2m/r in the metric form). The situation is very analogous to that of the static accelerating observers 6.1 The Schwarzschild solution 247 static observers ~N rf Ny rt {e9 suppressed} Fig. 6.6 The reciprocal nature of the gravitational redshift. A proper time interval DT} between radial light rays at r=, produces a proper time interval DT; = KDT; at =r», Reflection of these signals produces a proper time interval DT} = Ky.DT) = Ky KyDT; = DT; atr =r. in the Rindler universe (Section 4.3), which is not surprising: we expect this on the basis of the principle of equivalence. In the weak-field case (m/r < 1), (6.4) becomes Ky =14m/ry —mjra. (6.4a) This will hold, for example, for the gravitational redshift caused by the Earth or the Sun in the solar system. According to these results, the light emitted hy dense stars (e.g. white dwarfs) will show a gravitational redshift when it is received on the Earth. This has been verified observationally. Again, if sensitive enough measurements of precisely emitted wavelengths can be made, the effect can be observed for light moving radially out from the Earth (Le. climbing vertically away from the Earth’s sur- face), and this too has been verified observationally (as mentioned in Section 5.6) in the case of light emitted at the base of the Harvard Tower and received near the top of the tower (Fig. 6.7). Thus, gravitational redshift is a phenomenon that has been well verified experimentally. Further properties Many other results follow from the metric form (6.1). In particular, one can derive from it the particle orbits in the gravitational field represented, and the bending of light that will result from that field. The methods used to derive these results, however, demand more advanced mathematical techniques than we are allowing ourselves in this hook 248 Spherical stars and stellar collapse ‘| receiver transmitter Fig. 6.7 A test of the gravitational redshift using light emitted at the bottom of the Harvard Tower at r = r; and absorbed near the top at r = ro. Wewill not pursue this topic further here, except to note that these calculations form the basis of the classical tests of general relativity (see Section 5.9) through examination of the paths of light-rays (particularly the famous observations of the bending of light by the Sun) and the motion of planets and spacecraft in the solar system (particularly observation of the perihelion of the planet Mercury). Extremely good data is now available through the tracking of spacecraft through the solar system and the measurement by radar of the distance to reflectors placed on the planet Mars. The best evidence al the present time, from examining the motion of light and massive bodies in the solar system, is that the geometry of the space-time of the solar system is indeed well represented by the Schwarzschild metric form (6.1). Conclusion There is good reason to believe that the Schwarzschild solution describes accur- ately the gravitational field of an isolated massive body, e.g. a spherical star, for that geometry appears to describe the space-time of the solar system to a high degree of accuracy. Thus the geometric properties described above will char- acterize the local space-time features of many regions of the universe. Exercises 6.1. Consider a light ray in the gravitational field of a spherically symmetric object, described by (6.1). Find its coordinate velocity if it travels (a) radially. (b) transversely. Does the dependence on distance violate Einstein’s principle of the invariance of the speed of light? 6.2. Light signals are emitted from a lift which moves at 20 metres/sec in a vertical lift shaft on the outside of a sky-scraper. An observer at the base of the lift shaft records the light signal when the lift is 100 metres above ground level. Calculate the redshift due to (a) the Doppler effect, (b) the gravitational effect (you may take the radius of the Earth to ‘be 6000 km). 6.3, The gravitational effect of the Earth on the Moon is due to the curvature of space— time casued by the Earth at the distance of the Moon. Calculate (a) the distance from the surface of the Earth to the Moon, and the circumference C of the Moon’s orbit; hence find 6.2 Spherical collapse to black holes 249 the ratio R = C/d of C to the distance d from the centre of the Earth to the Moon; (b) the ratio DT /Dt of proper time to coordinate time at the Moon’s orbit, (c) the gravitational redshift from the Barth’s surface to.the Moon’s surface. [The radius of the Earth is 6000 km and the average distance from the centre of the Earth to the centre of the Moon is 386 000 km; after converting to suitable units, take these as the appropriate values of the coordinate r in (6.1), Note that we do not know the form of ds?inside the Earth.) Similarly, calculate the curvature effects at the Earth’s orbit caused by the Sun. Itis this tiny effect that is responsible for us remaining in our nearly circular orbit around the Sun! [Take the distance from the centre of the Sun to the Earth as 1.496 x 10°km.] 6.4 Read about light bending and perihelion precession (see e.g. Space, Time and Gravitation by A. Eddington, Harper Torchbooks, 1959), and other tests of general relativity theory (see e.g. ‘Gravitation theory’ by C. M. Will, Scientific American, November 1974 and Wills book referred to on p. 223). 6.2 Spherical collapse to black holes Having studied the space-time around an isolated spherical body, we are nowina position to consider what happens when such a body, for example a massive star that has burned all its nuclear fuel, collapses to form a ‘black hole’. Two of the most important effects resulting are the formation of a singularity at the end- point of the collapse, and the occurrence of causal limits (the ‘event horizon’) restricting communication between an outside observer and the collapsing star, and preventing the singularity ever being visible to any outside observer. The major features of these causal limits follow simply from an understanding of the nature of curved spacetimes and their null-cones. A powerful result known as Birkhoff’s theorem shows that the Schwarzschild solution (metric form (6.1)) will represent the exterior gravitational field of any spherically symmetric star, not only if itis static but also if it is expanding, col- lapsing, or pulsating. Thus the metric form is of very wide applicability. The form (6.1) isclearly singular at r = 2m: the coefficient of d/? vanishes there, and the coefficient of dr? diverges there. Al first one might suspect that this implies that the space-time itself is badly behaved there: However, investigation shows this is not so: rather it is the coordinates that are badly behaved, Thus this is a coordinate singulazity rather than a physical singularity (but it does have physical significance, as we shall see later). Collapse of a star: use of a null coordinate Let us consider a star in which the density of matter is so high that gravitational forces overwhelm other forces, and it shrinks in on itself and eventually collapses to form a black hole. Its surface will then decrease to zero, so R, decreases to zero too; clearly a physical singularity occurs then (the star has collapsed to zero volume). During the collapse, the interior geometry will be described by some 250 Spherical stars and stellar collapse dynamic metric which we do not wish to investigate here. What concerns us is the exterior solution to the collapsing star. As it collapses, its surface will eventually reach and fall through the critical value R; = 2m. Thus, to represent the exterior solution at all times, weneed anew coordinate system that will cover the surface r = 2min a regular manner. Various such coordinates can be found; see e.g. Gravitation by C. Misner, K. Thorne, and J. Wheeler (Freeman, 1973), pp. 823-836, or Essential Relativity by W. Rindler (Springer, 1976), pp.185~-6. One such set, the Eddington-Finkelstein coordin- ates, are particularly suited to exploring gravitational collapse. We will not pursue the complex details of the changes of coordinates, but rather focus on the resulting metric form for the exterior space-time, which can be written as ds? = —(1 — 2m/r) dv? +2dudr + P(d0 + sin? dg") (6.5) for r > Re, where v = t+ r+ 2mlog.{(r/2m) — 1} is a coordinate such that the past light cones centred on the star are the surfaces {v = constant}. To see this, consider a radial displacement {v = const, = const, @ = const} in these sur- faces, Because dv =0, d9 = 0, and d¢ = 0, it will have components (dx*) = (0, dr, 0, 0); because there is no term in dr?in the metric form, (6.5) shows that the metric form ds? for this displacement is zero, i. it is a light ray. We call v a null coordinate (the use of these coordinates in two-dimensional flat space-time was investigated in Exercise 4.18; the similarity between (6.5) and the metric form (*) derived there for flat space-time when null coordinates are used, is immediately apparent). We must emphasize that for r > 2m, (6.5) is just the Schwarzschild exterior solution (6.1), but in new coordinates, The advantage of these coordinates is that they are well behaved where r = 2m (when this condition is fulfilled, (6.5) reduces just to the interval of flat space-time in double null coordinates, and flat space— time is perfectly regular; cf. equation (***) in Exercise 4.18). As we shall see, {r = 2m} is a null surface called the event horizon. Use of the form (6.5) let us extend the solution to r < 2m, and explore what happens to an object that crosses the event horizon from the outside (r > 2m) to the inside (r < 2m). One can conveniently draw the space-time represented by (6.5) in a form where one generator of the past null-cones (the surfaces {v = const}) is drawn at 45° to the vertical axes, and the surface {r = const} are cylinders parallel to the central line at r = 0 (Fig. 6.8). One of the angular coordinates has been suppressed, but the spherical symmetry is readily apparent from the diagrams in terms of invariance under changes of the coordinate 0. Note that the surfaces {t = constant} are nor horizontal planes in this diagram. The diagram represents an interior solution as well as the exterior solution (6.5). The interior solution that is, the collapsing spherical star _ is represented by the interior of the surface of the star (r = Rg, where R, a decreasing function of time). Clearly, the radius decreases steadily with time until it reaches zero; then the remains of the star form the ‘singularity’ at r — 0 (which we discuss later). We have not attempted to show any details of the interior solution (which depends on the equation of state of the matter in the star). The important point is simply to note that the interior of the star lies inside the surface shown. 6.2 Spherical collapse to black holes 251 outgoing light ray: bends over ingaing light ra fight cones Wer tit over 2 Py te vp one approaches cH ‘S/ Trapped observer stays cannot send signa putside aw FF eonstent) HORIZON object (r=2m) ‘falls in SINGULARITY| ted invisible. _ collapse radius a decreases as time a increases ‘Collapsing star Fig. 6.8 A space-time diagram of the collapse of a star to form a black hole. The vertical axis represents time, and r and @ are polar coordinates in planes perpendicular to the t-axis (the angle ¢ has been suppressed). Lines of constant v are drawn at 45° to the vertical. The radius of the star decreases to zero, where a ‘singularity’ (with infinite density) is formed on the axis. The surface r = 2m forms the event horizon, which encloses the events which cannot be seen from the outside world. The ingoing light rays move on lines of constant v, while the directions of the outgoing ones depend on radial distance. The light cones tilt over toward the spatial origin with decreasing r, and are vertical on the surface r = 2m (the ‘event horizon’). Outside the surface of the star, we have the exterior solution represented by the metric form (6.5), which is just the Schwarzschild solution in new coordinates. An important feature is that the light cones, determined as usual by the equation ds? = 0, ‘tip over’ as one moves from large to small values of r. The significance of the surface {r = 2m} can now be seen: this is a null surface, i.e. it is generated by light rays (the rays {r = 2,0 — const, ¢ = const}), At all points, the ‘ingoing’ light rays are at 45° to the vertical. These trace the path of light that is emitted radially inwards towards the centre (e.g. by pointing a flashlight towards the centre of the star, and pressing the ‘on’ button for a brief instant). Similarly, the ‘outgoing’ light rays trace the path of light that is emitted radially outwards from the centre (e.g. by pointing a flashlight directly away from the centre of the star, and pressing the ‘on’ button for a brief instant). Outside the surface {r — 2m}, these rays are tilted outward; inside the surface, they are tilted inward. On the surface, they are precisely vertical, i.e. they are lines of constant r (as follows from the fact that the coefficient of dr’ in the metric (6.5) vanishes there). The light rays. 252 Spherical stars and stellar collapse indicate the orientation of the light cone at each point; and the causal properties of this space-time all follow from this behaviour of the local light cones (cf. Fig. 4.17(b)). Exercises 6.5 Consider radial light rays in the metric (6.5). Deduce that the coordinate dis- placements dv and dr along the light rays are related by {2dr ~ (1 —2m/r) du} dv = 0. Hence show that the ingoing light rays are given by du = Oand the outgoing light rays by dr =4(1 — 2m/r) dv; and so confirm that the local light cones are correctly represented in Fig. 6.8 (where lines {v =constant} are drawn at 45° to the vertical, while lines {r= constant} are vertical). [Hint; look at the possible signs of dr/d¢ on the null lines). 6.6 Check that the transformation from the coordinate ¢ to the coordinate v is not well-behaved when r = 2m. [This feature is necessary to enable removal of the apparent singularity in (6.1) to give the form (6.5), regular at r = 2m,] The event horizon The most important feature shown by Fig, 6.8 is that the event horizon (the surface r = 2m) is a one-way ‘trapping surface’, which lets radiation and matter fall into the inside region (r < 2nr) but prevents any matter or radiation escaping from there.’ Specifically, at this critical radius (r = 2m, the Schwarzschild radius) an outgoing light ray attempts to escape from the star but is held back’ by the star’s gravitational field, which is precisely strong enough to hold it at this dis- tance from the star. Light emitted just outside the event horizon can escape to infinity (by following the outgoing null rays). Light emitted just inside cannot escape; radially outgoing light rays are dragged back by the gravitational field, and fall into the singularity at r = 0. Clearly, any massive object inside the event horizon cannot escape, since it cannot exceed the speed of light. hus its possible future histories are bounded by the ingoing and outgoing light rays, representing radial inward and outward motion at the speed of light; 30 no matter how he may accelerate, the fate of any observer or object that has crossed the horizon to the inside region is necessarily to fall into the singularity at r = 0. This is the reason for the name ‘black hole’; no radiation or signal of any kind can reach the outside from inside.* IL is an unknown Legion to the exterior observers, who cannot see what is happening inside by any observational technique whatever. Note that inside the region r = 2m, it is not even possible to emit light that moves radially outwards. The ‘outward’ light cone tips inward, and if one con- siders any path that does in fact move outward radially in this regime, then this corresponds to motion at greater than the speed of light, which is not physically allowed. Hence, one also cannot set a particle in motion ona space-time path that * We are considering the situation here classically. When quantum effects are significant, the situation is different, as will be mentioned briefly at the end of the chapter, 6.2 Spherical collapse to black holes 253 moves outwards (i. to larger values of r) in this region. This becomes even clearer if one uses conformally flat coordinates where the light cone appears at +45°; however, to go into that representation is beyond the scope of this book. For details, see e.g. R d’Inverno Introducing Einstein's Relativity, pp. 230-238 (Oxford University Press, 1992), or C. W. Misner, K. . Thorne, and J. A. Wheeler Gravitation, pp. 833-840 (Freeman, 1973). This gravitational trapping of light and matter will happen for very small radii For example, in the case of an object with the mass of the Sun, in appropriate units m= 1.5km so the Schwarzschild radius is 3km. Thus we would have to compress the Sun (whose radius is 696 000 km) until its radius is less than 3 km in order to make the curvature of space-time high enough to cause this trapping effect. Similarly, the Barth would have to be compressed to about 0.9 cm radius before it fell within its event horizon. Collapse seen from outside Consider the collapse from the viewpoint of an external observer (see Fig. 6.9). An observer O; who stays outside the event horizon sees the star shrinking towards r = 2m, but never actually reaching this radius: clearly, no light ray can reach the observer when the surface of the star lies at r = 2m or r < 2m. Thus the final collapse is hidden behind the event horizon. An inward moving observer 0, who falls across the event horizon can indeed see all the collapse, but he himselfis 0; ft.) outgoing \ += light rays proper time : fi stationary or sr infalling ' server observer infalling ohserver ‘Le (9 const) Fig. 6.9 An infalling observer O2 emits radial light signals each minute; a stationary ‘observer U} receives them at longer and longer intervals, and the final minute before the infalling observer Q3 crosses the event horizon appears to the external observer O, to last for ever. Thus, O, sees ever-increasing redshifts in the images of O3; consequently, O2 fades away from sight. 254 Spherical stars and stellar collapse inevitably drawn into the singularity within a short time thereafter, and there is no way he can send signals to the outside observer O, to report his findings. Similarly O, cannot see what happens to O2 once he has crossed the horizon, Suppose O, crosses the horizon at the time 12:00 measured by his clock. Light emitted by him at 12:00 will never reach Oj, since it will stay at the distance r= 2m. Light emitted by him at every previous time will reach O;. For illustra- tion, signals sent by him at 11:57, 11:58, and 11:59 are shown in Fig. 6.9. Clearly, when O, and O;are at the same radial distance, there will be a K-factor determined by the Doppler redshift effect alone. However, as Op gets further away from Oy, a gravitational redshift will contribute to K as discussed in the previous section. The crucial feature is that the light emitted in the one-minute interval from 11:59 to 12:00 will take an infinite time to be received by O, (the second signal never arrives). The light emitted during the intervals 11:57 to 11:58 and 11:58 to 11:59 will be climbing out of deeper and deeper gravitational potential wells, so the observed redshift (and thus the K-factor) will be getting larger and larger. In the limit as O, crosses the horizon, the K-factor becomes infinitely large (the observed time dilation increasing without limit, see eqn (6.4)). Thus the event horizon may also be characterized as an infinite-redshift surface. This situation is precisely modelled by the Rindler universe discussed in Section 4.3. From this discussion, it becomes clear that the surface of the star too will be observed with ever-increasing redshift as it approaches the horizon. As the observed redshift increases, the observed intensity of light received from the star will decrease, so the star (seen from outside) will be observed to fade away as its surface approaches the horizon, and larger and larger redshifts are seen by the outside observer. One should note here that the speed at which the observer O2 and the surface of the star cross the horizon is perfectly finite (and indeed less than c); the infinite redshift observed is a gravitational redshift in a static space— time, namely the exterior Schwarzschild solution (note that the redshift will become infinite even for the family of static observers in the space-time). While factors will be reciprocal for ingoing and outgoing signals outside r = 2m, one cannot consider their reciprocity for r < 2m, for the outgoing signals cannot then be received by O;. Ingoing signals will be received by O, with increasing blue- shifts, while the out-going signals will not be received at all. Thus the surface of the star after it has crossed the event horizon, and its final destruction at the central singularity, cannot be witnessed by an outside observer. The central singularity What is the fate of the matter in the star, and any observer or other object that falls into the central singularity at r = 02 This is a real physical singularity where the gravitational field is unbounded. Thus they are torn to pieces by the asso- ciated tidal forces which increase without limit as the particles approach the centre (where the space-time curvature diverges), Space-time itself breaks down there: our model of space and time cannot be continued any more. Thus the theory we are using (general relativity) predicts a singularity, an end to 6.3. More general black holes 255 space-time, there. More fundamental theories that unite gravity with quantum theory may make other predictions, but the classical theory predicts that the end- point of spherical gravitational collapse is a breakdown of our present laws of physics at a singularity where space-time itself comes to a singular ending, However, this singularity is invisible to the external world; itis veiled by the event horizon (sce ‘Gravitational collapse’ by K. 8. Thorne, Scientific American, November 1967). Exercises 6.7 A radial geodesic x(v) in the Schwarzschild solution, where v is an affine parameter, is characterized by three features: (a) @= constant, $= constant, (b) Eygy(dx'/dv)(dx!/dv) =c is constant along the geodesic (the tangent vector X' = dx /dv has constant magnitude along a geodesic because it is parallel propagated), (©) dt/dv = B/(1 — 2m/r) where E(# 0) isa constant (this is energy conservation for the particle relative to the static frame). (1) What is the value of if x!(v) is null? What is its sign if x/(w) is time-like? (2) Show that (a) to (c) lead to the equation (dr/dv)’ — E? =e(1 — 2m/r) (*) relating the displacement dr to the affine parameter increment dv; using (¢), deduce the relation dr/dt = (1 + (e/E2)(1 — 2mjr)}2(1 — 2m fr) («*) between the displacements dr, dé along the geodesic. 6.8 Consider a typical galaxy with radius 10” cm and mass 10'7 cm, Suppose that it collapses under gravity; at what radius will it become invisible to the rest of the universe? 6.9 Ona graph of mass against radius (using logarithmic scales). plot points repre- senting the Earth, the Sun, the galaxy in Exercise 6.8, and as many other astronomical objects as you can. Draw in also the line r = 2m. Confirm from this diagram that no ordinary planet, star, star cluster, or galaxy violates the condition R, > 2m. Do black holes actually exist in the universe? If so, will they be like the spherically-symmetric case we have just considered, or are there other possibi- lities? How will they have formed? Is there any way we can detect them? In the next three sections, we shall try to answer these and other questions. 6.3. More general black holes In Scetion 6.2, we considered the special case of collapse to a non-rotating (spherically-symmetric) black hole. In the real world, such collapsing systems are much more likely to be rotating and then various more complex geometric fea- tures come into operation (see e.g. ‘Black holes’ by Roger Penrose, Scientific American, May 1972). An important issue that arises is the question whether, when rotation is taken into account, every collapse of a massive star will result in the formation of a black hole ((.e. will an event horizon necessarily occur?) 256 Spherical stars and stellar collapse The alternative would be the creation of a naked singularity visible to the outside world. Kip Thorne has conjectured that whenever a mass M is concentrated inside a region with ‘circumference’ in any direction less than 2x(2M), then there will be a horizon enclosing the mass. Complementary to this is the cosmic cen- sorship hypothesis that, under certain conditions, all singularities are ‘hidden’ in black holes where they cannot be ‘seen’ by distant observers (ic. the singularities are not ‘naked’). Neither of these conjectures has been proved, and while the majority view is that they are correct, the issues are not completely resolved (see, for example, The Edge of Infinity by P. C. W. Davies, Dent and Sons, 1981; Frontiers of Modern Physics by T. Rothman et al, Chapter 2, Dover, 1985). Rotating black holes ‘The Schwarzschild geometry described in Section 6.2 has to be replaced by a more general solution to Einstein’s equations when the system rotates, For an axially- symmetric rotating black hole, with mass m, charge qand angular momentum per unit mass a, the Kerr-Newman solution for the space-time outside the matter is given by the metric form: ds? = —(A/p?)|dt — asin? 6d¢)* + (sin? 6/p?)|(P + a?) dg — at? + (#/A)d? + pa (6.6) where —Imr +a +4, (6.7) P=P+a cos’. (68) This is a generalization of (6.1), the Schwarzschild metric, to which it reduces when a = q = VU. (When a = U, the solution is known as the Keissner—Nordstrem solution for a charged black hole.) The Kerr-Newman solution has a horizon at raryp=m+ (nt -g-a@); (6.9) this takes a real value only if m? > q? + a and so corresponds toa black hole only if this inequality holds. Unlike the Schwarzschild metric, there is another non- zero value of r with physical significance: ro(8) = m+ (me = q? — a? cos? A), (6.10) which is known as the static limit. To understand the importance of this, we need to consider first the case of a particle dropped straight in towards a black hole from very far away. The cross-term involving d¢d/in the metric means that such a particle acquires an angular velocity in the same direction as the rotating black hole. This effect is known as the dragging of inertial frames. When the particle reaches the crgospherc, which is the region between the static limit and the horizon (Fig. 6.10), this dragging effect is so strong that the particle has to rotate with the hole even if it has arbitrarily large angular momentum in the opposite direction! r 6.4 Black hole evaporation and thermodynamics 257 Horizon Ergosphere Static limit Fig. 6.10 A Kerr—Newman black hole; the ergosphere is the region between the horizon at r — rz and the static limit at r = ro(0). A black hole has no hair Let us summarize what is definitely known about the end results of gravitational collapse (see Schutz’s book). If the collapse is nearly spherical, a rotating mass settles down toa Kerr black hole. Any horizon is expected to become stationary eventually, and_a stationary black hole is characterized in principle by four quantities, its mass, angular momentum, electric charge, and magnetic monopole charge. Since magnetic monopoles are not known to exist in nature and since any net electric charge is likely to be neutralized by accretion of opposite charge, then in practice, itis usual to consider only the mass and angular momentum. This has given rise to the cliché ‘a black hole has no hair’ meaning that it has no other independent physical characteristics. If the horizon is not stationary, it has been proved by Hawking that the area can only increase in size Energy extraction from black holes ‘An unexpected result of Penrose has shown how energy could in principle be extracted from a black hole with an ergosphere. The basic idea is to make the black hole absorb a particle with negative energy, so that there is an effective increase in energy outside the black hole. It has to be arranged (somehow) that a particle entering the ergosphere breaks up into two parts, one with negative total energy. Fora Kerr black hole, this fragment falls into the hole, while the other one escapes to infinity. Unfortunately, this Penrose process is not the solution to the world’s energy needs as it cannot go on indefinitely. This is because the negative energy particles also carry negative angular momentum so the rotation of the black hole slows down and the ergosphere eventually disappears. 6.4 Black hole evaporation and thermodynamics As mentioned in Section 5.12, Hawking has shown that, due to quantum effects, one would expect a black hole to emit radiation so that it is no longer opaque. This holds whether or not the hole is rotating and is quite different from the Penrose process which is a classical phenomenon. Although the details of the 258 Spherical stars and stellar collapse Hawking effect involve the interaction of quantum field theory with curved space-time, a complex topic beyond the scope of this book, we shall explain the basic ideas and refer the reader to “The quantum mechanics of black holes’ by Stephen Hawking, Scientific American, January 1977, for a fuller discussion. The Hawking process From quantum field theory, it is known that space is filled with ‘vacuum fluc- tuations’ of the electromagnetic field, in which pairs of photons with energies +E are created and then recombine within a time At given by the uncertainty prin- ciple A‘AE > h, where AE is the uncertainty in their energies and f is Planck’s constant. Normally a photon with negative energy could not propagate in ordinary space, but if a vacuum fluctuation takes place near the horizon of a black hole, the position of the horizon itself being influenced by the uncertainty in the position of the light cone, then there is a small chance that within time Ar the negative-energy photon will end up inside the horizon, where, for technical reasons, it can propagate freely. The positive-energy photon can then escape to infinity, producing radiation from the black hole (see Fig. 6.11). This mechanism works not just for photons, but also for other lypes of particle, so a black hole should emit the full range of radiation, provided it is sufficiently hot. Hawking showed that the radiation has a black-body spectrum, with temperature inversely proportional to the mass of the black hole. (Incidentally, this shows that a black hole has negative specific heat, which is typical of self-gravitating systems.) Using Positive energy Particle escaping to infinity | Negative energy particle fallin Particle into black hole anti-particle pairs ~~ Event horizon Fig. 6.11 ‘Iwo-dimensional diagram of the mechanism for the Hawking process: particle-antiparticle pairs are produced in vacuum fluctuations. When this happens near the horizon of a black hole, the negative energy particle may fall into the black hole and the positive energy one escape to infinity. 6.4 Black hole evaporation and thermodynamics 259 the rate of radiation ofa black body, it can be shown (see Exercise 6.10) that the lifetime of the black hole is proportional to the cube of its mass. Thus big black holes live longer, but not for ever; they radiate away all their mass in a finite time. For black holes of stellar mass, which have temperature 3 x 10-®K, their potential life-time is of the order of 10°? years, which is much longer than the current age of the universe. On the other hand, much smaller black holes formed in the early universe should have radiated away by now (see Section 6.5). The thermodynamics of black holes The work of Hawking and others has led to a very elegant analogy between the laws of black hole radiation and the laws of thermodynamics. We have already stated in the last section that the area of a black hole cannot decrease, which parallels the second law of thermodynamics, that the entropy or disorder of an isolated system never decreases. The dE = TdS in the first law of thermo- dynamics is paralleled by the black hole law dM = (1/87M)d(4/4) where dM is the change in the hole’s energy and 4 is the horizon area. Thus if we take the horizon area to be proportional to the entropy, and the surface gravity (the strength of gravity’s pull on an object just outside the horizon), which depends roughly on the inverse of the mass of the black hole, to be proportional to the temperature, these laws are stating the same properties in different contexts, There are also analogues of the zeroth and third laws of thermodynamics (see J.M. Bardeen, B. Carter, and S. W. Hawking: Comm. Math. Phys. 31, 161-170, 1973). Although the argument may seem to have gone full circle, we may now see that, because of this analogy, a black hole must have a finite temperature and must emit radiation (see J. D. Bekenstein: Physics Today 33, 24-31, 1980). The relationship between entropy and the horizon area of a black hole is a very puzzling one, and leads to what some physicists regard as a fascinating paradox. When anything disappears into a black hole, then almost all information about it appears to be lost, since in practice, only the mass and angular momentum are likely to characterize the black hole, and they are all that can be measured from outside. (Incidentally, this means that one can think of the entropy ofa black hole asa measure of all the ways in which the hole could have been made.) Once inside the black hole, the object, and most of ils information, will eventually be crunched up in the central singularity. However, it may be that that information is still available in some form through the Hawking process, when the mass of the black holcis gradually radiated away (for if this is not truc, some of the usual features of quantum theory are violated). Some would argue that although this does not mean that a copy of this book which happened to fall into a black hole would necessarily re emerge in a recognizable form millions of years later, the infor mation contained in it would be there in some sense. The so-called information paradox arises from these two apparently contra- dictory pictures: almost all the information is destroyed in the central singularity, but at the same time it is available to re-emerge as Hawking radiation! This is still a controversial subject; some physicists argue that information really is lost, and any apparent problem arises hecause we do not yet have a complete theory of 260 Spherical stars and stellar collapse quantum gravity. Some who believe there is a real paradox think that its reso- lution lies in complementarity, the idea that both pictures are correct but are describing the same reality from different view points (in the way that, for example, light has to be viewed both as a particle and a wave to explain all its properties). Fora fuller discussion, see ‘Black holes and the information paradox’ by L. Susskind, Scientific American, April 1997, 40-45. Inside a black hole We have not talked much about the singularity at the centre ofa black hole. This is partly because no one actually knows very much about it. What we do know, as stated already in the section on quantum gravity, is that the laws of general relativity fail at or near the hole’s centre and need to be replaced by some new laws of quantum gravity (see ‘The lesson of the black hole’ by J. A. Wheeler, Proc. Am. Phil. Soc. 125, 25-37, 1981). In the region where quantum effects become important, it has been conjectured that space-time has a foam-like structure, with its topology (the way it is connected together) fluctuating probabilistically over very short distances. One has to imagine a multiply-connected structure with lots of handles and tunnels! The mystery surrounding the singularity itself has led to some conjectures which sound more like science-fiction than science, but are taken as serious scientific possibilities. For example, maybe the singularity is in some sense the gateway for expansion into a new universe. his idea has given rise to the idea of a ‘phoenix universe’ and to the concept of a succession of universes following a type of Darwinian evolution (see Lee Smolin’s book The Life of the Cosmos (London, Weidenfeld and Nicolson, 1997). Exercise 6.10 cube of its mass: (i) the temperature of a black hole is inversely proportional to its mass, (ii) the horizon area is proportional to the square of its mass; (ii) the rate of radiation is proportional to the horizon area and to the fourth power of the temperature, 6.5 Black hole candidates and ways of detecting them It is hardly necessary to say that it is very hard to detect black holes, because of their very nature. We have to look for evidence of their gravitational pull on other bodies, radiation like X-rays from matter accreting onto them, and gravitational waves emitted, particularly at moments of collapse. We shall now discuss the various possible types of black holes and the par- ticular ways in which each type might be detected. Stellar collapse On theoretical grounds, we believe that many black holes should occur at the end- point of the life of massive stars, which cannot be prevented from collapsing by any known physical force (see the subsection on sources of gravitational waves in 6.5 Black hole candidates and ways of detecting them 261 Section 5.10). Such black holes would have masses between two and one hundred solar masses. Their detection is much more feasible when they are in a binary orbit with a visible star. In that case, not only does the motion of the visible companion suggest the presence of an invisible object, but also matter slowly spiralling in towards the rotating black hole tends to form an ‘accretion disc’ in its equatorial plane. Different parts of the disc rotate with different speeds, and the resulting frictional heating leads to the emission of X-rays. Once these are detected, study of the structure of the accretion disc and of the orbit of the visible star can lead to limits on the mass of the invisible object. For example, the first candidate widely believed to be a black hole is the X-ray source known as Cygnus X-1. The Uhuru satellite recorded data showing that the X-rays varied over a very short time-scale, which meant that the source was very compact. The data on the spectrum of the visible star gave an indication of its mass, and the resulting model predicted that themass of the invisible object was at least 3 solar masses, probably greater than 7 solar masses, and most likely about 16 solar masses! Since all of these possibilities are well over the mass limit for a neutron star, it was deduced that a black hole had been located (see e.g. “The search for black holes’ by Kip Thorne, Scientific American, December 1974). Recent work by Narayan and collaborators on particular models of accretion discs, has pinpointed another way of distinguishing black holes from neutron stars. The energy carried through an accretion disc to the central object will disappear if it is a black hole or be re-radiated if it hits the ‘hard’ surface of a neutron star. Observations of this phenomenon promise an exciting new approach to black hole detection. (See R. Narayan, ‘Astrophysical evidence for black hole event horizons’ in Gravitation and Relativity: At the Turn of the Mil- lenium; Proceedings of the GR-15 Conference, Pune, India, December 1997, edited by N. Dadhich and J. Narlikar (UCAA 1998).) Quasars and galactic centres Although quasi-stellar objects or quasars were discovered in the early 1960s, they are still not fully understood. They are extreme examples of what are known as active galactic nuclei, which are the very bright central sources in so-called ‘active galaxies’. These ventral sources emit an unusually large component of blue light and are often as bright as the entire surrounding galaxy. The light-emitting region is typically about a light-year in size but can be much smaller, and the brightness in various parts of the spectrum depends on a number of factors like the magnetic fields. One theory is that at its centre, a quasar has a supermassive black hole of perhaps 100 million solar masses. As material accelerates in, it heats up and radiates, but this does not explain the enormous amounts of power produced by quasars. The light comes from a massive compact gaseous object heated by an extremely powerful small engine. Chemical power, nuclear power and the con- version of matter to anti-matter are all inadequate as sources of this power, and it is believed that only gravity can provide the energy. The existence of giant double lobes in radio galaxies has been used to cast doubt on the ‘small engine’ idea, but in fact, such galaxies also emit radio waves from their central cores, and a single 262. Spherical stars and stellar collapse source there could be responsible for all the radio emission through gas jets emerging from the centre and creating the radio lobes. Because these jets, which emerge on opposite sides of the core, are straight for at least a very substantial distance, the central engine has to fire them in the same direction for a very long time. Therefore the nozzles that collimate the jets must be attached to a superbly steady object, a long-lived gyroscope of some sort. The ‘best-buy’ candidate is a gigantic spinning black hole, with the less likely alternative being a massive spinning magnetic star. There are at least four possible mechanisms for the creation of the jets. For example, in the Blandford Znajek process, their energy comes from the hole’s rotational energy. An example of the type of object just described is the radio source 3C 273, identified in the 1960s. It has a high redshift, which shows that it moves with 16 per cent of the speed of light. Although it is very distant, it looks very bright; it radiates immense amounts of power, making it 100 times more luminous than the brightest galaxy ever seen before. Its brightness fluctuates within the period of a month, indicating that light comes from a region smaller in size than a light- month and therefore 10'* times smaller than the volume in which a typical galaxy produces its light (see ‘The quasar 3C 273’ by T. J. -L. Courvoisier and E. I. Robson, Scientific American, June 1991, 24-31). It isalso thought that there are supermassive black holes at the centres of many ‘ordinary’ galaxies like our own (see, for example, the image of the galaxy NGC 3377 on the front cover, taken from http: //www.seds.org/hst/97-01.html). These were perhaps formed by the collapse of a cluster of stars, or from the cumulative interactions of stars in the galaxy core, with friction driving the interstellar gas down into the core. Evidence for the existence of these black holes comes from brightness enhancement in nearby stars or perhaps just an enhanced con- centration of stars there, and anomalously high velocities near the centre, indi- cating the presence of a very massive object. The orbital motion of gas clouds near the core of our galaxy suggest that they are moving round an object of mass about 3 million times the mass of the Sun. Another particularly good candidate is at the centre of the galaxy M87, which displays these features, including velocities ranging up to 500 km/sec, and is thought to have a mass of about 10°° solar masses. (For more discussion of the evidence for black holes at the centres of galaxies, see for exampie, ‘Galactic nuclei and quasars: supermassive biack holes’ by M. J. Rees, New Scientist 80, 188-191, 1978; ‘The central parsec of the galaxy’ by T. Geballe, Scientific American, July 1979; ‘Cosmic jets’ by M. Begelman, R. Blandford and M, J. Rees, Scientific American, May 1982, ‘Centaurus A’ by J. Burns and R. Price, Scientific American, November 1983; ‘Black holes in galactic centres’ by M. J. Rees, Scientific American, November 1990; ‘A new look at quasars’ by M. Disney, Scientific American, Junc 1998, 36 41.) Primordial black holes At the other end of the scale, small ‘primordial’ black holes could have formed from massive density fluctuations in the early universe. Making reasonable assumptions about the possible size of such fluctuations shows that a primordial 6.5 Black hole candidates and ways of detecting them 263 black hole could well have a mass comparable to that of the Earth, in which case the radius of its horizon would be about | cm! Detecting such black holes sounds even more implausible than detecting supermassive ones, but in theory there is one particular possibility based on the existence of Hawking radiation described in Section 6.4. It is conceivable that any primordial black hole still around would be completing its evaporation process now. Calculation of the energy radiated in its last second shows that it would briefly have similar luminosity to a small star, but its spectrum would be very different. Unfortunately no events of this type have been observed as yet. To summarize, we have reasonable evidence for several stellar mass black holes in our galaxy. Also, many astronomers find the existence of supermassive black holes at the centres of quasars and many galaxies the most plausible explanation for the phenomena observed. The reader is referred to two books which discuss these issues in much greater detail: Gravity’s Fatal Attraction: Black Holes in the Universe by M. Begelman and M. J. Rees (Scientific American Library, W. H. Freeman, 1996) and Black Holes and Time Warps: Einstein’s Outrageous Legacy by K. S. Thorne (W. W. Norton, 1994). Exercise 6.11 Read up on the evidence for the existence of black holes (a) as remnants of the collapse of massive stars, (b) at the centre of our own galaxy [see e.g, the Scientific American articles cited above; The Cambridge Encyclopaedia of Astronomy, ed. $, Mitton, Cambridge University Press, 1979; The New Astronomy by N. Henbest and M. Marten, Cambridge University Press, 1983] Computer Exercise 15 (A) (a) Using equation (**) of Exercise 6.7, write a subroutine GEODESIC to deter- mine a numerical approximation to the geodesic curve R(T) starting at a point (T1, RI). with the constant ¢/£? denoted by EPS [choose a time increment DT and then for J =1,2,... repeatedly determine the corresponding increment D.R(/) and hence find the next point 7(/), R(/) on the curve, where the initial values are T(1) = T1, R(1) = Ri}. (b) Choose values for T1, Ri (where RI > 2M) and EPS (where EPS #0 and is such that DR(1) <0; note that the numerical approximation suggested will break down if DR(J) = 0). Use your subroutine GEODESIC to find the resulting geodesic , and show that for arbitrarily large coordinate times this curve never crosses R — 2M. (c) Now use program PROPER from Computer Exercise 14 to show that the proper time along the geodesic until R = 2M is reached, is finite. (B)_ If you are feeling strong, calculate {using similar methods to Part (A)] the radial outgoing null geodesics from 7 to an observer O who remains stationary at radial distance R= Ri, starting from + at the times 7(/). Use the program PROPER from Computer Exercise 14 to determine the corresponding proper-times intervals DTAU-EMIT, DTAU- ‘OBS between the light rays measured by 7 and by O, Hence explicitly determine the redshift measured by O for light emitted by -y. Show how this diverges as approaches r=2m 7 Simple cosmological models In this final main chapter, we look at curved space-time models of the large-scale geometry of the universe. These explain our observations of the systematic red- shifts of distant galaxies in terms of expansion of the universe from a hot initial state, preceded by a space-time singularity (the ‘Big Bang’) which is the origin of the universe and indeed of space-time itself. We have already examined an expanding universe model (the Milne universe) in Chapter 4; however, that was a flat-space universe model, not incorporating the effects of gravity. In this chapter we use the same concept that a universe model is a space-time on which there is defined a set of preferred world-lines (‘fundamental world-lines’), representing the average motion of matter in the universe (cf. Section 4.3). However, we now look at the consequences of using Einstein’s field equations to determine the space-time curvature. As in the case of black holes, perhaps the most intriguing features resulting are causal limits that occur in space-time, which lead to the existence of ‘particle horizons’. We shall look at their properties in detail; these can be understood on the basis of a good grasp of the nature of curved space-times and the properties of light cones in these space-times. ‘A good companion book to the present one in terms of understanding present day cosmology is 4 Short History of the Universe by Joseph Silk (Scientific American Library, WH Freeman, 1997). We will refer to that book in what follows as A Short History. 7.1 Space-time geometry The simplest cosmological models are obtained by assuming that the large scale features of the universe are spatially homogeneous, i.e. are the same at all points in space, and are isotropic about us, i. are the same in all directions about us. These assumptions are clearly not true locally, but may be good approxi- mations ona very large scale; specifically, when we average over volumes of size 600 million light years and up (substantially larger than the local supercluster of galaxies). It follows from these assumptions that the universe must be isotropic about every point, i.e. every observer will see the large-scale properties of the universe to be the same in all directions about him. For historical reasons, we shall refer to universe models in which this condition is exactly fulfilled as Friedmann-Lemaitre-Robertson—Walker, or FLRW, universe models (Friedmann was Russian, Lemaitre was Belgian, Robertson was American, and Walker is English). 7.1 Space-time geometry 265 The metric form In FLRW space-times, coordinates can be chosen so that the invariant interval takes the form ds? = dr? + R°(d){dr? + (0) (a0? + sin? 0 dd”)} ca) where 1 is a time coordinate, 9 and ¢ are standard angular coordinates, r is a coordinate determining spatial radial distance, and depending on the nature of the universe model f(r) is either sin r, r, or sinh r.* The fundamental world-lines, representing the average motion of matter in the universe, are the curves {r, 4, 6 constant}. As in Section 4.3, we shall refer to an observer moving on these world- lines as a ‘fundamental observer’; by definition, he is moving at the average velocity of motion of matter in the universe, and so his observations represent what the universe will be like for an average observer. Because of the spatial homogeneity, the density jz and pressure p of matter in this universe are functions only of the coordinate time r. There is thus no spatial gradient of any physical quantity in the universe that could cause the fundamental observers to move non- inertially; and indeed it follows from the metric form (7.1) that they are in free fall (i.e. they are moving under gravity and inertia alone, ef. Section 5.3), It immediately follows from the metric form (7.1) that the coordinate ¢ meas- ures proper time along these flow lines (Fig, 7.1). It is not immediately obvious, but (7.1) implies spatial homogeneity: all physical and geometrical quantities are galaxy world lines, th, la a ~ ~ a \ j / | -—— ( Alt) | \ 5 m5 reg |/ \ proper surfaces of ~ time constant time {oy ouppresced] Fig. 7.1. The world-lines {r,4, 6 constant} of fundamental galaxies and observers in the FLRW universe. The proper time along these world-lines between surfaces of constant lime are just the coordinate time differences. * The function sinh r is the hyperbolic sine, introduced in the discussion of the Rindler universe (Scetion 4.3), 266 Simple cosmological models uniform in the surfaces {t= const}. The form (7.1) also implies that for every fundamental observer, these space-times are isotropic about each point, ie. all directions are equivalent (so for example a fundamental observer cannot pointin any particular direction and say ‘the centre of the universe lies in that direction’, since no direction is preferred over any other). As in the previous chapter, this follows for the observer at the origin r— 0 because @ and ¢ occur in (7.1) only in the form of the metric of a two-sphere, which is spherically symmetric. From the spatial homogeneity of the universe model, it then is clear that thisis true for every fundamental observer. Exercise 7.1 By considering an arbitrary displacement dx* = (0,dx!,dx?, dx?) in a surface t = fo, show that this surface is orthogonal in the space-time sense (see eqn (5.6c)) to the world- lines {r,8,¢=constant}. Deduce that this displacement is instantaneous for a funda- mental observer, ‘The space-sections The surfaces {t= constant} are locally surfaces of simultaneity for all funda- mental observers, because they are orthogonal (in the space-time sense) to the matter world-lines. They are surfaces of homogeneity in space-time, that is, all physical quantities are constant on them (in particular p = p(2), = ju(1)). It is instructive to examine in some detail the geometry of these surfaces. Consider the surface t = to which can be seen to have the metric form: ds? = R'(to){dr? +f?(r)(de? + sin? @d¢?)} (7.2) @. Just as in the case of spherical polars in flat space-time, so the coordinates here are centred on the (arbitrary) point O where r= 0, which is equivalent to every other point in these surfaces. Moving radially out from this point to the coord- inate value r = rs (Fig. 7.2), one will reach a two-sphere $ with metric ds? = (to) f2(r.)(d@ + sin? Add?) (7.3) t=t, sphere 8, coordinate 5 ¢ distance Pits g area anfig iA)? Fig.7.2 The coordinate ris not the radial distance norisit an area coordinate as it was for the Schwarzschild solution (Chapter 6). The area A ofa sphere with coordinate r, at t = tp is Anf?{r.)R(ta) and its radial distance from the origin is R(ta)rs 7.1 Space-time geometry 267 the area of which is A = 4R2(to)f2(rs). From (7.2), the distance from O to this sphere is D = R(so)rs. The implications depend on which form of f(r) applies. Flat space If fir)=r, then A= 4R?(to)r? and we have the usual relation between A and D, ie. A=4xD?. This is precisely the relation that holds in Euclidean space, and in fact this is just the case when the space-sections are flat, (ic. they are surfaces of zero curvature), These space-sections continue inde- finitely; thus this is a spatially infinite universe. Tt will, for example, therefore contain an infinite number of galaxies (because of the spatial homogeneity of the distribution of galaxies). Hyperbolic space If f(r) = sinh, A = 47 R(t) sinh? r,. Now as we move out from any point, the area of the sphere S is greater than it would be in Euclidean space because sinh? r > 7? (Fig. 7.3). This is the case of a hyperbolic 3-space of constant negative curvature 1/R?(o), characterized by the relation A= 4nR’ (to) sinh’(D/R(to)), showing how the distance D relates to the surface area A of a two-sphere centred ‘on any point in the three-space. Again, the space-sections continue indefinitely; this is also a spatially infinite universe containing an infinite number of galaxies. Elliptic space If f(r) =sinr, then A = 47 R(t) sin’ r,. Now as we move out from any point, the area of the sphere S is less than it would be in Euclidean space because sin’ r < r? (Fig. 7.3). This is the case of an elliptic three-space of constant area A hyperbolic / he ve kew elliptic Fig.7.3 ‘The local geometry of the three-spaces of constant time in the FLKW universes is characterized by the area of a sphere of radius D. Here this area is plotted against D? for elliptic (k = +1), flat (ie = 0), and hyperbolic (k = —1) spaces (we have taken R(to) = 1; then D = r,). 268 Simple cosmological models positive curvature 1/R?(to), characterized by the relation A= 4nR (ty) sin’(D/R(t0)), showing how the distance D relates to the surface area A of a two-sphere centred on any point in the three-space. In this case, a new feature arises. As D increases, the area A increases to a maximum, then decreases, and finally goes to zero at a point P, the point ‘antipodal’ to O. Thereafter 4 increases again, goes toa maximum, and decreases to zero again. To understand this, consider moving out from O on geodesics in any direction D, and the directly opposite direction D2. At a distance d from O, these curves intersect a two-sphere S centred at O in two points p; and p> antipodal to each other on S (Fig. 7.4). As the distance d increases, the area of the sphere S reaches a maximum and then starts decreasing again. As this area goes to zero, the geodesics approach a point P antipodal to O in the three-space; the curves approach P from precisely opposite directions D/, and D}, (because they intersect. each surface S in points antipodal to each other on S). Hence, the situation is as follows: moving out from O in the direction Dj, a geodesic passes through all the pointsp, reaches P indirection Dj, leaves Pin direction D}, passes through all the points pp, and arrives back at O from the direction D» (Fig. 7.4). Therefore, the universe is necessarily spatially closed: moving radially out from O in any direction, one passes through the antipodal point P and then arrives back at O. The maximum distance of any point in the space from O cannot exceed the dis- tance to P, and the total volume of the three-space is finite. ker antipodal pairs orthogonal curve to Isespheres: @ a Fig. 7.4 The global geometry of an elliptic (k = +1) three-space. Geodesies in opposite directions D, and D2 from O cut a series of two-spheres in antipodal points p, and p>. Because the area of the two-spheres eventually goes to zero, the geodesics eventually meet again at P, the point antipodal to O. They approach P trom opposite directions Ly and Dy; hence a geodesic starting from O in the direction 1D, and continuing without deviation will arrive at P from the direction Dj, pass through P, and continue in the direction Dj, eventually arriving back at O from the direction D2. 7.1 Space-time geometry 269 sphere model Fig.7.5 ‘The two-dimensional analogue of Fig. 7.4. Geodesics in opposite directions from a point O on the 2-sphere meet again at P, the point antipodal to O. En route they cut each circle centred at O in opposite points p; and po, An exact model of this situation is given by looking at the geometry of a two- sphere of radius a, where exactly the same occurs but with one dimension less (Fig. 7.5). Starting from any point O on a two-sphere, moving a distance d along great circles whose initial directions at O are opposite each other, one arrives at opposite points p; and pz on a circle C with circumference 2a sin(d/a). These circles focus at a point P antipodal to O on the two-sphere; continuing in an unchanged direction along either of them, one arrives back at the original point O from the opposite direction. This is an exact model of the geometry of the three- dimensional spaces of constant curvature. The way the circles C spread out from the point O and refocus at the antipodal point P gives a good idea of how the two- spheres $ do the same in the full three-dimensional case. That this two-dimensional case gives a good model of the full three-dimen- of the three-dimensional spaces of constant curvature. To see this, choose a circle in cach two-sphere S by setting @ = 7/2 (so dd =0 and the circle coordinate is 4); then (7.2) reduces to two-sp! ds? = R*(to)(dr? + sin? r dg), (7.4) the metric form of a two-dimensional section of the full three-space, However, this is a two-sphere metric with the properties just discussed. We see then that, in the elliptic case, the space-sections are necessarily finite, and consequently the universe contains a finite number of galaxies. The three-dimensional spaces with metric ds* given by (7.2), whose geometry we have now examined in detail, are called three-spaces of constant curvature k. The curvature « depends on the time ‘9; it can be expressed in the form «= k/R?(to), where k = +1 in theellipticcase, k = Qin the flat case,andk = —1 inthe hyperbolic case (i.e. kis +1, 0, or —1 when f(v) is sin r, r, or sinhr respectively), The scale function and time evolution The above discussion leads us to expect that all distances in the surfaces {1—constant} will scale as R(1), all areas as R2(1), and all volumes as R3(1). Thisis 270 Simple cosmological models indeed the case; for example, the total volume of the finite (k = +1) universes scales as R°(1). Now the fundamental particles of the models (representing clusters of galaxies in the universe) can be thought of as at fixed positions in these surfaces, because they lie at constant coordinate values r, 0, ¢. Thus, all the distances between them will also scale with R(t) (Fig. 7.6); this follows directly from (7.2) on noting that this distance will be an expression depending only on the spatial coordinates, multiplied by R(J). For this reason, the function R(t) is often referred to as the scale function of the universe model. As indicated by the functional notation used, in general R varies as time progresses. In this case, the distances between all particles in the universe scales with R(2), increasing when R does and decreasing when R does. Hence the metric form (7.1) expresses the very important concept that the universe can evolve with time. Note that not merely do the distances between all clusters of galaxies vary with R(d), but also the space-time itself evolves: the curvature of the three-spaces {1=constant} varies as k/ R2(1); the density of matter will vary, and this can be shown to represent part of the curvature of space-time; further, in the case k = +1, the total volume of these three-spaces will vary with time. The way R() varies is determined by Einstein’s field equations of gravitation, and depends on the amount of matter and radiation in the universe. As we shall see shortly, the evidence is that we live in an expanding universe, where R(¢) is presently increasing, having increased to its present value from zero. It is important to notice two features of such an expansion. Firstly, the expansion is an expansion of the universe as a whole; therefore it is not an expansion into anything (there is nothing outside the universe for it to expand into, since it is the totality of all that exists!). A simple model is as follows: consider a sheet of paper with pictures of galaxies on it, but where the sheet has no edge: it continues to. infinity, thus there is nothing beyond it—it has no edge. Now consider what happens if the size of the sheet is increased to twice as large. It will still stretch to infinity, but now the distances between the galaxy images will be twice as large. It has not expanded into anything: it has just got twice as large while still being without limit (this is possible because of the fundamental paradoxical property of galaxy world lines tLe } —— ) t=, = AIL) = 0; that is, it has closed spatial sections. This universe model, originally found by Einstein in 1917, is a curved space-time which is an exact static solution of the gravitational field equations 272 Simple cosmological models Apart from the fact that it has spatial sections of positive curvature—and so there are only a finite number of galaxies in such a universe model—it is similar to the Minkowski universe discussed in Section 4.3. It is unchanging in time, and so is infinite in the time dimension. There will be no systematic redshifts predicted in it. Further, it is unstable: if any density fluctuation were to occur, it would either collapse to a singularity because gravitational forces overwhelm the repulsion due to the cosmological constant, or expand forever because the cosmological constant overcomes the attractive power of the matter. For these reasons, the Einstein static universe is not believed to be a good model of the real universe, and we will not consider it further. If we follow Einstein and assume that the ‘cosmological constant’ A is zero, then static solutions of the field equations appear to be impossible. (For further discussion of the value of A in the light of new observational data, see Section 7.4.) Evolving universes Provided the energy density and pressure in the universe are positive, Einstein’s field equations uniquely imply that the universe must expand from an infinitely compressed state, the ‘hot big bang’, with the rate of expansion decreasing as the universe ages. We consider first the initial expansion of the universe, and then its behaviour at later times. The early universe In all cases, at very early times (when the universe is filled with radiation, ef. the discussion below) the evolution proceeds according to RO) xB, (7.5) where we have chosen the time coordinate ¢ so that ¢ = 0 at the origin of the universe (where R = 0). This implies in particular that the universe begins by expanding from that time, when the matter in the universe is indefinitely com- pressed (because R(¢) = 0 there) and the density and temperature are infinitely iarge (Fig. 7.7). This is the “Hot Big Bang’—a singular origin to the universe. As far as classical physics can predict, the curvature of space-time is infinite there and space, time, and even the laws of physics do not exist before: thus this is the origin of the universe. We can only use the laws of physics to understand the evolution of the universe after this creation event. At very early times the physics involved in understanding the evolution of the universe is ill understood, and our thcorics about what happens are speculative. However, at times later than about | second after the expansion began, the physics involved is reasonably well understood. The universe was filled with a very hot interacting mixture of particles and radiation in equilibrium with cach other, that cooled as the universe expanded (the temperature Tis proportional to 1/R, and was 10°K at t = 1 second, see Fig. 7.7). As the temperature dropped, clement formation (nucleosynthesis) took place at about 10°K, and then the matter and radiation in the universe decoupled when the temperature was about 3000 K (the universe was opaque to electromagnetic radiation at earlier times when electrons. freely moving between nuclei, scattered light strongly, but was 7.2. The evolution of the universe 273 Fig. 7.7 The density j, temperature T and scale factor R of the universe plotted against time ¢, At t= 0, the ‘Hot Big Bang’ singularity corresponds to infinite density and temperature in zero volume! the re, “here and now’ z ak opaque (221000) Fig. 7.8. Black-body radiation arriving along the past light cone. Before the decoupling time fy the universe was opaque, the redshift greater than 1000 and the temperature greater than 3000 K. The temperature of the radiation is now 3K. transparent afterwards when the electrons were bound together with nuclei to form atoms). The remnant radiation from this time is observed by us today as black-body radiation at a temperature T of approximately 3 K, observed with very sensitive radio receivers and infra-red-radiation detectors (see ‘The primeval fireball’ by P. Peebles and D. T. Wilkinson, Scientific American, June 1967) Although it is very difficult to detect because of this low temperature, the dis- covery of this radiation in 1965 was of great importance, because it is direct evidence that there was a hot early stage in the mniverse, when R(#) was much less than it is now. Further, this radiation provides direct evidence of conditions very early on (at the time of decoupling, long before the existence of any stars or galaxies: see Fig. 7.8). It is not possible by analysing any electromagnetic 274 Simple cosmological models radiation to obtain information about times earlier than the time of decoupling, because the universe was opaque before then. Further, the isotropy of this tem- perature (it is the same in all directions to an accuracy of 1 part in 10*) is the best evidence we have for the uniformity of the universe at very early times; the very small remnant anisotropy detected can be understood as due to our motion relative to the fundamental velocity at our space-time position (see Section 3.1 above, and ‘The cosmic background radiation and the new aether drift’ by R. Muller, Scientific American, May 1978). ‘The physics involved in the early stages of the universe is very complex. A brief summary, and references for further reading, is given in an Appendix to this section. The late universe The later behaviour of the universe differs according to whether the spatial curvature is positive, zero, or negative (see Fig. 7.a). Assuming the cosmological constant, A, is zero, then if k = —1, the universe is a low-density universe that easily expands forever; if k = 0, it is a high-density universe that just manages to expand forever; if k = +1, it is a high-density universe that expands to a maximum value of R(¢) and then recollapses in the future, ending at a second singularity similar to the initial singularity where it began. If A> 0, then the universe will in many cases expand forever, even if k=+1 Tt is clearly of considerable interest to find out whether k = 0, +1, or —1, since this determines not only whether the universe space-sections are finite or infinite (as discussed above) but also whether the universe will expand forever or not if A = 0. One attempts to determine the value of k by astronomical observations of distant galaxies, using these to determine the behaviour of R(#) and hence to infer the value of k. @ () Hig. 7.9 (a) Lhe scale factor K(z) plotted against time ¢. For k = —I and 0, it increases indefinitely; for & = +1 it increases to a maximum and then decreases again to zero. (b) The Hubble constant Hp is the slope of the curve R(0) at the time fo, and the deceleration parameter gy its curvature then. The evolution of the universe 275 The basic parameters ‘The basic parameters characterizing different universe models are the Hubble constant 1 | mo~ (ES, [ 1 @R RHj de | o where the subscript 0 means ‘evaluated at the time fo’. The first characterizes the rate of expansion of the universe (see the discussion of the Milne universe in Section 4.3), and the second the rate at which the expansion of the universe is slowing down (Fig. 7.9(b)). According to the Einstein field equations with vanishing cosmological constant A, go is directly proportional to the amount of matter in the universe; if go > by we are ina high-density ( = +1) universe which will recollapse, whereas if go < 4, we are ina low density (k = —1) universe which will expand forever. The critical case qo = } (the Einstein-de Sitter universe) has flat spatial sections (k — 0) and a simple form for R(): in this case, RD (r= a)! (18) where /; is a constant (this would be the time at which the expansion began, if this expansion law held all the way back to the initial singularity; however, as we have seen, that is not the case). Our present observations of the density of matter in the universe suggest it is too low to cause a recollapse; the highest densities suggested by direct observations correspond to gp ~ 0.1. Although observed densities are less than those predicted in the critical-density case, in order to understand the broad nature of the evolution of the universe it is common practice to use (7.5) at early times and (7.6) at late times, matching the expressions for R to R. at some critical time ¢. at which a transition took place from a radiation-dominated to a matter-dominated universe. Similarly the expressions for R must be matched at f-. The Hubble constant gives an estimate of the age of the universe. In the case 4o = 0, we are in an empty universe with age fo = 1/Ho; this is just the Milne a discussed in Section 4.3 (an empty universe with linear expansion*). If = 4 (the critical case), then fo = 3{1/Hp). Present estimates of Hp imply that ! Hoi isabout 15 x 10° years. Combining this with present estimates of theages of stars in globular clusters (between 14 and 18 x 10” years) suggests that in a high- density universe, the deduced ages of stars may be uncomfortably large compared with the age of the universe. However, there is still considerable uncertainty in the value of the Hubble constant, so arguments based on ages need to be treated with caution. Additionally it now seems possible that A > 0, which will imply larger ages for a given Hubble constant, solving the age problem, as is dis- cussed later. and the deceleration parameter " qo * More precisely, the four-dimensional Milne universe has metric form (7:1) with k = =1, R(t) = and gp = 0. It is a flat space time but with negatively curved space sections. 276 Simple cosmological models Appendix: Physics of the early universe The matter created initially in the very early universe will be a very hot gas of elementary particles (protons, electrons, positrons, neutrinos, etc.) and photons (i. radiation) in equilibrium with each other, Because of the pair-production process discussed in Section 3.7, photons will collide to produce particle-anti- particle pairs; conversely, particles will collide with antiparticles to form photons. The temperature of this gas will drop as it expands. Let us consider this in a little more detail. One might expect that as the universe expands, the wavelength of any radiation present, just like all other length scales, will vary as R(‘). An exam- ination of Maxwell’s cquations for clectromagnctic radiation shows that that is indeed precisely what happens. Now, the wavelength \ and frequency v of the radiation are related by the fundamental relation c = vA, so v varies as 1/R(1). Further, the energy of the light is related to v by E = hv (ha constant). Thus the energy E scales as 1/R(t). This suggests that the temperature T of black-body radiation in equilibrium in an expanding universe will vary as 1/R(1), because T measures the average energy of the radiation. Indeed this is so; further con- sideration of the thermodynamics of such radiation confirms the conclusion that, as the universe expands, the radiation cools according to the law T « 1/R (see e.g. Cosmology, by E. Harrison, Cambridge University Press, 1981). ‘As the universe cools, the various reactions that are possible at very high temperatures drop below their threshold temperatures and cease to occur, thereby causing the various equilibria to be broken. Further, the disruptive effect of the photons decreases as the temperature decreases, so more and more com- plex particles and structures can come into existence. In particular, the light elements (helium, deuterium, lithium, tritium) are created by nucleosynthesis at temperatures of about 10*K. Stars and galaxies form at much later times when the radiation has cooled down to about 300 K, and second-generation stars and the solar system form even later. We will not discuss the synthesis of the elements here, but refer the reader to other books (¢.g. Modern Cosmology, by Dennis Sciama, Cambridge University Press, 1976, The First Three Minutes, by Stephen Weinberg, Basic Books, 1977 or Chapter 5 of A Short History) for details. However, two points are particularly important and must be mentioned. Firstly, one can compare observations of element abundances in the universe with the predictions of these models. Excellent agreement is attained on the understanding that the light elements (hydrogen, deuterium, helium) are created in the hot early universe but the heavier elements (e.g. carbon, nitrogen, oxygen, iron) are created in subsequent processes in stars, some of which then spread these elements through space in supernova explosions. Indeed this is why our Sun must be a second-generation star: the planets in the solar system are formed of heavy elements which must have been created in the interior of an earlier-generation star, Secondly, at early times the radiation in the universcis in close equilibrium with the hot matter in it, which is ionized, that is, the atoms are split into their constituents, the nuclei and the electrons, and these move independently in the 7.3. Observable quantities 277 gas instead of being bound together as atoms. The radiation will then be black- body radiation at a temperature appropriate to the stage of evolution of the universe. The free electrons interact strongly with all electromagnetic radiation. This means that the universe is then opaque to light, radio waves, X-rays, etc.; as in the interior of the Sun, a photon (that is, a particle of light) can proceed only a very small distance before colliding with an clectron and being scattered from it. However, at later times (when the temperature of the universe drops to about 3000 K) the electrons and nuclei recombine to form atoms. The free electrons are now closely bound to the nuclei, and so they no longer scatter light as they did at earlier times, and the universe becomes transparent, with radiation mostly moving freely between the atoms without interacting with them; thus the time of recombination is also the time of decoupling of matter and radiation. The radiation that wasin equilibrium with the matter atearly times thereafter remains black-body radiation, with its temperature falling steadily as the universe expands. As mentioned above, the solar system is bathed in the very dilute remnants of this black-body radiation at the present time. For more details, see e.g. the books by Weinberg, Sciama, or Harrison mentioned above or Chapter 3 of A Short History. Exercise 7. Determine the relation between the Hubble constant and the age of the universe (a) in the case of a matter-dominated universe (i.e. (7.6) holds), and (b) for a radiation-domi- nated universe (i.e. (7.5) holds). 7.3. Observable quantities Th ebscrabichca fd Shicets in the the The major observable features of distant objects in the universe arc thei apparent angles, and apparent luminosities. These have been used to estimate the distances of stars, galaxies, and quasi-stellar objects; thus they are the way we establish the size of the universe. The somewhat detailed argument in this section is not needed to understand the causal arguments that follow in Section 7.5. Redshift It is easy to work out from the fundamental form (7.1) the paths of radial light rays. On them, d@ = 0 = d¢ (as they are radial geodesics on which @ and ¢ are constant) and ds? = 0 (expressing the fact that they are light rays). Then we see from (7.1) that on these curves, dr = d¢/R(#) (taking both dr and dr positive for a future-outgoing geodesic). Thus, iflight isemitted bya galaxy, atr = Oand time 1 = fe, and received by an observer O2 at = wand time ¢ = fy (Fig. 7.10), we find u= # (7.7) where the integral is taken from the time f, of emission of the light to the time fo of its observation. Similarly, a light ray emitted by O, a short time later at fe + Dfe 278 Simple cosmological models observer Fig. 7.10 Radial light rays are emitted at /, and f, + Df, by asourceatr = 0, andreceived at fo and to + Df by an observer at r = w. and received by O; at 9 + Dé will obey relation (7.7) but now with the integral taken from the time f, + Di, to the time f9 + Df. Now, the crucial feature is thal u is constant (because the fundamental observers are at constant values of the coordinate r) so the right-hand side of (7.7) has the same value on both light rays. We can therefore equate the two integrals. If we now approximate these expressions, allowing for the fact that Dfy and Dr. are small and so R(#) is very nearly constant for the relevant interval, we find that Dto/R(to) = Dte/R(te). Hence, the ratio of time intervals observed is given by K = Dto/Dfe = R(t0)/R(te) = 1 +2 (7.8) (the last relation following from (3.3)). We have worked the result out for one galaxy at the origin of coordinates, but the result applies to any galaxy pair because of the homogeneity of the universe (the emitter can always be chosen as the origin of the coordinates). This expression shows how observed redshifts directly measure the expansion that has taken place in the universe; so by (7.8), redshifts directly measure the ratio of the scale factor at the time of observation and the time of emission. Note that in this case, the effect is entirely reciprocal; Op would observe exactly the same redshift as O, for light emitted at /, and received at fo. However, the value of K will not stay constant for any particular pair of galaxies: rather, its variation with time will reflect directly the dynamic expansion or contraction of the uni- verse. Thus, K is a function of f (or of fo). The Milne universe described in Section 4.3 is an exact model of this situation. The factor K is directly observed through measuring the redshifts in spectra of distant galaxies (see e.g. The Realm of the Nebulae, by E. Hubble, Yale University Press, 1936, reprinted 1982; ‘The redshift’ by A. Sandage, Scientific American, September 1956; and Fig. 3.4). At the time of writing, redshifts up to z = $.34 for galaxies have been measured, detected by light emitted when the universe was about one-seventh of its present age. In the case of quasi-stellar objects, redshifts of up to 5.0 have heen measnred, again corresponding to seeing these objects a 7.3. Observable quantities 279 very long time ago (about 7 x 10° years) when they were 6.0 times closer than at present. In the case of the cosmic microwave background radiation, because the radiation temperature varies as 1/R(4) and ils present temperature is 3K, the temperature of this radiation at the time corresponding to a redshift of z will be T = 3({1+2z)K. Thus the radiation we have detected, emitted by hot dense matter in the carly universe at a temperature of about 3000 K (when the universe became transparent), was emitted at a redshift of about 1000 (Fig. 7.8). Because R(t) — Oat the beginning of the universe (when ¢ — 0), the redshift of radiation received from earlier and earlier times, if it could penetrate the intervening matter, would diverge to infinity. Because of the opaqueness of intervening matter, we cannot in fact receive electromagnetic radiation from extremely early times, but we may one day be able to detect neutrinos emitted at a redshift of about 10°. If we were able to detect extremely weak gravitational waves, we could in principle observe to much earlier times. Exercises 7.4 Suppose the light rays emitted at an interval Dr, by O; were reflected from O, and received again by Qj. What would be the interval Dr’ measured by O; between their reception? Contrast your result with that for the Schwarzschild solution (Section 6.1). 7.5 Use eqn (7.8) to confirm the result that \ scales as R(d). [Consider the relation between the period and the wavelength of the light.] To compare models of the universe with astronomical observations of distant objects, one must measure some other characteristic of the objects observed beside their redshift in order to obtain an observable relation that can be compared with theory. There is no direct way to determine the time of emission of the radiation, which is just the look-back time to when the light was emitted. The distance of the ‘object can be estimated either from its apparent size or from its apparent luminosity; and these quantities can also be predicted from the metric form (7.1) if we know the source’s intrinsic properties. We will look at them in turn. Apparent angles Just as we examined the apparent size of an object in the Minkowski universe (Section 4.3), so again we can consider an object of length D at radial coordinate 7 = uwhich is seen by the observer at r = 0 to have an apparent angular size a (Fig, 4,28a). In that case we obtained (4.34) from the flat-space metric (4.32c); using the same methods, in the present case we obtain a=D/r, (7.9a) where the ‘area distance’ rg is defined by ro = Rte) f(u) = R(to) f(u)/(1 + 2) (7.9b) here wis given by (7.7), tc is the time of emission of the light, and fo the time of its observation. This equation enables us to predict the angular size of any object of known size, given its distance, or conversely to estimate this distance from measurements of its angular size 280 Simple cosmological models ° 71s 8 8 a2 Fig. 7.11 The relation between ‘area distance’ ro, which determines apparent angles through equation (7.9a), and redshift z in a matter-dominated universe with flat spatial sections (lc = 0). There is a maximum of the area distance at z = 1.25; correspondingly there will be a minimum in apparent angular diameters at this redshift. In the case ofa flat (k = 0) matter-dominated universe with A = 0, (7.6) holds. This implies that R(t) = A(t — 11)! and Hor = (2)(to — )"', where fis a con- stant. Also f(u) = (3/8)[(4o ~ 4) ~ (te ~ f1)4] from (7.6). From (7.8) it follows that in this case (to — 1) = (fe — i )(1+ 2), and then, substituting for R(f) and flu) in (7.96), we find that = (2/Ho)(1 +2) {1 +2 (1+2)} (7.10) This relation is plotted in Fig. 7.11. The striking feature here is that this quantity hasa maximum at z = §, and thereafter decreases. Consider observing a series of sources with sharply defined features that can be used to define angular diameters (e.g. barred galaxies), and suppose that they are all of the same intrinsic size D. Then, by (7.9a), in such a universe the apparent angular diameter of this set of uniform objects will reach a minimum at redshift z = } and thereafter increase (Fig. 5.26). This is precisely the situation indicated in Fig. 5.21, but it is true for observations made in all directions, and at all times (since this behaviour is independent of the value of fy or Ho). Thus, in these universes we have the situation shown in Fig, 5.25b, where the entire past light cone of each observer refocuses at z = j. Examination of the equations involved shows that similar refocusing is expected to occur in all expanding-universe models with metric form, (7.1) that contain normal matter (more precisely, matter with a positive energy density). Unfortunately, this predicted behaviour is difficult to verify observationally, because there is a great variation in the intrinsic size of galaxies and radio sources, and because most galaxiesdo not have sharply defined outer edges (they fade away into the night sky). Exercise 7.6 Derive eqns (7.9) from (7.1). Derive eqn (7-10), and verify that ry has a maximum at z = $ 7.3 Observable quantities 281 Observed luminosities Again, we can essentially follow the calculation previously given for the case of flat space-time in Section 4.3. Consider a source of luminosity L, that is, a source emitting radiation at rate L in all directions. We choose coordinates centred on the source, i.e. the source is at r = 0. When we place a detector to receive this radiation, that detector (say of area A) intersects a particular bundle of light rays out ofall such rays emanating from the source (Fig. 7.12a). If this bundle of rays is characterized by angular displacements (d0, d@), the fraction of light L emitted in these directions in unit time by the source is P= singdgdg. 4x Ifwe assume there is no absorbing medium in the way, all these photons reach the receiver. Now three effects occur, which determine radiation intensity detected at the receiver. Firstly, at the detector (where ¢ = to andr = u), this radiation is spread over an area A (Fig. 7.12). One can easily relate this area to the angles dé and d¢ at the source because the light rays are radial, i.e. @ and ¢ are constant on them. ‘hus, the bundle of light rays will still be characterized by 0, d#, and déat the detector. Hence, from the metric form (7.1), the area A is given by the expression A = R*(to)f?(u) sin 6 d0.d@, Because the photons are conserved, all the photons emitted into the bundle of light rays will be received at the detector. Thus, the rate at which photons are received by the detector per unit area will be proportional to P and inversely proportional to the area A; taking the ratio, the rate of reception of photons per unit area is inversely proportional to R2(to) f2(u). Secondly, the energy per photon is proportional to its frequency v which, Solid angle (desinede) detector ‘unit sphere ) o Fig.7.12 (a) A bundle of light rays emitted by a source and received by a detector of area A, (b) The relation between the area 4 of the bundle of light rays at the detector and the solid angle at the source. For radial light rays with solid angle sin 0d dat the source, the width of the beam in the é-direction at the detector will be d/; = R(¢o)/(r) d0; similarly the width in the ¢-direction will be diy = R(to)/(r) sin 6d¢. The area at the detector is then A = dhdhy. 282 Simple cosmological models Finally, because photons are conserved, the rate at which they are received would be the same as that at which they are emitted, were if not for the Doppler shift factor K = 1 + z. Because this factor relates all time intervals measured by the source and by the observer, it relates in particular the time interval measured by the source and the observer for transmission and reception of any particular set of photons. Thus, the ratio of the rate at which they are reccived to the rate at which they are emitted is inversely proportional to 1 + z (ef. eqn (7.8)). When all these factors are put together, the radiation flux (the radiation received per unit area per unit time) measured from the source is given by L Ps 2'R (71) where ry is defined by (7.9b). Equations (7.9) and (7.7) enable us to calculate the flux of radiation (or ‘apparent luminosity’) of any source of known intrinsic luminosity L at a redshift z, once R(1) is known (from the Einstein field equa- tions). Thatis, itenables us to construct a theoretical redshift-luminosity relation for each universe model. The nature of these curves is shown in Fig. 7.13, where, as customary, the observed source flux has been re-expressed in terms of its 11 76 0 2-4 sea BS Fig.7.13 Magnitude-redshift curves: log(cz), where z is the redshift, is plotted against m, the apparent magnitude of the source, which is effectively the logarithm of the received flux of radiation F. 7.3 Observable quantities 283 magnitude m defined by m = —2.5 log, F + Ki, where K; is a constant and Fis given by (7.11). In principle, these theoretical relations can be compared with astronomical observations of distant galaxies to determine whether the relations for k = +1, k = 0ork = —1 give a better fit to observations, and to determine the curvature of space-time by measuring R(éo). Unfortunately, there are many observational difficulties and many problems in interpreting the observations. In particular, it is difficult to estimate what the intrinsic luminosity L of the source was at the time of emission of the light observed, sometimes thousands of millions of years ago, when the source luminosity may have been different from that of similar sources at the present time. These difficulties have so far prevented us from satisfactorily determining even the sign of k by this method. When we add evidence from the age of the universe and the abundance of the light elements, insofar as we can come to a conclusion from the weight of evidence, it has been until recently that we live in a low-density (k = —1) universe that will expand forever (see e.g. ‘Will the universe expand forever” by J. R. Gott, J. E. Gunn, D. N. Schramm, and B.M. Tinsley, Scientific American, March 1976). However, for theoretical rea- sons related to the ‘inflationary universe’ idea (see Section 7.6), many astron- omers believe that the universe actually has almost-flat spatial sections, very like the Einstein-de Sitter (k = 0) model, containing a large amount of ‘dark matter’ which we have not yet detected. In fact 90% of the mass of the universe could consist of hidden matter (e.g. black holes, neutrinos, or exotic particles) and there is probably also a non-zero cosmological constant A (see below). Apparent brightness Just as (4.34) and (4.35) led to the brightness relation (4.36b) in the case of the ow (7.9a) and (7.11) again lead to the same brightness relation (4.36b T=h/(1+2)'. (7.12) ‘That is, in a curved-space-time expanding-universe model, the apparent surface brightness / of a distant object depends only on the intrinsic surface brightness Jy of the object at the time the radiation was emitted, and its observed redshift. Again we are led to Olber’s paradox (Exercise 4.19): every ray of light eventually either runs into the surface of a star, or into the hot matter in the early universe; why do we not observe the night sky to be as bright as the surface of a star? The resolution to this paradox lies in the form of the eqn (7.12), together with the idea of the expanding universe. The first major feature apparent is the factor 1/(1 +2), showing that the surface brightness of light fiom distant galaxies or stars will be greatly dimmed as we look back to earlier and earlier times, and the redshift increases. A second factor is that because the expanding universe has a beginning a finite time ago, galaxics and stars did not exist at very early times, or when observed had not yet had enough time to begin burning brightly; so Jy was very low at early enough times. Some light might miss any intervening matter and enable us to see directly the hot primeval matter in the early universe, emitted at a 284 Simple cosmological models surface brightness high enough to burn us to death. Thus, the modern form of the paradox is that it at first seems the entire sky should be at least at the temperature of matter at the time of decoupling of matter and radiation (about 3000 K) However (see eqn (7.12)), redshifting reduces this to the harmless 3 K radiation that we can only detect with very sensitive receivers. The argument here has referred to the sky at night; however the same argument applies to the sky not obscured by the Sun during the day. Thus the reason the sky in the photograph on the cover is not everywhere nearly as bright as the surface of the Sun, is because of the expansion of the universe from its beginninga finite time ago. Exercises 7.7 Check the derivation of eqn (7.11), and derive eqn (7.12). 78 Number Counts: Suppose there is a density n of objects at a coordinate distance r from the observer in a FLRW universe model. Determine how many such objects one would expect to see between distances r and r + dr within the range of angles d@ and dé about a direction (6, ¢). (Hint: (i) Find the proper distance d/ corresponding to dr at the distance r from the observer, (ji) Find the area dA defined by the quantities d# and dg at this distance. (iii) Hence find the proper volume d corresponding to dr, d9, and d¢, and so find the number dN of such objects in the volume dV from the formula dN = nd¥] 7.4 New observational data The past decade have seen a great proliferation of data. This is due firstly to new telescopes that have come into operation, particularly the Keck Telescope in California, and many others in space (in satellites circling the Farth—IRAS, ROSAT, the Hobble Space Telescope, COBE for example) (Fig. 7.14). Secondly, it is due to improvement in detector technology, particularly development of CCDs (Charge Coupled Devices) enabling extremely sensitive and efficient digital image recording, and introduction of optical fibres enabling a great increase in the number of redshifts that can be measured in a single observing run. Consequently our understanding of the physical universe has developed in a remarkable way, still keeping the same basic picture as before, but with many of the details filled in and giving solidity to the previously rather schematic repre- sentations we had, New evidence on the nature of the universe will probably lead to determination of the main cosmological parameters in the next decade. Large-scale structure First, because we have better distance indicators we have been able to identify large-scale ‘walls’ made of galaxies, surrounding much emptier voids, the whole having something like a bubble structure. These have not been seen in the past because we see images of all these objects projected against one another; to separate them out we need careful distance estimates apart from redshift (see, for example, A Short History for details of such measures).

Potrebbero piacerti anche