0 Voti positivi0 Voti negativi

93 visualizzazioni1056 pagineSee "Interferometer dlc"
Performance of Standard Fourier-Transform Spectrometers Copyright @ 2007 by Douglas Cohen
PRINTED IN THE UNITED STATES OF AMERICA
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without prior permission of the author.

Oct 13, 2010

© Attribution Non-Commercial (BY-NC)

PDF, TXT o leggi online da Scribd

See "Interferometer dlc"
Performance of Standard Fourier-Transform Spectrometers Copyright @ 2007 by Douglas Cohen
PRINTED IN THE UNITED STATES OF AMERICA
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without prior permission of the author.

Attribution Non-Commercial (BY-NC)

93 visualizzazioni

See "Interferometer dlc"
Performance of Standard Fourier-Transform Spectrometers Copyright @ 2007 by Douglas Cohen
PRINTED IN THE UNITED STATES OF AMERICA
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without prior permission of the author.

Attribution Non-Commercial (BY-NC)

- QM and QM/MM studies on organic and inorganic systems
- Quantum Physics by Bellac
- Alien Abductions
- 24. Robert Hooke - Micrographia (1665)
- 803-Page Collection of Papers on Anti-Gravity Research
- Std12 Phy Vol 2
- Introductory Acoustics
- Transforming Scholarly Publishing
- Physics GRE Solutions Omnibus
- Secondary Electron Emission by Bruce Darrow Gaither
- Abbott Scientific Theism
- Essential Physics
- Free Science Sites
- Thermodynamic
- Acoustics - Theoretical
- Basic_Physics_of_Nuclear_Medicine
- TT15_fm1
- Michelson Interferometer
- The Michelson-Gale Experiment
- problems

Sei sulla pagina 1di 1056

MICHELSON INTERFEROMETERS

The Michelson interferometer is named after Albert Abraham Michelson, who designed and built

it in 1881 to detect the ether wind caused by the Earth’s orbital motion. Michelson’s attempt

failed; his interferometer, sensitive enough to detect stamping feet 100 meters away,1 could not

detect the Earth’s orbital motion. So important and difficult to explain was this result that

Michelson and Edward Morley repeated the experiment with a larger and more sensitive

interferometer in 1887. This second attempt, which is today called the Michelson-Morley

experiment, also yielded a negative result: The Earth’s motion could not be detected. The

Michelson-Morley experiment is one of the most important negative findings of 19th-century

science; it encouraged physics to discard the idea of a luminiferous ether and prepared the way

for Einstein’s relativity theories at the beginning of the 20th century.

The idea of a luminiferous ether—a plenum pervading both (transparent) matter and empty

space—had been widely accepted ever since Young and Fresnel established around 1820 that

light behaved like a transverse vibration or wavefield as it propagated past obstacles. There were

recognized difficulties with the concept; for example, the ether provided no detectable resistance

to the motion of material bodies yet was elastic enough to transmit light vibrations without

measurable energy loss. In the 1820s and ’30s, Poisson, Cauchy, and Green, famous

mathematical scientists, derived equations of motion for transverse waves in an elastic medium,

but when these equations were applied to the already known behavior of light, the results were at

best mixed.2 In 1867 James Clerk Maxwell modified the formulas describing the interdependent

behavior of electric and magnetic fields to make them a self-consistent set of equations; he

believed himself to be constructing a mechanical analogy for the ether. After showing that the

new set of equations predicted transverse electromagnetic waves traveling at the speed of light,

Maxwell not only asserted that light was a propagating electromagnetic disturbance, but he also

used his discovery to connect electric and magnetic properties to the behavior of the luminiferous

ether. It was not until 1888 that Hertz demonstrated experimentally that propagating

electromagnetic disturbances actually exist; and the optical community itself did not

acknowledge until 1896, with the discovery of the Lorentz-Zeeman effect, that light had to be

1

A. Michelson, “The Relative Motion of the Earth and the Luminiferous Ether,” American Journal of Science 22,

Series 2 (1881), p. 120–129.

2

E. Whittaker, A History of the Theories of Aether and Electricity, Vol. I, The Classical Theories (Thomas Nelson &

Sons, Ltd., New York, 1951), pp. 129–142.

-1-

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

such a propagating electromagnetic wavefield.3 So the ether concept was not only alive and well

at the time of Michelson’s experiments, but it could also be said, with the growing acceptance of

Maxwell’s equations to describe the behavior of the luminiferous ether, that it had never been

healthier.

Figure 1.1(a) is a drawing of the instrument Michelson described in his 1881 paper, and Fig.

1.1(b) shows how the interferometer works. Incident light enters from the left, as shown by the

dark solid arrow, and hits a glass plate whose back is a partly reflecting, partly transmitting

surface. Ideally, half the incident light is transmitted through to mirror C and half is reflected up

to mirror D. Mirrors C and D then return the light to the beam splitter, as shown by the dashed

arrows. At the beam splitter, the light is again half transmitted and half reflected to send two

equal-intensity beams into the observer’s telescope. The light that is first transmitted and then

reflected at the beam splitter is called beam TR, and the light that is first reflected and then

transmitted at the beam splitter is called beam RT. These beams are drawn as two side-by-side

dotted arrows, but in reality they should be thought of as lying one on top of the other, filling the

same volume of space as they travel from the beam splitter to the telescope.

Michelson, thinking then in terms of 19th-century optical theory, would have regarded light as

transverse and elastic vibrations in the ether. The ether’s plane of vibration might be horizontal,

as shown in Fig. 1.2(a), or vertical, as shown in Fig. 1.2(b). It was assumed, in fact, that the ether

could undergo transverse vibrations in any plane at all—horizontal, vertical, or something in

between, as shown in Fig. 1.2(c)—although not all at the same time. At any given point in the

light beam, there could be only one plane of vibration, with different colors of light characterized

by different wavelengths of vibration. If a “snapshot” of a light beam could be taken, the plane of

vibration could well be changing along its length, as shown in Fig. 1.3(a). At some slightly later

time, the snapshot would show the same configuration advanced in the direction of propagation,

as shown in Fig. 1.3(b). White light, then as now, was taken to be a composite beam consisting of

many different wavelengths simultaneously traveling in the same direction. Different colors of

light correspond to disturbances of different wavelengths. Combining or adding together many

different-colored disturbances produces a total transverse vibration having no particular or unique

wavelength and with the plane of vibration free to change in an irregular fashion along the length

of the beam, as shown in Fig. 1.3(c). The situation depicted in Figs. 1.3(a)–1.3(c) is actually very

close to the physical models used today to explain the behavior of light; all we need to do is

accept Maxwell’s equations—but not Maxwell’s ether—and say that the sinusoidal curves in

3

D. Goldstein, Polarized Light, 2nd ed. (Marcel Dekker, Inc., New York, 2003), p. 298.

-2-

7KH)LUVW0LFKHOVRQ,QWHUIHURPHWHUÂ

),*85($

D7KHILUVW0LFKHOVRQLQWHUIHURPHWHU

Figs. 1.3(a)±1.3(c) describe the changing length and orientations of the tip of the wavefield’s

oscillating electric or magnetic field vectors.4

Suppose length D in Fig. 1.1(b) is adMusted until the distance from mirror C to the beam splitter

is exactly the same as the distance from mirror D to the beam splitter. When monochromatic

light—that is, light having a unique wavelength—enters the interferometer as shown in Figs.

1.4(a) and 1.4(b), then the beams reflected from C and D recombine when leaving the

interferometer in such a way that their planes of vibration, as well as their state of oscillation,

exactly match. Since the planes of vibration match, we can disregard the planes’ orientation and

Must add together the two beams’ sinusoidal curves. Figure 1.5(a) shows that if the RT and TR

beams line up exactly—as they must when the distances from mirrors C and D to the beam

splitter are equal—then the summed oscillation is a maximum because the two wavefields are in

phase. If the distances from mirrors C and D to the beam splitter are unequal, then beams RT and

TR shift with respect to each other, as shown in Figs. 1.5(b)±1.5(e). The two beams can be out of

wavelength.depending on the

phase by any fraction of a wavelength howamount

much the

of inequality in mirror

the twodistance is.

distances.

4

See, for example, the discussion in Secs. 4.2 through 4.4 of Chapter 4. Figures 1.2(a) and 1.2(b) can be profitably

compared to Figs. 4.5 and 4.6 in Chapter 4.

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.1(b).

Mirror D

a Beam Compensator

Splitter Plate

Incident

Light

Mirror C

partially reflective

surface

Beam RT Beam TR

first reflected then first transmitted then

transmitted at beam splitter reflected at beam splitter

Observing Telescope

-4-

The First Michelson Interferometer · 1.1

FIGURE 1.2(a).

cut in

wavefield

plane perpendicular

to direction of

propagation

FIGURE 1.2(b).

vibrations of vibrations of

transverse wavefield transverse wavefield

cut in wavefield

direction of

propagation

plane

perpendicular

to direction of

propagation

-5-

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

transverse wavefield

three different

planes of vibration

FIGURE 1.3(a).

vibration wavelength

vibration wavelength

FIGURE 1.3(b).

wavelength

-6-

The First Michelson Interferometer · 1.1

The closer this fraction is to one-half, the smaller the summed oscillation; and if they are out of

phase by exactly a half-wavelength, then their sum is zero and the combined beam disappears.

When one beam is shifted against the other by exactly one wavelength, and the planes of

vibration still match, then once again the monochromatic RT and TR beams are in phase and

producing a bright combined oscillation.5 There seems to be a real possibility that a

monochromatic beam cannot be used to confirm that mirrors C and D are the same distance from

the beam splitter because the recombined exit beam may look the same as it does when no shift at

all exists if one wavefield is shifted against the other by one, two, etc., wavelengths.

Suppose two monochromatic beams with two different wavelengths are sent through the

interferometer at the same time. If the distances from mirrors C and D to the beam splitter are

equal, then both the monochromatic beams, even though they have different wavelengths, must

be in phase when leaving the interferometer, producing a maximally bright oscillation in the

recombined exit beam. When the distances to the beam splitter are not exactly equal, however,

one of the monochromatic beams may end up shifted against itself by one, two, etc., wavelengths,

but there is no reason for the other beam to be shifted against itself the same way. When three

monochromatic beams are sent through the interferometer while the distances to the beam splitter

are not equal, matching all three wavetrains becomes even more unlikely. Hence, if we pass

white light containing innumerable distinct monochromatic wavetrains through the instrument,

then the RT and TR beams will recombine to produce a maximally bright output beam if and only

if the distances from mirrors C and D to the beam splitter are equal.

To make the white-light beam work as intended, the interferometer needs a glass compensator

plate between mirror C and the beam splitter [see Fig. 1.1(b)]. The compensator plate must be the

same thickness and orientation—and made from the same type of glass—as the glass in front of

the beam splitter’s partially reflecting surface. Figure 1.6(a) shows how light waves reflect from

mirrors C and D; the wavelength does not change while reflecting. In Fig. 1.6(b), however, light

waves inside the glass are somewhat shorter than they are outside the glass; the wavelength of the

light with respect to the glass thickness is greatly exaggerated to show this effect.

Therefore, a given distance traveled inside the glass corresponds to more wavelengths of a

monochromatic beam than the same distance in empty space. Moreover, different colors or

wavelengths of light shrink by different amounts, and this effect was a familiar one to 19th-

century optical scientists. If the compensator plate is not present, then the RT beam in Fig. 1.1(b)

passes through the glass in the beam splitter three times, whereas the TR beam passes through the

beam-splitter glass only once. The RT beam thus contains more wavelengths than the TR beam

even though the distances between the mirrors and the beam splitter are equal. With the

compensator plate there, however,

present, however,both thethe

both TRTRandand

RTthe

beams pass through

RT beams three glass

pass through threelayers.

glass

thicknesses.

5

In fact, we now know that a strictly monochromatic beam of light must have matching planes of vibration when

shifted against itself by exactly one, two, etc., wavelengths.

-7-

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.4(a). Figure 1.4(a) shows a segment of radiation entering the interferometer and Fig. 1.4(b)

shows what that segment becomes when it leaves the interferometer if the distance it travels up and back

each interferometer arm is the same.

the interferometer

-8-

The First Michelson Interferometer · 1.1

FIGURE 1.4(b).

interferometer

Beam RT Beam TR

-9-

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

Beam TR

FIGURE 1.5(a).

Beam RT

In Phase Total

Beam TR

FIGURE 1.5(b).

Beam RT

Out-of-Phase

by a Quarter

Wavelength Total

Beam TR

FIGURE 1.5(c).

Out-of-Phase Beam RT

by a Half

Wavelength

Total

Beam TR

FIGURE 1.5(d).

Beam RT

Out-of-Phase by

Three-Quarters Total

Wavelength

Beam TR

FIGURE 1.5(e).

Beam RT

In Phase

Total

- 10 -

The First Michelson Interferometer · 1.1

FIGURE 1.6(a).

Incident Wavefield

Reflected Wavefield

FIGURE 1.6(b).

Reflected Wavefield

Incident Wavefield

Transmitted

Glass Wavefield

Substrate

Beamsplitting Film

- 11 -

Â(WKHU:LQG6SHFWUDO/LQHVDQG0LFKHOVRQ,QWHUIHURPHWHUV

monochromatic component

component has

has itsitsown

ownunique

uniquenumber

numberofofwavelengths

wavelengths inin each

each arm

of the interferometer; thus, the blue-light component in one arm has the same number of

wavelengths as the blue-light component in the other arm, the red-light component in one arm

has the same number of wavelengths as the red-light component in the other arm, and the same

can be said about all the other colors in the white-light beam.

Michelson wanted to do more than Must make the distances traveled by light going back and

forth between the C, D mirrors and the beam splitter equal; he also wanted to see how the

distances traveled by the light beams changed when he rotated the interferometer on its stand >see

Fig. 1.1(a)@. Up to now, we have assumed that mirrors C and D are exactly perpendicular to the

line of sight between their centers and the beam splitter, but nothing stops us from tilting one of

them a very slight amount, as shown in Fig. 1.7. The degree of tilt is, of course, greatly

exaggerated to show what is happening. When the tilt is imposed after the distances of mirrors C

and D to the beam splitter have been made equal, the center line of the tilted mirror remains at the

same distance from the beam splitter as it was before the tilt occurred. If the tilt is so small that

the slight change in direction of the beam can be disregarded, then that part of the beam reflecting

off the mirror’s center line still recombines with light from the other mirror in such a way as to

produce the maximally bright oscillation already discussed above. The off-center parts of the

recombined beam are, of course, dimmer because the off-center parts of the tilted mirror no

longer match up properly to the untilted mirror.6 An observer looking through the telescope

shown in Figs. 1.1(a) and 1.1(b) sees a bright central band, called a ³fringe,´ corresponding to the

central strip lying along the center line of the tilted mirror, with dark and less bright bands or

fringes on either side. If the distance that the light travels between the tilted mirror and the beam

splitter changes slightly, we expect the central fringe to shift as one side or another of the tilted

mirror—instead of its center line—becomes equal to the distance traveled by the light in the other

arm of the interferometer. It is exactly this sort of fringe shift that Michelson hoped to see when

he rotated the interferometer on its stand, changing the direction in space of the light going up

and back the arms of the interferometer.

One last point we need to make is that many beam splitters of the type shown in Fig. 1.1(b)

reflect differently from the glass side and the nonglass side of the partially reflecting surface,

reversing the directing of vibration in the TR beam reflecting off the nonglass side and not

reversing it in the RT beam reflecting off the glass side.7

Figure 1.5(c) shows that reversing the direction of vibration is the same as changing the phase

of the beam by one half-wavelength or 1808, so the phenomenon is often referred to as a 1808

phase shift on reflection. Michelson used this sort of phase-shifting beam splitter, so the RT and

TR beams in his interferometer did not match up the way they are shown in Fig. 1.4(b) when the

distances of mirrors

mirrors CC and

andDDfrom

fromthe

thebeam

beamsplitter

splitter are

are equal

equalbut

butinstead

insteadmatch

matchupupasasshown

showninin

6

See Secs. 5.20 and 5.21 in Chapter 5 for a more detailed discussion of how to analyze a tilted mirror.

7

F. Jenkins and H. White, )XQGDPHQWDOV RI 2SWLFV 3rd ed. (McGraw-Hill Book Company, New <ork, 1957), p.

251.

The First Michelson Interferometer · 1.1

Centerline of

FIGURE 1.7. Tilted Mirror

Angle

Note: The angle of tilt is

greatly exaggerated in of Tilt

this diagram.

- 13 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

Fig. 1.8. Now the central fringe coming from the center line of the tilted mirror is dark because

all the monochromatic components of the two beams cancel out rather than add together. When

Michelson sent white light through his interferometer, he thus saw a central dark fringe with

parallel multicolored fringes on either side. The colored fringes come from the off-center strips of

the tilted mirror where one or another monochromatic wavetrain is shifted against itself by

exactly one, two, etc., wavelengths, increasing the amplitude of its oscillation with respect to the

wavetrains of other colors inside the recombined beam. In this setup, the central dark fringe is

unique, making it easy for Michelson to see how its position changes as the interferometer is

rotated.

Physical theory has changed a great deal since 1881, but it is still relatively easy to understand

the reasoning behind Michelson’s experiment. As soon as light is taken to be a wavefield in a

medium at rest, such as waves on the surface of water, and the Earth’s motion through space is

regarded as carrying the interferometer through the medium, everything falls into place.

The first point worth mentioning is that the velocity at the equator due to the Earth’s daily

47 km/sec, much less than the Earth’s orbital velocity around the sun of 29.67

rotation is 0.46 9.7

km/sec. Consequently, the rotational velocity of Michelson’s laboratory—well north of the

equator—was only about 1% of the orbital velocity, and Michelson did not have to pay any

attention to it. The interferometer in Fig. 1.1(a) can be rotated on its stand, so at noon and

midnight, Michelson could always arrange for one arm to be aligned with the Earth’s orbital

velocity. Figures 1.9(a) and 1.9(b) show light traveling along the arms of a Michelson

interferometer when the interferometer is viewed as moving with a velocity v through a stationary

medium—that is, a luminiferous ether—and one of the arms is aligned with v. To keep life

simple, we have dropped the compensator plate from the two diagrams. Figure 1.9(a) shows light

traveling out and back along the arm aligned with v, with the interferometer rotated so that this is

the arm holding mirror C in Fig. 1.1(b). Figure 1.9(b) shows light traveling out and back along

the arm holding mirror D in Fig. 1.1(b). The positions of mirrors C and D are adjusted so that

each one is the same distance a from the beam splitter.

Figure 1.9(a) shows the beam splitter at three different positions as a single crest of the light’s

wavefield moves through the interferometer: when the wavecrest first enters the arm of the

interferometer, when the wavecrest reflects off mirror C, and when the wavecrest returns to the

beam splitter for the second time. Mirror C is shown at the same three times—when the

wavecrest enters the arm, when it reflects off C, and when it returns to the beam splitter. The

velocity of the wavecrest with respect to the ether is c, and time t1 elapses as the wavecrest goes

from the beam splitter to mirror C. Hence, the wavecrest covers a distance a + vt1 in the

stationary ether while traveling at velocity c, with

- 14 -

Historical Reasoning Behind the Ether-Wind Experiment · 1.2

FIGURE 1.8.

Beam RT Beam TR

- 15 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.9(a).

Direction of

Earth’s Motion

vt1 vt 2 vt1 vt 2

a

Incident Light

Beam Splitter of Mirror C

To Telescope

- 16 -

Historical Reasoning Behind the Ether-Wind Experiment · 1.2

FIGURE 1.9(b).

Direction of

Earth’s Motion

Mirror D

Positions of the

Beam Splitter

a

Incident Light

vt 3 vt 3

To Telescope

- 17 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

Time

Time t2 elapses

t2 elapses while

while thethe wavecrestreturns

wavecrest returnsfrom

frommirror

mirrorCCtotothe

thebeam

beam splitter,

splitter, and

and similar

reasoning shows that

a vt2 ct2 . (1.1b)

Solving

Solving for for

t1 and

t1 and

t2 in

t2 Eqs.

in Eqs.

(1.1a)

(1.1a)

andand

(1.1b)

(1.1b)

gives

gives

a

t1

cv

and

a

t2 .

cv

TheThe

wavecrest

wavecrest

spends

spends

timetime

a a 2ac

t1 t2 2 2

cv cv c v

going out to mirror C and back to the beam splitter, and it does so while traveling at velocity c, so

it covers a total distance

2ac 2

c A (t1 t2 ) 2 2 . (1.1c)

c v

Figure 1.9(a) also shows the wavecrest traveling at an angle, instead of straight down, after it

reflects off the beam splitter when leaving the interferometer’s arm. This allows it to head toward

where the observing telescope will be by the time the wavecrest reaches it; there is thus no

danger of the telescope missing the wavecrest because it has moved out of position. Figures

1.10(a) and 1.10(b) show why this happens. Figure 1.10(a) shows a single wavecrest reflecting

off a 458 stationary mirror. The large dots indicate where the “corner” of the reflecting wavecrest

is now and has been in the past as it reflects from the stationary mirror. The reflected wavecrest

travels upward at 908 from its original direction, as expected. Figure 1.10(b) shows what happens

when the same type of wavecrest reflects off a moving 458 mirror. The four thin solid lines show

the positions of the mirror at four equally spaced instants in time, and the large dots again show

where the corner of the reflecting wavecrest is at these times. Connecting these dots with a thick

dashed line, we see that the wavecrest feels an effective stationary mirror that is slanted at an

angle somewhat greater than 458. This means the reflected wavecrest does not travel straight up

as in Fig. 1.10(a) but instead moves a little off to the right.

- 18 -

Historical Reasoning Behind the Ether-Wind Experiment · 1.2

Figure 1.9(b) shows how the wavecrest travels up and back the interferometer arm

perpendicular to velocity v. In time t3 , the wavecrest travels a distance a 2 v 2t32 from the beam

splitter to mirror D; and, because it does this at velocity c, we must have

ct3 a 2 v 2t32

or

a

t3 .

c2 v2

Figure

Figure 1.9(b)

1.9(b) shows

shows thatthat

thethe totaldistance

total distancetraveled

traveledfrom

fromthe

thebeam

beamsplitter

splitter to

to mirror

mirror D

D and

back again must be

2ac

2ct3 . (1.2)

c2 v2

Even though the two interferometer arms are both of length a, if the interferometer is moving

then a single wavecrest splitting at the beam splitter does not travel the same distance in each arm

before recombining at the beam splitter. The difference ¨s between the distances traveled out and

back in each arm is, according to Eqs. (1.2) and (1.1c),

2ac ª c º 2a ª 1 º

s c(t1 t2 ) 2ct3 « 1» « 1» .

c2 v2 ¬ c2 v2 ¼ 1 v 2 c 2 «¬ 1 v 2 c 2 »¼

The Earth’s orbital velocity is about 104 of the speed of light c, so we can make the

approximation

1 2 v2

1 v2 c2
1

2c 2

.

This gives

§ v2 · § v2 · av 2

s
2a ¨1 2 ¸ ¨1 2 1¸ 2 O(v 4 c 4 ) .

© 2c ¹ © 2c ¹ c

- 19 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.10(a). An incident wavecrest enters from the right and is reflected up from a stationary

surface. The dots show where the corner of the wavecrest is at equally spaced time intervals while it is

reflecting off the surface.

incident wavecrest

moving to the left

reflected wavecrest

moving up

reflecting surface

- 20 -

Historical Reasoning Behind the Ether-Wind Experiment · 1.2

FIGURE 1.10(b). The same wavecrest is shown here at four instants of time, each instant

separated from the next by a time interval of ¨t, as it enters from the right and reflects off a flat

surface traveling from left to right across the page. The dots show where the corner of the wavecrest

is at these four instants of time, and the thick dashed line shows the effective slant of the surface

experienced by the wavecrest as it reflects.

spaced instants of time

t t – ǻt

t 2t

direction of travel of

incident wavecrest

t 3t

direction of travel of

reflected wavecrest

t

t 3t

t 2t t t

instants of time

- 21 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

Since v 2 c 2 108 and v 4 c 4 1016 , it makes sense to neglect the v 4 c 4 terms and write

av 2

s
2 ? 108 a . (1.3a)

c

It isIt perhaps of of

is perhaps interest to to

interest point

pointoutoutthat

thatMichelson,

Michelson,bybymistakenly

mistakenlyassuming

assuming that

that the

the light

light

traveling up and back the arm perpendicular to the orbital velocity covered a distance 2a instead

of 2ac / c 2 v 2 , ended up with

2av 2

s
? 2 ;108 a (1.3b)

c2

in his 1881 paper. This incorrect formula did not affect Michelson’s overall analysis because, as

he explained in the paper, the data was good enough to rule out an effect ten times smaller than

what he expected to see.

As pointed out in Sec. 1.1, when white light passed through the interferometer with one of the

end mirrors slightly tilted, Michelson saw a central dark band or fringe from the centerline of the

tilted mirror because the centerline is the same distance from the beam splitter as the untilted

mirror. Remembering that Michelson used a beam splitter that reversed the direction of vibration

in one of the recombining beams, we know that at the center of the dark fringe each

monochromatic wavetrain in the white-light beam cancels itself out. At the first colored band or

fringe on either side of the centerline, the wavetrains go from cancelling themselves out to

reinforcing themselves, becoming bright at those positions on the tilted mirror where the length

traveled out and back the tilted mirror arm is a half-wavelength longer than at the center of the

dark band [see, for example, the transition from Fig. 1.5(c) to Fig. 1.5(e)]. Hence, for each

monochromatic wavetrain, the transition from dark to bright is halfway complete where the

length traveled out and back the tilted-mirror arm is a quarter wavelength different from what it is

at the center of the dark band. Considering the joint actions of all the monochromatic wavetrains

in the white-light beam, Michelson then knew that going from the center to the edge of the dark

fringe corresponded to shifting from a position on the tilted mirror where the length out and back

in both interferometer arms was equal to a position where the length out and back the tilted

mirror arm was different by one quarter of the average wavelength Ȝav of the white-light beam.

Thus the fringe widths inside the telescope’s field of view gave him an extremely fine-grained

scale for measuring the difference in distance between the two arms. For greater accuracy, a

monochromatic beam could be sent through the interferometer and the tilted mirror adjusted until

the fringes matched up with the scale marks of the telescope’s eyepiece.

If the interferometer is rotated so that the arm originally parallel to v is now perpendicular to

v, then the distance out and back one arm is shorter by ¨s and the distance out and back in the

other arm is longer by ¨s, so there is—according to Eq. (1.3a)—a shift of

- 22 -

Historical Reasoning Behind the Ether-Wind Experiment · 1.2

2av 2

2∆s ≅ 2

≈ 2 ×10−8 a (1.4)

c

of the wavefield from one arm when compared to the wavefield from the other arm. If 2¨s equals

λav / 4 , the dark fringe shifts until its center is located at the previous position of one of its edges;

if 2¨s is larger, then the dark fringe shifts more; and if 2¨s is smaller, then the dark fringe shifts

less. For the value of a he chose, Michelson expected the fringe to shift by approximately one-

tenth its width. To within experimental error, he did not see the dark fringe shift at all. Michelson

concluded that

the hypothesis of the stationary ether is thus shown to be incorrect, and the necessary conclusion follows that

the hypothesis is erroneous.8

The existence of the ether was accepted by a lot of scientists, so this experiment was by no

means the last word in the matter; indeed, it inaugurated 50 years of ever more painstaking

attempts to detect an ether wind using larger and more sensitive Michelson interferometers.

Michelson himself took the first step down this road when, in 1887, he collaborated with Edward

Morley to repeat his experiment; Fig. 1.11 shows the optical diagram of the interferometer they

constructed. They concluded that the velocity v of the interferometer with respect to the ether was

probably less than a sixth of the Earth’s orbital velocity, an upper limit suggested by

experimental error.9 Michelson and Morley regarded this as another negative result. Many

scientists, including Michelson, at first interpreted these experiments as showing that the Earth

dragged along a layer of ether near its surface, making it hard to say just how fast the

interferometer might be moving with respect to the ether in the laboratory. Interferometers were

set up on tops of mountains and sent up in high-altitude balloons, hoping to get outside the ether

layer dragged along by the Earth, but no one came up with any results convincingly larger than

experimental error. According to Einstein’s special theory of relativity, published in 1905, there

is no reason to expect “ether drift” at all, because the speed of light is the same in all inertial

frames of reference. After 1905, attempts to detect ether drift were basically attempts to disprove

relativity theory, and scientists who pursued them were regarded by their peers as ever more

eccentric. Perhaps the last serious attempt to detect an ether wind using a Michelson

interferometer took place on top of Mount Palomar, where Dayton Miller ran an extremely large

and sensitive Michelson experiment in the 1920s. When publishing the results in the early 1930s,

he claimed to detect ether-wind velocities on the order of 10 km/sec,10,11 but the data remained

8

Michelson, “The Relative Motion of the Earth.”

9

A. Michelson and E. Morley, “On the Relative Motion of the Earth and the Luminiferous Ether,” American Journal

of Science 34, Series 3 (1887), 333–345.

10

D. Miller, “The Ether-Drift Experiment and the Determination of the Absolute Motion of the Earth,” Reviews of

Modern Physics 5, no. 2 (July 1933), 203–242.

- 23 -

Â(WKHU:LQG6SHFWUDO/LQHVDQG0LFKHOVRQ,QWHUIHURPHWHUV

controversial. After his death, the results were attributed to slight but systematic temperature

changes in the instrument during the measurements.12

0RQRFKURPDWLF/LJKWDQG6SHFWUDO/LQHV

The wavelength λ of a monochromatic light wave and the frequency I in cycles per unit time of

that same monochromatic light wave are connected by

λI =F, (1.5)

where F is the velocity of light. By the second half of the 19th century, it was known that the light

emitted by free atoms, such as from the atoms inside a hot dilute gas, is often emitted at specific

frequencies called spectral lines. Equation (1.5) then requires the light from a spectral line to

have a precise wavelength λ FI. Michelson used these spectral lines to generate the

monochromatic light sent through his interferometer. When, for example, a spectroscope was

used to separate out the cadmium red line and send it through the interferometer, he would see a

regular pattern of red fringes; when the mercury green line was sent through, he would see

regular green fringes; and so on. Many of these lines are in reality clumped groups of spectral

lines, all having nearly the same wavelength; they masquerade as a single bright line when

observed by low-resolution spectroscopes and spectrometers.

$SSO\LQJWKH0LFKHOVRQ,QWHUIHURPHWHUWR6SHFWUDO/LQHV

After the first ether-wind experiments, Michelson demonstrated that his interferometer could also

be used both as an extremely accurate, practical ruler for measuring fundamental lengths and as

an extremely high-resolution spectrometer. To understand Michelson’s approach, we must keep

in mind that the only ³optical detectors´ available back then were cameras (whose images had to

be chemically developed in darkrooms) and the human eye.

When the interferometer is used as a ruler or spectrometer, one of the arms is modified so that

its mirror is easily moved, as shown in Fig. 1.12. This moving mirror and the fixed mirror on the

other arm are still slightly tilted with respect to each other; that is, when extended indefinitely,

the planes of the mirror surfaces do not meet at exactly 90°. In this discussion, we refer to the

moving mirror as being tilted and the fixed mirror as being untilted. To keep things consistent

Sec. 1.1,

with the discussion in Sec. 1.1, the

the beam

beam splitter

splitter isis assumed

assumedto

tobe

bethe

thesame

sametype

typeused

usedininthe

the1881

1881

11

D. Miller, ³The Ether-Drift Experiment and the Determination of the Absolute Motion of the Earth,´ 1DWXUH

(February 3, 1934), 162±164.

12

R. Shankland, S. McCuskey, F. Leone, and G. Kuerti, ³New Analysis of the Interferometer Observations of

Dayton C. Miller,´ 5HYLHZVRI0RGHUQ3K\VLFV , no. 2 (April 1955), 167±178.

Applying the Michelson Interferometer to Spectral Lines · 1.4

FIGURE 1.11.

- 25 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

ether-wind experiment. Hence, when a white-light beam is sent through the instrument, an

observer notes a central dark fringe if the center of the tilted moving mirror is the same distance

from the beam splitter as the center of the fixed mirror. This equidistant position of the moving

mirror is today often called the position of zero-path difference (ZPD) because the light’s path up

and back each arm of the interferometer is the same when there is no tilt present.

The position and tilt of the moving mirror can be adjusted until the central dark fringe is

centered on rulings marked in the telescope’s eyepiece. When the white-light beam is replaced by

a monochromatic beam from a spectral line, the observer sees a sequence of light and dark bands

forming a regular pattern of fringes having the same color as the spectral line. The marked

position of the central dark fringe in the center of the eyepiece is now occupied by a dark null of

the monochromatic fringe pattern. This null corresponds to the centerline strip of the tilted

mirror’s surface being the same distance from the beam splitter as the untilted mirror’s surface.

The two bright fringes on either side of the marked null separate that null from the two

neighboring nulls, with the neighboring nulls corresponding to two strips of the tilted mirror’s

surface that are a half-wavelength closer to, and a half-wavelength further away from, the beam

splitter. A half-wavelength difference in distance from the beam splitter creates, of course, a full

wavelength’s difference in the distance traveled up and back the interferometer’s arm, which is

why we see another null. Depending on the configuration of the telescope, the amount of tilt in

the tilted mirror, and the wavelength of the monochromatic beam, there will be some number of

additional fringes alternating bright and dark across the field of view, with the nulls

corresponding to strips of the tilted mirror’s surface that are one half-wavelength closer to and

further away from the beam splitter, two halves or one full wavelength closer to and further away

from the beam splitter, three halves closer to and further away from the beam splitter, and so on.

The observer can slowly move the tilted mirror out along its arm, watching as the fringe

pattern moves across the telescope’s field of view. The movement occurs, of course, because the

strips of the moving mirror’s tilted surface that are 1/2, 1, 3/2, etc., wavelengths closer to or

further away from the beam splitter are now no longer where they used to be. The marked null

shifts and, after the mirror moves half a wavelength from its original position, the null that used

to be immediately to one side shifts into the marked location. The fringe pattern looks the same

as just before the mirror began moving, but the observer knows there has been a half-wavelength

shift in the position of the moving mirror because the fringes have been carefully watched as their

positions changed. As the mirror moves, old fringes move out of sight on one side of the field of

view while new fringes replace them on the other side of the field of view. The observer checks

that the tilt of the moving mirror does not change by making sure that there is always the same

number of bright-null repetitions in the fringe pattern. Since the position of the moving mirror is

always known to within a small fraction of a wavelength, the interferometer has now become an

extremely accurate way to measure distance.

- 26 -

Applying the Michelson Interferometer to Spectral Lines · 1.4

FIGURE 1.12.

Moving Mirror

p

Beam Compensator

Splitter Plate

Spectral Lines

Fixed

Mirror

To Telescope

- 27 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

Michelson did not hesitate to measure distances with his interferometer. In 1892 he

established that the standard meter bar in Paris corresponded, to an accuracy of one part in two

million, to 1,553,163.5 wavelengths of monochromatic light from the red cadmium spectral line.

At Yerkes Observatory in Wisconsin, he measured the extremely small tidal distortions of the

planet Earth due to the moon’s gravity, helping to establish that the Earth has an iron core, and

published the results in 1919. There is, however, a fundamental difficulty limiting his ability to

use the interferometer as a ruler: As the moving mirror gets further and further away from its

equidistant or ZPD position, the pattern of fringes starts to fade and eventually disappears. This

phenomenon is caused by the beam from the spectral line not being exactly monochromatic—

either because what looks like a single spectral line is in reality a group of two or more lines

having almost the same wavelength, or because the line itself has a finite spectral “width,”

simultaneously emitting light at a very large number of wavelengths all very close to each other

in value.

To see why the fade-out occurs for a closely spaced group of spectral lines, we first analyze

what happens when the light from a pair of equal-intensity, closely spaced spectral lines,

sometimes called a spectral doublet, is sent through the interferometer. Inside the interferometer,

the doublet behaves like two monochromatic beams—each having a slightly different

wavelength—simultaneously passing through the instrument. After using white light to put the

moving, tilted mirror at its ZPD position, we begin sending the doublet beam through the

interferometer. Each monochromatic beam produces a fringe pattern. To the human eye, the

fringe patterns have the same color and their nulls seem to be at exactly the same places in the

telescope’s field of view. Because the wavelengths of the beams are nearly identical, the two

fringe patterns lie almost exactly on top of each other, reinforcing each other the same way the

dashed and solid oscillations lie on top of each other to create a thicker line at the left-hand edge

of Fig. 1.13. When, for example, there is a null in one beam’s fringe pattern because that strip of

the tilted mirror’s surface is an integer number of half-wavelengths closer to or further away from

the beam splitter, the null from the other beam’s fringe pattern falls in almost exactly the same

place because it has almost exactly the same wavelength. As we shift the moving mirror further

away from ZPD and watch the fringes move, we know that when each new fringe forms at the

leading edge of the field of view, it shows that the edge of the tilted moving mirror is an ever

larger number of half-wavelengths further from the beam splitter. Sooner or later, however, the

same thing happens to the two beams’ fringe patterns that happens in Fig. 1.13 as we look away

from its left-hand edge—the oscillations get out of phase. Just as the dashed and solid lines in

Fig. 1.13 no longer match up exactly because they have slightly different repetition lengths, so do

the two fringe patterns of the two beams match up less well because they have slightly different

wavelengths. There always comes a point—perhaps when the next null is forming at 10,000 or

50,000 or more half-wavelengths from the ZPD position of the moving mirror—where the

monochromatic beam with the slightly shorter wavelength λ1 is ready to form a null somewhat

before the beam with the slightly longer wavelength λ2. The nulls and brights from one

monochromatic fringe pattern shift enough with respect to the other that we begin to notice a

change: the pattern begins to fade. Eventually, the two fringe patterns are completely out of

- 28 -

Applying the Michelson Interferometer to Spectral Lines · 1.4

phase, with the brights and nulls of one pattern lying on, respectively, the nulls and brights of the

other. If the two beams are of equal intensity, then the fringe pattern fades away completely.

Suppose the λ1 set of fringes first becomes exactly out of phase with the λ2 set of fringes when

the moving mirror has traveled a distance of approximately N/2 wavelengths of the λ2 beam from

its equidistant or ZPD location. At this point, N satisfies the approximate equation

1 1§ 1·

N λ2 ≅ ¨ N + ¸ λ1 , (1.6a)

2 2© 2¹

which can also be written as

λ2 − λ1 1

≅ . (1.6b)

λ1 2N

λ2 − λ1

λ1

between the doublet’s wavelengths in terms of N. If N is too large for convenient counting and

only several digits of accuracy are needed, we can directly measure the distance p in Fig. 1.12 at

which the fringe pattern disappears. Recognizing that both sides of Eq. (1.6a) are formulas for p

at the fade-out point, we can approximate either side of Eq. (1.6a) by N λav , where λav is the

approximate wavelength of the doublet, and write

N λav

≅ p. (1.6c)

2

2p

N≅ (1.6d)

λav

to estimate N in terms of the known values of p and λav . This approximate value of N can then

be put into Eq. (1.6b) to find the fractional spread in the doublet. Hence, we see that the fade-out

is both a “bug” and a “feature” of the interferometer—although it sets a limit on the distances that

can be measured, it also specifies the exact separation of spectral lines too close to be resolved by

other types of spectrometers. This exercise also establishes the basic idea behind Michelson-

based spectroscopy: examining the behavior of the interference signal to measure the beam’s

spectral shape.

- 29 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.13. The solid oscillation represents the fringe pattern of one spectral line in the doublet and

the dashed oscillation represents the fringe pattern of the other spectral line in the doublet. The

wavelengths of both spectral lines are almost the same, so their fringe patterns slowly change from being

in-phase, to being out-of-phase, and then back to being in-phase.

ax( p )

p

i

0

P

i

min ( p )

0 1 2 3 4 5 6 7 8 9 10

0 x 10

i

Now that we understand why the fringe pattern of a doublet fades, it is easy to see why the

same sort of thing happens with any size group—or multiplet—of closely spaced spectral lines.

Each line of intrinsically greater or lesser intensity generates a fringe pattern of intrinsically

greater or lesser intensity connected to its wavelength. Near ZPD, all the fringe patterns are in

phase, but as the moving mirror shifts away from ZPD, the fringe patterns, since each is produced

by a slightly different wavelength, go out of phase, causing the fringes to fade. Figure 1.14 even

suggests a quick way of understanding something about why a single, finite-width spectral line

also produces fading fringe patterns; approximating it as a closely spaced multiplet, we might

expect its fringes to behave the same way any other multiplet’s would. We should, however, be

careful about carrying this sort of reasoning too far. Figure 1.13 suggests that if, after reaching

the fade-out point, we keep moving the tilted mirror away from its ZPD position, then the

doublet’s fringe pattern starts to reappear, eventually becoming as strong as it was near ZPD. The

same sort of phenomenon should also occur for any multiplet consisting of a finite number of

exact wavelengths; if we go far enough from ZPD, then there should be a region where the fringe

patterns are all back in phase. In reality, when moving away from ZPD, there are indeed regions

where a multiplet’s fringe pattern first fades then grows stronger, but the finite width of each

spectral line inside the multiplet stops the fringes from ever regaining their full ZPD strength.

The fringes always, eventually, fade away completely. To explain this behavior, it is enough to

examine how and why the fringe pattern of a single, finite-width spectral line fades away. This is

done in the next three sections, where we show how a fringe pattern is connected to the Fourier

transform of the spectral intensity.

- 30 -

Interference Equation for the Ideal Michelson Interferometer· 1.5

When using a Michelson interferometer for Fourier-transform spectroscopy, the end mirrors in

each arm are aligned to be perpendicular to the line of sight between their centers and the center

of the beam splitter. In effect, we remove the tilt from the moving mirror so that its central fringe

fills the detector’s field of view in Fig. 1.15. The light beam passing through the interferometer

should be collimated, shown schematically in Fig. 1.15, by putting the point source of the beam

at the focus of a thin lens. The beam leaving the interferometer is concentrated onto a detector by

another thin lens. The dashed line shows the ZPD position of the moving mirror in Figs. 1.15 and

1.16. The moving mirror is a distance p from ZPD in these two figures, with p taken to be

positive when the mirror is further away from the beam splitter than its ZPD position and

negative when it is closer to the beam splitter than its ZPD position. The moving mirror should

remain perpendicular to the line of sight between it and the beam splitter as p changes, and the

detector records the changing intensity I of the collimated beam leaving the interferometer.

Even though Michelson did not usually set up his interferometers this way, optical theory was

advanced enough then for him to predict how I depends on p. The first step is to set up an x, y, z

Cartesian coordinate system such as the one shown in Fig. 1.16, with the collimated exit beam

traveling down the z axis. There are dimensionless unit vectors x̂ , ŷ , ẑ pointing in the direction

of the positive x, y, z coordinate axes. Still treating a light beam as a transverse wavefield of the

type shown in Figs. 1.2(a)–1.2(c) and 1.3(a)–1.3(c), we assume that beam TR in Fig. 1.16 is

monochromatic light and write its transverse disturbance as

K § 2π z · § 2π z ·

Af = xU

ˆ f cos ¨ − 2π ft + δU ¸ + yV

ˆ f cos ¨ − 2π ft + δV ¸ . (1.7a)

¨ λf ¸ ¨ λf ¸

© ¹ © ¹

Here, t is the time coordinate, f is the frequency of the monochromatic disturbance, and λf is the

wavelength corresponding to frequency f. The period of the disturbance is, of course, 1/f, and Eq.

(1.5) reminds us that the wavelength λf is connected to the frequency f by

λf f = c ,

K

where again c is the speed of light. Vector Af has no ẑ component, allowing it to represent a

transverse disturbance in the “ether”

K of the type shown in Figs. 1.2(a)–1.2(c) and 1.3(a)–1.3(c).

The x̂ and ŷ components of Af are the real-valued expressions

§ 2π z ·

U f cos ¨ − 2π ft + δU ¸

¨ λf ¸

© ¹

- 31 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.14.

Spectral Intensity

frequency f

Spectral Intensity

Spectral Multiplet

frequency f

- 32 -

Interference Equation for the Ideal Michelson Interferometer· 1.5

FIGURE 1.15.

90 deg.

p Moving Mirror

Fixed

Mirror

45 deg. Compensator

source at

Plate

focus 90 deg.

Beam

Splitter

Detector

- 33 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

and

§ 2π z ·

V f cos ¨ − 2π ft + δV ¸¸

¨ λf

© ¹

respectively. These components must both oscillate at the same frequency f because the light

beam is monochromatic, but they can have different constant phase shifts δU and δV . This allows

K

Af to point in different directions in the x, y plane when we move along the beam, as suggested

by the changing orientations of the arrows in beams RT and TR of Fig. 1.16. The Uf and Vf

amplitudes of the x and y oscillations do not have to be equal. To simplify the notation, and

because the concept will be routinely used in the rest of the book, we define

1

σf = (1.7b)

λf

to be the wavenumber of the monochromatic disturbance. Now Eqs. (1.7a) and (1.5) can be

written as

K

ˆ f cos ( 2πσ f z − 2π ft + δU ) + yV

Af = xU ˆ f cos ( 2πσ f z − 2π ft + δV ) (1.7c)

with

σ f = f /c . (1.7d)

This is the same monochromatic disturbance as before; all that changes is the notation used to

specify how its phase changes with z.

The power transported by a physical wavefield of any type is usually proportional to its

squared amplitude;13,14 and in optics it is now, as it was in Michelson’s time, customary to set the

time average of the squared amplitude equal to the intensity of the transverse wavefield.15 Visible

light has a wavelength on the order of 5 × 10−7 meters , so by Eq. (1.5) its frequency is about

c

f ≅ ≅ 6 ×1014 Hz (1.8a)

5 ×10 meters

−7

given that c ≅ 3 ×108 m/sec . Hence one cycle of the transverse wavefield has a period of about

13

H. Lamb, Hydrodynamics (6th edition), Dover Publications, New York, 1945 copy of the 6th edition first

published in 1879, p. 370.

14

P. Morse and K. Ingard, Theoretical Acoustics, McGraw-Hill, Inc., New York, 1968, p. 250.

15

G. Stokes, Mathematical and Physical Papers, Vol. III, Cambridge at the University Press, 1901, pp. 233-258.

- 34 -

Interference Equation for the Ideal Michelson Interferometer· 1.5

FIGURE 1.16.

Moving Mirror

Fixed

Mirror

Beam

Splitter

Compensator

Plate

χ = 2p

Beam RT

ŷ

y axis

x axis

z axis

ẑ x̂

Beam TR

- 35 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

1

2 ;1015 sec . (1.8b)

6 ;1014 Hz

TheThe response

response time

time of of

thethe unaidedhuman

unaided humaneye eyeisisperhaps

perhapsasasshort 10í2í2 s,s, and

shortasas10 and 2×10 í15

2×10í15 s is

13 13

shorter than that by a factor of about 10 . The response of the fastest optical detectors available

today is on the order of 10í9 s, which is still an incredibly long time compared to 2×10í15 s.

Therefore, we might as well take the time over which the squared amplitude is averaged to be

infinitely long, because compared to the wavefield’s period, that’s what it effectively is.

Following the notation of the time, the time average of a function g(t) is taken to be

T

1

j g (t ) lim

T 75 2T ³ g (t )dt .

T

(1.9a)

ForFor

anyany

twotwo functions

functions g(t)g(t)

andand

h(t),h(t),

we wethenthen have

have

T T T

1 1 1

j g (t ) h(t ) lim

T 75 2T ³ [ g (t ) h(t )]dt lim

T

T 75 2T ³

T

g (t )dt lim

T 75 2T ³ h(t )dt

T

or

j g (t ) h(t ) j g (t ) j h(t ) . (1.9b)

Multiplying

Multiplying g(t)g(t)

by abyconstant

a constant K and

K and thenthen averaging,

averaging, we we

get get

T T

1 1

j K A g (t ) lim

T 75 2T ³T [ Kg (t )]dt K Tlim

75 2T ³ g (t )dt

T

or

j K A g (t ) K A j g (t ) . (1.9c)

K K

Af = Af U 2f cos 2 2&) f z 2& ft U V f2 cos 2 2&) f z 2& ft V .

K K

j ( Af = Af ) j U 2f cos 2 2&) f z 2& ft U V f2 cos 2 2&) f z 2& ft V , (1.10a)

- 36 -

Interference Equation for the Ideal Michelson Interferometer· 1.5

K K

j ( Af = Af ) U 2f j cos 2 2&) f z 2& ft U V f2 j cos 2 2&) f z 2& ft V . (1.10b)

The average of the squared cosine is 1/2 over one of its cycles.16 As the averaging time gets

longer, it contains ever more cycles of the squared cosine, as well as—almost certainly—some

fraction of a cycle. The contribution of the squared cosine over a fractional cycle has practically

no influence compared to the squared cosine’s average value of 1/2 over a large number of

complete cycles. In the limit as T ĺ , it follows that

j cos 2 (at b) 1/ 2 (1.10c)

for all real values of a and b. Hence, the formula for the intensity of the monochromatic beam in

Eq. (1.10b) now reduces to

K K 1

j ( Af i Af ) U 2f V f2 .

2

(1.10d)

Although the squared cosine is always positive, the cosine itself is negative as often as it is

positive and averages to zero over one cycle. As the averaging time increases, it includes an ever

larger number of cycles as well as (probably) some leftover fraction of a cycle. Again, the

influence of the zero from the large number of complete cycles outweighs the contribution of

whatever fractional cycle may be present, and as T ĺ in the limit

j cos(at b) 0 (1.11)

for all real values of a and b.

The wavefield of a beam of light containing two monochromatic wavetrains of frequencies f1

and f2 can be written as K K K

A A f1 A f2 , (1.12a)

where

K

ˆ f1 cos 2&) f1 z 2& f1t U(1) yV

Af1 xU

ˆ f1 cos 2&) f1 z 2& f1t V(1) (1.12b)

and

K

ˆ f2 cos 2&) f2 z 2& f 2t U(2) yV

Af2 xU

ˆ f2 cos 2&) f2 z 2& f 2t V(2) . (1.12c)

16

D. Griffiths, Introduction to Electrodynamics, 2nd ed. (Prentice Hall, Englewood Cliffs, NJ, 1989), p. 359.

- 37 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

The beam’s intensity is the time average of its squared amplitude, which is

K K K K K K K K K K K K

j A = A j ( Af1 Af2 ) = ( Af1 Af2 ) j Af1 = Af1 Af2 = Af2 2 Af1 = Af2 ) .

K K K K K K K K

j A = A j Af1 = Af1 j Af2 = Af2 2 j Af1 = Af2 . (1.12d)

Substituting Eqs. (1.12b) and (1.12c) into the cross term in Eq. (1.12d) gives

K K

j Af1 = Af2 j U f1U f2 cos 2&) f1 z 2& f1t U(1) cos 2&) f2 z 2& f 2t U(2)

V f1V f2 cos 2&) f1 z 2& f1t V(1) cos 2&) f2 z 2& f 2t V(2) .

K K

j Af1 = Af2 U f1U f2 j cos 2&) f1 z 2& f1t U(1) cos 2&) f2 z 2& f 2t U(2)

(1.12e)

V f1V f2 j cos 2&) f1 z 2& f1t V(1) cos 2&) (2)

f 2 z 2& f 2 t V .

There

There is aistrigonometric

a trigonometric identity

identity

1 1

(cos . )(cos ) cos(. ) cos(. ) , (1.12f)

2 2

which shows that

cos 2&) f1 z 2& f1t U(1) cos 2&) f2 z 2& f 2t U(2)

1

2

cos 2& z () f1 ) f2 ) 2& t ( f1 f 2 ) U(1) U(2) (1.12g)

1

cos 2& z () f1 ) f2 ) 2& t ( f1 f 2 ) U(1) U(2) .

2

Taking

Taking the the

timetime average

average of both

of both sides

sides andand applying

applying Eqs.Eqs. (1.9b)

(1.9b) andand (1.9c),

(1.9c), we we

see see

thatthat

- 38 -

Interference Equation for the Ideal Michelson Interferometer· 1.5

j cos 2&) f1 z 2& f1t U(1) cos 2&) f2 z 2& f 2t U(2)

1

2

j cos 2& z () f1 ) f2 ) 2& t ( f1 f 2 ) U(1) U(2)

1

j cos 2& z () f1 ) f2 ) 2& t ( f1 f 2 ) U(1) U(2)

2

.

Equation (1.11)

Equation requires

(1.11) bothboth

requires terms on the

terms right-hand

on the sideside

right-hand to be

to zero, which

be zero, gives

which gives

j cos 2&) f1 z 2& f1t U(1) cos 2&) f2 z 2& f 2t U(2) = 0 . (1.12h)

Replacing

Replacing U(1,2)

U(1,2)bybyV(1,2)

V(1,2)in inthethealgebra

algebraused

usedtotoreach

reach this

this result

result does

does not

not change

change the

conclusion, which means that

j cos 2&) f1 z 2& f1t V(1) cos 2&) f2 z 2& f 2t V(2) = 0 (1.12i)

K K

j Af1 = Af2 0 (1.12j)

for any two frequencies f1 and f2 such that f1 f2. Hence, Eq. (1.12d) can be written as

K K K K K K

j A = A j Af1 = Af1 j Af2 = Af2 . (1.12k)

Comparing

Comparing thethe

formula in in

formula (1.12k)

(1.12k)forforthe

theintensity

intensityofofa abeam

beamcontaining

containing two

two monochromatic

monochromatic

wavefields to the left-hand side of the formula in (1.10d) for the intensity of a single

monochromatic wavefield, we note that the intensity of the beam with two monochromatic

wavefields is the sum of the intensities of each monochromatic wavefield.

The wavefield of a beam of light containing three monochromatic wavetrains of frequencies

f1, f2, and f3 can be written as K K K K

A A f1 Af2 A f3 (1.13a)

K K K

with Af1 , Af2 specified by formulas (1.12b) and (1.12c) respectively and Af3 specified by

- 39 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

K

ˆ f3 cos 2&) f3 z 2& f 3t U(3) yV

Af3 xU

ˆ f3 cos 2&) f3 z 2& f3t V(3) . (1.13b)

Following thethe

Following same

sameanalysis as as

analysis before,

before,wewenotenotethat

thatthe

theintensity

intensityofofthis

thisthree-frequency

three-frequency light

beam is

K K K K K K K K

j A = A j ( Af1 Af2 Af3 ) = ( Af1 Af2 Af3 )

K K K K K K K K K K K K

j Af1 = Af1 Af2 = Af2 A f3 = Af3 2 A f1 = A f2 2 Af1 = Af3 2 A f2 = Af 3

K K K K K K

j Af1 = Af1 j Af2 = Af2 j Af3 = Af3

K K K K K K

2 j Af1 = Af2 2 j Af1 = Af3 2 j Af2 = Af3 .

Equation (1.12j) shows that

K K

j Af1 = Af2 0

K K

for any two distinct frequencies f1 and f2. The only thing different about j Af1 = Af3 and

K K

j Af2 = Af3 is the subscripts assigned to the distinct frequencies, so the same algebra showing

K K

that j Af1 = Af2 is zero also shows that

K K K K

j Af1 = Af3 j Af2 = Af3 0 .

K K K

Hence, the the

Hence, three-frequency formula

three-frequency for for

formula j jA= A= A

reduces

reduces

to to

K K K K K K K K

j A = A j Af1 = Af1 j Af2 = Af2 j Af3 = Af3 . (1.13c)

Here again, the intensity of the beam equals the sum of the intensities of its monochromatic

wavetrains.

This same argument can obviously be generalized to a beam consisting of N monochromatic

wavetrains. Since N may be left unspecified and can be made as large as we please, this is the

same as extending it to a beam of white light. The white-light wavefield can be written as

K N K

A ¦ A fi , (1.14a)

i 1

where

K

ˆ fi cos 2&) fi z 2& f i t U( i ) yV

Afi xU

ˆ fi cos 2&) fi z 2& fi t V( i ) (1.14b)

- 40 -

Interference Equation for the Ideal Michelson Interferometer· 1.5

K K §§ N K · § N K ·· § N N K K ·

j ( A • A ) = j ¨ ¨ ¦ A fi ¸ • ¨ ¦ A f j ¸ ¸¸ = j ¨ ¦¦ Afi • Af j ¸ ,

¨

© © i =1 ¹ © j =1 ¹¹ © i =1 j =1 ¹

K K K K

( )

N N

j ( A • A ) = ¦¦ j A fi • Af j . (1.14c)

i =1 j =1

K K

(

j A fi • A f j = 0 ) (1.14d)

K K K K K K K K N K K

( ) ( ) (

j ( A • A ) = j Af1 • Af1 + j Af2 • Af2 + " + j Af N • Af N = ¦ j Afi • Afi ) i =1

( ) (1.14e)

because all the i j terms disappear. Equation (1.14e) shows that the intensity of any beam, even

a white-light beam, is the sum of the intensities of its monochromatic wavetrains. This is

sometimes called the principle of independent superposition,17 and can be written as

N

I = I f1 + I f2 + " + I f N = ¦ I fi , (1.14f)

i =1

where

K K

I = j ( A • A) (1.14g)

is the total intensity of the beam and

K K

(

I fi = j A fi • A fi ) (1.14h)

Returning now to Fig. 1.16, we suppose that Eqs. (1.14f)–(1.14h) refer to beam TR and

consider how to write the disturbance for beam RT. In an ideal Michelson interferometer, the

only difference between beam RT and beam TR is that the wavefields in beam RT lag behind the

wavefields in beam TR by a distance Ȥ = 2p that is usually called the optical-path difference.

Using the notation specified in Eq. (1.14b), we see that for every monochromatic wavetrain

17

J. Chamberlain, The Principles of Interferometric Spectroscopy (John Wiley & Sons, New York, 1979), p. 98.

- 41 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

K )

A(fTR

i

= xU (

ˆ fi cos 2πσ fi z − 2π fi t + δU(i ) + yV ) (

ˆ fi cos 2πσ fi z − 2π fi t + δV(i ) ) (1.15a)

in beam TR, there must be, according to Fig. 1.16, a corresponding monochromatic wavetrain

K

( )

ˆ fi cos 2πσ fi ( z + χ ) − 2π f i t + δU( i ) + yV

A(fiRT ) = xU (

ˆ fi cos 2πσ fi ( z + χ ) − 2π f i t + δV( i ) ) (1.15b)

in beam RT. The total disturbance for the combined beams’ fith wavetrain is then

K K )

A(fiRT ) + A(fTR

i

in Fig. 1.16. We also note, however, that the beam splitter in Fig. 1.16 is evidently not the same

sort of beam splitter as the one used by Michelson because it does not reverse the direction of the

oscillation of the TR beam the way that the beam splitter in Fig. 1.8 did. For this sort of beam

splitter, the total disturbance of the combined beam’s fith wavetrain should be

K K )

A(fiRT ) − A(fTR

i

according to the discussion at the end of Sec. 1.1. To accommodate both possibilities, we write

the fith wavetrain of the combined beam as

K K K )

A(ficb ) = A(fiRT ) + WA(fTR

i

, (1.15c)

where parameter W is í1 for Michelson-type beam splitters Kand 1 for non-Michelson beam

splitters. The superscript (cb) indicates that the disturbance A(ficb ) is the fith wavetrain of two

beams combined in a balanced way—that is, each beam has undergone one transmission and one

reflection at the beam splitter. The intensity of the combined fith wavetrain is

K K K K K K )

( ) (

I (ficb ) = j A(ficb ) • A(ficb ) = j ( A(fiRT ) + WA(fiTR ) ) • ( A(fiRT ) + WA(fTR

i

)

)

K K K ) K (TR ) K ( RT ) K (TR )

(

= j A(fiRT ) • A(fiRT ) + W 2 A(fTR

i

• Af

i

+ 2WA fi • Af

i

)

.

K K K ) K (TR ) K ( RT ) K (TR )

( ) (

I (ficb ) = j A(fiRT ) • A(fiRT ) + j A(fTR

i

• Af

i

+ 2W j)A fi • Af (

i

, ) (1.15d)

- 42 -

Interference Equation for the Ideal Michelson Interferometer· 1.5

where we have recognized that W2 = 1 because W = ±1. Since both disturbances have the same fi

K K )

frequency, Eq. (1.12j) cannot be used to say that j A(fiRT ) = A(fTR

i

is zero. Substituting from

(1.15a) and (1.15b) gives

K K )

j A(fiRT ) = A(fTR

i

j U 2fi cos 2&) fi ( z ) 2& f i t U( i ) cos 2&) fi z 2& f i t U( i )

V f2i cos 2&) fi ( z ) 2& fi t V( i ) cos 2&) fi z 2& f i t V( i ) ,

or

K K )

j A(fiRT ) = A(fTR

i

U 2fi j cos 2&) fi z 2&) fi 2& fi t U(i ) cos 2&) fi z 2& fi t U(i )

(1.15e)

2

V j cos 2&) fi z 2&) fi 2& fi t

fi

(i )

V cos 2&) fi z 2& fi t (i )

V .

j cos 2&) fi z 2&) fi 2& fi t U(i ) cos 2&) fi z 2& f i t U( i )

§1 1 ·

j ¨ cos 4&) fi z 2&) fi 4& fi t 2U( i ) cos 2&) fi ¸ .

©2 2 ¹

Applying (1.9b)

Applying andand

(1.9b) (1.9c), we we

(1.9c), get get

thatthat

j cos 2&) fi z 2&) fi 2& f i t U( i ) cos 2&) fi z 2& fi t U( i )

(1.15f)

1 1

2

j cos 4&) fi z 2&) fi 4& f i t 2U( i )

2

j cos 2&) fi .

TheThe

timetime average

average of any

of any time-independent

time-independent quantity

quantity equals

equals thatthat quantity—that

quantity—that is, is,

j K K (1.15g)

j cos 4&) fi z 2&) fi 4& fi t 2U(i ) 0 .

- 43 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

j cos 2&) fi z 2&) fi 2& fi t U(i ) cos 2&) fi z 2& f i t U( i )

(1.15h)

1

2

cos 2&) fi .

Replacing U(i ) by

Replacing (i )

U byV(i ) does

(i )

V doesnot not

change

change

the the

algebra

algebra

usedused

to derive

to derive

(1.15h).

(1.15h).

It follows

It follows

thatthat

j cos 2&) fi z 2&) fi 2& fi t V(i ) cos 2&) fi z 2& f i t V( i ) 12 cos 2&) . (1.15i)

fi

Substituting

Substituting (1.15h)

(1.15h) andand (1.15i)

(1.15i) intointo (1.15e)

(1.15e) nownow gives

gives

K K ) 1 2

j A(fiRT ) = A(fTR

i

2

U fi V f2i cos 2&) fi , (1.15j)

K K K K )

I (ficb ) j A(fiRT ) = A(fiRT ) j A(fiTR ) = A(fTR

i

W U 2fi V f2i cos 2&) fi . (1.15k)

For an ideal Michelson interferometer, the intensity of the fith monochromatic wavetrain in

the RT beam and the intensity of the fith monochromatic wavetrain in the TR beam must be

identical because they arise in a symmetric way from the fith wavetrain of the white-light beam

entering the instrument. We can imagine taking out the moving mirror from its interferometer

arm

K (TR )so that only the TR beam is reflected back to the beam splitter. This means that only the

A fi monochromatic disturbance leaves the interferometer in the proper direction, and its

K ) K (TR )

intensity is, of course, j A(fTR i

= Af

i

. Taking out the fixed mirror in the other arm and

replacing the moving mirror in the first arm ensures that only the RT beam reflects back to the

K K

beam splitter. Now j A(fiRT ) = A(fiRT ) is the intensity of the monochromatic disturbance leaving

the interferometer in the proper direction. Since we have just said that these two intensities must

be equal, it follows that

K K K K )

j A(fiRT ) = A(fiRT ) j A(fiTR ) = A(fTR

i

. (1.16a)

- 44 -

Interference Equation for the Ideal Michelson Interferometer· 1.5

KK

Equation

Equation

(1.10d)

(1.10d)

holds

holds true

true

forforany

anymonochromatic

monochromaticwavetrain

wavetrain AAf f of

offrequency

frequency f,f, so

so itit must

K (TR )

apply to wavetrain Afi of frequency f1. Hence, Eq. (1.15a) must mean that

K ) K (TR ) 1 2

j A(fTRi

= Af

i

2

(U fi V f2i ). (1.16b)

K KRT( RT) )

Equation (1.10d)

Equation (1.10d)also

alsoapplies wavetrain A(A

appliestotowavetrain fi fi ofoffrequency

frequency fifi inin Eq.

Eq. (1.15b),

(1.15b), which

similarly leads to

K K 1

j A(fiRT ) = A(fiRT ) (U 2fi V f2i ) .

2

(1.16c)

The right-hand sides of (1.16b) and (1.16c) are the same, which makes sense since the left-hand

sides of (1.16b) and (1.16c) must satisfy Eq. (1.16a).

Again taking out the moving mirror, we note that then, in an ideal interferometer, one quarter

of the entering beam’s power ends up leaving the interferometer as beam TR traveling along the z

axis in Fig. 1.16. Hence, if I (0)

fi is the intensity of the fith monochromatic wavetrain entering this

interferometer, we must have

K ) K (TR ) 1

j A(fTR

i

= Af

i

I (0)

4 i

f . (1.17a)

K K 1

j A(fiRT ) = A(fiRT ) I (0)

4 i

f (1.17b)

I (0) 2 2

fi 2(U fi V fi ) . (1.17c)

1 (0) W (0)

I (ficb ) I f I fi cos 2&) fi

2 i 2

or

1 (0) ª

I (ficb ) I f 1 W cos 2&) fi º .

(1.17d)

2 i ¬ ¼

- 45 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

Equation (1.17d) is the basic equation for the intensity of a monochromatic wavetrain leaving

an ideal Michelson interferometer when the intensity of the corresponding wavetrain entering the

interferometer is I (0)

fi and the moving mirror is displaced from its ZPD position by a distance

p / 2 , as shown in Fig. 1.16. We note that for those values of Ȥ = 2p, where

W cos 2&) f 1 , the intensity of the fith monochromatic wavetrain leaving the interferometer is

i

the same as the intensity of the fith monochromatic wavetrain entering the interferometer. This

corresponds to constructive interference of the fith monochromatic component of the RT and TR

beams. Suppose the beam entering the interferometer consists of just this one monochromatic

component. Glancing back at Fig. 1.1(b), we see that the power of the beam entering an ideal

Michelson interferometer can leave by either the combined RT and TR dotted beams or by the

two combined dash-dot beams traveling in the opposite direction to the incident beam. The dotted

beams are often called the balanced output of the interferometer, because each one has undergone

one transmission and one reflection at the beam splitter; similarly, the dash-dot beams are called

the unbalanced output, because one beam has undergone two reflections and the other beam has

undergone two transmissions. Conservation of energy requires that the power in all the

monochromatic beams leaving the ideal interferometer must equal the power in the one

monochromatic beam entering the interferometer. Hence, when constructive interference of the

balanced RT and TR beams makes their combined intensity equal to that of the beam entering the

interferometer, we know that destructive interference of the two unbalanced beams must make

their combined intensity equal to zero. Consequently, at each Ȥ = 2p value where

W cos 2&) f 1 , not only is the intensity of the balanced monochromatic beams the same as

i

that of the monochromatic beam entering the interferometer, but also the intensity of the

unbalanced monochromatic beams is zero. On the other hand, for moving-mirror positions where

Ȥ = 2p has a value such that W cos 2&) f 1 , the intensity of the combined monochromatic

i

RT and TR beams in Fig. 1.1(b) is zero according to Eq. (1.17d). At these moving-mirror

locations, the balanced output undergoes destructive interference. Conservation of energy then

requires the unbalanced output to undergo constructive interference and have the same intensity

as the monochromatic beam entering the interferometer.

This analysis can be generalized to any mirror position and value of Ȥ = 2p. If I (ficu ) is the

intensity of the unbalanced monochromatic wavetrain and, as before, I (0)

fi and I (ficb ) are the

intensities of the incident monochromatic wavetrain and balanced monochromatic wavetrain

respectively, then conservation of energy forces us to write

I (0) ( cb )

fi I fi I (ficu ) . (1.18a)

- 46 -

Interference Equation for the Ideal Michelson Interferometer· 1.5

1 (0) ª

I (0)

fi =

2 ¬ ¼ (

I fi 1 + W cos 2πσ fi χ º + I (ficu ) , )

which can be solved for I (ficu ) to get

1 (0) ª

2

I fi 1 − W cos 2πσ fi χ º .

I (ficu ) =

¬ ¼ ( (1.18b) )

This specifies the intensity of the fith monochromatic wavetrain in the unbalanced output of an

ideal Michelson interferometer.

The dashed lines in Fig. 1.17 show the positions of the moving mirror at which

n n +1 n + 2

χ = …, , , ,… .

σf i

σf i

σf i

These are the positions where I (ficb ) = 0 in Eq. (1.17d) when W = í1 for an interferometer using a

Michelson-type beam splitter. This can also be written as, substituting from Eq. (1.7b),

i i i

where λ fi is the wavelength of the fith monochromatic wavetrain. For beam splitters where

W = 1 , of course, these dashed lines represent the moving-mirror positions at which I (ficb ) = I (0)

fi . If

the moving mirror is slightly tilted, so that its surface crosses more than one dashed line, and the

beam entering the interferometer contains only the fith monochromatic wavetrain, then the

combined RT and TR beams leaving the interferometer have light and dark strips as the surface

of the tilted mirror crosses through those planes in space where an untilted mirror would produce

an all-bright or an all-dark balanced output. This connects Eq. (1.17d) to the bright and null

fringe patterns from a spectral line discussed in Sec. 1.4.

When a beam of white light passes through the interferometer—that is, a beam having many

different frequencies—the principle of independent superposition in Eq. (1.14f) requires the

intensity of the interferometer’s balanced output to be the sum of the intensities of each

monochromatic wavetrain,

N

I ( cb ) = ¦ I (ficb ) ,

i =1

1 N (0) ª

I ( cb )

= ¦ I fi 1 + W cos 2πσ fi χ º .

2 i =1 ¬ ¼ ( ) (1.19a)

- 47 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.17.

(n + 3)rd crossing

(n + 2)nd crossing

distance between

dashed lines is λ fi / 2

(n + 1)st crossing

nth crossing

χ = nλ f i χ = (n + 1)λ fi χ = (n + 2)λ fi χ = (n + 3)λ fi

- 48 -

Interference Equation for the Ideal Michelson Interferometer· 1.5

When describing natural sources of light, we often replace sums of discrete quantities with

integrals over continuous functions, and this transformation was perhaps even more characteristic

of late 19th-century science than it is of today’s physics. So it would be an automatic process for

Michelson and his contemporaries to define a spectral intensity function I (0) ( f ) to describe the

radiation entering the instrument. When using this sort of mathematical formalism, we say that

I (0) ( f )df is the optical intensity of all the radiation having frequency values between f and f + df

entering the interferometer. The intensity of the balanced output is then

5

1 (0)

2 ³0

I ( cb ) I ( f ) ª¬1 W cos 2&) f º¼ df . (1.19b)

TheThe physical

physical meaning

meaning of of

Eq.Eq. (1.19b)

(1.19b) is isexactly

exactlythe

thesame

sameasasEq.

Eq.(1.19a);

(1.19a);we

wehave

have just

just replaced

replaced

(0) (0)

I fi by I ( f )df and changed the sum to an integral. We have also relied on variable f itself

instead of index i to label the different frequencies. To make this last tactic work, we just assume

that I (0) ( f ) is zero for those frequencies f that are not part of the original sum over i; this also

lets us specify the integral to be over all possible frequencies f between 0 and . The

wavenumber ıf can be eliminated by substituting from the formula for f in (1.7d) to get

ª § 2& f · º

5

1

I ( cb )

³ I (0) ( f ) «1 W cos ¨ ¸ » df . (1.19c)

20 ¬ © c ¹¼

TheThe only

only problem

problem with

with this

this equationis isthe

equation theunreasonably

unreasonablyhigh

highnumbers

numbersrequired

required to

to represent

represent f

at optical frequencies—when going from one extreme to the other across the visible spectrum, for

example, frequency f changes from 4×1014 Hz to 7.5×1014 Hz (approximately). Consequently,

today’s Fourier spectroscopists often use Eq. (1.7d) to eliminate f rather than ı from Eq. (1.19b).

To do this, we differentiate both sides of (1.7d) to get

1

df c d) or d) df

c

and define

S () ) cI (0) (c) ) (1.19d)

so that

1

S () ) d) cI (0) (c) ) A df

c

simplifies to

S () ) d) I (0) (c) ) df . (1.19e)

- 49 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

∞

1

I ( cb )

= ³ S (σ ) ª¬1 + W cos ( 2πσχ ) º¼ dσ . (1.19f)

20

To get the white-light intensity formulas for the unbalanced output, we can apply to the

unbalanced monochromatic formula the same analysis used on the balanced monochromatic

formula. Comparing the unbalanced formula (1.18b) to the balanced formula (1.17d), we see that

changing the sign of W is all that needs to be done to go from the balanced formula to the

unbalanced formula. Hence, when we apply to the unbalanced formula the same algebra used on

the balanced formula, we know that all the way through the derivation—and, of course, in the

final results—the only difference would be that W is replaced by íW. Consequently, we can write

down at once the unbalanced white-light formulas corresponding to (1.19b), (1.19c), and (1.19f)

as

∞

1

I ( cu )

= ³ I (0) ( f ) ª¬1 − W cos ( 2πσ f χ ) º¼ df , (1.20a)

20

∞

1 ª § 2π f · º

I ( cu )

= ³ I (0) ( f ) «1 − W cos ¨ χ ¸ » df , (1.20b)

20 ¬ © c ¹¼

and

∞

1

I ( cu )

= ³ S (σ ) ª¬1 − W cos ( 2πσχ ) º¼ dσ (1.20c)

20

respectively. Formulas (1.19b), (1.19c), and (1.19f) contain all the basic information needed to

understand how Fourier-transform spectroscopy works, and it was derived here using only those

facts that Michelson knew over 100 years ago about the nature of light. Unfortunately, it applies

only to an ideal interferometer; not surprisingly, the 19th-century approach used to derive it is

difficult to adapt to the study of both the random and nonrandom errors present in even the most

accurate of today’s Michelson interferometers. For this reason, in Chapter 4 we return to basic

principles and rederive the formula for I(cb) starting from the modern form of Maxwell’s

equations, this time being careful to include all the nonideal terms needed for the error analysis.

Formula (1.19f) is, however, already good enough—if we borrow several mathematical results

from Chapter 2—to explain why the fringes from even the thinnest of spectral lines discussed in

Sec. 1.4 must eventually fade away as Ȥ = 2p increases.

- 50 -

Fringe Patterns of Finite-Width Spectral Lines· 1.6

Finite-width spectral lines, such as the one in the top graph of Fig. 1.18, can be represented by a

spectral intensity function I(0)(f). We can also follow the standard practice of Fourier

spectroscopists and represent the finite-width spectral line by the S(ı) function defined in Eq.

(1.19d) and plotted in the bottom graph of Fig. 1.18. If the intensity of a spectral line is described

by a narrow I(0)(f) function such as the one in the top graph of Fig. 1.18, which is significantly

different from zero only between two very closely spaced frequencies f1 and f2, then the

corresponding S(ı) curve is significantly different from zero only between the two closely spaced

wavenumbers ) 1 f1 / c and ) 2 f 2 / c , as shown in the bottom graph of Fig. 1.18.

The right-hand side of Eq. (1.19f) can be split up into the sum of a constant term and a term

that changes as the location coordinate p = Ȥ/2 of the moving mirror changes,

5 5

1 W

I ( cb )

³ S () ) d) ³ S () ) cos 2&) d) . (1.21a)

20 2 0

Since ) :)0: in

Since 0 the

in the

integrals

integrals

over

overd)d), nothing

, nothingstops

stopsususfrom

fromreplacing

replacing SS(()))) by

by SS(()) )) in the

second term to get

5 5

0 0

(1.21b)

Anticipating

Anticipating some

some of of

thethe Fourier

Fourier materialininChapter

material Chapter2,2,we

wenote

notethat,

that,according

according to

to Eq.

Eq. (2.11a)

(2.11a)

in Chapter 2, function S ( ) ) is even because

S ( ) ) S ( ) ) ,

and, of course, it is real because it represents a real physical quantity—the intensity of the

spectral line. Turning next to Eq. (2.34g) in Chapter 2, we see that because S ( ) ) is a real and

even function, the cosine integral on the right-hand side of Eq. (1.21b) is one half of the Fourier

transform of S [if we specify that parameter ı in (1.21b) corresponds to variable t in (2.34g) and

that parameter Ȥ in (1.21b) corresponds to variable f in (2.34g)]. Anticipating the material in

Chapter 2 one last time, we consult Eq. (2.35k) and note that if the nth derivative of S has a well-

defined Fourier transform, then for large values of its argument the Fourier transform of S

approaches zero as the nth power of the absolute value of its argument. Since S describes a

spectral line—that is, a natural phenomenon—we expect it to have derivatives of all orders and

also expect those derivatives to have Fourier transforms. The argument of the Fourier transform

of S is Ȥ, and we already know that the right-hand side of (1.21b) is half the Fourier transform of

S, so we can now conclude that

- 51 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

5 5

n

(1.21c)

0 0

(1.20a) shows that

5

1

I ( cb )

2 ³0

S () ) d ) O

n

(1.21d)

for large values of Ȥ. Hence, as the moving mirror gets further and further from its ZPD location,

increasing the value of 2 p , the value of I ( cb ) eventually stops changing and approaches the

constant value

5

1

lim I ( cb )

³ S () ) d) . (1.21e)

75 20

This happens for all types of intensity curves, not just those associated with spectral lines. If S

does represent a spectral line such as the one in Fig. 1.18, the brights and nulls associated with

the dashed lines in Fig. 1.17 eventually fade away. Consequently, no matter how the moving

mirror is tilted, no fringes can be seen. If the Michelson interferometer is being used as a ruler,

the fringe counting must stop. When the spectral line is a closely spaced multiplet, each line in

the group has a finite spectral width, ensuring that—no matter how the lines interact with each

other to form bright and dim regions in the overall fringe pattern—eventually any and all fringe

traces must disappear. Every spectral line found in nature produces light having some finite

spectral width, no matter how small, so this sort of fade-out is a universal phenomenon.

In Michelson’s time there was no easy way to measure the intensity of the exit beam leaving the

interferometer, so it was not practical to measure the change in I(cb) as a function of Ȥ = 2p in

order to determine the Ȥ-dependent curve,

³ S () ) cos 2&) d) ,

0

coming from the second term on the right-hand side of Eq. (1.21a). In the previous section we

found that this curve is half the Fourier transform of S. This means that if the curve could be

- 52 -

Fourier-Transform Spectrometers · 1.7

FIGURE 1.18.

( 0)

Spectral Intensity I (f)

f1 f2 frequency f

S (σ ) = cI (0) (cσ )

f1 f2 wavenumber σ

σ1 = σ2 =

c c

- 53 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

measured, then the Fourier transform could be reversed to get the shape of the S spectrum

entering the interferometer. In the 1950s, both optical detectors to measure I(cb) and digital

computers to reverse the Fourier transform became widely available. Spectroscopists began to

design and build spectrometers based on measuring I(cb) as a function of Ȥ and then reversing the

Fourier transform to find S. Today, these sorts of instruments are usually called Fourier-transform

spectrometers.

Equation (1.21a) is an idealized form of the fundamental equation of Fourier-transform

spectroscopy. It describes the intensity of the beam leaving an interferometer whenever we

2) Recombine the two secondary beams after the wavefield of one is shifted a distance Ȥ

with respect to the wavefield of the other.

Although this is exactly what happens inside a standard Michelson interferometer, Figs. 1.19(a)–

1.19(d) show that there are many other combinations of beam splitters and mirrors that divide and

recombine beams in this way.18

Figure 1.19(a) shows the first and perhaps most obvious modification. Michelson put the arms

of his interferometer at right angles to maximize the fringe shift due to the ether wind thought to

exist by 19th-century scientists. If all that is desired, however, is to divide and recombine beams,

then the two arms can be at any (reasonable) angle with respect to each other, as shown in Fig.

1.19(a). The setup in Fig. 1.19(a) may in fact have some advantages over the standard Michelson

interferometer; arranging for near-normal reflections off the beam splitter usually modifies the

polarization of the wavefields less than large-angle reflections (see Sec. 4.4 of Chapter 4 for an

explanation of polarization).

Figure 1.19(b) shows that the end mirrors can be replaced by retroreflectors like corner cubes

or cat’s-eyes. For best results, both arms should have the same type of retroreflector.

The discussion following Eq. (1.17d) above explains the difference between the balanced and

unbalanced optical outputs leaving the standard Michelson interferometer. In Figs. 1.19(a) and

1.19(b), the unbalanced output cannot be detected because it goes back out along the entrance

beam, making it impossible to separate the two. The interferometer in Fig. 1.19(c), however,

shows that there are ways to keep the entrance beam separate from the unbalanced output, giving

us access to both the balanced and unbalanced optical signals. According to Eqs. (1.19f) and

(1.20c), if I(cb) is the intensity of the balanced output and I((cu)

cb)

is the intensity of the unbalanced

output, then

5

I ( cb )

I ( cu )

W ³ S () ) cos 2&) d) (1.22a)

0

and

18

To keep things simple, compensation plates and other secondary optical components have been omitted.

- 54 -

)RXULHU7UDQVIRUP6SHFWURPHWHUVÂ

∞

, ( FE )

+, ( FX )

= ³ 6 (σ ) Gσ . (1.22b)

0

Equation (1.22a) shows that subtracting the output of the detectors measuring the balanced and

unbalanced signals eliminates the constant term and doubles the size of the signal component

containing the Fourier transform. Adding the detectors’ outputs in Eq. (1.22b) eliminates the

Fourier transform, producing the integrated spectral intensity of the entrance beam. This

integrated source intensity should, of course, remain constant during a spectral measurement

because Fourier-transform spectrometers are vulnerable to source fluctuations. Astronomers often

design their Fourier-transform spectrometers so that both the balanced and unbalanced outputs

are available. When they investigate the spectra of weak and fluctuating sources (such as

twinkling stars), these instruments allow them both to double the signal from—and to check the

constancy of—the radiances being measured. If the source fluctuates, formula (1.22b) can be

used to measure the fluctuation. Sometimes this allows the astronomer to rescale the Fourier

signal in (1.22a) to correct the spectral measurement.

In a standard Michelson interferometer such as the one shown in Fig. 1.1(b), and in the setups

shown in Figs. 1.19(a)±1.19(c), the wavefield of one recombining beam is displaced a distance Ȥ

with respect to the wavefield of the other whenever the moving mirror or corner cube is displaced

from =PD by a distance Ȥ/2. In Fig. 1.19(d), however, the corner cube only has to move a

distance Ȥ/4 to displace one wavefield by Ȥ with respect to the other. Equation (5.67) in Chapter 5

shows that larger values of Ȥ lead to more detailed spectral measurements in standard Michelson

interferometers, and the same holds true for the nonstandard interferometers discussed here. In

particular, a setup such as the one shown in Fig. 1.19(d) lets us achieve larger Ȥ values with

smaller displacements of the corner cube. The moving corner cube is also, strictly speaking, no

longer the retroreflector; plane mirrors in both arms are used to reverse the beam directions.

During the 1950s, it was established that Fourier-transform spectrometers had two basic

advantages—often called the Jacquinot advantage and the Fellget advantage—over contemporary

types of prism-based and grating-based spectrometers.19 These advantages revealed that under

many circumstances spectra measured by Fourier-transform spectrometers had a better signal-to-

noise ratio than equivalent prism-based or grating-based instruments. With the popularization of

the fast-Fourier transform (FFT) algorithms in the 1960s, Fourier-transform spectrometers soon

established themselves as usually the first and best choice for measuring infrared spectra

(electromagnetic radiation having wavelengths between 1 and 100 ȝm). The growing availability

of personal and desktop computers in the late 1970s and 1980s made Fourier-transform systems

more compact, powerful, and user-friendly. Over the past two decades, there has been a tendency

standard Michelson

to use standard Michelson configurations,

configurations,such

suchasasthose

thoseininFigs.

Figs.1.1(b)

1.1(b)oror1.19(a),

1.19(a),when

when

19

J. Chamberlain, 7KH3ULQFLSOHVRI,QWHUIHURPHWULF6SHFWURVFRS\ p. 16.

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.19(a). χ

p=

2

Moving

Mirror

Beam Fixed

Splitter Mirror

Entrance Beam

To Balanced

Signal Detector

Moving Corner χ

FIGURE 1.19(b). Cube p=

2

Beam

Splitter

Entrance Beam Fixed Corner

Cube

To Balanced

Signal Detector

χ

p=

FIGURE 1.19(c). 2

Beam

Entrance Beam Splitter

Fixed Corner

Cube

To Unbalanced To Balanced

Signal Detector Signal Detector

- 56 -

Fourier-Transform Spectrometers · 1.7

FIGURE 1.19(d).

χ

p=

4

Beam

Entrance Beam Splitter

Fixed

Mirror

designing the optics of Fourier-transform spectrometers. Standard Michelsons are well suited to

the laser-based servo controls often used to maintain the alignment of the fixed and moving

mirrors.

Today’s Fourier-transform spectrometers often rely on laser-based servo systems to maintain

alignment and control the motion of the moving mirror. The average wavelength of the measured

spectra determines the standards of alignment and control required for good spectral

- 57 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

measurement. Systems designed to measure infrared spectra typically have lasers that work in the

visible. Not only do modest standards of alignment and control in the visible correspond to

extremely accurate standards of alignment and control in the infrared—because visible

wavelengths are much shorter than infrared wavelengths—but the infrared detectors responsible

for the spectral measurements are also easily shielded from stray laser light. The laser servo

systems follow many different designs. Figures 1.20(a) and 1.20(b) show a typical setup that may

not be exactly like any system now in use but that does present the basic ideas behind them.

In Fig. 1.20(a), a single laser beam is separated into beams A, B, and C by laser-beam

splitters. Separating one beam into three ensures that all three beams have the same wavelength.

The three beams enter the interferometer parallel to, and at the edges of, the entrance beam.

Figure 1.20(b) shows the path of beams A and B through the instrument; beam C is not shown

because it is out of the plane of the page, but it is assumed to follow a path similar to beams A

and B. The solid lines representing the laser beams are always parallel to the dotted lines showing

the path of the entrance beam through the interferometer; and the laser beams interact with the

interferometer’s beam splitter, fixed mirror, and moving mirror exactly the same way the

entrance beam does. Because all three laser beams are monochromatic wavetrains of wavelength

λ, the same reasoning used to produce Fig. 1.17 shows that we can draw a sequence of dashed

lines perpendicular to the laser beams to represent the moving-mirror positions where the laser

beams would form fringes. Just like in Fig. 1.17, each dashed line is separated from its two

nearest neighbors by λ/2. Taking the dashed lines to represent nulls, we note that if the moving

mirror has a slight tilt, as shown in Fig. 1.20(b), then the laser detector for beam B will see a near

null in the beam B fringe while the laser detector for beam A will see a near bright in the beam A

fringe. If the moving mirror is aligned in the plane of Fig. 1.20(b) but has a small out-of-plane

tilt, then the laser detector for beam C is sure to see a different fringe brightness than the laser

detectors for beams A and B. The three laser detectors send their signals to a servomechanism

that readjusts the mirror tilt until both detectors see the same fringe intensity, keeping the

interferometer aligned while the moving mirror changes position. Often these servomechanisms

readjust the tilt of the fixed mirror instead of directly correcting the moving mirror’s tilt. It is not

difficult to design systems of this sort that can detect changes of λ/100 in the position of the

moving-mirror’s surface. The A, B, and C laser detectors can also be used to count fringes as the

moving mirror changes position, keeping a record of where the moving mirror is and how fast it

is moving. This information is almost always used to sample the interferometer’s output signal at

equally spaced positions of the moving mirror, and it is often sent to a servomechanism

responsible for producing steady motion in the moving mirror.

___________

Chapters 2 and 3 spell out the mathematical ideas needed to analyze the performance of

Fourier-transform spectrometers, and they also establish the notation used to describe these ideas

in subsequent chapters. Readers who are already familiar with Fourier theory and random

- 58 -

Laser-Based Control Systems · 1.8

functions can skip ahead to Chapter 4, returning to Chapters 2 and 3 as needed to refresh their

understanding. Chapter 4 starts with Maxwell’s equations, working with them to derive the

nonideal versions of Eq. (1.19f) and (1.20c) needed to understand both the nonrandom and

random sources of error in Fourier-transform spectrometers. We always assume a standard

Michelson configuration, such as the ones shown in Fig. 1.1(b) or 1.19(a), controlled by laser-

based metrology and alignment systems similar to the ones shown in Figs. 1.20(a) and 1.20(b).

These are arguably the most common type of Fourier-transform spectrometer in use today. Most

of the basic ideas applied here to these standard Michelson systems are also relevant to other

types of Fourier-transform spectrometers; anyone who reads and understands the analysis

presented in Chapters 4 through 8 will be able to modify the equations presented there so that

they apply to nonstandard Michelson configurations. One possible exception to this rule are

Michelsons such as the one shown in Fig. 1.19(b) that use nonstandard retroreflectors to return

the split entrance beam to the beam splitter. These sorts of systems, which are outside the scope

of this book, are spared many forms of the “tilt” misalignment possible in a standard Michelson,

which is an advantage, but on the other hand exhibit shear types of misalignments, which

standard Michelsons do not have. The equations governing shear misalignment turn out to be

similar to those for tilt misalignment, but it does not necessarily make sense to analyze them as a

source of random error, the way tilt is analyzed in Chapter 7.

- 59 -

1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.20(a).

Interferometer

Beam Splitter

Beam C

Beam B

Laser

Beam A

Laser Beam

Splitters

Entrance

Beam

- 60 -

Laser-Based Control Systems · 1.8

FIGURE 1.20(b).

Moving

Mirror

Laser

Beam C

Interferometer Fixed

Laser Beam Beam Splitter Mirror

Splitters

Beam B

Entrance

Beam

Beam A

To Laser

Detector B

To Laser

Detector A

To Infrared Detector

- 61 -

2

FOURIER THEORY

Many single-chapter introductions to Fourier theory follow a top-down approach, defining what a

Fourier transform is and then listing the mathematical consequences. Here, on the other hand, we

begin with more of a bottom-up approach, seeking not only to present the mathematical

formalism of Fourier transforms but also to give an intuitive feel for how they work and what

they mean. Once the basic idea is established, we need to know which data sequences and

functions have well-defined Fourier transforms. This topic is often scanted because Fourier

theory is notorious for providing no simple mathematical answers to this simple mathematical

question. Indeed, engineers, scientists, and applied mathematicians have a long tradition of using

Fourier transforms in mathematically improper—yet extremely useful—ways that usually give

the correct answer. To show why these techniques work, and also when they cannot be trusted,

there is a brief sketch of generalized function theory. This is followed by a discussion of the

Fourier series and the discrete Fourier transform, including an exact description of how they are

connected to the integral Fourier transform. The discrete Fourier transform is particularly

important because, almost without exception, the only type of Fourier transform calculated on

today’s computers is the discrete Fourier transform; without it, the Michelson interferometer

would be a much more limited instrument. The chapter then concludes with a brief discussion of

how Fourier transforms are applied to two-dimensional and three-dimensional functions.

The idea of a Fourier transform develops naturally from a simple idea for comparing the shape of

two sequences of measurements. A sequence of measurements is really just a list of numbers, so

when we compare sequences of measurements we compare the shapes of number lists graphed in

the order of their measurement. We can suppose without any loss of generality that two lists, uk

and vk , have the same number of members with k 1, 2, … , N . Figures 2.1(a) and 2.1(b) show

two lists uk and vk graphed against their index value k. Defining u and v to be the mean values

of uk and vk ,

1 N

u ¦ uk (2.1a)

N k 1

and

1 N

v ¦ vk , (2.1b)

N k 1

-- 62

62 --

Basic Concept of a Fourier Transform · 2.1

FIGURE 2.1(a).

List uk

1 2 3 4

increasing index k

FIGURE 2.1(b).

List vk

1 2 3 4

increasing index k

-- 63

63 --

2 · Fourier Theory

we form the sum S of the products of the differences from the mean,

N

S ¦ uk u vk v . (2.2)

k 1

If the graphs of uk and vk have similar shapes, so that uk u ? vk v for most values of k,

then uk u and vk v are very likely to have the same sign for most values of k. This means

few terms in the sum are negative and S ends up being a large positive number. If uk and vk have

little similarity in shape, then uk u and vk v are as likely to have opposite signs as the

same sign and the terms in the sum are just as likely to be positive as they are to be negative.

When this happens, S is a sum of terms that tend to cancel out, and the magnitude of S is likely to

be small.

The same basic idea can be applied to continuous functions u(t) and v(t). To create a formal

correspondence between functions and lists, we define an interval ¨t in t and match uk and vk to

u(t) and v(t) with the equations

u

u (k t ) k

t

and

v(k t ) vk .

Because u and v are continuous functions of time, we can assume that they vary in an

unsurprising manner between the isolated points at t , 2t , … , N t at which they have been

specified. Traditionally, the argument of functions u and v is called t and assumed to be time, but

it is worth remembering that t can stand for any relevant physical parameter, such as length,

voltage, current, etc. Now we can approximate Eq. (2.2) as

N t

S
³ u (t ) u v(t ) v dt ,

t

(2.3a)

where now

N t

1

u

N t ³ u (t )dt

t

(2.3b)

and

N t

1

v

N t ³ v(t )dt .

t

(2.3c)

Equations (2.3b) and (2.3c) just ensure that u and v are now the average values of u(t) and

- 64 --

- 64

Basic Concept of a Fourier Transform · 2.1

v(t) respectively. We note that the value of u has been redefined from what it was in Eq. (2.1a)

above,

unew
uold / t ,

whereas v has basically the same value as in Eq. (2.1b)—the only change is to replace the sum

by the equivalent integral. At this point, the finite value of ¨t is just a distraction, because it is the

shapes of the continuous functions u(t) and v(t) that are being compared. Taking the limit as

t 7 0 and N 7 5 in such a way that

t 70

N 75

we get

Tmax

S ³ u (t ) u v(t ) v dt ,

0

(2.4b)

where

Tmax

1

u

Tmax ³ u (t )dt

0

(2.4c)

and

Tmax

1

v

Tmax ³

0

v(t )dt . (2.4d)

We still expect S to be large when functions u and v have similar shapes and S to be small when

they have dissimilar shapes.

Equation (2.4b) can be written as

Tmax Tmax

0 0

Tmax

ªTmax º

³ u (t ) u v(t ) dt v A « ³ u (t ) dt u A Tmax »

0 ¬« 0 ¼» (2.5)

Tmax Tmax

ª Tmax

º

³ u (t ) v(t ) dt u ³ v(t )dt v A « ³ u (t ) dt u A Tmax »

0 0 ¬« 0 ¼»

Tmax

0

max ,

where in the last step (2.4c) ensures that the term in the square brackets [ ] is zero and (2.4d) is

-- 65

65 --

2 · Fourier Theory

used to replace the integral over v by vTmax . To get to Fourier theory from Eq. (2.5), we suppose

v(t) to be an oscillatory function like sin(2& ft ) or cos(2& ft ) with f > 0 . This makes function u

the data—that is, the value of our measurement at time t is u(t). Equation (2.4d) then reveals,

depending on whether we choose v to be a sine curve or a cosine curve, that

Tmax

1

vTmax ³ sin(2& ft )dt 2& f A 1 cos(2& fT )

0

max (2.6a)

or

Tmax

1

vTmax ³

0

cos(2& ft )dt

2& f

A sin(2& fTmax ) . (2.6b)

When v is a sine curve, vTmax oscillates between 1 & f and 0 as Tmax increases; and when v

is a cosine curve, vTmax oscillates between 1 2& f and 1 2& f as Tmax increases. Keeping in

mind that u(t) represents a function measured in a laboratory, if we want to compare the shape of

u to either sin(2& ft ) or cos(2& ft ) , common sense requires Tmax, the range of t over which data is

gathered, to be much greater than 1/ƒ, the period of the sine or cosine curve to which we want to

compare the data. Unless u entirely lacks a resemblance to the sine or cosine so that

Tmax

³ u (t )v(t )dt
0

0

Tmax

³ u (t )v(t )dt

0

to be large when the u measurements are large, and small when the u measurements are small—

and the integral’s magnitude should also increase as Tmax increases. So when u represents a

typical set of data that is not completely unlike v in shape, then

Tmax

0

max )

or

- 66 --

- 66

Basic Concept of a Fourier Transform · 2.1

Tmax

1

u ³ u(t )v(t )dt O(T

0

max ).

Equations (2.6a) and (2.6b) show that vTmax must remain somewhere between the two values

1 & f and 1 2& f no matter how large Tmax gets, which means

vTmax O( f 1 ) .

Having already concluded that Tmax has been chosen much larger than 1/ƒ, we expect

Tmax

1

u ³ u(t )v(t )dt O(T

0

max ) O( f 1 ) vTmax ,

Tmax

1

u ³ u(t )v(t )dt vT

0

max .

ª 1 Tmax º T

1 max

Tmax

¬« u 0 ¼» u 0 0

The integral in (2.7) can be regarded as assigning the number S to the similarity in shape of u and

v, when v is a sine or cosine curve of frequency ƒ. Remembering where S came from, we realize

that this number is large when u and v have similar shapes and small when u and v have

dissimilar shapes.

To make the ideas of the previous section mathematically rigorous, we define the Fourier sine

transform of function u to be

5

p( ft ) u (t ) 2 ³ u (t ) sin(2& ft ) dt (2.8a)

0

-- 67

67 --

2 · Fourier Theory

5

C ( ft )

u (t ) 2³ u (t ) cos(2& ft )dt . (2.8b)

0

The notation p( ft ) u (t ) and C ( ft ) u (t ) shows that the function u(t) is being multiplied by,

respectively, the sine or cosine function having—as indicated by the superscript—an argument ft

multiplied by 2& . The order of the ft product in the superscript does not matter because it does

not matter in the arguments of the sine and cosine, so

p( ft ) u (t ) p( tf ) u (t ) and C ( ft ) u (t ) C ( tf ) u (t ) .

In particular we know, because t is repeated in both u(t) and the superscript of p and C , that t is

the dummy variable of integration whereas ƒ, which is only contained in the superscript, is an

independent parameter. This means the transforms p( ft ) u (t ) and C ( ft ) u (t ) are themselves

functions of the parameter ƒ,

5

U p f 2 ³ u (t ) sin(2& ft )dt (2.8c)

0

and

5

U C f 2 ³ u (t ) cos(2& ft )dt . (2.8d)

0

The “capital U” names of functions U p and U C show that they are mathematically associated

with the original function u(t), created from u(t) by the integrals in (2.8c) and (2.8d).

Although the upper limit of integration is now in Eqs. (2.8a) and (2.8b), this should not be

interpreted as taking the limit as Tmax 7 5 in Eq. (2.7). The upper limit is put at just to

eliminate Tmax as an explicit parameter, and the idea behind the presence of Tmax—that u(t)

represents the result of a measurement—is kept alive by placing restrictions on the type of

function u can be. In particular, we expect u(t), in some sense, to diminish or get small as t gets

large, because it is impossible to measure data for all the times t out to . It turns out that when

the right sorts of restrictions are placed on u, the Fourier sine and cosine transforms can be

inverted to recover the original functions,

5

u (t ) 2 ³ U p f sin(2& ft ) df (2.8e)

0

- 68 --

- 68

Fourier Sine and Cosine Transforms · 2.2

and

5

u (t ) 2 ³ U C f cos(2& ft ) df (2.8f)

0

for t 0 .

If we adopt the strictest definition of what is meant by the integral of a function between 0 and

, then Eqs. (2.8a)–(2.8f) are true when function u(t) satisfies the following four requirements:

(II) It is continuous except for a finite number of jump discontinuities.

(III) It is bounded on any finite interval 0

a

t

b

5 .

(IV) It has finite variation on any finite interval 0

a

t

b

5 .

We now show why function u(t) naturally satisfies all these restrictions when it represents a

(possibly idealized) measurement controlled or described by a continuous parameter t.

No matter what the argument t of function u represents—time, voltage, energy, etc.—function

u(t) can only be measured over a finite range of t. Although there may be no reason to think u is

zero or negligible when measured outside this range, we obviously cannot “make up” values for

what it might be. If we extrapolate to get the unmeasured t values, the extrapolation should not

dominate the information contained in u. In general, the measurement should be carried out in

such a way that the unmeasured or extrapolated values are of negligible importance compared to

the measured values. Mathematically we might say that there exists a positive, finite value of t,

which we call Tmax, such that the important measured values of u are all at t 4 Tmax . One way of

expressing this constraint is to require

Tmax 5

³

0

u (t ) dt
³ u (t ) dt .

0

(2.9a)

Since the left-hand integral ought to be finite, when (2.9a) is true, it follows that

³ u (t ) dt

5 .

0

(2.9b)

Functions u that satisfy (2.9b) are said to be absolutely integrable; clearly, all functions

representing possible measurements share this quality, satisfying requirement (I) above.

Understanding requirement (II) requires some discussion of what it means to call an

experimental measurement continuous. To assign, with negligible experimental error, a definite

value of t to a measurement u, some minimum and finite change in t must occur between adjacent

measurements. In practice, continuous measurements are constructed by connecting sequences of

-- 69

69 --

2 · Fourier Theory

adjacent but separate points. We then assume that if u were measured between these already

known points, it would equal (to within experimental error) the values selected by connecting the

points. Thus, the continuity of u is a requirement that the measurement captures all the relevant

detail. In this sense, asserting that u is continuous is a type of idealization—just another way of

saying that the measurement is accurate and representative. This takes care of the first part of

requirement (II), but there is a second part permitting u to have a finite number of jump

discontinuities. Figure 2.2 shows a jump discontinuity in u(t). Jump discontinuities represent

another type of idealization—what can occur when, for example, instruments are turned on or off

during a measurement. Because it is unrealistic to have this happen an infinite number of times

over a finite range of t, it makes sense to say that all functions u representing measurements are

continuous over any finite range of t except for a finite number of jump discontinuities.

Consequently, we can expect all functions representing measurements to satisfy requirement (II).

Standard proofs that the Fourier transform of the Fourier transform returns the original

function u usually end up showing as their final step that

5

1

2 ³ U p f sin(2& ft )df lim u (t ) u (t ) (2.9c)

70 2

0

and

5

1

2 ³ U C f cos(2& ft ) df lim u (t ) u (t ) . (2.9d)

70 2

0

When u is continuous, this immediately reduces to the desired result, but when the integrals are

evaluated at a jump discontinuity, such as at t to in Fig. 2.2, the limits on the right-hand side of

(2.9c) and (2.9d) give u a value at the jump discontinuity that is probably different from the

original value of u at the jump discontinuity. To keep this from happening, we define the value of

u to be, for all values t t jump marking the location of a jump discontinuity,

1

u (t jump ) lim ª¬u (t jump ) u (t jump ) º¼ . (2.9e)

70 2

Modifying u this way cannot change the value of any integral whose integrand is the product of u

with another smooth function. The sine and cosine are smooth functions, so using (2.9e) to

modify the value of u at jump discontinuities does not change the values of the sine or cosine

transforms.

Measurements must be done with physically realizable equipment, which necessarily

produces finite values of u. This means there always exists a finite real number B

5 such that

- 70 --

- 70

Fourier Sine and Cosine Transforms · 2.2

Figure 2.2.

u (t )

t t0

______________________________________________________________________________

u (t )

B (2.9f)

a

t

b

5 when function u represents a measurement. Functions

obeying this inequality are called bounded functions, so functions representing measurements

always satisfy requirement (III).

Requirement (IV) is a little bit more complicated to explain. Any function u(t) can be written

as the difference of two other functions u1 (t ) and u2 (t ) , as shown in Figs. 2.3(a) and 2.3(b),

u (t ) u1 (t ) u2 (t ) (2.9g)

In Fig. 2.3(a), function u is drawn with a continuous line where it is increasing and with a dashed

line where it is decreasing. In Fig. 2.3(b), we see that functions u1 and u2 are constructed so that

every time u increases, u1 also increases while u2 remains the same, and every time u decreases,

u2 increases while u1 remains the same. Consequently, for any function u and time values b : a ,

the differences u1 (b) u1 (a) and u2 (b) u2 (a ) are non-negative and can only increase, which

means that their sum

-- 71

71 --

2 · Fourier Theory

FIGURE 2.3(a).

u (t )

a b

t1 t2 t3

FIGURE 2.3(b).

u1,2 (t )

u1 (t )

u2 (t )

a b

t1 t2 t3

- 72 --

- 72

Fourier Sine and Cosine Transforms · 2.2

is also non-negative. Functions u1 and u2 have been constructed so that every time u goes up and

down, the differences u1 (b) u1 (a ) and u2 (b) u2 (a ) increase, making the size of Vab (u ) a

record of how many times u oscillates in the interval a

t

b . We define Vab (u ) to be the

variation of u over the interval a

t

b , and if

Vab (u )

5 , (2.9i)

t

b . Requirement (IV), that u have finite

variation in any interval 0

a

t

b

5 , means that u can only oscillate a finite number of

times in that interval. The function sin((t 1) 1 ) , for example, does not have finite variation over

any interval containing t 1 . If we attempted to measure a quantity that had infinite variation

inside a finite interval, we would be blocked by the realization, already discussed above in

connection with requirement (II), that adjacent measurements must be separated by some

minimum value of t. If the measurement were repeated over and over, it would seem as if u were

changing unpredictably in the region of infinite variation, leading us to wonder whether our

measurement reflected the same physical reality. Therefore, our measurements cannot have

infinite variation, and so any function u(t) representing a realistic measurement must also satisfy

requirement (IV).

We see that requirements (I) through (IV) are always satisfied by functions representing

physically realizable measurements. It should be emphasized that requirements (I) through (IV)

are sufficient to ensure that Eqs. (2.8a)–(2.8f) hold true, but not necessary. It is easy to show that

there exist functions that do not meet requirements (I) through (IV) yet still satisfy Eqs. (2.8a)–

(2.8f). Consider, for example,

& for 0 4 t

1 2&

°

g (t ) ® & / 2 for t 1 2& (2.10a)

°0 for t 1 2&

¯

This test function clearly satisfies (I) through (IV) and so must have a Fourier cosine transform,

2 & 1

sin( f )

GC ( f ) 2& ³

0

cos(2& ft )dt

f

(2.10b)

such that we return to the original function g by taking cosine transform of the GC transform,

-- 73

73 --

2 · Fourier Theory

5 5

sin( f )

g (t ) 2 ³ GC ( f ) cos(2& ft )df 2³ cos(2& ft )df . (2.10c)

0 0

f

sin(t )

h(t )

t

5

sin(t )

H C ( f ) 2³ cos(2& ft )dt . (2.10d)

0

t

The integral in (2.10d) is clearly the same as the first integral in (2.10c) with the variables ƒ and t

interchanged. Therefore,

& for 0 4 f

1 2&

°

H C ( f ) g ( f ) ® & / 2 for f 1 2&

° 0 for f 1 2&

¯

Hence we know that h(t) satisfies Eqs. (2.8b), (2.8d), and (2.8f)—it is both cosine transformable

and its cosine transform returns the original function when cosine transformed—exactly because

g(t) in (2.10a) satisfies Eqs. (2.8b), (2.8d), and (2.8f). Yet h(t), unlike g(t), does not satisfy

requirements (I) through (IV)—in particular, it violates requirement (I) because it is not

absolutely integrable. To see that this is true, note that

j& j&

5

sin(t ) 5

sin(t ) 5

1 2 5

1

³ dt ¦ ³& dt : ¦ ³& sin(t ) dt ¦ j 7 5,

0

t j 1 j 1 t j 1 j& j 1 & j 1

where the last step uses a well-known property of the harmonic series,

5

1

¦ j,

j 1

that it grows large without limit. This simple example also shows that just because a function g(t)

satisfies requirements (I) through (IV), so that the transform of the transform returns the original

- 74 --

- 74

Fourier Sine and Cosine Transforms · 2.2

function g(t), it does not necessarily follow that transform itself satisfies requirements (I) through

(IV).

Here is another example to show that, even though the transform of a function may exist, if

requirements (I) through (IV) are violated, then the transform of the transform does not

necessarily return the original function. We consider another test function,

z (t ) t 1 , (2.10e)

5 A

dt dt

³0 t lim

A75 ³ lim ª¬ ln A º¼ 5 ,

t A775

70 0

5

sin(2& ft )

Z p ( f ) 2³ dt .

0

t

0 for f 0

Zp ( f ) ® . (2.10f)

¯& for f 0

Therefore, the sine transform Z p of z (t ) t 1 exists, yet the sine transform of the sine transform

does not return z:

5 F

1 1

2& ³ sin(2& ft )df lim 2& ³ sin(2& ft ) df lim 1 cos(2& Ft ) > . (2.10g)

F 75 F 75 t t

0 0

Clearly, if a function violates requirements (I) through (IV) yet has a well-defined sine or

cosine transform, the sine transform of the sine transform and the cosine transform of the cosine

transform must be checked explicitly to confirm that the original function is returned. The only

exception is when the transform itself satisfies (I) through (IV) even though the original test

function does not. Because we could just as easily have started with the transform itself instead of

the original test function, we can conclude that the transform of the transform of the original

function must return the original function. In general, repeatedly applying the sine or cosine

-- 75

75 --

2 · Fourier Theory

transform just takes us back and forth between the same two functions, and the transformations

are mathematically justified whenever at least one of those functions satisfies requirements (I)

through (IV).

Fourier transform theory can be extended to include functions that are evaluated for negative as

well as positive values of their arguments. To assist our analysis of these extended transforms, we

decide to classify u as an even, odd, or mixed function. An even function u satisfies the constraint

u (t ) u (t ) (2.11a)

for all values of t, negative as well as positive; an odd function satisfies the constraint

u (t ) u (t ) (2.11b)

for all values of t, negative as well as positive; and a mixed function is partly even and partly odd

in the sense that it is the sum of an even function and an odd function, neither of which is

identically zero. Any function u(t)—whether even, odd, or mixed—can be written as the sum of

two functions, ue and uo , with ue being an even function obeying (2.11a) and uo being an odd

function obeying (2.11b),

u (t ) ue (t ) uo (t ) , (2.11c)

where

1

ue (t ) u (t ) u (t ) (2.11d)

2

and

1

uo (t ) u (t ) u (t ) . (2.11e)

2

Clearly,

1 1

ue (t ) u (t ) u (t ) u (t ) u (t ) ue (t )

2 2

and

1 1

u o ( t ) u (t ) u (t ) u (t ) u (t ) uo (t ) .

2 2

If u starts off as an even function, then u ue , and uo is identically zero; if u starts off as an odd

function, then u uo , and ue is identically zero; and if u starts off as a mixed function, then

- 76 --

- 76

Even, Odd, and Mixed Functions · 2.3

neither ue nor uo are identically zero. If u is identically zero, it can be regarded as either even or

odd, according to the classifier’s convenience.

Figures 2.4(a) and 2.4(b) graph examples of even and odd functions respectively, and Fig.

2.4(c) shows a mixed function that is split up into its even and odd parts. We note that cos(2& ft )

is an even function of both ƒ and t and sin(2& ft ) is an odd function of both ƒ and t. One point

worth remembering is that the behavior of even and odd functions is severely constrained near

t 0 . For any odd function at t 0 , we have

from Eq. (2.11b). Since the only number equal to its own negative value is zero, all odd functions

u(t) that have a well-defined value at t 0 must be zero at t 0 ,

t 0

Because u (t ) u (t ) for even functions, when t is near zero the value of u (if u is continuous) is

almost constant. Therefore, when t is exactly zero the derivative of any even function u(t), if it is

well defined, must be zero,

du

0 if the derivative at zero exists and u is even. (2.12b)

dt t 0

du ª u (t ) u (t ) º ª u (t ) u (t ) º

lim « » lim « »¼ ,

dt 70 ¬ ¼ 70 ¬

lim « » lim « o » .

dt t to 70 ¬ ¼ 70 ¬ ¼ dt t to

This shows that when u is even, the derivative of u is odd, and so from (2.12a), which states that

odd functions are zero when their argument is zero, we know that (2.12b) must be true. Similarly,

for any odd function u,

-- 77

77 --

2 · Fourier Theory

FIGURE 2.4(a).

u (t )

FIGURE 2.4(b).

u (t )

- 78 --

- 78

Even, Odd, and Mixed Functions · 2.3

FIGURE 2.4(c).

10

9.28

ue (t )

5

u (t )

u t

i

ue t

i 0

uo t

i

uo (t )

5

9.557 10

2 1.5 1 0.5 0 0.5 1 1.5 2

2 t ti 0 t 2

lim « » lim « o »¼ dt ,

dt t to 70 ¬ ¼ 70 ¬ t to

showing that when u is odd, its derivative is even. The second derivative d 2u dt 2 of an even

function u is the first derivative of du dt that is odd, and so d 2u dt 2 must be even; similarly, the

third derivative d 3u dt 3 is the first derivative of d 2u dt 2 that is even, and so must be odd.

Examining in this fashion ever higher derivatives of the even function u, we conclude that

-- 79

79 --

2 · Fourier Theory

® ¾ when u is even. (2.12c)

dt n ¯ even function for n 2, 4, … ¿

The same reasoning applied to the derivatives of an odd function u shows that

® ¾ when u is odd. (2.12d)

dt n ¯ odd function for n 2, 4, 6, …¿

Equation (2.12c) states that the odd-numbered derivatives of an even function are odd while the

even-numbered derivatives of an even function are even, and Eq. (2.12d) states that the odd-

numbered derivatives of an odd function are even while the even-numbered derivatives of an odd

function are odd. Therefore, an immediate consequence of (2.12a), (2.12c), and (2.12d) is that the

odd-numbered derivatives of an even function—if they exist and are well-defined—are zero at

t 0 and the even-numbered derivatives of an odd function—if they exist and are well-defined—

are zero at t 0 .

We can now extend the sine and cosine transforms to include functions u(t) evaluated for

negative as well as positive values of t while generalizing requirements (I) through (IV)

previously applied to u for t 0 in Sec. 2.2. The extended requirements are

5

³ u(t ) dt

5 .

5

(2.13a)

(VI) Function u (t ) must be continuous except for a finite number of jump discontinuities

over any finite interval 5

a

t

b

5 .

(VII) There must exist a finite positive number B such that

u (t )

B . (2.13b)

(VIII) The non-negative variation Vab (u ) of function u(t) as defined in Eqs. (2.9g) and (2.9h)

is finite over any finite interval 5

a

t

b

5 ,

Vab (u )

5 . (2.13c)

- 80 --

- 80

Extended Sine and Cosine Transforms · 2.4

We also define the value of u at all its jump discontinuities to be given by Eq. (2.9e). These new

requirements are clearly just the old set of requirements extended to cover negative as well as

positive values of t.

The extended Fourier sine transform of u is

5

pE ( ft )

u (t ) ³ u (t ) sin(2& ft )dt , (2.14a)

5

5

CE ( ft )

u (t ) ³ u (t ) cos(2& ft )dt . (2.14b)

5

Just like in Eqs. (2.8a) and (2.8b), defining the standard sine and cosine transforms, the order of

the ft product in the superscript does not matter:

pE ( ft ) u (t ) pE ( tf ) u (t )

and

CE ( ft ) u (t ) CE ( tf ) u (t ) .

We can write u as the sum of even and odd functions, u (t ) ue (t ) uo (t ) , as described in Eq.

(2.11c), and substitute this sum into the definitions of the extended sine and cosine transforms in

(2.14a) and (2.14b) to get

5 5

pE ( ft )

u (t ) ³ ue (t ) sin(2& ft )dt ³ uo (t ) sin(2& ft )dt (2.15a)

5 5

and

5 5

CE ( ft ) u (t ) ³ ue (t ) cos(2& ft )dt

5

³ u (t ) cos(2& ft )dt .

5

o (2.15b)

We note that the product of an even function ue and the sine, as well as the product of an odd

function uo and the cosine, must be an odd function,

-- 81

81 --

2 · Fourier Theory

and

uo (t ) cos 2& f A (t ) uo (t ) cos(2& ft ) uo (t ) cos(2& ft ) . (2.16b)

The integral between í and + of any odd function o (t ) can be thought of as the limit of

the sum of a large number of small terms,

5

o o o o o o

Because o is odd, o (0) is zero; o (dt ) A dt o (dt ) A dt and cancels o (dt ) A dt ;

o (2dt ) A dt o (2dt ) A dt and cancels o (2dt ) A dt ; and so on. Therefore,20

³ (t )dt 0 ,

5

o (2.17)

5

pE ( ft ) u (t )

5

³ u (t ) sin(2& ft )dt

o (2.18a)

and

5

CE ( ft )

u (t ) ³ ue (t ) cos(2& ft )dt . (2.18b)

5

5

e e e e e e

Because e is even, e ( dt ) e (dt ) , e (2dt ) e (2dt ) , and so on. Therefore, the integral over

negative t has the same value as the integral over positive t and we can write

20

Strictly speaking, we are here treating the integral between í and + as a Cauchy principle value, a concept

introduced in Sec. 2.10 below.

- 82 --

- 82

Extended Sine and Cosine Transforms · 2.4

5 5

³ (t )dt 2³ (t )dt .

5

e

0

e (2.19)

and the product of ue and the cosine, both of them even functions, is another even function.

Consequently, the extended sine and cosine transforms in Eqs. (2.18a) and (2.18b) are, according

to (2.19), (2.8a), and (2.8b),

5 5

pE ( ft ) u (t ) ³ uo (t ) sin(2& ft )dt 2³ uo (t ) sin(2& ft )dt p uo (t )

( ft )

(2.21a)

5 0

and

5 5

CE ( ft )

u (t ) ³ ue (t ) cos(2& ft )dt 2³ ue (t ) cos(2& ft )dt C ( ft ) ue (t ) . (2.21b)

5 0

Equation (2.21a) shows that the extended sine transform of a function u(t) is the unextended sine

transform of uo , the odd component of u; and Eq. (2.21b) shows that the extended cosine

transform of u(t) is the unextended cosine transform of ue , the even component of u. Because the

result will be needed later, we also show that the extended sine transform defined in Eq. (2.14a)

is an odd function of ƒ,

5 5

pE ( ft )

u (t ) ³ u (t ) sin(2& ft )dt ³ u (t ) sin(2& ft )dt pE ( ft ) u (t ) ; (2.22a)

5 5

and a similar manipulation shows that the extended cosine transform defined in (2.14b) is an even

function of ƒ,

5 5

CE ( ft ) u (t ) ³ u (t ) cos(2& ft )dt ³ u (t ) cos(2& ft )dt C u (t ) .

( ft )

E (2.22b)

5 5

We now examine what happens when the extended sine and cosine transforms are applied

twice to the same function. We define

-- 83

83 --

2 · Fourier Theory

U pE f pE ( ft ) u (t ) p( ft ) uo (t ) (2.23a)

and

U CE f CE ( ft ) u (t ) C ( ft ) ue (t ) , (2.23b)

where the second step in Eqs. (2.23a) and (2.23b) comes from (2.21a) and (2.21b). Taking the

extended Fourier sine and cosine transforms of U pE and U CE respectively, we get

5

pE ( tf )

U pE ( f ) pE U pE ( f ) ³ U pE ( f ) sin(2& ft )df

( ft )

(2.24a)

5

and

5

CE ( tf )

U CE ( f ) CE U CE ( f ) ³ U CE ( f ) cos(2& ft )df

( ft )

. (2.24b)

5

The second step in (2.24a) and (2.24b) is there just to emphasize that we are allowed to change

the order of the ft product in the superscripts.

Equation (2.22a) shows that the extended sine transform U pE is an odd function of ƒ, so its

product with the sine is an even function of ƒ; and Eq. (2.22b) shows that the extended cosine

transform U CE is an even function of ƒ, so its product with the cosine is also an even function of

ƒ. Hence, according to (2.19), Eqs. (2.24a) and (2.24b) become

5

pE ( tf )

U pE ( f ) 2³ U pE ( f ) sin(2& ft )df (2.25a)

0

and

5

CE ( tf ) U CE ( f ) 2³ U CE ( f ) cos(2& ft )df . (2.25b)

0

But Eq. (2.23a) shows that U pE is also the unextended sine transform of uo , so from (2.25a) we

see that

pE ( tf ) U pE ( f )

equals the unextended sine transform of the unextended sine transform of uo , the odd component

of function u. According to Eqs. (2.8a), (2.8c), and (2.8e), the unextended sine transform of the

unextended sine transform returns the original function for positive values of t. This means that

the extended sine transform of the extended sine transform,

- 84 --

- 84

Extended Sine and Cosine Transforms · 2.4

pE ( tf ) U pE ( f ) ,

which we have just seen to be equal to the unextended sine transform of the unextended sine

transform, must return uo for positive values of t. Consequently, for positive values of t, Eq.

(2.25a) becomes

5

pE ( tf )

U pE ( f ) 2³ U pE ( f ) sin(2& ft )df uo (t ) . (2.26a)

0

Function uo is, however, defined for all values of t according to the rule for odd functions

uo (t ) uo (t ) , and the integral

5

2 ³ U pE ( f ) sin(2& f (t ))df

0

5 5

2 ³ U pE ( f ) sin(2& f (t ))df 2³ U pE ( f ) sin(2& ft ) df .

0 0

Consequently, the integral exists and is well defined for negative t whenever the integral exists

and is well-defined for positive t. We conclude that Eq. (2.26a) holds true for negative as well as

positive t. Hence, using Eq. (2.23a) to substitute for U pE in Eq. (2.26a), we can write

pE (tf ) pE ( ft 3) u (t 3) uo (t ) (2.26b)

This shows that taking the extended sine transform of the extended sine transform returns the odd

component uo of function u for all values of t, both positive and negative. Switching now to the

extended cosine transform U CE , we see that Eq. (2.23b) shows the extended cosine transform U CE

is also the unextended cosine transform of ue , the even component of function u. From the right-

hand side of Eq. (2.25b), we then know that

CE ( tf ) U CE ( f )

is equal to the unextended cosine transform of the unextended cosine transform of ue . Equations

(2.8b), (2.8d), and (2.8f) show that the unextended cosine transform of the unextended cosine

transform returns the original function for positive values of t. Consequently, the extended cosine

-- 85

85 --

2 · Fourier Theory

CE ( tf ) U CE ( f ) ,

which we have just seen to be equal to the unextended cosine transform of the unextended cosine

transform of ue , must also equal ue for positive values of t. This means that Eq. (2.25b) becomes

(for positive values of t),

5

CE ( tf )

U CE ( f ) 2³ U CE ( f ) cos(2& ft )df ue (t ) . (2.26c)

0

But ue (t ) is defined for negative as well as positive values of t according to the rule

ue (t ) ue (t ) for even functions of t, and the integral

5

2 ³ U CE ( f ) cos(2& ft )df

0

5 5

2 ³ U CE ( f ) cos 2& f (t ) df 2³ U CE ( f ) cos 2& f (t ) df .

0 0

Consequently, the integral exists and is well defined for negative t if it exists and is well defined

for positive t. We conclude that Eq. (2.26c) is valid for both negative and positive t and that,

substituting Eq. (2.23b) into Eq. (2.26c),

CE (tf ) CE ( ft 3) u (t 3) ue (t ) . (2.26d)

This shows that taking the extended cosine transform of the extended cosine transform returns

ue , the even component of function u, for all values of t both positive and negative. Equations

(2.11d) and (2.11e), the original definitions of the even and odd components of a function u,

show that Eqs. (2.26b) and (2.26d) can be written as

1

pE ( tf ) pE ( ft 3) u (t 3)

2

u (t ) u (t ) (2.26e)

and

- 86 --

- 86

Extended Sine and Cosine Transforms · 2.4

1

CE ( tf ) CE ( ft 3) (u (t 3))

2

u (t ) u (t ) . (2.26f)

Adding together the extended sine transform of the extended sine transform and the extended

cosine transform of the extended cosine transform then gives

pE ( tf ) pE ( ft 3) u (t 3) CE ( tf ) CE ( ft 3) u (t 3)

1 1 (2.26g)

u (t ) u (t ) u (t ) u (t ) u (t ) .

2 2

We conclude that for any function u(t), the sum of the extended sine transform of the extended

sine transform and the extended cosine transform of the extended cosine transform returns the

original function.

One obvious way to proceed from this point is to define the Hartley transform

5

e a

( ft )

u (t ) ³ u (t ) cos(2& ft ) sin(2& ft ) dt

5

5 5

³ u (t ) cos(2& ft )dt ³ u(t ) sin(2& ft )dt

5 5

(2.26h)

CE (tf ) u (t ) pE (tf ) u (t )

U CE ( f ) U pE f ,

where in the next-to-last step we use definitions (2.14a) and (2.14b) of the extended sine and

cosine transforms and in the last step Eqs. (2.23a) and (2.23b) are used to write the extended sine

and cosine transforms as functions of ƒ. The order of the ft product in the superscript is not

important because, just like in the sine and cosine transforms, we have

ea( ft ) u (t ) ea( tf ) u (t ) .

Working with this definition, we see that the Hartley transform of the Hartley transform gives

ea( tf ) ea( ft 3) u (t 3) ea(tf ) U CE ( f ) U pE f

5 (2.26i)

³

5

ª¬U CE ( f ) U pE f º¼ cos(2& ft ) sin(2& ft ) df .

-- 87

87 --

2 · Fourier Theory

According to Eqs. (2.22a) and (2.22b), the extended sine transform U pCE is an odd function of ƒ

and the extended cosine transform U CE is an even function of ƒ. Using the same reasoning as in

Eqs. (2.16a) and (2.16b) above,

and

U pE ( f ) cos 2& t A ( f ) U pE ( f ) cos(2& ft ) U pE ( f ) cos(2& ft ) .

We see that U CE ( f ) sin(2& ft ) and U pE f cos(2& ft ) are both odd functions of ƒ, and Eq. (2.17)

states that the integral between í and + of any odd function is zero. Therefore,

5 5

³U

5

CE ( f ) sin(2& ft )df ³ U f cos(2& ft )df 0 .

5

pE

Now the Hartley transform of the Hartley transform in Eq. (2.26i) can be simplified to

5

e a

( tf )

e u (t 3) ³

a

( ft 3 )

ª¬U CE ( f ) U pE f º¼ cos(2& ft ) sin(2& ft ) df

5

5 5

³

5

U CE ( f ) cos(2& ft )df ³U

5

CE ( f ) sin(2& ft )df

5 5

³ U f cos(2& ft )df ³ U f sin(2& ft )df

5

pE

5

pE

5 5

³U

5

CE ( f ) cos(2& ft )df ³ U f sin(2& ft )df

5

pE

CE (tf ) U CE ( f ) pE ( tf ) U pE ( f ) .

sine and sine transforms of u [see Eqs.

cosine

(2.23a) and (2.23b)], we have

ea( tf ) ea( ft 3) u (t 3) CE ( tf ) CE ( ft 3) u (t 3) pE ( tf ) pE ( ft 3) u (t 3) ,

- 88 --

- 88

Extended Sine and Cosine Transforms · 2.4

ea( tf ) ea( ft 3) u (t 3) u (t ) . (2.26j)

We see that the Hartley transform of the Hartley transform returns the original function for both

positive and negative values of t. The Hartley transform was never very popular and is only rarely

encountered today. What is done instead, as we shall see in the next section, is to combine the

extended sine and cosine transforms into a single Fourier transform based on a complex

exponential.

The Fourier transform is based on the well-known identity

where i 1 .

For any real function u(t) satisfying requirements (V) through (VIII) in Sec. 2.4, we can add

the extended cosine transform to i times the extended sine transform to get

5 5

CE ( ft ) u (t ) i A pE ( ft ) u (t ) ³ u(t ) cos(2& ft ) i sin(2& ft ) dt ³e

2& ift

u (t )dt . (2.28a)

5 5

CE ( ft ) u (t ) U CE f and pE ( ft ) u (t ) U pE f ,

³e

2& ift

u (t )dt U CE f iU pE f . (2.28b)

5

5 5 5 5

2& ift 3

u (t 3) CE pE

5 5 5

5

5

(2.28c)

i ³ U pE f sin(2& ft )df

5

-- 89

89 --

2 · Fourier Theory

because U CE f sin(2& ft ) is an odd function of ƒ and integrates to zero [see discussion after Eq.

(2.26i) above]. Taking the extended cosine transform of both sides of Eq. (2.28b) gives

5 5 5 5

5 5

5

5 5

(2.28d)

³ U f cos(2& ft )df

5

CE

(2.24a) and (2.24b) into (2.28c) and (2.28d) gives

5 5

³ df sin(2& ft ) ³ dt 3 e

2& ift 3

u (t 3) i A pE ( tf ) U pE ( f ) (2.28e)

5 5

and

5 5

³

5

df cos(2& ft ) ³ dt 3 e2& ift 3u (t 3) CE ( tf ) U CE ( f ) .

5

(2.28f)

(2.28e) and (2.28f) can be written as

5 5

³ df sin(2& ft ) ³ dt 3 e

2& ift 3

u (t 3) i A pE ( tf ) pE ( ft 3) u (t 3) (2.28g)

5 5

and

5 5

³ df cos(2& ft ) ³ dt 3 e

2& ift 3

u (t 3) CE ( tf ) CE ( ft 3) u (t 3) . (2.28h)

5 5

We now multiply both sides of (2.28g) by ( i ) and sum the resulting equation with Eq. (2.28h) to

get

5 5 5 5

³

5

df cos(2& ft ) ³ dt 3 e2& ift 3u (t 3) i ³ df sin(2& ft ) ³ dt 3 e 2& ift 3u (t 3)

5 5 5

CE ( tf )

C E

( ft 3 )

u (t 3) pE (tf ) pE ( ft3) u (t 3)

- 90 --

- 90

Forward and Inverse Fourier Transforms · 2.5

5 5

³ df e ³ dt 3 e

2& ift 2& ift 3

u (t 3) CE ( tf ) CE ( ft 3) u (t 3) pE (tf ) pE ( ft3) u (t 3) . (2.28i)

5 5

5 5

³ df e ³ dt 3 e

2& ift 2& ift 3

u (t 3) u (t ) . (2.28j)

5 5

If, in Eq. (2.28a), we start out by adding the extended cosine transform to (i ) times the extended

sine transform, then instead of Eqs. (2.28g) and (2.28h), we get [just replace i by (i )

everywhere]

5 5

³

5

df sin(2& ft ) ³ dt 3 e2& ift 3u (t 3) i A pE ( tf ) pE ( ft 3) u (t 3)

5

and

5 5

³ df cos(2& ft ) ³ dt 3 e

2& ift 3

u (t 3) CE ( tf ) CE ( ft 3) u (t 3) .

5 5

Now we must multiply the top equation by i before summing it with the bottom equation to get

5 5 5 5

2& ift 3

5 5 5 5

C E

( tf )

C E

( ft 3 )

u (t 3) pE (tf ) pE ( ft3) u (t 3)

or

5 5

³ df e ³ dt 3 e

2& ift 2& ift 3

u (t 3) u (t ) . (2.28k)

5 5

Clearly, Eqs. (2.28j) and (2.28k) are basically the same identity, which can be written as

5 5

³ df e ³ dt 3 e

92& ift B2& ift 3

u (t 3) u (t ) . (2.28 A )

5 5

As long as the exponent of e changes sign in the two integrals over ƒ and t, we get back the

original function. Looking at how Eqs. (2.28j) and (2.28k) are derived, we see that if the sign of

the exponent does not change, we get

-- 91

91 --

2 · Fourier Theory

CE ( tf ) CE ( ft 3) u (t 3) pE ( tf ) pE ( ft 3) u (t 3)

instead of

CE ( tf ) CE ( ft 3) u (t 3) pE ( tf ) pE ( ft 3) u (t 3) .

Equations (2.26e) and (2.26f) then show that

CE ( tf ) CE ( ft 3) u (t 3) pE ( tf ) pE ( ft 3) u (t 3) u (t ) ,

which gives

5 5

³ df e 92& ift ³ dt 3 e

92& ift 3

u (t 3) u (t ) (2.28m)

5 5

This interesting result shows that when u is even so that u (t ) u (t ) , we still get back the

original function, and when u is odd so that u (t ) u (t ) , we just have to multiply by ( 1) to

retrieve u. Even when u is mixed, no information is lost; reversing the sign of the argument still

gets us back to the original function. Replacing t by ít in (2.28m) takes us back to the original

formula (2.28 A ).

Up to this point, we have taken u to be real, but if Eq. (2.28 A ) holds true when u is a real

function of a real argument, it must also hold true when u is a complex function of a real

argument. To show why this is so, we break complex functions u(t) of a real argument t into real

and imaginary parts,

u (t ) ur (t ) iui (t ) ,

where ur and ui are both real functions of t. Substituting this complex-valued u(t) into the left-

hand side of (2.28 A ) gives

5 5

³ ³ dt 3 e

B2& ift 3

df e92& ift ur (t 3) iui (t 3)

5 5

5 5 5 5

³ df e ³ dt 3 e ur (t 3) i ³ df e ³ dt3 e

92& ift B2& ift 3 92& ift B2& ift 3

ui (t 3) .

5 5 5 5

Since (2.28 A ) holds for real functions ur and ui , this last expression must be equal to the

original complex function u,

ur (t ) iui (t ) u (t ) ,

- 92 --

- 92

Forward and Inverse Fourier Transforms · 2.5

showing that Eq. (2.28 A ) is true for complex functions of t as well as strictly real functions of t.

Similar reasoning shows that (2.28m) also holds true for complex functions of real variables.

Indeed, we can even apply this analysis to the unextended sine and cosine transforms to show that

the unextended sine transform of the unextended sine transform and the unextended cosine

transform of the unextended cosine transform return the original function (for positive values of

the argument) when the original function is complex.

We now define the Fourier transform of a complex function u with real argument t to be

5

F ( ift ) u (t ) ³ u(t )e

2& ift

dt . (2.29a)

5

The notation for F introduced in (2.29a) explicitly shows that t, being repeated inside both upper

and lower parentheses, is the dummy variable of integration; and that F produces a function of ƒ

because ƒ is only listed in the upper parentheses. We call (2.29a) the forward Fourier transform

and, when convenient, follow the custom of writing it with the upper-case letter of the

transformed function,

5

³ u (t )e

2& ift

U( f ) dt . (2.29b)

5

³ U ( f )e

( itf ) 2& ift

F (U ( f )) df . (2.29c)

5

In both the forward and inverse transform the order of the tf product in the superscript is

irrelevant, just as it is for the sine, cosine, and Hartley transforms,

What is important is the sign inside the superscript, since it determines whether the forward or

inverse transform is being performed. Equation (2.28 A ) shows, of course, that

5

u (t ) F ( itf ) U ( f ) ³ U ( f )e

2& ift

df F (itf ) F ( ift 3) u (t 3) . (2.29d)

5

It is entirely a matter of convention which Fourier transform is called the forward transform and

which is called the reverse transform; all that matters is for (2.28 A ) to be satisfied. Some authors

-- 93

93 --

2 · Fourier Theory

change the sign of the exponent 2& ift , defining the forward Fourier transform to be F ( ift ) ,

5

F ( ift )

u (t ) ³ u (t )e2& ift dt ,

5

5

F ( itf ) U ( f ) ³ U ( f )e

2& ift

df .

5

Clearly, this convention also satisfies (2.28 A ), with the inverse Fourier transform of the forward

Fourier transform still returning the original function.

In physics and related disciplines, the frequency variable is often changed to - 2& f , so that

(2.28 A ) becomes

5 5

1

³ ³ dt 3 eB i-t 3u (t 3) u (t ) .

9 i-t

d - e (2.30a)

2& 5 5

Authors using the frequency variable Ȧ allocate the factor of 1 (2& ) different ways when

defining the forward and inverse Fourier transforms in terms of Ȧ, with all reasonable

possibilities chosen at one time or another:

5

Forward Fourier transform of u (t ) is ³ u (t )e B i-t dt U (- ) , (2.30b)

5

5

1

³ U (- )e

9 i-t

Inverse Fourier transform of U (- ) d- ,

2& 5

5

1

³ u (t )e

B i-t

Forward Fourier transform of u (t ) is dt U (- ) , (2.30c)

2& 5

5

1

³ U (- )e

9 i-t

Inverse Fourier transform of U (- ) d- ,

2& 5

5

1

³ u (t )e

B i-t

Forward Fourier transform of u (t ) is dt U (- ) , (2.30d)

2& 5

- 94 --

- 94

Forward and Inverse Fourier Transforms · 2.5

³ U (- )e

9 i-t

Inverse Fourier transform of U (- ) d- .

5

In each of the three pairs of definitions listed above, the plus and minus signs are synchronized;

so if the top (bottom) sign is chosen for the first member of the pair then the top (bottom) sign

must also be chosen for the second member of the pair. This gives a total of six different ways of

defining the forward and inverse Fourier transforms, and all six satisfy Eq. (2.30a).

The unextended sine and cosine transforms—usually called just the sine and cosine

transforms—can also be defined in many different ways. Equations (2.8a), (2.8c), (2.8e), and

(2.8b), (2.8d), (2.8f) can be combined to write

5 5

4 ³ df sin(2& ft ) ³ dt 3 u (t 3) sin(2& ft 3) u (t ) for t 0 (2.31a)

0 0

and

5 5

4 ³ df cos(2& ft ) ³ dt 3 u (t 3) cos(2& ft 3) u (t ) for t 0 . (2.31b)

0 0

5 5

2

& ³ df sin(-t )³ dt 3 u (t 3) sin(-t 3) u(t ) for t 0

0 0

(2.31c)

and

5 5

2

& ³ df cos(-t )³ dt 3 u (t 3) cos(-t 3) u(t )

0 0

for t 0 . (2.31d)

Just like the factor of 1 (2& ) in Eq. (2.30a), the factor of 2 & in (2.31c) and (2.31d) can be

allocated three different ways when defining the forward and inverse sine and cosine transforms:

5

Forward sine transform of u (t ) for t 0 is ³ u (t ) sin(-t )dt U p - , (2.31e)

0

5

Forward cosine transform of u (t ) for t > 0 is ³ u (t ) cos(-t ) dt U C - ,

0

5

2

Inverse sine transform of U p - is

& ³ U - sin(-t )d- u(t )

0

p for t 0 ,

-- 95

95 --

2 · Fourier Theory

5

2

Inverse cosine transform of U C - is

& ³ U - cos(-t )d- u (t )

0

C for t 0 ,

5

2

Forward sine transform of u (t ) for t > 0 is

& ³ u (t ) sin(-t )dt U - ,

0

p (2.31f)

5

2

Forward cosine transform of u (t ) for t > 0 is

& ³ u (t ) cos(-t )dt U - ,

0

C

5

2

Inverse sine transform of U p - is

& ³ U - sin(-t )d- u (t )

0

p for t 0 ,

5

2

Inverse cosine transform of U C - is

& ³ U - cos(-t )d- u (t )

0

C for t 0 ,

5

2

Forward sine transform of u (t ) for t > 0 is

& ³ u (t ) sin(-t )dt U - ,

0

p (2.31g)

5

2

Forward cosine transform of u (t ) for t > 0 is

& ³ u (t ) cos(-t )dt U - ,

0

C

5

Inverse sine transform of U p - is ³ U p - sin(-t )d - u (t ) for t 0 ,

0

5

Inverse cosine transform of U C - is ³ U C - cos(-t )d - u (t ) for t 0 .

0

The reader should expect to encounter all three classes of definitions given in (2.31e)–(2.31g).

The symmetric definitions in (2.31f) are the most popular, probably because they remove the

distinction between the forward and inverse transform, letting us say that the sine transform of

the sine transform and the cosine transform of the cosine transform return the original function

for t 0 .

In today’s optical-engineering textbooks—and user manuals for the fast Fourier transform—

there is a tendency to choose Eq. (2.29a)–(2.29d) as the definitions of the forward and inverse

Fourier transform, and that is the convention followed here. It is perhaps somewhat

unconventional not to use the frequency variable - 2& f when defining the sine and cosine

transforms, but using ƒ rather than Ȧ brings their definitions into conformity with the definitions

chosen for the forward and inverse Fourier transforms.

- 96 --

- 96

Fourier Transform as a Linear Operation · 2.6

The forward and inverse Fourier transforms are linear operations. If Į, ȕ are any two complex

constants and u(t), v(t) are two complex-valued functions of a real variable t, then the definition

of a linear operator L isis that

that

L1 u (t ) g (t ) A u (t ) ,

du (t )

L2 u (t ) ,

dt

t

t2

t2

L3 u (t ) ³ u (t ) dt .

t1

du (t ) dv(t )

L2 u (t ) v(t ) L2 u (t ) L2 v(t ) ,

dt dt

and

t2 t2

t1 t1

Combinations of linear operators are always linear; for example, the operator Z defined by

Z u (t ) L3 L1 u (t )

must be linear because

-- 97

97 --

2 · Fourier Theory

L3 L1 u (t ) L3 L1 v(t ) (2.32b)

L u (t ) L v(t )

5

F ( ift )

u (t ) ³ u (t )e2& ift dt

5

as defined in Eq. (2.29a) is, in fact, just L3 L1 u (t ) with g (t ) e 2& ift in the L1 multiplication

and t1 5 , t2 5 in the L3 integration. Similarly, the inverse Fourier transform is,

interchanging the roles of the ƒ and t variables in Eq. (2.29b),

5

F (ift ) U (t ) ³ U (t )e

2& ift

dt ,

5

the L3 integration. Equation (2.32b) thus shows that both the forward and inverse Fourier

transforms are linear. The unextended and extended sine transforms in Eqs. (2.8a) and (2.14a),

5 5

p ( ft )

u (t ) 2³ u (t ) sin(2& ft )dt and pE ( ft )

u (t ) ³ u (t ) sin(2& ft )dt ,

0 5

are also both L3 L1 u (t ) : the unextended sine transform has g (t ) 2sin(2& ft ) in the L1

multiplication and t1 0 , t2 5 in the L3 integration; and the extended sine transform has

g (t ) sin(2& ft ) in the L1 multiplication and t1 5 , t2 5 in the L3 integration. The

unextended and extended cosine transforms in Eqs. (2.8b) and (2.14b),

5 5

C ( ft )

u (t ) 2³ u (t ) cos(2& ft )dt and CE ( ft )

u (t ) ³ u (t ) cos(2& ft )dt ,

0 5

are, of course, identical to the unextended and extended sine transforms in being L3 L1 u (t ) ;

the only change is that the sines change to cosines in the L1 multiplications. From Eq. (2.32b), all

- 98 --

- 98

Fourier Transform as a Linear Operation · 2.6

four transforms—the extended sine transform, the unextended sine transform, the extended

cosine transform, and the unextended cosine transform—are linear operations. We see that the

only other transform discussed so far, the Hartley transform

5

ea( ft ) u (t ) ³ u (t ) cos(2& ft ) sin(2& ft ) dt

5

There are a large number of symmetry relations that hold for any function u(t) and its Fourier

transform

5

U( f ) F ( ift )

u (t ) ³ u (t )e2& ift dt . (2.33a)

5

We have already seen that the inverse Fourier transform of U ( f ) returns the original function,

³ U ( f )e

2& ift

df F (itf ) U ( f ) u (t ) . (2.33b)

5

u (t ) F ( itf ) U ( f ) .

u ( f ) F ( ift ) U (t ) , (2.33c)

which shows that u(íf) is the forward Fourier transform of U(t). We expect, then, that U(t) is the

inverse Fourier transform of u(íf). To show this is true, we interchange the roles of variables ƒ

and t in (2.33a) and then make f 3 f the new variable of integration to get

-- 99

99 --

2 · Fourier Theory

5 5 5

U (t ) F ( itf )

u ( f ) ³ u ( f )e 2& ift

df ³ u ( f 3)e 2& if 3t df 3 ³ u ( f )e

2& ift

df

5 5 5

(2.33d)

( itf )

F u ( f ) .

Not only does this show that U(t) is the inverse Fourier transform of u(íf) but also, by comparing

the two expressions involving the F operator, we see that changing the sign of the integration

variable ƒ does not change the value of the Fourier operation F. It does, however, change its

name—the first F operation in (2.33d) is the forward Fourier transform of u(f) and the second F

operation in (2.33d) is the inverse Fourier transform of u(íf). Taking the complex conjugate of all

three expressions in Eq. (2.33b) gives

³ U( f ) e

2& ift

u (t ) df F ( itf ) U ( f ) ,

5

which shows that we get the complex conjugate of operator F by taking the complex conjugates

of the quantities inside both parentheses. Starting with the original Fourier transform relationship

between U and u,

U ( f ) F ( ift ) u (t ) (2.33e)

and

u (t ) F ( itf ) U ( f ) , (2.33f)

U ( f ) F ( ift ) u (t ) ,

and then change the sign of ƒ to get

U ( f ) F ( ift ) u (t ) . (2.33g)

This shows that U(íf)* is the forward Fourier transform of u(t)*. Since U(íf)* is the forward

Fourier transform of u(t)*, we expect the inverse Fourier transform of U(íf)* to be u(t)*. To show

this is true, we just change the sign of integration variable in Eq. (2.33f),

u (t ) F ( itf ) U ( f ) ,

- 100

- 100- -

Mathematical Symmetries of the Fourier Transform · 2.7

u (t ) F ( itf ) U ( f ) . (2.33h)

When u(t) is a strictly real function, as it is for much of the Fourier-transform work done in

this book, u equals its complex conjugate so that

F ( ift ) u (t ) F ( ift ) u (t ) ,

U ( f ) F ( ift ) u (t ) .

U ( f ) U ( f )

U ( f ) U ( f ) . (2.34a)

Functions U(f) that obey Eq. (2.34a) are called Hermitian. If u(t) is purely imaginary, so that

u (t ) u (t ) , then Eq. (2.33g) becomes

U ( f ) F ( ift ) u (t )

or

F ( ift ) u (t ) U ( f ) , (2.34b)

where the linearity of F is used to take (1) outside the transform and shift it over to the other

side of the equation. Since F ( ift ) u (t ) is just U(f), Eq. (2.34b) shows that

U ( f ) U ( f )

or

U ( f ) U ( f ) (2.34c)

when u is purely imaginary. Functions U(f) that obey Eq. (2.34c) are called anti-Hermitian. A

special and very important case occurs when u is both real and even. Then, since U is the forward

-- 101

101 --

2 · Fourier Theory

Fourier transform of u with U ( f ) F ( ift ) u (t ) , we take the complex conjugate of both sides to

get

U ( f ) F ( ift ) u (t ) .

Because u is real this becomes, changing the sign of the variable of integration,

U ( f ) F ( ift ) u (t ) U ( f )

so that

U ( f ) U ( f ) . (2.34d)

Hence, U equals its own complex conjugate, which shows it must be real. Because u is real, we

already know that U is Hermitian and (2.34a) must hold true; now that U is known to be real, Eq.

(2.34a) can be written as

U ( f ) U ( f ) (2.34e)

This shows that U must be real and even when u is real and even. Taking the real part of Eq.

(2.33a) now gives, since both U and u are known to be real,

§5 · 5

U ( f ) Re ¨ ³ u (t )e 2& ift

dt ¸ ³ u (t ) Re e2& ift dt ,

© 5 ¹ 5

5

U( f ) ³ u(t ) cos(2& ft ) dt .

5

(2.34f)

Because u(t) is also even, we know that the product u (t ) cos(2& ft ) is even with respect to t,

which means that (2.34f) can be written as [see formula (2.19) above]

5

U ( f ) 2 ³ u (t ) cos(2& ft ) dt . (2.34g)

0

- 102

- 102- -

Mathematical Symmetries of the Fourier Transform · 2.7

The right-hand side is the unextended cosine transform of u, showing that when u(t) is real and

even, its Fourier transform equals its cosine transform. According to Eq. (2.8f), it follows that u

must then be the cosine transform of U,

5

u (t ) 2 ³ U ( f ) cos(2& ft ) df . (2.34h)

0

There are a number of simple Fourier identities that are true for the transforms of any function u.

One very simple identity—surprisingly easy to overlook—is that when U(f) is the forward or

inverse Fourier transform of u(t), the value of U at the origin is the total integral of u:

ª5 º

U( f ) f 0

« ³ u (t )e B2& ift dt »

¬ 5 ¼ f 0

or

5

U (0) ³ u (t )dt .

5

(2.35a)

ª5 º

u (t ) t 0 « ³ U ( f )e 92& ift df »

¬ 5 ¼ t 0

or

5

u (0) ³ U ( f )df .

5

(2.35b)

When U(f) is the forward Fourier transform of u(t), the nth derivative of U is

d nU <n

5 5

2& ift n n 2& ift

dt (2& i ) dt ; (2.35c)

df n <f n 5 5

and, because Eqs. (2.29a) and (2.29d) require u to be the inverse transform of U when U is the

forward transform of u, the nth derivative of u is

-- 103

103 --

2 · Fourier Theory

d nu < n

5 5

2& ift n

df (2.35d)

dt n <t n 5 5

Therefore, when both u and d nu dt n satisfy requirements (V) through (VIII) in Sec. 2.4 and U(f)

is the forward Fourier transform of u(t), Eq. (2.35d) shows that [(2& i ) n f nU ( f )] must be the

forward Fourier transform of d nu dt n because d nu dt n is the inverse Fourier transform of

[(2& i ) n f nU ( f )] . Equation (2.35c) similarly shows that when u(t) and [t nu (t )] satisfy

requirements (V) through (VIII) in Sec. 2.4 and U(f) is the forward Fourier transform of u(t), the

forward Fourier transform of [t nu (t )] is

1 d nU

.

(2& i ) n df n

functions, adopting the convention that the function on the right is always the forward Fourier

transform of the function on the left and the function on the left is always the inverse Fourier

transform of the function on the right. The results of the above analysis can then be written as

d nu

6 (2& i ) n f nU ( f ) (2.35e)

dt n

and

1 d nU

t nu (t ) 6 . (2.35f)

(2& i ) n df n

b b

³ c(t ) dt 4 ³ c(t ) dt

a a

(2.35g)

must hold true for any two real values of a and b where a 4 b . When u(t) is real, so is its nth

derivative, and we can write

5 5 5

- 104

- 104- -

Basic Fourier Identities · 2.8

d nu 2& ift d nu

5 5

³ dt n e dt 4 5³ dt n dt .

5

(2.35h)

Because we are supposing the Fourier transform of d nu / dt n to exist, the existence requirement

in Eq. (2.13a) shows that

d nu

5

³ dt n dt

5

d n u 2& ift

5

³5 dt n e dt

also to be finite, which means that we can assume that it is less than or equal to some finite real

and non-negative number B for all values of ƒ:

d nu 2& ift

5

³ dt n e dt 4 B .

5

(2.35i)

d nu 2& ift

5

³5 dt n e dt (2& ) i f U ( f ) ,

n n n

(2.35j)

where

5

³ u (t )e

2& ift

U( f ) dt

5

is, of course, the Fourier transform of u(t). Taking the magnitude of the complex values of both

sides of (2.35j) and remembering that i n 1 shows that

d nu 2& ift

5

³5 dt n e dt (2& ) f

n n

U( f ) ,

-- 105

105 --

2 · Fourier Theory

n

B : (2& ) n f U( f )

or

B n

U( f ) 4 f . (2.35k)

(2& ) n

Hence, when the Fourier transform of the nth derivative of u(t) exists, we know that the

n

magnitude U ( f ) of the Fourier transform of u decreases as f for large values of ƒ.

We next examine a set of identities often called the Fourier shift theorem. When U(f) is the

forward Fourier transform of u(t),

5

³ u (t )e

2& ift

U( f ) dt ,

5

u (t ) 7 u (t a) ,

then the forward Fourier transform of u (t a) is, changing the variable of integration to

t3 t a ,

5 5

2& ift 2& if ( t 3 a )

dt 3

5 5

5

e 2& ifa ³ u (t 3)e

2& ift 3

dt 3 e 2& ifaU ( f ).

5

Hence the forward Fourier transform of u (t a) is e 2& ifaU ( f ) when the forward Fourier

transform of u(t) is U(f), which we can write as

operator, we

In terms of the Fourier F operator, we have

have

Working with the reverse Fourier transform of U ( f f 0 ) and changing the variable of

integration to f 3 f f 0 , we see that

- 106

- 106- -

Basic Fourier Identities · 2.8

5 5

³ U ( f f )e ³ U ( f 3)e

2& ift 2& if0 t 2& if 3t

0 df e df 3 e 2& if0t u (t ) (2.36c)

5 5

or

e 2& if0t u (t ) 6 U ( f f 0 ) . (2.36d)

or

F (ift ) e 2& if0t u (t ) U ( f f 0 ) F

i f f 0 t

u (t ) . (2.36f)

Equations (2.36d)–(2.36f) show that multiplying u(t) by e 2& if0t shifts U(ƒ), the forward Fourier

transform of u(t), to the right by a frequency f 0 . By interchanging the roles of t and ƒ—and

replacing u by U and f 0 by a—in (2.36e) and comparing the result to (2.36b), we see the two

equations can be combined into one formula:

This last result can also be written as, defining a new constant b a ,

5 5

³ u(t b) e ³ u(t ) e

92& ift B2& ifb 92& ift

dt e dt (2.36h)

5 5

or

F ( 9 ift ) u (t b) e B2& ifb F ( 9 ift ) u (t ) . (2.36i)

The next set of identities is sometimes called the Fourier scaling theorem. If U(ƒ) is the

forward Fourier transform of u(t) and the argument of u is scaled by the real constant a,

u (t ) 7 u (at ) ,

5 5 § ft 3 ·

1 2& i ¨ ¸ 1 § f ·

³ u(at )e ³ u (t 3)e

2& ift © a ¹

dt dt 3 U ¨ ¸.

5

a 5

a ©a¹

-- 107

107 --

2 · Fourier Theory

1 § f ·

u (at ) 6 U¨ ¸ (2.37a)

a ©a¹

or

1 i f a t

F ( ift ) u (at ) F u (t ) . (2.37b)

a

We also have, scaling the frequency by a positive constant a and letting f 3 af , that

5 5 § f 3t ·

1 2& i ¨ ¸ 1 §t·

³ U (af )e df ³ U ( f 3)e © a ¹ df 3 u ¨ ¸ .

2& ift

5

a 5 a ©a¹

1 §t·

u ¨ ¸ 6 U (af ) for a 0 (2.37c)

a ©a¹

or

1 it a f

F ( itf ) U (af ) F U ( f ) for a 0 . (2.37d)

a

Equation (2.37b) and (after interchanging the roles of ƒ and t) Eq. (2.37d) can be combined into

the single formula,

1 9 i f a t

F ( 9 ift ) u (at ) F u (t ) for a 0 . (2.37e)

a

Because u(t) must satisfy requirements (V) through (VIII) in Sec. 2.4 for these results to be

true—and in particular it must satisfy requirement (V) that it be absolutely integrable—there may

well be only a finite region of t over which u(t) is significantly different from zero. When

0

a

1 so that the range of t over which u is significantly different from zero expands, formula

(2.37a) shows that the region of ƒ over which U(ƒ) is significantly different from zero shrinks;

and, of course, when a 1 , just the opposite occurs. For 0

a

1 , function u (at ) more closely

resembles sin(2& ft ) and cos(2& ft ) for smaller values of ƒ, explaining why the region of ƒ for

which U is significantly different from zero shrinks; and when a 1 , function u (at ) more closely

resembles sin(2& ft ) and cos(2& ft ) for larger values of ƒ, explaining why the region of ƒ for

which U is significantly different from zero expands. We also note that if f 1 (2& ) , so that

sin(2& ft ) sin(t ) and cos(2& ft ) cos(t ) , then the sine and cosine can change significantly in

value only when t changes by at least

- 108

- 108- -

Basic Fourier Identities · 2.8

tmin O (1) .

Suppose t must also change by at least tmin O (1) for a significant change in u(t) to occur,

which means that sin(2& ft ) sin(t ) and cos(2& ft ) cos(t ) vary about as fast with respect to t as

u does—that is, sin(t ) and cos(t ) “resemble” u somewhat. Recalling the heuristic reasoning used

in Sec. 2.1 to introduce and justify the sine and cosine integrals, we now expect U(ƒ) to be

significantly different from zero when f 1 (2& ) . Suppose next that t changes by less than

tmin O (1) so that u does not change significantly in value, remaining almost constant. Now

when ƒ becomes significantly larger than 1 (2& ) , functions sin(2& ft ) and cos(2& ft ) oscillate

ever more rapidly so that they change significantly in value for changes in t that are ever smaller

than tmin . For these larger values of ƒ, the sine and cosine do not much resemble u(t), forcing

the Fourier transform U(ƒ) to be negligible or zero for f O (1 (2& )) . We can modify the

original function u by creating a new function u (t ) u (t ) for 0 . Now t must change by at

least an O( ) amount for u to change significantly; and when t changes by less than O( ) ,

function u does not change significantly in value. We know from (2.37a) with a 1 that the

forward Fourier transform of u is U ( f ) U f . Hence, when ƒ is larger than

O 1 (2& ) , it must be true that U ( f ) is negligible or zero, since this is the same as having

f O(1 (2& )) in U(ƒ). Because 2& is often regarded as an O(1) quantity, this result can also be

interpreted as showing that U ( f ) must be negligible or zero for f O (1 ) . Since the original

Fourier transform pair

u (t ) 6 U ( f )

is left unspecified, u in fact represents any function v(t) where t must change by at least an

O( ) amount for a significant change in v to occur. Consequently, we can conclude if t must

change by at least an O( ) amount for v(t) to change significantly, then the forward Fourier

transform of v(t) must be negligible or zero for f O (1 ) . The arguments leading to this

conclusion work just as well when we consider the inverse Fourier transform in Eqs. (2.37c) and

(2.37e). Therefore, this more general result is also true: if v(t) is a function such that t must

change by at least an O( ) amount for a significant change in v to occur, then the forward or

inverse Fourier transform,

5

³ v(t )e

92& ift

V( f ) dt ,

5

-- 109

109 --

2 · Fourier Theory

It is hard to overstate the importance of the Fourier convolution theorem; it plays a fundamental

role in linear signal theory and structures the thinking of many different engineering

disciplines—signal processing, electrical engineering, image analysis, and servomechanism

design, to name but a few.

We define the convolution of two functions u(t) and v(t) to be

5

u (t ) v(t ) ³ u(t3)v(t t3) dt 3 .

5

(2.38a)

Here, u and v may be complex functions but their argument t is assumed to be real. The

convolution is commutative and associative. It is commutative because making the substitution

t 33 t t 3 gives

5 5 5

u (t ) v(t ) ³ u (t 3)v(t t 3) dt 3 ³ u (t t 33)v(t 33) dt 33 ³ v(t33)u(t t 33) dt 33 ,

5 5 5

showing that

u (t ) v(t ) v(t ) u (t ) . (2.38b)

The convolution is associative because for three complex functions u(t), v(t), and h(t) with real

argument t we can write, changing the variable of integration to t 333 t 33 t 3 ,

5 5 5 5

5 5 5 5

5 5

³ dt 3u (t 3) ³ dt 333v(t 333)h (t t 3) t 333

5 5

u (t ) v(t ) h(t ) .

Hence,

u (t ) v(t ) h(t ) u (t ) v(t ) h(t ) . (2.38c)

The convolution is a linear operation, because for any two complex constants Į and ȕ,

- 110

- 110- -

Fourier Convolution Theorem · 2.9

5

h(t ) u (t ) v(t ) ³ h(t 3) u (t t 3) v(t t 3) dt 3

5

5 5

³ h(t 3)u (t t 3)dt 3 ³ h(t 3)v(t t 3)dt 3 ,

5 5

showing that

This shows that the convolution is linear on both the left-hand and right-hand sides of the .

The convolution of two even functions or two odd functions is an even function. If u(t) and

v(t) are both even or both odd, then we have, using t 33 t 3 ,

5 5

u (t ) v(t ) ³ u(t 3)v(t t 3) dt 3 ³ u (t 33)v(t t 33) dt 33

5

5

5

(2.38f)

5

³ u (t 33)v(t t 33) dt 33 u(t ) v(t ) .

5 5

u ( t ) v ( t ) ³ u(t 3)v(t t 3) dt 3 ³ u (t 33)v(t t33) dt 33

5 5

5

(2.38g)

³ u (t 33)v(t t 33) dt 33 u (t ) v(t ) .

5

If u and v have more than one argument so that they are written u ( y, x1 , x2 ,…) and

v( y, x13, x23 ,…) , then we adopt the convention that the convolution

-- 111

111 --

2 · Fourier Theory

5

u ( y, x1 , x2 ,…) v( y, x13, x23 ,…) ³ u( y3, x , x ,…)v( y y3, x3, x3 ,…) dy3 ,

5

1 2 1 2

To derive the Fourier convolution theorem, we take the forward or inverse transform of

u (t ) v(t ) to get

5 5 5

F ( 9 ift )

u (t ) v(t ) ³ e 92& ift

u (t ) v(t ) dt ³ dt e 92& ift

³ dt 3u(t 3)v(t t 3)

5 5 5

5 5

³ dt 3u (t 3) ³ dt e

92& ift

v(t t 3).

5 5

5 5

F ( 9 ift ) u (t ) v(t ) ³

5

dt 3u (t 3)e 92& ift 3 ³ dt 33e92& ift 33v(t 33)

5

ª 5

º ª5 º

« ³ dt 3u (t 3)e 92& ift 3

» A « ³ dt 33e

92& ift 33

v(t 33) »

¬ 5 ¼ ¬ 5 ¼

or

F ( 9 ift ) u (t ) v(t ) F ( 9 ift ) u (t ) A F ( 9 ift ) v(t ) . (2.39a)

If U(ƒ) and V(ƒ) are the forward Fourier transforms of u(t) and v(t) respectively, we can choose

the minus sign of (2.39a) to get

³e

2& ift

u (t ) v(t ) dt U ( f ) AV ( f ) , (2.39b)

5

which shows that

u (t ) v(t ) 6 U ( f ) A V ( f ) . (2.39c)

Equation (2.28 A ) can be written as, for any function g(t) after interchanging the roles of t and t 3 ,

- 112

- 112- -

Fourier Convolution Theorem · 2.9

F ( 9 it 3f ) F ( B ift ) g (t ) g (t 3) . (2.39d) (2.39d)

We replace F ( 9 ) by F ( B ) on the right-hand side of Eq. (2.39a), which is just a change in the order

in which the two possible signs of the exponent are listed, and then take F ( 9 it 3f ) of both sides to

get that, applying (2.39d) with g (t ) u (t ) v(t ) ,

u (t 3) v(t 3) F ( 9 it 3f ) F ( B ift ) u (t ) A F ( B ift ) v(t ) . (2.39e)

Because u(t) and v(t) represent arbitrary, Fourier-transformable functions of t, F ( B ift ) (u (t )) and

F ( B ift ) (v(t )) must be arbitrary, Fourier-transformable functions of ƒ, which we can call U ( B ) and

V ( B ) respectively,

U ( B ) ( f ) F ( B ift ) u (t ) (2.39f)

and

V ( B ) ( f ) F ( B ift ) v(t ) . (2.39g)

Applying this notation to (2.39d), first with g (t ) u (t ) and then with g (t ) v(t ) , we see that

F ( 9 it 3f ) U ( B ) ( f ) u (t 3) (2.39h)

and

F ( 9 it 3f ) V ( B ) ( f ) v(t 3) . (2.39i)

F ( 9 it 3f 3) U ( B ) ( f 3) F ( 9 it 3f 33) V ( B ) ( f 33) F ( 9 it 3f ) U ( B ) ( f ) AV ( B ) ( f ) ,

where the convolution is over t 3 because it is the only argument repeated on both sides of the .

Since U ( B ) and V ( B ) are arbitrary, transformable functions, we can replace them by the arbitrary

transformable functions u and v to get, after interchanging the roles of ƒ and t 3 ,

-- 113

113 --

2 · Fourier Theory

If U(ƒ) and V(ƒ) are the forward Fourier transforms of u(t) and v(t) respectively, we can choose

the minus sign of (2.39j) to get

³e

2& ift

u (t ) A v(t ) dt U ( f ) V ( f ) (2.39k)

5

or

u (t ) A v(t ) 6 U ( f ) V ( f ) . (2.39 A )

Equation (2.39b) shows that the forward Fourier transform of the convolution of two functions

is the product of the forward Fourier transform of each function, and (2.39k) shows that the

forward Fourier transform of the product of two functions is the convolution of the forward

Fourier transform of each function. Equations (2.39a) and (2.39j) show that everything we just

said about the forward Fourier transform still holds true when we take the reverse Fourier

transform of the product of two functions or of the convolution of two functions.

When using the Fourier convolution theorem, we usually regard one of the two convolved

functions as representing the undisturbed signal—that is, the true set of values for what is to be

measured—and the other—usually much more narrow—function as specifying the blurring or

smearing effect of an imperfect measurement. The blurring or smearing function has different

names in different engineering disciplines; optical engineers often call it the instrument-response

or instrument line-shape function. In Fig. 2.5(a), function u is taken to be the true signal, and in

Fig. 2.5(b) function v is the instrument-response or instrument line-shape function. The

convolution

5

u (t ) v(t ) ³ u (t 3)v(t t 3) dt 3 u

5

blur (t )

defines the new function ublur (t ) as shown in Figs. 2.5(c)–2.5(e). The function v is flipped left to

right and slid along the t 3 axis in Fig. 2.5(c) by changing the value of t. Figure 2.5(d) is a close-

up of v at a specific value of t, with the shaded region being the area under the product

u (t 3)v(t t 3) . Since u (t 3)v(t t 3) is zero where v(t t 3) is zero, the area of the shaded region can

be found by integrating u (t 3)v(t t 3) over t 3 between í and +. This is, of course, just the

convolution of u and v for this particular value of t , which means the area of the shaded region

must be ublur (t ) for this value of t. Figure 2.5(e) represents the complete ublur (t ) function for all

values of t; clearly ublur has less detail than the original signal u.

The v(t) function in Fig. 2.5(b) is an unusual type of instrument response because it is not an

even function of t. Figure 2.5(f) shows a typical even instrument response ve (t ) . When the

instrument-response function is ve , the blurred signal is

- 114

- 114- -

Fourier Convolution Theorem · 2.9

ue ,blur (t ) u (t ) ve (t ) . (2.40a)

5 5

ue,blur (t ) ³ u(t 3)ve (t t 3) dt 3

5

³ u(t 3)v (t 3 t ) dt 3

5

e (2.40b)

with the last integral in (2.40b) making it perhaps more obvious that ue,blur is a localized and

weighted average of u centered on t. Instrument-response or line-shape functions are usually

designed to be even because an even instrument-response function does not shift the center point

of isolated peaks in the true data u.

As described in the first chapter, when using Michelson interferometers, we do not much care

about the exact shape of the optical intensity signal u but are instead interested in the shape of its

transform,

U ( f ) F ( ift ) u (t ) . (2.40c)

function of ƒ, the signal frequency. The electrical circuits transmitting and recording the signal u

can never do a perfect job—they always blur and smooth the original signal to some extent—so

what we end up with is not u(t) and U(ƒ) but rather ue,blur (t ) and the associated Fourier transform

The relationship between U e ,blur and U must be understood to design the electrical circuits

properly. Here is an important example of how to use the Fourier convolution theorem.

Substitution of (2.40a) into (2.40d) gives

U e ,blur ( f ) F ( ift ) u (t ) ve (t ) .

Using the Fourier convolution theorem as presented in Eq. (2.39a), this is rewritten as

or

U e ,blur ( f ) U ( f ) A Ve ( f ) , (2.40e)

-- 115

115 --

2 · Fourier Theory

FIGURE 2.5(a).

u (t )

u (t 3)

FIGURE 2.5(c).

t3

u (t 3)v(t t 3)

t value

v(t t 3)

FIGURE 2.5(d).

t3

ublur (t )

FIGURE 2.5(e). t

ve (t )

t

FIGURE 2.5(f).

- 116

- 116- -

Fourier Convolution Theorem · 2.9

Ve ( f ) F ( ift ) ve (t ) .

Equation (2.40e) is a very reassuring result, stating that as long as Ve ( f ) is known and not zero,

we can recover the Fourier transform of the true signal U(ƒ) from U e ,blur ( f ) by calculating

U e ,blur ( f )

U( f ) . (2.40f)

Ve ( f )

To design the circuits of a Michelson interferometer, we find the frequencies ƒ for which U(ƒ)

must be known and arrange for Ve to be as constant as possible—and definitely not zero—over

these frequencies. It turns out that preserving certain signal frequencies while neglecting others is

a standard problem in electrical circuit design, and it is usually easy to arrange for this to occur.

There is, in fact, a whole branch of electrical engineering called filter theory that describes

exactly how to design circuits where Ve is zero or very small at some frequencies while being

large and quasi-constant at others.

Fourier-transform theory has a history of treating with extreme kindness engineers and scientists

who blindly use its formalism without worrying about whether their manipulations make

mathematical sense. The rule of thumb seems to be that if the final result is mathematically

sound—such as a finite integral or the transform of an obviously transformable function—it

almost never matters whether intermediate steps involve the transforms of functions that

obviously cannot be transformed or even, strictly speaking, are not true functions at all. Any

reasonably comprehensive table of Fourier transforms contains functions that not only violate

requirements (V) through (VIII) in Sec. 2.4 but also have transform integrals that, according to

the standard definition of integration, either diverge or have no well-defined value. This book

shows that these puzzling entries are the modest but ubiquitous legacy of mathematicians who

have extended the meaning of what is meant by an integral and what is meant by a function in

Fourier-transform theory. Their work has not only benefited many scientists and engineers who

no longer have to apologize for the way they solve Fourier-transform problems but has also

helped their students who no longer need to accept without good explanations divergent integrals

and the transforms of poorly defined functions.

The standard definition of an improper integral

³ u (t )dt

5

for the function u(t) is that

-- 117

117 --

2 · Fourier Theory

5 T2

T1 75

5 T2 75 T1

t 7t s

5 ª ts 1 T2

º

³5 u (t ) dt lim « ³

« T1

u (t ) dt ³ u (t ) dt ». (2.41a)

¼»

T1 75 , T2 75

1 70, 2 70 ¬ ts 2

matter how T1 , T2 , 1 , and 2 approach their limits, the same answer is expected if the integral

exists. We now decide, in the interest of expanding Fourier-transform theory, to change this

standard definition of improper integral by connecting 1 to 2 and T1 to T2 as we take the limit,

5 ª t s T º

³5 u (t ) dt lim «

« T

³ u (t ) dt ³ u (t )dt » . (2.41b)

¼»

T 75

70 ¬ ts

The limiting process in definition (2.41b) is said to give the Cauchy principle value of the

integral, sometimes written as

5 5

_

PV ³ u (t )dt or ³ u(t )dt .

5 5

If u(t) has multiple singular points, the definition is expanded in the obvious way. For example,

with two singular points at ts1 and ts 2 with ts1

ts 2 , we have

5 ª ts1 1 t s 2 2 T º

PV ³ u (t )dt lim « ³ u (t )dt ³ u (t )dt ³ u (t ) dt » (2.41c)

1 70 « »¼

T 75

5 ¬ T ts 1 1 ts 2 2

2 70

and so on for three, four, etc., interior points of singularity in u(t). If an improper integral

converges to a finite value in the standard sense of (2.41a), then its Cauchy principle value also

converges to the same answer, but many improper integrals that do not converge in the sense of

(2.41a) nevertheless have well-defined Cauchy principle values. For this reason, it is customary

in Fourier-transform theory to interpret all improper integrals—such as the forward and inverse

Fourier transforms—as Cauchy principle values, and that is what we shall do from now on. There

will be no special notation used to distinguish Cauchy principle values from ordinary improper

integrals.

- 118

- 118- -

Fourier Transforms and Divergent Integrals · 2.10

To show the relevance of the Cauchy principle value, we calculate the Fourier transform of

1 t , an example already considered above in connection with the sine transform [see discussion

following Eq. (2.10e)]. Using the identity ei cos( ) i sin( ) , we have

5 5 5

F ( ift ) (t 1 ) ³ e 2& ift t 1dt ³ cos(2& ft ) t 1dt i ³ sin(2& ft ) t 1dt . (2.42a)

5 5 5

There is no problem evaluating the imaginary part of this transform. Because [t 1 sin(2& ft )] is

an even function of t, we can apply formulas (2.19) and (2.10f) to get

5 5

i ³ sin(2& ft ) t dt 2i ³ sin(2& ft ) t 1dt i& for

1

f 0.

5 0

When f

0 , we have

5 5

i ³ sin(2& ft ) t dt i ³ sin(2& f t ) t 1dt i& ,

1

5 5

allowing us to write

5

i ³ sin(2& ft ) t 1dt i& sgn( f ) , (2.42b)

5

where we define

1 for f 0

°

sgn( f ) ® 0 for f 0 . (2.42c)

° 1 for f

0

¯

The specification that sgn(0) 0 makes sgn( f ) a proper odd function, equal to zero at f 0 ,

even though it has a jump discontinuity there. It also, of course, makes sense considering that

(2.42b) is the integral of the zero function when f 0 . Evaluation of the real part of the

transform in (2.42a) shows the usefulness of interpreting improper integrals as Cauchy principle

values. When f 0 , the real part of the left-hand side of (2.42a) becomes, using the standard

interpretation of an improper integral in (2.41a),

-- 119

119 --

2 · Fourier Theory

5

dt ª 1 dt T2 dt º ª T1 dt § T2 · º

³ t T1 75,T2 75 « ³T t ³ t » T1 75,T2 75 « ³ t ¨© 2 ¸¹»»

lim « » lim « ln

5 70, 70 ¬ 1

1 2 2 ¼ 70, 70 ¬ 1

1 2 ¼

ª §T · § T ·º

lim « ln ¨ 1 ¸ ln ¨ 2 ¸ » (2.43a)

T1 75 , T2 75

1 70, 2 70 ¬ © 1 ¹ © 2 ¹¼

ª § · § T ·º

lim «ln ¨ 1 ¸ ln ¨ 2 ¸ » .

T1 75 , T2 75

70, 7 0 ¬

1 2

© 2 ¹ © T1 ¹ ¼

The expression ln(1 2 ) can be made anything we want depending on the limiting ratio

chosen for 1 2 as 1 7 0 and 2 7 0 ; the same is true of ln(T1 T2 ) as T1 7 5 and T2 7 5 .

Therefore, under the standard interpretation of an improper integral, the limit in (2.43a) does not

exist. Comparison of (2.41a) to (2.41b) shows that (2.43a) can be converted to a Cauchy principle

value by setting 1 2 , T1 T2 T , and taking the limit as T 7 5 , 7 0 . This leads to

ª § · § T ·º

lim «ln ¨ ¸ ln ¨ ¸ » 0 ,

70 ¬

T 75

© ¹ © T ¹¼

5

dt

allowing us to give a well-defined value to the expression ³ t .

5

In general, the Cauchy principle value of any odd function is always zero,

³ u(t )dt 0

5

for any function u such that u (t ) u (t ), (2.43b)

because when taking the limit we are always simultaneously adding u (t )dt increments to the

integral at values of t and ít with the balanced addition of increments always cancelling out.

Hence, interpreted as a Cauchy principle value,

³ cos(2& ft ) t

1

dt 0 (2.43c)

5

meaning to the forward Fourier transform of 1 t in (2.42a) using (2.43c) and (2.42b):

- 120

- 120- -

Fourier Transforms and Divergent Integrals · 2.10

For this answer to be a true extension to Fourier-transform theory, however, 1/t must satisfy

Eq. (2.28 A ); that is, the inverse transform

Direct evaluation of the inverse transform gives

5

F ( itf )

i& sgn( f ) i& ³ e2& ift sgn( f )df

5

5 5

(2.43e)

i& ³ cos(2& ft ) sgn( f )df & ³ sin(2& ft ) sgn( f )df .

5 5

The cosine integral is again the integral of an odd function so its Cauchy principle value is zero,

but it is still not clear what value to assign the integral of [sin(2& ft ) sgn( f )] . As the integral of

an even function, we might try applying formula (2.19) to get

5

? 5 5

& ³ sin(2& ft ) sgn( f )df 2& ³ sin(2& ft ) sgn( f )df 2& ³ sin(2& ft ) df , (2.43f)

5 0 0

but then we have the same difficulty already encountered when trying to evaluate the sine

transform

5

2& ³ sin(2& ft )df

0

in Eq. (2.10g). To evaluate the inverse transform of i& sgn( f ) , we need to create a new class of

mathematical entities, called generalized functions, together with a set of rules for how they

behave inside integrals. This extension to Fourier-transform theory is often called distribution

theory, with the generalized functions called distributions.

Generalized functions are based on the well-established mathematical concept of a functional. A

functional is a rule for assigning a complex number to each member of a set of test functions,

where each test function has only one number assigned to it and the same number may end up

assigned to different test functions. The Fourier transform of a function (t ) at a specific

frequency f f 0 is a functional because it assigns the number ( f 0 ) F ( if0t ) (t ) to the test

-- 121

121 --

2 · Fourier Theory

function . In general, we can use any complex function u(t) having a real argument t as a

weighting function inside an integral to create a functional. This functional, called ³ u , is defined

to be

5

5

According to this definition the functional ³ u is linear, like the Fourier transform, because

5 5 5

1 2 1 2 1 2

(2.45)

5 5 5

³ u ³ u

1 2

From the notation ³ u , it is clear that all functions u, as long as the integral in Eq. (2.44) exists,

have associated with them the functional ³ u defined for the test functions . There are also

functionals that behave in every way like the functionals ³ u , but for which no corresponding true

function u can be defined. We can, however, associate with these functionals a new class of

mathematical objects, called generalized functions, which can be shown to have many of the

properties of true functions. For this reason, it is customary to use function notation when

referring to generalized functions. If an already-understood functional has no true function u(t)

associated with it, we can use the properties of this already-understood functional to define a

generalized function called uG (t ) , with the subscript G reminding us that uG is a generalized

function. By analogy with the true function u(t) associated with the functional ³ u , the

generalized function and its behavior inside integrals is defined in terms of the already-known

functional, which we call ³ uG , using the definition

³u

5

G

(t ) (t ) dt ³ uG (2.46)

for any test function . Since we already know what complex number the functional ³ uG gives

for any test function , Eq. (2.46) is not a definition of ³ uG but rather a definition of what it

means to put [uG (t ) A (t )] inside an integral. Clearly, the generalized function itself is well

defined only when its product with a test function is integrated over t. Because the functional ³ uG

behaves in every way like the functionals ³ u based on the Cauchy-principle-value integration of

true functions, we have established a new type of integration using the product of generalized

- 122

- 122- -

Generalized Functions · 2.11

functions uG (t ) with test functions (t ) . Hence, we have not only generalized what is meant by a

function but have also extended again what is meant by integration.

To handle algebraic expressions involving both generalized functions and true functions, we

must define what it means to say two generalized functions uG (t ) and vG (t ) are equal. We say

that when

5 5

³u

5

G (t ) (t )dt ³v

5

G (t ) (t )dt (2.47a)

uG (t ) vG (t ) . (2.47b)

We also define a generalized function uG (t ) , which we know only from its associated

functional ³ uG using definition (2.46), to be equal to a true function v(t) when

³ u ³ v

G (2.48a)

for all appropriate test functions . Another way of stating this is that whenever

5 5

³ uG (t ) (t )dt

5

³ v(t ) (t )dt

5

(2.48b)

uG (t ) v(t ) . (2.48c)

t

b

when

³ uG ab ³ vG ab (2.48d)

or

5 5

³ uG (t )ab (t )dt

5

³v

5

G (t )ab (t )dt (2.48e)

for all test functions ab (t ) that are identically zero for all t

a and for all t b . The key point

here is that we are explicitly allowing ab (t ) to be nonzero only inside the interval a

t

b . We

also say that a true function v(t) equals a generalized function uG (t ) in the interval a

t

b ,

-- 123

123 --

2 · Fourier Theory

uG (t ) v(t ) for a

t

b , (2.48f)

whenever

5 5

³u

5

G (t )ab (t )dt ³ v(t )

5

ab (t )dt (2.48g)

for all the ab (t ) test functions. In Eqs. (2.48d)–(2.48g), we allow for half-infinite intervals by

permitting constant b to be 5 with constant a finite and constant a to be í with constant b

finite.

The definitions of equality between two generalized functions or between a generalized

function and a true function can be, depending on the set of test functions chosen, either very

much looser than the standard idea of equality or very much the same. Suppose, by way of

analogy, we define two true functions u1 (t ) and u2 (t ) to be “equal” when

5 5

³ u (t ) (t )dt ³ u (t ) (t )dt

5

1

5

2 (2.49)

for all test functions . If the only allowed test function is (t ) 0 , then any two functions u1 (t )

and u2 (t ) are “equal.” If, on the other hand, the allowed test functions are (t ) e 92& ift for all real

values of ƒ, we are saying that u1 (t ) and u2 (t ) are “equal” when their Fourier transforms

F ( 92& ift ) u1 (t ) and F ( 92& ift ) u2 (t ) are the same. From the Fourier inversion formulas, it then

follows that u1 (t ) must be identical to u2 (t ) , except possibly at jump discontinuities and isolated

points, for all reasonably well-behaved functions u1 (t ) and u2 (t ) . In general, we expect the set of

test functions to be diverse enough that serious thought and some mathematical ingenuity are

required to find two functions u1 (t ) and u2 (t ) that satisfy Eq. (2.49) yet are not basically the

same function. Of course, the integrals used in Eq. (2.49)—and all the other integrals involving

only true functions in Eqs. (2.44) through (2.48g), for that matter—must be known to exist. Often

the finiteness of these integrals and the general smoothness of the test functions are enforced by

the requirement that

N

lim[ t (t )] 0 for N 0,1, 2,… , (2.50a)

t 75

M

- 124

- 124- -

Generalized Functions · 2.11

N

lim[ t ( M ) (t )] 0 for N 0,1, 2,…

t 75 . (2.50b)

and M 1, 2,…

2

A function such as e at for a 0 satisfies (2.50a) and (2.50b), and in general all functions

representing physically realistic measurements can be taken to satisfy these two requirements. It

turns out, however, that the most useful and popular generalized function used in Fourier theory

can handle a wider variety of test functions, requiring only that the test functions be

continuous at t 0 (see Sec. 2.14 below).

Continuing to develop what is meant by the sign applied to generalized functions, we say

that the product of a true function w(t) and a generalized function uG (t ) is another generalized

function vG (t ) ,

vG (t ) w(t ) A uG (t ) , (2.51a)

5 5

³

5

vG (t ) (t )dt ³ w(t )u

5

G (t ) (t ) dt

for all test functions (t ) . A linear combination of true functions and generalized functions

specified by

wG (t ) u1 (t )vG1 (t ) u2 (t )vG 2 (t ) " (2.51b)

5 5 5

³

5

wG (t ) (t )dt ³ u1 (t )vG1 (t ) (t )dt

5

³ u (t ) v

5

2 G2 (t ) (t ) dt "

for all test functions (t ) . In general, there is no difficulty assigning a meaning to equations such

as

u1 (t )vG1 (t ) u2 (t )vG 2 (t ) " u N (t )vGN (t )

(2.51c)

U1 (t )VG1 (t ) U 2 (t )VG 2 (t ) " U M (t )VGM (t )

vG1 (t ), vG 2 (t ),… , vGN (t ), VG1 (t ), VG 2 (t ),… , VGM (t ) . As long as both sides of the equation are just

linear combinations of generalized functions and true functions, we interpret their equality to

mean that

-- 125

125 --

2 · Fourier Theory

5 5 5

³ u1 (t )vG1 (t ) (t )dt

5 5

³ u2 (t )vG 2 (t ) (t )dt " ³u

5

N (t )vGN (t ) (t ) dt

5 5 5

³ U (t )V

5

1 G1 (t ) (t ) dt ³ U 2 (t )VG 2 (t ) (t ) dt " ³ U M (t )VGM (t ) (t ) dt

5 5

for all test functions (t ) . Even the simplest nonlinear expressions, however, such as

? 2

vG (t ) uG (t ) ,

cannot be resolved by putting both sides inside an integral, because the right-hand side of

5

?5

³ vG (t ) (t )dt ³ uG (t ) (t )dt

2

5 5

is still undefined. We know that the left-hand side is the same as applying the already-understood

functional ³ uG to ,

5

³u

5

G

(t ) (t )dt ³ uG ,

5

³ u

2

G (t ) (t )dt

5

in terms of the functional ³ uG . It turns out that, in general, nonlinear expressions involving

generalized functions cannot be given useful interpretations. Hence, generalized functions must

be treated with caution unless they are used inside linear combinations of the type shown in

(2.51b) and (2.51c).

Although generalized functions do have limitations, there are many things that can be done

with them. We can give meaning to uG (t a ) for any real constant a by defining that

5 5

³ uG (t a) (t )dt

5

³u

5

G (t ) (t a)dt (2.52a)

for all test functions . This definition is, of course, consistent with what happens when the

formal substitution t 3 t a is made inside the original integral,

- 126

- 126- -

Generalized Functions · 2.11

5 5 5

5 5

³u

5

G (t ) (t a)dt ,

treating uG (t a ) like a true function u (t a) . We can give meaning to uG (at ) for any real

constant a by defining that

5 5

1

³5 G

u ( at ) (t ) dt ³ uG (t ) t a dt

a 5

(2.52b)

for all test functions . This definition is consistent with what happens when we make the formal

substitution t 3 at in the integral

5

³u

5

G (at ) (t )dt

1 5 ½

5 ° ³

° a 5

uG (t 3) t 3 a dt 3 for a 0 °

° 1

5

° ° a

°a ³

5

uG ( t 3) t 3 a dt 3 for a

0 °

¯ 5 ¿

When the argument of uG is the a linear combination at c for real constants a and c, we

define

5 5

1

³ uG (at c) (t )dt a 5³ uG (t ) (t c) a dt

5

(2.52c)

and, combining the arguments used to explain definitions (2.52a) and (2.52b), we see that

transforming the variable of integration to t 3 at c gives

5 5

1

³5 uG (at c) (t )dt a ³u

5

G (t 3) (t 3 c) a dt 3 ,

justifying definition (2.52c). In general, any variable transformation that is permitted for the

argument of a true function we also permit for the argument of a generalized function unless it

results in an inappropriate test function.

We define a generalized function uG (t ) to be even if

-- 127

127 --

2 · Fourier Theory

³u

5

G (t )o (t )dt 0 (2.52d)

³u

5

G (t )e (t )dt 0 (2.52e)

for all even test functions e . This gives uG (t ) the same behavior it would have if it were an even

or odd true function multiplied by e or o and integrated over all t. Putting a subscript e on the

generalized function uGe (t ) to show that it obeys the above definition for an even generalized

function, we note that, as described in Eq. (2.11c) above, any test function (t ) can be written as

the sum of an even function e (t ) and an odd function o (t ) . Hence, for any test function and

an even generalized function uGe (t ) , we can write, using definition (2.52d),

5 5 5 5

³ uGe (t ) (t )dt

5

³ uGe (t ) e (t ) o (t ) dt

5 5

³ uGe (t )e (t )dt ³u

5

Ge (t )o (t )dt

5

³u

5

Ge (t )e (t )dt .

5 5 5

5

³ uGe (t ) (t )dt

5

³u

5

Ge (t ) e (t ) o (t ) dt

5 5

³ uGe (t )e (t )dt

5

³u

5

Ge (t )o (t )dt

5 5

³u

5

Ge (t )e (t ) dt ³u

5

Ge (t )o (t ) dt

5

³u

5

Ge (t )e (t ) dt ,

where in the last two steps we use o (t ) o (t ) , e (t ) e (t ) , and definition (2.52d). We see

that both

- 128

- 128- -

Generalized Functions · 2.11

5 5

5

³u Ge (t ) (t )dt and ³u

5

Ge (t ) (t )dt

are equal to

5

5

³u Ge (t )e (t )dt

for any test function , so by definition (2.47a) for the equality of two generalized functions, it

follows that

uGe (t ) uGe (t ) (2.52f)

for any even generalized function uGe (t ) . If uGo (t ) is any odd generalized function, we can use

(t ) e (t ) o (t ) and definition (2.52e) to get

5 5 5

³u

5

Go (t ) (t )dt ³u

5

Go (t ) e (t ) o (t ) dt ³u

5

Go (t )o (t )dt

5 5 5 5

5

³ uGo (t ) (t )dt

5

³ uGo (t )e (t )dt

5

³u

5

Go (t )o (t )dt

5 5

³u

5

Go (t )e (t ) dt ³ [u

5

Go (t )]o (t ) dt

5

³ [uGo (t )o (t )] dt

5

or

5 5

³ [ u

5

Go (t )] (t ) dt ³u

5

Go (t )o (t ) dt .

5 5

Clearly, ³ uGo (t ) (t )dt and

5

³ [u

5

Go (t )] (t ) dt are equal to each other because they are both

5

equal to ³u

5

Go (t )o (t )dt for any test function , so by definition (2.47a) we conclude that

-- 129

129 --

2 · Fourier Theory

or

uGo (t ) uGo (t ) . (2.52g)

uG3 (t ) uG(1) (t ) .

what functional ³ uG3 defines the generalized function uG3 (t ) ? We specify this new functional ³ uG3

with the definition

³ uG3 ³ uG 3

or

5 5

§ d ·

³ uG3 ³ uG (t ) 3(t )dt ³ uG (t ) ¨

¸ dt (2.53a)

5 5 © dt ¹

for any test function . Therefore, the new generalized function uG3 (t ) satisfies the equation

5 5

§ d ·

³ u3 (t ) (t )dt ³ u

5

G

5

G (t ) ¨ ¸ dt

© dt ¹

(2.53b)

for any test function . We note that this definition is consistent with a formal integration by

parts, treating uG3 (t ) like a true function u 3(t ) to get

5 5 5

§ d · § d ·

³5 uG3 (t ) (t )dt uG (t ) (t )5 5³ uG (t ) ¨© dt ¸¹ dt 5³ uG (t ) ¨© dt ¸¹ dt ,

5

with the term in square brackets [ ] zero for all test functions . We can make this first term zero

either by requiring to approach zero as t 7 95 or by having uG (t ) equal a true function in the

sense of (2.48g) with the true function becoming zero as t 7 95 . The integral involving

3(t ) d dt must also, of course, have a well-defined meaning for all the test functions .

The convolution of two generalized functions uG (t ) and vG (t ) is defined to be another

generalized function

wG (t ) uG (t ) vG (t ) . (2.54a)

From Eqs. (2.47a) and (2.47b), we know that (2.54a) must mean that

- 130

- 130- -

Generalized Functions · 2.11

5 5

³w

5

G (t ) (t )dt

5

³ u G (t ) vG (t ) (t )dt (2.54b)

for all test functions . We now give meaning to both sides of (2.54b) by defining that, for all

test functions ,

5 5 5 5

³

5

wG (t ) (t )dt

5

³ uG (t ) vG (t ) (t )dt 5

³ dt 3 uG (t 3) ³ dt 33 vG (t 33) (t 3 t 33) .

5

(2.54c)

Note that the right-hand side of (2.54c) is as well defined as our previous definitions, since

5

v ³v

5

G (t 33) (t 3 t 33)dt 33

is just another complex number depending on the real parameter t 3 , which can be treated as

another true test function v (t 3) inside the double integral of (2.54c),

5 5 5

³ dt 3 u

5

G (t 3) ³ dt 33 vG (t 33) (t 3 t 33)

5

³u

5

G (t 3) v (t 3) dt 3 .

As long as (t 3 t 33) and v (t 3) are both test functions whenever is a test function,

definition (2.54c) should present no difficulties. To justify this definition, we note that formally

treating uG (t ) and vG (t ) as true functions gives

5 5 5

5

³

5

dt 33 (t 33) ³ dt 3 uG (t 3)vG (t 33 t 3)

5

5 5

³ dt 3 u

5

G (t 3) ³ dt 33 (t 33)vG (t 33 t 3) ,

5

where the last step interchanges the order of integration. We now use (2.52a) to write

5 5

³ (t 33)v

5

G (t 33 t 3)dt 33 ³v

5

G (t 33) (t 33 t 3) dt 33 ,

which leads to

-- 131

131 --

2 · Fourier Theory

5 5 5

5

³

5

dt 3 uG (t 3) ³ dt 33 vG (t 33) (t 33 t 3) ,

5

justifying the definition given in (2.54c). Note that the order of integration inside the double

integral of (2.54c) can be freely interchanged,

5 5 5 5

³

5

dt 3 uG (t 3) ³ dt 33 vG (t 33) (t 3 t 33)

5

³

5

dt 33 vG (t 33) ³ dt 3 uG (t 3) (t 3 t 33) ,

5

Because the convolution itself is defined as an integral, there is no problem giving a meaning to

the convolution of a true function with a generalized function as long as the true function is an

acceptable test function. For a generalized function uG (t ) and test function (t ) , we have

5 5 5

uG (t ) (t ) ³u

5

G (t 3) (t t 3)dt 3 ³u

5

G (t 3) (t 3 t ) dt 3 ³u

5

G (t t 3) (t 3)dt 3 , (2.55a)

where definition (2.52c) with a 1 and c t is used in the last step of (2.55a). It clearly makes

sense to say that

5

³u

5

G (t t 3) (t 3)dt 3 (t ) uG (t ) ,

uG (t ) (t ) (t ) uG (t ) (2.55b)

Given a sequence of true functions u1 (t ), u2 (t ),… , un (t ),… , we can form a corresponding

sequence of integrals with the test functions ,

5 5 5

5

1

5

2

5

n

We define Glim, the generalized limit of the sequence of true functions un (t ) , by taking the

standard limit of the sequence of integrals,

- 132

- 132- -

Generalized Limits · 2.12

5

lim

n 75 ³ u (t ) (t )dt ,

5

n

and requiring that the generalized limit of the sequence of true functions un (t ) , written as

G lim un (t ) ,

n 75

5 5

lim

n 75 ³ u (t ) (t )dt ³ ª¬G lim u (t )º¼ (t )dt

5

n

5

n 75

n (2.56a)

for any test function . In effect, the generalized limit Glim is what we get when we insist on

moving the standard limit inside the integral. Almost always, of course, it turns out that the

generalized limit is the same as the standard limit,

G lim un (t ) lim un (t ) ,

n 75 n 75

so that

5 5

lim

n 75 ³ u (t ) (t )dt ³ ª¬ lim u (t )º¼ (t )dt ,

5

n

5

n 75

n (2.56b)

but this is not always the case. If we define the function (see Fig. 2.6) by

1 for t

T

°

(t , T ) ®1 2 for t T , (2.56c)

° 0 for t T

¯

1 §t ·

un (t ) ¨ ,1¸ . (2.56d)

n ©n ¹

t

n , so when

(t ) 1

-- 133

133 --

2 · Fourier Theory

5 5

1 §t ·

³5 un (t )dt n 5³ ¨© n ,1¸¹ dt 2 ,

which makes

5

lim

n 75 ³ u (t )dt 2 .

5

n (2.56e)

ª 1 § t ·º

lim un (t ) lim « ¨ ,1¸ » 0 ,

n 75 n

n 75

¬ © n ¹¼

which gives

5

³ ª¬ lim u (t )º¼ dt 0 .

5

n 75

n (2.56f)

______________________________________________________________________________

FIGURE 2.6. (t , T )

t T t T

- 134

- 134- -

Generalized Limits · 2.12

The disagreement of (2.56e) and (2.56f) shows that there can be a very important difference

between the generalized limit and the standard limit, because Eq. (2.56b) does not always hold

true. We cannot avoid this problem by ruling out constant test functions such as (t ) 1 .

Consider, for example,

1

(t )

1 t2

un (t ) t sin(t n) .

We find that21

5

t sin(t n)

5

³ 1 t 2

dt & e 1 n , (2.57a)

which gives

5

t sin(t n)

lim

n 75 ³

5

1 t2

dt & . (2.57b)

5 {lim t sin(t n) }

³

5

n 75

1 t2

dt 0 . (2.57c)

Once again, we have found a sequence of true functions un (t ) that does not satisfy (2.56b). This

second example can, in fact, be seen to fail (2.56b) for much the same reason as the first. Since an

even function is being integrated, we can write that [see Eq. (2.19)]

5 5

t sin(t n) t sin(t n)

lim ³ 2

dt 2 lim ³ dt . (2.57d)

n 75

5

1 t n 75

0

1 t2

Consider what happens to the first, positive hump of the sine as n increases in the integral on the

right-hand side of Eq. (2.57d). The values of t for which sin(t n) is significantly different from

zero, say from n A (& 4) to n A (3& 4) , comprise an interval t n A (& 2) with a width that

increases linearly with n, just like the interval 2n in (2.56d) over which (t n ,1) equals one. The

21

I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, edited by Alan Jeffrey, 5th ed.

(Academic Press, New York, 1994), p. 445, formula 4 in Sec. 3.723 with a=1/n and =1.

-- 135

135 --

2 · Fourier Theory

center of this hump is at t n A (& 2) , so as n increases, the hump’s center appears at ever larger

values of t. Hence, we can make the approximation that for large n

t 2

2

t 1 ? .

1 t n&

t sin(t n)

1 t2

at the hump decreases as 1 n , while the hump’s width, t n A (& 2) , increases as n. The product

of the size and width therefore tends to a constant as n gets large, preventing the integral from

shrinking as n 7 5 . This is the same phenomenon that caused our first example n 1 (t n ,1) to

fail Eq. (2.56b). Up to this point, we have, of course, only discussed the contribution of the first

For every generalized function uG (t ) , there is at least one sequence of true functions

u1 (t ), u2 (t ),… , un (t ),… such that

G lim un (t ) uG (t ) . (2.58a)

n 75

This formula should be interpreted in the sense of (2.47b) and (2.56a); that is, it means

5 5 5

ª º

³5 «¬G limn75un (t )»¼ (t )dt lim

n 75 ³

5

un (t ) (t )dt ³ uG (t ) (t )dt

5

(2.58b)

for all test functions . We use the sequence of true functions whose generalized limit is the

generalized function to define the Fourier transform of the generalized function. If a sequence of

true functions w1 (t ), w2 (t ),… , wn (t ),… can be forward Fourier transformed to give another

- 136

- 136- -

Fourier Transforms of Generalized Functions · 2.13

³ w (t )e

2& ift

Wn ( f ) n dt (2.59a)

5

and

5

³ W ( f )e

2& ift

wn (t ) n df (2.59b)

5

for all values of n, we then define the forward Fourier transform of the generalized function

wG (t ) G lim wn (t ) (2.59c)

n 75

to be

F ( ift ) wG (t ) G lim Wn ( f ) . (2.59d)

n 75

We expect the sequence of true functions W1 ( f ), W2 ( f ),… , Wn ( f ),… also to give a generalized

function when we take the generalized limit of the sequence,

WG ( f ) G lim Wn ( f ) , (2.59e)

n 75

n 75

The double-arrow notation 6 introduced in the discussion after Eq. (2.35d) can be used to

restate this definition more concisely. We define that whenever

w1 (t ), w2 (t ),… , wG (t )

W1 ( f ), W2 ( f ),… , WG ( f )

w1 (t ) 6 W1 ( f ), w2 (t ) 6 W2 ( f ), … , wn (t ) 6 Wn ( f ),…

-- 137

137 --

2 · Fourier Theory

wG (t ) 6 WG (t ) (2.59g)

Now at last we can attach a meaning to the Fourier transform pair that could not be completed

in Eqs. (2.43d)–(2.43f). The explicit development that follows is perhaps somewhat long, but

worth doing to show how to construct the Fourier transforms of some of the functions violating

one or more of requirements (V) through (VIII) in Sec. 2.4. We create the sequence

n 75

where quotes “ ” are used to indicate that the “ sgn( f ) ” is a generalized function instead of the

true function sgn( f ) defined in Eq. (2.42c) above. The reason for this choice of sequence is

straightforward—function [sgn( f ) ( f , n)] satisfies requirements (V) through (VIII) in Sec. 2.4

for every finite value of n and so has a well-defined Fourier transform; as n increases, function

[sgn( f ) ( f , n)] resembles ever more closely the sgn( f ) function to which we want to give a

Fourier transform. We note that for any test function

5 5

5 5

n 75

5

lim ³ ( f ) sgn( f ) ( f , n)df

n 75

5

n

lim ³ ( f ) sgn( f )df

n 75

n

5

³ ( f ) sgn( f )df ,

5

so

"sgn( f )" sgn( f ) (2.60b)

- 138

- 138- -

Fourier Transforms of Generalized Functions · 2.13

in the sense of Eq. (2.48c). This equivalence can be used to justify dropping the distinction

between “ sgn( f ) ” and sgn( f ) . Applied mathematicians who work with generalized functions

often drop the distinction between a generalized function and the true function to which it is

equivalent, and the double-quote notation introduced here is not standard usage. There is,

however, no harm in keeping track of the distinction between the two types of functions, and the

double quotes acknowledge the close relationship of the two functions while reminding us that

they are not the same.

The inverse Fourier transform of [ i& sgn( f ) ( f , n)] is, using the identity

ei" cos " i sin " ,

5 n

F ( itf )

i& sgn( f ) ( f , n) i& ³ e 2& ift

sgn( f ) ( f , n) df 2& ³ sin(2& ft ) df .

5 0

which is an odd function in ƒ, has an integral that is zero according to Eq. (2.17); and the integral

between (ín) and n of [sin(2& ft ) sgn( f )] , which is an even function in ƒ, is twice the value of its

integral from zero to n according to Eq. (2.19). Making the substitution f 3 2& tf gives

1 2& nt

F (itf ) i& sgn( f ) ( f , n) cos f 30 .

t

This shows that the inverse Fourier transform of [i& sgn( f ) ( f , n)] is

Now we calculate the forward Fourier transform of (1/ t )[1 cos(2& nt )] . We get

5

F ( ift )

t 1

³e

[1 cos(2& nt )] 2& ift 1

t [1 cos(2& nt )] dt

5

5 5

dt 1

³

5

e 2& ift ³ e 2& ift cos(2& nt ) dt

t 5 t

5

1

i& sgn( f ) i ³ cos(2& nt ) sin(2& ft ) dt .

5

t

-- 139

139 --

2 · Fourier Theory

In the last step, Eq. (2.43d) is used to evaluate the integral of [e 2& ift t 1 ] ; we also substitute

ei" cos " i sin " into the integral of [e 2& ift t 1 cos(2& nt )] , discovering that the Cauchy principle

value of the integral of [t 1 cos(2& ft ) cos(2& nt )] , which is an odd function in t, is zero [see Eq.

(2.17)]. The remaining integral over the even function

[t 1 sin(2& ft ) cos(2& nt )]

can be simplified by applying Eq. (2.19) and then consulting a table of definite integrals,22

5 5

1 1

³5 t cos(2& nt ) sin(2& ft ) dt 2sgn( f )³0 t cos(2& nt ) sin(2& f t ) dt

& sgn( f ) (2& n, 2& f ) & sgn( f ) (n, f ) .

F ( ift ) t 1[1 cos(2& nt )] sgn( f ) ª¬ i& i& (n, f ) º¼ i& sgn( f ) ª¬1 ( n, f ) º¼

i& sgn( f ) ( f , n) .

Hence, (1/ t )[1 cos(2& nt )] and [i& sgn( f ) ( f , n)] are a Fourier-transform pair,

1

1 cos(2& nt ) 6 i& sgn( f ) ( f , n) .

t

1 1 1

1 cos(2& t ) , 1 cos(4& t ) , … , 1 cos(2& nt ) , … (2.60c)

t t t

and

22

I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, p. 453, formula 2 in Sec. 3.741 with

a=2&|f| and b=2&n.

- 140

- 140- -

Fourier Transforms of Generalized Functions · 2.13

such that each member of the lower sequence is the forward Fourier transform of the

corresponding member of the upper sequence and each member of the upper sequence is the

inverse Fourier transform of the corresponding member of the lower sequence. We know from

(2.60a) and (2.60b) that the generalized function given by the generalized limit of the lower

sequence is

n 75 n 75 (2.60d)

i& sgn( f ) ,

but what is the generalized function given by the generalized limit of the upper sequence? We

have for any test function

5 5

1

³ (t ) G lim 1 t [1 cos(2& nt )]2 dt lim ³ (t ) 1 cos(2& nt ) dt

1

5

n 75 n 75

5

t

° 5 dt

5

dt ½°

lim ® ³ (t ) ³ (t ) cos(2& nt ) ¾ (2.60e)

n 75

¯° 5 t 5 t ¿°

5 5

dt 1

³ (t ) lim ³ (t ) cos(2& nt ) dt .

5

t n75 5 t

5

1 1

lim ³ (t ) cos(2& nt ) dt lim ³ (t ) cos(2& nt ) dt

n 75

5

t n 75

5

t

1

lim ³ (t ) cos(2& nt ) dt (2.60f)

n 75

t

5

1

lim ³ (t ) cos(2& nt ) dt ,

n 75

t

where is a small positive number. By making all the test functions (t ) have finite variation as

in requirement (VIII) in Sec. 2.4, we recognize the first and third integrals on the right-hand side

of (2.60f) become zero as n 7 5 , because eventually the cosine oscillates both positive and

negative over each infinitesimal interval while (t ) t barely changes at all—the integrals can be

made as small as desired by picking a large enough value of n. For future use, we note that for

any continuous, finite-variation test function ,

-- 141

141 --

2 · Fourier Theory

5 5 5

lim ³ (t ) sin(nt )dt lim ³ (t ) cos(nt )dt lim ³ (t )e 9 int dt 0 ,

n 75 n 75 n 75

5 5 5

so that

G lim sin(nt ) G lim cos(nt ) G lim e 9 int 0 . (2.60g)

n 75 n 75 n 75

5

dt 1

³ (t ) cos(2& nt ) t
(0)5³ (t , ) t cos(2& nt ) dt ,

where we have chosen small enough that (t ) barely changes over the integral, letting us

replace it by (0) . Now the middle integral on the right-hand side of (2.60f) can be recognized as

the Cauchy principle value of the integral of (1 t ) (t , ) cos(2& nt ) , which is an odd function of t

and must be zero according to Eq. (2.17). Hence, (2.60f) becomes

5

1

lim ³ (t ) cos(2& nt ) dt 0 ,

n 75

5

t

5 5

dt

³ (t ) G lim 1 t [1 cos(2& nt )]2 dt ³ (t )

1

(2.60h)

5

n 75

5

t

for any test function . Since (2.60h) denotes equality in the sense of Eq. (2.48c), we can define

the generalized function “ t 1 ” to be

1

" t 1 " G lim t 1[1 cos(2& nt )]

n 75

2 (2.60i)

Equations (2.60d) and (2.60j) show that [i& "sgn( f )"] and “ t 1 ” are the generalized limits of the

two sequences in (2.60c). Because all the sequence members are Fourier transform pairs, we

- 142

- 142- -

Fourier Transforms of Generalized Functions · 2.13

know, according to (2.59g), that [ i& "sgn( f )"] and " t 1 " are a Fourier transform pair even

though [i& sgn( f )] and t 1 do not satisfy requirements (V) through (VIII) in Sec. 2.4 and, as

shown in Eqs. (2.43a) and (2.43f), their transforms cannot be evaluated as standard integrals. In

this sense, we can write that

and

F ( ift ) i& sgn( f ) t 1 . (2.60 A )

This can also be written as, reversing the sign of ƒ in (2.60k), the sign of t in (2.60 A ), and using

Eq. (2.42c) to get that sgn( f ) sgn( f ) ,

and

F ( ift ) i& sgn( f ) t 1 . (2.60n)

It is important to remember that Eqs. (2.60k) and (2.60m) are true only when integrals between

í and + are interpreted as Cauchy principle values and (2.60 A ) and (2.60n) are true only

when equality is defined as in Eq. (2.48c) using generalized function theory. Strictly speaking, it

might be better to say that the Cauchy principle value of

5

dt

³e

92& ift

is 9i& sgn( f )

5

t

and that

5

³e

92& ift

i& "sgn( f )" df 9 " t 1 " .

5

5

dt

³e

92& ift

9i& sgn( f ) (2.61a)

5

t

is usually not listed in standard tables of improper integrals without notation showing that it is a

Cauchy principle value, and the equality

5

i

³e

92& ift

sgn( f ) df 9 (2.61b)

5

&t

-- 143

143 --

2 · Fourier Theory

is usually not listed in these tables under any circumstances. It is also true, however, that (2.61a)

and (2.61b) are constantly used either explicitly or implicitly in Fourier-transform theory; and

lists of Fourier-transform pairs often contain (2.61a) and (2.61b). Unfortunately, it is standard

practice in the Fourier-transform tables that do list these integrals to omit any explanation that

they are only true when interpreted as the Fourier transforms of generalized functions. In general,

when using tables of Fourier transforms, all those transforms that do not exist as standard

integrals or Cauchy principle values should be interpreted as the transforms of generalized

functions and used only in the context of generalized function theory.

The most popular and useful generalized function is the Dirac delta function, a name usually

shortened to just the delta function. In a sense, the Secs. 2.11–2.13 describing generalized

function theory are there just so we can give a mathematically exact description of the delta

function. The delta function is often inexactly described in elementary textbooks as that function

(t ) such that

5 for t 0

(t ) ® (2.62a)

¯ 0 for t > 0

with

b

f (0) for a

0

b

³ (t ) f (t )dt ®

a ¯ 0 for a

b

0 or 0

a

b

. (2.62b)

(t ) lim[n (t , n 1 )] (2.63a)

n 75

or

§ n 2 ·

(t ) lim ¨¨ e nt ¸¸ . (2.63b)

© &

n 75

¹

There are, in fact, two different—but equivalent—mathematically exact ways to define the delta

function. The first way is to create a well-defined functional ³ that, when operating on a

complex-valued test function (t ) with a real argument t, produces as its complex number (0) ,

the value of at t equal to zero,

³ (0) . (2.64a)

- 144

- 144- -

The Delta Function · 2.14

This makes (t ) the generalized function associated with functional ³ , with (t ) having the

property that

5

³ (t ) (t )dt (0)

5

(2.64b)

for all test functions . The second way to define (t ) is to say it is the generalized limit of a

sequence such as the ones specified in (2.63a) and (2.63b),

(t ) G lim[n (t , n 1 )] (2.65a)

n 75

or

§ n 2 ·

(t ) G lim ¨¨ e nt ¸¸ . (2.65b)

© &

n 75

¹

Although the delta function is a generalized function in every sense of the term, we follow

standard notation and do not add the G subscript—or add the quotes “ ”—used to label other

generalized functions in this chapter.

Defining (t ) with a functional, as in (2.64a), shows that this generalized function can be

used on an extremely large set of test functions—any true function that is continuous at the origin

is an acceptable and appropriate test function. The subset of test functions ab used in Eqs.

(2.48d)–(2.48g) has a

b with ab (t ) automatically set to zero when t does not lie inside the

interval a

t

b . These functions can be used in (2.64b) to show that

³ (t )

5

ab (t )dt ab (0) 0

when a

b

0 or 0

a

b . Therefore, we have

5 5

³ (t )

5

ab (t )dt ³ 0 A

5

ab (t )dt 0

-- 145

145 --

2 · Fourier Theory

t

b does not include t 0 . This is a

mathematically exact way of stating the lower level of Eq. (2.62a). If (t ) is defined using

generalized limits, as in Eqs. (2.65a) and (2.65b), then we must show why Eq. (2.64b) is true. The

sequence in (2.65b), for example, leads to

5

ª n nt 2 º

5

n nt 2

5

n nt 2

³5 (t ) «Gn75

lim

&

e » dt lim ³

n 75 &

e (t )dt
lim (0) ³

n 75 &

e dt

¬ ¼ 5 5

5

n nt 2

(0) lim ³ e dt (2.66)

n 75

5

&

(0)

for any test function . As n gets large in (2.66), only the value of at t 0 can contribute

significantly to the integral. Replacing (t ) by (0) quickly reduces the whole expression to

(0) , showing that the generalized limit of the sequence in (2.65b) is indeed the delta function.

Some commonly used sequences that have the delta function as their generalized limits are

(t ) G lim

n & , (2.67a)

n 75 1 n 2t 2

sin 2 (nt )

(t ) G lim , (2.67b)

n 75 n& t 2

sin(2& nt )

(t ) G lim , (2.67c)

n 75 &t

and so on. Perhaps the most interesting of these sequences is (2.67c). We know from (2.65c) that

one important property of the delta function is

³ (t )

5

ab (t )dt 0

t

b does not include t 0 . The reason that

5 5

ª sin(2& nt ) º ª sin(2& nt ) º

³5 «¬ n75 & t »¼

G lim ab (t ) dt

n 75 ³ «

lim

5 ¬ & t »¼ ab (t )dt 0

- 146

- 146- -

The Delta Function · 2.14

t

b does not include t 0 is that for extremely large n values the sine

oscillates rapidly between +1 and í1 while ab (t ) t stays essentially constant for t > 0 , averaging

the integrand to zero. Hence,

sin(2& nt )

G lim (t ) 0 for t > 0

n 75 &t

G lim e 9 int 0

n 75

in Eq. (2.60g). To understand the behavior near t 0 , we construct function a 0b (t ) in which the

interval a

t

b does include t 0 . Now we can write, transforming the variable of integration

to t 3 2& nt ,

5 5

ª sin(2& nt ) º 1 ª sin(t 3) º § t3 · 3

³ « n75 & t »¼

5 ¬

G lim a 0 b (t ) dt lim

n 75 & ³5 «¬ t 3 »¼ ©¨ 2& n ¹¸ dt

a 0 b

5 5

1 ª sin(t 3) º

a 0b 0 lim ³ « dt a 0b 0 ³ (t )a 0b (t )dt ,

3

n 75 &

5 ¬

t 3 »¼ 5

where in the second-to-last step we use (see any handbook of definite integrals)

5

sin(t 3)

³

5

t3

dt 3 & .

Any arbitrary test function can be written as a function a 0b (t ) whose interval of nonzero values

includes t 0 plus other test functions whose intervals of nonzero values do not include t 0 ;

that is, we can always write (t ) a 0b (t ) [other functions zero at the origin] . When this (t ) is

multiplied by G lim sin(2& nt ) (& t ) and integrated over t between í and +, we realize that the

n 75

value of the integral is a 0b (0) (0) because the other functions that are zero at the origin give

zero contribution to the integral as n 7 5 . Consequently,

5 5

ª sin(2& nt ) º

³5 «¬Gn75

lim

&t »¼ (t )dt 0 ³ (t ) (t )dt ,

5

-- 147

147 --

2 · Fourier Theory

sin(2& nt )

&t

equals the delta function in the only sense that two generalized functions can ever be equal—the

integral of the left-hand side with any test function is always the same as the integral of the

right-hand side with any test function [see discussion after Eq. (2.47b)]. Figures 2.7(a)–2.7(c)

2

and 2.8(a)–2.8(c) plot the behavior of n & A e nt and (& t ) 1 sin(2& nt ) sequences, showing the

two different ways these sequences change into delta functions.

We note that for any odd test function o (t )

5

³ (t ) (t )dt (0) 0

o o

because, according to Eq. (2.12a), odd functions are zero at the origin. Therefore, from the

definitions of even and odd generalized functions in Eqs. (2.52d) and (2.52e), we conclude that

the delta function is an even generalized function because its integral with all odd test functions is

always zero. This means we can write [see Eq. (2.52f)]

(t ) (t ) . (2.68a)

5 5

³ (t t ) (t )dt ³ (t ) (t t )dt (t )

5

0

5

0 0

and, because the delta function equals the zero function for t > 0 , this result can be written as

b

0 for a

b

t0 or t0

a

b

³a (t t0 ) (t )dt ®¯ (t0 ) for a

t0

b . (2.68b)

5 5

1 1

³5 (c A t ) (t )dt c ³ (t ) (t / c)dt

5

c

(0) ,

- 148

- 148- -

The Delta Function · 2.14

FIGURE 2.7(a).

0

t

FIGURE 2.7(b).

0 t

FIGURE 2.7(c).

0 t

2

Figures 2.7(a)–2.7(c) show how n / & e nt changes into a delta function of t as n increases.

-- 149

149 --

2 · Fourier Theory

FIGURE 2.8(A).

0 t

FIGURE 2.8(b).

0 t

FIGURE 2.8(c).

0 t

-1

Figures 2.8(a)–2.8(c) show how (ʌt) sin(2ʌnt) changes into a delta function of t as n increases.

- 150

- 150- -

The Delta Function · 2.14

1

(ct ) (t ) (2.68c)

c

because

5

1

5

ª1 º

³ ( c A t ) (t ) dt

c

(0) ³ « c »» (t )dt

«

5 ¬

(t )

5 ¼

for all test functions . We note that this last rule, Eq. (2.68c), can also be used to show that the

delta function is even, since (2.68a) is just a special case of (2.68c) with c 1 .

Equation (2.52c) shows that there is no difficulty handling a general linear transformation of

the delta function’s argument, because for any two real constants a and c, we have

5

1

5

1 §c·

5

ª 1 § c ·º

³ (a A t c) (t )dt

5

a ³ (t ) ((t c) / a)dt

5

¨ ¸ ³ « ¨ t ¸ » (t )dt

a © a ¹ 5 «¬ a © a ¹ »¼

1 § c·

(a A t c) ¨t ¸ . (2.68d)

a © a¹

This is the same answer we would get from factoring a out of the delta function argument and

then using (2.68c) to rescale the delta function.

When the delta function is multiplied by a true function v(t), we have

5 5 5

5

0

5

0 0 0

5

0 0

1

u (t ) ¦ (t tk ) , (2.68f)

all k u 3(tk )

where u3(t ) du dt and t1 , t2 ,… are the values of t for which u (t ) 0 . This formula only makes

sense, of course, when u3(tk ) > 0 for t1 , t2 ,… . Perhaps the easiest way to see that (2.68f) must be

-- 151

151 --

2 · Fourier Theory

true is to note that the delta function equals the zero function whenever its argument is not zero.

Therefore,

5 ª tk º

³ (u (t )) (t ) dt ¦ « ³ (u (t )) (t ) dt » (2.68g)

5 « tk

all k ¬ ¼»

t

tk only includes one of the

tk values for which u is zero. Nothing stops us from making as small as we please—as long as

it does not become zero—and eventually each integral on the right-hand side of (2.68g) can be

written as

tk tk

tk tk

k k

where we expand u as

1

(t tk )u 3(tk ) (t tk ) ,

u 3(tk )

so that

tk tk

ª 1 º

³ u (t ) (t )dt ³ «« u3(t ) (t t ) »» (t )dt

k

tk tk ¬ k ¼

5

ª 1 º

³5 « u3(tk )

« (t t k ) » (t )dt .

¬ ¼»

5

ª 1

5

º 5

ª 1 º

³5 (u (t )) (t )dt ¦ ³ « (t tk ) » (t )dt ³ « ¦ (t tk ) » (t )dt

¬ u 3(tk )

all k 5 « »¼ ¬ all k u 3(tk )

5 « »¼

for all test functions . This justifies Eq. (2.68f) according to the definition for the equality of

generalized functions [see Eqs. (2.47a) and (2.47b)].

- 152

- 152- -

Derivatives of the Delta Function · 2.15

We have already remarked that the set of test functions for (t ) contains all functions that are

continuous at the origin. Changing the argument of the delta function changes the set of

appropriate test functions. In Eq. (2.68b), for example, the test functions must be continuous at

t t0 ; in (2.68d) they must be continuous at t c / a ; and in (2.68f) they must be continuous at

all t tk . When Eq. (2.53b) is used to define the derivative of a delta function, 3(t ) , we have

5 5

5 5

(2.69a)

which shows that now the first derivative of all the test functions must be continuous at the

origin. If we start out with a test function ab (t ) that must be identically zero for all t

a and for

all t b , then Eq. (2.69a) becomes

5 5

5 5

t

b does not contain the origin. Hence, we can write

5 5

5

³ 0 A

5

ab (t )dt

for a

b

0 or 0

a

b , showing that 3(t ) equals the zero function in the sense of Eq. (2.48f)

for t > 0 . Equation (2.52a) can be used in conjunction with (2.53b) to evaluate 3(t ) when it is

shifted from the origin by an amount t0 ,

5 5 5

5

0

5

0

5

0 0 (2.69b)

where now we require the first derivative of the test functions to be continuous at t t0 . This

result can be applied to test functions ab (t ) to get

5 5 5

³ 3(t t )

5

0 ab (t )dt ³ 3(t )

5

ab (t t0 )dt ³ (t )ab

5

3 (t t0 )dt ab

3 (t0 ) 0

-- 153

153 --

Â)RXULHU7KHRU\

ZKHQHYHUWKHLQWHUYDO D < W < E GRHVQRWFRQWDLQ W = W 7KHUHIRUH

∞ ∞

³ δ ′W − W φDE W GW = = ³ ⋅φ DE W GW

−∞ −∞

ZKHQHYHU D < E < W RU W < D < E VKRZLQJWKDW δ ′W − W HTXDOVWKH]HURIXQFWLRQ>LQWKHVHQVHRI

(TI@IRU W ≠ W (TXDWLRQVDDQGEFDQEHDSSOLHGDQ\QXPEHURIWLPHVWRJHW

δ Q WKHQWKGHULYDWLYHRIWKHGHOWDIXQFWLRQVKLIWHGDZD\IURPWKHRULJLQE\DQDPRXQW W :H

KDYH

∞ ∞ ∞

³δ W − W φ W GW = − ³ δ W φ W + W GW = ³δ W φ W + W GW = "

Q Q − Q −

−∞ −∞ −∞

ZKLFKHYHQWXDOO\EHFRPHV

∞

G Qφ

³δ W − W φ W GW = ( −) φ W = ( −)

Q Q Q Q

F

−∞

GW Q W = W

$JDLQWKLVODWHVWUHVXOWFDQEHDSSOLHGWRWHVWIXQFWLRQV φDE W WRJHW

∞

³ δ Q W − W φDE W GW = ( −) φDE Q W =

Q

−∞

ZKHQHYHUWKHLQWHUYDO D < W < E GRHVQRWFRQWDLQ W = W %HFDXVH

∞ ∞

³δ Q

W − W φDE W GW = = ³ ⋅φ DE W GW

−∞ −∞

ZKHQHYHU W = W OLHV RXWVLGH WKLV LQWHUYDO ZH HQG XS ZLWK >XVLQJ WKH GHILQLWLRQ RI HTXDOLW\ LQ

I@

δ Q W − W = IRUW ≠ W G

7KH WHVW IXQFWLRQV LQWHJUDWHG ZLWK δ Q W − W PXVW RI FRXUVH KDYH WKHLU QWK GHULYDWLYHV

FRQWLQXRXVDW W = W

Derivatives of the Delta Function · 2.15

1 for t 0

°

(t ) ®1 2 for t 0 . (2.70a)

° 0 for t

0

¯

d

(1) (t ) (t ) (2.70b)

dt

to be the first derivative of the function, then (1) (t ) 0 for all t > 0 . To evaluate (1) (t ) at

the origin, we decide to turn (t ) and (1) (t ) into generalized functions that we call “ (t ) ” and

“ (1) (t ) ” respectively. We define

5 5

5 5

for all test functions , which means that, according to Eqs. (2.48b) and (2.48c),

Having established the generalized function “ (t ) ”, we know from Eq. (2.53b) that the

generalized function “ (1) (t ) ” must satisfy

5 5

(1)

(2.70d)

5 5

5 5

(1) 5

(t )" (t )dt " (t )"A (t )5

5 5

-- 155

155 --

2 · Fourier Theory

5 5

(1)

t 75 ¼ 0

ªlim (t ) º (0) ªlim (t ) º

¬ t 75 ¼ ¬ t 75 ¼

5

(0) ³ (t ) (t )dt .

5

Hence, for all test functions continuous at the origin (note that they do not have to approach

zero at ), we have

5 5

(1)

5 5

so

d

" (1) (t )" " (t )" (t ) (2.70e)

dt

in the sense of Eq. (2.47b). There is nothing unique about the Heaviside step function. We can

also show, using the generalized function "sgn(t )" introduced in Eqs. (2.60a) and (2.60b) above,

that for any test function

5 5

1

³5 2 "sgn (t )" (t )dt 5³ (t ) (t )dt ,

(1)

(2.70f)

where "sgn (1) (t )" is the first derivative of "sgn(t )" . To show this is true, we do a formal

integration by parts,

5 5

1 1 1

³5 2 "sgn (t )" (t )dt 2 "sgn(t )"A (t )5 2 5³ "sgn(t )" 3(t )dt .

(1) 5

5 0 5

1 1ª º 1 ª lim (t ) º 1 3(t )dt 1 3(t ) dt

³2 ³ 2 ³0

(1)

"sgn (t )" ( t ) dt lim (t )

5

2 ¬ t 75 ¼ 2 ¬ t 75 ¼ 2 5

1 1 1 1 1 1

ªlim (t ) º ª lim (t ) º ª lim (t ) º (0) (0) ª lim (t ) º

2 ¬ t 75 ¼ 2 ¬ t 75 ¼ 2 ¬ t 75 ¼ 2 2 2 ¬ t 75 ¼

5

(0) ³ (t ) (t )dt .

5

This shows Eq. (2.70f) is true. Again, we get a formula

- 156

- 156- -

Derivatives of the Delta Function · 2.15

1

"sgn (1) (t )" (t ) (2.70g)

2

in the sense of Eq. (2.47b), where the only major restriction on the test functions is that they be

continuous at the origin.

To find the Fourier transform of the delta function, we construct two sequences of functions

having the relationship specified in (2.59a)–(2.59g) above. It is easiest to start with the delta-

function sequence in Eq. (2.67c). Any standard table of Fourier transforms gives23

5

sin(2& nt ) § sin(2& nt ) ·

³e

2& ift

dt F ( ift ) ¨ ¸ ( f , n)

5

&t © &t ¹

and

5

sin(2& nt )

³

5

e 2& ift ( f , n)df F (ift ) ( f , n)

&t

so that

sin(2& nt )

6 ( f , n) . (2.71a)

&t

Although Eq. (2.71a) holds true for all real n, it is here used only for integer values of n. We

know from (2.67c) that the generalized limit as n 7 5 of the left-hand side of (2.71a) is (t ) ,

but what is the corresponding generalized limit of the right-hand side? We have

5 5 n 5

³5 ( f ) df ª¬Gn75

lim ( f , n) º lim ³ ( f , n) ( f ) df lim ³ ( f ) df ³ 1A ( f ) df

¼ n75 5 n 75

n 5

G lim ( f , n) 1 ,

n 75

which is no surprise. Therefore, taking the generalized limit as n 7 5 of both sides of (2.71a)

23

Jack D. Gaskill, Linear Systems, Fourier Transforms, and Optics (John Wiley & Sons, New York, 1978), p. 201,

with the sinc, rect function pair corresponding to formula (2.71a) above.

-- 157

157 --

2 · Fourier Theory

gives

(t ) 6 1 , (2.71b)

or, restating this result,

5

³ (t )e

2& ift

dt 1 (2.71c)

5

and

5

³e

2& ift

df (t ) . (2.71d)

5

e 2& if A0 1 ;

but Eq. (2.71d) is true only in the sense of Eq. (2.47b), and it is only safe to substitute freely from

(2.71d) when the substitution takes place inside an integral.

Because the sine is an odd function of its argument, we have according to Eq. (2.17), and

assuming the integral is a Cauchy principle value, that

³ sin(2& ft )df

5

0.

Therefore, Eq. (2.71d) becomes, using Eq. (2.19) and that the cosine is even,

5 5

5 0

Since the integral over the sine always disappears, we can also write

5 5

³ cos(2& ft ) 9 i sin(2& ft ) df ³ e

92& ift

(t ) df .

5 5

5

2 ³ cos(2& ft )df (t ) (2.71e)

0

- 158

- 158- -

Fourier Transform of the Delta Function · 2.16

and

5

³e

92& ift

df (t ) . (2.71f)

5

As was the case for Eq. (2.71d), these formulas are meant to be used inside integrals.

Now that we have defined what is meant by the Fourier transform of a generalized function, it is

surprisingly easy to show that the Fourier convolution theorem holds for the product of a

generalized function and a true function.

We start with two sequences of true functions, one of them labeled with a superscript minus

sign for reasons that will become shortly become apparent, called

v1 (t ) 6 V1( ) ( f )

v2 (t ) 6 V2( ) ( f )

# ,

()

vn (t ) 6 Vn ( f )

#

we know from Eq. (2.59g) that the generalized functions vG (t ) and VG( ) ( f ) specified by

vG (t ) G lim vn (t ) (2.72a)

n 75

and

VG( ) ( f ) G lim Vn( ) ( f ) (2.72b)

n 75

vG (t ) 6 VG( ) ( f ) . (2.72c)

We also suppose that there exists a third sequence of true functions labeled with a superscript

plus sign,

V1( ) (t ),V2( ) (t ),… , Vn( ) (t ),… ,

such that

-- 159

159 --

2 · Fourier Theory

V1( ) (t ) 6 v1 ( f )

V2( ) (t ) 6 v2 ( f )

# .

()

Vn (t ) 6 vn ( f )

#

n 75

then the generalized functions VG( ) (t ) and vG ( f ) are also a Fourier transform pair,

VG( ) (t ) 6 vG ( f ) . (2.72e)

n 75

where we have replaced t by ƒ in (2.72d); and Eqs. (2.72c) and (2.72e) taken together give

From the Fourier convolution theorem for true functions [see Eq. (2.39j)], it follows that for

any true function u(t)

or

5 5

³e ³U

92& ift (9)

u (t )vn (t ) dt ( f 3) Vn( 9 ) ( f f 3) df 3 ,

5 5

where

5 5

U (9) ( f ) ³ e 92& ift u (t )dt and Vn( 9 ) ( f ) ³e

92& ift

vn (t )dt .

5 5

The integral formula for Vn( 9 ) ( f ) just restates the definitions given to Vn( ) and Vn( ) on the two

previous pages. Taking the limit of both sides as n 7 5 gives

- 160

- 160- -

Fourier Convolution Theorem with Generalized Functions · 2.17

5 5

lim

n 75 ³

5

e 92& ift u (t )vn (t ) dt lim ³ U ( 9 ) ( f 3) Vn( 9 ) ( f f 3) df 3

n 75

5

or, moving the limiting process inside the integral so that it becomes a generalized limit [see

discussion after Eq. (2.56a)],

5 5

³e ³U

92& ift (9)

u (t ) G lim vn (t ) dt ( f 3) G lim Vn( 9 ) ( f f 3)df 3 .

n 75 n 75

5 5

From the definitions of vG (t ) and VG( 9 ) ( f ) [see Eqs. (2.72a) and (2.72f)], we get

5 5

³e ³U

92& ift (9)

u (t )vG (t ) dt ( f 3)VG ( 9 ) ( f f 3) df 3 ,

5 5

which becomes

5

³e

92& ift

u (t )vG (t ) dt U ( 9 ) ( f ) VG ( 9 ) ( f ) (2.72h)

5

Consulting Eq. (2.55b) above, we note that convolution with a generalized function is

commutative, just like the convolution of two standard functions, so Eqs. (2.72h) and (2.72i) can

also be written as

5

³e

92& ift

u (t )vG (t ) dt VG ( 9 ) ( f ) U ( 9 ) ( f ) (2.72j)

5

and

F ( 9 ift ) u (t ) A vG (t ) F ( 9 ift 33) vG (t 33) F ( 9 ift 3) u (t 3) . (2.72k)

This establishes the generalized-function counterpart to Eq. (2.39j) whenever e 92& ift u (t ) and

U ( 9 ) ( f ) qualify as acceptable test functions. Since almost all well-behaved, continuous functions

are acceptable test functions when used with linear combinations of delta functions or the

derivatives of delta functions, Eqs. (2.72h) and (2.72i) are valid whenever vG (t ) is a linear

combination of delta functions or the derivatives of delta functions.

-- 161

161 --

2 · Fourier Theory

Establishing the Fourier convolution theorem in the other direction is even easier. We just

write, making the variable substitution t 33 t t 3 and remembering that the convolutions are

commutative,

5 5 5

³

5

e 92& ift [u (t ) vG (t )] dt ³

5

dt e92& ift

5

³ dt3 u(t t 3) G lim v (t 3)

n 75

n

5 5

³ dt 3 G lim v (t 3) ³ dt u(t t 3) e

92& ift

n

n 75

5 5

5 5

³ dt 3 v (t 3) ³ dt u(t t 3) e

92& ift

lim n

n 75

5 5

5 5

lim

n 75 ³

5

dt 3 vn (t 3) e92& ift 3 ³ dt 33 u (t 33) e92& ift 33

5

5 5

[³ e 92& ift 3

G lim vn (t 3) dt 3] A [ ³ u (t 33) e92& ift 33 dt 33] .

n 75

5 5

We conclude that

F ( 9 ift ) u (t ) vG (t ) F ( 9 ift 3) u (t 3) A F ( 9 ift 33) vG (t 33) , (2.72 A )

showing that Eq. (2.39a) holds true for the convolution of a true function and a generalized

function as well as for the convolution of two true functions.

The shah function, often written as I I , can be defined as the generalized limit

§ t § 1 ··

sin ¨ 2& ¨ n ¸ ¸

1

II( t , T ) A G lim © T© 2 ¹¹

. (2.73)

T n75 § t ·

sin ¨ & ¸

© T¹

For any test function (t ) , we have

5 ª sin 2& tT 1 n (1 2)

»ºdt lim 5 ° sin 2& tT 1 n (1 2)

½°dt

³5 (t ) Gn75

lim «

«¬ sin & tT 1 »¼ n 75 ³

(t ) ®

sin & tT 1

¾ (2.74a)

5 ¯° ¿°

- 162

- 162- -

The Shah Function · 2.18

As n gets large in (2.74a), the term in braces { } oscillates ever more rapidly between +1 and í1,

causing the more slowly varying function to make only a negligible contribution to the

integral. The only place this might not hold true is at the isolated t values

t 0, 9 T , 9 2T ,… . (2.74b)

It is easy to see why these isolated values are different. Suppose t differs from one of these

isolated values by only a small amount ¨t so that

sin 2& (t 9 mT )T 1 n (1 2) sin 2&tT n (1 2) 9 2& nm 9 & m

1

sin & (t 9 mT )T 1 sin(&tT 1 9 & m)

sin 2&tT 1 n (1 2) .

1

sin(&tT )

To explain the last step, we note that the sine does not change when a ±nm number of 2ʌ’s is

added to its argument, and adding a ±m number of ʌ’s to the sine’s argument either leaves the

sine unchanged (if m is even) or multiplies it by í1 (if m is odd). Since the sine values in both the

numerator and denominator have the same number of ʌ’s added to their arguments, we do not

care if m is odd because the factor of í1 cancels, leaving the sine ratio unchanged. As ¨t is taken

to be ever smaller in magnitude for a fixed value of n, there comes a time when the arguments of

both sines are small in magnitude, allowing each sine to be approximated by its argument. We

then have

sin 2& (t 9 mT )T 1 n (1 2)

sin 2&tT 1 n (1 2)

sin & (t 9 mT )T 1 sin &tT 1

1

2&tT n (1 2) 2

n (1 2) .

&tT 1

Consequently, the peak values of the term in braces get ever larger at the isolated points in

(2.74b) as n increases, as shown in Figs. 2.9(a)–2.9(c). We see that the triangular peaks at the

isolated points in (2.74b) have widths equal to T (n (1 2)) . As n gets ever larger, the term in

braces oscillates so rapidly between +1 and í1 compared to the test function that there is no

contribution made to the integral on the right-hand side of (2.74a) except at the isolated t values

shown in Figs. 2.9(a)–2.9(c). At these t values, we have

-- 163

163 --

2 · Fourier Theory

5 ° sin 2& tT 1 n (1 2)

°½dt " (T ) 1area of triangular peak2

lim ³ (t ) ® ¾

n 75

5 °¯

sin & tT 1 °¿

(0) 1area of triangular peak2

(T ) 1area of triangular peak2 "

1 T

A A 2 n (1 2) 1" (T ) (0) (T ) "2 ,

2 n (1 2)

which simplifies to

5 ° sin 2& tT 1 n (1 2)

½° dt T k 5

lim ³ (t ) ® ¾ ¦ (kT ) . (2.75a)

n 75

5 °¯ sin & tT 1 °¿ k 5

k 5

But T ¦ (kT ) can be regarded

k 5

thought ofasaswhat

whatwe

weget

getwhen

whenevaluating

evaluatingthe

theintegral

integral

ª k 5 º

5 k 5 5 k 5

³5 «¬ k¦

(t ) T

5

(t kT ) »

¼

dt T ¦³

k 5 5

(t kT ) (t ) dt T ¦ (kT ) .

k 5

5 ° sin 2& tT 1 n (1 2)

½° dt 5

ª k 5

º

lim ³ (t ) ® ¾ ³ (t ) «T ¦ (t kT ) »¼ dt (2.75b)

n 75

5 ¯° sin & tT 1 ¿° 5 ¬ k 5

or, using (2.56a) to take the limit inside the integral as a generalized limit,

5 ° sin 2& tT 1 n (1 2)

½° dt 5

ª k 5

º

³5 (t ) Gn75

lim ® ¾ ³ (t ) «T ¦ (t kT ) »¼ dt .

°¯ sin & tT 1 °¿ 5 ¬ k 5

Since this last result is true for any test function , we conclude that

- 164

- 164- -

The Shah Function · 2.18

° sin 2& tT 1 n (1 2) ½° k 5

G lim ® ¾ T ¦ (t kT ) (2.75c)

n 75

°¯ sin & tT 1 °¿ k 5

in the sense of Eq. (2.47b). Comparison of this result to the definition of the shah function in Eq.

(2.73) above shows that

5

II( t , T ) ¦ (t kT ) .

k 5

(2.75d)

° sin 2& fT 1 n (1 2)

½° T k 5

G lim ® ¾ ¦ ( f kT ) .

n 75

¯°

sin & fT 1 ¿° k 5

everywhere to get

° sin 2& fT n (1 2) ½° 1 k 5 § k·

G lim ® ¾ ¦ ¨ f ¸. (2.75e)

n 75

¯° sin & fT ¿° T k 5 © T ¹

To get the Fourier transform of the shah function, we construct the sequence of true functions

G1 (t , T ), G2 (t , T ),… , Gn (t , T ),… such that

n

Gn (t , T ) ¦g

k n

n (t kT ) , (2.76a)

where

sin(2& (n 1)t )

g n (t ) . (2.76b)

&t

sin(2& nt )

G lim g n 1 (t ) G lim (t ) .

n 75 n 75 &t

-- 165

165 --

2 · Fourier Theory

FIGURE 2.9(a).

FIGURE 2.9(b).

FIGURE 2.9(c).

The formula for the t interval between the arrows is T /( n 1/ 2) in all three plots. Figures 2.9(a), 2.9(b),

and 2.9(c) show how the base width of the central lobe becomes ever narrower as n increases.

- 166

- 166- -

Fourier Transform of the Shah Function · 2.19

Since adding one to n does not make any difference in the limit, we end up with

G lim g n (t ) (t ) ; (2.76c)

n 75

6 ( f , n 1) for n 1, 2,… . (2.76d)

&t

To find the generalized function that is the forward Fourier transform of the generalized limit of

Gn as n 7 5 , we must evaluate the forward Fourier transform of Gn for finite n,

5 n 5

F ( ift ) Gn (t ) ³ e Gn (t ) dt

2& ift

¦ ³e

k n 5

2& ift

g n (t kT ) dt

5

n 5

¦ e2& ifkT

k n

³e

2& ift 3

g n (t 3) dt 3,

5

where in the last step the variable of integration has been changed to t 3 t kT . The Fourier

transform inside the sum can be done using (2.76b) and (2.76d) to get

n

F ( ift ) Gn (t ) ( f , n 1) A ¦ e 2& ifkT . (2.77a)

k n

n

The sum ¦e

k n

2& ifkT

is just a disguised form of geometric series. We can write

n n

¦e

k n

2& ifkT

¦w

k n

k

, (2.77b)

where

w e 2& ifT

and define

n n

Sn ¦w

k n

k

¦e

k n

2& ifkT

.

-- 167

167 --

2 · Fourier Theory

Using the standard approach for calculating the sum of a geometric series, we note that

multiplying every term in the sum by w increases each power of w in the sum by one. This is the

same as adding wn 1 and subtracting w n from the original sum, giving

n 1

wSn ¦

k n 1

wk S n wn 1 w n

or

wn 1 w n

Sn .

w 1

2& ifT n 1 2

e

2& ifT n 1 2

n

e 2& ifT ( n 1) e 2& ifT ( n ) e

¦e

k n

2& ifkT

e 2& ifT 1

e & ifT e& ifT

(2.77c)

sin 2& fT n 1 2 ,

sin(& fT )

F ( ift ) (Gn (t ))

sin 2& fT n 1 2 ( f , n 1) . (2.77d)

sin(& fT )

The inverse Fourier transform of the forward Fourier transform returns the original function [see

Eqs. (2.29b) and (2.29d)], so this last result lets us write

Gn (t ) 6

sin 2& fT n 1 2 ( f , n 1) . (2.77e)

sin(& fT )

From the definition of the Fourier transform of a generalized function [see (2.59g)], we know that

taking the generalized limit of both sides of (2.77e) gives a Fourier transform relationship

between two generalized functions—all that needs to be done now is to find out what these

generalized functions are.

To find the generalized function that is the generalized limit of Gn as n 7 5 , we write for

any test function , using Eq. (2.76a), that

- 168

- 168- -

Fourier Transform of the Shah Function · 2.19

ª º

5 5 5 n

¬ n 75 ¼ n 75 n 75

¬ k n ¼

5 5 5

n 5

(2.77f)

lim

n 75

¦ ³ (t ) g

k n 5

n (t kT ) dt.

Equation (2.76c) states that the generalized limit of g n is the delta function, so

5 5 5

lim ³ (t ) g n (t kT ) dt ³ (t ) G lim g n (t kT ) dt ³ (t ) (t kT ) dt (kT ) ,

n 75 n 75

5 5 5

n 5 5

lim

n 75

¦ ³ (t ) gn (t kT ) dt

k n 5

¦ (kT ) .

k 5

5 5

³ (t ) ªG lim Gn (t )º dt

¬ n 75 ¼ ¦ (kT ) .

k 5

(2.77g)

5

But, just as in the discussion following Eq. (2.75a) above, we can regard

¦ (kT )

k 5

5

II( t , T ) ¦ (t kT )

k 5

with any test function , since

ª 5 º

5 5 5

³ II( t , T ) (t ) dt

5

³ «¦

5 ¬ k 5

(t kT ) »

¼

(t ) dt ¦ (kT ) .

k 5

-- 169

169 --

2 · Fourier Theory

ª 5 º

5 5

ª lim Gn (t ) ³ « ¦ (t kT ) » (t ) dt

³5 dt (t ) ¬Gn75 º (2.77h)

¼ 5 ¬ k 5 ¼

5

G lim Gn (t )

n 75

¦ (t kT ) II( t , T )

k 5

(2.77i)

The generalized function that is the generalized limit of the right-hand side of (2.77e) is

multiplied by an arbitrary test function ( f ) and integrated over all ƒ to get

ª sin 2& fT n 1 2

º ½°

°

5

³5 ( f ) ®Gn75

lim «

« sin(& fT )

( f , n 1) » ¾ df

»°

°¯ ¬ ¼¿

n 1 ª sin 2& fT n 1 2 º

lim ³ ( f ) « » df (2.78a)

n 75 « sin(& fT ) »

( n 1)

¬ ¼

5 ª sin 2& fT n 1 2 º

lim ³ ( f ) « » df ,

n 75 « sin( & fT ) »

5

¬ ¼

where in the last step we recognize that the behavior of the sine ratio inside the square brackets

[ ] is not affected by the endpoints for the region of integration as n 7 5 . Equations (2.56a) and

(2.75e) show that

° sin 2& fT n (1 2) °½ ª º

5 5 k 5

lim ³ ( f ) ® ¾ df ³ ( f ) «T 1 ¦ ( f kT 1 ) » df ,

n 75

5 °¯ sin & fT °¿ 5 ¬ k 5 ¼

- 170

- 170- -

Fourier Transform of the Shah Function · 2.19

ª sin 2& fT n 1 2

º ½°

°

5

³5 ( f ) ®Gn75

lim «

« sin(& fT )

( f , n 1) » ¾ df

»°

°¯ ¬ ¼¿

ª º

5 k 5

³ ( f ) «T 1 ¦ ( f kT 1 ) » df

5 ¬ k 5 ¼

for any test function ( f ) . Therefore,

ª sin 2& fT n 1 2

º 1 k 5

§ k·

G lim « ( f , n 1) » ¦ ¨ f ¸ (2.78b)

n 75 « sin(& fT ) » T k 5 © T¹

¬ ¼

in the sense of Eq. (2.47b). Since the right-hand side of (2.78b) is, according to (2.75d),

proportional to the shah function, we end up with

1 5 § k · 1

¦ ¨f

T k 5 © T ¹ T

1

¸ II( f , T ) . (2.78c)

Equations (2.78b) and (2.77i) let us take the generalized limits as n 7 5 of both sides (2.77e) to

get

5

1 5 § k·

¦

k 5

(t kT ) 6 ¦ ¨ f ¸ .

T k 5 © T¹

(2.78d)

1

II( t , T ) 6 II( f , T 1 ) . (2.78e)

T

modified to directly

generalize

to how

showboth the forward

explicitly andthe

that both inverse

forward Fourier

and

inverse Fourier

transform transform

of the shah of produce

function the shahanother

function produce

shah another

function. shah

We first function.

write (2.78d)We firstforward

as the write

(2.78d)

and as the

inverse forward

Fourier and inverse Fourier transforms,

transforms,

2& ift ª º

5 5

1 5

§ j·

³5 e « ¦ (t kT ) » dt T ¦ ¨© f T ¸¹ (2.79a)

¬ k 5 ¼ j 5

and

5

ª

2& ift 1

5

§ j ·º 5

³ «¬ T

e ¦ ¨ ¸ » ¦ (t kT ) .

j 5 ©

f

T ¹¼

df

k 5

(2.79b)

5

-- 171

171 --

2 · Fourier Theory

The discussion following Eq. (2.52c) above shows that linear transformations of the variables of

integration are allowed when using generalized functions, so we can change to t 3 t in Eqs.

(2.79a) and (2.79b) to get

2& ift 3 ª º

5 5

1 5

§ j·

³ e « ¦

¬ k 5

( t 3 kT ) »

¼

dt 3

T

¦ ¨© f T ¸¹

j 5

5

and

5

ª

2& ift 3 1

5

§ j ·º 5

³5 e «¬ T ¦¨f

j 5 ©

¸»

T ¹¼

df ¦ (t 3 kT ) .

k 5

The sum over index k goes over all positive and negative integers, so we can change the sum’s

index to k 3 k and use that the delta function is even [see Eq. (2.68a)] to get

2& ift 3 ª º

5 5

1 5

§ j·

³5 «¬ k¦

e

35

(t 3 k 3T ) » dt 3

¼ T

¦ ¨© f T ¸¹

j 5

and

5

ª1 5

§ j ·º 5

³e

2& ift 3

« ¦ ¨ f ¸ » df ¦ (t 3 k 3T ) .

5 ¬T j 5 © T ¹¼ k 3 5

Dropping the primes and combining these results with Eqs. (2.79a) and (2.79b) produces the

more general formulas

92& ift ª º

5 5

1 5

§ j·

³ e « ¦

¬ k 5

(t kT ) »

¼

dt

T

¦ ¨© f T ¸¹

j 5

(2.79c)

5

and

5

ª

92& ift 1

5

§ j ·º 5

³5 e «¬ T ¦¨f

j 5 ©

¸»

T ¹¼

df ¦ (t kT ) .

k 5

(2.79d)

In fact, we can easily show that Eqs. (2.79c) and (2.79d) are really the same formula. First, we

interchange the j, k indices and the ƒ, t variables in Eq. (2.79c) so that it becomes

5

ª 5 º 1 5 § k ·

³ e 92& ift

« ¦ ( f jT ) » df ¦ ¨t

T k 5 © T

¸.

¹

5 ¬ j 5 ¼

Parameter T is arbitrary, so—just like in the analysis following Eq. (2.75d) above—it can be

replaced everywhere by T 1 to get

- 172

- 172- -

Fourier Transform of the Shah Function · 2.19

5

ª 5 § j ·º 5

§ k·

³5 «¦ ¨ ¦

92& ift

e f ¸» df T ¨ t kT¸ .

¬ j 5 © T ¹¼ k 5 © T¹

After dividing through by T, we see that this last result is the same as Eq. (2.79d), showing that

Eqs. (2.79c) and (2.79d) are really the same formula.

Integral Fourier transforms are connected in a direct and straightforward way to both the Fourier

series and the discrete Fourier transform. This section shows the connection to the Fourier series

and the next section shows the connection to the discrete Fourier transform.24

We begin with an arbitrary, nonpathological function u(t) that has a well-defined Fourier

integral transform. Function u can be complex-valued but its argument t must be real, and U(ƒ) is

the forward Fourier transform of u(t), so

5

U( f ) F ( ift )

u (t ) ³ u (t )e2& ift dt (2.80a)

5

and

u (t ) 6 U ( f ) . (2.80b)

From u(t), we create a new function u[ 5 ] (t , T ) that repeats forever along the t axis at intervals of

T,

5

u[ 5 ] (t , T ) ¦ u(t kT ) .

k 5

(2.81a)

Although perhaps redundant, it turns out that listing T as one of the arguments of u[ 5 ] is a

convenient way to keep track of the connection between u and u[ 5 ] . Function u is called a

5

periodic function of period T because, for any finite positive or negative integer m,

u[ 5 ] (t mT , T ) u[ 5 ] (t , T ) . (2.81b)

Figures 2.10(a) and 2.10(b) show the plots for both u and u[ 5 ] as functions of t. Since function u

is left unspecified, u[ 5 ] can be thought of as representing an arbitrary periodic function. We can

24

The analysis in Secs. 2.20 and 2.21 is adapted from A. Papoulis, Signal Analysis (McGraw-Hill Book Company,

New York, 1977), pp. 76–81.

-- 173

173 --

2 · Fourier Theory

N

u [N ]

(t , T ) ¦ u (t kT ) .

k N

(2.81c)

Clearly,

lim u[ N ] (t , T ) u[ 5 ] (t , T ) . (2.81d)

N 75

We assume that u[ N ] is well behaved with respect to the test functions , so that

5 5

³ (t ) u ³ (t ) u

[N] [5]

lim (t , T ) dt (t , T ) dt . (2.81e)

N 75

5 5

_____________________________________________________________________________

FIGURE 2.10(a). u (t )

FIGURE 2.10(b).

u[ 5 ] (t , T )

T

Figure 2.10(a) is a plot of u (t ) . The solid curve in Fig. 2.10(b), shifted upward from its true position, is

u[ 5 ] (t , T ) and the dashed curves represent u (t ) displaced by multiples of T .

- 174

- 174- -

Fourier Series · 2.20

From (2.81e) and the definition of the generalized limit [see Eq. (2.56a)], we then know that

5 5 5

lim ³ (t ) u

[N ]

(t , T ) dt ³ (t ) ªG lim u[ N ] (t , T ) º dt ³ (t ) u[ 5 ] (t , T ) dt ,

N 75

5 5

¬ N 75 ¼ 5

G lim u[ N ] (t , T ) u[ 5 ] (t , T ) (2.81f)

N 75

Following the pattern of the definitions in (2.81a) and (2.81c), we define

N

[ N ] (t , T ) ¦ (t kT )

k N

(2.82a)

and

5

[5]

(t , T ) ¦ (t kT ) .

k 5

(2.82b)

Function [ 5 ] (t , T ) is clearly just another way of writing the shah function II( t , T ) . [The shah

5

function is defined in Eq. (2.73) and shown equal to ¦ (t kT )

k 5

in Eq. (2.75d).] The

N

[N]

(t , T ) ¦ (t kT )

k N

5 N 5

u (t ) [ N ] (t , T ) ³ u(t 3) (t t 3, T )dt 3

[N ]

¦ ³ u(t3) t 3 (t kT )

k N 5

5

N

¦ u(t kT ) ,

k N

where the next-to-last step uses ( x ) ( x ) as shown in Eq. (2.68a). The definition of u[ N ] in

(2.81c) then gives

u[ N ] (t , T ) u (t ) [ N ] (t , T ) . (2.82c)

-- 175

175 --

2 · Fourier Theory

Taking the integral Fourier transform of both sides, using the Fourier convolution theorem [see

Eq. (2.72 A )], and remembering that U(ƒ) is the forward Fourier transform of u(t), we get

F ( ift ) u[ N ] (t , T ) F ( ift ) u (t ) A F ( ift ) [ N ] (t , T )

N 5

U( f )A ¦ ³e

k N 5

2& ift

(t kT )dt

N (2.83a)

U( f ) ¦ e 2& ikfT

k N

sin 2& fT ( N 1 2)

U( f ) ,

sin(& fT )

where in the last step we substitute from Eq. (2.77c) above. Having now found that

sin 2& fT ( N 1 2)

F ( ift ) u[ N ] (t , T ) U ( f ) sin(& fT )

,

5

sin 2& fT ( N 1 2)

³e

[N ] 2& ift

u (t , T ) U( f ) df . (2.83b)

5

sin(& fT )

5

sin 2& fT ( N 1 2)

³e

[5] 2& ift

u (t , T ) lim U( f ) df . (2.83c)

N 75

5

sin(& fT )

5

ª sin 2& fT ( N 1 2) º

³e

[5] 2& ift

u (t , T ) U ( f ) G lim « » df

5

N 75

¬ sin(& fT ) ¼

1ª k ·º

5 5

§

³ e 2& iftU ( f ) « ¦ ¨ f ¸ » df

5

T ¬ k 5 © T ¹¼

or

5 kt

¦ ª¬T

2& i

u [5]

(t , T ) 1

U (k T ) º¼ e T

. (2.83d)

k 5

- 176

- 176- -

Fourier Series · 2.20

Equation (2.83d) specifies the Fourier series for an arbitrary periodic function u[ 5 ] , showing that

u[ 5 ] can be written as the infinite sum of complex exponentials multiplied by the complex

constants [T 1U (k T )] . To get these complex constants directly from u[ 5 ] , we note that for any

real number * and integer m,

1 ° ½°

5 m * ( N 1)T m

1 §m· 1 2& i t 2& i t

U ¨ ¸ ³ u (t )e T

dt lim ® ³ u (t )e T

dt ¾

T © T ¹ T 5 N 75 T

¯° * NT ¿°

1 °

* ( N 1)T m * ( N 2)T m

2 & i t 2 & i t

lim ®

N 75 T

°̄ *

³

NT

u (t )e T

dt ³

* ( N 1)T

u (t )e T

dt "

* m * T m

2 & i t 2& i t

³ u (t )e

* T

T

dt ³* u(t )e T

dt

* 2T m

2 & i t

* ( N 1)T m

2& i t ½°

* T

³ u (t )e T

dt " ³

* NT

u (t )e T

dt ¾ .

°¿

This can be simplified to

* ( k 1)T m

1 §m· 1 N

U ¨ ¸ lim ¦

2& i t

T © T ¹ N 75 T k N ³

* kT

e T

u (t )dt . (2.83e)

* ( k 1)T m * T m * T m

2& i t 2& i t 3 2& i t 3

³ ³* e ³*

2& imk

e T

u (t ) dt T

e u (t 3 kT ) dt 3 e T

u (t 3 kT ) dt 3 ,

* kT

where we use that e 2& imk 1 . Substituting this into (2.83e) gives

* T m * T m

1 §m· 1 N 1 ª N º

U ¨ ¸ lim ¦ « ¦ u (t 3 k 3T ) » dt 3 ,

2& i t 3 2& i t 3

T © T ¹ N 75 T k N ³* e T

u (t 3 kT )dt 3 lim

N 75 T ³* e T

¬ k 3 N ¼

where in the last step we have replaced index k by index k 3 k . Now, taking the limit inside the

integral to get the generalized limit [see Eq. (2.56a) above], we rely on (2.81f) to get

* T m * T m

1 §m· 1 ª N º 1

G lim « ¦ u (t 3 k 3T ) » dt 3

2& i t 3 2 & i t 3

³* e ³* e

[5 ]

U¨ ¸ T T

u (t 3, T ) dt 3 . (2.83f)

T ©T ¹ T N 75

¬ k 3 N ¼ T

-- 177

177 --

2 · Fourier Theory

Equations (2.83d) and (2.83f) let us put the Fourier series into its standard form. For any

periodic function

5

[5]

v(t ) u (t , T ) ¦ u (t kT )

k 5

of period T, we have found that

5 t

¦

2& ik

v(t ) Ak e T

, (2.84a)

k 5

where

* T k

1 2& i t

Ak

T ³* e T

v(t ) dt . (2.84b)

for any finite value of * . Because we did not require u(t) to be real in (2.80a), Eqs. (2.83d),

(2.83f), (2.84a), and (2.84b) still hold true for complex periodic functions with real arguments t.

It is customary—but of course not mandatory—to choose * 0 or * T 2 in (2.84b).

Using v(t ) u[ 5 ] (t , T ) , we know from Eqs. (2.83d), (2.83f), (2.84a), and (2.84b) that the Ak

coefficients can be specified in terms of the forward Fourier transform U(ƒ) of u(t),

1 §k ·

Ak U¨ ¸. (2.85a)

T ©T ¹

When u is real—which means that v(t ) u[ 5 ] (t , T ) is also real—we know from entry 7 of Table

2.1 (located at the end of this chapter) that U(ƒ) must be Hermitian so that

U ( f ) U ( f ) .

Hence, when v(t) is real in (2.84a), it then follows from (2.85a) that

A k Ak (2.85b)

in (2.84b). This procedure can be extended to all the entries in Table 2.1, giving us the entries in

Table 2.2 (also located at the end of this chapter). To go through another example, if u is

imaginary and odd, we know from entry 3 of Table 2.1 that U is real and odd, so

U ( f ) U ( f ) and Im U ( f ) 0 .

- 178

- 178- -

Fourier Series · 2.20

A k Ak and Im Ak 0 . (2.85c)

We can show that v(t ) u[ 5 ] (t , T ) is imaginary and odd when u is imaginary and odd (let

k 3 k ),

5 5 5

v(t ) u [5 ]

(t , T ) ¦ u(t kT ) ¦ u (t k 3T ) ¦ u(t k 3T )

k 5 k 3 5 k 3 5

u[ 5 ] (t , T ) v(t ) ,

and

5

Re v(t ) ¦ Re u (t kT ) 0 .

k 5

This shows that we end up with (2.85c) associated with v(t) being imaginary and odd, as stated in

entry 3 of Table 2.2.

A final point worth mentioning about Fourier series is that the Ak coefficients are often

reshuffled so that the series can be written as a sum of sines and cosines. Equation (2.84a) can be

rewritten as, using ei cos i sin ,

5 ª t

2& i k º

t

v(t ) A0 ¦ « A k e

2 & i k

T

Ak e T »

k 1 ¬ ¼ (2.86a)

5 § 2& k t · 5 § 2& k t ·

A0 ¦ ª¬ A k Ak º¼ cos ¨ ¸ ¦ i ª¬ Ak A k º¼ sin ¨ ¸.

k 1 © T ¹ k 1 © T ¹

* T

1

A0

T ³* v(t ) dt , (2.86b)

1

* T

ª 2& i k Tt 2& i k º

t

2

* T

§ 2& k t ·

A k Ak

T ³* v(t ) « e

¬

e T

» dt ³ v(t ) cos ¨

¼ T * © T ¹

¸ dt , (2.86c)

and

i

* T

ª 2& i k Tt 2& i k º

t

2

* T

§ 2& k t ·

i ª¬ Ak A k º¼

T ³* v(t ) «e

¬

e T » dt ³ v(t ) sin ¨

¼ T * © T

¸ dt .

¹

(2.86d)

-- 179

179 --

2 · Fourier Theory

c0 5 § 2& kt · 5 § 2& kt ·

v(t ) ¦ ck cos ¨ ¸ ¦ sk sin ¨ ¸, (2.87a)

2 k 1 © T ¹ k 1 © T ¹

where

* T

2 § 2& kt ·

ck

T ³* v(t ) cos ¨© T ¹

¸ for k 0,1, 2,… (2.87b)

and

* T

2 § 2& kt ·

sk

T ³* v(t ) sin ¨© T ¹

¸ for k 1, 2,3,… . (2.87c)

The absolute value signs are dropped from index k because it is defined positive in (2.87a), and

A0 is replaced by c0 2 so that the formula for c0 can be folded into the general formula for ck in

(2.87b). Although it is still not mandatory, parameter * is usually given the value 0 or T 2 .

Nowhere has v been required to be real, so Eqs. (2.87a)–(2.87c), just like Eqs. (2.84a) and

(2.84b), still hold true when v is a complex-valued periodic function of (real) period T. Indeed, if

v is a complex-valued function of a real argument t, both its real part

vR (t ) Re v(t )

and its imaginary part

vI (t ) Im v(t )

are real-valued periodic functions of period T. This means that when, for any integer m, we have

vR (t 9 mT ) vR (t ) (2.88b)

and

vI (t 9 mT ) vI (t ) . (2.88c)

Since sines and cosines of real arguments are strictly real, we can now take the real and

imaginary parts of (2.87a)–(2.87c) to get

- 180

- 180- -

Fourier Series · 2.20

vR (t ) ¦ Re(ck ) cos ¨ ¸ ¦ Re( sk ) sin ¨ ¸, (2.89a)

2 k 1 © T ¹ k 1 © T ¹

with

* T

2 § 2& kt ·

Re(ck )

T ³* v R (t ) cos ¨

© T ¹

¸ for k 0,1, 2,… (2.89b)

and

* T

2 § 2& kt ·

Re( sk )

T ³* v R (t ) sin ¨

© T ¹

¸ for k 1, 2,3,… , (2.89c)

as well as

Im(c0 ) 5 § 2& kt · 5 § 2& kt ·

vI (t ) ¦ Im(ck ) cos ¨ ¸ ¦ Im( sk ) sin ¨ ¸ , (2.90a)

2 k 1 © T ¹ k 1 © T ¹

with

* T

2 § 2& kt ·

Im(ck )

T ³* v (t ) cos ¨©

I

T ¹

¸ for k 0,1, 2,… (2.90b)

and

* T

2 § 2& kt ·

Im( sk )

T ³* v (t ) sin ¨©

I

T ¹

¸ for k 1, 2,3,… . (2.90c)

The first step in going from the integral Fourier transform to the discrete Fourier transform is to

repeat the procedure used in Sec. 2.20 to get the Fourier series. We pick a nonpathological

function u(t) having a forward Fourier transform

³ u (t )e

2& ift

U( f ) dt (2.91a)

5

and, following the same procedure used in Eq. (2.81a) above, create a periodic function of period

T:

5

u[ 5 ] (t , T ) ¦ u (t kT ) .

k 5

(2.91b)

As was shown Sec. 2.20, we can now write the associated Fourier series as [see Eq. (2.83d)]

-- 181

181 --

2 · Fourier Theory

kt

1 5 §k · 2& i T

u[ 5 ] (t , T ) ¦ U¨

T k 5 © T

¸e

¹

, (2.91c)

Next we divide the period T of u[ 5 ] into N equal lengths, t T N , and evaluate (2.91c) only

for t mt with m 0,1, 2,… , N 1 ,

km

1 5 §k · 2& i N

u [5]

(mt , T ) ¦ U ¨ ¸e , (2.92a)

T k 5 © T ¹

N t T (2.92b)

to simplify the exponent of (2.92a). The infinite sum in (2.92a) can be split in two by making the

substitution k n rN with n 0,1, 2,… , N 1 and r 0, 9 1, 9 2,…. This gives

nm

1 5 N 1 § n rN · 2& i N 2& irm

u [5]

(mt , T ) ¦ ¦ U ¨ ¸e e .

T r 5 n 0 © T ¹

Since e2& irm 1 and T N t , this becomes, making the index substitution r 3 r ,

nm

1 N 1 2& i N 5

§n r3 ·

u [5]

(mt , T ) ¦ e ¦ U ¨© T t ¸¹

T n 0 r 3 5

or

nm

1 N 1 2& i N [ 5 ] § n 1 ·

u[ 5 ] (mt , T ) ¦ e U ¨© T , t ¸¹ ,

T n 0

(2.93a)

where we follow the pattern of Eqs. (2.81a) and (2.91b) and define

5

U [5] ( f , F ) ¦ U ( f rF )

r 5

(2.93b)

Equation (2.93a) is a somewhat disguised version of the discrete Fourier transform (DFT).

Figures 2.11(a) and 2.11(b) show the relationship of the two periodic functions u[ 5 ] and U [ 5 ] ,

graphed with solid lines, to the two original functions u and U graphed with dashed lines. [In

graphs such as these, u(t) typically stands for data and is usually real, making it easy to represent

- 182

- 182- -

Discrete Fourier Transform · 2.21

with a two-dimensional plot; but its transform U(ƒ) is often complex, so it makes more sense to

plot U ( f ) if we just want to show where U(ƒ) is different from zero.] When function u[ 5 ] has

period T and is uniformly sampled at intervals of ¨t, then function U [ 5 ] has period

1

F (2.93c)

t

and is uniformly sampled at intervals of

1

f . (2.93d)

T

Note, of course, we could also say that u[ 5 ] has period 1 f and is uniformly sampled at

intervals of 1 F when U [ 5 ] has period F and is sampled at intervals of ¨ƒ. When both ¨ƒ and ¨t

are known, we have from (2.92b) and (2.93d) that

1

f A t (2.93e)

N

Figures 2.12(a) and 2.12(b) show that if T and F are large and functions u(t) and U(ƒ) die away

relatively quickly when t and f are large—which means that u and U are localized near the t

and ƒ origins—then the corresponding periodic functions u[ 5 ] (t , T ) and U [ 5 ] ( f , F ) can be used

to approximate the non-negligible regions of u and U. Almost always when the DFT is used, its

users have in mind a situation such as that shown in Figs. 2.12(a) and 2.12(b), with u[ 5 ] and U [ 5 ]

being good approximations of u and U for small to moderately large values of t and ƒ.

To complete the DFT transform pair, we define

2& i

wN e N

(2.94a)

1 N 1 nm [ 5 ] § n 1 ·

u[ 5 ] (mt , T ) ¦ wN U ¨© T , t ¸¹ .

T n 0

(2.94b)

N 1

1 N 1 [ 5 ] § n 1 · ª N 1 mA( n k ) º ½

¦u

m 0

[5 ]

(mt , T ) w mk

N ¦ ®U ¨ , ¸ A « ¦ wN

T n 0 ¯ © T t ¹ ¬ m 0

»¾ .

¼¿

(2.94c)

-- 183

183 --

2 · Fourier Theory

FIGURE 2.11(a).

1

T

f

u[ 5 ] (t , T )

t

t 1/ F

FIGURE 2.11(b).

1

F

U [5 ] ( f , F ) t

f

f 1/ T

The sum over m on the right-hand side is the sum of a geometric series,

N 1

Vn[,Nk ] ¦ wNm ( n k ) . (2.94d)

m 0

This can be solved using the standard procedure for geometric sums [see the analysis following

Eq. (2.77b) above], multiplying every term in the sum by wNn k to get

[N] 1 wNN A( n k ) 1 e 2& i ( n k )

V n,k , (2.94f)

1 wNn k § nk ·

2& i ¨ ¸

1 e © N ¹

- 184

- 184- -

Discrete Fourier Transform · 2.21

where in the last step definition (2.94a) is used to eliminate wN . Index n goes from zero to N 1

for each value of k [see Eqs. (2.94b) and (2.94c)]. Deciding also to restrict k to one of the integers

k 0,1, 2,… , N 1 , we see that the denominator in (2.94f) can be zero only when n k . This

looks like it could be a problem, but when n = k, we can return to the original formula in (2.94d),

noting that for n = k the sum Vn[,Nk ] is equal to N. When n k, the right-hand side of (2.94f) shows

that Vn[,Nk ] is zero because e2& i ( n k ) 1 . We conclude that

N for n k ½

Vn[,Nk ] ® ¾ N k ,n , (2.94g)

¯ 0 for n > k ¿

1 for n k

k ,n ® . (2.94h)

¯0 for n > k

N 1

1 N 1 [ 5 ] § n 1 · [ N ] ½

¦ u[5 ] (mt , T )wN mk ¦ ®U ¨© T , t ¸¹ AVn,k ¾ .

T n 0 ¯

m 0 ¿

N [ 5 ] § k 1 · N 1 [ 5 ]

U ¨ , ¸ ¦ u (mt , T ) wN mk . (2.94i)

T © T t ¹ m 0

This equation is the other half of the DFT [the first half is specified by Eqs. (2.94a) and (2.94b)].

Using Eqs. (2.94a) and (2.92b) to replace wN by e(2& i ) / N and N T by 1 t , we write (2.94b)

and (2.94i) as

§ mn ·

1 N 1 2& i¨ ¸ §n 1 ·

u (mt , T ) ¦ e © N ¹ U [ 5 ] ¨ , ¸

[5]

(2.95a)

T n 0 © T t ¹

and

§ mn ·

§n 1 · N 1 2& i ¨ ¸

U [5]

¨ , ¸

© T t ¹

t ¦

m 0

u[ 5 ] (mt , T )e © N ¹ , (2.95b)

-- 185

185 --

2 · Fourier Theory

FIGURE 2.12(a).

u[ 5 ] (t , T ) 1

T

f

t 1/ F

FIGURE 2.12(b).

U [5] ( f , F ) 1

F

t

f

f 1/ T

- 186

- 186- -

Discrete Fourier Transform · 2.21

where index k has been replaced by n in (2.94i). This can also be written as, using Eqs. (2.93c)

and (2.93d),

N 1 2& i § mn ·

¨ ¸

u (mt , T ) f ¦ e

[5] © N ¹

U [ 5 ] nf , F (2.95c)

n 0

and

N 1 § mn ·

2& i ¨ ¸

U [5]

nf , F t ¦ u [5]

(mt , T )e © N ¹

. (2.95d)

m 0

The forward and inverse DFTs shown in (2.95c) and (2.95d) are often written as

N 1 § mn ·

2& i ¨ ¸

um ¦ U n e © N ¹

(2.96a)

n 0

and

N 1 § mn ·

1 2 & i ¨ ¸

Un

N

¦u

m 0

m e © N ¹

. (2.96b)

um u[ 5 ] (mt , T ) (2.96c)

and

U n f A U [ 5 ] (nf , F ) , (2.96d)

and to get Eq. (2.96b), both sides of (2.95d) are multiplied by ¨ƒ, using (2.93e) to replace f A t

by 1 N . We can also define

U n U [ 5 ] (nf , F ) (2.97a)

and

um t A u[ 5 ] (mt , T ) (2.97b)

N 1 § mn ·

1 2& i ¨ ¸

um

N

¦U n e

n 0

© N ¹

(2.97c)

and

N 1 § mn ·

2& i ¨ ¸

U n ¦ um e © N ¹

, (2.97d)

m 0

-- 187

187 --

2 · Fourier Theory

Figures 2.13(a) and 2.13(b) show how the u[] and U[] continuous functions are sampled to

create the DFT formulas in the previous paragraph. The values of the original functions u and U

are ignored for negative values of t and ƒ; instead, we sample u[] and U[] out to t = T and f = F,

picking up the original u and U values at negative t and ƒ where they repeat near t = T and f = F.

Many times DFT plots show um and Un with n and m running from 0 to N í 1. When this is done,

it is with the understanding that the large index values greater than N/2 represent u and U for

negative t and ƒ values respectively.

The DFT is important because there is an algorithm, called the fast Fourier transform (FFT), that

allows computers to calculate the sums in Eqs. (2.96a), (2.96b), (2.97c), and (2.97d) rapidly when

N is a multiple of 2. The FFT performs best when N 2 j for j a positive integer. In fact, when

faced with calculating an integral Fourier transform

³ u (t )e

2& ift

U( f ) dt

5

over a range of ƒ values for an arbitrary function u(t), it is standard practice to convert the

integral to a DFT and do the job on a computer with a FFT. As we saw in the previous section,

the DFT deals directly with u[ 5 ] and U [ 5 ] rather than u and U. Thus, successfully using the DFT

to calculate the integral transform requires that u[ 5 ] and U [ 5 ] consist of well-separated, repetitive

regions of u and U, as shown in Figs. 2.12(a) and 2.12(b), instead of overlapping regions of u and

U, as shown in Figs. 2.11(a) and 2.11(b). Ensuring that u[ 5 ] consists of nonoverlapping regions

of u tends to occur naturally; the shape of u is already known so there is no real difficulty in

picking T large enough to prevent significant amounts of overlap in u[ 5 ] . The shape of U,

however, is not known in advance, so care must be taken to avoid significant amounts of overlap

in U.

Consider what happens when the DFT is used to analyze a real signal u(t) having the spectrum

U(ƒ) and we know that U(ƒ) is zero for all f : f max and nonzero for 0

f

f max . Because u is

real, we know from entry 7 in Table 2.1 that U ( f ) U ( f ) , ensuring that U(ƒ) is also nonzero

for negative frequency values 0 f f max ; that is, for every positive ƒ at which U is nonzero

there must be a íƒ at which U is nonzero, and because U is zero for f : f max it follows that U is

zero for all f 4 f max . Hence U can be represented schematically by the solid triangle centered

on the origin of Fig. 2.14. To construct U [ 5 ] , we write

- 188

- 188- -

Aliasing as an Error · 2.22

FIGURE 2.13(a).

u[ 5 ] (t , T )

1

T

f

t 1/ F

FIGURE 2.13(b).

U [5 ] ( f , F ) 1

F

t

f 1/ T

-- 189

189 --

2 · Fourier Theory

5

U [5] ( f , F ) ¦ U ( f kF ) ,

k 5

(2.98a)

where the smallest we can make F and still avoid overlap is, as shown by the dotted triangles in

Fig. 2.14,

F 2 f max . (2.98b)

1

F ,

t

where ¨t is the interval in t between adjacent samples of u(t). If ¨t is made smaller, then F

increases, moving the regions of nonzero U further apart in Fig. 2.14; and if ¨t is made larger,

then F decreases, forcing the regions of nonzero U to overlap in Fig. 2.14. Making ¨t smaller is

wasteful, in that more effort than is needed goes into sampling u(t), and making ¨t larger

damages the integrity of the U calculations for large values of ƒ near f max . Clearly, the frequency

value F/2 plays an important role in DFT analysis, because optimum performance requires

f max F / 2 . For this reason frequency F/2 is given a special name: the Nyquist frequency

f Nyq F / 2 . From (2.93c), we see that

1

f Nyq . (2.99a)

2t

A realistic system, of course, is designed with some built-in margin for error. The requirement

then becomes that ¨t be small enough to separate unexpectedly high frequencies when the

highest expected frequency is f max . To provide this margin, we take

1

f Nyq f max (2.99b)

2t

or

1

t

. (2.99c)

2 f max

Now the region between f max and f Nyq is available for analysis of unexpectedly high frequencies.

Suppose U(ƒ) is negligible everywhere except at two frequencies, the positive frequency f 0

and the corresponding negative frequency f 0 . Since U(ƒ) is the transform of a real signal,

entry 7 of Table 2.1 requires U ( f ) U ( f ) , forcing the existence of a non-negligible transform

- 190

- 190- -

Aliasing as an Error · 2.22

value at f 0 when there is a non-negligible transform value at f 0 . The two frequencies are

represented by wide, solid-sided arrows in Fig. 2.15. The arrows represent isolated, narrow

regions where U is very large, so we can think of them as proportional to delta functions and

write U(ƒ) as

U ( f ) A A ( f f0 ) B A ( f f0 ) .

Variables A and B are arbitrary complex constants. We have just seen that Table 2.1 requires

U ( f ) U ( f ) . Because the delta functions are real, the equation U ( f ) U ( f ) can be

written as

A A ( f f 0 ) B A ( f f 0 ) A A ( f f 0 ) B A ( f f 0 )

or, since the delta functions are also even [see Eq. (2.68a)],

A A ( f f 0 ) B A ( f f 0 ) A A ( f f 0 ) B A ( f f 0 ) .

This can only be true if A B (which is, of course, the same thing as having B A ).

Therefore, we have the freedom to choose only one arbitrary complex constant, say A, and after

making that choice function U(ƒ) becomes

______________________________________________________________________________

FIGURE 2.14.

U [5 ] ( f , F )

-F - f max f max F

U( f )

-- 191

191 --

2 · Fourier Theory

U ( f ) A A ( f f 0 ) A A ( f f 0 ) . (2.100a)

It is not difficult to figure out what happens when the DFT is used to calculate this double-delta

frequency spectrum. If the double-delta U(ƒ) is used to construct U[](f, F) according to formula

(2.98a), we get multiple isolated regions where U[] is very large, as shown by the wide dashed

arrows in Fig. 2.15. The curved single arrows show which wide dashed arrows come from the

wide, solid-sided arrow at f0 and which wide dashed arrows come from the wide solid-sided

arrow at f 0 . For example, the wide dashed arrow closest to f0 comes from the wide solid-

sided arrow at (–f0), and the wide dashed arrow closest to (–f0) comes from the wide solid-sided

arrow at f0. The two wide solid-sided arrows at f0 and –f0 lie a distance a inside the positions of

the positive and negative Nyquist frequencies fNyq and –fNyq, and the two wide dashed arrows that

are closest to f0 and –f0 lie a distance a outside the positive and negative Nyquist frequencies fNyq

and –fNyq. We see that the original double-delta U(ƒ) transform can be written as [from Eq.

(2.100a)]

and we can pair up the two wide dashed arrows closest to f0 and –f0 to create the transform

Because the delta function ( f f Nyq a) ( f f 0 ) has the coefficient A in (2.100b), the

curved single arrow going from f 0 to f Nyq a shows that the delta function ( f f Nyq a )

at f Nyq a must have the coefficient A in Eq. (2.100c); similarly, the curved single arrow going

from f 0 to f Nyq a shows that the delta function ( f f Nyq a ) at f Nyq a must have the

coefficient A in Eq. (2.100c). Nothing stops us from continuing out from the origin, pairing the

wide dashed arrows at f 3 f Nyq a and f 3 f Nyq a to get

and pairing the wide dashed arrows at f 3 f Nyq a and f 3 f Nyq a to get

- 192

- 192- -

Aliasing as an Error · 2.22

FIGURE 2.15.

frequency – f 0 frequency f 0

a a a a

F 2 f nyq

Each time, the curved single arrows in Fig. 2.15 are consulted to find the coefficients of the delta

functions. This can obviously be continued out to indefinitely large values of ƒ, creating the

paired transforms U [4] ,U [5] ,…, etc. The general formula for U [ k ] turns out to be

A ( f f Nyq kf Nyq a)

°

° A ( f f Nyq kf Nyq a) for k even

°

U [ k ] ( f ) ® . (2.100f)

° A ( f f (k 1) f a)

° Nyq Nyq

° A ( f f Nyq (k 1) f Nyq a ) for k odd

¯

-- 193

193 --

2 · Fourier Theory

We started out with the double-delta U(ƒ) being the forward Fourier transform of u(t), which

means that u(t) is the inverse Fourier transform of the double-delta U(ƒ),

³ U ( f )e

2& ift

u (t ) df .

5

We now show that u(t), the inverse transform of the double-delta U(ƒ), and u[1] (t ), u[2] (t ),… the

inverse transforms of U [1] ,U [2] ,…, all have the same values at t mt for m 0, 9 1, 9 2,… ,

u (mt ) u[1] (mt ) u[2] (mt ) " u[ k ] (mt ) " . (2.100g)

We begin by taking the inverse Fourier transform of the double-delta U(ƒ) function specified

in (2.100b),

5

u (t ) ³ [ A A ( f f

5

Nyq a) A A ( f f Nyq a)]e 2& ift df

(2.101a)

2& it ( f Nyq a ) 2& it ( a f Nyq ) 2& it ( f Nyq a )

Ae Ae 2 Re[ Ae ].

u[ k ] (t ) ® 2& it ( f Nyq ( k 1) f Nyq a )

. (2.101b)

°̄ 2 Re[ Ae ] for k odd

Substituting t mt from (2.100g) and f Nyq 1 (2t ) from (2.99a) into Eq. (2.101a) gives

1

u (mt ) 2 Re[ Ae2& imt ((2 t ) a )

] 2 Re[ Aei& m e 2& imat ]

(2.101c)

2 Re[(1) m Ae 2& imat ] .

- 194

- 194- -

Aliasing as an Error · 2.22

1 1

°

° 2 Re[ Aei& m ei& mk e2& imat ] for k even

°

u[ k ] (mt ) ® . (2.101d)

° 1 1

2& imt ((2 t ) ( k 1)(2 t ) a )

°2 Re[ Ae ]

° 2 Re[ Ae i& m e i& m ( k 1) e2& imat ] for k odd

¯

But e 9 i& mk (1) mk 1 when k is even and e 9 i& m ( k 1) (1) m ( k 1) 1 when k is odd, so this last

result can be written as

u[ k ] (mt ) ® m 2& imat

. (2.101e)

¯ 2 Re[ A(1) e ] for k odd

Comparing this with (2.101c), we conclude that u (mt ) u[ k ] ( mt ) for all values of m and k,

showing that (2.100g) must be true. Because the u[ k ] functions have exactly the same values as

the u functions at t mt for m 0, 9 1, 9 2,… , the u[ k ] functions are called aliases of function

u. Figure 2.16 graphs an example of u(t) and to show how u and its alias u[1] can have identical

values at all the sample positions on the t axis.

The term “alias” is an interesting one; it suggests that there is no real way to distinguish these

functions if all we know are the values of the sample points at t mt . Yet in Figs. 2.14 and

2.15, there is really no question as to which is the correct region of U [ 5 ] ; spectral values whose

frequencies do not lie between +fNyq and –fNyq can clearly be disregarded. Consider, however, that

before u(t) is analyzed there is no guarantee as to what the correct value of fmax is. Figure 2.17, for

example, shows a pattern for U [ 5 ] that seems to have well-separated regions for U and all its

aliases when in fact there is a high-frequency triangle that is hidden by aliasing. The unwary

analyst might conclude that U has the shape shown in Fig. 2.18(a) when its true shape is the one

shown in Fig. 2.18(b). There is really no way to be sure of the true shape of U when all that is

known is the DFT of the sampled signal u(t). The basic problem, which is that the DFT is the

sampled version of U [ 5 ] instead of U, does not disappear when F 1 t is made larger by

decreasing the sampling interval ¨t; there is always the possibility that the true U curve is broad

enough to overlap. Returning to Fig. 2.16, we see that no matter how small ¨t is made, the

information thrown away from between the samples inevitably allows high frequencies to

masquerade as low frequencies. There is no foolproof method for both sampling the data and

avoiding this possibility.

Fortunately, there are usually ways of avoiding this logical dead end. As is pointed out in Sec.

2.2 above [see discussion after Eq. (2.9b)], in practice all measurements are sampled and, before

representing them by continuous functions, we must know that the samples capture all the

-- 195

195 --

2 · Fourier Theory

relevant detail. In other words, there must be some way of knowing, based on past experience or

knowledge of how the data is gathered, that the sampling is rapid enough to represent faithfully

all the important high-frequency details. In terms of the notation used to discuss Fig. 2.14, we

must eventually be prepared to say that, for some specific ƒmax, no higher frequencies are present

to create aliasing—that is, we must know that if more closely spaced sampling is done all that

would be found is a smooth, quasi-linear variation between the current samples. Many times the

electronic instruments used to make the measurements cannot sense high-frequency data, so even

if high-frequency components exist, they cannot be recorded. Other times, all that can be done is

to look at the data samples and decide whether it is reasonable to suspect the presence of unseen

high-frequency components. The data in Fig. 2.19(a), for example, almost certainly do not

contain significant amounts of unseen high frequencies, whereas unseen high frequencies could

well be present in Fig. 2.19(b). There may be cases where all that can be done is to shorten ¨t and

see whether previously aliased frequency components suddenly appear. The question of whether

aliasing is present is analogous to the question of whether experimental error is present. Just as it

is always logically possible that data contain significant amounts of undetected error, so it is

FIGURE 2.16.

1.1

1

0.5

y

i

0

Y

i

0.5

1

1.1

5 4 3 2 1 0 1 2 3 4 5

4.5 x

i t 4.5

The solid line represents a sinusoidal oscillation at a frequency that is 0.8 times the Nyquist

frequency, and the dashed line represents a sinusoidal oscillation that is 1.2 times the

Nyquist frequency. When the curves are sampled at the rate represented by the black dots—

which in this case is the Nyquist frequency—there is no way to tell them apart in the sampled

data.

- 196

- 196- -

Aliasing as an Error · 2.22

always logically possible that significant amounts of aliasing are being overlooked. Just as we

often expect insignificant amounts of error to occur no matter what precautions are taken, so we

often expect insignificant amounts of aliasing to occur in the calculated DFT. What is needed is

the presence of good engineering and scientific judgment; there must always be someone willing

to pick a value for ƒmax, allowing us to specify the sampling interval t 4 1 (2 f max ) that prevents

significant aliasing in the DFT.

The previous section presented the bad aspects of aliasing, treating it as a form of data corruption.

There are, however, occasions when aliasing is more of a feature than a bug. Many times, a real

function u(t) is known to have a Fourier transform

³ u (t )e

2& ift

U( f ) dt ,

5

which is zero for all positive frequencies ƒ that do not lie between the two positive numbers ƒmin

and ƒmax; that is, U(ƒ) is zero when 0 4 f 4 f min and f : f max . Because u(t) is real, U(ƒ) must be

Hermitian (see entry 7 of Table 2.1), which means

U ( f ) U ( f ) .

This shows

This shows thatthat

U(U(ƒ)

f ) must

mustalso

alsobebestrictly

strictlyzero

zerofor

fornegative

negative frequencies

frequencies ƒf where

where f min 4 f 4 0

and f 4 f max . The U(ƒ) transform is schematically represented in Fig. 2.20 with the two blocks

showing that U is zero unless ƒ lies between ( f max , f min ) or ( f min , f max ) .

The situation shown in Fig. 2.20 describes the signal produced by Michelson interferometers.

At the beginning of this chapter, we mentioned that interferometers produce interferograms that

must then be Fourier transformed to produce the desired spectral measurement. As explained

later in Chapter 4 (see Sec. 4.10), interferometers use optical filters to block out undesired

electromagnetic frequencies, which means there always exist values of ƒmin and ƒmax such that the

transform U(ƒ) of the interferogram signal u(t) is zero unless ƒ lies between ( f max , f min ) or

( f min , f max ) . Suppose we sample the interferogram signal with a sampling interval ¨t such that

the Nyquist frequency f Nyq (2t ) 1 is slightly larger than ƒmax. Repeating the reasoning used to

get Fig. 2.15 above, we see that

5

U [5] ( f , F ) ¦ U ( f kF )

k 5

-- 197

197 --

2 · Fourier Theory

FIGURE 2.17.

U [5 ] ( f , F )

f

F 2 f Nyq F 2 f Nyq

f Nyq f Nyq

FIGURE 2.18(a). U( f )

FIGURE 2.18(b).

U( f )

The U [5 ] ( f , F ) data in Fig. 2.17 contains hidden aliasing that can lead spectral analysts to assume

that the Fig. 2.18(a) rather than 2.18(b) depicts the true frequency spectrum.

- 198

- 198- -

Aliasing as a Tool · 2.23

FIGURE 2.19(a).

This data is relatively smooth, suggesting that it does not contain high-frequency components.

FIGURE 2.19(b).

This curve varies rapidly in three locations, suggesting the presence of high-frequency

components in the data.

-- 199

199 --

2 · Fourier Theory

now has the form shown in Fig. 2.21. Again, the solid blocks show the original U(ƒ), the dashed

blocks show the aliases created by turning U(ƒ) into U [ 5 ] ( f , F ) , and the curved arrows drawn

show exactly how the aliased blocks are created from the original blocks. No solid blocks overlap

with the dashed blocks, so aliasing is not a problem.

Now consider what happens when we force aliasing to occur by choosing ¨t to be half its

original size, creating the U [ 5 ] plot shown in Fig. 2.22. As in Fig. 2.21, none of the solid blocks

overlap with the dashed blocks. Because the dashed blocks come from turning U into U [ 5 ] , the

spectral shapes represented by the solid and dashed blocks are all identical. This means that the

aliasing does not cause spectral information to be lost; either the solid blocks or the dashed

blocks can be used to recover the true shape of U(ƒ). The electronic equipment used to sample

u(t) only needs to sample half as often as before, which usually makes it less expensive to build,

and as a bonus the rate at which data flows from the interferometer ends up being cut in half. This

last point is often a significant consideration when the interferometer is on a satellite and all the

data has to be communicated to the ground. The scheme shown in Fig. 2.22 is called

undersampling. There is nothing special about undersampling by a factor of 2; if the distance

between ƒmin and ƒmax is small enough, and ƒmin is far enough from f 0 , we can undersample

by much higher factors. Figure 2.23 shows a scheme that withundersamples

4 aliases rather

bythan one. of 5.

a factor

We define a band-limited function u(t) to be a function for which there exists a positive

frequency ƒmax such that the forward Fourier transform of u(t),

³ u(t )e

2& ift

U( f ) dt ,

5

is strictly zero when f 4 f max or f : f max . The previous section indicated that the interferogram

of a Michelson interferometer is a special case of a band-limited function; not only is its

transform zero for f : f max , but there is also a positive frequency ƒmin such that its transform is

zero for f 4 f min (see Fig. 2.20). It can be shown that whenever a continuous function u(t) is

also band limited, then its samples u (mt ) (with m 0, 9 1, 9 2,… ) can be used to reconstruct the

complete function—including the values of u between the samples—as long as we choose

1

t

(2.102)

2 f max

to prevent aliasing.

We start by forming the mathematical construct

- 200

- 200- -

Sampling Theorem · 2.24

FIGURE 2.20.

U( f )

f

f max f min f min f max

FIGURE 2.21.

U [5] ( f , F )

f

f min f min

F f max f max f Nyq F

f Nyq

-- 201

201 --

2 · Fourier Theory

5

v(t ) ¦ u(mt ) (t mt ) .

m 5

(2.103)

Clearly, the u (mt ) sample values of function u are the only data used to set up function v(t).

Because u (t ) (t t0 ) u (t0 ) (t t0 ) for any continuous function u [see Eq. (2.68e) above], this

can be written as

5

v(t ) ¦ u (t ) (t mt )

m 5

or

ª 5 º

v(t ) u (t ) A « ¦ (t mt ) » .

¬ m 5 ¼

here tt in

Note that here has

thereturned

functiontoubeing a continuous,

has returned not

to being a sampled, variable. Taking the Fourier

a continuous

transform of both sides gives, using the Fourier convolution theorem [see Eq. (2.72i)],

ª1 5 § k ·º

V ( f ) U ( f ) « ¦ ¨ f ¸» , (2.104a)

¬ t k 5 © t ¹ ¼

where

5

³ v(t )e

2& ift

V( f ) dt , (2.104b)

5

³ u (t )e

2& ift

U( f ) dt , (2.104c)

5

and

ª 5 º 2& ift

5

1 5 § k ·

³5 ¬« k¦

5

(t k t ) »

¼

e dt ¦ ¨ f ¸

t k 5 © t ¹

(2.104d)

from formula (2.78d). Note that here both ƒ and t are continuous, not sampled, variables. We can

now use the linearity of the convolution [see discussion after Eq. (2.38c)] and the definition of

the convolution in Eq. (2.38a) to write (2.104a) as

5

5

§ k · 5

§ k ·

t AV ( f ) ¦

k 5

U ( f ) ¨

©

f ¸ ¦ ³ U ( f 3) ¨ f f 3 ¸ df 3

t ¹ k 5 5 © t ¹ (2.105a)

5

§ k · § 1 ·

¦ U ¨ f ¸ U [5] ¨ f , ¸ ,

k 5 © t ¹ © t ¹

- 202

- 202- -

Sampling Theorem · 2.24

f min f min

FIGURE 2.22.

F F

[5 ]

f max U ( f ,F)

f max

f

f Nyq

f Nyq

[5]

U ( f ,F)

FIGURE 2.23.

f min f max

f max f min f Nyq

F F

f Nyq

In both Figs. 2.22 and 2.23, frequency F is twice the Nyquist frequency f Nyq .

where U [ 5 ] is as defined in Eq. (2.93b) above. Inequality (2.102) ensures that the separate

regions of U that combine to create U [ 5 ] do not overlap, giving us the graph of U [ 5 ] shown in

Fig. 2.24. Hence, we can use the function defined in Eq. (2.56c) to select just the region of

nonzero U [ 5 ] between (2t ) 1 and (2t ) 1 , recreating the original U(ƒ) transform.

Multiplication of (2.105a) by f , (2t ) 1 then gives

§ 1 · [5] § 1 · § 1 ·

U( f ) ¨ f , ¸ AU ¨ f , ¸ t AV ( f ) A ¨ f , ¸. (2.105b)

© 2t ¹ © t ¹ © 2t ¹

-- 203

203 --

2 · Fourier Theory

Having recovered the original U(ƒ), an inverse Fourier transform of U(ƒ) gives back the original

unsampled u(t). Using the Fourier convolution theorem again to take the inverse Fourier

transform of both sides of (2.105b), we get [applying Eq. (2.39j) after interchanging the roles of ƒ

and t]

5

§ 1 · 2& ift

u (t ) t ³ V ( f ) A ¨ f , ¸ e df

5 © 2 t ¹

(2.106a)

ª 5

º ª5 § 1 · 2& if 3t º

t « ³ V ( f )e df » « ³ ¨ f 3,

2& ift

¸ e df 3» ,

¬ 5 ¼ ¬ 5 © 2t ¹ ¼

where the convolution between the two expressions inside square brackets [ ] is over the variable

t. From (2.104b), function V(ƒ) is the forward Fourier transform of v(t), making v(t) equal to the

inverse Fourier transform of V(ƒ) in (2.106a), with v(t) defined as

5

v(t ) ¦ u(mt ) (t mt )

m 5

in Eq. (2.103). From Eq. (2.71a) above, the inverse Fourier transform of is

§ § 1 ··

5

§ 1 · 1 § &t ·

¸¸ ³ e ¨ f ,

( ift ) 2& ift

F ¨¨ f , ¸ df sin ¨ ¸ .

© © 2t ¹ ¹ 5 © 2t ¹ & t © t ¹

ª 5 º ª1 § & t ·º

u (t ) t « ¦ u (mt ) (t mt ) » « sin ¨ ¸ » . (2.106b)

¬ m 5 ¼ ¬& t © t ¹ ¼

5

ª 1 § & t ·º ½

u (t ) t ¦ ®u (mt ) « (t mt ) & t sin ¨© t ¸¹» ¾

m 5 ¯ ¬ ¼¿

5 ° ª 1 § & (t mt ) · º ½°

u (t ) ¦ °®u (mt ) « & ((t mt ) t ) sin ©¨ t

¸» ¾ .

¹ ¼ ¿°

(2.106c)

m 5 ¯ ¬

- 204

- 204- -

Sampling Theorem · 2.24

FIGURE 2.24.

§ 1 ·

U [5] ¨ f , ¸

© t ¹

f

§1 · 1

¨ f max ¸ f max f max f max

© t ¹ t

1 1

U( f )

2t 2t

This formula gives us u(t) everywhere in terms of the samples u (mt ) and the function

1 § &t ·

sin ¨ ¸ .

& (t t ) © t ¹

sin( x)

sinc( x) (2.106d)

x

5

§ & (t mt ) ·

u (t ) ¦ u(mt )sinc ¨©

m 5 t

¸.

¹

(2.106e)

-- 205

205 --

2 · Fourier Theory

Many authors use a different definition of the sinc function, which we call here sincalt , with

sin(& x)

sinc alt ( x) .

&x

5

§ (t mt ) ·

u (t ) ¦ u(mt )sinc

m 5

alt ¨

© t

¸.

¹

sin( x) sin(& x)

For the rest of this book, the symbol sinc will refer to instead of . We also

x &x

note that the Fourier transform pair in (2.71a) can be written in terms of sinc( x) as

³e

2& ift

[2 Fsinc(2& Ft )] dt ( f , F )

5

and

5

³e

2& ift

( f , F ) df 2 Fsinc(2& Ft ) .

5

³e

2& ift

[2 Fsinc(2& Ft )] dt ( f , F ) ( f , F )

5

and

5

³e

2& ift

( f , F ) df 2 Fsinc(2& Ft ) 2 Fsinc(2& Ft ) ,

5

where we have used that ( f , F ) and sinc(2& Ft ) are even functions of their arguments:

and

( f , F ) ( f , F ) . (2.107b)

This means we can write this Fourier relationship using the more general formulas

- 206

- 206- -

Sampling Theorem · 2.24

5

F ( 9 ift )

2 Fsinc(2& Ft ) ³ e92& ift [2 Fsinc(2& Ft )] dt ( f , F ) (2.108a)

5

and

5

F ( 9 ift )

( f , F ) F ( 9 itf )

( f , F ) ³ e92& ift ( f , F ) df 2 Fsinc(2& Ft ) . (2.108b)

5

The integral Fourier transform extends easily and naturally to two- and three-dimensional

functions. We can, for example, define the integral Fourier transform of any two-dimensional

function u(x,y) to be

5 5

³ dx ³ dy e

2& i ( x. y! )

U (. ,! ) u ( x, y ) . (2.109a)

5 5

5 5

³ d. ³ d! e

2& i ( x. y! )

u ( x, y ) U (. ,! ) . (2.109b)

5 5

5 5 5

U (. ,! , 0 ) ³

5

dx ³ dy ³ dz e2& i ( x. y! z0 )u ( x, y, z )

5 5

(2.109c)

and

5 5 5

³ d. ³ d! ³ d0 e

2& i ( x. y! z0 )

u ( x, y , z ) U (. ,! , 0 ) . (2.109d)

5 5 5

This pattern of forward and inverse transforms can be extended indefinitely to functions u and U

with ever larger numbers of arguments, but for the purposes of this book there is no need to go

beyond the two- and three-dimensional transforms given in Eqs. (2.109a)–(2.109d). As a matter

of notation, we often use the standard Cartesian x̂ and ŷ unit vectors pointing along the x and y

axes of a Cartesian coordinate system to define vectors

G G

( xxˆ yyˆ and q . xˆ ! yˆ .

-- 207

207 --

2 · Fourier Theory

G G

We introduce the symbol u ( ( ) as a shorthand for u(x,y) and the symbol U (q ) as a shorthand for

U (. ,! ) . Now Eqs. (2.109a) and (2.109b) can be written as

5

G G G G

U (q ) ³³

5

d 2 ( e 2& i ( =q u ( () (2.110a)

and

5

G G G G

u(( ) ³³

5

d 2q e 2& i( =qU (q ) . (2.110b)

G G

r xxˆ yyˆ zzˆ and s . xˆ ! yˆ 0 zˆ ,

5

G G G G

³ ³³

3 2& ir = s

U (s ) d r e u (r ) (2.110c)

5

and

5

G G G G

³ ³³d se

3 2& ir = s

u (r ) U (s ) . (2.110d)

5

Vector notation is sometimes used to group families of associated forward and inverse Fourier

transforms into a single equation. We might, for example, write the six scalar equations

5 5

G G G G G G G G

³ ³ ³ d r e u x (r ) , u x (r ) ³ ³³d

3 2& ir = s 3

U x (s ) s e 2& ir = sU x ( s ) ,

5 5

5 5

G G G G G G G G

³ ³³d re ³ ³³

3 2& ir = s 3 2& ir = s

U y (s ) u y (r ) , u y (r ) d s e U y (s ) ,

5 5

and

5 5

G G G G G G G G

³ ³ ³ d r e u z (r ) , u z (r ) ³ ³³d se

3 2& ir = s 3 2& ir = s

U z (s ) U z (s )

5 5

G G 5

2& ir = s G G

G G

³ ³³

3

U (s ) d r e u (r ) (2.110e)

5

- 208

- 208- -

Fourier Transforms in Two and Three Dimensions · 2.25

and

G G G G

5

G G

³ ³³d

3

u (r ) s e 2& ir = sU ( s ) , (2.110f)

5

where

G G G G G G G G G G

u (r ) xˆu x (r ) yˆ u y (r ) zˆu z (r ) and U ( s ) xˆU x ( s ) yˆU y ( s ) zˆU z ( s ) .

G G G G G G

We call U ( s ) the vector Fourier transform of u (r ) and u (r ) the vector inverse Fourier

G G

transform of U ( s ) . Just as in the one-dimensional case, it makes no difference which Fourier

transform is labeled the forward transform and which is labeled the inverse transform as long as

there is a change in sign of the exponent of e. Following the pattern of Eq. (2.28 A ), we can also

write

5 5

G G G G G G

³³ ³³

2 9 2& i ( = q 2 B 2 & i ( 3= q

d q e d ( 3 e u ( ( 3) u ( ( ) (2.110g)

5 5

and

5 5

G G

³³ ³ d ³ ³ ³ d r3 e

G G G G

3 se 92& ir = s 3 B2& ir 3= s v (r 3) v(r ) (2.110h)

5 5

G G

for two-dimensional and three-dimensional scalar functions u ( ( ) and v(r ) . For three-

dimensional vector functions, this becomes

5 5

G G G G G G G G

³³ ³ d s e ³ ³ ³ d r3 e

3 9 2& ir = s 3 9 2& ir 3= s

v (r 3) v (r ) . (2.110i)

5 5

counterparts. For example, the Fourier shift theorem [see Eq. (2.36h) above] in two dimensions

G

ˆ x ya

becomes, for a two-dimensional vector constant a xa ˆ y,

5 5 5

G G G G

³ ³d ³ dx ³ dy e

2 92& i ( = q 92& i ( x. y! )

(e u(( a) u ( x ax , y a y )

5 5 5

5 5

³ dx3 ³ dy3 e

B2& i (. a x ! a y ) 92& i ( x3. y 3! )

e u ( x3, y3) ,

5 5

where in the last step we define x3 x ax and y3 y ax . We now see that (dropping the

primes inside the double integral)

-- 209

209 --

2 · Fourier Theory

5 5

G G G G G G G G G

³ ³d (e u ( ( a ) e B2& ia =q ³ ³

2 92& i ( = q 2 92& i ( = q

d ( e u( ( ) . (2.110j)

5 5

G G G G

This shows the forward or inverse two-dimensional Fourier transform of u( ( a) to be e B2& ia =q

G

multiplied by the forward or inverse two-dimensional Fourier transform of u ( ( ) . Similarly in

G

ˆ x yb

three dimensions, we have, for a three-dimensional constant vector b xb ˆ y zb

ˆ z , that

G G

5 5 5 5

G G

³ ³³d re ³ dx ³ dy ³

3 92& ir = s

v(r b ) dz e92& i ( x. y! z0 ) v( x bx , y by , z bz )

5 5 5 5

5 5 5

B2& i ( bx. by! bz0 )

e

5 5 5

where x3 x bx , y3 y by , and z 3 z bz . This time we find that the forward or inverse three-

G G G G

dimensional Fourier transform of v (r b ) is e B2& is =b multiplied by the forward or inverse three-

G

dimensional Fourier transform of v(r ) ,

G G

5 5

G G G G G G G

³ ³³d re v(r b ) e B2& is =b ³ ³³

3 92& ir = s 3 92& ir = s

d r e v( r ) . (2.110k)

5 5

scaling theorem discussed in Sec. 2.8 above [see Eq. (2.37a)]. In two dimensions when we have

5

G G G G

V ( 9 ) (q ) ³³

5

d 2 ( e 9 2& i ( =qv ( () (2.110 A )

G G G G

and v( ( ) is replaced by v(( ) , where Į is a real scalar, then we can substitute ( 3 ( to get

G

5 5 § ( 3· G

G G G 1 9 2& i¨ ¸ = q G 1 G

³³d ³³d

2 9 2& i ( = q 2

(e v ( () 2 (3e © ¹

v ( ( 3) 2 V ( 9 ) (q ) . (2.110m)

5

5

G G G G

Suppose there is a function of ( called u ( ( ) such that ( has to change by a vector distance (

G

whose magnitude must be at least (
for there to be a significant change in the value of

G

u ( ( ) . Using the same reasoning as was applied to the one-dimensional Fourier scaling theorem

G

[see the analysis following Eq. (2.37e)], we can show that U ( 9 ) (q ) , the two-dimensional forward

- 210

- 210- -

Fourier Transforms in Two and Three Dimensions · 2.25

G

or inverse Fourier transform of u, must be negligible or zero for all vectors q whose magnitude

G

q exceeds 1 . The Fourier scaling theorem in three dimensions starts with

5

G G G G

³ ³³

(9) 3 92& ir = s

V (s ) d r e v(r ) , (2.110n)

5

G G G

from which we discover, replacing r by r 3 r , that

G

5 5 § r3 · G

G G G 1 9 2& i¨ ¸ =s G 1 G

³ ³³ d r e ³ ³³ d

3 9 2& ir =s 3

v ( r ) 3 r3 e © ¹

v (r 3) 3 V ( 9 ) ( s ) . (2.110o)

5 5

G G

Again we can conclude that if there is a function u (r ) such that r must be at least ȕ for there

G

to be a significant change in u, then U ( 9 ) ( s ) , the three-dimensional forward or inverse Fourier

G G

transform of u, must be negligible or zero for all vector arguments s whose magnitude s

exceeds 1 .

The two-dimensional convolution of scalar functions u(x,y) and v(x,y) is written using the

symbol and defined to be

5 5

u ( x, y ) v( x, y ) ³ dx3 ³ dy3u( x3, y3)v( x x3, y y3) ,

5 5

(2.111a)

or

5

G G G G G

³ ³d

2

u ( ( ) v( ( ) ( 3 u ( ( 3)v( ( ( 3) (2.111b)

5

using the more concise vector notation. The vector notation may make the connection between

the one- and two-dimensional convolutions in Eqs. (2.38a) and (2.111b) easier to see. The two-

dimensional convolution, like the one-dimensional convolution, is both commutative and

associative. Using the same type of reasoning as in the analysis in Sec. 2.9, we have for the two-

G G G

dimensional functions u ( ( ) , v( ( ) , and h( ( ) that

5 5

G G G G G G G G

³ ³ ³ ³

2 2 2

u ( ( ) v( ( ) d ( 3 u ( ( 3) v ( ( ( 3) 1 d ( 33 u ( ( ( 33) v ( ( 33)

5

5

5

(2.111c)

G G G G G

³ ³d

2

( 33 v( ( 33)u ( ( ( 33) v( ( ) u ( ( )

5

and

-- 211

211 --

2 · Fourier Theory

5 5

G G G G G 2 G G G G

³³ ³³

2

u ( ( ) v( ( ) h( ( ) d ( 33 h ( ( ( 33) d ( 3 u ( ( 3) v ( ( 33 ( 3)

5 5

5 5

G G G G G G

³³ d ( 3 u ( ( 3) ³ ³d

2 2

( 33 h( ( ( 33)v( ( 33 ( 3)

5 5

(2.111d)

5 5

G G G G G G

³³ d ( 3 u ( ( 3) ³ ³d

2 2

( 333 v( ( 333) h(( ( ( 3) ( 333)

5 5

G G G

u ( ( ) v( ( ) h( ( ) ,

where to show that the two-dimensional convolution is commutative we make the variable

G G G

substitution ( 33 ( ( 3 in (2.111c); and to show it is associative, we make the variable

G G G

substitution ( 333 ( 33 ( 3 in (2.111d). The two-dimensional convolution is also linear. For any

two complex constants Į and ȕ, we have

5

G G G G G G G G

³ ³d

2

u ( ( ) v( ( ) h( ( ) ( 3 u ( ( 3) v( ( ( 3) h( ( ( 3)

5

5 5

G G G G G G

³ ³d ³ ³d

2 2

( 3 u ( ( 3)v( ( ( 3) ( 3 u ( ( 3)h( ( ( 3) (2.111e)

5 5

G G G G

u ( ( ) v( ( ) u ( ( ) h( ( ),

G G G G G G G

v( ( ) h( ( ) u ( ( ) v( ( ) u ( ( ) h( ( ) u ( ( ) . (2.111f)

It is easy to show that the Fourier convolution theorem holds true in two dimensions. We start

with

5 5

³ dx ³ dy e

92& i ( x. y! )

[u ( x, y ) v( x, y )]

5 5

5 5 5 5

³

5

dx ³ dy e 92& i ( x. y! ) ³ dx3 ³ dy3 u ( x3, y3)v( x x3, y y3)

5 5 5

5 5 5 5

92& i ( x. y! )

v( x x3, y y3).

5 5 5 5

- 212

- 212- -

Fourier Transforms in Two and Three Dimensions · 2.25

Now we replace the x, y integration variables by x33 x x3 and y33 y y3 , with dx33 dx and

dy33 dy , so that

5 5

³ dx ³ dy e

92& i ( x. y! )

[u ( x, y ) v( x, y )]

5 5

5 5 5 5

92& i ( x3. y 3! ) 92& i ( x33. y 33! )

v( x33, y33)

5 5 5 5

or

5 5

³ dx ³ dy e

92& i ( x. y! )

[u ( x, y ) v( x, y )] U ( 9 ) (. ,! ) A V ( 9 ) (. ,! ) , (2.112a)

5 5

5 5

U ( 9 ) (. ,! ) ³

5

dx ³ dy e 92& i ( x. y! )u ( x, y ) ,

5

(2.112b)

(9)

and V is the two-dimensional forward or inverse Fourier transform of v,

5 5

³ dx ³ dy e

(9) 92& i ( x. y! )

V (. ,! ) v ( x, y ) . (2.112c)

5 5

This gives the first half of the two-dimensional Fourier convolution theorem. To get the

second half, we reverse the transform in (2.112a). If the plus sign is used in (2.112a), take the

forward two-dimensional Fourier transform of both sides, and if the minus sign is used take the

inverse two-dimensional Fourier transform of both sides. This leads to

5 5

³ d. ³ d! e

B2& i ( x. y! )

U ( 9 ) (. ,! ) A V ( 9 ) (. ,! ) u ( x, y ) v( x, y ) , (2.113a)

5 5

5 5

³ d. ³ d! e

B2& i ( x. y! )

u ( x, y ) U ( 9 ) (. ,! ) (2.113b)

5 5

and

5 5

³ ³ d! e

B2& i ( x. y! )

v ( x, y ) d. V ( 9 ) (. ,! ) . (2.113c)

5 5

-- 213

213 --

2 · Fourier Theory

The first half of the two-dimensional Fourier convolution theorem, Eqs. (2.112a)–(2.112c),

shows that the forward or inverse two-dimensional Fourier transform of the two-dimensional

convolution of two functions u and v is the product of the forward or inverse two-dimensional

Fourier transforms of u and v. Because no restrictions are placed on the nature of u and v, other

than that they are transformable, there are also no restrictions on the nature of their U ( 9 ) and V ( 9 )

transforms. This means we can think of U ( 9 ) and V ( 9 ) as arbitrary transformable functions. The

(9 ) superscripts on U and V in Eqs. (2.113a)–(2.113c) then just tell us that, according to Eqs.

(2.112b) and (2.112c),

5 5

³ dx ³ dy e

(9) 92& i ( x. y! )

U (. ,! ) u ( x, y )

5 5

and

5 5

V ( 9 ) (. ,! ) ³

5

dx ³ dy e92& i ( x. y! ) v( x, y ) .

5

We already know this, however, from looking at Eqs. (2.113b) and (2.113c)—just take the

opposite-sign Fourier transform of both sides. Hence, we can drop the (9 ) superscripts on U and

V in Eqs. (2.113a)–(2.113c) as long as ( B ) superscripts are added to u and v to distinguish

between the two choices of sign in (2.113b) and (2.113c). Now Eqs. (2.113a)–(2.113c) become

5 5

³ d. ³ d! e

B2& i ( x. y! )

U (. ,! ) A V (. ,! ) u ( B ) ( x, y ) v ( B ) ( x, y ) , (2.114a)

5 5

where

5 5

u ( B ) ( x, y ) ³ ³ d! e

B2& i ( x. y! )

d. U (. ,! ) (2.114b)

5 5

and

5 5

³ d. ³ d! e

(B) B2& i ( x. y! )

v ( x, y ) V (. ,! ) . (2.114c)

5 5

The letters used to label the functions and variables are, of course, arbitrary, so nothing stops us

from interchanging the letters u and U, v and V, x and ȗ, y and Ș, and the vertical order of the ±

signs to get

5 5

³ dx ³ dy e

92& i ( x. y! )

u ( x, y ) A v( x, y ) U ( 9 ) (. ,! ) V ( 9 ) (. ,! ) , (2.115a)

5 5

- 214

- 214- -

Fourier Transforms in Two and Three Dimensions · 2.25

where

5 5

U ( 9 ) (. ,! ) ³

5

dx ³ dy e 92& i ( x. y! )u ( x, y )

5

(2.115b)

and

5 5

³ dx ³ dy e

(9) 92& i ( x. y! )

V (. ,! ) v ( x, y ) . (2.115c)

5 5

Equations (2.115a)–(2.115c) are the other half of the two-dimensional Fourier convolution

theorem—they show that the forward or inverse two-dimensional Fourier transform of the

product of two functions u and v is the two-dimensional convolution of the forward or inverse

two-dimensional Fourier transforms of u and v.

The three-dimensional convolution is written using the symbol and defined to be

5 5 5

u ( x, y, z ) v( x, y, z ) ³

5

dx3 ³ dy3 ³ dz3u ( x3, y3, z3) v( x x3, y y3, z z3)

5 5

(2.116a)

or

5

G G G G G

³ ³ ³ d r 3 u (r 3) v(r r 3) .

3

u (r ) v(r ) (2.116b)

5

Using three-dimensional vector notation, the three-dimensional convolution has the same

commutative, associative, and linearity properties as the two-dimensional convolution, as can be

seen by returning to Eqs. (2.111c)–(2.111f), mentally adding an extra , an extra integral sign,

and replacing all the superscript 2’s by superscript 3’s.

G G G G

u ( ( ) v( ( ) v( ( ) u ( ( ) , (2.117a)

G G G G G G

u ( ( ) v( ( ) h( ( ) u ( ( ) v( ( ) h( ( ) , (2.117b)

G G G G G G G

u ( ( ) v( ( ) h( ( ) u ( ( ) v( ( ) u ( ( ) h( ( ) , (2.117c)

and

G G G G G G G

v( ( ) h( ( ) u ( ( ) v( ( ) u ( ( ) h( ( ) u ( ( ) . (2.117d)

-- 215

215 --

2 · Fourier Theory

Looking carefully at the variable manipulations used to derive Eqs. (2.112a)–(2.112c), the first

half of the two-dimensional Fourier convolution theorem, we see that working with an extra

product z0 in the exponent of e and an extra integration over dz does not affect the end result.

We can therefore say that

5 5 5

³ dx ³ dy ³ dz e

92& i ( x. y! z0 )

[u ( x, y, z ) v( x, y, z )]

5 5 5

(2.118a)

(9) (9)

U (. ,! , 0 ) A V (. ,! , 0 ) ,

where

5 5 5

U ( 9 ) (. ,! , 0 ) ³ dx ³ dy ³ dz e

92& i ( x. y! z0 )

u ( x, y , z ) (2.118b)

5 5 5

and

5 5 5

³ dx ³ dy ³ dz e

(9) 92& i ( x. y! z0 )

V (. ,! , 0 ) v ( x, y , z ) . (2.118c)

5 5 5

The argument about relabeling the functions and variables used to go from (2.112a)–(2.112c) to

(2.115a)–(2.115c) works equally well here, giving us at once the other half of the three-

dimensional Fourier convolution theorem,

5 5 5

³ dx ³ dy ³ dz e

92& i ( x. y! z0 )

u ( x, y , z ) A v ( x, y , z )

5 5 5

(2.119a)

U ( 9 ) (. ,! , 0 ) V ( 9 ) (. ,! , 0 ) ,

where

5 5 5

³ dx ³ dy ³ dz e

(9) 92& i ( x. y! z0 )

U (. ,! , 0 ) u ( x, y , z ) (2.119b)

5 5 5

and

5 5 5

V ( 9 ) (. ,! , 0 ) ³ dx ³ dy ³ dz e

92& i ( x. y! z0 )

v ( x, y , z ) . (2.119c)

5 5 5

One last matter of notation worth mentioning is that we can create two-dimensional and three-

dimensional delta functions from the products of the already-discussed one-dimensional delta

function:

- 216

- 216- -

Fourier Transforms in Two and Three Dimensions · 2.25

G

( ( ) ( x) A ( y ) (2.120a)

and

G

(r ) ( x) A ( y ) A ( z ) . (2.120b)

5 5 5 5

o o o o

5 5 5

5

5

(2.121a)

³ dx ( x x )u( x, y ) u( x , y );

5

o o o o

5 5 5

³ dx ³ dy ³ dz v( x, y, z) ( x x ) ( y y ) ( z z )

5 5 5

o o o

5 5

³ dx ( x x ) ³ dy v( x, y, z ) ( y y )

5

o

5

o o (2.121b)

5

³ dx ( x x )v( x, y , z ) v( x , y , z ).

5

o o o o o o

5

G G G G

³ ³d

2

( u ( ( ) ( ( (o ) u ((o ) (2.121c)

5

and

5

G G G G

³ ³ ³ d r v( r ) (r r ) v(r ) .

3

o o (2.121d)

5

Combining Eq. (2.71f) for the one-dimensional delta function with Eqs. (2.120a) and (2.120b),

we see that in two dimensions

5 5 5

G G G

2 92& i ( = q

( ( ) ( x) A ( y ) (2.122a)

5 5 5

-- 217

217 --

2 · Fourier Theory

G

using the vector notation q . xˆ ! yˆ ; and in three dimensions

5 5 5

G

³ d. e ³ d! e ³ d0 e

92& ix. 92& iy! 92& iz0

(r ) ( x) A ( y ) A ( z )

5 5 5

5

(2.122b)

G G

³ ³³d

3 92& ir = s

se

5

G

using the vector notation s . xˆ ! yˆ 0 zˆ .

__________

This chapter provides both an intuitive understanding and a rigorous explanation of how

Fourier transforms work. Sine and cosine transforms are introduced as a way to measure how

much functions resemble sine and cosine curves, and these transforms are then combined to

create the standard complex Fourier transform. We describe convolutions and how they produce

new functions by blurring old ones. The Fourier convolution theorem—whose importance is

difficult to overstate—directly connects the convolution to Fourier-transform theory. Generalized

limits are explained to show in what sense some of the more puzzling functions found in lists of

Fourier transforms belong there, and a brief outline of generalized functions is presented to show

how delta functions can be described without making them sound like obvious nonsense.

Computers use discrete Fourier transforms to handle Fourier calculations, and we explain how

the discrete Fourier transform can be used to approximate the integral Fourier transform. The

discrete Fourier transform produces aliasing; we show when aliasing is desirable, when it is not

desirable, and when it can be neglected. All the major concepts explained in this chapter—the

linearity of the Fourier transform, the linearity of the convolution, the Fourier convolution

theorem, the idea of even and odd functions, and the delta function—have important roles to play

in the pages that follow.

- 218

- 218- -

Table 2.1

Table 2.1

U ( f ) F ( ift ) (u (t )) u (t ) F (ift ) (U ( f ))

Im(U ( f )) 0 , U ( f ) U ( f ) Im(u (t )) 0 , u (t ) u (t )

Re(U ( f )) 0 , U ( f ) U ( f ) Re(u (t )) 0 , u (t ) u (t )

(3) [real, odd] [imag., odd]

Im(U ( f )) 0 , U ( f ) U ( f ) Re(u (t )) 0 , u (t ) u (t )

Re(U ( f )) 0 , U ( f ) U ( f ) Im(u (t )) 0 , u (t ) u (t )

(5) [complex, even] [complex, even]

Re(U ( f )) > 0 for some f Re(u (t )) > 0 for some t

Im(U ( f )) > 0 for some f Im(u (t )) > 0 for some t

U ( f ) U ( f ) u (t ) u (t )

(6) [complex, odd] [complex, odd]

Re(U ( f )) > 0 for some f Re(u (t )) > 0 for some t

Im(U ( f )) > 0 for some f Im(u (t )) > 0 for some t

U ( f ) U ( f ) u (t ) u (t )

(7) [Hermitian] [real]

U ( f ) U ( f ) Im(u (t )) 0

(8) [real] [Hermitian]

Im(U ( f )) 0 u (t ) u (t )

-- 219

219 --

2 · Fourier Theory

Table 2.1

(continued)

U ( f ) U ( f ) Re(u (t )) 0

Re(U ( f )) 0 u (t ) u (t )

- 220

- 220- -

Table 2.2

Table 2.2

§t·

5 2&ik ¨ ¸

T §k ·

1 2 & i ¨ t ¸ v(t ) ¦ Ak e ©T ¹

Ak ³ e © T ¹ v(t )dt k 5

T 0

Im( Ak ) 0 , Ak Ak Im(v(t )) 0 , v(t ) v(t )

(1)

(2) [real, even]

[imag., even] [real, even]

[imag., even]

Re( Akk ) 00 ,, AAkk AAkk

Im( A ) Im(vv((tt))

Re( )) 00 ,, vv((tt)) vv((tt))

(2)

(3) [imag.,

[real, even]

odd] [imag., odd]

[imag., even]

Re( A

Im( Ak ))

00 ,, A

Ak

AA

kk

Re(vv((tt ))

Re( ))

00 ,, vv((

tt ))

v(vt()t )

k k

(3)

(4) [real, odd] [imag., odd]

[real, odd]

[imag., odd]

Im( Re(

Im(vv((tt )) 00 ,, vv((tt)) vv((tt))

Re( Ak ) 00 ,, A

A k ) Akk A

Akk ))

(4) [imag., odd] [real, odd]

(5) [complex, even] [complex, even]

Re( Ak ) 0 , Ak Ak Im(v(t )) 0 , v(t ) v(t )

Re( Ak ) > 0 for some k Re(v(t )) > 0 for some t

Im( Ak ) > 0 for some k Im(v(t )) > 0 for some t

(5) [complex, even] [complex, even]

A (tv)(t))v>

vRe( (t )0 for some t

k A )A>

Re( kk 0 for some k

Im( Ak ) > 0 for some k Im(v(t )) > 0 for some t

(6) [complex, odd]

Ak Ak v[complex,

(t ) v(todd]

)

Re( Ak ) > 0 for some k Re(v(t )) > 0 for some t

Im( Ak ) > 0 for some k Im(v(t )) > 0 for some t

(6) [complex, odd] [complex, odd]

A Ak0 for some k (tv)(t))>v(0t )for some t

vRe(

Re( k A ) >

k

Im( Ak ) > 0 for some k Im(v(t )) > 0 for some t

(7)

A[Hermitian] v[real]

(t ) v(t )

k Ak Im(v(t )) 0

Ak Ak

(7) [Hermitian] [real]

(8) A Ak Im(v(t )) 0

[Hermitian]

k

[real]

Im( Ak ) 0 v(t ) v(t )

- 221 -

(8) [real] [Hermitian]

Im( Ak ) 0 v(t ) v(t )

-- 221

221 --

2 · Fourier Theory

Table 2.2

(continued)

Ak Ak Re(v(t )) 0

Re( Ak ) 0 v(t ) v(t )

- 222

- 222- -

3

RANDOM VARIABLES, RANDOM

FUNCTIONS, AND POWER SPECTRA

Engineers and scientists are taught many statistical concepts in school, but all too often this is

done in an informal manner that does a good job of explaining how to eliminate random errors

and noise from real experimental data and a poor job of explaining how to analyze random errors

and noise in physical models. Understanding the correct way to represent random errors and

noise requires formal knowledge of the statistical concepts used to describe random signals;

otherwise, basic equations can be misunderstood and misused. For this reason, we here take a

more formal approach to the subject. Starting off with an explanation of the basics—random

functions, independent and dependent random variables, the expectation operator E , stationarity

and ergodicity—that do not require the Fourier theory discussed in the previous chapter, we then

move on to topics that do, such as autocorrelation functions, white noise, the noise-power

spectrum, and the Wiener-Khinchin theorem. The techniques explained in this chapter are used a

few times in the next chapter during the derivation of the Michelson interference equations and

then over and over again in Chapters 6, 7, and 8 to analyze the random errors and noise found in

Michelson systems.

Random variables can be thought of as uncontrolled variables and nonrandom variables can be

thought of as controlled variables. When, for example, a computer program is being written, the

programmer controls the values of nonrandom program variables using inputs or lines of code,

but the programmer has no desire to control the program’s random variables—a pseudo-random

number generator gives them values instead. In a similar spirit, a statistician constructing a set of

model equations always ends up controlling the nonrandom variables—either directly by saying

this variable can be measured like this and that variable can be measured like that, or indirectly,

by saying these variables must solve that set of equations. Even when a statistician plots a

function against its argument, the graph is constructed by specifying the argument’s values and

then calculating the function according to its definition, which puts both the nonrandom argument

and the nonrandom value of the function under the statistician’s control. The statistician always,

on the other hand, treats random variables in a model as if they cannot be controlled. They must

be handled as if coins will be flipped, dice rolled, or needles spun on dials to determine their

values after the model is written down. All the statistician can know is the probability this

random variable takes on that value and the probability that random variable takes on this value;

- 223 -

3 · Random Variables, Random Functions, and Power Spectra

that is, he knows what the chances are that the coins, dice, or needles return one set of numbers

rather than another. Most scientists and engineers do not pay much attention to the difference

between controlled and uncontrolled variables—perhaps because most of their “controlled”

variables are usually a little “uncontrolled” in the sense that they come from imperfectly accurate

measurements—but it is very convenient when analyzing a statistical model to keep careful track

of this distinction. To help us remember which variables are random and which are not, we put a

wavy line or tilde over the random variables while writing the nonrandom variables in the usual

way. As an example of how this looks, we note that u, a0, and zƍ are all nonrandom variables

whereas Ǌ, ã0, and z′ are all random.

When the argument of a function is a random variable, the value of the function is also random.

If, for example, x is a random variable and f is a function, then

y = f ( x ) (3.1a)

is another random variable. To give an example of how this works, we create a nonrandom time

variable t and a random angular frequency ω , multiply them together and take the sine of their

product to get

y = sin(ω t ) . (3.1b)

The value of y is clearly uncontrolled; for each unpredictable value of ω at time t, there is a

corresponding unpredictable number y that is given by sin(ω t ) . This example also shows that

when a function has several arguments, its value becomes random when only one of the

arguments is random. In Eq. (3.1b) the sine of ω t , regarded as a function of both ω and t, is

random even though only one of its arguments, ω , is random.

Many times when a function has multiple arguments, the controlled argument or arguments

are more interesting than the uncontrolled argument or arguments that make the function random.

One way to handle this situation is to list only the nonrandom arguments and say that what we

have is a random function with nonrandom arguments. To show what is going on, we put a wavy

line over the function name, indicating that even though all the listed arguments are nonrandom,

the function itself is random. If, for example, we are only interested in the nonrandom time t, we

could define

R (t ) = sin(ω t ) (3.2a)

to be a random function of the nonrandom variable t. Now whenever there is a list of time values

t1, t2, …, there is a corresponding list of random variables

- 224 -

Random and Nonrandom Functions · 3.2

u = R (t ) = sin(ω t ) ,

2 2 2

Although Eq. (3.2b) implicitly assumes a list of distinct and separate t values, this reasoning still

holds up when t is explicitly made a continuous variable. Nothing, for example, stops us from

saying that for each value of t between í and +, there corresponds a different random variable

The idea of a random function of nonrandom arguments becomes more attractive when there is

no realistic possibility of analyzing the effect of multiple random arguments on a single

nonrandom function. We might, for example, know exactly how N random parameters r1 , r2 , …,

rN interact to cause an error e in an electrical signal s at time t. This lets us write the error as a

nonrandom function

e(t , r1 , r2 ,… , rN ) .

Rather than investigating how r1 , r2 , …, rN are behaving, it usually makes more sense to say that

there is a random noise

n (t ) = e(t , r1 , r2 ," , rN ) (3.3a)

contaminating electrical signal s. Now we can put the error into our model as a random function ñ

that depends on a nonrandom parameter t instead of as a nonrandom function e that depends on t

and N random parameters r1 , r2 , …, rN . Sometimes the signal s in our model depends on more

than one nonrandom parameter, such as the x, y coordinates of an image point at time t. If the

corresponding error e in the signal s depends on x, y, and t as well as the random parameters r1 ,

r2 , …, rN , then we can say there is a random noise

contaminating signal s(x, y, t). Note that we can think in terms of a signal noise ñ(t) or ñ(x,y,t)

even when we are not sure what random arguments r1 , r2 , …, rN make the nonrandom function e

behave randomly. This is, of course, why the idea of a random function is so useful. In this book,

we use the term “random function” to refer to what statisticians often prefer to call a random or

stochastic process.

- 225 -

3 · Random Variables, Random Functions, and Power Spectra

Deviation

With every random variable r , we associate a nonrandom probability density distribution pr ( x)

such that pr ( x) dx is the probability that the random variable r takes on a value between x and

x + dx . The nonrandom argument x of pr is a dummy variable, and nothing stops us from calling

it r instead—in fact, that is the convention. The usual way to introduce a probability density

distribution for a random variable r is to say that pr (r ) dr is the probability that r takes on a

value between r and r + dr . The dummy argument of a probability density distribution p must be

nonrandom, and the subscript of the probability density distribution p must be random—the

subscript, after all, labels p to show which random variable is being described. Since r must

always take on some sort of value between í and +, the sum of all the probabilities pr (r ) dr

between í and + must always be one. Consequently, for any probability density distribution

pr (r ) , we have

∞

³ p (r ) dr = 1 .

−∞

r (3.4)

For Eq. (3.4) to make sense, the probability density distribution pr (r ) must be defined for all r

between í and + with the understanding that

pr (r ) = 0

for those values of r to which the random variable r can never be equal.

The predicted average or mean value of r can be written as

∞

µr = ³ p (r ) r dr .

−∞

r (3.5a)

Note that µr , just like pr , is nonrandom even though it has a random subscript. The predicted

variance of r , which is defined to be the predicted average or mean squared difference between

r and µr , is another nonrandom quantity

∞

³ p (r ) (r − µ )

2

vr = r r dr . (3.5b)

−∞

Many people prefer to characterize a random number r by its standard deviation σ r instead of its

variance vr . The standard deviation of a random number r is defined to be the square root of the

variance,

- 226 -

Probability Density Distributions: Mean, Variance, Standard Deviation · 3.3

σ r = vr . (3.5c)

Of course σ r , like vr , is a nonrandom quantity. In general, the probability density distribution pr

lets us find the predicted average or mean value of any nonrandom function f of the random

variable r by calculating the nonrandom quantity

∞

predicted mean value of f = ³ p (r ) f (r ) dr .

−∞

r (3.5d)

When f (r ) = r , this equation reduces to formula (3.5a) for µr ; and when f (r ) = (r − µr ) 2 , this

equation reduces to formula (3.5b) for vr .

Many random variables found in nature appear to obey a Gaussian, or “normal,” probability

distribution:

( r − µ r ) 2

1 −

2σ r2

pr (r ) = e . (3.6a)

σ r 2π

This can in part be explained as a consequence of the central limit theorem,25 which is described

in Sec. 3.11 below. It is easy to show that parameter µr in Eq. (3.6a) is the mean of the Gaussian

distribution. Consulting formula (3.5a) above, we see that the mean of the distribution in (3.6a)

must be

∞ ( r − µr )2 ∞ ( r ′ )2

r −

1 −

³σ ³

2σ r2 2σ r2

e dr = (r ′ + µr ) e dr ′ , (3.6b)

−∞ r 2π σ r 2π −∞

where on the right-hand side the variable of integration is changed to r ′ = r − µr . This becomes,

consulting Eq. (7A.3d) in Appendix 7A of Chapter 7,

∞ ( r ′ )2 ∞ ( r ′ )2 ∞ ( r ′ )2

1 −

1 −

µr −

³ (r ′ + µ ) e ³ r′ e ³e

2σ r2 2σ r2 2σ r2

r dr ′ = dr ′ + dr ′

σ r 2π −∞ σ r 2π −∞ σ r 2π −∞

(3.6c)

∞ ( r ′ )2

1 −

³ r′ e

2σ r2

= dr ′ + µr ⋅1 .

σ r 2π −∞

25

Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, 3rd ed. (McGraw-Hill, Inc., New

York, 1991), p. 214.

- 227 -

3 · Random Variables, Random Functions, and Power Spectra

If we replace r ′ by −r ′ in

( r ′ )2

−

2σ r2

g (r ′) = r ′ e ,

it is the same as multiplying g by −1 , which makes g an odd function [see Eq. (2.11b) in Chapter

2). Hence, according to Eq. (2.17) in Chapter 2,

∞ ( r ′)2

−

³ r′ e

2σ r2

dr ′ = 0

−∞

because it is the integral of an odd function between í and +. Therefore, Eq. (3.6c) simplifies

to

∞ ( r ′ )2

1 −

³ (r ′ + µ ) e

2σ r2

r dr ′ = µr , (3.6d)

σ r 2π −∞

∞ ( r − µ r ) 2

r −

³σ

2σ r2

e dr = µ r . (3.6e)

−∞ r 2π

This shows that, as claimed above, parameter µr is the mean of the probability distribution

specified in Eq. (3.6a). It is just as easy to show that σ r is the standard deviation of the

distribution in (3.6a). From (3.5b) we know that the variance of this distribution is

∞ ( r − µ r )2 ∞ ( r ′ )2

(r − µ r ) 2 − (r ′) 2 − 2σ r2

³−∞ σ 2π e dr = ³

2σ r2

e dr ′

r −∞ σ r 2 π

when the variable of integration is changed to r ′ = r − µr . According to Eq. (7A.3b) in Appendix

7A of Chapter 7, we can write

∞ ( r ′ )2

(r ′) 2 − 2σ r2

³−∞ σ 2π e dr ′ = σ r .

2

(3.6f)

r

Consequently, σ r2 is the variance of this probability density distribution. The square root of the

variance is the standard deviation according to (3.5c). Hence, it is, as claimed, easy to see that σ r

- 228 -

Probability Density Distributions: Mean, Variance, Standard Deviation · 3.3

When r can only take on the values r1 , r2 , …, rN , then pr can be written as a sum of delta

functions. If, for example, p1 is the probability that r is r1 , p2 is the probability that r is r2 , …,

pN is the probability that r is rN , then

N

pr (r ) = ¦ pk ⋅ δ (r − rk ) . (3.7a)

k =1

The integral for the predicted mean value of r in Eq. (3.5a) now reduces to

∞ N N ∞ N

µr = ³ [¦ pk ⋅ δ (r − rk )] r dr = ¦ pk ³ δ (r − rk ) r dr = ¦ pk rk (3.7b)

−∞ k =1 k =1 −∞ k =1

∞ N N ∞

³ [¦ pk ⋅ δ (r − rk )](r − µr ) dr = ¦ pk ³ δ (r − r ) (r − µ )

2 2

vr = k r dr

−∞ k =1 k =1 −∞

(3.7c)

N

= ¦ pk (rk − µr ) 2 ;

k =1

and, according to Eq. (3.5d), the predicted mean value of f (r ) becomes

∞ N N ∞ N

³ [¦ pk ⋅ δ (r − rk )] f (r ) dr = ¦ pk

−∞ k =1 k =1

³

−∞

f (r ) δ (r − rk ) dr = ¦ pk f (rk ) .

k =1

(3.7d)

Again, the integral formulas reduce to the correct probability-weighted sums. Looking at the

limiting case where N = 1 and p1 = 1 , we get

pr (r ) = δ (r − r1 )

so that

∞

µr = ³ δ (r − r ) r dr = r

−∞

1 1 (3.7e)

- 229 -

3 · Random Variables, Random Functions, and Power Spectra

³ (r − r ) δ (r − r1 ) dr = (r1 − r1 )2 = 0 .

2

vr = 1 (3.7f)

−∞

Results (3.7e) and (3.7f) show that the value of r is now completely controlled; it must be equal

to r1 and no longer needs to be treated like a random variable. Hence, the limiting case where

N = 1 and p1 = 1 can be regarded as changing a random variable into a nonrandom variable.

Statisticians avoid the mathematical awkwardness of probability density distributions and their

associated integrals by defining an expectation operator E . For any nonrandom function f with a

random argument x , we say that

E ( f ( x ) )

is the predicted mean, or average, value of f ( x ) . We also call E ( f ( x ) ) the expectation value of

f ( x ) . Mathematically we define

∞

E ( f ( x ) ) = ³ p ( x) f ( x) dx .

x (3.8a)

−∞

Just like before, px ( x) dx is the probability that the random variable x takes on a value between

x and x + dx . We can find E( x ) , the expectation value of x , by choosing f ( x ) = x in Eq. (3.8a)

to get

∞

E( x ) = ³ p ( x) x dx .

−∞

x (3.8b)

Comparing this to Eq. (3.5a) above, we see that the expectation value of x is the same as the

predicted mean or average value of x ,

E( x ) = µ x , (3.8c)

(

E ( x − µ x ) 2 = ) ³ p ( x) ( x − µ )

x x

2

dx . (3.8d)

−∞

- 230 -

The Expectation Operator · 3.4

(

vx = E ( x − µ x ) 2 . ) (3.8e)

(

Var ( x ) = E ( x − µ x ) 2 . ) (3.8f)

When the E operator is applied to any sort of random variable or function—for example,

f ( x ) —the result is always a nonrandom variable or function, namely

³ p ( x) f ( x) dx .

−∞

x

For example, the characteristic function Φ x of a random variable x , which is the nonrandom

Fourier transform of the probability density distribution of x ,

³ p ( x )e

−2π iν x

Φ x (ν ) = x dx , (3.9a)

−∞

Φ x (ν ) = E (e −2π iν x ) . (3.9b)

variable ρ that has the probability density distribution

pρ ( ρ ) = δ ( ρ − c) . (3.9c)

According to the discussion following Eqs. (3.7e,f) above, this makes ρ equivalent to the

nonrandom variable c. Consequently, we can say that

E(c) = E( ρ ) (3.9d)

and use Eq. (3.8b) above to get

- 231 -

3 · Random Variables, Random Functions, and Power Spectra

∞ ∞

E( c ) = ³ pρ ( ρ ) ρ d ρ = ³ δ ( ρ − c ) ρ d ρ = c .

−∞

−∞

(3.9e)

This justifies the general rule—which also makes good intuitive sense—that

E( c ) = c (3.9f)

for any nonrandom quantity c.

The expectation operator E can be applied to multiple random variables at the same time—all

that we need is the appropriate probability density distribution. Suppose, for example, that the

behavior of two random variables x and X is described by a two-argument probability density

distribution pxX

( x, X ) , with pxX

( x, X ) dx dX being the probability that the random variable x

takes on a value between x and x + dx while the random variable X takes on a value between X

and X + dX . No matter what the behavior of random variables x and X , we can always

construct an appropriate probability density distribution p . Since x and X must always take

xX

the same reasoning used to produce Eq. (3.4) now shows that

∞ ∞

³

−∞

dx ³ dX pxX

−∞

( x, X ) = 1 (3.10a)

. The expectation value of any function of the random

variables x and X , such as f ( x , X ) , is defined to be

∞ ∞

( ) ³

E f ( x, X ) = dx ³ dX pxX

( x, X ) f ( x , X ) . (3.10b)

−∞ −∞

In particular, we can always set f ( x , X ) = x X to get the expected value of the random variables’

product,

∞ ∞

)=

E( xX ³ x dx ³ dX X p

xX

( x, X ) . (3.10c)

−∞ −∞

- 232 -

Independent and Dependent Random Variables · 3.5

When comparing two random variables such as x and X , one of the first questions that arises is

whether they are dependent or independent. When two random variables are dependent, the

random variables influence each other; and when two random variables are independent, they do

not.

Independent random variables are used to describe random quantities for which no cause-and-

effect relationship can be found. When, for example, we pick a car randomly from all the cars

sold in a given year, there is no reason to expect that the random variable representing the

brightness of the car’s headlights is associated with any particular value of the random variable

representing the car’s length. Lacking any evidence to the contrary, then, we say that these two

random variables ought to be independent. Similarly, if we pick someone at random from a

collection of adults, there is no obvious reason to assume that the random variable representing

the person’s yearly income is associated with any particular value of the person’s shoe size.

Again, we might assume that these are independent random variables. In general, when there is

no reason to connect the values of random quantities, we set them up in our models as

independent random variables.

Many times random variables turn out to be dependent in surprising ways. Returning to the

first of the previous examples, when we examine the connection between a car’s length and the

brightness of its headlights, it might turn out that very short cars are more likely to be European

sports cars frequently washed by their owners, making them more likely to have cleaner and thus

brighter headlights. Similarly, returning to the second example, a person’s shoe size and height

are connected; and statisticians have in fact shown that tall people, who are more likely to wear

large shoes, are also more likely to earn large incomes (if only because people living in the

United States, Australia, Canada, and Europe are more likely to be tall). Just as in these two

examples, many random variables that look like they ought to be unconnected and independent

turn out, after closer examination, to be dependent; in this sense, the independence of random

variables is the ideal case from which realistic random variables tend to deviate to a greater or

lesser degree.

When x and X are independent random variables, their probability density distribution can be

written as26

pxX

( x, X ) = px ( x) ⋅ p X ( X ) . (3.11a)

where px and p X are the standard probability density distributions for x and X when x and X

26

Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 132.

- 233 -

3 · Random Variables, Random Functions, and Power Spectra

are treated as solitary random variables. This means that px ( x) dx is the probability that x lies

between x and x + dx regardless of the value of X , and p X ( X ) dX is the probability that X lies

between X and X + dX regardless of the value of x . We see that, according to Eqs. (3.10c) and

of two independent random variables is

(3.11a), the expectation value of the product xX

∞ ∞ ∞ ∞

)=

E( xX ³ x dx ³ dX X pxX

( x, X ) = ³ x dx ³ dX X px ( x) p X ( X )

−∞ −∞ −∞ −∞

∞ ∞

= [ ³ px ( x) x dx] ⋅ [ ³ p X ( X ) X dX ] .

−∞ −∞

) = E( x ) ⋅ E( X )

E( xX (3.11b)

or

) = µ x µ X .

E( xX (3.11c)

Our analysis of two random variables can be extended in a straightforward way to large

collections of random variables. If there are N random variables x1 , x2 ,…, x N , then we can

always construct a probability density distribution

px1x2 "xN ( x1 , x2 ,… , xN )

such that

px1x2 "xN ( x1 , x2 ,… , xN ) dx1 dx2 " dxN

is the probability that x1 lies between x1 and x1 + dx1 , that x2 lies between x2 and x2 + dx2 , ... ,

that x N lies between xN and xN + dxN . The expectation value of any function f ( x1 , x2 ,… , x N ) of

these N random variables is

E ( f ( x1 , x2 ,… , x N ) )

∞ ∞ ∞ (3.12a)

= ³

−∞

dx1 ³ dx2 " ³ dxN f ( x1 , x2 ,… , xN ) px1 x2 "xN ( x1 , x2 ,… , xN ).

−∞ −∞

- 234 -

Large Numbers of Random Variables · 3.7

Note that nothing has been said so far about the connections between these N random variables;

they could be either dependent or independent. If we now assume that these N random variables

are all independent with respect to one another, then

where px1 ( x1 ) dx1 is the probability that x1 lies between x1 and x1 + dx1 regardless of the values

of the other N − 1 random variables, px2 ( x2 ) dx2 is the probability that x2 lies between x2 and

x2 + dx2 regardless of the values of the other N − 1 random variables, …, pxN ( xN ) dxN is the

probability that x N lies between xN and xN + dxN regardless of the values of the other N − 1

random variables. The expectation value of the product of these N random variables can now be

written as, setting f ( x1 , x2 ," , x N ) = x1 x2 " x N in Eq. (3.12a),

∞ ∞ ∞

E( x1 x2 " x N ) = ³

−∞

dx1 ³ dx2 " ³ dxN [ x1 x2 " xN ] px1 x2 "xN ( x1 , x2 ,… , xN )

−∞ −∞

∞ ∞ ∞

= ³

−∞

px1 ( x1 ) x1 dx1 ³ px2 ( x2 ) x2 dx2 " ³ pxN ( xN ) xN dxN .

−∞ −∞

or

E( x1 x2 " x N ) = µ x1 µ x2 " µ xN . (3.12d)

We can calculate the predicted mean values of x and X by choosing f ( x , X ) = x and

f ( x , X ) = X in Eq. (3.10b) above. This gives

∞ ∞

µ x = E( x ) = ³ dx ³ dX x p

−∞ −∞

xX

( x, X ) (3.13a)

and

∞ ∞

µ X = E( X ) = ³ dx ³ dX X p

xX

( x, X ) . (3.13b)

−∞ −∞

- 235 -

3 · Random Variables, Random Functions, and Power Spectra

∞ ∞

E( x ) = ³

−∞

x [ ³ pxX

−∞

( x, X ) dX ] dx (3.13c)

and

∞ ∞

E( X ) = ³ X [³ pxX

( x, X ) dx] dX , (3.13d)

−∞ −∞

we compare them to the formula for the expected value of a random variable given in Eq. (3.8b).

This comparison suggests that, if we want to specify the behavior of one random variable while

disregarding the presence of the other, we can construct the single-argument probability density

distributions of x and X by writing

∞

px ( x) = ³p

−∞

xX

( x, X ) dX (3.13e)

and

∞

p X ( X ) =

−∞

³p

xX

( x, X ) dx . (3.13f)

Up to this point, none of the integrations have required assumptions about the dependence or

independence of the random variables, so Eqs. (3.13e) and (3.13f) hold true both for dependent

and independent random variables x and X . If we specify that x and X are independent, then

Eq. (3.11a) can be substituted into (3.13e) and (3.13f) to get

∞ ∞

px ( x) = ³ p ( x)

−∞

x p X ( X ) dX = px ( x) ³ p X ( X ) dX

−∞

and

∞ ∞

p X ( X ) =

−∞

³ px ( x) p X ( X ) dx = p X ( X ) ³ px ( x) dx .

−∞

Glancing back at Eq. (3.4), we note that these last two equalities are trivially true, because in both

cases the right-most integrals must be one.

Having found formulas for µ x and µ X that hold true for any pair of dependent or independent

random variables x and X , we now use µ and µ to define a new random variable

x X

- 236 -

Analyzing Dependent Random Variables · 3.9

y = ( x − µ x )( X − µ X ) . (3.14a)

(

E( y ) = E ( x − µ x )( X − µ X ) ) (3.14b)

is just the predicted average value of y . We can imagine, each time we acquire a random pair of

x and X values, comparing the sizes of x and X to their respective averages µ x and µ X by

subtracting µ and µ from them. If x and X are both simultaneously greater than, or both

x X

simultaneously less than, their averages, then y is positive; and if one is greater than its average

when the other is less that its average, then Ϳ is negative. If there is a tendency for one of the

random variables to exceed its average whenever the other exceeds its average, or a tendency for

one of the random variables to fall below its average whenever the other falls below its average,

then Ϳ has a greater probability of being positive than negative, so

E( y ) > 0 .

If, on the other hand, there is a tendency for one of the random variables to exceed its average

when the other falls below its average, then Ϳ has a greater probability of being negative than

positive, so

E( y ) < 0 .

If E( y ) is zero, it indicates that Ϳ is just as likely to be negative as positive, which means that

knowing one variable lies above or below its average tells us nothing about the likelihood that the

other variable lies above or below its average. Writing out the integral formula for E( y ) in terms

of the probability density distribution pxX

( x, X ) gives

∞ ∞

( ) ³ dx ³ dX [( x − µ )( X − µ

E( y ) = E ( x − µ x )( X − µ X ) = x X

)] pxX

( x, X ) . (3.14c)

−∞ −∞

We say that the value of the integral in Eq. (3.14c) measures the covariance of random variables

x and X . When

(

E( y ) = E ( x − µ x )( X − µ X ) )

is greater than zero, x and X are said to be positively correlated; when

- 237 -

3 · Random Variables, Random Functions, and Power Spectra

(

E( y ) = E ( x − µ x )( X − µ X ) )

is less than zero, x and X are said to be negatively correlated; and when

(

E( y ) = E ( x − µ x )( X − µ X ) )

equals zero, x and X are said to be uncorrelated.

Evaluating E( y ) and finding it not equal to zero is a standard way of showing that two

random variables x and X are correlated and so cannot be independent. We cannot, however,

say that x and X are independent just because E( y ) is zero; that is, saying that x and X are

uncorrelated is a weaker statement than saying that x and X are independent. To show why this

is so, we set up a random variable φ which has a probability density distribution

pφ (φ ) = ® . (3.15a)

¯ 0 for φ < 0 or φ ≥ 2π

The probability density distribution pφ shows that φ is equally likely to take on any value

between zero and 2ʌ, and that φ never takes on values less than zero or greater than 2ʌ. We next

define two random variables u and v such that

u = sin(φ ) (3.15b)

and

v = cos(φ ) . (3.15c)

It follows that

∞ 2π

1

µu = E(u ) = E(sin φ ) = ³−∞ pφ (φ ) sin(φ ) dφ = 2π ³ sin(φ ) dφ = 0 , (3.15d)

0

2π

1

µv = E(v ) =

2π ³ cos(φ ) dφ = 0 .

0

(3.15e)

Note that

- 238 -

Analyzing Dependent Random Variables · 3.9

(

E ( (u − µu )(v − µv ) ) = E(u v ) = E (sin φ )( cos φ ) )

2π

1

=

2π ³ sin(φ ) cos(φ ) dφ

0

(3.15f)

2π

1

4π ³0

= sin(2φ ) dφ = 0 ,

which means that u and v are uncorrelated random variables. On the other hand, we also know

that

u 2 + v 2 = sin 2 φ + cos 2 φ = 1 ,

which means that whenever u takes on a particular random value, say 1/2, then v must take on

one of the two random values

± 1 − (1 2) 2 = ± 3 2 .

Consequently, u and v are by no means independent random variables even though by definition

they are uncorrelated random variables.

The expectation operator is linear with respect to all random quantities. To see why, we take any

two functions f and g whose arguments are the N random variables x1 , x2 ,…, x N and multiply

them by two nonrandom variables Į and ȕ. The expectation operator E applied to

∞ ∞ ∞

= ³ dx ³ dx " ³ dx

−∞

1

−∞

2

−∞

N [α f ( x1 , x2 ,… , xN ) + β g ( x1 , x2 ,… , xN )] px1 x2 "xN ( x1 , x2 ,… , xN )

∞ ∞ ∞

=α ³ dx ³ dx " ³ dx

−∞

1

−∞

2

−∞

N f ( x1 , x2 ," , xN ) px1 x2 "xN ( x1 , x2 ," , xN ) (3.16a)

∞ ∞ ∞

+β ³

−∞

dx1 ³ dx2 " ³ dxN g ( x1 , x2 ,… , xN ) px1x2 "xN ( x1 , x2 ,… , xN )

−∞ −∞

- 239 -

3 · Random Variables, Random Functions, and Power Spectra

Note that in the last step Eq. (3.12a) is applied again to return to the expectation operator.

According to Eq. (2.32a) in Chapter 2, the definition of a linear operator L is that

L (α f + β g ) = α L ( f ) + β L ( g ) (3.16b)

for any two functions f, g and any two constants Į, ȕ. When we think of the nonrandom variables

Į and ȕ as “constants,” we see that Eqs. (3.16a) and (3.16b) provide plenty of justification for

calling the expectation operator E a linear operator with respect to all random quantities.

The linearity of E can be used to show that multiplying any random variable x by a

nonrandom parameter Į results in the mean of x being multiplied by Į and the variance of x

being multiplied by Į2. Starting with Eq. (3.8c), we multiply both sides by Į to get

α E( x ) = αµ x . (3.16c)

Because E is linear, E(α x ) = α E( x ) , which means that Eq. (3.16c) can be written as

E(α x ) = αµ x . (3.16d)

This shows that multiplying x by Į changes its average value from µ x to αµ x . As for the

variance vx of random variable x , according to Eq. (3.8e) we have

( )

E ( x − µ x ) 2 = vx (3.16e)

α 2E ( ( x − µ x ) 2 ) = α 2 vx . (3.16f)

Again the linearity of E lets us write

α 2E ( ( x − µ x ) 2 ) = E (α 2 ( x − µ x )2 ) ,

α 2E ( ( x − µ x )2 ) = E ( (α x − αµ x )2 ) .

- 240 -

Linearity of the Expectation Operator · 3.10

E ( (α x − αµ x ) 2 ) = α 2 vx . (3.16g)

Since α x is the new random variable which comes from multiplying x by Į and [according to

Eq. (3.16d)] the quantity αµ x is the mean of this new random variable, we now realize—

consulting the definition of the variance in Eq. (3.8e)—that E ( (α x − αµ x ) 2 ) must be the variance

of the new random variable α x . Equation (3.16e) reminds us that vx is the variance of the old

random variable x . Hence, Eq. (3.16g) states that if x is multiplied by Į then its variance must

be multiplied by Į2.

The expectation operator usually can be moved inside an integral over a nonrandom variable.

Suppose function f depends on one nonrandom variable z in addition to N random variables

x1 , x2 ,…, x N . Then, again using Eq. (3.12a), the expectation value of the integral

zB

³ f ( z, x , x ,…, x

zA

1 2 N ) dz

is

zB

E ( ³ f ( z , x1 , x2 ,… , x N ) dz )

zA

∞ ∞ ∞ zB

= ³ dx ³ dx " ³ dx

−∞

1

−∞

2

−∞

N px1 x2 "xN ( x1 , x2 ,… , xN ) ³ f ( z, x1 , x2 ,… , xN ) dz .

zA

As long as we can interchange the order of these integrations—which is almost always allowed

when dealing with physically realistic integrals—the expectation value can also be written as

§ zB ·

E ¨ ³ f ( z, x1 , x2 ,… , x N ) dz ¸

¨z ¸

© A ¹

zB

ª∞ ∞ ∞

º

= ³ dz « ³ dx1 ³ dx2 " ³ dxN px1 x2 "xN ( x1 , x2 ,… , xN ) f ( z, x1 , x2 ,… , xN ) » .

zA ¬ −∞ −∞ −∞ ¼

§ zB · zB

E ¨ ³ f ( z, x1 , x2 ,… , xN ) dz ¸ = ³ E ( f ( z, x1 , x2 ,… , x N ) ) dz .

(3.17a)

¨z ¸ z

© A ¹ A

- 241 -

3 · Random Variables, Random Functions, and Power Spectra

The same reasoning can be extended to M integrals over M nonrandom variables z1 , z2 ,…, zM .

We have

§ z1 B z2 B zMB

·

E ¨ ³ dz1 ³ dz2 " ³ dzM f ( z1 , z2 ,… , zM , x1 , x2 ,… , x N ) ¸

¨z ¸

© 1 A z2 A zMA ¹

∞ ∞ z1 A zMB

= ³

−∞

dx1 " ³ dxN px1x2 "xN ( x1 ,… , xN )

−∞

³

z1 A

dz1 " ³ dz

zMA

M f ( z1 ,… , zM , x1 ," , xN )

z2 A zMB

ª∞ ∞

º

= ³

z1 A

dz1 " ³z M «¬ −∞³ 1 −∞³ dxN px1x2"xN ( x1 ,… , xN ) f ( z1 ,…, zM , x1 ," , xN ) »¼ ,

dz dx "

MA

§ z2 B z2 B zMB

·

E ¨ ³ dz1 ³ dz2 " ³ dzM f ( z1 , z2 ,… , zM , x1 , x2 ,… , x N ) ¸

¨z ¸

© 1 A z2 A zMA ¹ (3.17b)

z1 B z2 B zMB

= ³ dz ³ dz " ³ dz

1 2 M E ( f ( z1 , z2 ,… , zM , x1 , x2 ,… , x N ) ).

z1 A z2 A zMA

The expectation operator can even be moved inside the integral of a random function

f ( z1 , z2 ,… , zM ) .

f ( z1 , z2 ,… , zM ) = f ( z1 , z2 ,… , zM , x1 , x2 ,… , x N )

for some set of random variables x1 , x2 ,…, x N . Hence, we can just suppress the random variables

x1 , x2 ,…, x N in Eq. (3.17b) to get

- 242 -

Linearity of the Expectation Operator · 3.10

§ z2 B z2 B zMB

·

E ¨ ³ dz1 ³ dz2 " ³ dzM f ( z1 , z2 ,… , zM ) ¸

¨z ¸

© 1 A z2 A zMA ¹ (3.17c)

z1 B z2 B zMB

= ³ dz ³ dz " ³ dz

1 2 M ( )

E f ( z1 , z2 ,… , zM ) .

z1 A z2 A zMA

The central limit theorem states that if there is a random variable sN equal to the sum of N

independent random variables r1 , r2 ,…, rN , then

has a probability density distribution psN ( sN ) that resembles a Gaussian or normal probability

density distribution more and more as N gets large,

( s N − µ sN )2

−

1 2σ s2N

psN ( sN ) ≅ e . (3.18b)

σ s N

2π

In Eq. (3.18b), µ sN is the mean or average value of sN and σ sN is the standard deviation of sN

about its mean. Figure 3.1 is a plot of the Gaussian distribution specified on the right-hand side of

(3.18b). For large but finite values of N, this Gaussian distribution tends to be a relatively good

approximation of psN ( sN ) for sN values near the peak in Fig. 3.1 and a not-so-good

approximation of psN ( sN ) for sN values in the tails of Fig. 3.1—that is, for sN values far from

the peak.

The mean of sN comes from applying the expectation operator E to both sides of Eq. (3.18a).

Remembering that E is linear with respect to random quantities [see Eq. (3.16a) above], we get

- 243 -

3 · Random Variables, Random Functions, and Power Spectra

FIGURE 3.1.

p ~sN ( s N )

sN

N 1 2 N

(3.19a)

( )

vsN = E ( sN − µ sN ) 2 ,

- 244 -

The Central Limit Theorem · 3.11

§§ N N · ·

2

§§ N · ·

2

¨ © j =1 ¹ ¸¹ ¨ © j =1 ¹ ¸¹

© j =1

©

§ N N N

·

vsN = E ¦ (rj − µrj ) + ¦¦ [(rj − µrj )(rk − µrk )] ¸ ,

¨ 2

¨ j =1 ¸

¨ j =1 k =1 ¸

© k≠ j ¹

and the linearity of the expectation operator with respect to random quantities then lets us write

this as

( ) ( )

N N N

vsN = ¦ E (rj − µrj ) 2 + ¦¦ E (rj − µrj )(rk − µrk ) . (3.19b)

j =1 j =1 k =1

k≠ j

Since r1 , r2 ,…, rN are independent random quantities, so must the random quantities r1 − µr1 ,

r2 − µr2 ,…, rN − µrN also be independent. Hence, according to Eq. (3.11b), we see that when

j≠k

( )

E (rj − µrj )(rk − µrk ) = E(rj − µrj ) ⋅ E(rk − µrk ) . (3.19c)

But, applying the linearity of the expectation operator and Eqs. (3.8c) and (3.9f), we have

(

E (rj − µ rj )(rk − µ rk ) = 0 ) (3.19d)

( )

N

vsN = ¦ E (rj − µrj ) 2 ,

j =1

- 245 -

3 · Random Variables, Random Functions, and Power Spectra

where

( )

E (rj − µrj ) 2 = vrj (3.19f)

is the variance of rj for j = 1, 2,… , N . The standard deviation of a random quantity is the square

root of its variance [see Eq. (3.5c)], so formulas (3.19e) and (3.19f) can also be written as

N 1 2 N

(3.19g)

where

( )

E (rj − µrj ) 2 = σ rj (3.19h)

is the standard deviation of rj for j = 1, 2,… , N and σ sN is the standard deviation of sN .

Returning to the approximation in Eq. (3.18b) used to explain the central limit theorem, we

notice that some care must be exercised in interpreting the limit as N → ∞ ; in particular, it is

clear from Eqs. (3.19a) and (3.19g) that there is a tendency for both µ sN and σ sN to become large

without limit as N increases, making the expression on the right-hand side of (3.18b) difficult to

interpret in the limit of large N. The central limit theorem can be written in terms of a

mathematically well-defined limit as N → ∞ if we are careful how the arguments of the

Gaussian or normal distribution are defined. To state the central limit theorem precisely, we

define a new random variable

sN − µ sN

zN = (3.20a)

σ s N

that has a probability density distribution pzN ( z N ) . Now we can present the central limit theorem

exactly by stating that

1 − z2 / 2

lim ª¬ pzN ( z ) º¼ = e . (3.20b)

N →∞ 2π

The right-hand side of (3.20b) is the Gaussian or normal distribution introduced above in Eq.

(3.6a) where the random variable has a mean of zero and a standard deviation of one. For any

large but finite value of N, we can recover the approximation in (3.18b) by assuming that pzN is

near its limit and then replacing z in (3.20b) by zN as defined in (3.20a). [The extra factor of σ sN

- 246 -

The Central Limit Theorem · 3.11

multiplying the 2π on the right-hand side of (3.18b) can be regarded as coming from Eq. (3.4)

above—if it isn’t there, then the integral of the probability density distribution between í and

+ does not equal one.]

It is now easy to explain why averaging together many identical but independent measurements

from the same experiment improves the accuracy of the result. Suppose N independent

measurements are to be averaged together this way. We can say that each measurement is an

independent random number rj for j = 1, 2,… , N having the same mean value µ, with µ taken to

be the true value of the experimental quantity being measured. Since the measurements are all

identical, all the rj have the same standard deviation ı due to the same sorts of random errors

occurring in each independent measurement. When all the experimental results are averaged, we

create a new random number—namely, the sum of all the rj divided by N. Let’s call this new

random number a N . The work done in the previous section lets us write this as [see Eq. (3.18a)]

sN

a N = . (3.21a)

N

Applying the expectation operator E to both sides gives, using the linearity of the expectation

operator (see Sec. 3.10 above),

1

E(a N ) = E( sN ) . (3.21b)

N

Since E( sN ) = µ sN , Eq. (3.19a) shows that, since all the rj have the same mean value µ,

1

E(a N ) = (N µ) = µ . (3.21d)

N

Equation (3.21d) states that the expected value of the experimental average a N is µ, the true

value of the experimental quantity being measured. This is no great surprise, because the

averaging process would not make sense unless it were true. The typical size of the error left after

the rj are averaged together—that is, the amount by which a N is likely to be different from its

average value—is just its standard deviation [see Eqs. (3.5c) and (3.8e) above],

- 247 -

3 · Random Variables, Random Functions, and Power Spectra

σ a = E ( (a N − µ ) 2 ) ,

N

which can also be written as, after substituting from Eq. (3.21a) and using the linearity of the

expectation operator,

§§ 1 · · 1

2

σ a N

= E ¨ ¨ sN − µ ¸ ¸ =

¨© N

©

¸

¹ ¹ N

E ( sN − N µ ) .

2

( ) (3.21e)

(

E ( sN − N µ )

2

).

the variance vsN of sN [see Eq. (3.8e) above]. Hence, (3.21e) can be written as

1 1

σ a = vsN = σ s2N

N

N N

because the variance is the square of the standard deviation σ sN . Substituting from (3.19g) now

gives

1 1

σ a = vsN = σ r21 + σ r22 + " + σ r2N .

N

N N

As already mentioned above, we can assume that all the rj have the same standard deviation ı.

Hence,

1 σ

σ a = Nσ 2 = . (3.21f)

N

N N

This shows that when the standard deviation or expected error in one measurement is ı, then the

standard deviation or expected error in the average a N of N identical but independent

measurements is σ / N , a significantly smaller number. Although we use several formulas from

the previous section on the central limit theorem to get this result, there is no assumption here

that the rj obey any particular probability density distribution. In order to derive Eqs. (3.21d) and

(3.21f), all that is needed is that the rj are independent and that the probability density

distributions of the rj have the same mean and standard deviation.

When spectrometers are used to make independent measurements of the same radiance

- 248 -

Averaging to Improve Experimental Accuracy · 3.12

spectra, we can extend the above analysis to the spectral measurements by regarding the

independent but identical random variables rj as random functions of the spectral wavelength or

frequency, with different values of index j now representing different spectral curves from

independent spectral measurements. We can now repeat all the algebraic manipulations used in

(3.21a)–(3.21f) above while regarding every quantity except N as a function of the spectral

wavelength or frequency and end up with the same results. If, for example, the quantities are

regarded as functions of the spectral wavelength Ȝ, then we just need to visualize a (Ȝ)

immediately following the relevant variables. In a sense, all that is happening is that we have

decided to repeat the algebra of Eqs. (3.21a)–(3.21f) at each spectral wavelength. Equation

(3.21d), for example, becomes

E ( a N (λ ) ) = µ (λ ) , (3.22a)

showing that the point-by-point average of the rj (λ ) spectral curves creates another curve a N (λ )

whose expected value is the true spectrum µ(Ȝ). The average spectrum a N (λ ) is allowed to have

a different expected value µ(Ȝ) at each wavelength Ȝ because it is now, of course, taken to be a

function of Ȝ. Similarly Eq. (3.21f) becomes

σ (λ )

σ a (λ ) = . (3.22b)

N

N

This shows that the expected error σ aN (λ ) at wavelength Ȝ of the average spectrum a N (λ ) is

smaller by a factor of N than the expected error ı(Ȝ) at wavelength Ȝ of a single spectral

measurement. The expected error σ (λ ) , just like the average µ(Ȝ), is allowed to be different at

different wavelengths. As long as the expected value µ(Ȝ) of a N (λ ) is the true spectral curve, Eq.

(3.22b) shows that we can approach this true spectrum as closely as we desire—that is, make the

error in our point-by-point average spectrum arbitrarily small—by making N as large as

necessary.

Time

Using the same notation as in the discussion following Eq. (3.2a) above, we write ñ(t) to

represent a random function ñ of a nonrandom time t. As we already mentioned at the end of Sec.

3.2, ñ(t) is often called a random or stochastic process. Having specified a random function—or

stochastic process or random process—called ñ(t), we know that for each time t there is a random

variable ñ(t); and when there are two different time values t1 and t2 with t1 t2, there is no reason

to expect the random variables ñ(t1) and ñ(t2) to behave the same way.

- 249 -

3 · Random Variables, Random Functions, and Power Spectra

We also know the behavior of random variables can be described by probability density

distributions. Associated with any N sequential random variables n (t1 ) , n (t2 ) ,..., n (t N ) specified

by the time values t1 < t2 < " < t N there is a probability density distribution

such that

pn (t1 ) n (t2 )"n (tN ) (n1 , n2 ,… , nN )dn1dn2 " dnN

is the probability first that ñ(t1) takes on a value between n1 and n1 + dn1 , and then that n (t2 )

takes on a value between n2 and n2 + dn2 , and then that n (t3 ) takes on a value between n3 and

n3 + dn3 , …, and then that n (t N ) takes on a value between nN and nN + dnN . The expectation

operator E has the same meaning as before: the expected or mean value of any function f of the

N random variables n (t1 ) , n (t2 ) , ... , n (t N ) is

∞ ∞ ∞ (3.23a)

= ³

−∞

dn1 ³ dn2 " ³ dnN f (n1 , n2 ,… , nN ) pn ( t1 ) n ( t2 )"n ( tN ) (n1 , n2 ,… , nN ) .

−∞ −∞

One of the most important expectation values associated with ñ occurs when we set N = 2 and

specify that

f ( n (t1 ), n (t2 ),… , n (t N ) ) = n (t1 ) ⋅ n (t2 )

∞ ∞

Rnn (t1 , t2 ) = E ( n (t1 ) ⋅ n (t2 ) ) = ³ dn1 ³ dn2 [n1n2 ] pn ( t1 ) n (t2 ) (n1 , n2 ) . (3.23b)

−∞ −∞

∞

µn (t ) = E ( n (t ) ) = ³np n ( t ) (n) dn , (3.23c)

−∞

and the autocovariance of ñ,

- 250 -

Mean, Autocorrelation, Autocovariance of Random Functions of Time · 3.13

(t1 , t2 ) = E

Cnn ((

n (t1 ) − µn ( t1 ) )( n(t ) − µ ) )

2 n ( t2 )

∞ ∞ (3.23d)

= ³ dn ³ dn (n − µ

−∞

1

−∞

2 1 n ( t1 ) )(n2 − µn ( t2 ) ) pn ( t1 ) n ( t2 ) (n1 , n2 ).

(t1 , t2 ) = Cnn

Rnn (t1 , t2 ) . (3.23e)

Almost always, the random functions used to represent noise in a physical system are specified in

such a way that µn ( t ) = 0 , which means the distinction between the autocorrelation function and

the autocovariance function becomes irrelevant.

3.14 Ensembles

Just as random variables are often regarded as taking on one or another specific value chosen

randomly from some collection of allowed nonrandom values, so too do we often think of

random functions as becoming one or another specific, nonrandom function chosen randomly

from a collection—or ensemble—of allowed nonrandom functions. We can visualize this

situation by imagining an infinitely long row of biased and crooked slot machines, one for every

value of t on the time axis.27 The slot machines do not necessarily behave identically and they are

wired together so that they can influence each other. When a slot machine’s lever is pulled, there

is never any jackpot; all that happens is that another number appears inside its window. Each time

we simultaneously pull all the levers of the slot machines, we randomly choose another member

of the ensemble of allowed functions. The probability pn ( t ) (n) dn that random variable ñ(t) takes

on a value between n and n + dn is just the probability that the slot machine at t takes on a value

between n and n + dn , and it is also the probability that some member function randomly chosen

from the ensemble of allowed functions has a value between n and n + dn at time t. In fact, we

can say that

is the probability, after the slot machine levers are pulled, that the slot machine at t1 has a value

between n1 and n1 + dn1 , that the slot machine at t2 has a value between n2 and n2 + dn2 , …, and

27

An objection that could be raised here is that an infinite number of slot machines is only what is called countably

infinite whereas the number of points on the time axis is uncountably infinite, a much “larger” type of infinity. For

our purposes, the distinction between these two types of infinity is not important.

- 251 -

3 · Random Variables, Random Functions, and Power Spectra

that the slot machine at tN has a value between nN and nN + dnN . It can also, of course, be thought

of as the probability that a member function randomly chosen from the ensemble of allowed

functions has values at times t1 < t2 < " < t N that lie between n1 and n1 + dn1 , n2 and n2 + dn2 ,

…, nN and nN + dnN respectively.

A random function ñ(t) is strictly stationary,28 or strict-sense stationary,29 if all its statistical

properties are unaffected when the origin of its time axis is changed (that is, when we change the

point at which t = 0 ). Mathematically we require, for any t1 < t2 < " < t N , that the probability

density distribution

for any value of τ and all N = 1, 2,… , ∞ . Thus, for any integrable function f with N arguments,

∞ ∞ ∞

³ dn ³ dn " ³ dn

−∞

1

−∞

2

−∞

N f (n1 , n2 ,… , nN ) pn ( t1 ) n (t2 )"n ( tN ) (n1 , n2 ,… , nN )

∞ ∞ ∞

(3.24b)

= ³ dn ³ dn " ³ dn

−∞

1

−∞

2

−∞

N f (n1 , n2 ,… , nN ) pn ( t1 +τ ) n ( t2 +τ )"n ( tN +τ ) (n1 , n2 ,… , nN ) ,

where t1 < t2 < " < t N and N = 1, 2,… , ∞ . This means that, according to Eq. (3.23a),

for any integrable function f, any value of τ , and N = 1, 2,… , ∞ . We note that when Eq. (3.24c)

holds true,

E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) )

cannot depend on all the N independent time values t1 , t2 ,…, t N as we might at first suppose. To

see why this is so, we just set τ = −t1 in (3.24c) to get

28

Paul H. Wirsching, Thomas L. Paez, and Keith Ortiz, Random Vibrations: Theory and Practice (John Wiley and

Sons, Inc., New York, 1995), p. 80.

29

Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 297.

- 252 -

Stationary Random Functions · 3.15

(3.24d)

= E ( f ( n (0), n (t2 − t1 ), n (t3 − t1 ),… , n (t N − t1 ) ) ) .

This shows that

E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) )

must be a function of just the nonrandom time parameters (t2 − t1 ) , (t3 − t1 ) ,…, (t N − t1 ) and there

are, of course, only N − 1 of these.

Equations (3.24b)–(3.24d) can be understood in terms of the following thought experiment.

We randomly pick some function from the ensemble of allowed functions and choose N time

values t1 < t2 < " < t N . The randomly picked function has values n1 , n2 ,…, nN at times

t1 , t2 ,…, t N respectively. Next, we create some nonrandom function f that has N arguments and is

not one of those physically unreasonable abstractions that mathematicians specialize in. We

calculate and store the value of f (n1 , n2 ,… , nN ) . Randomly choosing another function from the

ensemble of allowed functions for n (t ) , we again use n1 , n2 ,…, nN at t1 , t2 ,…, t N to calculate and

store a new value of f (n1 , n2 ,… , nN ) . Repeating this procedure enough times to get a large

collection of f values, we average them all together to get a good estimate of

Shifting to a new set of time values t1 + τ , t2 + τ ,…, t N + τ , we again generate another large

collection of f values, this time averaging them together to get a good estimate of

Since n is strict-sense stationary, we know that no matter what the positive integer N is, and no

matter what the function f is, and no matter what the value of τ is, both collections of f values

always have approximately the same average, with the difference between the averages becoming

less and less as the collections of f values get larger and larger.

To give an example of a random function ñ(t) that is strict-sense stationary, we define

(a, b) such that pab

(a, b) da db is the

probability that a takes on a value between a and a + da when b takes on a value between b and

- 253 -

3 · Random Variables, Random Functions, and Power Spectra

(a, b) da db is the probability that b takes on a

value between b and b + db when a takes on a value between a and a + da . We next require

pab

(a, b) = pab

( a 2 + b2 ) . (3.25b)

(a, b) is circularly symmetric because it depends on a and b only

through a 2 + b 2 , the “radius length” of a point whose x and y coordinates are a, b. Returning to

the slot-machine model for ñ(t) explained in Sec. 3.14, we note that randomly choosing values for

a and b is the same as simultaneously pulling the levers of all the slot machines representing

ñ(t) in Eq. (3.25a). Having pulled the levers and gotten, say, values a1 for a and b1 for b , we

then know that the number in the window of the slot machine located at time value t1 is

a1 cos(ω t1 ) + b1 sin(ω t1 ) ,

we know that the number in the window of the slot machine located at time value t2 is

a1 cos(ω t2 ) + b1 sin(ω t2 ) ,

and so on. If we pull all the levers again and get values a2 for a and b2 for b , then we know that

the slot machine at t1 has a number

a2 cos(ω t1 ) + b2 sin(ω t1 ) ,

a2 cos(ω t2 ) + b2 sin(ω t2 ) ,

(a, b) completely determines the

statistics of random variables a and b , we see that it must also completely determine the

statistics of ñ(t) in Eq. (3.25a).

It is not difficult to show that ñ(t) in Eq. (3.25a) is strict-sense stationary when pab

is

circularly symmetric.30 Picking an arbitrary time interval τ , we construct two new random

variables

30

Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 301.

- 254 -

Stationary Random Functions · 3.15

and

B = b cos(ωτ ) − a sin(ωτ ) . (3.26b)

and

b = B cos(ωτ ) + A sin(ωτ ) , (3.26d)

which we can find by solving Eqs. (3.26a) and (3.26b) for a and b in terms of A and B .

Equations (3.26a) and (3.26b) state that if random variables a and b take on the values a and b,

then random variables A and B must take on the values

a cos(ωτ ) + b sin(ωτ )

and

b cos(ωτ ) − a sin(ωτ )

respectively. Similarly Eqs. (3.26c) and (3.26d) state that if random variables A and B take on

values A and B, then random variables a and b must take on value

## Molto più che documenti.

Scopri tutto ciò che Scribd ha da offrire, inclusi libri e audiolibri dei maggiori editori.

Annulla in qualsiasi momento.