100%(3)Il 100% ha trovato utile questo documento (3 voti)

1K visualizzazioni1,395 pagineThis textbook is for Singapore H2 Mathematics students (Singapore-Cambridge A-levels). [Free here: http://bit.ly/h2math] Covers both 9740 (old) and 9758 (revised) syllabuses. Includes ~300 exercises and all 2006-2015 A-level exam questions -- all worked solutions included. (Brief contents: I. Functions and Graphs. II. Sequences and Series. III. Vectors IV. Complex Numbers. V. Calculus. VI. Probability and Statistics.)

Jul 25, 2016

© © All Rights Reserved

PDF, TXT o leggi online da Scribd

This textbook is for Singapore H2 Mathematics students (Singapore-Cambridge A-levels). [Free here: http://bit.ly/h2math] Covers both 9740 (old) and 9758 (revised) syllabuses. Includes ~300 exercises and all 2006-2015 A-level exam questions -- all worked solutions included. (Brief contents: I. Functions and Graphs. II. Sequences and Series. III. Vectors IV. Complex Numbers. V. Calculus. VI. Probability and Statistics.)

© All Rights Reserved

100%(3)Il 100% ha trovato utile questo documento (3 voti)

1K visualizzazioni1,395 pagineThis textbook is for Singapore H2 Mathematics students (Singapore-Cambridge A-levels). [Free here: http://bit.ly/h2math] Covers both 9740 (old) and 9758 (revised) syllabuses. Includes ~300 exercises and all 2006-2015 A-level exam questions -- all worked solutions included. (Brief contents: I. Functions and Graphs. II. Sequences and Series. III. Vectors IV. Complex Numbers. V. Calculus. VI. Probability and Statistics.)

© All Rights Reserved

Sei sulla pagina 1di 1395

Mathematics

Textbook

CHOO YAN MIN

& Answers.

Covers both 9740 & 9758 syllabuses.

Includes TYS

The latest version will always be at this link.

This book is optimised for viewing in PDF format (click the above link).

Other existing formats are crude conversions and may be sub-optimal.

Recent changes: First complete draft done. Spent a few days checking for errors.

Upcoming changes: None planned.

www.EconsPhDTutor.com

With your help, I plan to keep improving this textbook.

www.EconsPhDTutor.com

The actual main content takes up only about 700+ pages.

The other 700 pages are for things like front matter, TYS questions, appendices,

reproductions of formula lists and syllabuses, and answers to exercises.

www.EconsPhDTutor.com

This book is licensed under the Creative Commons license CC-BY-NC-SA 4.0.

Share copy and redistribute the material in any medium or format

Adapt remix, transform, and build upon the material

Under the following terms:

Attribution You must give appropriate credit, provide a link to the license, and

indicate if changes were made. You may do so in any reasonable manner, but not in any

way that suggests the licensor endorses you or your use.

NonCommercial You may not use the material for commercial purposes.

ShareAlike If you remix, transform, or build upon the material, you must distribute

your contributions under the same license as the original.

Title: H2 Mathematics Textbook.

ISBN: 978-981-11-0383-4 (e-book).

www.EconsPhDTutor.com

Paul Lockhart (2009, A Mathematicians Lament, p. 22).

more permanent than theirs, it is because they are made with ideas. ... Beauty is the

first test: there is no permanent place in the world for ugly mathematics.

- G.H. Hardy (1940 [1967], A Mathematicians Apology, pp. 84-85).

The scientist does not study nature because it is useful to do so. He studies it because

he takes pleasure in it, and he takes pleasure in it because it is beautiful.

- Henri Poincar (1908 [1914], Science and Method, English trans., p. 22).

www.EconsPhDTutor.com

This textbook is for Singaporean H2 Maths students (hence the occasional Singlish and

TLAs1 ). Of course, I hope that anyone else in the world will also find this useful!

I needed a definitive reference for my own teaching needs, but could find nothing satisfying.

So I decided to just write my own textbook.

This textbook is based exactly on the old (9740) and revised (9758) syllabuses (also

reproduced in Part VIII). Do check to make sure which exam youre taking. The revised

syllabus (9758) is the same as the old syllabus (9740), but with noticeable chunks excised

and is thus easier.2

9740 (old) examined? 9758 (revised) examined?

2016

Yes.

No.

2017

Yes, for the last time.

Yes, for the first time.

2018

No.

Yes.

SYLLABUS ALERT

Where there are any differences between the old and revised syllabuses, Ill let you know

with a yellow box like this.

FREE! This book is free. But if you paid any money for it, I certainly hope your money

is going to me! This book is free because:

1. It is a shameless advertising vehicle for my awesome tutoring services.

2. The marginal cost of reproducing this book is zero.

DONATE! This book may be free, but donations are more than welcome! Donation

methods in footnote.3

Its irrational for Homo economicus to donate. But please consider donating because:

1. Youre a nice human being , [*emotional_manipulation*].

2. Your donations will encourage me and others to continue producing awesome free content

for the world.

Three Letter Abbreviations.

Indeed, some chunks of the old syllabus (9740) have simply been moved into the syllabus of Further Maths (9649), which

the authorities have kindly resurrected for the 2017 exam season.

3

Singapore. POSB Savings Account 174052271 or OCBC Savings Account 5523016383 (Name: Choo Yan Min). International. Bitcoin wallet: 1GDGNAdGZhEq9pz2SaoAdLb1uu34LFwViz. Paypal ychoo@umich.edu (Name: Yan Min Choo,

USD preferred because this account was set up in the US). USA. Venmo link (Name: Yanmin Choo).

www.EconsPhDTutor.com

1. There are any errors in this book. Please let me know even if its something as trivial

as a spelling mistake or a grammatical error.

2. You have absolutely any suggestions for improvement.

3. Any part of this book is less than crystal clear.

Heres an anecdote about Richard Feynman, the great teacher and physicist:

Feynman was once asked by a Caltech faculty member to explain why spin

1/2 particles obey Fermi-Dirac statistics. He gauged his audience perfectly

and said, Ill prepare a freshman lecture on it. But a few days later he

returned and said, You know, I couldnt do it. I couldnt reduce it to

the freshman level. That means we really dont understand it.

I agree: If you cant explain something simply, you dont understand it well enough.4 And

as a corollary, the best way to gauge whether you understand something is to see if you

can explain it simply to someone else.

If at any point in this textbook, you have read the same passage a few times, tried to reason

it through, and still find things confusing, then it is a failure on MY part. Please let me

know and I will try to rewrite it so that its clearer. (There is also the possibility that I

simply messed up! So please let me know if theres anything confusing!)

I deeply value any feedback, because Id like to keep improving this textbook

for the benefit of everyone! I am very grateful to all the kind folks whove already

written in, allowing me to rid this book of more than a few embarrassing errors.

LyX rocks!

This book was written using LYX.5

Is the font size big enough?

Youre probably reading this on some device. So Ive tried to set the font sizes and stuff so

that one can comfortably read this on a device as small as a seven-inch tablet. It should

also be possible to read this on a phone, though somewhat less comfortably. (Please let me

know if you have any feedback about this!)

(Ill probably be contacting some publishers to see if they want to do a print version of

this, for anyone who prefers it in print.)

This quote or some similar variant is often (mis)attributed to Einstein. But as Einstein himself once said, 73% of Einstein

quotes are misattributed.

5 A

L TEX is the typesetting programme used by most economists and scientists. But LATEX can be difficult to use. LYX is a

user-friendly GUI version of LATEX. LYX has boosted my productivity by countless hours over the years and you should use

LYX too!

www.EconsPhDTutor.com

Read maths slowly.

Reading maths is not like reading Harry Potter. Most of Harry Potter is fluff. There is

little fluff in maths.

So go slowly. Dwell upon and carefully consider every sentence in this textbook. Make sure

you completely understand what each statement says and why it is true. Reading maths

is very different from reading any other subject matter.

If you dont quite understand some material, you might be tempted to move forward anyway.

Dont. In maths, later material usually builds on earlier material. So if you simply move

forward, this will usually cost you more time and frustration in the long run.

Better then to stop right there. Keep working on it until you get it. Ask a friend or

a teacher for help. Feel free to even email me! (Im always interested to know what the

common points of confusion are and how I can better clear them up.)

Examples and exercises are your best friends. So work through them.

A good stock of examples, as large as possible, is indispensable for a

thorough understanding of any concept, and when I want to learn something

new, I make it my first job to build one.

Work through all the examples and exercises. Merely moving your eyeballs is not the same

as working. Working means having pencil and paper by your side and going through each

example/exercise word-by-word, line-by-line.

For example, I might say something like x2 y 2 = 0. Thus, (x y)(x + y) = 0. If its not

obvious to you why the first sentence implies the second, stop right there and work on it

until you understand why. Dont just let your eyeballs fly over these sentences and pretend

that your brain is getting it.

I will often not bother to explain some steps, especially if they simply involve some simple

algebra.

You get a List of Formulae during the A-level exam.

So theres no need to memorise all the formulae that are already on the list youre getting.

Note that you get a different list depending on which exam youre taking List of Formulae

(MF15) for the old 9740 exam and List of Formulae (MF26) for the revised 9758 exam.

(Both lists are reproduced in Part VIII of this book.) I cannot guarantee though that your

JC will give you the List during your JC common tests and exams.

Page 9, Table of Contents

www.EconsPhDTutor.com

Youve probably forgotten some (or most?) of it, but unfortunately, you are still assumed

to know EVERYTHING from O-Level A Maths. See the lists near the end of either the

9740 (old) or the 9758 (revised) syllabus. Skim through and see if anything looks totally

alien to you!

Some chapters (e.g. Chapters 5 and 26) in this textbook will give a quick review of some of

the O-Level Maths material that you may have forgotten but which well use quite often.

Online Calculators

Google is probably the quickest for simple calculations. Type in anything into your

browsers Google search bar and the answer will instantly show up:

Wolfram Alpha is somewhat more advanced (but also slower). Enter sin x for example

and youll get graphs, the derivative, the indefinite integral, the Maclaurin series, and a

bunch of other stuff you neither know nor care about.

The Derivative Calculator and the Integral Calculator are probably unbeatable for the

specific purposes of differentiation and integration. Both give step-by-step solutions for

anything you want to differentiate or integrate.

Here is a collection of spreadsheets I made. These spreadsheets are for doing tedious and

repetitive calculations youll often encounter in H2 maths (e.g. with vectors, complex

numbers, etc.). As with anything I do, I welcome any feedback you may have about

these spreadsheets. Perhaps in the future I will make a more attractive version of it.

(Instructions: Click Make a copy to open up your own independent copy of

this spreadsheet. Enter your input in the yellow cells. Output is produced in

the blue cells. If you mess up anything, simply click the same link and Make

a copy again.)

www.EconsPhDTutor.com

There are way too many websites out there catering to primary, secondary, and lower-level

undergraduate maths. Unfortunately, some of them can be awful and can get things wrong.

Three resources I like (though are probably a bit advanced for JC students) are:

1. Math StackExchange

A great resource where you can ask maths questions and often get them answered fairly

promptly. Note though that this site is mostly frequented by fairly advanced students of

maths (not to mention also mathematicians), so they can be pretty impatient and quick

to downvote questions they perceive to be stupid. Nonetheless, if you make an effort to

write down a carefully-crafted question and show also that youve made some effort to look

for an answer (either on your own or online), they can be very helpful.6

2. ProofWiki gives succinct and rigorous definitions and proofs. Unfortunately it is very

incomplete.

3. Mathworld.Wolfram is also great, but at times excessively encyclopaedic, at the cost of

clarity and brevity.

And of course, you can find countless free maths textbooks online (some less legal than

others). Two totally illegal7 resources are: LibGen for books and SciHub for articles.8

An old reliable is Bittorrent.

There is an entire StackExchange family of websites. The flagship site is StackOverflow where you can ask any programming

question and get it answered amazingly quickly.

7

Well, depending on which jurisdiction you live in. Of course, in Singapore, unless told otherwise, you should assume that

everything is illegal.

8

Note though that these sites are constantly playing whac-a-mole with the fascist authorities so the URLs often change

if so, simply google to look up the current URLs.

www.EconsPhDTutor.com

Preface

Divide students into two extremes:

1. Type #1 is happy to get an A, even if this means learning absolutely nothing.

2. Type #2 would rather learn a lot, even if this means getting a C.

The good Singaporean is trained to view pragmatism is the highest virtue (and obedience

second). She is thus also trained to be a Type #1 student (and indeed a Type #1 human

being).

If youre a Type #1 student, then this textbook may not be the best use of your time

(though you may still find the TYS and answers useful). Please use instead these three

resources, which are provided with the efficient Type #1 student in mind:

The H2 Mathematics CheatSheet, which contains all the formulae youll ever need

on two sides of a single A4 sheet of paper.9

The H2 Maths Exercise Book (coming soon), which teaches you how to mindlessly

apply formulae and give the correct answer to every exam question.

My totally awesome tuition classes!

Of course, it is fully intended that this textbook (complemented by a capable teacher) will

help any student get her A. But that for me is quite beside the point.

My broader goal in writing this textbook is to impart genuine understanding or at least

as much as is possible, within the stultifying confines of the A-level syllabus.

Maths education in Singapore is at least every bit as stupid and boring and formulaic

and mindless as in the US.10 But at least the average US student has the consolation

that only a very small portion of her life will have been squandered on mindless pseudomaths.

The same cannot be said for the average Singaporean student. By the time she turns 18,

she will have just for the subject of maths alone clocked many thousands of hours

attending classes; doing homework; doing practice exam questions; doing assessment books,

Ten Year Series; going to tuition classes; taking common tests, promos, prelims, one big

exam after another; etc.

This textbook is for the Singaporean A-level student. So a good deal of mindless formulae

is unavoidable. But at the same time, I try in this textbook to give the student a tiny

glimpse of what maths really is the art of explanation.11 So for example, this textbook

explains

Two things: (1) This CheatSheet does not include many of the formulae already printed in List MF26. (2) It is written for

9758 (revised) students (so 9740 students may find a few things missing).

10

At least as described by Paul Lockhart, in A Mathematicians Lament.

11

Lockhart, p. 29.

www.EconsPhDTutor.com

Calculus. (To get an A, no understanding of these is necessary. Instead, one need merely

know how to do differentiation and integration problems.)

Why the Central Limit Theorem is so amazing. (To get an A, one need merely treat the

CLT as yet another mysterious mathematical trick that helps solve exam questions. No

appreciation of why it is wonderful is necessary.)

A bit of intuition behind the Maclaurin series. (To get an A, it suffices to know how to

mindlessly apply this strange formula that falls out from the sky.)

Why it is terribly wrong to believe that a high correlation coefficient means a good

model. (Yet this is exactly what the writers of the A-level exams seem to believe. See

Section 73.9.)

Once upon a time, I had the misfortune of being a Singaporean JC student myself. I

remember being deeply mystified by why the scalar (or dot) product, despite having a

simple algebraic definition, could at the same time also tell us about the cosine of the

angle between the two vectors. I never figured it out,12 but it didnt matter, because this

was simply yet another formula that we were required to know, for the sole purpose of

answering exam questions.

I remember being confused about the difference between the sample mean, the mean of the

sample mean, the variance of the sample mean, and the sample variance. But this confusion

didnt matter, because once again, all we needed to do to get an A was to mindlessly apply

formulae and algorithms. Monkey see, monkey do.

This textbook is thus partly in response to my unhappy and unsatisfactory experience as

a maths student in Singapore. Almost all results are proven. I often try to supply the

intuition for each result in the simplest possible terms. Many proofs are relegated to the

appendices, but where a proof is especially simple and beautiful, I encourage the student

to savour it by leaving it in the main text. In the rare instances where proofs are entirely

omitted from this book usually because they are too advanced I make sure to clearly

state so, lest the student wonder whether the result is supposed to be obvious.

Finally, I also hope that this textbook will serve as an authoritative resource to which

teachers and students alike can refer.

12

The internet was, at that time, not so well-developed, so one could not easily find answers online.

www.EconsPhDTutor.com

Tuition Ad

I give tuition for the following at any level:

Economics.

Mathematics.

I have a PhD in economics (University of Michigan)

and have been teaching and tutoring since 2010.

www.EconsPhDTutor.com

Or email:

DrChooYanMin@gmail.com

Contents

About This Book

Preface

12

35

1 Sets

36

1.1

In and Not In . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

1.2

Greater than >, Less Than <, Positive > 0, and Negative < 0 . . . . . . . . . .

38

1.3

Types of Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

1.4

40

1.5

41

1.6

42

1.7

43

1.8

44

1.9

. . . . . . . . . . . . . . . . . . . . . . . . . .

45

47

1.11 Subset Of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

49

1.13 Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

1.14 Intersection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

51

52

53

54

2 Dividing By Zero

55

www.EconsPhDTutor.com

3 Functions

56

3.1

. . . . . . . . . . . . . . . . . .

58

3.2

62

3.3

64

3.4

65

3.5

66

3.6

One-to-One Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

3.7

Inverse Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

68

3.8

73

3.9

Composite Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

74

4 Graphs

4.1

77

83

86

5.1

Laws of Exponents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

86

5.2

87

5.3

Absolute Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

88

6 Intercepts

90

7 Symmetry

93

7.1

93

7.2

94

7.3

Lines of Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

99

8.1

8.2

Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

8.3

8.4

8.5

99

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

www.EconsPhDTutor.com

9 Differentiation

114

9.1

9.2

9.3

9.4

. . . . . . . . . . . . . . . . . . . . . . . 121

9.5

d

Operator . . . . . . . . . . . . . . . . 123

dx

9.6

9.7

9.8

9.9

133

10.2 The First Derivative Increasing/Decreasing Test . . . . . . . . . . . . . . . . . 134

11 Extreme, Stationary, and Turning Points

135

11.2 Global Maximum and Minimum Points . . . . . . . . . . . . . . . . . . . . . . 139

11.3 Stationary and Turning Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

11.4 The Interior Extremum Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 145

11.5 How to Find Maximum and Minimum Points . . . . . . . . . . . . . . . . . . . 147

12 Concavity, Inflexion Points, and the 2DT

151

12.2 Summary of Points and Venn Diagram

. . . . . . . . . . . . . . . . . . . . . . 157

159

162

www.EconsPhDTutor.com

15 Transformations

166

15.2 y = f (x + a) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

15.3 y = af (x) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

15.4 y = f (ax) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

15.5 Combinations of the Above . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

15.6 y = f (x) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

15.7 y = f (x) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

15.8 y =

1

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

f (x)

16 Conic Sections

176

x2 y 2

16.2 The Ellipse 2 + 2 = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

a

b

16.3 The Hyperbola: y = 1/x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

16.4 The Hyperbola x2 y 2 = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

x2 y 2

16.5 The Hyperbola 2 2 = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

a

b

16.6 The Hyperbola

y 2 x2

= 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

b2 a2

16.8 The Hyperbola y =

bx + c

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

dx + e

ax2 + bx + c

. . . . . . . . . . . . . . . . . . . . . . . . . . . 198

16.9 The Hyperbola y =

dx + e

17 Simple Parametric Equations

203

www.EconsPhDTutor.com

211

18.1

ax + b

> 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

cx + d

18.2

ax2 + bx + c

> 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

dx2 + ex + f

18.4 Systems of Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

II

19 Finite Sequences

224

225

19.2 Recurrence Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

19.3 Creating New Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

20 Infinite Sequences

231

21 Series

233

22 Summation Notation

236

240

24 Geometric Sequences and Series

242

24.2 Infinite Geometric Sequences and Series . . . . . . . . . . . . . . . . . . . . . . 244

25 Proof by the Method of Mathematical Induction

245

III

251

Vectors

www.EconsPhDTutor.com

252

26.2 Angles - Acute, Right, Obtuse, Straight, Reflex . . . . . . . . . . . . . . . . . . 253

26.3 Triangles - Acute, Right, Obtuse . . . . . . . . . . . . . . . . . . . . . . . . . . 254

26.4 Sine, Cosine, Tangent - Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 255

26.5 Sine, Cosine, Tangent - Values and Graphs . . . . . . . . . . . . . . . . . . . . 256

26.6 Formulae for Sine, Cosine, and Tangent . . . . . . . . . . . . . . . . . . . . . . 257

26.7 Arcsine, Arccosine, Arctangent . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

26.8 The Law of Sines and the Law of Cosines . . . . . . . . . . . . . . . . . . . . . 262

27 Vectors in Two Dimensions (2D)

264

27.2 Sum, Additive Inverse, and Difference of Vectors . . . . . . . . . . . . . . . . . 272

27.3 Displacement Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275

27.4 Length (or Magnitude) of a Vector . . . . . . . . . . . . . . . . . . . . . . . . . 276

27.5 Scalar Multiplication of a Vector . . . . . . . . . . . . . . . . . . . . . . . . . . 278

27.6 Unit Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

27.7 The Ratio Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282

28 Scalar Product

284

28.2 Projection of One Vector on Another . . . . . . . . . . . . . . . . . . . . . . . . 289

28.3 Direction Cosines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291

29 Vectors in 3D

293

30 Vector Product

297

30.2 Areas of Triangles and Parallelograms . . . . . . . . . . . . . . . . . . . . . . . 299

30.3 Vector Product in 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

Page 20, Table of Contents

www.EconsPhDTutor.com

31 Lines

305

31.2 Lines on a 2D Plane: Vector to Cartesian Equations . . . . . . . . . . . . . . 310

31.3 Lines in 3D Space: Vector Equations . . . . . . . . . . . . . . . . . . . . . . . . 312

31.4 Lines in 3D Space: Vector to and from Cartesian Equations . . . . . . . . . . 314

31.5 Collinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321

32 Planes

323

32.2 Planes: Hessian Normal Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332

33 Distances

333

33.2 Distance of a Point from a Plane . . . . . . . . . . . . . . . . . . . . . . . . . . 341

34 Angles

345

34.2 Angle between Two Lines (3D) . . . . . . . . . . . . . . . . . . . . . . . . . . . 350

34.3 Angle between A Line and a Plane . . . . . . . . . . . . . . . . . . . . . . . . . 353

34.4 Angle between Two Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355

35 Relationships between Lines and Planes

357

35.2 Relationship between a Line and a Plane . . . . . . . . . . . . . . . . . . . . . 361

35.3 Relationship between Two Planes . . . . . . . . . . . . . . . . . . . . . . . . . . 363

35.4 Relationship between Three Planes . . . . . . . . . . . . . . . . . . . . . . . . . 368

IV

Complex Numbers

374

375

Page 21, Table of Contents

www.EconsPhDTutor.com

379

37.2 Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380

37.3 Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382

38 Solving Polynomial Equations

384

38.2 The Fundamental Theorem of Algebra . . . . . . . . . . . . . . . . . . . . . . . 386

38.3 The Complex Conjugate Roots Theorem . . . . . . . . . . . . . . . . . . . . . . 389

39 The Argand Diagram

390

39.2 Complex Numbers in Exponential Form . . . . . . . . . . . . . . . . . . . . . . 397

40 More Arithmetic of Complex Numbers

398

40.2 The Ratio of Two Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . 401

40.3 Sine and Cosine as Weighted Sums of the Exponential . . . . . . . . . . . . . 403

41 Geometry of Complex Numbers

406

41.2 The Product and Ratio of Two Complex Numbers . . . . . . . . . . . . . . . . 408

41.3 Conjugating a Complex Number . . . . . . . . . . . . . . . . . . . . . . . . . . . 410

42 Loci Involving Cartesian Equations

411

42.2 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416

42.3 Intersection of Lines and Circles . . . . . . . . . . . . . . . . . . . . . . . . . . . 420

www.EconsPhDTutor.com

423

43.2 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425

43.3 Rays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426

43.4 Quick O-Level Revision: Properties of The Circle . . . . . . . . . . . . . . . . 429

44 De Moivres Theorem

432

44.2 Roots of a Complex Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436

Calculus

441

442

45.2 Differentiation of Simple Parametric Functions

. . . . . . . . . . . . . . . . . 443

45.4 Connected Rates of Change Problems . . . . . . . . . . . . . . . . . . . . . . . 445

45.5 Finding Max/Min Points on the TI84 . . . . . . . . . . . . . . . . . . . . . . . 447

45.6 Finding the Derivative at a Point on the TI84 . . . . . . . . . . . . . . . . . . 449

46 The Maclaurin Series

451

46.2 Maclaurin Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452

46.3 The Amazing Maclaurin Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454

46.4 Finite-Order Maclaurin Series as Approximations . . . . . . . . . . . . . . . . 456

46.5 Product of Two Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460

46.6 Composition of Two Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462

46.7 How the Maclaurin Series Works (Optional) . . . . . . . . . . . . . . . . . . . . 465

www.EconsPhDTutor.com

466

47.2 The Indefinite Integral is Unique Up to the C.O.I. . . . . . . . . . . . . . . . . 469

48 Integration Techniques

470

48.2 More Basic Rules of Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . 472

48.3 Trigonometric Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474

48.4 Integration by Substitution (IBS) . . . . . . . . . . . . . . . . . . . . . . . . . . 475

48.5 Integration by Parts (IBP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480

49 The Fundamental Theorems of Calculus (FTCs)

481

49.2 The First Fundamental Theorem of Calculus (FTC1) . . . . . . . . . . . . . . 486

49.3 The Definite (or Riemann) Integral

. . . . . . . . . . . . . . . . . . . . . . . . 490

50 Definite Integrals

492

50.2 Area between a Curve and a Line . . . . . . . . . . . . . . . . . . . . . . . . . . 494

50.3 Area between Two Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495

50.4 Area below the x-Axis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496

50.5 Area under a Parametrically-Defined Curve

. . . . . . . . . . . . . . . . . . . 497

50.7 Finding Definite Integrals on your TI84 . . . . . . . . . . . . . . . . . . . . . . 501

51 Differential Equations

dy

= f (x) . . . . . . .

dx

dy

51.2

= f (y) . . . . . . .

dx

d2 y

51.3

= f (x) . . . . . . .

dx2

51.4 Word Problems . . .

51.1

502

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507

Page 24, Table of Contents

www.EconsPhDTutor.com

VI

511

512

52.2 How to Count: The Multiplication Principle . . . . . . . . . . . . . . . . . . . 516

52.3 How to Count: The Inclusion-Exclusion Principle . . . . . . . . . . . . . . . . 520

52.4 How to Count: The Complements Principle . . . . . . . . . . . . . . . . . . . . 522

53 How to Count: Permutations

523

53.2 Circular Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533

53.3 Partial Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536

53.4 Permutations with Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . 537

54 How to Count: Combinations

539

54.2 The Combination as Binomial Coefficient . . . . . . . . . . . . . . . . . . . . . 543

54.3 The Number of Subsets of a Set is 2n . . . . . . . . . . . . . . . . . . . . . . . . 545

55 Probability: Introduction

547

55.2 The Experiment as a Model of Scenarios Involving Chance . . . . . . . . . . . 549

55.3 The Kolmogorov Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554

55.4 Implications of the Kolmogorov Axioms . . . . . . . . . . . . . . . . . . . . . . 555

56 Probability: Conditional Probability

557

56.2 Two-Boys Problem (Fun, Optional) . . . . . . . . . . . . . . . . . . . . . . . . . 564

57 Probability: Independence

566

57.2 Probability: Independence of Multiple Events . . . . . . . . . . . . . . . . . . . 573

Page 25, Table of Contents

www.EconsPhDTutor.com

574

58.2 The Birthday Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577

59 Random Variables: Introduction

578

59.2 X = k Denotes the Event {s S X(s) = k} . . . . . . . . . . . . . . . . . . . . 580

59.3 The Probability Distribution of a Random Variable . . . . . . . . . . . . . . . 581

59.4 Random Variables Are Simply Functions . . . . . . . . . . . . . . . . . . . . . 584

60 Random Variables: Independence

586

589

61.2 The Expectation Operator is Linear . . . . . . . . . . . . . . . . . . . . . . . . 594

62 Random Variables: Variance

596

62.2 Standard Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604

62.3 The Variance Operator is Not Linear . . . . . . . . . . . . . . . . . . . . . . . . 605

62.4 The Definition of the Variance (Optional) . . . . . . . . . . . . . . . . . . . . . 607

63 The Coin-Flips Problem (Fun, Optional)

608

609

65 The Binomial Distribution

612

65.2 The Mean and Variance of the Binomial Random Variable

. . . . . . . . . . 615

www.EconsPhDTutor.com

617

66.2 When is the Poisson Random Variable an Appropriate Model? . . . . . . . . 620

66.3 The Mean and Variance of the Poisson Random Variable

. . . . . . . . . . . 623

66.5 The Sum of Two Independent Poisson R.V.s is a Poisson R.V. . . . . . . . . 627

67 The Continuous Uniform Distribution

630

67.2 The Cumulative Distribution Function (CDF) . . . . . . . . . . . . . . . . . . 633

67.3 Important Digression: P (X k) = P (X < k) . . . . . . . . . . . . . . . . . . . 634

67.4 The Probability Density Function (PDF) . . . . . . . . . . . . . . . . . . . . . 635

68 The Normal Distribution

636

68.2 Sum of Independent Normal Random Variables . . . . . . . . . . . . . . . . . 651

68.3 The Central Limit Theorem and The Normal Approximation . . . . . . . . . 655

68.3.1 Normal Approximation to the Binomial Distribution

68.3.2 Normal Approximation to the Poisson Distribution

69 The CLT is Amazing (Optional)

. . . . . . . . . 658

. . . . . . . . . . 659

660

69.2 Illustrating the Central Limit Theorem (CLT) . . . . . . . . . . . . . . . . . . 664

69.3 Why Are So Many Things Normally Distributed? . . . . . . . . . . . . . . . . 668

69.4 Dont Assume That Everything is Normal . . . . . . . . . . . . . . . . . . . . . 669

70 Statistics: Introduction (Optional)

675

70.2 Objectivists vs Subjectivists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676

www.EconsPhDTutor.com

71 Sampling

678

71.2 Population Mean and Population Variance . . . . . . . . . . . . . . . . . . . . 679

71.3 Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 680

71.4 Distribution of a Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681

71.5 A Random Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 682

71.6 Sample Mean and Sample Variance . . . . . . . . . . . . . . . . . . . . . . . . . 684

71.7 Sample Mean and Sample Variance are Unbiased Estimators . . . . . . . . . 690

71.8 The Sample Mean is a Random Variable

. . . . . . . . . . . . . . . . . . . . . 693

71.10Non-Random Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695

71.11Stratified, Quota, and Systematic Sampling . . . . . . . . . . . . . . . . . . . . 696

72 Null Hypothesis Significance Testing (NHST)

701

72.2 The Abuse of NHST (Optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . 708

72.3 Common Misinterpretations of the Margin of Error (Optional) . . . . . . . . 709

72.4 Critical Region and Critical Value . . . . . . . . . . . . . . . . . . . . . . . . . . 712

72.5 Testing

of

a

Population

Mean

2

(Small Sample, Normal Distribution, Known) . . . . . . . . . . . . . . . . . 714

72.6 Testing

of

a

Population

Mean

(Large Sample, Any Distribution, 2 Known) . . . . . . . . . . . . . . . . . . . 716

72.7 Testing

of

a

Population

Mean

2

(Large Sample, Any Distribution, Unknown) . . . . . . . . . . . . . . . . . 718

72.8 Testing

of

a

Population

Mean

2

(Small Sample, Normal Distribution, Unknown) . . . . . . . . . . . . . . . 720

72.9 Formulation of Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724

www.EconsPhDTutor.com

725

73.2 Product Moment Correlation Coefficient (PMCC) . . . . . . . . . . . . . . . . 727

73.3 Correlation Does Not Imply Causation (Optional) . . . . . . . . . . . . . . . . 733

73.4 Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734

73.5 Ordinary Least Squares (OLS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736

73.6 TI84 to Calculate the PMCC and the OLS Estimates . . . . . . . . . . . . . . 741

73.7 Interpolation and Extrapolation . . . . . . . . . . . . . . . . . . . . . . . . . . . 743

73.8 Transformations to Achieve Linearity . . . . . . . . . . . . . . . . . . . . . . . . 751

73.9 The Higher the PMCC, the Better the Model? . . . . . . . . . . . . . . . . . . 755

VII

Ten-Year Series

757

758

769

779

788

794

823

VIII

854

855

876

889

www.EconsPhDTutor.com

908

IX

921

Appendices (Optional)

922

84.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 923

84.3 Reflection in a Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 924

84.4 The Hyperbola y =

bx + c

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926

dx + e

ax2 + bx + c

. . . . . . . . . . . . . . . . . . . . . . . . . . . 927

dx + e

929

930

86.2 Scalar Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 931

86.3 The Ratio Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 932

86.4 Vector Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 933

86.5 2D Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 936

86.6 3D Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 937

87 Appendices for Part IV: Complex Numbers

943

946

88.2 Left- and Right-Sided Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 948

88.3 Infinite Limits and Vertical Asymptotes . . . . . . . . . . . . . . . . . . . . . . 949

88.4 Limits at Infinity, Horizontal, and Oblique Asymptotes

. . . . . . . . . . . . 950

88.6 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 954

Page 30, Table of Contents

www.EconsPhDTutor.com

88.8 Differentiability Implies Continuity . . . . . . . . . . . . . . . . . . . . . . . . . 961

88.9 Maximum, Minimum, and Turning Points . . . . . . . . . . . . . . . . . . . . . 962

88.10Concavity and Inflexion Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . 964

88.11Concavity and Inflexion Points with Differentiability . . . . . . . . . . . . . . 966

88.12Inverse Function Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 969

88.13Parametric Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 970

88.14Maclaurin Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 971

88.15Product of Two Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 975

88.16Composition of Two Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977

88.17The Fundamental Theorems of Calculus . . . . . . . . . . . . . . . . . . . . . . 978

88.18The Natural Logarithm and Eulers Number e . . . . . . . . . . . . . . . . . . 982

88.19Eulers Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 984

89 Appendices for Part VI: Probability and Statistics

986

89.2 Circular Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 988

89.3 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 989

89.4 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 990

89.5 The Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 994

89.6 The Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 996

89.7 Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 999

89.8 Null Hypothesis Significance Testing . . . . . . . . . . . . . . . . . . . . . . . . 1001

89.9 Calculating the Margin of Error . . . . . . . . . . . . . . . . . . . . . . . . . . . 1002

89.10Correlation and Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . 1004

89.10.1 Deriving a Linear Model from the Barometric Formula . . . . . . . . . 1006

Answers to Exercises

1007

www.EconsPhDTutor.com

1008

90.2 Answers to Exercises in Ch. 2: Dividing by Zero . . . . . . . . . . . . . . . . . 1010

90.3 Answers to Exercises in Ch. 3: Functions . . . . . . . . . . . . . . . . . . . . . 1011

90.4 Answers to Exercises in Ch. 4. Graphs . . . . . . . . . . . . . . . . . . . . . . . 1017

90.5 Answers to Exercises in Ch. 5. Quick Revision . . . . . . . . . . . . . . . . . . 1020

90.6 Answers to Exercises in Ch. 6. Intercepts . . . . . . . . . . . . . . . . . . . . . 1022

90.7 Answers to Exercises in Ch. 7. Symmetry . . . . . . . . . . . . . . . . . . . . . 1023

90.8 Answers to Exercises in Ch. 8. Limits, Continuity, and Asymptotes . . . . . 1024

90.9 Answers to Exercises in Ch. 9. Differentiation . . . . . . . . . . . . . . . . . . 1025

90.10Answers to Exercises in Ch. 11. Stationary, Maximum, Minimum, and

Inflexion Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1027

90.11Answers to Exercises in Ch. 14. Quadratic Equations . . . . . . . . . . . . . . 1035

90.12Answers to Exercises in Ch. 15. Transformations . . . . . . . . . . . . . . . . 1036

90.13Answers to Exercises in Ch. 16: Conic Sections . . . . . . . . . . . . . . . . . 1038

90.14Answers to Exercises in Ch. 17. Simple Parametric Equations . . . . . . . . . 1050

90.15Answers to Exercises in Ch. 18: Equations and Inequalities . . . . . . . . . . 1055

91 Answers to Exercises in Part II: Sequences and Series

1068

91.2 Answers for Ch. 20: Infinite Sequences . . . . . . . . . . . . . . . . . . . . . . . 1070

91.3 Answers for Ch. 22: Summation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1071

91.4 Answers for Ch. 23: Arithmetic Sequences and Series . . . . . . . . . . . . . . 1072

91.5 Answers for Ch. 24: Geometric Sequences and Series . . . . . . . . . . . . . . 1073

91.6 Answers for Ch. 25: Proof by Induction . . . . . . . . . . . . . . . . . . . . . . 1074

www.EconsPhDTutor.com

1077

92.2 Answers for Ch. 29: Vectors in 3D . . . . . . . . . . . . . . . . . . . . . . . . . 1082

92.3 Answers for Ch. 30: Vector Product . . . . . . . . . . . . . . . . . . . . . . . . 1085

92.4 Answers for Ch. 31: Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1086

92.5 Answers for Ch. 32: Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1089

92.6 Answers for Ch. 33: Distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1091

92.7 Answers for Ch. 34: Angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1098

92.8 Answers for Ch. 35: Relationships between Lines and Planes . . . . . . . . . 1101

93 Answers to Exercises in Part IV: Complex Numbers

1105

93.2 Answers for Ch. 37: Basic Arithmetic of Complex Numbers . . . . . . . . . . 1106

93.3 Answers for Ch. 38: Solving Polynomial Equations . . . . . . . . . . . . . . . 1108

93.4 Answers for Ch. 39: The Argand Diagram . . . . . . . . . . . . . . . . . . . . 1111

93.5 Answers for Ch. 40: More Arithmetic of Complex Numbers . . . . . . . . . . 1114

93.6 Answers for Ch. 41: Geometry of Complex Numbers . . . . . . . . . . . . . . 1117

93.7 Answers for Ch. 42: Loci Involving Cartesian Equations . . . . . . . . . . . . 1118

93.8 Answers for Ch. 43: Loci Involving Complex Equations . . . . . . . . . . . . . 1122

93.9 Answers for Ch. 44: De Moivres Theorem . . . . . . . . . . . . . . . . . . . . 1125

94 Answers to Exercises in Part V: Calculus

1128

94.1 Answers for Ch. 45: Solving Problems Involving Differentiation . . . . . . . . 1128

94.2 Answers for Ch. 46: Maclaurin Series . . . . . . . . . . . . . . . . . . . . . . . 1130

94.3 Answers for Ch. 47: The Indefinite Integral . . . . . . . . . . . . . . . . . . . . 1133

94.4 Answers for Ch. 48: Integration Techniques . . . . . . . . . . . . . . . . . . . . 1134

94.5 Answers for Ch. 49: The Fundamental Theorems of Calculus . . . . . . . . . 1143

94.6 Answers for Ch. 50: Definite Integrals . . . . . . . . . . . . . . . . . . . . . . . 1144

94.7 Answers for Ch. 51: Differential Equations . . . . . . . . . . . . . . . . . . . . 1147

Page 33, Table of Contents

www.EconsPhDTutor.com

1152

95.1 Answers for Ch. 52: How to Count: Four Principles . . . . . . . . . . . . . . . 1152

95.2 Answers for Ch. 53: How to Count: Permutations . . . . . . . . . . . . . . . . 1155

95.3 Answers for Ch. 54: How to Count: Combinations . . . . . . . . . . . . . . . . 1157

95.4 Answers for Ch. 55: Probability: Introduction . . . . . . . . . . . . . . . . . . 1160

95.5 Answers for Ch. 56: Conditional Probability . . . . . . . . . . . . . . . . . . . 1164

95.6 Answers for Ch. 57: Probability: Independence . . . . . . . . . . . . . . . . . 1165

95.7 Answers for Ch. 59: Random Variables: Introduction . . . . . . . . . . . . . . 1166

95.8 Answers for Ch. 60: Random Variables: Independence . . . . . . . . . . . . . 1170

95.9 Answers for Ch. 61: Random Variables: Expectation . . . . . . . . . . . . . . 1171

95.10Answers for Ch. 62: Random Variables: Variance . . . . . . . . . . . . . . . . 1173

95.11Answers for Ch. 64: Bernoulli Trial and Distribution . . . . . . . . . . . . . . 1174

95.12Answers for Ch. 65: Binomial Distribution . . . . . . . . . . . . . . . . . . . . 1175

95.13Answers for Ch. 66: Poisson Distribution . . . . . . . . . . . . . . . . . . . . . 1176

95.14Answers for Ch. 67: Continuous Uniform Distribution . . . . . . . . . . . . . 1178

95.15Answers for Ch. 68: Normal Distribution . . . . . . . . . . . . . . . . . . . . . 1179

95.16Answers for Ch. 71: Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1186

95.17Answers for Ch. 72: Null Hypothesis Significance Testing . . . . . . . . . . . 1189

95.18Answers for Ch. 73: Correlation and Linear Regression . . . . . . . . . . . . . 1194

96 Answers to Exercises in Part VII (2006-2015 A-Level Exams) 1198

96.1 Answers for Ch. 74: Functions and Graphs . . . . . . . . . . . . . . . . . . . . 1198

96.2 Answers for Ch. 75: Sequences and Series . . . . . . . . . . . . . . . . . . . . . 1225

96.3 Answers for Ch. 76: Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1255

96.4 Answers for Ch. 77: Complex Numbers . . . . . . . . . . . . . . . . . . . . . . 1269

96.5 Answers for Ch. 78: Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1292

96.6 Answers for Ch. 79: Probability and Statistics . . . . . . . . . . . . . . . . . . 1345

www.EconsPhDTutor.com

Part I

www.EconsPhDTutor.com

Sets

The glory of [maths] is its complete irrelevance to our lives. Thats why its so fun!

Paul Lockhart (2009, A Mathematicians Lament, p. 38).

I have never done anything useful. No discovery of mine has made, or is likely to make,

directly or indirectly, for good or ill, the least difference to the amenity of the world.

- G.H. Hardy (1940 [1967], A Mathematicians Apology, p. 150).

The set is the basic building block of mathematics. Informally, a set is a container that

usually has some objects in it, but can sometimes also be empty.

Each object in a set is called an element (of that set).

Example 1. Let A = {3, 2 , Clementi Mall, Love, the colour green}.

Observations:

The name of a set is often an upper-case letter; in this case, it is A.

Mathematical punctuation marks called braces {} are used to denote a set. Listed within

these braces are the elements of the set.

Elements of the set are separated by commas (,). This mathematical punctuation mark

means and.

Thus, {3, 2 , Clementi Mall, Love, the colour green} is the set consisting of five elements, namely 3 and 2 and Clementi Mall and Love and the colour green.

Elements in a set can be almost anything whatsoever! In this example, the

elements included a building (Clementi Mall), an abstract notion (Love), and even a

colour (green). The elements of a set can even be another set! But dont worry, in the

context of A-level maths, the elements of a set will almost always be numbers.

When we talk about a set, we refer to both the container itself and all the objects in

it.

Exercise 1. B is the set of the first 7 positive integers. Write down B in set notation.

(Answer on p. 1008.)

Exercise 2. C is the set of even prime numbers. Write down C in set notation. (Answer

on p. 1008.)

www.EconsPhDTutor.com

1.1

In and Not In

The mathematical punctuation mark means is in, while means is not in.

Example 2. Let B = {1, 2, 3, 4, 5, 6, 7}. Then 1 B, 2 B, 3 B, etc. You can read these

statements aloud as 1 is in B, 2 is in B, 3 is in B, etc.

We can also write 1, 2, 3 B (1, 2, and 3 are in B).

Also, 8 B, 9 B, 10 B, etc. (8 is not in B, 9 is not in B, 10 is not in B, etc.).

We can also write 8, 9, 10 B (8, 9, and 10 are not in B).

Example 3. Cow {Cow, Chicken} reads aloud as Cow is in the set consisting of Cow

and Chicken.

Cow, Chicken {Cow, Chicken} reads aloud as Cow and Chicken are in the set consisting

of Cow and Chicken.

www.EconsPhDTutor.com

1.2

Greater than >, Less Than <, Positive > 0, and Negative < 0

In this textbook:

Greater than means strictly greater than (>). So I wont bother saying strictly,

unless its something I want to emphasise.

Less than means strictly less than (<).

If I want to say greater than or equal to () or smaller than or equal to (), Ill

say exactly that.

Positive means greater than zero (> 0).

Negative means less than zero (< 0).

Non-negative means greater than or equal to zero ( 0).

Non-positive means less than or equal to zero ( 0).

0 is neither positive nor negative. Instead, 0 is both non-negative and non-positive.

www.EconsPhDTutor.com

1.3

Types of Numbers

The following taxonomy lists the several types of numbers youll encounter in this textbook.

Complex

Numbers

Real

Numbers

Rational

Numbers

Imaginary

Numbers

Irrational

Numbers

Integers

NonIntegers

Well study imaginary numbers only later on in Part IV of this textbook. For now, all

numbers well consider are real numbers (or reals). We wont define what real numbers

are. Instead, well simply assume (like in secondary school) that everyone knows what

real numbers are.

Infinity () and negative infinity () are NOT numbers. Informally, is the thing

that is greater than any real number. Similarly, is the thing that is smaller than any

real number. I repeat: INFINITY IS NOT A NUMBER.13

So what exactly are real numbers, infinity, and negative infinity? This is actually a fascinating question that mathematicians were able to answer satisfactorily only from the 19th

century, but is beyond the scope of the A-levels.

Definition 1. An integer is any one of these real numbers: . . . , 3, 2, 1, 0, 1, 2, 3, . . .

Definition 2. A rational number (or simply rationals) is any real number that can be

expressed as the ratio of two integers. An irrational number (or simply irrationals) is any

other real number.

Example 5. The number 1.87 is a rational and a real, but it is not an integer.

Example 6. The number 3.14159 is an irrational and a real, but it is neither an integer

nor a rational.

13

Actually, the truth is somewhat more complicated. Under certain special contexts in more advanced mathematics, infinity

is treated as a number. But in this textbook, Ill simply keep it simple and insist that infinity is not a number.

www.EconsPhDTutor.com

1.4

The order in which we write out the elements of the set does not matter:

Definition 3. Two sets are equal (or identical) if both sets contain exactly the same elements.

Example 7. There are at least six equivalent ways to write the set of the 3 smallest positive

even numbers: {2, 4, 6} = {2, 6, 4} = {4, 2, 6} = {4, 6, 2} = {6, 2, 4} = {6, 4, 2}.

Example 8. {Cow, Chicken} = {Chicken, Cow}.

www.EconsPhDTutor.com

1.5

Example 9. The set of the 3 smallest positive even numbers can be written as {2, 4, 6}. It

can also be written as: {2, 2, 4, 6} or {2, 6, 6, 6, 4, 4}. Repeated elements are simply ignored.

The notation n({2, 4, 6}) denotes the number of elements in the set of the first 3 even numbers. Hence, n({2, 4, 6}) = 3. And we also have n({2, 2, 4, 6}) = 3 and n({2, 6, 6, 6, 4, 4}) = 3.

Example 10. {Cow, Chicken} = {Cow, Cow, Chicken} = {Chicken, Cow, Chicken}. And

n({Cow, Chicken}) = n({Cow, Cow, Chicken}) = n({Chicken, Cow, Chicken}) = 2.

Note that more commonly, the number of elements in the set A is written as A. But for

some reason, the A-level syllabus instead uses the notation n(A), so thats what well use.

Exercise 3. W = {Apple, Apple, Apple, Banana, Banana, Apple}. What is n(W )? (Answer on p. 1008.)

Exercise 4. C is the set of even prime numbers. What is n(C)? (Answer on p. 1008.)

www.EconsPhDTutor.com

1.6

The mathematical punctuation mark . . . is called the ellipsis and means continue in the

obvious fashion.

Example 11. D is the set of all odd positive integers smaller than 100. So in set notation,

we can write D = {1, 3, 5, 9, 11, . . . , 99}.

Example 12. T is the set of all negative integers greater than 100. So in set notation, we

can write T = {99, 98, 97, . . . , 2, 1}.

What is obvious to one person might not be obvious to another. So only use the ellipsis

when youre confident it will be obvious to your reader! And never be shy to write a few

more of the sets elements (as I did with the sets above)!

Exercise 5. Let D and T be as in the above two examples. What are n(D) and n(T )?

(Answer on p. 1008.)

www.EconsPhDTutor.com

1.7

Example 13. Z+ is the set of all positive integers. So, Z+ = {1, 2, 3, . . . }. And since Z+ is

infinite, we write n(Z+ ) = .

Example 14. Z is the set of all integers. So, Z = {. . . , 3, 2, 1, 0, 1, 2, 3, . . . }. And since

Z is infinite, we write n(Z) = .

Obviously, for an infinite set, we cannot explicitly list out all of its elements. So well often

use ellipses to help out, as we did in the above examples. Alternatively, we can use interval

notation or set-builder notation, which well learn about shortly.

Exercise 6. H is the set of all prime numbers. Write down H in set notation. (Answer on

p. 1008.)

www.EconsPhDTutor.com

1.8

The following sets are so common that they have special symbols:

1. Z = {. . . , 3, 2, 1, 0, 1, 2, 3, . . . } is the set of all integers. (Z is for Zahl, German for

number.)

2. Q is the set of all rational numbers. (Q is for quoziente, Italian for quotient.)

3. R is the set of all real numbers.

4. C is the set of all complex numbers. (To be studied only in Part IV of this textbook.)

To create a new set that contains only the positive (or negative) elements of the old set,

append a superscript plus (+ ) or minus ( ) to the name of a set:

1. Z+ = {1, 2, 3, . . . } is the set of all positive integers. Z = {. . . , 3, 2, 1} is the set of all

negative integers.

2. Q+ is the set of all positive rational numbers. Q is the set of all negative rational

numbers.

3. R+ is the set of all positive real numbers. R is the set of all negative real numbers.

As well learn later, there is no such thing as a positive or negative complex number. Hence,

there is no such set named C+ or C .

To add the number 0 to a set, append a subscript zero (0 ) to its name:

Example 15. The set A = {3, 2 , Clementi Mall, Love, the colour green}. And so the set

A0 = {3, 2 , Clementi Mall, Love, the colour green, 0}.

Example 16. The set B = {1, 2, 3, 4, 5, 6, 7}. And so the set B0 = {0, 1, 2, 3, 4, 5, 6, 7}.

Adding both a superscript + and a subscript 0 to the name of a set creates a new set that

contains all positive elements of the old set and in addition the number 0.

Similarly, adding both a superscript and a subscript 0 to the name of a set creates a new

set that contains all negative elements of the old set and in addition the number 0.

Example 17. If V = {2, 1, 3, 4}, then V + = {3, 4}, V = {2, 1}, V0+ = {0, 3, 4}, and

V0 = {2, 1, 0}.

Exercise 7. If U = {1, 0, 2}, then what are U + , U , U0 , U0+ , and U0 ? (Answer on p. 1008.)

www.EconsPhDTutor.com

1.9

The left-parenthesis: (

The right-parenthesis: )

The left-bracket: [

The right-bracket: ]

An interval is a (usually infinite) set of real numbers. It is written using parentheses

and/or brackets. Let a and b be real numbers where b a. Then:

1. (a, b) is the set of all real numbers that are greater than a and smaller than b. Such a

set is also called an open interval.

Example 18. The set I = (0, 3) denotes

the set of all real numbers that are greater than

0 and smaller than 3. So for example, 2 1.41 I, but 0, 3 I.

2. [a, b] is the set of all real numbers that are greater than or equal to a and smaller than

or equal to b. Such a set is also called an closed interval.

Example 19. The set J = [0, 3] denotes the set of all real numbers that are

greater than

or equal to 0 and smaller than or equal to 3. So for example, the numbers 0, 2, 3 J.

3. (a, b] is the set of all real numbers that are greater than a and smaller than or equal to

b. Such a set is also called a half-open interval or a half-closed interval.

Example 20. The set K = (0, 3] denotes the set of all real numbers

that are greater than

4. [a, b) is the set of all real numbers that are greater than or equal to a and smaller than

b. Such a set is also called a half-open interval or a half-closed interval.

Example 21. The set L = [0, 3) denotes the set of all real numbers that are greater

than

or equal to 0 and smaller than or equal to 3. So for example, the numbers 0, 2 L, but

3 L.

www.EconsPhDTutor.com

Exercise 8. How many elements does the set Z = [1, 1] contain? (Answer on p. 1008.)

Exercise 9. How many elements does the set Y = (1, 1) contain? (Answer on p. 1008.)

Exercise 10. How many elements does the set X = (1, 1.01) contain? (Answer on p. 1008.)

Exercise 11. Write down R, R+ , R+0 , R , and R0 in interval notation. (Answer on p.

1008.)

www.EconsPhDTutor.com

1.10

The empty set is literally the set that contains no elements. Hence the name!

Definition 4. The empty set is the set {}. It can also be denoted .

Example 22. In 2016, the set of all Singapore Ministers who are younger than 30 is {} or

. This means that there is no Singapore Minister who is younger than 30.

Example 23. The set of all even prime numbers greater than 2 is {} or . This means

that there is no even prime number that is greater than 2.

Example 24. The set of numbers that are greater than 4 and smaller than 4 is {} or .

This means that there is no number that is simultaneously greater than 4 and smaller than

4.

As already mentioned, in this textbook (and also for the A-levels), the elements in a set will

almost always be numbers. But in general, the elements of a set can be (nearly) anything

whatsoever. In other words, a set really and simply is a container that can contain

(nearly) anything whatsoever.

Indeed, the elements of a set can be other sets, including even the empty set! Here are two

examples to illustrate:

Example 25. The set {} is not the same as the set . The former is a set containing a

single element, namely the empty set. The latter is the empty set. It is perhaps clearer if

we rewrite them as

{} = {{}} and = {} .

Now we clearly see that {{}} {}.

Note that the set {} = {{}} is certainly not empty, because it contains a single element

(namely the empty set).

Example 26. The set {, 1, {}} is the set containing three elements, namely the empty

set, the number 1, and a set containing the empty set.

www.EconsPhDTutor.com

1.11

Subset Of

in B.

Not surprisingly, A / B denotes that A is not a subset of B.

Example 27. Let M = {1, 2}, N = {1, 2, 3}, and O = {1, 2, 4, 5}. Then M N , but N M .

Also, M O, but O M . Further, N O and O N .

Exercise 12. State whether Z, Q, and R are subsets of each other. (Answer on p. 1008.)

Exercise 13. True or false: The set of currently-serving Singapore Prime Ministers is a

subset of the set of currently-serving Singapore Ministers. (Answer on p. 1008.)

The next fact is useful for showing that two sets are equal.

Fact 1. Two sets are subsets of each other They are identical.

The symbol stands for is equivalent to or if and only if.

The above claim may be decomposed into two separate claims:

1. Two sets are subsets of each other they are identical. (The symbol stands

for implies or only if.)

2. Two sets are subsets of each other they are identical. (The symbol stands

for is implied by or if.)

Note importantly that A B is different from its converse B A. For example,

x > 10 x > 3, but it is certainly not the case that x > 3 x > 10.

www.EconsPhDTutor.com

1.12

Proper Subset Of

Not surprisingly, A / B denotes that A is not a proper subset of B.

Example 28. Let M = {1, 2}, N = {1, 2, 3}, O = {1, 2, 4, 5}, and P = {1, 2, 3}. Then

M N, O, P and M N, O, P . In contrast, N P , but N / P ; this is because N = P .

Exercise 14. Is the set of all squares (call it S) a proper subset of the set of all rectangles

(call it R)? (Answer on p. 1008.)

Exercise 15. Does A B imply that A B? (Answer on p. 1009.)

Exercise 16. Does A B imply that A B? (Answer on p. 1009.)

Exercise 17. True or false statement: If A is a subset of B, then A is either a proper

subset of or is equal to B. (Answer on p. 1009.)

Remark 1. The official A-level syllabus uses the symbol to mean subset of and to

mean proper subset of. So this is what well use in this textbook.

However, confusingly enough, many writers use the symbol to mean subset of and to

mean proper subset of. We will not follow such practice in this textbook. Just to let you

know, in case you get confused while reading other mathematical texts!

www.EconsPhDTutor.com

1.13

Union

Definition 7. The union of the sets A and B (denoted A B) is the set of elements that

are either in A OR B.

Tip: U for Union.

Example 29. Let T = {1, 2}, U = {3, 4}, and V = {1, 2, 3}. Then T U = {1, 2, 3, 4},

T V = {1, 2, 3}, and U V = {1, 2, 3, 4}. And T U V = {1, 2, 3, 4}.

Exercise 18. Rewrite each of the following sets more simply: (a) [1, 2] [2, 3]. (b)

(, 3) [16, 7). (c) {0} Z+ ? (Answer on p. 1009.)

Exercise 19. What is the union of the set of squares (S) and the set of rectangles (R)?

(Answer on p. 1009.)

Exercise 20. What is the union of the set of rationals (Q) and the set of irrationals?

(Answer on p. 1009.)

www.EconsPhDTutor.com

1.14

Intersection

Definition 8. The intersection of the sets A and B (denoted A B) is the set of elements

that are in A AND B.

Definition 9. Two sets intersect if their intersection contains at least one element (i.e.

A B ).

Definition 10. Two sets are mutually exclusive or disjoint if their intersection is empty

(i.e. A B = ).

Example 30. Let T = {1, 2}, U = {3, 4}, and V = {1, 2, 3}. Then T U = , T V = {1, 2},

and U V = {3}. And T U V = .

Exercise 21. Rewrite each of the following sets more simply: (a) (4, 7] (6, 9). (b) [1, 2]

[5, 6]. (c) (, 3) (16, 7). (Answer on p. 1009.)

Exercise 22. What is the intersection of the set of squares (S) and the set of rectangles

(R)? (Answer on p. 1009.)

Exercise 23. What is the intersection of the set of rationals (Q) and the set of irrationals?

(Answer on p. 1009.)

www.EconsPhDTutor.com

1.15

Set Minus /

The set minus (sometimes also called set difference) operator is very convenient. Sadly,

it is not in the A-level syllabus and so Ill avoid using it in this textbook. Nonetheless, its

worth a quick mention.

Definition 11. A set minus B (denoted A/B or A B) is the set that contains every

element in A that is not also in B.

Example 31. Let T = {1, 2}, U = {3, 4}, and V = {1, 2, 3}. Then T /U = T , T /V = , and

U /V = {4}.

www.EconsPhDTutor.com

1.16

Set Complement A

Definition 12. The set complement of A (denoted A or Ac ) is the set of all elements that

are not in A.

Example 32. Consider the set of positive integers. Let A = {2, 4, 6, 8, 10, . . . }. Then in

this context, A = {1, 3, 5, 7, 9, 11, . . . }.

Example 33. Consider the set of all reals. Let A = R+ . Then in this context, A = R0 .

Example 34. Consider the roll of a die. The set of desired outcomes was A = {1, 6}.

Unfortunately, no desired outcome occurred. In other words, the actual outcome was an

element in the set A = {2, 3, 4, 5}.

www.EconsPhDTutor.com

1.17

Set-Builder Notation

Set-builder notation is an alternative method of writing down sets. In the current context,

the mathematical punctuation mark colon will mean such that.

Example 35. The set {x R x > 0} contains all x R such that x > 0. In words, this set

contains all real numbers that are positive.

What comes after the colon are the conditions or criteria that x must satisfy, in order to

qualify as a member of the set. Our sets will usually contain only numbers, but heres an

example to show you how we can write down one particular set of musical artists.

Example 36. The set {x x is an artist that has had a US Billboard Hot 100 #1 Single}

contains all the artists who have ever had a US Billboard Hot 100 #1 Single.

It will however be more typical for our sets to be sets such as these:

Example 37. {x R x > 0} = R+ , Q+ = {x Q x > 0}, Z+ = {x Z x > 0},

R+0 = {x R x 0}, Q+0 = {x Q x 0}, and Z+0 = {x Z x 0}.

Remark 2. We use the colon but some writers use instead the pipe .

Exercise 24. Write down R , Q , Z , R0 , Q0 , and Z0 in set-builder notation. (Answer

on p. 1009.)

Exercise 25. Write down (a, b), [a, b], (a, b], and [a, b) in set-builder notation. (Answer

on p. 1009.)

Exercise 26. Let X = {x x is a living current or former Prime Minister of Singapore}.

Write down the set X so that all its elements are explicitly stated. (Answer on p. 1009.)

Exercise 27. Rewrite

each of the following sets in set-builder notation: (a) (, 3)

www.EconsPhDTutor.com

Dividing By Zero

This very brief chapter is to warn you against making a common mistake dividing by

0. Students have little trouble avoiding this mistake if the divisor is obviously a big fat 0.

Instead, students usually make this mistake when the divisor is an unknown constant or

variable that might be 0.

Example 38. Find the values of x for which x(x 1) = (2x 2)(x 1).

Heres the wrong solution: Divide both sides by x 1 to get x = 2x 2. So x = 2.

Heres the correct solution: Case #1. Suppose x 1 = 0. Then the given equation is

satisfied. So x = 1 is one possible value for which x(x 1) = (2x 1)(x 1). Case #2.

Now suppose x 1 0. So we can divide both sides by x 1 to get x = 2x 2. So x = 2.

Conclusion. The two possible values of x for which x(x 1) = (2x 1)(x 1) are x = 1 and

x = 2.

Moral of the story. Whenever you divide by a certain quantity, make sure its non-zero.

If youre not sure whether it equals 0, then break up your analysis into two cases, as was

done in the above example: Case #1 the quantity equals 0 (and see what happens

in this case); Case #2 the quantity is non-zero (in which case you can go ahead and

divide).

By the way, lets take this opportunity to clear up another popular misconception You

1

1

may have heard that = . This is wrong. . Instead, any non-zero number divided

0

0

by 0 is undefined.14 Undefined is the mathematicians way of saying, You havent told

me what you are talking about. So what you are saying is meaningless.15

Exercise 28. Whats wrong with this proof that 1 = 0? (Answer on p. 1010.)

1. Let x, y be positive numbers such that x = y.

2. Square both sides: x2 = y 2 .

3. Rearrange: x2 y 2 = 0

4. Factorise: (x y)(x + y) = 0.

6. Since x = y, sub y = x into the above equation to get 2x = 0.

7. Divide both sides by 2x to get 1 = 0.

Once again, the truth is actually somewhat more complicated. Under certain special contexts in more advanced mathe1

1

matics, is well-defined. But in this textbook, Ill simply keep it simple and insist that is undefined.

0

0

0

15

On the other hand,

is indeterminate, which means that its typically undefined, but can sometimes be defined under

0

certain circumstances.

14

www.EconsPhDTutor.com

Functions

almost every branch of modern mathematics functions turn out to be the central objects of

investigation.

- Michael Spivak (1994 [2006], Calculus, p. 39).

You are probably familiar from secondary school with such statements as: Let f (x) = x + 8

be a function. Strictly speaking, this is not the correct way of describing a function.

Here is a more precise definition of a function.16 A function consists of three pieces:

1. A set called the domain;

2. A set called the codomain; and

3. A mapping rule (or simply mapping or simply rule) which specifies how each and every

element in the domain is mapped (or assigned) to one (and exactly one) element in the

codomain.

Remark 3. The codomain is not the same thing as the range. Well learn about the range

only in the next section.

Altogether then, a function simply maps (or assigns) each element in the domain to one

(and exactly one) element in the codomain.

Example 39. Let f be the function whose ...

Domain is the set {Cow, Chicken};

Codomain is the set {Produces eggs, Produces milk, Guards the home}; and

Mapping rule is, informally, match the animal to its role.

According to the mapping rule, Cow (in the domain) is mapped to Produces milk

(in the codomain) and Chicken (in the domain) is mapped to Produces eggs (in the

codomain). Every element in the domain is mapped to exactly one element in the codomain.

This is indeed a function, because it has a domain, codomain, and a correctly-specified

mapping rule.

16

This definition is still informal. See Definition 135 in the Appendices for the exact, formal definition (optional).

www.EconsPhDTutor.com

Domain is the set {1, 2};

Codomain is the set {1, 2, 3, 4, 5}; and

Mapping rule is, informally, multiply by 2.

According to the mapping rule, 1 (in the domain) is mapped to 2 (in the codomain)

and 2 (in the domain) is mapped to 4 (in the codomain). Every element in the domain

is mapped to exactly one element in the codomain.

This is indeed a function, because it has a domain, codomain, and a correctly-specified

mapping rule.

Domain is the set R;

Codomain is the set R; and

Mapping rule is, informally, round off to the nearest integer, where half-integers are

rounded up

According to the mapping rule, 3 (in the domain) is mapped to 3 (in the codomain),

3.14159 (in the domain) is mapped to 3 (in the codomain), 3.5 (in the domain) is

mapped to 4 (in the codomain), and 3.88 (in the domain) is mapped to 4 (in the

codomain). Every element in the domain is mapped to exactly one element in the codomain.

This is indeed a function, because it has a domain, codomain, and a correctly-specified

mapping rule.

www.EconsPhDTutor.com

3.1

The function f D C is defined by f x f (x) for all x D.

Or alternatively:

The function f D C is defined by f (x) = ... for all x D.

This says that the functions name is f , its domain is D, and its codomain is C. The last

bit f x f (x) is the mapping rule and this mapping rule applies to all x D (all

elements in the domain).

To save ourselves a bit of writing, if its clear from the context that were talking about the

function f , then well omit f from the front of the mapping rule. Also, if the mapping

rule applies universally to all elements of the domain, then we also omit the for all x D

at the end.

Altogether then, we will often simply write:

The function f D C is defined by x f (x).

We will sometimes also denote the domain and codomain of f by Dom(f ) and Cod(f ).

www.EconsPhDTutor.com

f {1, 2} {1, 2, 3, 4, 5} is defined by x 2x. Or alternatively, the function f {1, 2}

{1, 2, 3, 4, 5} is defined by f (x) = 2x.

This says that the function has:

Name f ;

Domain {1, 2};

Codomain {1, 2, 3, 4, 5}; and

Mapping rule: Map every element x in the domain to the element 2x in the codomain.

In the context of set-builder notation (Section 1.17), the mathematical punctuation mark

colon stood for such that. However, in the context of functions, the colon stands

instead for from. Unfortunately there are only so many symbols and punctuation marks,

so invariably some symbols will have to play more than one role!

The mathematical punctuation mark (right arrow) simply stands for to.

Altogether then, f D C reads as f is the function from domain D to domain C.

The mathematical punctuation mark stands for maps to. Hence, x f (x) reads

as x is mapped to f (x).

www.EconsPhDTutor.com

{Cow, Chicken} {Produces eggs, Produces milk, Guards the home} is defined by Cow

Produces milk and Chicken Produces eggs. Or alternatively, the function f {Cow,

Chicken} {Produces eggs, Produces milk, Guards the home} is defined by f (Cow) =

Produces milk and f (Chicken) = Produces eggs.

This says that the function has

Name f ;

Domain {Cow, Chicken};

Codomain {Produces eggs, Produces milk, Guards the home}; and

Mapping rule: Map the element Cow in the domain to the element Produces milk in

the codomain and the element Chicken in the domain to the element Produces eggs

in the codomain.

f R R is defined by x Integer closest to x.

This says that the function has

Name f ;

Domain R;

Codomain R;

Mapping rule: Map every element x in the domain to the closest integer in the codomain.

f and f (x) refer to two different things.

f denotes a function. f (x) denotes the value of f at x.

This may seem like an excessively pedantic distinction. But maths is precise and pedantic.

In maths, what we mean is precisely what we say and what we say is precisely what we

mean. There is never any room for ambiguity or alternative interpretations.

More examples:

www.EconsPhDTutor.com

function f [0, 1] R is defined by f (x) = 3x + 4.

This says that the functions name is f , its domain is [0, 1] (the set of all reals between 0

and 1, including 0 and 1), its codomain is R (the set of all reals), and its mapping rule is

that we map each element x in the domain to the element 3x + 4 in the codomain. The

value of f at 0.5 is f (0.5) = 3(0.5) + 4 = 5.5.

What is f (3)? It is not 3(3) + 4 = 13. This is because 3 is not in the domain of f . Hence,

f (3) is simply undefined.

function f R+ R is defined by f (x) = ln x.

This says that the functions name is f , its domain is R+ (the set of all positive reals), its

codomain is R (the set of all reals), and its mapping rule is that we map each element x in

the domain to the element ln x in the codomain. The value of f at 2 is f (2) = ln 2 0.693.

f (0) is simply undefined, because 0 is not in the domain of f . Likewise, f (a) is undefined,

for any a < 0.

Exercise 29. For each of the following functions, write down the value of the function at

1. (a) The function f R R is defined by x x + 1. (b) The function g [1, 1] R is

defined by x 17x. (c) The function h Z+ R is defined by x 3x . (d) The function

i Z R is defined by x 3x . (Answer on p. 1011.)

www.EconsPhDTutor.com

3.2

This section simply repeats and emphasises what was already said above.

Example 44. Say we have ...

Domain {Cow, Chicken};

Codomain {Produces eggs, Produces milk, Guards the home}; and

Mapping rule: Chicken is mapped to Produces eggs.

Can we define a function using the above domain, codomain, and mapping rule?

No. The reason is that the mapping rule fails to specify what Cow (an element of the

domain) should be mapped to. It thus fails the requirement that every element in the

domain be mapped to an element in the codomain.

Example 45. Say we have ...

Domain {Cow, Chicken};

Codomain {Produces eggs, Produces milk, Guards the home}; and

Mapping rule: Cow is mapped to both Produces milk and Guards the home; and

Chicken is mapped to Produces eggs.

Can we define a function using the above domain, codomain, and mapping rule?

No. The reason is that the mapping rule maps Cow (an element of the domain) to more

than one element in the codomain. It thus fails the requirement that every element in the

domain be mapped to exactly one element in the codomain.

Example 46. Say we have ...

Domain R;

Codomain [0, 1]; and

Mapping rule: x x + 1.

Can we define a function using the above domain, codomain, and mapping rule?

No. The reason is that the mapping rule fails to map some elements in the domain (e.g.

14) to any element in the codomain. It thus fails the requirement that every element in the

domain be mapped to an element in the codomain.

www.EconsPhDTutor.com

Domain R;

Codomain R; and

Mapping rule: x x.

Can we define a function using the above domain, codomain, and mapping rule?

No. The reason is that the mapping rule maps each element in the codomain (e.g. 14) to

more than one element in the codomain (+14 and -14). It thus fails the requirement that

every element in the domain be mapped to exactly one element in the codomain.

For Exercises 30-37: (i) State (yes/no) whether we can define a function using the given

domain, codomain, and rule. (ii) Explain why or why not. (iii) If we can, then write down

the function in formal notation.

Exercise 30. Let the domain be {5, 6, 7}, the codomain be Z+ , and the mapping rule be

x 2x (Answer on p. 1011.)

Exercise 31. Let the domain be {0, 3}, the codomain be {3, 4}, and the mapping rule be

(informally) any larger number will work. (Answer on p. 1011.)

Exercise 32. Let the domain be {2, 4}, the codomain be {3, 4}, and the mapping rule be

(informally) any smaller number will work. (Answer on p. 1011.)

Exercise 33. Let the domain be {1}, the codomain be {1}, and the mapping rule be

(informally) stay exactly the same. (Answer on p. 1011.)

Exercise 34. Let the domain be {1}, the codomain be {1, 2}, and the mapping rule be

(informally) stay exactly the same. (Answer on p. 1011.)

Exercise 35. Let the domain be {1, 2}, the codomain be {1}, and the mapping rule be

(informally) stay exactly the same. (Answer on p. 1011.)

Exercise 36. Let the domain be R, the codomain be R, and the mapping rule be x x.

(Answer on p. 1011.)

1

Exercise 37. Let the domain be R, the codomain be R, and the mapping rule be x .

x

(Answer on p. 1011.)

Exercise 38. How might you change the domain in Exercise 36 so that a function can be

defined? (Answer on p. 1012.)

Exercise 39. How might you change the domain in Exercise 37 so that a function can be

defined? (Answer on p. 1012.)

www.EconsPhDTutor.com

3.3

Definition 13. A function of a real variable is any function whose domain is a subset of

R.

Altogether then, a real-valued function of a real variable is any function both of whose

domain and codomain are subsets of R.

Example 48. Consider the functions f R R, g R R, and h R R defined by

x x2 . All three are real-valued functions, functions of a real variable, and thus also

real-valued functions of a real variable.

Consider the function i {Cow, Chicken} Z defined by Cow 5 and Chicken 32. This

is a real-valued function, but not a function of a real variable. Thus, it is not a real-valued

function of a real variable.

Consider the function j Z {Cow, Chicken} defined by x Cow if x is odd and x

Chicken if x is even. This is a function of a real variable, but not a real-valued function.

Almost all functions considered in H2 Maths are real-valued functions of a real variable. So

well see plenty of functions like f , g, and h from the above example, but rarely (if ever)

will we see functions like i or j.

In this textbook, unless otherwise clearly-stated, it may be assumed that all functions are

real-valued functions of a real variable.

www.EconsPhDTutor.com

3.4

in the codomain C that are hit by the function. Formally:

Definition 15. The range of a function f D C is f (D) = {y C There is x D such

that f (x) = y}.

The range of f may be denoted Range(f ) or f (D) (if D is the domain of f ).

The range is not the same thing as the codomain. Because this is such a common

misconception, let me repeat:

Indeed, the range is usually a proper subset of the codomain, as was the case in each of the

following examples.

Example 49. Define f [0, 1] R by x x + 1. Then Range(f ) = f ([0, 1]) = [1, 2].

Example 50. Define f {2, 3} R by x x + 1. Then Range(f ) = f ({2, 3}) = {3, 4}.

Example 51. Define f R R by x ex . Then Range(f ) = f (R) = R+ .

The range is often is a proper subset of the codomain, but sometimes they can be equal:

Example 52. Define f R+ R by x ln x. Then Range(f ) = f (R+ ) = R = Cod(f ).

(Answer on p. 1012.)

(Answer on p. 1012.)

Exercise 42. Which of the following statements is/are true? (a) The range of any function

is a subset of its domain. (b) The range of any function is a subset of its codomain. (c)

The range of any function is a proper subset of its codomain. (Answer on p. 1012.)

www.EconsPhDTutor.com

3.5

Let k R be a constant. Then we can create the function f + g in the obvious fashion.

f

We can also create the functions f g, f g, , and kf in the obvious fashions.17

g

The symbol is an alternative symbol for multiplication. We will often prefer using rather

than because there is the slight risk of confusing with the letter x.

As we shall see, f g shall refer to a function that is entirely different from f g, so we must

really be careful to write f g when that is what we mean.

Example 53. Let f R R be defined by x 7x + 5 and g R R be defined by x x3 .

Let k = 2. Then

f + g is the function with domain R, codomain R, and mapping rule x 7x + 5 + x3 ;

f g is the function with domain R, codomain R, and mapping rule x 7x + 5 x3 ;

f g is the function with domain R, codomain R, and mapping rule x (7x + 5) x3 ; and

f

is the function with domain R R+ , codomain R, and mapping rule x (7x + 5) /x3 .

g

kf is the function with domain R, codomain R, and mapping rule x 2 (7x + 5).

We can of course give these four new functions new names (perhaps a single-letter name

for each), but this is not necessary. We can simplywrite:

(f + g) (1) = 7(1) + 5 + 13 = 13,

(f g) (1) = 7(1) + 5 13 = 11,

f

( ) (1) = [7 (1) + 5] /13 = 12,

g

where the pairs of parentheses around each of the five new functions are just to be clear

that we are talking about a single, fully-fledged function.

17

Formally, f + g is the function with domain A B, codomain R, and mapping rule x f (x) + g(x). Similarly, f g is the

function with domain A B, codomain R, and mapping rule x f (x) g(x).

f g is the function with domain A B, codomain R, and mapping rule x f (x)g(x).

f

is the function with domain {x x A B, g(x) 0}, codomain R, and mapping rule x f (x)/g(x). The set A

g

B/ {x g(x) = 0} is the set of all elements x that are in both A and B, excluding those for which g(x) = 0. This exclusion

is necessary, otherwise f (x)/g(x) may sometimes not be well-defined.

Finally, kf is simply the function with domain A, codomain R, and mapping rule x kf (x).

www.EconsPhDTutor.com

3.6

One-to-One Functions

Informally, a function is one-to-one (or invertible) if every element in its range is hit

exactly once (by exactly one element in the domain). Put another way: every element y in

the range corresponds to exactly one element in the domain. Formally:

Definition 16. A function f D C is one-to-one (or invertible) if for every y f (D),

there is only one x D such that f (x) = y.

Example 54. Consider the function f whose domain is the set {Cow, Chicken}, codomain

is the set {Produces eggs, Produces milk, Guards the home}, and mapping rule is Cow

Produces milk and Chicken Produces eggs.

The range is {Produces eggs, Produces milk}.

This function is one-to-one because each element in the range is hit exactly once, as we

can easily verify: Produces eggs is hit once by Chicken and Produces milk is hit once

by Cow.

To check whether this function is one-to-one, we need to show that every element y in the

range corresponds to exactly one element x in the codomain. To this end, lets pick any

element y in the range and write: y = x + 1 y 1 = x.

Thus, indeed, this function is one-to-one every element y in the range corresponds to

exactly one element y 1 in the domain.

To show that a function is not one-to-one, simply give a counter-example:

Example 56. Let f R R be defined by x x2 . The range of f is R+ .

This function is not one-to-one for example, the element 9 in the range is hit twice,

once by 3 and again by 3.

Remark 4. One-to-one or invertible functions are also known as injective functions (or

simply injections), but we wont use this term in this textbook.

Exercise 43. State and explain

of the following functions is one-to-one. (a)

whether each

+

+

f R0 R is defined by x x. (b) g R0 R is defined by x x2 . (c) h R R is

defined by x x. (d) i R+0 R is defined by x x. (e) j R R is defined by x sin x.

(Answer on p. 1013.)

www.EconsPhDTutor.com

3.7

Inverse Functions

defined by the mapping rule f 1 (y) = x y = f (x).

Only invertible functions have inverse functions. If a function is not invertible, then

its inverse function simply does not exist.

Given a one-to-one (or invertible) function f , to find its inverse function f 1 , follow these

steps:

1. Dom (f 1 ) = Range (f ).

2. Cod (f 1 ) = Dom (f ).

3. Write down an expression f 1 (y) that involves only y and show that f 1 (y) = x

y = f (x) .

www.EconsPhDTutor.com

one-to-one. So its inverse function f 1 exists. Lets find it.

1. f has range [1, 2]. So f 1 has domain [1, 2].

2. f has domain [0, 1]. So f 1 has codomain [0, 1].

3. Pick any element y in the range of f and write:

y =f (x) y = x + 1 y 1 = x.

1

f

(y)

Well actually only formally talk about graphs in the next few chapters. But for now, as a

visual aid, Ill provide the graphs of f 1 (blue) and f (red) anyway.

Observe that f 1 is simply the reflection of f in the line y = x (dotted). Section 7.2 (in

particular Fact 7) will explain why exactly this is so.

f(x), f -1(x)

4

3

2

1

x

0

-5

-4

-3

-2

-1

-1

-2

-3

-4

-5

www.EconsPhDTutor.com

Example 58. You can verify for yourself that the function f R R defined by x 2x is

one-to-one. Lets find its inverse function f 1 .

1. f has range R. So f 1 has domain R.

2. f has domain R. So f 1 has codomain R.

3. As usual, lets pick any element y in the range of f and write:

y =f (x) y = 2x 0.5y = x.

1

f

(y)

f(x), f -1(x)

4

3

2

1

x

0

-5

-4

-3

-2

-1

-1

-2

-3

-4

-5

www.EconsPhDTutor.com

Example 59. You can verify for yourself that the function f R+ R R defined by

1

x is one-to-one. Lets find its inverse function f 1 .

x

1. f has range R+ R . So f 1 has domain R+ R .

3. As usual, lets pick any element y in the range of f and write:

1

1

y =f (x) y =

= x ( y 0).

x

y

1

f

(y)

1

So f 1 has mapping rule y .

y

(Note that is the shorthand symbol for because. Similarly, is the shorthand symbol

for therefore.)

The condition here that y 0 is important and goes back to our warning that was Chapter

2 (Dividing by Zero). We know for sure that the range of f does not contain 0. This is

why in the last line above, we can safely divide both sides of the equation by y.

5

f(x), f -1(x)

4

3

2

1

x

0

-5

-4

-3

-2

-1

-1

-2

-3

-4

-5

www.EconsPhDTutor.com

Example 60. You can verify for yourself that the function f R+0 R defined by x x2

is one-to-one. Lets find its inverse function f 1 .

1. f has range R+0 . So f 1 has domain R+0 .

3. As usual, lets pick any element y in the range of f and write:

y =f (x) y = x2 y = x.

1

f

(y)

Here there are two possibilities for the mapping rule of f 1 , namely y y and y y.

We must pick one. We know that the domain of f and

hence the codomain of f 1 is

f(x), f -1(x)

4

3

2

1

x

0

0.0

0.5

1.0

1.5

2.0

+

Exercise 44. Find

the inverse function for each of the following functions. (a) f R0 R

defined by x x. (b) g [0.5, 0.5] R defined by x sin x. (c) h R R defined by

x x3 .(Answers on p. 1014.)

www.EconsPhDTutor.com

3.8

We saw that some functions were not one-to-one (or non-invertible). And so for these

functions, an inverse function simply does not exist.

Nonetheless, we can often transform a non-invertible function into an invertible function.

One way to do this is by restricting the domain. The new invertible function will then have

an inverse function.

Example 61. We saw in Exercise 43 that the function j R R defined by x sin x was

not one-to-one. However, we can restrict the domain to [0.5, 0.5] to get a brand new

function g [0.5, 0.5] R defined by x sin x. This brand new function g is identical

to the original function j except for its domain. g is one-to-one, as you should verify for

yourself.

We can thus go ahead and construct the inverse function g 1 . Actually, we already did this

in Exercise 44.

not one-to-one. However, we can restrict the domain to R+0 to get a brand new function

g R+0 R defined by x x2 . This brand new function g is identical to the original function

f except for its domain. g is one-to-one, as we verified in Exercise 43.

We can thus go ahead and construct the inverse function g 1 . I leave this as an exercise for

you.

There is almost always more than one way to restrict the domain of a non-invertible function

to obtain an invertible function. Indeed, a trivial case would be where we restrict its domain

to be the empty set! In which case the function thus formed would certainly be invertible,

though not very interesting (it would have an empty domain and an empty range so too

would its inverse function).

1

(x 1)2

is not one-to-one. (b) Show that by restricting its domain to (1, ), we can create a new

invertible function g (you must prove that this new function is invertible). (c) Then find

the inverse function g 1 . (Answer on p. 1015.)

Exercise 45. (a) Show that the function f (, 1) (1, ) R defined by x

Exercise 46. For the function f in Example 62, lets instead restrict the domain to [20, 30].

Show that the new function thus obtained is one-to-one and find its inverse. (Answer on

p. 1015.)

www.EconsPhDTutor.com

3.9

Composite Functions

Definition 18. Let f and g be functions such that the range of g is a subset of the domain

of f . Then the composite function f g is the function with the same domain as g, the same

codomain as f , and mapping rule x f (g(x)).

The composite function f g can be read aloud as f circle g and is sometimes denoted

f g, especially when we want to make clear that we are not talking about f g. But well

rarely use the f g notation, unless there is some risk of confusion with f g.

The underlined condition is important: The range of g must be a subset of the domain of

f in order for the composite function f g to exist. This condition ensures that given any x

from the domain of g, the value g(x) is itself also in the domain of f , so that f (g(x)) is

well-defined.

If this condition fails, then the composite function f g simply does not exist.

Example 63. The functions g, f R R are defined by g x x + 1 and f x 2x. The

range of g is R this is indeed a subset of the domain of f (which is R). So the composite

function f g R R exists and is defined by x f (g(x)) = 2 (g(x)) = 2(x + 1).

Lets try computing f g(2). We can use the definition of a composite function: f g(2) =

f (g(2)) = f (2 + 1) = f (3) = 6.

Alternatively, we can directly use f g(x) = 2(x + 1) to compute f g(2) = 2(2 + 1) = 6.

Notice that for the composite function f g, we apply the function g first before applying the

function f . So for example, to compute, say f g(7), we compute g(7) first, then compute

f (g(7)). (A common mistake by students is to instinctiely read from left to right, and so

apply f first before g.)

range of g is R+0 this is indeed a subset of the domain of f (which is R). So the composite

function f g R R exists and is defined by x f (g(x)) = g(x) + 1 = x2 + 1.

Lets try computing f g(3). We can use either the definition of a composite function:

f g(3) = f (g(3)) = f (32 ) = f (9) = 10.

Alternatively, we can directly compute, using f g(x) = x2 + 1: f g(3) = 32 + 1 = 10.

www.EconsPhDTutor.com

defined by x ln x. The range of g is R, which is not a subset of the domain of f (which

is R+ ). Hence, the composite function f g simply does not exist.

We saw that if f is non-invertible, then its inverse function f 1 simply does not exist.

Nonetheless, we could restrict its domain to create a new invertible function g, whose

inverse function g 1 we could then write down.

By analogy, suppose we have functions f and g where gs range is not a subset of f s

domain. Thus, the composite function f g simply does not exist. But we can play a similar

trick: We can restrict the domain of g to create a new function g, so that the range of g is

a subset of f s domain. We can then write down the composite function f g.

Fortunately, this is not in the syllabus, so you dont need to know how to do this. Yay!

Exercise 47. For each of the following pairs of functions f and g, verify that the composite

function f g exists and write it out in full. Also, compute f g(1) and f g(2). (a) The functions

g, f R R defined by g x x2 + 1 and f x ex . (b) The functions g, f R R defined

by g x ex and f x x2 + 1. (c) The functions g, f R R+ R defined by g x 1/2x

and f x 1/x. (d) The functions g, f R R+ R defined by g x 1/x and f x 1/2x.

(Answer on p. 1016.)

We can of course also build a composite function out of a single function.

Example 66. The function f R R is defined by x 2x. The range of f is R and this

is indeed a subset of the domain of f (which is R). So the composite function f f R R

exists and is defined by x f (f (x)) = 2f (x) = 2(2x) = 4x. And so for example f f (3) =

2(2 3) = 12.

The composite function f f can instead be written as f 2 . So in the above example, wed

write f 2 (3) = 12.

We can, analogously, define the composite function f f 2 and denote it f 3 . Using the above

example, f 3 (x) = 8x and f 3 (3) = 24.

Of course, there are also f 4 , f 5 , etc.

www.EconsPhDTutor.com

Remark 5. The official A-level syllabus uses f 2 to mean the composite function f f and

nothing else. So this is what well do in this textbook.

But confusingly enough, some writers use the symbol f 2 to mean the second derivative of

f , f 3 to mean the third derivative of f , etc.. We wont follow such practice. Just to let

you know, in case you read other mathematical texts and get confused.

However, we will use f (3) to mean the third derivative of, f (4) to mean the fourth

derivative of, etc. This will show up occasionally in Part V (Calculus).

Exercise 48. For each of the following functions f , verify that the composite function f 2

exists and write it out in full. Also, compute f 2 (1) and f 2 (2). (a) The function f R R

defined by x ex . (b) The function f R R defined by x 3x + 2. (c) The function

f R R defined by x 2x2 + 1. (Answer on p. 1016.)

www.EconsPhDTutor.com

Graphs

An ordered pair is a mathematical object. Like set of two objects, an ordered pair is,

informally, a container with two objects, where the objects are listed out with a comma

separating them.

The only difference between a set of two objects and an ordered pair is that order

matters for the latter.

To distinguish an ordered pair from a set of two objects, we use parentheses (instead of

braces).

Example 67. (Cow, Chicken) is an ordered pair. (5, 4) is an ordered pair.

We also refer to (a, b) as ordered set notation. So (Cow, Chicken) and (5, 4) are both

examples of ordered pairs, written out in ordered set notation.

Example 68. Let (Cow, Chicken) and (Chicken, Cow) be ordered pairs. Let {Cow,

Chicken} and {Chicken, Cow} be sets.

Recall that for sets, order did not matter. Hence, {Cow, Chicken} = {Chicken, Cow}.

In contrast, for ordered pairs, order does matter. And so (Cow, Chicken) (Chicken, Cow).

Definition 19. An ordered pair of real numbers is any (x, y) where both x, y are real.

Example 69. (5, 4), (1, 1), and (2, 3) are all ordered pairs of real numbers.

Confusingly, above in Section 1.9 (Intervals), we said that (5, 4) was a set, namely {x R

5 < x < 4}. Here we say instead that (5, 4) is an ordered pair, consisting of two objects

(5 and 4), the order of which matters.

Unfortunately this is yet another bit of confusing notation youll have to live with. Youll

have to learn to tell, from the context, whether (5, 4) is a set of infinitely-many real

numbers or an ordered pair. But dont worry, this is usually pretty obvious.

www.EconsPhDTutor.com

Definition 20. In any ordered pair of real numbers, the first real is called the x-coordinate

and the second is the y-coordinate.

Definition 21. The cartesian plane is the set of all ordered pairs of real numbers.

In set-builder notation, the cartesian plane can be written as {(x, y) x R, y R}. This

reads aloud as the cartesian plane is the set of ordered pairs of real number (x, y).

In this textbook, well usually only ever look at ordered pairs of real numbers.

Hence, rather than say ordered pair of real numbers, well simply say ordered pair. And

so whenever you see the notation (x, y), it should be understood that this is an ordered

pair of real numbers (and not cows or chickens).

And so instead of writing the cartesian plane as {(x, y) x R, y R}, well simply write it

as {(x, y)}, with the understanding that x, y are reals.

In the present context, well also simply call any ordered pair of real numbers a point.

(Later on, in the context of three-dimensional geometry, points will also refer to ordered

triples of real numbers.)

Definition 22. In the context of the cartesian plane, the origin is the point (0, 0).

Points are usually given lower-case letters as names.

www.EconsPhDTutor.com

We can illustrate the cartesian plane graphically. The horizontal axis corresponds to the

x-coordinate of the points and is thus also called the x-axis. The vertical axis corresponds

to the y-coordinate of the points and is thus also called the y-axis.

Example 70. The points (or ordered pairs of real numbers) a = (5, 4), b = (1, 1), and

c = (2, 3) are illustrated graphically on the cartesian plane:

a

3

b

x

-5

-3

-1

-1

-3

-5

Example 71. The set of three points {a, b, c} = {(5, 4), (1, 1), (2, 3)} is a graph.

www.EconsPhDTutor.com

The graph of a function f is simply the set of points (x, y) that satisfy x D, y C, and

f (x) = y. Formally:

Definition 24. The graph of a function f D C is the set {(x, y) x D, y C, y = f (x)}.

Given a function that is named with a lower-case letter, we will often use the upper-case

version of that same letter to denote that functions graph. So for example, given the

function f , we often give its graph the name F .

Example 72. Consider the function f R R defined by x x2 . Its graph may be written

as F = {(x, y) y = x2 }.

f(x), y

x

0

-2

-1

Weve defined graph as a noun. But at the slight risk of confusion, well also use it as a

verb that means draw in the cartesian plane a given set of points. So we can say either

we draw the graph of f (graph as a noun), or we graph f (graph as a verb).

www.EconsPhDTutor.com

Definition 25. Given a function f D C, a point of the function is any element of its

graph. That is, it is any ordered pair (x, f (x)), where x D.

To use the above example, we say that (2, 4) and (5, 25) are both points of f .

But since x determines f (x), it is nice but not necessary to specify the complete ordered

pair (x, f (x)). Instead, we can refer to the point simply as x. So in the above example, we

can simply say that 2 and 5 are both points of f , with the understanding that what we

really mean is (2, 4) and (5, 25) are both points of f . This is a bit sloppy and at the risk

of some confusion, but will save us a lot of messy notation.

So in the context of functions, x does double duty. It can either refer to an element in the

functions domain OR it can refer to a point of the function.

On exams though, it is probably safer to simply list out the full co-ordinates, whenever

youre referring to a point. Just in case your marker is damn niao.

www.EconsPhDTutor.com

We just learnt about the graph of a function. A graph of an equation is very similarly

defined:

Definition 26. Given an equation involving x and y as the only two variables, the graph

of the equation is the set of points (x, y) that satisfy the equation.

Example 73. The graph below is of the equation x2 + y 2 = 1. It is simply the set {(x, y)

x2 + y 2 = 1}.

1

f(x), g(x), y

p = (x, y)

x 2 + y2 = 1

y

x

-1.0

-0.5

0.0

0.5

1.0

(-0, 0)

Centre

-1

Exercise 49. (a) Can the equation x2 + y 2 = 1 be rewritten into the form of a single

function? (b) Can it be rewritten into the form of two functions? (Answer on p. 1017.)

Exercise 50. Draw the graphs of each of the following equations. (a) y = ex . (b) y = 3x + 2.

(c) y = 2x2 + 1. (Answers on pp. 1017, 1018, and 1019.)

www.EconsPhDTutor.com

4.1

You are required to know how to use a graphing calculator to graph a given function.

Pretty bizarre that in this age of the smartphone, they want you to learn how to use these

clunky and now-useless devices from the 80s and 90s. It is the equivalent of having to

learn how to program a VCR.

This textbook will give only a very few examples involving graphing calculators.

There is no better way of learning to use it than to play around with it yourself. By the

time you sit down for your A-level exams, you will have had plenty of practice with it.

You can also use any of the seven calculators in the list below (last updated by SEAB on

March 1st, 2016, PDF). But this textbook will stick with the TI-84 PLUS Silver Edition

(which Ill simply call the TI84).18

Ill always start each example with the calculator freshly reset.

Example 74. Graph the function f R R defined by x x2 .

1. Press ON to turn on your calculator.

2. Press Y= to bring up the Y= editor.

3. Press X,T,,n to enter X; then x2 to enter the squared 2 symbol.

4. Now press GRAPH and the calculator will graph y = x2 .

After Step 1.

18

After Step 2.

After Step 3.

After Step 4.

My understanding is that most students use a TI calculator and that the five approved TI calculators are pretty similar.

www.EconsPhDTutor.com

The TI84 requires that we enter equations in a form where y is directly expressed in terms

of y. But there is no way to do this here. As explained in Exercise 49, there is no way to

rewrite this equation into a single function.

However, we can rewrite it into two functions:

Namely, f [1, 1] R defined by x

2

1 x2 and g [1, 1] R defined

by x 1 x. We can thus tell our calculator to

graph two separate equations: y = 1 x2 and y = 1 x2 .

1. Press ON to turn on your calculator.

2. Press Y= to bring up the Y= editor.

Most buttons on the TI84 have three different roles. If you simply press a button, then the

TI84 executes the role that is printed on the button itself. If you press the blue 2ND and

then a button, then the TI84 executes the role printed in blue above the button. And if

you press the green ALPHA and then a button, then the TI84 exexcutes the role printed

in green above the button.

3. Press the blue 2ND button and then

entered 1 x2 .

4. Now press ENTER and the blinking cursor will move down, to the right of Y2 =.

After Step 1.

After Step 2.

After Step 3.

After Step 4.

www.EconsPhDTutor.com

After Step 5.

After Step 6.

After Step 8.

After Step 9.

After Step 7.

4. Press the (-) button. (Warning: This is different from the - button. If you use the button, you will get an error message when you try to generate your graphs later.) Now

repeat what we did in step 3 above: Press the blue 2ND button and then

(which

2

1 X ). Altogether you will have entered 1 x2 .

5. Now press GRAPH and the calculator will graph both y = 1 x2 and y = 1 x2 .

Notice the graphs are very small. To zoom in:

6. Press the ZOOM button to bring up a menu of ZOOM options.

7. Press 2 to select the Zoom In option. Nothing seems to happen. But now press ENTER

and the TI84 will zoom in a little for you.

We expected to see a perfect circle. Instead we get an elongated oval whats going on?

The reason is that by default, the x- and y- axes are scaled differently. To set them to the

same scale:

8. Press the ZOOM button again to bring up the ZOOM menu of options. Press 5 to

select the ZSquare option. Nothing seems to happen. But now press ENTER and the

TI84 will adjust the x- and y- axes to have the same scale. And now we have a perfect

circle.

www.EconsPhDTutor.com

Heres a super quick revision of some O-Level Maths well be using. If you have severe

difficulty with these exercises, you should go back and review your O-Level Maths material!

5.1

Laws of Exponents

For all real numbers x, y, a, and b (provided any denominators are non-zero):

x x

a

=x

a+b

x a

xa

( ) = a,

y

y

xa

= xab ,

b

x

(xa )

= xab ,

(xy)a = xa y a ,

xa =

1

,

xa

a1/b =

b

a,

ac/b =

c

b

ac = ( b a) .

(53x 251x )

52x+1 + 3(25x ) + 17(52x )

(8x+2 34(23x ))

.

2x+1

( 8)

b

19

By convention, 00 is usually defined to be equal to 1 this textbook will follow this practice.

www.EconsPhDTutor.com

5.2

Example 76. Heres a case where theres just a surd in the denominator:

1

2

2

= =

.

2

2

2 2

For more complicated cases, the trick is to use the fact that (a + b)(a b) = a2 b2 .

Given a + b, we call a b the conjugate of a + b. We refer to a + b and a b as a conjugate

pair.

Example 77.

1 2

1 2 1 2

1

1 2

=

=

=

=

= 2 1.

12

1

1 + 2 (1 + 2) (1 2) 12 ( 2)2

x

y

x2

y2

+1

x

x2

+1 .

2

y

y

www.EconsPhDTutor.com

5.3

Absolute Value

z,

z =

z,

if z 0,

if z < 0.

(b) x b b x b.

Proof. (a) x < b 0 x < b OR b < x < 0 b < x < b.

(b) Very similar.

(b) x a b a b x a b.

Proof. (a) By Fact 2, x a < b if and only if b < x a < b. Rearranging the latter set of

inequalities yields a b < x < a + b.

(b) Very similar.

20

The absolute value operator is the function with domain R, codomain R+0 , and mapping rule x x if x 0 and x x

if x < 0.

www.EconsPhDTutor.com

Fact 4.

a a

= (provided b 0).

b

b

a

a

a

0 so that = . Moreover,

b

b

b

a a

a a a

a a

either

= or

=

= . Altogether then, indeed

= .

b b

b b b

b

b

If a and b have the same signs (and are non-zero), then

a

a

a

< 0 so that = . Moreover,

b

b

b

a a

a a

a a

either

=

or

= . Altogether then, indeed

= .

b

b

b b

b

b

If a and b have opposite signs (and are non-zero), then

www.EconsPhDTutor.com

Intercepts

Example 79. The graph below is of the equation y = x + 3. It has horizontal intercept 3

and vertical intercept 3.

5

y

y=x+3

3

1

x

-5

-3

-1

-1

-3

-5

Horizontal intercepts are the x-coordinates of the points at which the graph intersects

the horizontal or x-axis. Similarly, vertical intercepts are the y-coordinates of the points

at which the graph intersects the vertical or y-axis.

Definition 27. a is a horizontal intercept (or x-intercept) of a graph G if (a, 0) G.

Definition 28. b is a vertical intercept (or y-intercept) of a graph G if (0, b) G.

www.EconsPhDTutor.com

Where the graph G is of an equation (or function), we sometimes also call the horizontal

intercepts zeros or roots of the equation (or function). (Well use the terms zeros and

roots interchangeably in this textbook.)

Example 80. Consider the equation y = x2 1. Its zeros or roots are 1 and 1, because

these are the values of x for which y = 0.

Of course, 1 and 1 are also the horizontal intercepts (or x-intercepts) of the graph of the

equation.

Example 81. Consider the the function f with domain R, codomain R, and mapping rule

x x2 1. Its zeros or roots are 1 and 1, because these are the values of x for which

f (x) = 0.

Of course, 1 and 1 are also the horizontal intercepts (or x-intercepts) of the graph of f .

www.EconsPhDTutor.com

vertical intercept 1 and two horizontal intercepts, 1 and 1.

1 and 1 are also the zeros or roots of f , because f (1) = 0 and f (1) = 0.

f(x)

f(x) = x2 - 1

x

0

-2

-1

-1

The A-level exams will often ask you to write down the full co-ordinates of the points at

which a graph (or curve) crosses the axes this means writing down both the x- and

y-coordinates, and not just the horizontal intercept or the vertical intercept. Heres an

exercise to help you make this a habit.

Exercise 54. Write down in full the point(s) at which the graphs of each the following

equations crosses the axes: (a) x2 +y 2 = 1. (b) y = x2 4. (c) y = x2 +2x+1. (d) y = x2 +2x+2.

(Answer on p. 1022.)

www.EconsPhDTutor.com

7

7.1

Symmetry

A reflection of a point in a line is its mirror image point on that line. Formally:

Definition 29. Let a be a point and l1 be a line. Let l2 be the line that is perpendicular

to l1 and runs through a. Let x be the point where l1 and l2 intersect. Then the reflection

of a in l1 is the point a on l2 such that the distances ax and a x are equal.

l1

l2

a'

Fact 5. Let (a, b) be a point. Its reflection in the line y = x is the point (b, a).

Fact 6. Let (a, b) be a point. Its reflection in the line y = x is the point (b, a).

Example 83. (a) Given the point (3, 17), its reflection in the line y = x is (17, 3) and its

reflection in the line y = x is (17, 3).

(b) Given the point (1, 5), its reflection in the line y = x is (5, 1) and its reflection in the

line y = x is (5, 1).

(c) Given the point (0, 0), its reflection in the line y = x is (0, 0) and its reflection in the

line y = x is (0, 0).

Exercise 55. For each of the following points, write down their reflections in the lines (i)

y = x; and (ii) y = x. (a) (3, 17). (b) (1, 5). (c) (0, 0). (Answer on p. 1023.)

www.EconsPhDTutor.com

7.2

Definition 30. The reflection of a graph G in a line is the graph G where each point in

G is a reflection of a point in G.

Example 84. The reflection of the graph G = {(x, y) y = x2 + 4} in the line y = 2 is the

graph G = {(x, y) y = x2 }.

G : y = x2 + 4

y=2

line of reflection

G ' : y = -x2

www.EconsPhDTutor.com

Example 85. The reflection of the graph G = {(x, y) y = ln x} in the line x = 0 is the

graph G = {(x, y) y = ln(x)}.

y

x=0

line of reflection

G ' : y = ln (-x)

G : y = ln x

x

Fact 7 formalises our earlier observation in section 3.7 (Inverse Functions) that the graphs

of f and its inverse f 1 are reflections in the line y = x.

Fact 7. Let f be an invertible function. Then the reflection of the graph of f in the line

y = x is the graph of its inverse function f 1 .

www.EconsPhDTutor.com

The next Fact simply makes the obvious observation that the reflection in the line y = x of

any point along the line y = x is itself.

Fact 8. Let (a, a) be a point. Its reflection in the line y = x is (a, a).

Fact 9. Let f be invertible. Suppose f passes through (a, a). Then so too does its inverse

f 1 . And hence, f and f 1 intersect at those points where x = f (x).

The above Fact is useful for finding where a function and its inverse intersect.

Example 86. Let f R R be the invertible function defined by x 2x. The graph of f

intersects the graph of f 1 at the point(s) where x = f (x) x = 2x x = 0. Notice

the intersection point (0, 0) is also on the line y = x. See figure on p. 70.

Example 87. Let f R+0 R be the invertible function defined by x x2 . The graph of

f intersects the graph of f 1 at the point(s) where x = f (x) x = x2 x(x 1) = 0

x = 0, 1. Notice the intersection points (0, 0) and (1, 1) are also on the line y = x. See

figure on p. 72.

Be careful not to make the mistake of believing that f and f 1 can only intersect at points

where x = f (x). A function and its inverse can certainly intersect at points that

are not on the y = x line.

Example 88. Let f R+ R R be the invertible function defined by x 1/x. The

graph of f intersects the graph of f 1 at the point(s) where x = f (x) x = 1/x

x = 1, 1.

We have merely found two points at which f and f 1 intersect. There may very well be

other intersection points. Indeed, in this example, f and f 1 also intersect at every other

x 0! See figure on p. 71.

www.EconsPhDTutor.com

7.3

Lines of Symmetry

that line.

Example 89. The graph of y = x2 is symmetric in the line x = 0 (which also happens to

be the vertical axis).

4

y

x=0

Reflection

line

3

y = x2

x

0

-2

-1

www.EconsPhDTutor.com

1

is symmetric in the lines y = x and y = x.

x

5

y

4

y = -x

line

y=x

line

3

2

1

y=1/x

0

-5

-4

-3

-2

-1

0

-1

5

x

-2

-3

-4

-5

www.EconsPhDTutor.com

The syllabus makes nearly no mention of limits and none of continuity. Yet differentiation

and integration are built entirely on the concept of limits. Continuity is also almost always

assumed. It is thus well-worth spending an hour or two on these concepts, especially since

theyre not difficult and everything will become that much clearer.

8.1

Example 91. Graphed below is the function f R R defined by x 5x + 2. Observe

that as x approaches 3, f (x) approaches 17. We write this as:

Statement #1. As x 3, f (x) 17.

(The right arrow symbol means to in the context of functions, but now means approaches in the context of limits.)

Equivalently, we may say The limit of f (x) as x approaches 3 is equal to 17. We write:

Statement #2. lim f (x) = 17.

x3

Statements #1 and #2 are entirely equivalent. Either may be (informally) interpreted thus:

f (x) is close to (or possibly even equal to) 17.

y

x

-5

-3

-1

www.EconsPhDTutor.com

xa

f (x) is close to (or possibly even equal to) L.

This interpretation is informal because the words close to are vague. For the formal

definitions of limits (optional), see Section 88.1 in the Appendices.

The subtle condition but not equal to requires emphasis. When considering the limit

of f at 3, we do NOT care about the value f (3). Indeed, we do NOT care even if f (3) is

undefined!

Heres an example where lim g(x) is well-defined, even though g(3) is not.

x3

It looks almost exactly like that of f (from the previous example), except there is now a

hole (or more formally, a discontinuity) at x = 3.

Nonetheless, it is still true that

g(x) is close to (or possibly even equal to) 17.

In formal notation, we write as x 3, g(x) 17 or lim g(x) = 17.

x3

x

-5

-3

-1

www.EconsPhDTutor.com

In the next example, both h(3) and lim h(x) are well-defined, but lim h(x) h(3).

x3

x3

and h(3) = 0. The graph of h looks almost exactly like those of f and g (from the previous

examples), except that now the value of h at x = 3 is, strangely enough, 0.

Nonetheless, it is still true that

h(x) is close to (or possibly even equal to) 17.

In formal notation, we write as x 3, h(x) 17 or lim h(x) = 17.

x3

x

-5

-3

-1

www.EconsPhDTutor.com

i(3) = 17. This graph looks very different from those of f , g, and h (from the previous

examples). But like f , we again have i(3) = 17.

We are tempted to conclude that therefore lim i(x) = 17. This though is wrong, because

x3

we cannot make i(x) as close to 17 as we like by restricting x to values that are close to

but not equal to 3. Hence,

As x 3, i(x)

/ 17,

x3

x3

i(x) is close to (or possibly even equal to) 0.

x

-5

-3

-1

Section 8.3 gives more examples of limits. But first, lets learn about continuity.

www.EconsPhDTutor.com

8.2

Continuity

And so a function is continuous on an interval (of points) if you can smoothly draw its

graph for that entire interval without once lifting your pencil. Formally:

Definition 32. f D R is continuous at a D if lim f (x) = f (a).

xa

Section 88.6 in the Appendices contains additional definitions and results concerning continuity (optional).

Example 91 (revisited). Graphed below is the function f R R defined by x 5x + 2.

It is continuous at 3, because lim f (x) = 17 and f (3) = 17.

x3

x1

xa

20

10

f is continuous

everywhere

-5

-3

0

-1

x

1

-10

-20

www.EconsPhDTutor.com

by x 5x + 2. It is continuous at 1, because lim g(1) = 7 and g(1) = 7.

x5

However, it is not continuous at 3, because lim g(x) = 17, but g(3) is undefined, and so

x3

x3

we have lim g(x) = g(a). But g fails to be continuous at 3.

xa

g is continuous

everywhere

except at x = 3.

x

-5

-3

-1

www.EconsPhDTutor.com

for x 3 and h(3) = 0. It is continuous at 1, because lim h(x) = 7 and h(1) = 7.

x1

However, it is not continuous at 3, because lim h(x) = 17, but h(3) = 0 and so lim h(x)

x3

x3

h(3).

Altogether, h is continuous at any a (, 3)(3, ), because for any a (, 3)(3, ),

we have lim h(x) = h(a). But h fails to be continuous at 3.

xa

y

h is continuous

everywhere

except at x = 3.

x

-5

-3

-1

www.EconsPhDTutor.com

8.3

We now turn to examples where limits do not exist. We start with a trivial example.

Example 95. Graphed below is the function f R+0 R defined by x 5x + 2.

There is no number L such that for all values of x that are close to but not equal to 3,

f (x) is also close to L. And so we simply say that lim f (x) does not exist.

x3

This is a trivial example because 3 is far from the domain of f . So obviously, for all

values of x that are close to but not equal to 3, f (x) is undefined and so of course there

is no number L that f (x) is always close to!

x

-5

-3

-1

is

undefined.

www.EconsPhDTutor.com

1

x

for all x 0. This is a very strange function. As x gets ever closer to 0, g(x) fluctuates

ever more rapidly between 1 and 1.

Example 96. Graphed below is the function g R R defined by g(0) = 0 and g(x) = sin

Its difficult or even impossible to draw an accurate graph of g near the origin.

In this example, lim g(x) does not exist. The reason is that for all values of x that are

x0

close to but not equal to 0, there is no number L that g(x) is close to. When x is close

to 0, g(x) takes on every value in [1, 1] infinitely often! And so g(x) can never be said

to be close to any one single number L.

Altogether then, g is not continuous at 0. (With a little work, we can actually prove that

g is continuous on R and also on R+ , but this is beyond the scope of A-levels.)

www.EconsPhDTutor.com

form representation of x contains the digit 7 and x 2 otherwise.

This function is arguably even stranger than the previous one. We have for example,

h(7) = h(70) = h(1.27) = h(0.0007) = 1 and h(15) = h(16) = h(16.335) = 2.

There are infinitely many points along the line y = 1. And there are also infinitely many

points along the line y = 2! It is quite impossible to sketch its graph accurately.

Nonetheless, h is a perfectly well-defined function. Indeed, h(3) is well-defined and h(x) is

well-defined for any x R.

However, lim h(x) does not exist. However we try to restrict x to values that are close to

x3

(but not equal to) 3, h(x) is never close to any one single value; instead, h(x) switches

infinitely often between 1 and 2.

Indeed, lim h(x) does not exist for any a R! However we try to restrict x to values that

xa

are close to (but not equal to) a, h(x) is never close to any one single value; instead, h(x)

switches infinitely often between 1 and 2.

h is nowhere-continuous: For every a R, h(a) is perfectly well-defined, lim h(x) is not.

xa

And so for every a R, lim h(x) h(a).

xa

y

2

1

is nowherecontinuous.

x

1,

Exercise 56. Consider the function f R R defined by f (x) =

2,

x5

x0

if x 0,

if x > 0.

What are

x5

www.EconsPhDTutor.com

8.4

This section considers infinite limits, i.e. where as x approaches some number, f (x)

increases (or decreases) grows without bound .

Example 98. Graphed below are the functions f and g, both with domain (, 3)(3, )

1

1

and codomain R, defined by f x 2 +

and g x 2

.

2

(x 3)

(x 3)2

vertical asymptote

-2

-1

x

3

Observe that for all values of x that are close to but not equal to 3, there is no number

L that f (x) is close to. Hence, we say that lim f (x) simply does not exist. Similarly,

x3

x3

decreases without bound. By a very special convention, we are allowed to write these

observations as:

lim f (x) = and

x3

lim g(x) = .

x3

lim f (x) = must NOT be interpreted to mean that there exists something called

x3

lim f (x) (no such thing exists); or that this thing is equal to some other thing called

x3

(recall that is not a number!). Instead, lim f (x) = is interpreted informally as:

x3

for all values of x that are sufficiently close to but not equal to 3.

Again, see Section 88.3 in the Appendices (optional) for the formal definitions.

www.EconsPhDTutor.com

Example 99. Graphed below is the equation y = tan x. It has two vertical asymptotes

x = /2, because lim = and lim = .

x/2

x/2

15

Vertical

asymptote

x = - /2

10

y = tan x

x

0

/2

/2

-5

Vertical

asymptote

x = /2

-10

-15

www.EconsPhDTutor.com

8.5

This section considers limits at infinity (not to be confused with the infinite limits discussed in the previous section). That is, the behaviour of f (x) as x increases (or decreases) grows without bound.

Example 98 (revisited). Reproduced below are the graphs of the functions f and g,

1

and

both with domain (, 3) (3, ) and codomain R, defined by f x 2 +

(x 3)2

1

g x 2

.

(x 3)2

We already saw that f and g both have vertical asymptote x = 3, because as x , f (x)

increases without bound and g(x) decreases without bound. We now consider instead what

happens as x increases or decreases without bound.

horizontal asymptotes

bound, f (x) 2 and g(x) 2. We can write these observations as lim f (x) = 2,

x

lim g(x) = 2, lim f (x) = 2, and lim g(x) = 2.

horizontal asymptote of the graph of g. [See Section 88.4 in the Appendices (optional) for

the formal definition of a horizontal asymptote.]

Pedantic point: Infinite limits do not exist. In contrast, limits at infinity DO exist. Here

in this example, lim f (x) does not exist. In contrast, lim f (x) and lim f (x) both exist

x

x

x3

(and are both equal to 2).

www.EconsPhDTutor.com

As x , y 0. We can also write this as lim y = 0. And we can also say that this

x

graph has horizontal asymptote y = 0.

20

y = ex

15

10

Horizontal asymptote

x

y=0

0

-4

-2

www.EconsPhDTutor.com

oblique (or slant) asymptote.

1

Example 101. Consider the function f R R+ R defined by x x + .

x

As x increases without bound or decreases without bound, f (x) approaches the line

y = x. We can also write these observations as lim f (x) = x and lim f (x) = x.

x

Again, see Section 88.4 in the Appendices (optional) for the formal definition of an oblique

asymptote.

1

x

-5

-3

-1

-1

Oblique

asymptote

y=x

-3

-5

www.EconsPhDTutor.com

9

9.1

Differentiation

The problem of finding the derivative is the problem of finding the slope of the tangent to

a graph at a given point.

Graphed below is some function f R R. Pick some point A = (a, f (a)). Draw the line l

which is tangent to the graph at the point A.

How do we find the slope of l? Unsure of how to proceed, we try a crude approximation.

Pick some point X1 = (x1 , f (x1 )) that is also on the graph. Consider the line AX1 . Whats

f (x1 ) f (a)

its slope? Slope = Rise Run and so AX1 has slope

.

x1 a

This number serves as our first crude approximation of the slope of l.

How can we improve on this approximation? Simple just pick some point X2 = (x2 , f (x2 ))

f (x2 ) f (a)

that is closer to A. The line AX2 has slope

.

x2 a

This number serves as our second, improved approximation of the slope of l.

y

f (x1)

X1

l

f (x2)

X2

A

f (a)

y = f (x)

x

a

x2

x1

At least in theory, we can keep repeating this procedure, by picking points that are ever

closer to A. Our estimates of the slope of l will get ever better. Altogether then, we are

motivated to make the following formal definition of the derivative:

Page 114, Table of Contents

www.EconsPhDTutor.com

f (x) f (a)

.

xa

xa

lim

If this limit exists, then we say that f is differentiable at the point a D and we call this

limit the value of f s derivative at the point a D.

But if this limit does not exist, then we say that f is not differentiable at the point a D

and the value of f s derivative at the point a D is undefined or does not exist.

www.EconsPhDTutor.com

y

Derivative = -1 for a < 0

Derivative does not exist at a = 0.

f (x) f (5)

x 5

x 5

= lim

= lim

= lim 1 = 1.

x5

x5 x + 5

x5 x + 5

x5

x (5)

lim

x 3

x 3

f (x) f (3)

= lim

= lim

= lim 1 = 1.

x5 x + 3

x3 x + 3

x3

x3

x (3)

lim

Indeed, the value of f s derivative at any point a < 0 is 1, because for any a < 0,

f (x) f (a)

x a

x + a

= lim

= lim

= lim 1 = 1.

xa

xa x a

xa x a

xa

xa

lim

f (x) f (a)

x a

xa

= lim

= lim

= lim 1 = 1.

xa x a

xa x a

xa

xa

xa

lim

lim

= lim 1 = 1,

x0 x

x0

f (x) f (0)

x 0

x

lim

= lim

= lim

=

x0 x 0

x0 x

x0

x0

lim

= lim (1) = 1,

x0 x x0

So as x 0, there is no one single value towards which the expression

proaches. So the limit does not exist.

Page 116, Table of Contents

for x > 0,

for x < 0.

f (x) f (0)

apx0

www.EconsPhDTutor.com

9.2

The Value of the Derivative of f at a

Lagranges notation:

Leibnizs notation:

Newtons notation:

f (x) f (a)

.

xa

xa

f (a) = lim

R

f (x) f (a)

df (x) RRRR

RRR = lim

xa

dx RR

xa

Rx=a

or

R

f (x) f (a)

df RRRR

RRR = lim

.

xa

dx RR

xa

Rx=a

f (x) f (a)

.

xa

xa

f (a) = lim

Some remarks.

Lagranges and Leibnizs notation are widely-used. Newtons notation is not.

But Newtons notation is sometimes used in physics (especially when the independent

variable is time). You certainly need to know about Newtons notation because it is on the

A-level syllabus. Nonetheless, this textbook will avoid using Newtons notation.

d

Leibnizs notation is convenient in that it allows us to interpret

as the differentiate

dx

with respect to operator.

Section 9.5 will give some examples of how this operator works.

Define x to be equal to x a, and f (x) to be equal to f (x) f (a), so that we can write:

f (x) f (x) f (a)

=

.

x

xa

The limit of this expression as x a is precisely the value of the derivative of f at a:

R

f (x) f (a)

f (x) df RRRR

= RRR = lim

.

lim

xa x

xa

dx RR

xa

Rx=a

www.EconsPhDTutor.com

a = 5

a=2

a=0

f (2) = 1,

f (0) is undefined,

Leibnizs notation:

R

df RRRR

= 1,

R

dx RRRR

Rx=5

R

df RRRR

R = 1,

dx RRRR

Rx=2

R

df RRRR

is undefined,

R

dx RRRR

Rx=0

Newtons notation:

f (5) = 1,

f (2) = 1,

f (0) is undefined.

Here is a very oversimplified history of Leibnizs notation, to give you a better sense of why

we use it.

Leibniz (1646-1716) thought of dx as an infinitesimal change in x. And dy was the

corresponding infinitesimal change in y. Leibniz then defined the derivative to be literally

the quotient dy/dx. Unfortunately, the idea of infinitesimals was rather vague, imprecise,

and non-rigorous. So in the 19th century, mathematicians embarked on a project to put

calculus on a firmer footing. In particular, they wished to rid mathematics of all references

to infinitesimals. Eventually, they settled on the modern notion of limits, in which no

reference to infinitesimals was necessary. This modern notion of limits is also what youve

just learnt.

So simply put, Leibniz was wrong to think of the derivative as a fraction. And

you should be very careful not to think of the derivative as a fraction, even though it looks

very much like one.

You are now being taught things in the correct order. First you are taught about limits.

Next we define the derivative in terms of limits. We are careful to note that the derivative

is not a fraction.

But if Leibniz was wrong to think of the derivative as a fraction, then why are we still

using his notation? The main reason is that it is highly intuitive. In particular, it reminds

us of what calculus is really about how a small change in one variable affects another

variable. It also allows us to quickly grasp the intuition behind such results as the Chain

Rule, which may informally be stated as:

dz dz dy

=

.

dx dy dx

It is tempting to navely interpret the expressions in the above equation as fractions, navely

apply simple algebra, navely cancel out the dys, so that the equation is indeed true. But

Page 118, Table of Contents

www.EconsPhDTutor.com

the correct informal interpretation (easily seen when written in Leibnizs notation) is this:

The change in z caused by a small unit change in x is equal to The change in z caused

by a small unit change in y The change in y caused by a small unit change in x.

Another result is the Inverse Function Theorem, which may informally be stated as:

dy 1

= .

dx dx

dy

dy

dx

and

as fractions, so that indeed by nave

dx

dy

algebra, the above equation is true. But again, the correct informal interpretation (easily

seen when written in Leibnizs notation) is this: The change in y caused by a small unit

change in x is equal to The reciprocal of the change in x caused by a small unit change

in y.

Again, the nave interpretation would be of

For a more detailed discussion, see the leading answer to this question on Math StackExchange.

www.EconsPhDTutor.com

9.3

In contrast, we now define the derivative to be a function:

Definition 34. Let f D R be a function and A be the set of points at which f is

differentiable. Then the derivative of f is the function with domain A, the same codomain

as f (namely R), and mapping rule x f (x).

The Derivative of f

Lagranges notation:

f .

Leibnizs notation:

df (x)

dx

Newtons notation:

f.

or

df

.

dx

d 2

d

cx = 2cx and

cx = c.

dx

dx

Example 103. Let f R R be defined by f (x) = 7x2 . Its derivative is the function

f R R defined by f (x) = 14x. This derivative may be denoted

f or

df (x)

df

or

or f .

dx

dx

df (x)

df

=

= f (0.5) = 1.75.

dx x=0.5 dx x=0.5

df (x)

df

The value of the derivative of f at 1 is f (1) =

= = f (1) = 7.

dx x=1 dx x=1

df (x)

df

The value of the derivative of f at 2 is f (2) =

= = f (2) = 28.

dx x=2 dx x=2

www.EconsPhDTutor.com

9.4

The derivative is also known as the first derivative. The second derivative is, similarly, also

a function:

Definition 35. Let f D R be a function. The second derivative of f is simply the

derivative of the derivative of f .

The Derivative of f

Lagranges notation:

f .

Leibnizs notation:

d2 f (x)

dx2

Newtons notation:

f.

derivative of f by

d2

d2 f

f

or

.

dx2

dx2

or

d2 f

.

dx2

d

is the operator, it makes sense to denote the second

dx

is the function with domain and codomain both R, and mapping rule x 14. This second

derivative may be denoted

f or

d2 f (x)

d2 f

or

or

f

.

dx2

dx2

df (x)

df

=

= f (0.5) = 14.

dx x=0.5 dx x=0.5

df (x)

df

The value of the second derivative of f at 1 is f (1) =

= = f (1) = 14.

dx x=1 dx x=1

df (x)

df

The value of the second derivative of f at 2 is f (2) =

= = f (2) = 14.

dx x=2 dx x=2

www.EconsPhDTutor.com

We similarly define the third, fourth, fifth, etc. derivatives in the obvious fashion.

Definition 36. Let f D R be a function. For n 3, the nth derivative of f is simply

the derivative of the (n 1)th derivative of f .

The 3rd

The 4th

Derivative of f Derivative of f

Lagranges notation:

f (3) .

f (4) .

Leibnizs notation:

d3 f

dx3 .

d4 f

dx4

Newtons notation:

f.

Etc.

f.

Example 103 (revisited). Let f R R be defined by x 7x2 . Its first derivative is the

function f R R defined by x 14x. Its second derivative is the function f R R

defined by x 14. We have f (2) = 28 and f (2) = 14.

Its third derivative is the function f (3) R R defined by x 0. Its fourth derivative is

the function f (4) R R defined by x 0. Observe that f (3) = f (4) . Indeed, the third and

all higher-order derivatives are identical functions: f (3) = f (4) = f (5) = . . .

We have f (3) (2) = f (4) (2) = f (5) (2) = = 0. Indeed, for any x R, we have f (3) (x) =

f (4) (x) = f (5) (x) = = 0.

Exercise 57. Given f D R and f A R, what is f ? (Answer on p. 1025.)

Exercise 58. (Tedious but easy.) Let g R R be defined by x x4 x3 + x2 x + 1.

Write down all of its derivatives. Evaluate all of these derivatives at 1. Write your answers

in Lagranges, Leibnizs, and Newtons notation. (Answer on p. 1025.)

www.EconsPhDTutor.com

9.5

Example 104.

d

Operator

dx

d 2

x = 2x is simply shorthand for this statement:

dx

The derivative of the function with mapping rule x x2 is the function with mapping rule

x 2x.

Example 105.

d

f = g is simply shorthand for this statement:

dx

The derivative of the function f is the function g.

Example 106.

d

f g = g f + f g is simply shorthand for this statement:

dx

x g(x) f (x) + f (x) g (x).

www.EconsPhDTutor.com

9.6

and g . Suppose also that the composite function f g A R is well-defined. Let k R be a

constant. Then:

d

dx

d

sin x =

dx

cos x,

d

f g = f g,

dx

d

cos x =

dx

sin x,

d

dx

kf

d

dx

f g

g f + f g,

d

dx

xk

= kxk1 ,

d

dx

f

g

g f f g

,

gg

d

dx

ex

ex ,

d

d (f g) dg

f g =

.

dx

dg

dx

d

dx

ln x

1

,

x

0,

kf ,

(My mnemonic for the Quotient Rule is: Lo-D-Hi minus Hi-D-Lo; cross over and square

the low.)

Proof. Optional, see p. 957 in the Appendices.

Of the above rules, the Chain Rule is the most powerful. We can also write it more elegantly

(if a little imprecisely) as

dz dz dy

=

.

dx dy dx

As discussed above in the historical note (p. 118), thus written, the Chain Rule has a

beautiful informal interpretation: The change in z caused by a small unit change in x is

equal to The change in z caused by a small unit change in y The change in y caused

by a small unit change in x. This makes perfect sense:

www.EconsPhDTutor.com

Example 107. When I add 1 g of Milo (the x-variable) to a cup of water, the volume of

dy

the water increases by 2 cm3 (the y-variable). That is,

= 2 cm3 g-1

dx

When the volume of the water increases by 1 cm3 (the y-variable), the water level (in the

dz

cup) rises by 0.3 cm (the z-variable). That is

= 0.3 cm cm-3 = 0.3 cm-2 .

dy

Altogether then, when I add 1 g of Milo (the x-variable) to a cup of water, I should expect

dz

the water level to rise by 0.6 cm. That is,

= 0.6 cm g-1 . This is indeed consistent with

dx

dz dz dy

=

= 2 0.3 = 0.6 cm g-1 .

dx dy dx

In case youve forgotten how it works, here are a few examples to illustrate:

Example 108. Let h R R be defined by x esin x .

desin x desin x d sin x

h (x) =

=

= esin x cos x.

dx

d sin x dx

Example 109. Let g R R be defined by x

4x 1.

d 4x 1 d 4x 1 d (4x 1)

0.5

0.5

g (x) =

=

= 0.5 (4x 1)

4 = 2 (4x 1)

.

dx

d (4x 1)

dx

Heres a more complicated example, where the Chain Rule is applied twice.

3

3

f (x) =

dx

3

d [sin(2x 3) + cos(5 2x)] d [sin(2x 3) + cos(5 2x)]

=

d [sin(2x 3) + cos(5 2x)]

dx

d cos(5 2x) d(5 2x)

2 d sin(2x 3) d(2x 3)

= 3 [sin(2x 3) + cos(5 2x)] [

+

]

d(2x 3)

dx

d(5 2x)

dx

2

www.EconsPhDTutor.com

Exercise 59. For each of the following functions (assume they have a suitably defined

domain and codomain), evaluate the first derivative at 0. (a) f (x) = x2 . (b) g(x) =

x

2

1 + [x ln (x + 1)] . (c) h(x) = sin

2 . (Answer on p. 1026.)

1 + [x ln (x + 1)]

Corollary 1.

d

d

d

tan x = sec2 x,

cot x = csc2 x, and

csc x = csc x cot x.

dx

dx

dx

d

d sin x cos x cos x sin x( sin x) cos2 x + sin2 x

1

tan x =

=

=

=

= sec2 x.

2

2

2

dx

dx cos x

cos x

cos x

cos x

For the derivatives of cot x and csc x, see Exercise 60.

SYLLABUS ALERT

d

csc x = csc x cot x is in the List of Formulae for 9758 (revised), but not for 9740 (old).

dx

d

cot x = csc2 x and

dx

d

csc x = csc x cot x.

dx

(a) Newtons Second Law of Motion is that force is equal to the rate of change of momentum,

where momentum is the product of mass and velocity. Write down this law in mathematical

notation, with F , m, v, and t denoting force, mass, velocity, and time.

(b) Assume that mass is constant. Explain why Newtons Second Law then simplifies into

the more-familiar F = ma, where a is acceleration (i.e. the rate of change of velocity).

www.EconsPhDTutor.com

9.7

point in that set.

at every point in that set.

In other words, f is differentiable if and only if f has the same domain as f . Similarly, f

is twice-differentiable if and only if f has the same domain as f .

And of course, if a function is twice-differentiable, then it is also differentiable.

(The definitions for a function to be thrice-differentiable, four-times-differentiable,

etc. are very much analogous, but this textbook will have no reason to use these terms.)

The condition that the first derivative (or second derivative) exists at every point in the

domain is important. Failing which, we do not consider the function to be differentiable

(or twice-differentiable). The three functions in the next example illustrate:

www.EconsPhDTutor.com

for all x R. And so f is both differentiable and twice-differentiable.

Now consider g R R defined by g(x) = x x (graphed below). We have g (x) = 2 x for

all x R and

2,

g (x) =

2,

for x < 0,

for x > 0.

But g (0) does not exist. And so g is differentiable but NOT twice-differentiable.

, for all .

x

- 2, for x < 0,

2, for x > 0.

is undefined.

2,

h (x) =

2,

for x < 0,

for x > 0.

But h (0) does not exist. So h is not even once-differentiable. (And thus it is certainly not

twice-differentiable either.)

www.EconsPhDTutor.com

We can of course also consider thrice-differentiable, four-times-differentiable, etc. functions. We can even consider infinitely-differentiable functions. Indeed, in the A-levels, most

functions are usually infinitely differentiable. For example, all polynomials are infinitelydifferentiable, as illustrated in the next example.

Example 112. Consider i R R defined by x x5 x4 + x3 x2 + x 1. We have, for all

x R,

i (x) = 5x4 4x3 + 3x2 2x + 1,

The function i is infinitely-differentiable, with the 6th and higher-order derivatives all having the mapping rule x 0.

Example 113. Consider j R R defined by x ex . We have, for all x R,

j (x) = j (x) = j (3) (x) = j (4) (x) = = ex .

The function j is infinitely-differentiable, with every derivative simply being the same

function as j.

www.EconsPhDTutor.com

9.8

graph has no holes or jumps anywhere and can be drawn smoothly without lifting

your pencil.

Differentiability is a stronger smoothness condition. If a function is differentiable,

then its graph is continuous (i.e. has no holes or jumps) and moreover has no kinks

or other abrupt turns.

Example 114. Graphed below are the functions f , g, and h.

f is both continuous and differentiable.

g is continuous you can draw its entire graph without lifting your pencil. However, it is

not differentiable because of the kink.

h is neither continuous nor differentiable, because of the hole.

y

h is neither continuous

nor differentiable.

f is both continuous

and differentiable.

g is continuous, but

not differentiable.

Proof. Optional, see p. 961 in the Appendices.

www.EconsPhDTutor.com

9.9

Implicit Differentiation

dy

?

dx

dy

2x

x

x

=

=

=

.

dx

2 1 x2

1 x2

1 x2

Method #2 (implicit differentiation). Directly apply

d

to the given equation:

dx

d

d

dy

dy

x

(x2 + y 2 ) = (1) 2x + 2y

= 0

= .

dx

dx

dx

dx

y

x

dy

x

=

.

=

dx

1 x2

1 x2

In the above example, the second method (implicit differentiation) is not obviously superior

to the first. However, it is sometimes difficult (or impossible) to express y in terms of x.

Nonetheless we might still want to compute dy/dx. In such cases, the method of implicit

differentiation is wonderful. The next example illustrates:

when evaluated at x = 0)?

y

dy

dy

= 1. What is

(i.e. what is

cos x

dx x=0

dx

In this example, its difficult to express y in terms of x. But this doesnt matter, because

we can use implicit differentiation:

d

y

d

1 dy y( sin x) cos x dx

(x2 y +

) = (1) 2x y + x2

+

= 0.

dx

cos x

dx

2 y dx

cos2 x

dy

Now plug in x = 0:

1 dy y( sin 0) cos 0 dx

dy

2 0 y + 02

+

=

0

= 0.

2 y dx

cos2 0

dx

dy

www.EconsPhDTutor.com

The four rules of differentiation in the next corollary are in the List of Formulae you get

during A-level exams (both 9740 and 9758), so you need not know these by heart.

d

1

d

1

d

sec x = sec x tan x,

sin1 x =

,

cos1 x =

, and

dx

dx

1 x2 dx

1 x2

d

1

tan1 x =

.

dx

1 + x2

Corollary 2.

d

1

, first rewrite y = sin1 x as x = sin y. Next

sin1 x =

2

dx

1x

d

dy

then apply

(implicit differentiation) to get 1 = cos y . But sin2 y + cos2 y = 1, so

dx

dx

2

cos y = 1 x . And so,

Proof. To prove that

dy

d

1

1

.

=

sin1 x =

=

dx dx

cos y

1 x2

Exercise 62 asks you the prove the derivatives of sec x, cos1 x and tan1 x are as claimed.

d

d

1

d

sec x = sec x tan x,

cos1 x =

, and

tan1 x =

2

dx

dx

dx

1x

1

. (Answer on p. 1026.)

1 + x2

www.EconsPhDTutor.com

10

10.1

increasing on R+0 , strictly decreasing on R , and strictly increasing on R+ .

y

Decreasing on

Strictly decreasing on

Increasing on

Strictly decreasing on

Note: At x = 0, f is both decreasing and increasing, but neither strictly decreasing nor

strictly increasing. This follows from the formal definitions (below).

Definition 41. Given a function f and a set of points S, we say that f is ...

1. ... increasing on S if for any x1 , x2 S with x2 > x1 , we have f (x2 ) f (x1 );

2. ... strictly increasing on S if for any x1 , x2 S with x2 > x1 , we have f (x2 ) > f (x1 );

3. ... decreasing on S if for any x1 , x2 S with x2 > x1 , we have f (x2 ) f (x1 );

4. ... strictly decreasing on S if for any x1 , x2 S with x2 > x1 , we have f (x2 ) < f (x1 );

Of course, if a function is strictly increasing on a set of points, then it is also increasing on

that set. And if it is strictly decreasing, then it is also decreasing.

Exercise 63. Let g R R defined by x sin x. Identify the sets on which which g is

increasing, decreasing, strictly increasing and/or strictly decreasing. (Answer on p. 1027.)

www.EconsPhDTutor.com

10.2

The derivative is the slope of the tangent. And so not surprisingly, the derivative is intimiately related to whether a function is increasing or decreasing. Formally:

Fact 10. Let f R R be a differentiable function. Let a, b R with b > a. Then

1. f is decreasing on (a, b) f (x) 0, for all x (a, b).

2. f is increasing on (a, b) f (x) 0, for all x (a, b).

4. f is strictly increasing on (a, b) f (x) > 0, for all x (a, b).

5. f is both increasing and decreasing at a f (a) = 0.

1. f is decreasing on R0 , and so f (x) 0 for x 0.

3. f is strictly decreasing on R0 , and so f (x) < 0 for x 0.

4. f is strictly increasing on R+0 , and so f (x) > 0 for x 0.

, for

, for

, for

, for

www.EconsPhDTutor.com

11

11.1

1. If f (x) f (a) for all a D that are close to x, then we call x a maximum point of

f and f (x) a maximum value.

2. If f (x) f (a) for all a D that are close to x, then we call x a minimum point of

f and f (x) a minimum value.

3. If f (x) > f (a) for all a D that are close to x, then we call x a strict maximum

point of f and f (x) a strict maximum value.

4. If f (x) < f (a) for all a D that are close to x, then we call x a strict minimum

point of f and f (x) a strict minimum value.

Of course, a strict maximum point is also a maximum point. And a strict minimum point

is also a minimum point.

Any maximum or minimum point is also known as an extremum (plural: extrema) or an

extreme point.

21

www.EconsPhDTutor.com

point and a strict maximum point of f . The corresponding maximum value (and also strict

maximum value) is f (1) = 0.

Also graphed is g R R defined by g(x) = (x + 1)2 . x = 1 is a minimum point and a strict

minimum point of g. The corresponding minimum value (and also strict minimum value)

is g(1) = 0.

x = -1

minimum

point for g

x=1

maximum

point for f

www.EconsPhDTutor.com

Example 119. Graphed below is h R R defined by x 6x5 15x4 10x3 + 30x2 .

x = 1 is a maximum point and a strict maximum point of h. The corresponding

maximum value (and also strict maximum value) is h(1) = 19.

x = 1 is a maximum point and a strict maximum point of h. The corresponding maximum

value (and also strict maximum value) is h(1) = 11.

x = 0 is a minimum point and a strict minimum point of h. The corresponding minimum

value (and also strict minimum value) is h(0) = 0.

x = 2 is a minimum point and a strict minimum point of h. The corresponding minimum

value (and also strict minimum value) is h(2) = 8.

y

x = 1

maximum points

x

-2

-1

x = 0, 2

minimum points

www.EconsPhDTutor.com

The next example highlights the fact that a maximum point is sometimes not a strict

maximum point. Likewise with minimum points.

Example 120. Below is graphed i R R defined by x 3. (This is a constant

function.)

Every point x R is a maximum point of i. The corresponding maximum value is always

i(x) = 3.

But no point is a strict maximum point.

Every point x R is a minimum point of i. The corresponding minimum value is always

i(x) = 3.

But no point is a strict minimum point.

Every point is a

maximum point.

Every point is a

minimum point.

x

-2

-1

www.EconsPhDTutor.com

11.2

1. If f (a) f (x) for all x D, we call a the global maximum point of f and f (a) the global

maximum value.

2. If f (a) f (x) for all x D, we call a the global minimum point of f and f (a) the global

minimum value.

3. If f (a) > f (x) for all x D/{a}, we call a the strict global maximum of f and f (a) the

strict global maximum value.

4. If f (a) < f (x) for all x D/{a}, we call a the strict global minimum of f and f (a) the

strict global minimum value.

Fact 11. There cannot be more than one strict global maximum point of a function. (Similarly, there cannot be more than one strict global minimum point of a function.)

Proof. Suppose for contradiction that two distinct points x1 and x2 are strict global maximum points of f . Then since x1 is a strict global maximum point, we have f (x1 ) > f (x2 ).

Similarly, since x2 is a strict global maximum point, we have f (x2 ) > f (x1 ). The two

inequalities are contradictory. So it is impossible that two distinct points x1 and x2 are

strict global maximum points of f .

www.EconsPhDTutor.com

6x5 15x4 10x3 + 30x2 . (Graph reproduced below for convenience.)

x = 1 are maximum points. However, they are not global maximum points. Indeed, h

has no global maximum point because lim h(x) = (as x increases without bound, h(x)

x

also increases without bound). In other words, there is no x such that h(x) h(a) for all

a R.

Similarly, x = 0, 2 are minimum points. However, they are not global minimum points.

Indeed, h has no global minimum point because lim h(x) = (as x decreases without

x

bound, h(x) also decreases without bound). In other words, there is no x such that

h(x) h(a) for all a R.

y

x = 1

maximum points

x

-2

-1

x = 0, 2

minimum points

We next restrict the domain of h in two ways to create two new functions i and j:

www.EconsPhDTutor.com

Example 119 (revisited). Graphed below (left) is the function i [1.5, 2.5] R defined

by x 6x5 15x4 10x3 + 30x2 .

i has three maximum points in total, namely 1, 2.5. However, only 2.5 is a global maximum

point of i because only i(2.5) i(x) for all x [1.5, 2.5]. Of course, it is also a strict global

maximum point because i(2.5) > i(x) for all x [1.5, 2.5].

i has three minimum points in total, namely 1.5, 0, 2. However, only 1.5 is a global

maximum point of i because only i(1.5) i(x) for all x [1.5, 2.5]. Of course, it is also

a strict global minimum point because i(1.5) < i(x) for all x [1.5, 2.5].

x = 1

max

x = 2.5

max and

global max

x = -1

max and

global max x = 1, 1.2

max

x

-2

-1

0

1

2

x = -1.5

min and

global min x = 0, 2 min

x

-2

-1

x = -1.2, 0 min

x = 2 min and

global min

Also graphed above (right) is the function j [1.2, 2.2] R defined by x 6x5 15x4

10x3 + 30x2 .

Again, there are three maximum points in total, namely 1, 2.2. However, only 1 is a

global maximum point of j because only j(1) j(x) for all x [1.2, 2.2]. Of course, it is

also a strict global maximum point because j(1) > i(x) for all x [1.2, 2.2].

And again, there are three minimum points in total, namely 1.2, 0, 2. However, only 2 is

a global minimum point of j because only j(2) j(x) for all x [1.2, 2.2]. Of course, it is

also a strict global minimum point because j(2) < j(x) for all x [1.2, 2.2].

www.EconsPhDTutor.com

Note that the A-level syllabuses and exams only ever talk about maximum and minimum

points. They do not ever talk about

1. Strict maximum points;

2. Strict minimum points;

3. Global maximum points;

4. Global minimum points;

5. Strict global minimum points; and

6. Strict global maximum points.

Nonetheless, these concepts are not difficult to grasp. It is thus well worth learning them,

just so you have a better understanding of how to find maximum and minimum points.

Note also that what we simply call maximum and minimum points are sometimes instead

called local maximum and minimum points, so that they are better contrasted with global

maximum or minimum points.

Exercise 64. (Answer on p. 1027.) For each of the following functions, write down,

if any of these exist, the (i) maximum points, (ii) minimum points, (iii) strict maximum

points, (iv) strict minimum points, (v) global maximum points, (vi) global minimum points,

(vii) strict global maximum points, (viii) strict global minimum points; and also all the

corresponding values of the function at these points.

(a) f R R defined by x 100.

(b) g R R defined by x x2 .

(c) h [1, 2] R defined by x x2 .

www.EconsPhDTutor.com

11.3

Graphically, a stationary point is where the slope of the tangent is 0 (flat).

Definition 44. A turning point is any point that is both a stationary point and a maximum

or minimum point.

But the converse is not true: A stationary point need not always be a turning point. And

an extreme point need not always be a turning point.

www.EconsPhDTutor.com

Example 121. Graphed below is the function f [1.5, 0.5] R defined by x x5 +2x4 +x3 .

Five points are labelled. The table below classifies each point.

D is a stationary point but not a turning point. (As we shall learn in Section 151, D is an

example of an inflexion point.)

A is a minimum point and E is a maximum point. But neither is a turning point.

Type

Max

Min

Strict Max

Strict Min

Global Max

Global Min

Strict Global Max

Strict Global Min

Stationary

Turning

A B C D E

y

E

C

f (x) = x5 + 2x4 + x3

A

Exercise 65. Is each of the following statements true or false? To show that a statement

is false, simply give a counterexample from the above example. If it is true, explain why.

(Answer on p. 1028.)

(a) Every maximum point or minimum point is a stationary point.

(b) Every maximum point or minimum point is a turning point.

(c) Every stationary point is a maximum point or minimum point.

(d) Every turning point is a maximum point or minimum point.

(e) Every turning point is a stationary point.

(f) Every stationary point is a turning point.

www.EconsPhDTutor.com

11.4

Definition 45. x S is in the interior of S if there exists such that (x , x + ) S.

x S is a non-interior point of S if it is not in the interior of S.

Example 122. Consider the set S = [0, 1]. The points 0.2, 1/3, and 0.775 are all in the

interior of S. Indeed, every point x (0, 1) is in the interior of S.

In contrast, the points 0 and 1 are non-interior points of S.

Example 123. Consider the set S = [0, 0.5) (0.5, 1]. The points 0.2, 1/3, and 0.775 are all

in the interior of S. Indeed, every point x (0, 0.5) (0.5, 1) is in the interior of S.

In contrast, the points 0 and 1 are non-interior points of S.

The point 0.5 is not in the interior of S. It is not even a non-interior point of S, because

it is not in the set S to begin with.

www.EconsPhDTutor.com

The Interior Extremum Theorem (IET) is the fundamental reason why we lurrrve taking

derivatives and setting them equal to zero this is a great way to find maxima and minima!

Theorem 2. (Interior Extremum Theorem [IET].) Let f D R be a differentiable

function. If a is a maximum or minimum point AND in the interior of D, then f (a) = 0

(i.e. c is a stationary point).

Example 136 (revisited). Graphed below is f R R defined by x (x 1)2 . Heres

the intuition for why f (0) = 0:

In order for 1 to be a maximum point of f , it must be that to its left, f is increasing; while

to its right, f is decreasing. In other words, to the left of 1, f (x) 0. While to the right of

1, f (x) 0. Altogether then, we must have f (1) = 0 at the maximum point, the slope

of the function must be 0.

x = -1

minimum

point for g

x=1

maximum

point for f

Exercise 66. Refer to the above Example. Explain the intuition for why g (1) = 0.

(Answer on p. 1028.)

or minimum point AND in the interior of D, then x is a turning point. (Answer on p.

1028.)

www.EconsPhDTutor.com

11.5

In secondary school, you may have been taught that to find the maximum and minimum

points of f , simply follow this procedure:

Given a differentiable function f D R,

1. Compute f (x). Find the points x at which f (x) = 0.

2. These points are also the maximum and minimum points. (If we also want to know

which are maximum and which are minimum points, then simply employ some method

like sketch-the-graph or the Second Derivative Test.)

Unfortunately, the above procedure (lets call it the Incorrect Recipe) may sometimes

fail. It rests on the false belief that f (x) = 0 x is an extremum. This is false

because

1. The IET does NOT say, f (x) = 0 x is an extremum. It is perfectly possible that

f (x) = 0 without x being an extremum.

extremum AND an interior point f (x) = 0. Thus, it is perfectly possible that x

is an extremum without f (x) = 0.

Here is an example to illustrate these two failings of the Incorrect Recipe.

www.EconsPhDTutor.com

Example 144 (revisited). Graphed below is the function f [1.5, 0.5] R defined by

x x5 + 2x4 + x3 . Five points are labelled.

According to the Incorrect Recipe,

1. Compute f (x) = 5x4 + 8x3 + 3x2 = x2 (5x2 + 8x + 3) = x2 (5x + 3)(x + 1). We see that

3

f (x) = 0 x = , 1, 0.

5

3

2. So , 1, 0 are the maximum and minimum points of f .

5

The Incorrect Recipe does correctly identify the points B = (1, f (1)) and C =

3

3

( , f ( )) as maximum and minimum points, respectively. But it makes two mistakes.

5

5

Mistake #1: D = (0, 0) is neither a maximum nor a minimum point, contrary to the

Incorrect Recipe.

Mistake #2: A and E are respectively a minimum and a maximum point, but neither is

detected by the Incorrect Recipe.

y

E

C

f (x) = x5 + 2x4 + x3

A

We now give the Correct Recipe for finding maximum and minimum points:

www.EconsPhDTutor.com

Given a differentiable function f D R,

1. Identify all the stationary points (i.e. x where f (x) = 0).

2. Identify all the non-interior points.

3. Check to see if each stationary point and each non-interior point is a maximum point,

a minimum point, or neither. (To do so, employ some method like sketch-the-graph or

the Second Derivative Test.)

1. The Correct Recipe demands that you also check the non-interior points, which may

possibly be maximum or minimum points, but may not be detected by the Incorrect

Recipe.

2. The Correct Recipe does not assume that every single one of our shortlist of points (the

stationary points and the non-interior points) is either a maximum point or a minimum

point. It allows for the possibility that some of these points could be neither.

By the way, the condition that f is differentiable is very important. If f is not

differentiable, then the above Correct Recipe might not work. But not to worry,

since most functions on the A-levels are usually differentiable.

Example 124. Consider f [1, 1] R defined by x x3 . Lets apply the Correct Recipe.

1. Identify all the stationary points (i.e. x where f (x) = 0).

f (x) = 3x2 . So f (x) = 0 x = 0. The only stationary point is x = 0.

2. Identify all the non-interior points.

Every point x (1, 1) is in the interior of [1, 1]. The only non-interior points are 1 and

1.

3. Check if each of these points is a maximum point, a minimum point, or neither.

From a sketch of the graph, we see that the stationary point x = 0 is neither a maximum

nor a minimum point. The non-interior point 1 is a minimum point. The non-interior

point 1 is a maximum point.

Altogether, we conclude that 1 is the only minimum point and 1 is the only maximum

point.

www.EconsPhDTutor.com

Exercise 68. For each of the following functions, find all the maximum and minimum

points using the Correct Recipe. (Answer on p. 1029.)

(a) f R R defined by x x.

(b) g [0, 1] R defined by x x.

(c) h R R defined by x x4 2x2 . Identify also the global minimum point(s) of h (if

any exist).

www.EconsPhDTutor.com

12

segment connecting any two points of the graph in this interval is below the graph.

A function is concave upwards (or simply convex) on an interval if the line segment

connecting any two points of the graph in this interval is above the graph.

An inflexion point is any point where the concavity of the function changes, either

from downwards to upwards, or upwards to downwards.22

f is concave downwards on R0 because there, the line segment connecting any two points

on f is below the graph of f .

y

Tangent line at x = 0

is concave upwards on

x

-2

-1

is concave downwards on

In contrast, f is concave upwards on R+0 because there, the line segment connecting any

two points on f is above the graph of f .

0 is an inflexion point because this is where the function f changes from being concave

downwards to being concave upwards.

A test for whether a point is an inflexion point is this: Draw the tangent line to the graph

at that point. The point is an inflexion point The line is above the graph on one side

of the point and below the graph on the other side (see Fact 95 in the Appendices).

The tangent line to the graph at the point 0 is drawn in green (it coincides with the

horizontal axis). We indeed see that the line is above the graph on the left side of the point

and below the graph on the right side of the point. Therefore, 0 is an inflexion point.

22

These are informal definitions. For the formal definitions, see p. 964 in the Appendices (optional).

www.EconsPhDTutor.com

concave upwards, its slope must be increasing. Altogether then, the following proposition

is intuitively plausible.

Proposition 2. Let f D R be a twice-differentiable function.

(a) f is concave downwards on an interval f (x) 0 for every x in this interval.

(b) f is convex upwards on an interval f (x) 0 for every x in this interval.

(c) x is an inflexion point f (x) = 0.

on R0 , concave upwards on R+0 , and has an inflexion point at x = 0.

We can verify that, as per the above proposition:

< 0,

f (x) = 3x2 = 0,

> 0,

for x R0 ,

for x = 0,

for x R+0 .

www.EconsPhDTutor.com

It is very tempting to believe that the converse of part (c) of the above proposition is true.

That is, it is very tempting to believe that

f (x) = 0 x is an inflexion point.

But this is wrong! It is perfectly possible that f (x) = 0 without x being an inflexion

point! Heres an example:

Example 126. Consider g R R defined by x x4 . We have g (x) = 4x3 and g (x) =

12x2 , so that g (x) = 0 x = 0.

We might thus be tempted to conclude that 0 is an inflexion point. However, this is not

the case. Although g (0) = 0, we have g (x) > 0 for x > 0 and we also have g (x) > 0 for

x < 0, and so the concavity of g does not change at the point 0.

To qualify as an inflexion point, the concavity of the function must change. At 0,

the concacivty of g does not change. Therefore, 0 is NOT an inflexion point.

y

g is concave upwards everywhere

.

However, is not an

inflexion point of .

x

www.EconsPhDTutor.com

Inflexion points may be further sub-divided into stationary points of inflexion and

non-stationary points of inflexion.

Definition 46. A stationary point of inflexion is simply any point that is both an inflexion

point and a stationary point.

A non-stationary point of inflexion is simply any point that is an inflexion point, but not

a stationary point.

there is the temptation to believe that every inflexion point must also be a stationary

point. Heres a quick counter-example that dispels this belief:

Example 127. The graph below is for the function f R R defined by x x3 + x.

We have f (x) = 3x2 + 1 and f (x) = 6x. The point 0 is not a stationary point because

f (0) = 1 0.

However, 0 is an inflexion point, because to the left of 0, f is concave downwards; and to

the right, f is concave upwards. So 0 is a point of inflexion. Indeed, it is a non-stationary

point of inflexion.

Also illustrated is the tangent line at y = x (whose slope is indeed non-zero). Observe that

indeed, to the left of 0, the tangent line is above the graph; while to the right of 0, the

tangent line is below the graph. This serves as a second way to verify that 0 is a point of

inflexion.

Concave upwards on

x

Tangent line at 0

Concave downwards on

www.EconsPhDTutor.com

12.1

around the minimum

turning point 0.

So

around the maximum

turning point 0.

So

From graphs, it looks like around a maximum turning point a, f must be concave downwards, i.e. f (a) < 0. Similarly, around a minimum turning point b, f must be concave

upwards, i.e. f (b) > 0. The next proposition is thus intuitively plausible.

Proposition 3. (Second Derivative Test [2DT].) Let f be a twice-differentiable function. Let a be a stationary point (i.e. f (a) = 0).

1. If f (a) < 0, then a is a maximum point.

2. If f (a) > 0, then a is a minimum point.

3. If f (a) = 0, then the 2DT is uninformative. That is, a could be a maximum point, a

minimum point, an inflexion point, or something else altogether!

The third part of the above Proposition must be heavily emphasised: If f (a) = 0 and

f (a) = 0, then the 2DT tells us absolutely nothing about a! a could be a maximum

point, a minimum point, an inflexion point, or something else altogether!

We previously gave the Correct Recipe for finding maximum and minimum points. Lets

now add the 2DT to this recipe:

www.EconsPhDTutor.com

Given a twice-differentiable function f D R,

1. Identify all the stationary points (i.e. a where f (a) = 0).

(a) Evaluate f at each of these points.

(b) f (a) < 0 a is a maximum point. Conversely, f (a) > 0 x is a minimum

point. If f (a) = 0, then we need to determine the nature of a using some other

method (e.g. sketch-the-graph).

2. Identify all the non-interior points.

(a) Check if each of these points is a maximum point, a minimum point, or neither.

If f is not twice-differentiable, then the Enriched Recipe may not work. Fortunately, most functions in A-levels are twice-differentiable.

Example 121 (revisited). Consider f [1.5, 0.5] R defined by x x5 + 2x4 + x3 .

1. Identify all the stationary points. f (x) = 5x4 + 8x3 + 3x2 = x2 (5x2 + 8x + 3) = 0

x = 0 or x = 1, 0.6 (quadratic formula).

(a) f (x) = 20x3 + 24x2 + 6x = 2x(10x2 + 12x + 3).

(b) f (0.6) > 0 0.6 is a minimum point. f (1) < 0 1 is a maximum

point. But f (0) = 0, so the 2DT tells us nothing. By sketching the graph (nonrigorous method that will suffice for the A-levels), we see that 0 is an inflexion

point.

2. The only two non-interior points are 1.5 and 0.5. Again by sketching the graph, we see

that 1.5 is a minimum point and 0.5 is a maximum point.

Altogether, we conclude that there are two maximum points 1 and 0.5 and two

minimum points 0.6 and 1.5.

Exercise 69. Use the Enriched Recipe to find the maximum and minimum points of each

of the following functions. (Answer on p. 1031.)

(a) g R R defined by x x8 + x7 x6 .

(b) h ( , ) R defined by x tan x.

2 2

(c) i [0, 2] R defined by x sin x + cos x.

www.EconsPhDTutor.com

12.2

The Venn diagram below depicts the five types of points you need to know for the A-levels:

Inflexion, maximum, minimum, stationary, and turning points. To its right is a graph of a

rather-arbitrary function t D R designed to illustrate these various points. The x- and

y-coordinates of a are denoted ax and ay ; similarly for other points.

a

b

Inflexion

All

points

y

e

Stationary

i

f

h

Turning

g

c

f

h

Max

Min

b

a

x

For most functions youll ever encounter, most points are like a. For lack of a better

name, we can call such points boring points a boring point is simply any point that

is not an inflexion, maximum, minimum, stationary, or turning point.

b is a non-stationary point of inflexion (explicitly excluded from the A-levels).

c is a stationary point of inflexion.

A point like d (not illustrated) a stationary point that is not a maximum, minimum,

or inflexion point is extremely unusual. You can find an exotic example on p. 968.

f is both a maximum and minimum point because for all x D that are close to

fx D, we have t(x) t (fx ) t(x).

The set of turning points is simply the intersection of the set of stationary points and

the set of maximum and minimum points.

h is a maximum point because t(x) t (hx ) for all x D that are close to hx .

j is a minimum point because t(x) t (jx ) for all x D that are close to jx .

i is both a maximum and minimum point because there are simply no x D that are

close to ix D, and thus it is trivially or vacuously true that t(x) t (ix ) t(x) for x

that are close to x.23 i is not a stationary point because t (ix ) 0 indeed, t (ix ) is

undefined.24

23

24

A point like ix D that is not close to any other x D is, aptly enough, called an isolated point.

ix is an example of a critical point. A critical point is any point that is either stationary or where the derivative is

undefined. Dont worry, not something you need to know for the A-levels.

www.EconsPhDTutor.com

Exercise 70. For each of the following equations, (i) sketch its graph. (ii) Write down the

points at which it intersects the axes. (iii) Identify any turning points. (iv) Write down the

equations of any lines of symmetry and also (v) asymptotes. (a) y = 2ex + x. (b) x = 3x + 2.

(c) y = 2x2 + 1. (Answers on pp. 1032, 1033, and 1034.)

www.EconsPhDTutor.com

13

Given the graph of f , you are required to know how to figure out what f looks like. Lets

start with a very simple example.

Example 128. Let f R R be some differentiable function. Graphed below in blue is

its derivative f . You are told also that f (0) = 2. What does the graph of f look like?

(Pretend for a moment that you cant see the red graph.)

The derivative simply gives the slope of f . Since f (x) = 1 for all x, this means that f has

constant slope of 1. We are given moreover that f (0) = 2 (i.e. the vertical intercept is 2).

Altogether then, f (x) = x + 2 and is graphed in red above.

www.EconsPhDTutor.com

is its derivative g . You are told also that lim g(x) = 2. What does the graph of g look

x0

like? (Pretend for a moment that you cant see the red graph.)

-1

The derivative simply gives the slope of g. Since g (x) = 1 for all x < 0 and g (x) = 1 for

all x > 0, this means that g has constant slope of 1 for x < 0 and constant slope of 1 for

all x > 0. We are given moreover that lim g(x) = 2, so the two branches of g nearly meet

x0

x 2,

g(x) =

x 2,

for x < 0,

for x > 0.

www.EconsPhDTutor.com

Example 130. Let h R R be some differentiable function. Graphed below in blue is its

derivative h defined by h (x) = x. You are told also that h(x) = 0. What does the graph

of h look like? (Pretend for a moment that you cant see the red graph.)

The derivative simply gives the slope of h. Since h (x) < 0 for all x < 0, h (0) = 0, and

h (x) > 0 for all x > 0, this means that h is strictly decreasing on R , a turning point at 0,

and strictly increasing on R+ .

Moreover, the derivative (slope) is increasing (indeed it is increasing at a constant rate)

so the graph of h is concave upwards throughout.

Altogether then, even if we dont know how to figure out what h(x) is, we can at least

roughly sketch the graph of h (in red above below). (Of course, you probably already know

x2

from secondary school that h(x) = , but were not supposed to know this until we learn

2

about integration later in this textbook.)

www.EconsPhDTutor.com

14

Quadratic equations show up very often in various contexts. So here is a fairly complete if

brisk review of quadratic equations, which you were supposed to have completely mastered

in secondary school.

Example 131. Below are the graphs of the equations y = x2 + 3x + 1 (red), y = x2 + 2x + 1

(blue), y = x2 +x+1 (green), y = x2 +x+1 (red dotted), y = x2 2x1 (blue dotted),

and y = x2 x 1 (green dotted).

6

y=

x2

+x+1

y = x2 + 2x + 1

y = x2 + 3x + 1

2

y = - x2 + x + 1

0

-4

-3

-2

-1

-2

y=-

x2

1

y = - x2 - x - 1

- 2x - 1

-4

www.EconsPhDTutor.com

otherwise we are in the trivial case of a linear equation. First, write:

ax2 + bx + c =

1 2 b

c

(x + x + ) .

a

a

a

b

b 2 b2

x + x = (x + ) .

a

2a

4a

2

b 2 b2

c 1

b 2 b2 4ac

1

].

Hence, ax + bx + c = [(x + ) 2 + ]= [(x + )

a

2a

4a

a a

2a

4a2

2

What we just did above is called completing the square. We can use this to compute the

zeros or roots of the equation ax2 + bx + c = 0.

ax2 + bx + c = 0

b 2 b2 4ac

1

b 2 b2 4ac

] = (x + )

= [(x + )

a

2a

4a2

2a

4a2

b 2 b2 4ac

(x + ) =

2a

4a2

x=

b2 4ac

.

2a

This last expression give the roots of the equation ax2 + bx + c = 0. This expression will

NOT be printed in the A-Level List of Formulae! So be sure you remember it!

b b2 4ac

x=

.

2a

www.EconsPhDTutor.com

We can distinguish between six categories of quadratic equations, based on the signs of

a (the coefficient of x2 ) and b2 4ac (the discriminant). Each of these six categories was

illustrated in the figure above.

Category

1. a > 0, b2 4ac > 0

2. a > 0, b2 4ac = 0

3. a > 0, b2 4ac < 0

4. a < 0, b2 4ac > 0

5. a < 0, b2 4ac = 0

6. a < 0, b2 4ac < 0

Features

-shaped.

Intersects the horizontal axis at two points.

-shaped.

Just touches the horizontal axis at the minimum point.

-shaped.

Doesnt intersect the horizontal axis.

-shaped.

Intersects the horizontal axis at two points.

-shaped.

Just touches the horizontal axis at the maximum point.

-shaped.

Doesnt intersect the horizontal axis.

The Sign of a. If a > 0, then the graph is -shaped and has a minimum turning point

b

at x = . Conversely, if a < 0, then the graph is -shaped and has a maximum turning

2a

b

point at x = .

2a

The Discriminant. The term b2 4ac is called the discriminant. This name makes sense,

because it helps us discriminate between several possible cases of the equation ax2 +bx+c = 0:

If b2 4ac > 0, then:

There are two real roots (or zeros or horizontal intercepts), namely

b

b2 4ac

.

2a

ax2 + bx + c = (x

b +

b2 4ac

b + b2 4ac

) (x +

).

2a

2a

What we have just done is to factorise the expression ax2 + bx + c. Factorisation is often

a useful trick to play.

Notice that if you plug in either of the roots into the right hand side (RHS) of the above

equation, we do indeed get zero, as expected.

Page 164, Table of Contents

www.EconsPhDTutor.com

If b2 4ac = 0, then:

There is only one real root (or zero or horizontal intercept), namely

b

.

2a

b 2

b 2

ax + bx + c = (x ) = (x + ) .

2a

2a

2

b

Notice that if you plug x = into the RHS of the above equation, we do indeed get

2a

zero, as expected.

If b2 4ac < 0, then:

There are no real roots (or zeros or horizontal intercepts).

There is no way to factorise the expression ax2 +bx+c (unless we use complex numbers,

which well learn about only in Part IV).

Exercise 71. For each of the following equations, sketch its graph and identify its intercepts

and turning points (if these exist). (a) y = 2x2 + x + 1. (b) y = 2x2 + x + 1. (c) y = x2 + 6x + 9.

(Answer on p. 1035.)

www.EconsPhDTutor.com

15

Transformations

15.1

y = f (x) + a

The graph of y = f (x) + a is simply the graph of y = f (x) translated (moved) upwards by

a units.

Example 132. Define the function f R R by x x3 1. The graphs of f (red) and

y = f (x) + 2 (blue) are shown below.

Notice the blue curve is simply the red curve translated upwards by 2 units.

f(x), f(x) + 2

10

x

0

-3.0

-1.5

0.0

1.5

3.0

-2

-4

-6

-8

-10

www.EconsPhDTutor.com

15.2

y = f (x + a)

Why leftwards (and not rightwards)? The reason is that in order for f (x1 ) and f (x2 + a)

to hit the same value, we must have x2 = x1 a. That is, every x value is moved to the left

by a units.

Example 133. Define the function f R R by x x3 1. The graphs of f (red) and

y = f (x + 2) (blue) are shown below. (The latter equation is simply y = (x + 2)3 1.)

Notice the blue curve is simply the red curve translated leftwards by 2 units.

f(x), f(x+2)

10

x

0

-4

-2

-2

-4

-6

-8

-10

www.EconsPhDTutor.com

15.3

y = af (x)

The graph of y = af (x) is simply the graph of f (x) vertically-stretched (outwards from the

horizontal axis) by a stretching factor of a.

Example 134. Define the function f R R by x x3 1. The graphs of f (red) and

y = 2f (x) (blue) are shown below.

Notice the blue curve is simply the red curve stretched vertically (outwards from the

horizontal axis) by a factor of 2.

10

f(x), 2f(x)

x

0

-3.0

-1.5

0.0

1.5

3.0

-2

-4

-6

-8

-10

www.EconsPhDTutor.com

15.4

y = f (ax)

The graph of y = f (ax) is simply the graph of f (x) horizontally-stretched (outwards from

the vertical axis) by a stretching factor of 1/a. Or equivalently, the graph of y = f (ax) is

simply the graph of f (x) horizontally-compressed (inwards towards the vertical axis) by a

compression factor of a.

Why a stretching factor of 1/a (and not a)? The reason is that in order for f (x1 ) and

f (ax2 ) to hit the same value, we must have x2 = x1 /a. That is, every x value is scaled by

a factor of 1/a.

Example 135. Define the function f R R by x x3 1.The graphs of f (red) and

y = f (2x) (blue) are shown below. (The latter equation is simply y = (5x)3 1 = 125x3 1.)

Notice the blue curve is simply the red curve stretched horizontally (outwards from the

1

vertical axis) by a factor of . (Again, the A-level exams might instead word this as a

2

stretch with scale factor 0.5 parallel to the y-axis.)

Equivalently, the blue curve is simply the red curve compressed horizontally (inwards

towards from the vertical axis) by a factor of 2.

8

f(x), f(2x)

x

0

-2

-1

-2

-4

-6

-8

www.EconsPhDTutor.com

15.5

y = 1.1f (x 1) (blue), and y = f (1.1x) 1 (green) are shown below.

Notice the blue curve is simply the red curve translated rightwards by 1 unit and then

stretching it vertically (outwards from the vertical axis) by a factor of 1.1.

Notice the green curve is simply the red curve stretched horizontally (outwards from

the vertical axis) by a factor of 1/1.1 and then translated downwards by 1 unit.

f(x), 1.1f(x-1), f(1.1x)-1 8

x

0

-2

-1

-2

-4

-6

-8

www.EconsPhDTutor.com

15.6

y = f (x)

The graph of y = f (x) is simply the graph of f (x), but with all points for which f (x) < 0

reflected in the horizontal axis.

Example 137. Define the function f R R by x x3 1. The graphs of f (red) and

y = f (x) (blue) are shown below.

f(x), |f(x)|

x

0

-2

-1

-2

-4

-6

-8

www.EconsPhDTutor.com

15.7

y = f (x)

The graph of y = f (x) is simply the graph of f (x), but with all points for which x < 0

reflected in the vertical axis.

Example 138. Define the function f R R by x x3 1. The graphs of f (red) and

y = f (x) (blue) are shown below.

f(x), f(|x|)

x

0

-2

-1

-2

-4

-6

-8

www.EconsPhDTutor.com

15.8

y=

1

f (x)

1

(blue) are shown below.

y=

f (x)

1

. So

f (x)

in this case, x = 1 f (x) = 0 and thus x = 1 is a vertical asymptote for the graph of

1

1

y=

. As x approaches 1 from the left,

. And as x approaches 1 from the

f (x)

f (x)

1

right,

.

f (x)

1

Also, if as x , f (x) , then we also have

0, so that y = 0 is a horizontal

f (x)

asymptote. So here, as x , f (x) approaches 0 from above and as x , f (x)

approaches 0 from below.

Notice that wherever f (x) = 0, we have a vertical asymptote for the graph of y =

f(x), 1/f(x)

x

0

-2

-1

-2

-4

-6

-8

www.EconsPhDTutor.com

15.9

y 2 = f (x)

1. It is symmetric in the horizontal axis. This is because if y1 satisfies y 2 = f (x), then so

too does y1 .

2. If f (x) < 0, then there is no value of y for which y 2 = f (x). And so the graph of y 2 = f (x)

is empty wherever f (x) < 0.

3. The graph of y 2 = f (x) intersects the horizontal axis at the same point as the graph of

y = f (x). Moreover, at any such point, the tangent to the graph of y 2 = f (x) is vertical.

y 2 = f (x) (blue) are shown below.

7

y = f(x)

6

5

4

3

2

1

0

-1

-1 0

-2

y2 = f(x)

-3

-4

-5

-6

-7

-8

www.EconsPhDTutor.com

Exercise 72. The graph of the function f R R is drawn below in red. Graph each of

the following equations. (a) y = 2f (3x). (b) y = f (x 1). (c) y 2 = f (x) + 4. (Answer on

p. 1036.)

-5

-4

-3

-2

-1

30 f(x), y

29

28

27

26

25

24

23

22

21

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

0

1

-1 0

-2

-3

-4

-5

-6

-7

-8

-9

-10

x

2

1

Exercise 73. Describe a series of transformations that would transform the graph of y =

x

1

to y = 3

. (Answer on p. 1037.)

5x 2

www.EconsPhDTutor.com

16

Conic Sections

Conic sections are formed from the intersection of a double cone and a 2D cartesian plane.

Take an infinitely large double cone (it goes upwards and downwards forever). Use a 2D

cartesian plane to slice the double cone from all conceivable positions and at all conceivable

angles. The intersection of the plane and the surface of the double cone form curves which,

aptly enough, are called conic sections.

The figure below25 doesnt show the upper half of the double cone, but you can easily

imagine it. Of the four curves depicted, only the hyperbola also cuts the upper half of the

double cone.

25

www.EconsPhDTutor.com

The three types of conic sections are the ellipse (plural: ellipses), the parabola (parabolae), and the hyperbola (hyperbolae). The circle is regarded as a special case of the

ellipse.26 Here are the distinguishing characteristics of each:

Type

Ellipse

Parabola

Hyperbola

Description

Formed from only one half of the double cone.

A closed curve.27

Formed from only one half of the double cone.

Not a closed curve.

Formed from both halves of the double cone and is

thus composed of two distinct branches.

Not a closed curve.

Arises when

B 2 4AC < 0

B 2 4AC = 0

B 2 4AC > 0

We can prove (but do not do so in this textbook) that in general, a conic section is the

graph of the equation

1

Ax2 + Bxy + Cy 2 + Dx + Ey + F = 0,

where A, B, C, D, E, F are real constants and x and y are the two variables (on the

cartesian plane).

We refer to the expression B 2 4AC as the discriminant of the above equation. It is so

named because it discriminates between the three possible types of conic sections. We

can prove (but do not do so in this textbook) that if B 2 4AC > 0, then we have an ellipse;

if B 2 4AC = 0, then we have a parabola; and if B 2 4AC < 0, then we have a hyperbola.

In secondary school, we already learnt in some detail a special case of conic sections the

1

quadratic y = ax2 + bx + c. This is the special case of the equation = where

A = a, B= 0, C = 0, D= b, E = 1, and F = c.

The quadratic y = ax2 + bx + c is indeed a parabola, because B 2 4AC = 02 4(a)(0) = 0.

We already reviewed qudratic equations in section 14 and so we wont talk any more about

them in this chapter.

26

Strictly speaking, there are also the so-called degenerate conic sections, but we shall ignore these.

www.EconsPhDTutor.com

For A-levels, we are only required to learn about five more special cases of conic sections,

listed below. And so thats the plan for this chapter.

1.

x2 y 2

+

= 1,

a2 b2

2.

x2 y 2

= 1,

a2 b2

3.

y 2 x2

= 1,

b2 a2

ax + b

,

cx + d

4.

y=

5.

ax2 + bx + c

y=

.

dx + e

1

Exercise 74. As per the general form given in =, state for each of the above five equations,

what A, B, C, D, E, and F are. Compute the discriminant for each equation. Hence,

conclude that first equation is of an ellipse and the remaining four are of hyperbolae.

(Answer on p. 1038.)

www.EconsPhDTutor.com

16.1

x2 y 2

+

= 1 describes an ellipse. In this section, well study a special case of

a2 b2

this equation, where a = b = 1. The equation then becomes x2 + y 2 = 1, which is the unit

circle centred on the origin.

The equation

By unit circle, we mean that it has radius of unit length, i.e. length 1.

1

f(x), g(x), y

p = (x, y)

x2

y2

=1

y

x

-1.0

-0.5

0.0

0.5

1.0

(-0, 0)

Centre

-1

Why does this equation describe a circle?

You can easily see that (1, 0), (0, 1), (1, 0), and (0, 1) all satisfy the equation and are

thus part of its graph. Indeed, these are the horizontal and vertical intercepts. What about

elsewhere on the circle?

Consider any point p on the unit circle. It forms a triangle the line connecting it to the

origin is the hypothenuse; that connecting it to the horizontal axis is the side; and that

Page 179, Table of Contents

www.EconsPhDTutor.com

We have just proven that every point (x, y) on the unit circle satisfies the equation x2 +y 2 = 1.

We now examine some of its characteristics.

1. Intercepts. The graph intersects the vertical axis at the points (0, 1) and (0, 1) and

the horizontal axis at the points (1, 0) and (1, 0).

2. Turning points. In this case, it is easy to see that there is a maximum turning point

at (0, 1) and a minimum turning point at (0, 1). But just as an exercise, lets also try

to find these turning points more rigorously, i.e. through calculus.

Exercise 49 showed that although it is impossible to rewrite the equation x2 + y 2 = 1 into

the form of a single function, it is nonetheless possibleto and rewrite it into the form of

two functions.

Namely, f [1, 1] R defined by x 1 x2 and g [1, 1] R defined

by x 1 x2 . Above, the graph of the function f is the upper semicircle (red) and

the graph of the function g is the lower semicircle (blue).

Lets compute the first derivative of f and set it equal to 0:

f (x) = 0.5(1 x2 )0.5 (2x)

= x(1 x2 )0.5

x(1 x2 )0.5 = 0

x = 0.

So the only stationary point of the function f is 0. We must now determine whether it is

a maximum, minimum, or inflexion point.

Compute the second derivative and evaluate it at the stationary point:

f (x) = (1 x2 )0.5 x [0.5(1 x2 )1.5 (2x)] .

This second derivative is messy and can be further simplified, but in this case there is no

need to simplify it, since all we want is to evaluate it at 0. We have

f (0) = (1 02 )0.5 0 [0.5(1 02 )1.5 (2 0)] = 1 < 0.

Hence, the point x = 0 is a maximum turning point of f . We should make it a habit to

write out the point in full, as (0, f (0)) = (0, 1).

Since g = f , it follows that g (0) = 0 and g (0) = 1 > 0. That is, the only stationary point

of the function g is (0, g(0)) = (0, 1). and it is a minimum point.

3. Asymptotes. By observation, there are no asymptotes.

4. Symmetry. The graph is a perfect circle centred on the origin. So by observation,

every line that passes through the origin is a line of symmetry!

www.EconsPhDTutor.com

16.2

x2 y 2

The Ellipse 2 + 2 = 1

a

b

f(x), g(x), y

b

Vertical Intercept

a

Horizontal Intercept

x2 / a2 + y2 / b2 = 1

(-0, 0)

Centre

y=0

Line of Symmetry

x=0

Line of Symmetry

-b

Vertical Intercept

-a

Horizontal Intercept

Squares are a proper subset of rectangles. Similarly, circles are a proper subset of ellipses.

The ellipse can be regarded as the generalisation of the circle.

Why does the equation x2 /a2 + y 2 /b2 = 1 describe an ellipse? Rewrite the equation as

y 2

x 2

( ) + ( ) = 1.

a

b

Hence, going from x2 + y 2 = 1 to x2 /a2 + y 2 /b2 = 1 involves two transformations:

1. First, stretch the graph horizontally, outwards from the vertical axis, by a factor of a.

2. Then stretch the graph vertically, outwards from the horizontal axis, by a factor of b.

This gives us an elongated circle that we call an ellipse.

1. Intercepts. The graph intersects the vertical axis at the points (0, b) and (0, b), and

the horizontal axis at the points (a, 0) and (a, 0).

2. Turning points. Clearly, there are maximum and minimum turning points at (0, b)

and (0, b). Lets find these rigorously using calculus.

Lets again break the equationup and rewrite it into the form of two functions.

Namely, f

[a, a] R defined by x b 1 x2 /a2 and g [a, a] R defined by x b 1 x2 /a2 .

These are graphed above. Lets compute the first derivative of f and set it equal to 0:

f (x) = 0.5b

0.5

2x

x2

( 2 ) = bx (1 2 )

a

a

0.5

a2 = 0 x = 0.

www.EconsPhDTutor.com

So the only stationary point of the function f is 0. We can show that it is a maximum

point, by computing the second derivative and evaluating it at 0:

d

x2

f (x) =

[bx (1 2 )

dx

a

0.5

d

x2

a ] = a b [x (1 2 )

dx

a

2

0.5

0.5

1.5

x2

x2

2x

= a b [(1 2 )

0.5 (1 2 )

( 2 )]

a

a

a

2

2

0

0

0x

f (0) = a2 b [(1 2 ) 0.5 0.5 (1 2 ) 1.5 ( 2 )] = a2 b < 0.

a

a

a

2

And since g = f , g (0) = 0 and g (0) = a2 b > 0. That is, the only stationary point of g is

(0, b) and it is a minimum point.

3. Asymptotes. By observation, there are no asymptotes.

4. Symmetry. By observation, there are only two lines of symmetry, namely y = 0 and

x = 0 (the horizontal and vertical axes).

the equation

2

(x + c)

(y + d)

+

= 1.

a2

b2

(i) Sketch its graph. (ii) Write down the points at which it intersects the axes. (iii) Identify

any turning points. (iv) Write down the equations of any lines of symmetry and also (v)

asymptotes.

www.EconsPhDTutor.com

16.3

y = 1/x (graphed) is the first hyperbola well study. It is also the simplest possible hyperbola.

y = -x

line of

symmetry

y

The graph of

y = 1 / x has

two branches.

4

3

y=x

line of

symmetry

2

1

x

0

-5

-4

-3

-2

-1

-1

-2

(0, 0)

Centre

4

y=0

horizontal

asymptote

-3

-4

x=0

vertical

asymptote

-5

It turns out that all hyperbolae well study have some common features. They have two

branches. In the case of y = 1/x, one branch is top-right and the other is bottom-right.

www.EconsPhDTutor.com

1. Intercepts. Hyperbolae may or may not cross the axes. It depends.

y = 1/x is an example of a hyperbola that crosses neither the vertical nor the horizontal

axis. (But this is not true of all hyperbolae.)

2. Turning points. Hyperbolae may or may have turning points. It depends.

y = 1/x is an example of a hyperbola that has no turning points. (But this is not true of

all hyperbolae.)

3. Asymptotes. Hyperbolae always have two asymptotes.

In the case of y = 1/x, they are y = 0 and x = 0.

An interesting feature here is that the two asymptotes are perpendicular. A rectangular hyperbola is any hyperbola whose two asymptotes are perpendicular. And so

y = 1/x is an example of a rectangular hyperbola.

(But as well see, not all hyperbolae are rectangular.)

4. The centre is the point at which the two asymptotes intersect.

In the case of y = 1/x, the centre is (0, 0).

5. Two lines of symmetry. Both pass through the centre. Moreover, each line of symmetry bisects an angle formed by the two asymptotes.

In the case of y = 1/x, they are y = x and y = x.

www.EconsPhDTutor.com

The Hyperbola x2 y 2 = 1

16.4

x2 y 2 = 1 is a hyperbola and so it has two distinct branches. Notice also that if x (1, 1),

then there is no value of y for which x2 y 2 = 1. Hence, the graph of this equation is empty

in the region where x (1, 1).

f(x), g(x), y

x=0

Line of

Symmetry

(-0, 0)

Centre 3

y = f(x)

2

x2 - y2 = 1

y=0

Line of

Symmetry

0

-5

-4

-3

-2

-1

-1

Horizontal

Intercept

-2

Horizontal

Intercept

-3

y = g(x)

y = x -4

Linear

Asymptote

-5

y = -x

Linear

Asymptote

1. Intercepts. The graph crosses the horizontal axis at the points (1, 0) and (1, 0), but

does not intersect the vertical axis.

2. The two turning points there is a minimum turning point at (0, b) and a maximum

turning point at (0, b).

3. Asymptotes. We have y = x2 1. So as x , y = x2 1 x2 = x.

(Informally, as x , the 1 becomes negligible and we can simply ignore it). And so

the two asymptotes are y = x and y = x. The two asymptotes are perpendicular and

so this is a rectangular hyperbola.

4. The centre (point at which the two asymptotes intersect) is (0, 0).

5. We know that the two lines of symmetry bisect the angles formed by the asymptotes.

So they must have slope 1 and 1. Moreover, both pass through the centre (0, 0).

Altogether, we can work out that the lines of symmetry are y = x and y = x.

www.EconsPhDTutor.com

16.5

x2 y 2

The Hyperbola 2 2 = 1

a

b

x 2

y 2

( ) ( ) =1

a

b

involves two simple transformations:

1. First stretch the graph horizontally, outwards from the vertical axis, by a factor of a.

2. Then stretch the graph vertically, outwards from the horizontal axis, by a factor of b.

f(x), g(x), y

y = f(x)

(-0, 0)

Centre

x=0

Line of

Symmetry

x2 / a2 - y2 / b2 = 1

y=0

Line of

Symmetry

a

Horizontal

Intercept

-a

Horizontal

Intercept

y = g(x)

y = bx / a

Linear

Asymptote

y = -bx / a

Linear

Asymptote

www.EconsPhDTutor.com

y 2

x 2

The graphs characteristics are similar to before. Again, ( ) ( ) = 1 is a hyperbola

a

b

and so it has two distinct branches. Notice also that if x (a, a), then there is no value

x2 y 2

of y for which 2 2 = 1. Hence, the graph of this equation is empty in the region where

a

b

x (a, a).

1. Intercepts. The graph crosses the horizontal axis at the points (a, 0) and (a, 0), but

does not intersect the vertical axis.

2. There are no turning points.

x

x 2

3. Asymptotes. We have y = b ( ) 1. So as x , y = b ( ) 1

a

a

x 2

x

x

x

b ( ) = b . And so the two asymptotes are y = b and y = b . The two

a

a

a

a

asymptotes are perpendicular and so this is a rectangular hyperbola.

4. The centre (point at which the two asymptotes intersect) is (0, 0).

5. We know that the two lines of symmetry bisect the angles formed by the asymptotes.

So they must have slope 1 and 1. Moreover, both pass through the centre (0, 0).

Altogether, we can work out that the lines of symmetry are y = x and y = x.

Exam Tip

On the A-level exams, they typically only ask for (i) the intercepts; (ii) the asymptotes;

and (iii) turning points.

Nonetheless, you might as well know about the centre and the two lines of symmetry,

because these concepts are not difficult and will help you to sketch better graphs.

www.EconsPhDTutor.com

16.6

y 2 x2

The Hyperbola 2 2 = 1

b

a

SYLLABUS ALERT

y 2 /b2 x2 /a2 = 1 is explicitly in the 9758 (revised) but not the 9740 (old) syllabus.

But even if youre taking 9740, you might as well learn to draw y 2 /b2 x2 /a2 = 1, because

its really simple (since you now know how to draw x2 /a2 y 2 /b2 = 1).

y 2 x2

The graph of the equation 2 2 = 1 is simply the graph we studied in the previous section,

b a

2

y = f(x)

x=0

Line of

symmetry

(-0, 0)

Centre

f(x), g(x), y

b

Vertical

Intercept

y2 / b2 - x2 / a2 = 1

y=0

Line of

symmetry

y = -bx / a

Linear

asymptote

y = bx / a

Linear

asymptote

y = g(x)

-b

Vertical

intercept

Lets summarise the graphs characteristics. This is a hyperbola and so there are two

distinct branches. Notice also that if y (b, b), then there is no value of x for which

www.EconsPhDTutor.com

y 2 x2

= 1. Hence, the graph of this equation is empty in the region where y (b, b). The

b2 a2

range of y is thus (, b] [b, ).

1. Intercepts. The graph crosses the vertical axis at the points (0, b) and (0, b), but does

not intersect the horizontal axis.

2. The two turning points are (0, b) (minimum) and (0, b) (maximum).

2

x

x 2

3. Asymptotes. We have y = b 1 + ( ) . So as x , y = b 1 + ( )

a

a

2

x

x

x

x

b ( ) = b . And so the two asymptotes are y = b and y = b . The two

a

a

a

a

asymptotes are perpendicular and so this is a rectangular hyperbola.

4. The centre (point at which the two asymptotes intersect) is (0, 0).

5. We know that the two lines of symmetry bisect the angles formed by the asymptotes.

So they must have slope 1 and 1. Moreover, both pass through the centre (0, 0).

Altogether, we can work out that the lines of symmetry are y = x and y = x.

y 2 x2

If wed like, we can also find the turning points of 2 2 = 1 more rigorously, that is,

b

a

through calculus. As with the circle, although it is not possible to rewrite this equation

into the form of a single function, it is possible to rewrite

it into the form of two functions.

x 2

Namely, f (, a] [a, ) R defined by x b ( ) + 1 and g (, a) (a, )

a

2

x

R defined by x b ( ) + 1. The graph of the function f is entirely above the horia

zontal axis, while that of g is entirely below the horizontal axis.

Lets compute the first derivative of f :

x 2

f (x) = 0.5b [( ) + 1]

a

0.5

2x b

x

)=

.

a2

a x2 + a2

Hence, the only stationary point of f is (0, b). Lets check what sort of a stationary point

this is.

b

f (x) =

a

And so f (0) =

0.5

x2 + a2 x(0.5) (x2 + a2 )

(2x)

.

x2 + a2

b

> 0. Hence, this is a minimum point.

a2

Similarly, by computing the first derivative of g and doing the work, we can find that the

only stationary point of g is (0, b) and that this is a maximum point.

www.EconsPhDTutor.com

16.7

Remember long division? Turns out itll be useful for dividing polynomials. Here are a

couple primary school examples to jog your memory.

Example 141. Whats 83 7? By long division, the quotient is 11 with a remainder of 6.

So, 83 7 = 116/7.

11

7 83

77

6

The quotient is the integer portion of the solution and the remainder is the left-over

integer.

Example 142. Whats 470 17? By long division, the quotient is 27 with a remainder of

11. So, 470 17 = 2711/17.

27

17 470

459

11

www.EconsPhDTutor.com

Long division can be used to divide one polynomial by another. But first of all, in case you

dont remember what a polynomial is...

Definition 47. An nth-degree polynomial in one variable is any expression a0 xn + a1 xn1 +

a2 xn2 + + an1 x + an where each ai is a constant and x is the variable.

In this textbook, well almost always consider only polynomials in one variable. So when

I say polynomial, Ill always mean a polynomial in one variable, unless otherwise stated.28

Example 143. The expressions 7x 3 and 4x + 2 are 1st-degree polynomials (in one variable). These are also called linear polynomials. (Polynomials of low degree are often also

called by such special names.)

These are also called quadratic polynomials.

Example 145. The expressions 2x3 + 2x2 + 3x 1 and 3x3 + 2x2 + 3x + 1 3rd-degree polynomials. These are also called cubic polynomials.

Example 146. The expressions 5x4 2x3 + 2x2 + 3x 1 and 9x4 + 3x3 + 2x2 + 3x + 1 are

4th-degree polynomials. These are also called quartic polynomials.

28

Actually, weve already secretly studied an example of a polynomial in two variables the expression on the LHS of the

equation of the conic section: Ax2 + Bxy + Cy 2 + Dx + Ey + F = 0.

www.EconsPhDTutor.com

x2 + 3

. We might be perfectly content with

Example 147. Say you have an expression

x1

this expression. Or we might try to simplify it through long division:

x +1

x 1 x2 +0x +3

x2 x +0

x +3

x 1

4

The quotient is x + 1 and the remainder is 4. Hence,

4

x2 + 3

=x+1+

x1

x1

4x3 + 2x2 + 1

Example 148. Lets simplify

through long division:

2x2 x 1

2x

+2

2x2 x 1 4x3 +2x2

4x3 2x2

4x2

4x2

+0x

2x

+2x

2x

4x

+1

+0

+1

+3

+3

4x3 + 2x2 + 1

4x + 3

=

2x

+

2

+

.

2x2 x 1

2x2 x 1

Exercise 76. Simplify each of the following fractions through long division. (a)

(b)

4x2 3x + 1

x2 + x + 3

. (c)

. (Answer on p. 1040.)

x+5

x2 2x + 1

16x + 3

.

5x 2

www.EconsPhDTutor.com

16.8

The Hyperbola y =

bx + c

dx + e

ax2 + bx + c

.

y=

dx + e

In this section, as a warm-up, well study the special case of the above equation, where

a = 0:

y=

bx + c

.

dx + e

www.EconsPhDTutor.com

Example 149. Graphed below is the equation y = (2x + 1)/(x + 1). This is the case where

b = 2, c = 1, d = 1, and e = 1. Do the long division:

2

x + 1 2x +1

2x +2

y=

1

7

y = -x + 1

line of

symmetry

(-1, 2)

Centre

-6

-4

2x + 1

1

=2

.

x+1

x+1

y

y=x+3

line of

symmetry

y=2

horizontal

asymptote

1

x

-2

0

-1

x = -1

vertical

asymptote

-3

1. Intercepts. The graph intersects the vertical axis at the point (0, 1) and the horizontal

axis at the point (0.5, 0).

2. There are no turning points.

3. Asymptotes. As x 1, y . And so x = 1 is a vertical asymptote. As x ,

y 2. And so y = 2 is a horizontal asymptote. The two asymptotes are perpendicular

and so this is a rectangular hyperbola.

4. The centre (point at which the two asymptotes intersect) is (1, 2).

5. We know that the two lines of symmetry bisect the angles formed by the asymptotes.

So they must have slope 1 and 1. Moreover, both pass through the centre (1, 2).

Altogether, we can work out that the lines of symmetry are y = x + 3 and y = x + 1.

www.EconsPhDTutor.com

Example 150. Graphed below is the equation y = (7x + 3)/(2x + 4). This is the case where

b = 7, c = 3, d = 2, and e = 4. Do the long division:

3.5

2x + 4 7x

+3

7x +14

11

y=

7x + 3

11

= 3.5

.

2x + 4

2x + 4

Lets summarise the graphs characteristics. This is a hyperbola and so there are two

distinct branches.

1. Intercepts. The graph intersects the vertical axis at the point (0, 0.75) and the horizontal axis at the point (3/7, 0).

2. There are no turning points.

3. Asymptotes. As x 2, y . And so x = 2 is a vertical asymptote. As x ,

y 3.5. And so y = 3.5 is a horizontal asymptote. The two asymptotes are perpendicular

and so this is a rectangular hyperbola.

4. The centre (point at which the two asymptotes intersect) is (2, 3.5).

5. We know that the two lines of symmetry bisect the angles formed by the asymptotes.

So they must have slope 1 and 1. Moreover, both pass through the centre (2, 3.5).

Altogether, we can work out that the lines of symmetry are y = x + 5.5 and y = x + 1.5.

www.EconsPhDTutor.com

b/d

dx + e bx

bx

bx + c

. By long division, we have:

dx + e

+c

+be/d

c be/d

The quotient is b/d and the remainder is c be/d. Lets further simplify this so that x

has no coefficient.

bx + c b c be/d

= +

dx + e d dx + e

=

b c be/d 1

+

d

d

x + e/d

b cd be 1

+

d

d2 x + e/d

We can thus get from y = 1/x to the above equation, through these transformations:

1. Shift the graph leftwards by

1

e

units to get the graph of y =

.

d

x + e/d

2. Stretch the graph vertically, outwards from the horizontal axis, by a factor of

cd be

get the graph of y = 2

.

d (x + e/d)

3. Finally, shift the graph upwards by b/d units to get the final graph.

cd be

to

d2

Exam Tip

The A-level exams often ask you to list down a series of transformations that will get you

from one graph to another, as was just done.

www.EconsPhDTutor.com

bx + c

Lets now summarise the characteristics of the graph of the equation y =

. This is

dx + e

a hyperbola with two distinct branches.

1. Intercepts. If e = 0, then the graph does not cross the vertical axis. If e 0, then the

graph intersects the vertical axis at the point (0, c/e). If b = 0, then the graph does not

cross the horizontal axis. If b 0, then the graph intersects the vertical axis at the point

(c/b, 0).

2. There are no turning points.29

3. Asymptotes. As x e/d, y . And so x = e/d is a vertical asymptote. As

x , y b/d. And so y = b/d is a horizontal asymptote. The two asymptotes are

perpendicular and so this is a rectangular hyperbola.

4. The centre (point at which the two asymptotes intersect) is (e/d, b/d).

5. We know that the two lines of symmetry bisect the angles formed by the asymptotes.

So they must have slope 1 and 1. Moreover, both pass through the centre (e/d, b/d).

Altogether, we can work out that the lines of symmetry are y = x + e/d + b/d and y =

x e/d + b/d.

Exercise 77. For each of the following equations, sketch its graph and identify its intercepts, turning points, asymptotes, centre, and lines of symmetry (if there are any of these).

x2

3x + 1

3x + 2

(a) y =

. (b) y =

. (c) y =

. (Answers on pp. 1041, 1042, and 1043.)

x+2

2x + 1

2x + 3

29

bx + c

has no turning points.

dx + e

www.EconsPhDTutor.com

16.9

ax2 + bx + c

The Hyperbola y =

dx + e

ax2 + bx + c

.

y=

dx + e

Well rule out the following cases.

a = 0, because in that case

section.

ax2 + bx + c bx + c

=

and this was already studied in the last

dx + e

dx + e

ax2 + bx + c ax2 + bx + c

d = 0, because in that case

=

, which is a quadratic and which

dx + e

e

we already studied in secondary school.

ax2 + bx a

b

Both c and e are 0, because in that case

= x + , which is a linear expression.

dx

d

d

Well start with the simplest possible case (a = 1, b = 0, c = 1, d = 1, and e = 0). This is the

equation

y=

x2 + 1

.

x

www.EconsPhDTutor.com

10

y = (x2 + 1) / x

8

y=x

Oblique

Asymptote

6

(0, 0)

Centre

Minimum

Turning Point

2

0

-10

-6

Maximum

Turning Point

-2

-2

-4

x=0

vertical

asymptote

-8

-10

x

x x2 +1

10

y = (1 - 2) x

Line of Symmetry

-6

y = (1 + 2) x

Line of Symmetry

x2

y=

x2 + 1

1

=x+ .

x

x

1

As usual, this is a hyperbola that has two distinct branches. Other features:

1. Intercepts. The graph intersects neither the vertical axis nor the horizontal axis.

2. There are two turning points (1, 2) is a maximum turning point and (1, 2) is a

minimum turning point. (To find these, compute the first derivative dy/dx = 1 1/x2 .

Set these equal to 0 for find two stationary points: x = 1. Use the 2DT to determine

that x = 1 and x = 1 are, respectively maximum and minimum turning points.)

By observation, y can take on any value except those between these two turning points.

The range of y is thus (, 2] [2, ).

3. Asymptotes. As x 0, y . Hence, there is one vertical asymptote: x = 0. As

x , y x. Hence, there is one oblique asymptote: y = x. The two asymptotes are

not perpendicular and so this is not a rectangular hyperbola.

4. The centre (point at which the two asymptotes intersect) is (0, 0).

5. We know that the two lines of symmetry bisect the angles formed by the asymptotes

and pass through the centre. You dont need to learn how to figure out their

equations (but see pp. 927ff. in the Appendices if youre interested).

www.EconsPhDTutor.com

x2 + 3x + 1

. Do the long division:

Example 152. Graphed below is the equation y =

x+1

x +2

x + 1 x2 +3x +1

x2 +x

2x

2x

+2

1

y = (1 - 2) x + 2 - 2

Line of Symmetry

-11

y=

-7

y = (1 + 2) x + 2 + 2

Line of Symmetry

10

8

6

4

2

0

-3 -2

-4

-6

-8

-10

x2 + 3x + 1

1

=x+2

.

x+1

x+1

y=x+2

Oblique

Asymptote

x

1

(-1, 1)

Centre

x = -1

vertical

asymptote

As usual, this is a hyperbola that has two distinct branches. Other features:

1. Intercepts. The graph intersects the vertical axis at the point (0, 1) and the horizontal

axis at the points (0.5(3 + 5), 0) and (0.5(3 5), 0). (The horizontal intercepts

are simply the zeros of the quadratic x2 + 3x + 1.)

2. There are no turning points. (Compute dy/dx = 1 + 1/(x + 1)2 . Set this equal to 0

there are no stationary points and thus no turning points either.)

By observation, y can take on any value. The range of y is thus R.

3. Asymptotes. As x 1, y . Hence, there is one vertical asymptote: x = 1. As

x , y x+2. Hence, there is one oblique asymptote: y = x+2. The two asymptotes

are not perpendicular and so this is not a rectangular hyperbola.

4. The centre (point at which the two asymptotes intersect) is (1, 1).

5. We know that the two lines of symmetry bisect the angles formed by the asymptotes

and pass through the centre. Again, you dont need to know how to find their equations.

www.EconsPhDTutor.com

2x2 + 2x + 1

. Do the long division:

Example 153. Graphed below is the equation y =

x + 1

2x 4

x + 1 2x2 +2x +1

2x2 2x

4x

2x2 + 2x + 1

5

5

= 2x 4 +

= 2x 4 +

.

x + 1

x + 1

x + 1

4x

4

5

14

10

Minimum

Turning

Point

-9

6

2

-5

y = (-2 + 5) x - 4 - 5

Line of Symmetry

-1-2

-6

-10

-14

Maximum

Turning

Point

-18

y = (-2 - 5) x - 4 + 5 Line

of Symmetry

x=1

vertical

asymptote

x

3

11

(1, -6)

Centre y = -2x - 4

Oblique

Asymptote

-22

-26

As usual, this is a hyperbola that has two distinct branches. Other features:

1. Intercepts. The graph intersects the vertical axis at the point (0, 1), but not the

horizontal axis, because there are no real zeros for the quadratic 2x2 + 2x + 1.

2. There are two turning points (1 2.5, 0.325) and (1 + 2.5, 12.325) are the

minimum and maximum turning points. (Verify this.)

By observation, y can take on any value except those between these two turning points. The

range of y is thus (, 12.325] [0.325, ).

3. Asymptotes. As x 1, y . Hence, there is one vertical asymptote: x = 1. As

x , y 2x 4. Hence, there is one oblique asymptote: y = 2x 4. The two

asymptotes are not perpendicular and so this is not a rectangular hyperbola.

4. The centre (point at which the two asymptotes intersect) is (1, 6).

5. We know that the two lines of symmetry bisect the angles formed by the asymptotes

and pass through the centre. Again, you dont need to know how to find their equations.

Page 201, Table of Contents

www.EconsPhDTutor.com

ax2 + bx + c

We will not look more generally at the equation y =

, because it gets rather

dx + e

messy. But if you want, you can read about it on pp. 927ff. of the Appendices (optional).

Exercise 78. For each of the following equations, sketch its graph and identify its intercepts, turning points, asymptotes, centre, and lines of symmetry (if any of these exist). (a)

x2 + x 1

2x2 2x 1

x2 + 2x + 1

. (b) y =

. (c) y =

. (Answers on pp. 1044, 1046, and

y=

x4

x+1

x+4

1048.)

www.EconsPhDTutor.com

17

A graph (or curve) is simply a set of points. Parametric equations give us an alternative

method to describing the same graph (or curve).

Example 154. Recall that the graph of the equation x2 + y 2 = 1 i.e. the set S = {(x, y)

x2 + y 2 = 1} is the unit circle centred on the origin.

t = 3 / 4, x = - 2 / 2, y = 2 / 2

vx = - 2 / 2 ms-1, vy = - 2 / 2 ms-1

ax = 2 / 2 ms-2, ay = - 2 / 2 ms-2

instantaneous

direction of travel.

x2 + y2 = 1

x

t = 0, x = 1, y = 0

vx = 0 ms-1, vy = 1 ms-1

ax = -1 ms-2, ay = 0 ms-2

t = 3 / 2, x = 0, y = -1

vx = 1 ms-1, vy = 0 ms-1

ax = 0 ms-2, ay = 1 ms-2

turns out, this gives us a second way of writing the set S:

S = {(x, y) x = cos t, y = sin t, t R}.

The variable t is called a parameter, hence the name parametric equations. As t

increases from 0 to 2, we trace out, anti-clockwise, a unit circle centred on the origin.

t = 0 (x, y) = (1, 0),

www.EconsPhDTutor.com

Example 203 (continued from above). The set S = {(x, y) x = cos t, y = sin t, 0 t <

2} can also be interpreted as tracing the motion of a particle as it moves anti-clockwise

around a circle. x and y give the distances of the particle (in metres) from the origin, in

the x- and y-directions.

We have x = cos t and y = sin t. This says that at any instant of time t, the particle is cos t

metres to the east of the origin and sin t metres to the north of the origin. (Note that if

cos t < 0, then the particle is to the west of the origin. And if sin t < 0, then the particle is

to the south of the origin.)

At time t = 0 s, the particle is at the position (x, y) = (1, 0). At time t = 1 s, the particle

has moved to the position (x, y) = (0.54, 0.84). At time t = /2 1.07 s, the particle has

moved to position (x, y) = (0, 1).

Having interpreted t as time, we can now also easily talk about the velocity and acceleration of the particle at different instants in time.

Example 203 (continued from above). We have x = cos t and y = sin t. From this,

we can easily compute the particles velocity in each direction: vx = dx/dt = sin t and

vy = dy/dt = cos t.

This says that at any instant of time t, the velocity of the particle is sin t ms-1 in the

x-direction and cos t ms-1 in the y-direction. (Note that if sin t < 0, then the particle is

moving westwards. And if cos t < 0, then the particle is moving southwards.)

So for example,

at

time

t

=

7/4,

its

velocity

is

sin

(7/4)

=

2/2 ms-1 rightwards and

Similarly, we can compute the particles acceleration in each direction: ax = d2 x/dt2 =

cos t and ay = d2 y/dt2 = sin t.

So for example,at time t = 7/4, its acceleration is cos (7/4) = 2/2 ms-1 rightwards and

cos (7/4) = 2/2 ms1 upwards. That is, the particle is travelling rightwards (because

its velocity rightwards at this instant in time is positive); however, its rightwards velocity

is slowing down.

Exercise 79. (Answer on p. 1050.) Let P be the particle whose position (in metres) is described by the set {(x, y) x = cos t, y = sin t, t R}, where t is time (seconds). Let Q be the

particle whose position (in metres) is described by the set {(x, y) x = sin t, y = cos t, t R}.

(a) How does the starting point (when t = 0) of Q differ from that of P ? (b) What about

the direction of travel?

www.EconsPhDTutor.com

Example 155. Recall that the graph of the equation x2 /a2 + y 2 /b2 = 1 i.e. the set

T = {(x, y) x2 /a2 + y 2 /b2 = 1} is the ellipse centred on the origin, with horizontal

intercepts a and vertical intercepts b.

y

t = 3 / 4

instantaneous

direction of travel.

x

t = 0, x = 1, y = 0

t = 3 / 2

Observe that if x = a cos t and y = b sin t, then by the same trigonometric identity as before,

x2 /a2 + y 2 /b2 = 1. As it turns out, this gives us a second way of writing the set T :

T = {(x, y) x = a cos t, y = b sin t, t R} .

Similar to before, as t increases from 0 to 2, we trace out, anti-clockwise, an ellipse centred

on the origin.

At any instant in time t, the particles position, velocity, and acceleration are (x, y) =

(a cos t, b sin t), (vx , vy ) = (a sin t, b cos t), and (ax , ay ) = (a cos t, b sin t).

Exercise 80. Let P be the particle whose position (in metres) is described

{(x, y) x = a cos t, y = b sin t, t R}, where t is time (seconds). At each of the following

times, state the particles position and also its velocity and acceleration in both the x- and

4

2

www.EconsPhDTutor.com

Example 156. Recall that the graph of the equation x2 y 2 = 1 i.e. the set U = {(x, y)

x2 y 2 = 1} is the rectangular east-west hyperbola centred on the origin, with

horizontal intercepts 1 and no vertical intercepts.

Arrows indicate 5 y

the instantaneous 4

direction of travel. 3

x2 - y2 = 1

2

1

t=4

0

t=3

-5

-4

-3

-2

-1 -1 0

-2

t=2

-3

-4

-5

t=1

x

t=0

1

t=5

turns out, this gives us a second way of writing the set U :

U = {(x, y) x = sec t, y = tan t, t R, t k/2} .

Note that t cannot be a half-integer multiple of , because then tan t would be undefined.

Again, lets interpret this as the movement of a particle. Interestingly, the particle always

moves upwards, as we can easily prove vy = dy/dt = sec2 t > 0 for all t.

At t = 0, the particle is at (x, y) = (1, 0). During t [0, /2), the particle moves northeast

along the green segment and flies off towards infinity as t /2 1.57.

An instant after /2 seconds, the particle magically reappears near infinity in the southwest. During t (/2, ], the particle moves northeast along the blue segment. At t = ,

the particle is at (1, 0).

During t [, 3/2), the particle moves northwest along the red segment and flies off

towards infinity as t 3/2 4.71.

An instant after 3/2 seconds, the particle magically reappears near infinity in the southeast. During t (3/2, 2], the particle moves northwest along the pink segment.

www.EconsPhDTutor.com

Exercise 81. (Answer on p. 1051.) Suppose that the position of a particle is described by

the set {(x, y) x = tan t, y = sec t, t R}, where t is time, measured in seconds.

(a) Rewrite the set using a single cartesian equation.

(b) Compute dx/dt. And hence make an observation about how the particle travels in the

x-direction.

The graph below indicates six positions of the particle A, B, C, D, E, and F . (Also

indicated are the directions of travel.) The particle is at these positions at times t = 0, 1,

2, 3, 4, and 5 but not necessarily in that order.

(c) Using only the graphs of s = tan t and s = sec t (above) to guide you and without using

a calculator, state where the particle is, at each of the the times t = 0, 1, 2, 3, 4, and 5.

3

C

2

B

1

{(x, y): x = tan t, y = sec t, t

-5

-4

-3

-2

-1

}

0

-1

E

-2

D

-3

Arrows indicate -4

the instantaneous

direction of travel.

-5

www.EconsPhDTutor.com

17.1

Given a pair of parametric equations that describes a set of points, we can often go in

reverse: We can eliminate the parameter t and describe the same set of points using a

single equation.

Example 157. The set {(x, y) x = t2 + t, y = t 1, t R} describes the position (metres) of

a particle at time t (seconds).

y

Instantaneous

Direction of

Travel

Instantaneous

Direction of

Travel

Instantaneous

Direction of Travel

t = 1, x = 2, y = 0

vx = (2t + 1) ms-1 = 3 ms-1

vy = 1 ms-1, ax = 2 ms-2, ay = 0 ms-2

t = 0, x = 0, y = - 1

vx = (2t + 1) ms-1 = 1 ms-1

vy = 1 ms-1, ax = 2 ms-2, ay = 0 ms-2

t = - 1, x = 0, y = - 2

vx = (2t + 1) ms-1 = - 1 ms-1

vy = 1 ms-1, ax = 2 ms-2, ay = 0 ms-2

x = y2 + 3y + 2

as: {(x, y) x = y 2 + 3y + 2}.

As an exercise, lets also compute the velocity and acceleration of the particle.

vx = dx/dt = 2t + 1 and vy = dy/dt = 1. This says that at any instant in time t, the particle

has velocity 2t + 1 ms1 rightwards and 1 ms1 upwards.

ax = d2 x/dt2 = 2 and ay = d2 y/dt2 = 0. This says that the particle is always accelerating

rightwards at the rate 2 ms2 . Moreover, it is never accelerating upwards (this is consistent

with the above finding that its upwards velocity is a constant 1 ms1 ).

www.EconsPhDTutor.com

Example 158. The set {(x, y) x = 2 cos t 4, y = 3 sin t + 1, t R} describes the position

(metres) of a particle at time t (seconds).

5

t = / 2, x = - 4, y = 4

vx = - 2 sin (t) ms-1 = - 2 ms-1

vy = 3 cos (t) ms-1 = 0 ms-1

ax = - 2 cos (t) ms-2 = 0 ms-2

ay = - 3 sin (t) ms-2 = -3 ms-2

t = , x = - 6, y = 1

vx = - 2 sin (t) ms-1 = 0 ms-1

vy = 3 cos (t) ms-1 = - 3 ms-1

ax = - 2 cos (t) ms-2 = 2 ms-2

ay = - 3 sin (t) ms-2 = 0 ms-2

4

3

2

1

x

0

-7

-5

-3

-1

1

-1

t = 3 / 2 , x = - 4, y = - 2

vx = - 2 sin (t) ms-1 = 2 ms-1 -2

vy = 3 cos (t) ms-1 = 0 ms-1

ax = - 2 cos (t) ms-2 = 0 ms-2

ay = - 3 sin (t) ms-2 = 3 ms-2 -3

Write (x + 4) /2 = cos t and (y 1) /3 = sin t. Using the trigonometric identity cos2 t+sin2 t =

2

2

1, we can rewrite the set as {(x, y) x = [(x + 4) /2] + [(y 1) /3] = 1}. This is the ellipse

centred on (4, 1).

As an exercise, lets also compute the velocity and acceleration of the particle.

vx = dx/dt = 2 sin t and vy = dy/dt = 3 cos t. This says that at any instant in time t, the

particle has velocity 2 sin t ms1 leftwards and 3 cos t ms1 upwards.

ax = d2 x/dt2 = 2 cos t and ay = d2 y/dt2 = 3 sin t. This says that at any instant in time

t, the particle is accelerating leftwards at the rate 2 cos t ms2 and upwards at the rate

3 sin t ms2 .

www.EconsPhDTutor.com

Exercise 82. Each of the following sets describes the position (metres) of a particle at

time t (seconds). Rewrite each set into a form where the parameter t is eliminated. Sketch

the graph of each. Indicate the particles position and direction of travel at t = 0. (Answers

on pp. 1052, 1053, and 1054.)

(a) {(x, y) x = 2 sin t 1, y = 3 cos2 t, t R}.

(b) {(x, y) x =

1

, y = t2 + 1, t R}.

t1

www.EconsPhDTutor.com

18

N

N

Given any fraction

(where N and D are real numbers with D non-zero), we have

>0

D

D

if and only if one of the following is true:

1. N > 0 AND D > 0; OR

2. N < 0 AND D < 0.

The expressions that are in the numerator (N ) and denominator (D) can get pretty complicated. So here are some very simple examples just to warm you up.

Example 159.

4

> 0 because both the numerator and denominator are positive.

7

Example 160.

5

> 0 because both the numerator and denominator are negative.

3

Example 161.

9

< 0 because the numerator is negative but the denominator is positive.

2

Example 162.

1

> 0 because the numerator is positive but the denominator is negative.

8

www.EconsPhDTutor.com

18.1

Example 163.

ax + b

>0

cx + d

x+3

> 0 one of the following is true:

3x + 2

2. x + 3 < 0 AND 3x + 2 < 0.

Notice that (1) x + 3 > 0 AND 3x + 2 > 0 x > 3 AND x > 2/3 , which in turn is

equivalent to the single inequality x > 2/3.

Notice that (1) x + 3 < 0 AND 3x + 2 < 0 x < 3 AND x < 2/3 , which in turn is

equivalent to the single inequality x < 3.

x+3

> 0 x > 2/3 OR x < 3 (equivalently, x (, 3)

Altogether then,

3x + 2

2

( , )).

3

Note that I use quotation marks , but these are not necessary. Instead, they merely help

to make especially clear which groups of conditions corresponds to each other.

Example 164.

4x 1

> 0 one of the following is true:

x+2

2. 4x 1 < 0 AND x + 2 < 0 x < 1/4 AND x < 2 x < 2.

Altogether then,

(1/4, )).

Example 165.

4x 1

> 0 x > 1/4 and x < 2 (equivalently, x (, 2)

x+2

5x + 4

> 0 one of the following is true:

2x + 1

1. 5x + 4 > 0 AND 2x + 1 > 0 x > 4/5 AND x < 1/2 x (4/5, 1/2) ; OR

2. 5x + 4 < 0 AND 2x + 1 < 0 x < 4/5 AND x > 1/2, but these are mutually

contradictory and thus impossible.

Altogether then,

5x + 4

> 0 x (4/5, 1/2).

2x + 1

When given any inequality that is of a slightly different form, be sure to always convert it

N

into what Ill call the standard form

> 0. Strictly speaking, this is not necessary, but

D

Page 212, Table of Contents

www.EconsPhDTutor.com

if you always do this, youll make a habit of solving inequalities in this form, and thus be

less likely to make a careless mistake.

3x 2

< 3. This inequality is equivalent to

5x + 1

3x 2

>0

5x + 1

15x + 3 (3x 2)

>0

5x + 1

18x + 5

> 0.

5x + 1

1. 18x + 5 > 0 AND 5x + 1 > 0 x < 5/18 AND x < 1/5 x < 5/18 ; OR

2. 18x + 5 < 0 AND 5x + 1 < 0 x > 5/18 AND x > 1/5 x > 1/5.

Altogether then,

(1/5, )).

3x 2

< 3 x < 5/18 OR x > 1/5 (equivalently, x (, 5/18)

5x + 1

2x + 1

Exercise 83. For what values of x is each of the following inequalities true? (a)

> 0.

3x + 2

1

1

3x 18

2x + 3

x1

> 0. (c)

> 0. (d)

> 0. (e)

> 0. (f)

< 9. (Answers on p.

(b)

4

4

4

9x 14

x + 7

1055.)

www.EconsPhDTutor.com

18.2

ax2 + bx + c

>0

dx2 + ex + f

ax2 + bx + c

. But

dx2 + ex + f

ax2 + bx + c

> 0. This is

you are required to know how to find the values of x for which

dx2 + ex + f

just the same game as before, albeit slightly more complicated.

Dont worry, you are not required to know how to graph the equation y =

2x2 + x + 3

> 0 one of the following is true:

x2 + 3x + 2

Example 167.

2. 2x2 + x + 3 < 0 AND x2 + 3x + 2 < 0.

y = 2x2 + x + 3 is a -shaped quadratic and has no real roots (because the discriminant is

negative). Hence, it is always positive. It is thus impossible that 2x2 + x + 3 < 0 AND

x2 + 3x + 2 < 0 (Case 2).

We need thus only examine Case 1. As we just said, it is always true that 2x2 + x + 3 > 0.

So we need only examine when it is true that x2 + 3x + 2 > 0.

The equation y = x2 + 3x + 2 has a -shaped graph and has two real zeros given by:

32 4(1)(2) 3 17 3 17

=

=

= 0.5 (3 17) .

2(1)

2

2

3

Altogether then,

2x2 + x + 3

>

0

(0.5

(3

17)

,

0.5

(3

+

17)).

x2 + 3x + 2

A dirty trick is to use your TI84 to do a quick check that this answer is correct:

www.EconsPhDTutor.com

x2 + 4x 1

> 0 one of the following is true:

Example 168.

2x2 + x + 2

1. x2 + 4x 1 > 0 AND 2x2 + x + 2 > 0; OR

2. x2 + 4x 1 < 0 AND 2x2 + x + 2 < 0.

The equation y = 2x2 + x + 2 has a -shaped graph and has no real zeros (because the

discriminant is negative). Hence, it is always positive. It is thus impossible that x2 +

4x 1 < 0 AND 2x2 + x + 2 < 0 (Case 2).

We need thus only examine Case 1. As we just said, it is always true that y = 2x2 +x+2 > 0.

So we need only examine when it is true that x2 + 4x 1 > 0.

The equation y = x2 + 4x 1 has a -shaped graph and has two real zeros given by:

42 4(1)(1) 4 12 4 12

=

=

= 2 3.

2(1)

2

2

4

x2 + 4x 1

Thus,

>

0

(2

3,

2

+

3) (0.268, 3.732). As usual, lets check

2x2 + x + 2

using our TI84:

www.EconsPhDTutor.com

x2 + 5x + 4

> 0 one of the following is true:

Example 169.

x2 2x + 1

1. x2 + 5x + 4 > 0 AND x2 2x + 1 > 0; OR

2. x2 + 5x + 4 < 0 AND x2 2x + 1 < 0.

The equation y = x2 + 5x + 4 has a -shaped graph and has two real zeros given by:

5

(5)2 4(1)(4) 5 9 5 3

=

=

= 4, 1.

2(1)

2

2

The equation y = x2 2x + 1 has a -shaped graph and has two real roots given by:

(2)2 4(1)(1) 2 8

=

= 1 2.

2(1)

2

Hence, the expression x2 + 4x 1 > 0 x (1 2, 2 1). Thus:

2

1. x2 +5x+4 > 0 AND x2 2x+1 > 0 x < 4 OR x > 1 AND x (1 2, 2 1).

x2 + 5x + 4

Altogether then,

>

0

(4,

1

2)

(1,

2 1). As usual, lets

x2 2x + 1

check using our TI84:

www.EconsPhDTutor.com

x2 4x + 3

> 0 one of the following is true:

Example 170.

x2 2x

1. x2 4x + 3 > 0 AND x2 2x > 0; OR

2. x2 4x + 3 < 0 AND x2 2x < 0.

The equation y = x2 4x + 3 has a -shaped graph and has two real zeros given by:

4

(4)2 4(1)(3) 4 4

=

= 1, 3.

2(1)

2

The equation y = x2 2x has a -shaped graph and has two real roots given by:

2

(2)2 4(1)(0) 2 4

=

= 0, 2.

2(1)

2

1. x2 4x + 3 > 0 AND x2 2x > 0 x (1, 3) AND x (0, 2) x (1, 2).

2. x2 4x + 3 < 0 AND x2 2x < 0 x < 1 OR x > 3 AND x < 0 OR x > 2

x < 0 or x > 3.

Altogether then,

using our TI84:

x2 4x + 3

> 0 x (, 0) (1, 2) (3, ). As usual, lets check

x2 2x

Exercise 84. Without using a calculator, find the values of x for which each of the following

x2 + 2x + 1

x2 1

x2 3x 18

2x + 5

inequalities is true. (a) 2

> 0. (b) 2

> 0. (c)

>

0.

(d)

>

x 3x + 2

x 4

x2 + 9x 14

x + 4

3x + 1

. (Answers on pp. 1057, 1058, 1059, and 1060.)

6x 7

www.EconsPhDTutor.com

18.3

Rewrite the inequality as x sin(0.5x) > 0. Graph y = x sin(0.5x) on your graphing

calculator. Our goal is to first find the horizontal intercepts of this equation; this will let

us solve for x > sin (0.5x).

After Step 1.

After Step 2.

After Step 3.

After Step 4.

After Step 5.

After Step 6.

In the TI84:

1. Press ON to turn on your calculator.

2. Press Y= to bring up the Y= editor.

3. Press X,T,,n SIN 0 . 5 . To enter , press the blue 2ND button and then

(which corresponds to the button). Now press X,T,,n ) and altogether you will

have entered x sin(0.5x).

4. Now press GRAPH and the calculator will graph y = x sin(0.5x).

It looks like the horizontal intercepts are close to the origin. Lets zoom in to see better.

5. Press the (ZOOM) button to bring up a menu of ZOOM options.

6. Press 2 to select the Zoom In option. Nothing seems to happen. But now press ENTER

and the TI will zoom in a little for you.

It looks like there are 3 horizontal intercepts. To find out what precisely they are, well use

the TI84s zero option.

www.EconsPhDTutor.com

After Step 7.

After Step 8.

After Step 9.

3. Press the blue 2ND button and then CALC (which corresponds to the TRACE

button). This brings up the CALCULATE menu.

4. Press 2 to select the zero option. This brings you back to the graph, with a cursor

flashing. Also, the TI84 prompts you with the question: Left Bound?

TI84s ZERO function works by you first specifying a Left Bound and a Right Bound

for x. TI84 will then check to see if there are any horizontal intercepts (i.e. values of x for

which y = 0) within those bounds.

5. Using the < and > arrow keys, move the blinking cursor until it is where you want your

first Left Bound to be. For me, I have placed it a little to the left of where I believe

the leftmost horizontal intercept to be.

6. Press ENTER and you will have just entered your first Left Bound.

TI84 now prompts you with the question: Right Bound?.

7. So now just repeat. Using the < and > arrow keys, move the blinking cursor until it is

where you want your first Right Bound to be. For me, I have placed it a little to the

right of where I believe the leftmost horizontal is.

8. Again press ENTER and you will have just entered your first Right Bound.

TI84 now asks you: Guess? This is just asking if you want to proceed and get TI84 to

work out where the horizontal intercept is. So go ahead and:

9. Press ENTER . TI84 now informs you that there is a Zero at x = 1, y = 0 and

places the blinking cursor at precisely that point. This is the first horizontal intercept

weve found.

To find each of the other 2 horizontal intercepts, just repeat steps 3 through 9. You

should be able to find that they are at x = 0 and x = 1. Altogether, the 3 intercepts are

x = 1, 0, 1. Based on these and what the graph looks like, we conclude: x > sin (0.5x)

x (1, 0) (1, ).

www.EconsPhDTutor.com

For this example, I wont give the full detailed instructions of what to do on the TI84; Ill

only show a few screenshots. First, rewrite the inequality as x e ln x > 0 and so graph

y = x e ln x on your graphing calculator:

After Graphing. Zoom In, Adjust Window.

Look for the values of x for which x e ln x = 0. They are x = 0.7083, 4.1387:

Leftmost horizontal intercept. Rightmost horizontal intercept.

Based on these horizontal intercepts and what the graph looks like, we conclude: x > e+ln x

if and only if x (0, 0.7083) (4.1387, ).

Exercise 85. Use a graphing calculator to find the values of x for which each of the

1

> x3 + sin x.

following inequalities is true. (a) x3 x2 + x 1 > ex . (b) x > cos x. (c)

2

1x

(Answers on pp. 1061, 1061, and 1062.)

www.EconsPhDTutor.com

18.4

Systems of Equations

Warm-up questions:

Exercise 86. (PSLE-style question.) When Apu was 40 years old, Beng was twice as old

as Caleb. Today, Caleb is 28 years old and Apu is twice as old as Beng. What are the ages

of Apu and Beng today? (If necessary, assume that the age of a person is always an integer

and is fixed between January 1st and December 31st of each year.) (Answer on p. 1063.)

Exercise 87. (O-Level style question.) Planes A and B leave the same point at 12pm.

Plane A travels northeast at a constant speed of 100 km/h. Plane B travels south at a

constant speed of 200 km/h. At 3pm, both planes make an instant turn and start flying

directly towards each other at the same speed. At what time will the two planes collide?

(Answer on p. 1063.)

Definition 48. Given an equation involving a single variable x, a real solution to the

equation is any value of x R such that the equation is true.

Example 173. The equation x + 5 = 8 has one real solution: 3. The equation x2 1 = 0

has two real solutions: 1 and 1. The equation x2 1 = 8 has two real solutions: 3 and 3.

The equation x3 4x = 0 has three real solutions: 2, 0, and 2.

Example 174. The equation x2 + 1 = 0 has no real solution.

Definition 49. Given an equation involving a single variable x, a real solution set is the

set of values of x R such that the equation is true.

Example 175. The real solution set of the equation x + 5 = 8 is {3}. The real solution

set of the equation x2 1 = 0 is {1, 1}. The real solution set of the equation x2 1 = 8 is

{3, 3}. The real solution set of the equation x3 4x = 0 is {2, 0, 2}.

Example 176. The real solution set of the equation x2 + 1 = 0 is = {}.

www.EconsPhDTutor.com

Definition 50. Given a system of equations (or more simply a set of equations) involving

two variables x and y, a real solution to the set of equations is any point (or ordered pair)

(x, y) with x, y R for which the system of equations is true; and a real solution set is the

set of ordered pairs (x, y) for which the system of equations is true.

Example 177. Consider the system of equations y = x + 1, y = x + 3. To solve this system

of equations, plug in the second equation into the first to get: x + 3 = x + 1. Now solve:

x = 1. And so y = x + 1 = 2. Altogether, this system of equations has one real solution (1, 2).

Its real solution set is thus {(1, 2)}.

Example 178. Consider the system of equations y = 0.5x2 1.5 and y = x. To solve this

system of equations, plug in the second equation into the first to get: x = 0.5x2 1.5.

Rearranging: x2 2x 3 = 0. Now solve: x = 3, 1. Correspondingly, y = 3, 1. Altogether,

this system of equations has two real solutions: (3, 3) and (1, 1). Its real solution set is

thus {(3, 3), (1, 1)}.

A system of equations can have no real solutions.

Example 179. Consider the system of equations y = ln x and y = x. Observe that for all

x (0, 1), ln x < 0 and hence x > ln x. Moreover, for x = 1, ln x = 0 < x. Also, for x > 1,

1

d

d

ln x = < 1 <

x = 1, so the slope of y = x is steeper than that of y = ln x. Altogether

dx

x

dx

then, for all x > 0, x > ln x. Hence, this system of equations has no real solutions. Its real

solution set is thus = {}.

A system of equations can also have infinitely many real solutions.

Example 180. Consider the system of equations y = x and 2y = 2x. Observe that this

system of equations has infinitely many real solutions, e.g. (1, 1), (2, 2), (2.74, 2.74). There

is thus no way to explicitly list out all its real solutions. However, using set-builder notation,

we can write down its real solution set as {(x, y} y = x}. This says that every ordered pair

(x, y) such that y = x is a real solution to the given system of equations.

Exercise 88. The points (1, 2), (3, 5), and (6, 9) satisfy the equation y = ax2 +bx+c. What

are a, b, and c? (Answer on p. 1064.)

Exercise 89. The point (1, 2) satisfies the equation y = ax2 + bx + c. Moreover, the

minimum point of the equation y = ax2 + bx + c is (0, 0). What are a, b, and c? (Answer on

p. 1064.)

www.EconsPhDTutor.com

You are required to know how to use a graphing calculator to find the numerical solution

of equations (including system of linear equations).

Example 181. Solve the system of equations y = x4 x3 5, y = ln x.

One method is to graph both equations on your graphing calculator and then find their

intersection points.

Here Ill use another method: First rewrite the two equations as a third equation y =

x4 x3 5 ln x. Our goal is to find the horizontal intercepts of this equation, which will

in turn also be the solutions to the above set of equations.

Briefly, in the TI84:

1. Graph the equation y = x4 x3 5 ln x.

It looks like there is only one horizontal intercept.

2. Zoom in.

3. Find the horizontal intercept using the zero option.

Conclusion: There is one solution to this set of equations and its x-coordinate is 1.8658. To

find the y-coordinate, we need merely plug in this value of x into either of the equations in

the original set of equations: y = ln x = ln 1.8658 0.6237. Altogether, this set of equations

has one solution: (1.8658, 0.6237).

After Step 1.

After Step 2.

After Step 3.

Exercise 90. Using your graphing calculator, solve the following systems of equations. (a)

1

1

, y = x5 x3 + 2. (c) y =

x2 + y 2 = 1, y = sin x. (b) y =

, y = x3 + sin x. (Answers

2

1

x

1+ x

on pp. 1065, 1066, and 1067.)

www.EconsPhDTutor.com

Part II

www.EconsPhDTutor.com

19

Finite Sequences

Recall that an ordered pair (of real numbers) was simply any pair of real numbers, enclosed

by parentheses, and whose order matters (and this was the only difference between an

ordered pair and a set of two objects).

Example 182. (1, 2) and (2, 1) are both ordered pairs with (1, 2) (2, 1).

We can analogously define ordered triples, quadruples, quintuples, etc.

Example 183. (1, 2, 3) and (2, 1, 3) are both ordered triples with (1, 2, 3) (2, 1, 3).

(1, 1, 1, 1) and (2, 4, 1, 3) are both ordered quadruples with (1, 1, 1, 1) (2, 4, 1, 3).

(2, 2, 3, 2, 2) and (2, 4, 1, 5, 3) are both ordered quintuples with (2, 2, 3, 2, 2) (2, 4, 1, 5, 3).

Well simply call all of these ordered n-tuples or even simply tuples. Hence,

Example 184. (1, 2, 3), (2, 1, 3), (1, 1, 1, 1), (2, 4, 1, 3), (2, 2, 3, 2, 2), and (2, 4, 1, 5, 3) are

all ordered n-tuples. (1, 2, 3) and (2, 1, 3) are ordered 3-ples or triples. (1, 1, 1, 1) and

(2, 4, 1, 3) are ordered 4-tuples or quadruples. (2, 2, 3, 2, 2) and (2, 4, 1, 5, 3) are ordered

5-tuples or quintuples.

In fact, when talking about tuples, it will be understood that they are ordered, so well

drop the word ordered and simply call them tuples (instead of ordered tuples).

Definition 51. A finite sequence of length n is any n-tuple.

Example 185. (1, 2, 3) and (2, 1, 3) are 3-ples or, equivalently, finite sequences of length

3.

(1, 2, 3, 4) and (2, 4, 1, 3) are 4-tuples or, equivalently, finite sequences of length 4.

(1, 2, 3, 4, 5) and (2, 4, 1, 5, 3) are 5-tuples or, equivalently, finite sequences of length 5.

We refer to the objects in a sequence as terms.

Example 186. Given the sequence (2, 1, 3), 2 is its first term, 1 is its second term, and

3 is its third term.

www.EconsPhDTutor.com

19.1

is {1, 2, 3, . . . , n} and whose codomain is R.30

Example 187. (2, 4, 6, 8, 10, 12, 14) is a finite sequence of length 7, consisting of the first

seven even positive integers. A corresponding function f for this sequence has

Domain {1, 2, 3, 4, 5, 6, 7};

Codomain R; and

Mapping rule f (n) = 2n, for all n.

Indeed, the values of the function f (1) = 2, f (2) = 4, f (3) = 6, ..., f (7) = 14 exactly list out

the terms in the finite sequence (2, 4, 6, 8, 10, 12, 14).

Example 188. (2, 5, 12, 23, 38, 57, 80, 107, 138, 173) is a finite sequence of length 10. A

corresponding function f for this sequence has

Domain {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};

Codomain R; and

Mapping rule : f (n) = 2n2 3n + 3, for all n.

Indeed, the values of the function f (1) = 2, f (2) = 5, f (3) = 12, f (4) = 23, ..., f (10) = 173

exactly list out the terms in the finite sequence (2, 5, 12, 23, 38, 57, 80, 107, 138, 173).

Exercise 91. (Answer on p. 1068.) For each of the following finite sequences, write down

a corresponding function.

(a) (1, 4, 9, 16, 25, 36, 49, 64, 81, 100).

(b) (2, 5, 8, 11, 14, 17, 20).

(c) (0.5, 4, 13.5, 32, 62.5, 108, 171.5).

(d) (2, 6, 6, 12, 10, 18, 14, 24, 18, 30, 22, 36, 26, 42).

(e) (18, 14.5).

30

www.EconsPhDTutor.com

19.2

Recurrence Relations

SYLLABUS ALERT

Recurrence relations are included in the 9740 (old) syllabus, but not in the 9758 (revised)

syllabus. So you can skip this section if youre taking 9758.

Example 189. (1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024) is a finite sequence of length 10. A

corresponding function f for this sequence has

Domain {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};

Codomain R; and

Mapping rule : f (1) = 1 and f (n) = 2f (n 1) (the recurrence relation), for all n 2.

The equation f (n) = 2f (n1) is an example of a recurrence relation. That is, it describes

how each term in the sequence is generated, depending on what previous terms were.

In this particular example of a sequence, we can easily write down another corresponding

function that does not involve a recurrence relation:

Example 190. (1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024) is a finite sequence of length 10. A

corresponding function g for this sequence has

Domain {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};

Codomain R; and

Mapping rule : g(n) = 2n1 (not a recurrence relation), for all n.

If we can describe a sequence without using a recurrence relation, then we can immediately

compute what each term in the sequence is. So in the case of the finite sequence just given,

we prefer to use the function g rather than the function f as a corresponding function.

In contrast, with a recurrence relation, we need to know what some of the previous terms

are, in order to compute each term. So if possible, we prefer to describe sequences without

using recurrence relations.

But sometimes, it is difficult to describe a sequence without using a recurrence relation.

www.EconsPhDTutor.com

Example 191. (1, 4, 10, 22, 46, 94, 190, 382, 766, 1534) is a finite sequence of length 10. A

corresponding function f for this sequence has

Domain {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};

Codomain R; and

Mapping rule : f (1) = 1 and f (n) = 2f (n 1) + 2 (the recurrence relation), for all n 2.

It is possible to describe the sequence just given without using a recurrence relation, but it

does not come obviously (at least to the untrained eye) and takes a little work, as well see.

A recurrence relation can certainly involve more than just the previous term. In the Fibonnaci sequence, each term (from the third term onwards) is the sum of the previous

two terms: f (n) = f (n 2) + f (n 1). This equation is again a recurrence relation.

But in the past ten years exams, I havent seen a question where the recurrence relation

involves more than just the previous term. So we shall not bother doing much of these.

Example 192. (1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89) is a finite sequence of length 11, consisting

of the first 11 Fibonacci numbers. A corresponding function f for this sequence has

Domain {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11};

Codomain R; and

Mapping rule : f (n) = 1, for n = 1, 2; and f (n) = f (n 2) + f (n 1) (the recurrence

relation), for all n 3.

Exercise 92. Each of the following finite sequences involves a recurrence relation. (Hint:

Each involves only the previous term and also a squared term.) Write down a corresponding

function for each. (a) (3, 4, 9, 64, 3969). (b) (1, 2, 10, 290, 252010). (Answer on p. 1069.)

www.EconsPhDTutor.com

19.3

notation for this sequence is (an )nk . We call n the index variable or dummy variable.

Well assume the index variable always starts from 1, unless otherwise specified.

Example 193. (1, 1, 1, 3, 5, 9, 17, 31, 57, 105, 193) is a finite sequence of length 11. We can

also write it as (an )n11 = (a1 , a2 , a3 , . . . , a11 ), where a1 = 1, a2 = 1, a3 = 1, a4 = 3, a5 = 5, ...,

a11 = 193.

Example 194. (1, 1, 1, 2, 2, 3, 4, 5, 7, 9, 12, 16, 21, 28, 37, 49, 65, 86, 114, 151) is a finite sequence of length 20. We can also write it as (bn )n20 = (b1 , b2 , b3 , . . . , b20 ), where b1 = 1,

b2 = 1, b3 = 1, b4 = 2, b5 = 2, ..., b20 = 151.

Example 195. (2, 4, 6, 8, 10, 12, 14) is a finite sequence of length 7. We can also write it as

(cn )n7 = (c1 , c2 , c3 , . . . , c7 ), where c1 = 2, c2 = 4, c3 = 6, ..., c7 = 14.

Example 196. (1, 1, 3, 5, 11, 21, 43, 85, 171, 341, 683) is a finite sequence of length 11. We

can also write it as (dn )n11 = (d1 , d2 , d3 , . . . , d11 ), where d1 = 1, d2 = 1, d3 = 3, d4 = 5, d5 = 11,

..., d11 = 683.

We can create new sequences out of old ones, in the obvious fashion:

Example 197. Using the sequence (an )n11 = (1, 1, 1, 3, 5, 9, 17, 31, 57, 105, 193), here are

some new sequences we can create:

(zn )n11 = (an + 1)n11 = (a1 + 1, a2 + 1, a3 + 1, . . . , a11 + 1)

= (2, 2, 2, 4, 6, 10, 18, 32, 58, 106, 194) = (z1 , z2 , z3 , . . . , z11 ) ,

(yn )n11 = (2an )n11 = (2a1 , 2a2 , 2a3 , . . . , 2a11 )

= (2, 2, 2, 6, 10, 18, 34, 62, 114, 210, 386) = (y1 , y2 , y3 , . . . , y11 ) ,

(xn )n11 = (an 1)n11 = (a1 1, a2 1, a3 1, . . . , a11 1)

= (0, 0, 0, 2, 4, 8, 16, 30, 56, 104, 192) = (x1 , x2 , x3 , . . . , x11 ) ,

(wn )n11 = (an /2)n11 = (a1 /2, a2 /2, a3 /2, . . . , a11 /2)

= (1/2, 1/2, 1/2, 3/2, 5/2, 9/2, 17/2, 31/2, 57/2, 105/2, 193/2) = (w1 , w2 , w3 , . . . , w11 ) .

www.EconsPhDTutor.com

Moreover, using two (or more) finite sequences that are of the same length, we can

likewise create a new finite sequence (also of the same length), in the obvious fashion:

Example 198. Using the sequences (an )n11 = (1, 1, 1, 3, 5, 9, 17, 31, 57, 105, 193) and

(dn )n11 = (1, 1, 3, 5, 11, 21, 43, 85, 171, 341, 683), here are some new sequences we can create:

(en )n11 = (an + dn )n11 = (a1 + d1 , a2 + d2 , a3 + d3 , . . . , a11 + d11 )

= (2, 2, 4, 8, 16, 30, 60, 116, 228, 446, 876) = (e1 , e2 , e3 , . . . , e11 ) ,

(fn )n11 = (an dn )n11 = (a1 d1 , a2 d2 , a3 d3 , . . . , a11 d11 )

= (1, 1, 3, 15, 55, 189, . . . , 131819) = (f1 , f2 , f3 , . . . , f11 ) ,

(gn )n11 = (an dn )n11 = (a1 d1 , a2 d2 , a3 d3 , . . . , a11 d11 )

= (0, 0, 2, 2, 6, 12, 26, . . . , 490) = (g1 , g2 , g3 , . . . , g11 ) ,

(hn )n11 = (an /dn )n11 = (a1 /d1 , a2 /d2 , a3 /d3 , . . . , a11 /d11 )

= (1, 1, 1/3, 3/5, 5/11, 9/21, . . . , 193/683) = (h1 , h2 , h3 , . . . , h11 ) .

There are of course many other new sequences we can create, whether using only one

sequence, using two sequences, or even using three or more sequences.

Remark 6. You cannot create a new sequence using two finite sequences that are of different

lengths. For example, given two finite sequences (an )n11 = (1, 1, 1, 3, 5, 9, 17, 31, 57, 105, 193)

and (cn )n7 = (2, 4, 6, 8, 10, 12, 14), there is no such sequence as (an + cn )n11 or even

(an + cn )n7 . Either of these supposed sequences is simply undefined.

It turns out that we are rarely interested in finite sequences. Instead, we are much more

interested in infinite sequences, which is a simple extension of the concept of finite sequences.

www.EconsPhDTutor.com

20

Infinite Sequences

We can easily extend the concept of finite sequences to infinite sequences, which have

domain Z+ = {1, 2, 3, 4, . . . } (the entire set of positive integers).

Example 199. (2, 4, 6, 8, 10, 12, 14, 16, 18, . . . ) is the infinite sequence consisting of all the

even positive integers. A corresponding function f for this sequence has

Domain Z+ ;

Codomain R; and

Mapping rule f (n) = 2n for all n.

Example 200. (1, 3, 6, 10, 15, 21, 28, 36, 45, 55, . . . ) is the infinite sequence consisting of the

triangular numbers. A corresponding function f for this sequence has

Domain Z+ ;

Codomain R; and

Mapping rule f (1) = 1 and f (n) = 1 + 2 + + n for all n 2.

Example 201. The infinite sequence (1, 2, 6, 24, 120, 720, 5040, ...) has the corresponding

function f with

Domain Z+ ;

Codomain R; and

Mapping rule f (n) = 1 2 n = n! for all n.

Exercise 93. For each of the following infinite sequences, write down a corresponding function. (a) (1, 4, 9, 16, 25, 36, 49, 64, 81, 100, . . . ). (b) (2, 5, 8, 11, 14, 17, 20, . . . ). (c)

(0.5, 4, 13.5, 32, 62.5, 108, 171.5, . . . ). (d) (2, 6, 6, 12, 10, 18, 14, 24, 18, 30, 22, 36, 26, 42, . . . ).

(Answer on p. 1070.)

www.EconsPhDTutor.com

20.1

(an ) is our shorthand notation for an infinite sequence, where (an ) = (a1 , a2 , a3 , . . . ).

As stated, we are rarely interested in finite sequences. And so whenever we talk about

a sequence, it should be assumed that we are talking about an infinite sequence, unless

otherwise clearly stated.

The idea of creating new sequences carries over from the finite case in the obvious fashion.

Example 202.

Let

and

Then

(bn ) = (2, 4, 6, 8, 10, 12, 14, 16, 18, 20, . . . ) .

(an + bn ) = (3, 5, 8, 11, 15, 20, 27, 37, 52, 75, . . . ) .

Analogous to Remark 6, you cannot create a new sequence using a finite sequence and an

infinite sequence. Instead, you can only create one using two infinite sequences.

Example 203.

Let

and

Then

(bn )n7 = (2, 4, 6, 8, 10, 12, 14) .

(an + bn ) is undefined.

www.EconsPhDTutor.com

21

Series

Definition 52. Given a finite sequence (an )nk , its series is the expression

a1 + a2 + a3 + + ak .

We refer to a1 as the first term of the sequence and also as the first term of the series.

Similarly, a2 is the second term of both the sequence and the series. Etc.

Definition 53. Given a finite sequence (an )nk , its sum of series is the number S such

that S = a1 + a2 + a3 + + ak .

Example 204.

1 + 1 + 1 + 3 + 5 + 9 + 17 + 31

68.

Example 205.

2 + 4 + 6 + 8 + 10 + 12 + 14

56.

its series is the expression

and its sum of series is the number

its series is the expression

and its sum of series is the number

It may seem strange and unnecessary to distinguish between a series and a sum of series.

Arent they exactly the same thing?

It turns out that expressions like a1 + a2 + a3 + + ak play an important role in maths and

so we want to reserve a special name for the expression itself and distinguish it from the

sum of series. For example, we might be specifically interested in the series 1 + 2 + 3, rather

than just the sum of series 6.

Clearly, every finite sequence has a well-defined sum of series simply add up all the terms

in the finite sequence!

Definition 54. Given an infinite sequence (an ), its series is the expression a1 + a2 + a3 + . . . .

A series that corresponds to a finite sequence is called a finite series, while a series that

corresponds to an infinite sequence is called an infinite series.

www.EconsPhDTutor.com

21.1

Every finite sequence has a sum of series. In contrast, not all infinite sequences do:

Example 206. Consider the sequence (an ) = (1, 1, 1, 1, 1, 1, . . . ). Its series is the expression

1 + 1 + 1 + 1 + 1 + . . . . There is no number equal to 1 + 1 + 1 + 1 + 1 + . . . and so a sum of series

does not exist for this sequence.

But some infinite sequences do have sums of series:

Example 207. Consider the sequence (bn ) = (0, 0, 0, 0, 0, 0, . . . ). Its series is the expression

0 + 0 + 0 + 0 + 0 + . . . . The sum of series for this sequence exists and is 0.

Definition 55. An infinite sequence for which a sum of series exists is said to have a

convergent series.

An infinite sequence for which no sum of series exists is said to have a divergent series.

So in the above examples, we say that the sequence (an ) has a divergent series (because its

sum of series does not exist), while the sequence (bn ) has a convergent series (because its

sum of series exists).

www.EconsPhDTutor.com

These are actually fascinating questions, which means, of course, that theyre not in the

syllabus. Here is a simple example that gives you a glimpse of the difficulties involved.

Chapter 85 in the Appendices (optional) gives the precise definitions of when a series

converges or diverges.

Example 208. Consider the sequence (cn ) = (1, 1, 1, 1, 1, 1, . . . ), where the terms simply alternate between 1 and 1. Its series is the expression 1 1 + 1 1 + 1 1 + . . . . Is there

any number that is equal to 1 1 + 1 1 + 1 1 + . . . ? Its actually not obvious. On the one

hand, we can pair together every two terms like so:

1 1 + 1 1 + 1 1 + . . . = (1 1) + (1 1) + (1 1) + . . .

0

= 0 + 0 + 0 + ...

and happily conclude that the sum of series is 0. But wait a minute ... what if we instead

pair together every two terms like so:

1 1 + 1 1 + 1 1 + 1 . . . = 1 + (1 + 1) + (1 + 1) + (1 + 1) + . . .

0

= 1 + 0 + 0 + 0 + ...

It turns out that the sequence (cn ) = (1, 1, 1, 1, 1, 1, . . . ) is divergent. Or equivalently,

a sum of series simply does not exist for this sequence.

www.EconsPhDTutor.com

22

Summation Notation

is the upper-case Greek letter sigma. An enlarged version of that letter , read aloud

as sum, is used to express series in compact notation:

Example 209. Consider the series 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9. Another way to write it

is to use summation notation:

9

1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 = n.

n=1

The variable n below the is called the index variable or dummy variable. We could

have named it p or z or x or any other letter (instead of n) and it wouldnt have mattered.

Hence the name dummy.

The = 1 below the says that we start counting the index variable from n = 1. We call

the number 1 the starting point.

The 9 above the is called the stopping point. It says that we should stop adding

once we hit n = 9.

9

Altogether, the notation says that we are adding up 9 terms, namely a1 , a2 , ..., a9 .

n=1

The expression to the right of the tells us what each an is. In this example, it is n, which

simply says that for every n, an = n.

9

n=1

9

1.

n=1

This says that the starting point is 1 and the ending point is 9. In other words, we add

up a1 , a2 , . . . , a9 , where for each n, an = 1. And so a1 = 1, a2 = 1, etc. Altogether:

9

1 = a1 + a2 + + a9 = 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1.

n=1

www.EconsPhDTutor.com

7

2n.

n=1

This says that the starting point is 1 and the ending point is 7. In other words, we add

up a1 , a2 , . . . , a7 , where for each n, an = 2n. And so a1 = 2, a2 = 4, etc. Altogether:

7

2n = a1 + a2 + + a7 = 2 1 + 2 2 + + 2 7 = 2 + 4 + 6 + 8 + 10 + 12 + 14.

n=1

n=1

n=1

This says that the starting point is 1 and the ending point is 7. In other words, we add

up a1 , a2 , . . . , a7 , where for each n, an = 2n + 1. And so a1 = 3, a2 = 5, etc. Altogether:

3

15

(2n + 1) = a1 + a2 + + a7 = (2 1 + 1) + (2 2 + 1) + + (2 7 + 1).

7

n=1

www.EconsPhDTutor.com

Example 212. The series 2 + 4 + 8 + 16 + 32 + 64 + 128 + 256 + 512 + 1024 can be written as

10

2n .

n=1

This says that the starting point is 1 and the ending point is 10. In other words, we

add up a1 , a2 , . . . , a10 , where for each n, an = 2n . And so a1 = 2, a2 = 4, etc. Altogether:

10

n=1

Its nice to have 1 as the starting point, but theres no reason why this must always be so.

Example 213. The series 1 + 2 + 4 + 8 + 16 + 32 + 64 + 128 + 256 + 512 + 1024 can be written

as

10

2n .

n=0

This says that the starting point is 0 and the ending point is 10. In other words, we

add up a0 , a1 , a2 , . . . , a10 , where for each n, an = 2n . And so a0 = 1, a1 = 2, a2 = 4, etc.

Altogether:

10

n=0

www.EconsPhDTutor.com

Exercise 94. Rewrite each of the following in summation notation. (Answer on p. 1071.)

(a) 1 + 4 + 9 + 16 + 25 + 36 + 49 + 64 + 81 + 100.

(b) 2 + 5 + 8 + 11 + 14 + 17 + 20 + 23.

(c) 0.5 + 4 + 13.5 + 32 + 62.5 + 108 + 171.5.

Exercise 95. Find the sum of each of the following series. (Answer on p. 1071.)

5

(a) (2 n) .

n=2

17

n=16

33

(c) (x 3).

x=31

Definition 56. Let s, k be integers with s k. Let f be a function whose domain contains

s, s + 1, . . . , k and whose codomain is R. Then

k

n=s

www.EconsPhDTutor.com

23

Example 214. Consider the finite sequence (4, 7, 10, 13, 16, 19, 22). A corresponding function f for this sequence has

Domain {1, 2, 3, 4, 5, 6, 7} (a subset of Z+ );

Codomain R; and

Mapping rule f (1) = 4 and f (n) f (n 1) = 3 for all n 2.

This is an example of a finite arithmetic sequence.

Example 215. Consider the infinite sequence (4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, . . . ). A

corresponding function f for this sequence has

Domain Z+ ;

Codomain R; and

Mapping rule f (1) = 4 and f (n) f (n 1) = 3 for all n 2.

This is an example of an infinite arithmetic sequence.

Definition 57. An arithmetic sequence (or an arithmetic progression) is any finite or infinite sequence (an ) where an+1 an is a constant for all n = 1, 2, 3, . . . . We call an+1 an

the common difference. We call the series for an arithmetic sequence an arithmetic series.

And its sum of series (if it exists at all) is called an arithmetic sum of series.

Example 216. The sequence (an ) = (1, 4, 7, 10, 13, 16, 19, . . . ) is an arithmetic sequence

because an+1 an is constant for n = 1, 2, 3, . . . .

But the sequence (bn ) = (1, 1, 4, 7, 10, 13, 16, 19, . . . ) is not an arithmetic sequence because

a2 a1 = 0 a3 a2 = 3.

The next fact is intuitively obvious. Clearly, there is no number for which, for example,

4 + 7 + 10 + 13 + 16 + 19 + 22 + . . . is equal to.

Fact 12. The infinite arithmetic sequence (an ) has no sum of series, except in the trivial

case where (an ) = (0, 0, 0, 0, 0, 0, . . . ).

www.EconsPhDTutor.com

23.1

Example 217. Youve probably heard of the apocryphal story about an eight-year-old

Gauss adding up the numbers from 1 to 100 in an instant. The trick is to pair the first

number with the last, the second number with the second last, etc. then use multiplication.

Like this:

50 terms

1 + 2 + 3 + 4 + + 100 = (1 + 100) + (2 + 99) + (3 + 98) + + (50 + 51)

101

101

101

101

= 101 50 = 5050.

In general, there is a simple formula for the sum of a finite arithmetic series: (First Term

+ Last Term) (Number of Terms) 2.

k

Fact 13. The finite arithmetic series a1 + a2 + + ak has sum of series (a1 + ak ) .

2

(We will only prove Fact 13 on p. 249.)

Example 218. Consider the arithmetic sequence (7, 17, 27, 37, . . . , 837). Its common difference is 10. The difference between the first and last terms is 830. And so the last term

is 830 10 = 83 terms after the first. Hence, there are in total 84 terms. By Fact 13, its

84

sum of series is (7 + 837)

= 35488.

2

Example 219. Consider the arithmetic sequence (1, 5, 9, 13, 17, 21, 25, 29, 33, . . . , 393). Its

common difference is 4. The difference between the first and last terms is 392. And so the

last term is 392 4 = 98 terms after the first. Hence, there are in total 99 terms. By Fact

99

13, its sum of series is (1 + 393)

= 19503.

2

Exercise 96. Rewrite each of the following arithmetic series in summation notation and

compute their sums. (a) 2+7+12+17+22+27+32+ +997. (b) 3+20+37+54+71+ +1703.

(c) 81 + 89 + 97 + 105 + 113 + + 8081 (Answer on p. 1072.)

www.EconsPhDTutor.com

24

Example 220. Consider the finite sequence (1, 2, 4, 8, 16, 32, 64, 128). A corresponding

function f for this sequence has

Domain {1, 2, 3, 4, 5, 6, 7, 8} (a subset of Z+ );

Codomain R; and

Mapping rule f (1) = 1 and f (n + 1) f (n) = 2 for all n = 1, 2, 3, . . . .

This is an example of a finite geometric sequence.

Example 221. Consider the finite sequence (1, 2, 4, 8, 16, 32, 64, 128, 256, 512, . . . ). A corresponding function f for this sequence has

Domain Z+ ;

Codomain R; and

Mapping rule f (1) = 1 and f (n + 1) f (n) = 2 for all n = 1, 2, 3, . . . .

This is an example of a infinite geometric sequence.

Definition 58. A geometric sequence (or a geometric progression) is any sequence (an )

where an+1 an is constant for all n = 1, 2, 3, . . . . We call an+1 an the common ratio. We

call the series for a geometric sequence a geometric series. And its sum of series (if it exists

at all) is called a geometric sum of series.

Example 222. The sequence (an ) = (1, 2, 4, 8, 16, 32, . . . ) is a geometric sequence because

an+1 an is constant for all n = 1, 2, 3, . . . .

But the sequence (bn ) = (1, 1, 2, 4, 8, 16, 32, . . . ) is not a geometric sequence because a2 a1 =

1 a3 a2 = 2.

www.EconsPhDTutor.com

24.1

It turns out that just like with finite arithmetic series, there is a nice formula for the finite

geometric series. Lets start with the simple case first where the first term is simply 1.

Fact 14. 1 + r + r + r + + r

n1

1 rn

.

=

1r

Now take the difference: S rS = 1 rn .

1 rn

Hence, S =

.

1r

The trick used in the above proof is called the method of differences and the A-level

syllabus requires you to know it. The general case of a geometric series follows immediately

from the above:

Fact 15. a1 + a1 r + a1 r + a1 r + + a1 r

n1

1 rn

= a1

.

1r

Example 223. Consider the geometric sequence (1, 2, 4, 8, 16, . . . , 1024). Its common ratio

is 2. The ratio of the last term to the first is 1024 1 = 1024 = 210 . And so the last term

is 10 terms after the first. Hence, there are in total 11 terms. Thus, its sum of series is

1 211 2047

1

=

= 2047.

12

1

Example 224. Consider the geometric sequence (4, 12, 36, 108, . . . , 8748). Its common

ratio is 3. The ratio of the last term to the first is 8748 4 = 2187 = 37 . And so the last

term is 7 terms after the first. Hence, there are in total 8 terms. Thus, its sum of series is

1 38

6560

4

=4

= 4 3280 = 13120.

13

2

Exercise 97. Rewrite each of the following geometric series into summation notation and

compute their sums. (a) 7 + 14 + 28 + 56 + + 448 + 896. (b) 20 + 10 + 5 + + 5/8. (c)

1 + 1/3 + 1/9 + + 1/243. (Answer on p. 1073.)

www.EconsPhDTutor.com

24.2

Perhaps surprisingly, it turns out that under a certain condition, an infinite geometric

sequence can have a sum of series. Again, lets start with the simple case:

1

.

1r

way, we can also use summation notation for infinite series: S = r and S = rn+1 .)

n=0

n=0

simply S rS = 1.

And so, S =

1

.

1r

Fact 17. If r < 1, then a1 + a1 r + a1 r2 + a1 r3 + =

a1

.

1r

Fact 18. If r 1, then a1 + a1 r + a1 r2 + a1 r3 + . . . diverges.

Exercise 98. Rewrite each of the following infinite geometric series in summation notation

and compute its sum. (a) 6 + 9/2 + 27/8 + . . . . (b) 20 + 10 + 5 + . . . . (c) 1 + 1/3 + 1/9 + . . . . (Answer

on p. 1073.)

www.EconsPhDTutor.com

25

SYLLABUS ALERT

Proof by the method of mathematical induction is included in the 9740 (old) syllabus, but

not in the 9758 (revised) syllabus. So you can skip this Chapter if youre taking 9758.

Well now learn a new technique called proof by the method of mathematical induction. Its pretty difficult, so go real slow.31

Imagine an infinite chain of dominos. Our goal is to knock all of them down. Suppose we

manage to do two things:

1. Knock down the 1st domino (the base case).

2. Prove that if the jth domino is knocked down, then so too is the (j + 1)th domino

(the inductive step).

Then we will have succeeded. Because once the 1st domino is knocked down, the inductive

step implies that the 2nd domino is also knocked down, and now again by the inductive

step the 3rd domino is also knocked down, and now again by the inductive step the 4th

domino is also knocked down, ..., ad infinitum (to infinity).

31

Which is perhaps why they decided to drop it from the revised 9758 syllabus! It does appear though as the first topic of

Further Maths, which will be revived in 2017 and for which a free textbook will soon be appearing!

www.EconsPhDTutor.com

which Ill standardise into a three-step recipe:

Step #1. Let P(k) be (shorthand for) the proposition to be proven. Our goal is to show

that P(k) is true for all k = 1, 2, 3, . . .

Step #2 (the base case). Verify that P(1) is true.

Step #3 (the inductive step). Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ).

Step #1 rarely involves much work. Step #2 is usually, but not always, very easy. Step

#3 is usually the hardest part on the A-level exams, it usually just involves some (or a

lot of) algebra.

Why does the method of mathematical induction work? Step #2 (the base case) shows

that P(1) is true (knock down the 1st domino). Step #3 (the inductive step) then

implies that P(2) is also true (the falling 1st domino knocks down the 2nd domino).

Step #3 (the inductive step) then implies that P(3) is also true (the falling 2nd domino

knocks down the 3rd domino).

Step #3 (the inductive step ) then implies that P(4) is also true (the falling 3rd domino

knocks down the 4th domino).

Ad infinitum (to infinity). Thus, we have proven that P(k) is true for all k = 1, 2, 3, . . . , as

desired.

Too abstract? Work through all the examples and exercises and you should find that it is

not very difficult. For our first example, well reprove an earlier fact, but now using the

method of mathematical induction.

www.EconsPhDTutor.com

1 rn

.

1r

Proof. Step #1. Let P(k) be (shorthand for) the proposition that

1 + r + r2 + r3 + + rk1 =

1 rk

.

1r

1=

1 r1

.

1r

Assume that P(j) is true. That is,

1

1 + r + r2 + r3 + + rj1 =

1 rj

.

1r

1 + r + r2 + r3 + + rj =

To this end, write:

1 rj+1

.

1r

1 + r + r2 + r3 + + rj = (1 + r + r2 + r3 + + rj1 ) + rj

j

1 rj + (1 r)rj 1 rj+1

1 1r

j

=

+r =

=

, as desired.

1r

1r

1r

In this particular instance, the method of mathematical induction was terribly cumbersome,

compared to our earlier four-sentence proof (p. 243). But it turns out that in many other

instances, this method is the best and sometimes the only tool to use.

Lets try more examples.

www.EconsPhDTutor.com

r=1

n(n + 1)(2n + 1)

.

6

k

r=1

k(k + 1)(2k + 1)

.

6

1

n=1

1(1 + 1)(2 1 + 1)

.

6

j

n=1

j(j + 1)(2j + 1)

.

6

j+1

r2 =

n=1

(j + 1) [(j + 1) + 1] [2(j + 1) + 1]

.

6

j+1

r2 = r2 + (j + 1)2

n=1

n=1

=

6

=

7

=

5

=

4

j(j + 1)(2j + 1)

+ (j + 1)2

6

j+1

[j(2j + 1) + 6(j + 1)]

6

j+1

(2j 2 + 7j + 1)

6

(j + 1)(j + 2)(2j + 3)

6

(j + 1) [(j + 1) + 1] [2(j + 1) + 1]

,

6

(Using =)

as desired.

I just used the backwards-forwards method. The order in which I wrote down each line

is given by the numbers above each = sign.

Another trick is to exploit the fact that it has got to work out right. So for example,

it might not immediately be obvious that 2j 2 + 7j + 1 = (j + 2)(2j + 3), but you know it

has got to work out right and thus this must surely be true (unless of course you made

some mistake with the algebra somewhere). And if you expand the RHS, you find that this

equation is indeed true.

www.EconsPhDTutor.com

Fact 13 (reproduced from p. 241). The finite arithmetic sequence (an )nk has sum of

k

series (a1 + ak ) .

2

Proof. Step #1. Let P(k) be (shorthand for) the proposition that

a1 + a2 + + ak =

k(a1 + ak )

.

2

1(a1 + a1 )

.

2

Assume that P(j) is true. That is,

1

a1 + a2 + + aj =

j(a1 + aj )

.

2

a1 + a2 + + aj+1 =

(j + 1)(a1 + aj+1 )

.

2

Lets first observe that aj a1 = (j 1) (aj+1 aj ). In words, this equation says: Consider

the difference between the j th term and the first term; it is equal to j 1 times the difference

2 (j 1)aj+1 + a1

between two consecutive terms. Rearranging, we have aj =

.

j

Now write:

a1 + a2 + + aj+1

j(a1 + aj )

+ aj+1

2

j {a1 + [(j 1)aj+1 + a1 ] /j}

ja1 + (j 1)aj+1 + a1

=

+ aj+1 =

+ aj+1

2

2

(j + 1)a1 + (j 1)aj+1

(j + 1)a1 + (j 1)aj+1 + 2aj+1

=

+ aj+1

=

2

2

(j + 1)a1 + (j + 1)aj+1

(j + 1)(a1 + aj+1 )

=

=

, as desired.

2

2

= (a1 + a2 + + aj ) + aj+1

www.EconsPhDTutor.com

n(n + 1)

Exercise 99. Prove that r = [

] . (Answer on p. 1074.) By the way, this shows

2

r=1

n

r=1

r=1

that r3 = ( r) .

Exercise 100. (Answer on p. 1075.) Let a R. Prove that

1 (n + 1)an + nan+1

,

ra = a

(1 a)2

r=1

n

r=1

. (Answer on p. 1076.)

30

www.EconsPhDTutor.com

Part III

Vectors

www.EconsPhDTutor.com

26

26.1

Example 226. The line running through points a and b goes forever, in both directions

(red dotted line). In contrast, the line segment ab is finite. The line ab is a different

mathematical object from the line segment ab.

The length of the line segment ab is thus a well-defined concept. In contrast, it makes no

sense to talk about the length of the line ab.

A ray is a portion of a line, beginning at some point along the line, then going towards

infinity. You can think of a ray as a half-infinite-line. The figure above illustrates in grey

the ray that starts from the point a and goes in the direction b.

This textbook will strictly reserve the word ray to mean a half-infinite-line. But you should

know that some other writers use ray to mean a (finite) line segment.

www.EconsPhDTutor.com

26.2

We will not use the degree as a unit of measurement for angles. In this textbook, the

unit of measurement for angles is the radian. As well see in a moment, the radian is

actually a unitless unit. So well always write, for example, /3 instead of /3 rad.

rad= 45 ,

rad= 90 , rad= 180 , and

4

2

2 rad= 360 . (This last sentence is the one and only time in this textbook that well use

degrees as a unit of measurement for angles.)

is the zero angle if = 0,

is an obtuse angle if ( , ),

2

2

is a straight angle if = ,

is a right angle if =

,

2

In the figure below, the angle A is acute, R is right, O is obtuse, S is straight, and X is

reflex. The zero angle is not depicted.

By convention, every angle is depicted as a sector of a circle, unless it is a right angle, in

which case it is depicted by a square.

www.EconsPhDTutor.com

26.3

Triangles are also given different names, depending on the size of their largest angle. A

triangle is:

Acute if its largest angle is acute;

Right if its largest angle is right; and

Obtuse if its largest angle is obtuse.

In the figure below, the largest angle of each triangle is highlighted.

Obtuse triangle

Acute triangle

Right triangle

www.EconsPhDTutor.com

26.4

Both the sine and cosine functions have domain R and codomain [1, 1].

The tangent function has domain (1.5, 0.5)(0.5, 0.5)(0.5, 1.5). . . i.e.

all reals except half-integer multiples of . And the tangent functions codomain is R.

Draw a unit circle. Then given any point p = (px , py ) on the unit circle and the angle A

that the line segment op makes with the positive x-axis, we define sin A = py , cos A = px ,

py

and tan A = . Note that the line segment op has length 1.

px

y

p

py

A

px

In the case where A is acute (the point p is in the top-right quadrant of the cartesian

plane), one mnemonic is SOH, CAH, TOA Sine is Opposite over Hypothenuse, Cosine

is Adjacent over Hypothenuse, and Tangent is Opposite over Adjacent.

www.EconsPhDTutor.com

26.5

Sine and cosine fluctuate between 1 and 1. We describe their fluctuations as being sinusoidal. In contrast, tangent fluctuates between and . At half-integer multiples of ,

the tangent function is undefined.

y = tan x

y = cos x

y = sin x

0

3

x

0

-2

You dont need to memorise the following (because you have a calculator). But you will

solve problems a little more quickly if you have these memorised.

x

sin x 0

cos x 1

tan x 0

6

1/2

3/2

3/3

2/2

2/2

3/2

1/2

2

3

3/2

1/2

Undefined 3

3

4

2/2

5

6

1/2

3/3

2/2 3/2 1

www.EconsPhDTutor.com

26.6

For all x for which all expressions are well defined, we have:

tan x =

sin x

,

cos x

sin(x) = sin x,

sin(x + 2) = sin x,

cos(x) = cos x,

cos(x + 2) = cos x,

tan(x) = tan x,

tan(x + 2) = tan x.

The following formulae will appear in the List of Formulae youll get during exams, so

you dont need to memorise them. Exam Tip: Whenever you see a question with

trigonometric functions, make sure you have this list right next to you! For all

A, B, P, Q for which all expressions are well-defined, we have:

sin(A B) = sin A cos B cos A sin B,

cos(A B) = sin A cos B cos A sin B,

tan(A B) =

tan A tan B

,

1 tan A tan B

cos 2A = cos2 A sin2 A = 2 cos2 A 1 = 1 2 sin2 A,

tan 2A =

2 tan A

,

1 tan2 A

P Q

P +Q

) cos (

),

2

2

P +Q

P Q

) sin (

),

2

2

P +Q

P Q

) cos (

),

2

2

P +Q

P Q

) sin (

).

2

2

www.EconsPhDTutor.com

26.7

We define sin2 A to be the square of sin A. One might thus suppose that analogously,

sin1 x = 1/ sin x, but this is not so! Instead:

Definition 59. The arcsine function, denoted sin1 , has domain [1, 1], codomain (and

range) [0.5, 0.5], and rule x y where sin y = x.

Below is the graph of the arcsine function. The endpoints (1, 0.5) and (1, 0.5) are

marked with red dots.

y

0.5

y = sin-1 x

x

-1.0

-0.6

-0.2

0.2

0.6

1.0

-0.5

www.EconsPhDTutor.com

We refer to [0.5, 0.5] as the principal values of the arcsine function. What does this

mean?

Angles come full circle every 2 radians. And so for example, sin = 0.5. But also

6

sin ( + 2) = 0.5. And also sin ( + 4) = 0.5. And also sin ( 2) = 0.5. Indeed,

6

6

6

sin ( + 2k) = 0.5 for any k Z. We say that the sine function is periodic.

6

Yet we do not say that sin1 (0.5) = + 2k for any k Z because this would mean that

6

sin1 maps each element in the domain to more than one (indeed infinitely many) elements

in the codomain. And so sin1 wouldnt be a function.

Instead, we define the arcsine function so that its principal values are [0.5, 0.5]. That

is, the codomain of the arcsine function is [0.5, 0.5]. And thus, sin1 (0.5) = .

6

Note that the choice of [0.5, 0.5] as the principal values of the arcsine function is a

somewhat arbitrary convention. We could equally well have chosen, say, [0.5, 1.5] as our

principal values. Its nicer though that our principal values are centred on 0.

www.EconsPhDTutor.com

Definition 60. The arccosine function, denoted cos1 , has domain [1, 1], codomain (and

range) [0, ], and rule x y where cos y = x.

Below is the graph of the arccosine function. The endpoints (1, ) and (1, 0) are marked

with blue dots.

Note that [0, ] are the principal values of the arccosine function. Why cant we select

[0.5, 0.5] as the principal values for the arccosine function, like we did for the arcsine

function?32

y = cos-1 x

-1.0

32

-0.6

-0.2

0.2

0.6

1.0

x

www.EconsPhDTutor.com

Definition 61. The arctangent function, denoted tan1 , has domain R, codomain (and

range) (0.5, 0.5), and rule x y where tan y = x.

Below is the graph of the arctangent function. There are two horizontal asymptotes, namely

y = 0.5 and y = 0.5. That is, as x , y 0.5.

Note that (0.5, 0.5) are the principal values of the arctangent function.

y

y = 0.5

horizontal

asymptote

x

-10

-6

y=

tan-1

-2

10

y = -0.5

horizontal

asymptote

Remark 7. This notation can be tremendously confusing, which is why many writers prefer

to write arcsin x, arccos x, and arctan x instead of sin1 x, cos1 x, tan1 x. But the Singapore

Cambridge A-level syllabus does not use the arcsin x, arccos x, or arctan x notation and so

neither shall this textbook.

Page 261, Table of Contents

www.EconsPhDTutor.com

26.8

a

a sin C

A

b - a cos C

a cos C

b

Proposition 4. A triangle with sides of lengths a, b, and c and angles A, B, and C has

area is 0.5ab sin C.

Proof. The triangle has base b and height a sin C. Hence, its area is 0.5ab sin C.

www.EconsPhDTutor.com

Proposition 5. (The Law of Sines.) For a triangle with sides of lengths a, b, and c and

angles A, B, and C,

a

b

c

=

=

.

sin A sin B sin C

Proof. The area of the above triangle is 0.5ab sin C. By symmetry, it is also 0.5bc sin A and

0.5ac sin B. Equate these and divide by 0.5abc:

0.5ab sin C = 0.5bc sin A = 0.5ac sin B

b

c

a

=

=

.

sin A sin B sin C

Proposition 6. (The Law of Cosines.) For a triangle with sides of lengths a, b, and c

and angles A, B, and C, c2 = a2 + b2 2ab cos C.

2

= a2 sin2 C + b2 2ab cos C + a2 cos2 C

= a2 (sin2 C + cos2 C) + b2 2ab cos C

= a2 + b2 2ab cos C,

where the last line uses the identity sin2 C + cos2 C = 1.

One perhaps-obvious implication of the Law of Cosines is that the length of any one side

of a triangle is always less than the sum of the lengths of the other two sides.

Corollary 3. For a triangle with sides of lengths a, b, and c, a < b + c.

Proof. c2 = a2 +b2 2ab cos C = a2 +b2 2ab+2ab2ab cos C = (ab)2 +2ab(1cos C) > (ab)2 .

Hence, c > a b or a < b + c.

www.EconsPhDTutor.com

27

Example 227. The points a = (1, 2), b = (3, 1), c = (1, 1), and d = (3, 2) can be

illustrated graphically on the cartesian plane. The origin (0, 0) is usually named o.

a

2

c

1

0

-3

-2

-1

o

0

b

-1

-2

-3

We will not formally define vectors, because to do so would require more maths than is

covered at A-level. But informally, a vector is an arrow with two properties: direction

and length.

www.EconsPhDTutor.com

Example 228. In the figure, ab, cd, and u are all vectors. (As well see, there are multiple

ways to denote vectors.)

y

4

The vector ab = v = v

3

Length = 5

c 1

0

-3

-2

-1

-1

The vector cd = v = v

-2

5

x

b

The vector u

d

-3

Given two points a and b, ab denotes the vector from point a to point b. The word vector

means carrier (in Latin). You may have learnt in biology that mosquitoes are vectors,

because they carry diseases (to humans). In mathematics likewise, a vector carries us

from one point to another.

Example 229. The vector ab carries us from point a to point b. The vector cd carries us

from point c to point d.

www.EconsPhDTutor.com

Like a point, a vector can be described as being an ordered pair of real numbers

Example 230. The vector ab = (4, 3) carries us 4 units to the right and 3 units down. The

vector cd = (4, 3) carries us 4 units to the right and 3 units down. The vector u = (2, 1.5)

carries us 2 units to the right and 1.5 units down.

Note that were now using the (x, y) ordered set notation for the third time!33

Do not confuse a point with a vector!

Example 231. The point (4, 3) is a zero-dimensional object. In contrast, the vector

(4, 3) is a two-dimensional object.

x

.

y

4

2

and u = (2, 1.5) =

.

3

1.5

a

notation for vectors is very useful, because as well see shortly, well be doing a

b

lot of addition and multiplication with vectors, and this notation can help us see better (in

a literal sense). But in print, Ill often prefer using the (a, b) notation, simply because this

takes up less space.

The

The point a is called the vectors tail and the point b is called the vectors head. This is

potentially confusing, so always remember: a vector carries us from tail to head and

not the other way round!

A vector is defined by two characteristics: direction and length.

It must be stressed that the tail and head of a vector do not matter. Only the

direction and length do. So long as two vectors have the same direction and length, they

are considered to be exact same vector. Examples to illustrate

33

So far, we have used (x, y) to denote (i) an open interval specifically, the set of real numbers greater than x but smaller

than y; (ii) the ordered pair of real numbers x and y; and now also (iii) the vector that carries us x units to the right and

y units up.

www.EconsPhDTutor.com

Example 233. Informally, ab, cd, and u all point in the same direction. ab

and cd have

the same length, which we can compute using the Pythagorean Theorem as 32 + 42 = 5.

Hence, ab and cd are considered to be exactly the same: ab = cd. Even though they have

different heads and tails, both ab and cd carry us 4 units right and 3 units down. The

vector (4, 3) can carry us from a to b or from c to d. Thus, cd = (4, 3) = ab = (4, 3).

They are one and the same vector.

In contrast, the vector u has only half the length of ab and so u ab. (Indeed, as we shall

Example 234. The vector (0, 1) can carry us from a to c or from b to d. Thus,

= (0, 1).

Thus, bd = (0, 1) =

ac

But,

and

= (0, 1)

= (0, 1),

ca

ac

= (0, 1).

(0, 0.5)

ac

Yet another way of denoting vectors is by a single letter, either with a right arrow overhead

or in bold font. For example, in the figure above, the vector ab or cd is also named using

v or as boldfont v.

Example 235. So altogether, I can write the vector ab in five different ways:

4

ab =

v = v = (4, 3) =

.

3

v or v, the bold font v is preferred in print publications.

v (because writing in bold font is hard).

Exercise 102. Using a, b, c, or d from the above figure as the tail and a distinct point as

the head, there are 12 possible vectors. Weve already written out 4 of these in the last two

examples. Write out the other 8 in ordered set notation. (Answer on p. 1077.)

www.EconsPhDTutor.com

The position vector of a point a is simply the vector from the origin o = (0, 0) to the

point a. Formally:

Definition 62. Given a point a = (a1 , a2 ), its position vector is the vector a = (a1 , a2 ).

The position vector of the point a carries us from the origin o to the point a and so it

Take care not to confuse the point a = (a , a ) with the vector

can also be denoted

oa.

1 2

a = (a1 , a2 ) they are different objects!

Informally, the zero vector is the vector that carries us nowhere. Formally:

Definition 63. The zero vector is the vector (0, 0) and can be denoted 0 or 0 .

www.EconsPhDTutor.com

27.1

(1)

(2)

(3)

(4)

Point

Point

Point

Point

+ Point = Undefined,

Point =

Vector,

+ Vector =

Point,

Vector =

Point.

If a and b are points, then there is no such thing as a + b.34

The analogy is to points in the real world it makes no sense to talk about the sum of

two locations:

Example 236. Consider the points Paris and Tokyo. The sum Paris + Tokyo = ?? is

undefined. It makes no sense to talk about the sum of two locations.

p+v

v

u

ba

p

34

b

qu

www.EconsPhDTutor.com

Definition 64. Given two points a = (a1 , a2 ) and b = (b1 , b2 ), their difference ba is defined

to be the vector from a to b, i.e., b a = (b1 a1 , b2 a2 ).

Example 237. Paris Tokyo = The journey that carries us from Tokyo to Paris. We might

write Paris Tokyo =(9000 km, 1000 km), meaning that to get from Tokyo to Paris, we

must travel 9, 000 km west and 1, 000 km north.

It makes sense to talk about the distance of the journey from Tokyo to Paris. Shortly, well

see that it similarly makes sense to talk about the length of the vector from a to b.

Example 238. (See figure on p. 265.) Given the points a = (1, 2) and b = (3, 1), their

difference b a is the vector from a to b, i.e., b a = (3 (1), 1 2) = (4, 3).

Definition 65. Given the point p = (p1 , p2 ) and the vector v = (v1 , v2 ), their sum p + v is

defined to be the point p + v = (p1 + v1 , p2 + v2 ).

Geometrically, if the vector v has tail p, then it also has head p + v.

Example 239. Tokyo + (9000 km, 1000 km) = Paris. This says that starting from Tokyo,

if we embark on a journey that carries us 9, 000 km west and 1, 000 km north, then well

end up in Paris.

Example 240. (See figure on p. 265.) Consider the vector (4, 3). If its tail is a = (1, 2),

then its head is (1, 2) + (4, 3) = (3, 1) = b. And if its tail is c = (1, 1), then its head is

(1, 1) + (4, 3) = (3, 2) = d.

www.EconsPhDTutor.com

Definition 66. Given the point q = (q1 , q2 ) and the vector u = (u1 , u2 ), their difference

q u is defined to be the point q u = (q1 u1 , q2 u2 ).

Geometrically, if the vector u has head q, then it also has tail q u.

Example 241. Paris (9000 km, 1000 km) = Tokyo. This says that starting from Paris,

if we embark on a journey that is the exact opposite of going 9, 000 km west and 1, 000 km

north (equivalently, we embark on a journey that goes 9, 000 km east and 1, 000 km south),

then well end up in Tokyo.

Example 242. (See figure on p. 265.) Consider again the vector (4, 3). If its head is

b = (3, 1), then its tail is (3, 1) (4, 3) = (1, 2) = a. And if its head is d = (3, 2), then

its tail is (3, 2) (4, 3) = (1, 1) = c.

Exercise 103. Consider the vector (4, 3). (a) If it has tail (0, 0), then what is its head?

(b) If it has head (0, 0), then what is its tail? (c) If it has tail (5, 2), then what is its head?

(d) If it has head (5, 2), then what is its tail? (Answer on p. 1077.)

www.EconsPhDTutor.com

27.2

(1) Vector + Vector = Vector,

(2)

Vector = Vector, (additive inverse)

(3) Vector Vector = Vector.

Definition 67. If u = (u1 , u2 ) and v = (v1 , v2 ) are vectors, then their sum, denoted u + v,

is the vector defined by u + v = (u1 + v1 , u2 + v2 ).

Geometrically, if the tail of v is the head of u, then u + v is the vector from the tail of u

to the head of v.

u+v

ac.

www.EconsPhDTutor.com

v = (v1 , v2 ).

Geometrically, if the vector v is from point a to point b, then v is the vector from point

b to point a. And so informally, the additive inverse is simply the same vector but flipped

in the opposite direction.

Definition 69. Given two vectors u and v, their difference, denoted u v, is defined to

be the sum of the vectors u and v. Or equivalently, if u = (u1 , u2 ) and v = (v1 , v2 ), then

u v is the vector defined by u v = (u1 v1 , u2 v2 ).

Geometrically, if we place the heads of u and v at the same point, then u v is the vector

from the tail of u to the tail of v.

u-v

www.EconsPhDTutor.com

In the previous section, we learnt that by definition, the vector

pq

= q p. Now, well prove that

can also be written as the

difference of two points:

pq

pq

difference of two vectors:

= q p.

Fact 19. Let p and q be two points with position vectors p and q. Then

pq

+ (

=

+

=

+

This is thus the vector that carries

Proof. q p = q + (p) =

oq

op)

oq

po

po

oq.

us first from p to o, then from o to q; in short, it carries us from p to q. So it is simply the

vector

pq.

Interpreting u v as the sum of the vectors u and v is often convenient:

Example 249. (See figure on p. 265.) Without any numbers, we can compute: ab cb =

ab + ( cb) = ab + bc =

ac.

ab cb =

.

(4, 3) (4, 2) = (0, 1) =

ac

+

Exercise 104. Write down what

ac

cb, dc +

ca, bd + da, ad cd, dc bd, and bd + db are,

without writing out any numbers.(Answer on p. 1077.)

Exercise 105. Using the figure on p. 265, compute each of the following:

ac

cb, dc

ca,

www.EconsPhDTutor.com

27.3

Displacement Vectors

Definition 70. If a moving particle starts at point a and ends at point b, we call ab its

displacement vector.

Example 251. A particle is travelling along the red arc, along the path shown. Its starting

point is in blue and its ending point is in purple. Its displacement vector is thus (2, 2).

y

x

0

-1

0

-1

-2

2 Ending

point

Displacement

vector (2, 2)

Starting point

-3

-4

www.EconsPhDTutor.com

27.4

triangle and a, b are the lengths of the other two sides, then a2 + b2 = c2 .

c

a

As you learnt in secondary school we can calculate the distance between two points using

the Pythagorean Theorem:

Example

be two points. Then the distance between p

252. Let p = (1, 1) and q =(1, 1)

2

2

and q is [1 (1)] + [1 (1)] = 4 + 4 = 8.

1 - (-1)

1 - (-1)

0

-2

-1

-1

-2

www.EconsPhDTutor.com

The vector v = (v1 , v2 ) goes v1 units right and v2 units up. We are thus motivated to define

its length (or magnitude) as:

Definition

71. The length (or magnitude) of a vector v = (v1 , v2 ) is denoted v and defined

by v = v12 + v22 .

Example 252 (continued). Another way to find the distance between p and q is to first

= (2, 2). The distance between p

find the vector that carries us from p to q. This is

pq

2

2

and q is thus simply the length (or magnitude) of this vector: pq = (2) + (2) = 8.

Of course, the distance from p to q is the same as the distance from q to p. So we could

= (2, 2) and gotten the same answer

=

just as well have calculated the length of

qp

qp

22 + 22 = 8.

Exercise 106. Using the figure on p. 265, compute each of the following:

ac

cb, dc

ca,

bd da, ad + cd, dc + bd, and bd db. Also, find the distance between (18, 4) and

(1, 2). (Answer on p. 1077.)

Exercise 107. In general, given any two vectors u and v, is it true that u + v = u + v?

(Answer on p. 1077.)

www.EconsPhDTutor.com

27.5

A scalar is often contrasted with a vector. A vector has both magnitude (or length) and

direction. In contrast, a scalar has magnitude but no direction.

Definition 73. If v = (v1 , v2 ) is a vector and c R is a scalar, then cv denotes the vector

defined by cv = (cv1 , cv2 ). We call this operation scalar multiplication of a vector.

Graphically, cv is simply the vector that has the same direction as v, but with c times the

length. This is formally shown in the next fact.

v

cv

Proof.

2

2

cv = (cv1 , cv2 ) = (cv1 ) + (cv2 )

2

2

2

2

= c v1 + c v2 = c v12 + v22

= c (v1 , v2 ) = c v .

ac, and 4ad in ordered set

= 3

and 4

notation. Verify that 2ab = 2 ab, 3

ac

ac,

ad = 4 ad. (Answer on p. 1078.)

www.EconsPhDTutor.com

27.6

Unit Vectors

2 2

Example 253. Lets verify that the vectors (1, 0), (0, 1), and (

,

) are all unit vectors:

2 2

(1, 0) = 12 + 02 = 1,

(0, 1) = 02 + 12 = 1,

2

2

2

2 2

2

(

(

,

) =

) +(

) = 2/4 + 2/4 = 1.

2 2

2

2

Example 254. Lets verify that the vectors (1, 1) and (1, 1) are not unit vectors:

12 + 12 = 2 1,

2

2

(1, 1) = (1) + (1) = 2 1.

(1, 1) =

We specially reserve the name i (or i ) for the unit vector (1, 0), which is the unit vector

that is purely in the direction of the x-axis. Similarly, we specially reserve the name j (or

j ) for the unit vector (0, 1), which is the unit vector that is purely in the direction of the

y-axis.

And so, using also what we learnt about the sum of and scalar multiplication of vectors,

we can rewrite any vector into the sum of is and js:

www.EconsPhDTutor.com

Example 255. The position vectors for the points a, b, and c (illustrated below) are

a = (1, 2) = i + 2j, b = (4, 3) = 4i 3j, and c = (0, 6) = 6j.

y

c

6

j

5

j

4

j

3

j

2

ji

1

j

0

-3

-2

-1

-1

-2

-3

0-j

-j

-j

i

b

i

www.EconsPhDTutor.com

Informally, the unit vector in the direction v denoted v

v, but has length 1. Formally:

=

1

v.

v

Exercise 109. In the figure on p. 265, what are the unit vectors in the directions ab,

ac,

and ad? What are the unit vectors in the directions 2ab, 3ac, and 4ad? (Answer on p.

1078.)

Fact 21. If c is a scalar and v

is a unit vector, then the vector c

v has length c.

Informally, two vectors have the same unit vector they both point in the same

direction. Formally:

a can be written as a scalar

=b

Fact 22. Let a and b be any two vectors. Then a

multiple of b.

Informally, any vector in the plane can be written as the linear combination of any other

two vectors. Formally:

Fact 23. Let a and b be any two vectors in the same plane with distinct directions (i.e.

Then every vector in the same plane can be written as a + b for some , R.

b).

a

Proof. Optional, see p. 930 in the Appendices.

See TYS Exercise 338 (i) for an application of the above fact.

Exercise 110. Given the vectors a = (1, 3) and b = (7, 5), show that each of the following

vectors can be written in the form a + b for some , R. (i) (0, 1). (ii) (1, 0). (iii)

(1, 1). (Answer on p. 1078.)

www.EconsPhDTutor.com

27.7

a

b

o

Theorem 3. Ratio Theorem. Let a, b, and p be points, where p is on the line segment

ab. Let a, b, and p be the corresponding position vectors. Then

bp

ap

p=

a+

b.

+

+

ap

ap

bp

bp

Proof. Optional, see p. 932 (Appendices).

and =

Or if we let =

ap

bp, then the above can be rewritten in a form that is perhaps

easier to remember:

p=

a

b

a + b

+

=

.

+ +

+

The point dividing AB in the ratio has position vector

a + b

.

+

www.EconsPhDTutor.com

Example 256. Consider the points a = (3, 4) and b = (1, 2). Find the point p that divides

the line segment ab into the ratio 3 2.

2

3

2

3

3 14

3 14

We have p = a + b = (3, 4) + (1, 2) = ( , ). Hence, the point is p = ( , ).

5

5

5

5

5 5

5 5

Example 257. Consider the points a = (8, 3) and b = (2, 6). Find the point p that divides

the line segment ab into the ratio 3 7.

We have p = 0.7a +0.3b = 0.7(8, 3)+0.3(2, 6) = (6.2, 0.3). Hence, the point is p = (6.2, 0.3).

Exercise 111. (a) Consider the points a = (1, 2) and b = (3, 4). Find the point p that

divides the line segment ab into the ratio 5 6. (b) Consider the points a = (1, 4) and

b = (2, 3). Find the point p that divides the line segment ab into the ratio 5 1. (c)

Consider the points a = (1, 2) and b = (3, 4). Find the point p that divides the line

segment ab into the ratio 2 3. (Answer on p. 1078.)

www.EconsPhDTutor.com

28

Scalar Product

Definition 76. Given two 2D vectors u = (u1 , u2 ) and v = (v1 , v2 ), their scalar product (or

dot product), denoted u v, is defined by u v = u1 v1 + u2 v2 .

And so to get the scalar product, simply multiply each term of each vector with the corresponding term of the other, then add these up. Its that simple!

The scalar product is itself simply a scalar (i.e. a real number). Hence the name.

Example 258. (5, 3) (2, 1) = 5 2 + (3) 1 = 7.

Example 259. (0, 17) (1, 3) = 0 (1) + 17 3 = 51.

Ordinary multiplication is distributive:

Example 260. 3 (5 + 11) = 3 5 + 3 11 and 18 (7 31) = 18 7 18 31.

It turns out that the scalar product is likewise distributive:

Fact 24. Let a, b, and c be vectors. Then a (b + c) = a b + a c and (a + b) c = a c + b c.

Here is one use of the scalar product: the length of a vector is simply the square root of its

scalar product with itself. Formally:

Fact 25. Given a vector v, v =

v v.

Next up is a more important use of the scalar product:

www.EconsPhDTutor.com

28.1

Fact 26. Let [0, ] be the angle between two non-zero vectors u and v. Then

u v = u v cos .

The above fact35 gives us a very convenient way to calculate the angle between two vectors,

because rearranging, we have:

= cos1 (

35

uv

).

u v

We have two possible interpretations of the scalar product that are entirely equivalent. We can use either of these

interpretations as our definition and then prove that the other interpretation is true.

(1) In this textbook, we first define the scalar product by u v = u1 v1 + u2 v2 , then prove that u v = u v cos . That is,

we start with the algebraic definition, then prove a geometric property.

(2) In contrast, others may prefer to first define the scalar product by uv = u v cos , then prove that uv = u1 v1 +u2 v2 .

That is, we start with the geometric definition, then prove an algebraic property.

Either way, we first define the scalar product one way or the other. We then prove that the alternative statement is

equivalent.

(It is possible that your JC teachers take the second approach, rather than the first, as is done in this textbook.

Or worse, your teachers simply leave you confused as to why the hell u v = u v cos and at the same time, magically

enough, u v = u1 v1 + u2 v2 . This was my experience as a JC student a number of years ago. If this is also your current

experience, hopefully this textbook has helped to clear things up!)

www.EconsPhDTutor.com

Example 261. The vector i = (1, 0) points east. The vector (1, 1) points northeast. We

know the angle between these two vectors is . Lets check and verify that the formula

4

works:

= cos1 (

i (1, 1)

(1, 0) (1, 1)

) = cos1 (

)

i (1, 1)

(1, 0) (1, 1)

11+01

= cos

( 12 + 02 ) ( 12 + 12 )

1

= cos1 (

1+0

1

) = cos1 ( ) = .

4

1 2

2

Example 262. The vector i = (1, 0) points east. The vector j = (0, 1) points north. We

know the angle between these two vectors is right (i.e. ). Lets check and verify that the

2

formula works:

= cos1 (

ij

(1, 0) (0, 1)

) = cos1 (

)

i j

(1, 0) (0, 1)

10+01

= cos

( 12 + 02 ) ( 02 + 12 )

1

= cos1 (

0+0

) = cos1 0 =

11

2

www.EconsPhDTutor.com

Example 263. The angle between the vectors (3, 2) and (1, 4) is

= cos1 (

(3, 2) (1, 4)

)

(3, 2) (1, 4)

(1)

+

2

(4)

= cos1

2

2

2

2

( 3 + 2 ) ( (1) + (4) )

11

3 8

) = cos1 (

) 2.404

= cos1 (

13 17

221

y

(3, 2)

x

2.404 rad

(-1, -4)

www.EconsPhDTutor.com

Recall that the arccosine function is defined to have range [0, ]. That is, cos1 x [0, ].

Moreover,

2

2

2

These three observations, together with Fact 26, imply the following Fact, which by the

way was already illustrated by the previous three examples:

Fact 27. Let u and v be vectors. The angle between u and v is

(i) acute (or zero) if u v < 0;

(ii) right if u v = 0; and

(iii) obtuse (or straight) if u v > 0.

Well use the words perpendicular, orthogonal, and normal interchangeably:

Definition 77. Two vectors are orthogonal (or perpendicular or normal) if the angle be

tween them is right (i.e. equal to ).

2

I will sometimes write u v to mean u is orthogonal (or perpendicular or normal) to v.

Exercise 112. First write down the angle between each of the following pairs of vectors

without using the above formula. Then verify that the formula does indeed

give you

these correct angles: (a) (2, 0) and (0, 17); (b) (5, 0) and (3, 0); (c) i and (1, 3/3); (d) i

Exercise 113. Verify that i and j are orthogonal, by computing their scalar product.

(Answer on p. 1081.)

www.EconsPhDTutor.com

28.2

The scalar product also gives a convenient way of computing the length of the projection

of one vector on another.

Say we have a right triangle (left diagram) where the angle and the length a are known.

What is the length b? It is simply a cos .

Now suppose a (blue) and b (green) are vectors (right diagram). The projection of the

vector a on the vector b is denoted ab (red). Note that ab is itself a vector.

What is the length of the projection? Well, if a is the length of the vector a and is the

angle between the two vectors, then the length of the projection is ab simply a cos .

Nicely enough, we actually have a quick alternative method of computing this length. Let

be the unit vector for b. Then

b

= ab

cos = a 1 cos = a cos = ab .

ab

or more correctly a b,

since a b

may sometimes

So we have a nice interpretation for a b

be negative:

a b

Page 289, Table of Contents

www.EconsPhDTutor.com

1

1

(1, 1)] = (3, 2) (1, 1)

(1, 1)

2

5

5 2

1

= (3 1 + 2 1) = =

.

2

2

2

(3, 2) (1,

1) = (3, 2) [

You should

verify for yourself that the length of the projection of (3, 2) on (1000, 1000) is

5 2

. The length of the vector to be projected (3, 2) matters, but the length of

also

2

the vector onto which it is projected be it (1, 1) or (1000, 1000) doesnt matter.

1

1

(2, 0)] = (6, 1) (2, 0)

(2, 0)

2

12

1

= 6.

= (6 2 + 1 0) =

2

2

(6, 1) (2,

0) = (6, 1) [

Again, you can verify for yourself that the length of the projection of (6, 1) on (50000, 0)

is also 6. Again, the length of the vector to be projected (6, 1) matters, but the

length of the vector onto which it is projected be it (2, 0) or (50000, 0) doesnt matter.

Exercise 114. What are the lengths of the projections of (a) (1, 0) on (33, 33) and (b)

(33, 33) on (1, 0)? (Answer on p. 1081.)

www.EconsPhDTutor.com

28.3

Direction Cosines

The angle between a vector v and the x-axis is simply the angle between v and i = (1, 0).

Similarly, the angle between v and the y-axis is simply the angle between v and j = (0, 1).

Example 266. Consider the angle a between the vector (3, 2) and the x-axis. We have:

= cos a =

31+20

3

(3, 2) (1, 0)

=

= .

(3, 2) (1, 0) ( 32 + 22 ) ( 12 + 02 )

13

cos1 = cos1 (3/ 13) 0.588, we find that the angle a between the vector (3, 2) and the

x-axis is 0.588.

y

(3, 2)

0.983 rad

x

2.404 rad

(-1, -4)

www.EconsPhDTutor.com

Example 267. Consider the angle b between the vector (3, 2) and the y-axis. We have:

= cos b =

(3, 2) (0, 1)

2

2

30+21

= .

=

=

(3, 2) (0, 1) ( 32 + 22 ) ( 02 + 12 )

13 1

13

cos1 = cos1 (2/ 13) 0.983, we find that the angle b between the vector (3, 2) and the

y-axis is 0.983.

Definition 78. Given a vector v, its x-direction cosine is simply the length of the

on the x-axis.

projection of v

on the y-axis.

Similarly, its y-direction cosine is simply the length of the projection of v

The next Fact is immediate from the above definition:

= (, ).

Fact 28. Let v be a vector and and be its x- and y-direction cosines. Then v

Example 268. The x- and y-direction cosines of the vector (3, 2) are

3

=

13

2

and = .

13

3

2

Hence, the unit vector in the direction (3, 2) is ( , ).

13 13

Exercise 115. For each of the following vectors, find their x- and y-direction cosines.

Hence write down their unit vectors. (a) (1, 3). (b) (4, 2). (c) (1, 2). (Answer on p.

1081.)

www.EconsPhDTutor.com

29

Vectors in 3D

In two dimensions, we had the cartesian (or two-dimensional) plane with x- and y-axes.

Informally, the x-axis goes to the right and the y-axis goes up. A point was any ordered

pair of real numbers. The origin o = (0, 0) was the intersection point of the two axes. And

relative to the origin, the generic point a = (a1 , a2 ) was the point a1 units to the right and

a2 units up.

In three dimensions, we now instead have the three-dimensional space (3D space).

The x- and y-axes are as before. There is an additional z-axis that, informally, comes

out of the paper, perpendicular to the plane of the paper, straight towards your face.

We call this the right hand coordinate system, because if you take your right hand,

stick out your thumb, forefinger, and middle finger so that they are perpendicular, your

thumb represents the x-axis, your forefinger the y-axis, and your middle finger the z-axis.

(Try it!)

(If instead the z-axis goes into the paper, then wed have a left hand coordinate system.

Can you explain why?)

a2

x

a1

a3

z

www.EconsPhDTutor.com

In the context of 3D space, a point is any ordered triple of real numbers. The origin

o = (0, 0, 0) is the point where the x-, y-, and z-axes intersect. And relative to the origin,

the generic point a = (a1 , a2 , a3 ) is the point a1 units to the right, a2 units up, and a3 units

out of the paper.

Everything we learnt about 2D vectors finds its analogy in three-dimensional (3D)

vectors. Most of the time, the analogy is obvious. Try these exercises.

Exercise 116. (Answer on p. 1082.) (a) Fill in the blanks. A 3D vector is an arrow

that has two characteristics: __________ and __________. Just like a

point, it can be described by an __________ of __________. The vector

a = (a1 , a2 , a3 ) carries us from the origin to _______________.

(b) What other ways are there to denote the vector a = (a1 , a2 , a3 )? (Hint. The unit vector

in the z-axis is now called k.)

(c) Let a = (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ) be points. What are (i) a+b; (ii) a+ ob; (iii)

oa+

ob;

ba?

and (iv)

oa

www.EconsPhDTutor.com

The length (or magnitude) of a 2D vector v = (v1 , v2 ) was defined by v12 + v22 . What then

is the length (or magnitude) of a 3D vector? This is the one instance where the analogy

from the 2D case to the 3D case is perhaps less than obvious. So lets explore this issue.

Consider the blue point a in the figure below. What is its distance from the origin (0, 0, 0)?

In other words, what is the length of the green dotted line?

a2

x

a1

a3

z

First lets calculate the distance of the red point from the origin,

in other words the length

Now, notice the the green dotted line, the red dotted line (length a22 + a23 ), and the

blue dotted line (length a1 ) form a right-angled triangle, with the hypothenuse being the

green dotted line. Thus, the length of the green dotted line is (again by the Pythagorean

Theorem):

a21

2

2

2

+ ( a2 + a3 ) = a21 + a22 + a23 .

www.EconsPhDTutor.com

We are thus motivated to define the length (or magnitude) of a 3D vector as follows:

Definition 79.

The length (or magnitude) of a vector a = (a1 , a2 , a3 ) is denoted a and

defined by a = a21 + a22 + a23 .

This is very much analogous to the definition of the length (or magnitude) of a 2D vector.

Exercise 117. (Answer on p. 1083.) (a) Compute the lengths of the vectors a = (1, 2, 3),

b = (4, 5, 6), and a b.

(b) Compute the lengths of the vectors 2a = (2, 4, 6), 3b = (12, 15, 18), and 4(a b).

(c) Compute the unit vectors in the directions a = (1, 2, 3), b = (4, 5, 6), and a b.

(d) Compute (1, 2, 3) (4, 5, 6) and (2, 4, 6) (1, 2, 3).

(e) Compute the angles (i) between the vectors a = (1, 2, 3) and b = (4, 5, 6); and (ii)

between the vectors u = (2, 4, 6) and v = (1, 2, 3). (iii) Are the vectors (2, 4, 6) and

(1, 2, 3) orthogonal?

(f) Compute the length of the projection of a = (1, 2, 3) on b = (4, 5, 6).

(g) Find the point that divides the line segment ab in the ratio 2 3.

(h) For each of the following vectors, find their x-, y-, and z-direction cosines. And then

write down their unit vectors. (i) (1, 3, 2). (ii) (4, 2, 3). (iii) (1, 2, 4).

www.EconsPhDTutor.com

30

30.1

Vector Product

Vector Product in 2D

Recall that given two 2D vectors u = (ux , uy ) and v = (vx , vy ), their scalar product was the

scalar defined by u v = ux vx + vx vy . We now define a very similar concept.

Definition 80. Given two 2D vectors u = (ux , uy ) and v = (vx , vy ), their vector product (or

cross product), denoted u v, is the scalar defined by

u v = ux vy uy vx .

Example 270. If p = (3, 5) and q = (6, 1), then p q = 3 1 5 6 = 33.

Example 271. If u = (1, 4) and v = (2, 3), then u v = (1) (3) 4 2 = 5.

Ordinary multiplication is commutative. This simply means that given any real numbers

a, b, we have a b = b a. For example,

Example 272. 4 7 = 7 4 and 3 5 = 5 3.

In contrast, the vector product is not commutative because u v v u. This might be

the first time in your life that youre encountering a product that isnt commutative.

In fact, the vector product is anticommutative because u v = v u! For example,

Example 273. If u = (1, 2) and v = (3, 4), then u v = 1 4 2 3 = 2, but v u =

2 3 1 4 = 2.

Example 274. If u = (1, 4) and v = (2, 3), then u v = (1) (3) 4 2 = 5, but

v u = 4 2 (1) (3) = 5.

www.EconsPhDTutor.com

Recall that if [0, ] is the angle between two vectors, then based on our definition that

u v = ux vx + uy vy , we could prove that u v = u v cos .

It turns out based on our definition that u v = ux vy uy vx , we can prove a very similar

result:36

Fact 29. Let u and v be two non-zero 2D vectors and [0, ] be the angle between them.

Then the scalar u v is equal to either u v sin or u v sin .

Earlier we already had one formula for calculating the angle between two vectors. Let

[0, ] be the angle between u and v. Then

= cos1 (

uv

).

u v

The above Fact now gives us a second formula. Let [0, ] be the acute or right angle

between u and v. Then

= sin1

uv

.

u v

However, well stick with using only the first cosine formula. We wont use the second

sine formula, mainly because, as well see, computing the vector product is very tedious,

especially in the 3D case, where it is a different creature altogether.

36

Footnote 36 explained that the scalar product could be defined in one of two equivalent ways.

Similarly, the vector product can be defined in one of two equivalent ways. We can use either definition and then prove

that the other is true.

(1) In this textbook, we first define the vector product by u v = ux vy uy vx ; we then prove that u v = u v sin ,

where is the angle between the two vectors. That is, we start with the algebraic definition, then prove a geometric

property.

The alternative approach is this:

(2) Define the vector product by u v = u v sin if [0, ] or u v = u v sin if ( , ] ; then prove that

2

2

u v = ux vy uy vx . That is, we start with the geometric definition, then prove an algebraic property.

www.EconsPhDTutor.com

30.2

SYLLABUS ALERT

Calculation of the area of a triangle or parallelogram is included in the 9740 (old) syllabus,

but not in the 9758 (revised) syllabus. So you can skip this section if youre taking 9758.

The vector product is also helpful for computing the area of triangles and parallelograms.

Fact 30. The triangle with sides of lengths u, v, and v u has area 0.5u v.

Case #1.

|v| sin ( )

v

|v| sin

vu

vu

Case #2.

Proof. Case #1. If the vectors u and v form an acute or right angle , then the area of the

triangle is simply 0.5 Base Height or 0.5 u v sin . And by Fact 29, 0.5 u v sin =

0.5u v.

Case #2. And if the vectors u and v form an obtuse angle , then the area of the triangle is

again simply 0.5 Base Height or 0.5 u v sin( ). Recall that sin( ) = sin cos

sin cos = sin . So again the area of the triangle is 0.5 u v sin or 0.5u v.

Example 275. Consider the triangle formed by the points (0, 0), (3, 4), and (5, 6). Its

area is simply 0.5 (3, 4) (5, 6) = 0.53 6 4 5 = 1.

www.EconsPhDTutor.com

Fact 31. The parallelogram with sides of lengths u and v, and diagonal of length v u

has area u v.

vu

u

Proof. Such a parallelogram is simply composed of two of the triangles from Fact 30. And

so its area is simply twice the area of the triangle, or 2 0.5u v = u v.

www.EconsPhDTutor.com

30.3

Vector Product in 3D

The 3D vector product is very different from the 2D vector product. The latter was simply

a scalar (real number); in contrast, the 3D vector product is instead a VECTOR!

Also previously, we first started with the algebraic definitions. For example, the 3D scalar

product was defined as uv = u1 v1 +u2 v2 +u3 v3 and the 2D vector product as uv = u1 v2 u2 v1 .

We then showed that these algebraic definitions were equivalent to some geometric

interpretations.

For the vector product in 3D, I will go the other way round. That is, I will start with

the (very long) geometric definition, then show that it is equivalent to some algebraic

interpretation.

www.EconsPhDTutor.com

Definition 81. Given two distinct 3D vectors u = (ux , uy , uz ) and v = (vx , vy , vz ), their

vector product (or cross product), denoted u v, is the (unique) vector that satisfies 3

properties:

1. u v is orthogonal (perpendicular) to both u and v.

Lets see what this first property means. Recall that it doesnt matter where we put the

heads and tails of vectors. So lets put u and v on the same plane, with their heads at the

same point.

uv

Plane

We see that there are exactly two vectors that are orthogonal to both u and v the vector

pointing up (green) and the vector pointing down (purple).

There is thus an ambiguity. Which of these two vectors is u v?

To resolve this ambiguity, we also require that u v satisfy a second property:

2. u v satisfies the right-hand rule: Take your right hand, stick out your thumb,

forefinger, and middle finger so that they are perpendicular, your thumb represents the

vector u v, your forefinger the vector u, and your middle finger the vector v. Hence,

in the figure, u v points up (green). (Try it yourself!)

Note that the right-hand rule is a mere convention, but one that everyone has agreed upon.

There is no especially compelling reason for using it, other than the fact that left-handed

people are an oppressed minority! (If instead we used the left-hand rule, then u v would

point down (purple) (try it!), but for better or worse, we dont use the left-hand rule.)

The third and last property specifies the length (or magnitude) of u v.

3. u v = u v sin , where [0, ] is the angle between them.

www.EconsPhDTutor.com

Note that [0, ] sin 0, so that u v sin is never negative. (Otherwise, wed

have the distressing possibility that the length of u v is sometimes negative!)

One implication of this third and last property is that if both u and v point in the same

direction (so that = 0), then u v is the zero vector (i.e. u v = 0).

Fact 32. (a) i j = k; (b) j k = i; (c) k i = j; (d) j i = k; (e) k j = i; and (f)

i k = j.

Proof. In each case, use the right-hand rule to show that properties #1 and #2 of Definition

81 are satisfied.

In each case, the length of the cross product is u v sin = 1 1 sin = 1. So indeed,

2

property #3 is also satisfied.

Example 276. 3 (5 + 11) = 3 5 + 3 11 and 18 (7 31) = 18 7 18 31.

It turns out that the vector product is likewise distributive:

Fact 33. Let a, b, and c be vectors. Then a (b + c) = a b + a c. Moreover, (a + b) c =

a c + b c.

Proof. Optional, see p. 934 in the Appendices.

The next proposition gives the promised algebraic interpretation of the vector product.

Proposition 7. Given two 3D vectors u = (ux , uy , uz ) and v = (vx , vy , vz ), their vector

product is given by:

uy vz uz vy

uv=

uz vx ux vz

ux vy uy vx

Proof. Optional, see p. 935 in Appendices. (The proof is actually quite simple. It just

involves some tedious algebra.)

www.EconsPhDTutor.com

u v = (2 6 3 5, 3 4 1 6, 1 5 2 4) = (3, 6, 3) .

Lets verify that u v is orthogonal to u, by computing (u v) u = (3, 6, 3) (1, 2, 3) =

3 + 12 9 = 0 . Similarly, lets verify that u v is orthogonal to v, by computing

(u v) v = (3, 6, 3) (4, 5, 6) = 12 + 30 18 = 0 .

Example 278. If u = (1, 3, 5) and v = (2, 4, 6), then

u v = (3 6 (5) (4), (5) 2 (1) 6, (1) (4) 3 2) = (2, 4, 2) .

Lets verify that uv is orthogonal to u, by computing (u v)u = (2, 4, 2)(1, 3, 5) =

2 12 + 10 = 0 . Similarly, lets verify that u v is orthogonal to v, by computing

(u v) v = (2, 4, 2) (2, 4, 6) = 4 + 16 12 = 0 .

As in the 2D case of the vector product, here again in the 3D case, the vector product is

anticommutative, i.e. u v = v u (see Exercise 120).

Exercise 118. For each of the following pairs of vectors, compute the vector product and

verify that it is orthogonal to each of the two vectors. (a) u = (0, 1, 2) and v = (3, 4, 5). (b)

u = (1, 2, 3) and v = (1, 0, 5). (Answer on p. 1085.)

Exercise 119. Verify that in general, u v is orthogonal to u and v by showing that

(u v) u = 0 and that (u v) v = 0. (Answer on p. 1085.)

Exercise 120. (a) Given u = (1, 2, 3) and v = (4, 5, 6), show that v u = u v. (b) Prove

that in general, i.e. the 3D vector product is anti-commutative, i.e. u v = v u. (Answer

on p. 1085.)

Example 279. (2 3) 7 = 2 (3 7) and (8 13) 2 = 8 (13 2).

In contrast, the vector product is not associative.

Example 280. If u = (1, 2, 3), v = (4, 5, 6), and w = (1, 0, 1), then (u v)w = (3, 6, 3)

(1, 0, 1) = (6, 0, 6), but u (v w) = (1, 2, 3) (5, 2, 5) = (16, 20, 8).

www.EconsPhDTutor.com

31

31.1

Lines

ax + by + c = 0.

This says that the line consists of exactly those points (x, y) that satisfy the equation

ax + by + c = 0.

(You may be more familiar with describing lines in the form y = mx+d. This simply involves

a rearrangement of the above equation. But the above equation is preferred because it is

more general it allows for the possibility that the coefficient on y is 0.)

Example 281. Consider the line described by the cartesian equation 3x y + 2 = 0.

Rearranging, we get a more familiar-looking equation: y = 3x + 2.

For convenience (but at the cost of some sloppiness), we may even simply identify the line

with the cartesian equation.

Example 282. Consider the line 3x y + 2 = 0.

Describing lines using cartesian equations is secondary school stuff. Well now learn a

second method of describing lines through vector equations. In general, any line can be

described in the form

r = p + v, R,

where r is a generic point on the line, p is some known point on the line, v is a direction

vector of the line, and is a parameter that can take any real value.

Here are some examples to make sense of this.

www.EconsPhDTutor.com

Example 283. Consider the line (on a 2D plane) described by the cartesian equation

3x y + 2 = 0. It runs through the point (0, 2). A vector that points in the same direction

as this line is (1, 3). Hence, we can also describe it using the vector equation

r = (0, 2) + (1, 3), R.

This says that the line consists of every point r that can be written as (0, 2) + (1, 3) for

some real number . We call a parameter. As varies, we get different points of

the line. So for example, corresponding to = 0, 1, and 1, the line contains the points

(0, 2) + 0(1, 3) = (0, 2), (0, 2) + 1(1, 3) = (1, 5), and (0, 2)1(1, 3) = (1, 1). Of course, it

also contains infinitely many other points, one for each value of R.

We call (1, 3) a direction vector of the line. Note that this direction vector is not unique,

any scalar multiple thereof, i.e. c(1, 3) with c R, is also a direction vector of the line!

Again, we can either say the line is described by the vector equation r = (0, 2) +

(1, 3). OR, for convenience (but at the cost of some sloppiness), we can also say the line

is the very equation r = (0, 2) + (1, 3).

y 4

Line

Cartesian equation

3x - y + 2 = 0

x

0

-4

-2

Vector equation

r = (0, 2) + (1, 3)

-2

-4

Page 306, Table of Contents

www.EconsPhDTutor.com

Example 284. Consider the line (on a 2D plane) described by the cartesian equation

x + y 1 = 0. It runs through the point (0, 1). A vector that points in the same direction

as this line is (1, 1). Hence, we can also describe it using the vector equation

r = (0, 1) + (1, 1), R.

This says that the line consists of every point r that can be written as (0, 1) + (1, 1)

for some real number . Corresponding to = 0, 1, and 1, the line contains the points

(0, 1) + 0(1, 1) = (0, 1), (0, 1) + 1(1, 1) = (1, 0), and (0, 1)1(1, 1) = (1, 2).

Line

Cartesian equation

x+y-1=0

y 4

2

The point (0, 1)

x

0

-4

-2

-2

Vector equation

r = (0, 1) + (1, -1)

-4

www.EconsPhDTutor.com

Example 285. Consider the line (on a 2D plane) described by the cartesian equation

y 3 = 0. It runs through the point (0, 3). A vector that points in the same direction as

this line is (1, 0). Hence, we can also describe it using the vector equation

r = (0, 3) + (1, 0), R.

This says that the line consists of every point r that can be written as (0, 3) + (1, 0)

for some real number .Corresponding to = 0, 1, and 1, the line contains the points

(0, 3) + 0(1, 0) = (0, 3), (0, 3) + 1(1, 0) = (1, 3), and (0, 3)1(1, 0) = (1, 3).

y 4

Vector equation

r = (0, 3) + (1, 0)

Line

Cartesian equation

y-3=0

2

x

0

-4

-2

-2

-4

Exercise 121. Rewrite each of the following lines into vector equation form. (a) 5x+y+1 =

0. (b) x 2y 1 = 0. (c) y 4 = 0. (d) x 4 = 0. (Answer on p. 1086.)

www.EconsPhDTutor.com

1

r = p + v, R.

Notice that the LHS of this equation is the generic Point r. And the RHS of the equation

is the Point p minus the Vector v, which equals Point (see p. 270). So LHS and RHS

do indeed match up.

There is another way to describe a line using a vector equation. We can instead write:

2

r = p + v, R,

where now r is the position vector of a generic point r on the line and p is the position

vector of some known point p on the line. So now LHS is a Vector and so too is RHS.

1

Equation = said that the line consists of those points r that could be written as p + v. In

2

contrast, equation = says that the line consists of those points whose position vector r can

be written as p + v. But both equations can equally well describe the very same line. The

difference is a fine and pedantic one and really doesnt matter much.

3

r = p + v, R; WRONG!

4

or r = p + v, R. WRONG!

3

The LHS of = is a Point while the RHS of = is a Vector. Therefore = cannot possibly be

true.

4

The LHS of = is a Vector while the RHS of = is a Point. Therefore = cannot possibly be

true.

As usual, this is all very pedantic, but can serve as a useful test of your understanding.

www.EconsPhDTutor.com

31.2

In the previous section, given the cartesian equation of a line, we worked out its vector

equation. Now given its vector equation, well work out its cartesian equation.

Suppose a line (on a 2D plane) can be described by the vector equation

r = p + v = (p1 , p2 ) + (v1 , v2 ).

where R and v is a non-zero vector.37 And so any point (x, y) on this line must satisfy

x = p1 + v1

and y = p2 + v2 .

The above are the cartesian equations for a line (on a 2D plane)! But wait a minute ...

isnt there supposed to be just one equation? Well, if wed like, we can quite easily combine

them into a single equation by eliminating the parameter . In general:

Fact 34. The line with vector equation r = (p1 , p2 ) + (v1 , v2 ) (for R) is the line with

cartesian equations as given by the 3 cases below.

(1)

x p1 y p2

=

,

v1

v2

if v1 , v2 0;

(2) x = p1 , y is free,

if v1 = 0, v2 0;

(3) x is free, y = p2 ,

if v1 0, v2 = 0;

Proof. Optional, see p. 936 in the Appendices.

Some examples:

37

www.EconsPhDTutor.com

Example 286. The line described by the vector equation r = (1, 2) + (1, 1), where R

has cartesian equations x = 1 + and y = 2 + .

As varies between and , this pair of equations gives us the points that are on the

line. For example, when = 1, 17, 33, we have the points (2, 3), (18, 20), and (34, 36).

We can eliminate and reduce the above pair of equations into the single cartesian equation

y = x + 1 or

y1 x

= .

1

1

Example 287. Consider the line described by the vector equation r = (0, 0)+(4, 5), where

R has cartesian equations x = 4 and y = 5.

As varies between and , this pair of equations gives us the points that are on the

line. For example, when = 1, 17, 33, we have the points (4, 5), (68, 85), and (132, 165).

Eliminating , we can reduce the above pair of equations to y = 1.25x or

x

y

= .

1.25 1

Example 288. Consider the line described by the vector equation r = (3, 1)+(0, 2), where

R has cartesian equations x = 3 and y = 1 + 2.

As varies between and , this pair of equations gives us the points that are on the

line. So in fact, the above equations say that x must always be 3 and y is free to vary along

with . For example, when = 1, 17, 33, we have the points (3, 3), (3, 25), and (3, 67).

Hence, the above pair of equations can be reduced to x = 3.

Exercise 122. Rewrite each of the following lines into cartesian equation form. (a) r =

(1, 1) + (3, 2), where R. (b) r = (5, 6) + (7, 8), where R. (c) r = (0, 3) + (3, 0),

where R. (Answer on p. 1086.)

www.EconsPhDTutor.com

31.3

guess might be that lines in 3D space are analogously described by the cartesian equation

ax + by + cz + d = 0. Turns out this is wrong! The equation ax + by + cz + d = 0 actually

describes a plane, as well see later (Chapter 32).

In the 3D case, its easier to start by looking at the vector equation of a line. It turns out

to be exactly analogous to the 2D case. It can be written as r = a + v, where R and v

is a non-zero 3D vector.

This vector equation says that the line contains every point r can be expressed as (a1 , a2 , a3 )+

(v1 , v2 , v3 ), where R is a parameter.

Example 289. Consider the line described by the vector equation r = (1, 2, 3) + (0, 1, 1),

where R. Corresponding to = 0, 1, and 1, the line contains the points (1, 2, 3),

(1, 3, 4), and (1, 1, 2).

y

3

(1, 2, 3)

2

1

x

1

(1, 3, 4)

1

2

3 (1, 1, 2)

Line

r = (1, 2, 3) + (0, 1, 1)

www.EconsPhDTutor.com

Example 290. Consider the line described by the vector equation r = (0, 0, 0) + (1, 0, 0),

where R. Corresponding to = 0, 1, and 1, the line contains the points (0, 0, 0),

(1, 0, 0), and (1, 0, 0).

y

2

(-1, 0, 0)

1

Line

r = (0, 0, 0) + (1, 0, 0)

x

1

(1, 0, 0)

1

2 (0, 0, 0)

www.EconsPhDTutor.com

31.4

We now try to work out the cartesian equation of a line in 3D space. Suppose a line can

be described by the vector equation

r = p + v = (p1 , p2 , p3 ) + (v1 , v2 , v3 ).

where R and v is a non-zero vector.38 And so any point (x, y, z) on this line must satisfy

x = p1 + v1 , y = p2 + v2 , and z = p3 + v3 .

The above are the cartesian equations for a line (in 3D space)! These are exactly analogous

to the cartesian equations (p. 31.2) in the 2D case.

Unlike in the 2D case, it is generally impossible to reduce these equations into a single

cartesian equation. However, we can reduce them into two equations.

Fact 35. The line with vector equation r = (p1 , p2 , p3 ) + (v1 , v2 , v3 ) where R is the line

with cartesian equations as given by the 7 cases below.

(1)

x p 1 y p2 z p3

=

=

v1

v2

v3

if v1 , v2 , v3 0;

(2) x = p1 ,

y p2 z p3

=

,

v2

v3

if v1 = 0, v2 , v3 0;

(3) y = p2 ,

x p1 z p 3

=

,

v1

v3

if v2 = 0, v1 , v3 0;

(4) z = p3 ,

x p1 y p2

=

,

v1

v2

if v3 = 0, v1 , v2 0;

(5) x = p1 , y = p2 , z is free,

if v1 , v2 = 0, v3 0;

(6) x = p1 , z = p3 , y is free,

if v1 , v3 = 0, v2 0;

(7) y = p2 , z = p3 , x is free, if v2 , v3 = 0, v1 0.

Proof. Optional, see p. 937 in the Appendices.

38

www.EconsPhDTutor.com

The first two examples are where v1 , v2 , and v3 are non-zero (Case 1 of Fact 35).

Example 291. Consider the line described by the vector equation r = (1, 2, 3) + (4, 5, 6),

where R. It can be described by the cartesian equations x = 1 + 4, y = 2 + 5, and

z = 3 + 6.

As varies between and , these 3 equations give us the points that are on the line.

For example, when = 1, 3, 17, we have the points (5, 7, 9), (13, 17, 21), and (69, 87, 105).

By rearranging each equation so that is on one side, we can reduce these three equations

to just two:

x1 y2 z3

=

=

.

4

5

6

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian

equations.

Example 292. Consider the line described by the vector equation r = (0, 0, 0) + (2, 3, 5),

where R. It can be described by the cartesian equations x = 2, y = 3, and z = 5.

As varies between and , these 3 equations give us the points that are on the line.

For example, when = 1, 3, 17, we have the points (2, 3, 5), (6, 9, 15), and (34, 51, 85).

By rearranging each equation so that is on one side, we can reduce these three equations

to just two:

x y z

= = .

2 3 5

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian

equations.

www.EconsPhDTutor.com

of Fact 35).

In the case where v1 = 0 (but v2 0 and v3 0), then this is a line that is on the 2D yz

plane where x = p1 .

Example 293. Consider the line described by the vector equation r = (1, 2, 3) + (0, 5, 6),

where R. It can be described by the cartesian equations x = 1, y = 2 + 5, and z = 3 + 6.

As varies between and , these 3 equations give us the points that are on the line.

For example, when = 1, 3, 17, we have the points (1, 7, 9), (1, 17, 21), and (1, 87, 102).

We see that x must always be equal to 1.

By rearranging the second and third equations so that is on one side, we can reduce these

three equations to just two:

y2 z3

=

.

x = 1,

5

6

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian

equations.

Similarly, in the case where v2 = 0 (but v1 0 and v3 0), then this is a line that is on the

2D xz plane where y = p2 .

Example 294. Consider the line described by the vector equation r = (1, 2, 3) + (4, 0, 6),

where R. It can be described by the cartesian equations x = 1 + 4, y = 2, and z = 3 + 6.

As varies between and , these 3 equations give us the points that are on the line.

For example, when = 1, 3, 17, we have the points (5, 2, 9), (13, 2, 21), and (69, 2, 105).

We see that y must always be equal to 2.

By rearranging the first and third equations so that is on one side, we can reduce these

three equations to just two:

x1 z3

=

.

y = 2,

4

6

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian

equations.

www.EconsPhDTutor.com

Finally, in the case where v3 = 0 (but v2 0 and v3 0), then this is a line that is on the

2D xy plane where z = p3 .

Example 295. Consider the line described by the vector equation r = (1, 2, 3) + (4, 5, 0),

where R. It can be described by the cartesian equations x = 1 + 4, y = 2 + 5, and z = 3.

As varies between and , these 3 equations give us the points that are on the line.

For example, when = 1, 3, 17, we have the points (5, 7, 3), (13, 17, 3), and (69, 87, 3).

We see that z must always be equal to 3.

By rearranging the first and second equations so that is on one side, we can reduce these

three equations to just two:

x1 y2

=

.

z = 3,

4

5

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian

equations.

We now look at examples where exactly two of v1 , v2 , or v3 are zero (Cases 5, 6, and 7 of

Fact 35).

In the case where v1 = 0 and v2 = 0, but v3 0, then this is a line that runs through the

points (p1 , p2 , ) for R.

Example 296. Consider the line described by the vector equation r = (1, 2, 3) + (0, 0, 6),

where R. It can be described by the cartesian equations x = 1, y = 2, and z = 3 + 6.

As varies between and , these 3 equations give us the points that are on the line.

For example, when = 1, 3, 17, we have the points (1, 2, 9), (1, 2, 21), and (1, 2, 105).

We see that x and y must always be equal to 1 and 2. Hence, the above equations simply

reduce to:

x = 1,

y = 2.

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian

equations. These are the points (1, 2, ), where can be any real.

www.EconsPhDTutor.com

Similarly, in the case where v1 = 0 and v3 = 0, but v2 0, then this is a line that runs

through the points (p1 , , p3 ) for R.

Example 297. Consider the line described by the vector equation r = (1, 2, 3) + (0, 5, 0),

where R. It can be described by the cartesian equations x = 1, y = 2 + 5, and z = 3.

As varies between and , these 3 equations give us the points that are on the line.

For example, when = 1, 3, 17, we have the points (1, 7, 3), (1, 17, 3), and (1, 87, 3).

We see that x and z must always be equal to 1 and 3. Hence, the above equations simply

reduce to:

x = 1,

z = 3.

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian

equations. These are the points (1, , 3), where can be any real.

In the case where v2 = 0 and v3 = 0, but v1 0, then this is a line that runs through the

points (, p2 , p3 ) for R.

Example 298. Consider the line described by the vector equation r = (1, 2, 3) + (4, 0, 0),

where R. It can be described by the cartesian equations x = 1 + 4, y = 2, and z = 3.

As varies between and , these 3 equations give us the points that are on the line.

For example, when = 1, 3, 17, we have the points (5, 2, 3), (13, 2, 3), and (69, 2, 3).

We see that y and z must always be equal to 2 and 3. Hence, the above equations simply

reduce to:

y = 2,

z = 3.

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian

equations. These are the points (, 2, 3), where can be any real.

Exercise 123. Rewrite each of the following vector equation descriptions of lines into

cartesian equations describing the same line. (a) r = (1, 1, 1) + (3, 2, 1), where R.

(b) r = (5, 6, 1) + (7, 8, 1), where R. (c) r = (0, 3, 1) + (3, 0, 1), where R. (d)

r = (9, 9, 9) + (1, 0, 0), where R. (Answer on p. 1087.)

www.EconsPhDTutor.com

SYLLABUS ALERT

The 9740 (old) List of Formulae contains the following statement, but not the 9758 (revised)

List of Formulae!

If A is the point with position vector a = a1 i + a2 j + a3 k and the direction vector b is given

by b = b1 i + b2 j + b3 k, then the straight line through A with direction vector b has cartesian

equation

x a1 y a2 z a3

=

=

(= ).

b1

b2

b3

By the way, the above statement printed in the 9740 (old) List of Formulae is false (*gasp*),

because it fails to specify that b1 , b2 , b3 must be non-zero. (The correct statement was just

given as Fact 35.)

Consider for example, the point (0, 0, 0) and the direction vector b given by b = j + k. Then

contrary to the above statement, the straight line through A with direction vector b does

not have cartesian equation

x y z

= = ,

0 1 1

x

is undefined. This is the common mistake to which I devoted an entire chapter

0

(Chapter 2) earlier in this book. This seems like a very pedantic point to make, but dividing

by zero has been the cause of the downfall of many a student (and in this case some folks

at MOE).

because

www.EconsPhDTutor.com

In the examples and exercise we just went through, we started with a vector equation of a

line and then wrote down the lines cartesian equations. Now well go the other way round,

starting with the lines cartesian equations, then write down the vector equations.

Example 299. Consider a line described by the cartesian equations

3x 4 2y 18 z 1

=

=

.

6

5

3

In order to directly apply Fact 35, you must make sure that the coefficients on x, y,

and z are all 1! So first rewrite the above into

x 4/3 y 9 z 1

=

=

.

2

2.5

3

And now by Fact 35, we can immediately describe this line by the vector equation r =

(4/3, 9, 1) + (2, 2.5, 3), for R.

Example 300. Consider a line described by the cartesian equations

5x y 13 3z 14

=

=

.

2

6

8

Rewrite the above equations into

x y 13 z 14/3

=

= 8

.

2/5

6

/3

And so by Fact 35, we can immediately describe this line by the vector equation r =

(0, 13, 14/3) + (2/5, 6, 8/3), for R.

Example 301. Consider a line described by the cartesian equations 2x = 17 and 3z = 4.

Rewrite them into x = 8.5 and z = 4/3. And so by Fact 35, we can immediately describe

this line by the vector equation r = (8.5, 0, 4/3) + (0, 1, 0), for R.

Exercise 124. Rewrite each of the following cartesian equation descriptions of lines into

7x 2 0.3y 5 8z

a vector equation describing the same line. (a)

=

= . (b) 2x = 3y = 5z. (c)

5

7

7

3y 1

x 3 5z 2

17x 4 =

= 3z. (d)

=

, 3y = 11. (Answer on p. 1088.)

2

2

7

www.EconsPhDTutor.com

31.5

Collinearity

Definition 82. A set of points are said to be collinear if there is some line that contains

all of these points.

Any two points are always collinear simply take the line that passes through both of

them.

In contrast, three points may not be collinear. To check whether three points are collinear,

1. First take the line that passes through two of the points.

2. Then check whether the third point is on this line.

a

a

a, b, and c are

not collinear.

a, b, and c

are collinear.

www.EconsPhDTutor.com

Example 302. Are the points a = (1, 2, 3), b = (4, 5, 6), and c = (7, 8, 9) collinear?

First take the line through a and b. The vector from a to b is (3, 3, 3) and the line passes

through a. Hence, the line can be written as r = (1, 2, 3) + (3, 3, 3) ( R).

Then check whether c is on the line: Is there such that c = (7, 8, 9) = (1, 2, 3) + (3, 3, 3)?

Rearranging, we have (6, 6, 6) = (3, 3, 3), which we can write out as:

6 = 3,

6 = 3.

6= 3,

Clearly, all three of the above equations are true if = 2. And so c is also on the line.

Hence, the three points are collinear.

f = (0, 0, 1)

c = (7, 8, 9)

= (-1, 1, 0)

= (3, 3, 3)

e = (0, 1, 0)

b = (4, 5, 6)

a = (1, 2, 3)

d = (1, 0, 0)

Example 303. Are the points d = (1, 0, 0), e = (0, 1, 0), and f = (0, 0, 1) collinear?

First take the line through d and e . The vector from d to e is (1, 1, 0) and the line passes

through d. Hence, the line can be written as r = (1, 0, 0) + (1, 1, 0) ( R).

Then check whether f is on the line: Is there such that f = (0, 0, 1) = (1, 0, 0)+(1, 1, 0)?

Rearranging, we have (1, 0, 1) = (1, 1, 0), which we can write out as:

1 = ,

0= ,

1 = 0.

Clearly, there is no such that the above three equations can be true. And so the point f

is not on the line through d and e. Hence, the three points are not collinear.

Exercise 125. Determine whether each of the following set of three points are collinear. (a)

a = (3, 1, 2), b = (1, 6, 5), and c = (0, 1, 0). (b) a = (1, 2, 4), b = (0, 0, 1), and c = (3, 6, 10).

(Answer on p. 1088.)

www.EconsPhDTutor.com

32

Planes

Here are two very useful clues:

1. u v u v = 0.

In words: Two vectors are orthogonal if and only if their scalar product is 0.

2. Since the plane is a flat surface, there must be some vector n that is orthogonal (perpendicular) to this plane.

That is, n is orthogonal to every vector on the plane. We call n the planes normal

vector (hence the use of the letter n).

Is the normal vector unique? No, because any other vector cn (where c is any scalar) serves

equally well as a normal vector. In the figure below, n is a normal vector to the illustrated

plane. So too is 0.5n. And so too is n.

But otherwise, besides cn, there are no other vectors that are also orthogonal to

the plane. That is, any vector that cannot be written in the form cn is not orthogonal to

the plane.

Black vectors

Plane are on the plane

n (a normal

0.5n (Also a

vector)

normal vector)

-n (Also a

normal vector)

www.EconsPhDTutor.com

Suppose a plane contains some point p = (p1 , p2 , p3 ) and has a normal vector n = (a, b, c).

Now consider any point r on the plane. We can construct the vector r p.

This vector r p lies on the plane and must therefore be orthogonal to n, the planes normal

vector. So, for any point r that lies on the plane, we have

(r p) n = 0.

q (point not

on the plane)

n is normal to r p ,

but not to q p.

q p (vector not

on the plane)

n (a normal

vector)

p (point on

the plane)

r p (vector

on the plane)

Plane

r1 (point on

the plane)

Now consider any point q that is not on the plane. We can construct the vector q p.

This vector q p does not lie on the plane and must therefore not be orthogonal to n, the

planes normal vector. So, for any point q that does not lie on the plane, we have

(q p) n 0.

Altogether then, we conclude: A point r is on the plane if and only if

Fact 36. Suppose a plane contains point p and has normal vector n. Then the plane

contains exactly those points r such that

(r p) n = 0.

www.EconsPhDTutor.com

Recall that a line can be described by the vector equation r = p + v where r is a generic

point on the line and p is a known point on the line. Alternatively, it can also be described

by the vector equation r = p + v, where r is the position vector of r and p is the position

vector of p.

On the previous page, we proved that if a plane that contains point p and has normal vector

n, then it may be described by the vector equation (r p) n = 0, where r is a generic

point on the plane. Similar to a line, the same plane can also be described by the vector

equation (r p) n = 0.

By the distributivity of the scalar product, we have:

(r p) n = 0 r n p n = 0 r n = p n.

Now, p is known (it is the position vector of a point p known to be on the plane). So too

is n (it is the planes normal vector). Thus, p n is simply some known number.

So we can describe the plane even more simply by the vector equation

r n = d,

where d = p n.

www.EconsPhDTutor.com

Example 304. Consider a plane that contains the point p = (1, 2, 3) and has normal vector

(1, 1, 0). We compute

1 1

d=pn=

2 1

3 0

= 1 1 + 2 1 + 3 0 = 3.

We thus conclude that the plane may be described by the vector equation r (1, 1, 0) = 3.

This says that the plane contains exactly every point r, whose position vector r satisfies the

above equation. For example, the points r1 = (3, 0, 0), r2 = (0, 3, 5), and r3 = (1, 2, 1) are

on the plane, because their position vectors r1 = (3, 0, 0), r2 = (0, 3, 5), and r3 = (1, 2, 1)

satisfy the above equation, as we can easily verify:

3 1

r1 n =

0 1

0 0

= 3 1 + 0 1 + 0 0 = 3.

0 1

r2 n =

3 1

5 0

= 0 1 + 3 1 + 5 0 = 3.

1 1

r3 n =

2 1

1 0

= 1 1 + 2 1 + (1) 0 = 3.

Lest you be sceptical that a plane could be described so simply, lets verify that two vectors

on the plane are indeed orthogonal to the normal vector n. First consider r2 r1 = (0, 3, 5)

(3, 0, 0) = (3, 3, 5) this is a vector on the plane. We can verify that indeed

3 1

(r2 r1 ) n =

3 1

5 0

= 3 1 + 3 1 + 5 0 = 0.

Next consider p r3 = (1, 2, 3) (1, 2, 1) = (0, 0, 4) this is also a vector on the plane. We

can verify that indeed

0 1

(p r3 ) n =

0 1

4 0

Page 326, Table of Contents

= 0 1 + 0 1 + 4 0 = 0.

www.EconsPhDTutor.com

Example 305. Consider a plane that contains the point p = (0, 0, 1) and has normal vector

(2, 1, 1). We compute

0 2

d=pn=

0 1

1 1

= 0 2 + 0 (1) + 1 1 = 1.

We thus conclude that the plane may be described by the vector equation r (2, 1, 1) = 1.

This says that the plane contains exactly every point r, whose position vector r satisfies

the above equation. For example, the points r1 = (1, 1, 0), r2 = (0, 1, 2), and r3 = (1, 2, 1)

are on the plane, because their position vectors r1 = (1, 1, 0), r2 = (0, 1, 2), and r3 = (1, 2, 1)

satisfy the above equation, as we can easily verify:

1 2

r1 n =

1 1

0 1

= 1 2 + 1 (1) + 0 1 = 1.

0 2

r2 n =

1 1

2 1

= 0 2 + 1 (1) + 2 1 = 1.

1 2

r3 n =

2 1

1 1

= 1 2 + 2 (1) + 1 1 = 1.

Lest you be sceptical that a plane could be described so simply, lets verify that two vectors

on the plane are indeed orthogonal to the normal vector n. First consider r2 r1 = (0, 1, 2)

(1, 1, 0) = (1, 0, 2) this is a vector on the plane. We can verify that indeed

1 2

(r2 r1 ) n =

0 1

2 1

= (1) 2 + 0 (1) + 2 1 = 0.

Next consider p r3 = (0, 0, 1) (1, 2, 1) = (1, 2, 0) this is also a vector on the plane.

We can verify that indeed

1 2

(p r3 ) n =

2 1

0 1

Page 327, Table of Contents

www.EconsPhDTutor.com

We just learnt how to write down the vector equation of a plane, given a point on the

plane and its normal vector.

We now do the same, but given instead three points on a plane.

Example 306. A plane contains the points a = (1, 2, 3), b = (4, 5, 8), and c = (2, 3, 5).

Both vectors ab = (3, 3, 5) and

ac

plane is ab

ac = n = (1, 1, 0).

Since a n = 1, the plane can be described by the vector equation r (1, 1, 0) = 1.

Example 307. A plane contains the points a = (1, 0, 0), b = (0, 1, 0), and c = (0, 0, 1).

Both vectors ab = (1, 1, 0) and

ac

the plane is ab

ac = n = (1, 1, 1).

Since a n = 1, the plane can be described by the vector equation r (1, 1, 1) = 1.

www.EconsPhDTutor.com

We jnow write down the vector equation of a plane, given two points and a vector on

the plane.

Example 308. A plane contains the points a = (0, 0, 3) and b = (1, 4, 5), and the vector

v = (3, 2, 1).

Both vectors ab = (1, 4, 2) and v = (3, 2, 1) are on the plane. Hence, a normal vector to the

Since a n = 30, the plane can be described by the vector equation r (0, 5, 10) = 30.

Example 309. A plane contains the points a = (8, 2, 0) and b = (3, 6, 9), and the vector

v = (0, 1, 1).

Both vectors ab = (5, 8, 9) and v = (0, 1, 1) are on the plane. Hence, a normal vector to

Since a n = 18, the plane can be described by the vector equation r (1, 5, 5) = 18.

www.EconsPhDTutor.com

32.1

Let n = (a, b, c) be the normal vector of a plane. Let p = (p1 , p2 , p3 ) be a point on the plane.

Then the plane can be described by the vector equation

r n = p n,

where r = (x, y, z) is the position vector of a generic point on the plane. Writing out the

vectors in the above equation explicitly, we have:

x a

y b

z c

or

p1 a

= p b

2

p3 c

This last equation is the cartesian equation description of the same plane. Note, once

again, that d = ap1 + bp2 + cp3 is simply some known number. So this cartesian equation

simply says that the plane contains exactly those points (x, y, z) that satisfy the equation

ax + by + cz = d.

Example 310. The plane with vector equation r (1, 1, 0) = 3 has cartesian equation

x + y = 3.

Example 311. The plane with vector equation r (2, 1, 1) = 1 has cartesian equation

2x y + z = 1.

Example 312. The plane with vector equation r (1, 1, 0) = 1 has cartesian equation

x y = 1.

Example 313. The plane with vector equation r (1, 1, 1) = 1 has cartesian equation

x + y + z = 1.

Example 314. The plane with vector equation r (0, 5, 10) = 30 has cartesian equation

5y 10z = 30.

Example 315. The plane with vector equation r (1, 5, 5) = 18 has cartesian equation

x + 5y 5z = 18.

www.EconsPhDTutor.com

Its thus surprisingly easy to go back and forth between a planes vector and cartesian

equations:

r (a, b, c) = d

ax + by + cz = d.

know that it has vector equation r (2, 3, 5) = 7.

Example 317. Given a plane with cartesian equation 2x + 3z = 5, we immediately know

that it has vector equation r (2, 0, 3) = 5.

Heres a nice observation: Every plane that contains the origin (0, 0, 0) can be written in

the form ax + by + cz = 0. Conversely, every plane that does not contain the origin can be

written in the form ax + by + cz = 1. Formally:

Fact 37. A plane r n = d contains the origin d = 0.

Proof. Given a plane r n = d, the origin is on the plane (and thus satisfies this equation)

0 n = d = 0.

SYLLABUS ALERT

The 9740 (old) List of Formulae contains the following statement, but not the 9758 (revised)

List of Formulae!

The plane through A with normal vector n = n1 i + n2 j + n3 k has cartesian equation

n1 x + n2 y + n3 z + d = 0

where

d= a n.

Exercise 126. Find the vector and cartesian equations that describe the planes containing

each of the following set of three points: (a) a = (7, 3, 4), b = (8, 3, 4), and c = (9, 3, 7). (b)

a = (8, 0, 2), b = (4, 4, 3), and c = (2, 7, 2). (c) a = (8, 5, 9), b = (8, 4, 5), and c = (5, 6, 0).

(Answer on p. 1089.)

Exercise 127. Write down the vector equations of the planes whose cartesian equations

are as given: (a) 3x + 2y + 5z = 3. (b) 2y + 5z = 3. (c) 5z = 3. (Answer on p. 1090.)

www.EconsPhDTutor.com

32.2

Example 318. The plane with vector equation r (1, 0, 1) = 11 or cartesian equation x + z =

11 can be described in an infinite number of ways. For example, the same plane can also

be described by any of the following four equations: r (2, 0, 2) = 22, r (1/11, 0, 1/11) = 1,

2x + 2z = 22, and x/11 + z/11 = 1.

If you talk about the plane r (2, 0, 2) = 22 and I talk about the plane x/11x + z/11z = 1, it

make take us a moment to realise that we are talking about the exact same plane. To save

ourselves such trouble, it may be desirable to describe planes in a standardised form, called

the Hessian normal form.

as our normal vector. However,

This involves simply picking the unit normal vector n

there are two possible unit normal vectors, one pointing up and the other pointing down.

0, so that the RHS of our

We will choose the unit normal vector that ensures that p n

vector or cartesian equation in Hessian normal form is always non-negative.

0, 1) =

( 2/2, 0, 2/2). And so the plane can be rewritten in Hessian normal form as r

is uniquely defined. Indeed,

Notice that in the Hessian normal form, the number d = p n

it is the distance of the plane from the origin! (Well prove this in section 33.2.)

1, 3) =

(8/ 74, 1// 74, 3// 74). Note though that right now, the RHS is negative. So in order

to ensure that d 0 (as required by the Hessian normal form), we need simply reverse the

sign of our unit normal vector that is, we should pick (8/ 74, 1/ 74, 3/ 74) as our

unit normal

then,

can be rewritten

vector.

Altogether

the plane

in Hessian

normal form

as

r (8/ 74, 1/ 74, 3/ 74) = 3/ 74 or (8/ 74) x (1/ 74) y (3/ 74) z = (3/ 74).

Exercise 128. Rewrite each of the following planes vector equation into Hessian normal

form. (a) r (3, 6, 2) = 4. (b) r (1, 2, 2) = 1. (c) r (8, 1, 4) = 0. (Answer on p. 1090.)

www.EconsPhDTutor.com

33

Distances

Before we proceed, here are some useful things to remember. A line can be fully determined

by

1. Any two distinct points.

2. Any vector and a point.

Similarly, a plane can be fully determined by

1. Any three distinct points.

2. Any two distinct points and a distinct vector.39

3. Two distinct vectors and a point.

39

If the two points are a and b, then the vector must be distinct from cab, for any c R.

www.EconsPhDTutor.com

33.1

Definition 83. The foot of the perpendicular from a point a to a line l is the point b on

the line l that is closest to the point a. The distance between the point a and the line l is

the length of the line segment ab.

Distance

between

a and b

b

p

Note that the line ab must be perpendicular to the line l. Hence the name foot of the

perpendicular.

It is easier to remember how the proof of the following proposition works, than to try to

memorise the proposition itself:

Proposition 8. Given a point a and a line r = p +

v (for R),

2

2 (

v

) ; and

(a) The distance between the point and the line is

pa

pa

v

) v

.

(b) The foot of the perpendicular from the point to the line is the point p + (

pa

Proof. Let b be the foot of the perpendicular from the point to the line.

(a) Pick any known point on the line here the obvious choice is p. Consider the right and base of length

v

(refer

angled triangle bpa it has hypothenuse of length

pa

pa

Page 334, Table of Contents

www.EconsPhDTutor.com

to the diagram above). Hence, by the Pythagorean Theorem, the length of line segment ab

(or the distance between the point a and the line l) is:

2

2 (

v

) , as desired.

pa

pa

v

away from the point p, heading in the direction pb.

pa

(b) The point b is a distance

v

pb. There are two possible cases to examine.

pa

Hence b = p +

Case #1 : v

v

v

v

and

> 0, so that

=

. Altogether then, = becomes b =

Then pb = v

pa

pa

pa

v

v

v

= p + (

) v

, as desired.

p +

pa

pa

Case #2 : v

v

v

v

=

. Altogether then, = becomes b =

< 0, so that

pa

pa

Then pb =

v and

pa

v

v

) (

) (

p + (

pa

v ) = p + (

pa

v), as desired.

On p. 938 in the Appendices (optional), I give another proof of the above Proposition using

calculus. The idea of this second proof will be illustrated in the last two examples of this

section.

www.EconsPhDTutor.com

Example 321. Consider the point a = (1, 2, 3) and the line r = (0, 1, 2) + (9, 1, 3) ( R).

= (1, 1, 1) and so

2 = 12 +12 +12 = 3.

Pick a point on the line say p = (0, 1, 2). We have

pa

pa

Also,

(1, 1, 1) (9, 1, 3)

13

v

=

pa

= .

91

92 + 12 + 32

2

v

) = 169/61. Hence, the length of the side is

pa

And so (

169

=

91

104

=

91

8

1.069.

7

13 (9, 1, 3) 1

b = (0, 1, 2) +

= (9, 8, 17) .

7

91

91

Not to scale.

a = (1, 2, 3)

Distance between

a and b is 1.069

l

b=

(9, 8, 17)

p = (0, 1, 2)

v

> 0.

Note that in this example, v and pb do point in the same direction and we have

pa

In contrast, in the next example, v and pb will point in opposite directions and we will

v

< 0.

have

pa

Page 336, Table of Contents

www.EconsPhDTutor.com

Example 322. Consider the point a = (1, 0, 1) and the line r = (3, 2, 1)+(5, 1, 2) ( R).

= (4, 2, 0) and so

2 = 42 +22 +02 =

Pick a point on the line say p = (3, 2, 1). We have

pa

pa

20. Also,

(4, 2, 0) (5, 1, 2)

22

v

=

pa

= .

30

52 + 12 + 22

v

< 0 and sure enough, v and pb point in the opposite directions.) So

(As noted,

pa

2

v

) = 484/30. Hence, the length of the side is

(

pa

20

484

=

30

116

=

30

58

2.823.

15

22 (5, 1, 2) 1

(10, 19, 7) .

=

b = (3, 2, 1)

15

30

30

Not to scale.

a = (-1, 0, 1)

Distance between

a and b is 2.823

l

b=

(-10, 9, -7)

p = (3, 2, 1)

www.EconsPhDTutor.com

Exercise 129. For each of the following, find (i) the distance between the given point a

and the given line l; and also (ii) the point b on the line that is closest to a. (a) The point

a = (7, 3, 4) and the line l described by r = (8, 3, 4) + (9, 3, 7). (b) The point a = (8, 0, 2)

and the line l described by r = (4, 4, 3) + (2, 7, 2). (c) The point a = (8, 5, 9) and the line l

described by r = (8, 4, 5) + (5, 6, 0). (Answers on pp. 1091, 1092, and 1093.)

www.EconsPhDTutor.com

We now learn a second method for finding the foot of a perpendicular and hence the

distance of a point to a line. This second method involves calculus and finding the minimum

point. It also occasionally features on the A-level exams.

Example 323. Consider the point a = (1, 2, 3) and the line r = (0, 1, 2) + (9, 1, 3) ( R).

The distance between a and a generic point r on the line is

RRR

RRR 1 9

a r = RRRR

2 1+

RRR

RRR 3 2 + 3

=

R R

RRRR RRRR 1 9

RRRR = RRRR 1

RR RR

RRRR RRRR 1 3

R R

R

RRRR

RRRR

RR

RRRR

R

(1 9)2 + (1 )2 + (1 3)2 =

912 26 + 3.

(9, 1, 3) gives us another point of the line.

As varies, the distance between the point

a and the corresponding point r on the line is 912 26 + 3.

Our goal is to find the point

r on the line that is closest to the point a. In other

2

words,

our goal is to minimise 91 26 + 3. So we can look for the minimum point

of 912 26 + 3.

To simplify matters, note that minimising 912 26 + 3 is the same as minimising 912

26 + 3. So we might as well look for the minimum point of 912 26 + 3. To this end:

d

set

(912 26 + 3) = 182 26 = 0

d

26 1

= .

182 7

Altogether then, the point b on the line l that is closest to the point a has parameter = 1/7.

So b = (0, 1, 2) + 1/7(9, 1, 3) = 1/7(9, 8, 17).

And the distance between a and l (or equivalently, the length of the line segment ab) is

912 26 + 3 =

1 2

1

91 ( ) 26 ( ) + 3 =

7

7

8

.

7

Of course, these are the same as what we found in Example 321 a few pages ago.

www.EconsPhDTutor.com

Example 324. Consider the point a = (1, 0, 1) and the line described by the vector

equation r = (3, 2, 1) + (5, 1, 2) ( R). The distance between a and a generic point r on

the line is

RRR

RRR 1 3 + 5

a r = RRRR

0

2+

RRR

RRR 1 1 + 2

=

R R

RRRR RRRR 4 5

RRRR = RRRR 2

RR RR

RRRR RRRR 2

R R

R

RRRR

RRRR

RR

RRRR

R

d

set

(302 + 44 + 20) = 60 + 44 = 0

d

44 11

=

.

60

15

And the distance between a and l (or equivalently, the length of the line segment ab) is

302 + 44 + 20 =

11 2

11

30 (

) + 44 (

) + 20 =

15

15

58

.

15

Of course, these are the same as what we found in Example 322 a few pages ago.

Exercise 130. For each of the following, use the second method (calculus) to find (i) the

distance between the given point a and the given line l; and also (ii) the point b on the

line that is closest to a. (a) The point a = (7, 3, 4) and the line l described by r = (8, 3, 4) +

(9, 3, 7). (b) The point a = (8, 0, 2) and the line l described by r = (4, 4, 3) + (2, 7, 2). (c)

The point a = (8, 5, 9) and the line l described by r = (8, 4, 5) + (5, 6, 0). (Answers on p.

1094.)

www.EconsPhDTutor.com

33.2

Definition 84. The foot of the perpendicular from a point a to a plane P is the point b on

the plane P that is closest to the point a. The distance between the point a and the plane

P is the length of the line segment ab.

Distance

between

a and b

Plane

p

b

Proposition 9. Given a point a (with position vector a) and a plane given in Hessian

= d,

normal form r n

; and

(a) The distance between the point and the plane is d a n

) n

.

(b) The foot of the perpendicular from the point to the plane is the point a + (d a n

Proof. Let b be the foot of the perpendicular from the point to the line.

(a) Pick any point p on the plane. The length of the line segment ab and hence also the

on

distance between the point and the plane is simply the length of the projection of

ap

n

= (p a) n

= d a n

, as desired.

the planes normal vector, which is simply

ap

(b) The point b is a distance d a n

b = a + d a n

n

is pointing in the same direction as pb, then n

= ab. Moreover

=

Case #1 : If n

ap

> 0, so that d a n

= d a n

. Altogether then, = becomes b = a + (d a n

) n

, as

dan

desired.

n

and pb are pointing in opposite directions, then n

= ab. Moreover

= d

Case #2 : If n

ap

< 0, so that d a n

= (d a n

). Altogether then, = becomes b = a(d a n

) (

a n

n) =

) n

, as desired.

a + (d a n

Page 341, Table of Contents

www.EconsPhDTutor.com

Example 325. Consider the point a = (1, 2, 3) and the plane described by the vector

equation r (1, 1, 1) = 3.

Convert the vector equation of the plane to Hessian normal form:

3

1 1 1

r ( , , ) = = 3.

3 3 3

3

=

So n

3

= 2 3.

(1, 1, 1), d = 3, and a n

3

= 3 2 3 =

Altogether then, the distance between the point and the plane is d a n

) n

= (1, 2, 3) + ( 3 2 3)

a + (d a n

3

(1, 1, 1) = (0, 1, 2).

3

By the way, notice that in this example, n points in the opposite direction from ab. And

< 0.

n. And moreover, d a n

so ab =

a = (1, 2, 3)

Not to scale.

Plane

p = (0, 1, 2)

Distance

between

a and b

b = (0, 1, 2)

www.EconsPhDTutor.com

Example 326. Consider the point a = (0, 0, 0) and the plane described by the vector

equation r (1, 2, 3) = 32.

Convert the vector equation of the plane to Hessian normal form:

2

3

32

1

r ( , , ) = .

14 14 14

14

32

1

= 0.

= (1, 2, 3), d = , and a n

So n

14

14

32

= 0 =

Altogether then, the distance between the point and the plane is d a n

14

32

and the foot of the perpendicular is

14

32

1

16

) n

= (0, 0, 0) + ( 0) (1, 2, 3) = (1, 2, 3).

a + (d a n

7

14

14

By the way, notice that in this example, n points in the same direction as ab. And so

. And moreover, d a n

> 0.

ab = n

a = (0, 0, 0)

Not to scale.

Distance

between

a and b

Plane

p = (4, 5, 6)

b=

(1, 2, 3)

www.EconsPhDTutor.com

Exercise 131. For each of the following, find (i) the distance between the given point

a and the given plane P ; and also (ii) the point b on the line that is closest to a. (a)

a = (7, 3, 4), P r (9, 3, 7) = 109. (b) a = (8, 0, 2), P r (2, 7, 2) = 42. (c) a = (8, 5, 9),

P r (5, 6, 0) = 64. (Answers on pp. 1095, 1096, and 1097.)

www.EconsPhDTutor.com

34

34.1

Angles

Consider two lines on the 2D cartesian plane that are parallel (and thus either do not

intersect or are identical). We define the angle between them to be 0.

two parallel lines to be 0.

Now consider two lines that intersect (see diagram below). Taking their intersection point

to be the vertex, A and B are, respectively, the acute and obtuse angles between the two

lines. Of course, there is the possibility that the two lines are perpendicular, in which case

A and B are both right (i.e. equal to /2).

So when talking about the angle between two lines, there is some potential for confusion.

Are we talking about angle A or angle B?

By convention, the angle between two lines is the smaller angle. (Also, on the A-level

exams, they are usually quite careful to specifying that they want the acute angle, so that

there is no confusion.)

Page 345, Table of Contents

www.EconsPhDTutor.com

If we have the direction vectors of the lines, then we can simply use what we learnt about

the scalar product to compute the angle between them.

Example 327. Consider the lines (on the 2D cartesian plane) r = (1, 3) + (2, 1) and

r = (1, 1) + (1, 3) ( R). The angle between their direction vectors v1 = (2, 1) and

v2 = (1, 3) is given by

= cos1 (

v1 v2

(2, 1) (1, 3)

5

) = cos1 (

) = cos1 ( ) 0.785.

v1 v2

(2, 1) (1, 3)

5 10

y 4

A = 0.785

Vector equation

r = (1, 3) + (2, 1)

2

x

0

-4

-2

Vector equation

r = (-1, -3) + (1, 3)

2

The vector (1, 3)

-2

A = 0.785

The vector (2, 1)

-4

www.EconsPhDTutor.com

Example 328. Consider the lines (on the 2D cartesian plane) r = (0, 0) + (2, 3) and

r = (1, 0) + (3, 1) ( R). The angle between their direction vectors v1 = (2, 3) and

v2 = (3, 1) is given by

= cos1 (

(2, 3) (3, 1)

3

v1 v2

) = cos1 (

) = cos1 ( ) 1.837.

v1 v2

(2, 3) (3, 1)

13 10

This is the obtuse angle between the two lines. So the acute angle between the two lines is

A = 1.837 = 1.305.

y 4

B = 1.837

Vector equation

r = (0, 0) + (-2, 3)

2

A = 1.305

B = 1.837

x

0

-4

-2

A = 1.305

Vector equation

r = (1, 0) + (3, 1)

-2

-4

www.EconsPhDTutor.com

Example 329. Consider the lines (on the 2D cartesian plane) r = (2, 2) + (3, 3) and

r = (1, 1) + (1, 1) ( R). The angle between their direction vectors v1 = (3, 3) and

v2 = (1, 1) is given by

= cos1 (

(3, 3) (1, 1)

6

6

v1 v2

) = cos1 (

) = cos1 ( ) = cos1 ( ) = .

v1 v2

(3, 3) (1, 1)

6

18 2

So the two vectors are parallel. Which means that the two lines are parallel and so by

definition, the angle between the two lines is 0.

y 4

and thus the angle

between them is 0.

x

0

-4

-2

Vector equation

r = (2, -2) + (3, 3)

-2

Vector equation

r = (1, 1) + (-1, 1)

-4

www.EconsPhDTutor.com

Exercise 132. Find the acute angle between each of the following pairs of lines. (a)

r = (1, 2) + (1, 1) and r = (0, 0) + (2, 3) ( R). (b) r = (1, 2) + (1, 5) and

r = (0, 0) + (8, 1) ( R). (c) r = (1, 2) + (2, 6) and r = (0, 0) + (3, 2) ( R). (Answer

on p. 1098.)

www.EconsPhDTutor.com

34.2

Visualising lines in 3D space is difficult. Which is why we tackled the 2D case first.

It turns out that we compute angles between two lines in 3D space in exactly the same way

as in the 2D case.

1. If two lines are parallel, then again we define the angle between them to be 0.

2. If two lines intersect, then again we take their intersection point to be the vertex and

take the smaller angle formed to be the angle between the two lines.

On the 2D cartesian plane, the above were the only two possibilities two lines either are

parallel or intersect. In contrast, in 3D space, there is the third possibility that two lines

neither are parallel nor intersect! As well learn in section 35.1, any two lines that

neither are parallel nor intersect are called skew lines.

What is the angle between two skew lines, given that they do not intersect?

3. Given two skew lines, translate one of them so that they intersect. Examine the angle

between the two now-intersecting lines. This is defined to be the angle between the two

skew lines.

The next example illustrates.

www.EconsPhDTutor.com

Example 330. Below, the red line and pink line are skew lines, i.e., they neither intersect

nor are parallel. To find the angle between them, translate the red line upwards so that the

new red dotted line intersects the pink line at the purple dot. The angle A is then defined

to be the angle between the two skew lines.

intersect nor are parallel)

y

Translate one of

the lines so that

they intersect

2

1

x

1

1

2

3

z

So once again, given any two lines, the angle between them is simply the angle between

their direction vectors. So again the scalar product comes in handy.

www.EconsPhDTutor.com

Example 331. Consider the lines r = (0, 1, 2)+(9, 1, 3) and r = (4, 5, 6)+(3, 2, 1) ( R).

The angle between their direction vectors v1 = (9, 1, 3) and v2 = (3, 2, 1) is given by

= cos1 (

(9, 1, 3) (3, 2, 1)

32

v1 v2

) = cos1 (

) = cos1 ( ) 0.459.

v1 v2

(9, 1, 3) (3, 2, 1)

91 14

Example 332. Consider the lines r = (1, 2, 3) + (0, 1, 0) and r = (0, 0, 0) + (8, 3, 5)

( R). The angle between their direction vectors v1 = (0, 1, 0) and v2 = (8, 3, 5). Thus,

= cos1 (

(0, 1, 0) (8, 3, 5)

3

v1 v2

) = cos1 (

) = cos1 ( ) 1.879.

v1 v2

(0, 1, 0) (8, 3, 5)

1 98

So the obtuse angle between the two lines is 1.879. And the angle between the two lines is

1.263.

Example 333. Consider the lines r = (1, 3, 3) + (1, 5, 3) and r = (7, 4, 7) + (7, 2, 1)

( R). The angle between their direction vectors v1 = (1, 5, 3) and v2 = (7, 2, 1). Thus,

= cos1 (

v1 v2

(1, 5, 3) (7, 2, 1)

) = cos1 (

) = cos1 (0) 0.5.

v1 v2

(1, 5, 3) (7, 2, 1)

So the two lines are perpendicular and the angle between them is right (i.e. /2).

Exercise 133. Find the angle between each of the following pairs of lines. (a) r = (1, 2, 3)+

(1, 1, 0) and r = (0, 0, 0) + (2, 3, 4) ( R). (b) r = (1, 2, 3) + (1, 5, 6) and r =

(0, 0, 0) + (8, 1, 1) ( R). (c) r = (1, 2, 3) + (2, 6, 7) and r = (0, 0, 0) + (3, 2, 1) ( R).

(Answer on p. 1098.)

www.EconsPhDTutor.com

34.3

Fact 38. The angle A between the line r = p + v and the plane r n = d is given by

A = sin1

vn

.

v n

Line

Line

is obtuse

is acute

A = 0.5

Plane

A = 0.5

Plane

Direction

vector of line

Direction

vector of line

Proof. Consider the angle between the lines direction vector and the planes normal

vn

.

vector. By the scalar product, we have cos =

v n

Case #1: If is acute or right, i.e. (0, /2], then the angle A between the line and the

plane is A = /2 (see figure). And so

vn

sin A = sin ( ) = sin ( ) cos sin cos ( ) = cos =

.

2

2

2

v n

Note that if (0, /2], then v n 0, so that v n = v n. Altogether, we indeed have

sin A =

vn

v n

or A = sin1

vn

.

v n

Case #2: If is obtuse or straight, i.e. (/2, ], then the angle A between the line and

the plane is A = 0.5 (see figure). And so

v n

sin A = sin ( ) = sin cos ( ) sin ( ) cos = cos =

.

2

2

2

v n

Note that if (/2, ], then v n < 0, so that v n = v n. Altogether, we indeed have

sin A =

vn

v n

or A = sin1

vn

.

v n

www.EconsPhDTutor.com

Example 334. The angle between the line r = (0, 1, 2) + (9, 1, 3) ( R) and the plane

r (1, 1, 1) = 3 is

13

v

n

(9,

1,

3)

(1,

1,

1)

13

sin1

= sin1

= sin1 = sin1 0.906.

v n

(9, 1, 3) (1, 1, 1)

91 3

7 3

Example 335. The angle between the line r = (4, 2, 3) + (1, 0, 1) ( R) and the plane

r (1, 1, 1) = 5 is

sin1

(1, 0, 1) (1, 1, 0)

1

vn

= sin1

= sin1 = sin1 (1/2) = /6.

v n

(1, 0, 1) (1, 1, 0)

2 2

Example 336. The angle between the line r = (5, 5, 5) + (1, 0, 1) ( R) and the plane

r (0, 1, 0) = 3 is

sin1

(1, 0, 1) (0, 1, 0)

vn

= sin1

= sin1 0 = 0.

v n

(1, 0, 1) (0, 1, 0)

Exercise 134. Find the angle between the given line and plane. (a) r = (1, 2, 3) +

(1, 1, 0) ( R) and r (3, 4, 5) = 0. (b) r = (1, 2, 3) + (0, 2, 6) ( R) and r (1, 3, 5) = 2.

(c) r = (1, 2, 3) + (1, 9, 8) ( R) and r (2, 8, 2) = 3. (Answer on p. 1099.)

www.EconsPhDTutor.com

34.4

Given two planes P1 and P2 , the angle between them is simply the angle between any two

vectors v1 and v2 on the two planes.

n1

v2

n2

Angle between

the two planes

normal vectors

Angle between

the two planes

P2

v1

P1

But the normal vector n1 of the first plane is orthogonal to v1 ; similarly, the normal vector

n2 of the second plane is orthogonal to v2 . And so the angle between v1 and v2 is equal to

the angle between n1 and n2 .

Altogether then, the angle between two planes is simply the angle between their normal

vectors.

Again, there are two possible angles by convention, we take the smaller one.

www.EconsPhDTutor.com

Example 337. Consider the planes r (1, 1, 1) = 12 and r (1, 1, 0) = 1. The angle

between the two planes is:

= cos1 (

= cos1 (

n1 n2

)

n1 n2

(1, 1, 1) (1, 1, 0)

)

(1, 1, 1) (1, 1, 0)

2

2

= cos1 ( ) = cos1 ( )

3 2

6

2.526.

This is the obtuse angle. So the acute angle between the two planes is 2.526 = 0.615

radian.

Example 338. Consider the planes r (2, 1, 3) = 26 and r (3, 0, 5) = 25. The angle

between the two planes is

= cos1 (

= cos1 (

n1 n2

)

n1 n2

(2, 1, 3) (3, 0, 5)

)

(2, 1, 3) (3, 0, 5)

9

= cos1 ( ) 1.146.

14 34

Exercise 135. Find the angle between the two given planes. (a) r (1, 2, 3) = 1 and

r (3, 4, 5) = 2. (b) r (1, 2, 3) = 3 and r (5, 1, 1) = 4. (c) r (1, 1, 8) = 5 and r (3, 0, 10) = 6.

(Answer on p. 1100.)

www.EconsPhDTutor.com

35

35.1

Definition 85. Two lines are parallel if their direction vectors can be written as scalar

multiples of each other.

Example 339. The lines r = (0, 0, 0) + (0, 1, 0) and r = (4, 17, 0) + (1, 0, 0) ( R) are

not parallel, because (0, 1, 0) cannot be written as a scalar multiple of (1, 0, 0).

Example 340. The lines r = (8, 1, 1) + (3, 6, 9) and r = (4, 5, 6) + (1, 2, 3) ( R) are

parallel, because (3, 6, 9) = 3(1, 2, 3).

www.EconsPhDTutor.com

Definition 86. A set of points are coplanar if there is some plane that contains all of these

points.

Any two points are always coplanar indeed, they are collinear (p1 and p2 in the figure

below). Three points are also always coplanar, although they may not be collinear (p1 , p2 ,

and p3 in the figure below). But four points may not be coplanar (p1 , p2 , p3 , and p4 in the

figure below).

Line 2

Two points are coplanar. They

also lie on the same line.

not necessarily on the same line.

p1

Plane

p3

p2

Line 1

p4

not be coplanar.

Definition 87. Two lines are coplanar if there is some plane on which both lie. Two lines

that are not coplanar are called skew lines.

Example 341. In the figure above, Line 1 and Line 2 are skew lines. Line 1 lies on the

plane illustrated. Line 2 cuts through the plane and does not intersect Line 1.

www.EconsPhDTutor.com

How do we determine whether two lines l1 and l2 are coplanar or skew? Well,

1. If they are parallel, then obviously we can construct a plane that contains both lines.

And so the two lines are coplanar.

2. If they are not parallel and they lie on the same plane, then they must intersect. This

is just the familiar fact you learnt in primary school two non-parallel lines on the

plane must definitely intersect.

Altogether we conclude:

Fact 39. Two lines are coplanar if and only if they (i) are parallel; OR (ii) intersect.

Equivalently, two lines are skew if and only if they (i) are not parallel; AND (ii) do not

intersect.

Example 342. Consider the lines r = (8, 1, 1)+(3, 6, 9) and r = (4, 5, 6)+(1, 2, 3) ( R).

The direction vector of one can be written as the scalar multiple of the other, so they are

parallel. Hence, they are also coplanar; or equivalently, they are not skew.

Example 343. Consider the lines r = (0, 0, 0) + (0, 1, 0) and r = (4, 17, 0) + (1, 0, 0)

( R). The direction vector of one cannot be written as the scalar multiple of the

other, so they are not parallel. If they intersect, then there are reals and such that

(0, 0, 0) + (0, 1, 0) = (4, 17, 0) + (1, 0, 0), or

0 = 4 + , = 17, and 0 = 0.

= 17, = 4 solves the above equations. (What does this mean? This means that the

first line goes through the point (0, 0, 0) + (0, 1, 0) = (0, 17, 0) and the second line also goes

through the same point (4, 17, 0) + (1, 0, 0) = (0, 17, 0).)

The two lines intersect at (0, 17, 0). And so they are coplanar or equivalently, they are

not skew.

If wed like, we can easily find the plane on which these two lines lie. Remember: All

we need are two distinct vectors and a point to determine a plane. We already have two

distinct vectors, namely the direction vectors of the two lines. Using these, we can find

a normal vector for the plane namely (0, 1, 0) (1, 0, 0) = (0, 0, 1). Noting also that

the origin is on the first line and therefore on the plane, we conclude that the plane is

r (0, 0, 1) = 0.

www.EconsPhDTutor.com

Example 344. Consider the lines r = (0, 1, 2)+(9, 1, 3) and r = (4, 5, 6)+(3, 2, 1) ( R).

They are not parallel. Lets see if they have an intersection point. If they intersect, then

there are reals and such that (0, 1, 2) + (9, 1, 3) = (4, 5, 6) + (3, 2, 1), or

1

9 = 4 + 3, 1 + = 5 + 2, and 2 + 3 = 6 + .

3

2

Now from =, this means that = 1.6. These do not work if we try plugging them into

1

=. Hence, there are no reals and that solve the above system of equations. In other

words, the two lines do not intersect.

And so the two lines are not coplanar or equivalently, they are skew.

Exercise 136. Determine whether each of the following pairs of lines is coplanar or skew.

If they are coplanar, find the plane that contains both of them. (a) r = (8, 1, 5) + (3, 2, 1)

and r = (1, 2, 3) + (5, 6, 7) ( R). (b) r = (0, 0, 6) + (3, 9, 0) and r = (1, 1, 1) + (1, 3, 0)

( R). (c) r = (6, 5, 5) + (1, 0, 1) and r = (8, 3, 6) + (0, 1, 1) ( R). (Answer on p. 1101.)

www.EconsPhDTutor.com

35.2

Definition 88. A line with direction vector v and a plane with normal vector n are parallel

if v n = 0 (i.e. v and n are perpendicular).

The above definition makes sense, because if the line is perpendicular to the planes normal

vector, then the line must be parallel to the plane itself.

Fact 40. Given a plane and a line, there are three possible cases (illustrated below):

1. The line and plane are parallel and do not intersect at all.

2. The line and plane are parallel and the line lies completely on the plane.

3. The line and plane are not parallel and intersect at exactly one point.

Line 1

Line 3

Plane

Line 2

Note that if a line and a plane are parallel, then either (i) they do not intersect at all; or

(ii) the line lies completely on the plane.

So if a line and a plane are parallel and you can prove that they share at least one

intersection point, then it must be that the line lies completely on the plane.

Conversely, if a line and a plane are parallel and you can prove that there is at least one

point on the line that is not on the plane (or that there is at least one point on the plane

that is not on the line), then it must be that they do not intersect at all.

www.EconsPhDTutor.com

Example 345. Consider the line r = (3, 5, 5)+(9, 1, 3) ( R) and the plane r(1, 1, 1) = 3.

We have (9, 1, 3) (1, 1, 1) = 13 0 and so they are not parallel.

They must therefore intersect at exactly one point. Lets find it.

Plug in a generic point of the line into the equation for the plane: [(3, 5, 5) + (9, 1, 3)]

(1, 1, 1) = 3 3 + 9 + 5 + + 5 + 3 = 3 13 + 13 = 3 = 10/13. So the

intersection point is (3, 5, 5) 10/13(9, 1, 3).

Example 346. Consider the line r = (3, 5, 5) + (9, 1, 3) ( R) and the plane r (1, 0, 3) =

6. We have (9, 1, 3) (1, 0, 3) = 0 and so they are parallel.

There are two possibilities. Either they do not intersect at all OR the line lie completely

on the plane.

The point (3, 5, 5) is on the line but is not on the plane, as we can easily verify (3, 5, 5)

(1, 0, 3) = 12 6. And so the line and plane do not intersect at all.

Example 347. Consider the line r = (3, 5, 3) + (9, 1, 3) ( R) and the plane r (1, 0, 3) =

6. We have (9, 1, 3) (1, 0, 3) = 0 and so they are parallel.

There are two possibilities. Either they do not intersect at all OR the line lie completely

on the plane.

The point (3, 5, 3) on the line is also on the plane: (3, 5, 3) (1, 0, 3) = 6. Since they are

parallel and share at least one intersection point, it must be that the line lies completely

on the plane.

Exercise 137. Determine whether the given line and plane are (i) parallel but do not

intersect; (ii) parallel with the line lying completely on the plane; or (iii) intersect at

exactly one point. (a) r = (4, 5, 6) + (2, 3, 5) ( R) and r (10, 0, 4) = 26. (b) r =

(5, 5, 6) + (2, 3, 5) ( R) and r (10, 0, 4) = 26. (c) r = (4, 5, 6) + (2, 3, 5) ( R) and

r (10, 0, 3) = 26. (Answer on p. 1102.)

www.EconsPhDTutor.com

35.3

Definition 89. Two planes are parallel if their normal vectors can be written as scalar

multiples of each other.

(Note that an alternative definition is this: Two planes are parallel if they do not intersect.

We will show that these two definitions are equivalent.)

Imagine that two planes intersect at some line, which well call the intersection line.

Since this intersection line is on both planes, it must also be perpendicular to the normal

vectors of both planes. In other words, it must have direction vector n1 n2 . The next

fact is thus not surprising (although actually proving it takes a little work).

Fact 41. Two non-parallel planes with normal vectors n1 and n2 intersect at all if and only

if they intersect along a line with direction vector n1 n2 (i.e. the line is perpendicular to

both n1 and n2 ).

Fact 42. Given two planes, there are three possible cases:

1. The two planes are parallel and exactly identical.

2. The two planes are parallel and do not intersect at all.

3. The two planes are not parallel and share an intersection line with direction vector

n1 n2 (where n1 , n2 are the normal vectors of the plane).

www.EconsPhDTutor.com

Example 348. In the figure below, planes P1 and P2 are parallel and do not intersect at

all. Planes P2 and P3 are not parallel and share an intersection line with direction vector

n2 n3 .

n2

P3

n3

P2

Intersection line

of P2 and P3

n2 n3

P1

Note that analogous to our study of two lines, if two planes are parallel, then either (i) they

do not intersect at all; or (ii) they are identical.

So if two planes are parallel and you can prove that they share at least one intersection

point, then it must be that the two planes are identical.

Conversely, if two planes are parallel and you can prove that there is at least one point

on one plane that is not on the other plane, then it must be that they do not intersect

at all.

And in the case where they are not parallel, to find the intersection line, simply find a point

p where the two planes intersect. Then the intersection line is simply r = p + (n1 n2 )

( R).

www.EconsPhDTutor.com

Clearly, (7, 1, 1) cannot be written as a scalar multiple of (1, 1, 2). So the two planes are

not parallel and share an intersection line whose direction vector is (7, 1, 1) (1, 1, 2) =

(1, 13, 6).

Find a point p = (x, y, z) where the two planes intersect:

1

7x + y + z = 42,

x + y + 2z = 6.

There are infinitely many points where the two planes intersect. So why not we look for an

intersection point where x = 0. Ill call this the plug in x = 0 trick.

2

In which case = minus = yields z = 36 and y = 78. Hence, the intersection line is r =

(0, 78, 36) + (1, 13, 6) ( R).

Clearly, (1, 1, 1) cannot be written as a scalar multiple of (1, 1, 0). So the two planes are

not parallel and share an intersection line whose direction vector is (1, 1, 1) (1, 1, 0) =

(1, 1, 0).

Find a point p = (x, y, z) where the two planes intersect:

1

x + y + z = 12,

x y = 1.

2

Again, we can play the plug in x = 0 trick. In which case = says that y = 1 and now from

1

=, we have z = 11. And so (0, 1, 11) is an intersection point of the two planes. Hence, the

intersection line is r = (0, 1, 11) + (1, 1, 0) ( R).

www.EconsPhDTutor.com

You can use the plug in x = 0 trick whenever the intersection line has direction vector

with a x-coordinate that is not equal to 0.

But the plug in x = 0 trick may not work if the intersection line has direction vector

with x-coordinate equal to 0.

Clearly, (0, 1, 3) cannot be written as a scalar multiple of (1, 0, 5). So the two planes are

not parallel and share an intersection line whose direction vector is (0, 1, 3) (1, 1, 3) =

(0, 3, 1).

Find a point p = (x, y, z) where the two planes intersect:

1

y + 3z = 0,

x + y + 3z = 2.

Here the direction vector of the intersection line has x-coordinate 0. So the plug in x = 0

1

2

trick might not work. And indeed it doesnt, because if we plug in x = 0, then = and = are

contradictory.

So lets try the plug in y = 0 trick instead, which I know will work because the ycoordinate of the direction vector of the interesction line is non-zero (its 3). Then from

1

2

= we have z = 0 and now from = we have x = 2. And so (2, 0, 0) is an intersection point

of the two planes. Hence, the intersection line is r = (2, 0, 0) + (0, 3, 1) ( R).

Alternatively, we could also have used the plug in z = 0 trick instead, which again I

know will work because the z-coordinate of the direction vector of the interesction line is

1

2

non-zero (its 1). Then from = we have y = 0 and now from = we have x = 2. And so again

we find that (2, 0, 0) is an intersection point of the two planes. And so again we would

have concluded that the intersection line is r = (2, 0, 0) + (0, 3, 1) ( R).

www.EconsPhDTutor.com

Clearly, (4, 0, 3) can be written as a scalar multiple of (8, 0, 6) and so the two planes are

parallel.

Lets check if they are identical. The point (5/4, 0, 0) is on the first plane. However, it is

not on the second plane because (5/4, 0, 0) (8, 0, 6) = 10 64.

So the two planes are not identical and do not intersect at all.

Clearly, (4, 0, 3) can be written as a scalar multiple of (8, 0, 6) and so the two planes are

parallel.

Lets check if they are identical. The point (8, 0, 0) is on the first plane. And it is also on

the second plane because (8, 0, 0) (8, 0, 6) = 64.

Since the two planes are parallel and share at least one intersection point, it must be that

the two planes are exactly identical.

Exercise 138. Determine whether the given pair of planes are parallel and identical, parallel and do not intersect, or are not parallel. If they are not parallel, determine also their

intersection line. (a) r(4, 9, 3) = 61 and r(1, 1, 2) = 19. (b) r(1, 1, 0) = 4 and r(1, 6, 8) = 60.

(c) r (4, 4, 8) = 56 and r (1, 1, 2) = 12. (d) r (4, 4, 8) = 48 and r (1, 1, 2) = 12. (Answer on

p. 1103.)

www.EconsPhDTutor.com

35.4

SYLLABUS ALERT

The relationship between three planes is included in the 9740 (old) syllabus, but not in the

9758 (revised) syllabus. So you can skip this section if youre taking 9758.

P1 and P2 ;

P1 and P3 ; and

P2 and P3 .

To find the relationship between the 3 planes is simply to find the relationships between

each of these 3 pairs of planes. This can be insanely tedious, but there is nothing new here.

Everything follows from what you learnt in the previous sections.

Lets nonetheless give a summary of the possibilities. Given three planes, we have 3 possible

cases, each of which can be broken up into several sub-cases, for a total of 8 distinct

possibilities.

www.EconsPhDTutor.com

1. All 3 planes are parallel to each other. There are 3 possible sub-cases:

(a) All 3 planes are identical.

(b) Only 2 planes are identical.

(c) No 2 planes are identical.

P1 , P2 , P3

3 parallel,

identical planes

P3

P2

P3

P1

P1 , P2

3 parallel planes,

where P1 and P2

are identical

www.EconsPhDTutor.com

2. Two planes are parallel to each other, but the 3rd plane is not parallel to either

of the first 2 planes. There are 2 possible sub-cases:

(a) The first 2 planes are identical. And so here we are really back to the situation of

two non-parallel planes, which we already covered in detail in the previous section.

They intersect along a line.

(b) The first 2 planes are not identical. And so the non-parallel plane intersects each

of the other two planes along a separate line of intersection.

P3

identical. P3 intersects

both at the same line.

P1 , P2

P3

P2

non-identical. P3 intersects

both at separate lines.

P1

www.EconsPhDTutor.com

3. No 2 planes are parallel. Each pair of planes intersects along a line. There are thus

three intersection lines (though possibly some may be identical). It is possible to prove

(but we wont do so in this book) that there are only 3 possible sub-cases:

(a) None of the intersection lines intersect with each other. That is, each pair of planes

simply intersects along some distinct intersection line.

(b) All 3 intersection lines are identical. So all 3 planes intersect along the same intersection line.

(c) The 3 intersection lines and thus all 3 planes intersect at a single point.

To determine which of the above sub-cases were in, we must determine the relation between

each pair of intersection lines. This is tedious, but nothing new.

P3

intersect at different lines

P2

P1

intersect at the same line

P3

P2

P1

P3

P1

P2

intersect at only one point

www.EconsPhDTutor.com

and r (0, 0, 1) = 0.

Step #1. Check if any two planes are parallel.

By observation, no planes normal vector can be written as a scalar multiple of another

planes normal vector. So no two planes are parallel. (So we are in Case 3.)

Step #2. Find the 3 intersection lines along which each pair of planes intersect.

By observation, all three planes contain the origin.

The planes P1 and P2 share an intersection line with direction vector (1, 0, 0) (0, 1, 0) =

(0, 0, 1) and so their intersection line is r = (0, 0, 0) + (0, 0, 1) ( R). Call this line l1 .

The planes P1 and P3 share an intersection line with direction vector (1, 0, 0) (0, 0, 1) =

(0, 1, 0) and so their intersection line is r = (0, 0, 0) + (0, 1, 0) ( R). Call this line l2 .

The planes P2 and P3 share an intersection line with direction vector (0, 1, 0) (0, 0, 1) =

(1, 0, 0) and so their intersection line is r = (0, 0, 0) + (1, 0, 0) ( R). Call this line l3 .

Step #3. Determine where, if at all, the 3 intersection lines intersect.

l1 and l2 are not parallel, but they do intersect at the point (0, 0, 0) and so that is also their

only intersection point.

l1 and l3 are not parallel, but they do intersect at the point (0, 0, 0) and so that is also their

only intersection point.

l2 and l3 are not parallel, but they do intersect at the point (0, 0, 0) and so that is also their

only intersection point.

Conclusion.

Altogether, we conclude that the 3 intersection lines intersect at a single point. Hence, the

3 planes also intersect at a single point. (So we are in Case 3c.)

www.EconsPhDTutor.com

and r (0, 1, 1) = 1.

Step #1. Check if any two planes are parallel.

By observation, P1 s normal vector (1, 1, 0) can be written as a scalar multiple of P2 s

normal vector (2, 2, 0). And so these two planes are parallel.

P3 is not parallel to either of the first two planes, since (0, 1, 1) cannot be written as a

scalar multiple of (1, 1, 0) or (2, 2, 0).

Altogether then, we are in Case 2.

Step #2. Check if the two parallel planes are identical.

They are not, because (1, 0, 0) is on P1 but is not on P2 , as we can easily verify (1, 0, 0)

(2, 2, 0) = 2 4. So we are in Case 2b.

Step #3. Find the intersection lines. There are two one shared by P1 and

P3 and the other shared by P2 and P3 . (P1 and P2 are distinct, parallel planes

and thus do not intersect at all.)

The intersection line of P1 and P3 has direction vector (1, 1, 0) (0, 1, 1) = (1, 1, 1). Lets

1

find a point (x, y, z) at P1 and P3 intersect: the equations for the planes are x + y = 1 and

2

y + z = 1.

Using the plug in x = 0 trick, we see that they intersect at (0, 1, 0). Hence, their intersection line is r = (0, 1, 0) + (1, 1, 1) ( R). Call this line l1 .

The intersection line of P2 and P3 must also have direction vector (1, 1, 1). Lets find a

1

point (x, y, z) at P2 and P3 intersect: the equations for the planes are 2x 2y = 4 and

2

y + z = 1.

Using the plug in x = 0 trick, we see that they intersect at (0, 2, 1). Hence, their

intersection line is r = (0, 2, 1) + (1, 1, 1) ( R). Call this line l2 .

The lines l1 and l2 are parallel.

Exercise 139. What is the relationship between the 3 planes P1 , P2 , and P3 , given by

r (1, 0, 1) = 1, r (0, 1, 1) = 1, and r (1, 1, 0) = 2? (Answer on p. 1104.)

www.EconsPhDTutor.com

Part IV

Complex Numbers

www.EconsPhDTutor.com

36

1. Solve x 1 = 0. Easy; the answer is a natural number: x = 1.

2. To solve x + 1 = 0, we must invent negative numbers. The answer is x = 1.

Well start by defining the imaginary unit, then work our way to complex numbers.

Definition 90. The imaginary unit, denoted i, is a number that satisfies i2 = 1.

Using the imaginary unit, we can construct other purely imaginary numbers:

Definition 91. A purely imaginary number is any real, non-zero multiple of the imaginary

unit. That is, a purely imaginary number is any bi, where b R with b 0.

(We specify that b 0 because 0i = 0 is not a purely imaginary number, but a real number.)

i is both the imaginary unit and a purely imaginary number.

We can add real numbers to purely imaginary numbers to form imaginary numbers:

Definition 92. An imaginary number is any a + bi, where a, b R with b 0.

Again, we specify that b 0 because otherwise a + 0i would not be an imaginary number,

but a real number.

Example 357. 3 + 2i is imaginary, but not purely imaginary. In contrast, 2i is both

imaginary and purely imaginary.

www.EconsPhDTutor.com

Definition 93. A complex number is any a + bi, where a, b R.

Notice that here in contrast, we do not specify that b 0. The reason is that complex

numbers include all real numbers.

Example 358. 10 and 17 are complex and real. 2+9i and 32i are complex and imaginary.

2i is complex, imaginary, and purely imaginary. i is complex , imaginary, purely imaginary,

and also the imaginary unit.

We denoted the set of real numbers by the symbol R. We now denote the set of complex

numbers by the symbol C.

Definition 94. The set of all complex numbers, denoted C, is defined as {a + bia, b R}.

The set of reals is a proper subset of the set of complex numbers formally,

Fact 43. R C.

Proof. Every element a R can be written as a + 0i and is thus an element of C. So R is a

subset of C. Moreover R is not equal to C, because for example 3 + 7i C but 3 + 7i R.

Altogether then, R is a proper subset of C.

Complex numbers are thus the extension of the concept of real numbers. On the next page

is a modified version of our taxonomy of numbers from p. 39, with the complex numbers

fleshed out:

www.EconsPhDTutor.com

Complex Numbers

Real Numbers

Impure Imaginary

Numbers

Imaginary Numbers

Purely Imaginary

Numbers

The Imaginary Unit

appreciate why is beyond the scope of the A-levels. But for now here is a very simple

example just to illustrate the point.

Example 359. 1, 1, i, and i are all complex numbers. We do say that 1 is a positive

real number and 1 is a negative real number.

But we do not say that i is a positive complex number or that i is a negative complex

number.

In fact, we do not even say that 1 is a positive complex number or that 1 is a negative

complex number.

Exercise 140. Fill in the following table. The first column has been done for you. (Answer

on p. 1105.)

Is this ...

13 2i

A complex number?

Yes

A real number?

No

An imaginary number?

Yes

A purely imaginary number?

No

The imaginary unit?

No

3i 0 4 4 + 2i i

3

www.EconsPhDTutor.com

36.1

Definition 95. Given a complex number z = a + bi, its real part is a and is denoted Re(z).

Similarly, its imaginary part is b and is denoted Im(z).

Example 361. Re(7) = 7 and Im(7) = 0.

Example 362. Re(19i) = 0 and Im(19i) = 19.

It is also often convenient to write complex numbers in ordered pair notation, with the first

term being the real part and the second term being the imaginary.

Example 363. Given z = 3 + 2i, we can also write z = (3, 2).

Example 364. Given z = 7, we can also write z = (7, 0).

Example 365. Given z = 19i, we can also write z = (0, 19).

Of course, two complex numbers z and w are equal if and only if (i) their real parts are

equal; AND (ii) their imaginary parts are equal.

Example 366. Suppose z = 3 + bi and w = a 17i are equal. Then it must be that a = 3

and b = 17.

Exercise 141. Exactly two of the following complex numbers are identical. Find out which

two. (Answer on p. 1105.)

1

2

a=

i,

2

2

3

1

b = i,

2

2

c = sin sin i,

3

3

d=

cos ( ) i.

2

4

www.EconsPhDTutor.com

37

The familiar arithmetic operations work the same way on imaginary numbers as they do

on real numbers. Addition and subtraction are especially simple.

37.1

We can also write z = (2, 1) and w = (0, 3), so that z + w = (2 + 0, 1 + 3) = (2, 4) and

z w = (2 0, 1 3) = (2, 2).

Example 368. Let z = 7 i and w = 2 + 5i. Then z + w = 9 + 4i and z w = 5 6i.

We can also write z = (7, 1) and w = (2, 5), so that z + w = (7 + 2, 1 + 5) = (7 + 2, 1 + 5)

and z w = (7 2, 1 5) = (5, 6).

In general,

Fact 44. If z = (a, b) and w = (c, d), then

z + w = (a + c, b + d),

z w = (a c, b d).

z +w and z w. (a) z = 5+2i, w = 7+3i.

(b) z = 3 i, w = 11 + 2i. (c) z = 1 + 2i, w = 3 2i. (Answer on p. 1106.)

www.EconsPhDTutor.com

37.2

Multiplication

Below are listed the powers of i. Note that the cycle repeats after every fourth power,

because i4 = 1.

i = i,

i2 = i i = 1,

i3 = i i2 = i,

i4 = i i3 = 1,

i5 = i i4 = i,

i6 = i i = 1,

i7 = i i2 = i,

i8 = i i3 = 1,

i9 = i i8 = i,

i10 = i i = 1,

i11 = i i2 = i,

i12 = i i3 = 1,

etc.

Example 369. Let z = i and w = 1 + i. Then zw = i (1 + i) = i 1 + i i = i 1.

Example 370. Let z = 2 + i and w = 3i. Then

zw = (2 + i) (3i) = (2) (3i) + i (3i)

= 6i + 3i2 = 6i + 3(1) = 3 6i.

Example 371. Let z = 2 i and w = 1 + i. Then

zw = (2 i)(1 + i) = 2 + 2i + i i2

= 2 + 3i i2 = 2 + 3i (1) = 1 + 3i.

Example 372. Let z = 3 + 2i and w = 7 + 4i. Then

zw = (3 + 2i)(7 + 4i) = 21 + 12i 14i + (2i)(4i)

= 21 2i + 8i2 = 21 2i + 8 (1) = 29 2i.

In general,

Fact 45. If z = (a, b) and w = (c, d), then

zw = (ac bd, ad + bc) .

www.EconsPhDTutor.com

compute zw. (a) z = 5 + 2i, w = 7 + 3i. (b)

z = 3 i, w = 11 + 2i. (c) z = 1 + 2i, w = 3 2i. (Answer on p. 1106.)

Exercise 145. Given that z = 2 + i and az 3 + bz 2 + 3z 1 = 0, find a and b. (Answer on p.

1106.)

www.EconsPhDTutor.com

37.3

Division

Recall that to rationalise a surd in the denominator (section 5.2), we used a trick involving

conjugate pairs.

Example 373.

(

(1

3

3

5)

5 1)

3

3

1 5

=

=

=

.

15

4

1+ 5 1+ 5 1 5

root of some number, then this is a rationalisation (make rational) that helps get rid

of an ugly surd.

Now, given z = a + bi, we call z = a bi its conjugate. And we call a + bi and a bi a

conjugate pair, because

(a + bi)(a bi) = a2 (bi) 2 = a2 b2 i2 = a2 + b2 .

This is a realisation (make real) that helps get rid of any complex numbers. Example:

Example 374. (1 + i) = 1 i, i = i, and (1 i) = 1 + i. Thus:

(a)

1i

1

1

1i

1i

=

= 2 2=

= 0.5 0.5i.

1+i 1+i 1i 1 i

1+1

1 1 i i i

=

=

=

= i.

i i i i2 1

(b)

(c)

1

1

1+i

1+i

1+i

=

= 2 2=

= 0.5 + 0.5i.

1i 1i 1+i 1 i

1+1

In general,

Fact 46. If z = (a, b), then

z = (a, b),

zz = a + b ,

1 1 z

z

a

b

= = 2 2 = ( 2 2, 2 2).

z z z

a +b

a +b a +b

Exercise 146. For each of the following z, write down its conjugate z and hence compute

its reciprocal (i.e. 1/z). (a) z = 5 + 2i. (b) z = 3 i. (c) z = 1 + 2i. (Answer on p. 1106.)

www.EconsPhDTutor.com

Example 375.

(a)

(b)

2 + i 2 + i 3i 6i 3i2 6i + 3

=

=

=

.

3i

3i

3i

9i2

9

3 + i 3 + i 1 + i (3 + i)(1 + i) 3 + 3i + i + i2 2 + 4i

=

=

=

=

= 1 + 2i.

1i 1i 1+i

12 i2

1+1

2

(c)

1+i

1 + i 3 + 2i 3 + 2i + 3i + 2i2 1 + 5i

=

=

=

.

3 2i 3 2i 3 + 2i

9+4

13

(d)

2 i 1 i 2 2i + i + i2 3 i

2i

=

=

=

= 1.5 0.5i.

1 + i 1 + i 1 i

1+1

2

(e)

3 + 2i

3 + 2i 7 4i 21 12i 14i 8i2 13 26i

=

=

=

= 0.2 0.4i.

7 + 4i 7 + 4i 7 4i

49 + 16

65

(f)

=

=

=

.

2 + i

2 + i 2 i

22 2 i 2

4 + 2

In general,

Fact 47. If z = (a, b) and w = (c, d) with w 0, then

z z w

zw

ac + bd bc ad

= = 2

=

(

,

).

w w w

c + d2

c2 + d2 c2 + d2

Exercise 147. Rewrite each of the following fractions into the form a + bi. (a)

2 3i

11 + 2i

2 i

3

7 2i

. (d)

. (c)

. (e)

. (f)

. (Answer on p. 1107.)

1+i

i

2+i

5+i

3 2i

1 + 3i

. (b)

i

www.EconsPhDTutor.com

38

38.1

Complex Roots to Quadratic Equations

discriminant (i.e. b2 4ac 0), then its real roots are given by

x=

b2 4ac

.

2a

(3)2 4(1)(2) = 1 > 0. Hence, it has two real roots, given by

x=

b2 4ac 3 1

=

= 1, 2.

2a

2

Now, armed with our new concept of imaginary numbers, we can completely dispense with

the requirement that b2 4ac 0. We can simply say that ax2 + bx + c = 0 ALWAYS has

complex roots, given by

x=

b2 4ac

.

2a

Example 377. Consider the equation x2 2x+2 = 0. Its discriminant is negative: b2 4ac =

(2)2 4(1)(2) = 4 < 0. It has two imaginary (and thus also complex) roots, given by

x=

b2 4ac 2 4

4 1

2i

=

=1

= 1 = 1 i.

2a

2

2

2

Notice that 1 + i was a root to the given quadratic equation. And interestingly enough, so

too was 1 i.

It turns out that in general, a quadratic equation with real coefficients has roots that come

in conjugate pairs. That is, if x + yi is a root, then so too is its conjugate x yi. 40 More

examples:

40

This is not terribly surprising if you examine the general solution for the quadratic equation the b2 4ac bit corresponds precisely to the imaginary part.

www.EconsPhDTutor.com

Example 378. Consider the equation 3x2 +x+1 = 0. Its discriminant is negative: b2 4ac =

(1)2 4(3)(1) = 11 < 0. It has two imaginary (and thus also complex) roots, given by

x=

b2 4ac 1 11

11 1

11

1

1

=

=

=

i.

2a

6

6

6

6

6

both real), then what are b and c? Well, we know that 3 2i is also a root to the equation.

And so

x2 + bx + c = [x (3 + 2i)] [x (3 2i)] = (x 3)2 (2i)2 = x2 6x + 13.

Hence, b = 6 and c = 13.

Exercise 148. Find the roots for each of the following quadratic equations. (a) x2 +x+1 = 0.

(b) x2 + 2x + 2 = 0. (c) 3x2 + 3x + 1 = 0. (Answer on p. 1108.)

Exercise 149. If 1 i is a root to the quadratic equation x2 + bx + c = 0 (where b and c are

both real), then what are b and c? (Answer on p. 1108.)

www.EconsPhDTutor.com

38.2

a0 xn + a1 xn1 + a2 xn2 + + an1 x + an where each ai is a constant and x is the variable.

Theorem 4. The Fundamental Theorem of Algebra. A polynomial of degree n in

one variable has exactly n zeros (though some may be repeated). That is, there are exactly

n (possibly repeated) solutions to the equation a0 xn + a1 xn1 + a2 xn2 + + an1 x + an = 0.

Proof. The proof of this theorem is way too advanced and so omitted from this book.41

namely 1 and 1.

Example 381. x2 + 1 is a polynomial of degree 2. And indeed, x2 + 1 = 0 has two solutions,

namely i and i.

There are sometimes repeated solutions or what are more formally called multiple roots,

as the next example illustrates.

Example 382. x2 2x + 1 = 0 has two (repeated) solutions, namely 1 and 1. We call 1 a

multiple root (indeed a double root).

Example 383. x3 6x2 + 12x 8 = 0 has three (repeated) solutions, namely 2, 2, and 2.

We call 2 a multiple root (indeed a triple root).

41

www.EconsPhDTutor.com

The Fundamental Theorem of Algebra can be useful even if we have no idea how to find

the solutions to an equation.

Example 384. x17 + 3x4 2x + 1 is a polynomial of degree 17. I may not know what the

solutions to x17 + 3x4 2x + 1 = 0 are, but I know from the Fundamental Theorem of Algebra

that there MUST be 17 solutions (though some may possibly be repeated).

zeros. Suppose we are given as a hint that two of them are i and i. Then how would we

go about finding the other two zeros?

The problem of finding the zeros of a polynomial is really the same as the problem of

factorising a polynomial. This is because a is a zero of a polynomial if and only if (x a)

is a factor of the polynomial.

So (x i) and (x + i) are factors for the polynomial. Now, (x i)(x + i) = x2 i2 = x2 + 1.

So find (x4 + x3 5x2 + x 6) (x2 + 1) through long division:

x2 +x

6

x2 + 1 x4 +x3 5x2

x4 +0 +x2

x3 6x2

x3

+0

6x2

6x2

+x 6

+x

+0 6

+0 6

0.

By observation, x2 + x 6 = (x 2)(x + 3). Hence,

(x4 + x3 5x2 + x 6) = (x2 + 1) (x2 + x 6) = (x i)(x + i)(x 2)(x + 3).

Altogether, the four zeros of the given polynomial are i, 2, and 3.

www.EconsPhDTutor.com

zeros. As a hint, we are told that one of them is 5. What are the other two?

x2 +2x

+5

x 5 x3 3x2 5x 25

x3 5x2

2x2

2x2 10x

5x 25

5x 25

0.

So x3 3x2 5x 25i = (x 3) (x2 + 2x + 5). Im unable to easily see how x2 + 2x + 5 can be

factorised. So let me just use the quadratic formula:

x=

22 4(1)(5)

= 1 1 5 = 1 4 = 1 2i.

2

Altogether then,

x3 3x2 5x 25i = (x 5) (x2 + 2x + 5) = (x 5) [x (1 + 2i)] [x (1 2i)] .

So the three zeros of the polynomial are 5 and 1 2i.

Exercise 150. Each of the following polynomials has 1 as a zero. Find the other zeros.

(a) x3 + x2 2. (b) x4 x2 2x + 2. (Answer on p. 1109.)

www.EconsPhDTutor.com

38.3

and c are real), then so too is its conjugate c di.

What is perhaps surprising is that this generalises to the case of any polynomial, provided

that all coefficients of the polynomial are real.

Example 387. If told that 2 i solves x3 x2 7x + 15 = 0, we know immediately that its

conjugate 2 + i also solves the same equation.

Example 388. If told that i solves 4x4 +5x2 +1 = 0, we know immediately that its conjugate

i also solves the same equation. Similarly, if told also that 0.5i solves the same equation,

we know immediately that its conjugate 0.5i also solves the same equation.

a + bi solves an xn + an1 xn1 + an2 xn2 + + a1 x + a0 = 0, then so does a bi.

The condition that all coefficients ak are real is important. The above theorem

does not apply if any of the coefficients are imaginary.

Example 389. i solves x2 + ix + 2 = 0. However, its conjugate i does not solve the same

equation (verify this yourself!).

2

2

2

2

Example 390.

+

i solves x2 = i. However, its conjugate

2

2

2

2

the same equation (verify this yourself!).

does not solve the same equation (verify this yourself!).

Exercise 151. Each of the following polynomials has 2 3i as a zero. Find the other zeros.

(a) x4 6x3 + 18x2 14x 39. (b) 2x4 + 21x3 93x2 + 229x 195. (Answer on p. 1110.)

www.EconsPhDTutor.com

39

The complex plane (or Argand diagram) gives us a nice geometric interpretation: The

complex numbers are simply points on the plane. The real axis is the horizontal

or x-axis. The imaginary axis is the vertical or y-axis.

Example 392. In the figure below, marked in red are the real numbers 3, 0, , and 2,

which may be written in ordered pair notation as (3, 0), (0, 0), (, 0), (2, 0). Points on

the horizontal axis are real numbers.

In blue are the purely imaginary numbers 4i and 3i, which may be written in ordered pair

notation as (0, 4) and (0, 3). Points on the vertical axis are purely imaginary numbers.

In green are the impure imaginary numbers 1 + i, 3 + 2i, 1 3i, and 4 i, which may

be written in ordered pair notation as (1, 1), (3, 2), (1, 3), and (4, 1). Points not on

either axis are impure imaginary numbers.

4

3

2

1

x

0

-5

-4

-3

-2

-1

-1

-2

-3

-4

-5

www.EconsPhDTutor.com

For our purposes, well regard the complex plane C as being exactly identical to the cartesian

plane {(x, y) x R, y R}. Both are represented graphically as a two-dimensional plane.

The only difference is that we interpret points on each plane differently: Points on the

complex plane are complex numbers, while points on the cartesian plane are ordered pairs

of real numbers.42

Exercise 152. Illustrate the complex numbers 1, 3, 2i, 1 + 2i, and 1 3i on a single

Argand diagram. (Answer on p. 1111.)

42

The differences between C and R2 in fact run deeper. See e.g. this discussion..

www.EconsPhDTutor.com

39.1

To write a complex number in standard form i.e. z = x + iy, we need only two pieces

of information: its real part (x) and its imaginary part (y).

We now write a complex number in polar form. Again, we need only two pieces of

information: the modulus, denoted z, and the argument, denoted arg z. Informally, the

modulus is the length of the position vector of z; the argument is the angle the position

vector of z makes with the positive x-axis.

Example 393. The complex number 3 = (3, 0) has modulus 3 = 3 and argument

arg 3 = . The complex number 4i = (0, 4) has modulus 4i = 4 and argument

arg

(4i) =

2

2

/2. The complex number 3 + 3i = (3, 3), has modulus 3 + 3i = 3 + 3 = 3 2 and

argument arg(3 + 3i) = /4.

4

3 + 3i = (3, 3)

3

2

1

-3 = (-3, 0)

x

0

-5

-4

-3

-2

-1

-1

-2

-3

-4

-5

www.EconsPhDTutor.com

Definition

96. The modulus function has domain C, codomain R, and mapping rule z

In contrast, it is tricky to write down a formal definition of the argument function. One

problem is this: Angles are periodic.

Example 394. Consider again the complex number 3 + 3i = (3, 3). The angle it makes

with the positive x-axis is /4.

But angles are periodic. Equivalently, angles come full circle 2 radians. So it would make

just as much sense to say that the angle is 9/4. Or 17/4. Or 7/4. Or indeed any

/4 + 2k, where k is any inteer.

To overcome this problem, we shall somewhat arbitrarily choose (, ] as our principal

values. Thus, arg(3 + 3i) shall be uniquely defined to be the value /4 and nothing else.

Another problem is this: We are tempted to simply define arg(x + yi) = tan1 (y/x). Unfortunately, the tan1 function has codomain ge (/2, /2). Whereas, as we just decided,

arg should have codomain (, ]. To overcome this, altogether, the argument function is

defined as follows:

Definition 97. The argument function has domain C, codomain (, ], and mapping

rule as given below:

tan1 (y/x) ,

Undefined,

/2,

arg z =

/2,

tan1 (y/x) + ,

tan (y/x) ,

if x = 0 = y (the origin),

if x = 0, y > 0 (the positive y axis),

if x = 0, y < 0, (the negative y axis)

if x < 0, y 0 (top-left quadrant, including the negative x-axis),

if x < 0, y < 0 (bottom-left quadrant).

y

(My mnemonic for the above: arg z = tan1 . Top left +. Bottom left.)

x

We now illustrate and explain the above definition:

www.EconsPhDTutor.com

y

arg z =

arg z =

x

arg z =

If x > 0 (top-right and bottom-right quadrants), then define arg(x + yi) = tan1 (y/x).

The green point in the figure above illustrates. The angle that the position vector of the

green (x, y) makes with the positive x-axis is indeed simply tan1 (y/x).

If x = 0, y = 0 (the origin), then arg(x + yi) is undefined. In other words, we leave arg 0

undefined.43

If x = 0, y > 0 (positive vertical axis), then define arg(x + yi) = arg(yi) = /2.

If x = 0, y < 0 (negative vertical axis), then define arg(x + yi) = arg(yi) = /2.

If x < 0, y 0 (top-left quadrant plus the negative horizontal axis), then define arg(x +

yi) = tan1 (y/x) + .

The red point illustrates. The angle its position vector makes with the negative x-axis

is tan1 (y/x). And so arg(x + yi) = tan1 (y/x). Observe that tan1 (y/x) =

tan1 (y/ x) = tan1 (y/x). Thus, arg(x + yi) = tan1 (y/x) = tan1 (y/x) + .

If x < 0, y < 0 (bottom-left quadrant), then define arg(x + yi) = tan1 (y/x) .

The blue point illustrates. The angle its position vector makes with the negative x-axis is

tan1 (y/x). And so arg(x + yi) = tan1 (y/x) . Observe that (y/x) = (y/ x) =

(y/x). Thus, arg(x + yi) = tan1 (y/x) = tan1 (y/x) .

43

www.EconsPhDTutor.com

Fact 48. (a) z is purely imaginary (z is on the vertical axis) arg z = /2.

(a) z is real (z is on the horizontal axis) arg z = 0, .

Exercise 153. Compute the modulus and argument of 4, 3, 2i, 1+2i, and 13i. Illustrate

these numbers and their arguments on a single Argand diagram. (Answer on p. 1112.)

Exercise 154. Where on the complex plane must a complex number be, if its argument is

... (a) Positive? (b) Negative? (c) 0? (d) ? (e) ? (f) > ? (g) < ? (Answer on p.

2

2

2

2

1113.)

www.EconsPhDTutor.com

Armed with the modulus and the argument, we have a nice geometric interpretation:

Fact 49. Let z be a complex number with z = r and arg z = . Then z = r (cos + i sin ).

We call r (cos + i sin ) the polar form representation of z.

y

z = r (cos + i sin ) = (r cos , r sin )

r

r sin

x

r cos

2

2

5

3

Example 396. For z = 1 + 3i = (1, 3), z = 12 + 32 = 10 and arg z tan1 1.249. So in

1

7

Example 397. For z = 4 + 7i = (4, 7), z = (4)2 + 72 = 65 and arg z = tan1

+

4

Exercise 155. Rewrite each of the following complex numbers in polar form: 1, 3, 2i,

1 + 2i, and 1 3i. (Answer on p. 1113.)

www.EconsPhDTutor.com

39.2

Fact 50. Let z be a complex number with z = r and arg z = . Then z = rei .

2

49 and uses the Euler Formula.

We call rei the exponential form representation of z.

Eulers identity is one of the most extraordinary and beautiful equations in all of mathematics. It links together five fundamental mathematical constants: e, i, , 1, and 0.

Corollary 4. (Eulers identity.) ei + 1 = 0.

2

Example 398. The number z = 5 2i = (5, 2) has modulus 52 + (2) = 29 and

2 + 32 =

Example 399. The number z = 1 + 3i = (1, 3) has

modulus

1

10 and argument

i(1.249)

1

tan (3/1) 1.249. Hence, we can also write z = 10e

.

argument tan1 (7/ 4) + 2.090. Hence, we can also write z = 65ei(2.090) .

Exercise 156. Rewrite each of the following complex numbers in exponential form: 1, 3,

2i, 1 + 2i, and 1 3i. (Answer on p. 1113.)

www.EconsPhDTutor.com

40

Now that we know how to write complex numbers in polar and exponential forms, the

arithmetic of complex numbers becomes even easier.

40.1

Fact 51. Product of two complex numbers. Let z and w be complex numbers. Then

zw = z w ,

and

Proof. Let z = r(cos + i sin ) and w = s(cos + i sin ). Then

zw = rs(cos + i sin )(cos + i sin )

= rs(cos cos + i sin cos + i cos sin sin sin )

= rs [cos ( + ) + i sin ( + )] .

This is the complex number with modulus rs and which makes an angle + with the

positive x-axis.

Note though that + may not be in (, ]. Thus, rather than say that arg(zw) =

arg z + arg w, we instead say that arg (zw) = arg z + arg w + 2k (where k = 1, 0, 1 ensures

that arg z + arg w + 2k (, ]).

Here is an alternative quicker proof of the above fact, using the exponential form.

Proof. Let z = rei and w = sei . Then zw = rsei(+) . This is the complex number with

modulus rs and which makes an angle + with the positive x-axis.

www.EconsPhDTutor.com

2

z = 52 + (2) = 29, and arg z = tan1 (2/5),

w = 12 + 32 = 10,

and arg w = tan1 (3/1),

zw = 29 10 = 290, and

arg (zw) = tan1 (2/5) + tan1 (3/1) + 2k 0.869 + 2k = 0.869 (k = 0).

Notice that here arg z + arg w 0.869 (, ]. So arg z + arg w is already a principal

value and we can simply set k = 0 or arg(zw) = arg z + arg w.

To get zw in standard form, use a calculator: Youll get

2

3

290 cos [tan1

+ tan1 ] = 11,

5

1

and

2

3

290 sin [tan1

+ tan1 ] = 13.

5

1

z =

(4)2 + 72 =

65, and

zw = 65 37 = 2405, and

arg (zw) = tan1 [7/ (4)] + + tan1 (6/1) + 2k 0.684 + 2k = 0.684 (k = 0).

Notice that here arg z + arg w 0.684 (, ]. So arg z + arg w is already a principal

value and we can simply set k = 0 or arg(zw) = arg z + arg w.

To get zw in standard form, use a calculator: Youll get

7

6

2405 cos [tan1

+ + tan1 ] = 38,

4

1

and

7

6

2405 sin [tan1

+ + tan1 ] = 31.

4

1

Page 399, Table of Contents

www.EconsPhDTutor.com

(3)2 + 42 = 5,

and arg z = tan1 [4/ (3)] + ,

z =

zw = 5 29,

arg (zw) = (tan1

and

2

4

+ ) + (tan1

+ ) + 2k 4.975 + 2k = 1.308 (k = 1).

3

5

arg(zw) = arg z + arg w 2k 1.308 (, ], so that arg(zw) is indeed a principal

value.

To get zw in standard form, use a calculator: You use a calculator, youll get

2

2

4

4

5 29 cos [tan1

+ + tan1 ] = 7 and 5 29 sin [tan1

+ + tan1 ] 26.

3

5

3

5

And indeed zw = (3 + 4i)(5 + 2i) = 7 26i.

Exercise 157. Write down zw in polar and exponential forms, for each of the following

pair of z and w. (a) z = 1, w = 3. (b) z = 2i, w = 1 + 2i. (c) z = 1 3i, w = 3 + 4i. (Answer

on p. 1114.)

www.EconsPhDTutor.com

40.2

Fact 52. Ratio of two complex numbers. Let z and w be complex numbers. Then

z

z

=

,

w w

and

arg

z

= arg z arg w + 2k,

w

Proof. Let z = r(cos + i sin ) and w = s(cos + i sin ). Then

z r(cos + i sin ) r cos + i sin cos i sin

=

=

=

s

cos2 + sin2

s

1

r

[cos ( ) + i sin ( )] .

s

This is the complex number with modulus r/s and argument + 2k (where k is the

unique integer such that + 2k (, ]).

Here is an alternative quicker proof of the above fact, using the exponential form.

Proof. Let z = rei and w = sei . Then z/w = ei() (r/s). This is the complex number

with modulus r/s and argument + 2k (where k is the unique integer such that

+ 2k (, ]).

www.EconsPhDTutor.com

z =

29,

arg z = tan1

z

= 2.9,

w

zw

2

,

5

w =

3

arg w = tan1 .

1

10,

2

3

z

tan1 + 2k 1.630 (k = 0).

arg ( ) = tan1

w

5

1

z =

z

=

w

65,

arg z = tan1

7

+ ,

4

w =

37,

arg w = tan1

6

.

1

z

65

7

6

, arg ( ) = tan1 +tan1 +2k 3.496+2k 2.788 (k = 1).

37

w

4

1

z

65

65 i(2.788)

e

.

w

37

37

z = 5,

z

5

= ,

w

29

arg z = tan1

4

+ ,

3

w =

29,

arg w = tan1

2

+ .

5

z

4

2

arg ( ) = tan1

tan1

+ 2k 0.547 + 2k = 0.547 (k = 0).

w

3

5

z

5

5

[cos (0.547) + i sin (0.547)] = ei(0.547) .

w

29

29

z

in polar and exponential forms, for each of the following

w

pairs of z and w. (a) z = 1, w = 3. (b) z = 2i, w = 1 + 2i. (c) z = 1 3i, w = 3 + 4i. (Answer

on p. 1114.)

Exercise 158. Write down

www.EconsPhDTutor.com

40.3

Fact 53 expresses the sine and cosine functions as weighted sums of the exponential functions. It is not in the syllabus, but made a sudden first-time appearance on the 2015 A-level

exams (Exercise 356), just to screw students over.

ei + ei

ei ei

and sin =

.

2

2i

Proof. By the Euler Formula, ei = cos +i sin . Moreover, ei = cos ()+i sin () = cos

i sin , where the second equality uses the properties cos x = cos(x) and sin(x) = sin x.

Hence,

=

= cos , as desired.

2

2

=

= sin , also as desired.

Similarly,

2i

2

The 2015 question was about the sum (or difference) of two complex numbers that

have the same modulus. Heres a similar example:

www.EconsPhDTutor.com

Example 407. Let z = 5ei and w = 5e0.4i . What, exactly, are the modulus and arguments

of z + w and z w?

Without the above fact, its not obvious. With the above fact, its easy. First, observe that

0.7 is the average of 1 and 0.4. Then factorise 5ei + 5e0.4i into a form where we can exploit

the above fact. Like so:

z + w = 5ei + 5e0.4i = 5e0.7i (e0.3i + e0.3i ) = 5e0.7i 2 cos(0.3),

where the last = uses Fact 53. And thus:

arg (z + w) = arg [5e0.7i 2 cos(0.3)]

= arg 5 + arg (e0.7i ) + arg 2 + arg [cos(0.3)] + 2k

= 0 + 0.7 + 0 + 0 + 2k = 0.7 (k = 0).

z + w = 5e0.7i 2 cos(0.3) = 5 e0.7i 2 cos(0.3)

= 5 1 2 cos(0.3) = 10 cos(0.3).

Altogether then, z + w = 10 cos(0.3)ei(0.7) .

We can play a similar trick to figure out the modulus and argument of z w:

z w = 5ei 5e0.4i = 5e0.7i (e0.3i e0.3i ) = 5e0.7i 2i sin(0.3).

where again the last = uses Fact 53. And thus:

arg (z w) = arg [5e0.7i 2i sin(0.3)]

= arg 5 + arg (e0.7i ) + arg 2 + arg i + arg [sin(0.3)] + 2k

= 0 + 0.7 + 0 + 0.5 + 0 + 2k = 0.8 (k = 1),

z w = 5e0.7i 2i sin(0.3) = 5 e0.7i 2 i sin(0.3)

= 5 1 2 1 sin(0.3) = 10 sin(0.3).

Altogether then, z w = 10 sin(0.3)ei(0.8) .

www.EconsPhDTutor.com

In general,

Fact 54.

ei + ei = 2 cos

i( + +2k)

e 2

2

and

ei ei = 2 sin

i( ++ +2m)

e 2

,

2

Proof. See Exercise 160.

Exercise 159. Let z = 3ei(0.2) and w = 3ei(0.9) . By mimicking the steps in Example 407,

find z + w and z w in exact polar and exponential forms. (Answer on p. 1115.)

SYLLABUS ALERT

If youre taking the 9758 (revised) exam, you are done with Part IV: Complex Numbers.

The remaining chapters in Part IV covers the following, which are on the 9740 (old) syllabus

but not on the 9758 (revised) syllabus:

geometrical effects of conjugating a complex number and of adding, subtracting, multiplying, dividing two complex numbers

loci such as z c r, z a z b and arg (z a) =

use of de Moivres theorem to find the powers and nth roots of a complex number.

www.EconsPhDTutor.com

41

In secondary school, we learnt to do some geometry using cartesian equations. And in Part

III (Vectors), we learnt to do some geometry using vector equations. Now, well learn to

do some geometry using complex equations!

41.1

Given two complex numbers z = x + iy and w = a + ib, their sum is simply the complex

number z + w = (x + a) + (y + b)i.

We already know how to interpret z = (x, y) and w = (a, b) as points on the plane. This

gives us a nice geometric interpretation: z +w = (x+a, y +b) is likewise a point on the plane.

We can also interpret z = (x, y) and w = (a, b) as position vectors. And thus as usual, the

sum of two vectors is itself a vector: z + w = (x + a, y + b).

z + w = (x + a, y + b)

z = (x, y)

z+w

w = (a, b)

www.EconsPhDTutor.com

Similarly, their difference z w is simply the point (x a, y b). This corresponds to the

vector z w = (x a)i + (y b)j.

z = (x, y)

z - w = (x - a, y - b)

z-w

w = (a, b)

x

from the above figures and also bearing in mind Corollary 3 (the sum of the lengths of any

two sides of a triangle is always greater than the length of the third side).

www.EconsPhDTutor.com

41.2

With sums and differences, there was an exact analogy to vectors. In contrast, with products

and ratios of complex numbers, there is no analogy to vectors. In particular, the

product of two complex numbers has nothing to do with the scalar product or vector

product of their position vectors.

Nonetheless, we do have nice geometric interpretations. We already know from Fact 51

that the product of two complex numbers z and w is simply the complex number zw with

1. zw = z w; and

2. arg (zw) = arg z + arg w + 2k, where k = 1, 0, 1 ensures that arg z + arg w + 2k (, ].

So geometrically, to get zw, we take z and

1. First multiply its length by a factor equal to the length of w;

2. Then rotate it anti-clockwise by the angle arg w.

y

zw = (ac - bd, ad + bc)

z = (a, b)

w = (c, d)

x

www.EconsPhDTutor.com

Similarly, we already know from Fact 51 that the ratio of complex numbers z to w is simply

the complex number z/w with

1. z/w = z / w; and

2. arg (z/w) = arg z arg w +2k, where k = 1, 0, 1 ensures that arg z arg w +2k (, ].

So geometrically, to get z/w, we take z and

1. First compress its length by a factor equal to the length of w;

2. Then rotate it anti-clockwise by the angle arg w. (Or equivalently, clockwise by the

angle arg w.)

z = (a, b)

w = (c, d)

www.EconsPhDTutor.com

41.3

complex number is simply to reflect it in the horizontal axis.

Fact 55. Complex conjugate. Let z be a complex number. Then

z = z ,

and

Proof. Let z = r(cos + i sin ). Then

z = r(cos i sin ) = r [cos () i sin ()] .

This is the complex number with modulus r and angle with the positive x-axis.

So given z = r(cos + i sin ) = rei , its conjugate is simply

z = r [cos () + i sin ()] = rei() .

y

z* = (x, -y)

z = (x, y)

www.EconsPhDTutor.com

42

A locus (plural: loci) is a set of points that satisfy some condition (or conditions). Weve

actually already encountered plenty of loci in Part I (Functions and Graphs), so this is

nothing new. This chapter reviews loci involving cartesian equations (and inequalities).

The goal is to prepare you for the next chapter, where we look at loci involving complex

equations (and inequalities).

42.1

Circles

Example 408. {(x, y) x2 + y 2 = 1} is the set of all points (x, y) in the cartesian plane that

satisfy the condition x2 + y 2 = 1. Graphically, this locus describes describing the unit circle

centred on the origin. (To be clear, it includes only the circumference of the circle.)

{(x, y): x2 + y2 = 1}

www.EconsPhDTutor.com

Example 409. The locus {(x, y) x2 + y 2 1} describes the entire interior of the unit circle

centred on the origin, including the circumference of the circle.

{(x, y):

x2

y2

1}

www.EconsPhDTutor.com

Example 410. The locus {(x, y) x2 + y 2 < 1} describes the entire interior of the unit circle

centred on the origin, excluding the circumference of the circle.

{(x, y):

x2

y2

< 1}

www.EconsPhDTutor.com

Example 411. The locus {(x, y) x2 + y 2 1} describes everything outside the unit circle

centred on the origin, including the circumference of the circle.

{(x, y): x2 + y2 1}

x

www.EconsPhDTutor.com

Example 412. The locus {(x, y) x2 + y 2 > 1} describes everything outside the unit circle

centred on the origin, excluding the circumference of the circle.

x

(a) {(x, y) (x a)2 + (y b)2 = r2 }.

(b) {(x, y) (x a)2 + (y b)2 r2 }.

(c) {(x, y) (x a)2 + (y b)2 < r2 }.

(d)

2

2

2

2

2

2

{(x, y) (x a) + (y b) r }. (e) {(x, y) (x a) + (y b) > r }. (Answer on p.

1118.)

www.EconsPhDTutor.com

42.2

Lines

{(x, y): y = x}

www.EconsPhDTutor.com

Example 414. The locus {(x, y) y x} describes the set of all points under the line

y = x, including the line itself. It contains literally half the plane, so we call this a halfplane. We can also specify that this is a closed half-plane the word closed means that

it includes also the line y = x.

Graphically, the locus {(x, y) y < x} describes the set of all points under the line y = x,

excluding the line itself. This is an open half-plane. The word open means that it

excludes the line y = x.

y

y

{(x, y): y x }

Example 415. Graphically, the locus {(x, y) y x} describes the set of all points above

the line y = x, including the line itself. Again, this is a closed half-plane.

Graphically, the locus {(x, y) y > x} describes the set of all points above the line y = x,

but excluding the line itself. Again, this is an open half-plane.

y

{(x, y): y x}

www.EconsPhDTutor.com

The locus of points that are equidistant to two points is simply a line.

Example 416. Let (a, b) and (c, d) be points. The locus of points that are equidistant to

(a, b) and (c, d) is the line illustrated below. This is because if you pick any point (e.g. P )

on the line, it is indeed equidistant to (a, b) and (c, d). And if you pick any point (e.g. Q)

not on the line, it must be either closer to (a, b) or closer to (c, d) in this case, Q is

closer to (a, b) than to (c, d).

line must be closer to

one of the two points.

Q

y

P

Any point on the

line is equidistant

to the two points.

(a, b)

x

(c, d)

www.EconsPhDTutor.com

Let (a, b) and (c, d) be points. We now prove that the locus {(x, y) (x a, y b) =

(x c, y d)} simply describes a line:

(x a, y b) = (x c, y d)

(x a)2 + (y b)2 = (x c)2 + (y d)2

x2 2ax + a2 + y 2 2by + b2 = x2 2cx + c2 + y 2 2dy + d2

2ax + a2 2by + b2 = 2cx + c2 2dy + d2

Exercise 162. (a) Find the cartesian equation of the line that is equidistant to the points

(1, 4) and (5, 0).

(b) Describe in words the set {(x, y) (x 17, y 3) = (x + 2, y + 11)}. Then rewrite the

cartesian equation (x 17, y 3) = (x + 2, y + 11) into the form ay + bx + c = 0. (Answer

on p. 1120.)

44

If wed like, we can further simplify this equation. If d b 0, then it can be rewritten as

y=

c2 + d2 (a2 + b2 )

ac

x+

.

db

2(d b)

x=

c2 + d2 (a2 + b2 )

2(c a)

www.EconsPhDTutor.com

42.3

namely the equation x2 +y 2 = 1 and the equation y = x. This locus describes the intersection

points of the circle x2 + y 2 = 1 and the line y = x. By plugging the equation of the line into

the equation of the circle, we can show that this locus consists of only two points:

2

2

2 2

{(x, y) x2 + y 2 = 1, y = x} = {(

,

),(

,

)}.

2

2

2 2

{(x, y): y = x}

{(x, y): x2 + y2 = 1}

{(x, y): y = x, x2 + y2 = 1}

www.EconsPhDTutor.com

Example 418. {(x, y) x2 + y 2 1, y = x} is the portion of the line y = x that is within the

interior of the circle, including the endpoints. It is illustrated in green in the figure below.

{(x, y): y = x}

{(x, y): x2 + y2 1}

{(x, y): y = x, x2 + y2 1}

www.EconsPhDTutor.com

Example 419. The locus {(x, y) x2 + y 2 > 1, y > x} describes the region above both the

circle x2 + y 2 = 1 and y = x, excluding the circumference of the circle and the line. It is

illustrated in green in the figure below.

{(x, y): y = x}

x

{(x, y): x2 + y2 = 1}

Exercise 163. Sketch on a cartesian plane the locus {(x, y) x2 + y 2 = 1, x > 0}. (Answer

on p. 1121.)

www.EconsPhDTutor.com

43

43.1

Circles

On an Argand diagram (or complex plane), the locus {z C z = 1} simply describes the

unit circle centred on the origin, as we now prove:

or x2 + y 2 = 1. Hence,

{z C z = 1} = {(x, y) x2 + y 2 = 1} .

But we already saw in the previous chapter that the locus {(x, y) x2 + y 2 = 1} describes

the unit circle centred on the origin.

Loci involving complex equations (or inequalities) can usually be easily transformed into a

familiar cartesian equation (or inequality).

y

{z : |z | = 1} = {(x, y): x2 + y2 = 1}

www.EconsPhDTutor.com

Exercise 164. (a) Prove that the locus {z C z = r} describes the circle of radius r

centred on the origin.

(b) Let c be some fixed complex number. Prove that the locus {z C z c = r} is the

circle of radius r centred on the point c.

(c) What does the locus {z C z c r} describe?

(d) What does the locus {z C z c < r} describe? (Answer on p. 1122.)

www.EconsPhDTutor.com

43.2

Lines

Let b and c be fixed complex numbers. The equation z c = z b is simply the condition

that z is equidistant to b and c.

Hence, the locus {z C z c = z b} simply describes the points that are equidistant to

b and c. And as we showed earlier, such a locus is simply a line.

line must be closer to

one of the two points.

e

y

d

Any point on the

line is equidistant

to the two points.

b

x

c

{z : |z b | = |z c |}

Exercise 165. Let b and c be fixed complex numbers. What is the locus of complex

numbers z that satisfy each of the following inequalities? (a) z c z b. (b) z c <

z b. (c) z c z b. (d)z c > z b. (Answer on p. 1122.)

www.EconsPhDTutor.com

43.3

Rays

The locus {z C arg z = } describes the set of points z whose argument is . It is thus

the ray (or half-line) which starts from but excludes the origin and which makes an angle

with the positive x-axis. The figure below illustrates.

The point A is in the locus, because indeed arg A = . In contrast, the point B is not in

the locus, because its argument is not arg B .

Note importantly that points along the dotted red ray, such as C, are not in the locus,

because arg C = .

Moreover, the origin is not in the locus, because arg 0 is undefined.

{z : arg z = }

x

C

If we really wanted to, we could rewrite the complex equation arg z = into cartesian form.

But it turns out that in this case, the cartesian form is more complicated. And so well

just stick with the equation arg z = .

www.EconsPhDTutor.com

Let a be a complex number. Then the graph of arg (z a) = is simply the translation of

the graph of arg z = . And so arg (z a) = is the ray (or half-line) which starts from

but excludes the point a and which makes an angle with the positive x-axis.

The point b is in the locus, because indeed arg(b a) = . In contrast, the point c is not in

the locus, because its argument is not arg(c a) .

Note importantly that points along the dotted red ray, such as d, are not in the locus,

because arg(d a) = .

Moreover, the point a is not in the locus, because arg(a a) = arg 0 is undefined.

{z : arg (z a) = }

b

a

d

www.EconsPhDTutor.com

The 9740 syllabus doesnt mention loci of the form arg z . Unfortunately, such loci

have occasionally appeared on the A-level exams,45 which means you have to learn it.

The locus arg z is simply the region bounded by (and including) the rays arg z =

and arg z = .

{z : arg z = }

{z : arg z = }

45

www.EconsPhDTutor.com

43.4

Definition 98. A chord is a line segment connecting any two points on a circles circumference.

Here are a few properties of the circle (which you are supposed to still remember from

O-levels) and which would definitely have been useful in some complex loci questions in

the past ten years A-levels.

Fact 56. Let A be a point exterior to a circle. Let B and C be the points at which the

tangents from A touch the circle. Let O be the centre of the circle.

(a) The line through A and O (i) bisects the angle BAC; (ii) is the perpendicular bisector

of the chord BC; and (iii) passes through the points D and E, which are the points on the

circle that are respectively that closest to and furthest from A.

(b) The lengths AB and AC are equal.

(c) The angles OBA and OCA are right.

Perpendicular

bisector of chord

B

Chord

E

O

Tangents C

Heres an example that illustrates the uses of the above properties of the circle.

Page 429, Table of Contents

www.EconsPhDTutor.com

Example 420. The complex number z satisfies the equation z + 4 + 2i = 1. (a) What are

the maximum and minimum possible values of z? (b) For what values of z is z maximised

and minimised?

z + 4 + 2i = 1 describes a unit circle centred on the point C = (4, 2). Even if not asked

for, you should make a quick sketch to help yourself see better.

By the above fact, z is maximised at F and minimised at N , where F and N lie on the

line through the origin and the circles centre.

(b) Consider CAN . The line through F , C, N , and the origin is y = 0.5x. So AN =

0.5CA. Moreover, CA2 + AN 2 = CN 2 = 12 = 1.

Altogether then, CA2 + 0.25CA2 = 1 or CA2 =

2

1

4

or CA = . And AN = . Hence,

5

5

5

2

1

N = (4 + , 2 + )

5

5

1

2

Symmetrically, F = (4 , 2 ).

5

5

y

O

|z + 4 + 2i | = 1

U

N

F

D

y = 0.5 x

(Line through the

origin and the

centre of the circle.)

www.EconsPhDTutor.com

z satisfies z + 4 + 2i = 1. (c) What are the maximum and minimum possible values of arg z?

(d) For what values of z is arg z maximised and minimised?

(c) The points U and D at which arg z is maximised and minimised are also where the tangents OU and OD from the origin touch the circle. By the above fact, OU is perpendicular

to CU . Similarly, OD is perpendicular to CD.

The angle the lower half of theline y = 0.5x makes with the positive x-axis is = tan1 0.5.

The angle COU is sin1 (1/ 20). Hence, arg U = +COU = tan1 0.5sin1 (1/ 20).

19. Altogether then D = 19 and arg D = tan1 0.5 + sin1 (1/ 20).

Symmetrically, we also have U = 19 and arg U = sin1 0.5 + tan1 (1/ 20).

(Figure reproduced for convenience.)

y

O

|z + 4 + 2i | = 1

U

N

F

D

y = 0.5 x

(Line through the

origin and the

centre of the circle.)

Exercise 167. The complex number z satisfies the equation z 2 2i = 1. (a) What are

the maximum and minimum possible values of z? (b) For what values of z is z maximised

and minimised? (c) What are the maximum and minimum possible values of arg z? (d)

For what values of z is arg z maximised and minimised? (Answer on p. 1123f.)

www.EconsPhDTutor.com

44

De Moivres Theorem

n

Proof. cos + i sin is the complex number with modulus 1 and argument . So by Fact 51,

n

(cos + i sin ) is the complex number with modulus 1n = 1 and argument n + 2k (where

k is the unique integer such that n + 2k is a principal value) this complex number can

be written as cos (n) + i sin (n).

Here is an alternative proof that uses the Euler Formula:

n 2

n 1

Proof. (cos + i sin ) = (ei ) = ei(n) = cos (n) + i sin (n), where = and = use the Euler

2

Formula (Theorem 6) and = uses the law of exponents (xa ) = xab , which applies even when

a is imaginary.

Exercise 168. Prove de Moivres Theorem using the method of mathematical induction.

(Answer on p. 1125.)

easy to rewrite it so that it applies more generally to any complex number with modulus r:

n

n

Or equivalently, if z = r and arg z = , then z n = rn and arg z n = n + 2k (where k is the

unique integer such that n + 2k is a principal value).

www.EconsPhDTutor.com

44.1

Example 421. Let z = 1 + i. Then:

z =

2,

arg z =

+ 2k =

(k = 0),

4

4

2

z 2 = ( 2) = 2,

arg z 2 = 2 ( ) + 2k =

(k = 0),

4

2

z 3 = ( 2) = 2 2,

3

arg z 3 = 3 ( ) + 2k =

(k = 0),

4

4

4

z 4 = ( 2) = 4,

arg z 4 = 4 ( ) + 2k = (k = 0),

4

z 5 = ( 2) = 4 2,

(k = 1),

arg z 5 = 5 ( ) + 2k =

4

4

etc.

z11

32 y z10

16

z12

-64

z9

z3 z2

z8

z

x

0

-16 z5 0

16

7

z

6

z

-16

z4

-48

-32

-32

-48

z13

-64

www.EconsPhDTutor.com

z =

12 + 0.42 = 1.16,

z 2 = 1.16,

z 3 = 1.16 1.16,

z 4 = 1.162 ,

z 5 = 1.162 1.16,

etc.

The powers of z = 1 + 0.4i, up to the 14th, are illustrated in the figure below.

z5

z6

z4

z3

y

z2

z1

z

z7

z8

z14

z9

z13

z10

z12

z10

z11

www.EconsPhDTutor.com

Exercise 169. (a) Given z = 3 4i, find z and arg z. Hence find z 7 and arg z 7 . Write

down (3 4i)7 in exponential form.

(b) Given z = 5 + 12i, find z and arg z. Hence find z 8 and arg z 8 . Write down (5 + 12i)8

in exponential form. (Answer on p. 1125.)

Exercise 170. For each of the given values of z, compute z 10 , expressing your answer in

all three forms (polar, exponential, and standard). (a) z = 1 i. (b) z = 2 + i. (c) z = 1 3i.

(Answer on p. 1126.)

www.EconsPhDTutor.com

44.2

Example 423. What are the roots to the equation z 3 = 1 + i? That is, for what values of

z is the given equation true?

A nave application of de Moivres Theorem might suggest that

z 3 = 21/2 and arg z 3 = /4

1/3

z = (21/2 )

This is not incorrect, but it gives us only one root to the equation z 3 = 1 + i, namely

z = 21/6 ei(/12) .

In contrast, the Fundamental Theorem of Algebra tells us that since the equation z 3 = 1 + i

involves a degree-3 polynomial, it should have 3 roots. Weve just found one root. How do

we find the other two?

The trick is to recognise that z 3 = 21/2 ei/4 can also be written as z 3 = 21/2 ei(/4+2k) ,

for any integer k. This is because if you plug in any integer k, you will always get

21/2 ei(/4+2k) = 21/2 ei(/4) . The reason is that ei(2) = 1.

1/3

1/3

We then have z = (z 3 ) = [21/2 ei(/4+2k) ] = 21/6 ei(/4+2k)/3 , for any integer k. Now in

contrast to before, different integers k will yield us distinct values for z = 21/6 ei(/4+2k)/3 .

In particular, if we pick values of k so that the values of (/4 + 2k) /3 are principal values,

that is, if we pick k = 0, 1, we have

z = 21/6 ei(/12) ,

21/6 ei(11/12) ,

21/6 ei(7/12) .

Observe that beautifully enough, the roots of the equation z 3 = 1 + i lie on a circle in

particular, the circle of radius 21/6 centred on the origin. Moreover, each root can be

2

obtained by rotating another root

radians about the origin.

3

www.EconsPhDTutor.com

z n

arg z n + 2k

In general, given z , we have z =

and arg z =

, where k are those integers

n

n

n

arg z

such that arg z =

+ 2k (, ].

n

n

The annoying part is to figure out the appropriate values of k. So heres how to do it:

1. If n is odd, then simply pick k = 0, 1, 2, . . . ,

0, 1, 2, . . . , 7.)

n1

. (E.g., if n = 15, then pick k =

2

n

2. If n is even AND arg z n > 0, then simply pick k = 0, 1, 2, . . . , . (E.g., if n = 16 and

2

arg z n > 0, then pick k = 0, 1, 2, . . . , 7, 8.)

n

3. If n is even AND arg z n 0, then simply pick k = 0, 1, 2, . . . , . (E.g., if n = 16 and

2

arg z n 0, then pick k = 0, 1, 2, . . . , 7, 8.)

You can easily verify that in each case, we do indeed have n roots (just count them). See

Fact 90 in the Appendices for a proof (or explanation) of why the above values of k ensure

that we have k distinct principal values for arg z.

www.EconsPhDTutor.com

1

Since 4 is even and arg z 4 > 0, we should pick k = 0, 1, 2 to get

1

z=

1

,

13 e

(12/5)+2k]/4

, for k Z.

(k = 0),

(k = 1),

(k = 1),

(k = 2).

www.EconsPhDTutor.com

1

z = 51/7 ei[tan (4/3)+2k]/7 , for k Z. Since 7 is odd, we should pick k = 0, 1, 2, 3.

1

1

51/8 ei[tan (4/3)+2m]/8 , for m Z. Since 8 is even and arg w8 0, we should pick m =

0, 1, 2, 3, 4.

Altogether then, the possible values of z and w are given by:

5

e

,

5

e

,

(k

=

0),

1

1/7

i[tan

(4/3)+2]/7

5 e

,

(k = 1),

5

e

,

5

e

,

(k

=

2),

1/7 i[tan1 (4/3)+6]/7

and w=

z= 5 e

,

(k = 3),

5

e

,

5

e

,

(k

=

1),

1

1/7

i[tan

(4/3)4]/7

5 e

,

(k = 2),

5

e

,

5

e

,

(k

=

3),

,

5 e

(m = 0),

(m = 1),

(m = 2),

(m = 3),

(m = 1),

(m = 2),

(m = 3),

(m = 4).

x

/

Notice that the eight possible values of w are on a circle whose radius is just slightly shorter

than the red circle. (Only the red circle is illustrated.)

www.EconsPhDTutor.com

Exercise 171. Find the roots of each of the following equations. (a) z 10 = 1 i. (b)

z 11 = 2 + i. (c) z 12 = 1 3i. (Answer on p. 1127.)

www.EconsPhDTutor.com

Part V

Calculus

www.EconsPhDTutor.com

45

Part I already covered differentiation. This chapter merely ties up some loose ends.

45.1

The Inverse Function Theorem (IFT) simply says that The change in y caused by a small

unit change in x (dy/dx) is the inverse of the change in x caused by a small unit change

in y (dx/dy).46 That is,

dy 1

= .

dx dx

dy

Example 426. Suppose that adding 1 g of Milo (the x-variable) to a cup of water increases

the volume of water by 2 cm3 (the y-variable). That is, dy/dx = 2 cm3 g-1 .

Then dx/dy = 0.5 g cm-3 . That is, if instead we had wanted to increase the volume of water

by 1 cm3 , we should have added 0.5 g of Milo to the water.

Example 427. Let x [/2, /2]. Let y = sin x. Suppose we wish to find dx/dy in terms

of x.

Method #1 (longer method using Corollary 2 ). y = sin x x = sin1 y. So

1

dx d

1

1

=

sin1 y =

=

=

dy dy

1 y2

1 sin2 x cos x

Method #2 (quicker method using the IFT).

dy

dx

1

= cos x

=

.

dx

dy cos x

dy

dx

. Hence write down

. (You may leave

dx

dy

your answers expressed in terms of x and y.) (Answer on p. 1128.)

46

This is informal. For the formal statement of the IFT (optional), see p. 969 in the Appendices.

www.EconsPhDTutor.com

45.2

Informal Fact.

dx

dy dy dx

=

(provided

0).

dx dt dt

dt

Here is an informal proof of the above informal fact. By the Chain Rule,

dy 1 dy dt

=

.

dx dt dx

By the IFT,

dt 2 1

= .

dx dx

dt

2

dy dy dx

=

.

dx dt dt

See p. 970 in the Appendices for a formal version of the above Fact.

dy

.

dx t=0

dy dy dx 6t5 1

=

=

.

dx dt dt 5t4 1

dy

= 1. It would be much more difficult (perhaps even impossible) if instead we first

dx t=0

dy

tried to express y in terms of x, then compute

.

dx

So

dy

. (Answer on p. 1128.)

dx

www.EconsPhDTutor.com

45.3

Fact 57. The line with slope m through the point (a, b) has equation y b = m(x a).

Fact 58. Given a line with slope m, its perpendicular has slope

1

.

m

Consider the normal line at the point where t = 0. Find any point(s) at which the normal

line intersects the curve C again.

First, note that t = 0 (x, y) = (0, 0). Next,

R

R

R

dy dx RRRR

6t5 1 RRRR

dy RRRR

R =

R = 1.

R = 4

dx RRRR

dt dt RRRR

5t 1 RRRR

Rt=0

Rt=0

Rt=0

So the tangent line at the point t = 0 or (0, 0) has slope 1. Thus, the normal line at this

point has slope 1. Its equation is thus y 0 = 1(x 0) or more simply y = x.

The points where this normal line intersects the curve is thus given by the system of

equations y = x, x = t5 + t, and y = t6 t. Putting these together, we have t5 + t = t6 t

t (t5 t4 2) = 0. So t = 0 or t 1.45 (calculator). (We know by the Fundamental Theorem

of Algebra that there must be six roots altogether in this case, only two are real, while

the other four are complex.)

So the normal line intersects the curve C again at the point where t 1.45 or where

(x, y) (7.88, 7.88).

Exercise 174. A curve C is described by the pair of parametric equations x = t5 + t and

y = t4 t. Find the tangent lines to the curve at the points where t = 0 and t = 1. Find the

intersection point of these two tangent lines.(Answer on p. 1128.)

www.EconsPhDTutor.com

45.4

Example 430. Sand is being unloaded onto a flat surface at a steady rate of 0.01 m3 s-1 .

Assume that the unloaded sand always forms a perfect cone with equal height and base

diameter. Find the rate at which the area of the base of the cone increasing at the instant

t = 20 s.

1

First, recall that a cone has volume V = r2 h, where r is the radius of the base and h is

3

the height. Since the base diameter equals the height (or h = 2r), we can rewrite this as

2

d

dV

dr

V = r3 . Applying , we have

= 2r2 .

3

dt

dt

dt

The base area is A = r2 . So the rate at which the base area is increasing is

dA

dr dV

= 2r =

r.

dt

dt dt

dV

= 0.01 at all times.

dt

3V 1/3

0.3 1/3

V t=20 = 20 0.01 = 0.2, so that rt=20 = ( )

= ( ) . Altogether then,

t=20

2

dA

0.3 1/3

= 0.01 ( ) = 0.0219 m2 s1 .

dt t=20

www.EconsPhDTutor.com

Exercise 175. (Answer on p. 1129.) Illustrated below is a cone with lateral l, base radius

r, and height h. You are given that such a cone has total external surface area (excluding

1

the base) rl and volume r2 h.

3

total external surface area (excluding the base) is minimised. Find out what its height

should be. (You can follow the steps below.)

(a) Express r in terms of h.

(b) Use the Pythagorean Theorem to express l in terms of r and h. Hence express l solely

in terms of h.

(c) Now express the total external surface area A (excludes the base) solely in terms of h.

dA 3 h63

6 1/3

(d) Show that

=

. Hence conclude that the only stationary point is h = ( ) .

dh 2 A

12

d2 A 9 h4 A2 ( h3 )

=

.

dh2 4

A3

d2 A

. Replace A2 with the expression for A that you found

dh2

in (c). Now fully expand this numerator. Observe that it is a quadratic and prove that it

is always positive.

(g) Hence conclude that the stationary point we found is indeed the global minimum.

www.EconsPhDTutor.com

45.5

Example 431. Define f [0, 2] R by x xsin (0.5x). We can easily find the minimum

point of f analytically:

df

= 1 cos ( x) = 0

dx

2

2

2

cos ( x) =

2

x=

2

2

cos1 0.560664181.

After Step 1.

After Step 2.

After Step 3.

After Step 4.

After Step 5.

After Step 6.

2. Press Y= to bring up the Y= editor.

3. Press X,T,,n SIN 0 . 5 . To enter , press the blue 2ND button and then

(which corresponds to the button). Now press X,T,,n ) and altogether you will

have entered x sin(0.5x).

4. Now press GRAPH and the calculator will graph y = x sin(0.5x).

Note that in the question given, the domain is actually [0, 2], but we didnt bother telling

the calculator this. So the calculator just went ahead and graphed the equation y = x

sin(0.5x) for all possible real values of x and y.

No big deal, all we need to do is to zoom in to the region where 0 x 2.

5. Press the (ZOOM) button to bring up a menu of ZOOM options.

6. Press 2 to select the Zoom In option. Using the < and > arrow keys, move the cursor

to where X = 1.0638298, Y = 0. Now press ENTER and the TI will zoom in a little,

centred on the point X = 1.0638298, Y = 0.

(... Example continued on the next page ...)

www.EconsPhDTutor.com

It looks like starting at x = 0, the function is decreasing, then hits a minimum point, then

keeps increasing. Our goal now is to find out what that minimum point is.

After Step 7.

After Step 8.

After Step 9.

4. Press the blue 2ND button and then CALC (which corresponds to the TRACE

button). This brings up the CALCULATE menu.

5. Press 3 to select the minimum option. This brings you back to the graph, with a

cursor flashing. Also, the TI84 prompts you with the question: Left Bound?

TI84s MINIMUM function works by you first choosing a Left Bound and a Right

Bound for x. TI84 will then look for the minimum point within your chosen bounds.

6. Using the < and > arrow keys, move the blinking cursor until it is where you want your

first Left Bound to be. For me, I have placed it a little to the left of where I believe

the minimum point to be.

7. Press ENTER and you will have just entered your first Left Bound.

TI84 now prompts you with the question: Right Bound?.

8. So now just repeat. Using the < and > arrow keys, move the blinking cursor until it is

where you want your first Right Bound to be. For me, I have placed it a little to the

right of where I believe the minimum point to be.

9. Again press ENTER and you will have just entered your first Right Bound.

TI84 now asks you: Guess? This is just asking if you want to proceed and get TI84 to

work out where the minimum point is. So go ahead and:

10. Press ENTER . TI84 now informs you that there is a Zero at X = .56066485,

Y = .2105137 and places the cursor at precisely that point. This is our desired

minimum point.

(Notice theres a slight error, because the TI84 uses slightly-imprecise numerical methods.

Analytically, we found that the minimum point was x 0.560664181, while the TI84 claims

it is X = .56066485.)

Page 448, Table of Contents

www.EconsPhDTutor.com

45.6

This example will also illustrate how to graph parametric equations on the TI84.

Example 432. The curve C has parametric equations x = t5 + t and y = t6 t, t R. Well

find dy/dx using our TI84, even though this is easily found analytically:

t=1

dy

6t5 1

5

= 4

= .

dx t=1 5t + 1 t=1 6

After Step 1.

After Step 2.

After Step 3.

After Step 4.

After Step 5.

After Step 6.

After Step 7.

After Step 8.

2. Press MODE to bring up a menu of settings that you can play with. In this example,

all we want is to plot a curve based on parametric equations. So:

3. Using the arrow keys, move the blinking cursor to the word PAR (short for parametric)

and press ENTER .

4. Now as usual, well input the equations of our curve. To do so, press Y= to bring up

the Y= editor. Notice that this screen looks a little different from usual, because we are

now under the parametric setting.

5. Press X,T,,n 5 + X,T,,n and altogether you will have entered T 5 + T in the

first line.

6. Now press ENTER to go to the second line.

7. Press X,T,,n 6 - X,T,,n and altogether you will have entered T 6 T in the

second line.

8. Now press GRAPH and the calculator will graph the given pair of parametric equations.

Notice that strangely enough, the graph seems to be empty for the region where x < 0. But

clearly there are values for which x < 0 for example, t = 1.1 (x, y) (2.71, 2.87).

So why isnt the TI84 graphing this?

(... Example continued on the next page ...)

www.EconsPhDTutor.com

The reason is that by default, the TI84 graphs only the region for where 0 t 2 (at least

this is so for my particular calculator). We can easily adjust this:

4. Press the WINDOW button to bring up a menu of WINDOW options.

5. Using the arrow keys, the number pad, and the ENTER key as is appropriate, change

Tmin and Tmax to your desired values. In my case, I decided somewhat randomly to

enter Tmin = 10 and Tmax= 10.

6. Then press GRAPH again and the calculator will graph the given pair of parametric

equations, now for the region Tmin t Tmax, where Tmin and Tmax are whatever

you chose.

After Step 9.

dy

,

Actually, the last few steps were really not necessary, if all we wanted was to find

dx t=1

as we do now:

7. Press the blue 2ND button and then CALC (which corresponds to the TRACE

button). This brings up the CALCULATE menu, which once again looks a little different

under the current parametric setting.

8. Press 2 to select the dy/dx option. This brings you back to the graph.

Nothing seems to be happening. But now, simply ...

9. Press 1 and now the bottom left of the screen changes to display T = 1.

10. Hit ENTER . What youve just done is to ask the calculator to calculate

point where t = 1. The calculator tells you that dy/dx = .83333528.

dy

at the

dx

dy 5

Again, theres a slight error the exact correct answer is

= = 0.8333..., so again the

dx 6

TI84 is a tiny bit off.

www.EconsPhDTutor.com

46

46.1

Power Series

Example 433. 4 + x + 3x2 is a 2nd-degree polynomial. 18 + 5x x2 + x4 is a 4th-degree

polynomial.

You can easily imagine what a -degree polynomial is. Only we dont call it a -degree

polynomial. Instead, we call it a power series.

Definition 99. A power series is simply any infinite series

ai xi = a0 + a1 x + a2 x2 + . . . ,

i=0

Example 434. 1 + 2x + 3x2 + 4x3 + 5x4 + 6x5 + . . . is a power series, with a0 = 1, a1 = 2,

a2 = 3, . . . , ak = k + 1, . . .

So too is 1 x + x2 x3 + x4 x5 + . . . , with a0 = 1, a1 = 1, a2 = 1, . . . , ak = (1)k+1 , . . .

As we learnt before, a series can either be convergent or divergent.

Example 435. 1 + x + x2 + x3 + x4 + x5 + . . . is a power series, with a0 = a1 = = ak = = 1.

It is, moreover, a convergent power series, provided x < 1. Indeed, provided x < 1, we

1

know that this is an infinite geometric series that converges to

and we may write

1x

1 + x + x2 + x3 + x4 + x5 + =

1

.

1x

For H2 Maths, the only power series well be interested in is called the Maclaurin series.

www.EconsPhDTutor.com

46.2

Maclaurin Series

at x is denoted M (x) and is defined to be the power series

M (x) = a0 + a1 x + a2 x2 + + an xn + . . . ,

f (3) (0)

f (n) (0)

f (0)

, a3 =

, ..., an =

, ...

where a0 = f (0), a1 = f (0), a2 =

2!

3!

n!

f (0) 2 f (3) (0) 3 f (4) (0) 4

f (n) (0) n

M (x) = f (0) + f (0)x +

x +

x +

x + +

x + ....

2!

3!

4!

n!

(i)

f (0) i

=

x.

i!

i=0

Definition 101. Let f be a n-times differentiable function. The nth-order Maclaurin series

of f at x is denoted Mn (x) and is defined as the nth-degree polynomial (or finite series)

Mn (x) = a0 + a1 x + a2 x2 + + an xn ,

f (0)

f (3) (0)

f (n) (0)

, a3 =

, ..., an =

.

where a0 = f (0), a1 = f (0), a2 =

2!

3!

n!

have f (0) = e0 = 1. We also have f (x) = ex , so that f (0) = e0 = 1. Similarly, f (x) = ex ,

so that f (0) = e0 = 1. Indeed, for any k Z+ , f (k) (x) = ex , so that f (k) (0) = e0 = 1. Hence,

the Maclaurin series for f is

M (x) = 1 + 1x +

1 2 1 3

x2 x2

x + x + = 1 + x +

+

+ ...

2!

3!

2! 3!

M0 (x) = 1,

M1 (x) = 1 + x,

M2 (x) = 1 + x +

x2

,

2!

M3 (x) = 1 + x +

x2 x3

+ .

2! 3!

www.EconsPhDTutor.com

Exercise 176. Write down the third-order Maclaurin series for each of the following functions: (Answer on p. 1130.)

(a) f R R defined by x (1 + x)n ,

(b) g R R defined by x sin x,

(c) h R R defined by x cos x,

(d) i R R defined by x ln(1 + x).

Remark 8. The A-level syllabuses make no mention of the Taylor series and so we wont

talk about it. But just so you know, the Maclaurin series is simply a special case of the

Taylor series specifically, it is the Taylor series about 0.

www.EconsPhDTutor.com

46.3

The Maclaurin series is simply an (infinite) series. And as we saw in Part II, an infinite

series may or may not be convergent.

The following is very powerful theorem:

Informal Theorem. If f satisfies a nice property at a, then M (a) converges to

f (a). That is, M (a) = f (a).

(See section 971 in the Appendices for a more thorough and formal discussion of this

theorem.)

This table is in the List of Formulae you get, so no need to memorise.

f (x)

f (0)

xf (0)

x2

f (0)

2!

...

xn (n)

f (0)

n!

...

(1 + x)n

nx

n(n 1) 2

x

2!

...

n(n 1) . . . (n r + 1) r

x

r!

...

(x < 1)

ex

x2

2!

...

xr

r!

...

(all x)

sin x

x3

3!

x5

5!

...

(1)r x2r+1

(2r + 1)!

...

(all x)

cos x

x2

2!

x4

4!

...

(1)r x2r

(2r)!

...

(all x)

ln(1 + x)

x2

2

x3

3

...

(1)r+1 xr

r

...

(1 < x 1)

1. The first row of the above table says that if x is a value at which the function f satisfies

the nice property, then f (x) is equal to the Maclaurin series of f at x.

2. The second row says that g R R by x (1 + x)n satisfies the nice property for all

n(n 1) 2

x (1, 1). Thus, for all x (1, 1), we have (1 + x)n = 1 + nx +

x + . . . We

2!

say that (1, 1) is the range of values for which g has a convergent Maclaurin

series.47

3. The third row says that h R R by x ex satisfies the nice property for all x R.

x2

Thus, for all x R, we have ex = 1 + x +

+ . . . We say that R is the range of values

2!

for which h has a convergent Maclaurin series.

47

We should be careful to state that if n < 0, then the domain should be restricted to exclude 0.

www.EconsPhDTutor.com

4. The fourth row says that i R R by x sin x satisfies the nice property for all x R.

x3 x5

Thus, for all x R, we have sin x = x + . . . We say that R is the range of values

3! 5!

for which i has a convergent Maclaurin series.

5. The fifth row says that j R R by x cos x satisfies the nice property for all x R.

x2 x4

Thus, for all x R, we have cos x = 1 + . . . We say that R is the range of values

2! 4!

for which j has a convergent Maclaurin series.

6. The sixth row says that k (1, ) R by x ln(1 + x) satisfies the nice property for

x2 x3

all x (1, 1]. Thus, for all x (1, 1], we have ln(1 + x) = x + . . . We say that

2

3

(1, 1] is the range of values for which k has a convergent Maclaurin series.

In the syllabus, these five particular Maclaurin series are called standard series.

x2 x3

+ + . . . for all x R.

2! 3!

Instead, we will merely verify that this equation is plausible, for x = 0, 1, 5. (Try these

out yourself using the sheet Maclaurin series at the usual link.)

Example 437. Here we will not rigorously prove that ex = 1 + x +

for all n. So it does appear plausible that e0 = M (0).

1

For x = 1, we have ex = e1 2.718. And M0 (1) = 1, M1 (1) = 1 + 1 = 2, M2 (1) = 1 + 1 + = 2.5,

2

1 1

M3 (1) = 1 + 1 + + = 2.67, ..., M7 (1) 2.718. It appears that Mn (1) 2.718 for all n 7.

2 6

So it does appear plausible that e1 = M (1).

25

For x = 5, we have ex = e5 148.413. And M0 (5) = 1, M1 (5) = 1 + 5 = 6, M2 (5) = 1 + 5 +

=

2

1

25 125

= 39 , ..., M18 (5) 148.413. It appears that Mn (5) 148.413

18.5, M3 (1) = 1 + 5 + +

2

6

3

for all n 18. So it does appear plausible that e5 = M (5).

Exercise 177. (Tedious, use the sheet named Maclaurin series at the usual link.) Verify

x3 x5

that for x = 0, , 2, it is similarly plausible that sin x = x

+ . . . (Answer on p.

2

3! 5!

1131.)

www.EconsPhDTutor.com

46.4

One important practical use of Maclaurin series is that finite-order Maclaurin series can be

used as approximations.

Example 438. Consider h R R defined by x ex . We have h(1) = e 2.718.

The 0th-order Maclaurin series is a pretty terrible approximation: M0 (1) = 1. The 1st-order

Maclaurin series is slightly better: M1 (1) = 1 + x = 1 + 1 = 2. The 2nd-order Maclaurin series

is even better: M2 (1) = 1 + x + 0.5x2 = 1 + 1 + 0.5 = 2.5. The 3rd-order Maclaurin series is

We see that it tends to be that the higher the order of the Maclaurin series, the better the

approximation.

I emphasise the phrase tends to be, because the approximation can sometimes get worse

before it gets better, especially if were looking at a value that is far from 0. The next

example illustrates.

www.EconsPhDTutor.com

we do the tedious computations, we find that

M0 (2) = 0,

The 0th-order Maclaurin series gets it exactly right. But each subsequent finite-order

Maclaurin series then drifts ever further from 0! Having computed the 5th- and 6th-order

Maclaurin series, it certainly does not look like the approximations will get any better. Yet

if we perservere, we find that

M7 (2) = M8 (2) 30.159,

M15 (2) =M16 (2) 0.093, M17 (2) =M18 (2) 0.011,

...

Indeed, Mn (2) 0.000 for all n 21. So it does indeed look like the Maclaurin series for

sin x converges. Graphed below are sin x and M21 (x). We see that M21 (x) almost perfectly

approximates sin x for x [7, 7]. But for larger values, M21 (x) veers far away from sin x.

x

-12

-7

-2

12

www.EconsPhDTutor.com

Graphed below are y = sin x, M1 (x), . . . , M10 (x). We see that the 1st-order Maclaurin series

M1 (x) = x is indeed a good approximation for values of x that are close to 0, but terrible

for larger values.

Low-order Maclaurin series work well as approximations, provided we are looking at small

values of x (i.e. values that are close to 0).

But for large x, even if the Maclaurin series eventually converges, low-order Maclaurin

series may fare very poorly as approximations. Indeed, as we saw on the previous page, for

sufficiently large values of x, even a relatively-high-order Maclaurin series like M21 (x) will

fare poorly as an approximation!

www.EconsPhDTutor.com

If a is not within the range of values for which the Maclaurin series for the function f

converges, then M (a) f (a). That is, the (infinite) Maclaurin series does not converge.

Hence, there is no reason to expect that Mi (a) f (a) (i.e. that any finite-order Maclaurin

series will serve as a good approximation). Example to illustrate.

Example 440. Consider k R R defined by x ln(1 + x). The range of values for which

the Maclaurin series converges is (1, 1]. Suppose we pick x = 2, which is certainly outside

this range. Then we have k(2) = ln 3 1.099. Lets see what the finite-order Maclaurin

series look like:

M0 (2) = 0,

M1 (2) = 2,

1

M4 (2) = 1 ,

3

M5 (2) = 5

M8 (2) = 19.314,

M9 (2) = 37.575,

1

,

15

M2 (2) = 0,

2

M3 (2) = 2 ,

3

M6 (2) = 5.6,

M7 (2) 12.686,

Unlike before, further perserverance will not pay off here. Indeed, the Maclaurin series will

grow without bound. For example, M50 (2) 14.9 trillion! The Maclaurin series simply

does not converge for x = 2. So there is no reason to expect any finite-order Maclaurin

series to be a good approximation.

www.EconsPhDTutor.com

46.5

Informally, if two power series converge, then so too does their product; and to get this

product, simply multiply the two series together as if they were finite polynomials.48

1

1 3

x + . . . and cos x = 1 + 0 x2 + 0 + . . . .

3!

2!

Thus, for all x R, we have sin x cos x = c0 + c1 x + c2 x2 + c3 x3 + . . . , where

Example 441. For all x R, sin x = 0 + 1x + 0

Constant Term

c0 = 0 1 = 0,

Coefficient on x

c1 = 0 0 + 1 1 = 1,

Coefficient on x2

1

c2 = 0 ( ) + 1 0 + 0 1 = 0,

2!

Coefficient on x3

1

2

1

c3 = 0 0 + 1 ( ) + 0 0 + ( ) 1 =

2!

3!

3

2

2

sin x cos x = 0 + 1x + 0x2 + ( ) x3 + = x x3 + . . .

3

3

The expression on the RHS is, of course, simply also the Maclaurin series for sin x cos x.

You are asked to show this in Exercise 178.

Exercise 178. Let f R R be defined by x sin x cos x. Evaluate f (0), f (0), f (0),

and f (3) (0). Hence, write down the 3rd-order Maclaurin series for f and verify that this is

consistent with what we found in Example 441. (Answer on p. 1132.)

The next example illustrates that one must be careful about when the Maclaurin series is

convergent:

48

This assertion is formally stated and proven at Fact 97 in the Appendices (optional).

www.EconsPhDTutor.com

1

Example 442. For all x R, we have sin x = 0 + 1x + 0 x3 + . . . For all x (1, 1],

3!

1 2 1 3

we have ln(1 + x) = 1x x + x + . . . And so for x (1, 1], we have sin x ln(1 + x) =

2

3

c0 + c1 x + c2 x2 + c3 x3 + . . . , where

Constant Term

c0 = 0 0 = 0,

Coefficient on x

c1 = 0 1 + 1 0 = 0,

Coefficient on x2

1

c2 = 0 ( ) + 1 1 + 0 0 = 1,

2

Coefficient on x3

c3 = 0

1

1

1

1

+ 1 ( ) + 0 1 + ( ) 0 =

3

2

3!

2

1

1

And so sin x ln(1 + x) = 0 + 0x + 1x2 + ( ) x3 + = x2 x3 + . . . , for x (1, 1] this set

2

2

is simply the intersection of R and (1, 1], which are respectively the ranges of values on

which the Maclaurin series for sin x and ln x converge.

The expression on the RHS is, of course, simply also the Maclaurin series for sin x ln(1 + x).

You are asked to show this in Exercise 178.

Exercise 179. Let f R R be defined by x sin x ln(1+x). Evaluate f (0), f (0), f (0),

and f (3) (0). Hence, write down the 3rd-order Maclaurin series for f and verify that this is

consistent with what we found in Example 442. (Answer on p. 1132.)

www.EconsPhDTutor.com

46.6

2

That is, to get f (g(c)), simply plug in g(c) into the power series for f .49

Example 443. Define f R R+ R by f (x) = (1 + x)1 and g R R by g(x) = 2x. We

know that for all x (1, 1), we have f (x) = (1 + x)1 = 1 x + x2 x3 + . . . .

Thus, f (g(x)) = (1 + 2x)1 = 1 (2x) + (2x)2 (2x)3 + . . . for all g(x) = 2x (1, 1).

Equivalently,

f (g(x)) = (1 + 2x)1 = 1 2x + 4x2 8x3 + . . . for all x (0.5, 0.5).

Example 444. Define f R R by f (x) = ex and g R R by g(x) = x2 . We know that

x2 x3

+

+ ....

for all x R, we have f (x) = ex = 1 + x +

2! 3!

2

(x2 )

(x2 )

Thus, f (g(x)) = e = 1 + x +

+

+ . . . for all g(x) R. Equivalently,

2!

3!

2

x2

x4 x6

+

+ . . . for all x R.

f (g(x)) = e = 1 + x +

2! 3!

x2

In the case where g also has a convergent Maclaurin series, we can likewise also simply

plug in the Maclaurin series for g.50 Example:

49

50

For a more careful and formal version of this assertion, see Fact 98 in the Appendices (optional).

Again, for a more careful and formal version of this assertion, see Fact 99 in the Appendices (optional).

www.EconsPhDTutor.com

g(x) = sin x. Write down the Maclaurin series for f g, up to the 4th-order term.

Method #1 (composition method). We know that for all x (1, 1), we have f (x) =

x3

1x+x2 x3 +. . . . And for all x R, we have g(x) = x +. . . Hence, for all g(x) (1, 1),

3!

i.e. for all x k/2 (for k Z), we have

1

2

3

= 1 g(x) + [g(x)] [g(x)] + . . .

1 + sin x

2

3

x3

x3

x3

+ . . . ) + (x

+ . . . ) (x

+ ...) ...

= 1 (x

3!

3!

3!

1

5

= 1 x + x2 + x3 ( 1) + = 1 x + x2 x3 + . . .

3!

6

f (g(x)) =

Find the general term for a Maclaurin series is explicitly excluded from the A-level syllabuses. Usually youll just have to write down the first few terms.

Method #2 (direct method). Let h(x) = 1/(1 + sin x). We have h(0) = 1. We also have

R

R

cos x RRRR

dh RRRR

R =

= 1,

2 RRR

dx RRRR

R

(1

+

sin

x)

RRx=0

Rx=0

R

R

R

2

(1 + sin x) sin x + 2 cos2 x RRRR

d2 h RRRR

(1 + sin x) sin x + 2 cos2 x(1 + sin x) RRRR

RRR =

RRR

R =

4

3

dx2 RRRR

R

RRR

(1

+

sin

x)

(1

+

sin

x)

R

Rx=0

Rx=0

x=0

R

sin x + 2 sin2 x RRRR

=

RR = 2,

3

(1 + sin x) RRRRx=0

3

2

R

(1 + sin x) (cos x 2 sin x cos x) (sin x + 2 sin2 x) 3 (1 + sin x) cos x RRRR

d3 h RRRR

RRR

R =

6

RRR

dx3 RRRR

(1 + sin x)

Rx=0

Rx=0

1231

=

= 5.

1

Thus,

2

5

5

1

= 1 + (1)x + x2 + x3 + = 1 x + x2 x3 + . . .

1 + sin x

2!

3!

6

In the above example, I gave two methods. Use whichever seems to be easier or quicker.

Heres another example:

www.EconsPhDTutor.com

Example 446. Write down the Maclaurin series for sec x up to the 4th-order term.

Method #1 (composition method).

x2 x4

1

1

=

[1

(

sec x =

=

+ . . . )]

cos x 1 x2!2 + x4!4 . . .

2! 4!

x2 x4

x2 x4

=1+(

+ ...) + (

+ ...) + ...

2! 4!

2! 4!

1

1 2

x2 5x4

x2

4

+ x [ + ( ) ] + = 1 +

+

+ ...

=1+

2!

4!

2!

2

24

Method #2 (direct method). Let f (x) = sec x. Then f (0) = sec 0 = 1. And

f (x) = sec x tan x

f (0) = 0,

f (3) (x) = 6 sec2 xf (x) f (x)

2

f (0) = 1,

f (3) (0) = 0,

Thus,

f (4) (0) = 5.

1 2 0 3 5 4

x2 5x4

sec x = 1 + 0x + x + x + x + = 1 +

+

+ ...

2!

3!

4!

2! 24

Exercise 180. Write down the third-order Maclaurin series for sin [ln(1 + x)]. State also

the range of values for which the Maclaurin series converges. (Answer on p. 1132.)

www.EconsPhDTutor.com

46.7

all x (c, c), we have

f (x) = a0 + a1 x + a2 x2 + a3 x3 + a4 x4 . . .

Then the coefficients in the above power series are as given by the Maclaurin series. That

is, for each i = 0, 1, 2, . . . , we have

ai =

f (i) (0)

.

i!

f (x) = a1 + 2a2 x + 3a3 x2 + 4a4 x3 . . . ,

Thus, f (0) = a0 and

f (0) = a1 ,

f (0) = 2!a2 ,

Rearranging, we have a0 = f (0), a1 = f (0), a2 = f (0)/2!, a3 = f (3) (0)/3!, a4 = f (4) (0)/4!,

..., ai = f (i) (0)/i!, ..., as desired.

The above theorem is merely a tantalising hint of why the Maclaurin series works. This is

because the theorem merely says this: If we make the very big assumption that the infinitelydifferentiable function f can be written down as a power series, then the coefficients of the

power series are as given by the Maclaurin series.

But this is not very useful, because how do we know that the function can be written

down as a power series? For a continuation of this discussion, see section 88.14 in the

Appendices.

www.EconsPhDTutor.com

47

Definition 102. Given functions f and F , we call F an indefinite integral (or antiderivative

or primitive) of f if for all x in the domain of f ,

F (x) = f (x),

In Leibnizs notation, we may write

F = f (x) dx or more simply F = f dx.

The statement F = f dx is thus completely equivalent to the statement F = f .

Example 447. Consider the functions f, F R R defined by f (x) = 2x and F (x) = x2 .

We see that F is an indefinite integral of f , because F (x) = 2x = f (x) for all x. We can

equivalently say that f is the derivative of F . We can also write

F = f dx or

dF

= f.

dx

The statement the value of F at 5 is 25 can be written as F (5) = 25. It can also be

written as

f dxx=5 = 25 or

called the differential of the variable x it informs us that the variable of integration

is x. The function f to be integrated is called the integrand.

Just like with summation, x is a dummy variable. We can replace x with any other letter

and the function F will still remain exactly the same function.

www.EconsPhDTutor.com

Example 448. The following two expressions are equal because i on the LHS and r on the

RHS are simply dummy variables.

n

i=1

r=1

i = r.

Similarly, the statement F = f (x) dx is equivalent any of the following three statements,

because the letters x, a, b, c, etc. are merely dummy variables:

F = f (a) da,

or F = f (b) db,

or F = f (c) dc.

So the statement the value of F at 5 is 25 can also be written F (5) = 25 or any of the

following four statements:

f (x) dxx=5 = 25, f (a) daa=5 = 25,

f (b) dbb=5 = 25,

www.EconsPhDTutor.com

47.1

F R R defined by F (x) = cos x is an indefinite integral of f , because F (x) = sin x =

f (x) for all x R.

Are there any other indefinite integrals of f ? Yes, certainly.

For example, G R R defined by G(x) = cos x + 200 is also an indefinite integral of f ,

because G (x) = sin x = f (x) for all x R.

Indeed, any H R R defined by H(x) = cos x + C where C R is also an indefinite

integral of f , because H (x) = sin x = f (x) for all x R.

In general:

Fact 59. If F is an indefinite integral of f , then so too is G defined by G(x) = F (x) + C,

for any C R.

We call C the constant of integration.

Proof. Since F (x) = f (x) for all x, we also have G (x) = F (x) + C = F (x) + 0 = f (x) for

all x. And so by definition, G is also an indefinite integral of f .

www.EconsPhDTutor.com

47.2

and G are both indefinite integrals of f , then it must be that F and G differ only by a

constant.

Example 450. Say f has indefinite integral F defined by F (x) = sin (ex 3x+5 ). Suppose

G is another indefinite integral of f . Then it must be that F (x) = G(x) + C, for some

C R.

2

Formally:

Fact 60. If F and G are both indefinite integrals of f , then there exists some C R such

that F (x) = G(x) + C for all x.

Proof. Since F and G are both indefinite integrals of f , by definition, F (x) = G (x) for all

x. And thus (F G) (x) = 0 for all x. But the only functions whose derivative is always 0

are constant functions.51 Thus, F (x) G(x) = C, for all x, for some C R.

f (x) = 4 sin 4x, F (x) = cos 4x, and G(x) = 8 sin2 x cos2 x .

(a) Show that F and G are both indefinite integrals of f .

(b) F and G seem to be very different functions. Yet both are indefinite integrals of f .

Why does this not contradict our assertion that the indefinite integral is unique up to a

constant?

51

The alert reader will note that this assertion has not actually been proven in this textbook. Well simply take it for granted

that the only functions whose derivative is 0 are constant functions.

www.EconsPhDTutor.com

48

Integration Techniques

As before with our notation for differentiation, lets be clear (pedantic). To take an example,

the notation

sin x dx = cos x + C

is simply shorthand for the following long-winded statement:

Consider a function with mapping rule x sin x.

Its indefinite integrals are functions, all of which

have the mapping rule x cos x + C.52

52

This shorthand statement fails to to mention the domain and codomains of the function and its indefinite integral. However,

the careful writer will of course have specified these nearby.

www.EconsPhDTutor.com

48.1

indefinite integrals F and G. Then

k dx

n

x dx

kx + C,

xn+1

=

+ C,

n+1

1

x dx = ln x + C,

x

e dx

(x 0 if n < 0)

(x 0)

sin x dx

= cos x + C,

cos x dx

= sin x + C,

f (x) g(x) dx = F + G + C,

ex + C,

kf (x) dx

kF + C,

Proof. In general, to prove that f (x) dx = F , it suffices to prove that F (x) = f (x) for

all x.

d

And so to prove that x1 dx = ln x + C, it suffices to prove that

(ln x + C) = x1 for

dx

all x 0. This we now do. First note that

ln x + C,

ln x + C =

ln (x) + C,

Thus,

(ln x + C) =

dx

x

And so indeed

for x 0,

for x < 0.

for x 0,

1

,

x

for x < 0.

d

(ln x + C) = x1 for all x 0.

dx

You are asked to prove the remaining rules of integration in Exercise 182.

Exercise 182. Prove the remaining rules of integration listed in Proposition 10. (Answer

on p. 1134.)

www.EconsPhDTutor.com

48.2

No need to memorise the following rules of integration, because the List of Formulae contains a (slightly less general) version.

Proposition 11. Let a 0. Then

(a)

(b)

1

x2 + a2 dx

1

dx =

2

a x2

1

x

tan1 ( ) + C,

a

a

x

sin1 ( ) + C,

a

for x < a,

(c)

1

x2 a2 dx

1

xa

ln

+ C,

2a

x+a

for x a,

(d)

1

a2 x2 dx

1

a+x

ln

+ C,

2a

ax

for x a,

(e)

tan x dx

ln sec x + C,

(f)

cot x dx

ln sin x + C,

(g)

csc x dx

= ln csc x + cot x + C,

(h)

sec x dx

ln sec x + tan x + C,

,

2

for x not an multiple of ,

for x not an odd multiple of

,

2

Proof. We prove only (a), (c), and (e). (You are asked to prove the remaining rules of

integration in Exercise 183.)

d

1

d 1

x

1

1

1

tan1 x = 2

. Hence,

[ tan1 ( ) + C] =

=

dx

x +1

dx a

a

a ( x )2 + 1 a

a

a

1

1

x

. So indeed 2

dx = tan1 ( ) + C.

2

2

2

x +a

x +a

a

a

(a) By Corollary 2,

www.EconsPhDTutor.com

xa

0.

x+a

d 1

xa

d 1

xa

1 d

( ln

+ C) =

( ln

+ C) =

[ln(x a) ln(x + a)]

dx 2a

x+a

dx 2a x + a

2a dx

1

1

1

1 x + a (x a) 1 2a

1

=

(

)=

=

=

,

2a x a x + a

2a (x a) (x + a) 2a x2 a2 x2 a2

1

1

xa

so that indeed 2

dx

=

ln

+ C.

x a2

2a

x+a

Case #2:

xa

< 0.

x+a

xa

d 1

ax

1 d

d 1

( ln

+ C) =

( ln

+ C) =

[ln(a x) ln(x + a)]

dx 2a

x+a

dx 2a x + a

2a dx

1

1

1

1

1

1

1

=

(

)=

(

)= 2

,

2a a x x + a

2a x a x + a

x a2

1

xa

1

so that again 2

dx =

ln

+ C.

2

x a

2a

x+a

(e) Let x not be an odd integer multiple of /2, so that sec x 0.

Case #1: sec x 0.

d

d

sec x tan x

(ln sec x + C) =

(ln sec x + C) =

= tan x,

dx

dx

sec x

so that indeed tan x dx = ln sec x + C.

Case #2: sec x < 0.

d

sec x tan x

d

(ln sec x + C) =

[ln ( sec x) + C] =

= tan x,

dx

dx

sec x

so that again tan x dx = ln sec x + C.

Exercise 183. Prove the remaining rules of integration listed in Proposition 11. (Answers

on pp. 1135, 1136, 1137, and 1138.)

www.EconsPhDTutor.com

48.3

Trigonometric Functions

The following indefinite integrals are NOT on the List of Formulae and you are definitely

required to know how to derive them on your own!

Fact 61. Let m, n R. Then

(a)

2

sin x dx

1

sin 2x

x

+ C,

2

4

(b)

2

cos x dx

sin 2x

1

x+

+ C,

2

4

(c)

2

tan x dx

tan x + x + C,

(d)

+

} + C,

sin(mx) cos(nx) dx = 2 {

mn

m+n

(e)

sin(mx) sin(nx) dx =

{

} + C,

2

mn

m+n

(f)

cos(mx) cos(nx) dx =

{

+

} + C,

2

mn

m+n

Proof. (a) The trick is to recall the trigonometric identity cos 2x = 1 2 sin2 x (this is in the

List of Formulae, as are several other trig identities). And so:

2

sin x dx =

1 cos 2x

1

sin 2x

dx = x

+ C.

2

2

4

You are asked to prove the remaining rules of integration in Exercise 184.

Exercise 184. Prove the remaining rules of integration listed in Fact 61. (Answer on p.

1139.)

www.EconsPhDTutor.com

48.4

The method of integration by substitution (IBS) is the Chain Rule in reverse. Before

we explain why it works, here are two examples of how it works.53

cos x

. Next, observe that

Example 451. Lets find cot x dx. First, observe that cot x =

sin x

d

du

sin x = cos x. Let u = sin x (this is our substitution), so that we also have

= cos x.

dx

dx

So:

cos x

1 du

cot

x

dx

=

dx

=

sin x

u dx dx.

So far, nothing unusual has happened. Now were going to do something strange, which is

to take that last expression and merrily cancel out the dxs:

1 du

1

u dx dx = u du + C1 .

du

is NOT a fraction? So why are we

Didnt we repeatedly insist earlier that the derivative

dx

allowed to merrily cancel out the dxs!? Shortly well explain why this move is legitimate.

For now, let us blindly perservere:

1

u du + C1 = ln u + C = ln sin x + C.

Another example, before we explain why exactly we can merrily cancel out the dxs:

du

Example 452. Lets find 2x cos x2 dx. Let u = x2 , so that we also have

= 2x. Now,

dx

du

2

2x cos x dx = dx cos u dx.

Again, we merrily cancel out the dxs and write:

du

2

dx cos u dx = cos u du + C1 = sin u + C = sin x + C.

53

Actually we secretly already used this method a few times above, though not very explicitly.

www.EconsPhDTutor.com

We now explain why it is OK to merrily cancel out the dxs. In fact, saying that we

merrily cancel out the dxs is merely a mnemonic (memory device). We are not actually

cancelling out the dxs. Instead, we are appealing to the following result:

Theorem 9. Let f D R be any continuous function. Let u be a real-valued differentiable

function. Assume Range(u) D, so that the composite function f u exists. Then

du

f dx dx = f du + C.

dP 1

du

dQ 2

du

dx and Q = f du. In other words,

=f

and

= f.

Proof. Let P = f

dx

dx

dx

du

2

dQ dQ du 3

du

=

=f .

dx du dx

dx

du

. And so

dx

by Fact 60 (uniqueness of the indefinite integral up to a constant), P and Q must be equal

(or differ by at most a constant). That is, P = Q + C or

1

Examining = and =, we see that P and Q are both indefinite integrals for f

du

f dx dx = f du + C.

The above result says that when doing integration, we are allowed to merrily do two

things:

du

du

dx with du (cancel out the dxs from

dx to get du);

dx

dx

du

dx

du

2. Replace du with

dx (multiply du by

= 1 to get

dx).

dx

dx

dx

1. Replace

Of course, we are not actually doing any such things as cancelling out the dxs or muldx

tiplying by

= 1 these are merely mnemonics. Instead, all we are doing is appealing

dx

to the above theorem.54

Lets try more examples, now that we have a better understanding of how this works:

54

dy

dx

dy

= 1/ . The IFT is true NOT because

and

dx

dy

dx

dx

dy

dx

are fractions. Nonetheless, as a convenient mnemonic, we can pretend that the IFT holds because

and

are

dy

dx

dy

fractions even though strictly speaking, such thinking is wrong.

www.EconsPhDTutor.com

du

Example 453. Lets find esin x cos x dx. Let u = sin x, so that we also have

= cos x.

dx

Now we can write

du

1

sin x

cos x dx = eu dx = eu du + C1 = eu + C = esin x + C,

e

dx

1

where = uses Theorem 9. Purely as a mnemonic, we may think of this step = as cancelling

out the dxs, even though strictly speaking, we are doing no such thing; instead, we are

appealing to Theorem 9.

50

Example 454. Lets find (x3 + 5x2 3x + 2) (3x2 + 10x 3) dx. One method would

be to fully expand the integrand to get a 152nd-degree polynomial, then integrate this

polynomial term-by-term. This is doable, but absurdly tedious.

A better method is to observe that 3x2 + 10x 3 =

x3 + 5x2 3x + 2. Then we can write

d

(x3 + 5x2 3x + 2). Thus, let u =

dx

50

3

2

2

50 du

(x + 5x 3x + 2) (3x + 10x 3) dx = u dx dx

51

u51

(x3 + 5x2 3x + 2)

= u du + C1 =

+C =

51

51

1

50

+ C,

In the next three examples, we go in the opposite direction. That is, instead of cancelling

dx

out the dxs as was done in the previous few examples, we instead multiply by

= 1.

dx

www.EconsPhDTutor.com

Example 455. Lets find

1 u2 du. Well use the substitution u = sin x. Note that

du

2

1 u = 1 sin2 x = cos x. Moreover,

= cos x. So

dx

dx

du

1

1 u2 du = cos x du = cos x du = cos x dx + C1

dx

dx

sin 2x

2 1

= cos x cos x dx + C1 = cos2 x dx + C1 = x +

+ C,

2

4

1 1

2 sin x cos x sin1 u + u 1 u2

= sin u +

=

,

2

4

2

1

x2

Example 456. Lets find

dx. Well use the substitution u3 = 1 + 2x. Note that

3

1 + 2x

1 3

dx 3 2

x = (u 1) and

= u . So

2

du 2

2

2

2

[ 12 (u3 1)]

x2

(u3 1)

(u3 1) du

dx =

dx =

dx =

dx

3

3

4u

4u du

1 + 2x

u3

2

(u3 1) dx

(u3 1) 3 2

3u (1 2u3 + u6 )

=

du + C1 =

( u ) du + C1 =

du + C1

4u du

4u

2

8

1

3

3 u2 2u5 u8

3 2 1 2u3 u6

4

7

u

2u

+

u

du

+

C

=

(

+

)

+

C

=

u (

+ )+C

1

8

8 2

5

8

8

2

5

8

3

2(1 + 2x) (1 + 2x)2

2/3 1

= (1 + 2x) [

+

]+C

8

2

5

8

3

20 16 32x + 5 + 20x + 20x2

3

20x2 12x + 9

+ C = (1 + 2x)2/3

+ C,

= (1 + 2x)2/3

8

40

8

40

1

where = uses Theorem 9. (The last line is just further simplification, which is nice but not

necessary.)

www.EconsPhDTutor.com

1

Example 457. Lets find

dx. Well use the substitution u = tan x or x =

1 + 3 cos2 x

tan1 u. Note that

cos2 x =

So

1

1+

3

1+u2

dx =

1

1

1

=

=

2

sec2 x 1 + tan x 1 + u2

1

1+

3

1+u2

and

dx

1

=

.

du 1 + u2

du

1

1 dx

1

1

dx =

du + C1 =

du + C1

3

3

du

1 + 1+u2 du

1 + 1+u2 1 + u2

1

1

1

2 1

1 u

1 tan x

du

+

C

=

du

+

C

=

tan

(

)

+

C

=

tan

(

) + C,

1

1

1 + u2 + 3

22 + u2

2

2

2

2

=

1

du

2

= 1) and = uses Proposition 11.

du

Usually, the hard part is to figure out the appropriate substitution to make. Fortunately,

in the A-level exams, youll always be told what substitution to make.

Exercise 185. (Answers on pp. 1140 and 1141.) (a) (i) Use the substitution x = 3 sec u

9

to find

dx.

x2 x2 9

9

9

to find

dx.

2 x2 9

1u

x

1

(iii) Show that sin (sec1 y) = 1 2 . Then explain why your answers in (i) and (ii) are

y

consistent.

x3

3

tan u to find

dx.

3/2

2

(4x2 + 9)

x3

2

(ii) Now use instead the substitution u = 4x + 9 to find

dx.

3/2

(4x2 + 9)

1

(iii) Show that cos (tan1 y) =

. Then explain why your answers in (i) and (ii) are

1 + y2

consistent.

(b) (i) Use the substitution x =

www.EconsPhDTutor.com

48.5

Theorem 10. (Integration by Parts.) Let u and v be differentiable functions, which

have continuous derivatives u and v . Then uv dx = uv u v dx.

uv u v dx, as desired.

dv

, Exponential, Trig,

dx

Algebraic, Inverse trig, Log. (This is because exponential functions are easiest to integrate,

followed by trigonometric functions, etc.)

By the DETAIL rule of thumb, we should choose v = ex . Now,

u v

x

x

x

x

x

x

x e dx = uv u v dx = xe e dx = xe e = e (x 1).

By the DETAIL rule of thumb, we should choose v = ex . Now,

u v

2 x

2 x

x

x e dx = uv u v dx = x e 2xe dx

= x2 ex 2ex (x 1) = ex (x2 2x + 2). (Use the previous example.)

www.EconsPhDTutor.com

49

The problem of finding the definite integral is the problem of finding the area under a curve.

The problem of finding the derivative is the problem of finding the slope of the tangent.

The two Fundamental Theorems of Calculus (FTCs) show that, surprisingly enough, these

two problems are intimately (indeed inversely) related.

This chapter is a largely-informal discussion of the intuition behind the FTCs.

49.1

Given a continuous real-valued function f , its area function is denoted A and is, informally, defined by the mapping A(c) = Area bounded by the graph of f , the horizontal

axis, and the vertical lines x = 0 and x = c.

Example 460. Graphed below is the continuous function f R+0 R defined by f (x) =

x + 1.

The area A(6) is highlighted in red. It is the area bounded by the graph of f , the horizontal

axis, and the vertical lines x = 0 and x = 6.

Using a graphing calculator, A(6) = 15.79795897... Is there a way I can figure this out

without a graphing calculator? Heres one possible approach lets approximate the area

by using three rectangles.

www.EconsPhDTutor.com

Well use three rectangles of equal width so each rectangle has width 2. The leftmost

rectangle will occupy the interval [0, 2], the middle rectangle will occupy [2, 4], and the

rightmost rectangle will occupy [4, 6].

For each rectangle, we choose its height to be the lowest value attained by the function in

that interval. In the interval [0, 2], the lowest value attained by f is f (0). So the leftmost

blue rectangle has height f (0) and thus area Base Height = 2f (0).

Similarly, the middle green rectangle has height f (2), because in the interval [2, 4], the

lowest value attained by f is f (2). Hence, it has area Base Height = 2f (2).

The rightmost grey rectangle has height f (4), because in the interval [4, 6], the lowest value

attained by f is f (4). Hence, it has area Base Height = 2f (4).

x

-1

where SL3 stands for Lower Sum in the case of 3 rectangles with equal width. This is our

very first approximation of the area A(6). We see that this is a fairly poor approximation,

because the true area is A(6) = 15.79795897... Nonetheless, it is useful we know that SL3

is a lower bound for A(6). That is, we know that SL3 A(6).

Well next try a different approximation SU 3 . Can you guess what this involves?

(... Example continued on the next page ... )

www.EconsPhDTutor.com

Well again use three rectangles of equal width (width 2), occupying intervals [0, 2], [2, 4],

and [4, 6]. The difference now is that for each rectangle, we choose its height to be the

highest value attained by the function in that interval. In the interval [0, 2], the highest

value attained by f is f (2). So the leftmost blue rectangle has height f (2) and thus area

Base Height = 2f (2).

Similarly, the middle green rectangle has height f (4), because in the interval [2, 4], the

highest value attained by f is f (4). Hence, it has area Base Height = 2f (4).

The rightmost grey rectangle has height f (4), because in the interval [4, 6], the highest

value attained by f is f (6). Hence, it has area Base Height = 2f (6).

x

-1

where SU 3 stands for Upper Sum in the case of 3 rectangles with equal width. This

is our second approximation of the area A(6). We see that again, this is a fairly poor

approximation, because the true area is A(6) = 15.79795897.... Nonetheless, it is again

useful we know that SU 3 is an upper bound for A(6). That is, we know that A(6) SU 3 .

Altogether, we know that 12.828 SL3 A(6) SU 3 17.727.

Can we do better than this? Yes, certainly. An obvious follow-up would be to increase the

number of rectangles we use. Lets next use 6 rectangles instead.

(... Example continued on the next page ... )

Page 483, Table of Contents

www.EconsPhDTutor.com

Well now use six rectangles of equal width (width 1), occupying intervals [0, 1], [1, 2],

[2, 3], [3, 4], [4, 5], and [5, 6]. To calculuate the Lower Sum SL6 , we give the first rectangle

height of f (0), the second f (1), ..., the sixth f (5). So each rectangle has, respectively, area

1f (0), 1f (1), ..., and 1f (5). Hence, SL6 = f (0)+f (1)+f (2)+f (3)+f (4)+f (5) 14.382.

-1 0 1 2 3 4 5 6 7 8 9 -1 0 1 2 3 4 5 6 7 8 9

Analogously, to calculuate the Upper Sum SU 6 , we give the first rectangle height of f (1),

the second f (2), ..., the sixth f (6). So each rectangle has, respectively, area 1 f (1),

1 f (2), ..., and 1 f (6). Hence, SU 6 = f (1) + f (2) + f (3) + f (4) + f (5) + f (6) 16.832.

Once again, A(6) has lower and upper bounds SL6 and SU 6 . That is, 14.382 SL6 A(6)

SU 6 16.832.

You can see where this is going. We can get ever better lower and upper bounds, by

increasing the number of rectangles we use.

(... Example continued on the next page ... )

Page 484, Table of Contents

www.EconsPhDTutor.com

Exercise 187. Continuing with the above example, find SL12 and SU 12 . Hence give lower

and upper bounds for A(6). (Answer on p. 1143.)

Let n be the number of rectangles we use. We will always have SLn A(6) SU n .

As n increases, we have increasingly-many, increasingly-slim rectangles. As n , we have

infinitely-many, infinitely-slim rectangles, whose total area should approach A(6).

Indeed, this slim rectangles approach is exactly how the area function is formally and

rigorously defined see Section 88.17 in the Appendices for the details (optional).

x

-1

It appears then that we need to do more maths to figure out how to add up all these

infinitely-many, infinitely-slim rectangles. ... But it turns out though that there is an

absolutely-fantastic shortcut we can use.

www.EconsPhDTutor.com

49.2

Given a function f , we sketched an idea of how to find its area function A approximate

the area under the curve using infinitely-many, infinitely-slim rectangles and add up the

total area of these rectangles. This though was merely a sketch of an idea. How do we go

about adding up the area of these infinitely-many, infinitely-slim rectangles? Easier said

than done!

It turns out though that well take an entirely different approach. Strangely enough, instead

of finding the area function A, we shall try to find the the area functions derivative

A . This seems utterly bizarre. If we dont know what A is in the first place, how could

we possibly figure out what A is? This is analogous to asking someone, who has no idea

where Singapore is, to find the Singapore Flyer!

But surprisingly, it turns out to be much easier to find A than it is to find A! Well recycle

the example from the last section:

www.EconsPhDTutor.com

Example 460 (continued from the previous section). Pick some x thats just a little

larger than 6. A(x) is the area bounded by the graph of f , between the vertical lines x = 0

and x = 6. And so A(x) is just slightly larger than A(6).

x

-1

Consider the thin green vertical strip. This green strip is roughly rectangular in shape

its left, right, and bottom edges are all straight. Only its upper edge is not straight.

This green strips area is exactly A(x) A(6). Moreover, we know that its base is x 6, its

left side is f (6), and its right side is f (x). Hence,

(... Example continued on the next page ...)

www.EconsPhDTutor.com

Area of rectangle with

Area of thin green

Area of rectangle with

base x 6 and height f (6)

vertical strip

base x 6 and height f (x)

(x 6) f (6)

<

A(x) A(6)

<

(x 6) f (x) .

Rearranging, we have

f (6) <

A(x) A(6)

< f (x).

x6

Now consider what happens if we pick another x that is slightly smaller but still larger

than 6. Then the above pair of inequalities will still hold. Indeed, for all x > 6, the above

pair of inequalities hold. If we let x approach 6, the above pair of inequalities becomes

A(x) A(6)

lim f (x).

x6

x6

x6

x6

(For why the strict inequalities < became weak inequalities , either you simply trust me

or see Fact 7 in the Appendices.)

Of course, lim f (6) is simply f (6). And by the continuity of f , lim f (x) = f (6). Hence,

x6

x6

A(x) A(6)

f (6),

x6

x6

f (6) lim

A(x) A(6)

= f (6). But wait a second ...

x6

x6

A(x) A(6)

lim

? It is simply the value of the derivative of A at 6!!

x6

x6

which means of course that lim

what is

A(x) A(6)

Def

A (6) = lim

.

x6

x6

We thus conclude that astonishingly enough, A (6) = f (6). And this is more generally true

given a continuous function f , the derivative of its area function is simply the original

function itself! This is the First Fundamental Theorem of Calculus.

www.EconsPhDTutor.com

Note that earlier we defined the area function so that we started counting the area from

x = 0 (vertical axis). But this was just to keep the above arguments and diagrams simple.

It makes no difference if we start counting the area from any other x = a instead.

Theorem 11. (First Fundamental Theorem of Calculus [FTC1], informal statement.) Let f be a real-valued continuous function with area function A. Then A = f .

In words, the FTC1 says that the area function of a continuous function is simply

the function itself ! Equivalently, an indefinite integral (or antiderivative) of a

continuous function is the area function.55

Exercise 188. Why did I use the indefinite article an, rather than the definite article the,

in the last sentence above? (Answer on p. 1143.)

Example 461. Graphed below is the velocity v (ms-1 ) of a car as a function of time t (s).

Recall that the area under the graph is the distance travelled by the car. For example, the

shaded red area A(5) is the total distance travelled by the car after 5 s.

But the derivative of the distance travelled with respect to time is precisely the velocity!

Hence, this example illustrates the FTC1: the derivative of the area under the graph of a

function is precisely the function itself!

Velocity (ms-1)

Time (s)

0

55

For a formal, rigorous statement of FTC1 and its proof, see section 88.17 in the Appendices.

www.EconsPhDTutor.com

49.3

p f (x) dx is the area under under the graph of f , between p and q. (Compare this to the

q

area function: A(k) is the area under the graph of f , between 0 and k.) We call

the definite (or Riemann) integral of f between p and q.

56

f (x) dx

p

the area under f , between 1 and 3) is highlighted in blue. Similarly, the definite integral

q

y

x

0

IMPORTANT REMARK

q

The indefinite integral f dx and the definite integral f dx have very similar

p

names and notation. But do not make the mistake of believing that weve simply defined

them so that theyre similar we have not.

The indefinite integral f (x) dx is an antiderivative of f . (It is also a function.)

b

The definite integral f (x) dx is the area under the graph of f , between a and b. (It

a

is also a number.)

A priori, there is no reason whatsoever to believe that some antiderivative of f and

some area under the graph of f have anything in common.

It is the two FTCs that establish the connection between the two. This is what makes the

FTCs remarkable and surprising.

And it is because of this connection that we give these two distinctly-defined mathematical

objects such similar names and notation.

56

For the formal definition of the definite integral, see section 88.17 in the Appendices.

www.EconsPhDTutor.com

49.4

A(q) is the area under the curve between 0 and q;

Similarly, A(p) is the area under the curve between 0 and p;

q

p

q

Thus, f dx = A(q) A(p). From this and also with the aid of the FTC1, we can easily

p

prove the FTC2.

Theorem 12. (The Second Fundamental Theorem of Calculus [FTC2].) Let f

[a, b] R be a continuous function and p, q [a, b]. Then

q

p f dx = f dxx=q f dxx=p .

Proof. By Theorem 11, the area function A is an indefinite integral of f . And so by

Fact 60, A and f dx differ by at most a constant. That is, for all r [a, b], we have

A(r) = f dx

x=r

+ C. Hence,

q

p f dx = A(q) A(p)

= [ f dx

x=q

= f dx

x=q

+ C] [ f dx

f dx

x=p

+ C]

x=p

www.EconsPhDTutor.com

50

Definite Integrals

To repeat:

The indefinite integral f dx is an antiderivative of f .

b

a

A priori, there is no reason to believe that the two are in any way related. It is the two

FTCs that establishes their remarkable relationship:

b

a f dx = f dxx=b f dxx=a ,

b

To compute f dx, one method would have been to painfully add up the area of the

a

infinitely-many infinitely-slim rectangles. Thanks to the FTCs, we have a wonderful

alternative method that is much easier:

1. Find any indefinite integral of f .

2. The difference of the values of this indefinite integral at b and a is our desired area.

We can simply apply all the rules of integration we learnt earlier.

b

www.EconsPhDTutor.com

50.1

Example 463. Find the exact area bounded by the curve y = x2 and the horizontal lines

y = 1 and y = 2.

Its always helpful to make a quick sketch (given below). Our desired area is labelled A

below. To find a desired area, there are usually multiple methods, some quicker than others.

3 1

x

1

2

2

2

21

2

.

2 x dx = [ 3 ] = 3 ( 3 ) =

3

2

1

By symmetry, D has the same area as B. C has area 1 2. Hence, A has area

2 21

4

2 21

+2+

) = (2 2 1) .

A + B + C + D (B + C + D) = 4 2 (

3

3

3

Method #2. The right branch of the parabola y = x2 has equation x = y. The right half

y=2

y=2

2

2

2

4

x dy =

of the area A is

y dy = [y 3/2 ]1 = (2 2 1). Hence, A = (2 2 1).

3

3

3

y=1

y=1

y

y=2

A

y=1

B

x

-2

-1

Exercise 189. Find the exact area bounded by the curve y = x3 , the horizontal lines y = 1

and y = 2, and the vertical axis. (Answer on p. 1144.)

www.EconsPhDTutor.com

50.2

Example 464. Find the area A bounded by the curve y = x2 and the line y = x + 1.

1 5

.

By the quadratic formula, the curve and line intersect at the points x =

2

(1+ 5)/2

(1+ 5)/2

3

x

x2

2

(15)/2 x + 1 x dx = [ 2 + x 3 ]

(1 5)/2

3

3

2

(1 + 5)2

(1 + 5) (1 5)

(1 5)

1

+

5

1

=

23

2

3 23

23

2

3 23

6 + 2 5 1 + 5 16 + 8 5

6 2 5 1 5 16 8 5

=[

+

][

+

]

8

2

24

8

2

24

3 5 1 5 2 5

7+5 5 75 5 5 5

3+ 5 1+ 5 2+ 5

+

][

+

]=

=

.

=[

4

2

3

4

2

3

12

12

6

A

x

Exercise 190. Find the exact area bounded by the curve y = sin x and the line y = 0.5, for

x (0, /2).(Answer on p. 1145.)

www.EconsPhDTutor.com

50.3

1 5

. So

By the quadratic formula, the curves intersect at x =

2

A=

0.5(1+ 5)

0.5(1 5)

1 x2 (x2 2x 1) dx = 2

0.5(1+ 5)

0.5(1 5)

1 x2 + x dx

5 5

+ ]

=

,

= 2 [x

3

2 0.5(15)

3

x3

0.5(1+ 5)

2

x

where weve simply recycled our tedious calculations from the previous example.

x

A

Exercise 191. Find exact area bounded by the curves y = 2 x2 and y = x2 + 1. (Answer

on p. 1145.)

www.EconsPhDTutor.com

50.4

The definite integral calculates the signed area under the curve and above the x-axis. So

if the curve is under the x-axis, the computed area will be negative, as we now see:

2

8

8

32

x3

.

x 4 dx = [ 4x] = ( 8) ( + 8) =

3

3

3

3

2

2

But of course, an area is simply a magnitude, so well take the absolute value and conclude

32

that the desired area is .

3

x

A

Exercise 192. Find the exact area bounded by x4 16 and the x-axis. (Answer on p.

1146.)

www.EconsPhDTutor.com

50.5

Example 467. Consider the curve described by the equations x = t3 2 and y = 4t5 . Find

the exact area bounded by the curve, the lines x = 2 and x = 1, and the horizontal axis.

It helps to graph this curve on your graphing calculator:

computed as:

x=1

x=1

x=1

t=1

3t8

(4 t ) 3t dt = [4t

] = 4.

8 0

5

Find the exact area bounded by the curve, the lines y = 1 and y = 2, and the vertical axis.

(Answer on p. 1146.)

www.EconsPhDTutor.com

50.6

Example 468. Consider the line y = 1. Rotate it about the x-axis to form an (infinite)

3D cylinder. Now consider the finite portion of the cylinder between x = 1 and x = 2. By a

primary school formula, its volume is Base Area Height = 12 (2 1) = .

Height

Radius

Volume

We can also compute this same volume using integration. The intuition is that were adding

up infinitely-many infinitely-thin circle-shaped slices, laid on their sides, from x = 1 to x = 2

(left to right). The face of each of these circles has area y 2 . In this particular example, y

is constant (simply 1). Thus, the total volume is

1

y 2 dx =

2

1

dx = [x]1 = .

www.EconsPhDTutor.com

Example 469. Rotate the line y = 3x about the x-axis to form an infinite double cone.

Consider the finite portion of the cone between x = 0 and x = 2. By the formula for the

1

1

volume of a cone, we know its volume is r2 h = 62 2 = 24.

3

3

We can also compute this same volume using integration. Again, the intuition is that were

adding up infinitely-many infinitely-thin circle shaped slices, from x = 0 to x = 2. Again,

the face of each of these circles has area y 2 . In this particular example, y = 3x. Thus, the

total volume is

2

0 y dx = 0

x3

(3x) dx = 9 [ ] = 24.

3 0

2

Height

Radius

Volume

Now consider instead the finite portion of the cone between x = 3 and x = 5. This looks

like a pedestal tilted sideways (not illustrated). We can easily compute its volume using

integration:

5

3 y dx = 3

x3

(3x) dx = 9 [ ] = 294.

3 3

2

Computing its volume using geometric formulae is possible, if slightly more tedious. The

1

1

finite portion of the cone between x = 0 and x = 3 is V1 = r2 h = 92 3 = 81. The finite

3

3

1 2

1

portion of the cone between x = 0 and x = 5 is V2 = r h = 152 5 = 375. Hence, the

3

3

desired volume is V = V2 V1 = 375 81 = 294.

Page 499, Table of Contents

www.EconsPhDTutor.com

We can just as easily find the volume of rotation about the y-axis.

Example 470. Consider the curve y = x2 . Find its volume of rotation about the y-axis,

from y = 0 and y = 5.

In this case, there are no familiar geometric formulae we can apply. So we really just have

to compute this same volume using integration. Again, the intuition is that were adding up

infinitely-many infinitely-thin circle-shaped slices, but this time these circle-shaped slices

are stacked from bottom to top, from y = 0 to y = 5. The face of each of these circles has

area x2 , where in this particular example, x2 = y. Thus, the total volume is

5

0 x dy = 0

y2

y dy = [ ] = 12.5.

2 0

Volume

Exercise 194. Compute the volume of rotation of y = sin x about the x-axis from x = 0 to

x = . (Answer on p. 1146.)

www.EconsPhDTutor.com

50.7

Example 471. Use your TI84 to find the approximate area bounded by the curve y = esin x

and the horizontal axis, between x = 1 and x = 2.

After Step 1.

After Step 2.

After Step 3.

After Step 4.

After Step 5.

After Step 6.

After Step 7.

After Step 8.

After Step 9.

2. Press Y= .

3. Press blue 2ND button and then ex (which corresponds to the LN button). Then

press SIN X,T,,n ) ) and altogether you will have entered esin x .

4. Now press GRAPH and the calculator will graph the given equation.

5. Press the blue 2ND button and then CALC (which corresponds to the TRACE

button), to bring up the CALCULATE menu.

6. Press 7 to select the f (x) dx option. This brings you back to the graph.

7. The TI84 is now prompting you for Lower Limit? Simply press 1 .

8. Now press ENTER and you will have told the TI84 that your lower limit is x = 1.

9. The TI84 is now similarly prompting you for Upper Limit? Simply press 2 .

10. Now press ENTER and you will have told the TI84 that your upper limit is x = 2. The

TI84 also informs you that f (x) dx = 2.60466115. This is our desired area (which

is now also kindly shaded in black by our TI84.)

www.EconsPhDTutor.com

51

Differential Equations

51.1

dy

= f (x)

dx

dy

= f (x) is simply equivalent to y = f dx.

dx

x3

dy

2

2

= x . Easy: y = x dx = +C, where as usual C is the constant

Example 472. Solve

dx

3

of integration.

This is the general solution to the given differential equation. It is general because C is

free to vary and so there are many possible solutions for y.

However, suppose we are given also an additional piece of information: x = 0 y = 1.

Such information is often called an initial condition. Heres why. It might be that y is

the number of bats in a cave and x is time. Then the initial condition tells us that at time

x = 0 (i.e. initially), there is y = 1 bat in the cave. Over time, the bats in the cave grow

dy

= x2 .

according to the differential equation

dx

03

+ C. We thus find that C = 1.

With the initial condition x = 0 y = 1, we have 1 =

3

x3

We thus have that y =

+ 1. This is the particular solution to the given differential

3

equation (with given initial condition).

dy

= sin x.

dx

the general solution to the given differential equation.

If we are given the initial condition that x = 0 y = 1, then we can write 1 = cos 0 + C

and find that C = 2. We thus have that y = cos x + 2. This is the particular solution to

the given differential equation (with given initial condition).

dy

= ex sin x. Find also the particular solution,

dx

if given also the intial condition x = 0 y = 1. (Answer on p. 1147.)

www.EconsPhDTutor.com

51.2

dy

= f (y)

dx

dy

dx 1

= dy , (for

0.)

dy dx

dx

So given

dy

1

dx

1

= f (y), rearrange to get

=

(for f (y) 0). Equivalently, x =

dy.

dx

f (y) dy

f (y)

dy

= y2.

dx

dx

1

1

1

Rearrange to get

= 2 (for y 2 0 or y 0). Hence, x = 2 dy =

+ C (for y 0).

dy y

y

y

This is the general solution to the given differential equation.

We will often be asked to express y in terms of x. If so, we can easily rearrange to get

1

y=

(for x C). This is also the general solution to the given differential equation!

C x

If given also the initial condition x = 0 y = 1, then we have

1=

1

C = 1.

C 0

1

is the particular solution to the given differential equation (with given

1x

initial condition).

Thus, y =

www.EconsPhDTutor.com

dy

= sin y.

dx

dx

1

Rearrange to get

=

= csc y (for y not an integer multiple of ). Hence, by Proposidy sin y

tion 11, x = csc y dy = ln csc y + cot y + C (for y not an integer multiple of ). This is

the general solution for the given differential equation.

because for each given value of x, there are multiple possible values of y, as we now show,

by manipulating that last equation:

x = ln(csc y + cot y) + C

2 cos2 (y/2)

cos(y/2)

y

1 + cos y

=

=

= cot

sin y

2 sin(y/2) cos(y/2) sin(y/2)

2

That is, for each given value of x, there are infinitely-many possible values of y (one for

each integer m).

But now suppose we have the initial condition x = 3 y =

3 = ln csc

2

+ cot + C = ln 1 + C = C,

2

2

so that C = 3. We may write y = 2 (cot1 e3x + 2m). Moreover, plugging in the same

values for x and y, we see that

2

2

Hence, m = 0 and y = 2 cot1 e3x . This is the particular solution to the given differential

equation (with given initial condition)

dy

= y 2 + 1. Find also the particular solution,

dx

given also the initial condition x = 0 y = 1. (Answer on p. 1147.)

www.EconsPhDTutor.com

51.3

d2 y

= f (x)

dx2

d2 y

dy

=

f

(x)

is

equivalent

to

=

f dx which in turn is equivalent to y = ( f dx) dx.

dx2

dx

d2 y

= x2 .

2

dx

dy

x3

x3

x4

2

=

x dx =

+ C1 . Next, y =

+ C1 dx =

+ C1 x + C2 . This is the general

dx

3

3

12

solution to the given differential equation.

If given the initial conditions x = 0 y = 1 and x = 1 y = 2, then we have

1=

04

+ 0C1 + C2

12

14

2=

+ 1C1 + 1

12

Hence y =

C2 = 1,

C1 =

11

.

12

x4 11

+ x + 1 is the particular solution.

12 12

d2 y

= sin x.

dx2

dy

=

sin x dx = cos x + C1 . Next, y = cos x + C1 dx = sin x + C1 x + C2 . This is the

dx

general solution to the given differential equation.

If given the additional pieces of information that x = 0 y = 1 and x = y = 2,

then we we have

1 = sin 0 + 0C1 + C2

2 = sin + C1 + 1

C2 = 1,

1

C1 = .

1

Hence y = sin x + x + 1 is the particular solution.

www.EconsPhDTutor.com

d2 y

= ex sin x. Find also the particular solution,

2

dx

given also that x = 0 y = 1.(Answer on p. 1148.)

www.EconsPhDTutor.com

51.4

Word Problems

Formulate a differential equation from a problem situation; and

Interpret a differential equation and its solution in terms of a problem situation.

So thats what well do in this section.

Example 478. A plate of bacteria grows at a rate that is inversely proportional to the

number of bacteria. Express the number of bacteria as a function of time.

Let x be the number of bacteria. Let t be time. We are given that x grows in inverse

dx k

proportion to t. In other words,

= , for some constant k R. Rearranging, we have

dt x

dt x

= . Thus,

dx k

t=

x2

x

dx =

+ C.

k

k

Further rearranging,

we have x = k(t C), where of course the negative root may be

rejected. Hence, x = k(t C).

Suppose we are also given that t = 0 x = 1 and t = 1 x = 2. Then we have

a

1=

a

k(C) and 2 =

k(1 C).

+ 1/k) = k + 1 or k = 3.

Hence C = 1/3. Altogether then, the particular solution is x = 3t + 1.

www.EconsPhDTutor.com

Exercise 198. Follow these steps to find the escape velocity (of an object from Earth).

(Answer on p. 1149.)

(a) The law of gravitation states that the force of attraction F between two point masses

M and m is proportional to the product of their masses and inversely proportional to the

square of the distance r between them. Write down this law in the form of an equation.

Your answer should contain a constant name this constant G (this is the gravitational

constant).

Momentum is defined as the product of mass m and velocity v. Newtons Second Law of

Motion states that force is the rate of change of momentum.

(b) (i) Write down Newtons Second Law in the form of an equation.

(ii) Assume that mass m is constant. Explain why F = m

dv

.

dt

Now suppose M and m are, respectively, the masses of the Earth and a small ball. Assume

that

The Earth is a perfect sphere with radius R m.

You can treat the Earth as a single point with its mass concentrated at the centre of

the sphere. Thus, the initial distance between the Earths centre of mass and the ball is

R + x m.

Upwards (away from the Earth) is the positive direction and downwards (towards the

centre of the Earth) is the negative direction.

The Earth is immobile.

There is no air resistance or any other form of friction.

(c) The small ball is initially held at rest, x m above the surface of the Earth. It is then

GM

dv

released. Let v be the velocity of the ball. Explain why 2 = . (In particular, explain

r

dt

why there is a negative sign.)

(... Exercise continued on the next page ...)

www.EconsPhDTutor.com

From the equation in (c), we may write:

R

R

GM

dv

dr

=

R+x r2

R+x dt dr.

Let vs be the velocity at which the ball hits the surface of the Earth.

(d) (i) Show that the LHS of the above equation is equal to GM (

1

1

+

).

R R+x

vs2

(ii) Show that the RHS of the above equation is equal to . (Hint 1: Use Integration by

2

dr

substitution. Hint 2: What is ?)

dt

1

1

(iii) Hence show that vs = 2GM (

). Again, explain why vs is negative.

R R+x

Suppose instead that the small ball is initially at rest on the surface of the earth. It is then

propelled upwards at a velocity V .

(e)

Explain why the ball will reach a maximum height of x m, where V

1

1

), before falling back down to the earth.

2GM (

R R+x

(f) The escape velocity ve is the velocity with which we must propel the ball

upwards

2GM

(from its initial resting position on the surface of the earth). Explain why ve =

.

R

(g) Given that G = 6.6741011 m3 kg-1 s-2 , M = 5.9721024 kg, and R = 6, 371 km, compute

2GM

(express your answer in km s-1 , correct to 4 significant figures).

R

www.EconsPhDTutor.com

51.5

SYLLABUS ALERT

This is in the 9740 (old) syllabus, but not in the 9758 (revised) syllabus. So you can skip

this section if youre taking 9758.

dy

x3

2

2

Example 479. The general solution to

= x is y = x dx =

+ C.

dx

3

The corresponding family of solution curves is the set of equations {y =

This family is illustrated below.

x3

+ C C R}.

3

x3

solution y =

+ 1 is highlighted in red above.

3

Exercise 199. Sketch five members of the family of solution curves for

that x = 0 y = 1. (Answer on p. 1151.)

d2 y

= x, given also

dx2

www.EconsPhDTutor.com

Part VI

www.EconsPhDTutor.com

52

How many arrangements or permutations are there of the three letters in CAT? For

example, one possible permutation of CAT is TCA.

To solve this problem, one possible method is the method of enumeration. That is,

simply list out (enumerate) all the possible permutations.

ACT,

ATC,

CAT,

CTA,

TAC,

TCA.

Enumeration works well enough when we have just three letters, as in CAT. Indeed, enumeration is sometimes the quickest method.

In contrast, the 13 letters in the word UNPREDICTABLY have 6, 227, 020, 800 possible

permutations. So enumeration is probably not practical.

To help us count more efficiently, well learn about four basic principles of counting:

1. The Addition Principle (AP);

2. The Multiplication Principle (MP);

3. The Inclusion-Exclusion Principle (IEP); and

4. The Complements Principle (CP).

www.EconsPhDTutor.com

52.1

Example 480. For lunch today, I can either go to the food court or the hawker centre. At

the food court, I have 2 choices: ramen or briyani. At the hawker centre, I have 3 choices:

bak chor mee, nasi lemak, or kway teow.

Altogether then, I have 2 + 3 = 5 choices of what to eat for lunch today.

Heres an informal statement of the AP:57

The Addition Principle (AP). I have to choose a destination, out of two possible areas.

At area #1, there are p possible destinations to choose from. At area #2, there are q possible

destinations to choose from.

The Addition Principle (AP) simply states that I have, in total, p + q different choices.

(Just so you know, the AP is sometimes also called the Second Principle of Counting

or the Rule of Sum or the Disjunctive Rule.)

Of course, the AP generalises to cases where there are more than just 2 areas. It may

seem a little silly, but just to illustrate, lets use the AP to tackle the CAT problem:

57

See section 89.1 in the Appendices (optional) for a more precise statement of the AP.

www.EconsPhDTutor.com

Example 481. Problem: How many permutations are there of the letters in the word CAT?

We can divide the possibilities into three cases:

Case #1. First letter is an A. Then the next two letters are either CT or TC 2

possibilities.

Case #2. First letter is a C. Then the next two letters are either AT or TA 2 possibilities.

Case #3. First letter is a T. Then the next two letters are either AC or CA 2 possibilities.

Altogether then, by the AP, there are 2 + 2 + 2 = 6 possibilities. That is, there are 6 possible

permutations of the letters in CAT. These are illustrated in the tree diagram below.

www.EconsPhDTutor.com

The next exercise is very simple and just to illustrate again the AP.

Exercise 200. Without retracing your steps, how many ways are there to get from the

Starting Point to the River (see figure below)? (Answer on p. 1152.)

Exercise 201. How many permutations are there of the letters in the word DEED? Illustrate your answer with a tree diagram similar to that given in the CAT example above.

(Answer on p. 1152.)

www.EconsPhDTutor.com

52.2

Example 482. For lunch today, I can either have prata or horfun. For dinner tonight, I

can have McDonalds, KFC, or Pizza Hut.

Enumeration shows that I have a total of 6 possible choices for my two meals today:

(Prata, McDonalds), (Prata, KFC), (Prata, Pizza Hut),

(Horfun, McDonalds), (Horfun, KFC), (Horfun, Pizza Hut).

Alternatively, we can use the Multiplication Principle (MP). I have 2 choices for lunch

and 3 choices for dinner. Hence, for my two meals today, I have in total 2 3 = 6 possible

choices.

Heres an informal statement of the MP:58

The Multiplication Principle (MP). I have to choose two destinations, one from each

of two possible areas. At area #1, there are p possible destinations to choose from. At area

#2, there are q possible destinations to choose from.

The Multiplication Principle (AP) simply states that I have, in total, p q different choices.

(The MP is sometimes also called the Fundamental or First Principle of Counting

or the Rule of Product or the Sequential Rule.)

Of course, the MP generalises to cases where there are more than just 2 areas. Heres an

example where we have to make 3 decisions:

58

See section 89.1 in the Appendices (optional) for a more precise statement of the MP.

www.EconsPhDTutor.com

Example 483. For breakfast tomorrow, I can have sharks fin or birds nest (2 choices).

For lunch tomorrow, I can have black pepper crab or curry fishhead (2 choices). For dinner

tomorrow, I can have an apple, a banana, or a carrot (3 choices). By the MP, for tomorrows

meals, I have a total of 2 2 3 = 12 possible choices. We can enumerate these (Ill use

abbreviations):

More examples:

www.EconsPhDTutor.com

Example 484. Problem: How many four-letter words can be formed using the letters in

the 26-letter alphabet?

Lets rephrase this problem so that it is clearly in the framework of the MP. We have 4

blank spaces to be filled:

_ _ _ _.

1 2 3 4

These 4 blanks spaces correspond to 4 decisions to be made. Decision #1: What letter to

put in the first blank space? Decision #2: What letter to put in the second blank space?

Decision #3: What letter to put in the third blank space? Decision #4: What letter to

put in the fourth blank space?

How many choices have we for each decision?

For Decision #1, we can put A, B, C, ..., or Z. So we have 26 choices for Decision #1.

For Decision #2, we can again put A, B, C, ..., or Z. So we again have 26 choices for

Decision #2.

We likewise have 26 choices for Decision #3 and also 26 choices for Decision #4.

Altogether then, by the MP, there are 26 26 26 26 = 264 = 456, 976 ways to make our

four decisions.

Solution: There are 264 = 456, 976 possible four-letter words that can be formed using the

26-letter alphabet.

www.EconsPhDTutor.com

Example 485. One 18-sided die has the numbers 1 through 18 printed on each of its sides.

Another six-sided die has the letters A, B, C, D, E, and F printed on each of its sides. We

roll the two dice. How many distinct possible outcomes are there?

Again, lets rephrase this problem in the framework of the MP. Consider 2 blank spaces:

_ _.

1 2

These 2 blank spaces correspond to 2 decisions to be made. Decision #1: What number to

put in the first blank space? Decision #2: What letter to put in the second blank space?

Again we ask: How many choices have we for each decision?

For Decision #1, we can put 1, 2, 3, ..., or 18. So we have 18 choices for Decision #1.

For Decision #2, we can put A, B, C, D, E, or F. So we have 6 choices for Decision #2.

Altogether then, by the MP, there are 18 6 = 108 ways to make our two decisions. In other

words, there are 108 possible outcomes from rolling these two dice.

(If necessary, it is tedious but not difficult to enumerate them: 1A, 1B, 1C, 1D, 1E, 1F,

2A, 2B, ..., 17E, 17F, 18A, 18B, 18C, 18D, 18E, and 18F.)

Exercise 202. A club as a shortlist of 3 men for president, 5 animals for vice-president,

and 10 women for club mascot. How many possible ways are there to choose the president,

the vice-president, and the mascot? (Answer on p. 1153.)

Exercise 203. (Answer on p. 1153.) The highly-stimulating game of 4D consists of selecting a four-digit number, between 0000 and 9999 (so there are 10, 000 possible numbers).

Your mother tells you to go to the nearest gambling den (also known as a Singapore Pools

outlet) to buy any three numbers, subject to these two conditions:

The four digits in each number are distinct.

Each four-digit number is distinct.

How many possible ways are there to fulfil your mothers request?

www.EconsPhDTutor.com

52.3

Example 486. For lunch today, I can either go to the food court or the hawker centre. At

the food court, I have 4 choices of cuisine: Chinese, Indian, Malay, and Western. At the

hawker centre, I have 3 choices of cuisine: Chinese, Malay, and Thai.

There are 2 choices of cuisine that are common to both the food court and the hawker

centre (Chinese and Malay).

And so by the Inclusion-Exclusion Principle (IEP), I have in total 4 + 3 2 = 5 choices of

cuisine. The Venn diagram below illustrates.

Why do we subtract 2? If we simply added the 4 choices available at the food court to the

3 available at the hawker centre, then wed double-count the Chinese and Malay cuisines,

which are available at both the food court and the hawker centre. And so we must subtract

the 2 cuisines that are at both locations.

www.EconsPhDTutor.com

Example 487. Problem: How many integers between 1 and 20 are divisible by 2 or 5?

There are 10 integers divisible by 2, namely 2, 4, 6, 8, 10, 12, 14, 16, 18, and 20.

There are 4 integers divisible by 5, namely 5, 10, 15, and 20.

There are 2 integers divisble by BOTH 2 and 5, namely 10 and 20.

Hence, by the IEP, there are 10 + 4 2 = 12 integers that are divisible by either 2 or 5.

(These are namely 2, 4, 5, 6, 8, 10, 12, 14, 15, 16, 18, and 20.)

The Inclusion-Exclusion Principle (IEP). I have to choose a destination, out of two

possible areas. At area #1, there are p possible destinations to choose from. At area #2,

there are q possible destinations to choose from. Areas #1 and #2 overlap they have r

destinations in common.

The IEP simply states that I have, in total, p + q r different choices.

Exercise 204. (Answer on p. 1154.) The food court has 4 types of cuisine: Chinese,

Indonesian, Korean, and Western. The hawker centre has 3: Chinese, Malay, and Western.

A restaurant has 3: Chinese, Japanese, or Malay.

In total, how many different types of cuisine are there? Illustrate your answer with a Venn

diagram.

59

See section 89.1 in the Appendices (optional) for a more precise statement of the IEP.

www.EconsPhDTutor.com

52.4

Example 488. The food court has 4 types of cuisine: Chinese, Malay, Indian, and Other.

Im at the food court but dont feel like eating Malay or Chinese. So by the Complements

Principle (CP), I have 4 2 = 2 possible choices of cuisine (Indian and Other).

Heres an informal statement of the CP:60

The Complements Principle (CP). There are p possible destinations. I must choose

one. I rule out q of the possible destinations.

The Complements Principle says that I am left with p q possible choices.

Exercise 205. There are 10 Southeast Asian countries, of which 3 (Brunei, Indonesia, and

the Philippines) are not on the mainland. How many mainland Southeast Asian countries

are there that a European tourist can visit? (Answer on p. 1154.)

60

See section 89.1 in the Appendices (optional) for a more precise statement of the CP.

www.EconsPhDTutor.com

53

In this chapter, well use the MP to generate several more methods of counting.

But first, some notation you should find familiar from secondary school:

Definition 103. Let n Z+0 . Then n-factorial, denoted n!, is defined by n! = n(n1) 1

for n 1 and 0! = 1.

Example 489. 0! = 1, 1! = 1, 2! = 2 = 2, 3! = 3 2 1 = 6, 4! = 4 3 2 1 = 24,

5! = 5 4 3 2 1 = 120.

www.EconsPhDTutor.com

Example 490. Problem: How many permutations (or arrangements) are there of the three

letters in the word CAT?

Lets rephrase this problem in the framework of the MP. Consider three blank spaces:

_ _ _.

1 2 3

These 3 blank spaces correspond to 3 decisions to be made. Decision #1: What letter to

put in the first blank space? Decision #2: What letter to put in the second blank space?

Decision #3: What letter to put in the third blank space?

Again we ask: How many choices have we for each decision?

For Decision #1, we can put C, A, or T. So we have 3 choices for Decision #1.

Having already used up a letter in Decision #1, we are left with two letters. So we have 2

choices for Decision #2.

Having already used up a letter in Decision #1 and another in Decision #2, we are left

with just one letter. So we have only 1 choice for Decision #3.

Altogether then, by the MP, there are 321 = 3! = 6 possible ways of making our decisions.

This is also the number of ways there are to arrange the three letters in the word CAT.

Lets now try the UNPREDICTABLY problem.

www.EconsPhDTutor.com

Example 491. Problem: How many ways permutations are there of the 13 letters in the

word UNPREDICTABLY?

Again, lets rephrase this problem in the framework of the MP. Consider 13 blank spaces:

_ _ _ _ _ _ _ _ _ _ _ _ _.

1 2 3 4 5 6 7 8 9 10 11 12 13

These 13 blanks spaces correspond to 13 decisions to be made. Decision #1: What letter

to put in the first blank space? Decision #2: What letter to put in the second blank space?

... Decision #13: What letter to put in the 13th blank space?

Again we ask: How many choices have we for each decision?

First an important note: In the word UNPREDICTABLY, no letter is repeated. (Indeed,

UNPREDICTABLY is the longest common English word without any repeated letters.)

For Decision #1, we can put U, N, P, R, E, D, I, C, T, A, B, L, or Y. So we have 13 choices

for Decision #1.

For Decision #2, having already used up a letter in Decision #1, we are left with 12 letters.

So we have 12 choices for Decision #2.

For Decision #3, having already used up a letter in Decision #1 and another letter in

Decision #2, we are left with 11 letters. So we have 11 choices for Decision #3.

For Decision #13, having already used up a letter in Decision #1, another in Decision #2,

another in Decision #3, ..., and another in Decision #12, we are left with one letter. So

we have 1 choice for Decision #13.

Altogether then, by the MP, there are 13 12 2 1 = 13! = 6, 227, 020, 800 possible

ways of making our decisions. This is also the number of ways there are to arrange the 13

letters in the word UNPREDICTABLY.

The next fact simply summarises what should already be obvious from the above examples:

Fact 62. There are n! possible permutations of n distinct objects.

61

This is informal because, amongst other omissions, we havent yet given a precise definition of the term permutation.

www.EconsPhDTutor.com

Consider n empty spaces. We are to fill them with the n distinct objects.

_ _ _ . . . _.

1 2 3

n

For space #1, we have n possible choices. For space #2, we have n 1 possible choices

(because one object was already placed in space #1). ... And finally for space #n, we have

only 1 object left and thus only 1 choice. By the MP then, there are n (n 1) 1 = n!

possible ways of filling in these n spaces with the n distinct objects.

Example 492. The word COWDUNG has seven distinct letters. Hence, there are 7! = 5040

permutations of the letters in the word COWDUNG.

www.EconsPhDTutor.com

53.1

In the previous section, we saw that there are 3! permutations of the three letters in the

word CAT and 13! permutations of the 13 letters in the word UNPREDICTABLY. We

made an important note: In each of these words, there was no repeated letter.

We now consider permutations of a set where some elements are repeated.

Example 493. How many permutations are there of the three letters in the word SEE?

A nave application of the MP would suggest that the answer is 3! = 6. This is wrong.

Enumeration shows that there are only 3 possible permutations:

EES,

ESE,

SEE.

To see why a nave application of the MP fails, set up the problem in the framework of the

MP. Consider 3 blank spaces:

_ _ _.

1 2 3

These 3 blanks spaces correspond to 3 decisions to be made. Decision #1: What letter to

put in the first blank space? Decision #2: What letter to put in the second blank space?

Decision #3: What letter to put in the third blank space?

Again we ask: How many choices have we for each decision?

For Decision #1, we can put E or S. So we have 2 choices for Decision #1.

But now the number of choices available for Decision #2 depends on what we chose for

Decision #1! (If we chose E in Decision #1, then we again have 2 choices for Decision

#2. But if instead we chose S in Decision #2, then we now have only 1 choice for Decision

#2.) This violates the implicit but important assumption in the MP that the number of

choices available in one decision is independent on the choice made in the other decision.

Hence, the MP does not directly apply.

www.EconsPhDTutor.com

The reason SEE has only 3 possible permutations (instead of 3! = 6) is that it contains a

repeated element, namely E. But why would this make any difference?

so that the word SEE is now transTo understand why, lets rename the second E as E,

From the three letters of this new word, wed again have

formed into a new word SEE.

3! = 6 possible permutations:

EES,

EES,

ESE,

ESE,

SEE,

SEE.

Restricting attention to the two letters EE,

these two letters. Hence, any single permutation (in the case where we do not distinguish

between the two Es) corresponds to 2 possible permutations (in the case where we do). The

figure below illustrates how the 3 permutations of SEE correspond to the 6 permutations

in SEE.

Hence, when we do not distinguish between the two Es, there are only half as many possible

permutations.

We next consider permutations of SASS.

www.EconsPhDTutor.com

Example 494. How many permutations are there of the four letters in the word SASS?

The answer is 4!/3! = 4. Lets see why.

and S,

then wed

If we distinguish between the three Ss, perhaps by calling them S, S,

S.

S,

SS

S,

S,

SS

S,

S

SS,

and S

SS.

So distinguishing between the three Ss increases by 6-fold the

SS

number of possible permutations. Working backwards, the word SASS thus has one-sixth

S.

That is, SASS has 4!/3! = 4 possible permutations.

as many permutations as SAS

The figure below illustrates how the 4 possible permutations of SASS correspond to the 24

S.

www.EconsPhDTutor.com

Example 495. How many permutations are there of the four letters in the word DEED?

Answer:

4!

.

2!2!

In the numerator, the 4! corresponds to the total of 4 letters. In the denominator, the 2!

corresponds to the 2 Ds and the 2! corresponds to the 2 Es. Where do these numbers

come from?

Let x be the number of permutations of DEED (i.e. x is our desired answer).

If we distinguish between the two Ds, then wed increase by 2!-fold the number of possible

permutations, to x 2!. If, in addition, we distinguish between the 2 Es, then wed increase

again by 2!-fold the number of possible permutations, to x 2! 2!. But we know that if all

4 letters are distinct, then there are 4! possible permutations. Therefore,

x 2! 2! = 4!

Rearrangement yields the answer:

x=

4!

= 6.

2!2!

You can go back and check that this answer is consistent with our answer for Exercise 201

(above).

We next consider permutations of ASSESSES.

www.EconsPhDTutor.com

Example 496. Problem: How many permutations are there of the eight letters in the word

ASSESSES?

Answer:

8!

.

2!5!

In the numerator, the 8! corresponds to the total of 8 letters. In the denominator, the 2!

corresponds to the 2 Es and the 5! corresponds to the 5 Ss. Where do these come from?

Let y be the number of permutations of ASSESSES (i.e. y is our desired answer).

If we distinguish between the two Es, then wed increase by 2!-fold the number of possible

permutations, to y 2!. If, in addition, we distinguish between the 5 Ss, then wed increase

again by 5!-fold the number of possible permutations, to y 2! 5!. But we know that if all

8 letters are distinct, then there are 8! possible permutations. Therefore,

y 2! 5! = 8!

Rearrangement yields the answer:

y=

8!

.

2!5!

www.EconsPhDTutor.com

In general,

Fact 63. Consider n objects, only k of which are distinct. Let r1 , r2 , . . . , and rk be the

numbers of times the 1st, 2nd, . . . , and kth distinct objects appear. (So r1 + r2 + + rk = n.)

Then the number of possible ways to permute these n objects is

n!

.

r1 !r2 ! . . . rk !

More examples:

Example 497. How many permutations are there of the six letters in the word BANANA?

We have three distinct letters B, A, and N. The letter B appears 1 time. The letter A

appears 3 times. The letter N appears 2 times. Hence, by the above Fact, the number of

possible permutations of these 6 letters is

6!

= 60.

1!3!2!

Of course, 1! is simply equal to 1. So for the denominator, we shall usually not bother to

write out any 1!. So we will normally instead write that the number of permutations of

BANANA is:

6!

= 60.

3!2!

Example 498. How many permutations are there of the 11 letters in the word MISSISSIPPI?

We have four distinct letters M, I, S, and P. The letter M appears 1 time. The letter I

appears 4 times. The letter S appears 4 times. The letter P appears 2 times. Hence, by

the above Fact, the number of possible permutations of these 11 letters is

11!

= 34, 650.

4!4!2!

Exercise 207. There are 3 identical white tiles and 4 identical black tiles. How many ways

are there of arranging these 7 tiles in a row? (Answer on p. 1155.)

www.EconsPhDTutor.com

53.2

Circular Permutations

Informal Definition. Two circular permutations are equivalent if one can be transformed

into another by means of a rotation.

Example 499. There are 3! = 6 (linear) permutations of CAT. That is, there are 3! = 6

possible ways to fill them into these 3 linearly-arranged spaces:

___

1 2 3

In contrast, there are only 2! = 2 circular permutations of CAT. That is, there are only

2! = 2 possible ways to fill them into these 3 circularly-arranged spaces:

(... Example continued on the next page ...)

www.EconsPhDTutor.com

The three seemingly-different arrangements above are considered to be the same circular

permutation. This is because any arrangement is simply a rotation of another. Take the

left red arrangement, rotate it clockwise by one-third of a circle to get the middle green

arrangement. Repeat the rotation to get the right blue arrangement.

The second and only other circular arrangement of CAT is shown below. Again, these

three seemingly-different arrangements are considered to be the same circular permutation.

This is because any arrangement is simply a rotation of another. Take the left black arrangement, rotate it clockwise by one-third of a circle to get the middle pink arrangement.

Repeat the rotation to get the right orange arrangement.

Note importantly, that the arrangement (or three arrangements) below cannot be rotated

to get the arrangement (or three arrangements) above. Hence, the arrangement below is

indeed distinct from the arrangement above.

It turns out that in general, if we have n distinct objects, there are (n 1)! ways to arrange

them in a circle. So here there are only (3 1)! = 2! = 2 ways to arrange CAT in a circle.

www.EconsPhDTutor.com

In general:

Fact 64. n distinct objects have (n 1)! circular permutations.

Proof. Given n distinct objects, any 1 circular permutation can be rotated n times to obtain

n distinct (linear) permutations. Hence, there are n times as many (linear) permutations

as there are circular permutations.

But we already know that there are n! (linear) permutations of n distinct objects. Hence,

there are n!/n = (n 1)! circular permutations of n distinct objects.

Exercise 208. How many ways are there to seat 10 people in a circle? (Answer on p.

1155.)

Note that if there are repeated objects, then the problem is considerably more difficult. See

Section 89.2 in the Appendices for a brief discussion.

www.EconsPhDTutor.com

53.3

Partial Permutations

Example 500. Using the 26-letter alphabet, how many 3-letter words can we form that

have no repeated letters? This, of course, is simply the problem of filling in these 3 empty

spaces using 26 distinct elements. For space #1, we have 26 possible choices. For space

#2, we have 25. And for space #2, we have 24.

___

1 2 3

By the MP then, the number of ways to fill the three spaces is 26 25 24. This is also the

number of three-letter words with no repeated letters.

Problems like the above example crop up often enough to motivate a new piece of notation:

Definition 104. Let n, k be positive integers with n k. Then P (n, k), read aloud as n

permute k, is defined by

P (n, k) =

n!

.

(n k)!

P (n, k) answers the following question: Given n distinct objects and k spaces (where

k n), how many ways are there to fill the k spaces?

Just so you know, P (n, k) is also variously denoted nP k, Pkn , n Pk , etc., but well stick solely

with the P (n, k) in this textbook.

Example 536 (continued from above). The number of 3-letter words without repeated

letters is simply P (26, 3) = 26!/23! = 26 25 24.

Example 501. Problem: Using the 22-letter Phoenician alphabet, how many 4-letter words

can we form that have no repeated letters?

This, of course, is simply the problem of filling in these 4 empty spaces using 22 distinct

elements. So the answer is P (22, 4) = 22!/18! = 22 20 19 18 words.

Exercise 209. Out of a committee of 11 members, how many ways are there to choose a

president and a vice-president? (Answer on p. 1155.)

www.EconsPhDTutor.com

53.4

Example 502. At a dance party, there are 7 heterosexual married couples (and thus 14

people in total). Problem #1. How many ways are there of arranging them in a line, with

the restriction that every person is next to his or her partner?

Think of there as being 7 units (each unit being a couple). There are 7! ways to arrange

these 7 units in a line. Within each unit, there are 2 possible arrangements. Hence, in

total, there are 7! 27 possible arrangements.

Problem #2. Repeat the above problem, but now for a circle, rather than a line.

There are 6! ways to arrange the 7 units in a circle. Within each unit, there are 2 possible

arrangements. Hence, in total, there are 6! 27 possible arrangements.

Problem #3. How many ways are there of arranging them in a circle, with the restriction

that every man is to the right of his wife?

There are 6! ways to arrange the 7 units in a circle. Within each unit, there is only 1

possible arrangement. Hence, in total, there are 6! possible arrangements.

Example 503. (I assume youre familiar with the standard 52-card deck.)

www.EconsPhDTutor.com

Problem #1. Using a standard 52-card deck, how many ways are there of arranging any

3 cards in a line, with the restriction that no two cards of the same suit are next to each

other?

This is the problem of filling in 3 spaces with 52 distinct objects. For space #1, we have

52 possible choices.

_ _ _.

1 2 3

For space #2, having picked a card of suit X for space #1, we must pick a card from some

other suit Y. And so there are only 39 possible choices (we have three suits available

thats 3 13 = 39).

For space #3, having picked a card of suit Y for space #2, we must pick a card from some

other suit Z. Note that suit Z can be the same as suit X. And so there are 38 possible choices

(we have three suits available, less the card used for space #1 thats 3 13 1 = 38).

Altogether then, there are 52 39 38 possible arrangements.

Problem #2. Repeat the above problem, but now for a circle, rather than a line.

One subtle thing is that, in addition to space #1 being of a different suit from space #2

and space #2 being of a different suit from space #3, we must also have that space #3 is

of a different suit from space #1. Thus, there are 52 39 26 possible ways to fill in these

three spaces, if they were in a line.

Since they are instead in a circle, there are 52 39 26 3 possible ways to arrange three

cards in a circle, with the condition that no two cards of the same suit are next to each

other.

Exercise 210. (Answer on p. 1155.) There are 4 brothers and 3 sisters. In how many

ways can they be arranged ...

(a) in a line, without any 2 brothers being next to each other?

(b) in a line, without any 2 sisters being next to each other?

(c) in a circle, without any 2 brothers being next to each other?

(d) in a circle, without any 2 sisters being next to each other?

www.EconsPhDTutor.com

54

P (n, k) is the number of ways we can fill k (ordered) spaces using n distinct objects.

In contrast, C(n, k) is the number of ways of choosing ose k out of n distinct objects.

Equivalently, it is the same problem of filling k spaces using n distinct objects, except

that now order does not matter.

Example 504. Suppose we have a committee of 13 members and wish to select a president

and a vice-president. This is equivalent to the problem of filling in 2 spaces, given 13

distinct objects.

__

1 2

The answer is thus simply P (13, 2) = 13 12.

Suppose instead that we want to choose two co-presidents. How many ways are there of

doing so?

This is simply the same problem as before again we want to fill in 2 spaces, given 13

distinct objects. The only difference now is that the order of the 2 chosen objects

does not matter. So the answer must be that there are P (13, 2)/2! ways of choosing the

two co-presidents.

Example 505. How many ways are there of choosing 5 cards out of a standard 52-card

deck?

_____

1 2 3 4 5

First, how many ways are there to fill 5 spaces using 52 distinct objects (where order

matters)? Answer: P (52, 5) = 52 51 50 49 48 = 311, 875, 200.

And so if we dont care about order, we must adjust this number by dividing by 5! to get

P (52, 5)/5! = 2, 598, 960. So the answer is that to choose 5 cards out of a 52-card deck,

there are 2, 598, 960 ways.

The above examples suggest that, in general, to choose k out of n given distinct objects,

there are P (n, k)/k! possible ways. This motivates the following definition:

www.EconsPhDTutor.com

Definition 105. Let n, k be positive integers with n k. Then C(n, k), read aloud as n

choose k, is defined by

C(n, k) =

n!

P (n, k)

=

.

k!

(n k)!k!

It turns out that C(n, k) appears so often in maths that it has many alternative notations

n

one of the most common is

.

k

n choose k also has several names, such as the combination, the combinatorial

number, and even the binomial coefficient. Shortly, well see why the name binomial

coefficient makes sense.

Exercise 211 gives an alternate expression for C(n, k) which youll often find very useful.

Exercise 211. Show that C(n, k) =

1157.)

n (n 1) (n 2) (n k + 1)

. (Answer on p.

k!

Exercise 212. Compute C(4, 2), C(6, 4), and C(7, 3). (Answer on p. 1157.)

Exercise 213. We wish to form a basketball team, consisting of 1 centre, 2 forwards, and

2 guards. We have available 3 centres, 7 forwards, and 5 guards. How many ways are there

of forming a team? (Answer on p. 1157.)

Ways to choose k out

Ways to choose n k out

=

of n distinct objects

of n distinct objects.

Intuitively, this property is true because choosing k out of n objects, is the same as choosing

which n k out of n objects to ignore. Lets jot down this symmetry property as a formal

fact:

Fact 65. (Symmetry.) C(n, k) = C(n, n k).

www.EconsPhDTutor.com

Example 506. We have a group of 100 men. 70 are needed for a task. The number of

ways to choose these 70 men is:

C(100, 70) =

100!

.

30!70!

This is the same as the number of ways to choose the 30 men that will not be used for the

task:

C(100, 30) =

100!

.

70!30!

www.EconsPhDTutor.com

54.1

Pascals Triangle

Pascals Triangle consists of a triangle of numbers. If we adopt the convention that the

topmost row is row 0 and the leftmost term of each row is the 0th term, then the nth row,

k th term is the number C(n, k):

1

1

1

1

1

1

1

1

2

3

4

5

1

3

6

10

15

21

1

1

4

10

20

25

1

5

15

35

1

6

21

1

7

It turns out that beautifully enough, each term is equal to the sum of the two terms above

it. The next exercise asks you to verify several instances of this:

Exercise 214. Verify the following: (a) C(1, 0) + C(1, 1) = C(2, 1); (b) C(4, 2) + C(4, 3) =

C(5, 3); (c) C(17, 2) + C(17, 3) = C(18, 3). (Answer on p. 1157.)

Proof. C(n + 1, k) is the number of ways of choosing k out of n + 1 distinct objects.

Suppose we do not choose the last object, i.e. the n + 1th object. Then we have to choose

our k objects out of the first n objects. There are C(n, k) ways of doing so.

Suppose we do choose the last object. Then we have to choose another k 1 objects, out

of the first n objects. There are C(n, k 1) ways of doing so.

Altogether then, by the Addition Principle, there are C(n, k) + C(n, k 1) ways of choosing

k out of n + 1 distinct objects.

www.EconsPhDTutor.com

54.2

- Henri Poincar, p. 34 in Science and Method.

Poincars quote is especially true in combinatorics. In this section, well learn why C (n, k)

can be called the combination and also the binomial coefficient.

Verify for yourself that the following equations are true:

(1 + x)0 = 1,

(1 + x)1 = 1 + x,

(1 + x)2 = 1 + 2x + x2 ,

(1 + x)3 = 1 + 3x + 3x2 + x3 ,

(1 + x)4 = 1 + 4x + 6x2 + 4x3 + x4 ,

(1 + x)5 = 1 + 5x + 10x2 + 10x3 + 5x4 + x5 ,

(1 + x)6 = 1 + 6x + 15x2 + 20x3 + 15x4 + 6x5 + x6 ,

(1 + x)7 = 1 + 7x + 21x2 + 35x3 + 35x4 + 21x5 + 7x6 + x7 .

Each of the expressions on the RHS is called a binomial series. Each can also be called

the binomial expansion of (1 + x)n .

Notice anything interesting? No? Try this exercise:

7 7 7 7 7 7 7 7

,

,

,

,

,

,

,

. Compare

0 1 2 3 4 5 6 7

these to the coefficients of the binomial expansion of (1+x)7 . What do you notice? (Answer

on p. 1158.)

It turns out that somewhat surprisingly, the coefficients of the binomial expansions of

n n

n

(1 + x)n are simply

,

, ...

. As an additional exercise, you should verify for

0 1

n

yourself that this is also true for n = 0 through n = 6.

There are several ways to explain why the combinatorial numbers also happen to be the

binomial coefficients. Here well give only the combinatorial explanation:

www.EconsPhDTutor.com

(1 + x)2 = (1 + x)(1 + x) = 1 1 + 1 x + x 1 + x x.

Consider the 4 terms on the right.

For 1 1, we chose 1

from the first (1 + x) and 1

from the second (1 + x).

For 1 x, we chose 1

from the first (1 + x) and x

from the second (1 + x).

For x 1, we chose x

from the first (1 + x) and 1

from the second (1 + x).

x from the first (1 + x) and

x from the second (1 + x).

product, there is C(2, 0) = 1

way to choose 0 of the xs.

product, there are C(2, 1) = 2

ways to choose 1 of the xs.

product, there is C(2, 2) = 1

way to choose 2 of the xs.

Altogether then, the coefficient on x0 is C(2, 0) (choose 0 of the xs), that on x1 is C(2, 1)

(choose 1 of the xs), and that on x2 is C(2, 1) (choose 2 of the xs). That is:

(1 + x)2 =

2 0 2 1 2 2

x +

x +

x = 1 + 2x + x2 .

0

1

2

Exercise 216. (Answer on p. 1158.) Mimicking what was just done above, explain why

(1 + x)3 =

3 0 3 1 3 2 3 3

x +

x +

x +

x.

0

1

2

3

Fact 67. Let n Z+ . Then

n ni i n n 0 n n1 1 n n2 2

n 0 n

x y =

x y +

x y +

x y + +

xy .

i

0

1

2

n

i=0

n

(x + y)n =

www.EconsPhDTutor.com

54.3

By plugging x = 1, y = 1 into the last fact, we see that (1 + 1) = 2n is the sum of the terms

in the nth row of Pascals triangle:

Fact 68. Let n Z+ . Then

n

n n n n

.

+ +

+

+

=

n

i=0 i 0 1 2

n

2 =

n

Theres a nice combinatorial interpretation of the above fact (Poincars quote at work

again).

Consider the set S = {A, B}. S has 22 = 4 subsets: = {}, {A}, {B}, and S = {A, B}.

Now consider the set T = {A, B, C}. T has 23 = 8 subsets: = {}, {A}, {B}, {C}, {A, B},

{A, C}, {B, C}, and T = {A, B, C}.

In general, if a set has n elements, how many subsets does it have? We can couch this in

the framework of the Multiplication Principle this is really a sequence of n decisions of

whether or not to include each element in the subset. There are 2 choices for each decision.

Thus, there are 2n choices altogether. In other words, using a set of n elements, we can

form 2n subsets.

But of course, this must in turn be equal to the sum of the following:

C (n, 0) ways to form subsets with 0 elements;

C (n, 1) ways to form subsets with 1 element;

C (n, 2) ways to form subsets with 2 elements;

...

C (n, n) ways to form subsets with n elements.

Thus,

2n =

n n n

n

+

+

+ +

.

0 1 2

n

www.EconsPhDTutor.com

7 7 7

7

+

+

+ +

. (Answer on p. 1158.)

0 1 2

7

Exercise 218. Using what youve learnt, write down (3 + x)4 . (Answer on p. 1159.)

Exercise 219. (Answer on p. 1159.) (a) The Tan family has 4 sons and the Wong family

has 3 daughters. Using the sons and daughters from these two families, how many ways

are there of forming 2 heterosexual couples?

(b) The Lee family has 6 sons and the Ho family has 9 daughters. Using the sons and

daughters from these two families, how many ways are there of forming 5 heterosexual

couples?

www.EconsPhDTutor.com

55

Probability: Introduction

55.1

Mathematical Modelling

- G.E.P. Box, p. 202 in Robustness in Statistics.

Whenever we use maths in a real-world scenario, we have some mathematical model in

mind. Heres a very simple example just to illustrate:

Example 507. We want to know how much material to purchase, in order to build a fence

around a field. We might go through these steps:

1. Formulate a mathematical model: Our field is the shape of a rectangle, with length

100 m and breadth 50 m.

2. Analyse: The rectangle has perimeter 100 + 50 + 100 + 50 = 300 m.

3. Apply the results of our analysis: We need to buy enough material to build a

300-metre long fence.

The figure below depicts how mathematical modelling works.

1. Formulate a mathematical model.

That is, describe the real-world scenario in mathematical language and concepts.

This first step is arguably the most important. It is often subjective not everyone will

agree that your mathematical model is the most appropriate for the scenario at hand.

To use the above example, the field may not be a perfect rectangle, so some may object

to your description of the field as a rectangle. Nonetheless, you may decide that all things

considered, the rectangle is a good mathematical model.

Page 547, Table of Contents

www.EconsPhDTutor.com

This involves using maths and the rules of logic. (A-level maths exams tend to be mostly

concerned with this second step.)

In the above example, this second step simply involved computing the perimeter of the

rectangle 100 + 50 + 100 + 50 = 300 m. Of course, for the A-levels, you can expect the

analysis to be more challenging than this.

Note that this second step, in contrast to the first, is supposed to be completely watertight,

non-subjective, and with no room for disagreement. After all, hardly anyone reasonable

could disagree that a perfect rectangle with length 100 m and breadth 50 m has perimeter

300 m.

3. Apply your results.

Now apply the results of your analysis to the real-world scenario.

In the above example, pretend youre a mathematical consultant hired by the fence-builder.

Then your final report might simply say, We recommend the purchase of 300 m worth of

fence material.

This third and last step is, like the first, subjective and open to debate. It involves your

interpretation of what the results of your analysis mean (in the real world) and your recommendation of what actions to take.

For example, you find that the fence will have perimeter 300 m and thus recommend that

300 m of fence material be purchased. However, someone else, looking at the same result,

might point out that the corners of the fence require additional or special material; she

might thus make a slightly different recommendation.

Weve secretly always been using mathematical modelling; we just havent always been

terribly explicit about it. The foregoing discussion was placed here, because with probability

and statistical models, we want to be especially clear about that we are doing mathematical

modelling.

www.EconsPhDTutor.com

55.2

Real-world scenarios often involve chance. We can model such scenarios mathematically. For this purpose, well use a mathematical object named the experiment, typically

denoted E.62

An experiment E = (S, , P) is an ordered triple63 composed of three objects, called the

sample space S, the event space (upper-case sigma), and the probability function

P, where

The sample space S is simply the set of possible outcomes.

An event is simply any set of possible outcomes. In turn, the event space is simply

the set of all events.

The probability function P simply assigns to each event some probability between 0

and 1. This probability is interpreted as the likelihood of that particular event occurring.

Examples:

62

63

An experiment is often instead called a probability triple or probability space or (probability) measure space.

Previously, in the only ordered triples we encountered, the three terms were always simply real numbers. Here however,

the first two terms are sets and the third is a function. Nonetheless, this is all the same an ordered triple, albeit a more

complicated one.

www.EconsPhDTutor.com

Example 508. We model a coin-flip with the experiment E = (S, , P). What are the

sample space S, the event space , and the probability function P?

1. S = {H, T }.

The sample space is simply the set of possible outcomes.

The choice of the sample space belongs to Step #1 (Formulate a mathematical model) in

the process of mathematical modelling. It is subjective and open to disagreement.

For example, John (another scientist) might argue that the coin sometimes lands exactly

on its edge. This is exceedingly unlikely but nonetheless possible one empirical estimate

is that the US 5-cent coin has probability 1 in 6000 of landing on its edge when flipped

(source). So John might denote this third possible outcome X and his sample space would

instead be S = {H, T, X}.

2. Event space = {, {H}, {T }, {H, T }}.

An event is simply any subset of S. In other words, an event is simply some set of possible

outcomes. So here, {H} is an event. So too is {T }. But there are also two other events,

namely = {} (this is the event that never occurs) and S = {H, T } (this is the event that

always occurs).

The event space is simply the set of events. In other words, the event space is the set of

all subsets of S.*

As we saw in Section 54.3, given any finite set S, there are 2S possible subsets of S. In

general, given a finite sample space S, the corresponding event space always simply

contains 2S events. And so here, since there are 2 possible outcomes, there are, altogether,

22 = 4 possible events.

If the real-world outcome of the coin flip is Heads, then our interpretation (in terms of our

model) is that the events {H} and {H, T } occur. If the real-world outcome of the coin

flip is Tails, then our interpretation (in terms of our model) is that the events {T } and

{H, T } occur.

The event never occurs, whatever the real-world outcome is. And the event S = {H, T }

always occurs, whatever the real-world outcome is.

(... Example continued on the next page ...)

*Provided S is finite. If S is infinite, then this sentence must be modified slightly but this is well beyond the scope of the A-levels.

www.EconsPhDTutor.com

The mathematical modeller is free to select the sample space S she deems most appropriate.

However, once she has selected the sample space S, the event space is automatically

determined by the rules of maths. There is no room for interpretation. Hence, the selection

of the event space belongs to Step #2 (Analysis) in the process of mathematical modelling.

So likewise, John, who chooses S = {H, T, X} as his sample space, has no freedom to choose

his event space . It is automatically = {, {H}, {T }, {X}, {H, T }, {H, X}, {T, X}, S}

(consists of 8 elements).

3. Probability function P R.

The probability function simply assigns to each event a number (between 0 and 1) called

a probability. So here, if heads and tails are equally likely (or the coin is unbiased or

fair), then it makes sense to assign

P () = 0,

P(S) = 1.

The mathematical modeller has no freedom over the domain and codomain R of the

probability function. However, she does have freedom to choose the mapping rule she

deems most appropriate. Hence, the act of choosing the mapping rule belongs to Step #1

(Formulation) in the process of mathematical modelling.

So here, if told that heads and tails are equally likely (or that the coin is unbiased or

fair), the mathematical modeller would naturally choose to assign probability 0.5 to each

of the events {H} and {T }.

John, who chooses S = {H, T, X} as his sample space, might instead assign probability

1/6000 to the event {X} and probability 5999/12000 to each of the events {H} and {T }.

It is correct and proper to write P ({H}) = P ({T }) = 0.5. It is incorrect and improper to

write P (H) = P (T ) = 0.5. This is because the function P is of events (sets of outcomes)

and NOT of outcomes themselves.

Nonetheless, we will often allow ourselves to be sloppy and write the incorrect and improper P (H) = P (T ) = 0.5. This is because the notation P ({H}) = P ({T }) = 0.5 can get

rather messy. But you should always remember, even as you write P (H) = P (T ) = 0.5,

that this is technically incorrect.

www.EconsPhDTutor.com

where

1. S = {1, 2, 3, 4, 5, 6}.

2. Event space:

= {, {1} , {2} , . . . , {6} , {1, 2} , {1, 3} , . . . , {5, 6} , {1, 2, 3} , {1, 2, 4} , . . . , {4, 5, 6} , . . . . . . , S}

There are 6 possible outcomes and thus 26 = 64 possible events. The event space, given

above, is simply the set of all possible events.

If the real-world outcome of the die roll is 3, then the interpretation (in terms of our

model) is that the following 32 events occur: {3}, {1, 3}, {2, 3}, . . . , {1, 2, 3}, {1, 3, 4}, . . . ,

S = {1, 2, 3, 4, 5, 6}. (These are simply the events that contain the outcome 3.)

Similarly, if the real-world outcome of the die roll is 5, then the interpretation is that 32

events occur. You should be able to list all 32 of these events on your own.

3. Probability function P R.

If the die is unbiased or fair, then it makes sense to assign

1

P({1}) = P({2}) = P({3}) = P({4}) = P({5}) = P({6}) = .

6

4

What about the other 58 events? It makes sense to assign, for example, P ({1, 3, 5, 6}) = .

6

In general, the mapping rule of the probability function can be fully specified as: For any

event A ,

P(A) =

A A

=

.

S

6

In words, given any event A, its probability P(A) is simply the number of elements it

contains, divided by 6.

www.EconsPhDTutor.com

Definition 106. An experiment is an ordered triple (S, , P), where

S, the sample space, is simply any set (interpreted as the set of possible outcomes in a

real-world scenario involving chance).

, the event space, is the set of possible events.

P, the probability function, has domain , codomain R, and must satisfy the three

Kolmogorov axioms (to be discussed below in Definition 107).

Given any event A , the number P(A) is called the probability of A.

For the probability function P, the mathematical modeller is free to choose the mapping

rule she deems most appropriate. The only restriction is that P satisfies three axioms,

called the Kolmogorov Axioms, to be discussed in the next section.

Exercise 220. (Answers on pp. 1160, 1161, and 1162.) Consider each of the following

real-world scenarios.

(a) You pick, at random, a card from a standard 52-card deck.

(b) You flip two fair coins.

(c) You roll two fair dice.

Model each of the above real-world scenarios as an experiment, by following steps (i) - (iii):

(i) Write down the appropriate sample space S.

(ii) How many possible events are there? Hence, how many elements does the event space

contain? If it is not too tedious, write out in full.

(iii) What are the domain and codomain of the probability function P? Write down the

probabilities of any three events. Given any event A , what is P(A)?

(iv) In each scenario, explain briefly how John, another scientist, might justify choosing a

different sample space, event space, and probability function.

www.EconsPhDTutor.com

55.3

An axiom (or postulate) is a statement that is simply accepted as being true, without

justification or proof.

Example 510. Euclids parallel axiom says that Two non-parallel lines in the plane

eventually intersect. Historically, this axiom was accepted as a self-evident truth, without need for justification or proof.

However, in the 19th century, mathematicians discovered non-Euclidean geometries, in

which the parallel axiom did not hold. These turned out to have significant implications

for maths, philosophy, and physics.

The above example illustrates that an axiom is not an eternal and immutable truth. Instead,

it is merely a statement that some mathematicians tentatively accept as being true. Having

listed a bunch of axioms, mathematicians then study their implications.

In probability theory, we impose three axioms on the probability function. These can be

thought of as restrictions on what the probability function looks like. Informally:

1. Probabilities cant be negative.

2. The probability of an outcome occurring is 1.

3. The probability that one of two disjoint events occurs is the sum of the their individual

probabilities.

Formally:

Definition 107. We say that a function P satisfies the three Kolmogorov axioms if:

1. Non-Negativity Axiom. For any event E S, we have P(E) 0.

2. Normalisation Axiom. P(S) = 1.

3. Additivity Axiom. Given any two disjoint events E1 , E2 S, we have P (E1 E2 ) =

P (E1 ) + P (E2 ).*

*This additivity axiom is actually not quite the correct third Kolmogorv axiom. Strictly speaking, we want instead a countable-additivity

i=1 Ei ) = P (Ei ). But for the A-levels well gloss over this.

i=1

In case youve forgotten, two sets are disjoint if they have no elements in common.

www.EconsPhDTutor.com

55.4

Obviously, P() = 0 (the probability that the empty event occurs is 0). Previously, youve

probably taken this and other obvious properties for granted. Now well prove that they

follow from the Kolmogorov axioms.

Recall that given any set A, its complement Ac (sometimes also denoted A ) is defined to

be everything else more precisely, Ac is the set of all elements that are not in A.

Proposition 12. Let P be a probability function and A, B be events. Then P satisfies the

following properties:

1. Complements. P(A) = 1 P (Ac ).

2. Probability of Empty Event is Zero. P() = 0.

3. Monotonicity. If B A, then P(B) P(A).

4. Probabilities Are At Most One. P(A) 1.

5. Inclusion-Exclusion. P(A B) = P(A) + P(B) P(A B).

You may recognise that the Complements and the Inclusion-Exclusion properties are analogous to the CP and IEP from counting.

Proof. 1. Complements. By definition, A Ac are disjoint. And so by the Additivity

Axiom, P(A) + P(Ac ) = P(A Ac ).

Also by definition, A Ac = S. And so P(A Ac ) = P(S).

By the Normalisation Axiom, P(S) = 1.

Altogether then, P(A) + P(Ac ) = P(A Ac ) = P(S) = 1. Rearranging, P(A) = 1 P (Ac ), as

desired.

The remainder of the proof is continued on p. 989 in the Appendices.

Venn diagrams are helpful for illustrating probabilities. Those below help to illustrate the

four of the above five properties.

www.EconsPhDTutor.com

Exercise 221. Prove each of the following properies and illustrate with a Venn diagram:

(a) If two events A and B are mutually exclusive, then P(A B) = 0. (b) Let A, B, and

C be events. Then P(A B C) = P(A) + P (Ac B) + P (Ac B c C). (Answer on p.

1163.)

www.EconsPhDTutor.com

56

Example 511. Flip three fair coins. Model this as an experiment E = (S, , P), where

The sample space is S = {HHH, HHT, HT H, HT T, T HH, T HT, T T H, T T T }.

The event space has 28 = 256 elements.

The probability function P R has mapping rule

1

P(HHH) = P(HHT ) = = P(T T T ) = ,

8

and more generally, for any event A , P(E) =

A

.

8

Problem: Suppose there is at least 1 tail. Find the probability that there are at least 2 heads.

There are 7 possible outcomes where there is at least 1 tail: HHT , HT H, HT T , T HH,

T HT , T T H, and T T T . Each is equally likely to occur. Of these, 3 outcomes involve at

least 2 heads (HHT , HT H, and T HH). Thus, given there is at least 1 tail, the probability

that there are at least 2 heads is simply 3/7.

The above analysis was somewhat informal. Here is a more formal analysis.

Let A be the event that there are at least 2 heads: A = {HHT, HT H, T HH, HHH}.

Let B

be the event that there

{HHT, HT H, HT T, T HH, T HT, T T H, T T T }.

is

at

least

tail:

A B is thus the event that there are at least 2 heads and 1 tail:

{HHT, HT H, T HH}.

A B =

B, which is given by:

P(AB) =

P(A B) 3/8 3

=

= .

P(B)

7/8 7

www.EconsPhDTutor.com

P(A) = 0.5 (the probability that A occurs is 0.5).

P(B) = 0.6 (the probability that B occurs is 0.6).

P(A B) = 0.2 (the probability that both A and B occur is 0.2).

Hence, given that B has occurred, the probability that A has also occurred is simply

0.2/0.6 = 1/3. (The information that P(A) = 0.5 is irrelevant.) Formally:

P(AB) =

P(A B) 0.2 1

=

= .

P(B)

0.6 3

Definition 108. Let P be a probability function and A, B be events. Then the conditional probability of A given B is denoted P(AB) and is defined by:

P(AB) =

P(A B)

.

P(B)

Exercise 222. Roll two dice. Given that the sum of the two dice rolls is 8, what is the

probability that we rolled at least one even number? (Answer on p. 1164.)

www.EconsPhDTutor.com

56.1

Fact 69. (a) If P(A) < P(B), then P (AB) < P (BA).

(b) If P(A) > P(B), then P (AB) > P (BA).

(c) If P(A) = P(B), then P (AB) = P (BA).

Proof. By definition,

P (AB) =

P (A B)

P(B)

and P (BA) =

Thus, P (AB) =

P (B A)

.

P(A)

P (A)

P (BA) .

P(B)

P(A) > P(B) P (AB) > P (BA) ,

P(A) = P(B) P (AB) = P (BA) .

Informally, the conditional probability fallay (CPF) is the fallacy of leaping from

If A, then probably B

to

Formally:

Definition 109. The conditional probability fallacy (CPF) is the mistaken belief that

is always true.

P (AB) = P (BA)

The CPF is also known as the confusion of the inverse or the inverse fallacy. In

different contexts, it is also known variously as the base-rate fallacy, false-positive

fallacy, or prosecutors fallacy.

www.EconsPhDTutor.com

Example 513. Suppose the following statement is true: If Mary has ebola, then Mary

will probably vomit today. Formally, we might write P (VomitEbola) = 0.99.

Mary vomits today. One might then reason, Since P (VomitEbola) = 0.99, by the CPF,

we also have P (EbolaVomit) = 0.99. Thus, Mary probably has ebola.

Formally, this reasoning is flawed because P(Vomit) is probably much larger than P(Ebola).

Thus, P (VomitEbola) is probably much larger than P (EbolaVomit).

Informally, the reasoning is flawed because:

Ebola is extremely rare, so it is extremely unlikely that Mary has ebola in the first place.

Besides ebola, there are many other alternative explanations for why Mary might have

vomitted. For example, she might have had motion sickness or food poisoning.

www.EconsPhDTutor.com

Example 514. Sally buys a 4D ticket every week. One day, she wins the first prize. To

her astonishment, she wins the first prize again the following week.

Her jealous cousin Ah Kow makes a police report, based on the following erroneous reasoning:

Without cheating, the probability that Sally wins the first prize two weeks in a row is 1 in

100 million. Given that she did win first prize two weeks in a row, the probability that she

didnt cheat must likewise be 1 in 100 million. In other words, there is almost no chance

that Sally didnt cheat.

Lets rephrase Ah Kows reasoning more formally. Let A and B be the events Sally

wins the first prize two weeks in a row and Sally didnt cheat, respectively. We know

that P (AB) = 0.00000001. By the CPF, we have P (AB) = P (BA). Hence, P (BA) =

0.00000001. Equivalently, there is probability 0.99999999 that Sally cheated.

Formally, this reasoning is flawed because P(B) is probably much larger than P (A). Thus,

P (BA) is probably much larger than P (AB).

Informally, the reasoning is flawed because:

Cheating in 4D is extremely rare (and difficult), so it is extremely unlikely that Sally

cheated in the first place.

Besides cheating, there are many other alternative explanations for why there exists an

individual who won first prize two weeks in a row.

One important alternative explanation is that so many individuals buy 4D tickets regularly

that there will invariably be someone as lucky as Sally. Suppose that only 100, 000 Singaporeans (less than 2% of Singapores population) buy one 4D number every week. Then

wed expect that about once every 20 years, one of these 100, 000 Singaporeans will have

the fortune of winning the first prize on consecutive weeks. Rare, but hardly impossible.

The next example uses concrete numbers to illustrate how large the discrepancy between

P (AB) and P (BA) can be.

www.EconsPhDTutor.com

that 1 out of every 1, 000, 000 people has smallpox. The test is very accurate: If you have

smallpox, it correctly tells you so 99% of the time. (Equivalently, it gives a false negative

only 1% of the time.) And if you dont have smallpox, it also correctly tells us so 99% of

the time. (Equivalently, it gives a false positive only 1% of the time.)

Formally, let S, +, and denote the events the randomly-chosen person has smallpox,

the test returns positive, and the test returns negative. Then

P (S) =

1

,

1000000

P (S C ) =

999999

,

1000000

P (+S) = 0.99,

P (S) = 0.01,

P (S C ) = 0.99,

P (+S C ) = 0.01.

The test result returns positive (i.e. it says that the randomly-chosen person has smallpox).

What is the probability that this person actually has smallpox?

In words, it is easy to confuse the probability of a positive test result conditional on having

smallpox with the probability of having smallpox conditional on a positive test result.

Formally, this is the CPF. One starts with P (+S) = 0.99 and confusedly concludes that

P (S+) = 0.99 this person almost certainly has smallpox.

In fact, as we now show, despite testing positive, the person is very unlikely to have small1

pox. The correct answer is P (S+)

10, 000

definition of conditional probability (Definition 108):

P (S+) =

P (S +) P (S) P (+S)

P (S) P (+S)

=

=

P (+)

P (+)

P (+ S) + P (+ S C )

P (S) P (+S)

P (S) P (+S) + P (S C ) P (+S C )

1

1000000 0.99

1

999999

1000000 0.99 + 1000000 0.01

= 0.00009899029

1

.

10, 000

This example illustrates how far off the CPF can lead one astray.

Now an actual, real-world example:

www.EconsPhDTutor.com

Example 516. The British mother who murdered her two babies. In 1996, Sally

Clarks first-born died suddenly within a few weeks of birth. In 1998, the same happened

to her second child. Clark was then arrested on suspicion of murdering her babies.

At her trial, an expert witness claimed that in an affluent, non-smoking family such as

Sally Clarks, the probability of an infant suddenly dying with no explanation was 1/8543.

Hence, he concluded, the probability of two sudden infant deaths in the same family was

2

(1/8543) or approximately 1 in 73 million.

The expert then committed the CPF. He argued that since

P (Two babies suddenly dieMother did not murder babies) =

1

,

73, 000, 000

P (Mother did not murder babiesTwo babies suddenly die) =

1

.

73, 000, 000

This erroneous reasoning led to Sally Clark being convicted for murdering her two babies.

(Some of you may have noticed that the expert actually also made another mistake. But

well examine this only in the next chapter.)

It turns out that not only laypersons and court prosecutors commit the CPF. As well see

later, even academic researchers also often commit the CPF, when it comes to interpreting

the results of a null hypothesis significance test (Chapter 72).

collected. Its DNA is analysed and compared to a database of DNA profiles. A match

with one John Brown is found. Say there is only a 1 in 10 million chance that two random

individuals have a DNA match.

Does this mean that there is probability 1 in 10 million that the DNA match with John

Brown is merely a coincidence, and thus a near-certainy that the blood stain is really his?

Explain why or why not, with reference to the following conditional probabilities:

P (Blood stain is not John BrownsDNA match) ,

P (Blood stain is not John BrownsDNA match) .

www.EconsPhDTutor.com

56.2

Example 517. Consider all the families in the world that have two children, of whom at

least one is a boy. Randomly pick one of these families. What is the probability that both

children in this family are boys?

Think about it (set aside this book) before reading the answer below.

We already know that one child is a boy. So intuition might suggest that obviously,

P (Both boys) = P (The other child is a boy) = 0.5.

Intuition would be wrong. Intuition goes astray by failing to recognise that there are three equally likely ways that a family

with two children can have at least one boy: BB, BG, or GB. The answer is in fact 1/3:

P(BB)

=

P(At least one boy)

P(At least one boy)

P(BB)

=

P(BB) + P(BG) + P(GB)

1

4

1

4

1

4

1

4

1

= .

3

Next is a variant of the above Martin Gardner problem. This variant was first presented

in 2010.

www.EconsPhDTutor.com

Example 518. Consider all the families in the world that have two children, of whom at

least one is a boy born on a Tuesday. Randomly pick one of these families. What is the

probability that both children in this family are boys?

Those familiar with the previous problem might think, Well, this is exactly the same as

the two-boys problem, except with an obviously-irrelevant bit of information about the boy

being born on a Tuesday. So the answer must be the same as before: 1/3.

It turns out though that, surprisingly, the Tuesday bit of information makes a big difference.

The answer is 13/27 = 0.481. This is much closer to 0.5 than to 1/3!

Consider all the two-child, at-least-one-boy-born-on-a-Tuesday families in the world. The

four mutually-exclusive possibilities are

Child #1

Child #2

BT B

P (BT B) =

1 1

7

=

14 2 196

BT G

Girl

P (BT G) =

7

1 1

=

14 2 196

P (BN BT ) =

P (GBT ) =

Girl

GBT

Probability

6 1

6

=

14 14 196

7

1 1

=

2 14 196

Altogether then, amongst two-child families with at least one boy born on a Tuesday, the

proportion that have two boys is

P (BB At least one Tuesday boy)

=

P (At least one Tuesday boy)

P (BT B) + P (BN BT )

P (BT B) + P (BT G) + P (BN BT ) + P (GBT )

7

196

7

196

7

196

6

+ 196

13

6

7 = 27 .

+ 196 + 196

www.EconsPhDTutor.com

57

Probability: Independence

Informally, two events A and B are independent if the probability that both occur is

simply the product of the probabilities that each occurs. Independence is thus analogous

to the MP from counting. Formally:

Definition 110. Two events A, B are independent if

P(A B) = P(A)P(B).

There is a second, equivalent perspective of independence. Informally, two events A and B

are independent if the probability that A occurs is independent of whether B has occurred.

Formally:

Fact 70. Suppose P(B) 0. Then A, B are independent events P(AB) = P(A).

2

2

1

of independence, P(A B) = P(A)P(B). Plugging = into =, we have P(AB) = P(A), as

desired.

www.EconsPhDTutor.com

Example 519. Flip two fair coins. Model this with the usual experiment, where

S = {HH, HT, T H, T T },

P ({HH}) = P ({HT }) = P ({T H}) = P ({T T }) = 1/4.

Let H1 be the event that the first coin flip is Heads that is, H1 = {HH, HT }. Analogously

define T1 , H2 , and T2 .

The intuitive idea of independence is easy to grasp. If we say that the two coin flips are

independent, what we mean is that the following four conditions are true:

1. H1 and H2 are independent. (The probability that the second flip is heads is independent

of whether the first flip is heads.)

2. H1 and T2 are independent. (The probability that the second flip is tails is independent

of whether the first flip is heads.)

3. T1 and H2 are independent. (The probability that the second flip is heads is independent

of whether the first flip is tails.)

4. T1 and T2 are independent. (The probability that the second flip is tails is independent

of whether the first flip is tails.)

Formally:

1. P (H1 H2 ) = P({HH}) = P (H1 ) P (H2 ) = P({HH, HT }) P({HH, T H}) = 0.5 0.5 =

0.25.

2. P (H1 T2 ) = P({HT }) = P (H1 ) P (T2 ) = P({HH, HT })P({HT, T T }) = 0.50.5 = 0.25.

3. P (T1 H2 ) = P({T H}) = P (T1 ) P (H2 ) = P({T H, T T })P({HH, T H}) = 0.50.5 = 0.25.

4. P (T1 T2 ) = P({T T }) = P (T1 ) P (T2 ) = P({T H, T T }) P({HT, T T }) = 0.5 0.5 = 0.25.

www.EconsPhDTutor.com

Example 520. Flip a fair coin and roll a fair die. This can be modelled by an experiment,

where

S = {H1, H2, H3, H4, H5, H6, T 1, T 2, T 3, T 4, T 5, T 6} .

consists of 212 events.

P(A) = A/12, for any event A .

Now consider the event Heads E1 = {H1, H2, H3, H4, H5, H6}, and the event Roll an

odd number E2 = {H1, H3, H3, T 1, T 3, T 5}. These two events E1 and E2 are independent,

as we now verify:

P (E1 E2 ) =

P (E1 E2 ) 3/12 1

=

= = P (E1 ) .

P (E2 )

6/12 2

More broadly, we can even say that the coin flip and die roll are independent. Informally,

this means that the outcome of the coin flip has no influence on the outcome of the die roll,

and vice versa.

The idea of independence is a little tricky to illustrate on a Venn diagram. Ill try anyway.

www.EconsPhDTutor.com

Example 521. The Venn diagram below illustrates a sample space with 100 equally likely

outcomes (represented by 100 small squares). The event A is highlighted in red. The event

B is highlighted in blue.

P(A) = 0.2 (A is made of 20 small squares). P(B) = 0.1 (B is made of 10 small squares).

The event A B, coloured in green, is made of 2 small squares, so P(A B) = 0.02.

We compute

P(AB) =

P(A B) 0.02

=

= 0.2.

P(B)

0.1

We observe that P(A) = 0.2 = P(AB). And so by Fact 70, we conclude that the events A

and B are independent.

www.EconsPhDTutor.com

Exercise 224. Symmetry of Independence. In Fact 70, we showed that A, B independent P(AB) = P(A). Now prove that A, B are independent events

P(BA) = P(B). (Answer on p. 1165.)

A = B and B = C, then A = C. Another example is : If A B and B C, then A C.

In contrast, independence is not transitive, as this exercise will demonstrate. That is,

even if A and B are independent, and B and C are independent, it may not be that A and

C are also independent.

Flip two fair coins. Let H1 be the event that the first coin flip is heads, H2 be the event

that the second is heads, and T1 be the event that the first flip is tails. Show that

(a) H1 and H2 are independent.

(b) H2 and T1 are independent.

(c) H1 and T1 are not independent.

www.EconsPhDTutor.com

57.1

The idea of independence is intuitively easy to grasp. Indeed, so much so that students

often assume that everything is independent. This is a mistake. Unless youre explicitly

told, NEVER assume that two events are independent.

Here are two examples where the assumption of independence is plausible:

Example 522. The event coin-flip #1 is heads and the event coin-flip #2 is heads are

probably independent.

Example 523. The event die-roll #1 is 3 and the event die-roll #2 is 6 are probably

independent.

Here are two xxamples where the assumption of independence is not plausible:

Example 524. The event Googles share price rises today is probably not independent

of the event Apples share price rises today.

Example 525. The event it rains in Singapore today is probably not independent of the

event it rains in Kuala Lumpur today.

Nonetheless, the assumption of independence is frequently and incorrectly made even

when it is implausible. One reason is that the maths is easy if we assume independence

we can simply multiply probabilities together.

We now revisit the Sally Clark case. Previously, we saw that the courts expert witness

committed the CPF. Now, well see that he also made a second mistake that of assuming

independence.

www.EconsPhDTutor.com

Example 526. The expert witness claimed that in an affluent, non-smoking family such

as Sally Clarks, the probability of an infant suddenly dying with no explanation was 1/8543.

Hence, he concluded, the probability of two sudden infant deaths in the same family was

2

(1/8543) or approximately 1 in 73 million.

Can you spot the error in the reasoning?

By simply multiplying together probabilities, the expert implicitly assumed that the two

events sudden death of baby #1 and sudden death of baby #2 are independent.

But as any doctor will tell you, if your family has a history of heart attack, diabetes, or

pretty much any other ailment, then you may be at higher risk (than the average person)

of suffering the same.

And so, it may well be that in any given year, a random person has probability 0.001 of

dying of a heart attack. It does not however follow that in any given year, a random family

has probability 0.0012 = 0.000001 of two deaths by heart attack.

Similarly, it may be that if one baby in a family has already suddenly died, a second baby

is at higher risk (than the average baby) of suddenly dying.

Exercise 226. (Answer on p. 1165.) Say the probability that a randomly-chosen person

is or was an NBA player is one in a million. (This is probably about right, since thereve

only ever been 4, 000 or so NBA players, since the late 1940s.)

The Barry family had four players in the NBA the father Rick Barry and three of his

four sons Jon, Brent, and Drew. (The oldest son Scooter didnt make the NBA but was

still good enough to play professionally in other basketball leagues around the world.)

A journalist concludes that the probability of a Barry family ever occurring is

(

4

1

1

) =

.

1, 000, 000

1, 000, 000, 000, 000, 000, 000, 000, 000

This is equal to the probability of buying a 4D number on six consecutive weeks, and

winning first prize every time. Is the journalist correct?

www.EconsPhDTutor.com

57.2

A, B, C are pairwise independent if all three of the following conditions are true:

P(A B) = P(A)P(B),

P(B C) = P(B)P(C),

P(A C) = P(A)P(C).

A, B, C are independent if in addition to the above three conditions being true, it is also

true that

P(A B C) = P(A)P(B)P(C).

It is tempting to believe that pairwise independence implies independence. That is, if the

first three conditions listed above are true, then so is the fourth. Alas, this is false, as the

next exercise demonstrates:

on p. 1165.)

Flip two fair coins. Let H1 be the event that the first coin flip is heads, T2 be the event

that the second is tails, and X be the event that the two coin flips are different. Show that

(a) These three events are pairwise independent.

(b) These three events are not independent.

www.EconsPhDTutor.com

58

58.1

The Monty Hall Problem

The Monty Hall Problem is probably the worlds most famous probability puzzle. It takes

less than a minute to state. Yet its counter-intuitive answer confuses nearly everyone.

Youre at a gameshow. There are three boxes, labelled #1, #2, and #3. One box contains

one years worth of a Singapore ministers salary. The other two are empty.

You are asked to pick one box (but you are not allowed to open it yet).

The host, who knows where the ministers salary is, opens one of the other two boxes, to

reveal that it is empty. Important: The host is not allowed to open the box that contains

the ministers salary; he must always open a box that is empty.

Youre now given a choice: Stay (with your original choice) or switch (to the other unopened

box). What should you do?

To illustrate:

Example 527. Say you pick Box #2. The host then opens an empty Box #1. Youre now

given a choice: Stay (with Box #2) or switch (to Box #3). Which do you choose?

Box #1

Empty

Box #2

Box #3

Take as long as you need to think about this problem, before turning to the

next page for the answer.

www.EconsPhDTutor.com

A magazine columnist named Marilyn vos Savant64 gave the correct answer:

Yes; you should switch. The first door has a 1/3 chance

of winning, but the second door has a 2/3 chance.

1. The probability that the ministers salary is in the box you picked is 1/3. The probability

that the ministers salary is in either of the other two boxes is 2/3. Of the other two boxes,

the gameshow host (who knows where the salary is) helps you eliminate one of them. So

the remaining unopened box still has probability 2/3 of containing the ministers salary.

2. Imagine instead that there are 100 boxes, of which one contains the ministers salary

and the others are empty. You pick one. Of the remaining 99, the gameshow host opens

98. You are again given the choice: Should you stay or switch? In this more extreme

version of the game, it is perhaps more obvious that your originally-picked box has only

probability 1/100 of containing the ministers salary, while the only other unopened box

has probability 99/100 of the same. Therefore, you should switch.

Heres a more formal explanation using the method of enumeration:

3. Say you originally pick Box #1. There are three possible cases, each occurring with

probability 1/3:

Box #1

Box #2

Box #3

Host opens

Case

A

Ministers salary

Empty

Empty

Box #2 or Box #3

B

Empty

Ministers salary

Empty

Box #3

Empty

Empty

Ministers salary

Box #2

C

Not switching wins you the ministers salary only in Case A (1/3 probability).

Switching wins you the ministers salary in Cases B and C (2/3 probability).

64

Marilyn vos Savant was, briefly, on the Guinness Book of Records as the person with the worlds highest IQ, until Guinness

retired this category because IQ tests were considered to be too unreliable.

www.EconsPhDTutor.com

Even with the above explanations, some of you may remain unconvinced. Dont worry, you

are not alone. After Marilyns initial response, 10,000 readers sent in letters telling her she

was wrong. Some were from Professors of Mathematics and PhDs. A few examples:65

As a professional mathematician, Im very concerned with the general

publics lack of mathematical skills. Please help by confessing your error

and in the future being more careful.

There is enough mathematical illiteracy in this country, and we dont need

the worlds highest IQ propagating more. Shame!

Maybe women look at math problems differently than men.

Unfortunately for the above letter writers, Marilyn was correct and they were wrong.

The best way to convince the sceptical is through simulations try this Google spreadsheet.

Or if you dont trust computers, do an actual experiment:

Class Activity

Form pairs. One person is the gameshow host and the other is the contestant. The host

decides where the prize is (Box #1, #2, or #3). The contestant then picks a box. The

host then tells the contestant which one of the other two boxes is empty. The contestant

then decides whether to stay or switch.

Repeat as many times as you have time for. Record the proportion of times that the

contestant should have switched. You should find that this proportion is about 2/3.

65

www.EconsPhDTutor.com

58.2

Example 528. (The birthday problem.) What is the smallest number n of people in a

room, such that it is more likely than not, that at least 2 people in the room share the same

birthday?*

The probability that person #2s birthday is different (from person #1) is 364/365.

The probability that person #3s birthday is different (from persons #1 and #2) is

363/365.

The probability that person #4s birthday is different (from persons #1, #2, and #3)

is 362/365.

... ...

The probability that person #ns birthday is different (from persons #1 through #n1)

is (366 n)/365.

Altogether, the probability that no 2 persons share the same birthday is

364 363 362

366 n

.

365 365 365

365

Hence, the probability that at least 2 persons share the same birthday is

1

366 n

.

365 365 365

365

The smallest integer n for which the above probability is at least 0.5 is 23. (Wolfram

Alpha.) That is, perhaps surprisingly, with just 23 people, it is more likely than not that

at least 2 persons share a birthday.

*Assume there are no leap years (every year has 365 days). Also, assume each persons birthday is equally likely to be on

any day of the year and does not depend on the birthday of anyone else.

www.EconsPhDTutor.com

59

Informally, a random variable is a function that assigns a real number (you can think of

this as a numerical code) to each possible outcome s. We call any such real number an

observed value of X.

Example 529. Model a fair coin-flip with the usual experiment E = (S, , P), where

S = {H, T }.

= {, {H} , {T } , S}.

P R is defined by P () = 0, P ({H}) = P ({H}) = 0.5, and P(S) = 1.

Let X S R be the random variable that indicates whether the coin-flip is heads. That

is, the observed value of X is X(H) = 1 if the outcome is heads and X(T ) = 0 if the

outcome is tails.

Formally:

Definition 112. Let E = (S, , P) be an experiment. A random variable X (on the

experiment E) is any function with domain S and codomain R.

Given any random variable X and any outcome s S, we call X(s) the observed (or

realised) value of the random variable X. We often denote a generic observed value X(s)

by the lower-case letter x.

www.EconsPhDTutor.com

59.1

Students often confuse a random variable with an observed value of the random variable.

This confusion is, of course, simply the confusion between a function and the value taken

by the function.

Example 529 (continued from above). X is a function with domain S and codomain

R. X is therefore a random variable.

If the outcome of the coin-flip is heads, we do not say that X is 1. Instead, we say that

the observed value of X is 1.

If the outcome of the coin-flip is tails, we do not say that X is 0. Instead, we say that the

observed value of X is 0.

Remember: A random variable X is a function that can take on many possible real

number values. Each such value x = X(s) is called an observed value of X.

www.EconsPhDTutor.com

59.2

event {s S X(s) = k}.

The notation X k, X > k, X k, X < k, a X b, etc. are similarly defined.

Example 529 (continued from above). X(H) = 1 and X(T ) = 0. So we can write:

X = 1 denotes the event {s S X(s) = 1} = {H} ,

X = 0 denotes the event {s S X(s) = 0} = {T } .

Moreover, P ({H}) = 0.5 and P ({T }) = 0.5. So we can also write:

P(X = 1) = 0.5 and P(X = 0) = 0.5.

Now lets try some other arbitrary number like 13.71. Notice there is no outcome s such

that X(s) = 13.71. Thus:

X = 13.71 denotes the event {s S X(s) = 13.71} = ,

X = k denotes the event {s S X(s) = k} = ,

and P(X = k) = 0.

Define Y S R by Y (H) = 15.5, Y (T ) = 15.5. Y is an example of a constant random

variable. We may write:

Y = 15.5 denotes the event {s S X(s) = 15.5} = {H, T } ,

Y = k denotes the event {s S X(s) = k} = ,

and P(Y = k) = 0.

www.EconsPhDTutor.com

59.3

We call a complete specification of P (X = k) for all values of k the probability distribution (or probability law or probability mass function) of X. In the above example,

we gave the probability distributions of both X and Y .

More examples of random variables and their probability distributions:

Example 530. Flip two fair coins. Model this with the usual experiment, where S =

{HH, HT, T H, T T }.

Let X S R indicate whether the two coin flips are the same and Y S R count the

number of heads. That is,

X(HH) = 1, X(HT ) = 0, X(T H) = 0, X(T T ) = 1,

Y (HH) = 2, Y (HT ) = 1, Y (T H) = 1, Y (T T ) = 0.

And

P(X = 0) = 0.5, P(X = 1) = 0.5, and P(X = k) = 0, for any k 0, 1.

P(Y = 0) = 0.25, P(Y = 1) = 0.5, P(Y = 2) = 0.25, and P(X = k) = 0, for any k 0, 1, 2.

Another example:

www.EconsPhDTutor.com

Example 531. Pick a random card from the standard 52-card deck. Model this with the

usual experiment, where

S = {A, K, , . . . , 2, A, K, . . . , 2, A, K, . . . , 2, A, K, . . . , 2} .

X S R is the High Card Point count (used in the game of bridge). I.e.,

X(A of any suit) = 4,

X(J of any suit) = 1,

X(Any other card) = 0.

Thus,

P(X = 0) =

36

,

52

P(X = 1) =

4

,

52

P(X = 2) =

4

,

52

P(X = 3) =

4

,

52

P(X = 4) =

4

,

52

P(X = k) = 0,

for any k 0, 1, 2, 3, 4.

Y S R indicates whether the picked card is a spade (). I.e.,

Y (Any ) = 1, Y (Any other card) = 0.

Thus,

P(Y = 0) =

39

,

52

P(Y = 1) =

13

,

52

www.EconsPhDTutor.com

Example 532. Roll two fair dice. Model this with the usual experiment, where

S=

,...,

,...,

,...,

= 7 and X

= 5.

The table below says that P (X = 2) = 1/36, because there is only one way the event X = 2

can occur. And P (X = 3) = 2/36, because there are two ways the event X = 3 can occur.

You are asked to complete the table in the next exercise.

k

2

3

P (X = k)

1

36

2

36

4

5

6

7

8

9

10

11

12

Exercise 228. (Continuation of the above example.) (Answer on p. 1166.) (a) Complete

the above table.

Consider the event E, described in words as the sum of the two dice is at least 10.

(b) Write down the event E in terms of X.

(c) Calculate P (E).

www.EconsPhDTutor.com

59.4

Example 532 (continued from above). Continue with the same the roll-two-fair-dice

example, with X again being the random variable that is the sum of the two dice. We had

= 7 and X

= 5.

= 10 and Y

= 4.

Remember: random variables are simply functions. And thus, we can manipulate random

variables just like we manipulate any functions.

So for example, consider the function X + Y S R. It is also a random variable. We have

(X + Y )

= 17 and (X + Y )

= 9.

(XY )

= 70 and (XY )

= 20.

(4X 5Y )

= 22 and (4X 5Y )

= 0.

www.EconsPhDTutor.com

Exercise 229. Continue with the above roll-two-fair-dice example. Let P S R be the

greater of the two dice. Let Q S R be the difference of the two dice. Evaluate the

functions P , Q, and P Q at

and

. (Answer on p. 1167.)

Exercise 230. (Answer on p. 1167.) Model a fair die-roll with the usual experiment

E = {S, , P}. Define the function X S R by the mapping rule X(1) = 1, X(2) = 2,

X(3) = 3, X(4) = 4, X(5) = 5, and X(6) = 6.

Is X a random variable on E? Why or why not?

If X is indeed a random variable on E, then write down also P(X = k), for all possible k.

Exercise 231. For each of the following real-world scenarios, write down, in precise mathematical notation (i) the experiment E = {S, , P}; (ii) what the random variable X is; and

(iii) P(X = k), for all possible k. (Answers on pp. 1167 and 1168.)

(a) Flip 4 (fair) coins. Let the random variable X be a count of the number of heads.

(b) Roll 3 (fair) dice. Let the random variable X be the sum of the three dice. (Tedious.)

www.EconsPhDTutor.com

60

x, Y = y denotes the event {s S X(s) = x, Y (s) = y}.

Example 533. Flip two fair coins. Model this with the usual experiment where S =

{HH, HT, T H, T T }.

Let X S R indicate whether the two coin flips were the same and Y S R count the

number of heads. That is,

X(HH) = 1,

and Y (HH) = 2,

X(HT ) = 0,

Y (HT ) = 1,

X(T H) = 0,

Y (T H) = 1,

X(T T ) = 1,

Y (T T ) = 0.

Then X = 0, Y = 0 is the event that the two coin flips were not the same AND the number of

heads was 0. By observation, this event is the empty set. Thus, P (X = 0, Y = 0) = P () = 0.

X = 1, Y = 0 is the event that the two coin flips were the same AND the number of heads

was 0. By observation, this event is {T T }. Thus, P (X = 1, Y = 0) = P ({T T }) = 0.25.

Exercise: Verify for yourself that

P (X = 0, Y = 1) = 0.5, P (X = 1, Y = 1) = 0,

P (X = 0, Y = 2) = 0,

P (X = 1, Y = 2) = 0.25.

www.EconsPhDTutor.com

Informally, two random variables are independent if knowing the value of one does not

tell us anything about the value of the other.

Example 533 (continued from above). Flip two fair coins. We say the two coin-flips

are independent. Informally, the outcome of one doesnt affect the other. Knowing that

the first coin-flip is heads tells us nothing about the second coin-flip.

A little more formally, let A and B be the random variables indicating whether the first and

second coin-flip are heads (respectively). That is, A = 1 if the first coin-flip is heads and

A = 0 otherwise; and B = 1 if the second coin-flip is heads and B = 0 otherwise. Then the

informal statement the two coin-flips are independent may be translated into the formal

statement the random variables A and B are independent.

Informally, knowing the observed value of A tells us nothing about whether B = 0 or B = 1.

(And vice versa.)

Formally:

Definition 115. Given random variables X S R and Y S R, we say that X and Y

are independent if for all x, y,

P (X = x, Y = y) = P(X = x)P(Y = y).

Lets restate the above definition more explicitly. Suppose X can take on values x1 , x2 , . . . , xn

and Y can take on values y1 , y2 , . . . , ym . Then to say that X and Y are independent is to

say that all of the following n m pairs of events are independent

X = x 1 , Y = y1 ,

X = x 2 , Y = y1 ,

X = x n , Y = y1 ,

X = x 1 , Y = y2 ,

X = x 2 , Y = y2 ,

X = xn , Y = y2 ,

...

...

...

...

X = x 1 , Y = ym ,

X = x 2 , Y = ym ,

X = x n , Y = ym .

many pairs of events.

www.EconsPhDTutor.com

Example 533 (continued from above). We now verify, in more formal and precise

language, that the two coin-flips are indeed independent.

Again, A and B are the random variables indicating whether the first and second coin-flips

are heads (respectively).

We now verify that indeed, P (A = a, B = b) = P(A = a)P(B = b) for all possible values of a

and b:

a = 0, b = 0

a = 1, b = 0

a = 0, b = 1

a = 1, b = 1

P (A = a, B = b)

P ({T T }) = 0.25

P ({HT }) = 0.25

P ({T H}) = 0.25

P ({HH}) = 0.25

P(A = a)P(B = b)

P ({T H, T T }) P ({HT, T T }) = 0.5 0.5,

P ({HH, HT }) P ({HT, T T }) = 0.5 0.5,

P ({T H, T T }) P ({HH, T H}) = 0.5 0.5,

P ({HH, HT }) P ({HH, T H}) = 0.5 0.5.

Exercise 232. Flip two fair coins. Let X S R indicate whether the two coin flips were

the same and Y S R count the number of heads. Are X and Y independent random

variables? (Answer on p. 1170.)

Earlier we warned against blithely assuming that any two events are independent. Here we

can repeat this warning: Unless explicitly told (or you have a good reason), do not assume

that two random variables are independent.

The assumption of independence is a strong one. There are many scenarios where it is

plausible. For example, the flips of two coins are probably independent. The rolls of two

dice are probably independent.

There are, however, also many scenarios where it is not plausible. Todays changes in

the share prices of Google and Apple are probably not independent. Todays rainfall in

Singapore and in Kuala Lumpur are probably not independent.

Nonetheless, the assumption of independence is frequently and incorrectly made even

when it is implausible. The reason is that the maths is easy if we assume independence

we can simply multiply probabilities together. Unfortunately, incorrectly assuming independence has sometimes had tragic consequences, as we saw in the Sally Clark case.

www.EconsPhDTutor.com

61

What is the expected value (or the mean) of X? In other words, on average, whats the

expected outcome of a fair die roll?

Note that X takes on a value 1 with probability 1/6. Similarly, it takes on a value 2 with

probability 1/6. Etc. Hence, the expected value of X, denoted E [X] is given by:

E[X] =

1

1

1

1

1

1 + 2 + 3 + 4 + 5 + 6 21

1

1+ 2+ 3+ 4+ 5+ 6=

=

= 3.5.

6

6

6

6

6

6

6

6

E[X] is thus simply a weighted average of the possible values of X, where the weights are

the probability weights.

Well use the following slightly-incorrect definition of a discrete random variable:66

Slightly-Incorrect Definition. A random variable is discrete if its range is finite.

That is, a random variable is discrete if it takes on finitely many possible values.

We can now formally define the expected value of a discrete random variable:

Definition 116. Let E = (S, , P) be an experiment. Then the corresponding expectation

operator, denoted E, is the function that maps any discrete random variable X S R to

a real number, according to the mapping rule

E[X] =

P(X = k) k.

kRange(X)

We call E[X] the expected value (or mean) of X. We often write X = E[X] or even

= E[X] (if it is clear from the context that were talking about the mean of X).

66

The correct definition is this: A random variable is discrete if its range is finite or countably-infinite. I avoid giving this

correct definition because this would require explaining what countably-infinite means.

www.EconsPhDTutor.com

Example 535. Let X be the outcome of a fair die roll. The range of X is Range(X) =

{1, 2, 3, 4, 5, 6}. So

E[X] =

P (X = k) k

kRange(X)

= P (X = 1) 1 + P (X = 2) 2 + P (X = 3) 3 + P (X = 4) 4 + P (X = 5) 5 + P (X = 6) 6.

=

1

1

1

1

1

1

1 + 2 + 3 + 4 + 5 + 6 = 3.5.

6

6

6

6

6

6

The range of Y is Range(Y ) = {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}. In Exercise 228, we worked

out that P (Y = 2) = 1/36, P (Y = 3) = 2/36, etc. Thus:

E[Y ] =

P (Y = k) k

kRange(Y )

= P (Y = 2) 2 + P (Y = 3) 3 + P (Y = 4) 4 + P (Y = 5) 5 + + P (Y = 12) 12

=

2

3

4

5

6

5

4

3

2

1

1

2+

3+

4+

5+

6+

7+

8+

9+

10 +

11 +

12

36

36

36

36

36

36

36

36

36

36

36

2 + 6 + 12 + 20 + 30 + 42 + 40 + 36 + 30 + 22 + 12 252

=

= 7.

36

36

www.EconsPhDTutor.com

Example 537. Flip two fair coins and roll two fair dice. Let X be the number of heads

and Y be the number of sixes.

Problem: What is E[X + Y ]?

As it turns out, it is generally true that E[X + Y ] = E[X] + E[Y ] (as well see in the next

section). So if we knew this, then the problem is very easy: E[X + Y ] = E[X] + E[Y ] =

1 4

1+ = .

3 3

But as an exercise, lets pretend we dont know that E[X + Y ] = E[X] + E[Y ]. We thus

have to work out E[X + Y ] the hard way:

First, note that Range(X + Y ) = {0, 1, 2, 3, 4}. P (X + Y = 0) is the probability of 0 heads

and 0 sixes. And P (X + Y = 1) is the probability of 1 head and 0 sixes OR 0 heads and 1

six. We can compute:

P (X + Y = 0) =

1 1 5 5 25

=

,

2 2 6 6 144

P (X + Y = 1) =

2 1 1 5 5 1 1 2 5 1 50

10 60

+

=

+

= .

1 2 2 6 6 2 2 1 6 6 144 144 72

You are asked to complete the rest of this problem in the exercise below.

Exercise 233. Complete the above example by following these steps: (a) Compute

P (X + Y = 2). (b) Compute P (X + Y = 3). (c) Compute P (X + Y = 4). (d) Now compute E[X + Y ]. (Answer on p. 1171.)

www.EconsPhDTutor.com

61.1

Example 538. Let 5 be a constant random variable on some experiment E = (S, , P).

That is, 5 S R is the function defined by s 5. (Note that the symbol 5 does double

duty by denoting both a function and a real number.) Then not surprisingly,

Function Number

E [5] =

5 .

That is, on average, we expect the random variable 5 to take on the value 5.

We can easily prove the above observation:

Fact 71. If the constant random variable c maps every outcome to the number c, then

E[c] = c.

Proof. The PMF of the constant random variable c is given by P (c = c) = 1 and P (c = k) = 0

for any k c. Hence, E [c] = P (c = c) c = 1 c = c.

www.EconsPhDTutor.com

Exercise 234. In the game of 4D, you pay $1 to pick any four-digit number between 0000

and 9999 (there are thus 10, 000 possible choices). There are two variants of the 4D game

big and small. The prize structures are as given below. Let X be the prize received

from a $1 stake in the big game and Y be the prize received from a $1 stake in the small

game. (Answer on p. 1172.)

(a) Write down the range of X and the range of Y .

(b) Write down the probability distributions of X and Y .

(c) Hence find E[X] and E[Y ].

(d) Which game big or small is expected to lose you less money?

(Source: Singapore Pools, Rules for the 4-D Game, Version 1.11, 17/11/15. PDF.)

www.EconsPhDTutor.com

61.2

transformation if it satisfies the following two conditions:

(a) Additivity: f (x + y) = f (x) + f (y); and

(b) Homogeneity of degree 1: f (kx) = kf (x).

Example 539. The summation operator is an example of a linear transformation.

Because it satisfies both additivity and homogeneity of degree 1:

n

i=1

i=1

i=1

(ai + bi ) = ai + bi

i=1

i=1

and (kai ) = k ai .

d

is an example of a linear transformation.

Example 540. The differentiation operator

dx

Because it satisfies both additivity and homogeneity of degree 1:

d

d

d

(f (x) + g(x)) = f (x) + g(x) and

dx

dx

dx

d

d

(kf (x)) = k f (x).

dx

dx

Here are two examples of operators that are not linear transformations.

do not have

x+y =

x+

or

kx = k x.

not have

2

(x + y) = x2 + y 2

or (kx) = kx2 .

www.EconsPhDTutor.com

Proposition 13. The expectation operator E is linear. That is, if X and Y are random

variables and c is a constant, then

(a) Additivity: E[X + Y ] = E [X] + E [Y ],

(b) Homogeneity of degree 1: E[cX] = cE [X].

Proof. Optional, see p. 990 in the Appendices.

The linearity of the expectation operator is a powerful property, especially because it is

true even if independence is not satisfied.

Example 543. I stake $100 on each of two different 4D numbers for Saturdays drawing

(big game). (So thats $200 total.)

Let X and Y be my winnings (excluding my original stake) from the first and second

numbers (respectively). Now, X and Y are certainly not independent because for example,

if my first number wins first prize, then my second number cannot possibly also win first

prize.

Nonetheless, despite X and Y not being independent, the linearity of the expectation

operator tells us that

E [X + Y ] = E [X] + E [Y ] = $65.90 + $65.90 = $131.80.

www.EconsPhDTutor.com

62

Example 544. Consider a random variable X that is equally likely to take on one of 5

possible values: 0, 1, 2, 3, 4. Its mean is

X = P (X = k) k =

1

1

1

1

1

0 + 1 + 2 + 3 + 4 = 2.

5

5

5

5

5

Now consider another random variable Y that is equally likely to take on one of 5 possible

values: 8, 3, 2, 7, 12. Coincidentally, its mean is the same:

Y = P (Y = k) k =

1

1

1

1

1

(8) + (3) + 2 + 7 + 12 = 2.

5

5

5

5

5

The random variables X and Y share the same mean. However, there is an obvious difference: Y is more spread out.

What, precisely, do we mean when we say that one random variable is more spread out

than another?

Our goal in this section is to invent a measure of spread-outness. Well call this the

variance and denote the variance of any random variable X by V [X].

Its not at all obvious how the variance should be defined. One possibility is to define the

variance as the weighted average of the deviations from the mean.

www.EconsPhDTutor.com

Example 596 (continued from above). (Our first proposed definition of variance.)

For X, the weighted average of the deviations from the mean is

V [X] = P (X = k) (k )

1

1

1

1

1

= (0 ) + (1 ) + (2 ) + (3 ) + (4 )

5

5

5

5

5

1

1

1

1

1

= (0 2) + (1 2) + (2 2) + (3 2) + (4 2)

5

5

5

5

5

2 1

1 2

= + 0 + + = 0.

5 5

5 5

Hmm. This works out to be 0. Is that just a weird coincidence? Lets try the same for Y :

V [Y ] = P (Y = k) (k )

1

1

1

1

1

= (8 ) + (3 ) + (2 ) + (7 ) + (12 )

5

5

5

5

5

1

1

1

1

1

= (8 2) + (3 2) + (2 2) + (7 2) + (12 2)

5

5

5

5

5

= 2 1 + 0 + 1 + 2 = 0.

Hmm. Again it works out to be 0.

This is no mere coincidence. It turns out that P(X = k) (k ) is always equal to 0.

k

This is because

=

P(X = k) (k ) = P(X = k) k P(X = k)

k

= P(X = k) = 0.

k

=1

So our first proposed definition of the variance the weighted average of the deviations

from the mean is always equal to 0. Intuitively, the reason is that the negative deviations

(corresponding to those values below the mean) exactly cancel out the positive deviations

(corresponding to those values above the mean).

This proposed definition is thus quite useless. We cannot use it to say things like Y is

more spread out than X.

This suggests a second approach: define the variance to be the weighted average of the

absolute deviations from the mean.

www.EconsPhDTutor.com

Example 596 (continued from above). (Our second proposed definition of variance.)

For X, the weighted average of the absolute deviations from the mean is

V [X] = P (X = k) k

1

1

1

1

1

= 0 + 1 + 2 + 3 + 4

5

5

5

5

5

1

1

1

1

1

= 0 2 + 1 2 + 2 2 + 3 2 + 4 2

5

5

5

5

5

2 1

1 2 6

= + +0+ + = .

5 5

5 5 5

And now lets work out the same for Y :

V [Y ] = P (Y = k) (k )

1

1

1

1

1

= 8 + 3 + 2 + 7 + 12

5

5

5

5

5

1

1

1

1

1

= 8 2 + 3 2 + 2 2 + 7 2 + 12 2

5

5

5

5

5

= 2 + 1 + 0 + 1 + 2 = 6.

Wonderful! So we can now use this second proposed definition of the variance to say things

like Y is more spread out than X.

This second proposed definition seems perfectly satisfactory. Yet for some bizarre reason,

it will not be our actual definition of variance. Instead, the variance will be defined as the

weighted average of the squared deviations from the mean.

www.EconsPhDTutor.com

For X, the weighted average of the squared deviations from the mean is

2

V [X] = P (X = k) (k )

1

1

1

1

1

2

2

2

2

2

= (0 ) + (1 ) + (2 ) + (3 ) + (4 )

5

5

5

5

5

1

1

1

1

1

2

2

2

2

2

= (0 2) + (1 2) + (2 2) + (3 2) + (4 2)

5

5

5

5

5

4 1

1 4

= + + 0 + + = 2.

5 5

5 5

And now lets work out the same for Y :

2

V [Y ] = P (Y = k) (k )

1

1

1

1

1

2

2

2

2

2

= (8 ) + (3 ) + (2 ) + (7 ) + (12 )

5

5

5

5

5

1

1

1

1

1

2

2

2

2

2

= (8 2) + (3 2) + (2 2) + (7 2) + (12 2)

5

5

5

5

5

= 20 + 5 + 0 + 5 + 20 = 50.

Formally,

Definition 118. Let = E [X]. Then the variance operator is denoted V and is the

function that maps each random variable X to a real number c, given by the mapping rule

2

V[X] = E [(X ) ] .

2

We call V[X] the variance of X. This is often also instead written as X

or even more

2

simply as (if it is clear from the context that were talking about the variance of X).

So to calculate the variance, we do this: Consider all the possible values that X can take.

Take the difference between these values and the mean of X. Square them. Then take the

probability-weighted average of these squared numbers.

More examples:

www.EconsPhDTutor.com

Example 545. Let the random variable X be the outcome of the roll of a fair die. We

already know that = 3.5. Hence,

2

= P (X = 1) (1 3.5)2 + P (X = 2) (2 3.5)2 + + P (X = 6) (6 3.5)2

=

35

1

(2.52 + 1.52 + 0.52 + 0.52 + 1.52 + 2.52 ) =

2.92.

6

12

35

So the variance of the die roll is

2.92. This means that the expected squared deviation

12

35

of X from its mean = 3.5 is

2.92.

12

Example 546. Roll two fair dice. Let the random variable Y be the sum of the two dice.

We already know from Example 536 that = 7. So, using also our findings from Exercise

228,

2

= P (Y = 2) (2 7)2 + P (Y = 3) (3 7)2 + + P (Y = 12) (12 7)2

=

1 2 2 2 3 2 4 2 5 2 6 2 5 2

5 +

4 +

3 +

2 +

1 +

0 +

1

36

36

36

36

36

36

36

4 2 3 2 2 2 1 2

+

2 +

3 +

4 +

5

36

36

36

36

2 (25 + 32 + 27 + 16 + 5) 210 70

=

=

5.83.

36

36 12

70

5.83. This means that on average, the square

12

70

of the deviation of Y from its mean = 7 is

5.83.

12

As the above examples suggest, calculating the variance can be tedious. Fortunately, there

is a shortcut:

www.EconsPhDTutor.com

Proof. Using the definition of variance, the linearity of the expectation operator (Proposition 13), and the fact that is a constant, we have

2

= E [X 2 ] + 2 2E [X] = E [X 2 ] + 2 2 = E [X 2 ] 2 .

Example 545 (continued from above). Let the random variable X be the outcome of

the roll of a fair die. We already know that = 3.5. So compute

E [X 2 ] = P (X = 1) 12 + P (X = 2) 22 + + P (X = 6) 62 =

1 2 2

91

(1 + 2 + + 62 ) = .

6

6

Hence,

V[X] = E [X 2 ] 2 =

91

182 147 35

3.52 =

= .

6

12

12 12

www.EconsPhDTutor.com

Example 546 (continued from above). Let the random variable Y be the sum of two

rolled dice. We already know from Example 536 that = 7. So, using also our findings

from Exercise 228,

E [Y 2 ] = P (Y = 2) 22 + P (Y = 3) 32 + + P (Y = 12) 122

=

1 2 2 2 3 2 4 2 5 2 6 2 5 2

2 +

3 +

4 +

5 +

6 +

7 +

8

36

36

36

36

36

36

36

4 2 3

2

1

+

9 +

102 +

112 +

122

36

36

36

36

4 + 18 + 48 + 100 + 294 + 320 + 324 + 300 + 242 + 144 1974 658

=

=

.

36

36

12

Hence,

V[Y ] = E [Y 2 ] 2 =

658 588 70

658

72 =

= .

12

12

12 12

Exercise 235. Let the random variable Z be the sum of three rolled dice. Find V[Z].

(Answer on p. 1173.)

www.EconsPhDTutor.com

62.1

A constant random variable cannot vary. So not surprisingly, the variance of a constant

random variable is 0.

Fact 73. Let c be a constant random variable (i.e. it maps every outcome to the real number

c). Then

V[c] = 0.

2

www.EconsPhDTutor.com

62.2

Standard Deviation

Let X be a random variable. Then E [X] has the same unit of measure as X. In contrast,

V [X] uses the squared unit.

Example 547. There are 100 dumbbells in a gym, of which 30 have weight 5 kg and the

remaining 70 have weight 10 kg. Let X be the weight of a randomly-chosen dumbbell.

Then the mean of X is

E [X] = = 0.3 5 kg + 0.7 10 kg = 8.5 kg.

And the variance of X is

2

To get a measure of spread that uses the original unit of measure, we simply take the

square root of the variance. This is called the standard deviation as a measure of spread.

Definition 119. Let X be a random variable and V[X] be its variance. Then the standard

deviation of X is defined as

SD [X] =

V[X].

2

The variance of a random variable X is often denoted X

or even more simply as 2 (if it

is clear from the context that were talking about the variance of X).

Example 604 (continued from above). We calculated the variance of X to be V [X] =

2 = 5.25 kg2 .

Exercise 236. There are 100 rulers in a bookstore, of which 35 have length 20 cm and

the remaining 65 have length weight 30 cm. Let Y be the weight of a randomly-chosen

dumbbell. Find the mean, variance, and standard deviation of Y . (Be sure to include the

units of measurement.)(Answer on p. 1173.)

www.EconsPhDTutor.com

62.3

The variance operator is not linear. However, given independence, the variance operator

does satisfy additivity and homogeneity of degree 2.

Proposition 14. Let X and Y be independent random variables and c be a constant. Then

(a) Additivity: V[X + Y ] = V [X] + V [Y ],

(b) Homogeneity of degree 2: V[cX] = c2 V [X].

Proof. Optional, see p. 991 in the Appendices.

With the above, it becomes much easier than before to find the variance of the sum of 2

dice, 3 dice, or indeed n dice.

www.EconsPhDTutor.com

Example 548. Let X be the outcome of a fair die-roll. We showed earlier that V[X] =

35

.

12

Now roll two fair dice. Let X1 and X2 be the respective outcomes. Let Y be the sum of

the two dice (i.e. Y = X1 + X2 ). Assuming independence, we have

V[Y ] = V [X1 + X2 ] = V [X1 ] + V [X2 ] =

70

.

12

Now roll three fair dice. Let X3 , X4 , and X5 be the respective outcomes. Let Z be the sum

of the three dice (i.e. Z = X3 + X4 + X5 ). Again, assuming independence, we have

V[Z] = V [X3 + X4 + X5 ] = V [X3 ] + V [X4 ] + V [X5 ] =

105

.

12

Again, compare this quick computation to the work you had to do in Exercise 235!

Now, let A be double the outcome of a die roll (i.e. A = 2X). Note importantly that A Y .

Y is the sum of two independent die rolls. In contrast, A is double the outcome of a single

die roll. Indeed, by Proposition 14, we see that

V[A] = V[2X] = 4V[X] =

140

V[Y ].

12

Similarly, let B be triple the outcome of a die roll (i.e. B = 3X). Note importantly that

B Z. Z is the sum of three independent die rolls. In contrast, B is triple the outcome of

a single die roll. Indeed, by Proposition 14, we see that

V[B] = V[3X] = 9V[X] =

315

V[Z].

12

Exercise 237. The weight of a fish in a pond is a random variable with mean kg and

variance 2 kg2 . (Include the units of measurement in your answers.) (Answer on p. 1173.)

(a) If two fish are caught and the weights of these fish are independent of each other, what

are the mean and variance of the total weight of the two fish?

(b) If one fish is caught and an exact clone is made of it, what are the mean and variance

of the total weight of the fish and its clone?

(c) If two fish are caught and the weights of these fish are not independent of each other,

what are the mean and variance of the total weight of the two fish?

www.EconsPhDTutor.com

62.4

Why is the variance defined as the weighted average of squared deviations from the mean?

1. First, we tried defining the variance as the weighted average of deviations from the mean,

i.e. V[X] = E [X ]. But this was no good, because this quantity would always be

equal to 0.67

2. Next, we tried defining the variance as the weighted average of absolute deviations from

the mean, i.e. V[X] = E [X ]. This seemed to work well enough. But yet for some

bizarre reason, we choose not to use this definition.

3. Instead, we choose to use this definition:

2

V[X] = E [(X ) ] .

Why do we prefer using squared (rather than absolute) deviations as our definition of

variance? The conventional view is that the squared deviations definition is superior to

the absolute deviations definition (but see Gorard (2005) and Taleb (2014) for dissenting

views). Here are some reasons for believing the squared deviations definition to be superior:

The maths works out more nicely. For example:

The algebra is easier when dealing with squares than with absolute values.

Differentiation is easier (bserve that x2 is differentiable but x is not).

Variances are additive: If X and Y are independent, then V [X + Y ] = V [X] + V [Y ].

In contrast, if we use the definition V[X] = E [X ], then variances are no longer

additive.

Tradition.

A century or two ago, some Europeans preferred using squared to absolute deviations.

And so were stuck with using this.

See also these Stack Exchange Q&A discussions: [1], [2], [3], [4], and [5].

67

www.EconsPhDTutor.com

63

Heres another example of a probability problem that can be stated very simply, yet have

counter-intuitive results.

Example 549. Keep flipping a fair coin until you get a sequence of HH (two heads in a

row). Let X be the number of flips taken.

Now, keep flipping a fair coin until you get a sequence of HT . Let Y be the number of flips

taken.

Which is larger X = E [X] or Y = E [Y ]?

Intuition might suggest that obviously, X = Y . Intuition would be wrong. It turns out

that, surprisingly enough, X = 6 and Y = 4!

Example 550. Now suppose we flip a fair coin 10, 001 times. This gives us a sequence of

10, 000 pairs of consecutive coin-flips.

For example, if the 10, 001 coin-flips are HHTHT . . . , then the first four pairs of consecutive

coin-flips are HH, HT, TH, and HT .

Let A be the proportion of the 10, 000 consecutive coin-flips that are HH. Let B be the

proportion of the 10, 000 consecutive coin-flips that are HT .

Which is larger A = E [A] or B = E [B]?

In the previous example, we saw that it took, on average, 6 flips before getting HH and

4 flips before getting HT . So obviously, wed expect a smaller proportion to be HHs.

That is, A < B .

Sadly, we would again be wrong! It turns out that A = B = 1/4! This Google spreadsheet

simulates 10, 001 coin-flips and calculates A and B.

If youre interested, the above two results are formally proven in Fact 103 in the Appendices.

www.EconsPhDTutor.com

64

trial.

Example 551. Flip a coin. We can model this with a Bernoulli trial with probability of

success (heads) 0.5:

Sample space S = {T, H},

Event space = {, {T }, {H}, S},

Probability function P({T }) = 0.5 and P({H}) = 0.5.

The corresponding Bernoulli random variable is simply the random variable X S R

defined by X ({T }) = 0 and X ({H}) = 1. Its probability distribution is given by P (X = 0) =

0.5 and P(X = 1) = 0.5.

Formally:

Definition 120. A Bernoulli trial with probability of success p is an experiment (S, , P)

where

S = {0, 1}. (The sample space contains 2 elements.)

= {, {0}, {1}, S}.

P R is defined by P({0}) = 1 p and P({1}) = p. (And as usual P () = 0 and

P (S) = 1.)

The corresponding Bernoulli random variable is simply the random variable X S R

defined by X ({0}) = 0 and X ({1}) = 1. Its probability distribution is given by P (X = 0) =

1 p and P(X = 1) = p.

Note that we can denote the two elements of the sample space with any symbols. We could

use 0 standing for failure and 1 standing for success. Or we could use T and H,

as was done in the example above.

www.EconsPhDTutor.com

Example 552. On any given day, our refrigerator at home has probability 0.001 of breaking

down. We can model this with a Bernoulli trial with probability of success 0.001:

Sample space S = {0, 1},

Event space = {, {0}, {1}, S},

Probability function P({0}) = 0.999 and P({1}) = 0.001.

The corresponding Bernoulli random variable is simply the random variable T S R

defined by T ({0}) = 0 and T ({1}) = 1.

Its probability distribution is given by P (T = 0) = 0.999 and P(T = 1) = 0.001. In words,

the probability of no failure is 0.999 and the probability of a failure is 0.001.

Example 553. 90% of H2 Maths students pass their H2 Maths A-level exams. We randomly pick a H2 Maths student and see if she passes her H2 Maths A-level exam.

We can model this with a Bernoulli trial with probability of success 0.9:

Sample space S = {F, P },

Event space = {, {F }, {P }, S},

Probability function P({F }) = 0.1 and P({P }) = 0.9.

The corresponding Bernoulli random variable is simply the random variable Y S R

defined by Y ({F }) = 0 and Y ({P }) = 1. Its probability distribution is given by P (Y = 0) =

0.1 and P(Y = 1) = 0.9.

The following two statements are equivalent:

1. T is a Bernoulli random variable with probability of success p.

2. The random variable T has Bernoulli distribution with probability of success p.

www.EconsPhDTutor.com

64.1

Fact 74. A Bernoulli random variable T with probability of success p has mean p and

variance p(1 p).

Proof.

E[T ] = P (T = 0) 0 + P (T = 1) 1 = (1 p) 0 + p 1 = p.

E [T 2 ] = P (T = 0) 02 + P (T = 1) 12 = (1 p) 0 + p 12 = p.

Hence,

2

www.EconsPhDTutor.com

65

Informally, the binomial random variable simply counts the number of successes in a

sequence of n identical, but independent Bernoulli trials.

Example 554. Flip 3 fair coins. Let X be the number of heads.

1

X is an example of a binomial random variable X with parameters 3 and .

2

X can take on values 0, 1, 2, or 3 (corresponding to the number of heads).

The probability distribution of X is given by:

3 1 0 1 3 1

( ) ( ) = ,

P(X = 0) =

2

8

0 2

P(X = 2) =

3 1 2 1 1 3

( ) ( ) = ,

2

8

2 2

3 1 1 1 2 3

( ) ( ) = ,

P(X = 1) =

2

8

1 2

P(X = 3) =

3 1 3 1 0 1

( ) ( ) = .

2

8

3 2

Formally:

Definition 121. Let T1 , T2 , . . . , Tn be n identical, but independent Bernoulli random

variables, each with probability of success p. Then the binomial random variable X with

parameters n and p is defined as:

X = T1 + T2 + + Tn .

The following three statements are entirely equivalent:

1. X is a binomial random variable with parameters n and p.

2. The random variable X has the binomial distribution with parameters n and p.

3. X B(n, p).

www.EconsPhDTutor.com

Let Y be the number of passes among two randomly-chosen students. Then Y is a binomial

random variable with parameters 2 and 0.9. Its probability distribution is given by:

P (Y = 0) =

2 0 2

0.9 0.1 = 0.01,

0

P (Y = 1) =

2 1 1

0.9 0.1 = 0.18,

1

P (Y = 2) =

2 2 0

0.9 0.1 = 0.81.

2

In words, the probability that both fail is 0.01, the probability that exactly one passes is

0.18, and the probability that both pass is 0.81.

www.EconsPhDTutor.com

65.1

Observe that P(X = k) is simply the probability that in a sequence of n independent

Bernoulli trials, each with probability of success p, there are exactly k successes.

First consider instead the probability that in a sequence of n trials, the first k trials are

successes and the remaining n k are failures. We know that the probability of a success

is p and the probability of a failure is 1 p. Hence, by the Multiplication Principle, this

probability is simply pk (1 p)nk .

The above is the probability of k successes and n k failures, but where exactly the first k

trials are successes and exactly the last n k trials are failures. But we dont care about

where the successes are. We only care that there are k successes. And there are C(n, k)

ways to have exactly k successes in n trials. Thus,

P(X = k) =

n k

p (1 p)nk .

k

In summary:

Fact 75. Let X B(n, p). Then for any k = 0, 1, . . . , n,

P(X = k) =

n k

p (1 p)1k .

k

Example 556. Let X be the number of heads when 10 fair coins are flipped.

Then X B(10, 0.5). And the probability that exactly 8 coins are heads is:

P(X = 8) =

10 8 2

45

0.5 0.5 =

.

1024

8

Let Y be the number of passes among 20 randomly-chosen students. Then Y B(20, 0.9).

And the probability that at least 18 pass is

P(Y 18) = P(Y = 18) + P(Y = 19) + P(Y = 20)

=

20 18 2 20 19 1 20 20 0

0.9 0.1 +

0.9 0.1 +

0.9 0.1 0.677.

18

19

20

www.EconsPhDTutor.com

65.2

Example 558. Problem: Three machines each have, independently, probability 0.3 of failure. What is the expected number of failures? What is the variance of the number of

failures?

Solution: Let Z B(3, 0.3) be the number of failures. Then

P (Z = 1) =

3 1 2

0.3 0.7 ,

1

P (Z = 2) =

3 2 1

0.3 0.7 ,

2

P (Z = 3) =

3 3 0

0.3 0.7 .

3

Hence, E[Z] = P (Z = 1) 1 + P (Z = 2) 2 + P (Z = 3) 3

3 1 2

3 2 1

3 3 0

0.3 0.7 1 +

0.3 0.7 2 +

0.3 0.7 3

1

2

3

= 0.441 + 0.378 + 0.081 = 0.9.

=

Now,E [Z 2 ] = P (Z = 1) 12 + P (Z = 2) 22 + P (Z = 3) 32

3 1 2 2 3 2 1 2 3 3 0 2

0.3 0.7 1 +

0.3 0.7 2 +

0.3 0.7 3

1

2

3

= 0.441 + 0.756 + 0.243 = 1.44.

=

That is, the variance of the number of failures is 0.63.

It turns out though that there is a much quicker formula for finding the mean and variance

of any binomial random variable.

www.EconsPhDTutor.com

Fact 76. If X B(n, p), then E[X] = np and V[X] = np(1 p).

(You can verify that this formula works for the last example: n = 3, p = 0.3, and thus

E[Z] = np = 0.9.)

Proof. Let T1 , T2 , . . . , Tn be identical, but independent Bernoulli random variables with

parameter p. Then X = T1 + T2 + + Tn . Hence,

E[X] = E [T1 + T2 + + Tn ] = E [T1 ] + E [T2 ] + + E [Tn ] = p + p + + p = np.

V[X] = V [T1 + T2 + + Tn ] = V [T1 ] + V [T2 ] + + V [Tn ]

= p(1 p) + p(1 p) + + p(1 p) = np(1 p).

of which has probability 0.01 of failure. Plane engine #2 contains 35 components, each

of which has probability 0.005 of failure. The probability that any component fails is

independent of whether any other component has failed.

An engine fails if and only if at least 2 of its components fail. What is the probability that

both engines fail?

www.EconsPhDTutor.com

66

SYLLABUS ALERT

The Poisson distribution is in the 9740 (old) syllabus, but not in the 9758 (revised) syllabus.

So you can skip this chapter if youre taking 9758.

The Poisson process is the continuous time analogue of the Bernoulli process.68 And in

parallel, the Poisson random variable is the limit of the binomial random variable.

Example 559. The long-term average number of murders per year in Singapore is 2.4.

How might we model the rate at which murders are committed in Singapore?

Lets assume that the rate at which murders are committed satisfies two properties:

1. (Time-homogeneity.) The probability that there are k murders in any fixed time

interval is constant.

For example, the probability that there are 2 murders in the first 90 days of the year, is

the same as the probability that there are 2 murders in the last 90 days of the year. As

another example, the probability that there is 1 murder on January 10th is the same as the

probability that there is 1 murder on August 5th.

2. (Independence.) The probability that there is a murder at any given moment does

not depend on the number of murders that have already been committed that year.

For example, the probability that there is a murder in December does not depend on how

many murders were committed between January and November.

Then an appropriate model might be the Bernoulli process. Let us say that each month,

there is a murder with probability 2.4/12 = 0.2, and no murder with probability 0.8. The

number of murders each month may thus be modelled by a Bernoulli random variable T

with parameter 0.2.

By assumption, the number of murders in one month has no influence on the number of

murders in another month. Thus, the number of murders in a given year can be modelled

by the binomial random variable X B(12, 0.2). Equivalently,

X = TJan + TFeb + + TDec .

(... Example continued on the next page ...)

68

The Poisson process is an infinite process that is beyond the scope of the A-levels, and is thus omitted from this book.

www.EconsPhDTutor.com

(Notice the number 0.2 was chosen so that E[X] = np = 120.2 = 2.4 matches the long-term

average number of murders per year.)

This model is reasonably good, but suffers from at least two flaws: It implicitly assumes

that

In any given month, there can be at most 1 murder; and

In any given year, there can be at most 12 murders.

These two implicit assumptions are somewhat unrealistic.

In the above model, what we did was to partition the year into 12 time intervals. If we

instead partitioned the year into 365 time intervals, the above two implicit assumptions

would be relaxes.

Lets say that each day, there is probability 2.4/365 0.00658 of a murder and probability

12.4/365 0.99342 of no murders. The number of murders each day may thus be modelled

by a Bernoulli random variable U with parameter 2.4/365.

Thus, the number of murders in a given year can be modelled by the binomial random

2.4

). Equivalently,

variable Y B (365,

365

Y = U1 + U2 + + U365 .

2.4

2.4

was deliberately chosen so that E[X] = np = 365

= 2.4

365

365

matches the long-term average number of murders per year.)

(Again, the number

2.4

) is probably better than the first model X B(12, 0.2).

365

But why stop at partitioning the year into 365 days?

This second model Y B (365,

In general, we can model the number of murders by the binomial random variable Z

2.4

B (n,

). Taking the above reasoning to the extreme, we can instead partition the year

n

into infinitely-many infinitely-short time intervals. That is, we can let n . And as it

turns out, as n , the binomial random variable Z approaches something called the

Poisson random variable with parameter 2.4. That is,

lim Z Po(2.4).

www.EconsPhDTutor.com

66.1

variable that satisfies the following two properties:

Range(Y ) = {0, 1, 2, 3, . . . } = Z+0 .

Y has probability distribution given by P(Y = k) =

k e

, for all k Z+0 .

k!

The following result establishes that the limit of a binomial random variable is a

Poisson random variable.69

Theorem 13. Let > 0. Let Xn B (n, ). Let Y = lim Xn . Then Y is a Poisson random

n

n

variable with parameter .

Proof. This proof is actually also not too difficult. It just involves some algebra and manipulation of limits. But as usual, Ill put it in the Appendices (on p. 995).

1. Y is a Poisson random variable with parameter .

2. The random variable Y has the Poisson distribution with parameter .

3. Y Po().

69

By the way, the Poisson random variable is a discrete random variable because although its range is not finite, its range is

countably-infinite.

www.EconsPhDTutor.com

66.2

The Poisson random variable is typically used to model the number of occurrences or

arrivals of some phenomenon, within a given timespan or space. We already saw one

example where it could be deployed (murders in Singapore).

In general, the Poisson random variable is an appropriate model if:

1. (Time-homogeneity.) The rate of occurrences is constant.

2. (Independence.) The probability of occurrence is independent of when the last occurrence took place.

Example 560. Consider the number of goals scored in a given football match. Arguably,

an appropriate model for this number is a Poisson random variable, because arguably,

1. The rate of goals is constant.

2. The probability of a goal being scored in, say, the next 60 seconds is independent of

when the last goal was scored.

Suppose that, on average, the number of goals scored in a football match is 2.3. We can

model the number of goals scored with the Poisson random variable X Po( = 2.3).

By definition of the Poisson random variable, the probability that 0 goals are scored is

P(X = 0) =

k e 2.30 e2.3

=

= e2.3 0.100.

k!

0!

P(X = 3) =

k e 2.33 e2.3

=

0.203.

k!

3!

P(X 2) = P(X = 0) + P(X = 1) + P(X = 2) =

+

+

0.596.

0!

1!

2!

www.EconsPhDTutor.com

Example 561. Consider the number of public mass shootings in the US in a given year.

Arguably, an appropriate model for this number is a Poisson random variable, because

arguably,

1. The rate of public mass shootings in the US is constant.

2. The probability of a public mass shooting being committed in, say, the next three days

is independent of when the last public mass shooting was committed.

given millennium. Arguably, an appropriate model for this number is a Poisson random

variable, because arguably,

1. The rate of supernovae is constant.

2. The probability of a supernova being observed in, say, the next ten years is independent

of when the supernova was observed.

Now, no model is perfect. We do not know and may never know the exact processes

governing when a goal will be scored, a public mass shooting committed, or a supernova

observed. Nonetheless, we can argue that the Poisson random variable works reasonably

well as a model. We can make use of this model to analyse the phenomenon at hand.

If we choose not to use the Poisson random variable, then our alternatives are to:

Find some alternative model that works better than the Poisson random variable.

Shrug our shoulders and say that the phenomenon cannot be analysed mathematically.

The first alternative, if it exists, is great. The second is anti-intellectual and not very useful.

www.EconsPhDTutor.com

Exercise 239. For each the following phenomena, make an argument for whether or not

the Poisson random variable is a suitable model. (Answer on p. 1176.)

(a) The number of cats killed in northern Singapore, between 2011 and 2020.

(b) The number of errors in this textbook.

(c) The number of emails you receive, in a given 24-hour timespan.

Exercise 240. This exercise revisits Example 561. Suppose the number of public mass

shootings in the US in a given year can be modelled by X, a Poisson random variable

with parameter = 4.2. Compute the probability that there are more than 5 public mass

shootings in the US in a given year. (Answer on p. 1176.)

Exercise 241. This exercise revisits Example 562. Suppose the number of supernovae

observed in a millennium can be modelled by Y , a Poisson random variable with parameter = 3.7. Compute the probability that there are no supernovae observed in a given

millennium. (Answer on p. 1176.)

www.EconsPhDTutor.com

66.3

It turns out that interestingly (and conveniently) enough, the mean and variance of X

Po() are both equal to .

Fact 77. Let X Po(). Then E[X] = and V[X] = .

Proof. The proof is actually not too difficult, given what we know about Maclaurin series.

But as usual, Ill put it in the Appendices (p. 994).

www.EconsPhDTutor.com

66.4

Binomial Distribution

n

n

This implies that if X B (n, p), n is large enough, and p is small enough, then the

random variable Y Po( = np) serve as a good approximation for the random variable

X.

Different writers give different (and somewhat-arbitrary) rules-of-thumb as to how large

n and how small p must be in order for the Poisson random variable to serve as a good

approximation. The rule-of-thumb well use in this textbook is this:

If n 30 and p 0.05, then the Poisson distribution is a

good approximation to the binomial distribution.

The following example illustrates why we might want to approximate the binomial distribution with the Poisson distribution.

www.EconsPhDTutor.com

Example 563. Problem: We have 300 machines, each of which has, independently, probability 0.02 of breaking down in any given month. What is the probability that at most 10

break down in a given month?

Let X B(300, 0.02) be the number of machines that break down in a given month. Then

P(X 10) = pX (0) + pX (1) + pX (2) + + pX (10)

=

300

300

300

0.020 0.99300 +

0.021 0.99299 + +

0.0210 0.99299 .

0

1

10

In the old days, it would have been tedious to compute the above probability. So one might

instead have preferred to use the Poisson approximation.

Now, since n 30 and p 0.05, by our rule-of-thumb, Y Po(np) = Po(6) serves as a

suitable approximation to X B(n = 300, p = 0.02). Thus,

P(X 10) P(Y 10).

Now, it would have been easy to find P(Y 10), because one would have had a print copy

of a Poisson table, partly reproduced below. A Poisson table tells us what the value of

P(Y k) is, for various possible values of and the number k, given that Y Po(). (For

the full table, see sheet titled Poisson Table at the usual link.)

Reading off the table, we have P(Y 10) 0.9574. We thus conclude: The probability that

at most 10 machines break down in a given month is approximately 0.9574.

www.EconsPhDTutor.com

You might wonder, Well, werent there similarly also binomial tables that one could read

off of? If so, why then would we need to go through the trouble of approximating the

binomial with the Poisson and then refer to the Poisson table? We could just directly refer

to the binomial tables.

Now, observe that to print a Poisson table, we need only be concerned with the Poisson

parameter and the number k. Thats a total of 2 parameters. So we can, in a single

table, list a lot of information.

In contrast, to print a binomial table, we have two binomial parameters n and p, and in

addition the number k. Thats a total of 3 parameters. So a binomial table really involved

multiple binomial tables, one for each value of n! (See this example.) Typically, the tables

would end at some small-ish value of n (20 in the linked table). Whereas in this particular

example, we would have needed the binomial tables all the way to n = 300!

And so, even though there were binomial tables, these were limited and would typically

not have furnished the desired information. This, then, was one big reason for using the

Poisson approximation, at least in the old days.

But today, it is no more difficult to compute P(X 10) than it is to compute P(Y 10).

For example, using my spreadsheet titled Binomial (at the usual link), one can simply

punch in n = 300 and p = 0.02 and read off that the exact solution to our problem P(X 10)

is approximately 0.9590.

Exercise 242. Suppose the number of deaths by lightning strikes in Singapore in a given

year can be modelled by the random variable X B (5500000, 106 ). (Answer on p. 1177.)

(a) What is an appropriate interpretation of the numbers 5500000 and 106 ?

(b) Using a suitable approximation (and justify your use of this approximation), find the

probability that at least 5 people are killed by lightning strikes in Singapore in a given year.

www.EconsPhDTutor.com

66.5

Formally:

Theorem 14. Suppose X Po () and Y Po () are independent Poisson random variables. Then X + Y Po ( + ).

Proof. (Optional.) Well prove that the probability distribution of X + Y is that of the

Poisson random variable with parameter + .

k

P (X + Y = k) = P (X + Y = k, X = i)

i=0

k

= P (Y = k i, X = i) = P (Y = k i) P (X = i)

i=0

k

i=0

k

ki e i e

= pY (k i)pX (i) =

i=0

i=0

(k i)! i!

1

ki i

i=0 (k i)!i!

= e(+)

k!

e(+) k

ki i

k! i=0 (k i)!i!

e(+) k k ki i 2 e(+)

k

( + ) ,

=

k! i=0 i

k!

2

www.EconsPhDTutor.com

Example 564. Problem: There are 34 machines in Room A and 42 in Room B. In any

given month, each machine in Room A has, independently, probability 0.03 of breaking

down; and each machine in Room B has, independently, probability 0.02 of breaking down.

Using a suitable approximation, find the probability that in a given month, more than 2

machines (in total across the two rooms) break down.

Let A B(34, 0.03) and B B(42, 0.02). We could directly solve this problem by finding

P(A + B > 2).

It is however quicker if we use the Poisson distribution as an approximation. (We have a

simple formula for the sum of two independent Poisson random variables. In contrast, there

is no similarly simple formula for the sum of two independent binomial random variables.)

A suitable approximation for A is X Po(34 0.03) = Po(1.02). A suitable approximation

for B is Y Po(42 0.02) = Po(0.84). Thus, a suitable approximation for A + B is

X + Y Po(1.02 + 0.84) = Po(1.86). Hence,

P(A + B > 2) P(X + Y > 2) = 1 P (X + Y 2) 0.285.

is 30,000,000. In any given year, each Singaporean has, independently, probability 106 of

being killed by a lightning strike; and each Malaysian has, independently, probability 107

of suffering the same fate.

Using a suitable approximation, find the probability that in any given year, at least 10

people are killed by lightning strikes in Singapore and Malaysia, combined.

www.EconsPhDTutor.com

The sum of two independent Poisson random variables is itself a Poisson random variable.

We might thus wonder, Is the difference of two independent Poisson random variables also

a Poisson random variable? Unfortunately, the answer is no.

A trivial reason for this is that the difference of two independent Poisson random variables

can take on negative values. In contrast, the Poisson random variable always takes on

positive values. To illustrate:

Example 564 (continued from above). Reproduced from above: There are 34 machines in Room A and 42 in Room B. In any given month, each machine in Room A

has, independently, probability 0.03 of breaking down; and each machine in Room B has,

independently, probability 0.02 of breaking down.

Let A B(34, 0.03) and B B(42, 0.02). We now show that B A is not a Poisson random

variable.

The range of A and B are both Z+0 = {0, 1, 2, . . . }. Thus, the range of B A is Z =

{ 2, 1, 0, 1, 2, . . . }. By the definition of a Poisson random variable then (Definition

122), B A cannot possibly be a Poisson random variable.

www.EconsPhDTutor.com

67

So far, all examples of random variables weve seen have been discrete. For example, the

binomial random variable X B (n, p) is discrete, because Range (X) = {0, 1, 2, . . . , n} is

finite.

Well now look at continuous random variables. Informally, a random variable Y is continuous if its range takes on a continuum of values.

For H2 Maths, you need only learn about one continuous random variable: the normal

random variable (subject of the next chapter).

Nonetheless, well first look at another continuous random variable that is not in the syllabus. This is the continuous uniform random variable. It is much simpler than

the normal random variable and can thus help build up your intuition of how continuous

random variables work.

www.EconsPhDTutor.com

67.1

A line measuring exactly 1 metre in length is drawn on the floor. It is about to rain. Let

X be the position of the first rain-drop that hits the line. X is measured as the distance

(in metres) from the left-most point of the line.

So for example, if the first rain-drop hits the left-most point of the line, then x = 0. If it

hits the exact midpoint of the line, then x = 0.5. And if it hits the right-most point, then

x = 1.

Assume we can measure X to infinite precision.

Then, assuming the first rain-drop is equally likely to hit any point of the line, we can

model X as a continuous uniform random variable on [0, 1]. This says that

The range of X is [0, 1] (the first rain-drop can hit any point along the line); and

X is equally likely to take on any value in the interval [0, 1] (the first rain-drop is equally

likely to hit any point along the line).

The following three sta

## Molto più che documenti.

Scopri tutto ciò che Scribd ha da offrire, inclusi libri e audiolibri dei maggiori editori.

Annulla in qualsiasi momento.