How To Solve Mathematical Problems

How to Solve
Mathematical
Problems
Wayne A. Wickelgren
How to So|ve Prob|ems
ELEMENTS OF A THEORY OF PROBLEMS
AND PROBLEM SOLVING
Wayne A. Wickelgrn
UNlVLHY Ll LHLLLN

W. M. FREEMAN AND COMPANY
San Francisco
The Dover reprint added "Mathematical" to the title.
LIbrar o Congre88 CatalogIng In PublIcatIon Data
Wi ckel gren, Wayne A 1938-
How to solve problems.
Bi bl i ography: p.
1. Mathemati cs-Problems, exerci ses, etc.
2. Problem solvi ng. I. Ti tle.
QA43.w52 511 73-15787
ISBN 0-7167-0846-9
ISBN 0-7167-0845-0 (pbk.)
Copyri ght @1974 by W. H. Freeman and Company
No part of thi s book may be reproduced by any mechan i cal,
photograph i c, or electroni c process, or in the form of
a phonographi c recordi ng, nor may i t be stored i n a retri eval
system, transmi tted, or otherwi se copi ed for publi c or
pri vate use wi thout wri tten per mi sSi on of the publ i sher.
Pri nted in t he Uni ted States of Ameri ca
1 2 3 D 6 7 8 9
For as long as / can remember, / have been more
interested in reflecting on what / was doing or
thinking and in thinking about ways to improve my
methods than / have been in the particular things
/ was doing or thinking about. This emphasis on
self-analysis and improvement reflects the influence
of my mother and father, Alma and Herman Wickelgren,
to whom this book is dedicated and whose values and
practical principles have contributed so much
to my life.
Preface ix
3 Introduction
Z Problem Theory
Inference Z+
Contents
4 Classification of Action Sequences
b State Evaluation and Hill Climbing
b Subgoals d+
1 Contradiction +d
Working Backward +
Relations Between Problems +Z
3 Topics in Mathematical Representation +
33 Problems from Mathematics, Science, and
Engineering Zd
References Z
Index Zd
Preface
I n the mathematics and sci ence courses I took in col l ege, I was enor
mousl y irritated by the hundreds of hours that I wasted staring at
probl ems without any good idea about what approach to try next in
attempting to solve them. I thought at the time that there was no edu
cational value i n those "bl ank" mi nutes, and I see no val ue i n them
today. The general probl em-sol vi ng methods described i n thi s book
vi rtual l y guarantee that you wi l l never agai n have a bl ank mi nd i n such
ci rcumstances. They shoul d al so hel p you sol ve many more probl ems
and sol ve them faster. But whether or not you sol ve any parti cul ar
probl em, you wi l l al ways have l ots of ideas about ways to attack the
probl em. Also, the use of general probl em-sol vi ng methods ofen i ndi
cates the properties of the pri nci pl es you need to know from the sub
ject matter that the probl em i s attempt i ng to teach and t est . Thus,
whether you succeed of fai l i n sol vi ng any parti cular probl em, t he efort
wi l l be i nteresti ng and educati onal .
The theoretical and practical anal yses of probl ems and probl em
sol vi ng presented here were heavi l y i nfuenced by advances made
over the last years i n the fel ds of artifcial i ntel l igence and com
puter si mul ation of thought. My greatest i ntel l ectual debts are to Al l en
Newel l , Herbert Si mon, and George Pol ya. Newel l and Si mon' s
X Preface
anal yses of probl ems and probl em sol vi ng constituted my starting point
for working i n this area, and many of the best ideas i n the book are
ideas they have al ready presented in one form or another. Many other
good ideas were taken more or less di rectl y from Polya, whose books
on mathemati cal probl em solving are a rich source of methods and a
sti mul us for thought.
My eforts to understand and organi ze probl em-sol vi ng methods
began in 1 959 when, as an undergraduate at Harvard, I frst became
aware of the pi oneering work of Al l en Newel l , Cl if Shaw, and Herb
Simon on the computer si mul ation of thi nki ng. During graduate school
at the University of California, Berkel ey, I regarded probl em sol vi ng
as my major research area. I do not thi nk that my experimental studies
of human probl em sol vi ng ever amounted to much. However, I thought
at the time (and think today) that my theoretical (mathematical ) under
standi ng of probl ems and probl em sol vi ng was i mmeasurabl y i n
creased and that t hi s greatl y enhanced my abi l i ty t o sol ve al l ki nds
of mathematical probl ems. Shortl y afer coming to MI T as a new
facuI ty member i n the Psychol ogy Department, I decided that one
contribution I could make to the undergraduates there was to teach
them this newly acqui red ski l l of mathematical probl em sol ving. The
students enjoyed the course and, more i mportant, reported back to
me i n later years that they thought that their probl em-sol vi ng abi l i ty
i n mathemati cs, sci ence, and engineering courses had been greatl y
i ncreased by learning these general probl em-sol vi ng methods. En
rol l ment i n the course went from to i n three years, when I
stopped gi vi ng it because my pri mary research i nterest had shifted
to human memory. Some years l ater, after moving to the University
of Oregon, I deci ded that I now had the time to write a book contai ni ng
al l the ideas that I had acqui red from others and generated mysel f
concerning probl ems and probl em sol ving.
The purpose of the book i s to i mprove your abil ity to sol ve al l ki nds
of mathematical probl ems whether i n mathemati cs, science, en
gineering, busi ness, or purel y recreational mathematical problems
(puzzl es, games, and so on). Thi s book i s pri mari l y i ntended for col
l ege students who are currentl y taking el ementary mathemati cs,
sci ence, or engineering courses. However, I hope that students with
less mathemati cal background can read the book and master the
methods without an undue degree of addi tional efort and also that
more advanced readers wi l l proft from it without being bored. I
bel i eve that almost everyone who sol ves mathemati cal probl ems can
proft substantial l y from l earni ng the general probl em-sol ving methods
Preface X
described here, and I have tried to write in a way that wi l l communi
cate efecti vel y to al l such peopl e. The approach i s to defne each
general probl em-sol vi ng method and i l l ustrate i ts appl i cati on to s i mpl e
recreational mathemati cs probl ems that requi re no more mathematical
background than that possessed by someone wi th a year of hi gh school
algebra and a year of plane geometry. An el ementary knowl edge of
"new mathematics" ( sets, rel ati ons, functions, probabi l i ty, and so
on) would be hel pful , and some of thi s i s briefy taught i n Chapter
The sol uti ons to exampl e probl ems are presented gradual l y, usual l y
i n t he form of hi nt s t o gi ve t he reader more and more chances t o go
back and sol ve the probl em. Thi s techni que is founded on the bel ief
that you wi l l remember best what you di scover for yourself. The book
aims to guide you to di scoveri ng how to appl y general probl em-sol vi ng
methods to a ri ch variety of probl ems. I bel i eve that if you read t hi s
book and t ry to appl y the methods to around or of your own
probl ems, you wi l l i mprove substanti al l y i n probl em-sol vi ng abi l i t y,
wi t h consequent benefts i n job performance, school grades , and " i n
tel l igence" test scores ( i ncl udi ng SAT col lege entrance exams, and
The Graduate Record Exam).
Fi nal l y, I woul d l i ke to make a negat i ve acknowledgment . Thi s book
was wri tten i n spite of my four-year-ol d son, Abraham, and my si x
year-ol d daughter, I ngri d, who are such del i ghtful peopl e that I cannot
resi st spendi ng vast amounts of t i me wi th them.
October 1973 Wayne A. Wicke/gren
How to Solve Problems
1
Introduction
The purpose of thi s book is to hel p you i mprove your abi l i ty to sol ve
mathemati cal , sci entifc, and engineering probl ems. With thi s i n mi nd,
I wi l l describe certain el ementary concepts and pri nci pl es of the theory
of probl ems and probl em solving, something we have learned a great
deal about si nce the 1 950s, when the advent of computers made pos
sibl e research on arti fci al i ntel l igence and computer si mulation of
human probl em sol vi ng. I have tried to organi ze the di scussi on of
these ideas i n a si mpl e, logical way that wi l l help you understand,
remember, and appl y them.
You shoul d be warned, however, that the theory of probl em sol vi ng
is far from bei ng preci se enough at present to provi de si mpl e cookbook
i nstructi ons for sol vi ng most probl ems. Partl y for this reason and partl y
for reasons of i ntri nsi c meri t, teaching by example i s the pri mary ap
proach used in thi s book. Fi rst, a probl em-sol vi ng method wi l l be
di scussed theoreti cal l y, then it wi l l be appl i ed to a variety of probl ems,
so that you may see how to use the method i n actual practi ce.
To master these methods, i t i s essential to work through the exampl es
of thei r appl i cation to a vari ety of probl ems. Thus, much of the book
i s devoted to anal yzi ng probl ems that exempl ify the use of di ferent
methods. You should pay careful attention to these probl ems and
Z Chapter J
shoul d not be di scouraged if you do not perfectl y understand the the
oretical di scussi ons. The theory of probl em sol vi ng will undoubtabl y
hel p those students with sufci ent mathematical background to under
stand i t, but students who l ack such a background can compensate by
spendi ng greater time on the exampl es.
SCOPE OF THE BOOK
Thi s book i s pri mari l y a practi cal gui de to how to solve a certai n cl ass
of probl ems, speci fcal l y, what I cal l formal problems or just "prob
l ems" (wi th the adjecti ve formal bei ng understood in later contexts).
Formal probl ems i ncl ude al l mathematical probl ems of ei ther the "to
fnd" or the "to prove" character but do not i ncl ude probl ems of de
fni ng "mathemati cal l y i nteresti ng" axi om systems. A student taking
mathemati cs courses will hardl y be aware of the practical si gni fcance
of this excl usi on, since defning i nteresti ng axiom systems i s a prob
lem not typi cal l y encountered except i n certain areas of basic research
i n mathemati cs. Si mi l arly, the probl em of constructi ng a new mathe
matical theory in any feld of sci ence is not a formal probl em, as I use
the term, and I wi l l not di scuss i t i n thi s book. However, any other
mathematical probl em that comes up i n any feld of sci ence, engineer
i ng, or mathemati cs is a formal probl em in the sense of thi s book.
Probl ems such as what you should eat for breakfast, whether you
shoul d marry X or ), whether you should drop out of school , or how
can you get yourself to spend more time studyi ng are not formal prob
l ems. These probl ems are vi rtual l y i mpossi bl e at the present time to
turn i nto formal probl ems because we have no good ways of restrict
ing our thi nki ng to a speci fed set of given i nformation and operati ons
(courses of acti on we mi ght take) , nor do we ofen even know how to
specify preci sel y what our goal s are i n sol vi ng these probl ems. Under
standi ng formal probl ems can undoubtedl y make some contri buti ons to
your thi nki ng in regard to these poorly specifed personal probl ems,
but the scope of the present book does not i ncl ude such probl ems.
Even if i t di d, i t woul d be extremel y di fcul t to specify any preci se
methods for sol vi ng them.
However, formal probl ems i ncl ude a l arge cl ass of practi cal probl ems
that peopl e mi ght encounter i n the real worl d, although they usual l y
encounter them as games or puzzl es presented by fri ends or appearing
i n magazi nes. A practical probl em such as how to build a bridge across
a ri ver i s a formal probl em if, i n sol vi ng the probl em, one i s l i mited to
some speci fed set of materi al s (gi vens) , operati ons, and, of course,
the goal of getting the bri dge bui l t.
Introduction
I n actual i ty, you might l i mi t yourself in thi s way for a whi l e and,
if no solution emerged, deci de to consi der the use of some additi onal
materi al s, i f possi bl e. Expanding the set of given material s (by means
other than the use of acceptable operati ons) i s not a part of formal prob
lem solving, but often the situation presents certai n givens i n sufci entl y
di sgui sed or i mpl i ci t form that recogni ti on of al l the gi vens i s an i m
portant part of ski l l in formal probl em sol vi ng. That ski l l wi l l be
di scussed l ater.
Practical probl ems or puzzl es of the type we wi l l consi der di fer
from probl ems i n mathemati cs, sci ence, or engi neeri ng i n that to pose
them requi res l ess background i nformati on and trai ni ng. Thus, puzzl e
probl ems are especi al l y sui table as exampl es of probl em-sol vi ng
methods i n thi s book, because they communi cate the worki ngs of the
methods most easi l y to the wi dest range of readers. For this reason,
puzzl e probl ems wi l l constitute a l arge proporti on of t he exampl es
used i n thi s book -at least prior to the l ast chapter.
I n pri nci pl e, it might seem that most i mportant probl em-sol vi ng
methods woul d be uni que to each speci al i zed area of mathemati cs,
sci ence, or engi neeri ng, but thi s i s probabl y not the case. There are
many extremel y general probl em-sol vi ng methods, though, to be sure,
there are al so speci al methods that can be of use i n onl y a l i mited range
of fel ds.
I t may be qui te di fcul t to l earn the speci al methods and knowl edge
requi red i n a parti cul ar fel d, but at least such methods and knowl
edge are the speci fc object of i nstructi on i n courses. By contrast,
general probl em-sol vi ng methods are rarel y, i f ever, taught, though they
are quite hel pful i n sol vi ng probl ems i n every fel d of mathemati cs,
sci ence, and engineering.
GENERAL VERSUS SPECIAL METHODS
The relation between speci fc knowl edge and methods, on the one
hand, and general probl em-sol vi ng methods , on the other hand, ap
pears to be as fol l ows. When you understand the rel evant material
and specifc methods quite wel l and al ready have consi derable ex
perience i n appl yi ng thi s knowl edge to si mi l ar probl ems, then i n sol v
ing a new probl em you use the same speci fc methods you used before.
Consi deri ng the methods used in si mi l ar probl ems is a general probl em
sol vi ng techni que. However, in cases where it is obvi ous that a par
ti cul ar probl em is a member of a cl ass of probl ems you have sol ved
before, you do not need to make expl ici t, consci ous use of the method:
si mpl y go ahead and sol ve the probl em, using methods_ that you have
4
Chapter J
l earned to appl y to thi s cl ass. Once you have thi s l evel of under
standi ng of the rel evant materi al , general problem-sol vi ng methods are
of l i ttl e val ue in sol vi ng the vast majority of homework and exami na
tion probl ems for mathemati cs, sci ence, and engineeri ng courses.
When probl ems are more compl i cated, i n the sense of i nvol vi ng
more component steps, and are not hi ghl y si mi l ar to previousl y sol ved
probl ems, the use of general probl em-sol vi ng methods can be a sub
stantial aid i n sol uti on. However, such compl ex probl ems wi l l be en
countered onl y rarel y by the begi nni ng mathematics, science, and
engi neeri ng students taking courses i n high school and col l ege. More
i mportant to the i mmedi ate needs of such students is the rol e of gen
eral probl em-sol vi ng methods i n si mpl e homework and exami nation
probl ems where one does not completel y understand the rel evant
material and does not have consi derabl e experience i n sol vi ng the
rel evant class of probl ems. In such cases, general probl em-sol vi ng
methods serve to gui de the student to recogni ze what rel evant back
ground i nformati on needs to be understood. For exampl e, when one
understands the general probl em-sol vi ng method of setting subgoal s,
one can often set parti cul ar subgoal s that di rectl y i ndi cate what types
of speci fc i nformation are bei ng tested (and thereby taught) by a
parti cul ar probl em. One then knows what secti ons of the textbook to
reread in order to understand the rel evant materi al .
I f, however, the book is not avai l abl e, as in many exami nation
si tuations, general probl em-sol vi ng methods provi de one with powerful
general methods for retri evi ng from memory the rel evant background
i nformati on. For exampl e, the use of general probl em-sol vi ng methods
can i ndi cate for whi ch quanti ti es one needs a formul a and can provi de
a basi s for choosi ng among diferent al ternati ve formul as. Frequentl y,
a student may know all the defni ti ons, formul as, and so on, but not
have strong associ ati ons to this knowl edge from the cues present in
each type of probl em to whi ch thi s knowledge i s rel evant.
With experience i n sol vi ng a variety of probl ems to whi ch the
knowl edge i s relevant, one will devel op strong di rect associations
between the cues i n such probl ems and this relevant knowl edge. How
ever, in the earl y stages of learni ng the material , a student wi l l l ack
such di rect associ ati ons and wi l l need to use general probl em-sol vi ng
methods to i ndi cate where i n one' s memory to retri eve rel evant in
formation or where in the book to look it up. Assumi ng t hi s idea is
true (and thi s book aims to convi nce you i t i s) , mastering general
probl em-sol vi ng methods is i mportant to you both so you can use prob
l ems as a l earni ng devi ce and so you can achi eve the maxi mum range
of appl i cabi l i ty of the knowl edge you have stored in mi nd -on an
exami nati on, on a job, or whatever.
Introduction b
The goal of thi s book is to teach as many of these general probl em
sol vi ng methods as I know about, so that if you spend the ti me to
master these methods you can more efecti vel y l earn the subject matter
of your courses. Al so, si nce the abi l i ty to use the i nformation gi ven
i n most mathemati cs, sci ence, and engi neeri ng courses i s ofen pri
mari l y the abi l i ty to sol ve probl ems i n these fel ds, the book aims to
i ncrease thi s abi l i ty to use knowl edge.
RELATION TO ARTIFICIAL INTELLIGENCE
It shoul d be emphasi zed that t hi s text is pri mari l y a practical how-to
do-it book i n a feld where the l evel of preci se (mathemati cal ) formul a
tion is far bel ow what I am sure it wi l l be in the future, perhaps even
the near future. Arti fci al i ntel l igence and computer si mul ati on of
human probl em sol vi ng are currentl y very acti ve fel ds of research, and
resul ts from some of this work have heavi l y i nfuenced thi s book.
However, theoretical formul ations of probl em sol vi ng superior to
those we currentl y have wi l l eventual l y make the present formul ati on
outdated. Neverthel ess, the methods described i n the present book,
however i mperfectl y, can be of substantial beneft to any student who
masters them. When someone has a beautiful mathematical theory of
probl ems and probl em sol vi ng sometime i n the future, then cl earer
and more efecti ve how-to-do-it books can be written. Meanwhi l e,
it i s my hope that t hi s book wi l l hel p many peopl e to sol ve probl ems
better than they di d before.
APPLYING METHODS TO PROBLEMS
As di scussed previ ousl y, to master the probl em-sol vi ng methods de
scribed i n this book, it i s necessary to study the exampl e probl ems i l
l ustrati ng their use. The probl ems and sol uti ons anal yzed i n Chapters
3 to 1 0 i l l ustrate the use of the methods di scussed i n the parti cul ar
chapter. Chapter 1 1 consi ders a vari ety of homework and exami nation
probl ems for mathemati cs, sci ence, and engineering courses. Of
course, you probabl y have l ots of your own probl ems to sol ve in school
or work, and you shoul d begi n usi ng the methods on these probl ems
i mmedi atel y. Merel y readi ng thi s book provi des onl y the begi nni ng
concepts necessary to mastering general probl em-sol vi ng methods.
Practice i n usi ng t he methods i s essential t o achi evi ng a hi gh l evel
of ski l l .
b Chapter J
Everyone who sol ves probl ems uses many or al l of the methods
descri bed i n this book, but if you are not an extremel y good probl em
sol ver, you may be usi ng the methods l ess efectivel y or more hap
hazardl y than you could be by more expl i ci t trai ni ng i n the methods.
At frst , the appl i cation of such expl i ci tly taught probl em-sol vi ng
methods i nvol ves a rather sl ow, consci ous anal ysi s of each probl em.
There i s no parti cul ar reason to engage i n t hi s careful , consci ous
anal ysi s of a probl em when you can i mmediatel y get some good ideas
on how to sol ve i t. Just go ahead and sol ve the probl em "natural l y. "
However, afer you sol ve i t or, even better, whi l e you are sol vi ng i t,
anal yze what you are doi ng. I t wi l l greatl y deepen your understandi ng
of probl em-sol vi ng methods, and you might di scover new methods or
a new appl i cation of an ol d method.
As you get extensi ve practice i n usi ng these probl em-sol vi ng
methods you shoul d become so ski l l ed in thei r use that the process
becomes l ess consci ous and more automatic or natural . Thi s i s the way
of al l ski l l l earni ng, whether dri vi ng a car, pl ayi ng tenni s, or sol vi ng
mathematical probl ems.
Z
Problem Theory
FOUR SAMPLE PROBLEMS
To i l l ustrate the concepts invol ved in the theory of probl ems described
i n thi s chapter, we will begin with four sample probl ems.
Instant Insanity
nstant nsanity i s the name of a popul ar puzzl e consi sting of four
smal l cubes. Each face of every cube has one of four colors : red ( R) ,
blue ( B) , green (G) , or white ( W) . Each cube has at least one of i ts
si x faces with each of the four diferent col ors, but the remai ni ng two
faces necessari l y must repeat one or two of the colors al ready used.
The exact confgurati ons of colors on the faces of the cubes are
shown in Fig. 2- 1 . The faces of the cubes in the fgure have been cut
along the edges and fattened out for easy presentation on the two
di mensional page. (To reconstruct the cube i n three-di mensi ons, one
would si mpl y cut out the outl ined fgure, turn the top fap over on the
top and the bottom over on the bottom, and wrap the l eft si de and back
around to joi n up with the right si de at the rear of the cube. ) For con
venience, the faces of one cube i n the fgure have been labeled front ,
top, bottom, back, left side, right side. I f you thi nk of the front cube
ack
W
H
2B2G
2G2W
I I
otIOm
U
2R2W
W
W
G
"
sdc
H
FlGUHE 2-1
W W
H
The six colored faces of each of the four cubes in the
Instant Insanity puzzle. You could cut out each of
the above fgures and fold along the edges to make
cubes. In the above fgure H red, B blue, W
white, and C green. The cubes have been given
these "names": 2B2C, 2C2W. 2H2W. and 3H. which
indicates the colored faces, of which they have more
than one.
K
U
U
3H
W
Chapter Z
H
as being cl osest to you (facing you), then mental ly constructing the
cube from the two-di mensional drawi ng should be rel ati vel y easy to
do. However, you may wish to buy the puzzle to provi de a more con
crete and enjoyabl e representation.
Problem Theory
The goal of the puzzl e is to arrange the cubes one on top of the
other i n such a way that they form a stack four cubes high, wi th each
of the four sides having exactl y one red cube, one blue cube, one
green cube, and one whi te cube.
Chess Problem
From the board confguration shown in Fi g. 2-2 descri be a sequence of
moves such that whi te can achieve mate in fve moves.
Find Problem from Mechanics
What constant force wi l l cause a mass of 3 ki lograms to achi eve a
speed of 30 meters per second in 6 seconds, starting from rest?
Proof Problem from Modern Algebra
You are gi ven a mathematical system consi sting of a set of el ements
(A, B, C), with two bi nary operati ons (call them addition and multipli
cation) that combi ne two elements to give a third element. The system
has the fol l owi ng properties: ( I ) Addition and mul ti pl i cation are
cl osed ; that is, A + B and A
.
B are members of the original set for al l
A and B i n the set. ( 2) Mul tipl ication i s commutative ; that is, A B equals
H
f

'
i
. i
FIGURE 2-2
Pr of a famous chess problem. White to achieve mate in fve moves.
3 Chapter Z
BA for al l A and B in the set. ( 3) Equal s added to equal s are equal ;
that i s, if A A I and B B I, then A + B A I + B I, for all A, B, A I, B I
in the set. (4) The l ef di stri buti ve law appl ies ; that i s, C(A + B) -
CA + CB, for al l A, B, C in the set. ( 5) The transi ti ve law al so appl ies ;
that i s, if A B and B C, then A C. From these given assumptions,
you are to prove the right di stri buti ve l aw-that i s, that (A + B)C
AC + BC, for al l A, B, C in the set.
WHAT IS A PROBLEM?
Al l the formal probl ems of concern to us can be consi dered to be
composed of three types of i nformation: i nformation concerning givens
(gi ven expressi ons), i nformation concerning operations that transform
one or more expressi ons into one or more new expressi ons, and in
formation concerning goals ( goal expressi ons). There may be i nter
medi ate subgoal expressi ons mentioned expl i citl y in the probl em, or
the probl em sol ver may defne these subgoal expressi ons for hi mself;
but we wi l l assume that there is onl y one terminal goal per problem.
Any probl em stated with two or more independent terminal goal s
coul d always be viewed as two or more probl ems with the same gi vens
and operations and diferent goal s.
For conveni ence and accuracy, I tend t o take t he more formal view
that a probl em i nvol ves expressi ons of i nformation rather than actual
physi cal objects. Even i n a practical probl em stated i n terms of physi cal
objects, it is always possi bl e to consi der objects or sets of properties
of objects as represented by expressi ons. I ndeed, we must have
representati ons in our heads of objects, properti es of objects, and op
erati ons when we sol ve practi cal probl ems, si nce we certai nl y do not
have the real objects there. Thus, defnitions of probl ems, soluti ons,
and methods need not make any di sti nction between practical (con
crete) and symbol ic (abstract, mathematical ) . However, when deal i ng
with a practi cal probl em, there is no need to tal k of representations
or expressions, if the probl em i s more easily solved without using
this more abst ract l anguage.
Givens
Gi vens refer to the set of expressi ons that we accept as bei ng present
i n the world of the probl em at the onset of work on the probl em. I n
deed, the gi vens and the operati ons together constitute the enti re world
of the probl em at the begi nni ng of work on it. This defni tion of the
givens encompasses expressions representi ng objects, things, pieces
Problem Theory 33
of material , and so on, as wel l as expressi ons representing assump
tions, defnitions, axi oms, postulates, facts, and the l i ke.
In some ki nds of puzzl es the gi vens consi st of the material s. For
exampl e, the gi vens i n I nstant I nsanity are four cubes, with each si de
of each cube havi ng one of four colors (red, bl ue, green, or whi te) , as
shown i n Fi g. 2- 1 .
I n the chess probl em, the gi vens are the pi eces of each pl ayer and
their posi ti ons on the board pl us the i nformation concerni ng whose
move it is. In the parti cul ar chess probl em shown i n Fi g. 2-2, the
givens are that whi te has a ki ng, a rook, and a pawn at the posi ti ons
i ndi cated; that bl ack has a king, a bi shop, and two pawns at the posi
ti ons indi cated; and that i t i s white' s move. The i mpl i ci tl y specifed
given i nformation consi sts of al l the rul es of chess, incl uding such i n
formation as that a rook can move any number of squares al ong a row
or col umn unti l blocked by another pi ece, that a king can move one
square i n any di rection ( horizontal l y, vertical l y, or di agonal l y) , that
checkmate consi sts of putting the opponent' s ki ng i nto a position
where it woul d be captured on the next move if it was not moved out
of the square it was i n and such that all squares that the ki ng coul d
move to would al so resul t in capture.
I n the fnd probl em, the gi vens are the information expl i citl y stated
i n the probl em pl us whatever other mathematical or sci entifc knowl
edge i s to be i mpl i ci tl y assumed as part of the gi vens. In the physi cs
probl em described above, the expl i ci tl y described gi ven information
i ncl udes the fol l owi ng: the mass of the given object i s 3 ki l ograms, its
initial speed i s zero, its fnal speed afer 6 seconds of appl ying a force
is 30 meters per second, and the force and mass are constant. I m
pl i ci tl y specifed information i ncl ude Newton' s second law that force
equal s mass times accel eration, and the rul es of algebra and possi bl y
cal cul us (dependi ng upon how one sol ves the probl em).
In a mathematical proof probl em, the gi vens are al l the axi oms that
one is allowed to assume. The gi vens in the parti cular proof probl em
descri bed above are three of the fve assumpti ons: ( 1 ) that the system
i s closed, (2) that mul ti pl i cation i s commutati ve, and (4) that the lef
di stri butive law holds. Assumptions ( 3) , that equals added to equal s
are equal , and ( 5) , that the transi ti ve law hol ds (that i s, if A -B and
B -C, then A -C) are real l y rul es of i nference rather than gi vens.
Rul es of i nference are operati ons, di scussed bel ow.
Operations
Operations refer to the acti ons you are al l owed to perform on the
givens or on expressi ons derived from the givens by some previ ous
TZ
Chapter Z
sequence of actions. Other terms for operati ons i ncl ude transforma
tions and rules of inference, though the l atter term seems to be appro
priate onl y for concl usion-drawing probl ems and not so appropriate
for action-oriented probl ems.
I n I nstant I nsanity, the allowable operations can be conceptuali zed
i n a vari ety of equi val ent ways, the si mpl est of whi ch i s just that cubes
can be pl aced on top of one another in a singl e tower ( such that al l
faces of al l cubes are either paral l el or perpendi cul ar to one another) .
I n a chess probl em, the allowabl e operations are given by the all ow
able moves of each pi ece on the board of the player whose turn it i s
to move. I n a fnd probl em, the operati ons are sometimes pecul iar
to the probl em but are often the operati ons (or rul es of i nference) of
mathemati cs or logi c. I n the mechani cs probl em described at the be
gi nni ng of this chapter, mul ti pl ying or di vidi ng both sides of an equa
tion by the same quantity is an al l owabl e operation.
In a proof probl em, the operations are those rul es of inference that
are allowabl e wi thi n the mathematical system i n question. For exampl e,
i n proposi ti onal l ogi c, if proposition A i s true and if the statement
"A i mpl i es B" i s true, then one may infer that proposition B i s true.
In the modern-algebra proof probl em described at the begi nni ng of
the chapter, the two rul es of inference that constitute the allowable
operati ons i n this probl em are property ( 3) , that if A =A ' and B = B' ,
then A + B = A' + B' , and property ( 5) , that if A = B and B = C, then
A = C. Note that these operati ons take two i nput expressi ons and
produce a singl e new output expressi on. Al so note that, although ad
dition and mul ti pl i cation are certai nl y operati ons wi thi n the mathe
matical system described i n the proof probl em, mul ti pl ication and
addition are not the operations to be used i n sol ving the probl em.
Something that i s an operation i n one probl em may be onl y a part of
the gi ven expressi ons in another probl em.
Let me di sti ngui sh between destructive operations, whi ch produce
new expressions by destroyi ng ol d expressi ons, and nondestructive
operations, whi ch produce new expressi ons to i ncrease the set of
exi sting expressi ons without destroyi ng any ol d expressions. In the
above exampl es, I nstant I nsanity and chess i nvol ve destructive opera
tions ; al gebraic fnd probl ems and l ogical proof probl ems i nvol ve
nondestructi ve operations.
Al though many probl ems allow one t o use any al l owabl e operation
at any time, some probl ems pl ace restricti ons on the number of times
an operation can be used or the conditions under whi ch it can be used.
For i nstance, i n chess a pawn frst can be moved either one or two
squares, but thereafter it can be moved ahead only one square at a time.
Problem Theory T
Let us adopt t he convention that an operation refers t o a cl ass of
actions, with the actions being di stingui shed onl y by the operands
expressi ons or objects -to which the operation i s appl i ed. Assume
that a parti cul ar operati on, F, can be appl i ed to any expressi on wi thi n
some set of expressi ons, {Xi} . The parti cul ar Xi to whi ch we wi l l appl y
the operation wi l l be called the operand. The operation appl i ed to a
parti cul ar operand, namel y, F(Xi ), wi l l be cal l ed an action. Obvi ousl y,
these defnitions of operations, operands, and acti ons general ize easi l y
t o functions of more than one variabl e -for exampl e, F(x, ), z).
Goals
The goal of a probl em is a terminal expressi on one wi shes to cause to
exi st in the world of the probl em. There are two types of goal s specifed
i n probl ems: completel y specifed goal expressions i n proof probl ems
and incompletely specifed goal expressi ons in fnd probl ems.
For exampl e, consi der the probl em of fnding the val ue of X, gi ven the
expression 4x + 5 -1 7. I n this probl em, one can regard the goal
expression as being of the form X -g where the correct number
i s to be found i n order to fl l i n the bl ank i n the goal expressi on. The
goal expression i n a fnd probl em of thi s type i s i ncompl etel y specifed.
I f the goal expression were specifed compl et el y -for exampl e, X -3 -
then the probl em woul d be a proof probl em, with onl y the sequence of
operations to be determined in order to sol ve the probl em. Of course,
if one were not guaranteed that the goal expression X -3 was true, then
the terminal goal expressi on shoul d real l y be consi dered to be i ncom
pletel y specifed -something like the statement "x -3 i s (true or fal se) . "
I n I nstant I nsanity, the goal i s incompl etel y speci fed. The goal i s
t o get a tower of four cubes arranged i n such a way that each of the
four rows of si des has one of each of the four col ors. However, one is
not told exactl y what the arrangement of the colors i s to be -if one
were, it woul d be a very si mpl e proof probl em i nstead of a rather hard
fnd probl em.
I n many chess probl ems, the goal i s to checkmate the other pl ayer in
some smal l number of moves. Thi s goal i s cl ear, but it i s certainl y not the
same as giving a compl ete specifcation of the termi nal board position.
I ncomplete specifcation of the goal state does not i mpl y any am
biguity about what constitutes a correct or i ncorrect sol ution to the
probl em, as I shall defne the term sol uti on. There may be more than
one correct solution to a probl em, but all formal probl ems di scussed
in this book have the property that a solution i s either correct or in
correct, without ambiguity.
T4 Chapter Z
One reason for di scussi ng the completeness of specifcation of the
goal is to cl earl y descri be the nature of the diference between fnd
and proof probl ems. Another reason is to point out that fnd probl ems
have a terminal or goal expressi on that i s specifed ( i n vari ous ways
and to di ferent degrees) in a manner rather si mi l ar to the theorem to
be proved in a proof probl em. I t turns out that the degree of si mi l arity
i n the speci fcation of the goal expressi on i s sufci ent to allow most of
the same probl em-sol vi ng methods to be appl i ed to fnd probl ems and
to proof probl ems. Working backward from the goal i s probabl y the
onl y general probl em-sol vi ng method that i s used pri mari l y i n proof
probl ems and vi rtual l y never in fnd probl ems. Al l other methods di s
cussed i n thi s book are frequentl y used i n both fnd and proof prob
l ems. Thus, although the di sti ncti on between fnd and proof probl ems i s
perhaps the most fami l iar di sti nction between types of probl ems, i t
has onl y moderate signifcance for problem-sol vi ng methods.
Implicit Specification of Givens, Operations, and Goals
Al though some probl ems (for exampl e, some proof probl ems) expl i ci tl y
specify al l of the gi vens, operati ons, and goal s, other probl ems speci fy
them onl y i mpl i ci tl y. For exampl e, in sol vi ng the typi cal physi cs
probl em, al l of the assumpti ons, operati ons, and previ ousl y proved
theorems of real -variabl e and compl ex-vari able mathemati cs are at
one' s di sposal i n worki ng on the probl em, though thi s fact i s general l y
not stated expl i ci tl y. Usual l y, t he i mpl i ci t gi vens, operati ons, and
goal s of a problem are cl ear to the probl em sol ver, but sometimes
they are not.
Incomplete Specification of Givens,
Operations and Goals
There are ofen del i beratel y i ncompl ete statements of gi vens, opera
tions, and goal s. That i s, the probl em sol ver may have some degree
of choi ce among a set of possi bl e gi ven expressions, a set of possi bl e
operati ons, and a set of possi bl e goal expressi ons. We have al ready
di scussed the case where the terminal goal expresi on is not specifed
compl etel y, but i nstead the probl em sol ver has to fnd the correct
expressi on to fl l i nto a bl ank space in the termi nal goal expressi on.
Many fnd probl ems, such as the exampl e gi ven earl i er of fndi ng
X -= g gi ven 4x + 5 -1 7, are equi valent to a probl em with a com-
pl etel y speci fed goal , X + 5 -1 7, but with an i ncompl etel y specifed
gi ven, x -= . Equi val ences l i ke this obtain where operations
Problem Theory T
are uni quel y reversi bl e (that i s, where there exist i nverse operati ons
for al l operations) .
I n algebra probl ems -for i nstance, solving for x = i n a cubi c
equation such as x3 + 2x2 - X - 2 O -i t i s probabl y somewhat better
to vi ew the probl em as having d completel y speci fed goal expressi on,
r + 2X2 - X - 2 -0, and an i ncompl etel y speci fed gi ven expressi on,
x = , than the reverse. Ofen you are asked to determi ne all the
val ues of x that sati sfy the equati on, whi ch means that you need to
know all the val ues of x from whi ch you coul d deri ve the compl i cated
equation. Basically, this i s a hypothesi s generati on ( guessi ng) and test
i ng situati on, because the di rection of i mpl i cation (by ordi nary arith-
metic operati ons) i s from an unknown x -= to a known goal ,
r + 2x2 - X - 2 0, not the reverse. There are three val ues of x that
sati sfy the equation r + 2X2 - X - 2 0, so the l atter equati on cannot
i mpl y three contradi ctory equati ons, x I , x --I , and x --2.
Other exampl es of probl ems wi th i ncompl ete specifcation of gi vens
or operati ons i ncl ude many construction problems. Many such prob
lems requi re one to bui l d something wi th a range of possi bl e gi ven
material s and operati ons, ' but there are costs or other restri cti ons
attached to the use of the material s ( gi vens) and operati ons. The prob
l em solver must sel ect an unordered set of materi al s and an ordered
set of (sequence of) operati ons that sati sfes some constrai nts speci fed
i n the probl em and also achi eves the goal .
Optimization problems are a natural extensi on of probl ems where
gi vens or operati ons have costs. I n an opti mi zati on probl em, one i s
supposed t o fnd t he way t o achi eve t he goal that mi ni mi zes some cost
or maxi mi zes some uti l i ty.
WHAT IS A PROBLEM STATE?
A problem state, the state of the world of a probl em, is the set of al l
the expressi ons that exi st i n the world of the probl em at a parti cul ar
ti me. The probl em state can be changed onl y by appl yi ng an operati on
to one or more expressi ons exi sti ng i n the previ ous probl em state to
produce one or more new expressi ons.
I n probl ems that have onl y nondestructi ve operati ons, a probl em
state consi sts of all the expressi ons that have been obtained from
the gi vens up to that moment in working on the probl em. I n probl ems
that have one or more destructi ve operati ons, the probl em state i n
cl udes onl y the currentl y exi sting expressi ons (those obtained that
have not been destroyed) . Ofen probl ems with destructi ve operati ons
Tb Chapter Z
are considered to have onl y a single expressi on representing their state
at the current moment, with the operati ons being able to change that
entire state into a new state. I n such probl ems, there is no reason to
di stingui sh between state and expressi on.
The gi ven probl em state i s the set of al l gi ven expressions. When
the givens are not specifed compl etel y, there are multiple possi bl e
gi ven states. When the givens are compl etel y specifed, there i s a
unique gi ven state. A goal state is a state that i ncl udes the goal ex
pressi on. When the goal i s not compl etel y specifed or when there are
nondestructi ve operati ons, there are multiple possi bl e goal states.
When t he goal i s compl etel y specifed and al l operati ons are destruc
tive, there may be a unique goal state.
WHAT IS A SOLUTION?
A solution to a problem contains al l four of the fol l owi ng parts. (a) Com
pl ete specifcation of the gi vens ; that is, a uni que given state from
whi ch the goal can be derived vi a a sequence of al l owable operations.
(b) Compl ete specifcation of the set of operations to be used. (c) Com
pl ete specifcation of the goal s. (d) An ordered succession or sequence
of probl em states, starting with the given state and terminating with a
goal state, such that each successi ve state is obtained from the pre
cedi ng state by means of an al l owabl e action (operation appl ied to one
or more expressi ons in the precedi ng state) .
Part (d) real l y i ncl udes t he frst three parts, so it may be taken t o be
a sufci ent defnition of a probl em sol uti on. However, part (d) appears
to place pri mary emphasi s on the sequenci ng of actions, and i n many
probl ems it i s the specifcation of gi vens or operati ons that constitutes
the main source of difculty i n the probl em. Thus, it i s important to give
these matters proper emphasi s.
A si mpl e and completel y equi valent defni tion of a sol ution i s t o say
that a sol ution i s a sequence of al l owabie actions that produces a com
pl etel y specifed goal expressi on.
In I nstant I nsanity, a sol uti on coul d be consi dered to consi st of
some gi ven confguration of the four cubes, fol l owed by a sequence of
diferent confgurati ons of the cubes, each of whi ch was obtained by an
al l owable operation from the previous confgurati on, and ending with
a confguration that satisfes the goal of having each of the four col ors
represented once on each of the four si des of the row of four cubes.
I n a chess probl em, a sol uti on consi sts of some gi ven board con
fgurati on, fol l owed by a sequence of board confgurati ons, each of
Problem Theory T
whi ch is deri ved from the previ ous confguration by an al l owabl e
move, and endi ng wi th a checkmate confgurati on. I f the probl em as
serts that thi s sol uti on is to be accompl i shed with some restri cti ons on
the number of moves, then the description of the probl em state must
i ncl ude a move counter that i s i ncreased by one on every move. The
terminal expressi on must not onl y be a checkmate posi ti on, but the
move counter must be l ess than or equal to some val ue. Chess prob
l ems are often opti mi zation probl ems, i n whi ch the di ferent sol uti ons
have di ferent val ues dependi ng upon how few moves they requi re.
I n algebraic fnd probl ems or l ogical proof probl ems, t he sol uti on
consi sts of a sequence of states such that (a) the gi ven state i s the
conjunction of al l the gi vens, ( b) each successi ve state i s deri ved
from the previ ous state by addi ng an expressi on that has been obtained
by appl yi ng an al l owabl e operation to one or more of the previ ousl y
obtai ned expressi ons, ( c) t he goal state i ncl udes a compl etel y speci fed
goal expressi on. When there are several gi ven expressi ons, the most
common practi ce i s to write down the gi ven expressi ons onl y as soon
as they are needed for some operati on. This procedure makes it easi er
for the reader to fol l ow the proof, but I thi nk i t i s more l ogi cal to re
gard al l the gi vens as having been written down in the gi ven probl em
state. I f there i s some psychol ogical beneft i n wri ti ng them down agai n
i n probl ems i nvol vi ng onl y nondestructi ve operations, of course you
shoul d do i t. But I do not thi nk thi s writi ng exercise shoul d i nfuence
your defni ti on of a probl em sol uti on.
STATE-ACTION TREE
Al t hough the sol uti on of a probl em can be defned in terms of either
a sequence of acti ons or a sequence of states (terminating wi th the
achi evement of the goal ) , it i s very useful to represent both the pos
si bl e sequences of acti ons and the possi bl e sequences of states i n a
common di agram, whi ch coul d be cal l ed a state-action tree for a prob
l em. An exampl e of such a tree i s shown i n Fig. 2-3 .
I n a state-action tree, the nodes or branch poi nts of the tree represent
al l the possibly di ferent probl em states that coul d resul t from al l the
diferent action sequences. The concept of a node i n a state-acti on tree
difers from the concept of a probl em state i n a somewhat subtl e, but
i mportant, way. To be sure, every node represents a state of the prob
lem, but two di stinct nodes do not necessari l y represent two di sti nct
or di ferent states of the probl em. That i s, two or more action se
quences, whi ch resul t i n two di ferent nodes, may resul t i n two i dentical
T
No. possi ble states State level
(given) 0
Z
4 Z
8 J
IL 4
FIGURE 2-3
State-action tree for a probl em with two possibl e actions at each state, showing
how the number of possi bl e "terminal" states at level H (equaling the number of
diferent action sequences that are H actions in length) i ncreases geometrical l y
with H.
probl em states. Strictl y speaking, a node represents the sequence of
actions or the sequence of states that l ed up to i t, not the probl em
state achi eved by that sequence of actions or states. However, as l ong
as you bear in mind that di stinct nodes do not necessari l y represent
di stinct probl em states, there i s no harm i n consideri ng a node to repre
sent a state, rather than the sequence of actions or states that led
up to i t .
The branches from each node represent the diferent actions that
coul d be sel ected at that node. Obvi ousl y, the actions possi bl e at each
node need not be si mi l ar to the actions possi bl e at any other node, but
i n many probl ems the acti ons that are possi bl e at each node fal l i nto
the same action cl asses or operati ons, with onl y the avai l abl e operands
bei ng diferent ; however, thi s si mi l arity is not true of every probl em.
I n addi ti on, the number of possi bl e acti ons at each node need not be
equal either at the same level or across diferent l evel s.
These possi bl e di ferences from node to node do not al ter the pri
mary l esson to be l earned from exami ni ng a state-action tree -
Problem Theory
T
namel y, how rapidl y the number of possi bl e nodes or action sequences
increases in such a tree as a function of l evel , that i s, the l ength of
the pri or action sequence. If M acti ons occur at each node, then there
are M
!'
possi bl e acti on (or state) sequences terminating at l evel n. Each
of these diferent action ( state) sequences i s represented by a node at
level n i n the state-action tree, so there are M
!'
diferent nodes at l evel n.
Thi s geometric (di screte exponential) i ncrease i s perhaps t he si ngle
most i mportant fact to consider i n devel opi ng probl em-sol vi ng methods.
To sol ve a problem you must state t he exact sequence of acti ons
( states) that resul ts i n the goal , and many probl ems requi re a moder
atel y long sequence of acti ons to accompl i sh the goal . Thus, we are
ofen faced wi th a search among an extremel y l arge number of al
ternative action sequences. In these cases, we must "prune the tree"
so that there are not so many possi bl e acti on sequences to i nvesti
gate. But, of course, we must prune i n such a manner that we do not
cut of al l the branches that have "frui t, " that i s, states i ncl udi ng
the goal .
If you had no basi s for choosi ng between t he alternati ve acti on' at
each node, if al l the nodes at all l evel s represented di sti nct states
(di stinct sets of expressi ons), and if onl y one of the states (up to and
i ncl udi ng level n) i ncl uded the goal , then there would be no way to
prune the tree and reduce the search. However, i n most probl ems, i t i s
possibl e to prune the tree.
Diferent sequences of acti ons ofen resul t i n equi valent probl em
states, al l owi ng you to combi ne nodes, prune branches, construct
equi valent reduced state-action trees, and so on (for exampl e, cl assi
fcatory trial and error and macroaction i n Chapter 4). Usual l y, there
are good reasons for choosi ng certain acti ons at any node and ignor
i ng other acti ons and the branches they generate (for exampl e, state
evaluation and hill cl imbing i n Chapter 5) . Frequentl y, a l arge probl em
can be broken up i nto subprobl ems, thereby transforming a l arge tree
i nto several smal l er trees , with a great reduction i n the total number of
branches (for exampl e, subgoal s in Chapter 6) . Sometimes, a much
smal l er tree resul ts from tryi ng to get from the goal back to the gi vens,
rather than the reverse (for exampl e, worki ng backward i n Chapter .
Probl ems wi th mUl ti pl e gi ven states can be represented by as many
state-acti on trees as there are possi bl e gi ven states. I n some probl ems,
the pri nci pal task i s to choose among the gi ven states (alternative sets
of givens), the one or more gi ven states whose state-acti on trees con
tai n a goal state. Ofen these probl ems requi re only a very short acti on
sequence to achi eve the goal , once the correct gi ven state has been
sel ected. In such probl ems, the main di fcul ty i s to fnd the correct
type of tree i n a l arge forest ; cl i mbi ng the tree may pose onl y a mi nor
Z Chapter Z
probl em. The method of contradi cti on di scussed in Chapter 7 is ofen
useful for these probl ems.
There i s a speci al case of probl ems wi th multiple gi ven states that
occurs quite frequentl y and is of parti cul ar interest. In these prob
l ems, the sol ver has the option of consi dering state A to be gi ven and
state B to be the goal or of consi deri ng state B to be gi ven and state A
to be the goal . Thi s ki nd of equi valence between two problems occurs
where i nverse operati ons exi st for al l operations. One probl em of thi s
type was di scussed earl i er i n the chapter -namel y, the equi valence of
deri vi ng x -3 from 4x + 5 1 7 or vi ce versa.

Inference
Vi rtual l y al l probl ems present some of the rel evant i nformation i n
impl i ci t, rather than expl i ci t, form. That i s, some of the i nformation
concerning gi vens, operati ons, or occasional l y even goal s i s presented
i n a subtle manner that may not strongl y attract your attenti on, unl ess
you know what to l ook for. I n a sense, thi s situation might be said to
be poor communi cation of the components of a probl em. Why do not
the people who make up probl ems si mpl y do a better job of communi
cating the rel evant i nformation?
I woul d agree that, in some cases, probl ems used for teachi ng pur
poses coul d be i mproved by making the rel evant i nformation very cl ear.
I n these cases the probl em is di fcul t enough in expl i ci t form wi thout
the added di fcul ty of the rel evant i nformation bei ng presented i m
pl i ci tly. However, when you are posi ng and sol vi ng mathemati cal ,
scientifc, and engineering probl ems for yourself in some real -l ife en
deavor, your own i ni ti al posi ng of probl ems wi l l contai n i mpl i ci t
statements of i nformati on. Unl ess you know how to anal yze a probl em
for i mpl i ci t i nformati on, you wi l l have di fcul ty sol vi ng actual prob
lems l ater on.
Probl ems often evol ve from (a) vaguel y formul ated to (b) semi
preci sel y formul ated to (c) preci sel y but partl y i mpl i ci tl y formul ated
ZZ Chapter d
to (d) preci sel y and expl i citl y formulated stages. It is very i mportant
for probl em sol vers to know what ki nds of i mpl i ci t informati on to look
for in probl ems, because thi s i nformation is ofen a critical step i n
probl em sol vi ng, whether i n school or i n l ife. Furthermore, even when
al l the gi vens and operati ons are expl i ci tl y presented i n the probl em,
i t i s, of course, necessary to transform the gi vens by means of the
operati ons i n some way i n order to sol ve the probl em. The solver must
make i nferences, draw concl usi ons, from the gi ven informati on, a
process that i s, in essence, rendering expl i ci t the statements that were
( i n a somewhat di ferent sense) onl y i mpl i ci t in the gi vens.
When implicit information refers to the consequences of gi ven i n
formation, i t i s a somewhat di ferent use of the term than when i t refers
to i nformation not contained in the expl i ci t statement of the probl em
(although, by conventi on, one i s supposed to know that i t i s part of
the i nformation in the probl em). However, there are all degrees of
expl i ci t menti on of i mpl i ci t i nformation found in diferent probl ems.
For exampl e, a probl em mi ght refer to even numbers. In one sense,
thi s statement i s expl i ci t menti on of even numbers from whi ch one
can draw the i nference that, if n is an i nteger and an even number,
then it can be expressed as 2m, where m is al so an i nteger. However,
the defni ti on of even numbers is not presented expl i ci tl y in the prob
lem and must be suppl i ed from memory. Thi s sort of semi expl i ci t,
semi i mpl i ci t presentati on of i nformation occurs all the time i n prob
l ems. Thus, i t i s probabl y not too useful to di sti ngui sh between the
drawing of concl usi ons from di ferent degrees of i mpl i ci tl y versus ex
pl i ci tl y presented i nformati on.
Drawi ng i nferences from i mpl i ci tl y or expl i ci tl y presented i nforma
tion is essenti al l y random trial and error, unl ess some criteria are
speci fed regardi ng whi ch i nferences (more general l y, whi ch trans
formati ons of the goal or the gi ven i nformati on) shoul d be made frst.
There are essenti al l y two cri teria that can be formulated semi preci sel y,
but not compl etel y preci sel y, at the present ti me. The frst criterion
is that the i nferences should be those that you have frequentl y made
in the past from the same type of i nformati on. You assume that the
properti es that proved useful in the past wi l l most l i kel y prove useful
in the present probl em. The second criterion is that the i nferences you
draw shoul d be those i nferences that are concerned with properti es
menti oned i n the goal , the gi vens, or i n previ ousl y derived conse
quences of the goal or the gi vens. I nferences that sati sfy this second
criterion are l i kel y to combi ne with other i nformation to yield sti l l
further i nferences.
Inference Z
Thus, the general probl em-sol vi ng method described i n this chapter
may be stated as fol l ows : Draw inferences from explicitly and implicitly
presented information that satisfy one or both of the following t wo
criteria: (a) the inferences have frequently been made in the past from
the same type of information; (b) the inferences are concerned with
properties ( variables, terms, expressions, and so on) that appear in
the goal, the givens, or inferences from the goal and the givens.
Throughout the rest of the book, the expressi on "drawi ng i nferences"
wi l l be used to refer to the above statement of the method - namel y,
drawi ng i nferences that sati sfy one or both of the previ ousl y stated
criteria.
Drawi ng inferences (more general l y, making transformati ons of the
goal or the gi vens) i s probabl y the frst probl em-sol vi ng method you
should empl oy i n attempting to sol ve a probl em. You are essenti al l y
expanding t he goal or t he gi vens by bringing t o bear al l of t he knowl
edge you have concerni ng thi s probl em i n your memory. Frequentl y,
probl ems are quite si mpl y sol ved, once al l the rel evant i nformation
i s retri eved from memory, i n the drawi ng of i nferences from expl i ci tl y
and i mpl i ci tl y presented informati on. Most peopl e do make frequent
use of the i nference method, at least i n connecti on wi th drawi ng
i nferences from gi vens. (Thi s procedure is ofen thought to be random
trial and error, but thi s characterization is l argel y i naccurate, si nce
peopl e' s inferences usual l y do meet one or both of the stated cri teri a. )
The general probl em-sol vi ng methods di scussed l ater i n the book are
somewhat l ess uni versal l y used by human probl em sol vers , but the
di scussion of them shoul d not l ead you to ignore the basic inference
method. For thi s reason, thi s method i s the frst general probl em
sol vi ng method di scussed i n thi s book. Furthermore, a greater under
standi ng of how the i nference method operates and an awareness of
some i l l ustrati ve use can greatl y faci l i tate your profci ency in usi ng
the method, parti cul arl y wi th respect to i nferences from the goal
i nformation, whi ch peopl e do not pay enough attenti on to. Peopl e have
a bias to start at the begi nni ng, whi ch they take to mean the gi vens.
Thi s bi as i s ofen i nappropri ate i n probl em sol vi ng, si nce the goal i s
frequentl y a better begi nni ng poi nt than the gi vens.
So-cal l ed insight problems are ofen probl ems i n whi ch the pri nci pal
step in sol uti on is to draw the appropriate inference from certain ex
pl i ci tl y or i mpl i ci tl y presented i nformati on. Very few steps are requi red
to solve the probl em. What i s necessary i s to make that one critical
transformation of the givens that essenti al l y sol ves the probl em.
Di fcul t i nsi ght probl ems are often di fcul t preci sel y because t hey
Z4 Chapter d
requi re you to draw an i nference that is not too cl ose to the top of
your hi erarchy of i nferences from thi s type of gi ven information [cri
terion (a)] . Obvi ousl y, the more you have stored i n your memory con
cerning the pri nci pal i nferences to be drawn from the types of given
informati on contained i n the probl em, the more l i kel y you are to be
able to achi eve the cri ti cal i nsi ght . However, whatever your l evel of
speci fc knowl edge concerni ng the gi ven i nformati on, greater under
standi ng and experi ence in the use of the i nference method wi l l i n
crease your chances of systematical l y di scovering the requi red i nsight
i n the course of drawing i nferences concerning properti es of the
given i nformati on. Just knowi ng that what you are doing i s surel y not
random trial and error may cause you to go further and further down
the l i st of i nferences to be made from the i nformation i n the probl em,
rather than gi vi ng up thi s approach afer the frst few i nferences fai l .
With the knowl edge of probl em-sol vi ng methods contained i n thi s book
and experi ence i n appl yi ng them to the sol uti on of probl ems, you can
gradual l y devel op a fai rl y accurate i ntuition as to whi ch probl ems are
i nsight probl ems and thus most suited to the inference method and
not to other probl em-sol vi ng methods. If you cl assify a probl em as an
i nsight probl em, then you shoul d conti nue drawing inferences ( rather
than use other methods) for a l onger period of time than if you do not
cl assify it as an i nsight probl em.
Of course, drawing i nferences ( i ncl udi ng expl i ci t representation of
i mpl i ci t i nformati on) is ofen an i mportant part of sol vi ng any probl em,
not just i nsight probl ems. I nsight probl ems are si mpl y those i n whi ch
inference i s the pri nci pal or onl y method empl oyed i n sol vi ng them.
In noni nsight probl ems, you shoul d stop usi ng the i nference method
when you "run out of gas" usi ng the method -that i s, when you fnd
it di fcul t to draw from the gi ven i nformation any new concl usi ons
that seem to have any l i kel i hood of bei ng useful i n sol vi ng the probl em.
I n noni nsight probl ems, you shoul d then go on to consi der empl oyi ng
other general probl em-sol vi ng methods, usi ng the expanded set of
gi ven i nformati on provi ded by the inference method. I n i nsi ght prob
l ems, when you run out of gas, you shoul d go back and try over and
over agai n to look at the probl em from a diferent point of view to
yi el d addi ti onal new i nferences.
The di scussi on of i nference and i mpl i ci t i nformation natural l y di vi des
i nto three secti ons. Fi rst, gi vens may be, to some extent, stated i m
pl i ci tl y and, in any event, can usual l y be expanded consi derabl y by
use of the i nference method. Second, operati ons are not al ways ex
pl i ci tl y stated. Thi rd, the goal of the probl em i s occasi onal l y not
Inference Zb
completel y cl ear, and the sol ver must get a preci se and correct defni
tion of the goal . In addi ti on, it i s ofen hel pful to specify the proper
ties of the goal in more detai l . Thi s procedure frequentl y i nvol ves
drawing i nferences from presented i nformation ( gi vens and goal ) ,
i ncl udi ng expl i ci t symbol i c or di agrammatic representation of i nforma
tion that may appear onl y i mpl i ci tl y in the probl em.
GIVENS
The probl ems at the end of a secti on in a textbook are there to test
the reader' s knowl edge of the material presented i n that section. Each
probl em, then, i ncl udes all of the given assumpti ons, proved theorems,
and operati ons that appeared i n the secti on as wel l as the parti cular
givens of the parti cul ar probl em. In addi ti on, some previ ous material
presented i n the book may be rel evant to sol vi ng the probl em, and
certain background knowledge from other books may also be needed.
Such background i nformati on concerni ng gi vens and operati ons i s one
ki nd of i mpl i ci t i nformation i n probl ems.
You shoul d be aware of thi s ki nd of i mpl i ci t i nformation i n prob
l ems, and take care to master background subject matter before pro
ceedi ng on to courses that have this background as a prerequi si te.
I f you have not ful l y understood what was presented previ ousl y i n
the course or what was presented i n relevant background courses, you
should face thi s fact and go back to l earn the relevant prior material ,
either si multaneousl y wi th or i nstead of taki ng a subsequent course.
It i s l unacy to go on to more advanced courses wi thout a reasonabl y
cl ear understandi ng of t he rel evant background materi al . The general
probl em-sol vi ng methods taught in thi s book wi l l not substitute for
l ack of the rel evant knowl edge.
I t i s true that you can understand the relevant material and not be
able to solve probl ems for l ack of understandi ng of general probl em
sol vi ng methods. However, you wi l l al so fai l to sol ve probl ems if you
l ack the rel evant knowl edge, no matter how ski l lful a probl em solver
you are. In today' s schools a C or even a B i n a course may represent
an i nadequate level of understandi ng for going on to more advanced
courses, and the consci enti ous student should recogni ze this fact and
act accordi ngl y.
In addition to background i nformati on, there i s another ki nd of i m
pl icit problem i nformation that the ski l l ed probl em sol ver can come to
recognize rather easi l y, someti mes greatl y faci l itating sol uti on. Thi s
Zb Chapter d
other ki nd of i mpl i ci t i nformation concerns the properti es possessed
by each of the gi vens or operations i n a probl em. When a fami liar
object or acti vi ty i s presented i n a probl em, all of the known proper
ti es of that object or activity (i ncl udi ng al l i ts known relations to other
objects or acti vi ti es) are usual l y consi dered to be part of the given in
formati on. There may be no question that everyone who works on
the probl em knows all of the rel evant properti es of al l the gi vens and
operati ons i n the probl em. That i s, no speci al i zed background knowl
edge i s requi red. However, amateur probl em solvers frequentl y fail
to ask themsel ves what they know about the gi vens and operati ons i n
a probl em from thei r own past experi ence. I nsight probl ems are very
ofen probl ems that requi re one to notice -whi ch means represent
expl i ci tl y -properti es of gi vens presented i n the probl em.
Of course, many of the i mpl i ci t properti es of the gi vens are irrel evant
to sol vi ng the probl em. We know that most people have two l egs, two
arms, two eyes, ski n, hair, a nose, a mouth, and so on, but most of
these properti es are i rrel evant to the solution of any single probl em
where peopl e are i ncl uded i n the gi ven i nformati on. Such i rrel evant
properti es shoul d be ignored, and probl em sol vers are usual l y able to
reject such trul y i rrelevant i mpl i ci t properti es. The di fcul ty usual l y
comes i n abstracti ng or consci ousl y consi dering t he possi bl y relevant
i mpl i ci t properti es. Some exampl es are described i n the fol lowing
subsections.
Numerical Properties
Whenever numbers are i nvol ved in a probl em in any way, you shoul d
consi der whether the known properti es of the ki nd of numbers i nvolved
i n the probl em might be of any value in sol vi ng the probl em. For ex
ampl e, if some number i s known to be a posi tive i nteger, then i t cannot
be negati ve, zero, or a fracti on. If an i nteger, i s known to be even,
then it can be expressed as -2m, where M i s al so an i nteger, or as
-2sp, where i s an i nteger and p i s an odd i nteger. If an i nteger,
i s known to be odd, then i t can be expressed as -2m + 1 , where m
is an i nteger, or -2sp + I , where is an i nteger and p is an odd i nteger.
A somewhat famous exampl e in the psychol ogy of probl em sol vi ng
of the abstracti on of numerical properti es comes in t he 3 problem of
Karl Duncker ( 1 945 , p. 3 1 ) . The probl em can be stated as fol l ows :
Prove that al l si x-pl ace numbers of the form abcabc (for exampl e, 4 1 64 1 6
or 258258) are di vi si bl e (evenl y) by 1 3 .
Inference Z
Stop reading and try to sol ve thi s probl em, then read on.
You might try a variety of speci al cases, verifyi ng that i n every case
the number was di vi si bl e by 1 3 , but that woul d probabl y not suggest
how to prove the theorem i n general . The critical step i s to i nqui re
whether you know any numerical properti es of a number of the form
abcabc. If you coul d not sol ve t hi s probl em before, stop readi ng and
try agai n by abstracti ng numerical properti es of numbers of the
form abcabc.
I f you sti l l coul d not sol ve the probl em, consi der whether you coul d
factor a number of the form abcabc i nto a product of other numbers.
Now stop readi ng and try agai n.
I n factori ng t he number, you no doubt determined that abcabc -
(abc) ( \ 00 I ) , for all numbers of the form abc and therefore for al l
numbers of the form abcabc. Now, of course, 1 00 1 is di vi si bl e ( evenl y)
by 1 3 , so ( abc) ( 1 00 I ) i s di vi si bl e by 1 3 , and the theorem i s proved.
Furthermore, the factoring of abcabc i nto abc( 1 00 1 ) can be achi eved
qui te automati cal l y by representi ng the numerical propert i es of abcabc
i n the fol l owing standard way (for whi ch abcabc i s real l y the conven
ti onal abbrevi ation) :
abcabc = (a . 1 05) + ( b 1 04) + ( c l O:l) + (a . 1 02) + ( b 1 0) + (c)
= tl
.
( 1 05 + 1 02) + b( l 04 + 1 0) + c( l 03 + I )
= a . 1 02 ( l O:l + I ) + b
.
1 0( 1 03 + I ) + ct l O3 + I )
= ( l OO I ) ( a 1 02 + b 1 0 + c) = ( 1 00 1 ) (abc)
Topological Properties
Topology is concerned wi th the properti es of geometric fgures that
remain unaltered when the fgures are stretched, shrunk, and twi sted
i n any regul ar or i rregul ar way. For exampl e, consi der the square
shown in Fi g. 3- 1 . I magine that the square was drawn on a sheet of
very fexi bl e rubber and that it was stretched so that the square l ooked
l i ke that shown at right i n the fgure. What properti es remain i nvariant
under the stretchi ng, shri nki ng, and twi sti ng of the rubber sheet?
Actual l y, a number of properti es are unchanged. Poi nts i nsi de the
fgure remai n i nsi de, poi nts outsi de the fgure remain outsi de, and
poi nts on the edges ( l i nes) of the fgure remain on the edges. I f you
consider that the fgure has onl y four poi nts -namel y, the four verti ces
A, B, C, and D
-
and that the edges are defned merel y as unordered
pairs of the vertex poi nts, then the set of points and the set of edges
(unordered pairs of poi nts) has not been changed by the di storti on ei ther.
Z
C
D
A
(
a
)
FI GURE 3-1
C
(b)
Di storting a square drawn on a rubber sheet to i l l ustrate
the topological properti es of a fgure (those propert i es
that are unchanged by stretchi ng, shri nki ng, and twi sti ng).
Chapter d
Consi der a fgure wi th several faces or regi ons enti rel y encl osed
by l i nes with no i nterior l i nes, such as the three-face fgure shown in
Fig. 3- 2. Al l of the i nvariants described i n the precedi ng paragraph for
a si ngl e-face fgure obtain for the mul tiface fgure. I n addi ti on, the
faces that border on each other (have a common edge) sti l l border on
exactl y the same faces afer the di storti on. Thus, if you constructed
the set of unordered pairs of faces that border on each other-namel y,
i, g) and (g, h) -thi s set woul d remain i nvariant under stretchi ng,
shri nki ng, and twi sti ng.
A /
A F C
(a) ( b)
FI GURE 3-2
Di storting a three-face fgure drawn on a rubber sheet to
i l l ustrate topol ogical properti es. Faces are represented by
f, , and h. Verti ces are represented by A. B, C. D. E. and F.
Inference Z
One of my favorite probl ems i nvol ves the property of the borderi ng
( di rect connecti on) of faces i n an i mportant way. Thi s i s the no(ched
checkerboard problem:
You are gi ven a checkerboard and 3 2 domi noes. Each domi no covers
exactl y two adjacent squares on the board. Thus, the 32 domi noes can
cover al l 64 squares of the checkerboard. Now suppose two squares are
cut of at di agonal l y opposi te corners of the board (see Fig. 3 . 3). Is i t
possi bl e to pl ace 3 I domi noes on the board so that al l of the 62 remai ni ng
squares are covered? If so, show how it can be done. If not, prove
i t impossi bl e.
Stop readi ng and try to sol ve thi s probl em.
I f you coul d not sol ve i t, consi der the fol l owi ng hi nt. Thi s problem
primari l y involves use of the i nference method to expl i ci tl y represent
certain properti es of the checkerboard and domi noes that are onl y
FI GURE 3-3
The notched checkerboard.
Chapter d
i mpl i ci tl y presented in the present probl em. Once the appropri ate
property or properti es are recogni zed, the solution to the problem i s
obvious. Now stop readi ng and try to sol ve the probl em, i f you could
not do so before.
The cri tical property is that of the two squares of the checkerboard
that are covered by any domi no. What are some of the properties of
any such two squares? If you have not yet solved the probl em, stop
reading and try agai n, considering thi s hi nt.
The critical properti es of the two squares covered by any domi no can
be expressed i n terms of the colors of these two squares. What are the
colors of the two squares covered by any domino on a checkerboard?
I f you have not yet sol ved the probl em, stop readi ng and try agai n,
consi dering thi s hi nt.
The key i nsight requi red to sol ve the notched-checkerboard probl em
i s to notice that a domi no covers two squares that are always of diferent
col ors (that i s, one bl ack and one whi te). Si nce the di agonal l y opposite
corner squares are of the same col or, there are now 30 squares of one
col or and 32 squares of the other color, and obviousl y the 62 squares
cannot be covered by 3 1 dominoes.
What has i ntrigued me most about t he probl em i s thi s : t he impos
si bi l ity of covering the remai ni ng 62 squares with 3 I dominoes can be
proved i rrespective of whether the eight-by-eight matrix is presented
as a checkerboard with a checkerboard coloring pattern and even
irrespective of whether the probl em sol ver has ever experienced a
checkerboard coloring pattern. But what probl em-sol vi ng methcd
woul d l ead one to discover the el egant proof that comes from imposing
a checkerboard col oring pattern on the matri x? Is this kind of i ngenious
idea a chance happeni ng, or something onl y very bri l l iant people can
think of, using methods that are not understandabl e by others ? I do
not thi nk so. I thi nk that use of the probl em-sol ving method of repre
senting al l of the possi bl y relevant properties of the gi vens in a prob
lem makes it l i kel y that many probl em sol vers would di scover the
el egant solution of even the notched eight-by-eight colorl ess matri x
probl em.
I thi nk that it i s not l i kel y a person unfamil iar wi th a checkerboard
col ori ng pattern woul d impose such a pattern on a colorl ess eight-by
eight matri x. However, I thi nk that i t i s l i kel y that a person would do
somethi ng equi val ent to i mposi ng checkerboard col ori ng on the matri x,
as fol l ows. U si ng t he method of trying t o represent al l of t he possi bl y
rel evant properti es of the gi vens i n the probl em, one would eventual l y
l abel the squares i n t he ei ght-by-eight matrix i n ordered-pair (co
ordi nate) notati on, as shown i n Fi g. 3- 4. Now one might eventual l y
I nference
o
1. | 1,= 1, J
*
1, 5 J, |
7J
1
6. 0 > , 1 . `
6, 4 ,7
6.
I. 1
. | 5 . I
' ` 5. J 5, 4 5. 5 , o
5
.
7
4,0 4, 1 4, 2 4, 3

+. 1, 6 +, 1
3, 0 3, | J, `
* ' 4'
5 J. |
`, 1
2. 0 2, 1 2, 2 `. `, 4 2, 5 2, 6 =,1
1 , 0 | . I 1 , 2 | .
3 1 , 4
, 5 1 , 6 I . 1
0, 0 0, 1 0. 2 0. 3 0. 4 0, 5 0. 6
FIGURE 3-4
The notched checkerboard with ordered-pair (coordinate)
label i ng of the squares.
T
look for some property common to all pairs of squares that a si ngl e
domino coul d cover. I f the i dea occurred to one to look for thi s ki nd
of property, then havi ng l abel ed the squares i n ordered-pair (coordi
nate) notation, it i s likely that one would see that a domino must cover
two squares, one of whose coordi nate sums i s odd and the other even.
Since the di agonal l y opposite squares of the matri x both have ei ther
an odd or an even coordi nate sum, the notched matri x cannot be cov
ered by the 3 1 domi noes. The solution i s i n every way equi valent to
that given for the notched checkerboard using the col or property but
in no way requi res one to i nvent some special l abel i ng scheme such as
a checkerboard col oring pattern_ Onl y the very general l y useful and
familiar coordi nate labeling scheme i s needed.
Let us exami ne why thi s probl em i s an exampl e of the abstraction
of the topological properti es of a fgure. A domino covers two faces
that border on each other i n a compl ex fgure composed of faces with
a very special type of bordering structure. It i s the bordering structure
Z Chapter d
of the matri x of faces that is represented by the coordi nate l abel i ng
scheme (or the checkerboard col ori ng pattern) , and the shapes and
sizes of the faces or the matri x are compl etel y i rrelevant. Thus, the
notched eight-by-eight matri x probl em i s a probl em where the cri ti cal
properti es to be represented are topological properti es.
Other probl ems i n whi ch representing topological i nformation i s
important for achi eving sol uti on are those i n whi ch a bl ock i s cut i nto
component subbl ocks. The fol l owi ng cube-cutting problem i s, I guess,
the cl assi c such probl em:
You are worki ng wi t h a power saw and wi sh t o cut a wooden cube,
3 i nches on a si de, i nto 27 I -i nch cubes. You can do thi s by maki ng si x
cuts t hrough the cube, keepi ng the pi eces together i n the cube shape
( see Fi g. 3-5). Can you reduce the number of necessary cuts by rearrang
ing the pi eces afer each cut ?
|
|
| |
.... . . . L. .
| |
| |
| |
| |
.... . .
T
...
| |
|
|
|
|
FIGURE 3-5
Sl i ci ng a 3-by-3-by-3-i nch cube
into 27 subcubes.
Stop readi ng and try to sol ve the probl em.
Consi der the 3-by-3 -by-3-i nch cube to be al ready di vi ded i nto its
27 component cubes but sti l l stacked i n such a way as to form a
3-by-3 -by-3 cube. The i mportant topol ogical properti es of such a
structure are concerned with the verti ces, the edges, and the faces of
the component cubes. I f you did not sol ve the probl em, stop readi ng
and try agai n.
Among the i mportant topologi cal properti es the one most l i kel y to
be rel evant to the sol ution of the present probl em concerns the faces
of the component cubes, si nce the power saw essential l y separates
the faces of certain component cubes from the faces of other component
cubes. If you have so far not sol ved the probl em, stop readi ng and
try agai n, usi ng this hi nt.
The 27 component cubes fal l i nto several cl asses on the basi s of how
many of thei r faces (si des) border on other component cubes versus
how many are parts of the exterior faces of the 3- by- 3-by-3 cube. Clas
sify the component cubes by this criterion and consider t hi s i nforma-
Inference
tion in rel ation to the sol ution of the probl em. If you have not sol ved
the probl em thus far, stop reading and try agai n.
There are four cl asses of cubes with respect t o t he property of the
number of "i nterior" faces (that is, the number of faces that border on
faces of other component cubes and thus must be cut ) . There are the
corner cubes that have onl y three i nterior faces ; there are the 1 2
edge cubes that have four interior faces ; there are the 6 face cubes
that have fve i nterior faces ; and there i s the one center cube that has
si x i nterior faces (and i s total l y hi dden from vi ew i n the 3-by-3-by-3
cube). A cross-sectional di agram of the cube representi ng the number
of i nterior faces for each component cube i s shown i n Fi g. 3-6. I f you
have not sol ved the probl em thus far, consi der the i nformation in
Fi g. 3 - 6 and try agai n.
J
4
J
Top section Middle section Bottom section
4 J 4 5 4 J 4
5 4 5 5 4 5
4 J 4 5 4 J 4
FIGURE 3-6
The number of interior faces (needi ng to be cut) for each component cube
of the 3-by-3-by- 3-i nch cube.
J
4
J
The key i nsight requi red to sol ve bl ock-cutting probl ems in general
and thi s cube-cutting problem i n parti cul ar i s to focus on that subbl ock
whi ch has the greatest number of faces that must be cut i n order to
separate it from the other subbl ocks. The reason for focusi ng on the
subbl ock wi th the l argest number of faces to cut i s that the number of
such faces on thi s bl ock sets a mi ni mum to the number of cuts that must
be made. I t i s obvi ous why thi s i s so, si nce under no ci rcumstances
can one cut more than one face of a subbl ock at a time. In the case of
the cube, thi s fact means focusi ng on the most central cube, whi ch
has no exposed faces to begi n wi th. Thi s cube has si x faces that must
be cut, and therefore no fewer than si x cuts wi l l sol ve the probl em.
Si nce we know by i nspection that si x cuts wi l l sol ve the probl em, the
number of cuts that i s requi red i s exactly si x.
The same pri nci pl e can be applied to a l arge cl ass of other probl ems
to set a mi ni mum on the number of cuts that are requi red. For exampl e,
a cube cut into four subcubes, as i n Fi g. 3- 7, requi res three cuts, si nce
each of the four subcubes has three unexposed faces that must be cut.
4
l
|
|
|
|
|
~~~~~~4~~~~~~
|
|
'
|
|
'
FIGURE 3-7
Sl icing a 2-by-2-by-2-inch cube
into subcubes.
Chapter d
Operations
Many practi cal probl ems requi re you to thi nk of a type of operati on
that wi l l sol ve the probl em. The operati on i s usual l y one with whi ch
you woul d be quite fami l i ar, but thi nki ng of that operation may be far
from tri vi al . Neverthel ess, it i s probabl y of some hel p to be expl i ci tl y
aware of the possi bi l ity of i mpl i ci t operati ons and to have some
exampl es of such probl ems i n your mi nd. One exampl e is the wel l
known radiation problem of Duncker ( 1 945, p. I ) :
Gi ven a human bei ng wi th an inoperabl e stomach tumor, and rays whi ch
destroy organic ti ssue at sufci ent i ntensi ty, by what procedure can one
free hi m of the tumor by these rays and at the same ti me avoi d destroyi ng
the heal thy tissue whi ch surrounds it?
There are a number of pl ausi bl e sol uti ons, each of whi ch i nvol ves
thi nki ng of some operation not speci fed i n the statement of the prob
l em. For exampl e, the rays might be focused from several sources so
that they i ntersected i n the region of the tumor. A si ngl e source of
radiation coul d be rotated around the body so that all the beams i nter
sected in the regi on of the tumor. Perhaps a source of radiation coul d
be i mpl anted i nsi de the tumor.
I n some formal probl ems with a preci sel y del i mi ted set of operati ons,
the properti es of one or more of the operati ons may be somewhat i m
pl i ci t. Fai l ure to achi eve a compl etel y expl i ci t and accurate under
standi ng of the properti es of such operati ons may bl ock solution of
the probl em. As an exampl e of thi s type of si tuati on, consi der the
fol l owi ng one-heavy-coin problem:
You have a pi l e of 24 coi ns. Twenty-three of these coi ns have the same
weight, and one i s heavier than the others. Your task is to determine
which coin i s heavier and to do so i n the mi ni mum number of weighi ngs.
You are gi ven a beam bal ance ( scal e), whi ch wi l l compare the wei ghts
of any two sets of coi ns out of the total set of 24 coi ns.
Inference b
Stop reading and try to sol ve the probl em. Consi der the properti es
of the wei ghi ng operati on if a beam bal ance i s used. What ki nd of i n
formation does the beam balance provi de concerni ng rel ati ve weights
i n any two sets of coi ns? How many di ferent outcomes are there to a
weighing on a beam balance? I f you have not sol ved the probl em thus
far, stop readi ng and try agai n.
A beam balance actual l y has three di ferent outcomes, not two
namel y, the l eft pan is heavi er, l ighter, or equal in wei ght to the right
pan. Since there are three di ferent outcomes to weighing on a beam
balance, it i s at l east theoreti cal l y possi bl e that a beam balance coul d
provi de one with an answer as to whi ch of three subsets of coi ns con
tai ns the heavy coi n ( not just deci di ng whi ch of two subsets contai ns
the heavy coi n) . Consi der thi s hi nt, and you shoul d easi l y be abl e to
solve the probl em.
A beam balance has two pans and compares the weights of two sets
of coins. For thi s reason, many people assume that the operation they
have avai lable to sol ve the probl em i s essenti al l y to ask which of two
(equal l y l arge) sets of coins contai ns the heavy coin. Accordi ngl y,
they reason that the optimal strategy must be to di vi de the total set of
coi ns i n hal f and wei gh one hal f agai nst the other hal f ( 1 2 coi ns agai nst
1 2 coi ns). Then, having determi ned whi ch set of 1 2 coi ns contai ns the
heavy coi n, they proceed to divide that set i n hal f and weigh 6 coi ns
against 6 coi ns, then 3 agai nst 3, and fnal l y I agai nst I , or 2 agai nst 2
fol l owed by I agai nst I .
When the number of coi ns remai ni ng i s onl y three, i t might occur to
a person that one' s original characterization of the operati on as a
two-way question was in error. However, someti mes even thi s si mpl e
termi nal probl em does not necessari l y i ndi cate to the probl em sol ver
that the beam balance can actual l y provide one with an answer to a
three-way question, if the coi ns are di vi ded i nto three pi l es (two of
whi ch are equal) on each and every weighi ng. Thi s procedure i s, of
course, the solution to the probl em-namel y, you shoul d frst weigh
two sets of eight coins agai nst each other. I f one pan i s heavi er than
the other, then i t contai ns the heavy coi n. If the two pans balance,
then the heavy coi n i s i n the remai ni ng set of eight coi ns that was l ef
of the pans of the balance scal e. I n any case, you fnd out which subset
of eight coins contai ns the heavy coi n. You then continue to partition
the remai ni ng set of eight coi ns into three parts by wei ghi ng three
coins agai nst three coi ns. No matter which subset of coi ns contai ns
the heavy coi n, the answer wi l l be found i n one addi ti onal weighi ng,
or three wei ghi ngs i n al l . By contrast, di vi di ng the set i nto two equal
parts (whenever possi bl e) requi res four wei ghi ngs.
b Chapter d
I f you are careful to expl i ci tl y state the properti es of the operations
avai l abl e i n a probl em, then you will be more l i kel y to avoid the in
accurate characteri zati ons that frequentl y occur i n probl ems such
as thi s one.
GOALS
Occasi onal l y in school and more frequentl y in real -l i fe probl ems, the
goal of a probl em is not compl etel y cl ear. Obvi ousl y, an i mportant step
i n representi ng the i nformation i n a probl em i s to be sure you have a
preci se and correct defni ti on of the goal . I t is ofen worthwhi l e to
question whether or not you understood the goal correctl y, si nce it i s
someti mes easy to make a mi stake i n thi s regard. As an exampl e, con
sider the fol l owi ng logic probl em:
The country of Marr i s i nhabi ted by t wo types of peopl e, l i ars and truars
(truth tel l ers). Li ars always lie and truars always tell the truth. As the
newl y appoi nted Uni ted States ambassador to Marr, you have been
i nvi ted to a local cocktai l party. Whi l e consumi ng some of the nat i ve
spi ri ts, you are engaged in conversati on wi th three of Marr' s most
promi nent ci ti zens : Joan Landi l l , Shawn Farrar, and Peter Gant. At one
point i n the conversati on Joan remarks that Shawn and Peter are both
l i ars. Shawn vehementl y deni es that he i s a l i ar, but Peter repl i es that
Shawn i s i ndeed a l i ar. From this i nformati on can you determi ne how
many of the t hree are l i ars and how many are truars?
Stop readi ng and try to sol ve thi s probl em, then read on.
In sol vi ng l ogi c probl ems of thi s type, it i s a common procedure to
l i st al l of the possi bi l i ti es in the form of one or more tabl es. For ex
ampl e, i n thi s probl em one mi ght attempt to fl l out the fol l owi ng tabl e:
Person
Joan
Shawn
Peter
Liar Truar
I n a probl em very si mi l ar to thi s one, a student in one of my problem
sol vi ng cl asses attempted to solve the probl em by fl l i ng out just such
a tabl e. Perhaps you di d thi s i n the present probl em. However, in
translating the probl em i nto this form, a subtle transformation has taken
pl ace with respect to the goal -namel y, the goal has been changed
Inference
from determi ni ng hoI\ ' many of the three are l i ars to determi ni ng
whether each of t he three persons is a l i ar or a truar. If you try to
answer the new versi on of the probl em, you wi l l never be abl e to reach
a sol ution ! Al l you must determine is how many of the three are l i ars.
The correct answer i s a number: 0, I , 2, or 3. The names of the peopl e
are l argel y i rrel evant i nformation, added to make the probl em appear
more i nteresting and si mul taneousl y to act as a di stracti on. If you have
not yet sol ved the probl em, stop readi ng and try agai n, then read on.
I n fact, i t i s i mpossi bl e to determine whether Shawn i s a l i ar and i t
i s al so impossi bl e to determi ne whether Peter i s a l i ar. However, one
can concl ude t hat ei ther Shawn i s a l i ar and Peter a truar or Shawn i s
a truar and Peter a l i ar. Ei t her way one and onl y one of the men i s a
l i ar. Thus, Joan must be a l i ar. Thi s concl usi on i mpl i es that there are
exact l y two l i ars and one truar in the group of three nati ves to whom
you are tal king. Thi s answer sol ves the original probl em but does not
al l ow you to compl etel y fl l out the tabl e.
I n addi ti on to readi ng and rereadi ng a probl em to avoid mi sunder
standing the goal , you shoul d have a cl ear, preci se statement of the
goal , rather than some vague formulation of i t. Sometimes vague state
ments of the goal are partl y or completel y due to uncl ear statements
of the goal i n the probl em, and someti mes the vague formulation i s due
part l y or compl etel y to sl oppy reformulation by the probl em sol ver.
I n either case, a vague formul ation of the goal may do consi derable
harm when you attempt to sol ve a probl em. For exampl e, consi der the
cheap-necklace problem:
You are gi ven four separate pi eces of chai n that are each t hree l i nks i n
lengt h. I t costs 2 t o open a l i nk and 3 to cl ose a l i nk. Al l l i nks are
cl osed at the begi nni ng of the probl em. Your goal is to obtai n a si ngl e
cl osed chai n, usi ng al l l i nks, at a cost of no more t han 1 5.
The goal i s t o obtain a si ngl e cl osed chai n, usi ng al l l i nks. But what
does that mean? I s a si ngl e cl osed chai n a si mpl e loop or ci rcl e? Or
woul d mul ti pl e l oops be sati sfactory? Is i t concei vabl e that a cl osed
chai n mi ght mean a l ong chai n formed wi thout joi ni ng the ends together
i n a l oop? Coul d there be some other variati ons on these possi bi l i ti es?
Unti l you can deci de whi ch of the reasonable possi bi l i ti es consti tutes
the actual goal of the probl em, you do not real l y know what the prob
l em i s and cannot expect to make much progress i n sol vi ng i t. Some
people object to such del i beratel y vague statements of goal s , but
vagueness regardi ng goal s is often a feature of real probl em sol vi ng
and you probabl y shoul d get some experi ence in deal i ng with thi s ki nd
Chapter d
of probl em. Whether in school or el sewhere, the l esson is that you
should be sure you have a preci sel y formul ated and accurate under
standi ng of the goal .
I n addi ti on to having a preci se and accurate understandi ng of the
goal , you wi l l frequentl y fnd it hel pful to have a more detailed repre
sentation and understandi ng of the goal than may be provided i n the
original statement of the probl em. As Pol ya ( 1 962, p. 7) has empha
si zed, it may be useful to i magi ne for a moment that you have already
sol ved the probl em and ask yourself, "What woul d I have? " Pol ya
vari ousl y cal l s thi s exerci se wi shful thi nking or taki ng the probl em as
sol ved. Whatever one cal l s thi s type of thi nki ng, i t i nvol ves some sort
of i ncrease in the expl i ci t representation of the goal ei ther in sym
bol i c (verbal ) or di agrammatic form. The useful ness of i ncreasing
the speci fcati on of the goal may i nvol ve l i ttle more than i ntroduci ng
names ( l abel s, symbol s) for concepts that appear in the goal but are
not expl i ci tl y represented in that form in the original statement of the
probl em. Thus, one purpose of i ncreasi ng the speci fcation of the goal
is to i ntroduce the necessary working concepts for reachi ng the goal .
Another purpose is to deri ve some addi ti onal properti es possessed
by the goal , ei ther by rigorous i nference from the i nformation contai ned
i n the goal and/or the gi vens of the probl em or by representing a rea
sonable conjecture (guess) based on one or another heuri sti c consi dera
tion. I n ei ther case, deri vi ng addi ti onal properti es of the goal may make
i t easi er to reach the goal because then you have a more specifc idea
of the diferent components that you are attempting to achi eve.
As an example of a probl em where it i s useful to represent the goal
expl i ci tl y in a di agrammatic form, consi der the fol l owi ng geometry
constructi on probl em:
Gi ven an acute angl e UVW and a poi nt P wi t hi n t he angl e, use a compass
and straightedge to construct a segment QR passi ng t hrough P, such
that QP and PR stand i n the ratio 2 : I , Q and R l ying on UV and VW,
respectively.
In sol vi ng this probl em, i t i s, of course, useful to represent the acute
UVW and the point P wi thi n the angle as shown i n Fig. 3- 8. However,
although we do not yet know exactl y where point Q l i es on the l i ne
UV or where poi nt R l i es on the l i ne VW, i t i s useful to expl i ci tl y repre
sent the goal l i ne QR. Thi s representation is done by drawi ng in a
hypothetical dashed l i ne, as shown in Fi g. 3- 8. The sol uti on to thi s
probl em wi l l be di scussed i n more detai l i n Chapter 4, but the advan
tage of expl i ci t representati on of the l i ne QR is that you are more
Inference
C
FIGURE 3-8
Expl i cit representation of goal l i ne.
mmmmm
mmm
l i kel y to see how to construct si mi l ar triangl es i nvol vi ng the l i ne seg
ment QR, and the constructi on of these si mi l ar triangl es is a cri ti cal
step i n sol vi ng the probl em. Al though the pl acement of the l i ne QR
i n Fi g. 3-8 i s certai nl y not expected to be exactl y correct, i t gi ves you
a more explicit representation of what the fnal goal woul d look l i ke.
I n thi s case, that makes i t much more probabl e that you wi l l see cer
tain rel ati ons that are cri ti cal in sol vi ng the probl em.
I ncreasing the speci fci ty of the goal general l y means more than
merel y drawi ng an extra line or two i n a fgure or i ntroduci ng a few
new symbol s ( i mportant though thi s purel y representati ve aspect may
be). Ofen it means deri vi ng addi tional properti es possessed by the
goal , usi ng either the statement of properti es of the goal as given in
the original probl em or possi bl y also usi ng given i nformati on to deri ve
properti es of the goal (wi thout necessari l y achi evi ng the enti re goal ) .
A marvelous example of t he i mportance of deri vi ng properti es of the
goal i s provided by the fol l owi ng pl ane-geometry probl em:
Can t wo triangl es have fve of their si x parts (three sides and three
angles) be equal and yet the triangles not be congruent?
Stop readi ng and try to sol ve thi s probl em, usi ng the probl em
sol vi ng method of expl i ci t representation of the goal and deri vi ng
properti es of the goal .
The frst i nference you might make is that, for two triangl es to have
fve of thei r six parts equal , this goal subdi vi des i nto two alternati ve,
more specifcal l y stated goal s -namel y, the two triangl es havi ng three
sides and two angl es be equal or the two triangl es having two si des
and three angl es be equal . Havi ng expl i ci tl y represented both pos
si bi l i ti es, it i s easy to see that the frst i s i mpossi bl e -namel y, two tri
angl es with three equal sides must be congruent, by a theorem of pl ane
4 Chapter d
geometry. Thus, we need onl y consi der the case where t he two tri
angl es have two si des equal and all three angles equal . If you di d not
solve the probl em previ ousl y, stop readi ng and try agai n.
Another property that can be deri ved regardi ng t he goal pair of
triangl es is that the two triangl es must be si mi l ar. Thi s property is a
tri vi a tstatement of the property al ready deri ved that the three angl es
are e'iJal . Neverthel ess, restating thi s property using the words
"si mi l ar triangl es" i s quite hel pful i n bringing to mind a useful repre
sentation and the proper theorems regardi ng the relationshi p of corre
spondi ng parts in si mi lar triangl es. If you di d not sol ve the probl em
so far, stop readi ng and try agai n.
I f you had not al ready done so, you shoul d have i ntroduced some
kind of di agrammati c representation of the two si mi l ar triangl es that
constitute the goal , such as the di agram i l l ustrated in Fi g. 3 - 9. Besi des
drawing the two si mi l ar triangl es, you shoul d al so have l abel ed the
si des i n a manner that easi l y refects whi ch si des are correspondi ng,
as i s al so shown i n Fi g. 3- 9 -namel y, by usi ng the same l etter for
correspondi ng si des and di stingui shi ng the two triangl es by the pres
ence or absence of a prime. It i s not stri ctl y necessary for the solution
of thi s probl em to l abel the angl es, but it does not hurt. If you have not
solved the probl em al ready, stop readi ng and try agai n.
Another i nference that can be drawn regardi ng t he goal i s that the
two equal si des in the triangl es A Be and A ' B' C' will be noncor
respondi ng si des. Thi s concl usi on is cl earl y true (from the method of
contradi cti on, to be expl ai ned in Chapter 7) , si nce, if the two si des
were correspondi ng, we shoul d have three equal angl es and two equal
correspondi ng si des i n the two triangl es, and such tri angl es are cl earl y
congruent, by several theorems of pl ane geometry. I f you sti l l have
not solved the probl em, stop readi ng and try agai n.
Yet another rel evant i nference concerning t he properti e of t he goal
is that the rati os of all correspondi ng si des of si mi l ar triangl es are
C
C

A
C
(a)
l lJ
FIGURE 3-9
Two si mi l ar triangl es wi th two equal sides (not corresponding sides).
Inference
4T
equal . Thus, a' /a -b' /b -e' /e. If you have not sol ved the probl em, stop
reading and try agai n.
Another i mportant set of i nferences to be drawn from the goal are
the i nequal i ti es that hold between the l engths of the di ferent si des
wi thi n each of the two triangl es. Of course, it i s compl etel y arbi trary
whi ch side we deci de i s l ongest , next l ongest, and shortest. Neverthe
l ess it i s i mportant to represent this i nformati on expl i ci tl y i n sol vi ng
thi s probl em. As i ndi cated in Fi g. 3- 9, we have assumed that e : b : a
and c' : b' : a' . I f you have not sol ved the probl em thus far, stop
readi ng and try agai n.
Now i t i s useful t o consi der whether the goal triangl es ABC and
A ' B' C' coul d be equi l ateral or i soscel es triangl es. Si nce the triangl es
are si mi l ar, if one i s equi l ateral the other i s equi l ateral , and if one i s
i soscel es the other i s i soscel es. Cl earl y, t he tri angl es cannot be equi
lateral , si nce then al l three si des of triangle ABC woul d have to be
equal to all three sides of A ' B' C' , or else al l three sides of triangle
A BC woul d have to be unequal to al l three sides of triangle A ' B' C' .
Neither case sati sfes the goal constrai nt of havi ng two equal si des.
I n a somewhat si mi l ar manner, we can contradi ct the possi bi l i ty that
the triangl es ABC and A ' B' C' are i soscel es. I f you have not al ready
done so, stop readi ng and prove thi s and attempt to sol ve the rest of
the probl em.
I n proving that the two triangl es cannot be i soscel es and i n further
work i n connection wi th thi s probl em, it i s useful to deri ve another
property of the goal triangl es, namel y, that tri angl e A' B' C' i s bi gger
than triangl e ABC. Cl earl y, one i s free to make this assumpti on with
out any l oss of general i ty, since the label i ng of the triangl es i s purel y
arbitrary. We can si mpl y adopt a conventi on for conveni ence that the
A' B' C' triangle refers to the l arger of the two si mi l ar triangl es, no
matter what pair of si mi l ar triangl es we choose to work with. I nci
dental l y, thi s tri ck of observi ng when one i s free to assume certain
relati ons without any l oss of general i ty comes up often enough i n
probl em sol vi ng to be worth taking speci al note of In the present i n
stance, if the triangl es are i soscel es, there are two possi bl e cases:
the t wo l ongest si des are equal ( c -b and e' -b' ) or t he two shortest
si des are equal ( b -a and b' -a' ) . I n the former case, the two larger
si des (e' and b ' ) of triangle A' B' C' wi l l be l arger than any of the three
sides of triangle ABC. Thus, there cannot be two sides of A ' B' C' equal
to two sides of A BC. Si mi l arl y, i n the l atter case, there wi l l be two si des
(a and b) of triangle A BC that wi l l be smal l er than any of the three
sides of triangl e A' B' C' . So, i n the l atter case, there also cannot be
two si des of triangle A BC that are equal to two si des of triangle
4Z Chapter d
A' B' C' . Thus, we can assume that e > b > a and e' > b' > a' . That i s,
the si des must have a strict i nequal i ty rel ati onshi p among them, withi n
any gi ven triangl e. Now, if you have not sol ved the probl em al ready,
stop readi ng and try agai n.
Again conti nui ng t o focus on t he properti es of the goal , we can deri ve
whi ch of the two si des of triangl e A BC must be equal to whi ch of the
two si des of triangle A' B' C' . If you have not solved the probl em, stop
readi ng and answer the questi on concerni ng whi ch si des of triangle
ABC must be equal to whi ch si des of triangle A' B' C' . Havi ng answered
that questi on, try again to sol ve the probl em, if you have not done
so al ready.
Since e' i s the l argest si de of triangl e A I B' C' and triangle A I B' C' is
l arger than triangle ABC (e' > e, b' > b, a' > a), i n the goal triangl es,
b' must be equal to e and a' must be equal to b. That i s, the l argest
si de ( e' ) of triangle A ' B' C' can have no si de equal to i t i n triangl e ABC,
and the smal l est si de (a) of tri angl e ABC can have no si de equal to i t
i n tri angl e A' B' C' . Agai n, if you have not sol ved the probl em al ready,
stop readi ng and try agai n.
I t i s now hel pful t o represent another concept connected wi t h the
goal triangl es, namel y, the expansion ratio of the correspondi ng si des
of the two triangl es: x -a' /a -b' /b -e' /e. Usi ng the rel ati onshi ps
expressed i n thi s seri es of equati ons, we can derive the equations
b' -xb and a' xa. Recal l i ng that i n the goal triangl es, b' must equal
e and a' must equal b, we can derive the expressi ons x e/b and x -
b/a. From thi s fact we concl ude that in the goal triangl es e/b -b/a.
That is, the ratio of the l arge si de to the middl e si de i n the triangl e A BC
must be equal to the rati o of the middl e si de to the smal l si de of tri
angl e ABC.
Now al l that i s necessary i s to real i ze that we have so compl etel y
speci fed the goal pair of triangl es i n thi s case that we have everythi ng
we need to sol v the ori gi nal probl em. We know that two triangl es can
have fve of thei r parts equal , provided that those parts are three angl es
and two si des and that the rati o of the l arge si de of one triangle to the
mi ddl e si de i s equal to the ratio of the mi ddl e si de to the smal l si de of
the same triangl e. Then cl earl y thi s rel ati onshi p wi I I hol d for both
triangl es, si nce the triangl es are si mi l ar. In addi ti on to triangl e ABC
satisfyi ng the relation e/b -bfa, it is al so necessary that triangle
ABC sati sfy the triangle i nequal i ty, namel y, L b + a. However, there
are an i nfni ty of sets of three l i nes ( a, b, e) that do indeed form a tri
angle ( sati sfy the triangle i nequal i ty) and are i n the relation of e/b -b/a.
For each such triangl e (ABC) , one can construct exactl y one l arger
triangl e (and one smal l er triangl e) such that fve of the six parts of the
Inference 4
two triangl es are equal . The requi red expansi on of the triangle A Be i s
obviousl y the factor X -c/ b -b/a.
Thus, the ori gi nal probl em i s solved. Note that i n essence thi s was
a constructi on probl em, though it was phrased more as if i t were an
exi stence probl em. However, i n order to determi ne the exi stence of
pai rs of triangl es having fve of thei r si x parts equal , i t was necessary
i n this case to sketch a specifc means by whi ch such a pair of triangl es
coul d be constructed. The strategy used i n thi s proof was to i ndi cate
a pl an for constructing a pair of such triangl es.
Thi s probl em provi des a trul y remarkabl e exampl e of the i mportance
i n some probl ems of focusi ng on the goal and deri vi ng properti es of
the goal (drawing inferences concerning the goal ) . The number of ti mes
that properti es of the goal were represented or i nferences were made
concerning the goal i n thi s probl em was unusual l y l arge. Practi cal l y
t he enti re probl em-sol vi ng process consi sted of representi ng or deri vi ng
properti es of the goal , in thi s case. The reason for thi s extensi ve
focusi ng on the goal was pri mari l y that the gi vens were so unspeci fc
they were al l the axi oms and theorems of pl ane geometry. The onl y
unique aspect of the probl em that i ndi cated what to sel ect from al l of
our knowledge of pl ane geometry was the goal . Thus, we necessari l y
had t o focus enti rel y on the goal , si nce i t was t he onl y uni que aspect of
the probl em. Said another way, the goal provi ded us a uni que be
gi nni ng point from whi ch to draw i nferences, whereas i f we had started
from the gi vens our frst step would have been to write down any axi om
or theorem of plane geometry. Starti ng wi th the gi vens we woul d have
had l i ttle idea where to proceed. Therefore, good strategy i n thi s prob
l em was to focus on the goal and make that goal progressi vel y more
and more specifc to i ndi cate exactl y what aspects of pl ane geometry
were relevant to sol ve the probl em. In thi s case, once al l the proper
ti es of the goal were expl i ci tl y represented, it was tri vi al to sol ve the
probl em from the begi nni ng, by specifying a pl an of constructi on.
A probl em in a completel y diferent context that i l l ustrates the use
ful ness of focusing on the goal and deri vi ng some of i ts properti es at
an earl y stage in probl em sol vi ng is the fol l owi ng 63-link-chain probl em:
Wanda the wi tch agrees to trade one of her magi c broomst i cks to Gaspar
the gho!t i n exchange for one of his gol d chai ns. Gaspar is somewhat
skepti cal that the broomsti ck i s i n worki ng order and i nsi sts on a guarantee
equal i n days to the number of l i nks i n hi s gol d chai n. To faci l i tate en
forcement of the guarantee, he i nsi st s on payi ng by the i nstal l ment pl an,
one gol d l i nk per day unti l the end of the 63-day peri od, wi th the bal ance
to be forfei t i f the broomsti ck mal functi ons duri ng the guarantee peri od.
Wanda agrees to t hi s request, but i nsi st s that the i nstal l ment payment be
44 Chapter 3
efected by cutting no more than three l inks in the gol d chain. Can this
be done, and, i f so, what l i nks i n the chai n shoul d be cut? The chai n
i nitial l y consists of 63 gol d l i nks arranged i n a simpl e linear order (not
closed into a circle).
Stop readi ng and try to determi ne an additional property possessed
by the goal that woul d be hel pful to deri ve for use i n solvi ng the
probl em.
The pri mary property of the goal that i s useful i n sol vi ng the problem
is the i nformati on that only three l i nks need be cut to achi eve the
goal . This property means that there will be at least three si ngle l i nks
i n the goal state, namely, the three l i nks that have been cut. We sti l l
do not know how l ong the other l engths of chai n wi l l be (whi ch si ngle
l i nks i n the 63- l i nk chain wi l l be cut), but we can then begi n to work
on the probl em, knowi ng the l engths of three of the segments of chain
i n the sol uti on of the probl em (the three si ngle-l i nk chai ns). This prob
lem al so i l l ustrates the subgoal method, and i s di scussed in this context
i n more detail in Chapter 6. If you cannot wait until Chapter 6 to check
your sol uti on to thi s probl em, please turn ahead to pages 1 00- 1 0 I .

1
-
1
. l
FIGURE 3-1 0
Part of a famous chess problem. White to move and
to achieve mate in fve moves.
Inference 4b
I n some probl ems, one cannot rigorousl y i nfer addi ti onal properti es
possessed by the goal , but one can make reasonable conjectures based
on heuristic pri nci pl es. For exampl e, consider the fol l owi ng chess
probl em, whi ch constitutes one portion of a famous probl em origi nated
by Sam Ll oyd. The probl em i s for white to achi eve checkmate i n fve
moves from the starting position shown i n Fig. 3 - 1 0. Stop readi ng and
try to solve thi s problem by guessi ng one or more pl ausibl e goal posi
tions i n whi ch bl ack i s checkmated and then try to determi ne how you
coul d achi eve such a checkmate posi ti on.
I t seems reasonabl e to conjecture that the checkmate posi ti on wi l l
have whi te' s rook at hi s own ki ng' s rook one. That i s, whi te' s rook
will be at the end of the open fle where he has the bl ack king trapped.
Bl ack has the potential opportuni ty to i nterpose hi s bi shop at two
pl aces i n that fl e between the conjectured posi ti on of whi te' s rook and
bl ack' s king. However, the move sequences by whi ch bl ack can i nter
pose his bi shop between the white rook and the black ki ng can al l be
frustrated by whi te i n one way or another. The essenti al strategy for
sol vi ng the chess probl em comes by conjecturing that you wi sh to
have the whi te rook at whi te' s ki ng' s rook one, without the possi bi l i ty
of bl ack bl ocki ng the attack by hi s bi shop, and then working forward
to determine what white must do at each move in order to achi eve that
terminal checkmate posi ti on.
4
Classification of
Action Sequences
RANDOM TRIAL AND ERROR
The frst thi ng that most people do when confronted with a problem i s
t o start appl yi ng t he allowable operati ons t o t he gi vens i n t he probl em.
Cal l thi s random trial and error. ( Readers with a course i n probabi l i ty
shoul d understand that what I am cal l i ng random trial and error i s
equi valent to random sampl i ng wi th replacement from the population
of action sequences less than or equal to some maxi mum l ength. ) I f a
very short sequence of such actions is sufci ent to get from the gi vens
to the goal , even randoml y generated sequences of acti ons may yi el d
the sol uti on fai rl y qui ckl y.
SYSTEMATIC TRIAL AND ERROR
To avoid goi ng around in ci rcl es, i t is obvi ousl y desi rable to remember
what sequences of acti ons have been tried already wi th no success.
I n addi ti on, it i s desi rable to have a scheme for systemat i cal l y generat
ing di ferent sequences of actions, whi ch guarantees that all sequences
(to some maxi mum l ength) wi l l be generated. Most desi rable of al l
random trial -and-error schemes woul d be a generation method that
Classification of Action Sequences 4
automatical l y produced a mutual l y excl usi ve and exhausti ve l i sti ng of
all sequences of actions up to some maxi mum l ength. Cal l thi s sys
tematic trial and error (equi val ent to random sampl i ng without re
pl acement) .
From the above di scussi on, it shoul d be rel ati vel y obvi ous that
there can be di ferent degrees of systemati cness between random and
completel y systematic trial and error. A probl em sol ver coul d have
some memory for past attempts, but it coul d be l i mited or subject to
error. A probl em sol ver coul d al so have di ferent degrees of efecti ve
ness i n systematical l y generati ng all of the di ferent sequences of
acti ons. The degree of systemati cness i n the use of tri al and error i s
one useful i ndi cator of the i ntel l igence of diferent speci es of ani mal s.
I t hi nk it i s not known whether any speci es of ani mal bel ow human
beings can be trai ned to be more systematic i n thei r trial and error,
but humans certai nl y can be. Peopl e can overcome thei r memory
l i mitations by writing thi ngs down, and they can ofen i nvent mutual l y
excl usi ve and exhausti ve generation schemes, though the di fcul ty
of accompl i shi ng the l atter vari es from probl em to probl em.
CLASSIFICATORY TRIAL AND ERROR
The most powerful ki nd of trial and error is what might be cal l ed
classicatory trial and error, whi ch requi res that sequences of acti ons
be organized i nto cl asses that are equi valent (or probabl y equi valent)
with respect to the sol ution of the probl em. That is, if one sequence of
actions wi thi n a cl ass wil l sol ve the probl em, then al l the other se
quences of acti ons within the same cl ass will probabl y al so solve the
problem. Conversel y, if one sequence of actions within the class can
be shown not to sol ve the probl em, then probabl y every other sequence
of acti ons i n the same cl ass wi l l al so fai l .
To appreci ate t he power of cl assi fcatory trial and error, consi der a
state-action tree with n possi bl e acti ons at each node of the tree.
Wi th thi s representati on, there are nm possi bl e sequences of acti ons
that are m acti ons i n l ength. For even rather smal l val ues of n and m,
nm can be s o l arge as to prohi bi t the use of systematic trial and error.
Obvi ousl y, if nm sequences of acti ons coul d be reduced to a small
number of equivalence classes -that is, classes that are equi valent
with respect to the solution of the probl em-i t woul d make the probl em
much si mpl er to sol ve. I n thi s case, you coul d systemati cal l y try one
sequence from each cl ass unti l you found d cl ass that sol ved the prob
l em. Such cl assi fcatory trial and error onl y works for probl ems where
4
Chapter 4
sequences of acti ons fal l i nto cl asses that are equi val ent wi th respect
to sol uti on of the probl em, but most of the probl ems peopl e sol ve
probabl y exhi bi t some such equi val ences.
There are four di ferent categori es of probl ems i n whi ch cl assifca
tory trial and error is hel pful , and wi thi n each four are two subtypes.
To di scuss these types of probl ems, l et us i magi ne that the states
reached by vari ous action sequences appl i ed to the given i nformation
can be represented by a sequence of l etters , where the frst l etter stands
for the action taken at the frst node, the second l etter for action at
the second node, and so on. Thus, abc represents the state reached by
taking acti on a at the frst node, fol l owed by acti on b at the second
node, fol l owed by action c at the thi rd node i n the state-action tree.
The basi c pri nci pl e i s that two or more action sequences are equi val ent
if and onl y if they resul t in the same state or states thought to be equi va
l ent wi th respect to sol vi ng the probl em.
The frst major type of equi val ence cl ass of action sequences i s the
obvi ous one that resul ts from havi ng equi val ence cl asses of actions.
In thi s case, for exampl e, l et us i magi ne that a set of actions { bl , b
2
,
b3 , } -{ bi } are al l i denti cal or thought to be equi valent. I n thi s
case, the acti on sequence ab; c i s equi valent to the action sequence
abjc, for all i and j. People usual l y have no troubl e i n identifyi ng
equi val ence cl asses of action sequences based on such el ementary equi
val ences of component actions. Such equi valent acti ons often ari se i n
probl ems where there are a l arge number of equi valent gi vens, such as
a l arge number of enti ti es of the same type -for exampl e, si x sti cks
i denti cal i n l ength and every other i mportant property. When the gi vens
are not i denti cal i n every respect but are equi val ent with respect onl y
to the properti es that are thought to be i mportant to the probl em, then
recogni ti on of such equi val ences may be more di fcul t and subject to
error. In any event, identical or equi valent actions (ofen resul ti ng
from i dentical or equi val ent gi vens) produce the frst type of equi va
l ence cl asses of action sequences.
A second and rel ati vel y fami l iar type of equi valence cl ass of action
sequences ari ses i n probl ems having commutative actions -that i s,
where the resul t of taking acti on a fol l owed by action b yi el ds the same
resul t as taking action b fol l owed by action a. If three actions (abc) are
all commutati ve with respect to one another, then action sequences
abc, acb, bac, bca, cab, and cba are al l equi val ent, si nce they resul t
i n the same state when appl i ed to the same gi ven i nformation or other
starti ng point i n a probl em. For exampl e, i n sol vi ng for X, given the
equation 5x + 17 -3x + 2 1 , we could subtract 3x from both sides of the
equati on as the frst step, then subtract 1 7 from both si des as the
Classification of Action Sequences 4
second step; but the same resul t i s achi eved by performing the acti ons
i n the reverse order. Even the fnal acti on of di vi di ng both si des of the
equation by 2 coul d be commuted with respect to the other two ac
tions and equi val ent resul ts obtai ned.
A thi rd major way i n whi ch action sequences may be equi val ent oc
curs i n probl ems where one or more acti ons have inverse acti ons. I f
action a has an i nverse action a-
I
, then the resul t of appl yi ng action a
fol l owed by action a-
I
is to leave the state of the probl em i denti cal to
what it was before the sequence aa-
1
Said another way, the sequence
of acti ons aa-
1
equal s the i denti ty acti on, whi ch l eaves the state of
the probl em unchanged. I f all the acti ons i n some sequence are com
mutati ve, then any co-occurrence of acti on a and i ts i nverse a-
I
per
mi ts you to cancel both a and a-I
from the sequence. For exampl e,
if the acti ons a, a-I
, b, c, d are al l commutati ve wi th respect to each
other, then the action sequence ca-
1
bda is equi val ent to the acti on
sequence cbd, si nce the a and a-I
cancel each other.
A good exampl e of the power produced by the combi ned recogni ti on
of commutati vi ty and i nverse acti ons i n reduci ng the number of di f
ferent action sequences to a small set of equi val ence cl asses is pro
vided by the six-arrow problem:
You are gi ven si x arrows in a row, the lef t hree poi nti ng up, and the right
t hree poi nti ng down. The goal is to t ransform these arrows into an al
ternati ng sequence such that the lef-most arrow poi nts up, the next arrow
to i t poi nts down, the next up, then down, then up, and then down. The
acti ons al l owed are to si mul taneousl y i nvert ( turn upsi de down) any
two adjacent arrows. Note that you cannot i nvert one arrow at a ti me but
must i nvert two arrows at a ti me, and the two arrows must be adjacent.
The gi ven and goal states are i l l ustrated i n Fi g. 4- 1 . Achi eve the sol u
ti on usi ng the mi ni mum number of acti ons (i nversi ons of adjacent pai rs).
Before readi ng further, try to sol ve thi s probl em by determi ni ng the
very smal l number of di ferent equi val ence cl asses of action sequences.
2 3 4 5 6 2 3 4 5 6
r r r 1 1 1 r 1 r 1 r j
Gi ven Goal
FIGURE 4-1
The si x-arrow probl em.
b Chapter 4
I f you have not sol ved the probl em, consi der the fol l owi ng. I n
representi ng the i nformation gi ven i n the probl em, you shoul d note that
there are onl y fve di ferent possi bl e acti ons that you can take at any
gi ven stage of the probl em-namel y, to i nvert arrows I and 2, 2 and 3 ,
3 and 4 , 4 and 5 , or 5 and 6, as shown i n Fig. 4- 1 . Of course, if every
di ferent sequence of acti ons had to be consi dered, and you had no way
of knowi ng how long a possi bl e sequence might be necessary to sol ve
the probl em, then the probl em coul d be extraordi nari l y di fcul t, de
spite the l i mited number of acti ons avai l abl e at each node. In fact, a
l i ttle reasoni ng concerni ng the equi val ence of diferent action se
quences reduces the number of nonequi valent action sequences to an
extremel y smal l number.
I n the frst pl ace, you shoul d note that the order in whi ch you per
form the acti ons makes no di ference. That i s, the acti ons commute
one wi th another, so that i nverting arrows 3 and and then i nverti ng
4 and 5 i s compl etel y equi val ent to frst i nverting 4 and 5 and then
i nverti ng 3 and 4. The same i s true for any set of three or more acti ons
i n a sequence. Thus, you do not have to deal wi th ordered sets of ac
ti ons but onl y with unordered sets. Thi s statement si mpl y means that
al l the di frent orderi ngs ( permutati ons) of a gi ven unordered set of
action sequences are equi val ent, greatl y reduci ng the number of pos
si bl e sol uti ons to be consi dered. Now stop readi ng and try to sol ve
the probl em, if you coul d not before.
I f you sti l l cannot sol ve the probl em, consi der the fol l owi ng hi nt.
An opti mal solution will contai n no more than one occurrence of any
gi ven type of acti on. An action i s i ts own i nverse. I nverti ng arrows 2
and 3 twi ce l eaves the arrows exactl y the same as they were. Thus,
any pai r of two occurrences of a gi ven action can be cancel ed (even
if they are not adjacent, si nce the acti ons are compl etel y commutati ve).
Any even number of occurrences of an action i s equi val ent to zero
occurrences of that acti on, and, for the same reason, any odd number
of occurrences i s equi valent to a singl e occurrence. Thus , we need
consi der only combi nati ons ( unordered sets) of from one to fve pos
sibl e actions. At this poi nt we have reduced a potenti al l y i nfni te num
ber of di ferent action sequences to 3 1 possi bl e classes of acti on
sequences. Each of the 3 1 cl asses can be represented by its si mpl est
member, as fol l ows : 5 si ngle- step acti ons, 1 0 two-step action sequences,
1 0 three-step action sequences , 5 four-step action sequences, and
1 fve- step action sequence. Stop readi ng and try to solve the probl em,
if you di d not before.
We can now observe t hat two of the fve acti ons
-
namel y, i nverti ng
arrows and 2 and i nverting arrows 5 and -cannot possi bl y be in
cl uded i n the opti mal sol uti on, si nce these actions change arrows 1
Classification of Action Sequences bT
and 6, whi ch are in the right position in the begi nni ng state. To change
these end arrows back to the correct posi ti on would requi re another
use of exact l y the same acti on, si nce no other acti ons change arrows
I and 6. Thi s sol uti on cannot possi bl y be optimal , si nce i t i s equi val ent
to not performing the action at al l . Thus, we have reduced the number
of possi bl e acti ons to consi der at any node to three. The maxi mum
number of acti ons i n a sol uti on sequence i s now reduced to three.
I t i s then a si mpl e matter to rul e out al l of the one- step and two-step
action sequences, l eavi ng onl y the si ngl e three- step action sequence
as a sol uti on to the probl em. Of course, as i l l ustrated i n Fi g. 4-2,
there are actual l y si x di ferent acti on sequences that al l achi eve the
goal i n the smal l est number of steps (three) . These si x sol uti ons difer
onl y i n the order wi th whi ch the three acti ons are appl i ed. Thi s sol u
tion poi nts out once again the exi stence of a l arge variety of acti on
sequences that are compl etel y equi val ent wi th respect to the sol uti on
of the probl em.
The fourt h major type of equi val ence cl ass of action sequences ari ses
i n probl ems where some arbitrary sequence of acti ons abc resul ts i n
State
Six equivalent solutions
1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
Gi ven
t t t + + + t t t + + + t t t + + +
t + + + + + t + + + + + t t + t + +
t + t t + + t + + t t + t + t t + +
Goal
t + t + t + t + t + t + t + t + t +
Gi ven
t t t + + + t t t + + + t t t + + +
t t + t + + t t t t t + t t t +
t t + + t + t t + + t + t + + t t +
Goal
t + t + t + t + t + t + t + t + t +
RGURE 4-2
Si x equi val ent three-act i on sequences that sol ve the si x-arrow probl em.
bZ Chapter 4
an i denti cal or equi val ent state as some other sequence of actions
dafg, for exampl e. You may not have any el egant theoretical defnition
for al l the equi val ences of this fourth major type i n a probl em, but
neverthel ess you shoul d recognize such equi valences and take advan
tage of them i n speedi ng up the sol uti on to the probl em. In t hi s fourth,
most general case of cl assi fcatory trial and error, what we are doing
i s to defne certain probl em states and recogni ze that any action se
quence that achi eves a gi ven probl em state is a member of the same
equi val ence cl ass. We may not know prior to executi ng an action se
quence that i t will resul t i n the same state as some al ready executed
action sequence ; however, the advantage of recogni zi ng the equi valent
state reached by both action sequences i s that we need not conti nue
pursui ng an action sequence dafg that arri ves at the same state as
a previ ousl y executed acti on sequence abc, if we have al ready deter
mi ned that no path from that state is l i kel y to reach the goal . Thus,
we can truncate i n t hi s way, recogni zi ng that a diferent sequence of
acti ons has resul ted in the same state, and, therefore, we need not
conti nue the action sequence from that poi nt on, si nce it woul d be,
i n essence, repeating sequences previousl y shown to be frui tl ess.
One practical way to i mpl ement thi s use of cl assifcatory tri al and error
is to gi ve names or otherwi se store in your memory a representation of
certain di sti ncti ve states reached in vari ous attempts to sol ve the
probl em.
A good exampl e of the useful ness of expl i ci tl y i dent ifyi ng di sti nc
ti ve states achi eved i n vari ous attempts at sol uti on of a probl em i s
provided by the railroad-siding problem:
You are gi ven a ci rcul ar rai l road t rack that passes through a tunnel and
has one si di ng, as i l l ust rated in Fi g. 4- 3. In the gi ven state of the probl em,
an engi ne, , rests on the si di ng and two cars , and B, rest on the ci rcu
l ar track on opposi te si des of the t unnel , as i l l ustrated i n Fi g. 4- 3 . The
goal i s to i nterchange the posi ti ons of cars and B and have the engine
Tunnel Tunnel
A
(
a)
Gi ven |b) Goal
FIGURE 4-3
The ri l rad- i di ng prohl em.
Classification of Action Sequences b
back on the si di ng. An i mportant restricti on is that onl y the engi ne can
pass through the tunnel ; the cars cannot . Both cars and engi ne may rest
on a si di ng i n any order and i n any numbers. However, as wi th real -worl d
rai l road si di ngs, a seri es of cars comi ng of the si di ng must go on to the
ci rcul ar track i n the di recti on of the tunnel ; they cannot make the sharp
angle t urn from the si di ng onto the ci rcul ar track in the di recti on away
from the tunnel . Cars can be coupl ed and uncoupled from one another
or coupl ed or uncoupl ed from the engi ne at any poi nt.
Stop readi ng and attempt to sol ve the probl em by drawi ng a di agram
on a piece of paper and getting three di sti ngui shabl e objects to act as
the engine and cars A and B.
You mi ght consi der there t o be many di sti ngui shabl e states i n the
probl em, where the states are defned to be the di ferent arrangements
of the engi ne and cars A and B on the ci rcul ar track and si di ng; how
ever, it might be extremel y useful to consi der a si mpl e di sti ngui shabl e
state i n whi ch onl y one car or engi ne rests i n each of the three major
portions of the track -namel y, the si di ng, the porti on of the track i n
a cl ockwi se di recti on from the si di ng to the tunnel , and t he porti on of
the track in a countercl ockwi se di recti on from t he si di ng to the t unnel .
Considering onl y these three diferent posi ti ons and l i miting consi dera
tion to those cases where onl y a si ngl e car or engi ne rests in each of
the t hree posi ti ons, there are onl y si x possi bl e confgurati ons of the
three enti ti es i n the three posi ti ons. One i s the gi ven state, anot her i s
the goal state, and the remai ni ng four can easi l y be represented wi th
penci l and paper. The si x possi bl e confgurati ons resul t from t he pos
si bi l i ty of pi cki ng any of the t hree enti ti es to fl l a posi ti on on t he
sidi ng, t hen pi cki ng any t wo of the remai ni ng ent i ti es to fl l the upper
posi ti on on the ci rcul ar track, whi ch l eaves onl y one remai ni ng entity
to fl l the l ower position on the ci rcul ar track. This yi el ds 3 2 or six
possi bl e confgurati ons. Havi ng l i sted al l six confgurat i ons, you mi ght
fnd i t useful to qui ckl y cl assify your acti on sequences wi th respect
to whether they achi eve as a subgoal any one of the four confgura
ti ons other t han the gi ven or goal confgurati on. Al l acti on sequences
that achi eve any part i cul ar one of t he four nontermi nal confgurati ons
are equi val ent. Thus , you mi ght set each of the four nontermi nal con
fgurati ons as a subgoal , try to achi eve i t , and then see whether you
coul d get to the goal posi ti on from t hat part i cul ar nont ermi nal posi
ti on. I n thi s way, you ensure a certai n degree of vari ety i n t he act i on
sequences you take, so t hat you are not goi ng around i n ci rcl es. Now
stop readi ng and try to sol ve t he probl em agai n, if you di d not before.
I n my opi ni on, the ideal subgoal posi ti on to work for i s to i nterchange
the engine and car A , pl aci ng the engine on the upper portion of t he
b4 Chapter 9
ci rcul ar track and car A on the si di ng, wi th car B remai ni ng in the l ower
posi ti on of the ci rcul ar track. Thi s subgoal confguration i s probabl y
optimal because i t i s j ust sl ightl y more than hal fway between the gi ven
state and the goal state i n terms of the sequence of steps needed to
sol ve the probl em. Note that, al though thi s i mmedi ate state i s bei ng
cal l ed a subgoal state, there is no sense in whi ch we have any reason
to thi nk that this state i s cl oser to the goal than the gi ven state. Thus,
we are not real l y usi ng the subgoal method, as thi s wi l l be defned i n
Chapter 6. Rather, the basi s for sel ecti ng thi s confguration as a state
to work toward is si mpl y that getti ng to i t from the gi ven state and
getting frm it to the goal state each represent equi val ence cl asses
of action sequences. Stop readi ng and try agai n to sol ve the probl em,
Consi der an extensi on of the method of cl assifyi ng action sequences
wi th respect to states achi eved - namel y, i dentifyi ng acti on sequences
i n terms of the sequence of states achi eved. In the present probl em,
i t turns out that the sol uti on sequence passes through fve of the si x
si mpl e confgurati ons that we di st i ngui shed. The sequence of fve
si mpl e confgurati ons i n the sol uti on of the probl em i s shown i n Fi g.
4- 4. By fol l owi ng thi s seri es of subgoal s , you shoul d be abl e to sol ve
the probl em. Thi s probl em provi des an excel l ent i l l ustration of the
uti l ity of i dent ifyi ng equi val ent acti on sequences by states achi eved
or sequences achi eved, si nce i t i s rat her easy to get mi xed up con
cerni ng where one i s and where one i s going wi t h al l the i nvol ved move-
L A A L
DDD
L B
Du
L
FIGURE 4-4
^ sequence of i mpl e confgurat i on i n t he sol ut i on to t he
rai l road- si di ng probl em.
Classification of Action Sequences bb
ments of cars and engi nes requi red i n order to sol ve the probl em. I f
one does not i dentify landmarks al ong t he way, i t i s easy t o go around
i n ci rcl es.
So far, the exampl es that have been presented have i l l ustrated onl y
cases where several di ferent acti on sequences were consi dered equi va
l ent because they achi eved preci sel y i dentical states. I n some prob
l ems, i t i s useful to consi der action sequences to be equi val ent when
they achi eve states that are consi dered equi val ent with respect to
sol uti on of the probl em, despi te t he fact these states may not be i den
ti cal i n every respect. The states ( and therefore the action sequences
that l ead to them) are consi dered equi val ent because they al l have cer
tain propert i es in common t hat we judge to make the states equi val ent
i nsofar as sol vi ng thi s parti cul ar probl em i s concerned ( though the
states mi ght wel l not be judged equi val ent wi th respect to sol vi ng some
other probl em) . Such cl assifcat i on of states ( and action sequences)
as equi val ent i s, of course, more dangerous than t he compl etel y safe
cl assi fcati on of states as equi val ent when the states are i denti cal .
I f our judgment i s faul t y concerni ng whi ch properti es are rel evant and
i rrel evant to the sol uti on of the probl em, then our judgment that al l
members of some equi val ence cl ass wi l l fai l to sol ve the probl em may
be faul t y. Nevert hel ess, our judgment concerni ng rel evant and i r
rel evant properti es is general l y sufci entl y good that such equi val ence
cl asses are general l y qui te useful . An exampl e of t hi s type of equi va
l ence cl assi fcati on of action sequences al ong wi th some of the i denti ty
based equi val ence cl assi fcati on of acti on sequences i s provi ded by
the cheap-necklace problem:
You are gi ven four separate pi eces of chai n that are each three l i nks i n
l ength ( see lef si de of Fi g. 4-5 ) . I t costs 2 to open a l i nk and 3t to cl ose
a l i nk. Al l l i nks are cl osed at the begi nni ng of the probl em. Your goal i s
to joi n al l 1 2 l i nks of chai n i nto a si ngl e ci rcl e ( see ri ght si de of fgure)
at a cost of no more than 1 5t.
Stop reading and t ry to sol ve the probl em by defni ng equi val ence
cl asses of acti on sequences based on the achi evement of equi val ent
states.
I f you di d not sol ve the probl em, consi der t hese hi nts. There i s an
i mpl i ci t operation of inserting one l ink i nto an open link that has a cost
of O attached to it. In addi ti on, there i s another i mpl i ci t operati on of
detaching an open l i nk from a cl osed l i nk that al so has a cost of O.
These operati ons are onl y i mpl i ci t l y specifed. Now try to sol ve the
bb Chapter 9
Given state Goal state
Chain A _
Chain B
Chain e
Chai n
FIGURE 4-5
The gi ven and goal states for the cheap-necklace probl em.
probl em agai n, if you di d not before because you di d not expl i ci tl y
represent these operati ons. I f that was not your di fcul ty, read on.
Havi ng represented al l the i mportant gi vens and operati ons i n the
probl em, l et us exami ne how many di ferent types of acti on sequences
there mi ght be that achi eve the goal of getti ng al l the links i nto a cl osed
chai n ( ci rcl e) . The one type of action sequence t hat vi rt ual l y everyone
consi ders frst i s to open an end l i nk of one chai n ( for exampl e, chai n
A ) , i nsert an end l i nk of another chai n i nto i t ( for exampl e, chai n B) ,
cl ose t he joi ni ng l i nk, open an end l i nk of the combi ned ( 6- l i nk ) chai n,
i nsert anot her 3-l i nk chai n ( for exampl e, chai n C) i nto i t , cl ose the
joi ni ng l i nk, open an end l i nk of the combi ned ( 9- l i nk) chai n, i nsert the
l ast 3- l i nk chai n ( chai n D) i nto i t, cl ose t he joi ni ng l i nk, open an end
l i nk of t he combi ned ( 1 2- l i nk) chai n, i nsert the other end l i nk, and
cl ose t he joi ni ng l i nk to form a cl osed chai n. However, thi s acti on
sequence costs 1 8i , whi ch exceeds the l i mi t of 1 5i .
There are a l arge number of acti on sequences that are essenti al l y
equi val ent to t he one j ust menti oned, whi ch we mi ght refer t o as the
(l l d- to-(l1 d acti on sequence. Obvi ousl y, i t mak(s no di ference whi ch
3- l i nk chai n we start wi th, add on second, or add on t hi rd. I n addi ti on,
i t makes no di ference whi ch end l i nks we open at various stages of the
probl em. I f we have expl i ci tl y noti ced t he equi val ence of al l these
Classification of Action Sequences b
di ferent action sequences, we are i n a favorabl e posi ti on to di scover
whether there are any action sequences that are not equi valent to
end-to-end that mi ght resul t i n the sol uti on. I f you have so far not
solved the probl em, stop readi ng and try agai n.
I t might occur to you that , afer openi ng a l i nk, you coul d i nsert
two end l i nks i nto it or i nsert a mi ddl e l i nk. These acti ons as a part of
any action sequence ( whi ch did not l ater essenti al l y reverse the efect
of these acti ons) woul d undeni abl y resul t i n an outcome that was not
equi val ent to end-to-end. However, a l i ttle i nspecti on of the nature
of the goal to be achi eved reveal s that, in the fnal cl osed chai n, there
are no l i nks that have more than two l i nks i nserted in them. Thus, al l
methods that i nsert two end l i nks (of chai ns wi th two or more l i nks)
or a mi ddl e l i nk ( of a chai n wi th three or more l i nks) i nto an end l i nk
of any chai n wi th two or more l i nks coul d not produce the goal state.
(Thi s statement is true unl ess the critical acti on were l ater reversed,
and reversi ng acti ons seems very unl i kel y to be a part of the correct
solution, i n vi ew of the cost l i mi tati on. )
Assumi ng that we have heuri sti cal l y rejected al l action sequences
that i nvol ve acti ons that resul t i n three or more l i nks bei ng i nserted
i nside anot her l i nk, are there any other types of action sequences
that are not essential l y equi valent to end-to-end? I f you have not yet
sol ved the probl em, stop readi ng and try agai n, then read on.
Yes, there i s exactl y one other type, and i t i s the sol uti on to the
probl em. For some reason, this type of acti on sequence does not occur
to many peopl e very qui ckl y, but if you have rul ed out al l exampl es of
the previ ousl y menti oned two equi val ence cl asses of acti on sequences,
then you are i n a rather favorabl e posi ti on for di scovering thi s remain
ing type of action sequence. At least, you are not wasting ti me trying
out many specifc exampl es of each of the two l arge classes of acti on
sequences that do not work.
Perhaps you have al ready di scovered thi s remai ni ng type of action
sequence and perhaps you have not. In any event, thi s type of action
sequence coul d be cal l ed destroying a chain. The action sequence i s as
fol l ows: Open a l i nk of one ( 3- l i nk) chain (for exampl e, chain A ), detach
the link from that chain (chai n A ) , i nsert the link i nto ends of two
di ferent 3- l i nk chains (for exampl e, B and C) , close the joining l i nk;
open another l i nk of the chai n one is destroyi ng (chai n A) , i nsert that
l i nk i nto an end of the combi ned ( 7-l i nk) chai n and an end of the re
mai ni ng 3 - l i nk chai n ( D) , cl ose the joi ni ng l i nk; open the last l i nk of
the frst chai n (A ) , i nsert the ends of the combi ned ( I I -l i nk) chai n i nto
it, and cl ose the joi ni ng l i nk. The cost of thi s type of action sequence
i s exactl y 1 5i, sol vi ng t he probl em wi thi n the cost l i mi tati on.
Chapter 4
MACROACTIONS
Now consi der a sequence of acti ons that starts with some gi ven state
and achi eves some ot her state. Call t hat a micro{lctioll sequence. Fur
thermore, consi der any other sequence of mi croactions that starts
from the same gi ven state and achi eves the same terminal state to be
a member of an equi val ence class of mi croaction sequences. Call such
an equi val ence class a maeroaetion. Thus, a macroaction i s defned
to be an equi val ence class of sequences of mi croacti ons, though i n
some cases the equi val ence cl ass may consi st of onl y one member.
Defni ng one or more macroacti ons based on the mi croacti ons speci
fed i n the probl em i s someti mes a si gni fcant aid i n solution. If you
l i ke si l l y analogi es, it i s l i ke wearing seven-l eague boots and taking
giant steps i nstead of baby steps. Ofen, when one has defned macro
acti ons, the number of such macroacti ons necessary to go from the
gi vens to the goal is extremel y smal l . Thi s means a small state-action
tree, wi th onl y a relati vel y small number of di stinct possi bl e macro
action sequences to test. Systemati c trial and error wi l l ofen be quite
adequate to sol ve the probl em from this point on.
Defni ng macroacti ons from sequences of mi croacti ons does have
one possi bl e di fcul ty, namel y, that application of a macroaction
might take one past the goal . I n probl ems wi th destructi ve operations,
one woul d have sped past the goal much as an express subway train
speeds past a l ocal subway stop. You woul d not even have the op
portuni ty to see the goal for an i nstant as you passed through i t,
whereas at least on the express subway there i s some chance t hat you
coul d see your local stop as you sped through i t.
Even wi th nondestructi ve operati ons, the efect of appl yi ng a
macroacti on that took you past the goal coul d be much the same as
wi th destructi ve operations. To be sure, even when you have gone
past the goal , you still have achi eved i t i n some sense, when the
probl em i nvol ves nondestruct i ve operat i ons. However, if you do not
know that the goal was achi eved, because you never wrote down the
goal expressi on, then i n a practical sense, you have not achi eved
the goal .
As an exampl e of the successful use of macroacti ons and cl assi
fcatory tri al and error, consi der t he set of possi bl e acti on sequences
i nvol ved i n reduci ng an equati on of the form ax + b ex + d to an equa-
ti on of the form x . There are a number of di ferent mi cro-
acti ons in di ferent orders t hat wi l l serve to reduce t hi s equati on. One
Classification of Action Sequences
b
coul d subtract b from both si des, then subtract CX from both si des,
and then di vi de both si des by ( a C) , but one coul d al so add CX to
both si des, then mul ti pl y both si des by I /( a C) , and then subtract
b/(a C) from both si des, and so on. To someone experi enced i n sol v
ing such si mpl e al gebra probl ems, thi s probl em may seem pretty tri vi al ,
but many students l earni ng el ementary algebra for the frst ti me fnd
these probl ems di fcul t. One reason for thi s di ference i s that the ex
peri enced l i near-equation reducer sees al l these di ferent action se
quences as equi val ent, whi l e many of the i nexperi enced l i near-equati on
reducers have yet to learn thi s fact. Al though probabl y few people
actual l y think i n terms of a si ngl e equation for l i near-equation reduc
ti on, what an experi enced l i near-equati on reducer has i n hi s head i s
essenti al l y equi val ent t o a si ngle macroacti on for probl ems of the
above type, namel y, OX + b CX + d - X ( d b)/(a C) . The ex
perienced l i near-equati on reducer probabl y goes through a short se
quence of mi croacti ons to sol ve such a probl em, but such a person does
thi s i n one of a very small number of compl etel y routi ni zed ways, with,
at most , a si ngl e choi ce of one of the equi valent mi croacti on sequences.
Thi s statement i s what i s meant by thi nki ng i n terms of a si ngl e macro
action rather than a sequence of mi croacti ons.
Another exampl e of the useful ness of thi nki ng i n terms of macro
actions occurs in geometry construction probl ems. I n such probl ems,
you are al l owed to use a compass, an unmarked straightedge (no grada
ti ons as on a rul er), and, of course, penci l and paper. The mi croacti ons
that you have avai l abl e are then to draw arcs of ci rcl es and straight
l i ne segments. In learning how to sol ve geometry constructi on prob
l ems, you frst learn what sequences of mi croacti ons al l ow you to
achi eve certain states, such as construct i ng a perpendi cul ar to a l i ne
at a gi ven poi nt , constructing a perpendi cul ar bi sector of a l i ne seg
ment, constructi ng an angle bi sector, or construct i ng a paral lel to a
gi ven l i ne through a gi ven outsi de poi nt. Thereafter, in more compl ex
geometry constructi on probl ems, you thi nk i n terms of what sequence
of these macroacti ons i s necessary i n order to sol ve these geometry
constructi on probl ems rather than i n terms of the original mi croacti ons
of drawi ng arcs and ci rcl es and strai ght-l i ne segments, t hough you must
use a sequence of such mi croacti ons in achi evi ng each macroacti on.
However, i n constructi ng the basi c pl an for sol vi ng a more compl ex
geometry construct i on probl em, the use of a macroacti on i s extremel y
hel pful . As an exampl e of how t hi nking i n terms of such macroact i ons
si mpl i fes geometry constru ct i on probl ems, consi der t he fol l owi ng
probl em, whi ch was previ ousl y di scussed bri efy i n Chapter 3 .
b Chapter 9
Gi ven an acute angl e UVW and a poi nt P wi thi n the angl e, use a compass
and strai ghtedge to construct a segment QR passi ng through P, such
that QP and PR stand i n the rati o 2 : 1 , Q and R l yi ng on UV and VW,
respecti vel y.
Of course, t he l i ne QR shown i n Fi g. 4- 6 i s not part of the gi ven state
but is rather the goal to be achi eved. The l i ne QR has si mpl y been
drawn i n the fgure to faci l i tate thi nki ng about the probl em. Now stop
readi ng and attempt to sol ve the probl em, thi nki ng i n terms of geom
etry constructi on macroacti ons.
I f you di d not sol ve the probl em, consi der the fol l owi ng hi nt. The
l i ne segment s QP and PR wi l l be i n the rat i o 2 : L if and onl y if the ratio
of the line segment QR to the l i ne segment PR i s i n the rati o 3 : I . This
hi nt merel y exempl i fes a rel ati vel y trivial i nference made from the
gi ven i nformati on, though i n thi s case thi s tri vi al transformation of
the gi ven i nformation can be of consi derable help i n sol vi ng the prob
l em. Stop readi ng and try to solve the probl em, if you did not before.
I f you have not yet sol ved the probl em, consi der the fol l owing
additional hint. One way to make the line segments QR and PR be i n
the rati o 3 : 1 i s to make them correspondi ng parts of si mi l ar triangles,
one of whose other si des i s known to be i n the ratio 3 : 1 . Si nce PR i s
part of t he l i ne segment QR, t he obvi ous choi ce for si mi l ar tri angl es
woul d be to construct a paral l el l i ne to the l i ne UV through the point
P, produci ng a l i ttle triangle MPR ( i l l ustrated i n Fig. 4-7) , which would
be si mi l ar to the bi g triangle VQR. Of course, we have not yet deter
mi ned the l i ne QR, so thi s operation is sti l l to be done in order to
sol ve the probl em. However, any l i ne QPR drawn through the point P
wi l l now resul t in triangle MPR bei ng si mi l ar to triangle VQR. Thus,
al l that remai ns i s to determi ne whi ch l i ne QPR wi l l resul t i n a si mi l ar
triangle in whi ch the rati os of the si des are 3 : I .
Note that constructi ng a paral l el to a gi ven l i ne t hrough a gi ven
outsi de poi nt i s not an el ementary mi crooperati on, but rather a macro-
x
\
x
\
v
\
C
\ '
*
x
\

FIGURE 4-6
Const ruct _H such t hat _I 2 ^ IH.
Classification of Action Sequences
C
I
FIGURE 4-7
Constructi ng d paral l el ( PM) to U|
through the point P.
bT
operation that requi res a sequence of mi crooperati ons to be achi eved.
However, i n pl anni ng the sol uti on of the probl em, we need not bother
to expl i ci tl y carry out the sequence of mi crooperati ons necessary to
achi eve t hi s macrooperation. Stop readi ng and try to sol ve the prob
l em, if you have not done so.
Pl aci ng the l i ne segments QR and PR i n a 3 : I ratio i s the same as
pl aci ng the l i ne segments V R and M R i n the rel ation of 3 : 1 . The l atter
is equi val ent to pl aci ng the l i ne segments V M and M R in the rat i o 2 : 1 .
The l ength of VM is al ready determi ned. Therefore, al l we need is to
determi ne the l ength MR ( whi ch determi nes t he poi nt R) . Si nce MR
is hal f the l ength of VM, what we need is to determi ne what hal f the
l ength of VM is. By t hi s time i t shoul d be clear what macroacti on wi l l
al l ow us to determi ne hal f of VM and mark i t of from poi nt M al ong
the l i ne MW to determi ne the poi nt R. Stop readi ng and see if you can
determi ne t hi s macroacti on and then sol ve the probl em.
The requi red macroacti on for determi ni ng hal f t he l i ne segment V M
is to construct a perpendi cul ar bi sector to the l i ne segment VM. Thi s
procedure determi nes t he mi dpoi nt of VM, from whi ch one can deter
mi ne the l ength of hal f the segment VM by measuri ng from t he mi dpoi nt
to ei ther poi nt V or poi nt M wi t h the compass. Then we si mpl y hol d
the compass at t hi s posi ti on, pl ace one end of the compass at poi nt M,
and mark the other poi nt al ong the l i ne MW t o determi ne poi nt R.
Havi ng determi ned poi nt R such t hat the l i ne VM i s i n t he rat io 2 : 1 to
the l i ne segment M R, we have now uni quel y determi ned the l i ne QPR
such t hat the rati o of t he l i ne segment QP to the l i ne segment PR i s 2 : 1 ,
and the probl em i s sol ved.
For any readers wi t hout experi ence i n pl ane geometry constructi on
probl ems or who have forgotten what they l earned, a bi t of i nstruc
tion concerni ng t he achi evement of the two pri nci pal macroact i ons
bZ Chapter 9
used in thi s probl em may be hel pful in making the sol uti on of the
probl em compl etel y concrete.
To construct a paral l el to the l i ne UV through the poi nt P, use the
arbitrari l y drawn l i ne QPR i n Fi g. 4- 6. Pl ace one poi nt of the compass
at poi nt Q and draw an arc through the l i nes QU and QP. The arc may
be of any reasonabl e radi us. Now draw an arc of the same radi us
around the poi nt P. Now use the compass to measure the di stance
between i ntersecti ons of the arc around Q that i ntersects QU and
QP. To do this operati on, pl ace one point of the compass at the i nter
secti on of thi s arc wi th the l i ne QU. Now, keepi ng the same radi us
wi th the compass , pl ace one poi nt of the compass at the i ntersection of
the arc around the poi nt P wi th the l i ne PR and measure of the same
di stance along that arc. Thi s point when connected to point P wi l l
produce a l i ne paral l el t o t he l i ne VQU.
To achi eve the macroaction of bi secti ng l i ne segment VM, pl ace the
compass at poi nt V and draw an arc around V i ntersecting the l i ne
VM at a poi nt more than hal fway between V and M. Now draw the
same radi us arc around the poi nt M, i ntersecti ng the l i ne VM and
i ntersecti ng the arc around V once above the l i ne and once bel ow the
l i ne. Connecti ng the two i ntersecting poi nts for the arc around V
and the arc around M resul ts in a perpendi cul ar bi sector to the l i ne
VM and therefore determi nes the mi dpoi nt of l i ne VM.
Consi deri ng how compl ex especi al l y the frst of these two macro
acti ons i s, i n terms of the sequence of requi red mi croacti ons, it is
cl ear why i t faci l i tates pl anni ng the sol uti on of the probl em to t hi nk
i n terms of macroacti ons rather than t he sequences of mi croacti ons
necessary to achi eve t hem.
Knowi ng whi ch sequences of mi croacti ons or equi val ence cl asses
of sequences of mi croacti ons to defne as macroacti ons appears to
depend very heavi l y (i f not compl etel y) on speci fc knowl edge of the
area from whi ch the probl em i s taken. General probl em-sol vi ng anal ysi s
makes it cl ear what the potenti al val ue i s i n defni ng macroacti ons,
but i t does not t el l you whi ch macroacti ons t o defne i n any gi ven prob
l em area. Thi s characteri sti c i s very frequentl y the nature of the rel a
ti onshi p between general probl em-sol vi ng methods and speci fc
knowl edge
-
that i s, general probl em-sol vi ng methods di rect you
toward t he type of speci fc knowl edge t hat you shoul d acqui re and
mot i vate you to acqui re thi s knowl edge by demonst rat i ng thei r
useful ness i n the sol uti on of probl ems.
I nci dental l y, there are other ways to sol ve t hi s probl em, usi ng other
geometri c macroact i ons. So, i f you thought of a di ferent way to sol ve
the probl em, i t may wel l be correct .
b
GETTING OUT OF LOOPS
Begi nni ng probl em sol vers frequent l y run out of i deas to appl y to a
probl em. Hi ghl y ski l l ed probl em sol vers often experi ence the opposi te
di fculty ; they have too many ideas and are forced to choose among
a vari ety of possi bl e approaches to the probl em. I have devoted some
space i n t hi s book to this questi on of deci di ng among many possi bl e
probl em-sol vi ng methods, but I have been mai nl y concerned wi t h the
matter of provi di ng the student wi t h a ri ch vari ety of general methods
for attacki ng probl ems to ensure that you do not spend a great deal of
ti me stari ng at a probl em wi thout gett i ng any i deas.
Someti mes you may have no i deas at al l as to how to sol ve a prob
l em, but more frequentl y you wi l l run out of new i deas after havi ng
tried vari ous methods, none of whi ch worked. I n such cases, you may
repeatedl y thi nk of the i nadequate methods for sol vi ng the probl em
and get the feel i ng that you are goi ng around i n ci rcl es. When you are
caught in a loop l i ke thi s, i t is obvi ousl y ti me to do somet hi ng di ferent
from what you have been doi ng. But how? In many cases that seems
to be just the troubl e: you are i n a seri es of l oops, thi nki ng of the same
i nadequate i deas over and over agai n.
An excel l ent frst st ep i n getti ng out of a l oop and doi ng somethi ng
di ferent i s to anal yze what you have been doi ng. You must determi ne
the attri butes ( properti es) of the approaches you have been taki ng.
Usual l y when you make an efort to characteri ze what you have been
doi ng i n tryi ng to sol ve a probl em, you can i mmedi atel y t hi nk of some
ways to approach the probl em di ferentl y. Often, what is cri ti cal is to
step back and t hi nk about what you have been doi ng rat her than t hi nk
about the probl em i tsel f.
There are two basi c l evel s at whi ch t hi s anal ysi s of your probl em
sol vi ng methods can take pl ace: ( a) the l evel of the speci fc acti on or
action sequences specifed i n the probl em ( cl assi fyi ng action sequences)
and ( b) the l evel of general probl em-sol vi ng methods ( cl assi fyi ng prob
l em-sol vi ng methods) . I n each case, after you have characteri zed
what acti ons or methods you have used, you shoul d ask what other
cl asses of acti ons or methods seem remotel y appl i cabl e to the probl em.
At the l evel of general probl em-sol vi ng methods, there are general l y
many speci fc ways t o i mpl ement any gi ven general method i n any
part i cul ar probl em. What are the propert i es of t he way you have
chosen? Coul d you construct an al ternat i ve way t hat had di ferent
propert i es? Is t here any i nformat i on that i s expl i ci t l y or i mpl i ci t l y
a part of the probl em that has not been expl i ci tl y represented? What
b4 Chapter 9
ki nd of i nformation has been used? Can you thi nk of any al ternative
representation of thi s same i nformation?
At the acti on-sequence l evel , a good exampl e of the useful ness of
cl assi fyi ng action sequences to get out of l oops i s provi ded by the
nine-dot four-line problem:
Wi thout your pencil l eavi ng the paper, draw four straight l i nes t hrough
the fol l owi ng three-by-three array of nine dots ( see Fig. 4-8).
FIGURE 4-8
The ni ne-dot four-l i ne
probl em.
Stop readi ng and try to sol ve t hi s probl em.
I f you are l i ke many amateur probl em sol vers, you may have pro
duced a number of attempted sol uti ons such as those shown in Fi g. 4- 9.
Al though t here are many di ferent ways you can produce i ncorrect
sol ut i ons of thi s type, you can get the feel i ng rather qui ckl y t hat they
fal l i nto a smal l number of cl asses, al l of whi ch are i ncorrect. You may
feel you are goi ng around i n ci rcl es, produci ng attempted sol uti ons
t hat are of the same character as your previ ous tri es and getti ng no
cl oser to sol uti on wi th each attempt . When you reach such a stage, i t
i s wel l to try to determi ne t he properti es of your attempted methods
of sol ut i on. If you ask what al l t hese acti on sequences have i n common,
one answer is that t hey al l keep t he four l i nes wi thi n the peri meter of
the t hree-by-three array of ni ne dots. If you exami ne the gi ven i nforma
ti on in t he probl em, it is cl ear t hat t hi s restri cti on to the perimeter of
t he array is not a part of the probl em. Thus, i t i s permi ssi bl e to attempt
sol ut i ons i n whi ch t he l i nes extend beyond the peri meter of the array
of dot s, and, wi t h thi s i nsi ght , the sol ut i on i s readi l y achi eved as
i l l ust rated i n Fi g. O.
FI GURE 4-9
I ncorrect sol ut i on, t o the n i ne- dot fou r- l i ne p robl em.
INCUBATION
FIGURE 4- 1 0
Correct sol uti on t o the
ni ne-dot four-l i ne probl em.
bb
When you have been going around in ci rcl es and wi sh to do somethi ng
diferent to try to sol ve a probl em, probabl y the most frequentl y gi ven
pi ece of advi ce i s to put the probl em asi de for several mi nutes, hours,
or days, and work on somethi ng el se or get a good night' s sl eep before
coming back to the probl em. Thi s is good advi ce, though in an exami na
tion situation the maxi mum period of time you can let any probl em
i ncubate i s, of course, set by the ti me l i mitati ons of the exam. But
even then i t may be best to work on other probl ems and come back
l ater t o t he more di fcul t ones, so that you wi l l not spend too much
ti me on di fcul t probl ems and fai l to fni sh a number of easi er ques
tions. In addi ti on, even a few mi nutes or tens of mi nutes spent sol vi ng
other probl ems may gi ve you a fresh perspecti ve for sol vi ng probl ems
you found di fcul t on the frst attempt.
I must confess that i ncubation i s not one of my favorite probl em
sol vi ng methods, pri mari l y, I suppose, because, when one i s forced
to use it, i t i ndi cates that all the other general probl em-sol vi ng methods
have fai l ed. However, when you have tri ed a l arge number of ap
proaches to a probl em with no success, there comes a poi nt at whi ch
even the most ski l l ed probl em sol ver shoul d undoubtedl y put the prob
l em aside for a few hours or days and come back to it l ater. This is
true even though a ski l l ed probl em sol ver may sti l l be able to generate
new ideas concerni ng how to sol ve the probl em.
Psychol ogi sts do not understand why i ncubation i s useful i n sol vi ng
probl ems. The di fcul ty in expl ai ni ng the benefci al efects of i ncuba
tion on probl em sol vi ng is not t hat we l ack any ideas concerni ng pos
sibl e mechani sms for the efect. On the contrary, there are too many
possi bl e mechani sms for the benefci al efects of i ncubati on on prob
lem sol vi ng.
Fi rst, you may be quite general l y fati gued after you have worked
on a probl em for a l ong t i me, and comi ng back to it i n a fresher state
bb Chapter 9
of mi nd seems l i kel y to be benefci al (though agai n we do not under
stand the mechani sms of general i ntel l ectual fatigue or the need for
sl eep and so on).
Second, there may be more speci fc i ntel l ectual fati gue or i nter
ference i n the use of your memory because of the l arge number of
i ncorrect acti ons you have taken i n tryi ng to sol ve the probl em. The
passage of ti me fl l ed wi th i nterveni ng acti vi ti es provi des an oppor
tuni ty for these i nterferi ng memori es to fade away. Onl y the most
val uabl e l essons you have l earned remai n in the foreground of your
mi nd when you go back to the probl em, wi th a host of l esser i nterfering
associ ati ons havi ng decayed to a l ow l evel . It is not cl ear that this
sort of memory l oss shoul d necessari l y be benefci al to probl em sol v
i ng, but i t wel l mi ght be.
Thi rd, when you come back to the probl em, you have an al tered
memory and new set of thi ngs on your mi nd as a resul t of the i nter
veni ng act i vi ty. These new associ ati ons and new cues may wel l result
i n the retri eval of new i deas from memory concerni ng how to sol ve
the gi ven probl em. Thi s expl anati on i s probabl y the si ngl e most
pl ausi bl e reason for the success of the method of i ncubati on.
There i s a fourth, somewhat more exoti c possi bi l i ty, namel y, that
a person' s mi nd goes on unconsci ousl y worki ng the probl em all during
the l ong i ncubati on peri od. Either because the unconsci ous mi nd has
a long time to work on the probl em or because somethi ng speci al i s
added by unconsci ous problem sol vi ng, the probl em manages t o get
solved in thi s way, when consci ous probl em sol vi ng has fai l ed. I n any
event, the unconsci ous probl em sol vi ng may modi fy memory i n a man
ner that faci l i tates consci ous probl em sol vi ng at a l ater ti me. There
i s not one shred of evi dence for thi s expl anati on of i ncubati on, whereas
the frst three possi bl e mechani sms are all extensi ons of previ ousl y
establ i shed psychologi cal pri nci pl es. Neverthel ess, many psychol o
gi sts bel i eve i n unconsci ous probl em sol vi ng. I am very skepti cal on
the matter, but that i s pri mari l y a matter of phi l osophi cal preference.
In any event, i ncubati on often works, whatever the mechani sm.
THEORY

State Evaluation
and Hill Climbing
I n the l ast chapter we reduced the amount of tri al -and-error search
in a probl em by constructi ng equi val ent state-acti on trees of reduced
si ze. In thi s chapter, we di scuss a very di ferent way of reduci ng the
number of state-acti on sequences t hat have to be searched before
achi evi ng the sol ut i on. The method has two part s: (a) defni ng an
evaluation jimclion over all states i ncl udi ng the goal state and ( b) choos
i ng acti ons at any gi ven state to achi eve a next state wi th an eval uati on
cl oser to t hat of the goal . Pi cki ng an acti on on the basi s of such a
l ocal eval uati on of i t s consequences i s known as hill climbing, si nce
eval uati on functi ons are frequent l y defned so that the goal state has
the maxi mum val ue on some one-di mensi onal eval uati on functi on.
Fi gure i l l ustrates the appl i cati on of state eval uati on and hi l l
cl i mbi ng t o t he state-acti on tree for some unknown probl em wi t h a
hypotheti cal eval uati on functi on defned over each state. The val ue of
the functi on for each state is wri tten i nsi de the ci rcl e for each node
( state) . Thi s exampl e arbitrari l y uses an i nteger-val ued eval uati on
functi on, wi th the begi nni ng state havi ng val ue 0, the goal state havi ng
val ue 1 0, and nongoal states havi ng val ues i ntermedi ate between
b
Begi nni ng state
State
l evel
Goal state Hi l l-cl i mbi ng resul t
FIGURE 5-1
State-acti on tree wi t h an integer-val ued eval uati on functi on defned over every
state (node). One-step hi l l cl i mbi ng resul ts in the action sequence shown by the
arrow. Not e that, i n t hi s case, hi l l cl i mbi ng does not achi eve t he goal state.
Z
J
4
and 1 0. Appl i cation of a one-step hi l I -cl i mbi ng method to thi s state
acti on tree wi th thi s eval uati on function yi el ds the sequence of action
choi ces shown by arrows i n Fig. 5- 1 . You wiII note t hat hi l I cl i mbi ng
need not succeed i n achi evi ng the goal the frst ti me, and thi s time it
did not.
Havi ng fai l ed to achi eve the goal by hi l l cl i mbi ng i n the frst attempt,
there are many thi ngs you can do to achi eve the goal , sti l l using hi l l
cl i mbi ng. You coul d try choosi ng the action wi t h the next t o best
val ue at one of the various nodes on the ori gi nal hi l l -cl i mbi ng path,
use stri ct hi l l cl i mbi ng at al l other nodes, and see if you achi eved
the goal wi th any of these mi ni mal vi ol ati ons of t he general hi l l
cl i mbi ng method. I n the present i nstance, thi s mi ni mal . modifcation of
hi l l cl i mbi ng woul d succeed i f you took the next to best action going
from state l evel 0 to state l evel l , because, from t hat point on, hi l l
cl i mbi ng resul ts i n an action sequence t hat achi eves t he goal .
Al ternati vel y, you coul d try two-step hi l l cl i mbi ng and choose the
sequence of two acti ons at any gi ven node t hat resul ted i n a node
wi th the greatest val ue. Thi s two-step hi l l cl i mbi ng woul d produce the
goal the frst ti me i n the probl em shown i n Fi g. 5- 1 .
State Evaluation and Hill Climbing b
Fi nal l y, you coul d question the eval uation function you had defned
over te states i n the probl em. There i s usual l y no way to be certain
that you have defned the eval uation functi on that i s i deal for represent
i ng progress in achi evi ng the goal i n any gi ven probl em. Someti mes
the fai l ure of hi l l cl i mbi ng suggests that a reexami nati on of the ( expl i ci t
or i mpl ici t) eval uation functi on i s i n order. Eval uati on functi ons are
general l y not given i n the probl em (except i n optimization probl ems) ,
and so any eval uation function can be chosen to see i f it works i n
conjunction wi th hi l l cl i mbi ng ( or some ot her probl em-sol vi ng method)
to produce the sol ution to the probl em.
Someti me.s when hi l l cl i mbi ng i s used i n conjuncti on wi th a state
eval uation functi on, a real -val ued ( numeri cal ) eval uati on i s defned
for each state. In other cases, you may have some abi l i ty to compare
several states and judge whi ch is cl oser to the goal , but no actual
numbers are assigned to the states. Whether or not numbers are
assigned to states, two states can have e'ui val ent eval uation and so
you coul d not choose between them.
So far we have di scussed probl ems wi th onl y a si ngl e-val ued (one
di mensi onal ) state-eval uation functi on, but there are al so probl ems
where the goal di fers from the begi nni ng state on several di mensi ons.
I n these cases, it i s usual l y possi bl e to make judgments regardi ng
cl oseness to the goal on each of the di mensi ons separatel y, but there
may be no si ngl e, necessari l y opti mal way to combi ne the eval uati ons
on each separate di mensi on i nto a si ngl e overal l eval uati on of each
state. Thus, you coul d have a vector-val ued eval uati on functi on as
si gned to each state, as shown i n Fig. 5- 2.
There are a number of hi l l -cl i mbi ng opti ons i n regard to vector
val ued eval uati on functi ons, such as that shown in Fi g. 5- 2. You coul d
t ry various al ternation schemes
-
that i s, hi l l cl i mbi ng on one di men
sion for a whi l e and t hen hi l l cl i mbi ng on anot her di mensi on for a
whi l e. Obvi ousl y, when no i mprovement is possi bl e on a part i cul ar
di mensi on by any acti on t hat you coul d t ake from the node where you
are currentl y l ocated, you shoul d hi l l -cl i mb on a di ferent di mensi on
for at l east t ha node. I f you have reached the goal wi th respect to
one di mensi on, you shoul d also hi l l cl i mb on other di mensi ons. I n
usi ng these al ternati on schemes, i t hel ps to keep records of the nodes
where you coul d have chosen to i mprove on a di ferent di mensi on
than the one you di d choose. When t he frst hi l l - cl i mbi ng path t hrough
the state-acti on tree fai l s to produce t he sol uti on, these nodes where
you had good al ternat i ve choi ces are the obvi ous pl aces to back up
to and start new paths .
Begi nni ng state State level
Z
J
Goal stat c
FIGURE 5-2
State-acti on tree wi th a two-di mensi onal vector-val ued evaluation functi on defned
over every state (node) . In thi s case. the goal state has the eval uation vector ( 5 . 4).
and the begi nni ng state has the eval uati on vector (0, 0). The path taken by a hi l l
cl imbi ng method depends on whet her you hi l l cl i mb on wei ghted summed component s
or t ry some al t erat i on scheme. I n t he former case. t he exact wei ghi ng of t he t wo
component val ues i s al so i mportant i n det ermi n i ng t he path t aken by hi l l cl i mbi ng.
Another approach to mul ti di mensi onal eval uati on functi ons i s to
combi ne the val ues on the separate di mensi ons i nto a si ngl e overal l
val ue for each state. I f t here i s some si ngl e most nat ural way to com
bi ne t hem. do i t t hat way frst ; but remember that. no matter how
natural the combi nati on method mi ght be, i t coul d be the wrong way
to combi ne the val ues on the di ferent di mensi ons -t hat i s, wrong for
achi evi ng the sol uti on by one-di mensi onal hi l l cl i mbi ng. If the orig
i nal l y chosen combi nati on method fai l s to work, try some other method
of combi nati on, al ternati on schemes wi th the ori gi nal mul t i di mensi onal
eval uati on functi on, mul t i step hi l l cl i mbi ng, defni ng new eval uati on
funct i on, or t he l i ke.
APPLICATIONS
Exampl es of the use of state-eval uati on functi ons and hi l l cl i mbi ng
abound i n probl em sol vi ng. For i nstance, when you pl an a tri p across
the count ry on a map. you i ni t i al l y exami ne roads t hat go i n nearl y
the ri ght di recti on. The ri ght di recti on is the di recti on t hat reduces
t he di stance between where you are and where you are goi ng at the
fastest rat e. Of course, choosi ng t he road at the begi nni ng of a tri p
that goes cl osest to t he ri ght di rect i on may prove to be a bad choi ce.
State Eval uation and Hi l l Climbing T
This road may eventual l y l ead to a dead end or requi re you to go far
out of the way to reach the goal . I n addi ti on, pl anni ng a tri p on a map
usual l y i nvol ves other consi derati ons -speed, scenery, or other
properti es - besides fnding the shortest road between the starti ng
and endi ng poi nts. These consi derati ons pl ace you i n the posi ti on of
doing hi l l cl i mbi ng on vector-val ued eval uati on functi on. Despi te
al l these compl i cati ons, experi ence suggests that hi l l cl i mbi ng is a
promi nent method used in sol vi ng tri p pl anni ng probl ems wi t h a map.
Penci l -and-paper maze probl ems are rather si mi l ar to trip-pl anni ng
probl ems on a map, and peopl e frequentl y use hi l l cl i mbi ng i n an at
tempt to sol ve them. However, chal l engi ng maze probl ems are usual l y
del i beratel y constructed t o frustrate a h :n-c1 i mbi ng approach. Maze
probl ems frequentl y requi re nonopti mal choi ces at earl y and mi ddl e
stages of the sol uti on and may even requi re detours ( i ncreases i n the
di stance from the goal , as measured by the most obvi ous eval uation
function of physi cal di stance) . On the other hand, maze probl ems
usual l y do not i nvol ve consi derati ons of road speed or sceni c beauty.
Defni ng an expl i ci t eval uation functi on and empl oyi ng hi l l cl i mbi ng
i s al so useful i n sol vi ng the one-heavy-coi n probl em di scussed i n
Chapter 3 :
You have U pi l e of 24 coi ns. Twenty-three of these coi ns have the same
wei ght, and one i s heavi er. Your task i s to determi ne whi ch coi n i s heavi er
and to do so i n the mi ni mum number of weighi ngs. You are gi ven a beam
bal ance ( scal e) , whi ch wi l l compare the wei ght of any two sets of coi ns
out of the t otal set of 24 coi ns.
A sui tabl e eval uati on functi on for sol vi ng t hi s probl em woul d be the
number of coi ns whose cl assi fcati on as heavy or l i ght i s known. At the
begi nni ng of the probl em, the val ue of the functi on i s zero, si nce none
of the 24 coi ns i s known to be ei ther heavy or l i ght. In the goal stat e,
the heavy-l i ght cl assi fcati on of al l 24 coi ns i s known, so the val ue of
the functi on i s 24. Thus, a hi l l - cl i mbi ng approach woul d choose an
action at each node t hat maxi mi zed the number of coi ns whose heavy
l ight cl assi fcation i s known.
A very l arge number of al ternati ve acti ons are present at each node.
For exampl e, at the frst node, you mi ght wei gh any one of the coi ns
agai nst any t wo of t he other coi ns. I n general , you mi ght wei gh any
set of M coi ns agai nst any set of coi ns, where l l 24. The num
ber of di ferent pai rs of sets of and coi ns that sati sfy the restri cti on
that l 24 i s extremel y l arge. However, the most el ementary
considerat i on of the previ ousl y menti oned eval uati on functi on and
Z Chapter b
the hi l l - cl i mbi ng approach i mmediatel y rul es out al l acti ons that do
not i nvol ve wei ghi ng two sets containing equal numbers of coi ns i n
the two pans of the beam bal ance. Thi s excl usi on reduces the number
of al ternati ve acti ons consi derabl y.
Furthermore, usi ng the method of defni ng equi val ence cl asses of
actions di scussed in Chapter 4, note that, at the frst node of the
probl em, you have no way to di sti ngui sh di ferent subsets of i coi ns;
thus, you must consi der any t wo sets of i coi ns t o be equi val ent t o each
other ( i n thei r l i kel i hood of contai ni ng the heavy coi n). This con
si derati on reduces the number of di ferent alternati ve acti ons at the
frst node to 1 2 -that i s, a set of 1 2 coi ns i s weighed agai nst a set of
1 2 coi ns, a set of 1 1 coi ns agai nst another set of 1 1 coi ns, 1 0 agai nst
1 0, and so on, or 1 agai nst 1 .
I f you expl i ci tl y i nqui re whi ch of these 1 2 al ternati ve acti ons resul ts
i n the greatest number of known coi ns fol l owi ng the frst weighing,
you shoul d be l ed to sel ect the optimal action at the frst node -that i s,
t o weigh a set of 8 coi ns agai nst another set of 8 coi ns, si nce t hi s
maxi mal l y i ncreases the val ue of the eval uation function from 0 known
coins to 1 6 known coins fol l owi ng the frst weighing, whatever the out
come of the frst wei ghi ng.
The same sort of eval uation function and hi l l -cl i mbi ng approach
can be used to sol ve more compl ex coi n-wei ghi ng probl ems, such as
those i nvol vi ng two heavy coi ns or one coin that might be ei ther heavi er
or l ighter than the other coi ns. When the coi ns are cl assi fed i nto
three or more categori es ( for exampl e, heavy, medi um, and l i ght) ,
t hen it may be useful to use as an eval uati on function the number of
coi n-cl assi fcati on pai ri ngs ( for exampl e, coi n I is heavy, coi n 2 i s
medi um, coi n 3 i s l i ght) that have been rul ed out.
I n al l of the coi n-wei ghi ng probl ems, from the si mpl est to t he most
compl ex, keep i n mind that, afer a given wei ghi ng, the val ue of the
eval uati on functi on may be di ferent for the di ferent outcomes of
the wei ghi ng. In such cases, the value of the eval uati on functi on for
a parti cul ar wei ghi ng is usual l y best consi dered to be the expected
val ue of the eval uati on functi on across al l di ferent outcomes, where
t he val ue of t he eval uati on functi on for each outcome is wei ghted by
the probabi l i t y of obtai ni ng t hat outcome. Thus , afer the frst wei gh
i ng of ei ght coi ns agai nst eight coi ns i n the previ ousl y menti oned
one-heavy-coi n probl em, the opti mal choi ce i n the second wei ghi ng is
ei ther to weigh two coi ns agai nst two coins or three coi ns agai nst t hree
coi ns . In ei t her case, the t hree outcomes of the wei ghi ng ( t i l t l eft , bal
ance, til t right) are not equal l y l i kel y, nor does each outcome resul t in an
equi val ent i ncrease i n the number of known coi ns. For exampl e, wi t h
State Eval uation and Hill Climbing
the three agai nst three wei ghi ng (out of the eight remai ni ng coi ns) , the
probabi l ity of thei r balanci ng evenl y i s , whi l e the probabi l i ty of ti l t
l ef i s , and the probabi l i ty of ti l t ri ght i s
.
For si mpl i ci ty, l et us use as the eval uati on functi on the number of
unknol l 'll coi ns, where the goal state has a val ue of zero unknown coi ns.
Thus, hi l l cl i mbi ng, i n t hi s case, means attempti ng to mi ni mize t he
val ue of t he eval uation functi on. U si ng thi s eval uation functi on, the
val ue of a bal anced outcome i n the three-agai nst-three weighi ng i s
two remai ni ng unknown coi ns, whi l e the val ue of t i l t l eft i s 3 and the
val ue of t i l t right i s al so 3. The overal l eval uati on of the t hree-agai nst
three weighi ng, t hen, i s (
.
3) + (
.
3) + (
.
2) 21 .
The three-against-three wei ghi ng produces the mi ni mum expected
val ue on the eval uation functi on. Thi s fact can be seen by computi ng
the expected val ue for the other three pl ausi bl e weighi ngs -namel y,
one agai nst one, t wo agai nst two, and four agai nst four. The two
agai nst-two weighing i s almost as good as the three-agai nst-three
weighi ng, by thi s evaluation functi on. The two-agai nst-two wei ghi ng
has an expected value of (i ' 2) + (i
.
2) + (4 ' 4) 3. The four-against
four weighing has an expected val ue of a . 4) + (4 . 4) 4. The one
against-one weighing has the poorest expected val ue of al l - namel y,
(
.
0) + (
.
0) + (*
.
6) 44 .
I n terms of achi eving the goal of determi ni ng the one heavy coi n out
of 24 i n the mi ni mum number of wei ghi ngs, ei ther t he t hree-agai nst
three wei ghi ng or the two-agai nst-t wo wei ghi ng i s opti mal on the
second weighi ng. Thus , i n thi s case, hi l l cl i mbi ng i s a successful prob
l em-sol vi ng method, si nce it chooses one of the two acti ons that wi l l
l ead t o the goal wi t h t he mi ni mum number of total acti ons (wei ghi ngs) .
Sol vi ng si mpl e l i near equat i ons provi des another exampl e of t he
possi bi l i t y of successful use of hi l l cl i mbi ng i n probl em sol vi ng. Con
sider the l i near equation 9x + 7 5x + 15 as the gi ven, with an expres-
sion of the form x = bei ng the goal . The bl ank, . , repre-
sents some currentl y unknown real number that constitutes the val ue
of x i n the solution to the equati on.
I ni ti al l y, we mi ght defne a four-val ued vector eval uati on functi on
for thi s probl em, consi st i ng of the coefci ents of t he and numeri cal
terms on the l eft-hand si de of the equati on and the x and numerical
terms on the right-hand side. For the l i near equation above, then,
the val ue of the eval uati on functi on at the gi ven state woul d be ( 9, 7,
5, 1 5 ) . The val ue of t he eval uati on functi on for the goal state i s ( 1 , 0,
0, = ) , where . agai n i ndi cates t hat we do not current l y
know what real number i s acceptabl e i n t hi s posi t i on. We mi ght choose
acti ons at each step designed to i ncrease the number of terms of t hi s
4 Chapter b
four-valued vector eval uation function that are in agreement with
the correspondi ng terms of the eval uati on functi on for the goal . Thus,
if we subtract 5x from both si des of the equation, the eval uati on
functi on is changed to ( 4, 7, 0, 1 5) , whi ch is known to di sagree wi th
the eval uation functi on for the goal i n onl y the frst two posi ti ons ( the
agreement of the val ue i n the fourth posi ti on wi th the desi red val ue i n
the goal expressi on cannot be determi ned) . Subsequent l y, subtracting
7 from both sides of the equati on changes the eval uati on function to
( 4, 0, 0, 8) , whi ch di sagrees with the goal expression i n onl y one posi
tion ( t he frst ) . Fi nal l y, di vi di ng both sides of the equati on by 4 has an
eval uation functi on ( 1 , 0, 0, 2) , whi ch i s known to di sagree wi th the
eval uati on functi on for the goal i n zero posi ti ons. The state achi eved
at t hi s poi nt that i ncl udes the expressi on x 2 consti tutes the sol uti on
to the probl em.
Rather t han t hi nk of t hi s at al l i n terms of a four-val ued vector
eval uation functi on, we can si mpl y t hi nk of the number of "bad"
terms i n the expressi on. I ni ti al l y t here are t hree bad terms. After
subtract i ng 5x from both si des of the equati on ( obtai ni ng 4x + 7 1 5) ,
there are onl y two known bad terms. Afer subtracti ng 7 for both si des
( obtai ni ng 4x 8), there i s onl y one known bad term. Fi nal l y, after
di vi di ng both si des of t he equation by 4 ( obtai ni ng X 2) , t here are
no bad terms, and the probl em is sol ved.
I t may be somewhat di fcul t for someone experienced i n sol vi ng
such si mpl e l i near equati ons to i magi ne that anyone actual l y uses t hi s
sort of eval uati on functi on and hi l l cl i mbi ng i n order to sol ve so
si mpl e a probl em. However, t hi s approach coul d be used, and, very
l i kel y, many begi nni ng al gebra students unconsi ousl y use j ust such
a method i n sol vi ng thei r i ni ti al l i near-equati on probl ems.
The more experi enced l i near-equati on sol ver very l i kel y t hi nks of
the probl em in terms of t hree subgoal s , namel y, getti ng al l the x terms
on the l eft si de of the equati on, getti ng al l the numeri cal terms on the
right si de of the equati on, and di vi di ng t hrough by the coefci ent of
the x term. However, t hi s subgoal method ( to be described i n detai l
i n the fol l owi ng chapter) uses the same sort of eval uati on functi on as
used by the hi l l - cl i mbi ng approach to l i near-equati on probl ems.
Once you are an experi enced sol ver of l i near equat i ons you probabl y
never thi nk of eval uati on functi ons , subgoal s, or hi l l cl i mbi ng at al l
but si mpl y sol ve t he probl em using t he same t ype of act i on sequence
you have used i n sol vi ng ot her such probl ems
-
namel y, subt ract the
x term on the right-hand si de of the equation from the x term on the
l ef-hand side of the equation, then subtract the numeri cal term on
the l eft-hand side of the equati on from the numerical term on the
State Evaluation and Hil l Cl imbing b
right-hand si de of the equation, and fnal l y di vi de t hrough by the
coefci ent of the X term. ( Thi s probl em- sol vi ng met hod, knowi ng how
to sol ve a probl em because you recognize i ts rel ati onshi p to other
probl ems you sol ved previ ousl y, wi l l be di scussed i n Chapter 9. )
Thus , there are many di ferent probl em-sol vi ng methods t hat can al l
l ead to roughl y the same sequence of acti ons i n sol vi ng a si mpl e l i near
equati on probl em. Thi s si mpl e probl em is di scussed pri mari l y to com
muni cate what i s meant by such concepts as eval uati on functi ons,
hi l l cl i mbi ng, subgoal s , rel ati ons between probl ems, and the l i ke.
Furt hermore, hi l l cl i mbi ng i s frequent l y u sed t o sol ve more compl ex
equati ons or sets of equati ons, usi ng as an eval uati on functi on some
measure or measures of the di screpancy i n form between some equa
tion you have produced and the goal equati on. Thus , in sol vi ng
equati ons i nvol vi ng exponenti al terms wi t h the unknown in the ex
ponent, a sol ver often takes l ogs of both si des of the equati on to
i ncrease the si mi l arity of the resul ti ng equati on to the goal equati on
( si nce i n the goal equati on t he unknown i s not i n t he exponent). I n
sol vi ng diferenti al equati ons, you can i ntegrate t o get rid of the
di ferenti al terms, and i n sol vi ng i ntegral equati ons , you can sol ve
for the i ntegral s or el se di ferenti ate i n order to get rid of i ntegral s ,
and s o on.
The si x-arrow probl em di scussed i n Chapter 4 to i l l ustrate the power
of noti ci ng equi val ence cl asses of action sequences provi des a very
good exampl e of a probl em in whi ch you can defne mul ti pl e -at l east
three -diferent eval uation functions. The three eval uation functi ons
di fer consi derabl y i n thei r efecti veness for a hi l l -cl i mbi ng approach.
Recal l that the si x-arrow probl em i s as fol l ows:
You are gi ven si x arrows i n a row, t he lef t hree of whi ch are poi nti ng up
and t he right three of whi ch are poi nti ng down. The goal i s to transform
these arrows i nto an al ternat i ng sequence such that the left-most arrow
poi nts up, the next arrow to i t poi nts down, the next up, then down, then
up, and then down. The acti ons al l owed are to si mul taneousl y i nvert
( t urn upsi de down) any two adjacent arrows. Note that you may not
i nvert one arrow at a time but must i nvert two arrows at a t i me, and the
two must be adjacent. The given and goal states are i l l ust rated i n Fig. 5-3 .
You are to achi eve the sol uti on usi ng the mi ni mum number of act i ons
( i nversi ons of adjacent pai rs) .
Stop readi ng and t ry t o defne t hree di ferent eval uati on functi ons
that might be rel evant to sol vi ng t hi s probl em by hi l l cl i mbi ng, t hen
read on.
b Chapter b
2 3 4 5 6 2 3 4 5 6
r r r 1 1 1 r 1 r 1 r 1
Gi ven Goal
RGURE 5-3
The si x-arrow probl em.
The most obvi ous eval uation functi on i s probabl y the number of
arrows that are i n the same posi ti on as i n the goal state. Thi s eval ua
ti on functi on starts out at four in the gi ven state and ends at si x in the
goal state. However, thi s most obvi ous eval uation function turns out
to be of no hel p whatsoever i n sol vi ng the probl em at any of the earl y
stages. For exampl e, of the fve al ternati ve acti ons you might take at
the begi nni ng state, four l eave the eval uation function unchanged at
four and onl y one action -i nverti ng arrows 3 and 4 -decreases the
eval uation (from four to two). Even this l i mi ted degree of di scrimi na
ti on among acti ons is of negati ve val ue in sol vi ng the probl em, si nce
i nverting arrows 3 and 4 i s an action that i s, i n fact, desirable to per
form at some stage in sol vi ng the probl em, whereas i nverting arrows
I and 2 and i nverting arrows 5 and 6 are acti ons that should not be
performed at any stage. Even i f you choose to i nvert arrows 2 and 3
or i nvert arrows 4 and 5 at the frst step, thi s eval uation function
agai n provi des no assi stance i n choosi ng the correct action at the
second step. It i s onl y when you have chosen the correct two begi nni ng
acti ons that the eval uation functi on coul d i mmedi atel y tel l you whi ch
acti on to choose at the thi rd step, a fact that woul d be obvi ous i n any
event. Havi ng read thi s di scussi on of one eval uation functi on, you
might stop readi ng for a bi t and try to generate some addi ti onal eval ua
tion functi ons, if you are not satisfed wi th the ones you have thought
of so far.
A somewhat di ferent eval uati on functi on that is consi derabl y more
useful in sol vi ng the probl em is to count the number of runs of arrows
(consecuti ve arrows with i denti cal orientati on). Thi s eval uation func
tion starts out at two runs for the begi nni ng state and ends at si x runs
for the goal state. In the sol ution shown i n Fi g. 5- 4, thi s eval uation
functi on was not i ncreased i n going from the begi nni ng state to the
next state, but was i ncreased at each of the two remai ni ng states. I n
other sol ut i ons to t he probl em, t he number of runs mi ght be i ncreased
State Evaluation and Hill Cl imbing
State Three evaluation functions
Di stance between
No. of arrows i n No. of runs two i ncorrect
1 2 3 4 5 6 goal posi ti on of arrows arrows
Begi nni ng
t t t t t t
4 2 3
t t t t t t
4 2 2
t t t t t t
4 4
Goal
t t t t t t
6 6
RGURE 5- 4
The val ues of three di ferent eval uati on functi ons for each successi ve state
i n a sol uti on to the si x-arrow probl em.
at the frst step, hel d constant at the second step, and fnal l y i ncreased
agai n at the t hi rd step. Thus, the number of runs i s a more useful
eval uati on functi on i n conjuncti on wi th the hi l l - cl i mbi ng approach
than i s the number of arrows i n the goal posi ti on.
However, the eval uation functi on that i s optimal i n conjuncti on wi th
the hi l l -cl i mbi ng approach to t hi s probl em i s to consi der the di stance
between the two i ncorrectl y pl aced arrows and attempt to reduce t hat
di stance. Probabl y you woul d arri ve at such an eval uation functi on,
i n essence, by working backward ( see Chapter 7) and noting that you
coul d sol ve the probl em if you had al l the arrows correctl y posi ti oned,
except two i ncorrect l y posi ti oned arrows that were adjacent to each
other. In fact , you mi ght note that, si nce an acti on al ways changes
the position of two arrows, the fnal step must necessari l y be to change
two arrows ( both of whi ch are i ncorrect l y posi ti oned) to bei ng cor
rect l y posi ti oned. Thus, if more than one step is requi red, you know
for certain that you coul d not get fve arrows correctl y posi ti oned
and be abl e to sol ve t he probl em. Hence, there i s no poi nt to the frst
eval uati on functi on ; you shoul d focus i nstead on what you need to do
i n order to achi eve the subgoal of putti ng the two i ncorrect l y posi
ti oned arrows adjacent to each other. In essence, thi s procedure de
fnes the t hi rd eval uati on functi on, whi ch i s the di stance between the
two i ncorrectl y posi tioned arrows. Note that, i n the gi ven state, the
val ue of thi s eval uation functi on i s 3 , and the successi ve acti ons i n
a correct sol ut i on to the probl em can reduce t hi s to 2 and t hen to I ,
from whi ch the fnal acti on i s obvi ous .
1b Chapter b
Another probl em that i l l ustrates the possi bi l ity of defni ng several
pl ausi bl e eval uati on functi ons for the sol uti on of a probl em by hi l l
cl i mbi ng i s t he fol l owi ng discrimination rel ' ersal problem:
I n the one-di mensi onal worl d of Li nel and, there are two races of "peopl e" :
whi tes and bl acks. As in our t hree- di mensi onal worl d, the whi tes have
for a very l ong t i me di scri mi nated agai nst the bl acks. However, of l ate,
the bl acks have been maki ng some gai ns i n the area of soci al justi ce, i n
some cases obtai ni ng judgments from courts and l egi sl atures that a cer
tai n degree of reverse di scri mi nati on shoul d obtai n for a period of ti me,
as symbol i c retri buti on to bl acks and as a l esson to whi tes concerni ng
the evi l s of di scri mi nati on. One of the areas i n whi ch the bl acks have
just now achi eved a court deci si on ordering di scri mi nati on reversal i s
i n the matter of bus t ravel . I n the past , whi t es have al ways ri dden i n t he
front of the bus and bl acks i n the back. Now the court has just ordered
that for a t i me bl acks wi l l ride in the front and whi tes in the back.
When the order took efect, there was one seven-passenger bus that was
al ready l oaded with t hree blacks i n the last t hree seat s, three whi tes in
the next t hree seat s, and the front seat empty. The bus i s automati c, re
qui ri ng no dri ver ( steeri ng i s not requi red i n Li nel and) . All this i s i l l us
t rated i n Fi g. 5-5. Si nce the order had al ready gone i nto efect, the pol i ce
i nsi sted that the bl acks and the whi tes must reverse posi ti ons compl etel y.
Of course, peopl e i n Li nel and are abl e to move to adjacent posi ti ons i n
thei r l i near worl d, so a person coul d move to an empty adjacent seat i n
the bus. However, i n addi ti on, Li nel anders have i nvented a special devi ce
that al l ows them to pass t hrough two-di mensi onal space for a very l i mited
di stance, hopping over i nterveni ng persons and objects i n either di recti on
al ong thei r l i near worl d. Thi s hoppi ng abi l i t y has a maxi mum l i mi t equal
to two seats i n the bus. Thus, ei ther a whi te or a bl ack coul d jump over
one or two adjacent seats i n the bus, provi ded the target seat was empty.
For exampl e, i n the gi ven state of Fi g. 5- 5, the frst whi te coul d move
i nto the front seat, or the second whi te coul d hop over one whi te i nto the
front seat , or the t hi rd whi te coul d hop over two whi tes i nto the front
seat . But the frst bl ack coul d not hop over all three whi tes i nto the
front seat. U si ng these movement propert i es of whi tes and bl acks in
FI GURE 5-5
The di scri mi nat i onal reversal probl em for a bus
i n Li nel and .
State Evaluation and Hill Climbing
Li neland, sol ve the probl em of reversi ng the rel ati ve posi ti ons of the
blacks and the whi tes so that al l three bl acks are i n front of al l t hree
whites i n the bus. Do this i n the mi ni mum number of moves. In sol vi ng
the probl em, note that it is i rrel evant to the sati sfacti on of the court order
where the empty seat occurs i n the bus, so l ong as al l t hree bl acks are
si tt i ng in front of all three whi tes.
I t i s not possi bl e i n thi s probl em to wri te down a si ngl e goal state,
si nce a variety of possi bl e goal states wi l l satisfy the probl em. Al l we
can say for sure is that, in the goal state, the frst four posi ti ons in the
bus will contain al l three bl acks and the l ast four seats i n the bus wi l l
contain al l three whi tes, but one does not know where t he empty seat
wi l l occur. Thi s l i mi ted speci fcation of the goal state, however, is
quite adequate for defning a variety of eval uati on functi ons that ap
pear rel evant to the sol ution of the probl em. At this point, stop readi ng
and, using a hi l l - cl i mbi ng approach, try to defne expl i ci t l y some eval ua
tion functi ons that mi ght prove useful for sol vi ng the probl em.
There appear to me to be two obvi ous types of eval uati on functi ons
we can defne for thi s probl em, both of whi ch are quite sati sfactory for
hi l l cl i mbi ng. One eval uation functi on i nvol ves numberi ng the posi ti ons
i n the bus from I at the front to 7 at the back. The eval uation functi on
would be somethi ng l i ke the average posi ti on of the whi tes mi nus the
average position of the bl acks. We coul d then attempt to maxi mi ze thi s
number. With thi s eval uation functi on, a detour i s requi red on the frst
move, but thereafer al l moves i n the optimal sequence do i ncrease
thi s eval uation function by an amount that i s ei ther greater than or
equal to every other al ternati ve action ( usual l y greater than every
other al ternati ve action). Expl i ci t computati on of the value of thi s
function i s somewhat more di fcul t than the second eval uation functi on
to be di scussed, but al l that real l y counts i s the rel ati ve di ference
between the current state and every al ternati ve state that can be
achi eved by taking any admi ssi bl e acti on. Thi s di ference i s rel ati vel y
easy t o determi ne, and so t hi s eval uati on functi on proves quite hel p
ful in conj uncti on wi th a hi l l -cl i mbi ng approach to sol vi ng the probl em.
If you have not yet thought of a second eval uation functi on for the
sol ut ion of thi s probl em, st op readi ng and try to thi nk of another one.
A second eval uati on functi on, whi ch i s even easi er to compute than
the frst , i s the sum of the number of bl acks i n front of each whi te,
summed across al l whi tes. The val ue of t hi s eval uati on functi on for
every state starting wi th the begi nni ng state and endi ng wi t h the goal
state i n the opti mal sol uti on of the probl em i s shown i n Fig. 5 - 6. Cl earl y,
Chapter b
State Configuration Evaluation
Gi ven (0) . W W W B B B 0
1 W . W W B B B 0
2 W B W W . B B 2
3 W B W W B B . 2
4 W B W . B B W 4
5 . B W W B B W 5
6 B . W W B B W 5
7 B B W W . B W 7
8 B B W W B W 7
Goal (9) B B B W W . W 9
RGURE 5- 6
The opti mal sol ut i on t o t he di scri mi nation reversal probl em.
usi ng hi l l cl i mbi ng on the eval uati on functi on of the number
of Bs to the lef of each W. summed over al l three Ws.
thi s eval uation functi on represents most cl osel y what we are trying
to achieve i n the probl em. In the given state, there are zero blacks i n
front of the whi tes, yi el di ng an eval uation of zero. I n the goal state,
there wi l l be al l three bl acks in front of al l three whites, yi el ding an
eval uation of 9. Thus , with thi s eval uation function, we know exactly
what val ue i s possessed by the goal state.
Wi th the frst eval uation function, i ncreases in the eval uation can be
made, afer the goal is achi eved, by moving whi tes further to the back
of the bus and posi ti oni ng the empty seat i n the mi ddl e. Si nce thi s
addi ti onal move i s not requi red for the sol uti on of the probl em, it i s
unnecessary and nonoptimal . However, thi s i s a tri vi al matter, and, i n
fact, both eval uati on functi ons serve al most equai l y wel l i n the sol ution
of the probl em. Actual l y, the frst eval uation function provi des in
creases i n the eval uation between states 2 and 3 , states 5 and 6, and
states ! and 8 ( see Fig. 5 - 6) , where the second eval uation functi on
cannot be changed by any acti on. Thus , in some ways, the frst eval ua
tion functi on is superior, though if you do some l i mited l ooki ng ahead
( two-step hi l l cl i mbi ng) with the second eval uation function, you wi l l
have t he optimal choi ce at each of these three nodes.
The optimal sol uti on to the di scri mi nation reversal probl em i s
achi fved by usi ng (a) the second ( rel ati ve posi ti on) eval uati on function
with a l i mi ted degree of two-step hi l l cl i mbi ng to decide among the
equ i val ent l y val ued act i ons i n st at es U and | . , and . ` and 6. and 1 and
, or ( b) t he frst ( absol ute posi t i on) eval uat i on functi on wi t h the tri vi al
modi fcation that you stop when the rel ati ve posi ti ons are correct.
even though the absol ute-posi ti on eval uati on functi on coul d st i i l be
State Evaluation and Hill Climbing T
i ncreased. It i s a remarkable fact concerni ng the absol ute-positi on
evaluation function that it permits choi ce of the optimal action at each
state, using a hi l l -cl i mbi ng approach. That is, hill cl i mbing usi ng the
absol ute-posi ti on eval uation function will produce the solution the
very frst time by choosing the action that maxi mi zes the eval uation
on the next state ( bearing in mind that, at the frst move, you must
choose the acti on that reduces the absol ute position eval uation by
the least amount).
The defnition of evaluation functions and the use of hi l l cl i mbi ng
have a substantial rol e in pl ayi ng chess games and in the sol ution of
many chess probl ems. However, there are at least two major di fcul
ti es i n the use of hi l l cl i mbing i n chess. Fi rst, even the i mmediate
evaluation of any move you take must depend to some extent on your
opponent' s immedi atel y fol l owing move. Thi s fact leads to a certain
degree of uncertai nty, but it can be resol ved i n at least two ways :
(a) by assigning sUbjective probabi l i ti es t o your opponent' s di ferent
moves and accordi ngl y determi ni ng the expected values of your own
possible moves (as was done i n the "game agai nst nature" i l l ustrated
by the coin-weighing probl em) and (b) by assigning an eval uati on to
your move consi stent with the best next move your opponent coul d
produce (where thi s move i s i n any sense determinabl e). Si nce your
subjective probabi l i ti es for your opponent' s moves are l i kel y to be
onl y approxi matel y accurate at best and since your abi l i ty to judge your
opponent' s best response i s l i mi ted as wel l , there are substantial di f
cul ties i n appl yi ng hi l l cl i mbing, no matter which approach i s taken.
The second pri ncipal di fcul ty of usi ng hi l l cl i mbi ng i n chess con
cerns the very l arge variety of diferent eval uati on functi ons that are
relevant to playi ng a good game of chess or sol vi ng many chess prob
lems. For exampl e, you are concerned with moving your own pi eces
to favorabl e posi ti ons on the board (where they control l arge numbers
of squares), the subsi diary goal of control of the center, preventi ng
your opponent' s pi ece devel opment, ensuring the safety of your ki ng,
jeopardi zi ng the safety of your opponent
'
s ki ng, and many, many
others. I t i s not at al l obvi ous what al l the rel evant eval uation func
ti ons mi ght be i n chess nor how to wei ght them i n diferent si tuati ons
to come up with some overal l eval uati on of your next move.
Despite al l these probl ems, the defni ng of eval uation functi ons and
the use of hi l l cl i mbi ng ( l ooking ahead one or more steps) are i mportant
probl em-sol vi ng methods i n chess. To i l l ustrate, consi der the fol l owi ng
very si mpl e end-game probl em i nvol vi ng bl ack' s rook and ki ng agai nst
whi te' s ki ng, wi th the posi ti ons as i l l ustrated i n Fi g. 5- 7 and bl ack to
move. It i s bl ack' s move and bl ack' s objecti ve i s to checkmate the
Z
|

-
I

FI GURE 5-7
Bl ack to move i n a manner that maxi mal l y restricts
the squares of the board to which the white ki ng
mi ght eventual l y move.
Chapter b
white ki ng in the mi ni mum number of moves. Stop reading and try to
think of at l east one eval uation function that i s relevant to thi s objective
and that might dictate the choice for bl ack' s frst move i n the present
i nstance.
Al though bl ack must be continual l y concerned wi th avoiding stale
mate ( putting the white king i n a position where he i s not i n check
but has no move except one that would put hi m i n check), the most
obvi ous objecti ve of black i s to mi ni mi ze the number of squares to
whi ch the white ki ng can move without being i n check. Al though mini
mi zi ng the possibl e moves of the white ki ng might be consi dered to
appl y onl y to the next move, i t is more useful to consider the eval uation
function to refer to mi nimi zi ng the number of squares to which the
white king might L1 be able to move (that i s, mi ni mizing the number
of squares reachable by a sequence of several moves). By the former
eval uation function, moving the bl ack rook to either the ffth, si xth,
or seventh fi l e ( row) woul d be equal l y good. However, by the second,
more adequate eval uation functi on, onl y the move of bl ack' s rook to
hi s ffth fle maximal l y restricts the number of squares on the board
to whi ch the whi te king might ul ti matel y move ( if the rook merel y
stayed on that fl e). Thus , thi s i s the solution to the probl em, and it i s
rel ati vel y straightforward i n terms of hi l l -cl i mbi ng approach usi ng
the second eval uation functi on.
DIFFICULTIES WITH HILL CLIMBING

Local Maximum
One of the most common appl i cati ons of hi l l cl i mbi ng comes i n opti
mi zation probl ems, where the evaluation functi on i s already given in
the frst part of the probl em. For exampl e, suppose you are attempting
to determi ne the maxi mum value of some functi on defned over a I ,
2, 3 , . . . di mensi onal space. That i s, you have some compl ex
functi on of several variabl es for whi ch i t i s possi bl e to compute the
val ue of the function, gi ven any parti cul ar set of val ues for the i ndi
vi dual variabl es, but for whi ch i t i s not possi bl e to determine by analytic
means what the maxi mum of the functi on might be. To sol ve such prob
l ems, i t i s best to begin with a parti cular set of val ues for the i nde
pendent variabl es, compute the val ue of the functi on (dependent
variable) for that set of i ndependent variabl es, then determi ne the
val ue of the function for points that are nearby i n the space of the
i ndependent variabl es. That i s, you make small variati ons i n the val ues
of each of the i ndependent variabl es i n turn and determi ne the val ue
of the dependent variabl e (eval uati on functi on) for each new set of
i ndependent variabl es. Whatever di rection of movement i n the space
of independent variabl es produces the greatest i ncrease i n the value of
the dependent variable is chosen as the new focus for expl orati on.
Proceed i n thi s manner unti l you fnd a poi nt for whi ch no movement
i n any di rection produces an i ncrease i n the eval uation functi on. At
that poi nt, you have cl i mbed to the top of some local peak ( l ocal
maximum) i n the space, and hill cl i mbi ng i s no l onger of any value
i n searchi ng for the highest peak (absolute maxi mum) i n the space
of the independent variabl es.
The most frequentl y di scussed di fculty wi th the hi l l -cl i mbi ng ap
proach i n such optimi zation probl ems i s that you can onl y reach a
l ocal maxi mum and have no guarantee that it is the absolute maxi mum
(highest val ue of the dependent vari abl e, defned over the space of
the independent variabl es). The onl y real sol uti on to the probl em is to
try a l arge number of wi del y di spersed starti ng poi nts in the space of
the independent variabl es and choose the maxi mum of the local
maxi ma reached by hill cl i mbi ng. Assumi ng that the givens of the
probl em do not include i nformation concerning the optimum val ue of
the evaluation functi on, you can never be absolutely sure that you
have found the absol ute maxi mum.
However, i t shoul d be pointed out that the appli cation of hi l l cl i mb
i ng to most probl ems other than thi s cl ass of opti mi zati on probl ems
i ncl udes information concerni ng the val ue of the eval uti on functi on
4
Chapter b
at the goal . In such cases, you can always know whether or not you
have reached the goal by hi l l cl imbing. This greatl y attenuates the
seriousness of the "l ocal maxi mum" di fcul ty.
End-bunching
Hi l l cl i mbing is often used in construction probl ems, where you start
putting together some of the materi al s to resul t in a state that is cl oser
to the goal (more si milar to the object being constructed) than was the
original gi ven state. I n some construction probl ems, thi s method works
wel l , but in others i t is not useful . For exampl e, consi der the Instant
Insanity probl em described in Chapter 2. Stop reading and defne
some possi bl e relevant eval uation functions for Instant I nsanity.
One rather natural four-di mensional eval uation function might be
the number of diferent colors you achieved on each side of the tower.
The number of bl ocks already placed in the tower is i ncl uded in this
evaluation function i n what appears to be a completely satisfactory
way, si nce to achi eve the goal of having four diferent colors on each
si de, you woul d have to have a tower of four bl ocks. If you were given
more than four blocks to work with, this would not be a satisfactory
eval uation function unl ess you consi dered the number of bl ocks, since
you coul d achieve four diferent colors on each side by using more
than four bl ocks. However, in the present probl em, the goal state
coul d be characterized exactl y as having each of the four colors
represented on each of the four vertical sides. Hence, we may consider
the beginning state to have the eval uation vector (0, 0, 0, 0) and the goal
to have the eval uation vector (4, 4, 4, 4) . The four di mensi ons could
rather naturall y be combined into a one-dimensional evaluation func
tion si mpl y by summi ng the four components. I n obtai ning thi s sum,
it seems natural to give equal weight to each component, si nce each
di mension has the same range of values and an analogous meaning.
Although few people are expl i ci tl y aware of it, vi rtual l y everyone
who works on Instant Insanity attempts to use some form of hi l l
cl i mbi ng, usi ng something l i ke the above evaluation function. Sys
temati c use of thi s ki nd of hi l l cl i mbing greatl y reduces the search
space (number of alternative towers to be investigated) , but the method
sti l l leaves a very l arge number of alternatives to i nvestigate. There
are many equivalent options at each of the four nonterminal nodes of
the state-action tree for I nstant Insanity, so hi l l cl i mbing with this
evaluation function hardl y yi el ds the answer with a si ngle series of
four choices. The difcul ty with this state eval uation function applied
to thi s probl em i s that it i s much harder to i ncrease the evaluation
State Evaluation and Hill Climbing b
function by the required amount at the last (fourth) choice node than
at earl ier nodes. At most of the last nodes, no action will achi eve the
goal , even though the sol ver i s currentl y at a node that has the eval ua
tion ( 3 , 3 , 3 , 3) . Whether or not you can sol ve the probl em is determined
by the exi stence of such an action at the fourth node, but the eval uation
function for the states that coul d be achieved at earl i er nodes gives
very inadequate i nformation concerning the "correct" fourth node
at whi ch to be. That is, there are many fourth nodes with the eval ua
tion ( 3 , 3 , 3 , 3 ) , and very few of these have any action that l eads to a
terminal node with the eval uation (4, 4, 4, 4).
There are many probl ems l i ke thi s, where the restrictions bunch up
at the end of the probl em. It i s as if you had many easy trail s to cl i mb
most of the way up a mountai n, but the summi t was attainable from
onl y a few of these trai l s, with the rest runni ng into unscalable preci
pi ces. Hil l cl i mbi ng (i n the probl em-sol vi ng sense) i s often not a very
good method to use i n such cases, though it may considerabl y reduce
the amount of trial-and-error search.
The astute reader might note that the end-bunchi ng of restri cti ons i s
a difculty with hi l l cl i mbing that i s somewhat anal ogous t o t he l ocal
maxi mum di fcul ty.
Detours and Circling
Probl ems with mul tipl e equival entl y val ued paths at the early nodes
can be di fcul t to sol ve with hi l l cl i mbi ng, but perhaps the greatest
frustration in using the method comes i n detour problems, where at
some node you must actual l y choose an action that decreases the
evaluation. Somewhat less difculty i s encountered i n what might be
called circling problems, where at one or more nodes you must take
actions that do not i ncrease the eval uti ons. If the nodes where you
must detour or ci rcl e have no better choi ces (that i s, no choi ces that
increase the evaluation) , then you are more l i kel y to try detouring or
circl ing than if the critical nodes have better choi ces. When better
choices are avai l abl e, you tend to just choose them and go on without
considering the possi bi l ity of detouring or ci rcl i ng. If the path you
choose does not l ead to the goal , you might go back and i nvestigate
alternati ve paths, but the frst ones to be i nvestigated wi l l be those that
were equi valent or almost equival ent at some previ ous node. Onl y
afer al l of thi s fai l s shoul d you try detouring -that i s, choosi ng an
action at some node that produces a state that has a lower eval uation
than the previ ous state had.
b Chapter b
The missionaries-and-cannibals problem is a famous example of
the di fcul ti es encountered by hi l l cl imbing in a detour probl em. The
probl em i s as fol l ows :
On one side of a river there are three missionaries and three cannibal s.
They have a boat on thei r si de that is capable of carrying two people at
a ti me across the river. The goal is to transport al l six people across to
the other si de of the river. At no poi nt can the cannibals on either side
of the river outnumber the missionaries on that side of the river (or the
canni bal s would eat the outnumbered mi ssi onaries). This constraint only
holds when there is at least one missionary on the side of the river
where there are more cannibal s. That is, it is all right to have one, two,
or three cannibals on the same side of the river with zero missi onaries,
because then they would have no missionaries to eat.
Stop reading and try to sol ve the probl em by explicitly defning some
eval uation function and using a hill cl i mbi ng approach, then see Fig.
5-8 for a sequence of states that sol ves the probl em.
Ofhand, you mi ght thi nk this was an absol utel y tri vial probl em,
si nce the state-action tree for the probl em i s rather smal l , and hi l l
cl i mbi ng on an eval uation function such as "the number of people on
the other si de of the river" reduces the number of paths to search to
a very smal l number. But that is just the troubl e ! Hi l l cl i mbing on this
obvious eval uation function reduces the search space i n such a way as
to eliminate the path that leads to the goal . Gi ven this evaluation
function (the number of people on the other side of the river) for each
state, there i s a critical node at whi ch you must detour (more than
usual ) to sol ve the probl em by taking two people back across the river
to the original side. Of course, at every other node in the probl em, there
is a necessary detour, when one person must row the boat back to the
origi nal si de. But, as I mentioned before, necessary detours often cause
l ittle di fculty, especi al l y i n a probl em l i ke thi s one, where they are so
obvi ousl y necessary on any path to the goal . But taking two people
back to the original si de i s a detour that just does not occur to many
peopl e who work on this probl em. If they were consciousl y aware
that they had defned an evaluation function and were hi l l cl i mbi ng
using that evaluation function, then it would qui ckl y occur to them
that a detour might be necessary or that a new evaluation function
was i n order, and so on.
Incidental l y, one reason why peopl e have no di fcul ty wi th the
necessary detour on every other node of the state-action tree for
the mi ssionaries-and-cannibal s probl em is that sol vers usual l y auto
matical l y use two-step hi l l cl i mbi ng; that i s, they maximize the number
State Evaluation and Hill Cl imbing b1
Node Node
Level State Evaluation Level State Evaluation
0 6 MC
J 2
MMMGGG b MMGG b
MG b 7 MMMG b
2 4
MMGG GG
2 G 8 MMM
J
MMMGG b GGG b
3 GGG b 9 MMMGG b
3 5
MMM G
4 GG 1 0 MMMG
2 4
MMMG b GG b
5 MMGG b 1 1 MMMGGG b
4 6
MG
(Cri ti cal detour step)
AGURE 5-8
A di agram of the successi ve states i n a solution to
the mi ssi onaries and canni bal s probl em, where M mi ssionary, C canni bal ,
h boat , and the horizontal l i ne is the ri ver.
If two-step hi l l cl i mbi ng i s used unti l the l ast step
(to ignore the necessary detour on every al ternate acti on), the eval uation
numbers considered are the underlined numbers.
of people on the other si de of the ri ver afer 8 trip across the ri ver
and the return trip as wel l . U si ng thi s two-step hi l l cl i mbi ng, no detours
at all are necessary, and at the one cruci al node, al l that i s necessary i s
a ci rcl i ng action ( not i ncreasing the number of people on the other si de
as a resul t of the round-trip voyage of the boat ) .
You coul d defne an evaluation function di ferent from "the number
of peopl e on the goal side of the ri ver. " Obvi ously, you coul d use the
two-di mensional vector of the number of mi ssionaries and the number
of canni bal s on the goal si de of the river, starting with (0, 0) and the
goal being ( 3 , 3) . This process does not avoid the necessi ty of detouring
(or ci rcl i ng i n two-step hill cl i mbi ng).
Chapter b
Somehow we woul d l i ke to have the constrai nt regarding cannibals
outnumberi ng mi ssionaries to be refected i n the evaluation function.
If that were done i n the proper way, we woul d suspect that, by that
eval uati on functi on, it woul d not be necessary to make anything but
the obvi ousl y necessary detour i nvol ved i n getti ng the boat back to
the original si de of the ri ver. If you were to consi der the number of
mi ssi onary-cannibal pairs on the goal si de of the river to be your
eval uati on of the state, then the only detours that woul d have to be
taken would be those necessary to bring the boat back across the river.
Thi s eval uation function does not di stingui sh between states as fnely
as does the previ ousl y mentioned eval uation functi ons. That is, at
any gi ven node, there are more actions with the same eval uation than
is the case with the previ ousl y mentioned eval uation functi ons. How
ever, thi s does not cause di fcul ti es since most of the acti ons at any
given node are el i mi nated from consi deration by the constraint that
the canni bal s cannot outnumber the mi ssionaries on either side of the
ri ver. I must say, though, that I am sure that someone who was thor
oughly fami l i ar with evaluation functi ons and hi l l cl i mbi ng woul d
l ook for a detour before defni ng some new evaluation function i n
thi s si mpl e a probl em.
Inference versus Action Problems
Most of the probl ems di scussed in the present chapter as exampl es
of the more or l ess successful use of hi l l cl i mbi ng were action prob
l ems ; only a few were inference probl ems. The best formal defni ti on
of the di sti ncti on between these two cl asses of probl ems i s that action
probl ems i nvol ve only destructi ve operations, whereas i nference
probl ems i nvol ve primarily or excl usi vel y nondestructi ve operati ons.
Acti on probl ems are concerned wi th achievi ng changes i n some physi
cal worl d via constructi ons, movements, or the l i ke. By contrast,
inference probl ems are concerned with our knowledge of something
(whether or not one thi nks of there being any physi cal referent) . In
inference probl ems, the objecti ve is to expand the set of true state
ments to i ncl ude the desi red goal statement.
By thi s defni ti on, tri p-pl anni ng probl ems, maze probl ems, the si x
arrow probl em, the di scri mi nation reversal probl em, I nstant Insanity,
and the mi ssionaries-and-canni bal s probl em are al l action probl ems.
The coi n-weighing probl em, the l i near and other equation-sol vi ng prob
l ems, and the functi on-opti mi zati on probl em are all i nference probl ems
(though other opti mi zation probl ems might be acti on probl ems) .
State Evaluation and Hill Climbing
Obvi ousl y, the hi l l -cl i mbi ng method is not restricted to action prob
lems and excl uded from inference probl ems. However, there is almost
always a substantial l y greater economy ( parsi mony) when you de
scri be your state at any poi nt i n the sol uti on of an acti on probl em.
Si nce action probl ems i nvol ve onl y destructi ve operati ons, the state
description can usual l y be achi eved by a si ngl e si mpl e expressi on.
Furthermore, the compl exi ty of the expressi on does not usual l y grow
enormousl y wi th i ncreases i n the number of acti ons that have been
taken i n the attempt to sol ve the probl em. By contrast, i n i nference
probl ems the number of expressi ons generated i ncreases wi th every
action. In i nference probl ems, you are conti nual l y i ncreasi ng the num
ber of statements known to be true. The descri pti on of the probl em
state must general l y be consi dered to i ncl ude the enti re set of expres
si ons gi ven or deri ved up to that poi nt. Si nce the goal i s usual l y a
si ngle expressi on, it is general l y much more di fcul t to defne an
evaluation function that is useful for hi l l cl i mbi ng that compares
the current state wi th the goal state.
Another reason for the greater di fcul ty in usi ng hi l l cl i mbi ng i n
inference probl ems i s that the nondestructi ve operati ons frequentl y
found i n such probl ems are ofen not one-to-one operati ons -that i s,
operations that take one expressi on as i nput and produce one expres
si on as output. There are such one-to-one operations, of course. How
ever, i n addi ti on, inference probl ems usual l y contain a variety of
two-to-one, and three-to-one, or even more compl ex operati ons -that
i s, operati ons that take two or three or more expressi ons as i nput and
produce one expressi on as the output (the i nferred expressi on).
One-to-one operati ons are usual l y cal l ed unary operati ons ; two-to-one
and three-to-one operati ons are usual l y called bi nary and ternary
operations. By and l arge, probl ems with only unary operati ons are
more suscepti bl e to a hi l l -cl i mbi ng approach than are probl ems con
tai ni ng bi nary and ternary operati ons.
Of course, there i s always t he trivial eval uati on functi on of how
much knowledge you have obtained from the given informati on. How
ever, sheer amount of knowl edge (for exampl e, the number of deri ved
expressi ons), whi l e posi ti vel y correl ated wi th achi evi ng the goal ex
pressi on, may not be very related to the achi evement of the goal , i f
the inferences are proceeding i n the wrong directi on (a direction not
rel ated to i nferri ng the goal expressi on).
We need to defne an evaluation functi on that measures the rel evant
progress toward achi eving the goal expressi on i n i nference probl ems,
and such evaluation functi ons can frequentl y be found. However,
when they are found, they are usual l y more useful in conjuncti on wi th
Chapter b
the subgoal method (to be di scussed in the fol l owing chapter) than
they are with hi l l cl i mbi ng. The reason i s that usual l y a sequence of
several actions i s requi red to achi eve an expression that moves the
eval uation function in the direction of the evaluation characteri stic
of the goal . Most of the single actions taken in reachi ng each successive
subgoal cannot themsel ves be identifed as reduci ng the di stance to
the goal in terms of the goal eval uation functi on.
However, i n these cases, if we defne eval uation functions relevant
for reachi ng each successi ve subgoal , then conceivabl y i n a very large
proportion of inference probl ems i t is possi bl e to use the hi l l -cl i mbing
method to achieve various subgoal s. Some examples of thi s combined
use of hill cl i mbing and subgoal methods will be di scussed in the
mathemati cs, sci ence, and engineering probl ems of Chapter 1 1 .

Subgoals
THEORY
A probl em-sol vi ng method that is i mportant but di fcul t to master
i s that of defning subgoal s i n order to faci l itate solving the original
probl em. Thi s method i s someti mes cal l ed "anal yzi ng a probl em i nto
subprobl ems, " or "breaki ng up a probl em i nto part s. " In essence, the
purpose is to repl ace si ngl e di fcul t probl em wi t h two or more si mpl er
probl ems.
Of course, if you already know how to sol ve some of the sub
probl ems, or if some of them are analogous to probl ems you already
know how to sol ve, then obviousl y i t might be easier to solve the set
of si mpl er probl ems than the si ngl e original probl em.
However, the fact that it is advantageous to break up a probl em i nto
subprobl ems does not mean you must be more fami l iar wi th the sub
probl ems than wi th the original probl em. One way to see the advantage
of defning subgoal s is to look at the fol l owing anal ysi s of the state
action tree for a probl em ( 1 ) wi th M alternati ve actions at each node
and (2) a sequence of acti ons bei ng necessary for sol uti on.
Let us assume that we know that the sol ution to some probl em wi l l
requi re a sequence of acti ons ( or l ess) . By systematic trial and error
there are M
alternati ve paths (action sequences) to be i nvestigated

Z
Chapter 6
in the original probl em. Now assume that you can defne a subgoal
state that is known to be on the correct path to the goal and, let us
say, halfway from the begi nni ng to the goal . Defni ng one subgoal
di vi des the probl em i nto two subprobl ems -frst, getting from the given
state to the subgoal and, second, getting from the subgoal to the goal .
In thi s case, there are .
paths to i nvestigate in attempting to get

from the givens to the subgoal , and there are the same number (.

)
of paths to i nvestigate to get from the subgoal to the goal . Thus with
the si ngl e subgoal , the number of action sequences to be i nvestigated
is .

action sequences that are n/2 steps long, versus .
action
sequences that are steps long i n the original probl em wi thout a subgoal .
To get some concrete notion of the advantages of reduci ng the ex
ponent of .i n thi s manner, consi der the case where .-1 0 and -1 0.
I n thi s case, .-1 01 0 and .

-2
.
( 1 05 ) . In thi s case, a si ngle sub
goal has reduced the search by a factor of 50,000, whi ch is, of course,
a staggering reducti on. In addi ti on, wi th the subgoal , the action se
quences are onl y half as l ong. A state-acti on tree of a very simple prob
l em, whi ch vastly underesti mates the power of the subgoal method,
is shown in Fi g. 6- 1 .
If we defned four subgoals (fve subprobl ems) i n the probl em, with
.-1 0 and -1 0, then the number of two-step paths i nvolved i n
achi evi ng al l the subgoals pl us the fnal goal i s 5 ( 1 0
) -500, whi ch i s
a reducti on of the search by a factor of 2
.
( 1 07 ) , or 20, 000, 000.
To be sure, a number of si mpl ifyi ng assumpti ons were made i n com
puti ng the comparati ve advantages of defning a series of subgoal s.
However, the pri mary assumpti on, whi ch overesti mates t he advantages
of the subgoal method, is that you coul d be sure the subgoal s you
defned were states on a path that led to the goal . I n some cases, you
.be sure of thi s, but i n many other cases you cannot. Neverthel ess,
if you coul d fnd a true subgoal by maki ng 5, 1 0, or even 1 00 guesses
of i t, you woul d sti l l be reduci ng the search space by extremel y l arge
factors in al l of the many probl ems that requi re more than a few steps
to sol ve.
The subgoal method i s advantageous for attacki ng probl ems that
requi re a sequence of more than two or three acti ons to solve -whi ch
i s what most nontri vi al probl ems requi re. Sti l l , some probl ems are not
si mpl i fed appreci abl y by thi s method ; they are sometimes called
i nsight probl ems because they requi re few steps to solve once the
cri ti cal i nsight has been achi eved. These i ncl ude probl ems i n whi ch
one must represent the components of the probl em i n some sui tabl e
way, guess the correct set of gi vens (where there are mul ti pl e given
Given
Goal
FIGURE 61
State-action tree for very si mpl e problem showing how defning
a subgoal on the correct path (action sequence) to the goal can
reduce the search. I n this case, the search i s l i mited to the region
inside the two boxes, which is eight action sequences each two
steps long, instead of l action sequences each four steps long.
Some simpl ifying assumptions are made, such as that one knows
that the subgoal i s two steps from the beginning and two steps
from the end. However, the average problem i s much longer, and
the degree of reduction in search by defning subgoal s is far
greater than in this si mpl e exampl e.
states), or choose a solution approach that violates hill cl i mbing but

that requires choosing from among onl y a smal l number of action se
quences, once the i nsight has been achieved. Exampl es of i nsight
probl ems, for which the subgoal method has l ittle to ofer, are many
implicit information probl ems, such as the notched-checkerboard and
block-cutting probl ems of Chapter 3 , and the detour probl ems, such
as the mi ssionaries-and-cannibal s probl em of Chapter 5.
The subgoal method al so does not work, of course, i f you cannot
thi nk of any pl ausi bl e subgoal s. However, if the probl em seems l i kel y
to be d mul tistep rather than i nsight probl em, it i s usual l y advantageous
to spend some time trying to generate pl ausi bl e subgoal s , because of
the enormous power of the method.
How do you try to defne a subgoal with reasonable hopes that you
are on a path to the goal ? Al though there i s no method of defning
plausibl e subgoal s that i s mathematically preci se and that appl i es to
every type of probl em, you can take the frst step by defning an
4 Chapter 6
evaluation function over diferent probl em states, as was done as a
necessary precondition in appl ying the hi l l -cl i mbing method. Having
done this, you can recognize a pl ausi bl e subgoal as a probl em state
wi th an eval uation part way between the given state and the goal state.
I f the eval uation function is mul ti di mensional , then such a subgoal
might have the goal val ues on some, but not al l , of the di mensi ons.
Or i t mi ght just have val ues closer to the goal val ues on some, or even
al l , of the di mensi ons.
Defni ng an eval uation function over probl em states provides not
only a way to recognize a pl ausi bl e subgoal but also a way to generate
or defne pl ausi bl e subgoal s. Evaluation functions are si ngle valued
in the di rection from a probl em state to an evaluation vector for the
probl em state. ( Mul ti di mensional eval uation functions are still si ngle
valued so long as only one di sti nct evaluation vector i s associated with
each di sti nct probl em state. ) The i nverse function may be mul ti valued,
but that does not seriousl y reduce the val ue of thi s approach to
generating pl ausibl e subgoal s.
It may be that subgoal s can always be determined t o have inter
mediate val ues between the gi ven and goal state accordi ng to some
expl i ci tl y defned evaluation functi on. However, a person may fre
quentl y defne subgoal s without being able to state expl i ci tl y any
rel evant eval uation functi on. Thus, this book will general l y di scuss
the appl ication of the subgoal method to probl ems without attempting
to describe any formal evaluation function.
I n general , there are a mul ti pl i city of pl ausi bl e subgoal s, some on
a correct path to the goal and some not. As mentioned, even if the
probabil ity of a pl ausi bl e subgoal bei ng a true subgoal i s onl y 0. 1 or
0. 0 1 , the method i s sti l l reduci ng the search by an enormous factor i n
most probl ems. In any event, it i s usual l y not very difcul t to con
jecture a variety of reasonabl y pl ausi bl e subgoal s, but the l ikel i hood
of defni ng a good subgoal wi l l depend upon how good an eval uation
function you have defned over probl em states. In turn, how good your
evaluation function is, how suitabl e it i s to solving the probl em at
hand, ofen depends upon how adequatel y you represented the in
formation i n the probl em (di scussed i n Chapters 3 and 1 0) , the defning
of macroactions, and the use of various other probl em-sol vi ng methods.
Probl em-sol vi ng methods are general l y used i n combination, and the
combi ned power of several methods i n reducing the search space can
resul t i n very fast sol ution of many probl ems with l ittle trial-and-error
search being requi red.
When you have defned two or more subgoal s to be achieved in get
ting from the gi ven state to the goal , you can make a logical di stinction
Subgoal s b
as to whether the subgoal s must necessari l y be achieved in a certain
order or whether they can be achieved i n any order. Thi s si mpl e logical
di stinction between ordered and unordered subgoal s i s i l l ustrated i n
Fi g. 6-2.
In some probl ems i t i s obvi ous that one of the subgoal s (SG , ) i s
cl oser to the gi ven i n terms of the eval uation function than i s the other
subgoal (SG
2
) , while the l atter subgoal i s closer to the goal than is
the former subgoal . In such i nstances, the subgoal s cl early should be
achi eved i n a parti cul ar order.
In other cases, while the achievement of two or more subgoal s may
constitute two components that are necessary i n order to get to the
goal from the givens, i t i s not obvi ous whi ch subgoal is easi er to achi eve
from the gi vens. I n the l atter case, you have a choi ce of what order to
arrange the subgoal s on a path from the gi vens to the goal . In these
cases, if your frst choi ce i s not worki ng out wel l , then you shoul d
swi tch to some other choi ce i n orderi ng the subgoal s. In some cases,
where the ordering of subgoal s i s not i mmedi ately apparent at the
outset of the probl em, some orders of achi evi ng subgoal s may be easier
to accompl i sh than others. Bei ng aware of the di sti nction between
ordered and unordered subgoal s permits greater fexi bi l ity in the
solution of probl ems i nvol vi ng unordered subgoal s.
Ordered subgoals
( Givens >- - GOal )
Unordered subgoals
FIGURE 6-2
An exampl e of ordered subgoal s (SGi) where there i s a unique
order i n whi ch the diferent subgoal s must be achieved i n getting
from the givens to the goal versus an exampl e of unordered
subgoal s where any order of achievement of the subgoal s can
l ead to the goal .
b Chapter 6
If you have defned n subgoal s (whether ordered or unordered) , you
have automatical l y defned n + 1 subprobl ems to be sol ved -namely,
getting from the gi vens to one of the subgoal s, getting from the frst
subgoal to the second subgoal , and so on, from the nth subgoal to the
goal . Whether any particul ar ordering of the subgoal s i s requi red by
the original eval uation function or i s optional by that evaluation
functi on, you may often be free to choose to work on any l i nk i n the
chain frst, second, and so on. By and l arge, i t i s advantageous to start
with a subprobl em of getting from the gi vens to one of the subgoal s,
or el se to work on the probl em of getting from one of the subgoal s to
the goal , because subgoal s are frequentl y not ful l y defned probl em
states. I t i s usual ly preferabl e to work on a subprobl em i n whi ch either
the begi nni ng or end state i s compl etel y specifed. I n most probl ems,
thi s means starting from the gi vens i s preferabl e, though i n some
cases working backward from the goal may be just as good or better.
An additional advantage of starting with the givens is that you are
si mul taneousl y drawing inferences (see di scussion of thi s method in
Chapter 3) that you know i s expanding the i nformation you have
available for the sol ution of the probl em, no matter what the ul ti mate
success of the parti cul ar subgoal approach that you have taken. If
you begin working on other subprobl ems that do not start from the
gi ven state, then a fai l ure usi ng the subgoal approach may not have
generated equi val entl y useful i nformation as that which would have
been generated by starti ng from the givens. All this is perhaps rela
ti vel y obvious, but it i s neverthel ess i mportant to bear i n mi nd.
APPLICATIONS
Pl anni ng a trip across the country is a probl em to which the subgoal
method can be applied in a relati vel y tri vial manner. If you wanted to
travel from San Franci sco to New York City, you might sel ect Denver
and Chicago as subgoal s. Sel ection of such reasonable subgoal s de
pends upon havi ng an eval uation function defned over such ci ti es
(such as thei r two-di mensional coordi nates on a map) , which i ndicates
that Denver and Chi cago have i ntermediate values on the east-west
coordi nate compared to San Franci sco and New York. In thi s case,
it is primari l y one of the two di mensi ons of the evaluation function
that needs to be altered to get from the given state to the goal . But in
going from Spokane, Washi ngton, to Mi ami , Fl orida, by way of Denver
and Memphi s, Tennessee, you are substantial l y alteri ng the val ues on
both di mensi ons in going from the given state to the goal . Obvi ousl y,
the subgoal s are ordered i n these trip-pl anni ng probl ems.
Subgoal s
As an example of a probl em that i s extremel y easy to sol ve usi ng

the subgoal method (wi th unordered subgoal s) , consi der the fol l owi ng:
A light pl ane carrying three men crashes in the desert . The men decide
that their best chance for survi val consi sts of each of them setting across
the desert i n diferent directions in hopes that one of the directions wi l l
pass by a sufcient number of oases to permit that man to reach civiliza
tion and get help for the others. Before going their separate ways across
the desert , they are faced wi th the probl em of achieving an equal di vi sion
of their stock of water and canteens. They have in their possession fve
canteens ful l of water, fve canteens half-ful l of water, and fve empty
canteens. All canteens are the same size. Since water-carrying capacity
is important should a man reach an oasi s, they wish to divide both thei r
suppl y of water and the number of canteens equal l y among themsel ves.
How can they achieve thi s?
Stop reading and attempt to sol ve thi s probl em.
If you were unable to sol ve i t, consi der the fol lowi ng hi nts. You
might original l y defne three unordered subgoal s consi sti ng of attempt
ing to di vi de the ful l canteens evenly among the three men, di vi di ng
the half-ful l canteens evenl y among the three men, and di vi di ng the
empty canteens evenl y among the three. I t i s i mmedi atel y obvi ous that
thi s subgoal approach wi l l not work. An alternati ve defni ti on of sub
goal s i nvol ves frst maki ng the i nferences that the total quantity of
water is 5 + or 74 canteens ful l of water and that the total number of
canteens equal s 1 5 . From thi s you can concl ude that, in the goal state,
each person wi l l have 24 canteens ful l of water and 5 canteens. I n es
sence, thi s defnes a si x-di mensional vector eval uati on functi on such
that i n the gi ven state each of the three persons has zero water and
zero canteens and i n the goal state each person has 24 canteens ful l of
water and 5 canteens. If you have not sol ved the probl em, stop read
ing and attempt to defne rel evant subgoal s.
The relevant subgoal s are t o attempt t o gi ve frst one of t he men (i t
obviousl y does not matter whi ch one) 24 canteens ful l of water di s
tributed among fve canteens. There are a number of ways of doi ng
thi s, onl y one of whi ch wi l l make i t i mpossi bl e to achi eve the other
two subgoal s of gi vi ng 2 4 canteens ful l of water i n 5 canteens to each of
the other two men. The onl y way that prevents achi evement of t he goal
i s to give the frst man the enti re set of half-ful l canteens. Any other
method of gi vi ng the frst man 24 canteens ful l of water and 5 canteens
wi l l permit achi evement of the remai ni ng two subgoal s -namel y, gi ve
the frst man I ful l canteen of water, 3 half ful l canteens of water, and
I empty canteen, or gi ve the frst man 2 ful l canteens of water, 1 half
ful l canteen of water, and 2 empty canteens. Once the frst subgoal has
Chapter 6
been achieved, it is tri vial l y obvi ous whether or not it is possi bl e to
achieve the second and thi rd subgoal s.
Another probl em that i l l ustrates t he power of t he repeated use of
the subgoal method is the fol l owing:
Nine men and two boys want to cross a river, using an infatable raf that
will carry either one man or the two boys. How many times must the
boat cross the river in order to accomplish this goal ? (A round trip
equals two crossings. )
Stop reading and try t o solve the probl em, using t he subgoal method.
Defne as a subgoal the probl em of getting one man across the river
and getting the boat back to the starting si de. Stop readi ng and attempt
to solve the probl em, if you have not al ready.
It takes exactly four crossings to get one man across the river and
retur the boat to the original side. Fi rst, the two boys cross the river
i n the boat, then one boy takes the boat back to the original side of the
river, then a man takes the boat across the river, and then the second
boy takes the boat back to the original side of the river. These four
crossings put both boys and the boat in the same position they were
when they transported the frst man across the river. Thus, to transport
all nine men across the river wi l l requi re 9 " 4 or 3 6 one-way crossings.
At that point, both boys will be on the original side of the river with
the boat, and one additional crossing wi l l be requi red for them to get
to the goal side of the river with the boat. Thus, a total of 37 one-way
crossings are requi red i n al l .
An example of a somewhat simil ar subgoal probl em i n a probabi l ity
context is provided by the fol l owing exampl e:
The ace, 2, 3 , 4, 5 , 6, 7, and 8 of hearts are placed face up in a row on the
table. Then a pack of eight cards containing the ace, 2, 3 , 4, 5 , 6, 7, and
8 of spades are shufed and placed in front of the player. As each succes
sive spade i s turned over, the corresponding heart is removed from the
row. What is the probability that all the hearts can be removed without a
break (hole) ever occurring in the row of hearts?
Stop reading and try to sol ve the probl em.
Consi der as a subgoal the probabi l ity that the frst removed heart
does not cause a break in the row. Stop reading and try to sol ve the
probl em, using this subgoal .
The probabil ity of achieving the frst subgoal (removing one heart
without producing a break in the row) is exactl y . This probabi l ity
resul ts from the fact that there are two end positions to the row, and
Subgoals
onl y the two cards i n these end positions may be removed wi thout
causing a break. Since there are eight cards i n the row, the probabil ity
i s . If you did not sol ve the probl em before, stop readi ng and try to
solve the origi nal probl em, having achieved the solution to the subgoal .
Once the frst sub goal has been achieved of drawing one card from
the end of each row (and its probabi l ity has been determined) , the
second subgoal shoul d be to compute the probabi l ity that the second
card will be removed from the end of the row. If the frst subgoal has
been successful l y achi eved, there are sti l l two cards at the ends, but
now onl y seven cards in toto. Thus, the probabi l ity of successful l y re
moving a second card from the ends of the row i s . If you have not yet
solved the entire probl em, stop reading and try to complete the rest of
the sol ution on your own.
Conti nue in thi s way, defni ng successi ve subgoal s of removi ng cards
from either end of the row unti l al l cards have been removed. The
probabi l ity that each subgoal will be successful l y achieved with a
random shufing of the pack of spades is evi dentl y
.

.
i
.

.

.
i
.
, or
:1:5 ' The probabi l ity of successful achi evement of the entire set of
necessary subgoal s i s si mpl y the product of the probabi l iti es of
achi evi ng each successi ve sub goal . Note that thi s probabi l ity probl em
represents a rather i nteresting variation i n the use of the subgoal
method, since there are, in essence, two paral l el sets of subgoal s i n
vol ved in the probl em. On the one hand, there i s the series of subgoal s
of removing cards from an end of the row, progressivel y reduci ng the
row in l ength without creating a hol e. On the other hand, there i s the
series of subgoal s of computing the probabi l i ti es of achieving each
of these subgoal s.
Now consi der thi s rather di ferent probl em (previ ousl y di scussed
in Chapter 3) , which al so i l l ustrates the use of the subgoal method:
Wanda the wi tch agrees to trade one of her magic broomsti cks to Gaspar
the ghost i n exchange for one of hi s gold chai ns. Gaspar i s somewhat
skeptical that the broomstick is i n working order and i nsi sts on a guaran
tee equal to the number of l i nks i n the gold chai n. As a guarantee, he
insi sts on payi ng by the i nstal lment plan, one gold link per day until the
end of the 63-day peri od, wi th the bal ance to be forfei t if the broomsti ck
malfuncti ons duri ng the guarantee peri od. Wanda agrees to thi s arrange
ment , but i nsi sts t hat t he i nst al l ment payment be efected by cut t i ng no
more than t hree l i nks i n the gol d chai n. Can t hi s cutti ng be done, and, if
so, what links i n the chain should be cut? The chain i ni ti ally consi sts of
63 closed gold l i nks arranged i n a si mple li near order (not closed into
a ci rcle).
T Chapter 6
Assume it is possi bl e to sol ve the probl em by making onl y three cuts.
Obvi ousl y, if it i s possi bl e, then Gaspar and Wanda wi l l have to make
change on vari ous days during the 63 -day period. That i s, they must
exchange vari ous links of chain on the di ferent days, so that Wanda
acqui res one extra l i nk each day, si nce it surel y i s not possi bl e to
separate the chain into 63 i ndi vidual l i nks by making only three cuts.
I f you have not sol ved the probl em, stop reading and try agai n.
Sti l l assumi ng that it i s possi bl e to sol ve the probl em, note that, i f
it i s possi bl e, the sol ution wi l l resul t i n creating at l east three si ngle
l i nks of chai n, as wel l as vari ous other longer l i nks of chain. I f you
have not solved the probl em so far, stop reading and try agai n.
Havi ng three i ndi vidual chain l i nks wi l l permit payment of one l i nk
per day from days 1 to 3. Now as the frst subprobl em, you should
determi ne the longest link chain that can be used with Wanda making
change, in order to permit payment of an addi tional link on the fourth
day. Obviousl y, the solution to this probl em i s to cut a chain that is
four l i nks long, since Wanda can return the three i ndi vi dual l i nks.
Then the second probl em i s to cut the maxi mum l i nk chai n that wi l l
permi t payment when the three i ndi vidual chai n l i nks and the four
l i nk chain have been used up. Obvi ousl y, this chain would consi st of
eight l i nks. Conti nue in thi s manner, defni ng as subgoal s the making
of change, using l engths of chain known to be part of the solution, until
these are all gi ven over to Wanda and a longer chain is requi red. Then
determine what longer chain i s requi red on that day. Thus, the solution
of the probl em i s to have 3 i ndi vidual l i nks of chain, then a chain each
of 4 l i nks, 8 l i nks, 1 6 l i nks, and 32 l i nks. Si nce, by i nspection, this wi l l
requi re onl y three cuts (separating the 4 from the 8, the 8 from the 1 6,
and the 1 6 from the 3 2) , the probl em is sol ved.
Note that essential i nsight for sol vi ng the problem i s to consider
how to make change on each day of the 63 -day period, starting from
the frst day and conti nui ng through to the 63rd day, achieving these
subgoal s i n order.
In addition to the subgoal method, it is al so i mportant te the solution
of thi s probl em for you to make the inference from the goal that, when
you have determined where to place the cuts in creating the l arger
chai n l i nks, you wi l l have al so achieved three i ndi vi dual l i nks.
Thus, you shoul d start the process of change making on the i nitial
days, usi ng the i ndividual l i nks unti l they are inadequate, and there
after conti nue to use al l the known l ength chains until they are inade
quate, at that poi nt cutting of the l argest l ength of chain that wi l l solve
the probl em on that day. In a sense, the original probl em i s di vided
into 63 subgoal s -that is, making correct change on each of the 63
Subgoals TT
days, though onl y a few of these days are special i n that they requi re
you to exchange one long pi ece of chain for all previousl y gi ven shorter
pieces of chai n.
A very si mpl e puzzl e problem i l l ustrates t he use of t he subgoal
method in an enti rel y di ferent context:
Fi ve squares are i nserted i nt o a three-by-two rectangle, as i llustrated
in the gi ven of Fig. 6- 3. Three of the squares have a label A, one square
has label B, and one square has label C. Any square may be moved within
the rectangle to an adjacent square, provided that the square moved into
is empty. The problem i s to make a sequence of moves so as to achieve
the goal state, as illustrated i n Fi g. 6-3 .
Given

FIGURE 6-3
Goal

The gi ven and goal states for the A BC puzzl e.
Now make up fve l ittle squares of paper (or other tokens) that wi l l
ft in the rectangle in Fig. 6- 3 , and, by moving them around in the
rectangl e, attempt to solve the probl em. In attempting to sol ve the
probl em, you will fnd it helpful to try to defne a subgoal state that is
on the path from the gi vens to the goal by some eval uation functi on.
Moving the A Be squares in the ri ght four cel l s of the rectangle wi l l
not solve the probl em, for the three squares can onl y be moved i n a
cycl i c manner wi thi n the four squares, whi ch wi l l never change the
relative ordering of the B and e sguares -preci sel y what i s requi red
in the goal state. At some poi nt, the B and e squares must be separated
to the frst and thi rd col umns of the rectangle in order to achieve a
change in the cycl i c order of the B and e squares. With thi s somewhat
vague idea for a subgoal in mind, stop reading, attempt to defne the
subgoal more preci sel y, and then sol ve the probl em, if you have not
done so al ready.
A more specifc defnition of the subgoal of separating the B and e
squares to opposite si des of the rectangle is i l l ustrated in Fig. 6- 4.
Note that i f the B and e squares are separated as i n subgoal 1 , it i s
relativel y easy to move B next to e (subgoal 2) and then move the
A ' s around i n such a way that B coul d be on top of e i n the thi rd
col umn. Thus, in the case of thi s parti cul ar subgoal , you can qui ckl y
TZ Chapter 6
verify that you coul d get from the subgoal s to the goal and then at
tempt to reach subgoal 1 from the gi ven state. The l atter problem i s
not too di fcul t and i s l ef to you as an exerci se.
Subgoa l
I

FIGURE 6-4
Subgoal 2
LE

^ useful set of two subgoal s for the sol ution of the ABC puzzl e.
The rel evant eval uation function for defning subgoal 2 i s probabl y
whether the cycl ic order of B and C wi thi n the right four cel l s of the
rectangle i s BC or CB. Subgoal 2 shares the BC order with the goal
state, whereas the cycl i c order in the gi ven state is CB. Since the BC
ordering i l l ustrated i n subgoal 2 cannot be achieved from the prior
state by movi ng B unl ess B i s at some time moved out of the right
four squares, you know that the preceding subgoal must have B in
the extreme l eft-hand col umn of the rectangl e. Subgoal 1 i l l ustrates
the simpl est such possi bi l ity in relation to subgoal 2.
Surel y one of the most remarkabl e si mpl e exampl es of the use of
the subgoal method comes i n the Tower of Hanoi (disk transfer)
problem. One version of the probl em can be stated as fol l ows:
There are three identi cal spikes and si x di sks, each with a diferent
diameter but each having a hole in the center large enough for a spike to
go through. At the beginning of the problem, the six di sks are placed on
one spike, one on top of another, with the largest di sk on the bottom, then
the next largest, and so on, in order of decreasi ng size until the smallest
di sk, which is on top. (See Fig. 6- 5. ) You are permi tted to move only one
di sk at a time from one spike to another spike, wi th the restriction that a
larger di sk must never be moved on top of a smaller di sk. The goal is to
transfer all six di sks to one of the other two spi kes (without ever permit
ti ng a l arger disk to rest on top of a smaller di sk) .
I bel i eve I was once tol d some relativel y routine mechanical proce
dure for sol vi ng this probl em, but I do not remember it, si nce I was
not given a proof that i ndi cated why it worked. However, a beautiful
repeated hi erarchical (recursive) use of the subgoal method provides
a sol ution to thi s probl em that dramatical l y i l l ustrates the power of
the subgoal method. Stop readi ng and try to sol ve the probl em.
Given state
Goal state
FIGURE 6-5
The given and goal states for the Tower of Hanoi
(di sk transfer) probl em.
T
Apropos the probl em-solving method (di scussed in Chapters 3 and
1 0) of compl ete representation (naming) of al l the concepts i n a prob
lem, the frst step i n solving this probl em might be to give names (in
thi s case numbers) to the di sks i n a manner that easily represents the
one way in which they difer from each other, namely, in diameter. So
let us number the di sks I to 6 from smal l est to l argest. In addition,
l et us label the probl em of transferring si x disks from one spi ke to
another a six-problem, which impl icitl y recogni zes this probl em as a
particular case of a l arger cl ass of disk-transfer probl ems (fve-probl ems,
seven-probl ems, and so on) . For convenience in verbal descriptions,
l et us also label the spi kes A , B, and C. This representation of the
problem is shown in Fig. 6- 6. Now stop reading and try agai n to solve
the probl em, if you fai l ed before.
Havi ng represented t he probl em i n this way, I think that i t i s rea
sonably l ikel y that one woul d thi nk of the fol l owi ng elegant way to
divide the probl em into subgoal s. Sol vi ng a six-probl em from A onto C
is equivalent to sol vi ng a fve-probl em from A onto B, moving the six-
T4
Gi ven state for 6-problem
1
Goal state for 6-problem
1
2
3
4
5
6
A B c
RGURE 6- 6
A
2
3
4
5
6
B C
The given and goal states for the Tower of Hanoi (di sk transfer) probl em,
wi th numerical representati on.
di sk from A to C, and solving a fve-probl em from B onto C. The fve
probl ems are equi val ent to a four-probl em, a move of the fve-di sk,
and another four-probl em. In turn, the four-probl ems can be subgoaled
into two three-probl ems and a move, and so on. Thus, the enti re prob
lem can be sol ved by a recursi ve use of the subgoal method. To actual l y
i mpl ement the method, i n thi s case, you must have the abil ity to re
member what l evel of subgoal you are currently working on, but this
probl em can easi l y be sol ved by making some notes on a piece of paper.
In any event, it is clear that thi s method sol ves the probl em and, in
addi ti on, thi s subgoal method gi ves an excel l ent i nsight i nto the struc
ture of the Tower of Hanoi probl em.
Story-algebra probl ems frequentl y i l l ustrate the useful ness of the
subgoal method. Consider the fol l owi ng si mpl e probl em:
Each day, Abe either wal ks to work and rides hi s bi cycle home or rides
his bicycle to work and wal ks home. Ei ther way, the round tri p takes one
hour. I f he were to ri de both ways, i t would take 30 mi nutes. How long
would a round trip take, if Abe walked both ways?
Stop readi ng and try to sol ve the probl em by defning some si mpl e
subgoal s.
The frst subgoal you might defne i s t o determine how long it takes
Abe to ride one way. You can then determi ne, as a second subgoal ,
how long it takes to walk one way. Then it is trivial l y easy to determine
how long it takes to wal k both ways and to solve the probl em. You note
that the time to ride both ways i s 30 mi nutes, from which it is obvious
that the one-way riding trip requi res 1 5 minutes. If these 1 5 minutes
are subtracted from the one-hour round trip for wal king plus riding,
45 minutes remain for the one-way wal ki ng tri p. Doubl ing thi s yi el ds
a round-trip wal king time of 90 mi nutes, whi ch i s the sol ution to
the probl em.
Subgoals Tb
Most story-algebra probl ems are amenable to the subgoal approach.
Instead of working di rectl y t o determine the val ue of the unknown
quantity, you set subgoal s of determi ni ng various other unknown quan
ti ti es that are rel ated to the goal quantity by some known rel ati on.
When all the unknown quanti ti es except the goal quantity have been
determi ned i n the known relati on, you can then use the known relation
to solve for the goal quantity. You must al so be able to represent the
el ements expressed i n the story probl em i n algebraic (equation) form.
However, afer skill in algebraic representati on, skill i n defning sub
goal s i s probabl y the next most i mportant el ement i n the sol uti on of
story-algebra probl ems.
For another si mpl e example of the subgoal method appl i ed to a
simple story-algebra probl em, consi der the fol l owi ng:
I ngrid bri ngs a quant i t y of hats to sel l at the Saturday market. I n the
morni ng, she sel l s her hats for $3 each, grossi ng $ 1 8. I n the afternoon,
she reduces her pri ce to $2 each and sel l s twi ce as many. What was
I ngri d' s gross i ncome for the day from the sale of hat s?
Stop reading and solve the probl em by defni ng a si mpl e sequence
of subgoal s.
The frst subgoal i s t o determine how many hats are sold i n the
morning. From thi s, it i s tri vi al l y easy to determine the number of
hats sold i n the afernoon, whi ch i s the second subgoal , and then the
gross i ncome for the day.
The specifc sol uti on to the probl em is as fol l ows : I f I ngrid grossed
$ 1 8 i n the morni ng by sel l i ng hats at $3 per hat , she evi dentl y sol d
6 hat s. This impl i es that she sol d 1 2 hats i n the afernoon. Therefore,
I ngrid grossed 6 $3 + 1 2 $2, or $42 for the day.
The subgoal method i s al so frequentl y useful i n the sol ution of
geometry probl ems, as i n the fol l owi ng exampl e:
Gi ven t he parallelogram ABCD i l l ustrated i n Fi g. 6-7, prove t hat the
perpendi cul ars and CF drawn to the di agonal BD are equal.
Stop readi ng and try to solve the probl em by defning a rel evant
subgoal .
One very common way to prove two l i nes are equal is to prove that
they are corresponding parts of congruent triangl es. In the present
case, this coul d mean either proving that triangle ABE was congruent
to triangle CDF or proving the triangle AED was congruent to triangle
CFB. These two alternative subgoal s appear to be equi val ent, and
therefore we may arbitrari l y choose to work on the subgoal of provi ng
Tb
B
C
A
D
FIGURE 6-7
Gi ven AB CD. BC AD. AE BD. and CF BD. prove that AE CF.
triangle ABE congruent to triangle CDF. Stop reading and attempt to
sol ve the probl em, possi bl y by defning a further subgoal .
To prove triangle ABE congruent to triangle CDF, it is hel pful to
defne a prior subgoal of proving that triangle A BD is congruent to
triangle CDB. Stop reading and attempt to sol ve the probl em, using
this sequence of two subgoal s.
Triangle ABD i s evi dentl y congruent to triangle CDB, si nce the
three corresponding sides of each are equal. From thi s, we can con
clude that angle L equals angle { i n Fig. 6-7. From thi s, we can conclude
that triangle ABE i s congruent to triangle CDF, since both are right
triangl es and there are corresponding angles (besi des the right angl es)
that are equal and the hypotenuses are equal . Now that these triangl es
have been proved congruent, si de A E equal s si de C F by corresponding
parts of congruent triangl es, and the probl em is sol ved.
Another geometry probl em that i s qui ckl y solved by use of the sub
goal method i s the fol lowing:
Gi ven the ci rcle illustrated in Fi g. 6- 8, proceed from the ci rcumference
along a diameter of the circle for an arbitrary unknown di stance, to poi nt
, then turn perpeadicular to the radi us and draw a l i ne connecting the
radi us to the circumference, point B. Then erect another perpendi cular
at B until i t intersects, at point C, the di ameter perpendi cular to the
original di ameter. The di ameter of the circle is 1 00 feet. Determi ne the
length of the l i ne AC.
Stop reading and try to sol ve the probl em by defning relevant
subgoal s.
At frst, thi s probl em may seem extremel y di fcul t t o sol ve, since
very l ittle numeri cal i nformation i s gi ven i n the probl em. However, a
reasonabl e subgoal for determining the l ength of the l i ne A C is to deter
mi ne some triangle to whi ch A BC is congruent, where the l ength of
Subgoals T
one or more of the si des of thp second triangle is known. Alternati vel y,
you coul d attempt to determi ne some triangle congruent to the triangle
AOC. Using thi s subgoal , attempt to sol ve the probl em.
Of course, triangle A BC i s congruent to triangle A C, but thi s
knowl edge i s not much hel p, si nce you do not know the l engths of any
of the l i nes i n either of these triangl es. There are no other triangl es
drawn expl i ci tl y in Fi g. 6-8. Thus , you wi l l have to draw additional
l ines in order to defne new triangles that may be congruent to the
triangl es al ready given in Fig. 6-8. Cl earl y, you shoul d draw lines that
resul t in triangl es with one or more known l ength sides. The onl y
known l engths are the diameters and radi i of the ci rcl e. Thus, the
constructed triangle shoul d evi dentl y i ncl ude a diameter or radi us.
Given thi s l i ne of reasoni ng, sooner or later you shoul d hi t upon the
idea of drawing the radi us DB to defne the triangl es BOA and BOC,
both of whi ch are congruent t o t he original triangl es ABC and AOC
(easi l y proved). From thi s, we concl ude that l i ne AC equal s l i ne DB,
by corresponding parts of congruent triangl es, and, si nce l i ne DB i s
a radi us of a ci rcl e, we know that l i ne A C equal s t he radi us of the
ci rcl e, which i s 50 feet.
The common practice in mathematics of conjecturing and provi ng
one or more l emmas as subgoal s on the way to provi ng some major
theorem i s a good example of the use of the subgoal method. The ski l l
ful defni ti on of l emmas ( subgoal s) to aid i n provi ng a difcul t theorem
depends on having si mple and compl ete representations of the rel e
vant mathematical concepts and very good eval uation functi ons
based on such el egant representations and experience i n theorem
proving. Some of thi s abil ity to represent concepts elegantly and
defne good evaluation functi ons can be gained by studying general
probl em-sol ving methods and appl yi ng them to probl ems requi ring no
specialized knowl edge. However, you cannot expect to be able to
prove di fcul t theorems i n some area of mathemati cs without extensi ve
FIGURE 6-8
Gi ven the ci rcl e i l l ustrated above wi th
a diameter of I feet, determi ne the
l ength of the l i ne A C.
T Chapter 6
studying of the concepts, assumptions, operations, and so on, in that
area. To sol ve probl ems in a special ized area of knowl edge, one must
have l earned certain el egant ways of representing concepts i n that
area. Thi s knowledge is often required in order to defne good evalua
tion functions for use in hill cl i mbing and the subgoal method.
Another mathematical proof technique that i s an i ngeni ous use of
the subgoal method is mathematical induction. Mathematical induc
tion can be used to prove theorems that i n some way invol ve natural
numbers ( positive integers). Let the goal expression you are trying
to prove be represented by E(n) , where n stands for any natural num
ber (n c I ) . The probl em of proving E(n) true for any natural number,
n, can be di vi ded into two subprobl ems : frst, proving E(n) true for
n = I , and, second, provi ng that if E(n - 1 ) is true, then E(n) is true.
For exampl e, consi der a mathematical induction proof of the
theorem that the sum of the frst n positive integers, I?
=
1 i, is n(n + 1 )/2:
( 1 ) 2 = 1 i = 1 = 1
.
t so true for n = 1 .
(n - I ) n
(2) Assume true for (n - I ) : 2:'=-/ i =
2
( 3) Add n to both sides: 2
7
= 1 i =
(n -
2
1 ) n
+ n
(4)
. (n - I ) n + 2n
Put over common denomi nator: 2:'= 1 1 =
2
( 5) F t
__
n(n - 1 + 2)
ac or. _ = _ 1
2
(6) Si mpl ify: 2;'= 1
. n(n + l )
1 =
2
Q. E. D.
Step ( 1 ) establ i shes that t he theorem i s true for n = 1 , and steps (2)
through (6) establ i sh that if the theorem i s true for n - I , it i s true for n.

Contradicti on
As mentioned in Chapter 3 , amateur probl em sol vers often do not pay
enough attention to the goal or the set of possi bl e goals as part of the
information in a probl em. They apply operati ons to the givens i n an
attempt to get to the goal , but they frequentl y do not consi der apply
ing operations to the possible goal s in order to get to the givens or to
meet the gi vens hal fway.
In Chapter 3 we were concerned with inferences about the goal
that could be made primari l y from the partial information the sol ver
already possessed about the goal , but al so from gi ven i nformation or
from i nformation about both givens and the goal . Here we are al so
concerned with inferences that can be made from the goal in conjunc
tion wi th the givens. However, the purpose of the types of inferences
I wi l l di scuss here i s quite diferent from the purpose of those i n Chap
ter 3, where the purpose was to cl early specify the parts of the goal ,
so that we coul d more easi l y see exactl y what was to be derived from
the givens. By contrast, the purpose of the types of inferences to be
di scussed now i s to derive an i nference that contradi cts some pi ece
of given informati on.
Deriving a contradiction proves that the goal coul d not possi bl y be
obtained from the givens, since it i s i nconsi stent with the givens. Thi s
method of contradiction i s appropri ate for several types of probl ems i n
TT Chapter
whi ch you must deci de which of two or more goal s coul d be derived
from the givens. The method of contradiction onl y tel l s you which
goal s cannot be derived from the givens. However, i n some probl ems,
the abil ity to deci de whether a possi bl e goal does or does not contra
di ct the gi vens may be all that is requi red to sol ve the probl em.
Probl ems for which the method of contradiction i s appropriate
i ncl ude those where you must onl y determine whether a goal is con
si stent with the given i nformati on, not necessari l y whether it could be
derived from the given information using some parti cular set of opera
tions. The method is al so appropriate for probl ems that guarantee that
exactl y one of several alternative goal s can be derived from the given
informati on. Here if al l alternati ves but one can be ruled out, then
that one must be derivable from the givens, and the method of contra
diction constitutes a sufci ent proof of i t.
Many probl ems make the guarantee that one out of several alterna
tive goal s can be derived from the givens. In thi s chapter, I wi l l di scuss
i n four sections the probl ems that i l l ustrate how the method of contra
di cti on can be appl i ed. The four sections are as fol l ows.
Indirect Proof The frst secti on wi l l be concerned with probl ems with
only two or three alternative goal s. It will focus on the method of
i ndi rect proof i n mathemati cs, where the two alternative goal s are
usual l y that some statement is either true or fal se. We are not i nterested
i n whether we can always say a statement i s either true or fal se but
rather i n the use of the method of contradiction i n those cases where
it is assumed that onl y these two alternati ves can hol d. In such cases,
if a person can show that one of the two alternati ves leads to a contra
di cti on, then the other alternative has been proved.
Multiple Choice -Small Search Space The second section will be
concerned with the method of contradiction i n probl ems i nvol vi ng a
smal l (from two to 1 0) set of alternative goal s that are mutual l y in
consi stent (only one of the goal s can be derived from the gi vens). I n
probl ems i nvol vi ng a smal l set of alternative goal s, i t i s feasible to
systematical l y apply the method of contradi ction to every alternative
goal . Exampl es of such probl ems i ncl ude mul tipl e-choice exami nation
probl ems and certain logic probl ems.
Classicatory Contradiction -Large Search Space The third sec
tion will be concerned with the use of the method of contradiction in
probl ems i n whi ch there i s a large, but di screte and fni te, population
of alternati ve goal s. In these probl ems, it i s general l y not feasible to
systematical l y search every alternati ve. It i s necessary to devi se some
Contradiction
TTT
more efcient search procedure that contradicts l arge cl asses of
alternative goal s si multaneousl y. Probl ems in thi s category i ncl ude the
coin-weighing problem di scussed earl i er, many concept-attai nment
probl ems, and l etter-arithmetic probl ems.
Classicatory Contradiction -Infnite Search Space The fourth
section wil l be concerned with the use of the method of contradi c
ti on in probl ems i nvol vi ng infnite (ofen conti nuous) populations of
goal s. In these probl ems, it is clearly i mpossi bl e to contradi ct each
goal individual l y; the sol ver must contradi ct i nfni tel y l arge cl asses
on the basi s of some common property. An example of thi s case i s
provided by the half-i nterval search technique in the numerical sol u
tion for roots of equations.
INDIRECT PROOF
The method of i ndi rect proof in mathemati cs is an extremel y i mportant
example of the probl em-sol vi ng method of contradi cti on. To prove that
a statement fol l ows from certain givens, the method of i ndirect proof
is to assume the contrary is true and show that the contrary statement,
in combination with the givens, resul ts in a contradi cti on. Therefore,
since the contrary statement is fal se, the original statement must be
true. You wi l l note that, in thi s case, for the method of i ndi rect proof
to be valid there must be onl y two possi bl e alteratives: ei ther the
goal statement i s true (can be derived from the gi vens) or it is fal se
(the contradiction of the statement can be deri ved from the givens, but
the original statement cannot) . For the method of contradiction to be
val i d, the statement must be either true or fal se. The truth val ue of
the statement cannot be undeci dabl e. In addition, the set of gi ven
statements must themsel ves be free of i nternal contradiction ; other
wi se, contradictions coul d be derived from a possi bl e goal in combina
tion with the givens, not because of a contradiction between the goal
and the givens, but because of a contradi cti on wi thi n the givens. How
ever, the beginning student need not be very concerned with these
l i mi tations on the use of the method of contradiction. By and l arge,
whenever it appears reasonabl e to use the method of contradi cti on,
it i s val i d to use i t.
There are, of course, i nnumerabl e exampl es of the use of i ndi rect
proof in every area of mathematics. Here i s one exampl e:
Given that you have already proved the theorem that all squares of non
zero integers are positive, prove that equation x2 + 1 0 has no integer
solution.
TTZ Chapter
Stop reading and attempt to prove the theorem, using the method
of contradiction (i ndirect proof) .
The frst step in appl yi ng the method of contradiction to thi s probl em
i s to assume the contradi cti on of the theorem-namel y, that .has an
integer sol uti on, .- where i s an integer. I f you have so far not
sol ved the probl em, stop readi ng and try agai n.
Given that .
+ 1 -0, subtract 1 from both si des of t he equation to

get .
--1 . Now substitute for .getting
--1 . Thi s resul t i s a

contradi ction to the al ready proved theorem that the square of any
integer must be posi ti ve.
A famous proof of the exi stence of i rrational numbers also uses the
method of contradi cti on. Rational numbers are numbers expressi bl e
by si mpl e fractions, min, of i ntegers m and n, where n i s nonzero.
To show the exi stence of i rrational numbers, we need to show that
there exi sts at l east one such number -for example, \.
Given an isosceles right triangle with si des of unit length, the Pythagorean
Theorem asserts that the length of the hypotenuse equals the square root
of the sum of the squares of the lengths of the sides ; namely, c = V
\. Prove that the length of the hypotenuse of this triangle -namely,
\ i s i rrational .
Stop reading and try to sol ve the probl em, using the method of con
tradi cti on.
The contradi cti on of the theorem i s to assume that the \ i s rational
and therefore can be expressed as the ratio of two i ntegers min, where
both m and n are i ntegers ( greater than zero). Al so, when we assume
\ equal s min, we can assume that m and n have no common factors,
si nce these common factors coul d al ready have been canceled out.
If you have not sol ved the probl em so far, stop readi ng and try agai n.
From the above, we derive that 2n2 -m
2
, whi ch i mpl i es that m2 i s
even. Thi s resul t i n turn i mpl i es that m i s even (m -2p, where p i s
an i nteger) . If you have sti l l not sol ved the probl em, stop reading and
try agai n.
If m -2p, we can substitute 2p for m i n the equation 2n2 -m2, ob
tai ni ng 2n2 -(2p)2 -4p2. From thi s resul t we obtain n2 -2p2, whi ch
i mpl i es that n2 i s even. Agai n, thi s resul t i mpl i es that n i s even (con
tai ns a factor of two) . However, we have now derived that both m and
n are even (contain a common factor of 2) , contradicting the hypothesi s
that m and n have no common factors. Thus, the contradiction i s fal se,
and \ must be i rrational .
A common feature of both these exampl es of i ndi rect proofs that
is characteristic of most proof probl ems to whi ch the method of in
direct proof i s well suited i s that the contradiction of the theorem
Contradiction TT
permits a larger number of specifc consequences t o be derived from
it than does the origi nal statement of the theorem. This feature gi ves
you a great deal more to work with by concentrating on the contradi c
tion of the theorem than you woul d have by concentrating on the
original statement of the theorem. It shoul d be cl ear, then, why the
method of contradiction i s so useful in cases l i ke thi s.
The method of contradiction i s al so used i n proof probl ems where
there are two or more incorrect alternati ves to the correct theorem,
each of whi ch must be di sproved by contradiction when combi ned
wi th the givens. For exampl e, consi der a proof of the fol l owing theorem:
You are gi ven three assumpti ons or previ ousl y proved theorems. ( 1 ) I f
e ~ 0 and a b, then ae be. ( 2) I f e ~ 0 and a ~ b, then ae ~ be. ( 3) The
law of trichotomy obtai ns: for any a and b, one and onl y one of three
al ternati ves hol ds : a b, a b, or a ~ b. U si ng these gi vens, prove that,
if e ~ 0 and ae be, then a b.
Stop reading and try to prove the theorem, using the method of con
tradiction.
To prove thi s theorem, you must test two i ncorrect alternati ves to
show that they resul t in contradictions -namely, a > b and a -b. If
you have not al ready proved the theorem, stop reading and try agai n.
First, l et us assume a > b. If a > b, then with e > 0, we know that
ae > be. But this resul t i s a contradi cti on to the given i nformation that
ae be, by the law of trichotomy. Si mi l arly, to rul e out the alternati ve
that a = b, we deri ve from {/ = b and L > 0 that ae = be. whi ch contra
dicts the given information that ae be, by the law of trichotomy.
Therefore, the only remaining possi bi l ity, by the law of trichotomy, i s
that a b, whi ch was t o be proved. Note that we had t o rul e out two
alternatives before we coul d conclude the theorem proved, although
in thi s case the method of contradicting each alternative was ex
tremel y si mi l ar.
The method of i ndi rect proof shows up in an enormous variety of
probl ems. For exampl e, recall that one essential part of the sol ution to
the notched-checkerboard probl em i n Chapter 3 was to assume that
there was a method of covering 62 squares with 3 1 domi noes. These
3 1 dominoes must cover 3 1 black squares and 3 1 white squares. From
the given information, we can derive that removi ng the two di agonal l y
opposite squares of t he checkerboard wi l l produce 32 squares of one
color and 30 squares of the other col or, resul ti ng i n a contradi cti on.
Thus, there i s no method of covering the 62 remaining squares wi th
3 1 dominoes.
Si mi l arly, i n the method used i n Chapter 3 to establ i sh the mi ni mum
number of cuts needed to sol ve vari ous cube-cutting probl ems, we
TT4 Chapter
i mpl i citly ruled out a smal l er number of cuts by contradi ction with the
i mpl i ci t gi ven i nformation that not more than one face of a subcube
could be cut at a time.
The method of i ndi rect proof i s ofen useful i n plane geometry proof
probl ems. I ndeed, high school plane geometry is usually the frst op
portunity for most students to become acquai nted with the method of
i ndi rect proof. One reason why most people are so suspicious of the
method of i ndi rect proof when they frst encounter it i s that they
encounter i t so l ate. I suspect they unconsci ousl y feel that so basic
a method of proof should have been explained to them much earl ier
i n thei r l i ves, as i ndeed i t shoul d have been. Be that as it may, pl ane
geometry proof probl ems ofen demonstrate the method of indirect
proof, and the fol l owing i s a parti cul arl y si mpl e exampl e:
Gi ven the assumpti on that two di sti nct poi nt s determi ne one and onl y
one strai ght l i ne, prove t hat two l i nes can i ntersect at no more than
one poi nt.
Stop reading and try to prove thi s theorem, using the method of
contradi cti on.
Fi rst, assume the contrary -namel y, that there exi st two l i nes that
intersect in at least two points, A and B. If you di d not solve the prob
lem thus far, stop readi ng and try agai n.
Si nce the two straight l i nes i ntersect i n poi nts A and B, there are
two di sti nct straight l i nes passi ng through the points A and B. How
ever, this i s contrary to the assumption that two points determine one
and only one straight l i ne. Thus, the contrary of the theorem is con
tradi cted, and so the theorem i s proved.
Finally, consi der the fol l owi ng pl ane geometry problem as an ex
ampl e of the use of i ndi rect proof:
You are gi ven two assumpti ons or previ ousl y proved theorems. ( I ) A
straight l i ne is a 1 800 angl e. (2) Two l i nes are perpendi cul ar, if they
make a 900 angle where they intersect. From these assumpti ons, prove
that from a point on a l i ne, onl y one perpendi cul ar l i ne can be erected.
Stop reading and try to prove this theorem, using the method of
contradi cti on.
To prove thi s geometric theorem it i s useful , as it al most invariabl y
i s i n pl ane geometry probl ems, to construct a fgure. Consi der Fig. 7- 1 .
To prove that at most one l i ne can be perpendi cular to another l i ne at
a gi ven poi nt, assume the contrary -namel y, that at least two l i nes
can be drawn perpendi cul ar to a gi ven l i ne through a given poi nt. I n
Contradiction TTb
Fig. 7- 1 this assumption i s represented by the two l i nes drawn through
point A and represented as being perpendi cul ar to line C. If you have
not yet proved the theorem, stop readi ng and try agai n.
Accordi ng to the hypothesi s that the two perpendi cul ars are di sti nct,
there i s some angl e L between them, L > O. Each of the perpendi cu
l ars forms a 90 angl e with l i ne C. Thus, the straight l i ne C equal s 90
+ L + 90 > 1 80, whi ch i s contrary to the assumption that a straight
l ine is a 1 80 angl e.
C
A
FIGURE 7-1
Figure to prove by the method of
contradiction that, at a given point, ^,
on a l i ne, C, onl y one perpendicul ar can
be erected.
MULTIPLE CHOICE-SMALL SEARCH SPACE
Besi des being useful for proving theorems, the method of contradi cti on
i s useful i n the sol uti on of a wide variety of other probl ems where there
are usual l y more than two alternatives. Whenever you are guaranteed
that exactl y one of a small set of alternative goal s i s consi stent with
(or fol l ows from) the gi ven i nformati on, it i s possi bl e to determine
which goal , by systematical l y exami ni ng each and deriving a contra
diction from all but one of them.
Probl ems given on tests with mul ti pl e-choi ce answers have so few
choices (fve or l ess) that contradi cti on is frequentl y the i deal sol ution
method. Si mpl y take each alternative answer i n tur and determine
whether it i s consi stent with the given informati on. That is, combine
each possi bl e answer with the gi ven i nformation to attempt to derive
a contradiction. If you can derive contradi cti ons for al l the answers
except one, then that remai ning answer i s correct. For exampl e, con
sider the fol l owi ng potential exam probl em:
The sol uti on of V7x 3 + v 2 i s: (A) x
3 , (B) x
g (C) x 2,
( D) x I , ( E) x O.
TTb Chapter
Stop reading and try to solve thi s probl em, usi ng the method of
contradi ction.
The sl ow way to sol ve the probl em is to perform various operations
on both si des of the equation (addi ng, subtracting, mul ti pl yi ng, divi di ng,
squaring both si des of the equation) . The fast way is to substitute each
of the alternati ve val ues of .into the equation and see which ones
work. In thi s case, onl y .-1 i s consistent with the equation, so ( D)
i s the answer.
Sol vi ng for the val ues of one or more variabl es that satisfy one or
more equations i s a primary exampl e of probl ems where ofen the
gi vens and the goal should be combi ned. In these probl ems, there may
be several val ues of a variabl e, or several sets of val ues of the several
variabl es, that satisfy the equation or equations. You are being asked
to determine one or more such sol uti ons that satisfy the equations,
whi ch i s i n essence saying that consi stency of the gi vens and goal
statements is all that is demanded.
When you encounter a probl em l i ke this on a mul tipl e-choice test,
with a smal l number of choices for sol utions of the equations, then
the ideal probl em-sol vi ng method i s contradiction. You simply try
each of the alternative sol uti ons i n turn to see if it sati sfes the equa
tions -that i s, gi ves an answer l i ke .-.for al l equations -when the
set of val ues of variabl es i s substituted i nto the equations. All of the
i ncorrect answers wi l l produce contradi cti ons such as 5 -3. For ex
ampl e, consi der the fol l owing probl em:
Which of the fol l owi ng i s a sol uti on of the cubi c equati on, X+ 4x2 -7x - 1 O
O? x equal s: ( A) -2, ( 8) -5, ( C) 4, ( D) 3, ( E) none of these.
Stop readi ng and try to sol ve thi s probl em, usi ng the method of con
tradiction.
In thi s instance, you can factor the cubi c into three l i near factors,
yi el di ng three real roots. However, a much faster way to sol ve the
probl em i s to check each of the frst four specifc alternative answers
for consi stency with the cubic equati on.
Si nce one of the answers -namel y, .--5 -i s consi stent with the
cubi c equation, and all the rest are not, we know that (8) i s the cor
rect answer. Si nce all you are asked i s whether a particul ar alternative
goal i s consi stent with the given i nformation (the cubic equation),
determining one answer that i s consi stent i s sufci ent to sol ve the
probl em. You need not actual l y deri ve contradi cti ons in the case of
alterati ves (e) and ( D) , except as a check that your determination of
consi stency in the case of alternative ( 8) was correct. If alternative
Contradiction TT
(E) was an expressi on such as "several of these, " it woul d be neces
sary to deri ve contradi ctions to al l but one of the frst four altera
tive answers i n order to rul e out this alternati ve.
Contradiction i s useful i n examinations on a very wide variety of
probl ems. For exampl e, consi der the fol l owi ng:
The base of our number system i s 1 0. I f the base were changed to four
you would count as fol l ows: I , 2, 3 , 1 0, I I , 1 2, 1 3 , 20, 2 1 , 22, 23 , 30,
and so on. The 22nd number i n t he base-four system i s : ( A) 22, ( B) 37,
( C) 64, ( D) 1 04, ( E) 1 1 2.
Stop reading and try t o answer t hi s question, usi ng t he method of
contradiction.
In a number system with base of four, the digits 4, 5, 6, 7, 8, and 9
cannot appear. Thus, alternative answers ( B) , (C) , and (0) coul d not
possi bl y be correct. In addition, the number 22 has al ready been used
prior to achievi ng the 22nd number i n the base-four system. Thus, the
only possi bl e answer of the fve that coul d be correct i s answer (E),
whi ch i s 1 1 2.
Some probl ems are necessari l y solved by contradi cti on. For ex
ampl e, consider the fol l owing probl em:
The formul a expressi ng the rel ati onship between x and y i n the table i s:
x
y
2 3
2 3 6
4 5
1 1 1 8
(A) y =
2x + I , ( B) y =
-r + 2x
2 + I , (C) y = x' - 2r+ 3x
2
-x + I , ( D) y =
x
2
- 2x + 3, ( E) y
=
x
2
+ l .
Stop reading and try to sol ve thi s probl em, usi ng the method of
contradiction.
The correct answer is alternative ( 0) , and there is real l y no other
way to determine that (0) i s the correct answer except by usi ng the
method of contradi cti on. I nfni tel y many di ferent functi ons are con
si stent with any fnite number of poi nts, and infnitel y many diferent
functions are i nconsi stent with any fnite number of poi nts. All we can
do i s to determi ne whi ch functions are consi stent and which are in
consi stent by checki ng for contradictions between the proposed func
tions ( goals) and the given information about poi nts on the functi on.
No one woul d object to your usi ng the method of contradi cti on to
sol ve such probl ems in an examination situation and, of course, to
TT Chapter
using it in the case of i ndi rect proof. However, I have occasionally
heard objecti ons to its bei ng used i n some of the above examples of
mul ti pl e-choice probl ems, where there are di rect methods for deter
mining the goal from the gi ven information. Some teachers have even
protested that for students to use the answers and look for contra
di cti ons is mi l dl y i mmoral , that it "educates people to be test takers. "
It i s true that such students are not demonstrating their knowledge
of the di rect algorithmic methods for obtai ni ng the goal from the given
informati on. However, I thi nk we must face the fact that these types
of test questions si mpl y do not adequately assess a student
'
s knowledge
of algorithmi c methods , because the method of contradiction can be
used in pl ace of the algori thms. Some students are inevitabl y going to
use the method of contradiction whether anybody tel l s them about it
or not, and thi s onl y introduces an extra source of noise in the rela
tive assessment of understandi ng of diferent students. Test questions
must be made fool -proof; it will not do to ask students to be fool s.
Furthermore, there are many ti mes when a teacher wants to assess
students
'
knowledge of certain specifc mathematical concepts with
questi ons that can be answered using the method of contradiction,
and it is either not possi bl e or the teacher does not care to assess their
understanding of any algori thmi c method for generating the solution.
Another class of probl ems that involves a search among a small
popul ation of alternative goals are the recreational logic probl ems
that make up such a l arge part of probl em books. In many of these
probl ems it i s difcul t to make inferences from the gi vens to the goal ,
but it is general l y quite easy to test any gi ven assumption about the
goal for consi stency with the given information. Since the number of
alternative goal s i s ofen quite smal l , the method of contradi ction is
ideal l y suited to the solution of such probl ems. Some of the most
i nteresting recreational logic probl ems that are the most difcult to
solve by inferences from the gi vens to the goal are probl ems i nvol v
ing the possi bi l ity that some of the given i nformation i s fal se. These
are the famous liar and truth-tel l er (truar) probl ems. We di scussed
one such probl em in Chapter 3 i n connection with the need for having
a clear understanding of the goal ; let us consider it again here from
the standpoint of the method of contradi ction:
The country of Marr is i nhabited by two types of peopl e, l i ars and truars
(truth tel l ers). Liars al ways lie and truars al ways tel l the truth. As the
newl y appointed United States ambassador to Marr, you have been in
vited to a l ocal cocktail party. While consuming some of the native spirits,
you are engaged in conversation with three of Marr' s most promi nent
Contradiction TT
ci tizens: Joan Landill, Shawn Farrar, and Peter Gant. At one poi nt in
the conversati on Joan remarks that Shawn and Peter are both l iars. Shawn
vehementl y denies that he i s a l iar, but Peter repl i es that Shawn is indeed
a l iar. From this i nformation, can you determine how many of the three
are l iars and how many are truars?
Stop reading and try to solve thi s probl em, using the method of
contradiction. You wi l l recall from Chapter 3 that al l we need to deter
mine is how many of the three are l i ars -namely, whether there are
zero, one, two, or three l iars among the three people. However, i n
order to determine thi s number, it i s useful to consi der al l eight pos
si bi l ities for the l i ar-versus-truar status of Joan, Shawn, and Peter
namely, al l three are l iars ; Joan and Shawn are truars , but Peter i s
a l iar; Joan and Peter are l i ars, but Shawn i s a truar; Shawn and Peter
are l iars, but Joan i s a truar; Joan i s a l i ar, but Shawn and Peter are
truars ; Shawn is a l iar, and Joan and Peter are truars ; and, fnal l y,
Peter i s a l iar, and Joan and Shawn are truars. It i s easy to test the
consi stency of each of these eight possi bi li ti es wi th the given informa
tion of each of these eight possi bi l i ti es. For exampl e, all three cannot
be truars, since Joan would not then say that both Shawn and Peter
were l iars. All three cannot be l ying, si nce Peter would not then say
that Shawn was a l iar. We can al so rule out each of the three possi bi l i
ties in whi ch there are one l iar and two truars i n the group. Of the
remaining three possi bil i ti es, we can rul e out the possi bi l ity that Joan
i s a truar and both Shawn and Peter are l iars, but we cannot fnd a
contradiction to either of the other two possi bi l i ti es -namely, that Joan
and Shawn are l iars and Peter is a truar, or that Joan and Peter are l iars
and Shawn i s a truar. As di scussed i n Chapter 3 , the inabi l ity to deci de
between these two possi bi l ities i s of no consequence to the sol ution
of the original probl em, since al l we were asked to determine was how
many of the three are l iars. Under either of the two possi bi l i ti es that
are not contradi ctory with the given information, there are exactl y two
l iars and one truar, whi ch is the answer to the probl em.
Instead of exami ni ng al l eight combinations of I i ar-and-truar status
for each of the three peopl e, it is possi bl e to use the method of con
tradiction somewhat more efciently by consi dering vari ous classes of
the eight alternatives. For i nstance, a judicious choice woul d be to
consider the class of possi bi l i ti es i n whi ch Joan i s a truar. All of the
four members of thi s cl ass can be shown to be contradi ctory to the
given i nformation, since then both Shawn and Peter must be l iars and
Shawn would then be tel l ing the truth -a contradiction. Thi s cl assi fca
tory use of the method of contradiction i s di scussed more extensi vel y
in the next section. In probl ems i nvol ving search through onl y a smal l
TZ Chapter
set of alternatives, it is usual l y qui ckest to test each of the possibil i
ti es i ndi vi dual l y for consi stency with the given informati on.
Another l i ar-truar probl em that si mpl y i l l ustrates the useful ness of
the method of contradiction is as fol l ows:
The Nel sons have gone out for the evening, leaving their four children
with a new babysitter, Nancy Wiggens. Among the many instructions the
Nel sons gave Nancy before they lef was that three of their children were
consi stent liars and only one of them consi stently told the truth, and told
her which one. But in the course of receiving so much other information,
Nancy forgot which child was the truar. As she was preparing di nner for
the children, one of them broke a vase in the next room. Nancy rushed
in and asked who broke the vase. These were the chil dren' s statements:
Betty:
Steve:
Laura:
John:
Steve broke the vase.
John broke it.
I didn' t break it.
Steve lied when he said I broke it.
Knowing that only one of these statements was true, Nancy quickly de
termined which child broke the vase. Who was it?
Stop reading and try to sol ve the probl em, using the method of con
tradi cti on.
There are two possi bl e approaches to thi s probl em. First, we might
try to test each of the four possi bi l i ti es for who broke the vase. This
approach appears to be the most direct way to the goal ; however, it
wi l l not work unti l we frst determine whi ch of the four i s tel l i ng the
truth and which three are lying. When the l iar-versus-truar status of
the four chi l dren has been determined, it is trivial to determine who
broke the vase. Thus, to successful l y apply the method of contradic
tion to the probl em, we should test the four possibilities i n regard to
which of the chi l dren i s a truar. If you did not solve the probl em before,
stop readi ng and try again, usi ng thi s i ndi rect appli cation of the method
of contradi cti on.
Betty cannot be the truar, si nce then both Betty and Laura woul d be
tel l ing the truth, contrary to the information that onl y one can be tell
ing the truth. For the same reason, Steve coul d not be tell ing the truth,
si nce then both Steve and Laura would be tel l i ng the truth. Laura can
not be tel l ing the truth because then, if John i s l ying, Steve i s tel l ing
the truth, contrary to the information that onl y one chi l d can be tel l ing
Contradiction
TZT
the truth. The onl y possi bi l ity that i s consi stent wi th the gi ven i nforma
tion is that John is tel l i ng the truth and Betty, Steve, and Laura are
l ying. Gi ven thi s, it is tri vi al to determine that Laura must be the one
who broke the vase.
One of my al l -time favorite recreational logic probl ems is the famous
Smith, Jones, and Robinson problem:
Smith, Jones, and Robinson are the brakeman, freman, and engineer of
a train, not necessari l y respecti vel y. Today only three passengers are
ridi ng thi s train, and, by an extraordi nary coincidence, their last names
are the same as the l ast names of the brakeman, freman, and engineer.
To distinguish the passengers from the trainmen, l et us refer to the pas
sengers with the title Mr. -Mr. Smith, Mr. Jones, and Mr. Robi nson. Here
is some other relevant i nformation:
(A) Mr. Robinson lives i n Detroit.
(B) The brakeman l i ves halfway between Chicago and Detroit.
(C) The passenger who lives i n Chicago has the same name as the
brakeman.
(D) The brakeman' s next-door neighbor, one of the passengers, earns
exactly three times as much as the brakeman.
(E) Mr. Jones earns exactl y $2,000 a year (and collects a l ot of food
stamps and welfare payments).
(F) Smith beat the freman at bil l iards.
Who is the engineer?
Stop reading and try to sol ve the probl em, using the method of
contradi cti on.
The most di rect appl ication of the method of contradiction to thi s
problem woul d be to test the three possi bi l i ti es for the name of the
engineer -Smith, Jones, or Robinson -agai nst the gi ven informati on.
As in the previous probl em, thi s most di rect approach i s not the best,
si nce none of the si x statements of information i ncl udes any reference
to the engineer. Thus, it i s obvious that if we are to determine the name
of the engineer, we must consider some more i ndi rect approach, whi ch
frst i nvol ves determi ni ng who mi ght be the brakeman or the freman,
who might l i ve next door to whom, who might l ive in what ci ty, and
so on. I f you di d not sol ve the probl em thus far, stop reading and
try agai n.
A mi ni mal expansi on of the search space of alternati ves, using the
method of contradi cti on, i s to consi der each of the six possi bi l i ti es
T ZZ
Chapter
for the assignment of names to the brakeman, freman, and engineer,
as i l l ustrated in the fol l owing tabl e (S -Smith, J =Jones, R -Robi nson):
Person
Brakeman
Fi reman
Engi neer
Hypotheses
2 3 4 5 6
S S J J R R
J R S R S J
R J R S J S
Now we examine the si x pi eces of i nformation to determi ne which of
these si x possi bi l i ti es produces a contradiction and therefore can be
el i mi nated from consi deration. In the frst pl ace, hypotheses 3 and 5
can be el i minated, because condition (F) says that Smith beat the
freman at bi l l iards ; assumi ng a person cannot beat hi mself, then, Smith
cannot be the freman. All but one of the remaining four possi bi l ities
can be el i mi nated by verbal reasoning, but it can be a little confusing.
A great deal of the i nformation i n the probl em concerns the passen
gers and, i n particul ar, where they l i ve. Thus, it probabl y would be
hel pful to go a step further away from the direct approach to the goal
and to try to test vari ous possi bi l i ti es for the assignment of passengers'
names to l ocations. If you have not solved the probl em already, stop
readi ng and try agai n, using the method of contradiction as applied to
the various possi bi l i ti es for home addresses of the three diferent
passengers.
There are three home addresses for the passengers -Chicago, De
troi t, and halfway between Chicago and Detroit. Furthermore, one
and only one passenger l i ves i n each of the three locations. Since the
given i nformation says that Mr. Robinson l i ves in Detroit, there are
onl y two remaining possi bi l i ti es for the compl ete assignment of pas
sengers to home addresses: either Mr. Jones l i ves in Chi cago and Mr.
Smith l i ves between Chicago and Detroit or el se Mr. Smith l i ves i n
Chi cago and Mr. Jones l i ves between Detroit and Chi cago. Si nce Mr.
Jones earns exactly $2, 000, and $2,000 is not di vi si bl e by 3 , and the
brakeman' s next-door neighbor earns exactly three times as much as
the brakeman, Mr. Jones cannot l i ve next door to the brakeman (half
way between Chi cago and Detroit). Thus, Mr. Jones must l i ve in Chi
cago and Mr. Smith must l i ve halfway between Detroit and Chicago.
Si nce Mr. Jones l i ves in Chi cago, the brakeman i s Jones by statement
Contradiction TZ
(e) . Thi s resul t el i mi nates, by contradi cti on, alternati ves 1 , 2, 5, and 6
in the assignment of names to the three positions of brakeman, freman,
and engineer. Si nce we already ruled out alternative 3 , we are l ef wi th
onl y alterative 4, consi stent with the given informati on. Thus, Smith
i s the engineer (Jones i s the brakeman and Robinson the freman) .
It ofen faci l i tates work on recreational logic probl ems of thi s type
to set up vari ous tabl es representing what goes with what. In the pres
ent i nstance, there are two useful tabl es. Fi rst i s a tabl e such as the
fol lowing, i nvol ving the assignment of the names (Smith, Jones,
Robinson) to the positions (brakeman, freman, engineer) :
Brakeman
Fi reman
Eng i neer
Smith Jones Robinson
In addition, it i s useful to set up a table assigning passengers' names to
home addresses, as fol l ows:
Mr. Smith Mr. Jones Mr. Robinson
Chi cago
Detroi t
Hal fway between
When you acqui re a piece of information such as that Smith cannot
be a freman, you enter no i n the box of the tabl e appropriate to Smith
being a freman. When you know from gi ven i nformation that Mr.
Robinson l i ves in Detroit, you enter yes in that box of the rel evant
tabl e; you also enter no i n every other box i n the same row or col umn
of the tabl e, si nce there can be onl y one yes in each row or col umn of
such logic tabl es. It is the restriction to onl y one yes in a row or col umn
that permits rather powerul use of thi s tabul ar representati on: when
ever you have a yes i n a row or col umn, you can fl l i n the rest of both
the row and the column with nos ; whenever you have two nos i n a row
or column, you can fl l in a yes i n the remai ni ng position i n that row or
column. Tabul ar representation permits us to draw i nferences qui te
TZ4
Chapter
mechanical l y from previous inferences that are recorded in the tabl e,
avoiding compl i cated verbal reasoni ng and the possi bi l i ty of memory
l oss. The fni shed versions of these two tabl es for solution of this
Smith, Jones, Robi nson probl em are shown i n Fig. 7-2.
Smith Jones Robinson
Brakeman No Yes No
Fi reman No No Yes
Eng i neer Yes No No
Mr. Smith Mr. Jones Mr. Robinson
Chi cago No Yes No
Detroi t No No Yes
Between Yes No No
RGURE 7-2
Final tables for sol ution of Smith, Jones, and Robinson probl em.
A fnal probl em of a compl etel y diferent ki nd that i l l ustrates the
useful ness of the method of contradiction i s a spatial -puzzle problem
that I have called the bowling-pin reversal problem:
Six-year-old Heather Phillips set up the ten pi ns for her bowling game at
the end of the hall in a manner exactly opposite to the correct confgura
tion. Before Heather could throw the bowling ball down the hall, her
father informed her that she had set up the pins i n the wrong manner
and that the pins should have the row of one pin i n front, followed by the
row of two pi ns, followed by the row of three, and, fnally, the row of
four in the back. Although Heather is given to childish reversal errors of
this type when she forgets to put on her thinking cap, she is actually a
budding mathematical genius. So, upon being informed of her error,
Heather quickly put her thinking cap back on, ran down the hall, and, by
moving just three pi ns, was able to reverse the confguration from the
given state to the goal state, as illustrated in Fig. 7-3 . How did she do it?
( By the way, Heather assumed that the exact placement of the pins on the
foor was not important, so long as the relative placement of the pins with
respect to each other was correct. You should assume this also. )
Stop reading, put on your thinking cap (if you do not have it on al
ready), and try to sol ve the probl em, usi ng the method of contradi ction
(not just random trial and error) .
Given
@
@@
@ @ @
@ @ @ @
FIGURE 7-3
GOal
@ @ @ @
@ @
@
@ @
@
Gi ven and goal states for the bowl i ng-pin reversal probl em.
TZb
To appl y the method of contradi cti on, we need to have a wel l -defned
set of possibi lities. The smal l est such set of possibil ities i s to ask where
the row of four pi ns wi l l be in the goal state with respect to its position
in the given state. There are si x logical possi bi l i ti es for thi s -above the
row, with one pi n in the given state; the row with one pi n in the given
state wi l l become the row with four pi ns i n the goal state; the row with
two pi ns in the given state will become the row with four pins in the
goal state; the row with three pins i n the gi ven state will become the
row with four pins i n the goal state ; the row with four pins i n the gi ven
state wi l l remain the row with four pi ns i n the goal state; or the row of
four pi ns i n the goal state wi l l be bel ow the row with four pi ns i n the
given state. If you have not solved the probl em thus far, stop readi ng
and try agai n, using the method of contradiction to el i mi nate al l but
one of the si x possi bi l ities.
Cl early, if the row of four pi ns were either above the row of one pi n
or below the row of four pi ns i n the given state, we would have to move
more than three pi ns i n order to achieve thi s aspect of the goal state
alone. Thus, we have contradicted these two possi bi l i ti es. To make the
row of one pin i n the given state the row of four pi ns i n the goal state
woul d requi re a mi ni mum of three moves to achieve that subgoal al one,
pl us the row of two pi ns woul d then have to become the row of three
pins, produci ng already more than three moves. This arrangement can
not be the desired solution. If the row of four pins in the gi ven state
i s to remain the row of four pi ns i n the goal state, all si x pi ns above
it woul d have to be moved, contradicting the requirement of the pro
posed solution. Fi nal l y, if the row of three pi ns were to become the
row of four pins in the goal state, al l three pi ns above the row of three
woul d have to be removed, pl us one of the pi ns in the row of four woul d
have to be moved, contradicting the restriction to onl y three moves.
This resul t leaves onl y the possibil ity of making the row of two pins
in the given state into the row of four pins i n the goal state, produci ng
TZb Chapter
the solution in a fairl y di rect manner : the two extreme pi ns in the row
of four are moved to the two extreme positions in the row of two, and
the top pi n in the gi ven state is moved to the mi ddl e of the row, bel ow
al l the other pi ns, achieving the goal state.
Probabl y the most common way of sol vi ng thi s probl em is not to
use the method of contradi ction but rather to look for a subset of seven
pins i n the gi ven state i n identical positions to seven pi ns i n the goal
state. In general , if you did not know the mi ni mum number of moves
to transform the given state i nto the goal state i n a probl em of this type,
you might l ook for the maximum subset of entities in the given state
that were i n identical positions relative to one another to positions
i n the goal state. Impl ementation of thi s method i s l argel y a percep
tual method of scanning the given and goal states looking for matches
of (usual l y compact) subsets.
CLASSIFICATORY CONTRADICTION
LARGE SEARCH SPACE
In di scussing the method of contradiction above, we were able to
conceptualize the probl ems so that there was a rel ativel y small popu
lation of alternative goal s to deci de among by the method. In the
probl ems here, however, the number of alternative specifc goal s is
so l arge that contradi cti ng them one at a time would be i mpractical .
In such cases, we must use some efcient search strategy for contra
di cti ng l arge subgroups of alternative goal s at a ti me. To implement
this more efcient search, some expl i cit or implicit cl assifcation must
be imposed on the alternative goal s , and classes of goal s must be con
tradicted on the basis of common properties possessed by all of them.
In addition, there i s ofel some natural ordering for the contradi ction
of diferent cl asses of goals such that it i s easiest to rul e out a particu
l ar class i n the begi nning, some other cl ass next, another class next,
and so on. The rul i ng out of earl i er classes of goal s provides the addi
tional information necessary for contradi cti ng subsequent cl asses of
goal s. Attempts to rule out cl asses of goal s in orders other than the
natural or easi est ordering wi l l usual l y be extremel y difcul t or im
possi bl e.
The coin-weighing probl ems di scussed i n Chapters 3 and 5 are ex
ampl es of the method of cl assifcatory contradi cti on. Recall that i n
the si mpl est of these probl ems, you must determine whi ch of n coins
i s the heavy coin, usi ng a beam balance. Weighing one group of coins
against another provi des information that contradicts a large cl ass of
Contradiction T Z
possi bi l i ti es with respect to whi ch coin is the heavy coi n. Whether you
consider these probl ems to exempl ify the contradiction of one set
of alternati ves or the i mpl ication of the compl ementary set of al
teratives i s obviousl y completely arbitrary.
A cl ass of probl ems somewhat si mi l ar to the coin-weighing probl ems
in the need for cl assifcatory contradiction of alternative goal s are the
concept-attainment probl ems, of which the fol l owing i s one exampl e:
You are given a set of six-place numbers (for example, 792, 674, which
is to be read as 7 in place I , 9 in place 2, and so on), some of which are
examples of the concept and some of which are not. Concepts are either
simple concepts of the form "concept is d i n place p" (that i s , a particular
digit d i n a particular place p) or conjunctive concepts of the form "con
cept i s dl in place , and . . . and di in place p; (that i s, a conjunction
of digi ts in particular places). I f the concept were 9 in place 2 and 7 in
place 5, then 792, 674 would be an example of the concept, because it
meets both necessary conditions. On the other hand, 722, 674 would not
be an example of the concept, because it lacks one of the necessary condi
tions: it does not have a 9 in place 2. Now, determine the conjunctive
concept that i s impli ed by the following information concerning some six
place numbers that are known to be examples and nonexamples of the
concept:
1 07, 254 is an example of the concept.
1 57, 254 is an example of the concept.
937, 254 i s an example of the concept.
867, 1 84 is an example of the concept.
295, 684 is not an example of the concept.
367, 497 i s not an example of the concept.
Stop reading and try to sol ve the probl em, making cl assifcatory use
of the method of contradi cti on.
From the frst piece of information that 1 07, 254 i s an exampl e of
the concept, we can determi ne that the concept wi l l i ncl ude some
combination of the fol l owing si x restrictions : 1 i n place 1 , 0 i n place 2,
7 i n place 3 , 2 i n place 4, 5 i n place 5 , and 4 i n place 6. Rather than
test al l 63 diferent subsets of combinations of from one to six of these
restrictions, it i s much more efci ent to test each of the si x restri cti ons
individual l y -namel y, test whether a concept must i nvol ve the restric
tion of 1 in pl ace 1 , and so on. This procedure i n essence amounts to
testing the cl ass of al l concepts that i nvol ve the restriction 1 i n place 1 .
Stop reading and sol ve the probl em, if you have not done so al ready.
Cl earl y, the set of concepts that i nvol ve the restriction I i n place 1
is contradi cted, because some of the exampl es do not have the 1 i n
TZ Chapter
place 1 . Proceeding in the same manner, we can rul e out al l concepts
except those that requi re 7 i n place 3 or 4 in place 6. What information
tell s you that both of these restrictions are necessary i n order for a
six-place number to be an example of the concept? Thi s i nformation
comes from the two nonexampl es of the concept, each of whi ch i l l us
trates that ei ther of the restrictions in i sol ation i s not sufci ent to make
a si x-place number an exampl e of the concept. Both are requi red.
Cl assifcatory contradiction i n concept-attai nment problems i s
equi valent to deriving some rul es of inference as to whi ch dimensions
or pl aces of the exampl es of the concept are relevant to the concept
and which are irrelevant. For simple and conjunctive concepts of this
type, there are two very simple rul es: (a) If two exampl es of the con
cept difer in the val ues or digits they have on one or more di mensions
( pl aces), al l of these dimensions are i rrel evant to the concept (that is,
not invol ved i n the necessary conditions specifed by the concept) .
(b) If an exampl e of the concept and a nonexampl e difer on one and
onl y one possi bl y relevant di mensi on, then that di mension is rel evant
(and the value of that di mensi on i n the example i s the necessary value).
Having derived these rul es of i nference, we can now solve all simple
and conjunctive concept-attai nment probl ems i n a very straightforward
manner. Cl assifcatory contradiction is essential l y equi val ent to thi s
inference method.
Letter-arithmetic probl ems, such as that bel ow, ni cel y i l l ustrate how
cl assifcatory contradi cti on frequentl y can be combined with drawing
inferences to provi de a sol uti on. I got the fol l owi ng probl em from
Bartlett ( 1 958) and Simon and Newel l ( 1 97 1 ) , who have studied how
people sol ve this probl em:
LLNPL L
+ cPL L
Lc !
This problem is to be treated as an exercise in simple addition. All that
is known is the following: ( I ) D J , (2) every number from 0 to 9 has its
corresponding letter, (3) each letter must be assigned a number diferent
from that given for any other letter. The goal i s to fnd a number for each
letter, stating the steps of the process and their order.
Here you shoul d use cl assi fcatory contradiction, whi ch was sug
gested as an opti mal method for the concept-attainment probl ems. You
shoul d test hypotheses concerning the val ues of each letter, which i s,
in essence, the testi ng of cl asses of hypotheses about how all the l et
ters are assigned to diferent numbers. That i s, in testing the hypothesi s
Contradiction TZ
that 3 , you are, in essence, testi ng the enti re set of possi bl e sol u
tions t o the probl em i n whi ch 3 and the other letters equal various
other digits. If you di d not sol ve the probl em, stop reading and try agai n.
By knowing that D 5, we can infer that T -O. Thus, from the
o + E -0 col umn, we know that E must equal 9, there having been a
carry of 1 from the previ ous + R -B col umn. Si nce there is a carry
from D + D T to the next col umn, we know that L + L + 1 -R must
be an odd number. From the D + G -R col umn, we know that R i s a
number greater than 5. Thus, R coul d onl y be the number 7, si nce we
have rul ed out every other possi bl e hypothesi s. Now, si nce E -9 and
A + A must be an even number, we know there had to be a carry from
the L + L -R col umn to the A + A -E col umn. Therefore, either A
could be 4, so that 4 + 4 + 1 -9, or A coul d be 9, so that 9 + 9 + 1 -1 9,
that i s, 9 pl us a carry. However, 9 i s al ready used. Thus , we know that
A can onl y be 4. As Si mon and Newel l ( 1 97 1 ) point out, you can pro
ceed to determine a uni que number for each letter except N, B, and O.
For these numbers, you must actually try out t he various combi nations
of remai ning di gi ts, 6, 3, and 2, assi gni ng them to the three l etters
i n each of the six possi bl e ways and testing whether each assignment
i s consistent with the information gi ven i n the probl em. If it i s not,
you must try a new assignment of the three remaini ng letters to the
three remai ni ng numbers, unti l an assignment i s found that works.
Clearly, thi s last stage i s contradiction pure and si mpl e. However,
in many of the preceding inferences you used the method of contradi c
tion: you determined which l etter worked by ruling out al l possi bl e
alternative assi gnments of digits to that letter. If you have not done so,
you would not have known for sure that the digit that seemed to work
was the only digit that woul d work when assigned to that letter. Noti ce,
however, that an efcient sol ution to the probl em requi res cl assifca
tory contradiction. I n thi s probl em, then, you must determine as nearly
as possible what number to assign to a gi ven letter, independently of
testing hypotheses about the numbers to be assigned to other letters.
Thi s procedure di ctates that the val ues of diferent letters must be
specifed in a certain order because onl y in certain orders is it possi bl e
to determine a unique number to be assigned to each letter.
As i l l ustrated in letter-arithmetic probl ems and concept-attainment
probl ems, cl assifcatory contradiction i s somewhat anal ogous to the
probl em-sol ving method of defni ng subgoal s i n those probl ems re
qui ring the construction of a long sequence of acti ons i n order to
achieve the goal . In contradiction probl ems, the di fcul ty i s not in the
long sequence of operations but i n the l arge cl ass of possi bl e hypothe
ses. But either way, you have i n essence a l arge set of alternati ves to
search through. To the extent that you can reduce that search space
T Chapter
by consi deri ng l arge cl asses of alternati ves at one ti me, it is advan
tageous to do so.
Another probl em that exempl ifes the combi ned use of inferences
and contradi cti on to reduce a l arge number of hypotheses to a smal l
number i s the integer-path-addition problem:
Put the digits I , 2, . . . , 9 into a 3 Y 3 matrix, one digit into each cell,
as shown i n Fig. 7-4. Your assignment of digi ts to cells must satisfy two
conditions: ( I ) Row I plus row 2 must equal row 3 (considering each row
as a three-digit number). (2) The digit i must be located immediately next
to (above, below, to the right, or to the left) the digit i- I , for i= 2, . . . , 9.
This second condition means you may place the digit I anywhere, but 2
must be placed next to I along a row or column (not diagonally), 3 must
be placed next to 2, and so on. This is what is meant by calling thi s prob
lem an integer-path problem.
You coul d begin by trying various hypotheses as to sequences of
fl l ing the digits , 2, . . . 9 in the ni ne cel l s, but thi s process is long,
slow, and chancey. A few i nferences that can be made from the above
information allow one to greatl y reduce the number of hypotheses that
must be tested. Probabl y the most important i nsight to gain at an earl y
stage in working on the probl em is to notice that, of two cel l s adjacent
along a row or col umn, one must be fl l ed with an odd number and the
other with an even number. This arrangement can easi l y be proved by
taking any two cel l s adjacent along a row or col umn and considering
al l possi bl e paths from one cell to the other, real izing that an odd num
ber of steps is required to reach the adjacent cel l no matter what path
is taken wi thin the matri x. However, it is real l y not essential to prove
the theorem that cel l s adjacent along a row or col umn must have one
Row I
Row 2
Row 3
FIGURE 7-4
The 3 ? 3 matri x for the integer
path-addi tion probl em.
Contradiction TT
even number and one odd number. A l arge number of trial-and-error
substitutions executed within a few mi nutes wi l l show that the theorem
i s almost certai nl y true. Stop reading and try to solve the probl em, if
you could not do so before.
Given t hi s theorem, you can now easi l y determine that the corner
cel l s and the center cell must be fl l ed with odd numbers and the other
four cells with even numbers, since there are fve corner plus center
cells and fve odd numbers. Thi s determi nation cuts the number of
alterati ves per cell of the matrix approximatel y i n half, greatl y re
ducing the search space. The restrictions might well be represented
i n a fgure such as Fig. 7- 5.
If you have not yet solved the probl em, stop readi ng and try agai n.
The astute probl em sol ver might infer that i n the right-hand col umn
an odd pl us an even number i s equal to an odd number, but i n the mi d
dl e col umn an even number pl us an odd number i s equal to an even
Row l
Row 2
Row 3
No carr Car
Odd Even
Even Odd
Odd Even
Odd
Even
Odd
FIGURE 7-5
Restrictions on the digits that can fll
cel l s in the integer-path-addition
probl em. The restrictions come from
considering that the four corner cel l s
and the center cel l must have odd digi ts
in them.
number. Thi s event can onl y happen if there was a carry of I from the
right-hand col umn to the mi ddl e col umn. Thus, we know that the sum
of the two upper digits i n the right-hand col umn must be greater than
1 0. Furthermore, in the l ef-hand col umn, an odd number plus an even
number i s equal to an odd number. Thus , there can be no carryover
from the middl e col umn to the left-hand col umn. Hence, the sum of the
top two digits i n the mi ddl e col umn i s 9 or l ess. Al so, we know that the
sum of the digits i n the lef-hand column i s 9 or l ess, si nce by given
information there can be no carry from the left-hand col umn. With al l
these restrictions, it i s a rel ati vel y si mpl e matter to consi der the
smal l number of hypotheses that are consi stent with these restri cti ons.
Perhaps the easi est way to proceed i s to focus on the mi ddl e col umn
and test al l the hypotheses that are consi stent with the i nformati on.
T Z Chapter
Thi s procedure means actual l y testing onl y si x possibl e assignments of
digits to the top two cel l s in the mi ddl e col umn -namel y, 6- 1 , 4-3 ,
4- 1 , 2-5 , 2-3 , and 2- 1 . It is a rel ativel y si mpl e matter to check each
of these assi gnments to see if any path of i ntegers consistent with the
assignment coul d solve the probl em. It turs out that only the 2-3
assignment of digits to the top two cel l s i n the mi ddl e col umn wi l l
sol ve the probl em. Furthermore, thi s assignment wi l l sol ve the probl em
in onl y one way, namel y, the way shown in Fig. 7- 6.
Kow I I 2
Kow 2 4 3
KOw 3 5 6
9
8
7
FIGURE 7-6
The sol ution to the integer-path-addition
probl em.
Another exampl e of cl assifcatory contradiction combined with
i nferences based on numerical properti es is provided by the lonesome
eight problem, which was originated by Chessin ( 1 954) :
Determine all of the digits represented by in the following long division
and also determine the remaining four digits of the fve-di gi t answer of
which 8 is the third digit, as shown i n Fig. 7-7.
Stop readi ng and try to sol ve the probl em, usi ng the method of con
tradiction to draw inferences.
8
X X X X X X X X X X X
X X X
X X X X
X X X
X X X X
X X X X
RGURE 7-7
The l onesome-eight probl em.
Since 8 times the di vi sor i s a three-digit number, we know that the
di vi sor must be 1 24 or l ess because 8 ( 1 25 + z) -1 , 000 + 8z, which
i s a contradi cti on for al l z o. We can al so determine that the last digit
of the quotient must be 9, si :;e 8 times the di vi sor would equal a three-
Contradiction T
di gi t number (a contradi ction) . The i nitial digit of the quotient must be
greater than 7, because 7 ti mes any number l ess than or equal to 1 24
would leave a remainder that was greater than a two-digit number when
subtracted from the di vidend (a contradi cti on). Now stop readi ng and
try to solve for the rest of the unknown digits, if you di d not before.
Si nce t he frst digit of t he quotient mul ti pl i ed by the divi sor equal s a
three-digit number, we know it cannot be 9. Thus, the frst digit of the
quotient must be 8. The second and fourth digits of the quotient must
be zero, because i n both cases two di gi ts from the di vidend were
brought down in the work below. So we have the quotient, 80809.
Maki ng use of the two pl aces i n the work bel ow the di vi dend i n the
long di vi sion where diferences are 99 or less (two-digit numbers) ,
we can determi ne that 8 ti mes the di vi sor must be a number between
990 and 999. The only di vi sor that will multiply by 8 to give a number
between 990 and 999 i s the di vi sor, 1 24. Numbers 1 23 and l ess are
rejected by contradi cti on. Thus, we have the quotient and the di vi sor,
and from them we can determine the di vidend and al l of the val ues of
X i n the work underneath the divi dend (see Fi g. 7-8) . Note that in
solving thi s probl em, we contradicted l arge cl asses of hypotheses
(solutions) i n making each inference.
8 0 8 0 9
1 2 4 1 0 0 2 0 3 1 6
9 9 2
1 0 0 3
9 9 2
1 1 1 6
1 1 1 6
FIGURE 7-8
The sol uti on to the l onesome-eight probl em.
ITERATIVE CONTRADICTION IN
INFINITE SEARCH SPACES
Occasional l y, the method of contradiction can be used in probl ems
that have (initial l y, at least) an i nfni te number of possi bl e sol utions.
Natural l y, i n such cases, it i s necessary to rul e out large or i nfnite
cl asses of alternati ves by the method of contradi ction. A parti cul arl y
simple exampl e of the use of cl assi fcatory contradi cti on is aforded
by the fol l owing probl em adapted from Polya ( 1 957) .
In numbering t he pages of a book, a printer used 3 , 289 digits. How many
pages were in the book, assuming that the frst page in the book was
numbered ?
T4 Chapter
Stop reading and try to sol ve the probl em, using the method of clas
sifcatory contradiction to rul e out al l but one of the infnity of positive
i nteger answers.
Of course, there are not an infnite number of solutions to the prob
l em, once one draws a relati vel y trivial i nference. The inference is
that the number of pages cannot possi bl y be greater than the number
of digits ( 3 , 289) , si nce at least one digit has to be used to number each
page. Thus, we might wel l regard thi s probl em as an example of clas
sifcatory contradiction in a large, but fnite, search space. However,
since an inference using the method of contradiction was necessary
in order to make the search space fni te, it seems appropriate to con
si der the probl em as having an initial l y i nfnite search space.
The probl em provi des a parti cul arly si mpl e exampl e of the use of
the iterative method of taking a prel i mi nary estimate of the goal ,
determining the magnitude of the error of the estimate from the goal ,
then moving in the di rection of the goal to obtain another esti mate
along with a magnitude of error, and so on, hoping ul timatel y to con
verge upon the goal . Since the prel i minary esti mates are contradicted
by the given informati on, I thi nk it i s useful to consider iterati ve
methods as a subcl ass of the method of contradiction, a subclass that
i s parti cul arly useful in solving probl ems with infnite (but ordered)
search spaces. Now stop reading, and try again to solve the probl em,
if you coul d not before.
To make the iterative sol ution to thi s probl em cl ear, imagine that
we start with a prel i mi nary esti mate of nine pages for the book. Each
page is a singl e digit. Thus , ni ne digits i n all would be used to number
the book. Thi s number i s cl earl y too l ittl e, so we move up to the end of
the two-digit numbers, namely, the number 99. Numbering 99 pages
requi res 9 single digits pl us 90 two-digit numbers , for a total of 9 + 1 80,
or 1 89 digi ts. The number 1 89 i s st i l l substantial l y below t he number
3 , 289. Thus, for our next esti mate we wil l take the end of the three
digit numbers, namely, 999. Numbering 999 pages uses nine si ngle
digit numbers plus 90 two-digit numbers plus 900 three-digit numbers ,
for a total of 9 + 1 80 + 2700, or 2, 889 di gi ts. Thi s fgure is quite cl ose
to the target of 3 , 289 digits , so we are encouraged perhaps to try a
more di rect analytic method at thi s point, namely, subtracting 2, 889
from 3 , 289 t o obtain an additional 400 digits that are needed t o achieve
the goal . ( However, note that one coul d continue to use the iterati ve
method. ) Since we are now in the four-digit numbers, each page will
requi re four di gi ts. Thus, to use 400 more digits will requi re 1 00 ad
ditional pages. Thus, we can infer that we must add 1 00 to 999 to ob
tain an answer of 1 , 099 pages i n the book, i n order to use up 3 , 289 digits.
Contradiction
Tb
Iterative methods are frequentl y used i n the numerical sol uti on for
roots of equations. For exampl e, consi der the fol l owing probl em:
Determine t he roots ( permi ssi bl e values of x) that satisfy the equation
x6 - 4. + 2x + 3.l - 7 x2 + 1 3x - 30 O.
Stop reading and attempt to specify an iterative method of contra
diction, by whi ch one might determine each root ( permi ssi bl e val ue of
x) for the preceding equation. You can assume that you have a com
puter at your di sposal to carry out the l arge number of steps that might
be requi red in order to converge upon each real solution for thi s
equation.
For l arge enough positive or negative val ues of x, the x6 term i n the
expression must domi nate the rest of the expressi on (be greater than
the sum of all the other terms i n the expressi on). Thus , for sufci entl y
large posi ti ve or negative val ues of x, the expressi on x6 - 4x" + 2x4
+ 3x3 - 7x2 + l 3x - 3 0 must be greater than zero and monotoni cal l y i n
creasi ng for more extreme posi ti ve or negati ve val ues of x. Thus , t here
can be no real sol utions for val ues of x more extreme than these points
at whi ch the expressi on x6 - 4x5 + 2X4 + 3x3 - 7x2 + 1 3x - 30 begi ns
from a positive val ue monotonical l y to i ncrease without l i mi t. You can
either determine these poi nts or else you can make a safe guess, based
on the values of the coefci ents of the terms i n the expressi on. In this
case, you might assume the function to be monotoni cal l y i ncreasing
above zero for values of x greater than 1 , 000 or l ess than -1 , 000. As
sumi ng that the function i s monotoni cal l y i ncreasi ng from posi ti ve val
ues beyond thi s range (-1 ,000 : x : + 1 ,000) , we know that there can
be no zero crossi ngs ( roots) beyond the i nterval from -1 , 000 : x :
+ 1 ,000. Thus, by contradi cti on, we have ruled out al l val ues of x
greater than 1 , 000 or l ess than -1 , 000. In the present case, we have
not done thi s i n a careful way, but we coul d. Now stop readi ng and try
to solve the probl em again, if you fai l ed to sketch an iterative sol ution
method before.
To determi ne sol uti ons for x within the i nterval from x --1 , 000 to
x -+ 1 , 000, you may defne a step size such that you think it unl i kel y
that there woul d be two diferent sol utions wi thi n a si ngl e step. In
the present instance, we might choose a step size of unity, though i f
we wi sh t o be more careful we might pi ck steps of l ess than unity.
Now we can eval uate the function x5 - 4x
5
+ 2x4 + 3x3 - 7x2 + 1 3x - 30
at al l i nteger val ues of x from -1 ,000 to + 1 , 000 to see if the val ue of
the function changes si des (goes from minus to pl us or pl us to minus)
over the step. I f the function does change signs, we know that there
Tb Chapter
is a sol ution (zero crossi ng) wi thi n the interval defned by that step.
We can then proceed to use essential l y the same method (but more
efcientl y di vidi ng the remaining interval in half each time) to deter
mine the val ue of the solution to as fne a degree of approximation as
we wi sh. If the function does not change sign over a step, we assume
that there i s no sol ution wi thi n that interval .
This numerical method for sol ving higher order equations is called
the halfinterval search technique and i s an excel l ent example of the
i terative use of the method of contradi ction. The half-i nterval search
techni que uses contradi cti on, because we consider all i nterval s over
which the function does not change sign not to contain a solution to
the equation, si nce a solution i nvol ves a zero crossi ng ( passage from
a pl us value of the function to a mi nus val ue of the function) . The ab
sence of a change from pl us to mi nus over some interval contradicts
the possi bi l ity of a solution in that interval , provided we are justifed
i n assuming that the function cannot have two zero crossings over
that i nterval . If there were some reason to doubt the val idity of this
for the chosen step si ze, you coul d always choose a smaller step size
to see if more real roots would be di scovered, using the smaller step
si ze. Naturally, if you already found n real roots for an equation of
degree n, then you know you have obtained all the roots that it is
possibl e to obtai n, since there are at most n real roots for an equation
of degree n.
Obvi ousl y, t he same iterative method can be used t o solve other
types of equations invol ving logs, exponents, trig functions, and the
l ike. In additi on, you can defne multi di mensional analogs of the above
iterative method to sol ve sets of several equations with several un
knowns. However, it clearl y gets more and more time consuming and
more and more compl icated, the greater the number of unknowns
or the more compl ex the functions.

Working Backward
THEORY
The method of working backward is si mi l ar to the method of contra
diction (Chapter 7) and the method of drawing inferences about the
goal (Chapter 3) i n that all three focus on the goal to a great extent
and consider it rather than the gi vens as the starting point for the
probl em-solving process. However, worki ng backward difers i n the
way the goal i s consi dered i n relation to the gi vens. With the methods
of contradiction and drawing inferences from the goal , the goal i s
considered t o be part of the given i nformati on, and we attempt to
derive consequences from the goal i n conjunction with the gi vens.
Thus, the di rection of inference i s from the goal statement to some new
statements. In working backward, the goal i s not consi dered to be a
piece of given i nformati on. We start with the goal , but i nstead of draw
ing inferences from it, we try to guess a precedi ng statement or state
ments that, taken together, woul d i mpl y the goal statement. Hence, the
di rection of inference i s the same as i n working forward -namel y,
from the gi ven information to the goal . We start at the end poi nt and
T Chapter 8
try to determine precedi ng statements, which need not necessari l y
be gi ven statements but which, when taken together, wi l l produce the
goal . Then we try to determine other statements that will imply those
statements, gradual l y working our way back. We hope to arrive at given
information that is sufcient to derive everything in between the
gi vens and the goal .
Why shoul d we want to reverse di rection l ike thi s, proceeding from
the goal to the gi vens rather than from the gi vens to the goal ? When is
thi s method more appropriate than working forward, and why? That
i s, whi ch probl ems are appropriate for working backward and which
for working forward?
The method of working backward is l i kel y to be useful if a probl em
satisfes two criteria. The frst i s that the probl em shoul d have a
uniquely specied goal, as is the case for al l proof probl ems. When
ever there i s a si ngl e, cl early, and completel y specifed goal stated in
the probl em, you shoul d seriousl y consi der the possibil ity of working
backward. This approach i s particularly true if, i n contrast to the si ngle
goal statement, there are l arge numbers of given statements. Newel l ,
Shaw, and Si mon ( 1 962) have cl earl y stated the advantage to working
backward i n such probl ems - namely, there i s no ambiguity as to what
statement to start wi th when you work backward, whereas such am
biguity is consi derable when you work forward. As they so aptly put
it, working forward i n such a probl em is analogous to fnding a needle
i n a haystack, whereas worki ng backward i s anal ogous to the needle
fnding i ts way out of the haystack. You can start from many places
outside the haystack i n tryi ng to fnd the single location of the needle.
By contrast, the needle starting i n a singl e location can solve the
probl em of getting out of the haystack by getting to an extremel y l arge
number of alternative locations outside the haystack.
I n the needle-in-the-haystack probl em, the l arge number of givens
have a disjunctive relationshi p to one another. That is, to solve the
probl em, you need to get from any one of these gi vens to the goal or
by the method of working backward from the goal to any one of the
l arge number of diferent gi vens. In many probl ems to which the
method of working backward i s appropriate, such as proof probl ems,
the gi vens have a conjunctive relationshi p to one another. That i s,
you must use several of the givens to derive the goal . Thus, in the
method of worki ng backward, it wi l l usual l y be necessary to work
backward from the goal to get to several of the gi vens rather than to
onl y one of the givens. Neverthel ess, the method is frequently very
useful in such probl ems, because the uni que starting point so fre-
Working Backward T
quently di rects you to just those aspects of the gi ven i nformation that
are relevant to the sol uti on.
In probl ems where the goal i s not so clearly and compl etel y speci fed
and there are, i n fact, a vari ety of possi bl e alternati ve goal s, the ad
vantages of worki ng backward are ofen l argel y el i mi nated. In the case
of inference probl ems (probl ems with nondestructive operati ons) , the
method of working backward i s suitable for proof probl ems but not
general l y for fnd probl ems. I n the case of action probl ems (probl ems
with destructi ve operations), essential l y the same di sti ncti on appl i es
namel y, probl ems with uni quel y specifed goal s are appropriate for
the method of worki ng backward and probl ems in whi ch only certain
characteri stics of the goal are specifed are general l y not so suitabl e.
The second criterion of a probl em as to whether the method of
working backward is appl i cabl e concerns the nature of the operati ons
specifed i n the probl em. If al l operati ons are unary and one-to-one
operations, the method of worki ng backward i s l i kel y to be hel pful .
U nar operations are operati ons that take one gi ven i nput statement
and produce one output statement. One-to-one operations are opera
tions for whi ch i t is possi bl e to uni quel y determine what i nput state
ments produce the output statement. (These concepts wi l l be di scussed
i n more detai l i n Chapter 1 0. ) Si nce a wel l -defned unary operation
yields a unique output statement for each input statement, there is
no ambiguity concerning the resul t of an operation when appl i ed to
some state worki ng forward. However, a wel l -defned unary operati on
need not be one-to-one; several di ferent i nput statements coul d pro
duce the same output statement. Thus, when operati ons are not one
to-one, worki ng backward may lead to a more rapi dl y branchi ng tree
of possi bl e action sequences than wi l l working forward. I n such cases,
i t would general l y be preferabl e t o work forward.
The unary property of an operation is not as i mportant as the one
to-one property for the appl i cabi l i ty of worki ng backward. Bi nary or
ternary operati ons take two or three i nput statements and produce
one output statement. U si ng the method of worki ng backward, i t
would be necessary, gi ven the output statement, to produce two or
three i nput statements. Superfci al l y, thi s might seem si mi l ar to what
happens usi ng the method of worki ng backward when operati ons are
not one-to-one. However, there is an i mportant di ference -wi th bi nary
or ternary operations, the i nference process i s essenti al l y as compl i
cated when working forward as when worki ng backward. A conjunc
ti on of i nputs i s related to a si ngl e output statement wi th bi nary or
ternary operations, and thi s i s equal l y true worki ng forward or back-
T4 Chapter 8
ward. From one probl em to another, it may be more or l ess difcul t
to work i n the forward or the backward di rection, but the exi stence
of binary or ternary operations does not necessari l y i nvalidate work
ing backward, and exampl es of working backward i n just such probl ems
wi l l be gi ven l ater in the chapter. I n such probl ems, working backward
general l y resul ts in a set of subgoal statements , and then the subgoal
statements are frequentl y derived from the gi vens by working either
forward or backward.
By contrast, with operati ons that are not one-to-one, working back
ward generates a mul ti pl i city of alternative input statements that are
di sjuncti vel y related one to another (rather than conjuncti vel y rel ated) .
Thus , you are generati ng a l arge set of alternative prior statements,
using the method of working backward, whi ch would never be present
using the method of working forward, since only one of these state
ments i s necessary i n order to derive the goal (not a conjunction of
several or all of them) . Thus, the more critical property of an operation
that is benefcial for worki ng backward i s the one-to-one property,
whi l e the unary property faci l itates probl em sol ving working either
forward or backward.
Another way to state the critical one-to-one property of operations
desi rable for working backward i s to say that the operations in the
probl em should admit of the possi bi l i ty of defni ng i nverse operations.
Inverse operations are operations that go from the output statement
back to the i nput statement and reverse the efect of some given opera
tion. Whenever operati ons are one-to-one, it is possi bl e to specify
wel l -defned i nverse operati ons that wi l l uni quel y produce the input
statement from the output statement. Cl earl y, if the original l y specifed
operati ons were not one-to-one, a mul ti pl i ci ty of diferent input state
ments woul d produce the same output statement, and you would have
no way of defni ng an i nverse operation that uniquel y specifed a single
i nput statement that produced some output statement.
If ei ther action or inference probl ems specify a si ngle goal and if
the operations specifed in the probl em are one-to-one (admit the defni
tion of inverse operations), working backward wi l l ofen, though not
always, be preferabl e to working forward. However, where one or both
of these criteria are not sati sfed by a probl em, working backward wi l l
l i kel y be inferior to working forward. Hence, working backward i s
by no means uni versal l y preferabl e to working forward. In fact, i n
my experience, it i s general l y more di fcul t to work backward than to
work forward. Neverthel ess, there is a l arge cl ass of probl ems to which
the method of working backward i s appropriate, and some examples
are given i n the fol lowi ng secti on.
T4T
APPLICATIONS
Action Problems
The fol l owing cl ever l ittl e doubling-game problem i l l ustrates the use
ful ness of working backward:
Three peopl e pl ay a game in whi ch one person l oses and twq peopl e win
each game. The one who l oses must doubl e the amount of money that
each of the other two pl ayers has at that ti me. The t hree players agree
to pl ay three games. At the end of the three games, each pl ayer has l ost
one game and each person has $8. What was the origi nal stake of each
pl ayer?
Ofand, it seems as if there i s i nsufcient i nformation to determine
the answer. However, because the players all fnish with the same
amount of money, $8, it i s possible to compute thei r original stake by
working backward. We wi l l label the frst l osi ng pl ayer PI , the second
P2 , and the thi rd P
3
Stop reading and try to sol ve the probl em, if you
did not do so before.
At the end of game 3, PI
'
P
2
, and p;! each had $8. Working backward
to the end of game 2, PI must have had $4 and P
2
$4, si nce both won
in game 3 (Pa l ost) , and thus both had thei r stakes doubl ed by the re
suIts of game 3 . Si nce PI and P
2
each gained $4 i n game 3 , P
3
must have
lost $8 i n game 3 , so Pa had $ 1 6 at the end of game 2. Now work back
ward to determine the stakes of each player at the beginning of game 1 ,
if you di d not sol ve the probl em before.
The complete solution obtained by working backward is shown in
Fig. 8- 1 , where we observe that in the beginning PI had $ 1 3 , P
2
had $7,
and Pa had $4.
Note that i f the players di d not al l end with t he same amount of
money, it would be i mpossi bl e to determine what each pl ayer started
with, because the order in which the pl ayers won and lost would make
/| /z /
End of game 3 $ 8 $ 8 $ 8
End of game 2 4 4 1 6
End of game 1 2 1 4 8
Begi nni ng 1 3 7 4
FIGURE 8-1
Working backward to sol ve a doubl ing-game probl em.
T4Z Chapter 8
a diference. However, in the present i nstance, the order in which the
pl ayers won or lost games makes no diference to determi ni ng the
i nitial stake of each pl ayer.
Al so, if you had names for the pl ayers , you could not tel l which
pl ayer started wi th $ 1 3 , whi ch with $7, and whi ch with $4, unless you
know the order i n which they won. Here I simply named the frst l osi ng
pl ayer PI
'
the second P2 , and the thi rd Pa Thi s was completely ade
quate, si nce the stated goal did not require pairing stakes with named
pl ayers.
Thi s doubl i ng-game probl em i s an extreme exampl e of the useful
ness of worki ng backward, si nce it i s essential l y i mpossi bl e to sol ve
the probl em, except by worki ng backward. The reason i s that there i s
a uni quel y defned goal , but no given state i s specifed at al l . In fact,
the probl em i s to derive a gi ven state that, i n conjunction with the op
erati ons, will produce the goal state. Al though the operati ons are stated
i n a forward di rection, they easi l y admit the defnition of unique in
verse operations (in whi ch two people have thei r stakes cut i n half and
the other person has his stake i ncreased by the sum of the amounts
the others were decreased) . Thus, it i s clear we must use the method
of working backward to sol ve thi s probl em. Of course, it is equal l y
correct to say that what we have done i s to transform the operations
i nto i nverse operations and reverse the goal and the givens, taking the
goal as the givens and attempting to derive the given state from the
goal . Obvi ousl y, it makes l ittle diference which way we describe what
was done in thi s probl em, si nce what was done is preci sel y the same
under either description.
Nim games are games in which each pl ayer takes away tokens sub
ject to a variety of restri cti ons and tries to be the l ast -or not the
last -to take a token. N i m games provide excel l ent examples of the
useful ness of working backward to determine optimal strategy. One
exampl e i s the fol l owi ng:
Fi feen penni es are pl aced on a tabl e i n front of two pl ayers. Each
pl ayer is allowed to remove at l east one penny but not more than fve
penni es at his tum. The players al ternate turns, each removi ng from one
to fve penni es H number of turns, unti l one pl ayer takes the last penny
on the tabl e, and wi ns all 1 5 penni es. I s there a method of pl ay that wi l l
guarantee vi ctory? I f so, what i s i t ?
Stop readi ng and try to determine the optimal strategy by working
backward.
If you conjecture yoursel f to be in the goal state, this state would
Working Backward T4
cl earl y consi st of bei ng the pl ayer whose turn it i s to move, and there
being anywhere from one to fve penni es on the tabl e. I n thi s state,
you coul d take al l of the penni es lef on the tabl e and be the wi nner.
Now, can you work backward from thi s set of possi bl e goal states to
conjecture a preceding state for the other pl ayer in whi ch, no matter
what that pl ayer does, you wi l l be i n one of these desi rable goal states ?
Stop reading and try t o sol ve the probl em, i f you di d not before.
It i s clear that i f you confronted t he opposing pl ayer wi th si x penni es
on the preceding turn, no matter how many penni es he or she took
(from one to fve, there would sti l l be from one to fve penni es on the
tabl e when it was your turn, giving you the vi ctory. Thus , you shoul d
try to confront your opponent with si x penni es after your move. But
you cannot do thi s on your frst move, so you must work backward
again and ask what previ ous position you woul d have to put your
opponent in so that, no matter what he di d, you coul d have six penni es
on the tabl e afer your move. Now stop reading and try to sol ve the
rest of the probl em, if you di d not before.
Some thought reveal s that i f you coul d confront t he opposing pl ayer
with 1 2 penni es after Y
9
ur precedi ng move, then no matter how many
pennies he took (from 1 to 5), you woul d be able to take enough penni es
to confront hi m with 6 penni es on the next turn. Thus, you want to
confront your opponent with 1 2 penni es, and you can do that on your
frst move by removing 3 penni es from the board.
Note that in thi s nim probl em, there was no uni quel y defned goal
state, si nce the goal i s to take the last one, two, three, four, or fve pen
ni es on the tabl e, and we cannot know whi ch of these moves woul d con
stitute the goal i n any ni m game we might wi n. Of course, you coul d
easily transform thi s into the unique goal of facing your opponent with
a tabl e having zero penni es on it on the other pl ayer
'
s next move. Ob
viousl y, it makes no di ference whi ch way you l ook at thi s ni m probl em.
The point i s that working backward wi l l frequentl y generate many
possi bl e preceding states, and thi s fact does not necessari l y i nval i date
the method of worki ng backward. I n the present i nstance, working
backward two steps yields a uni yue preceding number of pennies that
you should confront your opponent with on the turn prior to your
opponent' s l ast turn -namel y, si x penni es. Thus, consideri ng onl y
your own sequence of moves (rather than al l the di ferent moves that
might be made by your opponent) , we see that the method of working
backward in the present probl em proceeds to get one precedi ng state
from one succeeding state. Thus , working i n the backward di rection,
the tree of possi bl e states i s conti nual l y bei ng pi nched back to a si ngle
state at every alternate move. The lesson i n thi s probl em i s that you
T44 Chapter 8
shoul d not be too easi l y di scouraged from working backward by a
mul ti pl i city of preceding states, if thi s mul ti pl i city is onl y a temporary
phenomenon or a one-time sprouting of branches of the tree fol l owed
by no further i ncrease. We cannot expect the method of working back
ward to always produce a single one-to-one chain of states back to
the gi vens from the goal with no alternati ves to i nvestigate.
Working backward was preferabl e to working forward in the preced
ing probl em because the number of diferent action sequences that had
to be consi dered working backward was consi derabl y smaller than
the number that had to be consi dered working forward. You coul d,
at every alternate move, determine a unique state i n whi ch you shoul d
be. Thi s i s the most general criterion for the appl icabil ity of working
backward, namely, that it produce a smal l er space of alternative action
sequences than would be produced by working forward. Sometimes
working backward i s preferable to working forward because it pro
duces a smal l er set of action sequences to consi der when combined
with a hi l l -cl i mbi ng approach. An exampl e i s provided by the fol l owing
checker-rearrangement problem:
On an infnitely extended checkerboard, one is gi ven three bl ack checkers
and two white checkers i ni ti al l y pl aced in immedi atel y adjacent squares
on a si ngl e row, proceedi ng from lef to ri ght, as shown in Fi g. 8-2: bl ack
( B) , white ( W) , bl ack, white, bl ack. The probl em i s to transform this
arrangement of al ternati ng bl ack and white checkers into an arrange
ment in which all three bl ack checkers are on the lef and both whi te
checkers are on the right (BBBWW) , with all checkers bei ng in adjacent
squares and i n the same row ( see Fig. 8-2). The al l owabl e operati on i s to
move two adjacent checkers at a time, one of which must be a bl ack
checker and one a whi te checker. During a move, the two checkers bei ng
moved must remain together at al l ti mes, with no reversal of their l ef
to-right order. You are permi tted to move a whi te-bl ack or bl ack-white
pair of checkers to any adjacent pair of unoccupied squares al ong the
same line. Note that there i s no need to keep the checkers that are not
being moved i n immediatel y adjacent squares at any time. That i s, there
may be unoccupied squares between checkers at various stages between
the gi vens and the goal . Al so note that the fve checkers i n the goal state
need not occupy the same fve squares on the checkerboard as they di d
i n the gi ven state. They may occupy any immediatel y adjacent fve squares
in the same row.
Stop reading and try to sol ve the probl em by defning an evaluation
function and then using the method of working backward in conjunc
tion with a hil l -cl i mbi ng approach on thi s eval uation function.
T 4b
Given Goal
f 0 0
FIGURE 8-2
Gi ven and goal slates for the checker-rearrangement probl em.
Thi s probl em is frustrating to solve by working forward because
there are many possi bl e moves at each point, and i t is not at al l clear
how to hi ll -cl i mb in order to get to the goal . Nor is it clear what sub
goals you ought to set on the way to the goal statement. By contrast,
working in the backward di recti on, there is onl y one pair of checkers
that can be moved initial l y, namel y, the thi rd and fourth checkers in
the row. After the frst move, the number of possi bi l i ti es at each move
i s also more limited than would happen if you moved in the forward
di rection. The solution is relativel y easy to obtain working backward
because of this much greater restriction in the number of possi bl e i ni tial
moves. Now stop reading and try to solve the probl em, if you did not
do so al ready.
In addition to working backward, it i s hel pful to defne as an evalua
tion function the number of immediatel y adjacent bl ack-white and
white-black pairs of checkers. This eval uation function has a value of
I in the goal state and a value of 4 bl ack-white or white-black transi
tions in the given state. Choosing actions that i ncrease this evaluation
function i s of some assi stance in narrowing the space of possi bl e moves
in worki ng backward from the goal state to the given state. Of course,
i t i s al so of some help in working forward. However, the probl em i s
considerabl y easi er working backward, because of the greater restric
tion of initial moves from the goal state, as wel l as for psychological
reasons pecul iar to this probl em. An optimal sol ution to the probl em,
along with the val ues of the evaluation functi on for each state, is
shown i n Fi g. 8- 3 .
Because of the greater restriction of i nitial moves starti ng from the
goal and worki ng backward than starti ng from the givens and working
forward, working backward was cl early i ndicated in thi s checker
rearrangement probl em. However, even if there is no reason to prefer
the method of working backward in a probl em, you shoul d always
consider its use whenever there is no reason to prefer working forward.
That is, there are many probl ems in whi ch it may not be obvi ous a priori
which method, working backward or working forward, is superior; i n
such cases, you might wel l try working forward and, i f it di d not seem
to be working out wel l , then try working backward.
T 4b
Goal
e
8
B
1

lJ
B
b
f

B

Evaluation function
(No. of BW or WB
transitions)

2
8

3
3
Given B
e
B

}
4
FIGURE 8-3
Working backward to solve the checker-rearrangement probl em.
As a fnal exampl e of the method of working backward, l et us con
sider a water jar problem:
Given a jar that will hold exactly 7 quarts of water, a jar that will hold
exactly 3 quarts of water, no other containers holding water, but an in
fnite supply of water, describe a sequence C fllings and emptyings of
water jars that will result in achieving J quarts of water.
Stop readi ng and try to solve the probl em, working backward.
Obviousl y, at the goal state we wi l l have 5 quarts of water in the
7-quart jar. There are several ways thi s might be achieved from a pre
ceding state, working backward. Fi rst, we might have 2 quarts of water
i n the 7 -quart jar and pour i n 3 quarts from the 3-quart jar. Second, we
might have 3 quarts of water i n the 7-quart jar and pour in 2 quarts from
the 3-quart jar (this seems l ess l ikely than the frst alternative). Third,
we might have 4 quarts of water in the 7 -quart jar and pour in quart
from the 3-quart jar. Fourth, we might have 7 quarts in the 7-quart
container, quart in the 3-quart container, and pour of 2 quarts i nto
the 3-quart jar. Fifth, we might have 6 quarts in the 7-quart jar, 2
quarts in the 3- quart jar. and pour of I quart i nto the 3-quart jar.
Now stop readi ng, and try to sol ve the probl em, if you have not done
so al ready.
Of the fve alternati ves for the state preceding the goal , the frst
and the fourth are the most pl ausi bl e, since they i nvol ve quantities
of water in one or the other jars that are easy to achieve -namel y,
7 quarts i n the 7-quart jar and 3 quarts i n the 3-quart jar. Thus, we
might wel l confne our attention to these two possi bilities, at least
Working Backward T4
initially. Now stop reading and try to sol ve the probl em, if you have
not done so al ready.
Al though it i s possi bl e to achi eve 2 quarts i n a 7-quart jar as a sub
goal and fl l up the 3-quart jar as speci fed in the frst alternative, the
fourth alternati ve i s actual l y optimal . Working backward from the
fourth alternati ve (7 quarts in a 7-quart jar and I quart i n the 3-quart
jar) , we set as the subgoal the achi evement of 1 quart in ei ther jar.
It is probabl y not particularl y useful to conti nue working backward
any l onger afer having defned these fve alternative precedi ng states.
Rather, in attempting t o achieve any one of these states, such as 7
quarts in the 7-quart jar and I quart in the 3-quart jar, it is probabl y
most useful t o set t hi s as a subgoal and work forward from t he gi ven
information in order to achi eve i t. I n the present i nstance, it i s quite
easy to achi eve the state of having 7 quarts i n the 7-quart jar and 1
quart in the 3-quart jar. To achi eve I quart in the 3-quart jar, fl l the
7-quart jar, pour of 3 quarts two successi ve ti mes to achi eve 1 quart
in the 7 -quart jar, then transfer thi s I quart to the 3-quart jar. Now fl l
up the 7-quart jar, and the subgoal i s achi eved. Afer thi s, of course,
it is si mpl e to pour of 2 quarts from the 7-quart jar into the 3-quart
jar, which al ready contains 1 quart. Thi s leaves 5 quarts in the 7-quart
jar, which is the goal .
Thi s probl em ni cel y i l l ustrates how working backward can permit
you to defne a subgoal , whi ch you can then achieve by worki ng for
ward. This pattern is typical of worki ng backward in both action prob
lems such as this one and the inference probl ems to be di scussed next.
Inference Probl ems
As a simple example of the method of working backward i n inference
problems, consider the fol l owing proof probl em:
If ~ 0 and B ~ 0, t hen A2 AB + B2 ~ O. The theorem is actual l y true
for al l and B, but it needl essl y compl icates the exampl e of the method
of working backward to consider the more general case. Thus, restrict
the proof to the case where A ~ 0 and B ~ O.
Stop readi ng and try to solve the probl em by working backward.
To appl y the method of working backward, we frst state the concl u
sion A2 - A B + B2 > O. A preceding statement that woul d i mpl y that
concl usion can be obtained by factoring the expression A2
- AB + B2
into A (A - B) + B2 > O. If we coul d show thi s expressi on to be true,
T4 Chapter 8
then it woul d i mpl y the desi red concl usi on. Stop reading and try to
sol ve the probl em, if you did not do so before.
By working backward we note that we coul d derive t he expression
A (A B) + B2 > 0 from three previ ous expressi ons: frst, A > 0,
whi ch i s gi ven i nformation; second, B2 > 0, whi ch i s true for al l real
numbers i ncl udi ng B ; and, thi rd, A -B > O. We cannot derive A -B > 0
from the gi ven i nformati on, but we coul d just assume it as one case.
Obvi ousl y, i n some cases where both A > 0 and B > 0, A wil l be greater
than B. So i n the case where A > B, then A B > 0, and the theorem
i s proved. Stop readi ng and try to solve the probl em, if you di d not
before.
Now, work backward from t he goal statement A
2
- AB + B
2
> 0 to
try to deri ve the goal expressi on in the case where B > A. I n thi s case,
we factor the goal expression i nto A
2
+ B( B A ) > 0, whi ch wi l l be
true if, frst, B > 0 (gi ven i nformation) ; second, A
2
> 0 (true for all
real A ) ; and, thi rd, B A > O. Now, B A > 0 fol l ows from B > A.
Thus, we have establ i shed that the concl usion fol l ows where A > B
or B > A. I n addi ti on, we have to show that the concl usi on fol l ows
when A B, but thi s matter i s tri vial . Thi s probl em i s so short and
rel ati vel y trivial that many people may not notice how much they are
working backward in sol vi ng i t. However, a careful examination re
veal s that the critical i nsights come from focusing on the goal statement
and noticing what it can be factored i nto.
Frequently we use the method of worki ng backward for onl y a few
steps i n order to deri ve some more congenial formul ation of the state
ment to be proved, then we proceed to work i n a forward di rection in
trying to derive thi s more congenial formul ation. Working backward
may si mpl y resul t in the substitution of a singl e subgoal (di rectl y re
l ated to the goal) for the original goal , or it may resul t in the substitu
tion of two or more subgoal s i n place of the original goal .
An extremel y si mpl e proof of the Pythagorean Theorem can be
obtained by i ni ti al l y working backward from the algebraic goal state
ment to obtain a singl e, more geometric sub goal .
As you may remember, the Pythagorean Theorem states that, for any
right triangle, c2 {2 + b2, where L i s the length of the hypotenuse. Prove
t hi s theorem, where the givens are, frst, the axioms of Eucl i dean geom
etry; second, the defnition of the area of a rectangular fgure (l ength
times width) ; and, third, the assumpti on that the areas of nonoverl apping
fgures are additi ve.
Stop reading and try to prove the theorem by working backward to
obtain a si ngl e subgoal .
FIGURE 8-4
Squares erected on the si des of a right triangle
by working backward one step from the algebraic.
formulation of the Pythagorean Theorem, r
u b
T 4
Instead of trying to use the gi vens in an attempt to derive the goal
expressi on, i t is far si mpl er to look at the goal expression and note
that it i s asserting that the area of a square with side c erected on the
hypotenuse i s equal to the sum of the areas of the squares wi th si des
a and b, respectivel y, erected on the other si des of the ri ght triangl e.
Thi s situation i s shown graphical l y i n Fi g. 8- 4.
Thus, by working backward from the goal expressi on c2 -a2 + b2,
we have obtained a subgoal that might prove more tractabl e than the
original goal -namely, to show that the area of the l arge square with
side L i s equal to the sum of the areas of the smaller squares with
sides and h. Now stop reading and try to prove the subgoal , if you
have not done so al ready.
To prove that the subgoal statement i s true, you need to get expres
si ons for the areas of the three squares that are i n the same terms, so
that you can determine whether the sum of the two smal l er areas equal s
the largest area. Si nce the original triangle i s the basis for any relation
among the areas of these three squares, it seems natural to try to ex
press the area of each square i n terms of the area of the original triangl e,
T. Now stop reading and try to formul ate thi s expressi on, if you have
not done so already.
It i s quite straightforward to refect the triangl e, T, onto the squares
erected on the nonhypotenuse sides, and b. Assumi ng that a > b, we
can lay out two triangles on the square with side a and have a rectangle
with area l(a - b) left over i n the square. In the case of the smal ler
Tb Chapter 8
square with si de b, two T triangl es wi l l use up a rectangle that has an
area greater than the area of the square by an amount equal to a rec
tangl e with area b(a - b). All this i s shown i n Fig. 8- 5. Thus , we can
replace the terms a2 + b
2
by 4T + a(a - b) - b(a - b) -4T + (a - b)2.
Figuring out how to lay out T triangl es i nside the l argest square
(with si de c) is more of a challenge. However, with the idea of refect
ing the original triangle about the side it shares with the various squares
erected on its si de, we shoul d eventual l y wind up laying out four T tri
angl es i nsi de the square with side c, as shown i n Fig. 8-5 . To l ay out
the four T triangl es within the l argest square, the most critical prop
erty of the original triangl e to note i s that L + f -90. Taking four T tri
angl es out of the square with side c l eaves a square wi th side ( a - b)
i nsi de the four triangl es. Then the area of the l arge square i s c2 -4 T
+ (a - b)2, exactl y what was obtained for the sum of the areas of the
other two squares. Thus, the area of the square erected on side c is
equal to the sum of the areas of the squares erected on sides a and b,
and t he Pythagorean Theorem i s proved.
An exampl e of working backward to fnd several subgoal s is pro
vided by the fol l owi ng proof probl em:
You are given the following four assumptions: ( I ) Mul ti pl ication is com
mutative ; that is, AB BA. (2) Equal s added to equal s are equal ; that i s,
i f A A ' and B B' , then A + B A' + B' . ( 3) The l ef distributi ve law
appl ies ; that is, C(A + B) CA + CB. (4) The transi ti ve l aw appl ies ;
that i s, if A B and B C, then A C. From these four gi vens, prove the
right di stributive l aw-that is, (A + B) C AC + Be.
Stop reading and try to sol ve the probl em, working backward to de
rive subgoal s.
To prove this theorem, a good beginning point would be t o start at
the goal statement (A + B) C -AC + BC, and write down one or more
preceding statements from which you coul d derive the goal statement
as a concl usi on. One pair of preceding statements from which you
could derive the goal statement i s the fol l owi ng: (A + B) C -X and
X -AC + BC. From these two precedi ng statements you could derive
the goal statement, usi ng the transitive law. Thus, we have subdivided
the goal into two subgoal s that are, however, somewhat dependent one
upon another i n that we must try to transform each of the expressions
that are consi dered to be equal i n the theorem into expressions that are
identical (indi cated by X) . Stop readi ng and try to sol ve the probl em,
| /)
FIGURE 8-5
Expressing the areas of al l three squares in terms
of T (the area of the original triangle). U, and b,
so that C terms are el i minated from the expression
for the area of the large square.
TbT
I n this probl em, just appl yi ng legitimate operations to the expres
sions (A + B) C and AC + BC will easi l y resul t i n deriving expressi ons
that are identical i n each case. So (A B) C -C(A B) , by the com
mutative l aw for multipl ication. C(A + B) -CA CB, by the lef
di stributive law. Therefore, (A + B) C -CA CB, by the transitive
law of equality. Now CA AC and CB -BC, by the commutative
law for mul tiplication. Therefore, CA C B -A C BC, because equal s
added to equal s are equal . Thus, (A + B) C -AC BC, by the transi
tive law for equality, and the theorem i s proved.

Relations Between Problems
When manki nd has a sati sfactory theory of probl ems, it wi l l be pos
sible to state many deep and detailed rel ati ons between diferent types
of probl ems. But even without such a theory, we can stil l state certain
basic types of relations between diferent probl ems.
In parti cul ar, fve fundamental types of relations can obtain between
two probl ems, a and b: Fi rst, probl em a i s unrel ated to problem b
( probl em a and probl em b have no common el ements). Second, prob
lem a is equi val ent to probl em b (a and b have the same probl em el e
ments, a and b are compl etel y anal ogous, a and b are i somorphi c).
Third, probl em a i s si mi l ar to probl em b ( probl ems a and b have some
common elements, probl ems a and b are partial l y analogous). Fourth,
probl em a is a special case of probl em b ( probl em a is i ncl uded i n
probl em b) . Fifth, probl em a i s a general i zation of probl em b ( probl em
b i s i ncl uded i n probl em a) . When probl ems O and b are si mi l ar, they
may be of approximatel y equivalent di fculty, b may be simpler than
a, or b may be more compl ex than a.
EQUIVALENT PROBLEMS
In determi ni ng whether any of these fve relations hol ds between two
probl ems, i t i s i mportant to note that the critical problem el ements
Relations Between Problems Tb
concern the types of operati ons and the relations that can obtain be
tween diferent expressions or things, not the specifc expressi ons or
things themsel ves. For exampl e, in the checker-rearrangement prob
lem descri bed i n Chapter 8, it would make no diference i n the el ements
of that probl em, from a probl em-sol vi ng vi ewpoi nt, if the bl ack
checkers were changed to quarters and the white checkers were
changed to penni es, provided that al l the same restricti ons and opera
tions sti l l appl i ed. Al ternati vel y, we coul d replace the bl ack checkers
by red poker chi ps and the whi te checkers by bl ue poker chi ps, and,
i f everything el se remained the same, the probl em woul d be equi valent
to the original probl em. Si mi l arly, i n the ni m probl em of Chapter 8,
i n which from one to fve penni es were removed by a pl ayer on each
tur, the pennies coul d be replaced by any token such as marbl es,
poker chi ps, buttons, or stones. I n the Tower of Hanoi probl em i n
Chapter 6, the di sks of decreasing si ze coul d be replaced by any set
of tokens having a simpl e order relation among them.
Probl ems that difer onl y with respect t o t he names attached to
diferent el ements of the probl em, but all of whose relati ons and opera
tions are i denti cal , are consi dered equivalent, meaning they are
completel y analogous or i somorphi c. Recognizing that two probl ems
are equivalent may sometimes i nvol ve real i zi ng that many of the i m
pl i ed properti es i ndi cated by the diferent names attached to corre
sponding el ements i n the two probl ems are compl etel y irrel evant to
the solution of the probl em. However, it is usual l y relativel y trivial
to recognize such irrelevanci es of the difering properties of corre
sponding el ements, and consequentl y the recognition of equi valent
probl ems i s frequentl y trivial .
SIMILAR PROBLEMS
Two probl ems can be extremel y si mi l ar and yet not be equi val ent i n
every respect. For exampl e, i n the Tower of Hanoi probl em, you may
start with 5, 6, 7, or 1 0 di sks on a si ngl e spi ke; the number of di sks
you begin with i s i n no way critical to the method of sol vi ng the prob
l em as outlined in Chapter 6. You can, i n fact, state a solution to the
general Tower of Hanoi probl em i n whi ch n di sks must be transferred
from one spike to another. Thus, any parti cul ar Tower of Hanoi prob
l em (a probl em i nvol vi ng the transfer of some parti cular number of
di sks from one spike to another) is a special case of the general Tower
of Hanoi probl em. Any two special cases of the general Tower of
Hanoi probl em are si mi lar probl ems, though I would hesitate to call
them equivalent probl ems.
Tb4 Chapter
Si mi l arl y, in the ni m probl em, i nvol vi ng one to fve pennies that can
be removed at each turn, you can construct a l arge vari ety of problems
that difer i n the maximum number of penni es that may be removed on
each turn or the number of penni es originally placed on the tabl e.
Each of these probl ems can be sol ved by essential l y the same problem
solving method, though the specifc number of pennies that the player
should take on each turn will difer from probl em to probl em. Again, all
of these probl ems are extremel y si mi l ar, but not completely equivalent.
The precedi ng two exampl es of extremel y si mi l ar probl ems were
exampl es in which the probl ems difered onl y in the quantities of cer
tain el ements of the probl em. In each case, all of the qualitative or
structural characteri sti cs of the probl ems were identical .
Equivalent Difficulty
There are other partl y analogous probl ems in whi ch the structure i s
somewhat diferent in the two probl ems being compared but still
highl y si mi l ar. For example, consider the fol l owing probl em called
the fox, goose, corn problem:
A man (M) , a fox ( F) , a goose, ( G ) , and some corn ( C) are together on
one si de of the river ( straight l ine) with a boat (8) , as i l l ustrated in the
gi ven state of Fi g. V | . The goal is to transfer al l of these enti ti es to the
other si de of the ri ver by means of the boat, whi ch wi l l carry the man and
one other enti t y. The fox and the goose cannot be l eft al one together, nor
can the goose and the corn.
Stop reading and try to solve the probl em by recalling the methods
used to solve a si mi l ar ( partl y analogous) probl em.
Given Goal
M F G C B
M F G C B
FI GURE 9-1
Given and goal states
for the fox, goose,
corn probl em.
The fox, goose, corn probl em i s si milar to the mi ssionaries-and
canni bal s probl em di scussed i n Chapter 5. Now stop reading and try
again to sol ve the probl em, if you did not before.
The sol ution t o thi s probl em i s shown i n Fi g. 9- 2. You wi l l note that
at one critical stage you must make an apparent detour in order to solve
the probl em, just as in the mi ssi onaries-and-cannibals probl em. You
might well have suspected that something of this character would be
Tbb
M F G C B
Gi ven
F C
M G B
M F C B
G
C
M F G B FIGURE 9- 2
Sol ut i on to t he fox, goose
,
M G C B
corn probl em.
F
G
M F C B
M G B
F C
Goal
M F G C B
requi red, si nce the two probl ems are si mi l ar in having restri cti ons re
gardi ng what enti ti es can be on the same side of the ri ver with what
other entities at the same ti me. Note, however, that i nstead of two
types of entities, as in the mi ssionari es-and-canni bal s probl em, thi s
problem has four types. Furthermore, there is onl y one exampl e of
each entity, whereas in the mi ssionaries-and-canni bal s probl em, any
one of the mi ssionari es or cannibal s coul d row the boat. With all these
diferences, it i s surprising that the one pri mary si mi l arity i s never
thel ess the dominant el ement of the problem with respect to probl em
solving methods : I n each case, we fnd an i mpl i ci t eval uation function
in terms of the number of entities on the goal si de of the river and take
actions that i ncrease that eval uation functi on. In each case, it i s neces
sary to make a bri ef detour i n terms of that eval uation function i n
order to solve the probl em.
Tbb
Chapter
A spatial reasoning probl em similar to a probl em previousl y con
sidered i n this book i s the fol l owi ng:
You are given six coins arranged i n two rows (as shown on the l ef side
of Fig. 9- 3) so that each coin touches the coi ns immediatel y above or
bel ow it and to the lef or right of i t. Specify a procedure for moving
exactly two coins so as to achieve the hexagonal arrangement shown on
the right side of Fig. 9- 3.
Stop reading and try t o sol ve thi s probl em by consi dering a pre
viousl y solved rel ated probl em.
The most cl osel y related probl em i s the bowl i ng-pin reversal problem
di scussed in Chapter 7. Both probl ems i nvol ve spatial l y di stributed
objects that must be rearranged i n a mi ni mum number of moves to
achi eve some new confguration. Stop readi ng and try agai n to solve
the probl em, if you di d not before.
By analogy t o t he bowl ing-pin reversal probl em, it i s useful t o ask
which four of the coi ns will remain i n the same position and which
two will be moved, going from the given to the goal . Since there are
onl y (6 " 5)/2, or 1 5 combi nati ons of two "moved" coi ns, it would be
rel ati vel y simpl e to investigate al l the possi bi l i ties, using the method
of contradi cti on.
However, if you recall that i n the bowl i ng-pi n reversal probl em an
efective strategy was to l ook for maxi mum subgroups i n the given and
goal states that occupy the same relative positions to one another,
then thi s perceptual strategy can be applied to qui ckl y give the answer
4
5 6
FI GURE 9-3
Coin-rearrangement probl em.
Goal
to the present probl em. Now stop readi ng and try to sol ve the probl em,
if you have not done so al ready.
Cl earl y, the coi ns i n posi ti ons 1 , 2, 4, and 6 are in preci sel y the
same relative confguration to one another as the top four coi ns i n the
goal state. Thus , you can achi eve a solution by moving coi ns 3 and 5
to the two bottom positi ons. A symmetrical l y opposite solution can be
achi eved by keeping coi ns 1 , 5, 6, and 3 i n the same positions (forming
the bottom of the goal hexagon) and movi ng coins 2 and 4 to the top
two posi ti ons of the goal . However, these moves are the only two of
the 1 5 alternative moves of two coins that solve the probl em.
Recall the one-heavy-coin probl em di scussed i n Chapters 3 and 5.
I n that probl em, we had to determine whi ch of 24 coi ns was heavi est,
using a beam balance. Obvi ousl y, si mi lar pri nci pl es of probl em sol v
ing are l i kel y to be i nvol ved i n probl ems where the ori gi nal set of coi ns
i s some number other than 24. Furthermore, it i s i mmediatel y apparent
that making the odd coi n l i ghter than the normal coi ns, rather than
heavier, woul d not change the method of sol ution i n any respect. That
i s, the one-light-coin probl em is equi valent to the one-heavy-coi n
probl em.
Simpler Problems
What happens when we know the odd coi n is ei ther heavier or l ighter
than the normal coins but do not know which of the two relations the
odd coi n has to a normal coin? Thi s probl em i s obviousl y rather si mi l ar
to the previ ous probl ems, but it is diferent in a much more profound
respect than si mpl y a variation i n the number of coi ns of the original
set or the heavy versus l ight nature of the odd coin. In this new prob
lem, where the heavi er versus l i ghter nature of the odd coi n i s am
biguous, one pri nci pl e of probl em solving sti l l appl i es -that is, you
still get an answer to a three-way question from the balance beam.
However, the logical character of the reasoni ng, given the diferent
outcomes on the balance beam, i s much more compl icated.
Neverthel ess, noti ci ng the analogy to these other probl ems would
enormousl y aid sol uti on of the heavier-l ighter-coi n probl em. Even if
you had not previousl y solved the one-heavy-coi n or one-l ight-coin
probl em, i t might sti l l be good strategy to pose and sol ve ei ther of
these simpl er probl ems before you attempt to sol ve the more compl ex
probl em.
Thi s strategy of posing simi l ar, si mpl er probl ems before worki ng on
a compl ex probl em i s very useful , since many of the methods of
representi ng informati on or of sol vi ng the problem are common to
T b Chapter
both. To be sure, the compl ex probl em wi l l i nvariabl y have some
additional compl i cations. However, if when solving the simpler prob
lem you di scovered some of the methods for sol ving the complex
probl em, it will be easier to di scover the remai ni ng methods of solv
ing the more complex probl em than if you had to solve the complex
probl em as a whol e.
Al ong the same l i nes, when you origi nal l y confronted the one-heavy
coi n probl em with a set of 24 coi ns, you might have tried solving a
si mpl er probl em, with the odd coin embedded in a set of onl y three or
four coi ns. With a set of three coi ns, you have the best opportunity
to real ize that the balance beam can provide a three-way partitioning
of the set of alternatives.
Often, as i n the heavy-coin probl ems, you judge t he si mpl i city of
the probl em by the number of diferent el ements or complications in it.
In the one-heavy-coi n probl ems, which di fer only in the total size of
the original set of coins, the compl ication of the probl em seems to be
variabl e in a si mpl e way, namel y, the change in the number of coins
i n the total set.
However, we have al so noted that a probl em i n whi ch the odd coin
might be ei ther heavier or l ighter was a substantial l y more compl i cated
probl em than the problem in whi ch i t was known defni tel y whi ch
weight relation the odd coi n had to a normal coi n. Si mpl i ci ty i n a prob
lem i s by no means a si mpl e quantitati ve concept.
Another probl em si mi l ar to the one-heavy-coin probl em, but simpler
to sol ve because it has one less restri cti on, i s the problem of three-way
question information theor. The probl em is to determine which
element i s the unique element, in a set of n possi bl e el ements, by
successi vel y partitioning the set into three subsets , then ask whi ch of
the three subsets contai ns the unique el ement.
I n ordi nary (two-way-question) i nformation theory, the optimal
strategy i s to di vi de the total set into two equal or nearly equal parts
and to conti nue di vi di ng the remai ni ng set into two equal parts unti l
the uni que el ement i s determined. Interesti ngly, in three-way-question
i nformation theory, the optimal strategy i s not always to divide into
three equal (or nearly equal) parts, though this i s not a bad strategy.
If the objective is to mi ni mize the expected number of questions to be
asked i n order to determine the uni que el ement, then other ki nds of
parti ti ons besi des the equal partition are opti mal for some set si zes.
For exampl e, if there are si x el ements i n the original total set, the opti
mal strategy with three-way-question information theory i s not to
di vi de into 2, 2, and 2 on the frst partition. Instead, the optimal strategy
is to di vi de into 3 , 2, and 1 , because one-sixth of the time thi s wil l
give you the answer in one question and fve-si xths of the time it wi l l
gi ve you the answer i n two questions. By contrast, the 2-2-2 spl i t
wi l l gi ve you the unique el ement i n two questi ons i n every case. Si mi
larly, i f there were seven el ements i n the original set, you shoul d
divide i nto 3-3- 1 rather than 3-2-2 -and so on.
Posi ng and sol vi ng the somewhat si mpl er three-way question i n
formation-theory problem provi des the surpri sing information that
di vi di ng i nto equal thirds is not necessari l y the opti mal solution. Thi s
pri nci pl e can be appl i ed i n some cases i n the one-heavy-coin probl em,
though it i s li mited by the restriction of having two equal subsets i n
that probl em.
Someti mes when you pose a si mpl er probl em you l ose al l the di fcul t
aspects of the original probl em. I n that case, sol vi ng the si mpl er prob
l em provides no hel p at al l in sol vi ng the original , more compl ex
problem. For exampl e, in the one-heavy-coi n probl em i nvol vi ng 24
coins, if you chose to i nvestigate the si mpl er probl em i nvol vi ng onl y
2 coins, you would draw the same type of wrong concl usi on regardi ng
the weighing operation that many people fall i nto when worki ng on
the original probl em -namel y, dividing i n half.
More serious than the danger that posi ng si mpl er probl ems wi l l l ose
the compl exity of the original probl em i s the danger of posi ng ap
parentl y si mpl er probl ems that are real l y more di fcul t to sol ve than
the original probl em. Al though i t i s general l y true that reducing the
number of el ements i n a probl em reduces the compl exity of the prob
bl em, it i s not always so. Sometimes reduci ng the number of el ements
of a particul ar kind i n a probl em, or el i mi nating some of the features
of the probl em, resul ts in a problem that is more di fcul t to sol ve than
the original probl em. Sometimes the supposedl y si mpl er probl em is
impossi bl e to solve. The fol l owing coin-weighing probl em i l l ustrates
the danger invol ved i n posing si mpl er probl ems :
You have 10 stacks of quarters wi th 10 quarters in each stack. One enti re
stack is composed of quarters, each of whi ch wei ghs 2 grams l ess than i t
shoul d. You know the correct wei ght of a quarter. You may wei gh t he
coi ns on a poi nter scal e, whi ch tel l s you how many grams a set of object s
pl aced on it wei ghs. What procedure wi l l determi ne the l i ght stack in the
smal l est number of wei ghi ngs?
Thi s coin-weighing probl em i s diferent from the previ ous coi n
weighing probl ems, primari l y because the weighing operati on i s dif
ferent. That i s, thi s problem uses a pointer scal e, whereas previ ous
Tb Chapter
probl ems used a beam balance. A weighing operation usi ng a beam
balance provi des an answer to a three-way question, but a single
weighing on a pointer scale provi des an enormousl y greater amount
of information ( l i mited onl y by the accuracy of the pointer scal e).
Because of the great di ference i n the amount of information provided
by the pointer versus beam-balance weighi ngs, there is vi rtual l y no
si mi l arity between the sol uti ons to these two types of coin-weighing
probl ems. Thus, these two types of coi n-weighing probl ems are not
real l y related at al l , i n the probl em-sol vi ng sense. The presence or
absence of concrete si mi l arities, such as two probl ems being both
concerned with coi ns and weighing operati ons, i s l ess i mportant than
more abstract si mi l arities concerned with the relationshi ps among
gi vens or between gi vens and operations. If you were mi sl ed into
trying to appl y simi l ar methods to those used in sol ving the previous
coi n-weighing probl ems, stop reading and try again to solve thi s
probl em.
Another approach that fai l s i s to try to si mpl ify the probl em by re
ducing the number of coi ns in each stack to one coin and si mpl y deter
mi ning which of the 1 0 remaining coi ns is l ight. This actuall y makes
the number of weighings vastly greater than i n the original probl em,
where one had 1 0 stacks of 10 coi ns each. What other way i s there to
simpl ify the probl em? Stop reading and try again to sol ve the probl em,
The other obvious way to si mpl ify the probl em i s to reduce the num
ber of stacks. Sol vi ng a probl em with a reduced number of stacks
coul d faci l i tate sol uti on of the probl em. Now stop reading and try to
sol ve the probl em, using this method of si mpl ifcation, if you di d not
solve it before.
The simplest probl em that can be posed, reducing the number of
stacks, is to deci de whi ch of two stacks is l ight. Evidentl y, thi s coul d
be done in one weighing by weighing a single coin from one of the two
stacks and determining that it i s either the correct weight for a quarter
or 2 grams l ess than the correct weight. Of itself, this solution to the
two-stack probl em does not i ndi cate how you shoul d solve the 1 0-stack
probl em. However, it does provi de you with a basic fami l iarity re
garding the nature of the information provided by the pointer scale.
You should then try to sol ve a three-stack probl em. Now stop reading
and try to solve the three-stack probl em i n a way that wil l allow gen
eral ization to a 1 0-stack probl em, if you have not solved the probl em
so far.
The three- stack probl em can be sol ved in a si ngle weighing, as can
the 1 0-stack probl em. However, an i nsight i s required to accomplish
thi s. I know of no general probl em-sol ving method that would auto-
Relations Between Problems TbT
matically provide you with the critical i nsight. Attempting to sol ve the
si mpl er three-stack probl em makes it more l i kel y that you woul d
achieve that i nsight, but it does not guarantee i t. If you have not yet
sol ved the probl em, stop reading and try to determine what combi na
tion of coi ns from the diferent stacks woul d allow you to determine
whi ch of the three stacks was the l ight stack i n a si ngl e weighing.
Once the three-stack probl em i s sol ved, the same procedure wi l l i m
mediatel y generalize to the 1 0-stack probl em.
The three-stack probl em can be sol ved in a si ngl e weighing onl y by
i ncl udi ng some number of coi ns from each stack and usi ng the amount
of underweight as measured by the pointer scale to determine whi ch
of the stacks i s l ight. To make use of the information concerni ng the
number of grams by which the weighing is underweight (from what it
would be if the coins were al l true quarters) , we cl early need to have
some way of associating the amount of underweight with each of the
three stacks. Thi s type of reasoning i s an i l l ustration of drawing i n
ferences from the goal (determi ni ng the l ight stack i n a si ngl e weigh
ing) . Again stop reading and try to solve the probl em, if you have not
done so already.
The procedure requi red to associate each stack with an amount of
underweight i s to take one coin from the frst stack, two from the
second stack, and three from the thi rd. If the pointer scale reads 2
grams underweight, you know that the frst stack is l ight. If it reads
4 grams underweight, the second stack is l ight. If it reads 6 grams un
derweight, the third stack is l i ght. Generalize the sol ution to the
1 0-stack probl em, if you have not done so al ready.
The original 1 0-stack probl em i s solved i n a si ngl e weighing as fol
l ows : You take one coi n from stack 1 , two coi ns from stack 2, three
coins from stack 3 , and so on, up to 1 0 coins from stack 1 0. Now
weigh thi s entire set of coi ns and determi ne by how many grams i t i s
underweight. The number of grams of underweight di vided by 2 i s
the number of the stack that i s l ight. Thus, the sol ution can be achi eved
with onl y a single weighi ng, when you have a sufcientl y l arge number
of coi ns avail abl e in each stack. Reduci ng the number of coi ns i n each
stack does not si mpl ify the probl em; indeed, it makes the probl em
much more difcul t, to the point of compl etel y preventing you from
seeing the el egant sol ution to the original probl em. When you reduce
the number of stacks rather than the number of coins i n each stack,
you obtain probl ems that are i n some sense si mpl er, though, of course,
you cannot reduce the number of weighings bel ow 1 .
Reducing the number of el ements i n a probl em i s not the onl y way
to make it simpl er. Another way i s to change the probl em so as to al l ow
you to use an al ready proved theorem i n the sol uti on. That i s, you
TbZ Chapter
change the probl em so that it permits you to use a theorem or knowl
edge that you do not al ready know how to use in the original probl em.
You then hope that appl yi ng thi s theorem to the simpl er problem wi l l
gi ve you an idea of how to use the theorem i n the original probl em.
A good exampl e of thi s techni que i s provi ded by the fol l owing di s
tance probl em:
A ray of l ight travel s from poi nt A to poi nt B in Fi g. 9- 4 by bouncing
of a mirror represented by the line CD. Determine the poi nt X on the
mirror such that the di stance travel ed from poi nt A to point B i s a mini
mum. What i s the rel ationshi p between the angl es O and f?
Stop readi ng and try to sol ve this probl em.
Probabl y the most sal i ent piece of knowl edge we al l have about
mi ni mum di stance i s the geometric assumption that the shortest di s
tance between two poi nts i s the straight l i ne connecti ng them. How
ever, it i s not i mmediatel y apparent how to apply this knowl edge in
C X
FIGURE 9-4
Mi ni mum-di stance probl em.
the present probl em, si nce we are constrained to connect A and B with
a bent line that touches the line CD at some point X. If you have not
sol ved the probl em, stop reading and try to defne a si mpl er probl em
that al l ows you to appl y the pri nci pl e that the shortest di stance be
tween two poi nts i s the straight l i ne connecting them.
The difcul ty in appl ying thi s pri nci pl e i s that the points A and B l i e
on the same si de of the l i ne CD (whi ch the shortest-di stance l i ne i s
required to i ntersect ) . Thi s fact prevents the shortest l i ne from being
a straight l i ne. However, if the points A and B l ay on opposite sides
of the line CD, it would then be possi bl e to i ntersect the line CD with
a straight l i ne connecti ng the points A and B. Thi s statement suggests
the defni ti on of a si mpl er probl em i n which the points A and B l i e on
opposite sides of the l i ne CD. Try to defne such a probl em, if you have
Relations Between Problems T b
A si mpl er probl em that permits the appl i cation of the shortest di s
tance princi pl e i s the fol l owi ng:
Fi nd the poi nt X such that the di stance from poi nt A t o poi nt passi ng
t hrough the l i ne CD in Fi g. 9- 5 is a mi ni mum.
Si nce poi nts A and E l i e on opposi te si des of the l i ne CD i n thi s new
probl em, the shortest di stance between poi nts A and E wi l l be the
straight l i ne connecting them. The poi nt X wi l l be the i ntersection of
l i ne AE with l i ne CD. Now if poi nt E i s constructed as i ndi cated i n
Fi g. 9-5 to be the same di stance from the l i ne CD, and at a poi nt sym
metrical l y opposite point B, then i t i s obvi ous that the di stance X B
wi l l be equal to the di stance XE (si nce these are correspondi ng parts
of congruent triangles) . Thi s i ndicates that the sol ution for point X in
the simpler probl em i s actual l y the sol ution for point X i n the original
probl em. Furthermore, L -, since they are opposite interior angles
of intersecting straight l i nes, and f -, si nce they are correspondi ng
angl es of congruent triangl es. Thus, L -f (the angl e of i nci dence
equal s the angl e of refection) , and the original probl em i s enti rel y sol ved.
A
0
mmmmmmmmmm
C
X
7
I
FIGURE 9-5

I
*
0
:

L
Si mpl er mi ni mum-di stance probl em.
Now that you have had some experience with mi ni mum-di stance
probl ems, perhaps you would like to try your hand at my version of
the classic walking-fy problem:
Bi l l y Smi th smudged hi s l ol l ypop at a poi nt on the wal l of the l i vi ng room
I foot from the foor and 6 feet from each corner. A fy with a broken wing
is standing on the opposite side wall I foot from the cei l ing and 6 feet
from each corner. If the l i vi ng room is 30 feet l ong, 1 2 feet high, and
1 2 feet wide, what is the shortest path along which the fy shoul d wal k
to get from where he i s to the l ol lypop smudge?
Tb4 Chapter
Si nce thi s is obviousl y a mi ni mum-di stance probl em, it is simi lar i n
that respect to the preceding probl em and thus perhaps may be ap
proached in a si mi l ar way. I n deci di ng how to appl y the methods used
in a previous probl em to a new probl em, it i s i mportant to state what
you di d in the precedi ng probl em at some l evel of abstraction that i s
general enough to appl y to both probl ems. I n trying to state such an
appropriate l evel of general ity, you may begin by stating what you did
i n whatever manner comes to mi nd most qui ckl y. Thi s statement may
wel l be too specifc to appl y to the present probl em, but you might
then try to state your methods in progressi vel y more and more abstract
form, unti l you reach some statement that appl i es to the present prob
l em. Now stop readi ng and try to formul ate (at perhaps several l evel s
of general i ty) what was done i n the preceding mi ni mum-di stance prob
l em, in order to get ideas for the present mi ni mum-di stance probl em.
One thi ng that was done i n the preceding probl em was to refect a
point about a l i ne in order to construct an equivalent di stance for
whi ch the sol ution was a straight l i ne. You might investi gate the pos
si bi l ity of refecti ng the starti ng poi nt, goal poi nt, or other points along
the wal l s, foor, and cei l i ng of the room in the wal ki ng-fy probl em,
but thi s procedure wi l l not produce a sol uti on. Thus, although refec
tion about an axis i s an operation that can be perormed i n the wal ki ng
fy probl em, this operation wi l l not hel p sol ve the probl em. Can you
thi nk of a more general way to state what was done i n the preceding
probl em that may suggest other operati ons to apply to the wal ki ng-fy
probl em and produce a sol ution? Stop readi ng and try to solve the
probl em, if you have not done so al ready.
A si mpl e way to make something more general i s to strip it of some
of its properties, l eavi ng these properties unspecifed. To solve the
precedi ng mi ni mum-di stance probl em, we performed some operations
so as to construct an equival ent probl em for which the sol ution i s a
straight l i ne. Now stop readi ng, and try to perorm some operation
such that you obtain an equi valent problem to which the solution is
a straight l i ne, if you have not solved the walki ng-fy probl em al ready.
The onl y way the fy could fol l ow a straight line i n the original room
is to fy across the room, which he cannot do because of hi s broken
wing. Thus, we must construct a new medi um through which the fy
can wal k an equi val ent di stance i n a straight l i ne from the starting
point to the goal . The refecting of a point i n the previ ous minimum
di stance probl em coul d al so be consi dered to be a rotation of a strip
of paper contai ni ng the point 1 800 around the l i ne axi s shown in Fig.
9-5 . I n order to rotate onl y one and not both poi nts, it would obviousl y
be necessary to cut the paper at an angle al ong a perpendi cul ar to
the axi s of rotati on. Can you thi nk of some way of cutting up the room
Relations Between Problems Tbb
and rotating some combi nation of wal l s, foor, and cei l i ng that woul d
result i n a completel y fat two-di mensional surface? Havi ng achi eved
an equi valent fat surface, you coul d then connect the starti ng and
fni shi ng poi nts by a straight l i ne. Stop readi ng and try to sol ve the
probl em, using thi s hi nt, if you have not done so already.
As a chi l d, you may have made boxes out of fat pi eces of paper.
Si nce the l i vi ng room in the fy probl em is equi valent to a rectangular
box, i t i s just as possible to cut along various edges and fatten i t out
as i t i s to construct i t from an origi nal l y fat pi ece of paper. Thus , we
can obtain a fattened analog of the l i vi ng room as shown i n Fig. 9-6.
Havi ng fattened out the room i n the manner shown i n Fi g. 9-6, it
i s a simple matter to determi ne that a straight l i ne connecti ng the
gi ven and the goal i s the hypotenuse of a right tri angl e whose si des are
24 feet and 3 2 feet. Thus, usi ng the Pythagorean Theorem, we fnd the
l ength of the hypotenuse equal s v242
+ 32
2 -v576 + 1 024 -v1 600
-40 feet. Thus, the fy must travel a di stance of 40 feet, and the path
J0
I Z
I Z
Cei l i ng
IZ
Given
i
`
`
:
`
`
I
`
I
`
`
I
`
40
Z4
`
I Z
I
`
`
I
`
`
I
`
I
`
`
I
`
Fl oor
| aaaaa .aaaaaaaa.
f
Goal I Z
JZ
FIGURE 9-6
Flattened l i vi ng room for the wal ki ng-fy probl em.
Tbb
Chapter
he must fol l ow i nvol ves travel ing across hi s own end wal l , a portion
of the cei l i ng, a portion of one of the long side wal l s, a portion of the
foor, and a portion of the opposite end wal l .
More Complex Problems
Posi ng a probl em that is more compl ex than the given probl em is the
logical inverse of posi ng a probl em that i s si mpl er than the given prob
l em. Ofand, it would seem that posing a more compl ex problem
woul d hardl y be a useful probl em-sol vi ng techni que, and i n general
thi s is true. However, if all el se fai l s, you might attempt to pose a more
compl ex probl em in whi ch the el ements of the gi ven probl em are
i ncl uded wi th addi tional compl icati ons, just on the chance that it would
give you some ideas. This method i s cl early a last resort and unl ikel y
to be benefci al , but it i s worth considering for one reason -namely,
you may al ready have solved a more compl ex probl em in which your
present probl em i s essentially embedded. If this i s the case, then think
ing of a more complex related probl em that you have al ready solved
will provide you with al l the ideas necessary for the solution of your
present, simpler probl em. I do not know of many exampl es of thi s,
but here i s one:
Gi ven a fve-by-fve checkerboard, as shown in Fig. 9-7, try to draw a
l i ne through al l the squares of the checkerboard, starting from the square
with the dot in it on the lef side and passing through each box once and
onl y once, without ever l ifing pencil from paper and wi thout ever pass
ing outside of the checkerboard. Show how to do this or prove it impossibl e.
Stop readi ng and try to sol ve the probl em by considering a previously
sol ved related probl em.
One probl em that certai nl y has some relation i s the notched-checker
board probl em di scussed in Chapter 3 . If you recal l how the notched
checkerboard probl em was sol ved, it might wel l provide you with all
the ideas needed to sol ve the present probl em. Now stop readi ng and
try agai n to sol ve the probl em, if you have not done so al ready.
The probl em that i s most cl osel y si mi l ar i s the i nteger-path addition
problem di scussed in Chapter 7. There the probl em was to place the
i ntegers 1 to 9 i n a continuous path over a three-by-three matri x such
that the three-digit number i n the frst row plus the three-digit number
in the second row summed to the three-digit number in the thi rd row.
Essenti al l y the i nteger-path addition probl em i nvol ves drawing a line
starting from one cell of a three-by-three matrix (checkerboard) in
Tb
FIGURE 9-7
The fve-by-fve checkerboard.
preci sel y the same way as i s i nvol ved in the present probl em usi ng a
fve-by-fve matri x (checkerboard) . The i nteger-path addi ti on probl em
al so involved another restri cti on, making it a more compl ex probl em
than the present one. However, the sol uti on to the i nteger-path addi
ti on probl em i nvol ved a consi derati on of the restriction on possi bl e
solutions placed by the path (conti nuous l i ne) aspect of the probl em.
Now stop reading and try to sol ve the probl em, if you have not done
so already.
In both the notched-checkerboard and the i nteger-path addition
problems the critical property i s the i mposition of a checkerboard
coloring pattern on the fve-by-fve checkerboard. Now note that every
time you draw a l i ne through two squares, you necessari l y draw a l i ne
through one white and one bl ack square. Al ternati vel y, you coul d i m
pose a two-di mensi onal coordi nate label i ng scheme from ( 1 , I ) to
( 5, 5) . In that case, notice that, whenever you leave a square with an
odd coordi nate sum, you pass i nto a square with an even coordi nate
sum, and whenever you leave a square with an even coordi nate sum,
you pass i nto a square wi th an odd coordi nate sum. N ow stop readi ng
and try again to sol ve the probl em, if you di d not before.
Consideration of t he i mpl ications of checkerboard col oring patterns
for the present probl em yi el ds the fol l owing i nference: I f you start
in a white square and must draw a l i ne through an odd number of
squares in total , then the col or of the last square you pass through must
be the same col or as the square you started from. In the present i n
stance, there are 25 squares i n the fve-by-fve checkerboard. Thus,
Tb Chapter
if you begin in a whi te square, you must end in a white square, and
there must be exactly 1 3 white squares and 1 2 black squares in the
checkerboard. However, i n the checkerboard shown i n Fig. 9- 7, there
are 1 3 bl ack squares and 1 2 whi te squares. Thus, starting from any
white square on the board, it wi l l be i mpossi bl e to solve the probl em
of drawing a conti nuous l i ne through each square once and onl y once.
SPECIAL CASE
Parti cul arl y in proof probl ems, it ofen happens that the theorem to be
proved states a general relation that hol ds over a number of special
cases or entities. In such probl ems, it i s often useful to try to prove the
theorem frst for one or more of these special cases before an attempt
is made to prove the theorem i n general . The reason i s that it is usual l y
easier to prove the theorem for a special case than for the theorem i n
general . Thi s argument i s preci sel y the same one made for the advan
tages of posing and sol ving simpl er probl ems in general . However,
not al l si mpl er problems are special cases of the probl em you are try
ing to sol ve. The reverse, however, is almost i nvariabl y the case
special cases are si mpl er probl ems than the general probl em.
Proving a theorem true for one or more special cases i ncreases the
probabil ity that the theorem i s true in general , but unl ess you can
prove the theorem true for al l special cases, proving the theorem i n a
parti cul ar case does not, of course, prove the theorem in general .
However, di sproving a special case of a conjectured theorem does
di sprove the theorem i n general . When you are uncertain about the
truth of the theorem, it can be parti cul arly useful to investigate the
theorem i n some special case, since a quick di sproof of the theorem
for the special case di sproves the theorem in general. This exercise
may save you consi derable time that otherwise might be spent i n
frui tl ess attempts to prove a fal se theorem.
When the theorem i s true, proving it true for one or more special
cases may provide you with many of the el ements needed in order to
prove the theorem in general . Thi s reason is perhaps the primary one
for posing and sol vi ng special cases of general probl ems.
One use of the method of special case was di scussed al ready in
Chapter 6 on subgoal s as a part of the method of mathematical induc
ti on. Recall that, i n the method of mathematical induction, we had to
frst prove the theorem true for n -1 (a special case) and then show
that if the theorem was true for n it was true for n + 1 . Thus, in proving
that the sum of the frst n i ntegers equal s n(n + 1 )/ 2, we initial l y es
tabl i shed that this was true for n -1 .
Relations Belween Problems Tb
Another use of the method of special case occurs someti mes i n
multiple-choice exami nation questions. For exampl e, if you were asked
to choose one of fve formulas for the sum of the frst n i ntegers, the
fastest method might be to i nvestigate each formul a on some speci al
case, such as n -5 , very l i kel y determi ni ng that al l but one of the
answers produced a contradiction i n that speci al case. Note that this
i s, i n essence, a combi nation of the use of two probl em-sol vi ng methods,
namely, speci al case and the method of contradi ction.
A si mi lar probl em ofen ari ses when you try to remember some
formul a you l earned previ ousl y and thi nk you recal l it but are not
sure. For exampl e, i n tryi ng to recal l the formul a for the sum of the
frst n i ntegers, you might erroneousl y recal l somethi ng such as
n(n 1 )/2. Such erroneous conjectures can easi l y be tested and re
jected by i nvestigating thei r truth i n one or more speci al cases. Si nce
you ofen have a reasonabl y good i dea of what the correct formul a i s,
a few rejecti ons of i ncorrect statements of t he formul a wi l l usual l y
be fol l owed by a correct statement, whi ch mi ght si mpl y be veri fed
by mathemati cal i nducti on.
Deri vi ng a formul a for the number of combi nations of M thi ngs
taken n at a ti me provi des another good example of the use of the
method of special case. Undoubtedl y you the reader have encoun
tered thi s formul a i n the past ; however, i n my experi ence, many stu
dents fai l to remember the formul a and most do not know how to derive
i t. Even if you do know how to sol ve the probl em, i t i s useful to thi nk
of how you woul d go about appl yi ng the method of special case to
derive the formula. Thus, consi der the fol l owi ng:
Derive a formul a for the number of combi nati ons of m t hi ngs taken n
at a ti me (m ~ n) . Combinati ons refer to the number of diferent un
ordered sets of el ements. That i s, the set of two elements obtai ned by
drawing X and then drawing Y is equivalent to the set obtained by draw
ing Y and then X. The set XYZ is equi valent to the set YXZ or the set
ZYX. The orderi ng of the elements in the set is irrelevant. Furthermore,
you are restricted to drawing an el ement only once from the underlying
popul ation of m el ements. That i s, you may sample from the underl ying
popul ation n times wi thout replacing the elements you sampl ed (sampl ing
wi thout repl acement).
Stop reading and try to solve this probl em, making use of the
method of special case.
There are four speci fc aspects to the probl em. Fi rst, there i s an
underlying population of M el ements. Second, you are picking a sample
of n of these el ements. Thi rd, the sampl i ng i s done without replace
ment; that is, every time you pi ck an el ement from the sampl e, you
T
Chapter
do not put it back in the popul ation, so the population is reduced by
one el ement every time you choose an el ement for the sampl e. Fourth,
you are concerned with the number of diferent unordered sets ob
tained by thi s sampling procedure, rather than the number of diferent
ordered sets.
Each of these four aspects coul d be changed to pose a problem
related to the present one, some of them simpler than the present
probl em, which might faci l itate its sol ution. If you have not yet solved
the probl em, stop reading and think how you might change one or
more of the four aspects to derive a related probl em or a special case
that is si mpl er to solve than the original probl em.
You coul d reduce the size of the underl ying population of m el e
ments to, say, the special case of two el ements. You woul d then also
have to reduce the si ze of the sample to either one or two elements
(n = 1 or 2). However, it i s probabl y unnecessary to reduce both m and
n i n this way. It i s quite possi bl e to l eave m as i t i s and reduce the
sample size to two elements (n = 2) . You would now be consi dering
the special case of the present probl em where n = 2, namely, where
one is sel ecting an unordered pair without repl acement from the
underl ying popul ation of m el ements. Stop reading and try to solve
the probl em, if you have not done so already.
It wi l l be easier to solve for the special case of the number of un
ordered pairs of el ements if you consider the related probl em of
determining the number of ordered pairs of el ements that can be
sel ected from the population of m el ements. This l atter probl em is
qui te trivial to sol ve: there are m ways to select the frst el ement,
and for each of these m ways, there are (m - 1) ways to select the
second element ; thus , there are m(m - 1) ways to select an ordered
pair of el ements. Having determi ned the number of ordered pairs , it
is quite possi bl e to determine the number of unordered pairs . Stop
reading and consider how you would do this, then generalize your
answer to solve the original probl em, if you have not done so already.
The criti cal diference between ordered and unordered pairs i s that
a pair of ordered el ements XY is considered equivalent to the ordered
pair YX, when unordered pairs are bei ng considered. Thus, there are
exactl y two diferent ordered pairs of el ements for each unordered
pair. Knowing thi s, how can you sol ve the probl em of determining the
number of unordered pairs, if you know how many ordered pair there
are? Stop reading and sol ve this probl em and then generalize your
answer to solve the original probl em, if you have not done so al ready.
Cl earl y, if t here are two ordered pai rs for each unordered pai r and
m (m - I ) ordered pairs , then there are m (m - I )/2 unordered pai rs.
Relations Between Problems TT
Now general i ze thi s answer to the sol uti on of the original probl em,
where you are sel ecti ng not a pai r of el ements but a set of n el ements
at a ti me.
The relevant general i zation of the sol uti on to the speci al case i s
that the probl em shoul d be broken i nto two parts. Fi rst, the number of
ordered sets of el ements shoul d be determi ned, then how many di fer
ent orderings there are for each unordered set. Now use thi s anal ysi s of
the two subprobl ems to solve the original probl em, i f you have not
done so already.
I f you are not abl e i mmedi atel y to general i ze the sol uti on of the
special case to obtai n a sol uti on to the general case, then consi der
another special case, where n 3. Here the number of ordered sets
obtained by sampl i ng without repl acement i s m(m - I Hm - 2). The
number of di ferent orderings of each sampl e of three elements i s
the number of permutati ons of three thi ngs. The number of permuta
tions of a set of three things is 3
.
2
.
I , or 6, si nce there are three
ways to pick the frst element from the sampl e, two ways to pick the
second, and one way to pick the thi rd. Thus, the number of diferent
unordered sets of three elements equals m(m - I Hm - 2)/3
.
2
.
1
= m(m - I Hm - 2)/ 3 ! . Stop readi ng and general i ze the formul a to the
case of an unordered set of n el ements, if you have not done so al ready.
Cl earl y, the general formul a for the number of unordered sets of
! el ements selected from a population of m elements wi thout replace
ment i s m(m - I) . . . (m - n + I )/n ! m ! /n ! (m - n) ! . The general
princi pl e for sol vi ng this probl em comes from breaki ng it i nto two
parts i n order to determi ne, frst, the number of ordered sets and,
second, the number of di stinct ways of ordering (reorderi ng) the el e
ments i n a parti cul ar ordered set (sampl e) . The number of unordered
sets is equal to the number of ordered sets di vi ded by the number of
ways of ordering the el ements in a parti cular set. Thi s general pri nci
ple i s essential l y present i n all speci al cases where n = 2, 3, and so on.
Thus, sol vi ng the probl em for one or two speci al cases provi des al l
the essential ingredi ents for sol vi ng the probl em i n general .
The method of speci al case is al so frequentl y useful in geometric
probl ems. Consider the fol l owi ng proof probl em i n Eucl idean geometry:
You are gi ven the fol l owi ng; (a) A strai ght l i ne equal s an angle of 1 80.
(b) A right angle equals 90. (c) If two paral l el l i nes are cut by a trans
versal , the alternate i nterior angl es are equal . Prove that the sum of the
angles of any triangle equal s 1 80.
Stop readi ng and try to prove this theorem, maki ng use of the method
of special case.
TZ Chapter
Si nce you are gi ven information on the number of degrees in the
right angl e, it i s reasonable to consi der the speci al case where one of
the angles of the triangl e i s a right angl e. For exampl e, consider the
right triangle shown i n the lef i n Fi g. 9- 8.
Stop reading and try t o sol ve the probl em for thi s special case, then
general i ze your answer to a triangle without any right angl e, if you have
not done so already.
Special case General case
mmm _
7
C
FIGURE 9-8
Proof of the theorem i n Eucl idean geometry that the sum of
the angl es of a triangl e i s 1 80. Left i s for the special case of a
right triangl e. Right is for the general case of any triangl e.
Given the right triangle shown i n Fig. 9- 8, i t would be reasonabl e
to construct a l i ne at point A paral l el to the opposite side of the tri
angl e. Havi ng done thi s, the alternate angl es are equal : f' -f and
-. Si nce
-L + f' , then L + f' -90 ; this establ i shes that L + f

-90. Thus, L + f + -1 80, and the theorem is proved for the special
case of a right triangl e. Now stop reading and try to generalize your
solution to the case of any triangl e, if you have not already solved
the probl em.
In sol vi ng the special case, we have essentially mapped the angles
of the triangle onto three angl es that, taken together, form a straight
l i ne. This approach extends in a direct way to the general case for any
triangl e, not just a right triangl e. Thus, for the general triangl e, we
woul d be led to construct a l i ne at A paral l el to the opposite si de. Then,
we woul d go through exactl y the same reasoning as i n the speci al case.
We might not recognize i n the special case the general pri nci pl e of
mappi ng the angl es of the triangl e onto a straight l i ne. However, it i s
l i kel y we woul d consi der erecting a l i ne at A paral l el to the opposite
si de of the triangle in the speci al case, notici ng that L + f' -90 and
T
that f -f' . Thi s procedure i ncl udes the essential notion of construct
i ng an additional line at A paral l el to the opposite si de, and thus , you
might think of the more general pri nci pl e.
I n some i nstances, i t turns out that provi ng the theorem for a smal l
number of special cases constitutes the proof of the enti re probl em.
Someti mes this resul t i s obvious i n advance, and someti mes i t onl y
becomes obvious afer considering a special case. For exampl e, i n
Chapter 7 we considered t he probl em of provi ng that A2 -AB + B2 > o.
We considered the proof of thi s theorem onl y for the case where A > 0
and B > 0 but noted that the theorem was actual l y true for al l A and B.
Proving the theorem for al l A and B essenti al l y requi res us t o prove
i t for four cases: (a) A > 0 and B > 0; (b) A > 0 and B 0; (c) A 0
and B > 0; (d) A 0 and B O. The proof of the theorem for each of
these four speci al cases i nvol ves breaki ng the probl em up i nto three
more special cases within each of the four previousl y menti oned
speci al cases -namel y, A > B, A -B, and A B. Thus, i n al l , we have
I 2 special cases for whi ch to prove that A2 - AB + B2 > 0 i s true.
But for each of these 1 2 special cases, the theorem i s rather si mpl e
to prove.
An example i n geometry where the general probl em can be di vi ded
i nto two essenti al l y i dentical speci al cases i s provi ded by the proof
of the fol l owi ng theorem:
You are gi ven t he fol lowi ng: ( a) The measure of an i ntercepted arc i n
degrees i s t he same as t he measure of i t s correspondi ng central angl e
(namel y, the angle determi ned by drawi ng the radi i from the center of
the ci rcl e to the ends of the i ntercepted arc). (b) The sum of the angl es
of a triangle equal s 1 80. (c) The angl es opposi te the equal si des of an
i soscel es tri angl e are equal . Prove that an angle i nscribed i n a ci rcl e has
half as many degrees as i ts i ntercepted arc.
Stop readi ng and try to sol ve the probl em by frst consi deri ng a
special case.
As i s frequentl y true, there are many ways of formul ati ng special
cases of the present theorem. For exampl e, we mi ght consi der the
special case where the i nscribed angle i s a right angle and try to prove
that its i ntercepted arc i s 1 800 (that the cord for thi s arc i s a di ameter
of a ci rcl e). Conversely, we might i nvestigate the speci al case i n whi ch
the i ntercepted arc was 1 80 and try to prove that the i nscribed angle
was 90. Another type of special case woul d be to assume that the
cords composing the i nscribed angle were equal , and so on. We might
consi der many speci al cases before we hi t on a speci al case that is
T4 Chapter
most useful in sol ving the general probl em. No previ ousl y mentioned
special case i s optimal for the solution of the present probl em, but the
alternate special cases may suggest such a one. Stop reading and try
to solve the probl em, if you have not done so already, by using the
method of special case.
The opti mal special case to consi der i s that where one of the si des
of the inscri bed angl e i s a di ameter of the ci rcl e. Thi s speci al case i s
i l l ustrated i n the upper secti on of Fig. 9- 9. Now stop reading and
try to solve the special case, then extend your answer to a proof of
the general theorem, if you have not proved the theorem already.
Provi ng the theorem for the special case is rel ati vel y straight
forward. Fi rst, draw i n the dashed l i ne shown i n the upper ci rcl e
of Fi g. 9-9, to obtai n the central angl e {, whi ch we know i s equal to
the i ntercepted arc. We can easi l y verify that the triangle shown i n
the fgure i s i soscel es. Now maki ng use of the assumption that the angles
opposite equal si des of an i soscel es t riangle are equal , we know that
Special case
Fi rst subcase
Second subcase
FIGURE 9-9
Diagrams for the proof of the theorem in Eucl idean geometry that an
inscribed angle has half as many degrees as the intercepted arc (whi ch
equal s the central angl e).
the inscribed angle a -a' . Therefore, by the gi vens that the sum of
the angl es of a triangle equal s 1 80, we know that a + a' + -1 80.
Therefore, 2a -f or a -f/ 2, and the theorem i s proved for the special
case. Now stop reading and extend your solution to the general case .
Actual l y, it is probably si mpl er to consi der two types of general
cases. In essence, we are subdi vi di ng the general case into two special
cases that exhaust the entire category. Let us cal l these two cases the
frst subcase and the second subcase. In the frst subcase, the i nscri bed
angl e i ncl udes the diameter of the ci rcl e drawn from the vertex of the
angl e. In the second subcase, the i nscribed angle does not include the
diameter drawn from the vertex of the i nscribed triangle. Now stop
reading and try to sol ve the probl em for each of the two subcases of
the general case, if you have not done so al ready.
I f in the frst subcase we can divi de the angle a into two components,
then al + a
2
-a such that the di vi di ng l i ne for the two component
angl es is a di ameter of the ci rcl e. Now for each of these component
angl es , a1 f
1
/ 2 and a2 -f2/ 2, as shown i n the l ower l eft di agram of
Fig. 9- 9. Thus, al + a2 -
(
f1 + f2 ) / 2, and the theorem i s proved for
the frst subcase. Now stop reading and sol ve for the second subcase,
i f you have not al ready done so.
In the second subcase, we can consi der the i nscribed angle a to
be equal to the diference between angle al and angle a2 , as i l l ustrated
in the right-hand di agram of Fig. 9- 9 (-al - a2 ) ' Si nce each of the
component angl es, al and a2 ' sati sfes the requirement of the speci al
case (that one of the cords of the angl e be a diameter of the ci rcl e),
we know that al -f1 /2 and a2 -f2 / 2. Thus, a -al - a2 -f1 /2 - f2/2
-
(
f
1
- f
2
)/2 -f/ 2, and the theorem i s proved for the second subcase.
Thi s probl em provi des a beautiful exampl e of the mul ti pl e use of the
method of special case. An extremel y special case was frst i nvesti
gated to get the basic idea for the sol ution. Then the general case was
subdivided into two special subcases, which were neverthel ess more
general than the origi nal special case. I n provi ng the theorem for each
of these two more general special cases, the truth of the theorem for
the special case was used as an integral part of the proof.
I n some cases, the solution of a si ngl e special case may provi de the
solution to the general probl em. One exampl e of this i s provided by
the fol lowing probl em:
A cylindrical hol e 1 0 inches long i s drilled through the center of a solid
sphere, as shown in Fig. 9- 1 0. What volume remains in the sphere?
Stop reading and try t o sol ve t he probl em, using the method of
special case.
Tb
, l m.
FIGURE 9- 1 0
The hol e i n the sphere probl em.
The probl em implies that the vol ume remai ni ng in the sphere is
i ndependent of the diameter of the cyl i ndrical hol e, provided that the
hol e i s 1 0 inches long. Assumi ng that the probl em has a unique solu
tion, i ndependent of the di ameter of the hole, we can get a solution for
the general probl em very si mpl y by considering a special case. What
is thi s special case? Stop readi ng and try to sol ve the probl em, if you
have not done so already.
Consi der the special case where the cylindrical hole has a diameter
of zero. I n thi s case, si nce the cyl indrical hole i s 1 0 i nches long, the
sphere must have a di ameter of 1 0 i nches, and the volume of a solid
sphere wi th a di ameter of I O inches equals t7r
3
=t75
3
= 5007/3 . Mak
i ng the assumption ( whi ch we have certai nl y not proved) that thi s
probl em has a uni que sol ution, i ndependent of the wi dth of the hol e,
5007/ 3 must be the answer. Natural l y, if we were conjecturing that
the val idity of this theorem was uncertai n, we coul d not use thi s l i ne
of reasoning to sol ve the probl em.
A second exampl e of a general probl em that can be sol ved by solv
ing a si ngl e special case i s the fol l owi ng:
I n thi s two-person game the pl ayers al ternatel y pl ace poker chi ps on a
circular table. The chips must not overlap and must be compl etel y on the
tabl e; that i s, no poker chip may stick out over the edge of the tabl e.
The l ast player to pl ay a chi p on the tabl e is the winner. I f each pl ayer
makes the optimal move on hi s turn, will the frst player or the second
pl ayer be the winner?
Stop readi ng and try to sol ve the probl em by consi dering a special
case.
The probl em suggests that opti mal strategy wi l l produce a forced
wi n for either pl ayer I or pl ayer 2, independent of the size of the tabl e.
Assumi ng thi s, what speci al case yi el ds a qui ck sol uti on?
T
Consider a tabl e that i s big enough to accommodate onl y one poker
chi p (when pl aced i n the center of the tabl e). I n such a case, the pl ayer
who goes frst wi l l be able to pl ace the frst and the l ast poker chi p on
the tabl e and wi l l therefore be the wi nner. Thi s case suggests that, i f
one of the players has a forced wi n pl aying by optimal strategy for al l
sizes of tabl es, then that pl ayer i s the frst pl ayer. Verifying thi s hy
pothesi s for a tabl e of any size requires a further clever i nsi ght beyond
that provi ded by the special case. However, the speci al case does sug
gest that we should test the hypothesi s that it i s the frst player who
can force a win for himself by optimal pl ay. Also, the wi nni ng frst
move for the frst pl ayer i n the speci al case might suggest the wi nni ng
frst move for the frst pl ayer in the general case. Stop readi ng and try
to solve the general case, if you have not done so already.
The insight i nvol ved i n sol vi ng the general probl em i s that the frst
player initial l y pl aces a poker chip i n the center of the tabl e (as i n the
special case) and thereafter pl ays chips i n a symmetri cal l y opposite
position to that pl ayed by the second pl ayer. Cl earl y, if the second
player has any pl ace on the table availabl e to pl ace a poker chip, there
wi l l sti l l then be a symmetrical l y opposite place on the tabl e for the
frst pl ayer to pl ace a chip, so that the frst pl ayer must be the last to
pl ay a chi p on the tabl e, independent of the size of the tabl e. Note that
the i nitial , unique move by the frst pl ayer -namely, pl acing a poker
chi p i n the center of the tabl e -i s exactl y the same move that the frst
player should make in the special case.
A thi rd exampl e of a general probl em that can be sol ved by sol vi ng
a single special case is the fol l owing:
Triangle ABC is formed by three tangents to a circle, as shown in Fig.
9- 1 1 . Angle DAE 26. Solve for angle COB.
Stop reading and try to solve the probl em.
Angl e DAE (whi ch i s the same as angl e BAC) i s compl etel y deter
mined by two of the three tangents. It i s impossi bl e from the informa
tion given in the probl em to determine the l ocation of the tangent BC
as i t intersects the ci rcl e anywhere within the arc ED. Thus, assuming
that the probl em has a unique solution for angl e COB (which i s cer
tainly impl ied by the statement of the probl em) , we can solve for angle
CO B by consi dering any special case of the pl acement of tangent C B.
Stop reading and try to solve the probl em, if you have not done so
already.
The obvious special case to consi der is for tangent CB to intersect
the circl e at the same point as does the line from the origin of the
circle to poi nt A (as shown above i n Fi g. 9- 1 1 ) . Havi ng chosen this
T
Special case
General case
L
-A
FI GURE 9-1 1
Diagrams for sol ution of the problem to fnd the angle COB.
special case, it i s now rel ati vel y easy to sol ve the probl em. Si nce angle
A ED and angle AD E intersect the same arc of the ci rcl e, these two
angl es are equal . Thus, AE -AD, si nce the si des opposite equal
angles of a triangle are equal . Line OE line OD, since both are radi i
of the same ci rcl e. Li ne OA -l i ne OA, si nce they are the same l i ne.
Thus, triangle AOE i s congruent to triangle AOD, by virtue of having
the three corresponding sides equal. Thus, angle EA 0 -angle DA 0
-1 3, si nce angle DAE -26. Si nce there are 1 80 in a triangle, and
angl e AEO -angle ADO -90, we know that angle AOE -angle
AOD -90 - 1 3 -77. We can easi l y prove that triangle OEC is
congruent to triangle OFC by havi ng the same hypotenuse and one
equal si de (radi i of the circl e) . Si mi l arly, triangle ODB i s congruent
to triangle OFB. Thus , angle COF -,angle EOF and angle FOB -,
angle FOD. Putting all this together i mpl i es that angle COB -77,
A fourth exampl e of sol ution of a general probl em by means of a
singl e special case is the fol l owi ng:
I n Fi g. 9- 1 2 angl e BAD 20, AB = AC, and AD Ao Sol ve for angle
CDE.
Relations Between Problems T
Stop readi ng and try to sol ve the probl em, maki ng use of the method
of special case.
Thi s probl em frst seems to l ack enough data to sol ve i t. The gi ven
i nformation i n the probl em i s not adequate to specify a si ngle uni que
triangl e with these properti es. There are a l arge variety of di ferent
(noncongruent) triangl es consi stent with the i nformati on that A B = A C,
AD = AE, and angl e BAD = 20. Furthermore, these triangl es are not
even si mi l ar to one another; that i s, the angl e DA E can assume a
variety of di ferent val ues. Cl early, the absol ute l engths of the si des
AB. AC, AE, and AD are not rel evant to determi ni ng the angl e CDE.
A
B
'- . ....
C
FI GURE 9- 1 2
Geometry probl em.
However, i t i s surpri si ng to be tol d i mpl i ci tl y by the probl em that the
angl e DAE i s irrel evant to the value of the angl e CDE. Assumi ng that
wi thi n some range of val ues the magnitude of angle DA E is i rrelevant
to the magni tude of the angl e CDE, then how might we go about sol v
ing for the magnitude of angle CDE? Stop readi ng and try to sol ve the
probl em, if you have not done so al ready.
Cl earl y, we can solve the probl em for a speci al case of the value of
angl e DAE and determine the magni tude of the angle CDE. Accordi ng
to the i mpl i ci t information stated i n the probl em, we shoul d obtain the
same solution for the angle CDE i rrespecti ve of our choice of angle
DAE (over some range) . Thus, l et us pick angle DAE = 20. Now stop
readi ng and sol ve for angl e CDE i n this speci al case, if you have not
al ready sol ved the probl em.
I f angle DAE = 20, then angle BAC = 40 and angle ABD = ACD
= 70, si nce these angl es are opposite the equal si des of an i soscel es
triangl e and there are 1 80 i n a t ri angl e. By the same reasoni ng, angl e
ADE = angle AED = 80. Thus, angl e DEC = 1 00, and therefore angle
C DE = 1 80 - 1 00 - 70 = 1 0. I t turns out that an i nfni ty of other val
ues substi tuted for the angl e DA E yield the same value ( 1 0) for angle
CDE, which i s the solution to the probl em. In general , we could si mpl y
substitute some arbitrary val ue ] for angle DAE and sol ve the set of
T Chapter
equations to determine the value of angle CDE in a manner that would
be i ndependent of ). However, it i s consi derabl y simpler to solve the
problem by choosi ng a single special case, since the given i nformation
i mpl i es that the solution to the special case i s equival ent to the sol u
ti on to the general probl em.
GENERALIZATION
Just as it is someti mes useful to sol ve a speci al case prior to working
on a more general probl em, i t is also frequentl y useful to do the opposite
and generalize the probl em somewhat.
Generali zation plays a rol e i n problem sol vi ng i n at least three
diferent ways. First, as a necessary part of probl em sol vi ng, we
usual l y abstract from a probl em certain properti es bel onging to a more
general cl ass of probl ems and thus rel evant for determining the pre
vi ousl y establ i shed pri nci pl es needed for sol vi ng our present probl em.
Second, afer we have solved the probl em, it i s ofen useful to consi der
whether we coul d generalize a solution from i t to a wider cl ass of prob
l ems i n order to derive a more general concl usi on or one or more corol
l aries of the pri nci pl e establ i shed i n the probl em we just solved. Third,
occasional l y (though i n my experi ence not too frequentl y) , it may be
useful to pose and attempt to solve a more general probl em prior to
working on the current probl em, even when the solution to that more
general probl em is not yet known to us.
The frst rol e that generali zation pl ays i n problem sol vi ng has real l y
already been di scussed i n Chapter 3 i n connection with the representa
tion of i nformation. Recall that a critical aspect of solving many
probl ems consi sts i n retrievi ng from memory the rel evant previousl y
establ i shed rel ations and pri nci pl es wi th common properti es needed
to solve the present probl em.
It may be that the current probl em i s real l y a speci al case of a general
cl ass of probl ems for whi ch we al ready know a si mpl e rule for solution.
For exampl e, if the present probl em i s the l i near equation 2x + 5 1 3,
we know that the solution to thi s parti cul ar l i near equation can be
achi eved by using the general methods for sol vi ng any l i near equation
of the form ax + b L. Si mi l arl y, if the equation were a quadratic of
the form 7x2 + 2x 4 0, we have a formula for solving any equation
of the form ax2 + bx + L O. A broad range of higher order equations
can be solved by certain types of numerical methods. I f we have a par
ticular equation that fal l s wi thi n the scope of a numerical method, we
know we can appl y thi s method to sol ve the parti cular probl em.
In a geometric context, if a problem gi ves two si des of a right tri-
Relations Between Problems TT
angl e and we are asked to sol ve for the thi rd si de, we know a general
method that i s appl i cabl e to sol vi ng all such probl ems -namel y, use
of the Pythagorean Theorem, ('
2
a
2
+ b2
If we are gi ven a probl em in whi ch we must determi ne the number
of combi nations of seven thi ngs taken four at a ti me, we need onl y
retrieve the formul a for the number of combi nati ons of M thi ngs taken
M at a time and substitute i n the appropriate values for M and M in
order to solve the probl em.
Ordi nari l y, to solve probl ems we must combi ne use of more than
one previousl y establ i shed pri nci pl e. Thus, i n al l proof probl ems,
whether algebrai c, geometri c, or l ogi cal , t he proof i nvariabl y requi res
a sequential appl i cation of several previousl y establ i shed pri nci pl es.
I n a story-algebra probl em, frst the methods of representi ng the i n
formation al gebrai cal l y must be appl i ed, then the methods appropri ate
for sol vi ng whatever al gebraic equati ons are deri ved from the story.
The exampl es of general i zation in thi s most i mportant context coul d
be extended i ndefni tely. Sufce i t to say that, i n thi s sense, general i za
tion pl ays an enormousl y i mportant rol e in probl em sol vi ng. However,
as di scussed in Chapter 3 , thi s use of the method of general i zation
depends critical l y on the degree of understandi ng you have of the pre
vi ousl y establ i shed pri nci pl es i n the areas rel evant to the current prob
lem. A few general pri nci pl es regardi ng representation of i nformation
are di scussed i n Chapter 3 and Chapter 1 0 ; however, the fel d i s just
too vast and thus is outsi de the scope of thi s book.
The second rol e of general ization i n probl em sol ving has l i ttl e use for
a student in a course but is frequentl y val uabl e for a mathematician or
scientist sol vi ng a new probl em to see if the sol ution can be general i zed
to a larger cl ass of probl ems. Along the same l i ne, we may try to derive
some addi tional consequences as rel ati vel y straightforward corol l ari es
of the sol uti on to the present probl em. For exampl e, if we had estab
l i shed that the two di agonals of a rectangle were equal , we might ask
whether this resul t could be general i zed to a l arger cl ass of si tuati ons.
It woul d then be rel ati vel y straightforward to noti ce that the sol uti on
general i zes to di agonal s connecti ng equal and paral l el si des of any
pol ygon, i ncl udi ng regular hexagons or octagons and paral l el ograms.
Another exampl e i n a geometric context i s provi ded by t he theorem
that the alternate interior angles formed by a transversal i ntersecting
two paral l el lines are equal . This resul t easi l y general i zes to establ i sh
that the other pair of alternate interior angles are equal and that both
pairs of correspondi ng exterior angl es are equal as wel l .
A fnal exampl e, agai n i n a geometric context: Assume that we have
already establ i shed that an i nscribed angle has half as many degrees
as its i ntercepted arc. From this resul t it i s relati vel y tri vi al to show
TZ Chapter
that the angle formed by a tangent and a chord meeting it at the point
of contact has also half as many degrees as its intercepted arc. In fact,
the latter theorem coul d be thought of as si mpl y bei ng a l imiting case
of the former theorem.
The thi rd possi bl e rol e of generali zation i n probl em solving i s, i n a
sense, the inverse of the previousl y descri bed role of the method of
speci al case; namely, it might faci l i tate solution of a specifc probl em
to formulate a more general probl em that had not been previousl y
solved. Then we might ei ther sol ve the more general probl em or,
in any event, work on the solution of the more general probl em for a
time, before going back to working on the specifc probl em.
Pol ya ( 1 957, pp. 1 08- 1 09) argues that thi s probl em-sol vi ng technique
i s qui te useful and he gives the fol l owing exampl e:
The problem is to fnd a plane that passes through a given line and bisects
the volume of a given octahedron.
Pol ya asserts that i t woul d be useful to formul ate the more general
probl em of fnding a pl ane that passes through a straight line and bisects
the volume of any sol id with a center of symmetry. The solution to
thi s probl em i s fai rl y obvious, namel y, a pl ane determined by the given
l ine and the center of the sol i d wi th a center of symmetry. Si nce an
octahedron i s a special case of a sol id with a center of symmetry, the
original , specifc probl em i s sol ved. Polya asserts that the value in
formulating the more general previ ousl y unsol ved probl em i s that it
can focus the probl em solver's attention on the necessary properties
i n the original probl em that must be used i n order to sol ve it. As Polya
hi mself points out, however, the primary function of generalization was
i n the formulation of the more general probl em. If we had generalized
the probl em in an inappropriate way -that i s, using some property
that was not in fact central to the solution of the original probl em
then the formulation of the general probl em would l i kel y have been
of no hel p and mi ght even have been a hindrance i n the solution of the
original probl em.
A two-di mensi onal analog of the previ ous example woul d be to deter
mi ne the l ine that passes through a given point and, say, a given square
bi secting the vol ume of that square. A general i zation of this probl em
woul d be to determine a l i ne that passed through a given poi nt and
bi sected the area of a given pl ane-fgure with a center of symmetry.
Again, formulating the more general probl em directly suggests the
solution -namely, that the line passes through the given point and the
center of symmetry of the fgure.
Relations Between Problems T
Personal l y, I am somewhat skeptical of the al l eged benefts of try
ing to sol ve a more general probl em, when the solution to the more
general probl em is not known to the probl em sol ver. However, I am
sure that thi nki ng of possi bl e general i zati ons of the current probl em
does ofen ai d the probl em sol ver i n real izing al l of the properti es of
the probl em, some of whi ch may be the critical properti es in order to
sol ve the probl em. In thi s sense, the all eged third rol e of generaliza
tion i s very si mi l ar to the frst role of thinking of generalizations of
the probl em when the sol ution of the more general probl ems is already
known. I t i s a question of representing the i mportant properti es or
principl es that are present i n a special probl em, and, to do so, an
abstraction process i s i nvol ved. Abstracting the properties from a
probl em i s, in essence, generali zation. Thus, once again we see that
the role of generali zation and the role of representation of information
(as di scussed i n Chapters 3 and 1 0) are very cl osel y l i nked and perhaps
identical .
1
Topics in
Mathematical Representatiof
As stated in Chapter 2, probl ems contain i nformation concerning
gi vens, actions, and goal s. The frst and most basic step in probl em
sol vi ng i s to represent thi s information in ei ther symbol i c or diagram
matic form. Symbolic form refers to the expression of i nformation in
words, l etters, numbers , mathematical symbol s, symbol i c logic nota
tion, and so on. Diagrammatic form refers to the expression of in
formation by a col l ecti on of poi nts, l i nes, angl es, fgures, di rected l i nes
(vectors) , matri ces, pl ots of functions, graphs , and the l i ke. Often the
same information shoul d be represented using a variety of symbolic
or diagrammatic notations. In fact, di agrammatic representation i s
general l y l abel ed ; for exampl e, poi nts, l i nes, and cel l s in a matri x have
symbol s attached to them i n the diagram. Of course, probl ems are
stated origi nal l y i n some form, often rel yi ng heavi l y upon verbal lan
guage. The frst step i n sol vi ng such a probl em i s to transl ate from
the representation gi ven expl icitl y or i mpl i ci tl y in the original state
ment of the probl em to a more adequate representati on.
Topics in Mathematical Representation Tb
Thi s chapter is concerned with sel ected topi cs in the mathematical
or preci se representation of information i n probl ems. Al though preci se
representation of the i nformation i n a probl em i s the frst step to take
in trying to sol ve a probl em, I deferred di scussi ng this i mportant topic
to this l ate chapter of the book for two reasons.
First, al though some general statements can be made about the
representation of i nformation i n a l arge variety of probl ems, most of
the pri nci pl es of representation are specifc to parti cul ar probl em areas.
Efective representation for probl ems from some area of mathemati cs,
sci ence, or engineering depends upon knowing centuries of conceptual
devel opment i n the relevant areas of mathemati cs, sci ence, and en
gi neering. I doubt that mankind wi l l ever devel op a general method
for determining what are the useful concepts to defne i n any parti cul ar
area. Certai nl y, no such general pri nci pl es of how to defne good con
cepts are presented i n thi s book. The best I can do i s to present those
types of concepts and the pri nci pl es for representing them that have
proved the most useful i n a wide variety of areas of formal probl em
sol vi ng. Thi s i s what i s done i n the present chapter, without any cl aim
to compl eteness (which woul d be preposterous) and with onl y mi ni mal
claim to logical organization of the concepts and the pri nci pl es of
mathematical representati on.
Second, although some of the pri nci pl es of mathematical representa
tion are reasonably si mpl e and can be communicated to even the most
minimal l y prepared student, some of the principles di scussed i n the
l atter hal f of thi s chapter are concerned with concepts from various
areas of mathemati cs with whi ch some readers wi l l be unfami l i ar.
I hope that these readers wi l l proft from the secti ons on sets, relations,
operations, mappi ngs, functions, and real -valued functi ons of a real
variabl e. However, it seemed wi sest to put this material near the end
of the book so as not to di scourage readers with l ess mathematical
sophi stication.
The material in the l atter portion of thi s chapter is real l y a brief,
simple di scussion of sel ected mathematical topi cs, l argel y modern
algebra and combinatorial mathematics. This material i s primari l y
intended for students who have some background i n these topi cs i n
col l ege, high school , or grade school new math courses. For such stu
dents, these sections are intended as review of the relation of certain
mathematical concepts to the general methods of probl em sol ving di s
cussed i n thi s book. For students wi th no background i n set theory,
modern al gebra, and combinatorial mathemati cs, these sections may
be rather hard going and requi re considerable study. Such students
shoul d consul t regul ar mathemati cs books concerned with these topi cs,
Tb
Chapter J
rather than try t o master the material on the basi s of the rather brief
di scussi on presented here.
The pri mary basi s for sel ecti ng the mathematical concepts di scussed
i n thi s chapter is thei r appl i cabi l ity to the puzzle-type probl ems
characteri stic of recreational mathemati cs, whi ch constitute the pri
mary exampl e probl ems i n thi s book. A l arge subcl ass of al l recreational
mathemati cs probl ems consi sts of "i nsight" probl ems, where a major
di fcul ty may be to recogni ze the i mportant concepts for representing
the i nformation in the probl em.
REPRESENTATION ON PAPER OR IN THE HEAD
Thi s secti on has a si mpl e message: use penci l and paper extensi vel y
when you are tryi ng to sol ve probl ems. Of course, the pri mary repre
sentati on of i nformation is i n your head, but vi rtual l y all probl ems can
be sol ved faster by representing some of the i nformation on paper
(or a bl ackboard or other writing surface) than they can wi thout a
written graphi c ai d. Written representation of i nformation is useful for
both verbal symbol i c i nformati on and vi sual di agrammati c i nformation.
To try to sol ve probl ems without usi ng penci l and paper i s to subject
yourself to an unnecessary handi cap. Al though an occasi onal probl em
may be sol ved faster purel y "i n the head, " the vast majority of al l
probl ems wi l l be more qui ckl y sol ved by representing i nformation on
paper at an earl y stage in worki ng on the probl em. No one can say
for sure why this i s so, but there are at l east four pl ausi bl e reasons.
Fi rst, writing down the components of a probl em focuses your atten
tion on the need to give names ( symbol s , di agrammatic representation)
to each of the i mportant concepts i n the probl em.
Second, i t automati cal l y draws your attention to the i nformation
stated in the probl em as you attempt to represent that i nformati on
on paper.
Thi rd, as you deri ve i nferences or get to i ntermedi ate stages in the
sol uti on of the probl em, writing aids your memory for these i nferences
or i ntermediate stages at l ater stages i n the sol ution of the probl em.
Afer worki ng on a probl em for some time, it i s easy to forget some of
the gi ven i nformation or i nferences you drew from the gi ven i nforma
tion, and some of thi s i nformation may be hel pful l ater. Having thi s
informati on wri tten down al l ows you to use rapid vi sual scanni ng to
jog your memory for prior concepts and facts that might useful l y be
combi ned with the concepts and facts to whi ch you are currentl y
paying attention.
Fourth, probl ems that i nvol ve tabl es or matri ces of i nformation are
Topics in Mathematical Representation T
especial l y difcul t to retai n as a vi sual image purel y in the mi nd. Such
information i s very efci entl y represented by means of a tabl e written
on paper. For an exampl e of the importance of constructing tabl es to
represent informati on, see the Smith, Jones, Robi nson probl em in
Chapter 7. Si mi l ar concl usi ons appl y to graphs and other fgures, whi ch
may be di fcul t to accuratel y i magine and remember purel y mental l y,
without graphi c aids.
Whatever t he reason, experience i ndi cates that penci l and paper
representation of information i s very useful in probl em sol vi ng. So
do not be l azy. Al ways have penci l and paper ready when you start
to work on probl ems, and make extensi ve use of them through all
stages of probl em sol vi ng.
DIAGRAMMATIC REPRESENTATION
When a probl em in some way i nvol ves spatial concepts -poi nts, l i nes,
angl es, di rections, vectors, surfaces or pl ane fgures, sol i ds, contiguity,
connectedness, inside, outside, around -di agrammatic representation
may be an extremel y useful aid to symbol i c representati on, whether
verbal , logical , or algebrai c. Even when the probl em does not seem to
involve any spatial concepts, it sometimes happens that you can form
an analogy between the concepts i n the probl em and spatial concepts,
so that you coul d draw a di agram that might be of some aid i n sol vi ng
the probl em. For exampl e, overl apping ci rcl es mi ght be used to repre
sent overl apping sets, points to represent el ements of a set, and sets
of arrows to represent mappings from one set to another.
Verbal symbol i c representation is probabl y somewhat more i m
portant than vi sual di agrammatic representation in probl em sol vi ng
and in abstract thi nki ng in general . The communi cation of the givens,
operati ons, and goal s of a probl em i s l argel y i n verbal symbol i c terms.
Even when we empl oy di agrams i n the sol ution of probl ems, they are
usual l y labeled ; that i s, symbol s are attached to the poi nts, l i nes, and
angl es. For exampl e, i n sol ving for the l engths of l i nes or the magni
tudes of the angl es between l i nes i n geometric fgures, we i nvariabl y
make extensi ve use of symbol s attached to vari ous poi nts, l i ne seg
ments, or angl es i n the di agram ( see Fig. 1 0- 1 ) .
Furthermore, al l the spatial information represented by a diagram
l i ke Fig. 1 0- 1 can be represented symbol i cal l y without having to
employ di agrammatic representati on. For exampl e, the spatial informa
tion represented in Fi g. 1 0- 1 can be represented symbol ical l y as fol
lows : l i nes a, b, and h meet at common vertex B, l i nes a and d meet at
vertex A, l i nes d, h, and L meet at vertex D, l i nes b and L meet at
T

d D
C
FIGURE 1 0- 1
^ diagrammatic representation of t he spatial information
in some geometric probl em.
vertex C, lines c and d are col l i near and l i ne h i s perpendi cul ar to l i nes
d and c. I f we wi shed to suppress the symbol s for l i ne segments, we
could represent l i nes by unordered pairs of points -for exampl e,
(A, B) for l i ne a.
By adopti ng some conventi ons regarding symbolic representation
of spatial or geometric i nformation, the above symbol i c representation
can be shortened consi derabl y. For exampl e, let the unordered sets
of l i nes meeting at di ferent verti ces ( points) be represented as fol l ows :
( a, b, h) , ( a, d) , ( d
, h, c) , ( b, c ) . The fact that d and c are col l i near ( K)
and h is perpendi cul ar ( - ) to them coul d be represented by something
like d K L and h - d and c. So there i s nothing uneconomical about
the symbol i c representation i n terms of time to write the i nformation.
However, to say that, compared to symbol i c representation, di agram
matic representation is l ess i mportant i s not to say it i s uni mportant.
People coul d probabl y l earn to sol ve probl ems i nvol vi ng spatial con
cepts with purel y symbol i c representati ons such as those just presented,
but it i s doubtful that they would solve them as efcientl y. Current
evi dence i ndi cates that there i s a modal i ty of the mind concerned with
spatial concepts that functi ons di ferently from the modal ities of the
mind concerned with verbal symbol i c concepts.
The symbol i c modal i ti es are much more general l y useful (for ex
ampl e, even the spatial concepts can be represented i n nonspatial
symbol ic terms) , but it i s very l i kel y that the spatial modality of the
mind is parti cul arly well sui ted to reasoning about spatial concepts and
probl ems i nvol ving those concepts. Representing the spatial informa
tion in a probl em in di agrammatic terms probabl y brings a new part
of your mi nd to work on the probl em. Furthermore, that part of your
mi nd i s probabl y very wel l designed for reasoning regarding the spatial
aspects of the probl em that are represented i n di agrams. Final l y, much
of your prior knowledge regardi ng spatial concepts, relations, and so
on, i s probabl y stored i n the mind' s spatial modality. Si nce such prior
knowledge i s ofen assumed i mpl i ci tl y to be part of the gi vens i n a
Topics in Mathematical Representation T
probl em, it i s cl earl y i mportant that you have access to your memory
for such i nformati on.
SYMBOLIC REPRESENTATION
General Principles of Concept Representation
The si mpl est and most frequent step in symbol i c representation of the
i nformation i n probl ems i s to choose some symbol (or sequence of sym
bol s) to stand for a concept. The concept represented by a symbol can
be anything the human mi nd can concei ve. Let us take the symbol X and
examine a few of the many concepts i t can represent i n vari ous prob
l ems. The symbol X can represent any real number or i t can repre
sent a parti cul ar, but (in the present probl em state) unknown, real
number. Al ternati vel y, X can range over the i ntegers ( . . . , -2, -1 ,
0, I , 2, . . . ), posi ti ve i ntegers, negati ve i ntegers , rati onal s, i rrati on
al s, compl ex numbers ( y + zi), the el ements of some set, subsets of
three el ements from some given l arger set, and so on. The symbol X can
be the label of a parti cul ar point, l i ne, or fgure i n a geometric probl em,
the l abel of some parti cul ar el ement i n a practi cal constructi on probl em
(such as a gate, a pi ece of a fence, or a stump), or one of the tokens i n
a puzzle or game ( such as ti cktacktoe) .
Any symbol can be used t o represent any concept, subject t o one
and only one restri ction: the same symbol shoul d al most never be used
to represent two concepts that are not known to be equi val ent through
out the present probl em. There are many other pri nci pl es of efecti ve
representati on of concepts, but we can vi ol ate them without the ri sk of
produci ng concl usions that contradi ct the gi ven i nformation i n the
probl em. We cannot represent by the same symbol two concepts that
are not equi val ent in a probl em, wi thout runni ng a substantial ri sk
of generating contradi cti ons to the gi ven i nformation and deri vi ng
i ncorrect answers.
By contrast, i t i s perfectl y safe to use di ferent symbol s for concepts
or quantities that may l ater prove to be equi val ent or equal . However,
it reduces the l oad on the memory, if you notice such equi val ences or
equal i ti es before you assi gn symbol s to concepts or quantities, and
assign the same symbol to concepts that must be equi val ent. Never
thel ess, in vi ew of the grave danger i nvol ved in mi stakenl y usi ng
the same symbol for nonequi valent concepts, it is best to always use
di ferent symbol s for diferent concepts, unl ess you are absolutel y
certai n that the concepts are equi val ent throughout the probl em.
T
Chapter J
Mnemonics and Symbol Conventions
Al though you may use any symbol you l i ke to represent any concept,
subject to the above-mentioned restri cti on, peopl e tend to develop
habits regardi ng the types of symbol s used for diferent types of con
cepts. For exampl e, i n mathematics, i , j, k, I, m, and n tend to be used
for variabl es that range over the i ntegers ; u, V, w, X, ), and , tend to
be used for variabl es that range over real numbers ; p and qare usual l y
used for probabi l i ti es (ranging over t he real numbers between 0 and 1 ) ;
and f, y , and h tend to b e used primari l y to represent unknown functions.
I n scientifc and engineering probl ems, symbol s chosen to represent
di ferent concepts tend to have some easy mnemonic relation to a
longer name for the concept in verbal l anguage, usual l y being the frst
l etter of the name (or the frst letter of the key word, if the name of
the concept i s some phrase containing several words). For exampl e,
f and F mi ght be used for forces, A for area, p and P for pressures,
t and T for times, r and R for rates, and w and H for work. Other sym
bol ic naming conventi ons in science and engineering may be purel y
arbitrary, such as using Greek letters ( 8, L f) for angl es ; but adhering
to such conventi ons (whether mnemonic or arbitrary) makes it easy to
remember i n any parti cul ar probl em what concepts the symbol s repre
sent. Mai ntai ni ng consistency across di ferent probl ems i n the types
of symbol s you use to represent types of concepts brings l ong-term
memory to the aid of short-term memory i n recal l i ng what i s meant
by each symbol you are using i n any probl em.
Single Symbols
If ease of remembering what a symbol represents is so important for
efci ent problem sol vi ng, why not use two or three l etters from the ful l
name or even the ful l name for the concept? Something l i ke thi s i s
occasional l y done i n probl ems where the names of two or more con
cepts have the same frst letter; for exampl e, using to for total, te for
tension, and ti for time, when al l three concept names appear i n some
probl em. This sort of multi symbol naming of a concept i s useful in
some cases, and these cases will be di scussed i n subsequent sections
on the use of subscripts, vectors, and functions. However, in cases
where the mnemonic advantages of using several symbol s to represent
a single concept are the only advantages, this practice i s al most always
a mi stake. Unl ess there are a very l arge number of concept names that
all have the same frst l etter, it is always possi bl e to thi nk of diferent
single symbol s to represent each concept, such that each symbol has
an adequate mnemoni c relation to the concept name so abbreviated.
Topics in Mathematical Representation TT
For i nstance, you coul d use some l etter in the name other than the frst,
capital letters as wel l as smal l l etters (for exampl e, T and t), corre
spondi ng Greek l etters (for exampl e, for t) , phoneti cal l y si mi l ar letter
names if you know phoneti cs (for exampl e, d for t), or change the con
cept name to some functional synonym (for thi s probl em) that starts
with a diferent frst l etter (for exampl e, sum for totalJorce for tension,
duration for time) .
For the purposes of uni que representati on in any one probl em, it i s
unnecessary to represent concepts by a string of several symbol s (as
we do i n verbal l anguage) , because so few di ferent concepts are in
volved i n any gi ven probl em. You do want to maintain a strong mne
monic rel ation between your chosen symbol and the concept name i t
abbreviates, because that provi des you with access to the i nformation
stored i n your l ong-term memory concerni ng the concept and i ts pre
viousl y establ i shed rel ati ons to other concepts. However, we have just
seen that a si ngl e symbol is usual l y adequate to accompl i sh thi s rel a
tion without a lot of tabl e l ookup.
Thi s being the case, there are substantial cogni ti ve advantages to
usi ng si ngl e symbol s to represent concepts i n probl em sol vi ng. It i s
fai r to say that psychology does not know for sure the exact reasons
why thi s i s so, but the experi ence of probl em sol vers establ i shes that
it is easier to work with si ngl e-symbol names for concepts. Many
pl ausi bl e reasons can be gi ven. Si ngl e symbol s probabl y pl ace l ess of
a load on short-term and long-term verbal and vi sual memory. So, for
exampl e, it is easier to remember the vi sual i mage or verbal statement
of an expressi on, formula, or equation that uses si ngl e symbol s for
each concept than one whi ch uses more symbol s to represent each
concept. Si ngl e symbols take l ess ti me to write on paper and general l y
have l ess potential for erroneous writing or readi ng.
Expressions
To reduce the cogni ti ve and memory l oad in sol vi ng probl ems, you can
also reduce the number of di ferent symbol s used to represent con
cepts by recogni zi ng the relati ons between concepts right away when
you are deci di ng upon representation. For exampl e, if John i s 30 pounds
heavi er than Bi l l , l et b represent Bi l l ' s weight and l et b + 30 represent
John' s wei ght, to avoi d assigni ng any new symbol at all to represent
John' s wei ght. Thi s use of expressi ons i nvol vi ng a smal l number of
symbol s to represent al l the concepts i n the probl em can speed up the
solution to the probl em by i mmedi atel y reduci ng the number of un
knowns to those having onl y nontri vial rel ati ons to one another.
However, it must be recognized that you are performing two steps at
TZ Chapter J
once: representi ng t he concepts i n t he probl em and expressi ng some
relati vel y si mpl e rel ati ons between concepts. Trying to do these two
steps at once i ncreases the probabi l i ty of error, though when per
formed successful ly, i t usual l y al l ows you to solve the probl em faster.
My advi ce is to try to combi ne these two steps and use expressi ons to
represent concepts. However, if you fnd yourself making lots of errors,
go back to the more compul si ve and systematic procedure of frst
assi gni ng symbols to al l di ferent and i mportant concepts i n the prob
l em, and onl y then start expressi ng the rel ations between these con
cepts -for exampl e, by equations such as j b + 30, where j represents
John' s weight.
Subscripts
I n a probl em wi th many di ferent quantiti es of the same type - such as
many di ferent ti mes, rates, di stances, vol umes, hei ghts, or radi i of
ci rcul ar bases - it is ofen desi rable to use two-symbol codes for
representi ng these concepts. In these probl ems, the type of concept
can be thought of as a property of some object, acti vi ty, or object
performi ng-an-acti vi ty. One of the two symbol s i n the two-symbol
code i s used to represent the property and the other symbol to represent
the object, acti vi ty, or object-performi ng-an-acti vi ty. In addi ti on, the
mnemoni c conventi on i s that the property i s represented by the main
symbol , wi th the object, acti vi ty, or object-performi ng-an-acti vi ty
represented by a subscri pt. So, for exampl e, the height of cyl i nder
A coul d be represented by hA and the radi us of i ts base by rA ' Si mi l arl y,
hB and rB coul d be used for cyl i nder B, and so on.
Subscri pts usual l y appear to the right and somewhat bel ow the mai n
symbol , but they occasi onal l y appear to the lower lef of the main sym
bol . So long as there are no exponents i nvol ved in sol vi ng a probl em,
you can use superscri pts for the second symbol just as wel l as sub
scripts. Superscri pts can appear to the upper lef or upper right. How
every, si nce there is very frequentl y a danger that superscri pts will be
confused with exponents, i t i s good practi ce to avoid usi ng them.
The onl y possi bl e excuse for usi ng a superscript occurs i n probl ems
where the objects to whi ch a property appl i es are di ferentiated along
two or more di mensi ons, and the val ues of the vari ous di mensi on are
compl etel y noncomparabl e. For exampl e, i magi ne that in some prob
lem you have vari ous compl ex contai ners, each of whi ch has several
component containers with di ferent shapes and di mensi ons. Container
A might be composed of a cyl i ndrical subcontainer, two cubical sub
contai ners, and a rectangul ar subcontai ner. There are al so compl ex
Topics in Mathematical Representation
T
containers B and C, each composed of subcontai ners. How shoul d
you represent the vol ume of each subcontainer i n contai ner A, for
exampl e? For ease in remembering what symbol s represent what and
the associated ease i n retri evi ng formulas for vol umes from your l ong
term memory, you might try usi ng a notation such as the fol l owi ng:
Let V/ be the vol ume of the cyl i ndri cal subcontainer of A ; l et [ be
the vol ume of the frst cubical ( square base) subcontai ner of A ; l et
be the vol ume of the second cubi cal subcontainer of A ; l et vt be the

vol ume of the rectangul ar subcontainer of A.
You encounter notati ons l i ke the above, with both l ef and right
subscripts and a superscript (or even two ! ) , but I think they are poor
notati ons. Superscripts are dangerous because they can be confused
wi th exponents. I f a superscript i s put to the upper lef of the main
symbol , it el i mi nates the possi bi l i ty of i t bei ng confused with an ex
ponent for the main symbol ; but it is then apt to be mi staken for an
exponent for the symbol to the left i n a mul ti pl i cation probl em (for
exampl e,
for ti mes . If you are always careful to put the

entire three- or four-symbol compl ex i nto parentheses -for exampl e,
(
2
Vf) -you wi l l avoid these confusi ons. However, you wi l l then have
a complex symbol with as many as six component symbol s; i ndeed,
even four component symbol s are too many.
In cases l i ke the above, you should probabl y scrap the enti re efort
and use a single letter for each concept or, at most, a two-symbol
code for each concept. I f you can easi l y think of some si ngl e symbol s
that have some reasonabl y good mnemoni c relation to the concepts
they represent, use such symbol s. Or, fail i ng thi s, make an arbitrary
assignment of si ngle symbol s to concepts and write down the assign
ment i n a tabl e or di agram, for consul ti ng when necessary. The ti me
saved i n the repeated wri ti ng of the symbol s i n vari ous statements
or equati ons usual l y more than compensates for the extra ti me spent
i n tabl e or di agram l ookup. Furthermore, i t i s rather hard to vi sual i ze,
verbal i ze, or otherwi se thi nk about such compl ex symbol s as the ones
in the above exampl e.
Multiple Subscripts
One important exception to the above advi ce regarding compl ex sym
bol s are cases where you have an enti re matri x (two-di mensional or
higher dimensi onal ) of objects or acti vities, each entry of whi ch has
one or more properti es. In such cases, use mul ti ple subscri pts ( possi bl y
separated by commas), all on the l ower right of the mai n symbol (for
exampl e, X
Uk
or x
u.d
. Such cases ari se frequentl y in stati stics , where
T4 Chapter J
__ might represent t he wheat yi el d on t he kth pl ot of l and, subjected to
the ith val ue on one treatment di mensi on (for exampl e, the amount of
some ki nd of ferti l i zer), and subjected to the jth value on another treat
ment di mensi on (for exampl e, the amount of water) .
Why i s use of a compl ex symbol with mul ti pl e subscri pts recom
mended i n the stati sti cs exampl e and not i n the previ ous example?
I n the stati sti cs exampl e there are a number of di mensi ons, and every
combi nation of val ues on every di mensi on (that is, every entry i n the
matri x) has a defned val ue of the property i n questi on (for i nstance,
wheat yi el d). I n the probl em concerni ng vol umes of compl ex contai ners,
every complex container coul d have di ferent shapes of subcontainers
and diferent numbers of each shape. Suppose you defned di mensi ons
l i ke the "shape of a subcontai ner" ; the compl ex contai ner to whi ch
the subcontai ner bel onged ; and whether i t was the frst, second, and
so on, subcontai ner of thi s shape to be a part of the compl ex container.
You woul d then have lots of cel l s in the matri x with no objects corre
spondi ng to them in the probl em (empty cel l s in the matri x) . Thi s i s
more troubl e than i t i s worth.
When you actual l y have a l arge number of objects i n some multi
di mensi onal matri x, there i s real l y no feasi bl e alternati ve to usi ng
mul ti pl e subscri pts. Furthermore, in matri x probl ems l i ke the stati sti cs
exampl e, we often do operati ons such as summi ng over al l the entries
i n some row or column without ever havi ng to look up i n any tabl e or
di agram the meaning of each compl ex symbol . When subscript notation
i s used, a conveni ent notati on exi sts for i ndi cati ng sums or products
for i nstance:
and
Subscri pt notati on i s always i ndi cated i n probl ems where mul ti pl e
sums or products of thi s type are l i kel y to be used i n the sol uti on.
Where such mul ti pl e sums and products or some other computati ons
or relati ons i nvol vi ng the subscri pts are not l i kel y to appear anywhere
in the sol uti on, it is questionable whether you should ever use a sym
bol wi th more than one subscri pt.
Example Problem
Tom, Di ck, and Harry mow l awns in the summer to earn money. They
each have a lawn mower, and one Saturday they deci de to mow a 5,900
square foot lawn together, usi ng all three lawn mowers. Tom mows 70
square feet per mi nute, Dick 50, and Harry 40. Dick and Harry start
mowi ng the l awn at the same ti me, but Tom has troubl e start i ng hi s mower
and is del ayed for 30 mi nutes. All t hree boys stop mowi ng at the same
t i me, when the lawn is fni shed. How long does Tom mow?
Topics in Mathematical Representation
Tb
I n sol vi ng thi s probl em, the pri ncipal step is to represent the i nforma
tion in algebraic notation and set up the equati ons. Afer thi s, the se
quence of algebraic actions (operations) i s tri vi al . Stop readi ng, and
represent the i nformation i n thi s probl em, using the princi pl es di s
cussed in thi s chapter ; then sol ve the probl em.
There are several steps i nvol ved i n representi ng the i nformati on,
and you shoul d be aware of them, even though you may be able t o sol ve
such probl ems qui ckl y and easi l y. The same types of steps are i nvol ved
i n al l story-algebra probl ems, and si mi l ar representational steps are
i nvol ved i n many other types of probl ems as wel l .
Fi rst we have t o represent t he unknown quanti ti es for whi ch we are
to sol ve. Here that means havi ng some expressi ons represent ( stand
for) the times that Tom, Di ck, and Harry work. It i s economi cal and
an aid to vi sual and verbal memory to choose a si ngl e symbol (ofen
a l etter) to stand for an unknown quantity. You may choose any sym
bol s you l i ke for the unknown quanti ti es represented, provi ded that
you do not use the same symbol to represent quantiti es that might not
be equal .
It is perfectl y safe to use di ferent symbol s for quantiti es that may
l ater prove to be equal . However, it aids the memory, if you notice
such equal i ti es before you assign symbol s to quantities and assign the
same symbol to quantities that must be equal . For exampl e, i n the
present probl em, we can use for the ti me that Di ck mowed and al so
for the time that Harry mowed. We know that these times are equal
si nce Di ck and Harry start and stop at the same ti me. It woul d pl ace
an unnecessary strai n on the memory to use
and
to stand
for the mowing times of Tom, Di ck, and Harry, respecti vel y, though
there i s no mathematical reason why numeri cal subscri pts cannot be
used for these purel y nominal or naming purposes.
Note that there i s a good mathematical reason for not choosi ng to
represent Tom' s mowi ng time by the symbol 8, because al l number
symbol s, i ncl udi ng 8, are al ready i mpl i ci tl y gi ven as concepts i n a
story-algebra probl em, and Tom' s mowing ti me i s not known to be
equal to 8. Thi s sort of objecti on does not appl y to the use of numbers
in a purel y nominal way i n subscripts, and, of course, nomi nal numerical
subscripts are used frequentl y i n mathemati cs, sci ence, and engi neer
ing. Ofen the probl em gi ves onl y the names "frst force" or "second
force, " and i n such cases the obvi ous representation i s i and h.
I n probl ems with many unknown quantiti es, i t i s desi rabl e, for the
same reason of mi ni mi zi ng the l oad on human memory, to choose l et
ters that have easy mnemoni c rel ati ons to the ful l names of the quanti
ties represented. For exampl e, we choose or T for ti me quantities
and or R for rate quanti ti es. I f there are many diferent quanti ti es of
Tb Chapter J
the same type, such as several diferent ti mes, then i t may be best to
represent them by a two-symbol code, such as tT, tn , and tH, where
the t i ndi cates a time and the subscript indi cates whi ch time. The
subscript shoul d also have some easy mnemonic relation to the ful l
name of the quantity represented. I n the present probl em, we might
initial l y have chosen tT, tn, and til to stand for the times that Tom,
Di ck, and Harry mow, though, as al ready noted, this i s unnecessary.
Now here is the sol uti on. Thi s is a si mpl e work-rate probl em. The
pri mary equation to use i n such problems i s that work equal s the sum
of al l the rate times time components. In this probl em, that means
setting up an equation such as
where
70(t - 30) + SOt + 40t 5, 900
t time Di ck and Harry mow
t- 3 0 time Tom mows
Sol vi ng the above equation gi ves t 50, and therefore Tom mows for
20 mi nutes.
SOME IMPORTANT MATHEMATICAL CONCEPTS
Ordered and Unordered Pairs
Ordered pairs without replacement Each of the possi bl e permutations
of m thi ngs taken two at a time is just another name for an ordered pair
of el ements (i,j) , such that both i and j are members of the set of m
thi ngs and i and j are not the same thing or el ement of the set. Thi s i s
sampl i ng twi ce from a set of m el ements without repl acement of the
el ement already sampl ed. Ordered pairs of this type ( permutations)
are frequentl y involved i n probl ems.
For exampl e, consi der a probl em in whi ch some group has the ritual
of everyone ki ssing everyone el se on the forehead, and you are sup
posed to determine the number of people i n the group from the number
of ki sses or vice versa. Ki ssing on the forehead i s represented by an
ordered pair of persons in whi ch the frst member of the pair is the
ki sser and the second member is the person ki ssed. Having recognized
that this i s an ordered pair ( permutation) probl em, you can solve the
probl em i n a manner si mi l ar to the way you solved other problems
namel y, by determi ni ng how many ways there are to fl l each position
in the ordered pair. I n this probl em, there are m ways to fll the frst
posi ti on, and, with it fl l ed, there are m - 1 ways to fll the second
Topics In Mathematical Representation
T
posi ti on. Hence, there are m ( m - I ) ways to fl l both posi ti ons (wi th
di ferent el ements). Thus, the m persons exchange m( m - 1 ) forehead
ki sses. Of course, to determine m from m ( m - 1 ) , you have to sol ve a
quadratic equation, but in the average probl em of thi s type, that woul d
be very si mpl e.
Ordered pairs with replacement A known or unknown poi nt i n a
plane is ofen best represented by an ordered pair of symbol s repre
senting i ts di stances from each of two ( usual l y perpendi cul ar) axes,
though, of course, i n many probl ems a poi nt can be represented by
just a si ngl e symbol such as A or a dot on a page. However, frequentl y
the representation of a poi nt i n a pl ane shoul d be by an ordered pai r
of symbol s -an ordered pair of numbers for a known poi nt, an ordered
pair of l etters for an unknown point. Si nce the val ues of the two coordi
nates can be equal , the representati on of a poi nt i n a pl ane by an
ordered pai r i s an exampl e of sampl i ng twi ce with replacement from the
entire population of coordi nate values (whether fni te or i nfni te).
Sampl i ng with replacement means that when you have chosen the val ue
for the frst coordi nate, you put that value back into the popul ation
so that it can be drawn again as the val ue for the second coordi nate.
Thus, if there were m possi bl e coordi nate val ues, there woul d be m2
possibl e points -that is, m2 ordered pairs of coordi nate val ues.
Unordered pairs without replacement Al though forehead ki sses are
represented by ordered pairs of persons, l i p ki sses (assumi ng mutual
ki ssi ng) are represented by unordered pai rs. That i s , if A ki sses B,
B i s assumed also to be ki ssi ng A. Thus, there i s no basi s for di sti n
gui shi ng between (A , B) and ( B, A ) .
Thi s i s a combi nati ons probl em rather than a permutati ons probl em.
As we di scussed i n Chapter 9, the number of combi nati ons of m thi ngs
taken two at a ti me i s m( m - 1 ) / 2. We deri ved thi s fgure by reasoni ng
as fol l ows: The number of permutati ons is m( m - 1 ) , and there are
two permutati ons for every combi nati on (two ordered pairs for each
unordered pair). Thus, we should di vi de the number of ordered pairs
(permutati ons) by 2 to get the number of unordered pairs (combi nations).
A l i ne segment i s determined by i ts two end poi nts. Ofen a l i ne
( segment) wi l l be defned as an unordered pair of (di sti nct) poi nts, wi th
every di ferent unordered pair of poi nts i n the total set of poi nts repre
senting a diferent l i ne. Noti ce that a l i ne is i ndeed an unordered pair
of poi nts, not an ordered pair, i n most cases. However, i t is perfectl y
possi bl e to have directed l i ne segments i n a probl em (that i s, l i nes wi th
arrows on one end, or vectors), i n whi ch case the l i ne (A, B) i s di ferent
T Chapter J
from the l i ne ( B, A ) , where A and B are two diferent points i n the set
of poi nts.
The above two exampl es of unordered pai rs are i nstances of un
ordered pairs where the sampl i ng i s without replacement. A person i s
not assumed to be abl e to ki ss hi msel f on the l i ps. A l i ne segment i s
determined by two end poi nts that are di sti nct, that i s, two diferent
poi nts.
Unordered pairs with replacement We can also obtain unordered
pairs by sampl i ng from a popul ation with repl acement. An exampl e
might be the di stinct combi nations obtained by throwing two di ce,
whi ch might produce these resul ts: 6-6, 6- 5, 6- 4, 6- 3 , 6- 2, 6- 1 , 5 - 5 ,
5- 4, . . . , 2- 2, 2- 1 , I - \ . To compute the number of di stingui shable
throws of two di ce, we reason that the order of the two di ce i s not
i mportant and thus 6- 5 is the same outcome as 5- 6.
Ofand you might thi nk that the number of unordered pairs obtained
by sampl i ng with repl acement would be equal to the number of ordered
pairs sampling with repl acement di vided by 2, as was the case for
sampl i ng without repl acement ( permutations and combinati ons). How
ever, this is not the case. Most of the unordered pairs of outcomes
obtained by sampl i ng with repl acement do i ndeed have two di stinct
ordered-pair counterparts, but there i s a subset of the unordered
pairs each of whi ch has onl y a si ngl e ordered-pair counterpart. These
l atter pairs are 6- 6, 5 - 5 , 4- 4, 3 - 3 , 2- 2, and I - \ .
I n general , if you are sampl i ng with repl acement two ti mes from a
popul ation of m el ements, the number of diferent unordered pairs
will be
+
m2 - m
m (m + 1 )
m
2
-
2
Thi s resul t is obtained by reasoni ng as fol l ows: exactl y m of the un
ordered pairs are of the form i-i, where i 1 , . . . , m. The remai ni ng
m2 - m ordered pairs have exactl y two ordered-pair counterparts for
each unordered pair. Thus, the quantity ( m2 - m) should be di vided by
2 to get the number of unordered pai rs, and m shoul d be added to this
total to get the total number of diferent unordered pai rs.
Systematic listing of unordered pairs Occasional l y we need to l i st
al l the unordered pairs that can be obtained by sampling, either with
or without repl acement, from some popul ati on. The efcient way to
accompl i sh such a l i sti ng i s to put al l of the el ements i n the population
into an orderi ng, whether they have any natural ordering or not. Havi ng
TOpics In Mathematical Representation T
ordered the el ements from 1 to m, we can then l i st al l of the unordered
pairs by proceeding as fol l ows: If the unordered pairs are being ob
tained by sampl i ng with replacement, we take each el ement from the
popul ation and pair it with itself and every element below it i n the
ordering. Thus, we woul d obtain pai rs such as 5 - 5 , 5 - 4, 5 - 3 , 5 -2, 5 - 1 ,
4- 4, 4- 3 , 4- 2, 4- 1 , 3 - 3 , 3 -2, 3 - 1 , 2-2, 2- 1 , I - I . I f the unordered pairs
are being obtained by sampl i ng without repl acement, we take each
el ement i n the popul ation and pair it with each of the elements bel ow
it in the ordering. Thus, we would have l i stings such as 5 - 4, 5 - 3 , 5 -2,
5 - 1 , 4- 3 , 4-2, 4- 1 , 3 -2, 3 - 1 , 2- 1 .
Importance of ordered and unordered pairs Ordered and unordered
pairs are very common concepts i n probl ems, and you should be al ert
for the possi bi l i ty of representing concepts as ordered or unordered
pairs of other concepts, frequentl y an i mportant step i n sol vi ng the
probl em. In a sense, thi s representation i nvol ves your recogni zi ng a
relation between diferent concepts and i ncorporating that relation i nto
your representation of concepts in the probl em, whi ch is si mi l ar to
representing concepts by expressi ons, as we di scussed previousl y.
You shoul d al so be careful to note whether the ordered and unordered
pairs are being obtained with or without repl acement, for, as stated
earl ier, it makes a consi derabl e diference i n the number of such pai rs.
Ofen you wi l l real ize that some concept can be represented by a
pair of other concepts before you real ize whether the pair shoul d be
considered an ordered pair or an unordered pair and whether the
sampl i ng i s with or without repl acement. However, by bei ng expl i ci tl y
aware of the di stinctions, you wi l l try to deci de whether the pai r i s
ordered or unordered and whether the sampl i ng i s wi th or without
replacement, before trying to use the concept i n sol vi ng the probl em.
Ordered and Unordered Sets
Ordered and unordered pairs generalize easi l y to ordered and unor
dered sets of el ements greater than two. So we can have an ordered
or unordered set of three el ements (A , D, C) , four el ements ( w, x, ), z) ,
and so on.
Ordered sets without replacement Each permutation of m thi ngs taken
n at a time (n m) i s an ordered set of n el ements. I n getting any
parti cul ar permutation, we are sampl i ng n times without repl acement
from a set of m el ements. Thus, there are m possi bl e ways to fl l the
frst posi ti on (m possi bl e el ements that coul d be sel ected frst ) , (m I )
Z Chapter J
ways t o fl l the second position, . . . , and ( m - n + I ) ways t o fl l the
nth ( l ast) position, or m( m - I ) . . . (m + I ) m ! / ( m - n) !
diferent ways to sel ect all el ements (assumi ng 3 m el ements are di s
tinct ) . An exampl e of a permutati on probl em (ordered sets, sampl i ng
without repl acement) is as fol l ows:
A gym teacher wi shes t o put on a balancing demonstrati on i n whi ch one
of the stunts wi l l be to have four boys stand on each others shoul ders
i n a si ngl e tower. Out of the cl ass of 20 boys, the gym teacher wi shes to
sel ect the most stable tower of four boys. To do t hi s he plans to try each
possi bl e tower of four boys once and time how long they are abl e to
bal ance successful l y on each others shoul ders wi thout fal l i ng over. How
many such towers of four boys must the gym teacher i nvesti gate?
Si nce there are 20 boys in the gym cl ass and a tower of four boys
constitutes an ordered set of four el ements sampl ed from the cl ass
without repl acement, thi s i s a permutati ons probl em. Therefore, the
number of possi bl e towers i s 20 ! /( 20 - 4) ! 20 ! / 1 6! .
Ordered sets with replacement A poi nt or vector i n n-dimensional
space can be represented by an ordered set of its n coordi nates (Xl
'
X
2
, , xn ) . Such an ordered set of n el ements is obtained by sampling
with repl acement from a popul ation of, say, m possible coordi nate
val ues exactl y n ti mes. The number of possi bl e ordered sets obtained
by sampl i ng n ti mes from a popul ation of m el ements i s equal to mn .
Unordered sets without replacement A triangle is an unordered
set of three diferent points ( sampl ing three times without repl acement
from the set of all points). A quadrilateral i s an unordered set of four
diferent poi nts i n a pl ane. Each combi nation of m thi ngs taken n at
a time (n m) is an unordered set of el ements such that none of the
el ements is identical l y the same el ement (for exampl e, one i s sampling
n ti mes without repl acement from a set of m el ements). The number of
combi nati ons of m thi ngs taken n at a ti me is si mpl y
the number of
permutations of m things taken n at a time, di vided by the number of
diferent permutati ons for the same combi nation of n el ements. Since
there are n ! diferent permutati ons for each combi nation of el ements,
there are m! 1 [ n ! (m - n) ! ] combi nations of m thi ngs taken at a ti me.
An exampl e of a combi nati ons probl em i s as fol l ows:
How many di ferent bridge hands ( 1 3 cards) can be obtai ned by deal i ng
1 3 cards out of a standard 52-card deck?
Topics in Mathematical Representation ZT
Deal i ng 1 3 cards from the standard 52-card deck i s sampl ing with
out repl acement. The order in whi ch the cards are dealt to someone
makes no diference i n defni ng a bridge hand. Therefore, this i s a
combinations probl em, that i s, a probl em i nvol vi ng an unordered set
obtained by sampl i ng without repl acement. Thus, the number of bridge
hands i s 5 2 ! / 1 3 ! ( 52 - 1 3 ) ! 52 ! / 1 3 ! 3 9! .
Unordered sets with replacement Thus far we have di scussed ordered
sets obtained by sampl i ng with and without repl acement and unordered
sets obtained by sampl i ng without repl acement. Computing the num
ber of unordered sets obtained by sampl ing with replacement i s a far
l ess trivial probl em. The probl em has an extremel y el egant solution,
whi ch I found in Fel l er ( 1 957, p. 3 8) and whi ch I thi nk provi des a good
example of how cl ever representation of the i nformation i n the probl em
can facil itate its solution. Thus, we wi l l exami ne the sol uti on from
two viewpoints -that of determi ni ng the formul a for the number of
unordered sets obtained by sampling with repl acement, and that of
having an el egant exampl e of how to represent information i n a parti cu
l ar cl ass of probl ems.
The basic probl em i s to determine how many unordered sets we can
obtain by sampl i ng n times with repl acement from a population of m
el ements. The solution is obtained easi l y by consi dering the m el ements
of the popul ation, to be represented by the spaces between m + 1
boundary markers ordered along a l i ne. That i s, we wi l l i magi ne we
have a l i ne with m + 1 interval boundaries marked of along that l ine
defning m i nterval s, as shown in Fi g. 1 0- 2. In the fgure, the n el ements
sampled are represented by circles between vari ous boundary markers.
The number of ci rcl es between the frst and second boundary marker
represents the number of times the frst el ement was sampl ed. The
number of ci rcl es between the second and thi rd boundary marker
represents the number of times the second element i n the popul ation
was sampled, and so on. With thi s representati on, we can easily
compute the number of diferent unordered sets that can be obtai ned
by sampl i ng n times without repl acement from a popul ation of m ele
ments. The two end boundary markers out of the m + 1 boundary
markers must remain fxed at the ends. The remaining m 1 boundary
markers and n el ements sampled can be rearranged at wi l l . If we con-
1 0 0 I 0 0 0 I 1 0 0 0 0 I
FIGURE 1 0-2
1 0 1
Cl ever reformulation of the information in the probl em of
determi ning the number of unordered sets that can be obtai ned
by sampling H ti mes from a popul ation of H el ements.
ZZ Chapter J
si der any parti cul ar rearrangement t o be obtai ned by sampl i ng n ti mes
without repl acement from the popul ation of n + m I el ements, then
the number of such rearrangements of the l i nes and ci rcl es i s easi l y
determi ned, namel y, ( n + m 1 ) ! In ! ( m 1 ) ! .
Thus , we have transformed the probl em of sampl i ng n ti mes with
repl acement from a popul ation of m el ements to the problem of sam
pl i ng n ti mes wi thout replacement from a population of n + m 1 ele
ments. In both cases, we are trying to compute the number of unordered
sets obtai ned by such a sampl i ng. Si nce we know the solution to the
probl em of obtai ni ng the number of unordered sets obtained by
sampl i ng wi thout repl acement (combi nations), we now know the
answer to the probl em of determi ni ng the number of unordered sets
obtained by sampl i ng with replacement.
Relations
Rel ations are l abel ed connecti ons between concepts. Examples of
relational concepts i ncl ude, among others : father of, brother of, si bl ing
of, descendant of, prior to, less than, equal to, i dentical to, heavier than,
ol der than, besi de, i ncl udes, is a member of Rel ati ons can be written
as "a R b" (meani ng perhaps "a is the father of b") or as R( a, b) .
I n the l atter case, R( a, b) is true (equal s I ) , if and onl y if the relation R
obtai ns for the ordered pair ( a, b ) , and R ( a, b) is fal se (equal s zero),
i f and onl y if the relation R does not obtai n for the ordered pair ( a, b) .
Rel ati ons can be cl assi fed accordi ng to whether they sati sfy cer
tain properti es (axi oms). For exampl e, a relation can be refexi ve
(a R a, for all a i n some set ) , antirefexi ve (a not- R a, for all a i n some
set), or nei ther. Rel ati ons can be symmetric or commutative (a R b
i mpl i es b R a, for al l a and b in some set ) , anti symmetric (a R b i m
pl i es b not- R a, for al l a and b i n some set) , or neither. Relations can
be transi ti ve (a R b and b R c i mpl y a R c, for all a, b, and c i n some
set ) or not. Rel ati ons that are refexi ve, symmetri c, and transitive
form an especi al l y i mportant class of rel ati ons known as equivalence
relations. "Equal to " and "i dentical to" are equi val ence rel ati ons, but
so are "has the same weight as," "is the same col or as," and "is just
as good as. "
Some students fnd i t di fcul t to di sti ngui sh the concepts of equiva
lence, equality, and identity, and i ndeed the meani ngs of these concepts
are somewhat variabl e, especi al l y i n going from science to mathemati cs.
To get an i dea of why we someti mes need to di sti ngui sh them, consider
the fol l owi ng exampl es.
Suppose you were thi nki ng about your prospects for i mmortal ity, and
you i magi ned that i t mi ght be possi bl e to form a compl ete dupl i cate of
Topics in Mathematical Representation Z
yoursel f that was the same confguration of mol ecul es as yoursel f but,
of course, used di ferent mol ecul es. Such a dupl i cate woul d be equal
to you i n every respect, but the dupl i cate woul d not be i denti cal to
you, si nce the two of you i n fact woul d be two diferent parts of the
uni verse. I dentical twi ns are consi dered to be genetical l y equal , but
they are not i dentical , si nce they are two diferent entities. Al l the
mol ecul es of some cl ass are consi dered to be equal, but si nce there is
more than one such mol ecul e in each class, the molecul es of a given
class are not identical . A thing i s i dentical only to itself, but it can
be equal to all dupl i cates. Of course, many people fee! that it i s l i kel y
that no two entities can be exact dupl icates i n every respect, but thi s
i s not too important for the defni ti on of equal i ty. We can si mpl y say
that two things are equal , if there i s no property that we can currently
determine to di sti ngui sh them (except of course the fact that they are
not the identical l y same enti ty).
Two enti ti es may be equivalent to each other i n some one respect
(one property) without bei ng equal to each other (equi val ent to each
other in all respects). Thi s i s pretty obvious when it i s put this way.
Two girl s can have the same weight (to the nearest pound) , have the
same shooting percentage i n basketbal l , have the same number of
points on a test, and so on. Thus, it i s cl ear that (a) two names of ob
jects that are identical l y the same real l y refer to the same object, (b) two
names of objects that are equal refer to two diferent objects that are
equivalent in all respects, and (c) two names of objects that are equi va
lent in some respect are not necessari l y identical or equal i n al l respects.
Someti mes there i s no need to di stingui sh all three types of "same
ness" concepts. For exampl e, in real -variabl e mathemati cs, where a
symbol has onl y one property, i ts numerical val ue, there is no need to
di stingui sh equi valence and equality. Furthermore, our defnition of
identity is not qui te the same intuitivel y as it is when we di scuss
objects presumed to exi st i n the real worl d. I n mathemati cs not viewed
as appl ying to the real worl d, two expressi ons are identical l y equal
( F) if and onl y if they have the same val ues across all substituti ons
for free variabl es (for exampl e, x2 - 1 F [x + 1 ] [x - 1 ] ) . Two expres
si ons are ofen said to be equal (=) if and onl y if they have the same
values for at least one substitution for free variabl es (for exampl e,
x2 + 2 = 6 for t he substitution x = 2 ) .
Operations
I n Chapter 2, operati ons were contrasted with gi vens. I ntui ti vel y,
operations were the thi ngs that you coul d do to change the state of
the probl em and gi vens were the material s you had to work with (the
Z4 Chapter J
starting point or i ni tial probl em state) . You coul d attach a symbol to
each operation that you coul d perform to change the probl em state,
but what woul d that accompl i sh? For one thing, it might allow you
more conveni entl y to list the alternative actions that could be per
formed at each node in the state-acti on tree. However, thi s is not the
primary reason for attachi ng symbol s to operations.
The pri mary reason for using a symbol to represent an operation is
that you can formul ate probl ems i n whi ch the gi vens are composed
of statements that i nvol ve action concepts as well as object or property
concepts. Consi der a probl em to sol ve two l i near equations i n the
variabl es X and y for the val ues of X and y -for exampl e, 2 + y I
and X 6y 20. Addi ti on, subtracti on, and multipl i cation operations
are i ndi cated i n each of these statements, but the parti cul ar actions
they i ndi cate (for exampl e, multiply y by 6) are not the acti ons you
take i n sol vi ng the probl em. The actions you take are to add, subtract,
mul ti pl y, or divide both si des of some equation by the same quantity
and to substitute equals for equal s.
By contrast , these are probl ems i n whi ch you might take t he opera
tions of addi ti on, subtracti on, mul ti pl i cation, and so on, of two quanti
ti es as the operati ons used at vari ous nodes of the state-action tree.
For example, water-jar probl ems (such as the one di scussed in Chapter
8) i nvol ve presenti ng several jars that hold diferent quantities of
l i qui d. One is asked to produce some quantity of l i qui d that is diferent
from the capacity of any jar. The method of solution i nvol ves, i n es
sence, addi ng and subtracting the capaci ti es of each of the jars.
I magi ne that you have a fve-quart jar and a three-quart jar and are
attempting to obtai n exactl y one quart of water. You coul d fl l the three
quart jar, pour it i nto the fve-quart jar, fll the three-quart jar, pour
two quarts into the fve-quart jar, and have exactl y one quart lef in
the three-quart jar, as requi red.
In water-jar probl ems, addi ng and subtracti ng quanti ti es are the
operati ons i n the state-action tree. In sol vi ng equations, addi ng and
subtracting quantities are operati ons used in constructing statements ;
addi ng and subtracting the same quantities ...,.,..
are operati ons in the state-action tree. Perhaps there shoul d be two
diferent names to di stingui sh operations at these two diferent l evel s
i n a probl em. In any event, i t i s i mportant for cl ear thi nki ng to keep
operations at the two l evel s di stinct i n your mi nd.
Mappings and Functions
I magi ne that you have the el ements of some set A (the argument set),
the el ements of another set T (the target set), and a set of arrows, each
TOpics in Mathematical Representation Zb
going from one member of the argument set to one member of the target
set. Not all of the el ements of set A need have arrows going from them.
Some el ements of set A may have several arrows goi ng from them to
di ferent el ements i n set T. Some of the el ements of set T may have no
arrows going i nto them, and some may have several arrows going i nto
them. Any such system of arrows l i nki ng two sets is cal l ed a ..,,,
Al though it is easier to expl ai n mappi ngs in di agrammati c (spatial )

terms, a mappi ng can be represented verbal l y as a .of ordered pairs
where the frst member of the pair i s an element from set A and the
second i s a correspondi ng el ement i n set T. The number of ordered
pairs i n the set i s the same as the number of arrows i n the di agram
matic representati on. Of course, most mappi ngs of i nterest are repre
sented more si mpl y by a rul e that al l ows us to compute the el ement
i n T associated with each el ement i n A.
One exampl e of a mapping i s i l l ustrated i n Fi g. 1 0- 3 , whi ch suggests
that sets A and T are completel y di sti nct, that i s, have no el ements
i n common (are nonoverlappi ng) . Thi s need not be the case. Sets A and
T could be i dentical sets, ei ther one coul d be compl etel y i ncl uded i n
the other, they coul d be overlapping ( some el ements common to both
sets, but some elements i n each set bei ng not contained i n the other
set), or they coul d be nonoverlappi ng. I n short, any set relation is
Set A Set 1
FIGURE 1 0-3
A mapping from set A to set T. where
set A and set T are nonoverlapping .
Zb Chapter J
possi bl e for sets A and T. An exampl e of a mapping i nvol vi ng two
overlapping sets is shown in Fi g. 1 0-4.
I n addi ti on to the relation between the argument set ( set A) and the
target set (set T) as just di scussed, mappi ngs have other properti es.
A mappi ng i s .,if and onl y if i t i s defned over al l members of
the argument set (that i s, all members of set A have an arrow going
from them) ; otherwi se i t i s ., Ofen we assume that all map
pi ngs are compl ete, si nce the argument set coul d always be reduced to
those el ements for whi ch the mappi ng is defned, with no l oss of i nforma
tion regardi ng the mappi ng other than that regarding i ts i ncompl eteness.
Set A Set 1
FIGURE 1 0-4
A mapping from set A to set T. where set A and set T
are overl apping. In thi s case, the sets have two common
el ements. Sets are encl osed i n ci rcl es. Common el ements
are i n the overl appi ng part of the two ci rcl es.
A mappi ng i s .,...(i s ..or i s a ,..if and
onl y i f each el ement i n the argument set maps into no more than one
el ement i n the target set (that i s, there is no more than one arrow going
out of each el ement i n set A ) . If a mappi ng i s both si ngle valued and
compl ete, then exactl y one arrow goes out of each element i n set A.
The term ,.i s someti mes reserved onl y for si ngle-valued map
pi ngs, but frequentl y the expressi on .....,.is also heard,
whi ch i ndi cates that there i s not complete consi stency i n the restric
tion of the term functi on to si ngle-valued mappi ngs. An example of
a compl ete and si ngl e-val ued mappi ng i s shown i n Fig. 1 0- 5.
A mappi ng i s an mappi ng i f and onl y i f every el ement of the
target set has an arrow going i nto it. Another way to say thi s is that the
Set A Set 1

:
:
Z
FIGURE 1 0-5
A si ngl e-val ued mapping from set A to
set T. The mapping i s al so compl ete
since al l members of set A have arrows
going from them. Note that the
mapping i s si ngle-valued i n going from
set A to set T but not in the inverse
di recti on, nor i s the inverse mapping
even compl ete (arrows do not go into
every el ement i n set !).
inverse mapping ( going backward along the arrows from set T to set A )
is compl ete (that i s, defned over every el ement in set T) . An exampl e of
an onto mapping i s shown i n Fi g. 1 0- 6.
A mapping i s i f and onl y i f every el ement of the target
set has no more than one el ement going i nto it. Another way to say
thi s is that the inverse mapping ( going backward along the arrows from
set T to set A ) is si ngl e valued. Note that a one-to-one mapping need
not be onto, just as a singl e-val ued mappi ng need not be compl ete.
That i s, the one-to-one property requi res that no more than one arrow
go i nto each el ement i n the target set, whereas the onto property re
qui res that at least one arrow go i nto each el ement in the target set.
Si mi l arly, the si ngle-valued property requi res that no more than one
arrow go out from each el ement i n the argument set, whereas the com
pl eteness property requires that at least one arrow go out from each
element i n the argument set. An exampl e of a one-to-one mappi ng that
is not si ngl e-val ued, onto, or compl ete is shown in Fi g. 1 0-7.
Set A Set 1
FIGURE 1 0-6
An onto mappi ng from set A to set T,
meaning that the inverse mapping
from set T to set A i s compl ete
(defned over al l el ements i n set T).
The mapping from set A to set T is
al so complete but i s not single val ued.
Z
5cI A 5cI 1
FI GURE 1 0-7
A one-Io-one mappi ng from set A to set
T thaI i s not compl ete ( two el ements of
sel T have no arrows goi ng from t hem) .
not ont o ( one el ement of set T has no
arrow going i nt o i t ) . and not single
val ued ( some elements of set A have
more than one arrow going from them
to diferent members of set T) .
Real-Valued Functions of 8 Real Variable
The most commonl y encountered functi ons are real -valued functions
of one or more real variabl es (real arguments). A statement such as
y f( x) means that an element x i s mapped i nto an el ement y accordi ng
to the rul e (functi on) represented by the symbol f. I n thi s section, we
are assumi ng that x and y are real numbers (the ordi nary positive and
negati ve i ntegers, fractions, square roots of posi tive i ntegers, and so on,
wi th whi ch we are fami l iar) . Exampl es of functi ons that f coul d repre
sent i ncl ude pol ynomi al s with known coefci ents (y 7x + 2, y 4r
3x2 + 37) , pol ynomial s with unknown coefci ents (y ar
t
bx
+ . and trigonometric functi ons (y si n x + a tan3x) .
I n some probl ems, you may know that one variable is a function of
one or more other variabl es, but i t may take some problem sol vi ng
to determi ne exactl y what the function i s. In such cases, i t i s general l y
hel pful to assign some symbol l i ke f to the unknown function and
write equati ons i nvol vi ng the unknown function [for exampl e, y f( x) ] .
I n some probl ems, you can reduce the number of symbol s you have
to remember by just writing an expression such as y y (x) , meaning
that the variable y is a functi on of the value of x, but without bothering
to give a separate name to the function (separate from the value of the
functi on when x is the argument). Thi s is a useful mnemonic tri ck i n
si mpl ifyi ng notati on i n probl ems where there i s no possi bi l ity of con
fusi ng the concept of the functi on ( f) with the concept of the dependent
variable ( y) . However, when such confusi on i s possi bl e, thi s trick
should be avoi ded.
1 1
Problems from Mathematics,
Science, and Engineering
Thi s chapter is designed to establ i sh the general i ty of the probl em
sol vi ng methods di scussed throughout the book. In previ ous chapters,
the probl ems used to i l l ustrate the methods were del i beratel y sel ected
so that they coul d be solved by the reader with no more background
than a high school student with one year of algebra and one year of
pl ane geometry. Many of the probl ems were of the puzzl e (or brain
teaser or recreational mathemati cs) variety, which requi re no speci al
ized knowledge of mathematics, sci ence, or engineering. Al though
methods for solving such probl ems have some recreational interest,
there i s al so a serious practi cal reason i n mastering them, for they
are also useful for sol vi ng serious probl ems i n all areas of mathematics,
sci ence, and engi neeri ng. Thi s chapter i s designed t o demonstrate thi s
appl i cabi l ity and to gi ve the reader some experi ence i n i t.
ALGEBRA
The solution of systems of si mul taneous l i near equati ons provi des a
simpl e example of the use of eval uation functions, hi l l cl i mbi ng, and
ZT
Chapter J J
subgoal s. As an exampl e, consi der the fol l owing system of three
l inear equations :
or
2x + y - 3z = 1
x + 2y + 5z = 9
3x - 3y - I Oz = 4
( El )
(E2)
(E3)
The operations avai l abl e for sol vi ng such a system are essential l y the
fol l owi ng. We can (a) mul tipl y both si des of an equati on by the same num
ber, (b) add equal s to equal s (or subtract equal s from equal s), and (c) sub
sti tute equal s for equal s. As an exampl e of the frst, consi der the acti on
of mul ti pl yi ng both si des of equati on (E2) by the number -2. Thi s yi el ds
the equati on -2x -4y - l Oz =-1 8. As an exampl e of the second operati on,
consi der the acti on of addi ng the equati on just deri ved from (E2) to (E 1 ) :
2x + y - 3z =
-2x - 4y - l Oz = -1 8
- 3y - 1 3z = -1 7
3y + 1 3z = 1 7
( El )
( -2)
.
( E2)
(E4)
As an exampl e of the t hi rd operati on, consi der the fol l owi ng substi tuti on
of an expressi on for x, deri ved from (E2), i nto ( EI ) :
x = -2y - 5 z + 9
2 (-2y - 5z + 9) + y - 3z = 1
(E2)
( El )
I f we were given a parti cular numerical val ue for x, we coul d, of course,
also substitute that parti cul ar value for x anywhere i t appeared i n any
equat i on. The goal i s to deri ve three expressi ons of the form x = . g
y = . , and z = = , where speci fc numbers appear in the bl anks.
Now stop reading and try to sol ve the probl em.
The sol uti on of such a problem i nvol ves pri mari l y the use of an
eval uation function and the subgoal method, with perhaps a l ittle use
of hi l l cl i mbi ng. The eval uation function i s concerned wi th the number
of variabl es (unknowns) i n each equation and the number of indepen
dent equations i nvol vi ng any parti cul ar set of variabl es (unknowns).
The original system of equations consi sts of three equati ons, each of
whi ch has three variabl es (unknowns). From thi s starting point, a
more highl y val ued state woul d be one in whi ch we had two equations
i nvol ving the same two unknowns. An even more highl y valued state
woul d be one i n whi ch we had one equation i nvol vi ng one unknown.
Still more highly val ued woul d be a state i n which we had two equa
tions, each of whi ch i nvol ved a di ferent, si ngle unknown. The most
Problems from Mathematics, Science, and Engineering ZTT
hi ghl y valued state of all -short of sol uti on -woul d be one i n whi ch
we had three equati ons, each i nvol vi ng a di ferent, si ngl e unknown.
For the purpose of defni ng the present state-eval uati on functi on, note
that we have ignored the subprobl em of sol vi ng a si ngle l i near equa
tion with one variable for the val ue of the unknown, si nce we assume
that to be a tri vi al subprobl em whose method of sol uti on i s al ready
wel l known. We have not bothered to assign numbers to states that
have the above-mentioned properti es, because thi s i s unnecessary for
sol vi ng thi s probl em. There are several ways to assi gn speci fc num
bers to these states, and any of them woul d be equal l y sati sfactory
as guide to the defni ti on of successi ve subgoal s i n sol vi ng the probl em.
In learni ng to sol ve systems of l i near equations by means of the
above three operations, you shoul d frst master the sol uti on of l i near
equati ons i nvol vi ng one unknown, then systems wi th two i ndependent
equations i nvol vi ng two unknowns, then three i ndependent equati ons
invol vi ng three unknowns, and so on. You shoul d l earn that the frst
subgoal to achi eve i n a system of M equati ons with M unknowns i s to
deri ve a system of M - I equati ons i nvol vi ng M - I unknowns. The
next subgoal i s to derive a system of M - 2 equati ons i nvol vi ng M - 2
unknowns, and so forth. Occasi onal l y, it is possi bl e to jump several
l evel s at once, and thi s i s even better, but i n general you must proceed
one step at a time. Now stop readi ng and solve the probl em, if you did
not do so before.
To solve the above probl em, we shoul d frst set a subgoal that we
must achi eve a system i nvol vi ng two equati ons and two unknowns.
Somehow, then, we must deri ve two equati ons, from each of whi ch we
have el i mi nated the same unknown. Si nce two such equati ons must
be derived, there are two parts to thi s frst subgoal (two subgoal s
of the frst subgoal ). There are a variety of ways to accompl i sh the
frst subgoal , one of which is as fol l ows:
Ll
2x + y - 3z =
-2x - 4y - I Oz = -1 8
- 3y - I 3z = -1 7
3y + I 3z = 1 7
3x - 3y - I Oz = 4
-3x - 6y - I Sz = -27
- 9y - 2Sz = -23
( EI )
(-2)
.
( E2)
( E4)
( E3)
(-3 )
.
(E2)
(ES)
ZTZ Chapter J J
Havi ng achieved the frst subgoal , the next subgoal i s to solve this
system of two equati ons and two unknowns to derive a singl e equation
i nvol vi ng one unknown, as fol l ows :
-9y - 25z = -23
9y + 39z = 5 1
1 4z = 28
z = 2
( E5)
( 3)
.
( E4)
Havi ng achi eved the second subgoal (i ncl uding fnding the value of
one of the unknowns) , it is time to proceed to the third subgoal of
deriving another singl e equation i nvol vi ng a si ngle unknown. This
deri vation can be done by using the substitution operation, as fol l ows:
3y + 1 3
.
2 = 1 7
3 y + 26 = 1 7
3y = -9
y = -3
( E4)
Y
Fi nal l y, the ffth subgoal and the fnal component in the sol ution of
the problem is as fol l ows:
x + [ ( 2)
.
(-3 ) ] + [ ( 5 )
.
( 2) ] = 9
x - 6 + 1 0 = 9
x = 5
( E2)
Y
To sol ve the above probl em, you have to know where you want to
go at all stages in its sol uti on. That i s, you must have an eval uation
function si mi l ar to that di scussed here. The eval uation function pro
vi des the means for defni ng a series of subgoal s ( subprobl ems) that
l ead to the sol ution of the entire probl em. Al ong the way, in the achieve
ment of some of the subgoal s, one equation might be multiplied by a
number to yi el d an equation with the same coefcient for a particular
unknown as some other equation al ready obtai ned. Thi s action i l l us
trates the use of another eval uation function -namel y, getting two
equations to i nvol ve the same coefcient for a parti cul ar variabl e.
Si nce achi evi ng thi s subgoal i s relati vel y si mpl e, we mi ght vi ew the
sel ection of the appropriate action to achi eve thi s subgoal as hi l l
cl i mbi ng. However, I thi nk that viewi ng the sol ution i n terms of the
subgoal method i s far more accurate and important.
Probl ems from Mathematics, Science, and Engi neeri ng ZT
N ow l et us consi der the solution of a very di ferent type of equati on:
The goal i s t o deri ve an equati on of t he form x = . The operati ons
avai l abl e i ncl ude al l those speci fed i n the previ ous probl em. Al so avai l
abl e are operati ons that may be stated general l y as "doi ng the same thi ng
to both sides of an equation": adding the same number to both si des, sub
tracti ng the same number from both si des, mul ti pl yi ng or di vi di ng both
sides of the equati on by the same number, rai si ng both si des to the same
power, taki ng the same root of both si des, or taking l ogs of both si des.
( Remember, however, that operati ons that i ncrease the degree of an equa
ti on will add root s and operati ons that reduce the degree of an equation
will subtract roots. ) For the purposes of sol vi ng this probl em, the onl y
properti es of logari t hms t hat we need to know are that l og ( ab ) b l og a
and that l og (
a
.
b) log a + log b.
Now stop readi ng and try to sol ve the probl em.
Si nce one property of the goal expressi on i s that the x does not ap
pear i n an exponent, one subgoal that can be defned i mmediatel y i s
to derive an equation i n whi ch x does not appear i n an exponent. Stop
readi ng and try to solve the probl em, if you di d not before.
Thi s sort of probl em would appear in conjuncti on with a chapter on
l ogari thms, si nce to achi eve the subgoal we must take logari thms of
both si des of the equation, as fol l ows:
(x - 3 ) l og 4 x l og 2 + ( x + I ) Iog 3
( l og 4 - l og 2 - l og 3 ) x l og 3 + 3 log 4
x :-
l o-g,3
:
+-- 3
-
I-og
,,
4_
:- -:
log 4 - log 2 - l og 3
Of course, we must know the rel evant properti es of l ogari thms i n
order to solve thi s probl em. I n addi tion, we must defne the subgoal
of achi evi ng an equation that i s i n a form for whi ch we know the ap
propriate sol uti on methods, just as i n the case of si mul taneous l i near
equations. In vi rtual l y al l probl ems from mathemati cs, sci ence, and
engineering, there i s an i nterplay between the use of speciali zed knowl
edge and the use of general probl em-sol vi ng methods. Either the lack
of special i zed knowl edge or the fai l ure to use general probl em
sol vi ng methods wi l l resul t i n fai l ure to sol ve probl ems.
I n a si mi l ar vei n, consi der the fol l owi ng logari thmi c equat i on:
l ogl o (x - l ) + 1 0gl O 5x 1
ZT4 Chapter J J
Stop readi ng and try to sol ve the above equation for the value of
the variable x.
The sol ution i s analogous to the solution of the previous probl em;
namely, we set as a subgoal the derivation of an equation that is a
si mpl e pol ynomial in x, for whi ch we may know a solution method
(for exampl e, factoring or substitution i nto the quadratic formul a) .
I n thi s instance, we set as an i ni tial subgoal the derivation of an
equation i nvol vi ng no log terms. That i s, we attempt to el i mi nate logs
i n the above equation. Stop readi ng and try to solve the probl em, mak
ing use of this subgoal .
El i mi nating logari thms from the equation can be achieved by ex
ponentiating each side of the equation, as fol l ows :
or
1 0
10g(x
- I
)
+
lo
g
5
x
= 1 0
1
1 O
log(x
- I
) 1 O
l o
g 5
x
= 1 0
(x - 1 )
.
5x = 1 0
5x2 - 5x - I O = 0
x2 - x - 2 = 0
(x - 2) (x + l ) = 0
x = 2
x = -1
From the above, two roots for the equation were derived: x = 2 and
x -1 . The l atter root i s not a solution to the original equation; it
i s a root that was added vi a the exponentiating process, since exponen
tiating i ncreased the degree of the equati on. Operations that resul t i n
equations with added roots are not as dangerous to use as operati ons
that resul t i n equations that el i mi nate roots of the original equation.
When roots are added, it i s easy enough t o determine t he correct roots
by substitution in the original equation. When roots are subtracted,
there may be no way to determine the val ue of the el i mi nated root
whi ch may not be a serious probl em, if you do not need to get that
el i mi nated root.
In the present probl em, exponentiating both si des of the equation
resulted i n a quadratic that was factorabl e, permitting easy solution.
Of course, any quadratic equation can be solved by the quadratic
formul a, whi ch anyone who has mastered high school algebra should
Probl ems from Mathematics, Science, and Engi neering ZTb
have memorized or be abl e to l ook up. I n thi s and the precedi ng prob
lem, the state eval uation function being used was that equati ons wi th
l ogs or exponential s of pol ynomials i n X are l ess hi ghl y valued than
equati ons that are si mpl e pol ynomi al s i n X, regardl ess of the degree.
The reason i s that we do not know any di rect algori thm for sol vi ng
an exponenti al or logari thmic equati on. Thus, we are required to trans
form the equation i nto some form for which we know a method of
solution that works at l east i n some cases. I n the last two probl ems,
then, we had to transform the exponenti al or logari thmi c equati ons
i nto pol ynomial equations, hopi ng that the pol ynomi al equati ons so
derived woul d be sol vabl e by factori ng or substi tuti on i nto the quad
rati c formul a.
TRIGONOMETRY
Determi ne the al ti tude, h, of a general scal ene tri angl e, gi ven the l ength
of one si de (its base b) and the angl es made by the two other sides wi th
the base (the two base angl es, O and y) , as i l l ustrated i n Fi g. 1 1 - 1 .
B
FI GURE 1 1 - 1
Al titude of a triangle probl em.
A
/
Knowing the base and the angl es L and , we have determined a
specifc (uni que) triangl e. Thus, in pri nci pl e, the altitude and every
other property of the tri angl e i s specifed. However, to compute h, we
need to have rel ati onshi ps that l i nk h to the val ues of the known quanti
ties , and b. We might consul t a trigonometry text to determine
whether there was any si mpl e formul a i nvol vi ng the unknown h and
the three known quantities L , and Suppose the trigonometry text
l i sted no such formul a. How mi ght we proceed to determine the val ue
of h ? Stop reading and try to sol ve the probl em.
We might defne a subgoal of determi ni ng the area of the triangle i n
terms of the known quantiti es L , and Thi s subgoal i s extremel y
useful since we al ready shoul d know that an equation exi sts that re
l ates h and to the area (A ) of the tri angl e -namel y, A l bh. Stop
readi ng and try agai n to sol ve the probl em, if you did not do so before.
ZTb Chapter J J
Suppose the trigonometry text does indeed l i st several formulas for
the area of a triangl e, one of whi ch l ooks somewhat si mi l ar to the equa
tion we are afer, namely, A ( b
2
si n L si n -) /( 2 si n () . The onl y prob
lem with thi s equation is that i t i nvol ves the addi tional quantity {.
However, if we remember that the sum of the angl es of a triangl e equals
1 80, then knowi ng two angl es of a triangle allows us to compute the
val ue of the thi rd angle of the triangl e, namel y, { 1 80 L -. Thus,
usi ng thi s equati on for the area of a triangle i n terms of the three known
quantities L -, and b, we can determine the area, A. From knowl edge
of the area, A, and the base, b, we can compute the height, h, whi ch
was to be determi ned.
Once agai n, note that speci fc knowl edge of trigonometric and geo
metric rel ati ons is critical for sol vi ng the probl em. However, the more
compl ex formul a for the area of the triangle i n terms of three angl es
and one si de i s a relation that we need not have memorized but onl y
be capabl e of l ooki ng up i n a trigonometry text. Even the si mpl e
formul a for fndi ng the area of the triangle i n terms of i ts hei ght and
its base might be looked up in such a text, though i t is l i kel y to be
remembered by a student who has understood geometry and trigonom
etry. The speci fc geometric knowl edge that the sum of the angl es of the
triangle is 1 80 probabl y must be known i n order to solve this probl em.
Sol vi ng thi s probl em requi res more than knowl edge of the relevant
geometric and trigonometric facts. We must also know whi ch of all
of the relevant facts shoul d be sel ected to use i n the sol uti on. Thi s
sel ecti on requi res the use of general probl em-sol vi ng methods. The
goal i s to fnd the val ue of h. We mi ght work backward from thi s state
ment of the goal to an equati on in whi ch h is i nvol ved along with some
other quanti ti es. Such an equation might be A i bh, the most com
monl y known formul a for the area of a triangl e. Si nce thi s equation
i nvol ves the area that i s also unknown, i t i mmedi atel y suggests the
subgoal of fnding the value of the area i n terms of known quanti ti es.
Thi s subgoal probabl y requi res us to exami ne a tabl e of trigonometric
formulas to determi ne if there is a formul a that rel ates the known
quanti ti es L -, and b to the area of the triangl e. If there is, then we
have all we need to sol ve the probl em. For us to solve this probl em,
having a speci fc knowl edge of trigonometric formulas i s l ess i mportant
than having access to books contai ni ng such i nformation in conveni
entl y usable form. What i s cri ti cal i s the use of the general problem
sol ving methods of working backward and defni ng subgoal s. These
methods provi de the framework within which we can proceed i n d
goal -di rected manner to sol ve the probl em.
Hi l l cl i mbi ng was al so used to some extent to sol ve thi s trigonometry
Problems from Mathematics, Science, and Engineering ZT
probl em. I n worki ng backward, the selection of an equati on relating
to h and A i s superior to an equation relating to quantiti es none
of whi ch i s known. Here, the value of the eval uation function i s greater,
the more known quantities appear in the expressi on and the fewer
unknown quanti ti es appear i n the expressi on. The same pri nci pl e
appl i es i n worki ng forward i n tryi ng to defne as a subgoal another
formul a for the area of the tri angl e that i nvol ves as many known
quanti ti es as possi bl e. In thi s case, i t was possi bl e to fnd a formul a
that i nvol ved al l three known quanti ti es and an addi ti onal quantity
that was not original l y speci fed i n the givens as known but that could
be tri vi al l y derived from the gi ven i nformation by means of a well
known geometric theorem (that the sum of the angl es of a triangle
i s 1 80 ).
ANALYTIC GEOMETRY
Determi ne the l ocati on and geometri c properti es of the fgure speci fed
by the equati on x2 + y2 - 5x + 7y 3.
The specifcal l y rel evant knowl edge from the fel d of anal yti c geom
etry i s that any equation of the form x2 + y2 + Ax + By C i s the
equation of a ci rcl e, and thi s equation can always be transformed to
an equi val ent equation of the form (x - a)
2
+ ( y - h) 2 (2, where
( a, h) represents the coordi nates of the center of the ci rcl e and (
represents the radi us of the ci rcl e. Retri evi ng thi s rel evant knowl edge
i s surel y essential to sol vi ng the probl em, and i t gi ves part of the answer
already -namel y, that the form of the fgure speci fed by the above
equation i s the form for a ci rcl e. Al l that remai ns i s to determi ne the
coordi nants of the center of the ci rcl e and i ts radi us ( a, h, and c ) . I t
i s thi s probl em to whi ch general probl em-sol vi ng methods are ap
pl icable. Stop readi ng and try to sol ve the probl em.
The gi ven expressi on i s x2 + y
2
-5x + 7y 3 , and the goal expressi on
i s of the form (x - a) 2 + (y - h)
2
c2. Once agai n, we mi ght defne a
subgoal by means of working backward from the goal expressi on. Start
ing with the goal expressi on, we know we can rewrite the goal expres
sion i n the form (x2 - 2ax + a2 ) + (y2 - 2hy + h
2
) (2. Now we need
to work forward from the gi ven expressi on to the subgoal expressi on.
Stop reading and try agai n to sol ve the probl em, if you di d not do
so before.
We know that 2a 5, from whi ch a . Si mi l arly, -2h = 7 or h f .
Thi s means that we must add a
2
25/4 and h
2
49/4 to the lef si de of
ZT
Chapter J J
the gi ven equation i n order to compl ete squares to get t o the l ef-hand
side of the subgoal equation. The same 29/4 + 49/4 must, therefore,
be added to the right side of the gi ven equation, yi el di ng 3 + 74/4, or
86/4. Thus, 86/4
and L V/2. Therefore, the coordi nates of the

center of the ci rcl e are ( i . i ) g and the radi us of the circl e i s V/2,
Compl eting the square becomes a rather fami l i ar specifc technique
in and of itself, once you have a certain degree of experience in
mathematics. However, at some point to the begi nni ng mathematics
student, it i s a new, unknown techni que. To the same student, the
techni que of expanding a term of the form (x . )
to get (x2 -2ax + .
)
is fami l iar. By usi ng the general probl em-sol vi ng method of working
backward to get a subgoal , we can get the idea of completing the square
in a compl etel y natural way and know exactl y how to do it.
Determi ne equati ons for the new coordi nates of a poi nt i n a pl ane when
the new coordi nate system is obtai ned by transl ati on and rotati on from
an ol d coordinate system. Transl ation of a coordi nate system means the
origin i s changed to a new point , and rotation of a coordi nate system
means both axes are turned through the same angl e i n the same di recti on,
pi voti ng about the ori gi n.
I t makes no di ference i n whi ch order t he transl ation and rotation
operati ons are performed ; the same new coordi nate system i s obtained
i n either case. Stop reading and try to solve the probl em.
The sol ution of thi s probl em resul ts si mpl y from breaking the prob
lem i nto parts, that i s, setting subgoal s. Fi rst, sol ve the probl em of
characterizing the new coordi nates obtained by si mpl e transl ation
al one. Havi ng achieved this subgoal , then sol ve the second subgoal
of characterizing the fnal set of coordi nates after a rotation has been
appl i ed to the coordi nate system previousl y derived from the transl a
tion. Stop readi ng and try to sol ve the probl em agai n.
If x and y are the original coordi nates, and the coordi nates of the
new origin i n terms of the origi nal (x, y) axes are (x
o
,
Y
o
) , then let the
new coordi nates obtained by transl ation be represented by
J
,
Y
I
' The
formul as for si mpl e transl ation of coordi nates are as fol l ows:
Y
Y
J
+
Y
o
YI
Y
-
Y
o
Stop reading and try to sol ve the rest of the probl em, if you have
Probl ems from Mathematics, Science, and Engineering
ZT
Let the coordi nates obtained by rotation through the angl e L be
X
2
and Y
2
. Formul as relating the new (X
2
' Y
2
) coordi nates to the pre
vious (Xl ' Yl ) coordi nates are as fol l ows:
XI X
2
cos L - Y
2
si n L
Yl X
2
si n L + Y2 cos L
To get formulas for the combi ned transl ation and rotati on trans
formati on, si mpl y combi ne the above formulas to obtain the fol l owi ng:
X X
2
cos L - Y
2
si n L + Xo
Y X
2
si n L + Y
2
cos L + Yo
To express the new coordi nates i n terms of the old coordi nates
requi res some algebra, from whi ch yi el ds the fol l owi ng:
X
2
X cos L + Y si n L - ( xo cos L + Yo si n L)
Y
2
-x si n L + Y cos L - (-xo si n L + Yo cos L)
With the probl em bei ng broken i nto two subgoal s, each of whi ch was
simpler to obtai n, i t was possi bl e to obtain the solution to the original
probl em by simple algebraic combi nation of the sol uti ons to the two
subprobl ems.
Now l et us consi der the fol l owing probl em i nvol vi ng the transforma
tion of coordi nate systems :
I s it possi bl e to transform axes such that the strai ght l i ne 4x - 3y + 2 0
wi l l have the form A 0 and such that the straight l i ne 2x + y 4 wi l l
have the form I, aYI ? If such i s possi bl e, deri ve t he transformati on.
Probabl y the frst thi ng to note i s that we have two equati ons and
two unknowns. I f the equations are i ndependent, which they are, then
it wi l l be possi bl e to solve the equations for the val ue of X and ). From
a geometric poi nt of view, then, you must fnd the point of i ntersection
of the two straight l ines represented by these l i near equations. Sol vi ng
for the poi nt of i ntersection of the two straight l i nes wi l l prove to be an
important subgoal in sol vi ng the probl em, but you need not even real i ze
that i n the begi nni ng. Si nce sol vi ng these two l i near equati ons and
two unknowns i s so si mpl e, you shoul d probabl y si mpl y draw these
ZZ
Chapter J J
inferences from the gi ven i nformation, without regard to the goal ,
at the outset of the probl em (as di scussed in Chapter 3) . Thi s assumes,
of course, that you are famil iar with the process of sol ving two l i near
equations with two unknowns, so that thi s is a tri vi al inference. When
it is easy to represent expl ici tl y the i nformation that is given i mpl i ci tl y
i n the probl em, you shoul d undoubtedl y do so i n the begi nni ng, before
even thi nki ng about how to reach the goal from the given information.
Thi s i nitial step wi l l yiel d the information that the sol ution of the two
equations ( poi nt of i ntersection of the two straight l i nes) is X I ,
2.
Stop reading and try to sol ve the probl em, if you di d not before.
Consider the goal and deri ve a sui table pl an for achi evi ng the goal
from the given i nformati on. Havi ng drawn the inference that the two
straight l i nes i ntersect at a parti cul ar known point, it is now cl ear
that the goal i s achi evabl e. Why? Stop readi ng and try to sol ve the
probl em, if you sti l l have not done so.
The probl em obviousl y i ndi cates a di vi si on i nto two subprobl ems :
(a) making the frst l i ne have the form .
0 and (b) making the second

line have the form .
Drawing i nferences about these subgoal s

wi l l poi nt out the general types of transformation necessary to achieve
each subgoal . Stop reading and try to sol ve the probl em, if you did
not do so.
It i s somewhat easier to consi der the achi evement of the second
subgoal frst. The second subgoal equation, .
asserts that, in
the new coordi nate system, the second line will have zero .
and
i ntercepts (bei ng a strict proportionality) . Having zero .
and
i ntercepts means that the equation of the second l i ne must pass through
the origin of the coordi nate system. Havi ng drawn this i nference from
the subgoal (or worked backward from the subgoal , if you wi l l ) , what
transformation of the ori gi nal coordi nate system wi l l achi eve this new
subgoal ? Stop readi ng and try to solve the probl em, if you did not before.
Cl earl y, we can achi eve the second subgoal by a si mpl e transl ation
of the coordi nate system from the origin to any point on the second
line (2 +
4) . Our preceding i nference concerning the point of

intersection of the two straight lines might bias us to transl ate the
origin of the coordi nate system to the point of i ntersection of the two
straight l i nes, but at this stage of working on the probl em we do not
know for sure that thi s i s the correct point of origin for the new
coordi nate system.
Now, how do we achi eve the other subgoal , namel y, that of trans
forming the frst l i ne so that it has the form .
O? Stop reading and

try to sol ve the probl em, if you have not done so al ready.
Again drawing i nferences from the subgoal (worki ng backward), we
Problems from Mathematics, Science, and Engineering ZZT
see that to transform the frst l i ne i nto a l i ne with the equati on .
-0,
we must transform the frst l i ne so that i t is coi nci dent with the
axi s
i n the new coordi nate system. Such a l i ne wi l l al ways have the .
co
ordi nate equal to zero for any val ue of the
coordi nate. Achi evi ng

thi s goal requi res what type of transformation of the original coordi nate
system? Stop readi ng and try to complete the sol uti on of the probl em,
if you have not done so al ready.
Cl early, to achi eve this subgoal we must, frst, transl ate the origin
of the coordi nate system to some poi nt along the frst line and, second,
rotate the axi s of the coordi nate system to coi nci de with the frst l i ne.
The frst aspect i nteracts with the transformation necessary to achi eve
the other subgoal , so we are restricted to l ocati ng the ori gi n of the new
coordi nate system at the point of i ntersection of the two straight l i nes
(since the origi n of the new coordi nate system must l i e on both straight
l i nes). A second restriction i n achi evi ng the present subgoal is that the
axi s of the coordi nate system must be rotated around thi s new origin
until the
axi s coincides with the frst straight l i ne ..
+ 2 -0) .
Geometric i ntuiti on i ndi cates that thi s can be done and so achi evi ng
the goal is cl earl y possi bl e.
Now the probl em i s to deri ve the nature of the rotati on, si nce the
transl ation i s al ready obvious , that is, to move the origin to the point
( I , 2). To sol ve the rest of the probl em, we now need to know the angle
of rotation of the coordi nate system requi red to line up the
axis with
the straight l i ne represented by .
+ 2 -O. Stop readi ng and

solve the rest of the probl em, if you have not done so al ready.
To sol ve thi s subprobl em l et us write down the formulas for the new
coordi nates i n terms of the ol d coordi nates ( noting that .
-.
and
, -namel y, .
-.cos L +
si n L k, where k -.
cos L +
si n
L. We mi ght al so go ahead and write down the equati on for
i n terms
of .but thi s actual l y i s unnecessary to the sol uti on of the probl em.
I f we are conti nual l y aware of what terms represent constants and what
terms represent vari ables in any expressi on, then we shoul d note that,
i n the right si de of the equation we have just written down, .and
are the onl y variabl es, and cos L si n L and k are all constants (al bei t
unknown to us at present). Si nce gi ven i nformati on speci fes that .
0
and that .
+ 2 -0, we can equate .cos L +
si n L k to the
expression .
+ 2.
Now if we take the equati ons for the general transformation of
coordi nates i nvol vi ng both transl ation and rotati on and substitute
them into the equation .
+ 2 -0 we obtain
.
cos L
si n L + .
) .sin L +
cos L +
+ 2 -O.
ZZZ Chapter J J
Si nce we wi sh t o fnd an a for whi ch - we can substitute -i nto
the above equati on and al so substitute the known val ues of _ -1 and
Y
o
-2. Thi s yi el ds the equation
-( 4 si n a + 3 cos a)
Y
I -
Thi s equation i mpl i es that 4 si n a + 3 cos a -O. Thus , 4 si n a =-3
cos a. To determi ne a from thi s equation, si mpl y remember the trigo
nometric i denti ty that sin a/cos a -tan a. We can then derive from
the above that tan a -- --0. 75 . Usi ng the tabl es, thi s i ndi cates that
a --3 6 5 2' . Thus, the sol uti on to the probl em is to transl ate the
origin of the coordi nate system to the point ( 1 , 2) and rotate the co
ordi nate system through an angle of 3 6 52' i n the negati ve di rection.
One sl ightl y tri cky aspect of the probl em i s the use of the equation
-i n conjuncti on wi th the equation deri ved by substi tuti ng i nto
4x - 3y + 2 O. Afer we have establ i shed that i t is i ndeed possi bl e to
deri ve a transformation of the coordi nates so that the equation 4x - 3y
+ 2 -can be transformed i nto an equation of the form _ -we are
not real l y tryi ng to achi eve the goal expressi on -O. Rather -i s
part of the gi ven i nformati on that we are usi ng t o achi eve the subgoal
of determi ni ng the angle of rotation a. Thi s probl em cl earl y points up
the need to careful l y defne and redefne what is gi ven i nformation
and what i s the goal at di ferent stages i n the solution of the probl em.
CALCULUS
Prove that , wi thi n the set of triangl es havi ng a constant base and constant
peri meter, the isosceles triangle has the maxi mum area.
The speci fc calcul us necessary to sol ve thi s probl em i s to know that
we can ofen fnd the maxi mum or mi ni mum of a function by di fer
entiating it with respect to the variabl e(s) of whi ch it is a functi on.
Cl early, i n the present i nstance, the function (dependent variabl e) i s
the area of a triangl e. However, the area of a triangle can be expressed
as a functi on of a number of di ferent i ndependent vari abl es. Therefore,
l et us set the subgoal of fndi ng a formul a for the area of a triangle that
i nvol ves those quanti ti es that are speci fcal l y gi ven in thi s problem
(ei ther constants or the variabl es with respect to whi ch we wi sh to fnd
maxi mum area). Stop readi ng and try agai n to sol ve the probl em, if
you did not do so before.
I n thi s probl em, t he constants and i ndependent variabl es are evi
dentl y the si des of the triangle ( i ncl udi ng the sum of the si des, whi ch
Problems from Mathematics. Science. and Engineering ZZ
i s the perimeter) . Thus, we need to fnd a formul a that i nvol ves onl y
these quanti ti es. Such a formula, whi ch can be looked up i n a book, i s
Heron' s Formul a: A -
. . .) ( s - b) ( s
,
] 1 1
2
,
where A is the area
of a triangl e; .i s the semi perimeter, whi ch equal s l .+ b + c) ; and
. b, and c are the lengths of the si des. Stop readi ng and try to sol ve
the probl em, if you have not done so already.
Havi ng achi eved the frst subgoal of fndi ng a formul a that i nvol ves
the relevant constants and variabl es, we shoul d note that the formul a
contai ns the semiperi meter (whi ch i s a constant, si nce the peri meter
i s a constant) and the l ength of one side (which i s a constant ; l et i t be
si de .. Two variabl e si des remai n -namel y, b and c -and we might
achi eve the sol uti on by simply di ferentiating A wi th respect to both
b and c. However, suppose we are fami l iar only with di ferentiating
functions of si ngl e variabl es with respect to the si ngl e variabl e in
order to obtai n the maxi mum or mi ni mum of the functi on wi th respect
to that variabl e. I n thi s case, we must set a second subgoal to reduce
the number of i ndependent variabl es from two to one. Stop readi ng
and try to solve the probl em, if you have not al ready done so.
We can achi eve the second subgoal , si nce with a constant peri meter
and a constant base .. the sum of the other two sides must be equal
to a constant. Thus, b + c -k and c - b. Substituting c -k b
i nto Heron' s Formula for the area of a triangl e, we obtain the area
as a function of a si ngl e variabl e, namel y, the l ength of si de b. Thi s
can be di ferentiated wi th respect to b and the deri vati ve set equal to
zero to determi ne that -2b. A tri ck that makes thi s a bi t easi er i s
t o note that, i f t he area A i s a maxi mum, then A 2 i s a maxi mum and
vice-versa. Si nce i t i s somewhat easier to di ferentiate A2 with respect
to b than to diferentiate A with respect to b, thi s l i ttl e tri ck saves some
work. In either case, we solve for -2b, from which i t fol l ows that
b -c, and the theorem is proved. The work is gi ven bel ow:
A2 -. .- . ) . b) . c)
A2 -. . . ) . b) . k + b)
d
2/ db -. . .
)
. - + b) + . b) ( + ,-0
.+ k - b + . b -0
k - 2b -0
- 2b
b + c = 2b
-
ZZ4
Chapter J J
I ncidental l y, the above probl em can be sol ved entirel y without cal
cul us, usi ng the method of contradi ction i n conjunction with Heron' s
Formul a. To use t he method of contradi cti on, we assume that the
squared area (An in the case where , ,is greater than the squared
area (An in the case where
d. Without l oss of general ity,

assume ,>
Since ,+
2d, then
> d >
U si ng these equal i
ti es and i nequal i ti es and some algebraic mani pul ation of the equation
( . , ( .
> . d
, we can eventual l y derive
0,
whi ch is a contradi cti on si nce the square of any real number must be
posi ti ve. The algebra i s given below for the i nterested reader:
Deri ve the form of the fol l owi ng i ndefni te i ntegral :
y =
, _ __i | z
d
x
The background i nformation that is assumed to be gi ven i ncl udes knowl
edge of the i ntegral s of el ementary functi ons (such as xn, eX, log x, si n x,
and cos x). Other i mportant background i nformati on are the techni ques
of i ntegrati on by substi tuti on and by part s, and di ferenti ati ng a functi on
of a functi on.
The frst major choi ce i n attacki ng an i ntegration probl em of thi s
ki nd i s whether to use the method of substitution or the method of
i ntegration by parts. Someti mes both methods must be used, but, i n
any event, you sti l l have to deci de whi ch to appl y frst. Si nce i ntegra-
Problems from Mathematics, Science, and Engineering ZZb
tion by substi tuti on i s the more useful techni que, i t i s to be preferred
as an i ni tial choi ce of i ntegrati on method, unl ess there is some speci al
reason for preferring i ntegration by parts. I ntegration by parts i s useful
pri mari l y when the function to be i ntegrated is a product of two func
tions
,. ,.... , Al though al l functi ons of .can be written
as a product of two functi ons -namel y, , .
, .
.
I -thi s i s a
tri vi al type of product to whi ch the appl i cation of i ntegration by parts
is only occasional l y useful . Thus, i n the present probl em, there i s no
reason to use i ntegration by parts, so we adopt i ntegration by substi
tuti on as our i ni ti al operati on. Stop readi ng and try to sol ve the probl em.
Note that the probl em-sol vi ng method consi derations di scussed
so far i n thi s problem are all speci fc to cal cul us. Substi tuti on and i nte
gration by parts are not general probl em-sol vi ng methods. However, i n
deci di ng what type of substitution to make, general probl em-sol vi ng
methods pl ay some rol e. In parti cul ar, hi l l cl i mbi ng i s useful . The hi l l
cl i mbi ng uses an evaluation functi on concerned roughl y wi th si mpl i ci ty
of functional form and the l i kel i hood of your knowi ng an i ntegral for
the function resul ti ng from thi s substituti on. At the same time, another
eval uation function i s general l y working at cross purposes wi th the
frst one -namel y, the si mpl i ci ty of the functional form for the substi
tution . , . A good rul e of thumb i s to try a substitution whose
functional form i s l ess compl i cated than that of the ori gi nal functi on
and resul ts i n a functi on to be i ntegrated that i s al so l ess compl i cated
than the origi nal functi on. Thus, i n the present probl em, although a
substitution of the form . + .
woul d greatl y si mpl ify the

origi nal probl em, the form of the substitution functi on woul d be as
compl i cated as the original functi on to be i ntegrated. Stop readi ng
and try agai n to sol ve the probl em, if you di d not before.
Better substi tuti ons woul d be . .

or . + .

Usi ng the l atter
substituti on, . + .

and .. . du, yi el ds

..

.

du

- du
2 ( u - l og . + C
+ .
- l og + .
+
_
.
l og + .

+
Frequentl y, several substi tuti ons wi l l be requi red in order to sol ve

the probl em, and, at each step you are essenti al l y usi ng the method of
hi l l cl i mbi ng on an eval uation function concerned wi th si mpl i ci ty of
functional form. There i s no preci se defni ti on of si mpl i ci ty of func
ti onal form, but that lack sho!d not prevent you from expl i ci tl y recog
ni zi ng that this i s what you are doing and that you have rather good
ZZb Chapter J J
judgment as t o what functions are si mpl er than other functions ( i n
the sense of bei ng cl oser to functions for whi ch you know the i ntegral ) .
As long as you are able t o deci de that t he functions resul ti ng from
certain substituti ons are si mpl er than functions resul ting from other
substitutions, you are i n a position to make good use of the hi l l
cl i mbing method, whether or not you can expl i ci tl y defne the eval ua
tion function.
Deri ve t he functi onal form of the fol l owi ng i ntegral : f x2 eX dx. The spe
cifc background knowl edge i ncl udes knowl edge of the i ntegral s of the
el ementary functi ons pl us the i ntegrati on by parts formul a -namel y,
f uv dx u V - f u' V dx, where V
=
f v dx and H
|
=
du/dx.
Si nce the function to be integrated is an obvious product of two
si mpl er functions, the method of integration by parts i s suggested.
Whether i ntegration by parts i s maki ng progress toward the goal i s
determined a great deal by the general probl em-sol vi ng method of hi l l
cl i mbi ng on an eval uation function of si mpl i ci ty of functional form
and ease of i ntegrati on. Stop reading and try to sol ve the probl em.
In the present case, two appl icati ons of the method of i ntegration
by parts is necessary i n order to solve the probl em. At each stage
the appl i cation of i ntegration by parts resul ts in functi ons to be i nte
grated that are si mpl er than the functions to be integrated prior to
the appl i cation of i ntegration by parts. The specifc sol ution is as fol l ows:
J x
2
e
X
dx x
2
ex - J 2xex dx +
x
2
ex - 2xe. + 2e. + C
( x
2
- 2x + 2 ) eX + C
I n the present i nstance, it woul d be possi bl e to give d preci se defni
tion of the eval uation function on whi ch the hi l l cl i mbing i s occurring
namel y, the exponent of x in the product x"eX when thi s product i s
the function to be integrated. Repeated appl i cation of the method of
i ntegration by parts resul ts i n reduction of the exponent, eventual l y
t o xOex or eX. However, whether or not it i s possi bl e t o expl i ci tl y defne
the eval uati on function bei ng used, hi l l -cl i mbing methods can be ex
tremel y useful in sol vi ng such a probl em, so l ong as your judgment of
si mpl i ci ty i s reasonabl y accurate.
Fi nd the val ues of x for whi ch the functi on y =
f( x) i s a maxi mum or
mi ni mum. The functi on i s defned by the equation X + xy + ] 27. Rele
vant background i nformati on i ncl udes the chain rul e for di ferentiati ng
the functi on of a functi on, the rul e for di ferenti ati ng the product of two
functi ons, and the theorem that the deri vati ve of a functi on equal s zero
at a mi ni mum or maxi mum.
Problems from Mathematics, Science, and Engineering ZZ
When we are fnding the maxima or mi ni ma of even a function of a
single variabl e, y
, . we are essential l y sol vi ng two equations for
the values of two unknowns, .and y. Thi s fact is ofen not apparent to
students when they origi nal l y l earn the method of fndi ng maxima and
minima by diferentiating , . setting i t equal to zero, and sol vi ng
for .because the original equation was already solved for y as a
function of . I n such a case, the deri vati ve wi l l i nvol ve onl y a si ngle
variable . When the derivative i s set equal to zero, the resul ting
equation i s solved for the value of .for whi ch the function i s a maxi
mum or minimum. I n the present probl em, the i nitial function y
, .
i s defned i mpl i ci tl y by the equation . xy y
2
I n thi s case,
it i s necessary to take a more general approach to the probl em, i n
whi ch fnding the derivative and setting it equal to zero al l ows us
to obtain a second equation, i n addi tion to the equation .
. We hope that these two equations wi l l permit us to sol ve for the

.y) points for whi ch the function has a maxi mum or mi ni mum. Stop
readi ng and try to sol ve the probl em.
To solve the probl em, we shoul d i nitial l y set a subgoal : to obtain
an equation that i nvol ves the deri vati ve y' .
..Stop reading and
try again to solve the probl em, if you did not before.
Thi s subgoal can be achieved by diferentiating the given equation
with respect to .(empl oying the product and chai n rul es for di ferenti
ation). The resul ting equation i s ..
y 2yy' o. Thi s equation

can be solved for y' by algebraic manipul ati on, yi el ding the equation
_ _ .y)
y
When thi s equation is set equal to zero, we obtain y

.Substi
tuting thi s equation i nto the original equati on, we obtain .
or .
or . - ) -and the probl em is sol ved. Once

agai n, a si mpl e defnition of a singl e subgoal -namel y, obtai ni ng an
expression for y' i n terms of .and y -resulted i n straightforward
sol ution of the probl em.
DIFFERENTIAL EQUATIONS
The solutions of diferential equations provide parti cul arly good ex
ampl es of the use of the probl em-sol vi ng methods of hi l l cl i mbi ng,
subgoal s, and analogy to si mi l ar probl ems. Perhaps the most important
specifc trai ning i s the ability to place a diferential equation in the
proper cl ass. Once you note what other di ferential equations the one
in front of you i s si mi l ar to (what cl ass it bel ongs to), you can then
ZZ Chapter J J
appl y the techni ques associated with the sol uti on of that cl ass of dif
ferential equations. You need not even have much specifc knowledge
of how to sol ve equations of a parti cul ar cl ass, so long as you can
identify the cl ass and look up i n a book how to solve equations of that
cl ass. Thus, analogy to si mi l ar probl ems is the crucial frst step in the
solution of many di ferential equations.
When the gi ven diferential equation i s a member of a cl ass for which
sol ution methods are known, the methods of hill cl i mbing and sub
goal s (using eval uation functi ons) are also quite important. For ex
ampl e, in sol vi ng diferential equations of diferent forms, we ofen
proceed by setting as a subgoal the transformation of the diferential
equation i nto another diferential equation of simpl er form, and then by
using the known sol ution methods for the si mpl er form.
In grab-bag cl asses of di ferential equations (such as mi scel l aneous
nonl i near di ferential equations), we may attempt to defne subgoal s
such as transforming the equation to l i near form or reducing the order
or degree of the equation, but frequentl y we si mpl y try out various
operations on the gi ven nonl i near diferential equations to see which
ones resul t i n an equation of the si mpl est form. The l atter i s cl earl y
an exampl e of hi l l cl i mbi ng, usi ng an eval uation function that somehow
weights diferent features of a diferential equation for overall ease
of sol ution.
For equations wi th order .and degree .> 1 ) , the relevant
eval uation function i s frequentl y the vector . with lower values
of either .or bei ng more highly valued. Diferential equations pro
vi de good exampl es of vector eval uation functi ons, where there are
many diferent properti es on whi ch hi l l cl i mbi ng might be tried to see
whi ch, if any, approach would solve the probl em. Frequently, the
sol uti on of a di ferential equation requi res a sequence of steps i n whi ch
the degree and order of the equation are progressi vel y reduced, fnal l y
resul ting i n a diferential equation of the frst order and frst degree.
The order in whi ch the degree and order of the di ferential equation
i s progressi vel y reduced may vary from probl em to probl em.
Once the stage i s reached where you have a nonl i near di ferential
equati on of the frst order and frst degree to sol ve, a variety of poten
tial sol ution sequences can fol l ow, again depending on the type of
frst-order, frst-degree di ferential equation.
The nonl i near di ferenti al equation may be reduci bl e to l i near form
by some suitable transformation. No general rul es exi st for determi ni ng
such transformati ons nor the types of nonl i near equations to which
they appl y, but experi ence wi th a wide variety of such probl ems may
i ndi cate that the present probl em is si mi l ar to some probl em al ready
Probl ems from Mathematics, Science, and Engineering ZZ
solved in thi s way. Havi ng achi eved a l i near equation of the frst order
and frst degree, you then appl y solution methods appropriate to thi s
type of equation (for exampl e, usi ng Laplace transforms or i nte
grating factors) .
Another sol uti on sequence starti ng with a nonl i near, frst-order,
frst-degree di ferential equation (whi ch is appropriate in some cases)
i s to try to transform the equation to be a member of a parti cul ar cl ass
of diferential equations known as ..., .,... Achi ev
ing thi s subgoal may requi re you to fnd an appropriate i ntegrating
factor to transform the gi ven diferential equati on i nto an exact di f
ferential equation. Once an exact di ferential equation has been ob
tai ned, you simpl y fol l ow solution methods appropriate for thi s type
of equation.
Another sol uti on sequence i s appropriate to diferential equati ons
of the form
.
..
-
.
where .
, .,
O. To sol ve such equations, we set the subgoal of

transforming thi s i nhomogeneous equation i nto a homogeneous equa
tion by maki ng a substi tuti on. The next subgoal i s to transform thi s
homogeneous equation i nto a diferential equati on i n whi ch the
variabl es are separated, which i s then sol ved by di rect i ntegrati on.
There are other solution sequences appropri ate to other types of
nonl i near di ferential equations of the frst order and frst degree.
However, just consi dering the sol uti on sequences di scussed here,
note that an experienced sol ver of diferenti al equati ons has estab
l i shed an eval uation function for frst-order, frst-degree di ferential
equations, whi ch i s essential l y a partial orderi ng of a variety of di f
ferent forms of such di ferenti al equati ons. I n thi s partial orderi ng,
equati ons with separated variabl es are more hi ghl y val ued than homo
geneous equations i n which the variabl es are not separated, the l atter
being more highly valued than the type of i nhomogeneous equati ons
described above, whi ch i n turn are more hi ghl y val ued than many
nonli near di ferential equati ons not of thi s or any other i dentifable
type. At the same time, exact di ferential equati ons are more hi ghl y
valued than these mi scel l aneous nonl i near diferential equati ons, but
there i s no relative ordering of exact diferential equati ons relative
to equati ons wi thi n another sol uti on sequence, such as that appropri
ate for the i nhomogeneous equations of the previousl y speci fed type.
Al ong the same l i nes, l i near diferential equations are more hi ghl y
Z
Chapter J J
valued than the mi scel l aneous nonl i near diferential equati ons, but
they are not necessari l y more highl y valued than types of equations
within some other solution sequence. This i s what we mean by saying
that these diferent types of frst-order, frst-degree diferential equa
ti ons have an eval uation function in the form of a ,...,
rather than i n a compl ete or si mpl e rank ordering of al l the diferent
types of such equations.
I f we know a variety of such types of diferential equations and the
appropriate partial ordering type of eval uation function defned over
them (that is, know the variety of diferent solution sequences), we are
in a good position either to defne subgoal s or to recognize progress
i n the use of the hi l l -cl i mbi ng method. However, there i s sti l l the prob
lem of determi ni ng the proper operation (substitution, integrating fac
tor, and the l i ke) to take in order to achieve a diferential equation of
the more highl y eval uated (si mpl er) form. Perhaps general problem
sol ving methods are appl i cabl e to this aspect of solving diferential
equations, but, frankl y, I have so little experience i n sol ving diferen
tial equations that I feel incompetent to di scuss the matter further.
I n any event, once agai n, a sol ution of mathematical problems re
qui res a mi xture of specifc knowl edge of mathemati cs and the use of
general probl em-sol vi ng methods. You can, of course, l earn how to
sol ve di ferential equations and other mathematical probl ems without
appreciating that you are thereby appl yi ng general probl em-sol ving
methods. However, understanding general probl em-sol vi ng methods
probabl y faci l i tates your understanding the variety of techniques
appl i cabl e to such mathematical probl ems. A general l y accepted dogma
in educational psychology is that the more you can rel ate new knowl
edge to ol d knowl edge, the faster and more complete your learning
will be (though how good the evi dence for this i s I certai nl y do not
know). So, if you know general probl em-sol vi ng methods you shoul d
be abl e to qui ckl y organize many specifc methods for sol vi ng difer
ential equati ons when these methods are i ntroduced i n terms of de
fni ng cl asses of si mi l ar probl ems and defni ng eval uation functions
that permit you to use the subgoal and hi l l -cl i mbing methods.
To compl ement t hi s rather abstract di scussi on, l et us consider the
sol ution of the fol l owing diferential equation, which was produced
by AI Stevens, a student i n one of my probl em-sol vi ng cl asses:
--
1
.
..
..
Stop reading and try to sol ve thi s diferential equati on.
Problems from Mathematics, Science, and Engineering ZT
Stevens frst defned as a subgoal the transformation of thi s equa
tion i nto a l i near, second-order diferential equati on, but he qui ckl y re
pl aced thi s subgoal with a diferent subgoal -namel y, that of reduci ng
the equation from second order, nonl i near to frst order, nonl i near.
Stop reading and try agai n to sol ve the probl em, if you di d not do
so before.
The second subgoal is easi l y achieved because the diferenlal equa
tion is of the form ...
-,
... with .
not bei ng a
functi on of .By recogni zing the diferential equation as a member of
thi s cl ass, Stevens made avai l abl e hi s knowl edge that a standard
substitution -namel y, -.
..-woul d transform the second-order

di ferential equation i nto a frst-order diferential equati on.
Si nce ...-...
and ...- ..
.. then .
..
..
Substituting into the original equation yi el ds the frst-order
nonl i near di ferential equati on
..
-1 . Stop reading and

try agai n to sol ve the probl em, if you did not do so before.
Algebraic mani pul ation of the equation yiel ds ..
. -,

Such an equation belongs to another specifc cl ass of diferential equa
tions
-
namel y, Bernoul l i equati ons -whi ch are of the form
.
.
-,

You can look up in a book that such equati ons can be reduced to
l i nearity by the substitution -
I n thi s case, that means the
substitution -
for whi ch
-

Substitution then yi el ds
and
.
.
,

.
.
__

g
.
| -
Mul ti pl yi ng through by
,
yi el ds
.
which is a l i near equation in w and From thi s poi nt, the sol uti on i s
straightforward by methods that can be looked up i n a book.
I n retrospect, Stevens noticed that hi s frst-order, nonl i near di fer
ential equati on,
..
1 , had the form .
-1 ,
ZZ Chapter J J
i n whi ch the variabl es y and
are tri vi al l y separabl e. Thus, the

equation is easi l y solvable without i t having to be reduced to l i near
form. The sol uti on is as fol l ows:
I ntegrati ng,
Si nce -dyjdx,
,
-
y -(y + d

(y +
y
c
1
)
11
2
dy -

dx
From a tabl e of i ntegral s,
-2 ( 2c
1
- y)
- +
2
PROBABILITY AND STATISTICS

A sampl e of two observations, x and y, are drawn from the uni form di s
tribution on the interval from zero to I , f( x) f( y) I , 0 " x, y " I .
Fi nd the rth raw moment of z, ( Ir : . ) , where z xy. Note that
Ir: .
( z
r
g( z )
d
z
There are at l east three di ferent ways of sol vi ng thi s probl em. Stop
readi ng and try to thi nk of as many ways as you can.
The most obvious (but also the most di fcul t) way i s to set two
subgoal s : (a) that of fndi ng the probabi l i ty densi ty function , . ,
for the new random vari able -.
.
y and (b) that of pl ugging thi s proba
bi l i ty densi ty functi on into the defni ti on for the rth raw moment of
Let us assume that we know a si mpl e generali zation of the above
formul a that extends i t in order to fnd the rth moment of a ,.
of random variabl es, . y ) , where the joi nt probabi l i ty densi ty func-
Problems from Mathematics, Science, and Engineering
Z
tion for the random variabl es .and
is represented by , . The
formul a i s
Pr. ht r.
9
r
j
.
.
,.
-.
Now stop readi ng and try to sol ve the probl em, if you di d not do
so before.
I n thi s case, we can compute the rth moment of the function .
provi ded we know the probabi l i ty densi ty function of the joi nt di stri bu
tion of .and
Thus, the frst subgoal i s to determine thi s functi on,

,.
We must assume as i mpl i ci t information (though i t was not
specifcal l y stated i n the probl em) that the two sampl e observations
.and y) are independent or uncorrel ated. Knowing these observati ons
are i ndependent and knowi ng the densi ty functi ons for each, we get
the joint densi ty function ,.
, . , I , l .` I , ,.
0, el sewhere. From thi s poi nt on, the sol uti on is a si mpl e i ntegra
tion, as fol l ows :
Pt . z
Pr. h( r.
9
i

.,.
...
...
( r + 1 )
2
Final l y, one can set a total l y di ferent subgoal of fndi ng the moment
generating function for the new vari able .
and di ferentiating the

moment generating function r ti mes to fnd the rth moment of I n
sol vi ng the probl em by thi s method, we need to know more speci fc
background information (such as the defni tion of a moment generat
ing function and the Taylor seri es expansi on for .. but otherwi se
the probl em i s sol ved i n a straightforward manner by this method
as wel l .
The pri nci pal general probl em-sol vi ng method used t o sol ve thi s
problem was the setting of subgoal s. A variety of such subgoal s were
l ogical l y rel ated to the sol uti on of the problem -namel y, that of de
ri vi ng the probabi l i ty densi ty function ,
,
that of deri vi ng the joint
di stribution function , . or that of deri vi ng the moment generati ng
function for Setting the subgoal s i n each case i s a part of an overal l
cal cul ative plan for sol vi ng the probl em in each of the three cases.
For exampl e, a frst step i n sol vi ng the probl em might be to write down
Z4
Chapter J J
the formul a for the rth moment of , i n terms of a doubl e i ntegral of
.
and the joi nt probabi l i ty densi ty function ,.
Thi s would
suggest that we needed to determine the joint probabi l i ty density func
tion as a subgoal i n order to do the i ntegration and solve the probl em.
The l engths of two part s, A and B, are normal l y di stri buted wi th means
2 centi meters and 4 centi meters and standard devi ati ons
O_ 0. 03 centi meter and Oj 0. 04 centi meter. One A piece and one B
pi ece are randoml y assembl ed and l ai d end to end to form a length about
6 centi meters long. If the assembl y i s to ft certai n qual i t y control stan
dards, it must be between 5. 9
1
and 6. 09 centi meters l ong. What per
centage of such assembl i es wi l l fai l to fal l wi thi n these l i mi t s?
We set as a subgoal that we must determi ne the di stri buti on function
for the sum of two random variabl es, A and B. We know from back
ground stati stical knowl edge that, if A and B are normal l y di stributed
random variabl es, then A B wi l l be a normal l y di stributed random
variabl e with a mean equal to the sum of the means of the component
random variabl es and a variance equal to the sum of the variances.
Thus, _
+
_ centi meters and _ ( _ .
Havi ng achi eved the subgoal of determi ni ng the di stri buti on

functi on for the random variabl e A B, we now set the second and
fnal subgoal to be to determine what percentage of the di stribution
lies outside a regi on of on either si de of the mean. This subgoal
can be determined from a tabl e of the normal di stributi on, provided
we know how many standard devi ations i s represented by centi
meter. To determi ne this amount we si mpl y di vi de by to get
1 . 8 standard devi ati on uni ts, tel l i ng us that we are aski ng for the per
centage of cases fal l i ng in the two tai l s of a normal di stri buti on 1 . 8
standard devi ati on uni ts out from the mean. Looking up the value
, 1 . 8 i n a tabl e of the normal di stribution gives the fgure of
i n one tail or 7. 2 percent i n both tai l s. Once agai n, the sol uti on of the
probl em proceeds from our setting a series of one or more subgoal s
that, taken together, constitute the sol uti on to the enti re probl em.
Determi ne one way i n whi ch the random vari abl e mi ght have been
formed, where the moment generati ng functi on of i s
40 + ( 8'1
2
)
M
z
(
)
=
(
1
_
2
(
)
31
2
Problems from Mathematics, Science, and Engineeri ng Zb
I n contrast to earl i er probl ems, i n this one the goal i s specifed but
the gi vens are not, and we must determine some set of gi vens such
that the goal -namel y, the moment generati ng function -can be de
rived as a consequence. The obvious way to check out any hypothe
si zed set of gi vens i s to use the method of contradi cti on. I n addi ti on,
we shoul d use the method of working backward from the goal expres
si on, si nce thi s i s a uni que starting point i n the probl em. The most
rel evant pi ece of background i nformation i s that the moment generat
ing functi on of a sum of random variabl es is the product of the moment
generating functions of each component random variabl e. Stop read
i ng and try again to sol ve the probl em, if you did not before.
By exami ni ng a tabl e of such moment generati ng functions, we can
qui ckl y excl ude the possi bi l ity that the moment generating functi on
of , i s i tsel f a random vari abl e with a si mpl e standard di stri buti on
function.
The next si mpl est hypothesi s woul d be that , is the sum of two ran
dom variabl es, each of which has a si mpl e fami l iar di stri buti on func
tion. Thi s bei ng the case, we shoul d work backward from the goal
expression by factoring it i nto two components, each of whi ch is a
moment generating function for a fami l iar di stribution functi on. The
most obvious spl i t of the goal moment generati ng functi on i s probabl y
t o mul tipl y t he numerator ti mes t he reci procal of t he denomi nator.
I t turns out that the numerator is the moment generati ng function for
a normal di stri bution with mean 4 and standard deviation I , and the
reci procal of the denomi nator i s the moment generating functi on for
a random variabl e of the ,
di stri bution on degrees of freedom,

whi ch means that , i s the sum of these two random variabl es.
Had thi s parti cul ar factorization not worked, there are a number of
other si mpl e factorizati ons of the moment generati ng function that
might have been matched for form agai nst our table of moment generat
i ng functions for fami l iar di stri buti ons. I n many ways, the sol uti on
of the probl em i s terri bl y si mpl e. We can ask, "How can we not start
with the goal expressi on, whi ch i s the onl y gi ven i n the probl em other
than impl icit gi ven i nformation? I ndeed, if we start to mani pul ate
the goal . expression and know the relevant background i nformati on
about the moment generating functi ons, it i s di fcul t to sec how we
can fai l to sol ve the probl em. Neverthel ess, many peopl e do fai l to
sol ve thi s probl em and other equal l y si mpl e probl ems, because they
have no idea what to do. In many cases, they are genui nel y defci ent
in important background i nformati on, but those who knew they had
to work backward from the goal expressi on in the present probl em
woul d l i kel y l ook up i n books the rel evant information about moment
Zb Chapter J J
generating functi ons that was needed i n order to sol ve the probl em.
Those who have a thorough knowledge of the speci fc subject matter
probabl y need to have no consci ous understandi ng of general probl em
sol vi ng methods i n order to sol ve this and many other probl ems.
However, those who are l earni ng the speci fc subject matter wi l l be
ai ded i n thi s learni ng by a thorough knowl edge of general problem
sol vi ng methods, whi ch suggest what types of i nformation are needed
i n order to solve probl ems.
Agai n, i n the probl em wi th the two pi eces A and B that are joined
end-to-end to form a new combined pi ece that must fall wi thi n certain
tol erance l i mi ts, students might lack the specifc background i nforma
tion about the di stribution function of the sum of two normal l y di s
tributed random variabl es. However, having cl earl y defned the subgoal
of determi ning such a di stri bution function in order to determine the
percentage of cases that l i e i n its tai l s, it is l i kel y that students would
l ook for the di rectl y rel evant pi ece of information they l acked.
Formulas for getting certai n i nformation from other i nformation
ofen automati cal ly provi de you with a set of subgoal s -namel y, that
of determi ni ng the values of the vari ous components of these formulas.
Thus, if you have enough speci fc background i nformation to know
the appropriate general formul as, you can ofen substitute that informa
ti on for an understandi ng of general probl em- sol vi ng methods i n those
cases where you know some general formul a that encompasses al l the
aspects of the probl em. However, if no such formul a exi sts or if you
do not know it, understandi ng general probl em-sol vi ng techni ques can
be quite cruci al i n devi si ng an adequate pl an to solve the probl em.
COMBINATORIAL ANALYSIS
How many ways can a set of contestants consi sti ng of four men, three
women, two boys, and three gi rl s be sel ected from an audi ence consi sti ng
of ei ght men, ni ne women, si x boys, and si x gi rl s?
Stop readi ng and try t o sol ve t he probl em.
To sol ve the probl em, we mi ght set a seri es of subgoal s ; that i s, we
mi ght determi ne how many ways there are to pi ck frst the men al one,
then the women al one, then the boys alone, and then the gi rl s al one.
Let us cal l the sol uti ons to these four subprobl ems , _ , , and
Stop readi ng and try agai n to sol ve the probl em, if you di d not before.
The number of ways to pick an enti re set of contestants is si mpl y
the product [ _ : _ . Each of the subgoal s i s a si mpl e combi na-
Problems from Mathematics, Science, and Engineering Z
ti ons probl em (unordered sets obtained by sampl i ng wi thout repl ace
ment) . Thus , the total number of ways i s si mpl y
John and Fred agree to pl ay a tenni s match, wi th the wi nner to be the

person who frst wi ns two sets i n a row or a total of t hree sets. Fi nd the
number of ways the match can occur.
Stop reading and try to solve the probl em.
The most straightforward way to determi ne the number of ways the
match can occur i s to construct a tree di agram, marki ng all termi nal s
of the tree where ei ther one of the condi ti ons i s frst satisfed and stop
pi ng the growth of the tree from that point on. The tree has two
branches at each node -namel y, A wi ns or B wi ns.
Al ternati vel y, we can determi ne the answer without expl i ci tl y con
structi ng the tree, by the fol l owi ng l i ne of reasoni ng. We frst make
certain i nferences from the i nformati on given i n the probl em-namel y,
that the match cannot end before two sets have been pl ayed and must
end afer a maxi mum of fve sets have been pl ayed ( si nce out of fve
sets one pl ayer must wi n at l east three sets) . Havi ng made these i n
ferences, the probl em of determi ni ng the number of ways the match
can occur can be reduced to a set of four subprobl ems -namel y, we
must determine how many ways the match can end afer two sets, three
sets, four sets, or fve sets. Stop readi ng and try agai n to sol ve the prob
lem, if you did not before.
Cl earl y, there are onl y t wo ways t he match can end after t wo sets
namel y, A wi ns both sets or B wi ns both sets. There are al so onl y two
ways the match can end after three sets: A wins the frst set and B wi ns
the next two, or B wi ns the frst set and A wi ns the next two. Now we
might note t hat, i n general , at each l evel of the tree, afer the second,
there wi l l be exactl y two termi nal nodes and two nonterminal nodes
under the rul e that the wi nni ng pl ayer must wi n two sets i n a row.
Thus, at every node prior t o t he l ast, there wi l l be exactl y t wo termi nal
nodes and two nonterminal nodes. At the last node, there wi l l be four
terminal nodes, si nce, by the three-set rul e, al l nodes must be termi nal
once fve sets have been pl ayed. Thus, there are two terminal nodes
after two sets, two termi nal nodes afer three sets, two termi nal nodes
afer four set s, and four termi nal nodes after fve sets, or termi nal
nodes i n all (and so ways the match can occur).
Z Chapter J J
The pri nci pal general probl em-sol vi ng methods used i n sol vi ng the
probl em were i nference and the subgoal method. Note that al though
the special i zed method of expl i ctI y constructing a tree di agram wi l l
al so sol ve thi s probl em without the need for usi ng the more general
subgoal method, the subgoal method i n combi nation with certain i n
ferences general i zes easi l y to probl ems i n whi ch to construct an ex
pl ici t tree woul d be extremel y l abori ous.
John pays a quarter to pl ay a si mpl e coi n-fi pping game agai nst a gambl i ng
casi no. The quarter enti tl es hi m to pl ay a maxi mum of fve coi n fi ps
agai nst the house. John wi ns $ 1 every t i me he cal l s the coi n correct l y
( head or tai l s) and l oses $ 1 every ti me he cal l s the coi n i ncorrectl y.
John begi ns wi th $3 and wi l l stop pl ayi ng whenever he l oses hi s enti re
stake or wi ns $3 (that i s , has a total of $6) . Of course, he must qui t afer
pl ayi ng a maxi mum of fve coin fi ps. Fi nd the number of ways the. pl ay
i ng can occur.
A frst general probl em-sol vi ng method we mi ght use i s to note the
si mi l arity to the previous probl em. This si mi l arity l eads to the conjec
ture that we coul d sol ve the probl em ei ther by constructing an expl i ci t
tree di agram or by maki ng certai n i nferences about the tree di agram
and then breaki ng up the probl em i nto subprobl ems to determine how
many ways the pl ayi ng can occur, stopping afer coi n fi ps, for al l
5 . Stop readi ng and try again to sol ve the probl em, if you di d not
before.
One si mpl e i nference is that John must pl ay for at l east three coi n
fi ps, si nce, at worst, he wi l l l oose . on each coi n fi p, and he has
.to pl ay with. Si nce he can pl ay at most for fve coi n fi ps for hi s
original quarter, we know that we can break the probl em into three sub
probl ems -namel y, to determine how many ways the pl ayi ng wi l l stop
afer three, four, and fve coin fi ps. Cl earl y, there are exactl y two ways
the pl ayi ng wi l l stop afer three coi n fi ps (two terminal nodes), l eav
ing si x nontermi nal nodes afer three coi n fi ps. Another rel evant in
ference is that i t is i mpossi bl e to be ahead or behi nd by an even number
of dol l ars afer any odd number of coi n fi ps (such as three fi ps).
Thus, i t i s i mpossi bl e for John to be ei ther even or ahead or behi nd
by $2 after three fi ps. Hence, the si x nontermi nal nodes must al l i n
vol ve diferent sequences of wi nni ngs and l osi ngs that total either
. or . and, by symmetry, there must be three of each type. From
thi s we can concl ude that there are no terminal nodes afer four coi n
fi ps and nontermi nal nodes. The 1 2 nontermi nal nodes at l evel
Problems from Mathematics, Science, and Engineeri ng Z
4 l ead to 24 nodes at l evel 5 , al l of whi ch, by defni ti on, must be ter
mi nal . Thus, there are preci sel y 26 di ferent ways the pl ayi ng can occur.
Binomial theorem. Prove that
( a +
b
) /1 Z
n
,
a
ll
-
r

b
r
Lr - o
r
n
,
n !
where
Z
I
(
_
)
I
r r . n r .
The basic probl em-sol vi ng method to be used i s mathematical i n
ducti on, whi ch we have al ready noted i nvol ves a combi nati on of the
general probl em-sol vi ng methods of speci al case ( provi ng the theorem
true for I) and the subgoal method ( di vi di ng the proof of the
theorem i nto two part s: provi ng i t true for 1 and showing that, if
it is true for the theorem is true for
.
Stop readi ng and t ry t o sol ve t he probl em.
The theorem i s tri vi al l y true for
so the crux of the proof con

si sts of assumi ng that the theorem hol ds for . and provi ng i t i s
true for .
To prove thi s l emma, we assume the theorem i s

true for and mul ti pl y both si des of the equation by . Thi s
operation yi el ds the term .
on the left si de of the equation,

as desi red. The right si de of the equation wi l l cl early i nvol ve exactl y
2 terms of the form .

where goes from zero to
as
desi red. Thi s i s obvi ous by i nspecti on. What remai ns i s to prove that
the coefcients of each . term have the form
, Except for
the terms .
and
whi ch ari se onl y once i n the mul ti pl i cati on of

. L;= o
, .

every other .
term ari ses in two pl aces .

The term i n the product that contai ns i s obtained from
.

_

,.
,
,.

, .
;
,
,
,

.

Al l that remai ns is to show that
'

, Thi s re
mai ni ng subgoal is tri vi al l y establ i shed by algebraic combi nation of
the two fractions, and the theorem is proved.
Z4
Chapter J J
Besi des usi ng the method of mathematical i nducti on, whi ch we have
noted is an appl i cCtion of two general probl em-sol vi ng methods, our
i mpl ementati on of the proof i nvol ved breaki ng the probl em i nto two
part s: frst, determi ni ng that there were the correct number and type
of terms .

on the right side of the equation and, second, deter
mi ni ng that the coefci ents of each such term were of the proper form,
namel y,
1
, . Achi evi ng the fnal subgoal of showi ng that
1
,

"
1
, coul d be sai d to i nvol ve hi l I cl i mbi ng on a two
di mensi onal eval uati on function consi sting of the number of terms i n
the numerator and the denomi nator of the coefcient of .

I n
the goal , the coefci ent of .

i s a si mpl e fracti on consi sting of
one factorial in the numerator di vi ded by two factorial s in the denomi-
nator. The expressi on
1
, + , | cl earl y i nvol ves two separate
factorial fracti ons that must be combi ned i nto a si ngl e factorial fraction
with two factori al s i n the denomi nator and one i n the numerator. By
analogy to si mi l ar probl ems, we frst express both fracti ons i n terms of
the same denomi nator by mul ti pl yi ng numerator and denomi nator of
each fracti on by the appropri ate numbers. Then we add the numerators,
putti ng them over the common denomi nator, and factor the numerator
to obtai n a coefci ent of .

of the desi red form. Of course,
m
ost
probl em solvers engaged i n sol vi ng probl ems of this type wiII have
practi ced the l atter sequence of operati ons so well i n hundreds of
precedi ng probl ems that they wi l l hardl y need to thi nk of appl ying any
general probl em- sol vi ng method i n order to i mpl ement the algebraic
sol uti on.
The eval uation functi on used i n defni ng the i ni ti al breakup i nto sub
probl ems (determi ni ng whether there were the right number of .
terms and determi ni ng whether the coefci ents matched) comes
straight from characterizati on of the right si de of the goal expressi on,
namel y
After mul ti pl yi ng .
we obtai n 1 ) terms, and these terms must be reuced to +

terms of the proper form, in order to achi eve the goal . Thi s reduction
cali be subdi vi ded i nto two parts: frst we achi eve the right number of
terms havi ng the proper .
components and then we determi ne if
the coefci ents of these terms match the desi red coefci ents i n the goal
expressi on. Note that in order to defne the subgoal s, it i s not neces
sary to expl i ci tl y defne any single numeri cal or vectored-val ued
evaluation functi on. Al l that is necessary is that we have a more or
Problems from Mathematics. Science. and Engineering Z4T
l ess expl i ci t awareness of some of the di mensi ons on whi ch the goal
expressi on di fers from the gi ven expressi on and defne subgoal s on
the basi s that they match the goal expressi on on more di mensi ons than
the given expressi on.
NUMBER THEORY
Prove that if ( 2" - 1 ) is a prime number, then N is a prime number.
Thi s problem and i ts proof were gi ven to me by AI Stevens. Stop
readi ng and try to solve the probl em.
A good general probl em-sol vi ng method to appl y i ni ti al l y i s the
method of contradi cti on. Thi s method i s suggested by the exi stence
of two si mpl e alternati ves for . either i t i s pri me or i t i s not prime.
I f it i s not pri me, i t i s expressi bl e as the produce of two i nteger factors,
neither of whi ch equal s unity. If bei ng not pri me in conjuncti on with
1 bei ng prime can be shown to yi el d the contradi ctory concl u
sion that 1 i s not pri me, then the ori gi nal theorem wi l l be estab
l i shed. Stop reading and try agai n to solve the probl em, if you did not
do so before.
To i mpl ement the method we must show i s not pri me i mpl i es that
1 i s not pri me. If is not pri me, then 1
"
1 , where j and
: Under these ci rcumstances 1 can be factored i nto
Thi s l atter is establ i shed by si mpl e di vi si on of 1 ) i nto 1 ) .
Thus, 1 i s not pri me, contradi cting the gi ven i nformati on.
About the onl y general probl em-sol vi ng method I can suggest that
might gi ve you the i dea of trying to use the factor 1 ) woul d be
general experience with probl ems i nvol vi ng si mi l ar expressi ons
namel y, those of the form .
b
Ill
. Of course, there are not too many
obvious factors to try to use other than ei ther 1 ) or 1 ) ,
ei ther of whi ch wi l l do. Thus, once you deci de to use the method of
contradi cti on, the rest of the probl em is rel ati vel y straightforward.
MODERN ALGEBRA
Gi ven that the posi ti ve i ntegers are wel l ordered (each nonempt y subset
of the i ntegers contai ns exactl y one smal l est i nteger), prove that there
is no i nteger between 0 and
1
.
Z4Z
Chapter J J
Use the method of contradi cti on. N ow try again to sol ve the probl em,
i f you di d not do so before.
Assume that there are one or more i ntegers between and I . By the
wel l -ordering property, there is i n this nonempty set of i ntegers be
tween and I some least i nteger, M, for whi ch M I . Mul ti pl yi ng
both si des of these i nequal i ti es by the number M we have M
2
M.
Thus, M
2
must be another i nteger i n the cl ass of i ntegers between and
I and, furthermore, M
2
M which contradi cts the assumption that M
was the l east i nteger between and I . Si nce the contradi cti on was
reached by assumi ng the exi stence of i ntegers between and I , this
i mpl i es that there i s no i nteger between and I .
Gi ven a set of el ements, G, wi th a bi nary operati on, * , defned over G
such that G is a group. The defni ti on of the group is a set of el ements
wi th a bi nary operati on such that al l of the fol l owi ng four properti es hol d.
( 1 ) Cl osure: for al l a, b, and c i n G, a* b d i s a member of G. ( 2) As
soci at i vi t y: for al l a, b, and c i n G, ( a*b) *c a*( b*c) . (3) Lef i dent i t y:
for al l a i n G, there exi sts an e in G such that e*a a. ( 4) Lef i nverse:
for al l a in G, there exi sts an a
-
I in G such that a
-
I * a C. For such a
system prove the fol l owi ng theorem: Uni que left i nverse: for al l a in G,
there exi sts a unique a-I i n G such that a
-
I * a t.
Stop readi ng and try to prove the above theorem.
Whenever we encounter a uni queness proof, the method of contra
di cti on is i mmedi atel y suggested. That i s, we shoul d assume that there
exi st two di ferent l ef i nverses .
and .
such that .
.
"
t and
.
."
t and attempt to show that .
(contradi cti ng the as

sumpti on that .
and .
are di ferent ) . Stop readi ng and try agai n to

sol ve the probl em.
I t i s tri vi al to concl ude that, if .
. C and
,
."
t, then .
.
.
.However, it may appear somewhat di fcul t to peel of the
i denti cal a ' s from the right-hand side of the equation ( si nce we have
not al ready proved any right cancel l ati on law). Therefore, we might set
as a subgoal ( l emma) to prove the right cancel l ation l aw -namel y, that
for al l b i n G,
. .i mpl i es that b However, thi s l aw is cer
tai nl y not easi er to prove than the uni que lef i nverse theorem i tsel f.
Thus, t hi s subgoal appears unl i kel y to be useful in the present probl em.
However, there i s another subgoal (conjectured l emma) that i s easier
to establ i sh and thus faci l i tates sol ution of the present probl em. Thi s
l emma does al l ow us i n essence to peel of the .from the equation
. . .
.Try to conjecture thi s lemma ( subgoal ) and then prove
i t, if you have not done so al ready.
The useful l emma ( subgoal ) is that the l ef i nverse of an el ement
Problems from Mathematics, Science, and Engineering
Z4
i n a group is the same as the right i nverse -namel y, .

.-i mpl i es
that .
.
- Cl earl y, if thi s l emma were true, i t woul d permi t us to

mul ti pl y both si des of the equation .
.-.

.on the right by the
quanti ty . and change the i dentical a' s into e' s on both sides of the
equati on. Since we have speci al given i nformation regardi ng e' s , they
might be easier to peel of than a' s. Stop reading and try to prove the
l emma that the l ef i nverse equal s the right i nverse.
Proof of thi s l emma i nvol ves use of the inference method ; that i s,
we si mpl y perform substitution operati ons on t he quantity .
to
attempt to show that .

-The exact proof of thi s l emma fol l ows.
.
, .
-.

,
.
.
-.
.

.
,
- .
Let .
-then
There exi sts in G such that

-
-
-
-

,
- -e
Thus, - and .
.-e Q. E. D.
Now stop readi ng and try t o prove the rest of the theorem, i f you
The frst l emma does not qui te permit us to prove the theorem i n a
straightforward way, si nce what we obtain i s an expressi on of the form
.
.

and we are not yet justifed in droppi ng the e' s from both
si des of the equati on. We have been gi ven the left i denti ty property
but not the right identity property. Therefore, i t i s necessary to set a
second subgoal ( l emma) of provi ng the right i denti ty property for a
group -namel y, that
-
,
i mpl i es that .
-.for al l .in G. Stop

readi ng and prove this second lemma and then conti nue to prove the
rest of the theorem, if you have not done so al ready.
The proof of the second l emma is quite tri vi al agai n by the i nference
method and i s gi ven bel ow:
.
-.

.
.-
.-. Q. E. D.
Gi ven the above two l emmas, the proof of the original theorem
concerning the uni queness of the lef i nverse i s quite tri vi al and is
gi ven bel ow:
and
.
, .
.

.
.
. .
-.

.
.
Z44 Chapter J J
But, by the frst l emma, .
.
-
Therefore, .
-.

and, by
the second l emma, .-. ( Q. E. D. ) .
MECHANICS
What constant force wi l l cause a mass of 3 ki l ograms to achi eve the speed
of 20 meters per second in 6 seconds starting from rest ? Relevant back
ground i nformati on is Newton' s second l aw: f NU, where U dv/dt
dx/dt
2

Al so rel evant is some very el ementary knowl edge of cal cul us.
Si nce we know a formul a for the goal quantity (force) , the frst step
is to work backward from the goal and write down the known formula
for force -namel y, , -.
Since we know the mass (m), we i mmedi

atel y defne as a subgoal the determi nation of the accel eration ..
Stop
readi ng and try agai n to solve the probl em, if you did not do so before.
By defni ti on, .-..and .-...
We choose to work with

the former formul a, since the given i nformation involves vel ocities
.and not positions ..
Since we do not know .. but onl y certain

val ues of the vel oci ty at the begi nni ng and end of the motion .
-
and .,-meters per second) , we set another subgoal of transform
i ng the equation .-..i nto an equation relating .to the known
quanti ti es V_ and , Stop readi ng and try again to sol ve the probl em,
El ementary knowl edge of calcul us tel l s us that thi s sol ution i s
achi eved by use of the i ntegration operati on, yi el di ng .. -.
;..-;. and since .i s a constant over ti me, .-+ c. The
constant of i ntegrati on C -.
-0, si nce -at -O. Thus, we have

the formul a .- .
whi ch i mpl i es that .- -

meters/sec2 Havi ng achi eved the subgoal of determi ni ng the accel era
ti on, the rest is si mpl e. Si nce the mass equal s ki l ograms, force equal s
- newtons.
The pri nci pal probl em-sol vi ng methods used were ( a) that of working
backward to determi ne the pri nci pal subgoal i n the probl em, namel y,
determi ning the (constant) accel eration, and ( b) that of hi l l cl i mbi ng i n
the determination of accel erati on, usi ng an eval uati on function con
cerned with how close to given quantities the quanti ti es were on the
right side of the equati on. By thi s l atter eval uation functi on, we choose
..over ...
si nce the former i s at least i n some way concerned

with a quantity (vel oci ty) that is known at some poi nts. By contrast ,
the l atter expressi on is concerned wi th posi ti on, about whi ch nothing
Problems from Mathematics, Science, and Engineering Z4b
is known at all i n the given i nformati on. Furthermore, we choose to
transform dv/dl, whi ch is a statement about the derivative of vel oci ty,
i nto a statement about v' s at certai n poi nts i n ti me, si nce the l atter are
di rectl y known from the gi ven i nformation and the former is not.
HEAT
A cal ori meter contai ns 500 grams of water and 300 grams of i ce, al l at a
temperature of 0 C. A 1 ,000 gram mass of an unknown substance i s
taken from a furnace where i t s temperat ure was 240 C and i s dropped
i mmediatel y i nto the cal ori meter. As a consequence, al l the i ce i s just
mel ted wi th the temperature of the water remai ni ng at 0. What woul d be
the fnal temperature of the water had the mass of the unknown substance
been 2, 000 grams? Negl ect heat l oss from the cal ori meter and the heat
capaci ty of the cal ori meter. The rel evant background i nformati on for
sol vi ng t hi s probl em consi st s of the fol l owi ng. The heat of fusi on of water
equal s 80 cal/gm, whi ch means that 80 cal ori es of heat must be suppl i ed
to convert I gram of i ce at 0 C to I gram of water at 0 C. Materi al s are
consi dered to have an approxi matel y constant speci fc heat capaci ty (c)
over modest ranges of temperatures (consi der the ranges di scussed i n
the present experi ment to be "modest ") . When a body changes tempera
ture, the heat gained or lost equal s the mass of the body t i mes the speci fc
heat capaci ty ti mes the di ference i n temperature (i n degrees C). Fi nal l y,
under the condi ti ons of cal ori meter experi ments, the l aw of conservati on
of heat hol ds
-
namel y, heat l ost equal s heat gai ned.
I t i s desi rabl e to i ntroduce efci ent symbol i c notati on to represent
the unknown quanti ti es i n the present probl em. Let 1
2
be the tempera
ture of the system (the 2, 000 gram substance and the water in the
calorimeter after the 2, 000 gram substance has been dropped i nto the
water and al l owed to reach equi l i bri um). Let
[
_ be the specifc heat
capacity of the unknown substance. The goal is to sol ve for 12, but to do
so, we must set a subgoal . What i s i t? Stop readi ng and try to sol ve
the probl em, if you have not done so al ready.
The obvi ous subgoal i s to determi ne the speci fc heat capaci ty of the
unknown substance. Thi s subgoal can evi dentl y be achi eved by usi ng
the resul t s from the frst experi ment, where a 1 , 000 gram mass of the
substance was just sufci ent to mel t 3 00 grams of i ce wi thout changing
its temperature. U si ng the formul a that heat l ost equal s heat gai ned,
we know that 1 000 L ( 240 - 0) 80 300 or
[
_ 0. 1 . Stop readi ng
and sol ve the rest of the probl em, if you have not done so al ready.
Z4b Chapter J J
Havi ng sol ved for the speci fc heat capaci ty of the unknown sub
stance, i t is now possi bl e to appl y the heat-Iost-equal s-heat-gai ned
formul a to the resul ts of the second experiment i n order to derive the
fnal temperature of the system afer the second experiment. The calcu
lation i s as fol l ows :
( 2000) ( 0. 1 ) ( 240 - 1
2
)
"
500( 1
2
- 0) + 300( 1
2
- 0) + ( 80) ( 300)
Thi s si mpl e l i near equation in one unknown is tri vi al l y sol ved to yi el d
1
2
"
24 C, whi ch i s the fnal temperature of the system.
Pri nci pal general probl em- sol vi ng methods used i n the present prob
lem were to label unknown quanti ti es and to set a subgoal . Al so, i n a
probl em of thi s type presented in a physi cs book, you woul d have to
generate the rel evant background i nformati on, si nce it woul d not be
stated expl i ci tl y i n the probl em.
ELECTRICITY
Deri ve a formul a for the el ectri c-fel d i ntensi ty, ,establ i shed by a charge
di stri buted uniforml y al ong an i nfni tel y l ong l i ne wi th a l i near charge
densi ty . The i mportant background i nformati on i ncl udes the fol l owi ng;
The magni tude of the el ectri c fel d produced by a poi nt charge of magni
tude q at a di stance l from the poi nt charge i s q/41Eor, where Eo i s a
known uni versal constant that is dependent upon the measuri ng uni ts,
dq/dl, where / represents posi ti on al ong the l i ne. The di recti on of the
el ectri c-fel d vector i s radi al l y out from the poi nt charge. The el ectric
fel d produced at a point by a set of point charges i s equal to the vector
sum of the el ectric fel d produced by al l component point charges at that
poi nt. Al so rel evant i s some knowl edge of el ementary trigonometry,
vectors, and cal cul us.
The most rel evant general probl em-sol vi ng method to the sol uti on
of thi s probl em i s for us to defne subgoal s (break up the probl em i nto
parts). What is the frst subgoal , we might consi der? Stop readi ng and
try agai n to sol ve the probl em, if you did not before.
Al though t he probl em asks us to describe the enti re el ectric fel d
produced by the l i ne ( at an i nfni ty of poi nts i n space), we know by
anal ogy to si mi l ar probl ems that thi s statement means we must derive
a formul a for the el ectric fel d at some arbi trary poi nt i n space. Thus,
the probl em i s si mpl i fed by consi deri ng the el ectri c fel d at onl y a
si ngl e ( variabl e) poi nt in space. Furthermore, symmetry i ndi cates
Problems from Mathematics, Science, and Engineering Z4
that the onl y rel evant i nformation is the di stance of the poi nt from the
l i ne, represented by i n Fi g. 1 1 - 2. Cl earl y, the el ectric fel d set up by
a charge di stributed al ong a straight l i ne must have cyl i ndri cal sym
metry (be equal at al l poi nts at the same di stance from the l i ne) ,
si nce there i s nothing di ferent about the gi ven i nformati on for any
such poi nt. Stop readi ng and try agai n to sol ve the probl em, if you di d
not do so before.
Another useful general probl em-sol vi ng techni que woul d be to draw
a di agram representi ng the i mportant i nformation i n the probl em.
The probl em can be broken i nto parts by defni ng another subgoal .
What subgoal might thi s be? Stop readi ng and try agai n to sol ve
the probl em.
h a
x
is
dL
N
N
N
0
m=m= m== m=
a
x
is
dl
FIGURE 1 1 -2
El ectric fel d produced by an infnite l i ne
with l i near charge densi ty k.
Si nce we know from a physi cal assumpti on that the el ectric fel d
at a poi nt i s equal to the sum of the contri buti ons of the el ectric fel d
produced by al l charges, it i s rel evant to attempt to determi ne the
i ndi vi dual contri buti on to the el ectric fel d at a poi nt di stant from the
l i ne due to any l i ttl e pi ece of charge along the l i ne. Consi der the el ec
tri c fel d produced at the poi nt by the amount of charge present al ong
an i nfni tesi mal l y smal l segment of the l i ne . The charge i n thi s seg
ment i s ., .Stop readi ng and try agai n to sol ve the problem by
frst sol vi ng the subgoal , if you have not done so al ready.
The contri buti on to the el ectric fel d .c produced by the charge
.in an i nfni tesi mal l y smal l segment of a l i ne . is gi ven by the
formul a (whi ch we know from background i nformati on) .c .-.
Note that .cshoul d be a vector quantity, and we have onl y obtained

an expression for the magni tude of the vector. It i s al so necessary to
state the di rection of the vector. Accordi ng to background i nformati on,
Z4
Chapter J J
thi s di rection i s evi dentl y radi al l y out from the point charge at .as
shown in Fig. 1 1 - 2. Havi ng achi eved the frst subgoal , our next sub
goal i s to combi ne the contri buti ons to the feld from al l segments .
along the i nfni tel y l ong l i ne. Thi s combi ni ng wi l l evi dentl y i nvol ve
an i ntegrati on, si nce the segments are i nfni tesi mal l y smal l , rather
than a summation where the contri buti ons to the fel d are fni te i n
number. Stop readi ng and try to sol ve the rest of the probl em, if you
In attempting to combi ne separate contri buti ons of each .along
the l i ne, i t is necessary to note that the di recti ons of the vectors dE
produced by each .are di ferent. Thus, we cannot si mpl y i ntegrate
the magni tudes of these vectors with respect to I from mi nus i nfni ty
to pl us i nfni ty. I nstead, we must resol ve each vector i nto I and com
ponents and i ntegrate each separatel y wi th respect to I from mi nus
i nfni ty to pl us i nfni ty. Thus, the next subgoal i s to resol ve the elec
tric fel d produced by each .i nto two components. El ementary trigo
nometry appl i ed to the previ ous formul a for the el ectric feld produced
by dl yi el ds the fol l owi ng components:
dl -I -A dl I .
dE, -dE cos
-
4

-
4

-
4

+
1
2
)3/ 2 -.
-.
o . dl .
dE
" -dE SI n
-
4

-
4
-

-
4

+
1
2
)3/
2
-.
-.
-.
Note that in the fnal expressi ons for dE, and dE" we substi tuted
( 12 +
1
2
) 1 / 2 for because we are i ntendi ng to i ntegrate with respect to I,
and is a functi on of I. Thus, we must express i n terms of I. Thi s sort
of mani pul ati on to el i mi nate unnecessary terms by expressi ng them in
other necessary terms is a form of hi l l cl i mbi ng on an eval uati on func
ti on concerned wi th the number of unknown terms. Stop readi ng and
try to solve the rest of the probl em, if you have not done so al ready.
Of course, al l that remai ns now is to actual l y perform the two i nte
grati ons to determi ne the E, and Ell components of the fel d at the
poi nt 11 di stant from the l i ne. Thi s is shown i n the work bel ow:

=[0 + 0] = 0
Problems from Mathematics, Science, and Engi neering Z4

Pri nci pal probl em-sol vi ng methods used in the sol ution of t hi s el ec
trostati cs probl em were subgoal s, representi ng i nformati on by sym
bol s and di agrams, symmetry ( noti ci ng equi val ence cl asses) , si mi larity
to previous probl ems , and perhaps some l i mited use of hi l l cl i mbi ng.
We might even contend that the subgoal of computing the contri buti on
to the el ectric fel d at a point produced by a smal l quanti t y of charge
dq di stri buted over a smal l segment of the l i ne dl consti tuted the
sol uti on of a si mpl er probl em and thus was an exampl e of that general
probl em-sol vi ng method, i n addi ti on to representi ng the subgoal method.
ELECTRICAL ENGINEERING
Givens: You have a l i mi ted suppl y of 2-i nput PL gates and 2-i nput LH
gates to use in constructi ng a vari ety of control ci rcui ts. A 2-i nput PL
or LH gate has t wo i nput wi res and one output wi re. Al l i nput l evel s and
output l evel s are either 0 or I ( bi nary digi tal -l ogi c ci rcui ts). A 2-i nput
PL gate has a I on the output wi re, if and onl y if both i nput wi res are
at the l l evel . A 2-i nput LH gate has a 1 on the output wi re, if and onl y if
either one or both of i ts inputs is at the 1 l evel . In constructi ng control
ci rcui ts, i t i s i mportant to know that you may connect the same (source)
wi re to many di ferent input wi res of many di ferent gates. Al so, the out
put wi re of one gate may be connected to one or more i nput wires of one
or more gates i n chai ns and even l oops. In parti cul ar, the output of a gate
may be connected to one of i ts own i nputs. However, you may not con
nect two output s. A somewhat related restri cti on i s that you must not
connect two wi res to the same i nput wi re, whenever doi ng so woul d
compl ete an undesi rabl e ci rcui t between the two wi res. In the present
probl em, assume that al l such ci rcui t s are undesi rabl e and do not connect
two wi res to the same i nput wi re of a gate.
Goal: In the part of the ci rcui t you are now constructi ng, there are 6 i nput
wi res and J output wi res, each of whi ch can be at the 0 l evel or the I
l evel . Onl y J patterns of O' s and I ' s wi l l ever occur on the 6 input
wi res. Your task i s to choose the set of J i nput patterns and construct
Zb
Chapter J J
a decodi ng ci rcui t, using the mi ni mum number of 2-i nput PDLor LHgates,
such that when any one of these 1 5 i nput patterns occurs, one and onl y
one of the 1 5 output wi res wi l l be at the 1 l evel (the rest bei ng at 0).
Natural l y, a di ferent output wi re shoul d be at the 1 l evel for each of the
1 5 di ferent i nput patterns.
The frst step here, as i n any probl em, i s to expl ore the probl em,
deri vi ng whatever concl usi ons can be deri ved easi l y. For exampl e,
wi th 6 bi nary i nputs, there are 6
4
possi bl e i nput patterns, onl y 1 5 of
whi ch are bei ng used. Four bi nary i nput wires woul d sufce to present
1 5 di ferent input patterns, so there must be some advantage i n usi ng
more i nput wi res. I t woul d be a good guess that it si mpl i fes the decod
i ng ci rcui t and mi ni mi zes the number of gates to employ 6 i nput wi res,
rather than
4
. Al ong the same l i ne, i t woul d be reasonabl e to conjec
ture that the probl em woul d be essenti al l y solved if we knew which
1 5 i nput patterns to use.
I f we were i ncl i ned toward number theory, we might i nqui re about
the properti es of the numbers 6 or 1 5 . I n thi s particul ar probl em, such
an i nqui ry coul d yi el d an i mmedi ate i dea for the correct solution,
especi al l y if the probl em sol ver were al ready sufci entl y fami l i ar with
the use of ANO and OK gates i n ci rcui t probl ems. However, let us not
fol l ow up thi s speci fc approach to the probl em now. You can go back
to consi der thi s approach, afer we have gone through more straight
forward and more general methods.
Another thi ng we mi ght deri ve i s the concl usi on that the sol ution of
the probl em must requi re at l east 1 5 gates, one for each di ferent out
put wi re. If the probl em can be sol ved with 1 5 gates, thi s number must
be the mi ni mum. Of course, we do not know yet whether a greater
number of gates than 1 5 wi l l be requi red. Stop readi ng and try agai n
to sol ve the probl em, if you di d not do so before.
So much for deri vi ng quick concl usi ons. If you are i nexperi enced in
usi ng ANO and OK gates i n ci rcui t probl ems, you will probabl y want to
spend some ti me thi nki ng about their properti es and usi ng them i n a
more or l ess random way, unrel ated to the probl em. You are probabl y
somewhat fami l iar wi th ..and f as used i n logical expressi ons, but
psychol ogi cal l y that i s not quite the same as usi ng ANO and OR gates
as transformati ons (operators) in ci rcui t-desi gn probl ems.
N ear the begi nni ng of your work on the probl em, you shoul d devel op
useful representati ons of concepts i n i t. Vector notation for the si x
bi t bi nary i nput patterns wi l l probabl y aid your thi nki ng -for exampl e,
1 1 0 I 00 or 00 I 000. Some sort of spatial representation of the two
Problems from Mathematics, Science, and Engi neering ZbT
diferent ki nds of gates ( l abeled boxes), and the i nput and output wi res
( l i nes) might also be hel pful to you. Stop readi ng and try agai n to sol ve
These somewhat ponderous prel i mi nari es to real work on the prob
lem may seem compl etel y unnecessary to some, but those for whom
the prel i mi nari es are unnecessary are ei ther l ucky i n sol vi ng thi s
parti cul ar probl em or el s e are consciousl y or unconsci ousl y accomp
l i shi ng these prel i mi nari es very qui ckl y i n thei r brai ns. Once a person
becomes ski l led at probl em sol vi ng, these prel i mi nari es, which take
several paragraphs to expl ai n, can be accompl i shed i n seconds in
the head.
Havi ng accompl i shed the prel i mi nari es of ful l y understnding the
probl em, deri vi ng quick concl usi ons, and devel opi ng some useful
verbal and spatial representations, it i s ti me to see if some sol uti on to
the problem just pops i nto your head, probabl y because it is analogous
to si mi l ar probl ems you have sol ved i n the past. If nothi ng comes to
mind, you mi ght try more acti vel y to thi nk about whether you have
solved si mi l ar probl ems i n the past and what speci fc or general
methods you used then. Let us assume that thi s is a fai l ure; you never
encountered a ci rcui t probl em before i n your l i fe or, i n any event,
you have not remembered anythi ng that seems useful from previ ous
probl ems.
What next? You mi ght try breaki ng the probl em i nto part s ( sub
probl ems or subgoal s) . For exampl e, you coul d note that three i nput
wi res can have eight di ferent i nput patterns, whi ch means that per
haps the probl em could be broken down i nto two subgoal s : that of
mappi ng eight i nput patterns on wi res 1 , 2, and 3 onto ei ght of the
output wi res, and that of mappi ng seven i nput patterns on wi res 4, 5 ,
and 6 onto the remai ni ng seven output wi res. I t seems a trife i nel egant
to have been given 1 5 , i nstead of 1 6, codes, but i n a real -worl d prob
lem, nothi ng guarantees this kind of el egance. Of course, this i s a
made-up probl em, and it is more el egant than thi s. I gnoring the ques
ti on of i nel egance, we might spend some ti me tryi ng thi s approach
based on the subgoal method. However, it happens in thi s case that
the anal ysi s into subprobl ems is not hel pful . There i s a power in the
combi nati ons across the two sets of three input lines that i s being l ost
by thi s anal ysi s i nto subprobl ems. I have not thought of any other
anal ysi s of thi s probl em i nto subprobl ems that i s hel pful , ei ther. Thus,
the subgoal method i s a compl ete bust on thi s probl em, but if you tried
thi s method you would be making a rel ati vel y i ntel l igent error. What
other general probl em-sol vi ng method might you use? Stop readi ng
and try agai n to sol ve the probl em, if you have not al ready.
ZbZ
Chapter J J
Many other general probl em-sol vi ng methods coul d be tried, but the
one that real l y cracks the probl em open i n a systemati c, straight
forward, though somewhat ti me-consumi ng, manner is to solve si mpl er
probl ems. There are a l arge number of si mpl er probl ems. You can start
as si mpl e as you wi sh and work your way up through more compl i
cated probl ems, and hope the general pri nci pl e of the solution to the
original probl em becomes cl ear. Stop readi ng and try agai n to solve
A good subprobl em to start with would be fve i nput patterns on
four i nput wi res, to be decoded onto fve output wi res. There are onl y
1 6 possi bl e bi nary i nput patterns on four wi res, and you are to select
fve of the 1 6 to achi eve a circuit usi ng the smal l est number of gates.
Presumably, i n working on t hi s subprobl em, you l earn a number of
pri nci pl es that wi l l be useful in sol vi ng the original probl em. For
exampl e, you learn to focus on the i nput wi res, whi ch are at the 1
l evel in any parti cul ar i nput pattern, because, without any i nvertors,
it i s onl y the I -l evel i nputs that can be used to turn on the correct
output. Al so, presumabl y you real i ze (if you di d not al ready) that some
ci rcui ts wi l l turn on the correct output but also turn on some i ncorrect
output wi res, in vi ol ati on of the requi rements of the probl em. Thi s
type of di fcul ty mi ght i ncl i ne you agai nst sel ecti ng bi nary i nput codes
that had too many I ' s i n them.
Note that we avoi ded choosi ng a si mpl er probl em that had no more
i nput patterns than it had i nput l i nes, because such a probl em permits
the trivial one-to-one solution that obvi ousl y will not work i n the
original probl em and wi l l give no i nsights i nto the original probl em.
Thus, we may already have real i zed that many or all of the i nput pat
terns must have more than a si ngl e 1 i n them. I n the present si mpl er
probl em, the combi nation of not wanti ng too many I ' s and wanting
more than a si ngl e 1 i n many or all of the input patterns essential l y
forces us t o use di ferent combi nati ons of two I ' s as i nputs t o ANO
gates -for exampl e, 1 1 00, 1 0 1 0, 1 00 I , 0 1 1 0, 0 1 0 1 . At thi s poi nt, we
mi ght see that this type of solution general i zes di rectl y to the original
probl em, or we might solve another, sl i ghtl y more di fcul t probl em
before seei ng that the sol uti on general i zes to the ori gi nal probl em.
Stop readi ng and try agai n to sol ve the probl em, if you di d not before.
Let us back up a l i ttl e. Maybe we never thought expl i ci tl y of the
pri nci pl e that too many I ' s i n an i nput pattern are no good. Neverthe
l ess, we woul d be apt to obtain the sol uti on to the si mpl er problem
because the range of possi bl e sol utions i s so much reduced. To be sure,
there are a lot of di ferent combi nati ons of 1 6 patterns taken fve at a
ti me, and we must al so descri be the decodi ng ci rcui t for any fve we
Problems from Mathematics, Science, and Engineeri ng
Zb
sel ect. However, the number of l ogical l y d
i
ferent ....of potenti al
sol uti ons i s much smal l er than thi s. Without tryi ng to enumerate al l
of the logical l y di ferent cl asses of sets of fve i nput patterns, we can
i ndi cate the nature of the features used to defne these cl asses ac
cordi ng to whether the pattern used consi sts of al l I ' s ( 1 1 1 1 ) ; three I ' s
(such as 1 1 1 0) ; two I ' s (such as 1 1 00) ; one I (such as 0 1 00) ; al l O' s
(0000) ; or whether t he same wi re i s at t he I l evel i n al l of t he fve pat
terns, four of the fve, three of the fve, and so on. If you use the
method of cl assi fcatory trial and error ( bei ng systematic about noti ng
the features of the types of sol uti ons you have consi dered and rejected),
i t shoul d not take too l ong to hi t upon the optimal sol uti on to the
s
i
mpl er probl em. Stop readi ng and try agai n to sol ve the probl em, if
you have not done so al ready.
The sol uti on to the original probl em is to choose the fol l owi ng set
of 1 5 input patterns: 1 1 0000, 1 0 1 000, 1 00 1 00, 1 000 1 0, 1 0000 1 ,
0 1 1 000, 0 1 0 1 00, 0 1 00 1 0, 0 1 000 1 , 00 1 1 00, 00 1 0 1 0, 00 1 00 1 , 000 1 1 0,
000 I 0 I , 0000 I I .
The decodi ng ci rcui t uses just 1 5 ANOgates with i nput l i nes I and 2
connected to the frst ANO gate, i nput l i nes I and 3 to the second,
i nput l i nes I and
4
to the thi rd, and so on, up to i nput l i nes 5 and 6
to the 1 5th.
What if you chose so tri vi al a si mpl er probl em that no useful i n
si ghts were obtai ned? An exampl e woul d be three i nput patterns on
three i nput wi res to be decoded onto three output wi res. I f you avoided
the tri vi al one-to-one sol uti on, you coul d sti l l l earn the necessary
pri nci pl es from thi s si mpl er probl em. If you did not avoid the tri vi al
sol uti on, you coul d then pose a somewhat more compi ex probl em,
conti nui ng t hi s process unti l a probl em was posed that was si mpl e
enough to sol ve easi l y but hard enough to i nvol ve some of the i m
portant pri nci pl es of the sol ution to the ori gi nal probl em.
What if your judgment regardi ng si mpl er probl ems was faul ty, and
a harder probl em was sel ected? For exampl e, you might thi nk that
eight i nput patterns on four i nput wi res was a si mpl er probl em than
the original probl em ( 1 5 input patterns on six wi res) , but i t i s not.
Furthermore, a sol uti on to thi s probl em wi l l tend to l ead you away
from the opti mal sol ution to the original probl em. You must honestl y
face the fact that thi s is a potential trap that ofen accompani es the
use of the si mpl er-probl em method. Your criteria for judgi ng the
si mpl i ci ty of a probl em are vi tal l y i mportant for the success of the
method. I f you know thi s, expl i ci tl y specify cri teri a for probl em si m
pl i ci ty wi thi n any cl ass of probl ems, and conti nual l y questi on these
criteria for probl em si mpl i ci ty when the supposedl y si mpl er probl em
Zb4 Chapter J J
proves di fcul t, then you can avoid bei ng trapped by the method.
What if, in working on the ori gi nal probl em or some si mpl er prob
l em, you devel op mental sets ( unconscious assumpti ons) about the
sol uti on to the probl em that are wrong and that prevent you from ob
tai ni ng the necessary ideas for sol vi ng the probl em? This ofen happens,
especi al l y when a person does not have a habi t of conti nual l y trying
to specify the methods bei ng used and the assumpti ons bei ng made.
For exampl e, i n the present probl em, we can devel op the working
hypothesi s that somehow the six input wi res should be considered i n
three groups of two wi res each. Thi s sort of crudel y formul ated work
i ng hypothesi s coul d be very hel pful , if i t were correct. I n thi s probl em
i t i s not correct, and i t can be di sti nctl y del eteri ous for getting the
necessary i deas.
As usual , an ounce of prevention i s worth a pound of cure. If you
are careful to note the worki ng assumpti ons you make, i t wi l l be easy
to questi on those assumpti ons and thi nk of other ideas (worki ng as
sumpti ons) that vi ol ate them. However, someti mes even the most
anal ytical probl em solvers make unconsci ous worki ng assumptions
and then fnd themsel ves going around i n circles -that is, repeatedl y
trying out the same i ncorrect sol uti ons wi thi n a l i mited set that does
not contai n the correct sol uti on. I f you are aware of thi s possi bi l ity,
then you can try to characterize your i mpl i ci t assumptions and make
one or more contrary assumpti ons.
COMPUTER PROGRAMMING
Computer programmi ng probl ems provide parti cul arl y good exampl es
of subgoal s , the representati on of i nformati on (nami ng), i nference
(representati on of i mpl i ci t i nformati on), analogy, and special case.
Computer programmi ng probl ems frequentl y i nvol ve the solution
of one or more mathematical probl ems, such as deri vi ng an al gori thm
for the solution of an equati on, i n addi ti on to the defnition of a se
quence of i nstructi ons to achi eve the sol uti on of the probl em by the
computer. Thi s probl em sol vi ng al ready provi des an exampl e of the
subgoal method, with one subgoal being the mathematical sol ution of
one or more probl ems and a second subgoal being the representation
of this solution i n a programmi ng l anguage.
Another basi c appl i cation of the subgoal method to vi rtual I y all
computer programmi ng probl ems i s the di vi si on of the probl em i nto
three part s: i nput, computati on, and output. I n addi ti on to the input
of the program i tsel f, most programmi ng probl ems requi re the val ues
of certai n variabl es to be i nput to the machi ne. The computer must
be told from what source to expect the i nput (cards, magnetic tape,
Problems from Mathematics. Science. and Engineering Zbb
paper tape, or whatever) and the format of the input (al phabetic, al pha
numeric, numeri cal , two-col umn fel ds, three-col umn fel ds, and
so on). In addi tion, the computer must be tol d where to store thi s
data and what names to gi ve to the vari ous subsets of i nput data.
These i nstructi ons represent further subsubgoal s of the i nput phase
of the programmi ng probl em. Si mi l arl y, the computer must be tol d
what val ues from what arrays t o output, on what output medi um
( pri nter, cards, magnetic tape, paper tape, and so on), the output for
mat, and al phanumeric headi ngs for vari ous porti ons of the output.
The computational porti on of any l arge program must frequentl y
be di vi ded further i nto subgoal s, the sol uti on to each of these subgoal s
bei ng cal l ed a subrouti ne. For exampl e, it may be that a porti on of
the computation i nvol ved i n a computer program is to fnd the val ue
of a function such as )
, . a log .+ si n .+ L Y. Wi thout con
si dering exactl y where i n the program thi s subroutine wi l l be used, a
computer programmer might write up a program for the computation
of thi s function and gi ve that portion of the program ( subrouti ne) a
name, so that it might be cal l ed at any poi nt duri ng the execution of
the main program. The computation of val ues of functi ons constitutes
but one rel ati vel y tri vi al exampl e of the appl i cation of the subgoal
method to computer programming probl ems. Other exampl es of sub
routi nes i ncl ude random-number generators, shufi ng programs, fnd
ing the maxi mum val ue i n an array of val ues, orderi ng or ranki ng a set
of numbers, and searchi ng for a parti cul ar al phanumeric l abel . Fre
quentl y, the programmer knows a number of subrouti nes that wi l l be
requi red to sol ve a computer programmi ng probl em, programs to
achi eve these subrouti nes can be devel oped rel ati vel y i ndependentl y
of one another and of the mai n program. Because of the l arge i nde
pendence that can be achi eved i n writing a computer program to
achi eve various subgoal s, i t i s possi bl e for a team of programmers to
di vi de the work of writing a l arge program. However, i t i s frequentl y
necessary for certain common nami ng conventi ons to be observed
and for the writer of the main program to speci fy to the writers of
each of the component subrouti nes the form and l ocation of the i nput
to thei r subrouti nes and the desi red form and l ocati on for the output
from the subrouti nes.
The i mportance of gi vi ng names to al l the i mportant concepts i n a
programmi ng probl em is so obvi ousl y forced upon any programmer
by the necessi ty of representi ng every i mportant aspect of the probl em
i n a computer program that i t bears no extensi ve di scussi on.
The frequent need to represent i mpl i ci t i nformation i n the sol uti on
of programming probl ems i s al most, but not qui te, as obvi ous as the
necessity of naming i mportant concepts. For exampl e, assume that
Zbb Chapter J J
one requi red subrouti ne i s to sampl e randoml y from a set without
replacement. We might achi eve thi s by si mul ati ng card shufi ng in
the computer. To si mul ate card shufi ng i n a computer i t is necessary
to represent expl i ci tl y what we know i mpl i ci tl y to be i nvol ved in
shufi ng a deck of cards. Shufi ng a deck of cards i nvol ves frst making
a si ngl e parti ti on at some random point near the middle of the deck
i n order to di vi de the set i nto two subsets (two i nterval subsets). The
cut might easi l y be achi eved i n a computer by pi cki ng a random num
ber between 0 and and addi ng thi s to a number that is 5 less than
half of the number of cards in the deck. Havi ng si mulated the cut,
we are now faced wi th si mul ati ng the actual shufe. What i s evi dentl y
i nvol ved i n thi s shufe i s that the top card from one of the two subsets
is i nserted at some random point wi thi n the top few cards of the other
subset and the second card i s i nserted randoml y a few cards bel ow
the frst card, and so on. Thi s can be easi l y si mul ated on a computer
by pi cki ng a random number between 0 and 2 for the number of cards
from the other subset to i ntervene between any two adjacent cards
from the frst subset. Thi s is not necessari l y the best shufi ng routi ne,
but i t vi vi dl y i l l ustrates the process of expl i ci tl y representing i mpl i ci t
i nformation as a component to the sol ution of a programmi ng probl em.
Analogy i s wi del y used i n the sol uti on of programmi ng probl ems i n
the rel ati vel y tri vi al sense that whenever a probl em can be i dentifed
as being essenti al l y i denti cal to a probl em for which a program al ready
exi sts, a programmer wi l l obtain that program from some l i brary and
i ncorporate i t i nto hi s own program to sol ve that portion of the prob
l em. Thi s method pl ays an i mportant rol e i n sol vi ng computer program
mi ng probl ems, but i ts use i s so wi del y understood that it hardl y
deserves much comment here.
Fi nal l y, the method of speci al case ofen pl ays an i mportant rol e i n
the sol uti on of computer programmi ng probl ems. Programmi ng prob
l ems frequentl y i nvol ve doi ng certai n computational jobs over and
over agai n for some mul ti di mensi onal array of val ues or vectors of
val ues as the i nput. It si mpl i fes the probl em greatl y to frst write a
program to sol ve the probl em in a speci al case and then extend thi s
sol uti on to the enti re mul ti di mensi onal array. Very frequentl y, thi s
method amounts to l i ttle more than getti ng a subroutine for doi ng a
parti cul ar job ( such as computi ng the val ue of a functi on) and then
embeddi ng that subrouti ne wi thi n a set of control l oops that iterate
the subrouti ne through all of the val ues in an i nput matri x and output
the resul ts of the computati on i nto the proper pl aces i n an output
matri x. The method of speci al case i s sometimes equi val ent to the
subgoal method.
References
Bartl ett, F. Thinking: An experiment and social study. New York: Basi c
Books, 1 958.
Chessi n, P. L. Probl em for sol uti on. American Mathematical Monthly, 1 954,
6 1 , 258-59.
Duncker, K. On probl em sol vi ng. Psychological Monographs, 1 945, ( 5,
Whol e No. 270).
Fel l er, W. An introduction to probability theory and its applications. ( 3rd
ed. ) Vol . I . New York: John Wi l ey & Sons, 1 957.
Newel l , A. , Shaw, J . L. , & Si mon, H. A. The processes of creati ve t hi nki ng.
I n H. E. Gruber, G. Terrel l , & M. Werthei mer ( Eds. ) , Contemporar
Approaches to Creative Thinking. New York: Atherton Press, 1 962.
Pp. 63 -1 1 O.
Pol ya, G. How to solve it. Garden Ci ty, N. Y. : Doubl eday & Company, 1 957.
Pol ya, G. Mathematical discovery. Vol . I . On understanding, learning, and
teaching problem solving. New York: John Wi l ey & Sons, 1 962.
Si mon, H. A. , & Newel l , A. Human probl em sol vi ng. American Psychologist,
1 97 1 , 26, 1 45 -1 59.
Abstract i on, 27, 3 1 , 1 64, |HO- 1 8 1 , |H3
Acti ons, | | , 1 2. 1 7- 1 9, 68
commutat i ve, 48-50
equi val ent 4H, 72
i denti ty, 49
i nverse, 49-50
Act i on sequences. 46-64, 6H, 1 44
equi val ence cl asses of. 47-64
Al gori thm, | |H
Anal ogy. 5cc Rel ati ons between
probl ems, equi val ent ; Rel at i ons
bet ween probl ems, si mi l ar
Argument set , 204-205 , 20H
Art i fci al i nt el l i gence, 5
Assumpt i ons wi thout l oss of general i t y,
4 1 , 224
Bartl ett, F . . 1 2 H. 25 7
Begi nni ng st at e. 5cc Gi ven state
Breaki ng a probl em i nto part s. 5cc
Subgoal s
Chessi n, P. L. . 1 3 2. 2 5 7
Ci rcl i ng, 85-KH
Combi nati ons. 50. 1 69- 1 7 1 . 1 97- 1 9K,
200-202 , 23 6-2 3 7
Compl ete speci fcat i on, 1 3 , 1 4- 1 5 , 1 6
Compl et i ng the square, 2 1 7-2 28
Computer si mul at i on of probl em sol vi ng.
5
Consci ous vs. automat ic probl em
sol vi ng, 6
Cont radi ct i on. 20, 40-4 1 . 1 09- 1 3 6, 1 3 7.
1 56. 1 68- 1 69. 224, 1 3 5 . 24 1 . 242
cl assi fcatory, appl i ed to i nfni t e
search space, | | | , 1 3 3- 1 3 6
Index
cl assi fcatory, appl i ed to large search
space, 1 1 0- 1 1 1 , 1 26- 1 3 3
i ndi rect proof, | 1 0- 1 1 5
mul ti pl e-choice, appl i ed to smal l
search space, | 1 0, 1 1 5- 1 26
Coordi nate notati on, 30-32
Detours , 7 1 , 79, 85-88, 93 , 1 54- 1 5 5
Di agrams. 5cc Representati on of
i nformati on, di agrammati c
Do somet hi ng di ferent . 5cc Loops
Drawi ng concl usi ons. 5cc I nference
Duncker, K. , 26-27, 34, 2 5 7
End-bunchi ng, 84-H5
Equal i t y vs . equi val ence, 202-203
Eval uat i on funct i on, 1 9, 67-90, 93-94,
95-96, 1 07- 1 08, 1 44- 1 46, 209-2 1 2 ,
2 1 5 . 2 1 7, 225-226, 228-23 1 .
240-24 1 , 244, 248
expect ed val ue, 72-73
one-di mensi onal , 69, 84-85
si ngl e-val ued, 94
vector-val ued, 69. 7 1 , 73-74, 94
Exampl e probl ems
ABC puzzl e, 1 0 1 - 1 02
abstract al gebra, | | | - | 1 2 , | 1 2 , 1 1 3 ,
1 47- 1 48, 1 50- 1 5 1 , 1 73 , 24 1 -244
anal yt i c geomet ry, 2 1 7-222
beam balance ( coi n wei ghi ng) , 34-3 5 ,
7 1 -7 3 , | | | , 1 26- 1 27, 1 5 7- 1 5 9
bl ock ( cube) cut t i ng, 32-3 3 , 93,
1 1 3 - 1 1 4
bowl i ng-pi n reversal . 1 24- 1 26. 1 5 6
cal cul us, 222, 227
canteen di vi si on, 97
Zb
Example probl ems (conttnucd)
card probabi l i t y, 98-99
cheap neckl ace, 55-57
checkerboard path, 1 66- 1 68
checker-rearrangement, 1 44- 1 46, 1 5 3
chess, 9, | | , 1 2, 1 3 , 1 6- 1 7, 44-45 ,
8 1 -82
combi natorial anal ysi s, 23 6-24 1
computer programmi ng, 254-2 5 6
concept attai nment, | | | , 1 27- 1 28
diferent i al equations, 75 , 227-232
di scri mi nati on reversal , 78
doubl i ng game, 1 4 1 - 1 42
Duncker, 26-27
Duncker radiati on, 34
el ectrical engi neeri ng, 249-254
el ect ri ci t y, 246-249
fox, goose, and corn, 1 54- 1 5 5
functi on "i nducti on, " 1 1 7
functi on opti mi zati on, 83-84
geometry constructi on, 38-39, 39-43 ,
59-62
geometry fnd, 1 06- 1 07, 1 75- 1 76,
1 77- 1 78, 1 78- 1 80, 1 82
geometry proof, 1 05- 1 06, 1 1 4- 1 1 5 ,
1 7 1 - 1 73 , 1 73- 1 75
heat, 245-246
hol e-i n-sphere, 1 75 - 1 76
I nstant I nsani t y, 7-9, | | , 1 2 , 1 3 , 1 6,
84-85
i nteger-path addi ti on, 1 30- 1 3 2 ,
1 66- 1 67
l etter ari thmeti c, | | | , 1 28- 1 29
l i ars vs . truars , 3 6-37, 1 1 8- 1 20,
1 20- 1 2 1
l ogi c, 36-37, 1 1 0, 1 1 8- 1 20, 1 20- 1 2 1 ,
1 2 1 - 1 24
l onesome ei ght, 1 3 2- 1 3 3
maze, 7 1
mechani cs, 244-245
mi ni mum di st ance, 1 62- 1 63 , 1 63- 1 66
mi ssi onari es and canni bal s, 86-88,
93 , 1 54- 1 5 5
mul ti pl e-choi ce, | 1 0, 1 69
ni m, 1 42- 1 44, 1 5 3 , 1 54
ni ne-dot four-l i ne, 64-65
notched checkerboard, 29-30, 93, 1 1 3 ,
1 66
notched col orl ess matri x, 30-32
number theory, 1 08, 1 1 7, 24 1
page numberi ng, 1 3 3- 1 34
probabi l i t y, 23 2-234, 234-2 3 5
Pythagorean Theorem, 1 48- 1 50
rai l road si di ng, 52-55
ri ver-crossi ng raft , 98
HOW TO SOLVE PROBLEMS
roots of l i near equati ons, 5 8-59,
73-75 , 204, 209-2 1 2
root s of nonl i near equat i ons, 75 ,
1 1 6- 1 1 7, 1 3 5- 1 3 6, 2 1 3-2 1 5
scal e (coi n) weighing, 1 59- 1 6 1
si x-arrow, 49-5 1 , 75-77
63- l i nk chai n, 43-44, 99- 1 0 |
Smi th, Jones, and Robi nson, 1 2 1 - 1 24,
1 87
spati al rearrangement, 1 56- 1 5 7
stat i sti cs, 234, 236
story algebra, 1 04- 1 05
tabl e, chi p pl acement, 1 76- 1 77
t hi rteen, 26-27
t hree-way-questi on i nformat i on
theory, 1 5 8- 1 59
Tower of Hanoi (di sk t ransfer) ,
1 02- 1 04, 1 5 3
t rigonometry, 2 1 5-2 1 7
tri p pl anni ng, 70-7 1 , 96
wal ki ng fy, 1 63- 1 66
water j ar, 1 46- 1 47, 204
work-rate, 1 94- 1 96
Expected val ue, 72-73
Expressi on, 1 0, 1 5- 1 6, 89, 1 9 1 - 1 92, 1 95
Fel l er, W. , 20 | , 25 7
Fi gures. 5cc Representat i on of
informati on, di agrammati c
Formul a, 4
Functi on, 206, 208
General izat i on, 1 80- 1 83
Gi vens, 2f, 1 0f, 23 , 25-3 3 , 203-204, 222
conjunct i ve vs. di sjunct i ve, 1 3 8
Gi ven state, 1 6, 1 7
Goal , 1 0, 1 3- 1 5 , 23 , 36-45 , 1 09- 1 1 1 ,
1 3 7, 222
uni quel y speci fed, 1 3 8, 1 39, 1 43
Goal state, 1 6, 1 7, 1 9, 79
Goi ng around i n ci rcl es. 5cc Loops
Hal f-i nterval search techni que, | | | ,
1 3 5 - 1 36
Heuri sti c approach, 5 7
Hi l l cl i mbi ng, 1 9, 67-90, 1 44- 1 46,
209-2 1 2 , 2 1 6-2 1 7, 225-226,
227-23 1 , 240-24 1 , 244, 248-249
I mpl i ci t i nformat i on, 2 1 -45, 1 03 ,
1 88- 1 89, 2 3 3 , 2 3 5 , 24 254,
255-256
rel evant vs. i rrel evant , 26, 3 2
I mpl i ci t propert i es. 5cc I mpl i ci t
i nformat ion
INDEX
I mpl i ci t speci fcati on, 1 4, 2 1 -45, 5 5
I ncompl ete speci fcat i on, 1 3 , 1 4- 1 5, 1 6,
1 9-20, 79, 92-93 , 1 39
I ncubati on, 65-66
I ndi rect proof. 5cc Contradi cti on
I nducti on, mathemat i cal , 1 08, 1 69,
239-240
I nference, 2 1 -45 , 60, 96, 1 00, 1 09,
1 28- 1 29, 1 30, 1 3 1 , 1 3 3 , 1 34, 1 37,
1 6 1 , 1 86, 1 9 1 - 1 92 , 2 1 9-220, 23 7,
23 8, 243 , 247, 250
I nverse mappi ng, 207
I nverse operat i ons, 1 5 , 20, 1 40
I terati ve contradi cti on, 1 3 3- 1 36
Label i ng, 5cc Representati on of
informati on, symbol i c
Lemma, 1 07, 242-244
Ll oyd, S. , 45
Loops, 46, 53-55, 63-66, 254
Macroacti on , 1 9, 5 K-62
Mappi ngs, 204-20K
compl ete, 206
i nverse , 206-207
one-to-one, 207-20K
onto, 207-20K
Maxi mum and mi ni mum l i mi t s
absol ut e, K3
l ocal , 83-84, K5 , 222-223, 226-227
Mi croact i on, 5 K
Mnemonic symbol convent i ons, 1 90,
1 92- 1 93 , 1 95- 1 96, 208
Mul t i pl e gi ven states. 5cc I ncompl ete
speci fcat i on
Mul t i pl e goal st at es. 5cc I ncompl et e
speci fcat i on
Newel l , ^. , 1 2 K, 1 29, 1 3 K, 25 7
Node, 1 7- 1 9, 47, 67-69
Numeri cal propert i es, 26-27
Operand, 1 3 , |K
Ordered pairs, 30, 1 70, 1 96- 1 97, 1 99
Ordered set s, 50, 1 7 1 , 1 99-200
Operat i ons, 1 0, 1 1 - 1 3 , 1 8, 25-26,
34-3 6, 203-204
bi nary, 1 3 9- 1 40
dest ruct i ve, 1 2 , 1 5 . 5 K . 88-K9
i nverse, 1 5 , 20, 1 40
nondestruct i ve, 1 2 , 1 5, 1 6, 58, 88-K9
one-to-one , 89, 1 39- 1 40
unary, 1 3 9- 1 40
wel l -defned, 1 3 9
Permutation, 50, 1 7 i , 1 96- 1 97
Pol ya, G. , 38, 1 3 3 , 1 82 , 2 5 7
Probl em, 1 0- 1 5
acti on, 88-90, 1 39, 1 4 1
construction, 1 5 , 84
fnd, 9, | | , 1 2, 1 3 , i 4, 1 7, 1 39
formal , 2f
i nference, 88-90, i 39, 1 47
i nsight, 23-24, 64, 92-93 , 1 60- 1 6 1 ,
1 77, 1 86
opti mi zati on, 1 5 , 1 7, 83
practi cal , 2f, 1 0
ZbT
proof, 9f, | | , 1 2 , i 3 , 1 4, 1 7, 1 3 8, 1 39
puzzl e, 3
Probl ems, exampl es of. 5cc Exampl e
probl ems
Probl em st at e, 1 5- 1 6, 1 7- 1 9, 48, 5 1 -5 5
Probl em-sol vi ng methods
general , 3f
speci fc, 3f
Pruni ng the t ree, 1 9
Recursi on, 1 02 - 1 04
Refexi ve rel at i ons, 202
Rel ati ons , 202-203
Rel at i ons bet ween probl ems, 1 5 2- 1 83
equi val ent , 1 5 2- 1 5 3 , 25 6
general i zati on, 1 80- 1 83
si mi l ar, 3 , 74-75 , 1 5 3- 1 68, 227-230,
238, 24 246, 249, 25 1 , 2 5 2-254
speci al cases , 1 5 3 , i 68- 1 80, 239, 256
Represent at i on of i nformat i on, 2 1 -45 ,
1 80- i 8 1 , 1 8 3 , 1 84-208
di agrammat i c, 25, 3 8, 40, 5 3 , 1 1 4- 1 1 5 ,
1 84, 1 86, 1 87- 1 89, 247, 249,
250-25 1
symbol i c, 25 , 3 8, 40, 1 03 , 1 84, 1 86,
1 87- 1 88, 1 89- 1 96, 245-246, 249,
250-25 1 , 2 54-25 5
tabul ar, 1 2 3 - 1 24, 1 3 1 , 1 86- 1 87
Rul es of i nference, | | I
Sampl i ng
wi t h repl acement , 1 96-202
wi t hout repl acement , 1 69- 1 7 1 ,
1 96-202 , 2 3 6-2 3 7
Search, 1 9, 67, 86, 92-93 , 1 1 0- 1 1 1 ,
1 1 9- 1 20, 1 26, 1 29- 1 3 0
Search space. 5cc State-acti on t ree
Shaw, J. C, 1 3 8, 2 5 7
Si mi l ar problems. 5cc Rel at i ons between
probl ems, si mi l ar
Si mon, H. A. , 1 28 , 1 29, 1 3 K , 25 7
Si mpl e probl ems. 5cc Rel at i ons between
probl ems, si mi l ar
ZbZ
Sol uti on, 1 3 , 1 6- 1 7
Spati al representa!ion. 5cc
Represent ati on of i nformat i on,
di agrammatic
State eval uati on. 5ec Eval uation
functi on
State-act i on tree, 1 7- 1 9, 47, 5 8, 67-70,
86, 9 1 -92, 1 44, 204
Subgoal s, 4, 1 0, 1 9, 5 3-54, 74, 77, 90,
9 1 - 1 08, 1 29, 1 40, 1 47, 1 48- 1 49,
1 50, 1 7 1 , 209-2 1 2 , 2 1 3-2 1 5 ,
2 1 5-2 1 7, 2 1 7-2 1 8, 2 1 8-2 1 9,
2 1 9-222, 222-223, 227, 227-23 1 ,
23 2-234, 23 6, 2 3 6-237, 2 3 7-238,
23 8-239, 239-24 1 , 242-243 , 244,
245-246, 246-249, 25 1 , 254-2 5 5
ordered vs . unordered, 94-96
Subprobl ems. 5cc Subgoal s
Subscri pts, 1 92- 1 94
Superscri pt s, 1 92- 1 93
Symbol i c representati on. 5cc
Representati on of i nformat i on,
symbol i c
HOW TO SOLVE PROBLEMS
Symmet ri c relati ons , 202
Symmetry, 1 77, 238, 246-247, 249
Tabl es. 5cc Representat i on of
i nformati on, tabul ar
Target set , 204-205
Topol ogi cal propert i es, 27-3 3
Transformat i ons, 1 2, 22-23
Trial-and-error search, 67, 1 3 1
cl assi fcatory, 1 9, 47-64, 25 2-253
random, 22-2 3 , 46
systemati c, 46-47, 5 8
Transi t i ve rel at i ons. 202
Unordered pai rs, 27-2 8, 1 70- 1 7 1 ,
1 97- 1 99
Unordered set s, 50, 1 69- 1 7 1 , 200-202,
23 6-237
Wi shful t hi nki ng, Pol ya on, 3 8
Worki ng backward, 1 4, 1 9, 77, 96,
1 3 7- 1 5 1 , 2 1 6-2 1 7, 2 1 7-2 1 8, 220,
2 3 5 , 244
Worki ng forward, 1 3 7- 1 40, 2 1 7

How To Solve Mathematical Problems

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

How To Solve Mathematical Problems

Caricato da

Copyright:

Formati disponibili

How to Solve

Begi nni ng state State level

DIFFICULTIES WITH HILL CLIMBING

alternati ve paths (action sequences) to be i nvestigated

paths to i nvestigate in attempting to get

states), or choose a solution approach that violates hill cl i mbing but

As an example of a probl em that i s extremel y easy to sol ve usi ng

+ 1 -0, subtract 1 from both si des of t he equation to

--1 . Now substitute for .getting

--1 . Thi s resul t i s a

Special case General case

-L + f' , then L + f' -90 ; this establ i shes that L + f

be the vol ume of the second cubi cal subcontainer of A ; l et vt be the

for ti mes . If you are always careful to put the

Al though it is easier to expl ai n mappi ngs in di agrammati c (spatial )

and L V/2. Therefore, the coordi nates of the

to get (x2 -2ax + .

0 and (b) making the second

Drawing i nferences about these subgoal s

i ntercepts (bei ng a strict proportionality) . Having zero .

4) . Our preceding i nference concerning the point of

O? Stop reading and

coordi nate. Achi evi ng

axi s coincides with the frst straight l i ne ..

+ 2 -O. Stop readi ng and

+ 2 -0, we can equate .cos L +

d. Without l oss of general ity,

, we can eventual l y derive

woul d greatl y si mpl ify the

Frequentl y, several substi tuti ons wi l l be requi red in order to sol ve

. We hope that these two equations wi l l permit us to sol ve for the

y 2yy' o. Thi s equation

When thi s equation is set equal to zero, we obtain y

or . - ) -and the probl em is sol ved. Once

O. To sol ve such equations, we set the subgoal of

..-woul d transform the second-order

-1 . Stop reading and

1 , had the form .

are tri vi al l y separabl e. Thus, the

PROBABILITY AND STATISTICS

Thus, the frst subgoal i s to determine thi s functi on,

and di ferentiating the

and the joi nt probabi l i ty densi ty function ,.

Havi ng achi eved the subgoal of determi ni ng the di stri buti on

di stri bution on degrees of freedom,

John and Fred agree to pl ay a tenni s match, wi th the wi nner to be the

so the crux of the proof con

To prove thi s l emma, we assume the theorem i s

on the left si de of the equation,

where goes from zero to

whi ch ari se onl y once i n the mul ti pl i cati on of

term ari ses in two pl aces .

Al l that remai ns is to show that

we obtai n 1 ) terms, and these terms must be reuced to +

(contradi cti ng the as

are di ferent ) . Stop readi ng and try agai n to

- Cl earl y, if thi s l emma were true, i t woul d permi t us to

-.for al l .in G. Stop

Since we know the mass (m), we i mmedi

We choose to work with

Since we do not know .. but onl y certain

-0, si nce -at -O. Thus, we have