Sei sulla pagina 1di 528

Paradoxes

Situations that seems to defy intuition

PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information.
PDF generated at: Tue, 08 Jul 2014 07:26:17 UTC

Contents
Articles
Introduction

Paradox

List of paradoxes

Paradoxical laughter

16

Decision theory

17

Abilene paradox

17

Chainstore paradox

19

Exchange paradox

22

Kavka's toxin puzzle

34

Necktie paradox

36

Economy

38

Allais paradox

38

Arrow's impossibility theorem

41

Bertrand paradox

52

Demographic-economic paradox

53

Dollar auction

56

DownsThomson paradox

57

Easterlin paradox

58

Ellsberg paradox

59

Green paradox

62

Icarus paradox

65

Jevons paradox

65

Leontief paradox

70

Lucas paradox

71

Metzler paradox

72

Paradox of thrift

73

Paradox of value

77

Productivity paradox

80

St. Petersburg paradox

85

Logic
All horses are the same color

92
92

Barbershop paradox

93

Carroll's paradox

96

Crocodile Dilemma

97

Drinker paradox

98

Infinite regress

101

Lottery paradox

102

Paradoxes of material implication

104

Raven paradox

107

Unexpected hanging paradox

119

What the Tortoise Said to Achilles

123

Mathematics

127

Accuracy paradox

127

Apportionment paradox

129

BanachTarski paradox

131

Berkson's paradox

139

Bertrand's box paradox

141

Bertrand paradox

146

Birthday problem

149

BorelKolmogorov paradox

163

Boy or Girl paradox

166

Burali-Forti paradox

172

Cantor's paradox

173

Coastline paradox

174

Cramer's paradox

178

Elevator paradox

179

False positive paradox

181

Gabriel's Horn

184

Galileo's paradox

187

Gambler's fallacy

188

Gdel's incompleteness theorems

195

Interesting number paradox

213

KleeneRosser paradox

214

Lindley's paradox

215

Low birth weight paradox

217

Missing square puzzle

219

Paradoxes of set theory

221

Parrondo's paradox

226

Russell's paradox

231

Simpson's paradox

237

Skolem's paradox

245

Smale's paradox

249

Thomson's lamp

251

Two envelopes problem

253

Von Neumann paradox

265

Miscellaneous

268

Bracketing paradox

268

Buridan's ass

269

Buttered cat paradox

272

Lombard's Paradox

273

Mere addition paradox

274

Navigation paradox

276

Paradox of the plankton

278

Temporal paradox

279

Tritone paradox

280

Voting paradox

282

Philosophy

283

Fitch's paradox of knowability

283

Grandfather paradox

286

Liberal paradox

291

Moore's paradox

295

Moravec's paradox

297

Newcomb's paradox

300

Omnipotence paradox

304

Paradox of hedonism

315

Paradox of nihilism

318

Paradox of tolerance

319

Predestination paradox

320

Zeno's paradoxes

322

Physics

329

Algol paradox

329

Archimedes paradox

329

Aristotle's wheel paradox

331

Bell's spaceship paradox

332

Bentley's paradox

338

Black hole information paradox

338

Braess's paradox

342

Cool tropics paradox

346

D'Alembert's paradox

348

Denny's paradox

357

Ehrenfest paradox

357

Elevator paradox

362

EPR paradox

363

Faint young Sun paradox

374

Fermi paradox

376

Feynman sprinkler

396

Gibbs paradox

399

Hardy's paradox

406

Heat death paradox

409

Irresistible force paradox

410

Ladder paradox

411

Loschmidt's paradox

420

Mpemba effect

422

Olbers' paradox

426

Ontological paradox

431

Painlev paradox

433

Physical paradox

434

Quantum pseudo-telepathy

439

Schrdinger's cat

442

Supplee's paradox

448

Tea leaf paradox

450

Twin paradox

452

Self-reference

462

Barber paradox

462

Berry paradox

465

Epimenides paradox

467

GrellingNelson paradox

470

Intentionally blank page

472

Liar paradox

475

Opposite Day

481

Paradox of the Court

482

Petronius

484

Quine's paradox

488

Richard's paradox

490

Self-reference

492

Socratic paradox

495

Yablo's paradox

497

Vagueness

498

Absence paradox

498

Bonini's paradox

498

Code-talker paradox

499

Ship of Theseus

500

References
Article Sources and Contributors

505

Image Sources, Licenses and Contributors

518

Article Licenses
License

522

Introduction
Paradox
For other uses, see Paradox (disambiguation).
A paradox is a statement that apparently contradicts itself and yet might be true. Most logical paradoxes are known
to be invalid arguments but are still valuable in promoting critical thinking.
Some paradoxes have revealed errors in definitions assumed to be rigorous, and have caused axioms of mathematics
and logic to be re-examined. One example is Russell's paradox, which questions whether a "list of all lists that do not
contain themselves" would include itself, and showed that attempts to found set theory on the identification of sets
with properties or predicates were flawed. Others, such as Curry's paradox, are not yet resolved.
Examples outside logic include the Ship of Theseus from philosophy (questioning whether a ship repaired over time
by replacing each of its wooden parts would remain the same ship). Paradoxes can also take the form of images or
other media. For example, M.C. Escher featured perspective-based paradoxes in many of his drawings, with walls
that are regarded as floors from other points of view, and staircases that appear to climb endlessly.
In common usage, the word "paradox" often refers to statements that are ironic or unexpected, such as "the paradox
that standing is more tiring than walking".

Logical paradox
See also: List of paradoxes
Common themes in paradoxes include self-reference, infinite regress, circular definitions, and confusion between
different levels of abstraction.
Patrick Hughes outlines three laws of the paradox:
Self-reference
An example is "This statement is false", a form of the liar paradox. The statement is referring to itself. Another
example of self-reference is the question of whether the barber shaves himself in the barber paradox. One more
example would be "Is the answer to this question 'No'?"
Contradiction
"This statement is false"; the statement cannot be false and true at the same time. Another example of
contradiction is if a man talking to a genie wishes that wishes couldn't come true. This contradicts itself
because if the genie grants his wish he did not grant his wish, and if he refuses to grant his wish then he did
indeed grant his wish, therefore making it impossible to either grant or not grant his wish because his wish
contradicts itself.
Vicious circularity, or infinite regress
"This statement is false"; if the statement is true, then the statement is false, thereby making the statement true.
Another example of vicious circularity is the following group of statements:
"The following sentence is true."
"The previous sentence is false."
"What happens when Pinocchio says, 'My nose will grow now'?"

Paradox
Other paradoxes involve false statements ("impossible is not a word in my vocabulary", a simple paradox) or
half-truths and the resulting biased assumptions. This form is common in howlers.
For example, consider a situation in which a father and his son are driving down the road. The car crashes into a tree
and the father is killed. The boy is rushed to the nearest hospital where he is prepared for emergency surgery. On
entering the surgery suite, the surgeon says, "I can't operate on this boy. He's my son."
The apparent paradox is caused by a hasty generalization, for if the surgeon is the boy's father, the statement cannot
be true. The paradox is resolved if it is revealed that the surgeon is a woman the boy's mother.
Paradoxes which are not based on a hidden error generally occur at the fringes of context or language, and require
extending the context or language in order to lose their paradoxical quality. Paradoxes that arise from apparently
intelligible uses of language are often of interest to logicians and philosophers. "This sentence is false" is an example
of the well-known liar paradox: it is a sentence which cannot be consistently interpreted as either true or false,
because if it is known to be false, then it is known that it must be true, and if it is known to be true, then it is known
that it must be false. Russell's paradox, which shows that the notion of the set of all those sets that do not contain
themselves leads to a contradiction, was instrumental in the development of modern logic and set theory.
Thought experiments can also yield interesting paradoxes. The grandfather paradox, for example, would arise if a
time traveller were to kill his own grandfather before his mother or father had been conceived, thereby preventing his
own birth. This is a specific example of the more general observation of the butterfly effect, or that a time-traveller's
interaction with the past however slight would entail making changes that would, in turn, change the future in
which the time-travel was yet to occur, and would thus change the circumstances of the time-travel itself.
Often a seemingly paradoxical conclusion arises from an inconsistent or inherently contradictory definition of the
initial premise. In the case of that apparent paradox of a time traveler killing his own grandfather it is the
inconsistency of defining the past to which he returns as being somehow different from the one which leads up to the
future from which he begins his trip but also insisting that he must have come to that past from the same future as the
one that it leads up to.

Quine's classification of paradoxes


W. V. Quine (1962) distinguished between three classes of paradoxes:
A veridical paradox produces a result that appears absurd but is demonstrated to be true nevertheless. Thus, the
paradox of Frederic's birthday in The Pirates of Penzance establishes the surprising fact that a
twenty-one-year-old would have had only five birthdays, if he had been born on a leap day. Likewise, Arrow's
impossibility theorem demonstrates difficulties in mapping voting results to the will of the people. The Monty
Hall paradox demonstrates that a decision which has an intuitive 50-50 chance in fact is heavily biased towards
making a decision which, given the intuitive conclusion, the player would be unlikely to make. In 20th century
science, Hilbert's paradox of the Grand Hotel and Schrdinger's cat are famously vivid examples of a theory being
taken to a logical but paradoxical end.
A falsidical paradox establishes a result that not only appears false but actually is false, due to a fallacy in the
demonstration. The various invalid mathematical proofs (e.g., that 1 = 2) are classic examples, generally relying
on a hidden division by zero. Another example is the inductive form of the horse paradox, which falsely
generalizes from true specific statements.
A paradox that is in neither class may be an antinomy, which reaches a self-contradictory result by properly
applying accepted ways of reasoning. For example, the GrellingNelson paradox points out genuine problems in
our understanding of the ideas of truth and description.
A fourth kind has sometimes been described since Quine's work.
A paradox that is both true and false at the same time and in the same sense is called a dialetheia. In Western
logics it is often assumed, following Aristotle, that no dialetheia exist, but they are sometimes accepted in Eastern

Paradox

traditionsWikipedia:Avoid weasel words and in paraconsistent logics. It would be mere equivocation or a matter
of degree, for example, to both affirm and deny that "John is here" when John is halfway through the door but it is
self-contradictory to simultaneously affirm and deny the event in some sense.

Paradox in philosophy
A taste for paradox is central to the philosophies of Laozi, Heraclitus, Meister Eckhart, Hegel, Kierkegaard,
Nietzsche, and G.K. Chesterton, among many others. Sren Kierkegaard, for example, writes, in the Philosophical
Fragments, that
But one must not think ill of the paradox, for the paradox is the passion of thought, and the thinker
without the paradox is like the lover without passion: a mediocre fellow. But the ultimate potentiation of
every passion is always to will its own downfall, and so it is also the ultimate passion of the
understanding to will the collision, although in one way or another the collision must become its
downfall. This, then, is the ultimate paradox of thought: to want to discover something that thought itself
cannot think.

Paradox in medicine
A paradoxical reaction to a drug is the opposite of what one would expect, such as becoming agitated by a sedative
or sedated by a stimulant. Some are common and are used regularly in medicine, such as the use of stimulants such
as Adderall and Ritalin in the treatment of attention deficit disorder, while others are rare and can be dangerous as
they are not expected, such as severe agitation from a benzodiazepine.

References
External links
Cantini, Andrea (Winter 2012). "Paradoxes and Contemporary Logic" (http://plato.stanford.edu/entries/
paradoxes-contemporary-logic/). In Zalta, Edward N. Stanford Encyclopedia of Philosophy.
Spade, Paul Vincent (Fall 2013). "Insolubles" (http://plato.stanford.edu/entries/insolubles). In Zalta, Edward
N. Stanford Encyclopedia of Philosophy.
Paradoxes (http://www.dmoz.org/Society/Philosophy/Philosophy_of_Logic/Paradoxes/) at DMOZ
"Zeno and the Paradox of Motion" (http://www.mathpages.com/rr/s3-07/3-07.htm) at MathPages.com.
"Logical Paradoxes" (http://www.iep.utm.edu/par-log) entry in the Internet Encyclopedia of Philosophy

List of paradoxes

List of paradoxes
This is a list of paradoxes, grouped thematically. The grouping is approximate, as paradoxes may fit into more than
one category. Because of varying definitions of the term paradox, some of the following are not considered to be
paradoxes by everyone. This list collects only scenarios that have been called a paradox by at least one source and
have their own article.
Although considered paradoxes, some of these are based on fallacious reasoning, or incomplete/faulty analysis.
Informally, the term is often used to describe a counter-intuitive result.
This list is incomplete; you can help by expanding it [1].

Logic
Barbershop paradox: The supposition that if one of two simultaneous assumptions leads to a contradiction, the
other assumption is also disproved leads to paradoxical consequences. Not to be confused with the Barber
paradox.
What the Tortoise Said to Achilles: "Whatever Logic is good enough to tell me is worth writing down...", also
known as Carroll's paradox, not to be confused with the physical paradox of the same name.
Catch-22: A situation in which someone is in need of something that can only be had by not being in need of it.
Drinker paradox: In any pub there is a customer of whom it is true to say: if that customer drinks, everybody in
the pub drinks.
Paradox of entailment: Inconsistent premises always make an argument valid.
Lottery paradox: There is one winning ticket in a large lottery. It is reasonable to believe of a particular lottery
ticket that it is not the winning ticket, since the probability that it is the winner is so very small, but it is not
reasonable to believe that no lottery ticket will win.
Raven paradox (or Hempel's Ravens): Observing a green apple increases the likelihood of all ravens being black.
Ross's paradox: Disjunction introduction poses a problem for imperative inference by seemingly permitting
arbitrary imperatives to be inferred.
Unexpected hanging paradox: The day of the hanging will be a surprise, so it cannot happen at all, so it will be a
surprise. The surprise examination and Bottle Imp paradox use similar logic

Self-reference
These paradoxes have in common a contradiction arising from self-reference.
Barber paradox: A barber (who is a man) shaves all and only those men who do not shave themselves. Does he
shave himself? (Russell's popularization of his set theoretic paradox.)
Berry paradox: The phrase "the first number not nameable in under ten words" appears to name it in nine words.
Crocodile dilemma: If a crocodile steals a child and promises its return if the father can correctly guess exactly
what the crocodile will do, how should the crocodile respond in the case that the father correctly guesses that the
child will not be returned?
Paradox of the Court: A law student agrees to pay his teacher after winning his first case. The teacher then sues
the student (who has not yet won a case) for payment.
Curry's paradox: "If this sentence is true, then Santa Claus exists."
Epimenides paradox: A Cretan says: "All Cretans are liars". This paradox works in mainly the same way as the
Liar paradox.
Exception paradox: "If there is an exception to every rule, then every rule must have at least one exception; the
exception to this one being that it has no exception." "There's always an exception to the rule, except to the
exception of the rulewhich is, in of itself, an accepted exception of the rule." "In a world with no rules, there

List of paradoxes
should be at least one rule - a rule against rules."
GrellingNelson paradox: Is the word "heterological", meaning "not applicable to itself", a heterological word?
(Another close relative of Russell's paradox.)
KleeneRosser paradox: By formulating an equivalent to Richard's paradox, untyped lambda calculus is shown to
be inconsistent.
Liar paradox: "This sentence is false." This is the canonical self-referential paradox. Also "Is the answer to this
question no?", "I'm lying", and "Everything I say is a lie."
Card paradox: "The next statement is true. The previous statement is false." A variant of the liar paradox that
does not use self-reference.
The Pinocchio paradox: What would happen if Pinocchio said "My nose will be growing"?[2]
Quine's paradox: "'Yields a falsehood when appended to its own quotation' yields a falsehood when appended
to its own quotation." Shows that a sentence can be paradoxical even if it is not self-referring and does not use
demonstratives or indexicals.
Yablo's paradox: An ordered infinite sequence of sentences, each of which says that all following sentences are
false. Uses neither self-reference nor circular reference.
Opposite Day: "It is opposite day today." Therefore it is not opposite day, but if you say it is a normal day it
would be considered a normal day.
Petronius's paradox: "Moderation in all things, including moderation" (unsourced quotation sometimes attributed
to Petronius).
Richard's paradox: We appear to be able to use simple English to define a decimal expansion in a way that is
self-contradictory.
Russell's paradox: Does the set of all those sets that do not contain themselves contain itself?
Socratic paradox: "I know that I know nothing at all."

Vagueness
Ship of Theseus (a.k.a. George Washington's axe or Grandfather's old axe): It seems like you can replace any
component of a ship, and it is still the same ship. So you can replace them all, one at a time, and it is still the same
ship. However, you can then take all the original pieces, and assemble them into a ship. That, too, is the same ship
you began with.
Sorites paradox (also known as the paradox of the heap): If you remove a single grain of sand from a heap, you
still have a heap. Keep removing single grains, and the heap will disappear. Can a single grain of sand make the
difference between heap and non-heap?

Mathematics
See also: Category:Mathematics paradoxes and Paradoxes of set theory
All horses are the same color: A proof by induction that all horses have the same color.
Cramer's paradox: The number of points of intersection of two higher-order curves can be greater than the number
of arbitrary points needed to define one such curve.
Elevator paradox: Elevators can seem to be mostly going in one direction, as if they were being manufactured in
the middle of the building and being disassembled on the roof and basement.
Interesting number paradox: The first number that can be considered "dull" rather than "interesting" becomes
interesting because of that fact.
Nontransitive dice: You can have three dice, called A, B, and C, such that A is likely to win in a roll against B, B
is likely to win in a roll against C, and C is likely to win in a roll against A.
Potato paradox: If you let potatoes consisting of 99% water dry so that they are 98% water, they lose 50% of their
weight.

List of paradoxes

Russell's paradox: Does the set of all those sets that do not contain themselves contain itself?

Statistics
See also: Category:Statistical paradoxes
Abelson's paradox: Effect size may not be indicative of practical meaning.
Accuracy paradox: Predictive models with a given level of accuracy may have greater predictive power than
models with higher accuracy.
Benford's law: Numbers starting with lower digits appear disproportionately often in seemingly random data sets.
Berkson's paradox: A complicating factor arising in statistical tests of proportions.
Freedman's paradox Describes a problem in model selection where predictor variables with no explanatory power
can appear artificially important.
Friendship paradox: For almost everyone, their friends have more friends than they do.
Inspection paradox: Why one will wait longer for a bus than one should.
Lindley's paradox: Tiny errors in the null hypothesis are magnified when large data sets are analyzed, leading to
false but highly statistically significant results.
Low birth weight paradox: Low birth weight and mothers who smoke contribute to a higher mortality rate. Babies
of smokers have lower average birth weight, but low birth weight babies born to smokers have a lower mortality
rate than other low birth weight babies. This is a special case of Simpson's paradox.
Simpson's paradox, or the YuleSimpson effect: A trend that appears in different groups of data disappears when
these groups are combined, and the reverse trend appears for the aggregate data.
Will Rogers phenomenon: The mathematical concept of an average, whether defined as the mean or median, leads
to apparently paradoxical resultsfor example, it is possible that moving an entry from an encyclopedia to a
dictionary would increase the average entry length on both books.

Probability
See also: Category:Probability theory paradoxes
Bertrand's box paradox: A paradox of conditional probability
closely related to the Boy or Girl paradox.
Bertrand's paradox: Different common-sense definitions of
randomness give quite different results.
Birthday paradox: What is the chance that two people in a room
have the same birthday?
Borel's paradox: Conditional probability density functions are not
invariant under coordinate transformations.

The Monty Hall problem: which door do you


choose?

Boy or Girl paradox: A two-child family has at least one boy. What is the probability that it has a girl?
False positive paradox: A test that is accurate the vast majority of the time could show you have a disease, but the
probability that you actually have it could still be tiny.
Grice's paradox: Shows that the exact meaning of statements involving conditionals and probabilities is more
complicated than may be obvious on casual examination.
Monty Hall problem: An unintuitive consequence of conditional probability.
Necktie paradox: A wager between two people seems to favour them both. Very similar in essence to the
Two-envelope paradox.
Proebsting's paradox: The Kelly criterion is an often optimal strategy for maximizing profit in the long run.
Proebsting's paradox apparently shows that the Kelly criterion can lead to ruin.
Sleeping Beauty problem: A probability problem that can be correctly answered as one half or one third
depending on how the question is approached.

List of paradoxes

Three cards problem: When pulling a random card, how do you determine the color of the underside?
Three Prisoners problem: A variation of the Monty Hall problem.
Two-envelope paradox: You are given two indistinguishable envelopes, each of which contains a positive sum of
money. One envelope contains twice as much as the other. You may pick one envelope and keep whatever
amount it contains. You pick one envelope at random but before you open it you are given the chance to take the
other envelope instead

Infinity and infinitesimals


Burali-Forti paradox: If the ordinal numbers formed a set, it would be an ordinal number that is smaller than
itself.
Cantor's paradox: There is no greatest cardinal number.
Galileo's paradox: Though most numbers are not squares, there are no more numbers than squares. (See also
Cantor's diagonal argument)
Hilbert's paradox of the Grand Hotel: If a hotel with infinitely many rooms is full, it can still take in more guests.
Russell's paradox: Does the set of all those sets that do not contain themselves contain itself?
Skolem's paradox: Countably infinite models of set theory contain uncountably infinite sets.
Zeno's paradoxes: "You will never reach point B from point A as you must always get half-way there, and half of
the half, and half of that half, and so on." (This is also a physical paradox.)
Supertasks may result in paradoxes such as
Benardete's paradox: Apparently, a man can be "forced to stay where he is by the mere unfulfilled intentions of
the gods".
RossLittlewood paradox: After alternatively adding and removing balls to a vase infinitely often, how many
balls remain?
Thomson's lamp: After flicking a lamp on and off infinitely often, is it on or off?

Geometry and topology


BanachTarski paradox: Cut a ball into a finite number of
pieces, re-assemble the pieces to get two balls, both of equal
size to the first. The von Neumann paradox is a
two-dimensional analogue.
Paradoxical set: A set that can be partitioned into two sets,
each of which is equivalent to the original.

The BanachTarski paradox: A ball can be


decomposed and reassembled into two balls the same
size as the original.

Coastline paradox: the perimeter of a landmass is in general


ill-defined.
Gabriel's Horn or Torricelli's trumpet: A simple object with finite volume but infinite surface area. Also, the
Mandelbrot set and various other fractals are covered by a finite area, but have an infinite perimeter (in fact, there
are no two distinct points on the boundary of the Mandelbrot set that can be reached from one another by moving
a finite distance along that boundary, which also implies that in a sense you go no further if you walk "the wrong
way" around the set to reach a nearby point). This can be represented by a Klein bottle.
Hausdorff paradox: There exists a countable subset C of the sphere S such that S\C is equidecomposable with two
copies of itself.
Missing square puzzle: Two similar-looking figures appear to have different areas while built from the same
pieces.
Nikodym set: A set contained in and with the same Lebesgue measure as the unit square, yet for every one of its
points there is a straight line intersecting the Nikodym set only in that point.
Smale's paradox: A sphere can, topologically, be turned inside out.

List of paradoxes

Decision theory
Abilene paradox: People can make decisions based not on what they actually want to do, but on what they think
that other people want to do, with the result that everybody decides to do something that nobody really wants to
do, but only what they thought that everybody else wanted to do.
Apportionment paradox: Some systems of apportioning representation can have unintuitive results due to
rounding

Alabama paradox: Increasing the total number of seats might shrink one block's seats.
New states paradox: Adding a new state or voting block might increase the number of votes of another.
Population paradox: A fast-growing state can lose votes to a slow-growing state.
Arrow's paradox: Given more than two choices, no system can have all the attributes of an ideal voting system at
once.
Buridan's ass: How can a rational choice be made between two outcomes of equal value?
Chainstore paradox: Even those who know better play the so-called chain store game in an irrational manner.
Decision-making paradox: Selecting the best decision-making method is a decision problem in itself.
Fenno's paradox: The belief that people generally disapprove of the United States Congress as a whole, but
support the Congressman from their own Congressional district.
Green paradox: Policies intending to reduce future CO2 emissions may lead to increased emissions in the present.

Hedgehog's dilemma (Lover's paradox): Despite goodwill, human intimacy cannot occur without substantial
mutual harm.
Inventor's paradox: It is easier to solve a more general problem that covers the specifics of the sought-after
solution.
Kavka's toxin puzzle: Can one intend to drink the non-deadly toxin, if the intention is the only thing needed to get
the reward?
Morton's fork: Choosing between unpalatable alternatives.
Navigation paradox: Increased navigational precision may result in increased collision risk.
Newcomb's paradox: How do you play a game against an omniscient opponent?
Paradox of tolerance: Should one tolerate intolerance if intolerance would destroy the possibility of tolerance?
Paradox of voting: Also known as the Downs paradox. For a rational, self-interested voter the costs of voting will
normally exceed the expected benefits, so why do people keep voting?
Parrondo's paradox: It is possible to play two losing games alternately to eventually win.
Prevention paradox: For one person to benefit, many people have to change their behavior even though they
receive no benefit, or even suffer, from the change.
Prisoner's dilemma: Two people might not cooperate even if it is in both their best interests to do so.
Relevance paradox: Sometimes relevant information is not sought out because its relevance only becomes clear
after the information is available.
Voting paradox: Also known as Condorcet's paradox and paradox of voting. A group of separately rational
individuals may have preferences that are irrational in the aggregate.
Willpower paradox: Those who kept their minds open were more goal-directed and more motivated than those
who declared their objective to themselves.

List of paradoxes

Physics
For more details on this topic, see Physical paradox.
Cool tropics paradox: A contradiction between modelled estimates
of tropical temperatures during warm, ice-free periods of the
Cretaceous and Eocene, and the lower temperatures that proxies
suggest were present.
The holographic principle: The amount of information that can be
stored in a given volume is not proportional to the volume but to the
area that bounds that volume.
Irresistible force paradox: What would happen if an unstoppable
force hit an immovable object?

Astrophysics
Algol paradox: In some binaries the partners seem to have different
ages, even though they're thought to have formed at the same time.

Robert Boyle's self-flowing flask fills itself in this


diagram, but perpetual motion machines cannot
exist.

Faint young Sun paradox: The apparent contradiction between observations of liquid water early in the Earth's
history and the astrophysical expectation that the output of the young sun would have been insufficient to melt ice
on earth.
The GZK paradox: High-energy cosmic rays have been observed that seem to violate the
Greisen-Zatsepin-Kuzmin limit, which is a consequence of special relativity.

Classical mechanics
Archer's paradox: An archer must, in order to hit his target, not aim directly at it, but slightly to the side.
Archimedes paradox (Hydrostatic paradox): A massive battleship can float in a few litres of water.
Aristotle's wheel paradox: Rolling joined concentric wheels seem to trace the same distance with their
circumferences, even though the circumferences are different.
Carroll's paradox: The angular momentum of a stick should be zero, but is not.
D'Alembert's paradox: Flow of an inviscid fluid produces no net force on a solid body.
Denny's paradox: Surface-dwelling arthropods (such as the water strider) should not be able to propel themselves
horizontally.
Elevator paradox: Even though hydrometers are used to measure fluid density, a hydrometer will not indicate
changes of fluid density caused by changing atmospheric pressure.
Feynman sprinkler: Which way does a sprinkler rotate when submerged in a tank and made to suck in the
surrounding fluid?
Painlev paradox: Rigid-body dynamics with contact and friction is inconsistent.
Tea leaf paradox: When a cup of tea is stirred, the leaves assemble in the center, even though centrifugal force
pushes them outward.

List of paradoxes

Cosmology
Bentley's paradox: In a Newtonian universe, gravitation should pull all matter into a single point.
Fermi paradox: If there are, as probability would suggest, many other sentient species in the Universe, then where
are they? Shouldn't their presence be obvious?
Heat death paradox: Since the universe is not infinitely old, it cannot be infinite in extent.
Olbers' paradox: Why is the night sky black if there is an infinity of stars?

Electromagnetism
Faraday paradox: An apparent violation of Faraday's law of electromagnetic induction.

Quantum mechanics
AharonovBohm effect: a charged particle is affected by an electromagnetic field even though it has no local
contact with that field
Bell's theorem: Why do measured quantum particles not satisfy mathematical probability theory?
Double-slit experiment: Matter and energy can act as a wave or as a particle depending on the experiment.
EinsteinPodolskyRosen paradox: Can far away events influence each other in quantum mechanics?
Extinction paradox: In the small wavelength limit, the total scattering cross section of an impenetrable sphere is
twice its geometrical cross-sectional area (which is the value obtained in classical mechanics).
Hardy's paradox: How can we make inferences about past events that we haven't observed while at the same time
acknowledge that the act of observing it affects the reality we are inferring to?
Klein paradox: When the potential of a potential barrier becomes similar to the mass of the impinging particle, it
becomes transparent.
The Mott problem: spherically symmetric wave functions, when observed, produce linear particle tracks.
Quantum LC circuit paradox: Energies stored on capacitance and inductance are not equal to the ground state
energy of the quantum oscillator.Wikipedia:Citation needed
Quantum pseudo-telepathy: Two players who can not communicate accomplish tasks that seemingly require
direct contact.
Quantum Zeno effect or Turing paradox: echoing the Zeno paradox, a quantum particle that is continuously
observed cannot change its state
Schrdinger's cat paradox: According to the Copenhagen interpretation of quantum mechanics, a cat could be
simultaneously alive and dead, as long as we don't look.
Uncertainty principle: Attempts to determine position must disturb momentum, and vice versa.

Relativity
Bell's spaceship paradox: concerning relativity.
Black hole information paradox: Black holes violate a commonly assumed tenet of science that information
cannot be destroyed.
Ehrenfest paradox: On the kinematics of a rigid, rotating disk.
Ladder paradox: A classic relativity problem.
Mocanu's velocity composition paradox: a paradox in special relativity.
Supplee's paradox: the buoyancy of a relativistic object (such as a bullet) appears to change when the reference
frame is changed from one in which the bullet is at rest to one in which the fluid is at rest.
Trouton-Noble or Right-angle lever paradox: Does a torque arise in static systems when changing frames?
Twin paradox: The theory of relativity predicts that a person making a round trip will return younger than his or
her identical twin who stayed at home.

10

List of paradoxes

Thermodynamics
Gibbs paradox: In an ideal gas, is entropy an extensive variable?
Loschmidt's paradox: Why is there an inevitable increase in entropy when the laws of physics are invariant under
time reversal? The time reversal symmetry of physical laws appears to contradict the second law of
thermodynamics.
Maxwell's demon: The second law of thermodynamics seems to be violated by a cleverly operated trapdoor.
Mpemba effect: Hot water can, under certain conditions, freeze faster than cold water, even though it must pass
the lower temperature on the way to freezing.

Biology
Antarctic paradox: In some areas of the oceans, phytoplankton concentrations are low despite there apparently
being sufficient nutrients.
C-value enigma: Genome size does not correlate with organismal complexity. For example, some unicellular
organisms have genomes much larger than that of humans.
Cole's paradox: Even a tiny fecundity advantage of one additional offspring would favor the evolution of
semelparity.
French paradox: The observation that the French suffer a relatively low incidence of coronary heart disease,
despite having a diet relatively rich in saturated fats.
Glucose paradox: The large amount of glycogen in the liver cannot be explained by its small glucose absorption.
Gray's paradox: Despite their relatively small muscle mass, dolphins can swim at high speeds and obtain large
accelerations.
Hispanic paradox: The finding that Hispanics in the U.S. tend to have substantially better health than the average
population in spite of what their aggregate socio-economic indicators predict.
Lombard's paradox: When rising to stand from a sitting or squatting position, both the hamstrings and quadriceps
contract at the same time, despite their being antagonists to each other.
Meditation paradox: The amplitude of heart rate oscillations during meditation was significantly greater than in
the pre-meditation control state and also in three non-meditation control groups
Mexican paradox: Mexican children tend to have higher birth weights than can be expected from their
socio-economic status.
Obesity survival paradox: Although the negative health consequences of obesity in the general population are
well supported by the available evidence, health outcomes in certain subgroups seem to be improved at an
increased BMI.
Paradox of enrichment: Increasing the food available to an ecosystem may lead to instability, and even to
extinction.
Paradox of the pesticides: Applying pesticide to a pest may increase the pest's abundance.
Paradox of the plankton: Why are there so many different species of phytoplankton, even though competition for
the same resources tends to reduce the number of species?
Peto's paradox: Humans get cancer with high frequency, while larger mammals, like whales, do not. If cancer is
essentially a negative outcome lottery at the cell level, and larger organisms have more cells, and thus more
potentially cancerous cell divisions, one would expect larger organisms to be more predisposed to cancer.
Pulsus paradoxus: A pulsus paradoxus is a paradoxical decrease in systolic blood pressure during inspiration. It
can indicate certain medical conditions in which there is reduced venous return of blood to the heart, such as
cardiac tamponade or constrictive pericarditis. Also known as the Pulse Paradox.
Sherman paradox: An anomalous pattern of inheritance in the fragile X syndrome.
Temporal paradox (paleontology): When did the ancestors of birds live?

11

List of paradoxes

Chemistry
Faraday paradox (electrochemistry): Diluted nitric acid will corrode steel, while concentrated nitric acid doesn't.
Levinthal paradox: The length of time that it takes for a protein chain to find its folded state is many orders of
magnitude shorter than it would be if it freely searched all possible configurations.
SAR paradox: Exceptions to the principle that a small change in a molecule causes a small change in its chemical
behaviour are frequently profound.

Time
Bootstrap paradox: Can a time traveler send himself information with no outside source?
Polchinski's paradox: A billiard ball can be thrown into a wormhole in such a way that it would emerge in the past
and knock its incoming past self away from the wormhole entrance, creating a variant of the grandfather paradox.
Predestination paradox:[3] A man travels back in time to discover the cause of a famous fire. While in the building
where the fire started, he accidentally knocks over a kerosene lantern and causes a fire, the same fire that would
inspire him, years later, to travel back in time. The bootstrap paradox is closely tied to this, in which, as a result of
time travel, information or objects appear to have no beginning.
Temporal paradox: What happens when a time traveler does things in the past that prevent him from doing them
in the first place?
Grandfather paradox: You travel back in time and kill your grandfather before he conceives one of your
parents, which precludes your own conception and, therefore, you couldn't go back in time and kill your
grandfather.
Hitler's murder paradox: You travel back in time and kill a famous person in history before they become
famous; but if the person had never been famous then he could not have been targeted as a famous person.

Linguistics and Artificial Intelligence


Bracketing paradox: Is an "historical linguist" a linguist who is historical, or someone who studies "historical
linguistics"?
Code-talker paradox: How can a language both enable communication and block communication?
Moravec's paradox: Logical thought is hard for humans and easy for computers, but picking a screw from a box of
screws is an unsolved problem.
Movement paradox: In transformational linguistics, there are pairs of sentences in which the sentence without
movement is ungrammatical while the sentence with movement is not.

Philosophy
Paradox of analysis: It seems that no conceptual analysis can meet the requirements both of correctness and of
informativeness.
Buridan's bridge: Will Plato throw Socrates into the water or not?
Paradox of fiction: How can people experience strong emotions from purely fictional things?
Fitch's paradox: If all truths are knowable, then all truths must in fact be known.
Paradox of free will: If God knew how we will decide when he created us, how can there be free will?
Goodman's paradox: Why can induction be used to confirm that things are "green", but not to confirm that things
are "grue"?
Paradox of hedonism: When one pursues happiness itself, one is miserable; but, when one pursues something
else, one achieves happiness.
Hutton's Paradox: If asking oneself "Am I dreaming?" in a dream proves that one is, what does it prove in waking
life?

12

List of paradoxes
Liberal paradox: "Minimal Liberty" is incompatible with Pareto optimality.
Meno's paradox (Learner's paradox): A man cannot search either for what he knows or for what he does not
know.
Mere addition paradox, also known as Parfit's paradox: Is a large population living a barely tolerable life better
than a small, happy population?
Moore's paradox: "It's raining, but I don't believe that it is."
Newcomb's paradox: A paradoxical game between two players, one of whom can predict the actions of the other.
Paradox of nihilism: Several distinct paradoxes share this name.
Omnipotence paradox: Can an omnipotent being create a rock too heavy for itself to lift?
Preface paradox: The author of a book may be justified in believing that all his statements in the book are correct,
at the same time believing that at least one of them is incorrect.
Problem of evil (Epicurean paradox): The existence of evil seems to be incompatible with the existence of an
omnipotent, omniscient, and morally perfect God.
Rule-following paradox: Even though rules are intended to determine actions, "no course of action could be
determined by a rule, because any course of action can be made out to accord with the rule".
When a white horse is not a horse: White horses are not horses because white and horse talk about different
things.
Zeno's paradoxes: "You will never reach point B from point A as you must always get half-way there, and half of
the half, and half of that half, and so on ..." (This is also a paradox of the infinite)

Mysticism
Tzimtzum: In Kabbalah, how to reconcile self-awareness of finite Creation with Infinite Divine source, as an
emanated causal chain would seemingly nullify existence. Luria's initial withdrawal of God in Hasidic
panentheism involves simultaneous illusionism of Creation (Upper Unity) and self-aware existence (Lower
Unity), God encompassing logical opposites.

Economics
See also: Category:Economics paradoxes
Allais paradox: A change in a possible outcome that is shared by different alternatives affects people's choices
among those alternatives, in contradiction with expected utility theory.
The Antitrust ParadoxWikipedia:Disputed statement: A book arguing that antitrust enforcement artificially raised
prices by protecting inefficient competitors from competition.
Arrow information paradox: To sell information you need to give it away before the sale.
Bertrand paradox: Two players reaching a state of Nash equilibrium both find themselves with no profits.
Braess's paradox: Adding extra capacity to a network can reduce overall performance.
Deaton paradox: Consumption varies surprisingly smoothly despite sharp variations in income.
Demographic-economic paradox: nations or subpopulations with higher GDP per capita are observed to have
fewer children, even though a richer population can support more children.
DownsThomson paradox: Increasing road capacity at the expense of investments in public transport can make
overall congestion on the road worse.
Easterlin paradox: For countries with income sufficient to meet basic needs, the reported level of happiness does
not correlate with national income per person.
Edgeworth paradox: With capacity constraints, there may not be an equilibrium.
Ellsberg paradox: People exhibit ambiguity aversion (as distinct from risk aversion), in contradiction with
expected utility theory.

13

List of paradoxes
European paradox: The perceived failure of European countries to translate scientific advances into marketable
innovations.
Gibson's paradox: Why were interest rates and prices correlated?
Giffen paradox: Increasing the price of bread makes poor people eat more of it.
Icarus paradox: Some businesses bring about their own downfall through their own successes.
Jevons paradox: Increases in efficiency lead to even larger increases in demand.
Leontief paradox: Some countries export labor-intensive commodities and import capital-intensive commodities,
in contradiction with HeckscherOhlin theory.
Lucas paradox: Capital is not flowing from developed countries to developing countries despite the fact that
developing countries have lower levels of capital per worker, and therefore higher returns to capital.
Mandeville's paradox: Actions that may be vicious to individuals may benefit society as a whole.
Mayfield's paradox: Keeping everyone out of an information system is impossible, but so is getting everybody in.
Metzler paradox: The imposition of a tariff on imports may reduce the relative internal price of that good.
Paradox of prosperity: Why do generations that significantly improve the economic climate seem to generally rear
a successor generation that consumes rather than produces?
Paradox of thrift: If everyone saves more money during times of recession, then aggregate demand will fall and
will in turn lower total savings in the population.
Paradox of toil: If everyone tries to work during times of recession, lower wages will reduce prices, leading to
more deflationary expectations, leading to further thrift, reducing demand and thereby reducing employment.
Paradox of value, also known as diamond-water paradox: Water is more useful than diamonds, yet is a lot
cheaper.
Productive failure: Providing less guidance and structure and thereby causing more failure is likely to promote
better learning.
Productivity paradox (also known as Solow computer paradox): Worker productivity may go down, despite
technological improvements.
Scitovsky paradox: Using the KaldorHicks criterion, an allocation A may be more efficient than allocation B,
while at the same time B is more efficient than A.
Service recovery paradox: Successfully fixing a problem with a defective product may lead to higher consumer
satisfaction than in the case where no problem occurred at all.
St. Petersburg paradox: People will only offer a modest fee for a reward of infinite expected value.
Paradox of Plenty: The Paradox of Plenty (resource curse) refers to the paradox that countries and regions with an
abundance of natural resources, specifically point-source non-renewable resources like minerals and fuels, tend to
have less economic growth and worse development outcomes than countries with fewer natural resources.
Tullock paradox: Bribing politicians costs less than one would expect, considering how much profit it can yield.

Perception
For more details on this topic, see Perceptual paradox.
Tritone paradox: An auditory illusion in which a sequentially played pair of Shepard tones is heard as ascending
by some people and as descending by others.
Blub paradox: Cognitive lock of some experienced programmers that prevents them from properly evaluating the
quality of programming languages which they do not know.[4]

14

List of paradoxes

Politics
Stabilityinstability paradox: When two countries each have nuclear weapons, the probability of a direct war
between them greatly decreases, but the probability of minor or indirect conflicts between them increases.

History
Georg Wilhelm Friedrich Hegel: We learn from history that we do not learn from history.Wikipedia:Citation
needed (paraphrased)

Psychology/Sociology
Gender paradox: Women conform more closely than men to sociolinguistics norms that are overtly prescribed,
but conform less than men when they are not.
Moral paradox: A situation in which moral imperatives clash without clear resolution.
Outcomes paradox: Schizophrenia patients in developing countries seem to fare better than their Western
counterparts.[5]
Status paradox: Several paradoxes involve the concept of medical or social status.
The Paradox of Anti-Semitism: A book arguing that the lack of external persecutions and antagonisms results in
the dissolution of Jewish identity, a theory that resonates in works of Dershowitz and Sartre.
Region-beta paradox: People can sometimes recover more quickly from more intense emotions or pain than from
less distressing experiences.
Self-absorption paradox: The contradictory association whereby higher levels of self-awareness are
simultaneously associated with higher levels of psychological distress and with psychological well-being.[6]
Stapp's ironical paradox: "The universal aptitude for ineptitude makes any human accomplishment an incredible
miracle."
Stockdale paradox: "You must never confuse faith that you will prevail in the endwhich you can never afford to
losewith the discipline to confront the most brutal facts of your current reality, whatever they might be."
Ironic process theory: Ironic processing is the psychological process whereby an individual's deliberate attempts
to suppress or avoid certain thoughts (thought suppression) renders those thoughts more persistent.

Miscellaneous
Absence paradox: No one is ever "here".
Ant on a rubber rope: An ant crawling on a rubber rope can reach the end even when the rope stretches much
faster than the ant can crawl.
Bonini's paradox: Models or simulations that explain the workings of complex systems are seemingly impossible
to construct. As a model of a complex system becomes more complete, it becomes less understandable, for it to be
more understandable it must be less complete and therefore less accurate. When the model becomes accurate, it is
just as difficult to understand as the real-world processes it represents.
Buttered cat paradox: Humorous example of a paradox from contradicting proverbs.
Intentionally blank page: Many documents contain pages on which the text "This page is intentionally left blank"
is printed, thereby making the page not blank.
Observer's paradox: The outcome of an event or experiment is influenced by the presence of the observer.

15

List of paradoxes

Notes
[1] http:/ / en. wikipedia. org/ w/ index. php?title=List_of_paradoxes& action=edit
[2] , an image of Pinocchio with a speech bubble "My nose will grow now!" has become a minor Internet phenomenon ( Google search (http:/ /
www. google. com/ search?q="pinocchio+ paradox"), Google image search (http:/ / www. google. com/ images?q="pinocchio+ paradox")). It
seems likely that this paradox has been independently conceived multiple times.
[3] See also Predestination paradoxes in popular culture
[4] Chapter 1, Introduction.
[5] Developing countries: The outcomes paradox Nature.com (http:/ / www. nature. com/ nature/ journal/ v508/ n7494_supp/ full/ 508S14a. html)
[6] Trapnell, P. D., & Campbell, J. D. (1999). "Private self-consciousness and the Five-Factor Model of Personality: Distinguishing rumination
from reflection". Journal of Personality and Social Psychology, 76, 284-304.

Paradoxical laughter
Paradoxical laughter is an exaggerated expression of humour which is unwarranted by external events. It may be
uncontrollable laughter which may be recognised as inappropriate by the person involved. It is associated with
altered mental states or mental illness, such as mania, hypomania or schizophrenia, and can have other causes.
Paradoxical laughter is indicative of an unstable mood, often caused by the pseudobulbar affect, which can quickly
change to anger and back again, on minor external cues.
This type of laughter can also occur at times when the fight-or-flight response may otherwise be evoked.

References
Frijda, Nico H. (1986). The Emotions (http://books.google.com/books?id=QkNuuVf-pBMC&pg=PA52&
lpg=PA52&dq="paradoxical+laughter"). Cambridge University Press. p.52. ISBN0-521-31600-6. Retrieved 14
November 2009.
Rutkowski, Anne-Franoise; Rijsman, John B.; Gergen, Mary (2004). "Paradoxical Laughter at a Victim as
Communication with a Non-victim" (http://dbiref.uvt.nl/iPort?request=full_record&db=wo&language=eng&
query=142993). International Review of Social Psychology 17 (4): 511. ISSN 0992-986X (http://www.
worldcat.org/issn/0992-986X). Retrieved 2009-11-14. ( French biobliographical record (http://cat.inist.fr/
?aModele=afficheN&cpsidt=16370783) with French translation of abstract)

16

17

Decision theory
Abilene paradox
In an Abilene paradox a group of people collectively decide on a course of action that is counter to the preferences
of many of the individuals in the group. It involves a common breakdown of group communication in which each
member mistakenly believes that their own preferences are counter to the group's and, therefore, does not raise
objections. A common phrase relating to the Abilene paradox is a desire to not "rock the boat".

Explanation
The term was introduced by management expert Jerry B. Harvey in his article The Abilene Paradox: The
Management of Agreement. The name of the phenomenon comes from an anecdote in the article which Harvey uses
to elucidate the paradox:
On a hot afternoon visiting in Coleman, Texas, the family is comfortably playing dominoes on a porch, until
the father-in-law suggests that they take a trip to Abilene [53 miles north] for dinner. The wife says, "Sounds
like a great idea." The husband, despite having reservations because the drive is long and hot, thinks that his
preferences must be out-of-step with the group and says, "Sounds good to me. I just hope your mother wants to
go." The mother-in-law then says, "Of course I want to go. I haven't been to Abilene in a long time."
The drive is hot, dusty, and long. When they arrive at the cafeteria, the food is as bad as the drive. They arrive
back home four hours later, exhausted.
One of them dishonestly says, "It was a great trip, wasn't it?" The mother-in-law says that, actually, she would
rather have stayed home, but went along since the other three were so enthusiastic. The husband says, "I wasn't
delighted to be doing what we were doing. I only went to satisfy the rest of you." The wife says, "I just went
along to keep you happy. I would have had to be crazy to want to go out in the heat like that." The
father-in-law then says that he only suggested it because he thought the others might be bored.
The group sits back, perplexed that they together decided to take a trip which none of them wanted. They each
would have preferred to sit comfortably, but did not admit to it when they still had time to enjoy the afternoon.
This anecdote was also made into a short film for management education.
Ronald Sims writes that the Abilene paradox is similar to groupthink, but differs in significant ways, including that
in groupthink individuals are not acting contrary to their conscious wishes and generally feel good about the
decisions the group has reached. According to Sims, in the Abilene paradox, the individuals acting contrary to their
own wishes are more likely to have negative feelings about the outcome. In Sims' view, groupthink is a
psychological phenomenon affecting clarity of thought, where in the Abilene paradox thought is unaffected.
Like groupthink theories, the Abilene paradox theory is used to illustrate that groups not only have problems
managing disagreements, but that agreements may also be a problem in a poorly functioning group.

Abilene paradox

Research
The phenomenon is explained by social psychology theories of social conformity and social influence which suggest
human beings are often very averse to acting contrary to the trend of a group. According to Harvey, the phenomenon
may occur when individuals experience action-anxiety stress concerning the group potentially expressing
negative attitudes towards them if they do not go along. This action-anxiety arises from what Harvey termed
"negative fantasies" unpleasant visualizations of what the group might say or do if individuals are honest about
their opinions when there is "real risk" of displeasure and negative consequences for not going along. The
individual may experience "separation anxiety", fearing exclusion from the group.

Applications of the theory


The theory is often used to help explain extremely poor group decisions, especially notions of the superiority of "rule
by committee." For example, Harvey himself cited the Watergate scandal as a potential instance of the Abilene
paradox in action. The Watergate scandal occurred in the United States in the 1970s when many high officials of the
administration of then President Richard Nixon, a Republican, colluded in the cover-up and perhaps the execution of
a break-in at the Democratic National Committee headquarters in Washington, D.C. Harvey quotes several people
indicted for the cover-up as indicating that they had personal qualms about the decision but feared to voice them. For
one instance, campaign aide Herbert Porter said that he "was not one to stand up in a meeting and say that this should
be stopped", a decision he then attributed to "the fear of the group pressure that would ensue, of not being a team
player".
A technique to help reduce the chance of making a poor decision mentioned in the study and/or training of
management, as well as practical guidance by consultants, is that when the time comes for a group to make a
decision, group members should ask each other, "Are we going to Abilene?" to determine whether their decision is
legitimately desired by the group's members.

References
Further reading
Harvey, Jerry B. (1988). The Abilene Paradox and Other Meditations on Management. Lexington, Mass:
Lexington Books. ISBN 0-669-19179-5
Harvey, Jerry B. (1996). The Abilene Paradox and Other Meditations on Management (paperback). San
Francisco: Jossey-Bass. ISBN 0-7879-0277-2
Harvey, Jerry B. (1999). How Come Every Time I Get Stabbed in the Back, My Fingerprints Are on the Knife?.
San Francisco: Jossey-Bass. ISBN 0-7879-4787-3

External links
How to Identify Groupthink: an Introduction to the Abilene Paradox (http://mystrategicplan.com/resources/
how-to-identify-groupthink-an-introduction-to-the-abilene-paradox/)
How to Avoid Bad Decisions and Why Not to Go to Abilene (http://mystrategicplan.com/resources/
how-to-avoid-bad-decisions-and-why-not-to-go-to-abilene/)
Combat the Abilene Paradox by Identifying Groupthink (http://mystrategicplan.com/resources/
combat-the-abilene-paradox-by-promoting-individual-thought/)

18

Chainstore paradox

Chainstore paradox
Chainstore paradox (or "Chain-Store paradox") is a concept that purports to refute standard game theory reasoning.

The chain store game


A monopolist (Player A) has branches in 20 towns. He faces 20 potential competitors, one in each town, who will be
able to choose IN or OUT. They do so in sequential order and one at a time. If a potential competitor chooses OUT,
he receives a payoff of 1, while A receives a payoff of 5. If he chooses IN, he will receive a payoff of either 2 or 0,
depending on the response of Player A to his action. Player A, in response to a choice of IN, must choose one of two
pricing strategies, COOPERATIVE or AGGRESSIVE. If he chooses COOPERATIVE, both player A and the
competitor receive a payoff of 2, and if A chooses AGGRESSIVE, each player receives a payoff of 0.
These outcomes lead to two theories for the game, the induction (game theoretically correct version) and the
deterrence theory (weakly dominated theory):

Induction theory
Consider the decision to be made by the 20th and final competitor, of whether to choose IN or OUT. He knows that
if he chooses IN, Player A receives a higher payoff from choosing cooperate than aggressive, and being the last
period of the game, there are no longer any future competitors whom Player A needs to intimidate from the market.
Knowing this, the 20th competitor enters the market, and Player A will cooperate (receiving a payoff of 2 instead of
0).
The outcome in the final period is set in stone, so to speak. Now consider period 19, and the potential competitor's
decision. He knows that A will cooperate in the next period, regardless of what happens in period 19. Thus, if player
19 enters, an aggressive strategy will be unable to deter player 20 from entering. Player 19 knows this and chooses
IN. Player A chooses cooperate.
Of course, this process of backward induction holds all the way back to the first competitor. Each potential
competitor chooses IN, and Player A always cooperates. A receives a payoff of 40 (220) and each competitor
receives 2.

Deterrence theory
This theory states that Player A will be able to get payoff of higher than 40. Suppose Player A finds the induction
argument convincing. He will decide how many periods at the end to play such a strategy, say 3. In periods 1-17, he
will decide to always be aggressive against the choice of IN. If all of the potential competitors know this, it is
unlikely potential competitors 1-17 will bother the chain store, thus risking the safe payout of 1 ("A" will not
retaliate if they choose "OUT"). If a few do test the chain store early in the game, and see that they are greeted with
the aggressive strategy, the rest of the competitors are likely not to test any further. Assuming all 17 are deterred,
Player A receives 91 (175 + 23). Even if as many as 10 competitors enter and test Player A's will, Player A will
still receive a payoff of 41 (100+ 75 + 32), which is better than the induction (game theoretically correct) payoff.

19

Chainstore paradox

The chain store paradox


If Player A follows the game theory payoff matrix to achieve the optimal payoff, he or she will have a lower payoff
than with the "deterrence" strategy. This creates an apparent game theory paradox: game theory states that induction
strategy should be optimal, but it looks like "deterrence strategy" is optimal instead.
The "deterrence strategy" is not a Nash equilibrium: It relies on the non-credible threat of responding to IN with
AGGRESSIVE. A rational player will not carry out a non-credible threat, but the paradox is that it nevertheless
seems to benefit Player A to carry out the threat.

Selten's response
Reinhard Selten's response to this apparent paradox is to argue that the idea of "deterrence", while irrational by the
standards of Game Theory, is in fact an acceptable idea by the rationality that individuals actually employ. Selten
argues that individuals can make decisions of three levels: Routine, Imagination, and Reasoning.

Complete information?
If we stand by game theory, then the initial description given for the game theory payoff matrix in the chain store
game is not in fact the complete payoff matrix. The "deterrence strategy" is a valid strategy for Player A, but it is
missing in the initially presented payoff matrix. Game theory is based on the idea that each matrix is modeled with
the assumption of complete information: that "every player knows the payoffs and strategies available to other
players."
The initially presented payoff matrix is written for one payoff round instead of for all rounds in their entirety. As
described in the "deterrence strategy" section (but not in the induction section), Player A's competitors look at Player
A's actions in previous game rounds to determine what course of action to take - this information is missing from the
payoff matrix. In this case, backwards induction seems like it will fail, because each individual round payoff matrix
is dependent on the previous round. In fact, by doubling the size of the payoff matrix on each round (or, quadrupling
the amount of choices -- there are two choices and four possibilities per round), we can find the optimal strategy for
all players before the first round is played.

Selten's levels of decision making


The routine level
The individuals use their past experience of the results of decisions to guide their response to choices in the present.
"The underlying criteria of similarity between decision situations are crude and sometimes inadequate". (Selten)

The imagination level


The individual tries to visualize how the selection of different alternatives may influence the probable course of
future events. This level employs the routine level within the procedural decisions. This method is similar to a
computer simulation.

20

Chainstore paradox

The reasoning level


The individual makes a conscious effort to analyze the situation in a rational way, using both past experience and
logical thinking. This mode of decision uses simplified models whose assumptions are products of imagination, and
is the only method of reasoning permitted and expected by game theory.

Decision-making process
The predecision
One chooses which method (routine, imagination or reasoning) to use for the problem, and this decision itself is
made on the routine level.

The final decision


Depending on which level is selected, the individual begins the decision procedure. The individual then arrives at a
(possibly different) decision for each level available (if we have chosen imagination, we would arrive at a routine
decision and possible and imagination decision). Selten argues that individuals can always reach a routine decision,
but perhaps not the higher levels. Once the individuals have all their levels of decision, they can decide which
answer to use...the Final Decision. The final decision is made on the routine level and governs actual behavior.

The economy of decision effort


Decision effort is a scarce commodity, being both time consuming and mentally taxing. Reasoning is more costly
than Imagination, which, in turn is more costly than Routine. The highest level activated is not always the most
accurate since the individual may be able to reach a good decision on the routine level, but makes serious
computational mistakes on higher levels, especially Reasoning.
Selten finally argues that strategic decisions, like those made by the monopolist in the chainstore paradox, are
generally made on the level of Imagination, where deterrence is a reality, due to the complexity of Reasoning, and
the great inferiority of Routine (it does not allow the individual to see herself in the other player's position). Since
Imagination cannot be used to visualize more than a few stages of an extensive form game (like the Chain-store
game) individuals break down games into "the beginning" and "towards the end". Here, deterrence is a reality, since
it is reasonable "in the beginning", yet is not convincing "towards the end".

References
Selten, Reinhard (1978). "The chain store paradox". Theory and Decision 9 (2): 127159.
doi:10.1007/BF00131770 [1]. ISSN0040-5833 [2].

Further reading
Relation of Chain Store Paradox to Constitutional Politics in Canada [3]

References
[1] http:/ / dx. doi. org/ 10. 1007%2FBF00131770
[2] http:/ / www. worldcat. org/ issn/ 0040-5833
[3] http:/ / jtp. sagepub. com/ cgi/ content/ abstract/ 11/ 1/ 5

21

Exchange paradox

Exchange paradox
The two envelopes problem, also known as the exchange paradox, is a brain teaser, puzzle, or paradox in logic,
probability, and recreational mathematics. It is of special interest in decision theory, and for the Bayesian
interpretation of probability theory. Historically, it arose as a variant of the necktie paradox.
The problem:
You have two indistinguishable envelopes that each contain money. One contains twice as much as the other.
You may pick one envelope and keep the money it contains. You pick at random, but before you open the
envelope, you are offered the chance to take the other envelope instead.
It can be argued that it is to your advantage to swap envelopes by showing that your expected return on swapping
exceeds the sum in your envelope. This leads to the paradoxical conclusion that it is beneficial to continue to swap
envelopes indefinitely.
Example
Assume the amount in a selected envelope is $20. If the envelope happens to be the larger of the two envelopes
("larger" meaning the one with the larger amount of money), that would mean that the amount in the envelope is
twice the amount in the other envelope. So in this case the amount in the other envelope would be $10.
However if the selected envelope is the smaller of the two envelopes, that would mean that the amount in the other
envelope is twice the amount in the selected envelope. In this second scenario the amount in the other envelope
would be $40.
The probability of either of these scenarios is one half, since there is a 50% chance that the larger envelope was
selected and a 50% chance that the smaller envelope was selected. The expected value calculation for how much
money is in the other envelope would be the amount in the first scenario times the probability of the first scenario
plus the amount in the second scenario times the probability of the second scenario, which is $10 * 1/2 + $40 * 1/2.
The result of this calculation is that the expected value of money in the other envelope is $25. Since this is greater
than the selected envelope, it would appear to the person selecting the envelope's advantage to always switch
envelopes. However, this probability calculation is for the singular situation of being a gambler offered by the house
to double or half your money (as opposed to the balanced "double or nothing"). In the two envelopes problem the
person selecting the envelope is, by switching, also effectively the house giving the other person (the gambler) an
envelope with double or half the amount in their envelope. This is by default because the starting value of the
envelope one party receives is linked at double or half of the other party and is randomly chosen. What this means is
the person who selected the envelope is giving up an envelope with an expected value of $25 to receive an envelope
with an expected value of $25.
A large number of solutions have been proposed. The usual scenario is that one writer proposes a solution that solves
the problem as stated, but then another writer discovers that altering the problem slightly revives the paradox. In this
way, a family of closely related formulations of the problem have been created, which are discussed in the literature.
No proposed solution is widely accepted as correct. Despite this it is common for authors to claim that the solution to
the problem is easy, even elementary. However, when investigating these elementary solutions they often differ from
one author to the next. Since 1987 new papers have been published every year.[1]

22

Exchange paradox

Problem
Basic setup: You are given two indistinguishable envelopes, each of
which contains a positive sum of money. One envelope contains twice
as much as the other. You may pick one envelope and keep whatever
amount it contains. You pick one envelope at random but before you
open it you are given the chance to take the other envelope instead.
The switching argument: Now suppose you reason as follows:
1. I denote by A the amount in my selected envelope.
2.
3.
4.
5.
6.
7.

The probability that A is the smaller amount is 1/2, and that it is the larger amount is also 1/2.
The other envelope may contain either 2A or A/2.
If A is the smaller amount, then the other envelope contains 2A.
If A is the larger amount, then the other envelope contains A/2.
Thus the other envelope contains 2A with probability 1/2 and A/2 with probability 1/2.
So the expected value of the money in the other envelope is

1.
2.
3.
4.
5.

This is greater than A, so I gain on average by swapping.


After the switch, I can denote that content by B and reason in exactly the same manner as above.
I will conclude that the most rational thing to do is to swap back again.
To be rational, I will thus end up swapping envelopes indefinitely.
As it seems more rational to open just any envelope than to swap indefinitely, we have a contradiction.

The puzzle: The puzzle is to find the flaw in the very compelling line of reasoning above.

Common resolution
A common way to resolve the paradox, both in popular literature and academic literature, is to observe that A stands
for different things at different places in the expected value calculation, step 7 above. In the first term A is the
smaller amount while in the second term A is the larger amount. To mix different instances of a variable in the same
formula like this is said to be illegitimate, so step 7 is incorrect, and this is the cause of the paradox.
According to this analysis, a correct argument runs on the following lines. By definition the amount in one envelope
is twice as much as in the other. Denoting the lower of the two amounts by X, we write the expected value
calculation as

Here X stands for the same thing in every term of the equation. We learn that 1.5X, is the average expected value in
either of the envelopes. Being less than 2X, the value in the greater envelope, there is no reason to swap the
envelopes.

23

Exchange paradox

24

Mathematical details
Let us rewrite the preceding calculations in a more detailed notation that explicitly distinguishes random from
not-random quantities (a different distinction from the usual in ordinary, deterministic mathematics between
variables and constants). This is a useful way to compare with the next, alternative, resolution. So far, we were
thinking of the two amounts of money in the two envelopes as fixed. The only randomness lies in which one goes
into which envelope. We called the smaller amount X, let us denote the larger amount by Y. Given the values x and y
of X and Y, where y = 2x and x > 0, the problem description tells us (whether or not x and y are known)

for all possible values x of the smaller amount X; there is a corresponding definition of the probability distribution of
B given X and Y. In our resolution of the paradox, we guessed that in Step 7 the writer was trying to compute the
expected value of B given X=x. Splitting the calculation over the two possibilities for which envelope contains the
smaller amount, it is certainly correct to write

At this point, the writer correctly substitutes the value 1/2 for both of the conditional probabilities on the right hand
side of this equation (Step 2). At the same time he correctly substitutes the random variable B inside the first
conditional expectation for 2A, when taking its expectation value given B > A and X = x, and he similarly correctly
substitutes the random variable B for A/2 when taking its expectation value given B < A and X = x (Steps 4 and 5).
He would then arrive at the completely correct equation

However he now proceeds, in the first of the two terms on the right hand side, to replace the expectation value of A
given that Envelope A contains the smaller amount and given that the amounts are x and 2x, by the random quantity
A itself. Similarly, in the second term on the right hand side he replaces the expectation value of A given now that
Envelope A contains the larger amount and given that the amounts are x and 2x, also by the random quantity A itself.
The correct substitutions would have been, of course, x and 2x respectively, leading to a correct conclusion
.
Naturally this coincides with the expectation value of A given X=x.
Indeed, in the two contexts in which the random variable A appears on the right hand side, it is standing for two
different things, since its distribution has been conditioned on different events. Obviously, A tends to be larger, when
we know that it is greater than B and when the two amounts are fixed, and it tends to be smaller, when we know that
it is smaller than B and the two amounts are fixed, cf. Schwitzgebel and Dever (2007, 2008). In fact, it is exactly
twice as large in the first situation as in the second situation.
The preceding resolution was first noted by Bruss in 1996. Falk gave a concise exposition in 2009.

Alternative interpretation
The first solution above does not explain what is wrong if the player is allowed to open the first envelope before
offered the option to switch. In this case, A stands for the value that then seen throughout all subsequent calculations.
The mathematical variable A stands for any particular amount he might see there (it is a mathematical variable, a
generic possible value of a random variable). The reasoning appears to show that whatever amount he would see
there, he would decide to switch. Hence, he does not need to look in the envelope at all: he knows that if he looks,

Exchange paradox
and goes through the calculations, they will tell him to switch, whatever he saw in the envelope.
In this case, at Steps 6, 7 and 8 of the reasoning, A is any fixed possible value of the amount of money in the first
envelope.
Thus, the proposed "common resolution" above breaks down, and another explanation is needed.
Panagiotis Tsikogiannopoulos (2014) proposed the following explanation to the paradox in his paper. Suppose that
both players see that the envelope of player A contains 100 euros and let's make a calculation using only numerical
values: Once we know the amount of 100 euros, we conclude that the other envelope can contain either 50 or 200
euros. Now there are two possible events for the two fixed amounts that the game can be played with:
Event 1: Amounts of 100 and 200 euros
Event 2: Amounts of 50 and 100 euros
We assume that both players have no prior beliefs for the total amount of money they are going to share, so Events 1
and 2 must be considered to have equal probabilities to occur. In every variation of two fixed amounts where the
players want to make their calculations based on the numerical value of the amount revealed to them, they will have
to weigh the return derived from each event with the average fixed amount by which the game is played in this
event. In the Event 1, player A will have a profit of 100 euros by exchanging his envelope while in the Event 2 will
have a loss of 50 euros. The average fixed amount in the Event 1 is 150 euros while in the Event 2 is 75 euros.
The formula of expected return in case of exchange for player A that summarizes the above remarks is the following:

So, the decision of the exchange or not of his envelope is indifferent. Below is an explanation of why the players will
have to weigh the return derived from each event with the average fixed amount:
In the Event 1, the player who will switch the amount of 100 euros with the amount of 200 euros will have a profit of
100 euros in a game that shares 300 euros in total. So we can say that this player played the game with a success
factor of 100 euros / 300 euros = +1/3. Similarly, the other player played the game with a success factor of -1/3. In
the Event 2, the player who will switch the amount of 50 euros with the amount of 100 euros will have a profit of 50
euros in a game that shares 150 euros in total. This player played the game with a success factor of +1/3 also and his
opponent with a success factor of -1/3 also. In reality, when a player sees that his envelope contains 100 euros, he
doesnt know whether he is in the game of the Event 1 or in the game of the Event 2. If he is in the Event 1 and
switches he will have a success factor of +1/3 and if he is in the Event 2 and switches he will have a success factor of
-1/3. As we mentioned above, these two events have equal probability of 1/2 to occur, so the total success factor of
player A considering both possible events is zero. This means that the decision of a player to switch or not switch his
envelope is indifferent, even when he makes his calculations based on the amount of money that is revealed to him.
This interpretation of the two envelopes problem appears in the first publications in which the paradox was
introduced, Gardner (1989) and Nalebuff (1989). It is common in the more mathematical literature on the problem.
The "common resolution" above depends on a particular interpretation of what the writer of the argument is trying to
calculate: namely, it assumes he is after the (unconditional) expectation value of what's in Envelope B. In the
mathematical literature on Two Envelopes Problem (and in particular, in the literature where it was first introduced
to the world), another interpretation is more common, involving the conditional expectation value (conditional on
what might be in Envelope A). To solve this and related interpretations or versions of the problem, most authors use
the Bayesian interpretation of probability.

25

Exchange paradox

Introduction to resolutions based on Bayesian probability theory


Here the ways the paradox can be resolved depend to a large degree on the assumptions that are made about the
things that are not made clear in the setup and the proposed argument for switching. The most usual assumption
about the way the envelopes are set up is that a sum of money is in one envelope, and twice that sum is in another
envelope. One of the two envelopes is randomly given to the player (envelope A). It is not made clear exactly how
the smaller of the two sums is determined, what values it could possibly take and, in particular, whether there is a
maximum sum it might contain. It is also not specified whether the player can look in Envelope A before deciding
whether or not to switch. A further ambiguity in the paradox is that it is not made clear in the proposed argument
whether the amount A in Envelope A is intended to be a constant, a random variable, or some other quantity.
If it assumed that there is a maximum sum that can be put in the first envelope, then a simple and mathematically
sound resolution is possible within the second interpretation. Step 6 in the proposed line of reasoning is not always
true, since if the player holds more than the maximum sum that can be put into the first envelope they must hold the
envelope containing the larger sum, and are thus certain to lose by switching. This may not occur often, but when it
does, the heavy loss the player incurs means that, on average, there is no advantage in switching. This resolves all
practical cases of the problem, whether or not the player looks in their envelope.
It can be envisaged, however, that the sums in the two envelopes are not limited. This requires a more careful
mathematical analysis, and also uncovers other possible interpretations of the problem. If, for example, the smaller
of the two sums of money is considered equally likely to be one of infinitely many positive integers, without upper
limitthe probability that it is any given number is always zero. This absurd situation exemplifies an improper prior,
and this is generally considered to resolve the paradox in this case.
It is possible to devise a distribution for the sums possible in the first envelope such that the maximum value is
unlimited, computation of the expectation of what is in B given what is in A seems to dictate you should switch, and
the distribution constitutes a proper prior. In these cases it can be shown that the expected sum in both envelopes is
infinite. There is no gain, on average, in swapping.
The first two resolutions we present correspond, technically speaking, first to A being a random variable, and
secondly to it being a possible value of a random variable (and the expectation being computed is a conditional
expectation). At the same time, in the first resolution the two original amounts of money seem to be thought of as
being fixed, while in the second they are also thought of as varying. Thus there are two main interpretations of the
problem, and two main resolutions.

Proposed resolutions to the alternative interpretation


Nalebuff (1989), Christensen and Utts (1992), Falk and Konold (1992), Blachman, Christensen and Utts (1996),
Nickerson and Falk (2006), pointed out that if the amounts of money in the two envelopes have any proper
probability distribution representing the player's prior beliefs about the amounts of money in the two envelopes, then
it is impossible that whatever the amount A=a in the first envelope might be, it would be equally likely, according to
these prior beliefs, that the second contains a/2 or 2a. Thus step 6 of the argument, which leads to always switching,
is a non-sequitur.

26

Exchange paradox
Mathematical details
According to this interpretation, the writer is carrying out the following computation, where he is conditioning now
on the value of A, the amount in Envelope A, not on the pair amounts in the two envelopes X and Y:

Completely correctly, and according to Step 5, the two conditional expectation values are evaluated as

However in Step 6 the writer is invoking Steps 2 and 3 to get the two conditional probabilities, and effectively
replacing the two conditional probabilities of Envelope A containing the smaller and larger amount, respectively,
given the amount actually in that envelope, both by the unconditional probability 1/2: he makes the substitutions

But intuitively, we would expect that the larger the amount in A, the more likely it is the larger of the two, and
vice-versa. And it is a mathematical fact, as we will see in a moment, that it is impossible that both of these
conditional probabilities are equal to 1/2 for all possible values of a. In fact, for step 6 to be true, whatever a might
be, the smaller amount of money in the two envelopes must be equally likely to be between 1 and 2, as between 2
and 4, as between 4 and 8, ... ad infinitum. But there is no way to divide total probability 1 into an infinite number of
pieces that are not only all equal to one another, but also all larger than zero. Yet the smaller amount of money in
the two envelopes must have probability larger than zero to be in at least one of the just mentioned ranges.
To see this, suppose that the chance that the smaller of the two envelopes contains an amount between 2n and 2n+1 is
p(n), where n is any whole number, positive or negative, and for definiteness we include the lower limit but exclude
the upper in each interval. It follows that the conditional probability that the envelope in our hands contains the
smaller amount of money of the two, given that its contents are between 2n and 2n+1, is

If this is equal to 1/2, it follows by simple algebra that

or p(n)=p(n-1). This must be true for all n, an impossibility.

New variant
Though Bayesian probability theory can resolve the alternative interpretation of the paradox above, it turns out that
examples can be found of proper probability distributions, such that the expected value of the amount in the second
envelope given that in the first does exceed the amount in the first, whatever it might be. The first such example was
already given by Nalebuff (1989). See also Christensen and Utts (1992)
Denote again the amount of money in the first envelope by A and that in the second by B. We think of these as
random. Let X be the smaller of the two amounts and Y=2X be the larger. Notice that once we have fixed a
probability distribution for X then the joint probability distribution of A,B is fixed, since A,B = X,Y or Y,X each with
probability 1/2, independently of X,Y.
The bad step 6 in the "always switching" argument led us to the finding E(B|A=a)>a for all a, and hence to the
recommendation to switch, whether or not we know a. Now, it turns out that one can quite easily invent proper
probability distributions for X, the smaller of the two amounts of money, such that this bad conclusion is still true.
One example is analysed in more detail, in a moment.
It cannot be true that whatever a, given A=a, B is equally likely to be a/2 or 2a, but it can be true that whatever a,
given A=a, B is larger in expected value than a.

27

Exchange paradox

28

Suppose for example (Broome, 1995)[2] that the envelope with the smaller amount actually contains 2n dollars with
probability 2n/3n+1 where n = 0, 1, 2, These probabilities sum to 1, hence the distribution is a proper prior (for
subjectivists) and a completely decent probability law also for frequentists.
Imagine what might be in the first envelope. A sensible strategy would certainly be to swap when the first envelope
contains 1, as the other must then contain 2. Suppose on the other hand the first envelope contains 2. In that case
there are two possibilities: the envelope pair in front of us is either {1, 2} or {2, 4}. All other pairs are impossible.
The conditional probability that we are dealing with the {1, 2} pair, given that the first envelope contains 2, is

and consequently the probability it's the {2, 4} pair is 2/5, since these are the only two possibilities. In this
derivation,
is the probability that the envelope pair is the pair 1 and 2, and Envelope A happens to
contain 2;

is the probability that the envelope pair is the pair 2 and 4, and (again) Envelope A

happens to contain 2. Those are the only two ways that Envelope A can end up containing the amount 2.
It turns out that these proportions hold in general unless the first envelope contains 1. Denote by a the amount we
imagine finding in Envelope A, if we were to open that envelope, and suppose that a = 2n for some n 1. In that case
the other envelope contains a/2 with probability 3/5 and 2a with probability 2/5.
So either the first envelope contains 1, in which case the conditional expected amount in the other envelope is 2, or
the first envelope contains a > 1, and though the second envelope is more likely to be smaller than larger, its
conditionally expected amount is larger: the conditionally expected amount in Envelope B is

which is more than a. This means that the player who looks in Envelope A would decide to switch whatever he saw
there. Hence there is no need to look in Envelope A to make that decision.
This conclusion is just as clearly wrong as it was in the preceding interpretations of the Two Envelopes Problem. But
now the flaws noted above do not apply; the a in the expected value calculation is a constant and the conditional
probabilities in the formula are obtained from a specified and proper prior distribution.

Proposed resolutions
Some writers think that the new paradox can be defused.[3] Suppose

for all a. As remarked

before, this is possible for some probability distributions of X (the smaller amount of money in the two envelopes).
Averaging over a, it follows either that
, or alternatively that
. But A and
B have the same probability distribution, and hence the same expectation value, by symmetry (each envelope is
equally likely to be the smaller of the two). Thus both have infinite expectation values, and hence so must X too.
Thus if we switch for the second envelope because its conditional expected value is larger than what actually is in
the first, whatever that might be, we are exchanging an unknown amount of money whose expectation value is
infinite for another unknown amount of money with the same distribution and the same infinite expected value. The
average amount of money in both envelopes is infinite. Exchanging one for the other simply exchanges an average of
infinity with an average of infinity.
Probability theory therefore tells us why and when the paradox can occur and explains to us where the sequence of
apparently logical steps breaks down. In this situation, Steps 6 and Steps 7 of the standard Two Envelopes argument
can be replaced by correct calculations of the conditional probabilities that the other envelope contains half or twice

Exchange paradox

29

what's in A, and a correct calculation of the conditional expectation of what's in B given what's in A. Indeed, that
conditional expected value is larger than what's in A. But because the unconditional expected amount in A is infinite,
this does not provide a reason to switch, because it does not guarantee that on average you'll be better off after
switching. One only has this mathematical guarantee in the situation that the unconditional expectation value of
what's in A is finite. But then the reason for switching without looking in the envelope,
for all a, simply cannot
Many economists prefer to argue that in a real-life situation, the expectation of the amount of money in an envelope
cannot be infinity, for instance, because the total amount of money in the world is bounded; therefore any probability
distribution describing the real world would have to assign probability 0 to the amount being larger than the total
amount of money on the world. Therefore the expectation of the amount of money under this distribution cannot be
infinity. The resolution of the second paradox, for such writers, is that the postulated probability distributions cannot
arise in a real-life situation. These are similar arguments as used to explain the St. Petersburg Paradox.

Foundations of mathematical economics


In mathematical economics and the theory of utility, which explains economic behaviour in terms of expected utility,
there remains a problem to be resolved. In the real world we presumably would not indefinitely exchange one
envelope for the other (and probability theory, as just discussed, explains quite well why calculations of conditional
expectations might mislead us). Yet the expected utility based theory of economic behaviour assumes that people do
(or should) make economic decisions by maximizing expected utility, conditional on present knowledge. If the utility
function is unbounded above, then the theory can still predict infinite switching.
Fortunately for mathematical economics and the theory of utility, it is generally agreed that as an amount of money
increases, its utility to the owner increases less and less, and ultimately there is a finite upper bound to the utility of
all possible amounts of money. We can pretend that the amount of money in the whole world is as large as we like,
yet the utility that the owner of all that money experiences, while rising further and further, will never rise beyond a
certain point no matter how much is in his possession. For decision theory and utility theory, the two envelope
paradox illustrates that unbounded utility does not exist in the real world, so fortunately there is no need to build a
decision theory that allows unbounded utility, let alone utility of infinite expectation.Wikipedia:Citation needed

Controversy among philosophers


As mentioned above, any distribution producing this variant of the paradox must have an infinite mean. So before
the player opens an envelope the expected gain from switching is " ", which is not defined. In the words of
Chalmers this is "just another example of a familiar phenomenon, the strange behaviour of infinity". Chalmers
suggests that decision theory generally breaks down when confronted with games having a diverging expectation,
and compares it with the situation generated by the classical St. Petersburg paradox.
However, Clark and Shackel argue that this blaming it all on "the strange behaviour of infinity" does not resolve the
paradox at all; neither in the single case nor the averaged case. They provide a simple example of a pair of random
variables both having infinite mean but where it is clearly sensible to prefer one to the other, both conditionally and
on average. They argue that decision theory should be extended so as to allow infinite expectation values in some
situations.

Exchange paradox

Non-probabilistic variant
The logician Raymond Smullyan questioned if the paradox has anything to do with probabilities at all. He did this by
expressing the problem in a way that does not involve probabilities. The following plainly logical arguments lead to
conflicting conclusions:
1. Let the amount in the envelope chosen by the player be A. By swapping, the player may gain A or lose A/2. So the
potential gain is strictly greater than the potential loss.
2. Let the amounts in the envelopes be X and 2X. Now by swapping, the player may gain X or lose X. So the
potential gain is equal to the potential loss.

Proposed resolutions
A number of solutions have been put forward. Careful analyses have been made by some logicians. Though solutions
differ, they all pinpoint semantic issues concerned with counterfactual reasoning. We want to compare the amount
that we would gain by switching if we would gain by switching, with the amount we would lose by switching if we
would indeed lose by switching. However, we cannot both gain and lose by switching at the same time. We are
asked to compare two incompatible situations. Only one of them can factually occur, the other is a counterfactual
situationsomehow imaginary. To compare them at all, we must somehow "align" the two situations, providing
some definite points in common.
James Chase (2002) argues that the second argument is correct because it does correspond to the way to align two
situations (one in which we gain, the other in which we lose), which is preferably indicated by the problem
description. Also Bernard Katz and Doris Olin (2007) argue this point of view. In the second argument, we consider
the amounts of money in the two envelopes as being fixed; what varies is which one is first given to the player.
Because that was an arbitrary and physical choice, the counterfactual world in which the player, counterfactually,
got the other envelope to the one he was actually (factually) given is a highly meaningful counterfactual world and
hence the comparison between gains and losses in the two worlds is meaningful. This comparison is uniquely
indicated by the problem description, in which two amounts of money are put in the two envelopes first, and only
after that is one chosen arbitrarily and given to the player. In the first argument, however, we consider the amount of
money in the envelope first given to the player as fixed and consider the situations where the second envelope
contains either half or twice that amount. This would only be a reasonable counterfactual world if in reality the
envelopes had been filled as follows: first, some amount of money is placed in the specific envelope that will be
given to the player; and secondly, by some arbitrary process, the other envelope is filled (arbitrarily or randomly)
either with double or with half of that amount of money.
Byeong-Uk Yi (2009), on the other hand, argues that comparing the amount you would gain if you would gain by
switching with the amount you would lose if you would lose by switching is a meaningless exercise from the outset.
According to his analysis, all three implications (switch, indifferent, do not switch) are incorrect. He analyses
Smullyan's arguments in detail, showing that intermediate steps are being taken, and pinpointing exactly where an
incorrect inference is made according to his formalization of counterfactual inference. An important difference with
Chase's analysis is that he does not take account of the part of the story where we are told that the envelope called
Envelope A is decided completely at random. Thus, Chase puts probability back into the problem description in
order to conclude that arguments 1 and 3 are incorrect, argument 2 is correct, while Yi keeps "two envelope problem
without probability" completely free of probability, and comes to the conclusion that there are no reasons to prefer
any action. This corresponds to the view of Albers et al., that without probability ingredient, there is no way to argue
that one action is better than another, anyway.
In perhaps the most recent paper on the subject, Bliss argues that the source of the paradox is that when one
mistakenly believes in the possibility of a larger payoff that does not, in actuality, exist, one is mistaken by a larger
margin than when one believes in the possibility of a smaller payoff that does not actually exist. If, for example, the
envelopes contained $5.00 and $10.00 respectively, a player who opened the $10.00 envelope would expect the

30

Exchange paradox

31

possibility of a $20.00 payout that simply does not exist. Were that player to open the $5.00 envelope instead, he
would believe in the possibility of a $2.50 payout, which constitutes a smaller deviation from the true value.
Albers, Kooi, and Schaafsma (2005) consider that without adding probability (or other) ingredients to the problem,
Smullyan's arguments do not give any reason to swap or not to swap, in any case. Thus there is no paradox. This
dismissive attitude is common among writers from probability and economics: Smullyan's paradox arises precisely
because he takes no account whatever of probability or utility.

Extensions to the problem


Since the two envelopes problem became popular, many authors have studied the problem in depth in the situation in
which the player has a prior probability distribution of the values in the two envelopes, and does look in Envelope A.
One of the most recent such publications is by McDonnell and Douglas (2009), who also consider some further
generalizations.
If a priori we know that the amount in the smaller envelope is a whole number of some currency units, then the
problem is determined, as far as probability theory is concerned, by the probability mass function
describing
our prior beliefs that the smaller amount is any number x = 1,2, ... ; the summation over all values of x being equal to
1. It follows that given the amount a in Envelope A, the amount in Envelope B is certainly 2a if a is an odd number.
However, if a is even, then the amount in Envelope B is 2a with probability
, and a/2
with probability

. If one would like to switch envelopes if the expectation value of

what is in the other is larger than what we have in ours, then a simple calculation shows that one should switch if
, keep to Envelope A if
.
If on the other hand the smaller amount of money can vary continuously, and we represent our prior beliefs about it
with a probability density
, thus a function that integrates to one when we integrate over x running from zero
to infinity, then given the amount a in Envelope A, the other envelope contains 2a with probability
, and a/2 with probability
. If again we decide to
switch or not according to the expectation value of what's in the other envelope, the criterion for switching now
becomes
.
The difference between the results for discrete and continuous variables may surprise many readers. Speaking
intuitively, this is explained as follows. Let h be a small quantity and imagine that the amount of money we see when
we look in Envelope A is rounded off in such a way that differences smaller than h are not noticeable, even though
actually it varies continuously. The probability that the smaller amount of money is in an interval around a of length
h, and Envelope A contains the smaller amount is approximately
. The probability that the larger
amount of money is in an interval around a of length h corresponds to the smaller amount being in an interval of
length h/2 around a/2. Hence the probability that the larger amount of money is in a small interval around a of length
h and Envelope A contains the larger amount is approximately
. Thus, given Envelope A
contains an amount about equal to a, the probability it is the smaller of the two is roughly
.
If the player only wants to end up with the larger amount of money, and does not care about expected amounts, then
in the discrete case he should switch if a is an odd number, or if a is even and
. In the continuous
case he should switch if

Some authors prefer to think of probability in a frequentist sense. If the player knows the probability distribution
used by the organizer to determine the smaller of the two values, then the analysis would proceed just as in the case
when p or f represents subjective prior beliefs. However, what if we take a frequentist point of view, but the player
does not know what probability distribution is used by the organiser to fix the amounts of money in any one
instance? Thinking of the arranger of the game and the player as two parties in a two person game, puts the problem
into the range of game theory. The arranger's strategy consists of a choice of a probability distribution of x, the
smaller of the two amounts. Allowing the player also to use randomness in making his decision, his strategy is

Exchange paradox
determined by his choosing a probability of switching

32
for each possible amount of money a he might see in

Envelope A. In this section we so far only discussed fixed strategies, that is strategies for which q only takes the
values 0 and 1, and we saw that the player is fine with a fixed strategy, if he knows the strategy of the organizer. In
the next section we will see that randomized strategies can be useful when the organizer's strategy is not known.

Randomized solutions
Suppose as in the previous section that the player is allowed to look in the first envelope before deciding whether to
switch or to stay. We'll think of the contents of the two envelopes as being two positive numbers, not necessarily two
amounts of money. The player is allowed either to keep the number in Envelope A, or to switch and take the number
in Envelope B. We'll drop the assumption that one number is exactly twice the other, we'll just suppose that they are
different and positive. On the other hand, instead of trying to maximize expectation values, we'll just try to maximize
the chance that we end up with the larger number.
In this section we ask the question, is it possible for the player to make his choice in such a way that he goes home
with the larger number with probability strictly greater than half, however the organizer has filled the two envelopes?
We are given no information at all about the two numbers in the two envelopes, except that they are different, and
strictly greater than zero. The numbers were written down on slips of paper by the organiser, put into the two
envelopes. The envelopes were then shuffled, the player picks one, calls it Envelope A, and opens it.
We are not told any joint probability distribution of the two numbers. We are not asking for a subjectivist solution.
We must think of the two numbers in the envelopes as chosen by the arranger of the game according to some
possibly random procedure, completely unknown to us, and fixed. Think of each envelope as simply containing a
positive number and such that the two numbers are not the same. The job of the player is to end up with the envelope
with the larger number. This variant of the problem, as well as its solution, is attributed by McDonnell and Abbott,
and by earlier authors, to information theorist Thomas M. Cover.
Counter-intuitive though it might seem, there is a way that the player can decide whether to switch or to stay so that
he has a larger chance than 1/2 of finishing with the bigger number, however the two numbers are chosen by the
arranger of the game. However, it is only possible with a so-called randomized algorithm: the player must be able to
generate his own random numbers. Suppose he is able to produce a random number, let's call it Z, such that the
probability that Z is larger than any particular quantity z is exp(-z). Note that exp(-z) starts off equal to 1 at z=0 and
decreases strictly and continuously as z increases, tending to zero as z tends to infinity. So the chance is 0 that Z is
exactly equal to any particular number, and there is a positive probability that Z lies between any two particular
different numbers. The player compares his Z with the number in Envelope A. If Z is smaller he keeps the envelope.
If Z is larger he switches to the other envelope.
Think of the two numbers in the envelopes as fixed (though of course unknown to the player). Think of the player's
random Z as a probe with which he decides whether the number in Envelope A is small or large. If it is small
compared to Z he switches, if it is large compared to Z he stays.
If both numbers are smaller than the player's Z, his strategy does not help him. He ends up with the Envelope B,
which is equally likely to be the larger or the smaller of the two. If both numbers are larger than Z his strategy does
not help him either, he ends up with the first Envelope A, which again is equally likely to be the larger or the smaller
of the two. However if Z happens to be in between the two numbers, then his strategy leads him correctly to keep
Envelope A if its contents are larger than those of B, but to switch to Envelope B if A has smaller contents than B.
Altogether, this means that he ends up with the envelope with the larger number with probability strictly larger than
1/2. To be precise, the probability that he ends with the "winning envelope" is 1/2 + P(Z falls between the two
numbers)/2.
In practice, the number Z we have described could be determined to the necessary degree of accuracy as follows.
Toss a fair coin many times, and convert the sequence of heads and tails into the binary representation of a number U

Exchange paradox
between 0 and 1: for instance, HTHHTH... becomes the binary representation of u=0.101101.. . In this way, we
generate a random number U, uniformly distributed between 0 and 1. Then define Z = ln (U) where "ln" stands for
natural logarithm, i.e., logarithm to base e. Note that we just need to toss the coin long enough to verify whether Z is
smaller or larger than the number a in the first envelopewe do not need to go on for ever. We only need to toss the
coin a finite (though random) number of times: at some point we can be sure that the outcomes of further coin tosses
would not change the outcome.
The particular probability law (the so-called standard exponential distribution) used to generate the random number
Z in this problem is not crucial. Any probability distribution over the positive real numbers that assigns positive
probability to any interval of positive length does the job.
This problem can be considered from the point of view of game theory, where we make the game a two-person
zero-sum game with outcomes win or lose, depending on whether the player ends up with the higher or lower
amount of money. The organiser chooses the joint distribution of the amounts of money in both envelopes, and the
player chooses the distribution of Z. The game does not have a "solution" (or saddle point) in the sense of game
theory. This is an infinite game and von Neumann's minimax theorem does not apply.

History of the paradox


The envelope paradox dates back at least to 1953, when Belgian mathematician Maurice Kraitchik proposed a puzzle
in his book Recreational Mathematics concerning two equally rich men who meet and compare their beautiful
neckties, presents from their wives, wondering which tie actually cost more money. It is also mentioned in a 1953
book on elementary mathematics and mathematical puzzles by the mathematician John Edensor Littlewood, who
credited it to the physicist Erwin Schroedinger. Martin Gardner popularized Kraitchik's puzzle in his 1982 book Aha!
Gotcha, in the form of a wallet game:
Two people, equally rich, meet to compare the contents of their wallets. Each is ignorant of the contents of the
two wallets. The game is as follows: whoever has the least money receives the contents of the wallet of the
other (in the case where the amounts are equal, nothing happens). One of the two men can reason: "I have the
amount A in my wallet. That's the maximum that I could lose. If I win (probability 0.5), the amount that I'll
have in my possession at the end of the game will be more than 2A. Therefore the game is favourable to me."
The other man can reason in exactly the same way. In fact, by symmetry, the game is fair. Where is the
mistake in the reasoning of each man?
In 1988 and 1989, Barry Nalebuff presented two different two-envelope problems, each with one envelope
containing twice what's in the other, and each with computation of the expectation value 5A/4. The first paper just
presents the two problems, the second paper discusses many solutions to both of them. The second of his two
problems nowadays the most common, and is presented in this article. According to this version, the two envelopes
are filled first, then one is chosen at random and called Envelope A. Martin Gardner independently mentioned this
same version in his 1989 book Penrose Tiles to Trapdoor Ciphers and the Return of Dr Matrix. Barry Nalebuff's
asymmetric variant, often known as the Ali Baba problem, has one envelope filled first, called Envelope A, and
given to Ali. Then a fair coin is tossed to decide whether Envelope B should contain half or twice that amount, and
only then given to Baba.

33

Exchange paradox

Notes and references


[1] A complete list of published and unpublished sources in chronological order can be found in the talk page.
[2] A famous example of a proper probability distribution of the amounts of money in the two envelopes, for which
UNIQ-math-0-e545348fab159bc7-QINU for all a.
[3] (letters to the editor, comment on Christensen and Utts (1992)

Kavka's toxin puzzle


Kavka's toxin puzzle is a thought experiment about the possibility of forming an intention to perform an act which,
following from reason, is an action one would not actually perform. It was presented by moral and political
philosopher Gregory S. Kavka in "The Toxin Puzzle" (1983), and grew out of his work in deterrence theory and
mutual assured destruction. Kavka is also well known for his Paradox of Future Individuals, which addresses our
moral obligation to future persons to plan for the future now. His slave child example also displays the deontological
concept that holds humans as "beyond price", therefore they should never be used as a mere means, but rather as an
end.

The puzzle
Kavka's original version of the puzzle is the following:
An eccentric billionaire places before you a vial of toxin that, if you drink it, will make you painfully ill
for a day, but will not threaten your life or have any lasting effects. The billionaire will pay you one
million dollars tomorrow morning if, at midnight tonight, you intend to drink the toxin tomorrow
afternoon. He emphasizes that you need not drink the toxin to receive the money; in fact, the money will
already be in your bank account hours before the time for drinking it arrives, if you succeed. All you
have to do is. . . intend at midnight tonight to drink the stuff tomorrow afternoon. You are perfectly free
to change your mind after receiving the money and not drink the toxin.
A possible interpretation: Can you intend to drink the toxin if you also intend to change your mind at a later time?

The paradox
The paradoxical nature can be stated in many ways, which may be useful for understanding analysis proposed by
philosophers:
In line with Newcomb's paradox, an omniscient pay-off mechanism makes a person's decision known to him
before he makes the decision, but it is also assumed that the person may change his decision afterwards, of free
will.
Similarly in line with Newcomb's paradox; Kavka's claim, that one cannot intend what one will not do, makes
pay-off mechanism an example of reverse causation.
Pay-off for decision to drink the poison is ambiguous.
There are two decisions for one event with different pay-offs.
Since the pain caused by the poison would be more than off-set by the money received, we can sketch the pay-off
table as follows.

34

Kavka's toxin puzzle

35

Pay-offs (Initial analysis)


Intend Do not intend
Drink

90

Do not drink 100

10
0

According to Kavka: Drinking the poison is never to your advantage regardless of whether you are paid. A rational
person would know he would not drink the poison and thus could not intend to drink it.

Pay-offs (According to Kavka)


Intend
Drink

Do not intend

Impossible 10

Do not drink Impossible 0

David Gauthier argues once a person intends drinking the poison one cannot entertain ideas of not drinking it.
The rational outcome of your deliberation tomorrow morning is the action that will be part of your life
going as well as possible, subject to the constraint that it be compatible with your commitment-in this
case, compatible with the sincere intention that you form today to drink the toxin. And so the rational
action is to drink the toxin.

Pay-offs (According to Gauthier)


Intend
Drink

90

Do not intend
10

Do not drink Impossible 0

One of the central tenets of the puzzle is that for a reasonable person
There is reasonable grounds for that person to drink the toxin, since some reward may be obtained.
Having come to the above conclusion there is no reasonable grounds for that person to drink the toxin, since no
further reward may be obtained, and no reasonable person would partake in self-harm for no benefit.
Thus a reasonable person must intend to drink the toxin by the first argument, yet if that person intends to drink the
toxin, he is being irrational by the second argument.

References

Necktie paradox

36

Necktie paradox
The necktie paradox is a puzzle or paradox within the subjectivistic interpretation of probability theory. It is a
variation (and historically, the origin) of the two-envelope paradox.
Two men are each given a necktie by their respective wives as a Christmas present. Over drinks they start arguing
over who has the cheaper necktie. They agree to have a wager over it. They will consult their wives and find out
which necktie is more expensive. The terms of the bet are that the man with the more expensive necktie has to give it
to the other as the prize.
The first man reasons as follows: winning and losing are equally likely. If I lose, then I lose the value of my necktie.
But if I win, then I win more than the value of my necktie. Therefore the wager is to my advantage. The second man
can consider the wager in exactly the same way; thus, paradoxically, it seems both men have the advantage in the
bet. This is obviously not possible.
The paradox can be resolved by giving more careful consideration to what is lost in one scenario ("the value of my
necktie") and what is won in the other ("more than the value of my necktie"). If we assume for simplicity that the
only possible necktie prices are $20 and $30, and that a man has equal chances of having a $20 or $30 necktie, then
four outcomes (all equally likely) are possible:
Price of 1st man's tie Price of 2nd man's tie 1st man's gain/loss
$20

$20

$20

$30

gain $30

$30

$20

lose $30

$30

$30

We see that the first man has a 50% chance of a neutral outcome, a 25% chance of gaining a necktie worth $30, and
a 25% chance of losing a necktie worth $30. Turning to the losing and winning scenarios: if the man loses $30, then
it is true that he has lost the value of his necktie; and if he gains $30, then it is true that he has gained more than the
value of his necktie. The win and the loss are equally likely; but what we call the value of his necktie in the losing
scenario is the same amount as what we call more than the value of his necktie in the winning scenario.
Accordingly, neither man has the advantage in the wager.
In general, what goes wrong is that when the first man is imagining the scenario that his necktie is actually worth
less than the other, his beliefs as to its value have to be revised (downwards) relatively to what they are a priori,
without such additional information. Yet in the apparently logical reasoning leading him to take the wager, he is
behaving as if his necktie is worth the same when it is worth less than the other, as when it is worth more than the
other. Of course, the price his wife actually paid for it is fixed, and doesn't change if it is revealed which tie is worth
more. The point is that this price, whatever it was, is unknown to him. It is his beliefs about the price which could not
be the same if he was given further information as to which tie was worth more. And it is on the basis of his prior
beliefs about the prices that he has to make his decision whether or not to accept the wager.
On a technical note, if the prices of the ties could in principle be arbitrarily large, then it is possible to have beliefs
about their values, such that learning which was the larger would not cause any change to one's beliefs about the
value of one's own tie. However, if one is 100% certain that neither tie can be worth more than, say $100, then
knowing which is worth more changes one's expected value of both (one goes up, the other goes down).
This paradox is a rephrasing of the simplest case of the two envelopes problem, and the explanation of "what goes
wrong" is essentially the same.

Necktie paradox

References
Brown, Aaron C. "Neckties, wallets, and money for nothing." Journal of Recreational Mathematics 27.2 (1995):
116122.
Maurice Kraitchik, Mathematical Recreations, George Allen & Unwin, London 1943

37

38

Economy
Allais paradox
The Allais paradox is a choice problem designed by Maurice Allais to show an inconsistency of actual observed
choices with the predictions of expected utility theory.

Statement of the Problem


The Allais paradox arises when comparing participants' choices in two different experiments, each of which
consists of a choice between two gambles, A and B. The payoffs for each gamble in each experiment are as follows:
Experiment 1
Gamble 1A

Experiment 2

Gamble 1B

Gamble 2A

Gamble 2B

Winnings Chance Winnings Chance Winnings Chance Winnings Chance


$1 million 100%

$1 million 89%

Nothing

Nothing

$1 million 11%

1%

$5 million 10%

89%

Nothing

90%

$5 million 10%

Several studies involving hypothetical and small monetary payoffs, and recently involving health outcomes, have
supported the assertion that when presented with a choice between 1A and 1B, most people would choose 1A.
Likewise, when presented with a choice between 2A and 2B, most people would choose 2B. Allais further asserted
that it was reasonable to choose 1A alone or 2B alone.
However, that the same person (who chose 1A alone or 2B alone) would choose both 1A and 2B together is
inconsistent with expected utility theory. According to expected utility theory, the person should choose either 1A
and 2A or 1B and 2B.
The inconsistency stems from the fact that in expected utility theory, equal outcomes added to each of the two
choices should have no effect on the relative desirability of one gamble over the other; equal outcomes should
"cancel out". Each experiment gives the same outcome 89% of the time (starting from the top row and moving down,
both 1A and 1B give an outcome of $1 million, and both 2A and 2B give an outcome of nothing). If this 89%
common consequence is disregarded, then the gambles will be left offering the same choice.
It may help to re-write the payoffs. After disregarding the 89% chance of winning the same outcome then 1B
is left offering a 1% chance of winning nothing and a 10% chance of winning $5 million, while 2B is also left
offering a 1% chance of winning nothing and a 10% chance of winning $5 million. Hence, choice 1B and 2B can be
seen as the same choice. In the same manner, 1A and 2A should also now be seen as the same choice.

Allais paradox

39

Experiment 1
Gamble 1A

Experiment 2

Gamble 1B

Gamble 2A

Gamble 2B

Winnings Chance Winnings Chance Winnings Chance Winnings Chance


$1 million 89%

$1 million 89%

Nothing

89%

Nothing

89%

$1 million 11%

Nothing

$1 million 11%

Nothing

1%

1%

$5 million 10%

$5 million 10%

Allais presented his paradox as a counterexample to the independence axiom.


Independence means that if an agent is indifferent between simple lotteries
between
probability

mixed with an arbitrary simple lottery

with probability

and
and

, the agent is also indifferent


mixed with

with the same

. Violating this principle is known as the "common consequence" problem (or "common consequence"

effect). The idea of the common consequence problem is that as the prize offered by

increases,

and

become consolation prizes, and the agent will modify preferences between the two lotteries so as to minimize risk
and disappointment in case they do not win the higher prize offered by
.
Difficulties such as this gave rise to a number of alternatives to, and generalizations of, the theory, notably including
prospect theory, developed by Daniel Kahneman and Amos Tversky, weighted utility (Chew), and rank-dependent
expected utility by John Quiggin. The point of these models was to allow a wider range of behavior than was
consistent with expected utility theory.
Also relevant here is the framing theory of Daniel Kahneman and Amos Tversky. Identical items will result in
different choices if presented to agents differently (i.e. a surgery with a 70% survival rate vs. a 30% chance of death).
The main point Allais wished to make is that the independence axiom of expected utility theory may not be a valid
axiom. The independence axiom states that two identical outcomes within a gamble should be treated as irrelevant to
the analysis of the gamble as a whole. However, this overlooks the notion of complementarities, the fact your choice
in one part of a gamble may depend on the possible outcome in the other part of the gamble. In the above choice, 1B,
there is a 1% chance of getting nothing. However, this 1% chance of getting nothing also carries with it a great sense
of disappointment if you were to pick that gamble and lose, knowing you could have won with 100% certainty if you
had chosen 1A. This feeling of disappointment, however, is contingent on the outcome in the other portion of the
gamble (i.e. the feeling of certainty). Hence, Allais argues that it is not possible to evaluate portions of gambles or
choices independently of the other choices presented, as the independence axiom requires, and thus is a poor judge
of our rational action (1B cannot be valued independently of 1A as the independence or sure thing principle requires
of us). We don't act irrationally when choosing 1A and 2B; rather expected utility theory is not robust enough to
capture such "bounded rationality" choices that in this case arise because of complementarities.

Mathematical proof of inconsistency


Using the values above and a utility function U(W), where W is wealth, we can demonstrate exactly how the paradox
manifests.
Because the typical individual prefers 1A to 1B and 2B to 2A, we can conclude that the expected utilities of the
preferred is greater than the expected utilities of the second choices, or,

Experiment 1

Allais paradox

Experiment 2
We can rewrite the latter equation (Experiment 2) as

which contradicts the first bet (Experiment 1), which shows the player prefers the sure thing over the gamble.

References
Machina, Mark (1987). "Choice Under Uncertainty: Problems Solved and Unsolved". The Journal of Economic
Perspectives 1 (1): 121154. doi: 10.1257/jep.1.1.121 (http://dx.doi.org/10.1257/jep.1.1.121).
Allais, M. (1953). "Le comportement de lhomme rationnel devant le risque: critique des postulats et axiomes de
lcole Amricaine". Econometrica 21 (4): 503546. JSTOR 1907921 (http://www.jstor.org/stable/1907921).
Chew Soo Hong; Mao, Jennifer; Nishimura, Naoko (2005). Preference for longshot: An Experimental Study of
Demand for Sweepstakes (http://cebr.ust.hk/conference/2ndconference/nishimura.htm).
Kahneman, Daniel; Tversky, Amos (1979). "Prospect Theory: An Analysis of Decision under Risk".
Econometrica 47 (2): 263291. JSTOR 1914185 (http://www.jstor.org/stable/1914185).
Oliver, Adam (2003). "A quantitative and qualitative test of the Allais paradox using health outcomes". Journal of
Economic Psychology 24 (1): 3548. doi: 10.1016/S0167-4870(02)00153-8 (http://dx.doi.org/10.1016/
S0167-4870(02)00153-8).
Quiggin, J. (1993). Generalized Expected Utility Theory:The Rank-Dependent Expected Utility model.
Amsterdam: Kluwer-Nijhoff. review (http://www.uq.edu.au/economics/johnquiggin/Books/Machina.html)

40

Arrow's impossibility theorem

Arrow's impossibility theorem


In social choice theory, Arrows impossibility theorem, the General Possibility Theorem, or Arrows paradox,
states that, when voters have three or more distinct alternatives (options), no rank order voting system can convert
the ranked preferences of individuals into a community-wide (complete and transitive) ranking while also meeting
a specific set of criteria. These criteria are called unrestricted domain, non-dictatorship, Pareto efficiency, and
independence of irrelevant alternatives. The theorem is often cited in discussions of election theory as it is further
interpreted by the GibbardSatterthwaite theorem.
The theorem is named after economist Kenneth Arrow, who demonstrated the theorem in his Ph.D. thesis and
popularized it in his 1951 book Social Choice and Individual Values. The original paper was titled "A Difficulty in
the Concept of Social Welfare".[1]
In short, the theorem states that no rank-order voting system can be designed that satisfies these three "fairness"
criteria:
If every voter prefers alternative X over alternative Y, then the group prefers X over Y.
If every voter's preference between X and Y remains unchanged, then the group's preference between X and Y
will also remain unchanged (even if voters' preferences between other pairs like X and Z, Y and Z, or Z and W
change).
There is no "dictator": no single voter possesses the power to always determine the group's preference.
Voting systems that use cardinal utility (which conveys more information than rank orders; see the subsection
discussing the cardinal utility approach to overcoming the negative conclusion) are not covered by the theorem.[2]
The theorem can also be sidestepped by weakening the notion of independence. Arrow rejected cardinal utility as a
meaningful tool for expressing social welfare,[3] and so focused his theorem on preference rankings.
The axiomatic approach Arrow adopted can treat all conceivable rules (that are based on preferences) within one
unified framework. In that sense, the approach is qualitatively different from the earlier one in voting theory, in
which rules were investigated one by one. One can therefore say that the contemporary paradigm of social choice
theory started from this theorem.[4] Introduction, page 10.</ref>

Statement of the theorem


The need to aggregate preferences occurs in many disciplines: in welfare economics, where one attempts to find an
economic outcome which would be acceptable and stable; in decision theory, where a person has to make a rational
choice based on several criteria; and most naturally in voting systems, which are mechanisms for extracting a
decision from a multitude of voters' preferences.
The framework for Arrow's theorem assumes that we need to extract a preference order on a given set of options
(outcomes). Each individual in the society (or equivalently, each decision criterion) gives a particular order of
preferences on the set of outcomes. We are searching for a ranked voting system, called a social welfare function
(preference aggregation rule), which transforms the set of preferences (profile of preferences) into a single global
societal preference order. The theorem considers the following properties, assumed to be reasonable requirements of
a fair voting method:
Non-dictatorship
The social welfare function should account for the wishes of multiple voters. It cannot simply mimic the
preferences of a single voter.
Unrestricted domain
(or universality) For any set of individual voter preferences, the social welfare function should yield a unique
and complete ranking of societal choices. Thus:

41

Arrow's impossibility theorem


It must do so in a manner that results in a complete ranking of preferences for society.
It must deterministically provide the same ranking each time voters' preferences are presented the same way.
Independence of irrelevant alternatives (IIA)
The social preference between x and y should depend only on the individual preferences between x and y
(Pairwise Independence). More generally, changes in individuals' rankings of irrelevant alternatives (ones
outside a certain subset) should have no impact on the societal ranking of the subset. (See Remarks below.)
Positive association of social and individual values
(or monotonicity) If any individual modifies his or her preference order by promoting a certain option, then
the societal preference order should respond only by promoting that same option or not changing, never by
placing it lower than before. An individual should not be able to hurt an option by ranking it higher.
Non-imposition
(or citizen sovereignty) Every possible societal preference order should be achievable by some set of
individual preference orders. This means that the social welfare function is surjective: It has an unrestricted
target space.
Arrow's theorem says that if the decision-making body has at least two members and at least three options to decide
among, then it is impossible to design a social welfare function that satisfies all these conditions at once.
A later (1963) version of Arrow's theorem can be obtained by replacing the monotonicity and non-imposition criteria
with:
Pareto efficiency
(or unanimity) If every individual prefers a certain option to another, then so must the resulting societal
preference order. This, again, is a demand that the social welfare function will be minimally sensitive to the
preference profile.
The later version of this theorem is strongerhas weaker conditionssince monotonicity, non-imposition, and
independence of irrelevant alternatives together imply Pareto efficiency, whereas Pareto efficiency and independence
of irrelevant alternatives together do not imply monotonicity. (Incidentally, Pareto efficiency on its own implies
non-imposition.)
Remarks on IIA
The IIA condition can be justified for three reasons (Mas-Colell, Whinston, and Green, 1995, page 794): (i)
normative (irrelevant alternatives should not matter), (ii) practical (use of minimal information), and (iii) strategic
(providing the right incentives for the truthful revelation of individual preferences). Though the strategic property
is conceptually different from IIA, it is closely related.
Arrow's death-of-a-candidate example (1963, page 26) suggests that the agenda (the set of feasible alternatives)
shrinks from, say, X = {a, b, c} to S = {a, b} because of the death of candidate c. This example is misleading
since it can give the reader an impression that IIA is a condition involving two agenda and one profile. The fact is
that IIA involves just one agendum ({x, y} in case of Pairwise Independence) but two profiles. If the condition is
applied to this confusing example, it requires this: Suppose an aggregation rule satisfying IIA chooses b from the
agenda {a, b} when the profile is given by (cab, cba), that is, individual 1 prefers c to a to b, 2 prefers c to b to a.
Then, it must still choose b from {a, b} if the profile were, say, (abc, bac) or (acb, bca) or (acb, cba) or (abc, cba).

42

Arrow's impossibility theorem

43

Formal statement of the theorem


Let

be a set of outcomes,

orderings of

by

a number of voters or decision criteria. We shall denote the set of all full linear

A (strict) social welfare function (preference aggregation rule) is a function


aggregates voters' preferences into a single preference order on

.[5] The

-tuple

which
of voters'

preferences is called a preference profile. In its strongest and simplest form, Arrow's impossibility theorem states
that whenever the set
of possible alternatives has more than 2 elements, then the following three conditions
become incompatible:
unanimity, or Pareto efficiency
If alternative a is ranked above b for all orderings

, then a is ranked higher than b by

. (Note that unanimity implies non-imposition).


non-dictatorship
There is no individual i whose preferences always prevail. That is, there is no

such that

.
independence of irrelevant alternatives
For two preference profiles
and b have the same order in
as in

and

such that for all individuals i, alternatives a

as in

, alternatives a and b have the same order in


.

Informal proof
Based on two proofs[6][7] appearing in Economic Theory. For simplicity we have presented all rankings as if ties are
impossible. A complete proof taking possible ties into account is not essentially different from the one below, except
that one ought to say "not above" instead of "below" or "not below" instead of "above" in some cases. Full details are
given in the original articles.
We will prove that any social choice system respecting unrestricted domain, unanimity, and independence of
irrelevant alternatives (IIA) is a dictatorship. The key idea is to identify a pivotal voter whose ballot swings the
societal outcome. We then prove that this voter is a partial dictator (in a specific technical sense, described below).
Finally we conclude by showing that all of the partial dictators are the same person, hence this voter is a dictator.

Arrow's impossibility theorem

Part One: There is a "pivotal" voter for B over A


Say there are three choices for society, call
them A, B, and C. Suppose first that
everyone prefers option B the least. That is,
everyone prefers every other option to B. By
unanimity, society must prefer every option
to B. Specifically, society prefers A and C
to B. Call this situation Profile 0.
On the other hand, if everyone preferred B
to everything else, then society would have
to prefer B to everything else by unanimity.
Now arrange all the voters in some arbitrary
but fixed order, and for each i let Profile i be
the same as Profile 0, but move B to the top
of the ballots for voters 1 through i. So
Profile 1 has B at the top of the ballot for
voter 1, but not for any of the others. Profile
2 has B at the top for voters 1 and 2, but no
others, and so on.
Since B eventually moves to the top of the
societal preference, there must be some
profile, number k, for which B moves above
A in the societal rank. We call the voter
Part One: Successively move B from the bottom to the top of voters' ballots. The
whose ballot change causes this to happen
voter whose change results in B being ranked over A is the pivotal voter for B over
A.
the pivotal voter for B over A. Note that
the pivotal voter for B over A is not, a
priori, the same as the pivotal voter for A over B. In Part Three of the proof we will show that these do turn out to be
the same.
Also note that by IIA the same argument applies if Profile 0 is any profile in which A is ranked above B by every
voter, and the pivotal voter for B over A will still be voter k. We will use this observation below.

Part Two: The pivotal voter for B over A is a dictator for B over C
In this part of the argument we refer to voter k, the pivotal voter for B over A, as Pivotal Voter for simplicity. We
will show that Pivotal Voter dictates society's decision for B over C. That is, we show that no matter how the rest of
society votes, if Pivotal Voter ranks B over C, then that is the societal outcome. Note again that the dictator for B
over C is not a priori the same as that for C over B. In Part Three of the proof we will see that these turn out to be
the same too.
In the following, we call voters 1 through k-1 "Segment One", and voters k+1 through N "Segment Two". To begin,
suppose that the ballots are as follows:

44

Arrow's impossibility theorem

45

Every voter in Segment One ranks B


above C and C above A.
Pivotal Voter ranks A above B and B
above C.
Every voter in Segment Two ranks A
above B and B above C.

Part Two: Switching A and B on the ballot of voter k causes the same switch to the
societal outcome, by Part One of the argument. Making any or all of the indicated
switches to the other ballots has no effect on the outcome.

Then by the argument in Part One (and the


last observation in that part), the societal
outcome must rank A above B. This is
because, except for a repositioning of C, this profile is the same as Profile k-1 from Part One. Furthermore, by
unanimity the societal outcome must rank B above C. Therefore we know the outcome in this case completely.
Now suppose that Pivotal Voter moves B above A, but keeps C in the same position and imagine that any number
(or all!) of the other voters change their ballots to move C above B, without changing the position of A. Then aside
from a repositioning of C this is the same as Profile k from Part One and hence the societal outcome ranks B above
A. Furthermore, by IIA the societal outcome must rank A above C, as in the previous case. In particular, the
societal outcome ranks B above C, even though Pivotal Voter may have been the only voter to rank B above C. By
IIA this conclusion holds independently of how A is positioned on the ballots, so Pivotal Voter is a dictator for B
over C.

Part Three: There can be at most one dictator


In this part of the argument we refer back to the original ordering of
voters, and compare the positions of the different pivotal voters
(identified by applying Parts One and Two to the other pairs of
candidates). First, the pivotal voter for B over C must appear earlier (or
Part Three: Since voter k is the dictator for B over
at the same position) in the line than the dictator for B over C: As we
C, the pivotal voter for B over C must appear
consider the argument of Part One applied to B and C, successively
among the first k voters. That is, outside of
moving B to the top of voters' ballots, the pivot point where society
Segment Two. Likewise, the pivotal voter for C
over B must appear among voters k through N.
ranks B above C must come at or before we reach the dictator for B
That is, outside of Segment One.
over C. Likewise, reversing the roles of B and C, the pivotal voter for
C over B must at or later in line than the dictator for B over C. In short,
if kX/Y denotes the position of the pivotal voter for X over Y (for any two candidates X and Y), then we have shown
kB/C kB/A kC/B.
Now repeating the entire argument above with B and C switched, we also have
kC/B kB/C.
Therefore we have
kB/C = kB/A = kC/B
and the same argument for other pairs shows that all the pivotal voters (and hence all the dictators) occur at the same
position in the list of voters. This voter is the dictator for the whole election.

Arrow's impossibility theorem

Interpretations of the theorem


Although Arrow's theorem is a mathematical result, it is often expressed in a non-mathematical way with a statement
such as "No voting method is fair," "Every ranked voting method is flawed," or "The only voting method that isn't
flawed is a dictatorship". These statements are simplifications of Arrow's result which are not universally considered
to be true. What Arrow's theorem does state is that a deterministic preferential voting mechanism - that is, one where
a preference order is the only information in a vote, and any possible set of votes gives a unique result - cannot
comply with all of the conditions given above simultaneously.
Various theorists have suggested weakening the IIA criterion as a way out of the paradox. Proponents of ranked
voting methods contend that the IIA is an unreasonably strong criterion. It is the one breached in most useful voting
systems. Advocates of this position point out that failure of the standard IIA criterion is trivially implied by the
possibility of cyclic preferences. If voters cast ballots as follows:
1 vote for A > B > C
1 vote for B > C > A
1 vote for C > A > B
then the pairwise majority preference of the group is that A wins over B, B wins over C, and C wins over A: these
yield rock-paper-scissors preferences for any pairwise comparison. In this circumstance, any aggregation rule that
satisfies the very basic majoritarian requirement that a candidate who receives a majority of votes must win the
election, will fail the IIA criterion, if social preference is required to be transitive (or acyclic). To see this, suppose
that such a rule satisfies IIA. Since majority preferences are respected, the society prefers A to B (two votes for A>B
and one for B>A), B to C, and C to A. Thus a cycle is generated, which contradicts the assumption that social
preference is transitive.
So, what Arrow's theorem really shows is that any majority-wins voting system is a non-trivial game, and that game
theory should be used to predict the outcome of most voting mechanisms.[8]), Section 7.2.</ref> This could be seen
as a discouraging result, because a game need not have efficient equilibria, e.g., a ballot could result in an alternative
nobody really wanted in the first place, yet everybody voted for.
Remark: Scalar rankings from a vector of attributes and the IIA property. The IIA property might not be
satisfied in human decision-making of realistic complexity because the scalar preference ranking is effectively
derived from the weightingnot usually explicitof a vector of attributes (one book dealing with the Arrow
theorem invites the reader to consider the related problem of creating a scalar measure for the track and field
decathlon evente.g. how does one make scoring 600 points in the discus event "commensurable" with scoring 600
points in the 1500m race) and this scalar ranking can depend sensitively on the weighting of different attributes,
with the tacit weighting itself affected by the context and contrast created by apparently "irrelevant" choices. Edward
MacNeal discusses this sensitivity problem with respect to the ranking of "most livable city" in the chapter
"Surveys" of his book MathSemantics: making numbers talk sense (1994).

Other possibilities
In an attempt to escape from the negative conclusion of Arrow's theorem, social choice theorists have investigated
various possibilities ("ways out"). These investigations can be divided into the following two:
those investigating functions whose domain, like that of Arrow's social welfare functions, consists of profiles of
preferences;
those investigating other kinds of rules.

46

Arrow's impossibility theorem

Approaches investigating functions of preference profiles


This section includes approaches that deal with
aggregation rules (functions that map each preference profile into a social preference), and
other functions, such as functions that map each preference profile into an alternative.
Since these two approaches often overlap, we discuss them at the same time. What is characteristic of these
approaches is that they investigate various possibilities by eliminating or weakening or replacing one or more
conditions (criteria) that Arrow imposed.
Infinitely many individuals
Several theorists (e.g., Kirman and Sondermann, 1972) point out that when one drops the assumption that there are
only finitely many individuals, one can find aggregation rules that satisfy all of Arrow's other conditions.
However, such aggregation rules are practically of limited interest, since they are based on ultrafilters, highly
nonconstructive mathematical objects. In particular, Kirman and Sondermann argue that there is an "invisible
dictator" behind such a rule. Mihara (1997, 1999) shows that such a rule violates algorithmic computability.[9] These
results can be seen to establish the robustness of Arrow's theorem.[10] Chapter 6) for a concise discussion of social
choice for infinite societies.</ref>
Limiting the number of alternatives
When there are only two alternatives to choose from, May's theorem shows that only simple majority rule satisfies a
certain set of criteria (e.g., equal treatment of individuals and of alternatives; increased support for a winning
alternative should not make it into a losing one). On the other hand, when there are at least three alternatives,
Arrow's theorem points out the difficulty of collective decision making. Why is there such a sharp difference
between the case of less than three alternatives and that of at least three alternatives?
Nakamura's theorem (about the core of simple games) gives an answer more generally. It establishes that if the
number of alternatives is less than a certain integer called the Nakamura number, then the rule in question will
identify "best" alternatives without any problem; if the number of alternatives is greater or equal to the Nakamura
number, then the rule will not always work, since for some profile a voting paradox (a cycle such as alternative A
socially preferred to alternative B, B to C, and C to A) will arise. Since the Nakamura number of majority rule is 3
(except the case of four individuals), one can conclude from Nakamura's theorem that majority rule can deal with up
to two alternatives rationally. Some super-majority rules (such as those requiring 2/3 of the votes) can have a
Nakamura number greater than 3, but such rules violate other conditions given by Arrow.[11]
Remark. A common way "around" Arrow's paradox is limiting the alternative set to two alternatives. Thus,
whenever more than two alternatives should be put to the test, it seems very tempting to use a mechanism that pairs
them and votes by pairs. As tempting as this mechanism seems at first glance, it is generally far from satisfying even
Pareto efficiency, not to mention IIA. The specific order by which the pairs are decided strongly influences the
outcome. This is not necessarily a bad feature of the mechanism. Many sports use the tournament
mechanismessentially a pairing mechanismto choose a winner. This gives considerable opportunity for weaker
teams to win, thus adding interest and tension throughout the tournament. This means that the person controlling the
order by which the choices are paired (the agenda maker) has great control over the outcome. In any case, when
viewing the entire voting process as one game, Arrow's theorem still applies.

47

Arrow's impossibility theorem


Domain restrictions
Another approach is relaxing the universality condition, which means restricting the domain of aggregation rules.
The best-known result along this line assumes "single peaked" preferences.
Duncan Black has shown that if there is only one dimension on which every individual has a "single-peaked"
preference, then all of Arrow's conditions are met by majority rule. Suppose that there is some predetermined linear
ordering of the alternative set. An individual's preference is single-peaked with respect to this ordering if he has
some special place that he likes best along that line, and his dislike for an alternative grows larger as the alternative
goes further away from that spot (i.e., the graph of his utility function has a single peak if alternatives are placed
according to the linear ordering on the horizontal axis). For example, if voters were voting on where to set the
volume for music, it would be reasonable to assume that each voter had their own ideal volume preference and that
as the volume got progressively too loud or too quiet they would be increasingly dissatisfied. If the domain is
restricted to profiles in which every individual has a single peaked preference with respect to the linear ordering,
then simple () aggregation rules, which includes majority rule, have an acyclic (defined below) social preference,
hence "best" alternatives.[12] In particular, when there are odd number of individuals, then the social preference
becomes transitive, and the socially "best" alternative is equal to the median of all the peaks of the individuals
(Black's median voter theorem). Under single-peaked preferences, the majority rule is in some respects the most
natural voting mechanism.
One can define the notion of "single-peaked" preferences on higher-dimensional sets of alternatives. However, one
can identify the "median" of the peaks only in exceptional cases. Instead, we typically have the destructive situation
suggested by McKelvey's Chaos Theorem (1976): for any x and y, one can find a sequence of alternatives such that x
is beaten by
by a majority,
by
,
,
by y.
Relaxing transitivity
By relaxing the transitivity of social preferences, we can find aggregation rules that satisfy Arrow's other conditions.
If we impose neutrality (equal treatment of alternatives) on such rules, however, there exists an individual who has a
"veto". So the possibility provided by this approach is also very limited.
First, suppose that a social preference is quasi-transitive (instead of transitive); this means that the strict preference
("better than") is transitive: if
and
, then
. Then, there do exist non-dictatorial
aggregation rules satisfying Arrow's conditions, but such rules are oligarchic (Gibbard, 1969). This means that there
exists a coalition L such that L is decisive (if every member in L prefers x to y, then the society prefers x to y), and
each member in L has a veto (if she prefers x to y, then the society cannot prefer y to x).
Second, suppose that a social preference is acyclic (instead of transitive): there does not exist alternatives
that form a cycle (
,
,
,
,
). Then, provided that there
are at least as many alternatives as individuals, an aggregation rule satisfying Arrow's other conditions is collegial
(Brown, 1975). This means that there are individuals who belong to the intersection ("collegium") of all decisive
coalitions. If there is someone who has a veto, then he belongs to the collegium. If the rule is assumed to be neutral,
then it does have someone who has a veto.
Finally, Brown's theorem left open the case of acyclic social preferences where the number of alternatives is less
than the number of individuals. One can give a definite answer for that case using the Nakamura number. See
#Limiting the number of alternatives.

48

Arrow's impossibility theorem


Relaxing IIA
There are numerous examples of aggregation rules satisfying Arrow's conditions except IIA. The Borda rule is one
of them. These rules, however, are susceptible to strategic manipulation by individuals (Blair and Muller, 1983).
See also Interpretations of the theorem above.
Relaxing the Pareto criterion
Wilson (1972) shows that if an aggregation rule is non-imposed and non-null, then there is either a dictator or an
inverse dictator, provided that Arrow's conditions other than Pareto are also satisfied. Here, an inverse dictator is an
individual i such that whenever i prefers x to y, then the society prefers y to x.
Remark. Amartya Sen offered both relaxation of transitivity and removal of the Pareto principle. He demonstrated
another interesting impossibility result, known as the "impossibility of the Paretian Liberal". (See liberal paradox for
details). Sen went on to argue that this demonstrates the futility of demanding Pareto optimality in relation to voting
mechanisms.
Social choice instead of social preference
In social decision making, to rank all alternatives is not usually a goal. It often suffices to find some alternative. The
approach focusing on choosing an alternative investigates either social choice functions (functions that map each
preference profile into an alternative) or social choice rules (functions that map each preference profile into a subset
of alternatives).
As for social choice functions, the GibbardSatterthwaite theorem is well-known, which states that if a social choice
function whose range contains at least three alternatives is strategy-proof, then it is dictatorial.
As for social choice rules, we should assume there is a social preference behind them. That is, we should regard a
rule as choosing the maximal elements ("best" alternatives) of some social preference. The set of maximal elements
of a social preference is called the core. Conditions for existence of an alternative in the core have been investigated
in two approaches. The first approach assumes that preferences are at least acyclic (which is necessary and sufficient
for the preferences to have a maximal element on any finite subset). For this reason, it is closely related to #Relaxing
transitivity. The second approach drops the assumption of acyclic preferences. Kumabe and Mihara (2011) adopt this
approach. They make a more direct assumption that individual preferences have maximal elements, and examine
conditions for the social preference to have a maximal element. See Nakamura number for details of these two
approaches.

Rated voting systems and other approaches


Arrow's framework assumes that individual and social preferences are "orderings" (i.e., satisfy completeness and
transitivity) on the set of alternatives. This means that if the preferences are represented by a utility function, its
value is an ordinal utility in the sense that it is meaningful so far as the greater value indicates the better alternative.
For instance, having ordinal utilities of 4, 3, 2, 1 for alternatives a, b, c, d, respectively, is the same as having 1000,
100.01, 100, 0, which in turn is the same as having 99, 98, 1, .997. They all represent the ordering in which a is
preferred to b to c to d. The assumption of ordinal preferences, which precludes interpersonal comparisons of utility,
is an integral part of Arrow's theorem.
For various reasons, an approach based on cardinal utility, where the utility has a meaning beyond just giving a
ranking of alternatives, is not common in contemporary economics. However, once one adopts that approach, one
can take intensities of preferences into consideration, or one can compare (i) gains and losses of utility or (ii) levels
of utility, across different individuals. In particular, Harsanyi (1955) gives a justification of utilitarianism (which
evaluates alternatives in terms of the sum of individual utilities), originating from Jeremy Bentham. Hammond
(1976) gives a justification of the maximin principle (which evaluates alternatives in terms of the utility of the
worst-off individual), originating from John Rawls.

49

Arrow's impossibility theorem

50

Not all voting methods use, as input, only an ordering of all candidates.[13] Methods which don't, often called "rated"
or "cardinal" (as opposed to "ranked", "ordinal", or "preferential") voting systems, can be viewed as using
information that only cardinal utility can convey. In that case, it is not surprising if some of them satisfy all of
Arrow's conditions that are reformulated.[14] Range voting is such a method.[15] Whether such a claim is correct
depends on how each condition is reformulated. [16] page 129). The notion requires that the social ranking of two
alternatives depend only on the levels of utility attained by individuals at the two alternatives. (More formally, a
social welfare functional is a function that maps each list
of utility functions into a social
preference.

satisfies IIA (for social welfare functionals) if for all lists


and

for all

, then

and for all alternatives

, if

.) Many cardinal voting methods

(including Range voting) satisfy the weakened version of IIA. </ref> Other rated voting systems which pass certain
generalizations of Arrow's criteria include Approval voting and Majority Judgment. Note that although Arrow's
theorem does not apply to such methods, the GibbardSatterthwaite theorem still does: no system is fully
strategy-free, so the informal dictum that "no voting system is perfect" still has a mathematical basis.
Finally, though not an approach investigating some kind of rules, there is a criticism by James M. Buchanan and
others. It argues that it is silly to think that there might be social preferences that are analogous to individual
preferences. Arrow (1963, Chapter 8) answers this sort of criticism seen in the early period, which come at least
partly from misunderstanding.

Notes
[1] Arrow, K.J., " A Difficulty in the Concept of Social Welfare (http:/ / gatton. uky. edu/ Faculty/ hoytw/ 751/ articles/ arrow. pdf)", Journal of
Political Economy 58(4) (August, 1950), pp. 328346.
[2] Interview with Dr. Kenneth Arrow (http:/ / electology. org/ interview-with-dr-kenneth-arrow/ ): "CES: Now, you mention that your theorem
applies to preferential systems or ranking systems. Dr. Arrow: Yes CES: But the system that youre just referring to, Approval Voting, falls
within a class called cardinal systems. So not within ranking systems. Dr. Arrow: And as I said, that in effect implies more information.
[3] "Modern economic theory has insisted on the ordinal concept of utility; that is, only orderings can be observed, and therefore no measurement
of utility independent of these orderings has any significance. In the field of consumer's demand theory the ordinalist position turned out to
create no problems; cardinal utility had no explanatory power above and beyond ordinal. Leibniz' Principle of the Identity of the
Indiscernables demanded then the excision of cardinal utility from our thought patterns." Arrow (1967), as quoted on p.33 (http:/ / books.
google. com/ books?id=7ECXDjlCpB0C& pg=PA33) by .
[4] Suzumura, 2002,<ref name = handbook02>
[5] Note that by definition, a social welfare function as defined here satisfies the Unrestricted domain condition. Restricting the range to the
social preferences that are never indifferent between distinct outcomes is probably a very restrictive assumption, but the goal here is to give a
simple statement of the theorem. Even if the restriction is relaxed, the impossibility result will persist.
[6] Three Brief Proofs of Arrows Impossibility Theorem (http:/ / ideas. repec. org/ p/ cwl/ cwldpp/ 1123r3. html)
[7] Yu, Ning Neil (2012) A One-shot Proof of Arrows Impossibility Theorem (https:/ / sites. google. com/ site/ neilningyu/ publications/ doc/
preprint 3 - economic theory. pdf?attredirects=0)
[8] This does not mean various normative criteria will be satisfied if we use equilibrium concepts in game theory. Indeed, the mapping from
profiles to equilibrium outcomes defines a social choice rule, whose performance can be investigated by social choice theory. See
Austen-Smith and Banks (1999<ref name=austensmith-b99>
[9] Mihara's definition of a computable aggregation rule is based on computability of a simple game (see Rice's theorem).
[10] See Taylor (2005,<ref>
[11] Austen-Smith and Banks (1999,Arrow, K.J., " A Difficulty in the Concept of Social Welfare (http:/ / gatton. uky. edu/ Faculty/ hoytw/ 751/
articles/ arrow. pdf)", Journal of Political Economy 58(4) (August, 1950), pp. 328346. Chapter 3) gives a detailed discussion of the approach
trying to limit the number of alternatives.
[12] Indeed, many different social welfare functions can meet Arrow's conditions under such restrictions of the domain. It has been proved,
however, that under any such restriction, if there exists any social welfare function that adheres to Arrow's criteria, then the majority rule will
adhere to Arrow's criteria. See Campbell, D. E.; Kelly, J. S. (2000). "A simple characterization of majority rule". Economic Theory 15 (3):
689700. doi: 10.1007/s001990050318 (http:/ / dx. doi. org/ 10. 1007/ s001990050318).
[13] It is sometimes asserted that such methods may trivially fail the universality criterion. However, it is more appropriate to consider that such
methods fail Arrow's definition of an aggregation rule (or that of a function whose domain consists of preference profiles), if preference
orderings cannot uniquely translate into a ballot.
[14] However, a modified version of Arrow's theorem may still apply to such methods (e.g., Brams and Fishburn, 2002, Interview with Dr.
Kenneth Arrow (http:/ / electology. org/ interview-with-dr-kenneth-arrow/ ): "CES: Now, you mention that your theorem applies to
preferential systems or ranking systems. Dr. Arrow: Yes CES: But the system that youre just referring to, Approval Voting, falls within a

Arrow's impossibility theorem


class called cardinal systems. So not within ranking systems. Dr. Arrow: And as I said, that in effect implies more information. Chapter 4,
Theorem 4.2).
[15] New Scientist 12 April 2008 pages 30-33
[16] No voting method that nontrivially uses cardinal utility satisfies Arrow's IIA (in which preference profiles are replaced by lists of ballots or
lists of utilities). For this reason, a weakened notion of IIA is proposed (e.g., Sen, 1979,<ref name= sen79>

References
Campbell, D.E. and Kelly, J.S. (2002) Impossibility theorems in the Arrovian framework, in Handbook of social
choice and welfare (ed. by Kenneth J. Arrow, Amartya K. Sen and Kotaro Suzumura), volume 1, pages 3594,
Elsevier. Surveys many of approaches discussed in #Approaches investigating functions of preference profiles.
The Mathematics of Behavior by Earl Hunt, Cambridge University Press, 2007. The chapter "Defining
Rationality: Personal and Group Decision Making" has a detailed discussion of the Arrow Theorem, with proof.
URL to CUP information on this book (http://www.cambridge.org/9780521850124)
Why flip a coin? : the art and science of good decisions by Harold W. Lewis, John Wiley, 1997. Gives explicit
examples of preference rankings and apparently anomalous results under different voting systems. States but does
not prove Arrow's theorem. ISBN 0-471-29645-7
Sen, A. K. (1979) Personal utilities and public judgements: or what's wrong with welfare economics? The
Economic Journal, 89, 537-558, arguing that Arrow's theorem was wrong because it did not incorporate
non-utility information and the utility information it did allow was impoverished http://www.jstor.org/stable/
2231867
Yu, Ning Neil (2012) A one-shot proof of Arrow's theorem. Economic Theory, volume 50, issue 2, pages
523-525, Springer. http://link.springer.com/article/10.1007%2Fs00199-012-0693-3

External links

Three Brief Proofs of Arrows Impossibility Theorem (http://ideas.repec.org/p/cwl/cwldpp/1123r3.html)


A Pedagogical Proof of Arrows Impossibility Theorem (http://repositories.cdlib.org/ucsdecon/1999-25/)
Another Graphical Proof of Arrows Impossibility Theorem (http://www.jstor.org/stable/1183438)
A One-Shot Proof of Arrows Impossibility Theorem (http://www.springerlink.com/content/
v00202437u066604/)
Computer-aided Proofs of Arrow's and Other Impossibility Theorems (http://www.sciencedirect.com/science/
article/pii/S0004370209000320/)

51

Bertrand paradox

Bertrand paradox
For other paradoxes by Joseph Bertrand, see Bertrand's paradox (disambiguation).
In economics and commerce, the Bertrand paradox named after its creator, Joseph Bertrand describes a
situation in which two players (firms) reach a state of Nash equilibrium where both firms charge a price equal to
marginal cost ("MC"). The paradox is that in models such as Cournot competition, an increase in the number of
firms is associated with a convergence of prices to marginal costs. In these alternative models of oligopoly a small
number of firms earn positive profits by charging prices above cost. Suppose two firms, A and B, sell a
homogeneous commodity, each with the same cost of production and distribution, so that customers choose the
product solely on the basis of price. It follows that demand is infinitely price-elastic. Neither A nor B will set a
higher price than the other because doing so would yield the entire market to their rival. If they set the same price,
the companies will share both the market and profits.
On the other hand, if either firm were to lower its price, even a little, it would gain the whole market and
substantially larger profits. Since both A and B know this, they will each try to undercut their competitor until the
product is selling at zero economic profit. This is the pure-strategy Nash equilibrium. Recent work has shown that
there may be an additional mixed-strategy Nash equilibrium with positive economic profits.
The Bertrand paradox rarely appears in practice because real products are almost always differentiated in some way
other than price (brand name, if nothing else); firms have limitations on their capacity to manufacture and distribute;
and two firms rarely have identical costs.
Bertrand's result is paradoxical because if the number of firms goes from one to two, the price decreases from the
monopoly price to the competitive price and stays at the same level as the number of firms increases further. This is
not very realistic, as in reality, markets featuring a small number of firms with market power typically charge a price
in excess of marginal cost. The empirical analysis shows that in most industries with two competitors, positive
profits are made. Solutions to the Paradox attempt to derive solutions that are more in line with solutions from the
Cournot model of competition, where two firms in a market earn positive profits that lie somewhere between the
perfectly competitive and monopoly levels.
Some reasons the Bertrand paradox do not strictly apply:
Capacity constraints. Sometimes firms do not have enough capacity to satisfy all demand. This was a point first
raised by Francis Edgeworth[1] and gave rise to the Bertrand-Edgeworth model.
Integer pricing. Prices higher than MC are ruled out because one firm can undercut another by an arbitrarily small
amount. If prices are discrete (for example have to take integer values) then one firm has to undercut the other by
at least one cent. This implies that the price one cent above MC is now an equilibrium: if the other firm sets the
price one cent above MC, the other firm can undercut it and capture the whole market, but this will earn it no
profit. It will prefer to share the market 50/50 with the other firm and earn strictly positive profits.
Product differentiation. If products of different firms are differentiated, then consumers may not switch
completely to the product with lower price.
Dynamic competition. Repeated interaction or repeated price competition can lead to the price above MC in
equilibrium.
More money for higher price. It follows from repeated interaction: If one company sets their price slightly higher,
then they will still get about the same amount of buys but more profit for each buy, so the other company will
raise their price, and so on (only in repeated games, otherwise the price dynamics are in the other direction).
Oligopoly. If the two companies can agree on a price, it is in their long-term interest to keep the agreement: the
revenue from cutting prices is less than twice the revenue from keeping the agreement, and lasts only until the
other firm cuts its own prices.

52

Bertrand paradox

References
[1] Edgeworth, Francis (1889) "The pure theory of monopoly"". Reprinted in

Demographic-economic paradox
The demographic-economic paradox
is the inverse correlation found
between wealth and fertility within and
between nations. The higher the degree
of education and GDP per capita of a
human population, subpopulation or
social stratum, the fewer children are
born in any industrialized country. In a
1974 UN population conference in
Bucharest, Karan Singh, a former
minister of population in India,
illustrated this trend by stating
"Development
is
the
best
contraceptive."
The term "paradox" comes from the
notion that greater means would
necessitate the production of more
Graph of Total Fertility Rate vs. GDP per capita of the corresponding country, 2009. Only
countries with over 5 Million population were plotted, to reduce outliers. Sources: CIA
offspring as suggested by the
World Fact Book. For details, see List of countries and territories by fertility rate
influential Thomas Malthus. Roughly
speaking, nations or subpopulations
with higher GDP per capita are observed to have fewer children, even though a richer population can support more
children. Malthus held that in order to prevent widespread suffering, from famine for example, what he called "moral
restraint" (which included abstinence) was required. The demographic-economic paradox suggests that reproductive
restraint arises naturally as a consequence of economic progress.
It is hypothesized that the observed trend has come about as a response to increased life expectancy, reduced
childhood mortality, improved female literacy and independence, and urbanization that all result from increased
GDP per capita, consistent with the demographic transition model.
According to the UN, "[a]mong the 201 countries or areas with at least 90,000 inhabitants in 2013, 50 countries in
1990-1995 and 71 countries in 2005-2010 had below-replacement fertility. In 2005-2010, 27 countries had very low
fertility, below 1.5 children per woman, and all of these countries are located in Eastern Asia or Europe." [1]

53

Demographic-economic paradox

Demographic transition
Before the 19th century demographic
transition of the western world, a minority
of children would survive to the age of 20,
and life expectancies were short even for
those who reached adulthood. For example,
in the 17th century in York, England 15% of
children were still alive at age 15 and only
10% of children survived to age 20.
Birth rates were correspondingly high,
resulting in slow population growth. The
second
agricultural
revolution
and
improvements in hygiene then brought
about dramatic reductions in mortality rates
in wealthy industrialized countries, initially
without affecting birth rates. In the 20th
century, birth rates of industrialized
countries began to fall, as societies became
United Nation's population summary and projections by location.
Note
the vertical axis is logarithmic and represents millions of people.
accustomed to the higher probability that
their children would survive them. Cultural
value changes were also contributors, as urbanization and female employment rose.
Since wealth is what drives this demographic transition, it follows that nations that lag behind in wealth also lag
behind in this demographic transition. The developing world's equivalent Green Revolution did not begin until the
mid-twentieth century. This creates the existing spread in fertility rates as a function of GDP per capita.

Religion
See also: Religious views on birth control
Another contributor to the demographic-economic paradox may be religion. Religious societies tend to have higher
birth rates than secular ones, and richer, more educated nations tend to advance secularization. This may help explain
the Israeli and Saudi Arabian exceptions, the two notable outliers in the graph of fertility versus GDP per capita at
the top of this article. The role of different religions in determining family size is complex. For example, the Catholic
countries of southern Europe traditionally had a much higher fertility rate than was the case in Protestant northern
Europe. However, economic growth in Spain, Italy, Poland etc., has been accompanied by a particularly sharp fall in
the fertility rate, to a level below that of the Protestant north. This suggests that the demographic-economic paradox
applies more strongly in Catholic countries, although Catholic fertility started to fall when the liberalizing reforms of
Vatican II were implemented. It remains to be seen if the fertility rate among (mostly Catholic) Hispanics in the U.S.
will follow a similar pattern.

United States
In his book America Alone: The End of the World as We Know It, Mark Steyn asserts that the United States has
higher fertility rates because of its greater economic freedom compared to other industrialized countries. However,
the countries with the highest assessed economic freedom, Hong Kong and Singapore, have significantly lower
birthrates than the United States. According to the Index of Economic Freedom, Hong Kong is the most
economically free country in the world. Hong Kong also has one of the world's lowest birth rates.

54

Demographic-economic paradox

Fertility and population density


Studies have also suggested a correlation between population density and fertility rate. Hong Kong and Singapore
have the third and fourth-highest population densities in the world. This may account for their very low birth rates
despite high economic freedom. By contrast, the United States ranks 180 out of 241 countries and dependencies by
population density.

Consequences
Main articles: Sub-replacement fertility, Dependency ratio and Pensions crisis
A reduction in fertility can lead to an aging population which leads to a variety of problems, see for example the
Demographics of Japan.
A related concern is that high birth rates tend to place a greater burden of child rearing and education on populations
already struggling with poverty. Consequently, inequality lowers average education and hampers economic
growth.[2] Also, in countries with a high burden of this kind, a reduction in fertility can hamper economic growth as
well as the other way around.[3]

References
[1] http:/ / www. un. org/ en/ development/ desa/ population/ publications/ pdf/ fertility/ world-fertility-patterns-2013. pdf
[2] de la Croix, David and Matthias Doepcke: Inequality and growth: why differential fertility matters. American Economic Review 4 (2003)
10911113. (http:/ / www. econ. ucla. edu/ workingpapers/ wp803. pdf)
[3] UNFPA: Population and poverty. Achieving equity, equality and sustainability. Population and development series no. 8, 2003. (http:/ / www.
unfpa. org/ upload/ lib_pub_file/ 191_filename_PDS08. pdf)

External links
Macleod, Mairi (29 October 2013), Population paradox: Why richer people have fewer kids (http://www.
newscientist.com/article/mg22029401.000-population-paradox-why-richer-people-have-fewer-kids.html)
(2940), New Scientist

55

Dollar auction

Dollar auction
The dollar auction is a non-zero sum sequential game designed by economist Martin Shubik to illustrate a paradox
brought about by traditional rational choice theory in which players with perfect information in the game are
compelled to make an ultimately irrational decision based completely on a sequence of rational choices made
throughout the game.[1]

Setup
The setup involves an auctioneer who volunteers to auction-off a dollar bill with the following rule: the bill goes to
the winner; however, the two highest bidders must pay the highest amount they bid. The winner can get a dollar for
mere five cents, but only if no one else enters into the bidding war. The second-highest bidder is the biggest loser by
paying the top amount he/she bid without getting anything back. The game begins with one of the players bidding 5
cents (the minimum), hoping to make a 95 cent profit. He can be outbid by another player bidding 10 cents, as a 90
cent profit is still desirable. Similarly, another bidder may bid 15 cents, making an 85 cent profit. Meanwhile, the
second bidder may attempt to convert his loss of 10 cents into a gain of 80 cents by bidding 20 cents, and so on.
Every player has a choice of either paying for nothing or bidding five cents more on the dollar. Any bid beyond the
value of a dollar, is a loss for all bidders alike. Only the auctioneer gets to profit in the end.
This game can be "beaten" in a sense if all players are superrational. In this case, any of the n players shall bid for 5
cent if they succeed in some independent probability event with a 1/n chance. If it ends up that no bids are made this
way, the person holding the auction would either abort or restart the game. If the game continues, one player (the
quickest to bid, possibly) will be the highest bidder with 5 cent. Once that player has locked in his 5 cent bid, the
other players could bet 10 cents, but they would not because they recognize the inevitable standoff as a result of their
superrationality. Hence, the auctioneer loses 95 cents, and the lucky player wins that much, with no second bidder to
take from.
This game can also be "beaten" in a second way, provided that all the bidders are bidding in a rational, but not
necessarily superrational fashion. If the first bidder bids 95 cents, for a 5 cent profit, none of the other bidders will
follow it up with a bid, because there is nothing to gain. However, this will only work if it is the first bid, as if it is
not, the second highest bidder will be pushed towards bidding.

Notes
[1] Shubik: 1971. Page 109

References
Shubik, Martin (1971). "The Dollar Auction Game: A Paradox in Noncooperative Behavior and Escalation"
(http://www.math.toronto.edu/mpugh/Teaching/Sci199_03/dollar_auction_1.pdf) (PDF file, direct
download 274 KB). Journal of Conflict Resolution 15 (1): 109111. doi: 10.1177/002200277101500111 (http://
dx.doi.org/10.1177/002200277101500111).
Poundstone, William (1993). "The Dollar Auction". Prisoner's Dilemma: John Von Neumann, Game Theory, and
the Puzzle of the Bomb. New York: Oxford University Press. ISBN0-19-286162-X.

56

DownsThomson paradox

DownsThomson paradox
The DownsThomson paradox (named after Anthony Downs and J. M. Thomson), also referred to as the
PigouKnightDowns paradox (after Arthur Cecil Pigou and Frank Knight), states that the equilibrium speed of
car traffic on the road network is determined by the average door-to-door speed of equivalent journeys by (rail-based
or otherwise segregated) public transport.
It follows that increasing road capacity can make traffic congestion worse, when the shift from public transport
causes a disinvestment in that mode such that the operator reduces frequency of service or raises fares to cover costs.
This shifts additional passengers into cars. Ultimately the system may be eliminated and traffic congestion is worse
than before.
The general conclusion, if the paradox applies, is that expanding a road system as a remedy to congestion is not only
ineffective but often counterproductive. This is known as LewisMogridge Position and was extensively
documented by Martin Mogridge in the case-study of London on his book Travel in towns: jam yesterday, jam today
and jam tomorrow?
A 1968 article by Dietrich Braess pointed out the existence of this counter-intuitive occurrence on networks: the
Braess' paradox states that adding extra capacity to a network, when the moving entities selfishly choose their route,
can in some cases reduce overall performance.
There is a recent interest in the study of this phenomenon since the same may happen in computer networks as well
as traffic networks. Increasing the size of the network is characterized by behaviors of users similar to that of
travelers on transportation networks, who act independently and in a decentralized manner in choosing optimal
routes between origin and destination.
This is an extension of the induced demand theory and consistent with Downs (1992) theory of "triple convergence",
formulated to explain the difficulty of removing peak-hour congestion from highways. In response to a capacity
addition three immediate effects occur: drivers using alternative routes begin to use the expanded highway; those
previously traveling at off-peak times (either immediately before or after the peak) shift to the peak (rescheduling
behavior as defined previously); and public transport users shift to driving.

Restrictions on validity
According to Downs the link between average speeds on public transport and private transport "only applies to
regions in which the vast majority of peak-hour commuting is done on rapid transit systems with separate rights of
way. Central London is an example, since in 2001 around 85 percent of all morning peak-period commuters into that
area used public transit (including 77 percent on separate rights of way) and only 11 percent used private cars. When
peak-hour travel equilibrium has been reached between the subway system and the major commuting roads, then the
travel time required for any given trip is roughly equal on both modes."Wikipedia:Citation needed

57

DownsThomson paradox

References
On a Paradox of Traffic Planning, translated from the 1968 D. Braess paper from German to English by D.
Braess, A. Nagurney, and T. Wakolbinger (2005), Transportation Science 39/4, 446450.
Downs, Anthony, Stuck in Traffic: Coping with Peak-Hour Traffic Congestion, The Brookings Institution:
Washington, DC. 1992. ISBN 0-8157-1923-X
Mogridge, Martin J.H. Travel in towns: jam yesterday, jam today and jam tomorrow? Macmillan Press: London,
1990. ISBN 0-333-53204-X
Thomson, J. M. (1972), Methods of traffic limitation in urban areas. Working Paper 3, Paris, OECD.

Easterlin paradox
The Easterlin Paradox is a key concept in happiness economics. It is named for the economist and USC professor
Richard Easterlin, who discussed the factors contributing to happiness in a 1974 book chapter.[1] According to the
University of Kent, the paradox explains that, "high incomes do correlate with happiness, but long term, increased
income doesn't correlate with increased happiness".
Easterlin found that within a given country people with higher incomes were more likely to report being happy.
However, in international comparisons, the average reported level of happiness did not vary much with national
income per person, at least for countries with income sufficient to meet basic needs. Similarly, although income per
person rose steadily in the United States between 1946 and 1970, average reported happiness showed no long-term
trend and declined between 1960 and 1970. The difference in international and micro-level results fostered an
ongoing body of research.[2]
Recent research has utilised several measures of happiness, including biological measures, Wikipedia:Citation
needed showing similar patterns of results .Wikipedia:Citation needed This goes some way to answering the
problems of self-rated happiness Wikipedia:Manual of Style/Dates and numbers. The claim was later taken up by
Andrew Oswald of the University of Warwick in 1997,[3] driving media interest in the topic.
If true (see below), one possible implication for government policy is said to be that, once basic needs are met,
policy should focus not on economic growth or GDP, but rather on increasing life satisfaction or Gross national
happiness (GNH).

Controversy
In 2003 Ruut Veenhoven and Michael Hagerty published a new analysis based on including various sources of data,
and their conclusion was that there is no paradox and countries did indeed get happier with increasing income. In his
reply Easterlin maintained his position, suggesting that his critics were using inadequate data.
In 2008, economists Betsey Stevenson and Justin Wolfers, both of the University of Pennsylvania, published a paper
where they reassessed the Easterlin paradox using new time-series data. They conclude like Veenhoven et al. that,
contrary to Easterlin's claim, increases in absolute income are clearly linked to increased self-reported happiness, for
both individual people and whole countries. The statistical relationship demonstrated is between happiness and the
logarithm of absolute income, suggesting that happiness increases more slowly than income, but no "saturation
point" is ever reached. The study provides evidence absolute income, in addition to relative income, determine
happiness. That is in contrast to an extreme understanding of the hedonic treadmill theory where "keeping up with
the Joneses" is the only determinant of behavior.
In 2010 Easterlin published data reaffirming the paradox with data from a sample of 37 countries.[4] In a report
prepared for the United Nations in 2012 [5] Richard Layard, Andrew Clark and Claudia Senik point out that other
variables covary with wealth, including social trust, and that these, and not income, may drive much of the

58

Easterlin paradox
association of GDP per capita with well-being.

References
[1] Easterlin (1974). Does Economic Growth Improve the Human Lot? Some Empirical Evidence. In Paul A. David and Melvin W. Reder, eds.,
Nations and Households in Economic Growth: Essays in Honor of Moses Abramovitz, New York: Academic Press, Inc. pdf (http:/ / graphics8.
nytimes. com/ images/ 2008/ 04/ 16/ business/ Easterlin1974. pdf)
[2] Diane J. Macunovich and Richard A. Easterlin, 2008 [1987], "Easterlin hypothesis," The New Palgrave Dictionary of Economics, 2nd
Edition. Abstract. (http:/ / www. dictionaryofeconomics. com/ article?id=pde2008_E000002& edition=current& q=)
Andrew E. Clark, Paul Frijters, and Michael A. Shields (2008). "Relative Income, Happiness, and Utility: An Explanation for the Easterlin
Paradox and Other Puzzles," Journal of Economic Literature, 46(1), pp. 95-144. (http:/ / ibe. eller. arizona. edu/ docs/ 2010/ martinsson/
happiness_jel_2008. pdf)
[3] Oswald, A. (2006). "The Hippies Were Right all Along about Happiness". Financial Times - January 19, 2006. . pdf (http:/ / www2. warwick.
ac. uk/ fac/ soc/ economics/ staff/ faculty/ oswald/ fthappinessjan96. pdf)
[4] Alok Jha: 13 December 2010
[5] http:/ / earth. columbia. edu/ articles/ view/ 2960

External links
It's experts that make us miserable (http://observer.guardian.co.uk/comment/story/0,,2000672,00.html) Nick Cohen - The Guardian - January 28, 2007.
Andrew Oswald's Website (http://www.andrewoswald.com/).
Happiness Is Increasing in Many Countries -- But Why? (http://pewglobal.org/commentary/display.
php?AnalysisID=1020) - Bruce Stokes - July 24, 2007.

Ellsberg paradox
The Ellsberg paradox is a paradox in decision theory in which people's choices violate the postulates of subjective
expected utility. It is generally taken to be evidence for ambiguity aversion. The paradox was popularized by Daniel
Ellsberg, although a version of it was noted considerably earlier by John Maynard Keynes.
The basic idea is that people overwhelmingly prefer taking on risk in situations where they know specific odds rather
than an alternate risk scenario in which the odds are completely ambiguousthey will always choose a known
probability of winning over an unknown probability of winning even if the known probability is low and the
unknown probability could be a guarantee of winning. That is, given a choice of risks to take (such as bets), people
"prefer the devil they know" rather than assuming a risk where odds are difficult or impossible to calculate.[1]
Ellsberg actually proposed two separate thought experiments, the proposed choices which contradict subjective
expected utility. The 2-color problem involves bets on two urns, both of which contain balls of two different colors.
The 3-color problem, described below, involves bets on a single urn, which contains balls of three different colors.

59

Ellsberg paradox

60

The 1 urn paradox


Suppose you have an urn containing 30 red balls and 60 other balls that are either black or yellow. You don't know
how many black or how many yellow balls there are, but that the total number of black balls plus the total number of
yellow equals 60. The balls are well mixed so that each individual ball is as likely to be drawn as any other. You are
now given a choice between two gambles:
Gamble A

Gamble B

You receive $100 if you draw a red ball You receive $100 if you draw a black ball

Also you are given the choice between these two gambles (about a different draw from the same urn):
Gamble C

Gamble D

You receive $100 if you draw a red or yellow ball You receive $100 if you draw a black or yellow ball

This situation poses both Knightian uncertainty how many of the non-red balls are yellow and how many are
black, which is not quantified and probability whether the ball is red or non-red, which is vs. .

Utility theory interpretation


Utility theory models the choice by assuming that in choosing between these gambles, people assume a probability
that the non-red balls are yellow versus black, and then compute the expected utility of the two gambles.
Since the prizes are exactly the same, it follows that you will prefer Gamble A to Gamble B if and only if you
believe that drawing a red ball is more likely than drawing a black ball (according to expected utility theory). Also,
there would be no clear preference between the choices if you thought that a red ball was as likely as a black ball.
Similarly it follows that you will prefer Gamble C to Gamble D if, and only if, you believe that drawing a red or
yellow ball is more likely than drawing a black or yellow ball. It might seem intuitive that, if drawing a red ball is
more likely than drawing a black ball, then drawing a red or yellow ball is also more likely than drawing a black or
yellow ball. So, supposing you prefer Gamble A to Gamble B, it follows that you will also prefer Gamble C to
Gamble D. And, supposing instead that you prefer Gamble B to Gamble A, it follows that you will also prefer
Gamble D to Gamble C.
When surveyed, however, most people strictly prefer Gamble A to Gamble B and Gamble D to Gamble C.
Therefore, some assumptions of the expected utility theory are violated.

Mathematical demonstration
Mathematically, your estimated probabilities of each color ball can be represented as: R, Y, and B. If you strictly
prefer Gamble A to Gamble B, by utility theory, it is presumed this preference is reflected by the expected utilities of
the two gambles: specifically, it must be the case that

where

is your utility function. If

(you strictly prefer $100 to nothing), this simplifies to:

If you also strictly prefer Gamble D to Gamble C, the following inequality is similarly obtained:

This simplifies to:

Ellsberg paradox
This contradiction indicates that your preferences are inconsistent with expected-utility theory.

Generality of the paradox


Note that the result holds regardless of your utility function. Indeed, the amount of the payoff is likewise irrelevant.
Whichever gamble you choose, the prize for winning it is the same, and the cost of losing it is the same (no cost), so
ultimately, there are only two outcomes: you receive a specific amount of money, or you receive nothing. Therefore
it is sufficient to assume that you prefer receiving some money to receiving nothing (and in fact, this assumption is
not necessary in the mathematical treatment above, it was assumed U($100) > U($0), but a contradiction can still
be obtained for U($100) < U($0) and for U($100) = U($0)).
In addition, the result holds regardless of your risk aversion. All the gambles involve risk. By choosing Gamble D,
you have a 1 in 3 chance of receiving nothing, and by choosing Gamble A, you have a 2 in 3 chance of receiving
nothing. If Gamble A was less risky than Gamble B, it would follow that Gamble C was less risky than Gamble D
(and vice versa), so, risk is not averted in this way.
However, because the exact chances of winning are known for Gambles A and D, and not known for Gambles B and
C, this can be taken as evidence for some sort of ambiguity aversion which cannot be accounted for in expected
utility theory. It has been demonstrated that this phenomenon occurs only when the choice set permits comparison of
the ambiguous proposition with a less vague proposition (but not when ambiguous propositions are evaluated in
isolation).

Possible explanations
There have been various attempts to provide decision-theoretic explanations of Ellsberg's observation. Since the
probabilistic information available to the decision-maker is incomplete, these attempts sometimes focus on
quantifying the non-probabilistic ambiguity which the decision-maker faces see Knightian uncertainty. That is,
these alternative approaches sometimes suppose that the agent formulates a subjective (though not necessarily
Bayesian) probability for possible outcomes.
One such attempt is based on info-gap decision theory. The agent is told precise probabilities of some outcomes,
though the practical meaning of the probability numbers is not entirely clear. For instance, in the gambles discussed
above, the probability of a red ball is 30/90, which is a precise number. Nonetheless, the agent may not distinguish,
intuitively, between this and, say, 30/91. No probability information whatsoever is provided regarding other
outcomes, so the agent has very unclear subjective impressions of these probabilities.
In light of the ambiguity in the probabilities of the outcomes, the agent is unable to evaluate a precise expected
utility. Consequently, a choice based on maximizing the expected utility is also impossible. The info-gap approach
supposes that the agent implicitly formulates info-gap models for the subjectively uncertain probabilities. The agent
then tries to satisfice the expected utility and to maximize the robustness against uncertainty in the imprecise
probabilities. This robust-satisficing approach can be developed explicitly to show that the choices of
decision-makers should display precisely the preference reversal which Ellsberg observed.
Another possible explanation is that this type of game triggers a deceit aversion mechanism. Many humans naturally
assume in real-world situations that if they are not told the probability of a certain event, it is to deceive them. People
make the same decisions in the experiment that they would about related but not identical real-life problems where
the experimenter would be likely to be a deceiver acting against the subject's interests. When faced with the choice
between a red ball and a black ball, the probability of 30/90 is compared to the lower part of the 0/90-60/90 range
(the probability of getting a black ball). The average person expects there to be fewer black balls than yellow balls
because in most real-world situations, it would be to the advantage of the experimenter to put fewer black balls in the
urn when offering such a gamble. On the other hand, when offered a choice between red and yellow balls and black
and yellow balls, people assume that there must be fewer than 30 yellow balls as would be necessary to deceive
them. When making the decision, it is quite possible that people simply forget to consider that the experimenter does

61

Ellsberg paradox

62

not have a chance to modify the contents of the urn in between the draws. In real-life situations, even if the urn is not
to be modified, people would be afraid of being deceived on that front as well.
A modification of utility theory to incorporate uncertainty as distinct from risk is Choquet expected utility, which
also proposes a solution to the paradox.

Alternative explanations
Other alternative explanations include the competence hypothesis and comparative ignorance hypothesis. These
theories attribute the source of the ambiguity aversion to the participant's pre-existing knowledge.

References
[1] EconPort discussion of the paradox (http:/ / www. econport. org/ econport/ request?page=man_ru_experiments_ellsberg)

Anand, Paul (1993). Foundations of Rational Choice Under Risk. Oxford University Press. ISBN0-19-823303-5.
Keynes, John Maynard (1921). A Treatise on Probability. London: Macmillan.
Schmeidler, D. (1989). "Subjective Probability and Expected Utility without Additivity". Econometrica 57 (3):
571587. doi: 10.2307/1911053 (http://dx.doi.org/10.2307/1911053). JSTOR 1911053 (http://www.jstor.
org/stable/1911053).

Green paradox
Part of a series on

Environmental
economics
Concepts

Green accounting

Green economy

Green trading

Eco commerce

Green job

Environmental enterprise

Fiscal environmentalism

Environmental finance

Renewable energy
Policies

Sustainable tourism

Ecotax

Environmental tariff

Net metering

Environmental pricing reform

Pigovian tax
Dynamics

Renewable energy commercialization


Marginal abatement cost
Green paradox
Green politics
Pollution haven hypothesis

Green paradox

63
Carbon related

Low-carbon economy

Carbon neutral fuel

Carbon neutrality

Carbon pricing

Emissions trading

Carbon credit

Carbon offset

Carbon emission trading

Personal carbon trading

Carbon tax

Carbon finance

Feed-in tariff

Carbon diet

Food miles

2000-watt society

Carbon footprint

v
t

e [1]

The Green Paradox is a phrase coined by German economist Hans-Werner Sinn to describe the fact that an
environmental policy that becomes greener with the passage of time acts like an announced expropriation for the
owners of fossil fuel resources, inducing them to anticipate resource extraction and hence to accelerate global
warming.

Main line of reasoning


The Green Paradoxs line of reasoning starts by recognizing a fundamental, unavoidable fact: every carbon atom in
the gas, coal or oil extracted from the ground to be used as fuel ends up in the atmosphere, in particular if high
efficiency combustion processes ensure that no part of it ends up as soot. About a quarter of the emitted carbon will
stay in the atmosphere practically forever, contributing to the greenhouse effect that causes global warming.[2]
Apart from afforestation, only two things can mitigate the accumulation of carbon in the atmosphere: either less
carbon is extracted from the ground, or it is injected back underground after harvesting its energy.
Environmental policy efforts, however, in particular European ones, go in neither of these two directions, aiming
instead at the promotion of alternative, CO2-free energy sources and a more efficient use of energy. In other words,
they only address the demand side of the carbon market, neglecting the supply side. Despite considerable investment,
the efforts to curtail demand have not reduced the aggregate amount of CO2 emitted globally, which continues to
increase unabated.[3]
The reason behind this, according to Sinn, is that green policies, by heralding a gradual tightening of policy over the
coming decades, exert a stronger downward pressure on future prices than on current ones, decreasing thus the rate
of capital appreciation of the fossil fuel deposits. The owners of these resources regard this development with
concern and react by increasing extraction volumes, converting the proceeds into investments in the capital markets,
which offer higher yields. That is the green paradox: environmental policy slated to become greener over time acts as
an announced expropriation that provokes owners to react by accelerating the rate of extraction of their fossil fuel
stocks,[4] thus accelerating climate change.
Countries that do not partake of the efforts to curb demand have a double advantage. They burn the carbon set free
by the green countries (leakage effect) and they also burn the additional carbon extracted as a reaction to the

Green paradox
announced and expected price cuts resulting from the gradual greening of environmental policies (green paradox).[5]
Sinn writes in his abstract that: "[Demand reduction strategies] simply depress the world price of carbon and induce
the environmental sinners to consume what the Kyoto countries have economized on. Even worse, if suppliers feel
threatened by a gradual greening of economic policies in the Kyoto countries that would damage their future prices,
they will extract their stocks more rapidly, thus accelerating global warming." [6]
Sinn emphasizes that a condition for the green paradox is that the resource be scarce in the sense that its price will
always be higher than the unit extraction and exploration costs combined. He points out that this condition is likely
to be satisfied as backstop technologies will at best offer a perfect substitute for electricity, but not for fossil fuels.
The prices of coal and crude oil are currently many times higher than the corresponding exploration and extraction
costs combined.

Practicable solutions
An effective climate policy must perforce focus on the hitherto neglected supply side of the carbon market in
addition to the demand side. The ways proposed as practicable by Sinn to do this include levying a withholding tax
on the capital gains on the financial investments of fossil fuel resource owners, or the establishment of a seamless
global emissions trading system that would effectively put a cap on worldwide fossil fuel consumption, thereby
achieving the desired reduction in carbon extraction rates.

Works on the subject


Hans-Werner Sinns ideas on the green paradox have been presented in detail in a number of scientific articles,[7][8]
his 2007 Thnen Lecture at the annual meeting of the German Economic Association (Verein fr Socialpolitik), his
2007 presidential address to the International Institute of Public Finance in Warwick, two working papers,[9][10] and
a German-language book, Das Grne Paradoxon (2008).[11] They build on his earlier studies on supply reactions of
the owners of natural resources to announced price changes.[12]

Notes and references


[1] http:/ / en. wikipedia. org/ w/ index. php?title=Template:Green_economics_sidebar& action=edit
[2] D. Archer, Fate of Fossil Fuel CO2 in Geologic Time, Journal of Geophysical Research 110, 2005, p. 511; D. Archer and V. Brovkin,
Millennial Atmospheric Lifetime of Anthropogenic CO2, Climate Change, mimeo, 2006; G. Hoos, R. Voss, K. Hasselmann, E. MeierReimer and F. Joos, A Nonlinear Impulse Response Model of the Coupled Carbon Cycle-Climate System (NICCS), Climate Dynamics 18,
2001, p. 189202.
[3] International Energy Agency (IEA), IEA Database, CO2 Emissions from Fuel Combustion 2007. Accessible online at: www.sourceoecd.org
(http:/ / www. sourceoecd. org); Netherlands Environmental Assessment Agency, Global CO2 Emissions: Increase Continued in 2007,
Bilthoven, June 13, 2008. Accessible online at: (http:/ / www. mnp. nl/ en/ publications/ 2008/ GlobalCO2emissionsthrough2007. html)
[4] N.V. Long, Resource Extraction under the Uncertainty about Possible Nationalization, Journal of Economic Theory 10, 1975, p. 42 53. K.
A. Konrad, T. E. Olson and R. Schb, Resource Extraction and the Threat of Possible Expropriation: The Role of Swiss Bank Accounts,
Journal of Environmental Economics and Management 26, 1994, p. 149162.
[5] S. Felder and T. F. Rutherford, Unilateral CO2 Reductions and Carbon Leakage: The Consequences of International Trade in Oil and Basic
Materials, Journal of Environmental Economics and Management 25, 1993, p. 162176, and J.-M. Burniaux and J. Oliveira Martins, Carbon
Emission Leakages: A General Equilibrium View, OECD Working Paper No. 242, 2000.
[6] Sinn, H.W. (2008). Public policies against global warming, International Tax and Public Finance, 15, 4, 360-394. Accessible online at:
(http:/ / www. cesifo-group. de/ portal/ page/ portal/ ifoContent/ N/ rts/ rts-mitarbeiter/ IFOMITARBSINNCV/ CVSinnPDF/
CVSinnPDFrefjournals2007/ ITAX-hws-2008. pdf)
[7] Public Policies against Global Warming: A Supply Side Approach, International Tax and Public Finance 15, 2008, p. 360394.
[8] H.-W. Sinn, Das grne Paradoxon: Warum man das Angebot bei der Klimapolitik nicht vergessen darf, Perspektiven der Wirtschaftspolitik
9, 2008, p. 109142.
[9] H.-W. Sinn, Public Policies against Global Warming, CESifo Working Paper No. 2087 (http:/ / www. cesifo-group. de/ portal/ page/ portal/
ifoHome/ b-publ/ b3publwp/ _wp_abstract?p_file_id=14563), August 2007
[10] H.-W. Sinn, Pareto Optimality in the Extraction of Fossil Fuels and the Greenhouse Effect: A Note, CESifo Working Paper No. 2083 (http:/
/ www. cesifo-group. de/ portal/ page/ portal/ ifoHome/ b-publ/ b3publwp/ _wp_abstract?p_file_id=14562), August 2007

64

Green paradox

65

[11] Das grne Paradoxon - Pldoyer fr eine illusionsfreie Klimapolitik (http:/ / www. cesifo-group. de/ link/ _publsinnparadoxon), Econ:
Berlin, 2008, 480 pages.
[12] H-W. Sinn, Absatzsteuern, lfrderung und das Allmendeproblem (Sales Taxes, Oil Extraction and the Common Pool Problem) (http:/ /
www. cesifo-group. de/ link/ Sinn_Abs_Oel_Allmend_1982. pdf), in: H. Siebert, ed., Reaktionen auf Energiepreisnderungen, Lang:
Frankfurt and Bern 1982, pp. 83-103; N.V. Long and H.-W. Sinn, Surprise Price Shifts, Tax Changes and the Supply Behaviour of Resource
Extracting Firms (http:/ / www. cesifo-group. de/ link/ Sinn_Surpr_Price_Shift_AEP_1985. pdf), Australian Economic Papers 24, 1985, pp.
278-289.

Icarus paradox
The Icarus paradox is a neologism coined by Danny Miller, and popularized by his 1990 book by the same name,
for the observed phenomenon of businesses that fail abruptly after a period of apparent success. In a 1992 article,
Miller noted that some businesses bring about their own downfall through their own successes, be this through
over-confidence, exaggeration, complacency. It refers to Icarus of Greek mythology who flew too close to the Sun
and melted his own wings. The book is a key source of insight in Escaping the Progress trap by Daniel O'Leary.

References

Jevons paradox
In economics, the Jevons paradox (/dvnz/; sometimes
Jevons effect) is the proposition that as technology
progresses, the increase in efficiency with which a resource
is used tends to increase (rather than decrease) the rate of
consumption of that resource. In 1865, the English economist
William Stanley Jevons observed that technological
improvements that increased the efficiency of coal-use led to
the increased consumption of coal in a wide range of
industries. He argued that, contrary to common intuition,
technological improvements could not be relied upon to
reduce fuel consumption.

Coal-burning factories in 19th-century Manchester, England.


Improved technology allowed coal to fuel the Industrial
Revolution, greatly increasing the consumption of coal.

The issue has been re-examined by modern economists


studying consumption rebound effects from improved energy
efficiency. In addition to reducing the amount needed for a given use, improved efficiency lowers the relative cost of
using a resource, which tends to increase the quantity of the resource demanded, potentially counteracting any
savings from increased efficiency. Additionally, increased efficiency accelerates economic growth, further increasing
the demand for resources. The Jevons paradox occurs when the effect from increased demand predominates, causing
resource use to increase.
The Jevons paradox has been used to argue that energy conservation may be futile, as increased efficiency may
increase fuel use. Nevertheless, increased efficiency can improve material living standards. Further, fuel use declines
if increased efficiency is coupled with a green tax or other conservation policies that keep the cost of use the same
(or higher). As the Jevons paradox applies only to technological improvements that increase fuel efficiency, policies
that impose conservation standards and increase costs do not display the paradox.

Jevons paradox

66

History
The Jevons paradox was first described by the English economist William
Stanley Jevons in his 1865 book The Coal Question. Jevons observed that
England's consumption of coal soared after James Watt introduced his coal-fired
steam engine, which greatly improved the efficiency of Thomas Newcomen's
earlier design. Watt's innovations made coal a more cost-effective power source,
leading to the increased use of the steam engine in a wide range of industries.
This in turn increased total coal consumption, even as the amount of coal
required for any particular application fell. Jevons argued that improvements in
fuel efficiency tend to increase, rather than decrease, fuel use: "It is a confusion
of ideas to suppose that the economical use of fuel is equivalent to diminished
consumption. The very contrary is the truth."
William Stanley Jevons
At that time many in Britain worried that coal reserves were rapidly dwindling,
but some experts opined that improving technology would reduce coal
consumption. Jevons argued that this view was incorrect, as further increases in efficiency would tend to increase the
use of coal. Hence, improving technology would tend to increase, rather than reduce, the rate at which England's coal
deposits were being depleted.

Cause
Rebound effect
Main
article:
(conservation)

Rebound

effect

One way to understand the Jevons


paradox is to observe that an increase
in the efficiency with which a resource
(e.g., fuel) is used causes a decrease in
the price of that resource when
measured in terms of what it can
achieve (e.g., work). Generally
speaking, a decrease in the price of a
Elastic Demand for Work: A doubling of fuel efficiency more than doubles work
good or service will increase the
demanded, increasing the amount of fuel used. Jevons paradox occurs.
quantity demanded (see supply and
demand, demand curve). Thus with a lower price for work, more work will be "purchased" (indirectly, by buying
more fuel). The resulting increase in the demand for fuel is known as the rebound effect. This increase in demand
may or may not be large enough to offset the original drop in demand from the increased efficiency. The Jevons
paradox
occurs
when
the

Jevons paradox

67
rebound effect is greater than 100%,
exceeding the original efficiency gains.
This effect has been called 'backfire'.

Consider a simple case: a perfectly


competitive market where fuel is the
sole input used, and the only
determinant of the cost of work. If the
price of fuel remains constant but the
efficiency of its conversion into work
is doubled, the effective price of work
is halved and twice as much work can
be purchased for the same amount of
Inelastic Demand for Work:A doubling of fuel efficiency does not double work
demanded, the amount of fuel used decreases. Jevons paradox does not occur.
money. If the amount of work
purchased more than doubles (i.e.,
demand for work is elastic, the price elasticity is greater than 1), then the quantity of fuel used would increase, not
decrease. If however, the demand for work is inelastic (price elasticity is less than 1), the amount of work purchased
would less than double, and the quantity of fuel used would decrease.
A full analysis would also have to take into account the fact that products (work) use more than one type of input
(e.g., fuel, labour, machinery), and that other factors besides input cost (e.g., a non-competitive market structure)
may also affect the price of work. These factors would tend to decrease the effect of fuel efficiency on the price of
work, and hence reduce the rebound effect, making the Jevons paradox less likely to occur. Additionally, any change
in the demand for fuel would have an effect on the price of fuel, and also on the effective price of work.

KhazzoomBrookes postulate
Main article: KhazzoomBrookes postulate
In the 1980s, economists Daniel Khazzoom and Leonard Brookes revisited the Jevons paradox in the case of a
society's energy use. Brookes, then chief economist at the UK Atomic Energy Authority, argued that attempts to
reduce energy consumption by increasing energy efficiency would simply raise demand for energy in the economy
as a whole. Khazzoom focused on the narrower point that the potential for rebound was ignored in mandatory
performance standards for domestic appliances being set by the California Energy Commission.
In 1992, the economist Harry Saunders dubbed the hypothesis that improvements in energy efficiency work to
increase, rather than decrease, energy consumption the KhazzoomBrookes postulate. Saunders showed that the
KhazzoomBrookes postulate was consistent with neo-classical growth theory (the mainstream economic theory of
capital accumulation, technological progress and long-run economic growth) under a wide range of assumptions.[1]
According to Saunders, increased energy efficiency tends to increase energy consumption by two means. First,
increased energy efficiency makes the use of energy relatively cheaper, thus encouraging increased use (the direct
rebound effect). Second, increased energy efficiency leads to increased economic growth, which pulls up energy use
for the whole economy. At the microeconomic level (looking at an individual market), even with the rebound effect,
improvements in energy efficiency usually result in reduced energy consumption. That is, the rebound effect is
usually less than 100percent. However, at the macroeconomic level, more efficient (and hence comparatively
cheaper) energy leads to faster economic growth, which in turn increases energy use throughout the economy.
Saunders concludes that, taking into account both microeconomic and macroeconomic effects, technological
progress that improves energy efficiency will tend to increase overall energy use.

Jevons paradox

68

Energy conservation policy

Sustainable energy

Energy conservation

Cogeneration

Energy efficiency

Heat pump

Green building

Microgeneration

Passive solar
Renewable energy

Anaerobic digestion

Geothermal

Hydroelectricity

Solar

Tidal

Wind
Sustainable transport

Carbon-neutral fuel
Electric vehicle
Green vehicle
Plug-in hybrid

Sustainable development portal

Renewable energy portal


Environment portal

v
t

e [2]

Jevons warned that fuel efficiency gains tend to increase fuel use, but this does not imply that increased fuel
efficiency is worthless. Increased fuel efficiency enables greater production and a higher quality of material life. For
example, a more efficient steam engine allowed the cheaper transport of goods and people that contributed to the
Industrial Revolution. However, if the KhazzoomBrookes postulate is correct, increased fuel efficiency will not
reduce the rate of depletion of fossil fuels.
The Jevons paradox is sometimes used to argue that energy conservation efforts are futile, for example, that more
efficient use of oil will lead to increased demand, and will not slow the arrival or the effects of peak oil. This
argument is usually presented as a reason not to impose environmental policies, or to increase fuel efficiency (e.g. if
cars are more efficient, it will simply lead to more driving). Several points have been raised against this argument.
First, in the context of a mature market such as for oil in developed countries, the direct rebound effect is usually
small, and so increased fuel efficiency usually reduces resource use, other conditions remaining constant. Second,
even if increased efficiency does not reduce the total amount of fuel used, there remain other benefits associated with

Jevons paradox
improved efficiency. For example, increased fuel efficiency may mitigate the price increases, shortages and
disruptions in the global economy associated with peak oil. Third, environmental economists have pointed out that
fuel use will unambiguously decrease if increased efficiency is coupled with an intervention (e.g. a green tax) that
keeps the cost of fuel use the same or higher.
The Jevons paradox indicates that increased efficiency by itself is unlikely to reduce fuel use, and that sustainable
energy policy must rely on other types of government interventions. As the Jevons paradox applies only to
technological improvements that increase fuel efficiency, the imposition of conservation standards that
simultaneously increase costs does not cause an increase in fuel use. To ensure that efficiency-enhancing
technological improvements reduce fuel use, efficiency gains must be paired with government intervention that
reduces demand (e.g., green taxes, a cap and trade programme, or higher fuel taxes). The ecological economists
Mathis Wackernagel and William Rees have suggested that any cost savings from efficiency gains be "taxed away or
otherwise removed from further economic circulation. Preferably they should be captured for reinvestment in natural
capital rehabilitation."[] By mitigating the economic effects of government interventions designed to promote
ecologically sustainable activities, efficiency-improving technological progress may make the imposition of these
interventions more palatable, and more likely to be implemented.

References
[1] Saunders, Harry D., "The KhazzoomBrookes postulate and neoclassical growth." (http:/ / www. jstor. org/ discover/ 10. 2307/
41322471?uid=3738776& uid=2& uid=4& sid=21101524317221) The Energy Journal, October 1, 1992.
[2] http:/ / en. wikipedia. org/ w/ index. php?title=Template:Sustainable_energy& action=edit

Further reading
Jevons, William Stanley (1866). The Coal Question (http://www.econlib.org/library/YPDBooks/Jevons/
jvnCQ.html) (2nd ed.). London: Macmillan and Co.
Lords Select Committee on Science and Technology (5 July 2005). "3: The economics of energy efficiency"
(http://www.publications.parliament.uk/pa/ld200506/ldselect/ldsctech/21/2106.htm). Select Committee on
Science and Technology Second Report. Session 2005-06. House of Lords.
Herring, Horace (19 July 1999). "Does energy efficiency save energy? The debate and its consequences". Applied
Energy 63 (3): 209226. doi: 10.1016/S0306-2619(99)00030-6 (http://dx.doi.org/10.1016/
S0306-2619(99)00030-6). ISSN 0306-2619 (http://www.worldcat.org/issn/0306-2619).
Owen, David (December 20, 2010). "Annals of Environmentalism: The Efficiency Dilemma" (http://www.
newyorker.com/reporting/2010/12/20/101220fa_fact_owen). The New Yorker. pp.78.

External links
Rocky Mountain Institute (May 1, 2008). "Beating the Energy Efficiency Paradox (Part I)" (http://www.
treehugger.com/files/2008/05/beating-energy-efficiency-paradox.php). TreeHugger.

69

Leontief paradox

Leontief paradox
Leontief's paradox in economics is that the country with the world's highest capital-per worker has a lower
capital/labor ratio in exports than in imports.
This econometric find was the result of Wassily W. Leontief's attempt to test the HeckscherOhlin theory
empirically. In 1954, Leontief found that the United Statesthe most capital-abundant country in the
worldexported labor-intensive commodities and imported capital-intensive commodities, in contradiction with
HeckscherOhlin theory ("HO theory").

Measurements
In 1971 Robert Baldwin showed that U.S. imports were 27% more capital-intensive than U.S. exports in the 1962
trade data, using a measure similar to Leontief's.
In 1980 Edward Leamer questioned Leontief's original methodology on real exchange rate grounds, but
acknowledged that the U.S. paradox still appears in the data (for years other than 1947).
A 1999 survey of the econometric literature by Elhanan Helpman concluded that the paradox persists, but some
studies in non-US trade were instead consistent with the HO theory.
In 2005 Kwok & Yu used an updated methodology to argue for a lower or zero paradox in U.S. trade statistics,
though the paradox is still derived in other developed nations.

Responses to the paradox


For many economists, Leontief's paradox undermined the validity of the HeckscherOhlin theorem (HO) theory,
which predicted that trade patterns would be based on countries' comparative advantage in certain factors of
production (such as capital and labor). Many economists have dismissed the H-O theory in favor of a more Ricardian
model where technological differences determine comparative advantage. These economists argue that the United
States has an advantage in highly skilled labor more so than capital. This can be seen as viewing "capital" more
broadly, to include human capital. Using this definition, the exports of the United States are very (human)
capital-intensive, and not particularly intensive in (unskilled) labor.
Some explanations for the paradox dismiss the importance of comparative advantage as a determinant of trade. For
instance, the Linder hypothesis states that demand plays a more important role than comparative advantage as a
determinant of tradewith the hypothesis that countries which share similar demands will be more likely to trade.
For instance, both the United States and Germany are developed countries with a significant demand for cars, so
both have large automotive industries. Rather than one country dominating the industry with a comparative
advantage, both countries trade different brands of cars between them. Similarly, New Trade Theory argues that
comparative advantages can develop separately from factor endowment variation (e.g., in industrial increasing
returns to scale).

References

70

Lucas paradox

Lucas paradox
In economics, the Lucas paradox or the Lucas puzzle is the observation that capital does not flow from developed
countries to developing countries despite the fact that developing countries have lower levels of capital per worker.
Classical economic theory predicts that capital should flow from rich countries to poor countries, due to the effect of
diminishing returns of capital. Poor countries have lower levels of capital per worker which explains, in part, why
they are poor. In poor countries, the scarcity of capital relative to labor should mean that the returns related to the
infusion of capital are higher than in developed countries. In response, savers in rich countries should look at poor
countries as profitable places in which to invest. In reality, things do not seem to work that way. Surprisingly little
capital flows from rich countries to poor countries. This puzzle, famously discussed in a paper by Robert Lucas in
1990, is often referred to as the "Lucas Paradox."
The theoretical explanations for the Lucas Paradox can be grouped into two categories.
1. The first group attributes the limited amount of capital received by poorer nations to differences in fundamentals
that affect the production structure of the economy, such as technological differences, missing factors of
production, government policies, and the institutional structure.
2. The second group of explanations focuses on international capital market imperfections, mainly sovereign risk
(risk of nationalization) and asymmetric information. Although the expected return on investment might be high
in many developing countries, it does not flow there because of the high level of uncertainty associated with those
expected returns.

Examples of the Lucas paradox: 20th century development of Third World


nations
Lucas seminal paper was a reaction to observed trends in international development efforts during the 20th century.
Regions characterized by poverty, such as India, China and Africa, have received particular attention with regard to
the underinvestment predicted by Lucas. African Nations, with their impoverished populace and rich natural
resources, has been upheld as exemplifying the type of nations that would, under neoclassical assumptions, be able
to offer extremely high returns to capital. The meager foreign capital African nations receive outside of the charity of
multinational corporations reveals the extent to which Lucas captured the realities of todays global capital flows.
Authors more recently have focused their explanations for the paradox on Lucas first category of explanation, the
difference in fundamentals of the production structure. Some have pointed to the quality of institutions as the key
determinant of capital inflows to poorer nations. As evidence for the central role played by institutional stability, it
has been shown that the amount of foreign direct investment a country receives is highly correlated to the strength of
infrastructure and the stability of government in that country.

Counterexample of the Lucas paradox: American economic development


Although Lucas original hypothesis has widely been accepted as descriptive of the modern period in history, the
paradox does not emerge as clearly before the 20th century. The colonial era, for instance, stands out as an age of
unimpeded capital flows. The system of imperialism produced economic conditions particularly amenable to the
movement of capital according to the assumptions of classical economics. Britain, for instance, was able to design,
impose, and control the quality of institutions in their colonies to capitalize on the high returns to capital in the new
world.
Jeffrey Williamson has explored in depth this reversal of the Lucas Paradox in the colonial context. Although not
emphasized by Lucas himself, Williamson maintains that unimpeded labor migration is one way that capital flows to
the citizens of developing nations. The empire structure was particularly important for facilitating low-cost
international migration, allowing wage rates to converge across the regions in the British Empire. For instance, in the

71

Lucas paradox
17th and 18th century, England incentivized its citizens to move to the labor-scarce America, endorsing a system of
indentured servitude to make overseas migration affordable.
While Britain enabled free capital flow from old to new world, the success of the American enterprise after the
American Revolution is a good example of the role of institutional and legal frameworks for facilitating a continued
flow of capital. The American Constitutions commitment to private property rights, rights of personal liberty; and
strong contract law enabled investment from Britain to America to continue even without the incentives of the
colonial relationship. In these ways, early American economic development, both pre and post-revolution, provides a
case study for the conditions under which the Lucas Paradox is reversed. Even after the average income level in
America exceeded that of Britain, the institutions exported under imperialism and the legal frameworks established
after independence enabled long term capital flows from Europe to America.

References

Metzler paradox
In economics, the Metzler paradox (named after the American economist Lloyd Metzler) is the theoretical
possibility that the imposition of a tariff on imports may reduce the relative internal price of that good. It was
proposed by Lloyd Metzler in 1949 upon examination of tariffs within the HeckscherOhlin model. The paradox has
roughly the same status as immiserizing growth and a transfer that makes the recipient worse off.[1]
The strange result could occur if the exporting country's offer curve is very inelastic. In this case, the tariff lowers the
duty-free cost of the price of the import by such a great degree that the effect of the improvement of the
tariff-imposing countries' terms of trade on relative prices exceeds the amount of the tariff. Such a tariff would not
protect the industry competing with the imported goods.
It is deemed to be unlikely in practice.[2]

References
[1] Krugman and Obstfeld (2003), p. 112
[2] Krugman and Obstfeld (2003), p. 113

Further reading
Krugman, Paul R.; Obstfeld, Maurice (2003). "Chapter 5: The Standard Trade Model". International Economics:
Theory and Policy (6th ed.). Boston: Addison-Wesley. p.112. ISBN0-321-11639-9.

72

Paradox of thrift

Paradox of thrift
The paradox of thrift (or paradox of saving) is a paradox of economics, popularized by John Maynard Keynes,
though it had been stated as early as 1714 in The Fable of the Bees,[1] and similar sentiments date to antiquity.[2] The
paradox states that if everyone tries to save more money during times of economic recession, then aggregate demand
will fall and will in turn lower total savings in the population because of the decrease in consumption and economic
growth. The paradox is, narrowly speaking, that total savings may fall even when individual savings attempt to rise,
and, broadly speaking, that increase in savings may be harmful to an economy.[3] Both the narrow and broad claims
are paradoxical within the assumption underlying the fallacy of composition, namely that what is true of the parts
must be true of the whole. The narrow claim transparently contradicts this assumption, and the broad one does so by
implication, because while individual thrift is generally averred to be good for the economy, the paradox of thrift
holds that collective thrift may be bad for the economy.
The paradox of thrift is a central component of Keynesian economics, and has formed part of mainstream economics
since the late 1940s, though it is criticized on a number of grounds.

Overview
The argument is that, in equilibrium, total income (and thus demand) must equal total output, and that total
investment must equal total saving. Assuming that saving rises faster as a function of income than the relationship
between investment and output, then an increase in the marginal propensity to save, other things being equal, will
move the equilibrium point at which income equals output and investment equals savings to lower values.
In this form it represents a prisoner's dilemma as saving is beneficial to each individual but deleterious to the general
population. This is a "paradox" because it runs contrary to intuition. Someone unaware of the paradox of thrift would
fall into a fallacy of composition and assume that what seems to be good for an individual within the economy will
be good for the entire population. However, exercising thrift may be good for an individual by enabling that
individual to save for a "rainy day", and yet not be good for the economy as a whole.
This paradox can be explained by analyzing the place, and impact, of increased savings in an economy. If a
population saves more money (that is the marginal propensity to save increases across all income levels), then total
revenues for companies will decline. This decrease in economic growth means fewer salary increases and perhaps
downsizing. Eventually the population's total savings will have remained the same or even declined because of lower
incomes and a weaker economy. This paradox is based on the proposition, put forth in Keynesian economics, that
many economic downturns are demand based. Hypothetically, if all people will save their money, savings will rise
but there is a tendency that the macroeconomic status will fall.Wikipedia:Citation needed

History
While the paradox of thrift was popularized by Keynes, and is often attributed to him, it was stated by a number of
others prior to Keynes, and the proposition that spending may help and saving may hurt an economy dates to
antiquity; similar sentiments occur in the Bible verse:
There is that scattereth, and yet increaseth; and there is that withholdeth more than is meet, but it tendeth to
poverty.
Proverbs 11:24
which has found occasional use as an epigram in underconsumptionist writings.[4][5][6]
Keynes himself notes the appearance of the paradox in The Fable of the Bees: or, Private Vices, Publick Benefits
(1714) by Bernard Mandeville, the title itself hinting at the paradox, and Keynes citing the passage:

73

Paradox of thrift
As this prudent economy, which some people call Saving, is in private families the most certain method to
increase an estate, so some imagine that, whether a country be barren or fruitful, the same method if generally
pursued (which they think practicable) will have the same effect upon a whole nation, and that, for example,
the English might be much richer than they are, if they would be as frugal as some of their neighbours. This, I
think, is an error.
Keynes suggests Adam Smith was referring to this passage when he wrote "What is prudence in the conduct of every
private family can scarce be folly in that of a great Kingdom."
The problem of underconsumption and oversaving, as they saw it, was developed by underconsumptionist
economists of the 19th century, and the paradox of thrift in the strict sense that "collective attempts to save yield
lower overall savings" was explicitly stated by John M. Robertson in his 1892 book The Fallacy of Saving, writing:
Had the whole population been alike bent on saving, the total saved would positively have been much less,
inasmuch as (other tendencies remaining the same) industrial paralysis would have been reached sooner or
oftener, profits would be less, interest much lower, and earnings smaller and more precarious. This ... is no idle
paradox, but the strictest economic truth.
John M. Robertson, The Fallacy of Saving, p. 1312
Similar ideas were forwarded by William Trufant Foster and Waddill Catchings in the 1920s in The Dilemma of
Thrift.
Keynes distinguished between business activity/investment ("Enterprise") and savings ("Thrift") in his Treatise on
Money (1930):
...mere abstinence is not enough by itself to build cities or drain fens. ... If Enterprise is afoot, wealth
accumulates whatever may be happening to Thrift; and if Enterprise is asleep, wealth decays whatever
Thrift may be doing. Thus, Thrift may be the handmaiden of Enterprise. But equally she may not. And,
perhaps, even usually she is not.
and stated the paradox of thrift in The General Theory, 1936:
For although the amount of his own saving is unlikely to have any significant influence on his own income,
the reactions of the amount of his consumption on the incomes of others makes it impossible for all individuals
simultaneously to save any given sums. Every such attempt to save more by reducing consumption will so
affect incomes that the attempt necessarily defeats itself. It is, of course, just as impossible for the community
as a whole to save less than the amount of current investment, since the attempt to do so will necessarily raise
incomes to a level at which the sums which individuals choose to save add up to a figure exactly equal to the
amount of investment.
John Maynard Keynes, The General Theory of Employment, Interest and Money, Chapter 7, p. 84
The theory is referred to as the "paradox of thrift" in Samuelson's influential Economics of 1948, which popularized
the term.

74

Paradox of thrift

Related concepts
The paradox of thrift has been related to the debt deflation theory of economic crises, being called "the paradox of
debt"[7] people save not to increase savings, but rather to pay down debt. As well, a paradox of toil and a paradox
of flexibility have been proposed: A willingness to work more in a liquidity trap and wage flexibility after a debt
deflation shock may lead not only to lower wages, but lower employment.
During April 2009, U.S. Federal Reserve Vice Chair Janet Yellen discussed the "Paradox of deleveraging" described
by economist Hyman Minsky: "Once this massive credit crunch hit, it didnt take long before we were in a recession.
The recession, in turn, deepened the credit crunch as demand and employment fell, and credit losses of financial
institutions surged. Indeed, we have been in the grips of precisely this adverse feedback loop for more than a year. A
process of balance sheet deleveraging has spread to nearly every corner of the economy. Consumers are pulling back
on purchases, especially on durable goods, to build their savings. Businesses are cancelling planned investments and
laying off workers to preserve cash. And, financial institutions are shrinking assets to bolster capital and improve
their chances of weathering the current storm. Once again, Minsky understood this dynamic. He spoke of the
paradox of deleveraging, in which precautions that may be smart for individuals and firmsand indeed essential to
return the economy to a normal statenevertheless magnify the distress of the economy as a whole."[8]

Criticisms
Within mainstream economics, non-Keynesian economists, particularly neoclassical economists, criticize this theory
on three principal grounds.
The first criticism is that, following Say's law and the related circle of ideas, if demand slackens, prices will fall
(barring government intervention), and the resulting lower price will stimulate demand (though at lower profit or
cost possibly even lower wages). This criticism in turn has been questioned by Keynesian economists, who reject
Say's law and instead point to evidence of sticky prices as a reason why prices do not fall in recession; this remains a
debated point.
The second criticism is that savings represent loanable funds, particularly at banks, assuming the savings are held at
banks, rather than currency itself being held ("stashed under one's mattress"). Thus an accumulation of savings yields
an increase in potential lending, which will lower interest rates and stimulate borrowing. So a decline in consumer
spending is offset by an increase in lending, and subsequent investment and spending.
Two caveats are added to this criticism. Firstly, if savings are held as cash, rather than being loaned out (directly by
savers, or indirectly, as via bank deposits), then loanable funds do not increase, and thus a recession may be caused
but this is due to holding cash, not to saving per se.[9] Secondly, banks themselves may hold cash, rather than loaning
it out, which results in the growth of excess reserves funds on deposit but not loaned out. This is argued to occur in
liquidity trap situations, when interest rates are at a zero lower bound (or near it) and savings still exceed investment
demand. Within Keynesian economics, the desire to hold currency rather than loan it out is discussed under liquidity
preference.
Third, the paradox assumes a closed economy in which savings are not invested abroad (to fund exports of local
production abroad). Thus, while the paradox may hold at the global level, it need not hold at the local or national
level: if one nation increases savings, this can be offset by trading partners consuming a greater amount relative to
their own production, i.e., if the saving nation increases exports, and its partners increase imports. This criticism is
not very controversial, and is generally accepted by Keynesian economists as well,[10] who refer to it as "exporting
one's way out of a recession". They further note that this frequently occurs in concert with currency devaluation[11]
(hence increasing exports and decreasing imports), and cannot work as a solution to a global problem, because the
global economy is a closed system not every nation can increase net exports.

75

Paradox of thrift

Austrian School criticism


The paradox was criticized by the Austrian School economist Friedrich Hayek in a 1929 article, "The 'Paradox' of
Savings", questioning the paradox as proposed by Foster and Catchings.[12] Hayek, and later Austrian School
economists agree that if a population saves more money, total revenues for companies will
declineWikipedia:Citation needed, but they deny the assertion that lower revenues lead to lower economic growth,
understanding that the additional savings are used to create more capital to increase production. Once the new, more
productive structure of capital has reorganized inside of the current structure, the real costs of production is reduced
for most firms.Wikipedia:Citation needed

References and sources


References
[1] Keynes, The General Theory of Employment, Interest and Money, Chapter 23. Notes on Merchantilism, the Usury Laws, Stamped Money
and Theories of Under-consumption (http:/ / www. marxists. org/ reference/ subject/ economics/ keynes/ general-theory/ ch23. htm)
[2] See history section for further discussion.
[3] These two formulations are given in Campbell R. McConnell (1960: 26162), emphasis added: "By attempting to increase its rate of saving,
society may create conditions under which the amount it can actually save is reduced. This phenomenon is called the paradox of
thrift....[T]hrift, which has always been held in high esteem in our economy, now becomes something of a social vice."
[4] English, Irish and Subversives Among the Dismal Scientists, Noel Thompson, Nigel Allington, 2010, p. 122 (http:/ / books. google. com/
books?id=4fjwxnH8VPcC& pg=PA122& dq="scattereth,+ and+ yet+ increaseth"):
"A suggestion that a more equal distribution of income might be a remedy for general stagnation and that excess saving can be harmful is
implicit in the quotation from the Old Testament on the Reply to Mr. Say [by John Cazenove (17881879)].
[5] A Reply to Mr. Says Letters to Mr. Malthus, by John Cazenove, uses the verse as an epigram.
[6] Studies in economics, William Smart, 1895, p. 249 (http:/ / books. google. com/ books?id=YwUPAAAAQAAJ& pg=PA94)
[7] Paradox of thrift (http:/ / krugman. blogs. nytimes. com/ 2009/ 02/ 03/ paradox-of-thrift/ ), Paul Krugman
[8] Federal Reserve-Janet Yellen-A Minsky Meltdown-April 2009 (http:/ / www. frbsf. org/ news/ speeches/ 2009/ 0416. html)
[9] See section 9.9 and 9.11 http:/ / www. auburn. edu/ ~garriro/ cbm. htm
[10] The paradox of thrift for real (http:/ / krugman. blogs. nytimes. com/ 2009/ 07/ 07/ the-paradox-of-thrift-for-real/ ), Paul Krugman, July 7,
2009
[11] Devaluing History (http:/ / krugman. blogs. nytimes. com/ 2010/ 11/ 24/ devaluing-history/ ), Paul Krugman, November 24, 2010
[12] Hayek on the Paradox of Saving (http:/ / mises. org/ story/ 2804)

Sources
Samuelson, Paul& Nordhaus, William (2005). Economics (18th ed.). New York: McGraw-Hill.
ISBN0-07-123932-4.

External links
The paradox of thrift explained (http://ingrimayne.saintjoe.edu/econ/Keynes/Paradox.html)

Criticisms
The Paradox of Thrift: RIP (http://www.cato.org/pubs/journal/cj16n1-7.html), by Clifford F. Thies, The Cato
Journal, Volume 16, Number 1
Consumers don't cause recessions (http://mises.org/story/3194) by Robert P. Murphy (an Austrian School
critique of the paradox of thrift)

76

Paradox of value

77

Paradox of value
The paradox of value (also known as the diamondwater paradox)
is the apparent contradiction that, although water is on the whole more
useful, in terms of survival, than diamonds, diamonds command a
higher price in the market. The philosopher Adam Smith is often
considered to be the classic presenter of this paradox. Nicolaus
Copernicus, John Locke, John Law and others had previously tried to
explain the disparity.

Labor theory of value


Main article: Labor theory of value

Water diamonds.

In a passage of Adam Smith's An Inquiry into the Nature and Causes of


the Wealth of Nations, he discusses the concepts of value in use and value in exchange, and notices how they tend to
differ:
What are the rules which men naturally observe in exchanging them [goods] for money or for one another, I
shall now proceed to examine. These rules determine what may be called the relative or exchangeable value of
goods. The word VALUE, it is to be observed, has two different meanings, and sometimes expresses the utility
of some particular object, and sometimes the power of purchasing other goods which the possession of that
object conveys. The one may be called "value in use;" the other, "value in exchange." The things which have
the greatest value in use have frequently little or no value in exchange; on the contrary, those which have the
greatest value in exchange have frequently little or no value in use. Nothing is more useful than water: but it
will purchase scarcely anything; scarcely anything can be had in exchange for it. A diamond, on the contrary,
has scarcely any use-value; but a very great quantity of other goods may frequently be had in exchange for it.
Furthermore, he explained the value in exchange as being determined by labor:
The real price of every thing, what every thing really costs to the man who wants to acquire it, is the toil and
trouble of acquiring it.
Hence, Smith denied a necessary relationship between price and utility. Price on this view was related to a factor of
production (namely, labor) and not to the point of view of the consumer.[1] The best practical example of this is
saffron - the most expensive spice - here much of its value derives from both the low yield from growing it and the
disproportionate amount of labor required to extract it. Proponents of the labor theory of value saw that as the
resolution of the paradox.
The labor theory of value has lost popularity in mainstream economics and has been replaced by the theory of
marginal utility.

Paradox of value

78

Marginalism
Main article: Marginalism
The theory of marginal utility, which is
based on the subjective theory of
value, says that the price at which an
object trades in the market is
determined neither by how much labor
was exerted in its production, as in the
labor theory of value, nor on how
useful it is on a whole (total utility).
Rather, its price is determined by its
marginal utility. The marginal utility of
a good is derived from its most
important use to a person. So, if
someone possesses a good, he will use
it to satisfy some need or want. Which
one? Naturally, the one that takes
highest-priority.
Eugen
von
Bhm-Bawerk illustrated this with the
example of a farmer having five sacks
of grain.

At low levels of consumption, water has a much higher marginal utility than diamonds
and thus is more valuable. People usually consume water at much higher levels than they
do diamonds and thus the marginal utility and price of water are lower than that of
diamonds.

With the first, he will make bread to survive. With the second, he will make more bread, in order to be strong enough
to work. With the next, he will feed his farm animals. The next is used to make whisky, and the last one he feeds to
the pigeons. If one of those bags is stolen, he will not reduce each of those activities by one-fifth; instead he will stop
feeding the pigeons.
So the value of the fifth bag of grain is equal to the satisfaction he gets from feeding the pigeons. If he sells that bag
and neglects the pigeons, his least productive use of the remaining grain is to make whisky, so the value of a fourth
bag of grain is the value of his whisky. Only if he loses four bags of grain will he start eating less; that is the most
productive use of his grain. The last bag of grain is worth his life.
In explaining the diamond-water paradox, marginalists explain that it is not the total usefulness of diamonds or water
that matters, but the usefulness of each unit of water or diamonds. It is true that the total utility of water to people is
tremendous, because they need it to survive. However, since water is in such large supply in the world, the marginal
utility of water is low. In other words, each additional unit of water that becomes available can be applied to less
urgent uses as more urgent uses for water are satisfied.
Therefore, any particular unit of water becomes worth less to people as the supply of water increases. On the other
hand, diamonds are in much lower supply. They are of such low supply that the usefulness of one diamond is greater
than the usefulness of one glass of water, which is in abundant supply. Thus, diamonds are worth more to people.
Therefore, those who want diamonds are willing to pay a higher price for one diamond than for one glass of water,
and sellers of diamonds ask a price for one diamond that is higher than for one glass of water.

Paradox of value

Efficiency model
The Chinese economist Tan Lidong addresses the question through relative Economic efficiency. noting water is in
such large supply in the world, but in the desert water-taking efficiency is very low, so the value of water is also
high. If someone can invent high efficiency equipment to get water in the desert, by then the water would be cheap.
He avers that value is determined by efficiency, as well as efficiency affected by tools, labor and resources.
He suggests we can calculate the exact value of the water or the diamond by using efficiency. He takes a historical
approach to value, that exchange ratios have been known for many products for a very long time, and established by
custom and practice. Technological change changes the efficiency of production, thus changing the relative values.[2]

Criticisms
George Stigler has argued that Smith's statement of the paradox is flawed, since it consisted of a comparison between
heterogeneous goods, and such comparison would have required using the concept of marginal utility of income.
And since this concept was not known in Smith's time, then the value in use and value in exchange judgement may
be meaningless:
The paradoxthat value in exchange may exceed or fall short of value in usewas, strictly speaking, a
meaningless statement, for Smith had no basis (i.e., no concept of marginal utility of income or marginal price
of utility) on which he could compare such heterogeneous quantities. On any reasonable interpretation,
moreover, Smith's statement that value in use could be less than value in exchange was clearly a moral
judgment, not shared by the possessors of diamonds. To avoid the incomparability of money and utility, one
may interpret Smith to mean that the ratio of values of two commodities is not equal to the ratio of their total
utilities. Or, alternatively, that the ratio of the prices of two commodities is not equal to the ratio of their total
utilities; but this also requires an illegitimate selection of units: The price of what quantity of diamonds is to be
compared with the price of one gallon of water?
George Stigler, The development of Utility Theory. I [3]

References
[1] Dhamee, Yousuf(1996?), Adam Smith and the division of labour (http:/ / www. victorianweb. org/ economics/ division. html) accessed
09/08/06
[2] Tan Lidong <The economics of happiness>, publishing house of China university of politics and law (January 2012) ISBN 9787562040675
[3] Stigler, George (1950). The development of Utility Theory. I. Journal of political economy 58(4), p 308.

79

Productivity paradox

Productivity paradox
The productivity paradox was analyzed and popularized in a widely cited article by Erik Brynjolfsson, which noted
the apparent contradiction between the remarkable advances in computer power and the relatively slow growth of
productivity at the level of the whole economy, individual firms and many specific applications. The concept is
sometimes referred to as the Solow computer paradox in reference to Robert Solow's 1987 quip, "You can see the
computer age everywhere but in the productivity statistics."[1] The paradox has been defined as the discrepancy
between measures of investment in information technology and measures of output at the national level.
It was widely believed that office automation was boosting labor productivity (or total factor productivity). However,
the growth accounts didn't seem to confirm the idea. From the early 1970s to the early 1990s there was a massive
slow-down in growth as the machines were becoming ubiquitous. (Other variables in country's economies were
changing simultaneously; growth accounting separates out the improvement in production output using the same
capital and labour resources as input by calculating growth in total factor productivity, AKA the "Solow residual".)
The productivity paradox has attracted a lot of attention because technology seems no longer to be able to create the
kind of productivity gains that occurred until the early 1970s. Some, such as economist Robert J. Gordon, are now
arguing that technology in general is subject to diminishing returns in its ability to increase economic growth.

Explanations
Different authors have explained the paradox in different ways. In his original article, Brynjolfsson (1993) identified
five possible explanations:
Mismeasurement: the gains are real, but our current measures miss them;
Redistribution: there are private gains, but they come at the expense of other firms and individuals, leaving little
net gain;
Time lags: the gains take a long time to show up; and
Mismanagement: there are no gains because of the unusual difficulties in managing IT or information itself.
Feedback effects: Lower labor requirements lead to fewer customers, negating any economies of scale achievable
with computers.
He stressed the first explanation, noting weaknesses with then-existing studies and measurement methods, and
pointing out that "a shortfall of evidence is not evidence of a shortfall."
Turban, et al. (2008), mention that understanding the paradox requires an understanding of the concept of
productivity. Pinsonneault et al. (1998) state that for untangling the paradox an understanding of how IT usage is
related to the nature of managerial work and the context in which it is deployed is required.
One hypothesis to explain the productivity paradox is that computers are productive, yet their productive gains are
realized only after a lag period, during which complementary capital investments must be developed to allow for the
use of computers to their full potential.[2]
Diminishing marginal returns from computers, the opposite of the time lag hypothesis, is that computers, in the form
of mainframes, were used in the most productive areas, like high volume transactions of banking, accounting and
airline reservations, over two decades before personal computers. Also, computers replaced a sophisticated system of
data processing that used unit record equipment. Therefore the important productivity opportunities were exhausted
before computers were everywhere. We were looking at the wrong time period.
Another hypothesis states that computers are simply not very productivity enhancing because they require time, a
scarce complementary human input.Wikipedia:Citation needed This theory holds that although computers perform a
variety of tasks, these tasks are not done in any particularly new or efficient manner, but rather they are only done
faster. Current data does not confirm the validity of either hypothesis. It could very well be that increases in
productivity due to computers are not captured in GDP measures, but rather in quality changes and new products.

80

Productivity paradox
Economists have done research in the productivity issue and concluded that there are three possible explanations for
the paradox. The explanations can be divided in three categories:
Data and analytical problems hide "productivity-revenues". The ratios for input and output are sometimes difficult
to measure, especially in the service sector.
Revenues gained by a company through productivity will be hard to notice because there might be losses in other
divisions/departments of the company. So it is again hard to measure the profits made only through investments in
productivity.
There is complexity in designing, administering and maintaining IT systems. IT projects, especially software
development, are notorious for cost overruns and schedule delays. Adding to cost are rapid obsolescence of
equipment and software, incompatible software and network platforms and issues with security such as data theft
and viruses. This causes constant spending for replacement. One time changes also occur, such as the Year 2000
problem and the changeover from Novell NetWare by many companies.
Other economists have made a more controversial charge against the utility of computers: that they pale into
insignificance as a source of productivity advantage when compared to the industrial revolution, electrification,
infrastructures (canals and waterways, railroads, highway system), Fordist mass production and the replacement of
human and animal power with machines. High productivity growth occurred from last decades of the 19th century
until the 1973, with a peak from 1929 to 1973, then declined to levels of the early 19th century. There was a rebound
in productivity after 2000. Much of the productivity from 1985 to 2000 came in the computer and related industries.
A number of explanations of this have been advanced, including:
The tendency at least initially of computer technology to be used for applications that have little impact on
overall productivity, e.g. word processing.
Inefficiencies arising from running manual paper-based and computer-based processes in parallel, requiring two
separate sets of activities and human effort to mediate between them usually considered a technology alignment
problem
Poor user interfaces that confuse users, prevent or slow access to time-saving facilities, are internally inconsistent
both with each other and with terms used in work processes a concern addressed in part by enterprise taxonomy
Extremely poor hardware and related boot image control standards that forced users into endless "fixes" as
operating systems and applications clashed addressed in part by single board computers and simpler more
automated re-install procedures, and the rise of software specifically to solve this problem, e.g. Norton Ghost
Technology-driven change driven by companies such as Microsoft which profit directly from more rapid
"upgrades"
An emphasis on presentation technology and even persuasion technology such as PowerPoint, at the direct
expense of core business processes and learning addressed in some companies including IBM and Sun
Microsystems by creating a PowerPoint-Free ZoneWikipedia:Citation needed
The blind assumption that introducing new technology must be good
The fact that computers handle office functions that, in most cases, are not related to the actual production of
goods and services.
Factories were automated decades before computers. Adding computer control to existing factories resulted in
only slight productivity gains in most cases.
A paper by Triplett (1999) reviews Solows paradox from seven other often given explanations. They are:
You dont see computers everywhere, in a meaningful economic sense
You only think you see computers everywhere
You may not see computers everywhere, but in the industrial sectors where you most see them, output is poorly
measured
Whether or not you see computer everywhere, some of what they do is not counted in economic statistics
You dont see computers in the productivity yet, but wait a bit and you will

81

Productivity paradox

82

You see computers everywhere but in the productivity statistics because computers are not as productive as you
think
There is no paradox: some economists are counting innovations and new products on an arithmetic scale when
they should count on a logarithmic scale.

Effects of economic sector share changes


Gordon J. Bjork points out that manufacturing productivity gains continued, although at a decreasing rate than in
decades past; however, the cost reductions in manufacturing shank the sector size. The services and government
sectors, where productivity growth is very low, gained in share, dragging down the overall productivity number.
Because government services are priced at cost with no value added, government productivity growth is near zero as
an artifact of the way in which it is measured. Bjork also points out that manufacturing uses more capital per unit of
output than government or services.

Miscellaneous causes
Before computers: Data processing with unit record equipment
Main articles: Unit record equipment and Plugboard

Early IBM tabulating machine. Common


applications were accounts receivable, payroll
and billing.

When computers for general business applications


appeared in the 1950s, a sophisticated industry for data
processing existed in the form of unit record
equipment. These systems processed data on punched
cards by running the cards through tabulating
machines, the holes in the cards allowing electrical
contact to activate relays and solenoids to keep a count.
The flow of punched cards could be arranged in various
program-like sequences to allow sophisticated data
processing. Some unit record equipment was
programmable by wiring a plug board, with the plug
boards being removable allowing for quick replacement
with another pre-wired program.
In 1949 vacuum tube calculators were added to unit
record equipment. In 1955 the first completely
transistorized calculator with magnetic cores for
dynamic memory, the IBM 608, was introduced.

Control panel for an IBM 402 Accounting Machine

The first computers were an improvement over unit


record equipment, but not by a great amount. This was
partly due to low level software used, low performance
capability and failure of vacuum tubes and other
components. Also, the data input to early computers
used punched cards. Most of these hardware and
software shortcomings were solved by the late 1960s,
but punched cards did not become fully displaced until
the 1980s.

Productivity paradox

Analog process control


Computers did not revolutionize manufacturing because automation, in the form of control systems, had already
been in existence for decades, although computers did allow more sophisticated control, which led to improved
product quality and process optimization. Pre-computer control was known as analog control and computerized
control is called digital.

Parasitic losses of cashless transactions


Credit card transactions now represent a large percentage of low value transactions on which credit card companies
charge merchants. Most of such credit card transactions are more of a habit than an actual need for credit and to the
extent that such purchases represent convenience or lack of planning to carry cash on the part of consumers, these
transactions add a layer of unnecessary expense. However, debit or check card transactions are cheaper than
processing paper checks.

Online commerce
Despite high expectations for online retail sales, individual item and small quantity handling and transportation costs
may offset the savings of not having to maintain "bricks and mortar" stores.Wikipedia:Citation needed Online retail
sales has proven successful in specialty items, collectibles and higher priced goods. Some airline and hotel retailers
and aggregators have also witnessed great success.
Online commerce has been extremely successful in banking, airline, hotel, and rental car reservations, to name a few.

Restructured office
The personal computer restructured the office by reducing the secretarial and clerical staffs. Prior to computers,
secretaries transcribed Dictaphone recordings or live speech into shorthand, and typed the information, typically a
memo or letter. All filing was done with paper copies.
A new position in the office staff was the information technologist, or department. With networking came
information overload in the form of e-mail, with some office workers receiving several hundred each day, most of
which are not necessary information for the recipient.
Some hold that one of the main productivity boosts from information technology is still to come: large-scale
reductions in traditional offices as home offices become widespread, but this requires large and major changes in
work culture and remains to be proven.

Cost overruns of software projects


It is well known by software developers that projects typically run over budget and finish behind schedule.
Software development is typically for new applications that are unique. The project's analyst is responsible for
interviewing the stakeholders, individually and in group meetings, to gather the requirements and incorporate them
into a logical format for review by the stakeholders and developers. This sequence is repeated in successive
iterations, with partially completed screens available for review in the latter stages.
Unfortunately, stakeholders often have a vague idea of what the functionality should be, and tend to add a lot of
unnecessary features, resulting in schedule delays and cost overruns.

83

Productivity paradox

Qualifications
By the late 1990s there were some signs that productivity in the workplace been improved by the introduction of IT,
especially in the United States. In fact, Erik Brynjolfsson and his colleagues found a significant positive relationship
between IT investments and productivity, at least when these investments were made to complement organizational
changes.[3][4][5] A large share of the productivity gains outside the IT-equipment industry itself have been in retail,
wholesale and finance. A major advance was computerized stock market transaction processing, which replaced the
system that had been in place since the Civil War but by the last half of 1968 caused the U. S. stock market to close
most Wednesday afternoons processing.
Computers revolutionized accounting, billing, record keeping and many other office functions; however, early
computers used punched cards for data and programming input. Until the 1980s it was common to receive monthly
utility bills printed on a punched card that was returned with the customers payment.
In 1973 IBM introduced point of sale (POS) terminals in which electronic cash registers were networked to the store
mainframe computer. By the 1980s bar code readers were added. These technologies automated inventory
management. Wal-Mart Stores was an early adopter of POS.
Computers also greatly increased productivity of the communications sector, especially in areas like the elimination
of telephone operators. In engineering, computers replaced manual drafting with CAD and software was developed
for calculations used in electronic circuits, stress analysis, heat and material balances, etc.
Automated teller machines (ATMs) became popular in recent decades and self checkout at retailers appeared in the
1990s.
The Airline Reservations System and banking are areas where computers are practically essential. Modern military
systems also rely on computers.

References
[1] Robert Solow, "We'd better watch out", New York Times Book Review, July 12, 1987, page 36. See here (http:/ / www. standupeconomist.
com/ pdf/ misc/ solow-computer-productivity. pdf).
[2] David P.A., "The Dynamo and the Computer: A Historical Perspective on the Modern Productivity Paradox", American Economic Review
Papers and Proceedings, 1990, 35561
[3] E.Brynjolfsson and L.Hitt, "Beyond the Productivity Paradox: Computers are the Catalyst for Bigger Changes", CACM, August 1998
[4] E. Brynjolfsson, S. Yang, The Intangible Costs and Benefits of Computer Investments: Evidence from the Financial Markets, MIT Sloan
School of Management, December 1999
[5] Paolo Magrassi, A.Panarella, B.Hayward, The 'IT and Economy' Discussion: A Review, GartnerGroup, Stamford (CT), USA, June 2002 [1]

Further reading
Brynjolfsson, Erik, and Lorin Hitt (June 2003). "Computing Productivity: Firm Level Evidence" (http://papers.
ssrn.com/sol3/papers.cfm?abstract_id=290325). MIT Sloan Working Paper No. 4210-01.
Brynjolfsson, Erik, and Adam Saunders (2010). Wired for Innovation: How Information Technology is Reshaping
the Economy (http://digital.mit.edu/erik/Wired4innovation.html). MIT Press.
Greenwood, Jeremy (1997). The Third Industrial Revolution: Technology, Productivity and Income Inequality
(http://www.econ.rochester.edu/Faculty/GreenwoodPapers/third.pdf). AEI Press.
Landauer, Thomas K. (1995). The trouble with computers: Usefulness, usability and productivity. Cambridge,
Massachusetts: MIT Press. ISBN0-262-62108-8.
Alian Pinsonneault & Suzanne Rivard (1998). "Information Technology and the Nature of Managerial Work:
From the Productivity paradox to the Icarus Paradox". MIS Quarterly 22 (3): 287311.
Triplett, Jack E. (1999). "The solow productivity paradox: what do computers do to productivity" (http://www.
csls.ca/journals/sisspp/v32n2_04.pdf). Canadian Journal of Economics 32 (2): 309334.

84

Productivity paradox
Stratopoulos, Theophanis, and Bruce Dehning (2000). "Does successful investment in information technology
solve the productivity paradox?". Information & Management: 113.

St. Petersburg paradox


The St. Petersburg lottery or St. Petersburg paradox[1] is a paradox related to probability and decision theory in
economics. It is based on a particular (theoretical) lottery game that leads to a random variable with infinite expected
value (i.e., infinite expected payoff) but nevertheless seems to be worth only a very small amount to the participants.
The St. Petersburg paradox is a situation where a naive decision criterion which takes only the expected value into
account predicts a course of action that presumably no actual person would be willing to take. Several resolutions are
possible.
The paradox is named from Daniel Bernoulli's presentation of the problem and his solution, published in 1738 in the
Commentaries of the Imperial Academy of Science of Saint Petersburg (Bernoulli 1738). However, the problem was
invented by Daniel's cousin Nicolas Bernoulli who first stated it in a letter to Pierre Raymond de Montmort on
September 9, 1713 (de Montmort 1713).

The paradox
A casino offers a game of chance for a single player in which a fair coin is tossed at each stage. The pot starts at 2
dollars and is doubled every time a head appears. The first time a tail appears, the game ends and the player wins
whatever is in the pot. Thus the player wins 2 dollars if a tail appears on the first toss, 4 dollars if a head appears on
the first toss and a tail on the second, 8 dollars if a head appears on the first two tosses and a tail on the third, 16
dollars if a head appears on the first three tosses and a tail on the fourth, and so on. In short, the player wins 2k
dollars, where k equals number of tosses. What would be a fair price to pay the casino for entering the game?
To answer this, we need to consider what would be the average payout: with probability 1/2, the player wins 2
dollars; with probability 1/4 the player wins 4 dollars; with probability 1/8 the player wins 8 dollars, and so on. The
expected value is thus

Assuming the game can continue as long as the coin toss results in heads and in particular that the casino has
unlimited resources, this sum grows without bound and so the expected win for repeated play is an infinite amount of
money. Considering nothing but the expectated value of the net change in one's monetary wealth, one should
therefore play the game at any price if offered the opportunity. Yet, in published descriptions of the game, many
people expressed disbelief in the result. Martin quotes Ian Hacking as saying "few of us would pay even $25 to enter
such a game" and says most commentators would agree. The paradox is the discrepancy between what people seem
willing to pay to enter the game and the infinite expected value.

85

St. Petersburg paradox

Solutions of the paradox


Several approaches have been proposed for solving the paradox.

Expected utility theory


The classical resolution of the paradox involved the explicit introduction of a utility function, an expected utility
hypothesis, and the presumption of diminishing marginal utility of money.
In Daniel Bernoulli's own words:
The determination of the value of an item must not be based on the price, but rather on the utility it yields.
There is no doubt that a gain of one thousand ducats is more significant to the pauper than to a rich man
though both gain the same amount.
A common utility model, suggested by Bernoulli himself, is the logarithmic function U(w)=ln(w) (known as log
utility [2]). It is a function of the gamblers total wealth w, and the concept of diminishing marginal utility of money
is built into it. The expected utility hypothesis posits that a utility function exists whose expected net change is a
good criterion for real people's behavior. For each possible event, the change in utility
ln(wealthaftertheevent)-ln(wealthbeforetheevent) will be weighted by the probability of that event occurring.
Let c be the cost charged to enter the game. The expected utility of the lottery now converges to a finite value:

This formula gives an implicit relationship between the gambler's wealth and how much he should be willing to pay
to play (specifically, any c that gives a positive expected utility). For example, with log utility a millionaire should
be willing to pay up to $10.94, a person with $1000 should pay up to $5.94, a person with $2 should pay up to $2,
and a person with $0.60 should borrow $0.87 and pay up to $1.47.
Before Daniel Bernoulli published, in 1728, another Swiss mathematician, Gabriel Cramer, had already found parts
of this idea (also motivated by the St. Petersburg Paradox) in stating that
the mathematicians estimate money in proportion to its quantity, and men of good sense in proportion to the
usage that they may make of it.
He demonstrated in a letter to Nicolas Bernoulli [3] that a square root function describing the diminishing marginal
benefit of gains can resolve the problem. However, unlike Daniel Bernoulli, he did not consider the total wealth of a
person, but only the gain by the lottery.
This solution by Cramer and Bernoulli, however, is not completely satisfying, since the lottery can easily be changed
in a way such that the paradox reappears. To this aim, we just need to change the game so that it gives the (even
larger) payoff
. Again, the game should be worth an infinite amount. More generally, one can find a lottery that
allows for a variant of the St. Petersburg paradox for every unbounded utility function, as was first pointed out by
Menger (Menger 1934).
Recently, expected utility theory has been extended to arrive at more behavioral decision models. In some of these
new theories, as in cumulative prospect theory, the St. Petersburg paradox again appears in certain cases, even when
the utility function is concave, but not if it is bounded (Rieger & Wang 2006).

Probability weighting
Nicolas Bernoulli himself proposed an alternative idea for solving the paradox. He conjectured that people will
neglect unlikely events (de Montmort 1713). Since in the St. Petersburg lottery only unlikely events yield the high
prizes that lead to an infinite expected value, this could resolve the paradox. The idea of probability weighting
resurfaced much later in the work on prospect theory by Daniel Kahneman and Amos Tversky. However, their
experiments indicated that, very much to the contrary, people tend to overweight small probability events. Therefore

86

St. Petersburg paradox


the proposed solution by Nicolas Bernoulli is nowadays not considered to be satisfactory.Wikipedia:Avoid weasel
words
Cumulative prospect theory is one popular generalization of expected utility theory that can predict many behavioral
regularities (Tversky & Kahneman 1992). However, the overweighting of small probability events introduced in
cumulative prospect theory may restore the St. Petersburg paradox. Cumulative prospect theory avoids the St.
Petersburg paradox only when the power coefficient of the utility function is lower than the power coefficient of the
probability weighting function (Blavatskyy 2005). Intuitively, the utility function must not simply be concave, but it
must be concave relative to the probability weighting function to avoid the St. Petersburg paradox.

Rejection of mathematical expectation


Various authors, including Jean le Rond d'Alembert and John Maynard Keynes, have rejected maximization of
expectation (even of utility) as a proper rule of conduct. Keynes, in particular, insisted that the relative risk of an
alternative could be sufficiently high to reject it even were its expectation enormous.

Answer by sampling
There is one mathematically correct answer with sampling by William Feller (obtained in 1937). Sufficient
knowledge of probability theory and statistics is necessary to fully understand Feller's answer. However, it can be
understood intuitively because it uses the technique "to play this game with a great number of people, and to then
calculate the expectation from the sample". According to this technique, if the expectation of a game diverges, the
assumption that the game can be played in infinite time is necessary and if the number of times of the game is
limited, the expectation converges to a much smaller value.

Finite St. Petersburg lotteries


The classical St. Petersburg lottery assumes that the casino has infinite resources. This assumption is unrealistic,
particularly in connection with the paradox, which involves the reactions of ordinary people to the lottery. Of course,
the resources of an actual casino (or any other potential backer of the lottery) are finite. More importantly, the
expected value of the lottery only grows logarithmically with the resources of the casino. As a result, the expected
value of the lottery, even when played against a casino with the largest resources realistically conceivable, is quite
modest. If the total resources (or total maximum jackpot) of the casino are W dollars, then L = 1 + floor(log2(W)) is
the maximum number of times the casino can play before it no longer covers the next bet. The expected value E of
the lottery then becomes:

The following table shows the expected value E of the game with various potential bankers and their bankroll W
(with the assumption that if you win more than the bankroll you will be paid what the bank has):

87

St. Petersburg paradox

88

Banker

Bankroll

Expected value of lottery

Friendly game

$100

$4.28

Millionaire

$1,000,000

$10.95

Billionaire

$1,000,000,000

$15.93

Bill Gates (2008)

$67,000,000,000

U.S. GDP (2007)

$13.8 trillion

$22.78

World GDP (2007) $54.3 trillion

$23.77

Googolaire

$166.50

[4] $21.77

[5]

$10100

A rational person might not find the lottery worth even the modest amounts in the above table, suggesting that the
naive decision model of the expected return causes essentially the same problems as for the infinite lottery. Even so,
the possible discrepancy between theory and reality is far less dramatic.
The assumption of infinite resources can produce other apparent paradoxes in economics. In the martingale betting
system, a gambler betting on a tossed coin doubles his bet after every loss, so that an eventual win would cover all
losses; in practice, this requires the gambler's bankroll to be infinite. The gambler's ruin concept shows a gambler
playing a negative expected value game will eventually go broke, regardless of his betting system.

Recent discussions
Although this paradox is three centuries old, papers with new and fresh ideas on how to resolve it still pop up.

Samuelson
Samuelson resolves the paradox by arguing that, even if an entity had infinite resources, the game would never be
offered. If the lottery represents an infinite expected gain to the player, then it also represents an infinite expected
loss to the host. No one could be observed paying to play the game because it would never be offered. As Paul
Samuelson describes the argument:
Paul will never be willing to give as much as Peter will demand for such a contract; and hence the indicated
activity will take place at the equilibrium level of zero intensity. (Samuelson 1960)

Peters
Ole Peters thinks that the St. Petersburg paradox can be solved by using concepts and ideas from ergodic theory
(Peters 2011a). In statistical mechanics it is a central problem to understand whether time averages resulting from a
long observation of a single system are equivalent to expectation values. This is the case only for a very limited class
of systems that are called "ergodic" there. For non-ergodic systems there is no general reason why expectation values
should have any relevance.
Peters points out that computing the naive expected payout is mathematically equivalent to considering multiple
outcomes of the same lottery in parallel universes. This is irrelevant to the individual considering whether to buy a
ticket since he exists in only one universe and is unable to exchange resources with the others. It is therefore unclear
why expected wealth should be a quantity whose maximization should lead to a sound decision theory. Indeed, the
St. Petersburg paradox is only a paradox if one accepts the premise that rational actors seek to maximize their
expected wealth. The classical resolution is to apply a utility function to the wealth, which reflects the notion that the
"usefulness" of an amount of money depends on how much of it one already has, and then to maximise the
expectation of this. The choice of utility function is often framed in terms of the individual's risk preferences and
may vary between individuals: it therefore provides a somewhat arbitrary framework for the treatment of the

St. Petersburg paradox

89

problem.
An alternative premise, which is less arbitrary and makes fewer assumptions, is that the performance over time of an
investment better characterises an investor's prospects and, therefore, better informs his investment decision. In this
case, the passage of time is incorporated by identifying as the quantity of interest the average rate of exponential
growth of the player's wealth in a single round of the lottery,

per round, where

is the

th (positive finite) payout and

standard St. Petersburg lottery,

and

is the (non-zero) probability of receiving it. In the


.

Although this is an expectation value of a growth rate, and may therefore be thought of in one sense as an average
over parallel universes, it is in fact equivalent to the time average growth rate that would be obtained if repeated
lotteries were played over time (Peters 2011a). While is identical to the rate of change of the expected logarithmic
utility, it has been obtained without making any assumptions about the player's risk preferences or behaviour, other
than that he is interested in the rate of growth of his wealth.
Under this paradigm, an individual with wealth

should buy a ticket at a price

provided

This strategy counsels against paying any amount of money for a ticket that admits the possibility of bankruptcy, i.e.

for any

, since this generates a negatively divergent logarithm in the sum for

all other terms in the sum and guarantee that

which can be shown to dominate

. If we assume the smallest payout is

, then the individual

will always be advised to decline the ticket at any price greater than
regardless of the payout structure of the lottery. The ticket price for which the expected growth rate falls to zero will
be less than
but may be greater than , indicating that borrowing money to purchase a ticket for more than
one's wealth can be a sound decision. This would be the case, for example, where the smallest payout exceeds the
player's current wealth, as it does in Menger's game.
It should also be noted in the above treatment that, contrary to Menger's analysis, no higher-paying lottery can
generate a paradox which the time resolution - or, equivalently, Bernoulli's or Laplace's logarithmic resolutions - fail
to resolve, since there is always a price at which the lottery should not be entered, even though for especially
favourable lotteries this may be greater than one's worth.

Further discussions
The St. Petersburg paradox and the theory of marginal utility have been highly disputed in the past. For a discussion
from the point of view of a philosopher, see (Martin 2004).

References and notes


Citations
[1] Conceptual foundations of risk theory. By Michael D. Weiss, United States. Dept. of Agriculture. Economic Research Service. p36 (http:/ /
books. google. com/ books?id=v33W8lKfOo0C& pg=PA36)
[2] http:/ / www. econterms. com/ glossary. cgi?query=log+ utility
[3] http:/ / www. cs. xu. edu/ math/ Sources/ Montmort/ stpetersburg. pdf#search=%22Nicolas%20Bernoulli%22
[4] The estimated net worth of Bill Gates is from Forbes.
[5] The GDP data are as estimated for 2007 by the International Monetary Fund, where one trillion dollars equals $1012 (one million times one
million dollars).

Works cited

St. Petersburg paradox


Arrow, Kenneth J. (February 1974). "The use of unbounded utility functions in expected-utility maximization:
Response" (http://ideas.repec.org/a/tpr/qjecon/v88y1974i1p136-38.html) (PDF). Quarterly Journal of
Economics (The MIT Press) 88 (1): 136138. doi: 10.2307/1881800 (http://dx.doi.org/10.2307/1881800).
JSTOR 1881800 (http://www.jstor.org/stable/1881800). Handle: RePEc:tpr:qjecon:v:88:y:1974:i:1:p:136-38.
Bernoulli, Daniel; Originally published in 1738; translated by Dr. Louise Sommer. (January 1954). "Exposition of
a New Theory on the Measurement of Risk" (http://www.math.fau.edu/richman/Ideas/daniel.htm).
Econometrica (The Econometric Society) 22 (1): 2236. doi: 10.2307/1909829 (http://dx.doi.org/10.2307/
1909829). JSTOR 1909829 (http://www.jstor.org/stable/1909829). Retrieved 2006-05-30.
Blavatskyy, Pavlo (April 2005). "Back to the St. Petersburg Paradox?". Management Science 51 (4): 677678.
doi: 10.1287/mnsc.1040.0352 (http://dx.doi.org/10.1287/mnsc.1040.0352).
de Montmort, Pierre Remond (1713). Essay d'analyse sur les jeux de hazard [Essays on the analysis of games of
chance] (Reprinted in 2006) (in (French)) (Second ed.). Providence, Rhode Island: American Mathematical
Society. ISBN978-0-8218-3781-8. as translated and posted at Pulskamp, Richard J. "Correspondence of Nicolas
Bernoulli concerning the St. Petersburg Game" (http://www.cs.xu.edu/math/Sources/Montmort/stpetersburg.
pdf) ( PDF(88KB)). Retrieved July 22, 2010.
Laplace, Pierre Simon (1814). Thorie analytique des probabilits [Analytical theory of probabilities] (in
(French)) (Second ed.). Paris: Ve. Courcier.
Martin, Robert (2004). "The St. Petersburg Paradox" (http://plato.stanford.edu/archives/fall2004/entries/
paradox-stpetersburg/). In Edward N. Zalta. The Stanford Encyclopedia of Philosophy (Fall 2004 ed.). Stanford,
California: Stanford University. ISSN 1095-5054 (http://www.worldcat.org/issn/1095-5054). Retrieved
2006-05-30.
Menger, Karl (August 1934). "Das Unsicherheitsmoment in der Wertlehre Betrachtungen im Anschlu an das
sogenannte Petersburger Spiel". Zeitschrift fr Nationalkonomie 5 (4): 459485. doi: 10.1007/BF01311578
(http://dx.doi.org/10.1007/BF01311578). ISSN 0931-8658 (http://www.worldcat.org/issn/0931-8658).
(Paper) (Online).
Peters, Ole (October 2011b). "Menger 1934 revisited" (http://arxiv.org/pdf/1110.1578v1.pdf). Journal of
Economic Literature.
Peters, Ole (2011a). "The time resolution of the St Petersburg paradox" (http://rsta.royalsocietypublishing.org/
content/369/1956/4913.full.pdf). Philosophical Transactions of the Royal Society 369: 49134931. doi:
10.1098/rsta.2011.0065 (http://dx.doi.org/10.1098/rsta.2011.0065).
Pianca, Paolo (September 2007). "The St. Petersburg Paradox: Historical Exposition, an Application to Growth
Stocks and Some Simulation Approaches" (http://www.dma.unive.it/quaderni/QD24-2007.pdf). Quaderni Di
Didattica, Department of Applied Mathematics, University of Venice 24: 115.
Rieger, Marc Oliver; Wang, Mei (August 2006). "Cumulative prospect theory and the St. Petersburg paradox".
Economic Theory 28 (3): 665679. doi: 10.1007/s00199-005-0641-6 (http://dx.doi.org/10.1007/
s00199-005-0641-6). ISSN 0938-2259 (http://www.worldcat.org/issn/0938-2259). (Paper) (Online). (
Publicly accessible, older version. (http://www.sfb504.uni-mannheim.de/publications/dp04-28.pdf))
Samuelson, Paul (January 1960). "The St. Petersburg Paradox as a Divergent Double Limit". International
Economic Review (Blackwell Publishing) 1 (1): 3137. doi: 10.2307/2525406 (http://dx.doi.org/10.2307/
2525406). JSTOR 2525406 (http://www.jstor.org/stable/2525406).
Samuelson, Paul (March 1977). "St. Petersburg Paradoxes: Defanged, Dissected, and Historically Described".
Journal of Economic Literature (American Economic Association) 15 (1): 2455. JSTOR 2722712 (http://
www.jstor.org/stable/2722712).
Todhunter, Isaac (1865). A history of the mathematical theory of probabilities. Macmillan & Co.

90

St. Petersburg paradox


Tversky, Amos; Kahneman (1992). "Advances in prospect theory: Cumulative representation of uncertainty".
Journal of Risk and Uncertainty 5: 297323. doi: 10.1007/bf00122574 (http://dx.doi.org/10.1007/
bf00122574).

Bibliography
Aumann, Robert J. (April 1977). "The St. Petersburg paradox: A discussion of some recent comments". Journal
of Economic Theory 14 (2): 443445. doi: 10.1016/0022-0531(77)90143-0 (http://dx.doi.org/10.1016/
0022-0531(77)90143-0).
Durand, David (September 1957). "Growth Stocks and the Petersburg Paradox". The Journal of Finance
(American Finance Association) 12 (3): 348363. doi: 10.2307/2976852 (http://dx.doi.org/10.2307/
2976852). JSTOR 2976852 (http://www.jstor.org/stable/2976852).
Feller, William. An Introduction to Probability Theory and its Applications Volume I,II.
"Bernoulli and the St. Petersburg Paradox" (http://cepa.newschool.edu/het/essays/uncert/bernoulhyp.htm).
The History of Economic Thought. The New School for Social Research, New York. Retrieved 2006-05-30.
Haigh, John (1999). Taking Chances. Oxford,UK: Oxford University Press. p.330. ISBN0198526636.(Chapter
4)

External links
Online simulation of the St. Petersburg lottery (http://www.mathematik.com/Petersburg/Petersburg.html)

91

92

Logic
All horses are the same color
The horse paradox is a falsidical paradox that arises from flawed demonstrations, which purport to use
mathematical induction, of the statement All horses are the same color. There is no actual contradiction, as these
arguments have a crucial flaw that makes them incorrect. This example was used by Joel E. Cohen as an example of
the subtle errors that can occur in attempts to prove statements by induction.[1]

The argument
The argument is proof by induction. First we establish a base case for one horse (
horses have the same color, then

). We then prove that if

horses must also have the same color.

Base case: One horse


The case with just one horse is trivial. If there is only one horse in the "group", then clearly all horses in that group
have the same color.

Inductive step
Assume that

horses always are the same color. Let us consider a group consisting of

horses.

First, exclude the last horse and look only at the first horses; all these are the same color since horses always
are the same color. Likewise, exclude the first horse and look only at the last horses. These too, must also be of
the same color. Therefore, the first horse in the group is of the same color as the horses in the middle, who in turn are
of the same color as the last horse. Hence the first horse, middle horses, and last horse are all of the same color, and
we have proven that:
If

horses have the same color, then

horses will also have the same color.

We already saw in the base case that the rule ("all horses have the same color") was valid for
step showed that since the rule is valid for
rule is valid for

, it must also be valid for

. The inductive

, which in turn implies that the

and so on.

Thus in any group of horses, all horses must be the same color.

Explanation
The argument above makes the implicit assumption that the two subsets of horses to which the induction assumption
is applied have a common element. This is not true when the original set (prior to either removal) only contains two
horses.
Let the two horses be horse A and horse B. When horse A is removed, it is true that the remaining horses in the set
are the same color (only horse B remains). If horse B is removed instead, this leaves a different set containing only
horse A, which may or may not be the same color as horse B.
The problem in the argument is the assumption that because each of these two sets contains only one color of horses,
the original set also contained only one color of horses. Because there are no common elements (horses) in the two
sets, it is unknown whether the two horses share the same color. The proof forms a falsidical paradox; it seems to
show something manifestly false by valid reasoning, but in fact the reasoning is flawed.

All horses are the same color

References
[1] . Reprinted in A Random Walk in Science (R. L. Weber, ed.), Crane, Russak & Co., 1973.

References
Enumerative Combinatorics by George E. Martin, ISBN 0-387-95225-X

Barbershop paradox
This article is about a paradox in the theory of logical conditionals introduced by Lewis Carroll in "A Logical
Paradox [1]". For an unrelated paradox of self-reference with a similar name, attributed to Bertrand Russell, see
Barber paradox.
The Barbershop paradox was proposed by Lewis Carroll in a three-page essay titled "A Logical Paradox", which
appeared in the July 1894 issue of Mind. The name comes from the "ornamental" short story that Carroll uses to
illustrate the paradox (although it had appeared several times in more abstract terms in his writing and
correspondence before the story was published). Carroll claimed that it illustrated "a very real difficulty in the
Theory of Hypotheticals" in use at the time. Modern logicians would not regard it as a paradox but simply as a
logical error on the part of Carroll.

The paradox
Briefly, the story runs as follows: Uncle Joe and Uncle Jim are walking to the barber shop. There are three barbers
who live and work in the shopAllen, Brown, and Carrbut not all of them are always in the shop. Carr is a good
barber, and Uncle Jim is keen to be shaved by him. He knows that the shop is open, so at least one of them must be
in. He also knows that Allen is a very nervous man, so that he never leaves the shop without Brown going with him.
Uncle Joe insists that Carr is certain to be in, and then claims that he can prove it logically. Uncle Jim demands the
proof. Uncle Joe reasons as follows.
Suppose that Carr is out. If Carr is out, then if Allen is also out Brown would have to be in, since someone must be in
the shop for it to be open. However, we know that whenever Allen goes out he takes Brown with him, and thus we
know as a general rule that if Allen is out, Brown is out. So if Carr is out then the statements "if Allen is out then
Brown is in" and "if Allen is out then Brown is out" would both be true at the same time.
Uncle Joe notes that this seems paradoxical; the two "hypotheticals" seem "incompatible" with each other. So, by
contradiction, Carr must logically be in.
However, the correct conclusion to draw from the incompatibility of the two "hypotheticals" is that what is
hypothesised in them ( that Allen is out) must be false under our assumption that Carr is out. Then our logic simply
allows us to arrive at the conclusion "If Carr is out, then Allen must necessarily be in".

93

Barbershop paradox

94

Simplification
Carroll wrote this story to illustrate a controversy in the field of logic that was raging at the time .Wikipedia:Citation
needed His vocabulary and writing style can easily add to the confusion of the core issue for modern readers.

Notation
When reading the original it may help to keep the following in mind:
What Carroll called "hypotheticals" modern logicians call "logical conditionals".
Whereas Uncle Joe concludes his proof reductio ad absurdum, modern mathematicians would more commonly
claim "proof by contradiction".
What Carroll calls the protasis of a conditional is now known as the antecedent, and similarly the apodosis is now
called the consequent.
Symbols can be used to greatly simplify logical statements such as those inherent in this story:
Operator (Name) Colloquial

Symbolic

Negation

NOT

not X

Conjunction

AND

X and Y

XY

Disjunction

OR

X or Y

XY

Conditional

IF ... THEN if X then Y

XY

Note: X Y (also known as "Implication") can be read many ways in English, from "X is sufficient for Y" to "Y
follows from X." (See also Table of mathematical symbols.)

Restatement
To aid in restating Carroll's story more simply, we will take the following atomic statements:
A = Allen is in the shop
B = Brown is in
C = Carr is in
So, for instance (A B) represents "Allen is out and Brown is in"
Uncle Jim gives us our two axioms:
1. There is at least one barber in the shop now (A B C)
2. Allen never leaves the shop without Brown (A B)
Uncle Joe presents a proof:
Abbreviated English with logical markers

Mainly Symbolic

Suppose Carr is NOT in.

H0: C

Given NOT C, IF Allen is NOT in THEN Brown must be in, to satisfy Axiom 1(A1).

By H0 and A1, A B

But Axiom 2(A2) gives that it is universally true that IF Allen


is Not in THEN Brown is Not in (it's always true that if A then B)

By A2, A B

So far we have that NOT C yields both (Not A THEN B) AND (Not A THEN Not B). Thus C ( (A B) (A B) )
Uncle Joe claims that these are contradictory.

Therefore Carr must be in.

Uncle Joe basically makes the argument that (A B) and (A B) are contradictory, saying that the same
antecedent cannot result in two different consequents.

Barbershop paradox

95

This purported contradiction is the crux of Joe's "proof." Carroll presents this intuition-defying result as a paradox,
hoping that the contemporary ambiguity would be resolved.

Discussion
In modern logic theory this scenario is not a paradox. The law of implication reconciles what Uncle Joe claims are
incompatible hypotheticals. This law states that "if X then Y" is logically identical to "X is false or Y is true" (X
Y). For example, given the statement "if you press the button then the light comes on", it must be true at any given
moment that either you have not pressed the button, or the light is on.
In short, what obtains is not that C yields a contradiction, only that it necessitates A, because A is what actually
yields the contradiction.
In this scenario, that means Carr doesn't have to be in, but that if he isn't in, Allen has to be in.

Simplifying to Axiom 1
Applying the law of implication to the offending conditionals shows that rather than contradicting each other one
simply reiterates the fact that since the shop is open one or more of Allen, Brown or Carr is in and the other puts very
little restriction on who can or cannot be in shop.
To see this let's attack Jim's large "contradictory" result, mainly by applying the law of implication repeatedly. First
let's break down one of the two offending conditionals:
"If Allen is out, then Brown is out"

(A B)

"Allen is in or Brown is out"

(A B)

Substituting this into


"IF Carr is out, THEN If Allen is also out Then Brown is in AND If Allen is out Then Brown is
out."

C ( (A B) (A B)
)

Which yields, with continued application of the law of implication,


"IF Carr is out, THEN if Allen is also out, Brown is in AND either Allen is in OR Brown is out."
"IF Carr is out, THEN both of these are true: Allen is in OR Brown is in AND Allen is in OR Brown
is out."
"Carr is in OR both of these are true: Allen is in OR Brown is in AND Allen is in OR Brown is out."

C ( (A B) (A
B) )
C ( (A B) (A B) )
C ( (A B) (A B) )

And finally, (on the right we are distributing over the parentheses)
"Carr is in OR Either Allen is in OR Brown is in, AND Carr is in OR Either Allen is in OR Brown is
out."

C (A B) C (A
B)

"Inclusively, Carr is in OR Allen is in OR Brown is in, AND Inclusively, Carr is in OR Allen is in OR


Brown is out."

(C A B) (C A
B)

So the two statements which become true at once are: "One or more of Allen, Brown or Carr is in," which is simply
Axiom 1, and "Carr is in or Allen is in or Brown is out." Clearly one way that both of these statements can become
true at once is in the case where Allen is in (because Allen's house is the barber shop, and at some point Brown left
the shop).
Another way to describe how (X Y) (X Y) resolves this into a valid set of statements is to rephrase Jim's
statement that "If Allen is also out ..." into "If Carr is out and Allen is out then Brown is in" ( (C A) B).

Barbershop paradox

96

Showing conditionals compatible


The two conditionals are not logical opposites: to prove by contradiction Jim needed to show C (Z Z), where
Z happens to be a conditional.
The opposite of (A B) is (A B), which, using De Morgan's Law, resolves to (A B), which is not at all the
same thing as (A B), which is what A B reduces to.
This confusion about the "compatibility" of these two conditionals was foreseen by Carroll, who includes a mention
of it at the end of the story. He attempts to clarify the issue by arguing that the protasis and apodosis of the
implication "If Carr is in ..." are "incorrectly divided." However, application of the Law of Implication removes the
"If ..." entirely (reducing to disjunctions), so no protasis and apodosis exist and no counter-argument is needed.

Notes
[1] http:/ / fair-use. org/ mind/ 1894/ 07/ notes/ a-logical-paradox

Further reading
Russell, Bertrand (1903). "Chapter II. Symbolic Logic" (http://fair-use.org/bertrand-russell/
the-principles-of-mathematics/s.19#s19n1). The Principles of Mathematics. p. 19 n. 1. ISBN0-415-48741-2.
Russell suggests a truth-functional notion of logical conditionals, which (among other things) entails that a false
proposition will imply all propositions. In a note he mentions that his theory of implication would dissolve
Carroll's paradox, since it not only allows, but in fact requires that both "p implies q" and "p implies not-q" be
true, so long as p is not true.

Carroll's paradox
This article is about the physical paradox. For the Lewis Carroll dialogue, see What the Tortoise Said to Achilles.
In physics, Carroll's paradox arises when considering the motion of a falling rigid rod that is specially constrained.
Considered one way, the angular momentum stays constant; considered in a different way, it changes. It is named
after Michael M. Carroll who first published it in 1984.

Explanation
Consider two concentric circles of radius
uniform rigid heavy rod of length

and
as might be drawn on the face of a wall clock. Suppose a
is somehow constrained between these two circles so that one end

of the rod remains on the inner circle and the other remains on the outer circle. Motion of the rod along these circles,
acting as guides, is frictionless. The rod is held in the three o'clock position so that it is horizontal, then released.
Now consider the angular momentum about the centre of the rod:
1. After release, the rod falls. Being constrained, it must rotate as it moves. When it gets to a vertical six o'clock
position, it has lost potential energy and, because the motion is frictionless, will have gained kinetic energy. It
therefore possesses angular momentum.
2. The reaction force on the rod from either circular guide is frictionless, so it must be directed along the rod; there
can be no component of the reaction force perpendicular to the rod. Taking moments about the center of the rod,
there can be no moment acting on the rod, so its angular momentum remains constant. Because the rod starts with
zero angular momentum, it must continue to have zero angular momentum for all time.
An apparent resolution of this paradox is that the physical situation cannot occur. To maintain the rod in a radial
position the circles have to exert an infinite force. In real life it would not be possible to construct guides that do not

Carroll's paradox
exert a significant reaction force perpendicular to the rod. Victor Namias, however, disputed that infinite forces
occur, and argued that a finitely thick rod experiences torque about its center of mass even in the limit as it
approaches zero width.

References
Carroll, Michael M. (November 1984). "Singular constraints in rigid-body dynamics". American Journal of
Physics 52 (11): 10101012. Bibcode: 1984AmJPh..52.1010C (http://adsabs.harvard.edu/abs/1984AmJPh..
52.1010C). doi: 10.1119/1.13777 (http://dx.doi.org/10.1119/1.13777).
Namias, Victor (May 1986). "On an apparent paradox in the motion of a smoothly constrained rod". American
Journal of Physics 54 (5): 440445. Bibcode: 1986AmJPh..54..440N (http://adsabs.harvard.edu/abs/
1986AmJPh..54..440N). doi: 10.1119/1.14610 (http://dx.doi.org/10.1119/1.14610).
Felszeghy, Stephen F. (1986). "On so-called singular constraints in rigid-body dynamics". American Journal of
Physics 54: 585586. Bibcode: 1986AmJPh..54..585F (http://adsabs.harvard.edu/abs/1986AmJPh..54..
585F). doi: 10.1119/1.14533 (http://dx.doi.org/10.1119/1.14533).

Crocodile Dilemma
The crocodile paradox is a paradox in logic in the same family of paradoxes as the liar paradox. The premise states
that a crocodile, who has stolen a child, promises the father that his son will be returned if and only if he can
correctly predict whether or not the crocodile will return the child.
The transaction is logically smooth but unpredictable if the father guesses that the child will be returned, but a
dilemma arises for the crocodile if he guesses that the child will not be returned. In the case that the crocodile
decides to keep the child, he violates his terms: the father's prediction has been validated, and the child should be
returned. However, in the case that the crocodile decides to give back the child, he still violates his terms, even if this
decision is based on the previous result: the father's prediction has been falsified, and the child should not be
returned. The question of what the crocodile should do is therefore paradoxical, and there is no justifiable solution.
The crocodile dilemma serves to expose some of the logical problems presented by metaknowledge. In this regard, it
is similar in construction to the unexpected hanging paradox, which Richard Montague(1960) used to demonstrate
that the following assumptions about knowledge are inconsistent when tested in combination:
(i) If is known to be true, then .
(ii) It is known that (i).
(iii) If implies , and is known to be true, then is also known to be true.
It also bears similarities to the liar paradox. Ancient Greek sources were the first to discuss the crocodile dilemma.

Notes

97

Drinker paradox

98

Drinker paradox
The drinker paradox (also known as drinker's principle or (the) drinking principle) is a theorem of classical
predicate logic, usually stated in natural language as: There is someone in the pub such that, if he is drinking,
everyone in the pub is drinking. The actual theorem is

where D is an arbitrary predicate and P is an arbitrary set. The paradox was popularised by the mathematical logician
Raymond Smullyan, who called it the "drinking principle" in his 1978 book What Is the Name of this Book?

Proofs of the paradox


The proof begins by recognizing it is true that either everyone in the pub is drinking, or at least one person in the pub
is not drinking. Consequently, there are two cases to consider:
1. Suppose everyone is drinking. For any particular person, it cannot be wrong to say that if that particular person is
drinking, then everyone in the pub is drinking because everyone is drinking. Because everyone is drinking,
then that one person must drink because when ' that person ' drinks ' everybody ' drinks, everybody includes that
person.
2. Suppose that at least one person is not drinking. For any particular nondrinking person, it still cannot be wrong to
say that if that particular person is drinking, then everyone in the pub is drinking because that person is, in
fact, not drinking. In this case the condition is false, so the statement is vacuously true due to the nature of
material implication in formal logic, which states that "If P, then Q" is always true if P (the condition or
antecedent) is false.
Either way, there is someone in the pub such that, if he is drinking, everyone in the pub is drinking. A slightly more
formal way of expressing the above is to say that if everybody drinks then anyone can be the witness for the validity
of the theorem. And if someone does not drink, then that particular non-drinking individual can be the witness to the
theorem's validity.
The proof above is essentially model-theoretic (can be formalized as such). A purely syntactic proof is possible and
can even be mechanized (in Otter for example), but only for an equisatisfiable rather than an equivalent negation of
the theorem.[1] Namely, the negation of the theorem is

which is equivalent with the prenex normal form

By Skolemization the above is equisatisfiable with

The resolution of the two clauses

and

results in an empty set of clauses (i.e. a contradiction),

thus proving the negation of the theorem is unsatisfiable. The resolution is slightly non-straightforward because it
involves a search based on Herbrand's theorem for ground instances that are propositionally unsatisfiable. The bound
variable x is first instantiated with a constant d (making use of the assumption that the domain is non-empty),
resulting in the Herbrand universe:
One can sketch the following natural deduction:

Drinker paradox

99

Or spelled out:
1. Instantiating x with d yields
2. x is then instantiated with f(d) yielding
Observe that

and

which implies
which implies

unify syntactically in their predicate arguments. An (automated) search

thus finishes in two steps:


1.
2.
The proof by resolution given here uses the law of excluded middle, the axiom of choice, and non-emptiness of the
domain as premises.

Discussion
This proof illustrates several properties of classical predicate logic that do not always agree with ordinary language.

Excluded middle
The above proof begins by saying that either everyone is drinking, or someone is not drinking. This uses the validity
of excluded middle for the statement
"everyone is drinking", which is always available in classical logic. If the
logic does not admit arbitrary excluded middlefor example if the logic is intuitionisticthen the truth of
must first be established, i.e.,

must be shown to be decidable.

Material versus indicative conditional


Most important to the paradox is that the conditional in classical (and intuitionistic) logic is the material conditional.
It has the property that
is true if B is true or if A is false (in classical logic, but not intuitionistic logic, this
is also a necessary condition).
So as it was applied here, the statement "if he is drinking, everyone is drinking" was taken to be correct in one case,
if everyone was drinking, and in the other case, if he was not drinking even though his drinking may not have had
anything to do with anyone else's drinking.
In natural language, on the other hand, typically "if...then..." is used as an indicative conditional.

Non-empty domain
It is not necessary to assume there was anyone in the pub. The assumption that the domain is non-empty is built into
the inference rules of classical predicate logic. We can deduce
from
, but of course if the domain
were empty (in this case, if there were nobody in the pub), the proposition

is not well-formed for any closed

expression .
Nevertheless, if we allow empty domains we still have something like the drinker paradox in the form of the
theorem:

Or in words:
If there is anyone in the pub at all, then there is someone such that, if he is drinking, then everyone in the pub
is drinking.

Drinker paradox

Temporal aspects
Although not discussed in formal terms by Smullyan, he hints that the verb "drinks" is also ambiguous by citing a
postcard written to him by two of his students, which contains the following dialogue (emphasis in original):
Logician / I know a fellow who is such that whenever he drinks, everyone does.
Student / I just don't understand. Do you mean, everyone on earth?
Logician / Yes, naturally.
Student / That sounds crazy! You mean as soon as he drinks, at just that moment, everyone does?
Logician / Of course.
Student / But that implies that at some time, everyone was drinking at once. Surely that never happened!

History and variations


Smullyan in his 1978 book attributes the naming of "The Drinking Principle" to his graduate students. He also
discusses variants (obtained by substituting D with other, more dramatic predicates):
"there is a woman on earth such that if she becomes sterile, the whole human race will die out." Smullyan writes
that this formulation emerged from a conversation he had with philosopher John Bacon.
A "dual" version of the Principle: "there is at least one person such that if anybody drinks, then he does."
As "Smullyan's Drinkers principle" or just "Drinkers' principle" it appears in H.P. Barendregt's "The quest for
correctness" (1996), accompanied by some machine proofs. Since then it has made regular appearance as an example
in publications about automated reasoning; it is sometimes used to contrast the expressiveness of proof assistants.[2]

References
[1] Marc Bezem , Dimitri Hendriks (2008) Clausification in Coq (http:/ / igitur-archive. library. uu. nl/ lg/ 2008-0402-200713/ preprint187. pdf)
[2] Freek Wiedijk. 2001. Mizar Light for HOL Light (http:/ / www. cs. ru. nl/ ~freek/ mizar/ miz. pdf). In Proceedings of the 14th International
Conference on Theorem Proving in Higher Order Logics (TPHOLs '01), Richard J. Boulton and Paul B. Jackson (Eds.). Springer-Verlag,
London, UK, 378-394.

100

Infinite regress

Infinite regress
For the Star Trek: Voyager episode, see Infinite Regress.
An infinite regress in a series of propositions arises if the truth of proposition P1 requires the support of proposition
P2, the truth of proposition P2 requires the support of proposition P3, ... , and the truth of proposition Pn-1 requires
the support of proposition Pn and n approaches infinity.
Distinction is made between infinite regresses that are "vicious" and those that are not.

Aristotle
Aristotle argued that knowing does not necessitate an infinite regress because some knowledge does not depend on
demonstration:
Some hold that, owing to the necessity of knowing the primary premises, there is no scientific knowledge.
Others think there is, but that all truths are demonstrable. Neither doctrine is either true or a necessary
deduction from the premises. The first school, assuming that there is no way of knowing other than by
demonstration, maintain that an infinite regress is involved, on the ground that if behind the prior stands no
primary, we could not know the posterior through the prior (wherein they are right, for one cannot traverse an
infinite series): if on the other hand they say the series terminates and there are primary premises, yet these
are unknowable because incapable of demonstration, which according to them is the only form of knowledge.
And since thus one cannot know the primary premises, knowledge of the conclusions which follow from them
is not pure scientific knowledge nor properly knowing at all, but rests on the mere supposition that the
premises are true. The other party agree with them as regards knowing, holding that it is only possible by
demonstration, but they see no difficulty in holding that all truths are demonstrated, on the ground that
demonstration may be circular and reciprocal.
Our own doctrine is that not all knowledge is demonstrative: on the contrary, knowledge of the immediate
premises is independent of demonstration. (The necessity of this is obvious; for since we must know the prior
premises from which the demonstration is drawn, and since the regress must end in immediate truths, those
truths must be indemonstrable.) Such, then, is our doctrine, and in addition we maintain that besides scientific
knowledge there is its original source which enables us to recognize the definitions.
Aristotle,Posterior Analytics I.3 72b1-15

Consciousness
Infinite regress in consciousness is the formation of an infinite series of "inner observers" as we ask the question of
who is observing the output of the neural correlates of consciousness in the study of subjective consciousness.

Optics
Infinite regress in optics is the formation of an infinite series of receding images created in two parallel facing
mirrors. See optical feedback.

References

101

Lottery paradox

Lottery paradox
Henry E. Kyburg, Jr.'s lottery paradox (1961, p. 197) arises from considering a fair 1000-ticket lottery that has
exactly one winning ticket. If this much is known about the execution of the lottery it is therefore rational to accept
that some ticket will win. Suppose that an event is very likely only if the probability of it occurring is greater than
0.99. On these grounds it is presumed rational to accept the proposition that ticket 1 of the lottery will not win. Since
the lottery is fair, it is rational to accept that ticket 2 won't win eitherindeed, it is rational to accept for any
individual ticket i of the lottery that ticket i will not win. However, accepting that ticket 1 won't win, accepting that
ticket 2 won't win, and so on until accepting that ticket 1000 won't win: that entails that it is rational to accept that no
ticket will win, which entails that it is rational to accept the contradictory proposition that one ticket wins and no
ticket wins.
The lottery paradox was designed to demonstrate that three attractive principles governing rational acceptance lead
to contradiction, namely that
It is rational to accept a proposition that is very likely true,
It is irrational to accept a proposition that is known to be inconsistent, and is jointly inconsistent
If it is rational to accept a proposition A and it is rational to accept another proposition A', then it is rational to
accept A & A',
The paradox remains of continuing interest because it raises several issues at the foundations of knowledge
representation and uncertain reasoning: the relationships between fallibility, corrigible belief and logical
consequence; the roles that consistency, statistical evidence and probability play in belief fixation; the precise
normative force that logical and probabilistic consistency have on rational belief.

History
Although the first published statement of the lottery paradox appears in Kyburg's 1961 Probability and the Logic of
Rational Belief, the first formulation of the paradox appears in his "Probability and Randomness," a paper delivered
at the 1959 meeting of the Association for Symbolic Logic, and the 1960 International Congress for the History and
Philosophy of Science, but published in the journal Theoria in 1963. This paper is reprinted in Kyburg (1987).

Smullyan's variation
Raymond Smullyan presents the following variation on the lottery paradox: You are either inconsistent or conceited.
Since the human brain is finite, there are a finite number of propositions p
1p
n that you believe. But unless you are conceited, you know that you sometimes make mistakes, and that not
everything you believe is true. Therefore, if you are not conceited, you know that at least some of the p
i are false. Yet you believe each of the p
i individually. This is an inconsistency.(Smullyan 1978, p.206)

A short guide to the literature


The lottery paradox has become a central topic within epistemology, and the enormous literature surrounding this
puzzle threatens to obscure its original purpose. Kyburg proposed the thought experiment to get across a feature of
his innovative ideas on probability (Kyburg 1961, Kyburg and Teng 2001), which are built around taking the first
two principles above seriously and rejecting the last. For Kyburg, the lottery paradox isn't really a paradox: his
solution is to restrict aggregation.

102

Lottery paradox
Even so, for orthodox probabilists the second and third principles are primary, so the first principle is rejected. Here
too you'll see claims that there is really no paradox but an error: the solution is to reject the first principle, and with it
the idea of rational acceptance. For anyone with basic knowledge of probability, the first principle should be
rejected: for a very likely event, the rational belief about that event is just that it is very likely, not that it is true.
Most of the literature in epistemology approaches the puzzle from the orthodox point of view and grapples with the
particular consequences faced by doing so, which is why the lottery is associated with discussions of skepticism
(e.g., Klein 1981), and conditions for asserting knowledge claims (e.g., J. P. Hawthorne 2004). It is common to also
find proposed resolutions to the puzzle that turn on particular features of the lottery thought experiment (e.g., Pollock
1986), which then invites comparisons of the lottery to other epistemic paradoxes, such as David Makinson's preface
paradox, and to "lotteries" having a different structure. This strategy is addressed in (Kyburg 1997) and also in
(Wheeler 2007). An extensive bibliography is included in (Wheeler 2007).
Philosophical logicians and AI researchers have tended to be interested in reconciling weakened versions of the three
principles, and there are many ways to do this, including Jim Hawthorne and Luc Bovens's (1999) logic of belief,
Gregory Wheeler's (2006) use of 1-monotone capacities, Bryson Brown's (1999) application of preservationist
paraconsistent logics, Igor Douven and Timothy Williamson's (2006) appeal to cumulative non-monotonic logics,
Horacio Arlo-Costa's (2007) use of minimal model (classical) modal logics, and Joe Halpern's (2003) use of
first-order probability.
Finally, philosophers of science, decision scientists, and statisticians are inclined to see the lottery paradox as an
early example of the complications one faces in constructing principled methods for aggregating uncertain
information, which is now a discipline of its own, with a dedicated journal, Information Fusion, in addition to
continuous contributions to general area journals.

Selected References
Arlo-Costa, H (2005). "Non-Adjunctive Inference and Classical Modalities", The Journal of Philosophical Logic,
34, 581605.
Brown, B. (1999). "Adjunction and Aggregation", Nous, 33(2), 273283.
Douven and Williamson (2006). "Generalizing the Lottery Paradox", The British Journal for the Philosophy of
Science, 57(4), pp. 755779.
Halpern, J. (2003). Reasoning about Uncertainty, Cambridge, MA: MIT Press.
Hawthorne, J. and Bovens, L. (1999). "The Preface, the Lottery, and the Logic of Belief", Mind, 108: 241264.
Hawthorne, J.P. (2004). Knowledge and Lotteries, New York: Oxford University Press.
Klein, P. (1981). Certainty: a Refutation of Scepticism, Minneapolis, MN: University of Minnesota Press.
Kyburg, H.E. (1961). Probability and the Logic of Rational Belief, Middletown, CT: Wesleyan University Press.
Kyburg, H. E. (1983). Epistemology and Inference, Minneapolis, MN: University of Minnesota Press.
Kyburg, H. E. (1997). "The Rule of Adjunction and Reasonable Inference", Journal of Philosophy, 94(3),
109125.
Kyburg, H. E., and Teng, C-M. (2001). Uncertain Inference, Cambridge: Cambridge University Press.
Lewis, D. (1996). "Elusive Knowledge", Australasian Journal of Philosophy, 74, pp. 54967.
Makinson, D. (1965). "The Paradox of the Preface", Analysis, 25: 205207.
Pollock, J. (1986). "The Paradox of the Preface", Philosophy of Science, 53, pp. 346258.
Smullyan, Raymond (1978). What is the name of this book?. PrenticeHall. p.206. ISBN0-13-955088-7.
Wheeler, G. (2006). "Rational Acceptance and Conjunctive/Disjunctive Absorption", Journal of Logic, Language,
and Information, 15(1-2): 4953.
Wheeler, G. (2007). "A Review of the Lottery Paradox", in William Harper and Gregory Wheeler (eds.)
Probability and Inference: Essays in Honour of Henry E. Kyburg, Jr., King's College Publications, pp. 131.

103

Lottery paradox

104

External links
Links to James Hawthorne's papers on the logic of nonmonotonic conditionals (and Lottery Logic) [1]

References
[1] http:/ / faculty-staff. ou. edu/ H/ James. A. Hawthorne-1/

Paradoxes of material implication


The paradoxes of material implication are a group of formulas which are truths of classical logic, but which are
intuitively problematic. One of these paradoxes is the paradox of entailment.
The root of the paradoxes lies in a mismatch between the interpretation of the validity of logical implication in
natural language, and its formal interpretation in classical logic, dating back to George Boole's algebraic logic. In
classical logic, implication describes conditional if-then statements using a truth-functional interpretation, i.e. "p
implies q" is defined to be "it is not the case that p is true and q false". Also, "p implies q" is equivalent to "p is false
or q is true". For example, "if it is raining, then I will bring an umbrella", is equivalent to "it is not raining, or I will
bring an umbrella, or both". This truth-functional interpretation of implication is called material implication or
material conditional.
The paradoxes are logical statements which are true but whose truth is intuitively surprising to people who are not
familiar with them. If the terms 'p', 'q' and 'r' stand for arbitrary propositions then the main paradoxes are given
formally as follows:
1.
2.
3.
4.
5.

, p and its negation imply q. This is the paradox of entailment.


, if p is true then it is implied by every q.
, if p is false then it implies every q. This is referred to as 'explosion'.
, either q or its negation is true, so their disjunction is implied by every p.
, if p, q and r are three arbitrary propositions, then either p implies q or q implies r. This is

because if q is true then p implies it, and if it is false then q implies any other statement. Since r can be p, it
follows that given two arbitrary propositions, one must imply the other, even if they are mutually contradictory.
For instance, "Nadia is in Barcelona implies Nadia is in Madrid, or Nadia is in Madrid implies Nadia is in
Barcelona." This truism sounds like nonsense in ordinary discourse.
6.
, if p does not imply q then p is true and q is false. NB if p were false then it would
imply q, so p is true. If q were also true then p would imply q, hence q is false. This paradox is particularly
surprising because it tells us that if one proposition does not imply another then the first is true and the second
false.
The paradoxes of material implication arise because of the truth-functional definition of material implication, which
is said to be true merely because the antecedent is false or the consequent is true. By this criterion, "If the moon is
made of green cheese, then the world is coming to an end," is true merely because the moon isn't made of green
cheese. By extension, any contradiction implies anything whatsoever, since a contradiction is never true. (All
paraconsistent logics must, by definition, reject (1) as false.) Also, any tautology is implied by anything whatsoever,
since a tautology is always true.
To sum up, although it is deceptively similar to what we mean by "logically follows" in ordinary usage, material
implication does not capture the meaning of "if... then".

Paradoxes of material implication

Paradox of entailment
As the best known of the paradoxes, and most formally simple, the paradox of entailment makes the best
introduction.
In natural language, an instance of the paradox of entailment arises:
It is raining
And
It is not raining
Therefore
George Washington was made of rakes.
This arises from the principle of explosion, a law of classical logic stating that inconsistent premises always make an
argument valid; that is, inconsistent premises imply any conclusion at all. This seems paradoxical, as it suggests that
the above is a valid argument.

Understanding the paradox of entailment


Validity is defined in classical logic as follows:
An argument (consisting of premises and a conclusion) is valid if and only if there is no possible situation in
which all the premises are true and the conclusion is false.
For example a valid argument might run:
If it is raining, water exists (1st premise)
It is raining (2nd premise)
Water exists (Conclusion)
In this example there is no possible situation in which the premises are true while the conclusion is false. Since there
is no counterexample, the argument is valid.
But one could construct an argument in which the premises are inconsistent. This would satisfy the test for a valid
argument since there would be no possible situation in which all the premises are true and therefore no possible
situation in which all the premises are true and the conclusion is false.
For example an argument with inconsistent premises might run:
Matter has mass (1st premise; true)
Matter does not have mass (2nd premise; false)
All numbers are equal to 12 (Conclusion)
As there is no possible situation where both premises could be true, then there is certainly no possible situation in
which the premises could be true while the conclusion was false. So the argument is valid whatever the conclusion
is; inconsistent premises imply all conclusions.

Explaining the paradox


The strangeness of the paradox of entailment comes from the fact that the definition of validity in classical logic does
not always agree with the use of the term in ordinary language. In everyday use validity suggests that the premises
are consistent. In classical logic, the additional notion of soundness is introduced. A sound argument is a valid
argument with all true premises. Hence a valid argument with an inconsistent set of premises can never be sound. A
suggested improvement to the notion of logical validity to eliminate this paradox is relevant logic.

105

Paradoxes of material implication

Simplification
The classical paradox formulas are closely tied to the formula,

the principle of Simplification, which can be derived from the paradox formulas rather easily (e.g. from (1) by
Importation). In addition, there are serious problems with trying to use material implication as representing the
English "if ... then ...". For example, the following are valid inferences:
1.
2.
but mapping these back to English sentences using "if" gives paradoxes. The first might be read "If John is in
London then he is in England, and if he is in Paris then he is in France. Therefore, it is either true that (a) if John is in
London then he is in France, or (b) that if he is in Paris then he is in England." Using material implication, if John
really is in London, then (since he is not in Paris) (b) is true; whereas if he is in Paris, then (a) is true. Since he
cannot be in both places, the conclusion that at least one of (a) or (b) is true is valid.
But this does not match how "if ... then ..." is used in natural language: the most likely scenario in which one would
say "If John is in London then he is in England" is if one does not know where John is, but nonetheless knows that if
he is in London, he is in England. Under this interpretation, both premises are true, but both clauses of the
conclusion are false.
The second example can be read "If both switch A and switch B are closed, then the light is on. Therefore, it is either
true that if switch A is closed, the light is on, or that if switch B is closed, the light is on." Here, the most likely
natural-language interpretation of the "if ... then ..." statements would be "whenever switch A is closed, the light is
on", and "whenever switch B is closed, the light is on". Again, under this interpretation both clauses of the
conclusion may be false (for instance in a series circuit, with a light that only comes on when both switches are
closed).

References

Bennett, J. A Philosophical Guide to Conditionals. Oxford: Clarendon Press. 2003.


Conditionals, ed. Frank Jackson. Oxford: Oxford University Press. 1991.
Etchemendy, J. The Concept of Logical Consequence. Cambridge: Harvard University Press. 1990.
Hazewinkel, Michiel, ed. (2001), "Strict implication calculus" [1], Encyclopedia of Mathematics, Springer,
ISBN978-1-55608-010-4
Sanford, D. If P, Then Q: Conditionals and the Foundations of Reasoning. New York: Routledge. 1989.
Priest, G. An Introduction to Non-Classical Logic, Cambridge University Press. 2001.

References
[1] http:/ / www. encyclopediaofmath. org/ index. php?title=p/ s090470

106

Raven paradox

Raven paradox

A black raven

Non-black
non-ravens
The raven paradox suggests that both of these images contribute evidence to the supposition that all ravens are black.
The raven paradox, also known as Hempel's paradox or Hempel's ravens, is a paradox arising from the question
of what constitutes evidence for a statement. Observing objects that are neither black nor ravens may formally
increase the likelihood that all ravens are black even though, intuitively, these observations are unrelated.
This problem was proposed by the logician Carl Gustav Hempel in the 1940s to illustrate a contradiction between
inductive logic and intuition. A related issue is the problem of induction and the gap between inductive and
deductive reasoning.[1]

The paradox
Hempel describes the paradox in terms of the hypothesis:[2][3]
(1) All ravens are black.
In strict logical terms, via contraposition, this statement is equivalent to:
(2) Everything that is not black is not a raven.
It should be clear that in all circumstances where (2) is true, (1) is also true; and likewise, in all circumstances where
(2) is false (i.e. if a world is imagined in which something that was not black, yet was a raven, existed), (1) is also
false. This establishes logical equivalence.
Given a general statement such as all ravens are black, a form of the same statement that refers to a specific
observable instance of the general class would typically be considered to constitute evidence for that general
statement. For example,
(3) Nevermore, my pet raven, is black.
is evidence supporting the hypothesis that all ravens are black.
The paradox arises when this same process is applied to statement (2). On sighting a green apple, one can observe:
(4) This green (and thus not black) thing is an apple (and thus not a raven).
By the same reasoning, this statement is evidence that (2) everything that is not black is not a raven. But since (as
above) this statement is logically equivalent to (1) all ravens are black, it follows that the sight of a green apple is
evidence supporting the notion that all ravens are black. This conclusion seems paradoxical, because it implies that
information has been gained about ravens by looking at an apple.

107

Raven paradox

Proposed resolutions
Nicod's criterion says that only observations of ravens should affect one's view as to whether all ravens are black.
Observing more instances of black ravens should support the view, observing white or coloured ravens should
contradict it, and observations of non-ravens should not have any influence.[4]
Hempel's equivalence condition states that when a proposition, X, provides evidence in favor of another proposition
Y, then X also provides evidence in favor of any proposition which is logically equivalent to Y.[5]
With normal real world expectations, the set of ravens is finite. The set of non black things is either non finite or
beyond human enumeration. In order to confirm the statement 'All ravens are black.', it would be necessary to
observe all ravens. This is possible. In order to confirm the statement 'All non black things are non ravens.', it would
be necessary to examine all non black things. This is not possible. Observing a black raven could be considered a
finite amount of confirmatory evidence, but observing a non black non raven would be an infinitesimal amount of
evidence.
The paradox shows that Nicod's criterion and Hempel's equivalence condition are not mutually consistent. A
resolution to the paradox must reject at least one out of:
1. negative instances having no influence (!PC),
2. equivalence condition (EC), or,
3. validation by positive instances (NC).
A satisfactory resolution should also explain why there naively appears to be a paradox. Solutions which accept the
paradoxical conclusion can do this by presenting a proposition which we intuitively know to be false but which is
easily confused with (PC), while solutions which reject (EC) or (NC) should present a proposition which we
intuitively know to be true but which is easily confused with (EC) or (NC).

Accepting non-ravens as relevant


Although this conclusion of the paradox seems counter-intuitive, some approaches accept that observations of
(coloured) non-ravens can in fact constitute valid evidence in support for hypotheses about (the universal blackness
of) ravens.
Hempel's resolution
Hempel himself accepted the paradoxical conclusion, arguing that the reason the result appears paradoxical is
because we possess prior information without which the observation of a non-black non-raven would indeed provide
evidence that all ravens are black.
He illustrates this with the example of the generalization "All sodium salts burn yellow", and asks us to consider the
observation which occurs when somebody holds a piece of pure ice in a colorless flame which does not turn yellow:
This result would confirm the assertion, "Whatever does not burn yellow is not sodium salt", and
consequently, by virtue of the equivalence condition, it would confirm the original formulation. Why does this
impress us as paradoxical? The reason becomes clear when we compare the previous situation with the case of
an experiment where an object whose chemical constitution is as yet unknown to us is held into a flame and
fails to turn it yellow, and where subsequent analysis reveals it to contain no sodium salt. This outcome, we
should no doubt agree, is what was to be expected on the basis of the hypothesis ... thus the data here obtained
constitute confirming evidence for the hypothesis.
In the seemingly paradoxical cases of confirmation, we are often not actually judging the relation of the given
evidence, E alone to the hypothesis H ... we tacitly introduce a comparison of H with a body of evidence
which consists of E in conjunction with an additional amount of information which we happen to have at our
disposal; in our illustration, this information includes the knowledge (1) that the substance used in the
experiment is ice, and (2) that ice contains no sodium salt. If we assume this additional information as given,

108

Raven paradox

109

then, of course, the outcome of the experiment can add no strength to the hypothesis under consideration. But
if we are careful to avoid this tacit reference to additional knowledge ... the paradoxes vanish.
Standard Bayesian solution
One of the most popular proposed resolutions is to accept the conclusion that the observation of a green apple
provides evidence that all ravens are black but to argue that the amount of confirmation provided is very small, due
to the large discrepancy between the number of ravens and the number of non-black objects. According to this
resolution, the conclusion appears paradoxical because we intuitively estimate the amount of evidence provided by
the observation of a green apple to be zero, when it is in fact non-zero but extremely small.
I J Good's presentation of this argument in 1960[6] is perhaps the best known, and variations of the argument have
been popular ever since,[7] although it had been presented in 1958[8] and early forms of the argument appeared as
early as 1940.[9]
Good's argument involves calculating the weight of evidence provided by the observation of a black raven or a white
shoe in favor of the hypothesis that all the ravens in a collection of objects are black. The weight of evidence is the
logarithm of the Bayes factor, which in this case is simply the factor by which the odds of the hypothesis changes
when the observation is made. The argument goes as follows:
... suppose that there are
black, and that the

objects that might be seen at any moment, of which

objects each have probability

of being seen. Let

non-black ravens, and suppose that the hypotheses

are ravens and

are

be the hypothesis that there are

are initially equiprobable. Then, if we

happen to see a black raven, the Bayes factor in favour of

is

i.e. about 2 if the number of ravens in existence is known to be large. But the factor if we see a white shoe is
only

and this exceeds unity by only about

if

is large compared to

. Thus the weight of

evidence provided by the sight of a white shoe is positive, but is small if the number of ravens is known to be
small compared to the number of non-black objects.[10]
Many of the proponents of this resolution and variants of it have been advocates of Bayesian probability, and it is
now commonly called the Bayesian Solution, although, as Chihara[11] observes, "there is no such thing as the
Bayesian solution. There are many different 'solutions' that Bayesians have put forward using Bayesian techniques."
Noteworthy
approaches
using
Bayesian
techniques
include
Earman,[12] Eells,[13] Gibson,[14]
[15]
[16]
[17]
Hosiasson-Lindenbaum,
Howson and Urbach,
Mackie,
and Hintikka,[18] who claims that his approach is
"more Bayesian than the so-called 'Bayesian solution' of the same paradox." Bayesian approaches which make use of
Carnap's theory of inductive inference include Humburg,[19] Maher,[20] and Fitelson et al.[21] Vranas[22] introduced
the term "Standard Bayesian Solution" to avoid confusion.

Raven paradox

110

Carnap approach
Maher[23] accepts the paradoxical conclusion, and refines it:
A non-raven (of whatever color) confirms that all ravens are black because
(i) the information that this object is not a raven removes the possibility that this object is a
counterexample to the generalization, and
(ii) it reduces the probability that unobserved objects are ravens, thereby reducing the probability that
they are counterexamples to the generalization.
In order to reach (ii), he appeals to Carnap's theory of inductive probability, which is (from the Bayesian point of
view) a way of assigning prior probabilities which naturally implements induction. According to Carnap's theory, the
posterior probability,
, that an object, , will have a predicate, , after the evidence
has been
observed, is:

where

is the initial probability that

has the predicate

examined (according to the available evidence


have the predicate
If

, and

is the number of objects which have been

is the number of examined objects which turned out to

is a constant which measures resistance to generalization.

is close to zero,

have the predicate

);

will be very close to one after a single observation of an object which turned out to
, while if

is much larger than

will be very close to

regardless of

the fraction of observed objects which had the predicate .


Using this Carnapian approach, Maher identifies a proposition which we intuitively (and correctly) know to be false,
but which we easily confuse with the paradoxical conclusion. The proposition in question is the proposition that
observing non-ravens tells us about the color of ravens. While this is intuitively false and is also false according to
Carnap's theory of induction, observing non-ravens (according to that same theory) causes us to reduce our estimate
of the total number of ravens, and thereby reduces the estimated number of possible counterexamples to the rule that
all ravens are black.
Hence, from the Bayesian-Carnapian point of view, the observation of a non-raven does not tell us anything about
the color of ravens, but it tells us about the prevalence of ravens, and supports "All ravens are black" by reducing our
estimate of the number of ravens which might not be black.
Role of background knowledge
Much of the discussion of the paradox in general and the Bayesian approach in particular has centred on the
relevance of background knowledge. Surprisingly, Maher shows that, for a large class of possible configurations of
background knowledge, the observation of a non-black non-raven provides exactly the same amount of confirmation
as the observation of a black raven. The configurations of background knowledge which he considers are those
which are provided by a sample proposition, namely a proposition which is a conjunction of atomic propositions,
each of which ascribes a single predicate to a single individual, with no two atomic propositions involving the same
individual. Thus, a proposition of the form "A is a black raven and B is a white shoe" can be considered a sample
proposition by taking "black raven" and "white shoe" to be predicates.
Maher's proof appears to contradict the result of the Bayesian argument, which was that the observation of a
non-black non-raven provides much less evidence than the observation of a black raven. The reason is that the
background knowledge which Good and others use can not be expressed in the form of a sample proposition - in
particular, variants of the standard Bayesian approach often suppose (as Good did in the argument quoted above) that
the total numbers of ravens, non-black objects and/or the total number of objects, are known quantities. Maher
comments that, "The reason we think there are more non-black things than ravens is because that has been true of the
things we have observed to date. Evidence of this kind can be represented by a sample proposition. But ... given any

Raven paradox

111

sample proposition as background evidence, a non-black non-raven confirms A just as strongly as a black raven does
... Thus my analysis suggests that this response to the paradox [i.e. the Standard Bayesian one] cannot be correct."
Fitelson et al.[24] examined the conditions under which the observation of a non-black non-raven provides less
evidence than the observation of a black raven. They show that, if is an object selected at random,
is the
proposition that the object is black, and

is the proposition that the object is a raven, then the condition:

is sufficient for the observation of a non-black non-raven to provide less evidence than the observation of a black
raven. Here, a line over a proposition indicates the logical negation of that proposition.
This condition does not tell us how large the difference in the evidence provided is, but a later calculation in the
same paper shows that the weight of evidence provided by a black raven exceeds that provided by a non-black
non-raven by about
. This is equal to the amount of additional information (in bits, if the
base of the logarithm is 2) which is provided when a raven of unknown color is discovered to be black, given the
hypothesis that not all ravens are black.
Fitelson et al. explain that:
Under normal circumstances,

may be somewhere around 0.9 or 0.95; so

is

somewhere around 1.11 or 1.05. Thus, it may appear that a single instance of a black raven does not yield
much more support than would a non-black non-raven. However, under plausible conditions it can be shown
that a sequence of instances (i.e. of n black ravens, as compared to n non-black non-ravens) yields a ratio of
likelihood ratios on the order of
, which blows up significantly for large .
The authors point out that their analysis is completely consistent with the supposition that a non-black non-raven
provides an extremely small amount of evidence although they do not attempt to prove it; they merely calculate the
difference between the amount of evidence that a black raven provides and the amount of evidence that a non-black
non-raven provides.

Disputing the induction from positive instances


Some approaches for resolving the paradox focus on the inductive step. They dispute whether observation of a
particular instance (such as one black raven) is the kind of evidence that necessarily increases confidence in the
general hypothesis (such as that ravens are always black).
The red herring
Good gives an example of background knowledge with respect to which the observation of a black raven decreases
the probability that all ravens are black:
Suppose that we know we are in one or other of two worlds, and the hypothesis, H, under consideration is that
all the ravens in our world are black. We know in advance that in one world there are a hundred black ravens,
no non-black ravens, and a million other birds; and that in the other world there are a thousand black ravens,
one white raven, and a million other birds. A bird is selected equiprobably at random from all the birds in our
world. It turns out to be a black raven. This is strong evidence ... that we are in the second world, wherein not
all ravens are black.
Good concludes that the white shoe is a "red herring": Sometimes even a black raven can constitute evidence against
the hypothesis that all ravens are black, so the fact that the observation of a white shoe can support it is not surprising
and not worth attention. Nicod's criterion is false, according to Good, and so the paradoxical conclusion does not
follow.
Hempel rejected this as a solution to the paradox, insisting that the proposition 'c is a raven and is black' must be
considered "by itself and without reference to any other information", and pointing out that it "... was emphasized in

Raven paradox
section 5.2(b) of my article in Mind ... that the very appearance of paradoxicality in cases like that of the white shoe
results in part from a failure to observe this maxim."[25]
The question which then arises is whether the paradox is to be understood in the context of absolutely no background
information (as Hempel suggests), or in the context of the background information which we actually possess
regarding ravens and black objects, or with regard to all possible configurations of background information.
Good had shown that, for some configurations of background knowledge, Nicod's criterion is false (provided that we
are willing to equate "inductively support" with "increase the probability of" - see below). The possibility remained
that, with respect to our actual configuration of knowledge, which is very different from Good's example, Nicod's
criterion might still be true and so we could still reach the paradoxical conclusion. Hempel, on the other hand, insists
that it is our background knowledge itself which is the red herring, and that we should consider induction with
respect to a condition of perfect ignorance.
Good's baby
In his proposed resolution, Maher implicitly made use of the fact that the proposition "All ravens are black" is highly
probable when it is highly probable that there are no ravens. Good had used this fact before to respond to Hempel's
insistence that Nicod's criterion was to be understood to hold in the absence of background information:
...imagine an infinitely intelligent newborn baby having built-in neural circuits enabling him to deal with
formal logic, English syntax, and subjective probability. He might now argue, after defining a raven in detail,
that it is extremely unlikely that there are any ravens, and therefore it is extremely likely that all ravens are
black, that is, that
is true. 'On the other hand', he goes on to argue, 'if there are ravens, then there is a
reasonable chance that they are of a variety of colours. Therefore, if I were to discover that even a black raven
exists I would consider
to be less probable than it was initially.'
This, according to Good, is as close as one can reasonably expect to get to a condition of perfect ignorance, and it
appears that Nicod's condition is still false. Maher made Good's argument more precise by using Carnap's theory of
induction to formalize the notion that if there is one raven, then it is likely that there are many.[26]
Maher's argument considers a universe of exactly two objects, each of which is very unlikely to be a raven (a one in
a thousand chance) and reasonably unlikely to be black (a one in ten chance). Using Carnap's formula for induction,
he finds that the probability that all ravens are black decreases from 0.9985 to 0.8995 when it is discovered that one
of the two objects is a black raven.
Maher concludes that not only is the paradoxical conclusion true, but that Nicod's criterion is false in the absence of
background knowledge (except for the knowledge that the number of objects in the universe is two and that ravens
are less likely than black things).
Distinguished predicates
Quine[27] argued that the solution to the paradox lies in the recognition that certain predicates, which he called
natural kinds, have a distinguished status with respect to induction. This can be illustrated with Nelson Goodman's
example of the predicate grue. An object is grue if it is blue before (say) 2015 and green afterwards. Clearly, we
expect objects which were blue before 2015 to remain blue afterwards, but we do not expect the objects which were
found to be grue before 2015 to be blue after 2015, since after 2015 they would be green. Quine's explanation is that
"blue" is a natural kind; a privileged predicate which can be used for induction, while "grue" is not a natural kind and
using induction with it leads to error.
This suggests a resolution to the paradox - Nicod's criterion is true for natural kinds, such as "blue" and "black", but
is false for artificially contrived predicates, such as "grue" or "non-raven". The paradox arises, according to this
resolution, because we implicitly interpret Nicod's criterion as applying to all predicates when in fact it only applies
to natural kinds.

112

Raven paradox
Another approach which favours specific predicates over others was taken by Hintikka.[28] Hintikka was motivated
to find a Bayesian approach to the paradox which did not make use of knowledge about the relative frequencies of
ravens and black things. Arguments concerning relative frequencies, he contends, cannot always account for the
perceived irrelevance of evidence consisting of observations of objects of type A for the purposes of learning about
objects of type not-A.
His argument can be illustrated by rephrasing the paradox using predicates other than "raven" and "black". For
example, "All men are tall" is equivalent to "All short people are women", and so observing that a randomly selected
person is a short woman should provide evidence that all men are tall. Despite the fact that we lack background
knowledge to indicate that there are dramatically fewer men than short people, we still find ourselves inclined to
reject the conclusion. Hintikka's example is: "... a generalization like 'no material bodies are infinitely divisible'
seems to be completely unaffected by questions concerning immaterial entities, independently of what one thinks of
the relative frequencies of material and immaterial entities in one's universe of discourse."
His solution is to introduce an order into the set of predicates. When the logical system is equipped with this order, it
is possible to restrict the scope of a generalization such as "All ravens are black" so that it applies to ravens only and
not to non-black things, since the order privileges ravens over non-black things. As he puts it:
If we are justified in assuming that the scope of the generalization 'All ravens are black' can be restricted to
ravens, then this means that we have some outside information which we can rely on concerning the factual
situation. The paradox arises from the fact that this information, which colors our spontaneous view of the
situation, is not incorporated in the usual treatments of the inductive situation.

Rejections of Hempel's equivalence condition


Some approaches for the resolution of the paradox reject Hempel's equivalence condition. That is, they may not
consider evidence supporting the statement all non-black objects are non-ravens to necessarily support
logically-equivalent statements such as all ravens are black.
Selective confirmation
Scheffler and Goodman[29] took an approach to the paradox which incorporates Karl Popper's view that scientific
hypotheses are never really confirmed, only falsified.
The approach begins by noting that the observation of a black raven does not prove that "All ravens are black" but it
falsifies the contrary hypothesis, "No ravens are black". A non-black non-raven, on the other hand, is consistent with
both "All ravens are black" and with "No ravens are black". As the authors put it:
... the statement that all ravens are black is not merely satisfied by evidence of a black raven but is favored by
such evidence, since a black raven disconfirms the contrary statement that all ravens are not black, i.e. satisfies
its denial. A black raven, in other words, satisfies the hypothesis that all ravens are black rather than not: it
thus selectively confirms that all ravens are black.
Selective confirmation violates the equivalence condition since a black raven selectively confirms "All ravens are
black" but not "All non-black things are non-ravens".

113

Raven paradox

114

Probabilistic or non-probabilistic induction


Scheffler and Goodman's concept of selective confirmation is an example of an interpretation of "provides evidence
in favor of," which does not coincide with "increase the probability of". This must be a general feature of all
resolutions which reject the equivalence condition, since logically equivalent propositions must always have the
same probability.
It is impossible for the observation of a black raven to increase the probability of the proposition "All ravens are
black" without causing exactly the same change to the probability that "All non-black things are non-ravens". If an
observation inductively supports the former but not the latter, then "inductively support" must refer to something
other than changes in the probabilities of propositions. A possible loophole is to interpret "All" as "Nearly all" "Nearly all ravens are black" is not equivalent to "Nearly all non-black things are non-ravens", and these
propositions can have very different probabilities.
This raises the broader question of the relation of probability theory to inductive reasoning. Karl Popper argued that
probability theory alone cannot account for induction. His argument involves splitting a hypothesis,
, into a part
which is deductively entailed by the evidence,
First, consider the splitting:

where

and

, and another part. This can be done in two ways.

[30]

are probabilistically independent:

and so on. The condition

which is necessary for such a splitting of H and E to be possible is


probabilistically supported by
.
Popper's observation is that the part,
, while the part of

, of

, that is, that

which receives support from

which does not follow deductively from

is

actually follows deductively from

receives no support at all from

- that is,

.
Second, the splitting:[31]

separates

into

, which as Popper says, "is the logically strongest part of

) that follows [deductively] from

," and

, which, he says, "contains all of

He continues:
Does
, in this case, provide any support for the factor
needed to obtain

? The answer is: No. It never does. Indeed,

(or of the content of


that goes beyond

, which in the presence of


countersupports

."

is alone

unless either

or
(which are possibilities of no interest). ...
This result is completely devastating to the inductive interpretation of the calculus of probability. All
probabilistic support is purely deductive: that part of a hypothesis that is not deductively entailed by the
evidence is always strongly countersupported by the evidence ... There is such a thing as probabilistic support;
there might even be such a thing as inductive support (though we hardly think so). But the calculus of
probability reveals that probabilistic support cannot be inductive support.
The orthodox approach
The orthodox Neyman-Pearson theory of hypothesis testing considers how to decide whether to accept or reject a
hypothesis, rather than what probability to assign to the hypothesis. From this point of view, the hypothesis that "All
ravens are black" is not accepted gradually, as its probability increases towards one when more and more
observations are made, but is accepted in a single action as the result of evaluating the data which has already been
collected. As Neyman and Pearson put it:
Without hoping to know whether each separate hypothesis is true or false, we may search for rules to govern
our behaviour with regard to them, in following which we insure that, in the long run of experience, we shall
not be too often wrong.[32]

Raven paradox

115

According to this approach, it is not necessary to assign any value to the probability of a hypothesis, although one
must certainly take into account the probability of the data given the hypothesis, or given a competing hypothesis,
when deciding whether to accept or to reject. The acceptance or rejection of a hypothesis carries with it the risk of
error.
This contrasts with the Bayesian approach, which requires that the hypothesis be assigned a prior probability, which
is revised in the light of the observed data to obtain the final probability of the hypothesis. Within the Bayesian
framework there is no risk of error since hypotheses are not accepted or rejected; instead they are assigned
probabilities.
An analysis of the paradox from the orthodox point of view has been performed, and leads to, among other insights,
a rejection of the equivalence condition:
It seems obvious that one cannot both accept the hypothesis that all P's are Q and also reject the contrapositive,
i.e. that all non-Q's are non-P. Yet it is easy to see that on the Neyman-Pearson theory of testing, a test of "All
P's are Q" is not necessarily a test of "All non-Q's are non-P" or vice versa. A test of "All P's are Q" requires
reference to some alternative statistical hypothesis of the form of all P's are Q,
, whereas a test
of "All non-Q's are non-P" requires reference to some statistical alternative of the form of all non-Q's are
non-P,
. But these two sets of possible alternatives are different ... Thus one could have a test of
without having a test of its contrapositive.[33]
Rejecting material implication
The following propositions all imply one another: "Every object is either black or not a raven", "Every Raven is
black", and "Every non-black object is a non-raven." They are therefore, by definition, logically equivalent.
However, the three propositions have different domains: the first proposition says something about "Every object",
while the second says something about "Every raven".
The first proposition is the only one whose domain of quantification is unrestricted ("all objects"), so this is the only
one which can be expressed in first order logic. It is logically equivalent to:

and also to

where
or

indicates the material conditional, according to which "If

then

" can be understood to mean "

".

It has been argued by several authors that material implication does not fully capture the meaning of "If

then

" (see the paradoxes of material implication). "For every object, , is either black or not a raven" is true when
there are no ravens. It is because of this that "All ravens are black" is regarded as true when there are no ravens.
Furthermore, the arguments which Good and Maher used to criticize Nicod's criterion (see Good's Baby, above)
relied on this fact - that "All ravens are black" is highly probable when it is highly probable that there are no ravens.
To say that all ravens are black in the absence of any ravens is an empty statement. It refers to nothing. All ravens
are white is equally relevant and true, if this statement is considered to have any truth or relevance.
Some approaches to the paradox have sought to find other ways of interpreting "If

then

" and "All

are

" which would eliminate the perceived equivalence between "All ravens are black" and "All non-black things are
non-ravens."
One such approach involves introducing a many-valued logic according to which "If
truth-value

, meaning "Indeterminate" or "Inappropriate" when

not automatically allowed: "If

then

" is not equivalent to "If

black" is not equivalent to "All non-black things are non-ravens".

[34]

is false.
then

then

" has the

In such a system, contraposition is


". Consequently, "All ravens are

Raven paradox

116

In this system, when contraposition occurs, the modality of the conditional involved changes from the indicative ("If
that piece of butter has been heated to 32 C then it has melted") to the counterfactual ("If that piece of butter had
been heated to 32 C then it would have melted"). According to this argument, this removes the alleged equivalence
which is necessary to conclude that yellow cows can inform us about ravens:
In proper grammatical usage, a contrapositive argument ought not to be stated entirely in the indicative. Thus:
From the fact that if this match is scratched it will light, it follows that if it does not light it was not
scratched.
is awkward. We should say:
From the fact that if this match is scratched it will light, it follows that if it were not to light it would not
have been scratched. ...
One might wonder what effect this interpretation of the Law of Contraposition has on Hempel's paradox of
confirmation. "If is a raven then is black" is equivalent to "If were not black then would not be a
raven". Therefore whatever confirms the latter should also, by the Equivalence Condition, confirm the former.
True, but yellow cows still cannot figure into the confirmation of "All ravens are black" because, in science,
confirmation is accomplished by prediction, and predictions are properly stated in the indicative mood. It is
senseless to ask what confirms a counterfactual.[35]
Differing results of accepting the hypotheses
Several commentators have observed that the propositions "All ravens are black" and "All non-black things are
non-ravens" suggest different procedures for testing the hypotheses. E.g. Good writes:[36]
As propositions the two statements are logically equivalent. But they have a different psychological effect on
the experimenter. If he is asked to test whether all ravens are black he will look for a raven and then decide
whether it is black. But if he is asked to test whether all non-black things are non-ravens he may look for a
non-black object and then decide whether it is a raven.
More recently, it has been suggested that "All ravens are black" and "All non-black things are non-ravens" can have
different effects when accepted.[37] The argument considers situations in which the total numbers or prevalences of
ravens and black objects are unknown, but estimated. When the hypothesis "All ravens are black" is accepted,
according to the argument, the estimated number of black objects increases, while the estimated number of ravens
does not change.
It can be illustrated by considering the situation of two people who have identical information regarding ravens and
black objects, and who have identical estimates of the numbers of ravens and black objects. For concreteness,
suppose that there are 100 objects overall, and, according to the information available to the people involved, each
object is just as likely to be a non-raven as it is to be a raven, and just as likely to be black as it is to be non-black:

and the propositions

are independent for different objects

and so on. Then the estimated number of

ravens is 50; the estimated number of black things is 50; the estimated number of black ravens is 25, and the
estimated number of non-black ravens (counterexamples to the hypotheses) is 25.
One of the people performs a statistical test (e.g. a Neyman-Pearson test or the comparison of the accumulated
weight of evidence to a threshold) of the hypothesis that "All ravens are black", while the other tests the hypothesis
that "All non-black objects are non-ravens". For simplicity, suppose that the evidence used for the test has nothing to
do with the collection of 100 objects dealt with here. If the first person accepts the hypothesis that "All ravens are
black" then, according to the argument, about 50 objects whose colors were previously in doubt (the ravens) are now
thought to be black, while nothing different is thought about the remaining objects (the non-ravens). Consequently,
he should estimate the number of black ravens at 50, the number of black non-ravens at 25 and the number of

Raven paradox

117

non-black non-ravens at 25. By specifying these changes, this argument explicitly restricts the domain of "All ravens
are black" to ravens.
On the other hand, if the second person accepts the hypothesis that "All non-black objects are non-ravens", then the
approximately 50 non-black objects about which it was uncertain whether each was a raven, will be thought to be
non-ravens. At the same time, nothing different will be thought about the approximately 50 remaining objects (the
black objects). Consequently, he should estimate the number of black ravens at 25, the number of black non-ravens
at 25 and the number of non-black non-ravens at 50. According to this argument, since the two people disagree about
their estimates after they have accepted the different hypotheses, accepting "All ravens are black" is not equivalent to
accepting "All non-black things are non-ravens"; accepting the former means estimating more things to be black,
while accepting the latter involves estimating more things to be non-ravens. Correspondingly, the argument goes, the
former requires as evidence ravens which turn out to be black and the latter requires non-black things which turn out
to be non-ravens.[38]
Existential presuppositions
A number of authors have argued that propositions of the form "All
which are
...

[39]

are

" presuppose that there are objects

[40]

This analysis has been applied to the raven paradox:

: "All ravens are black" and

: "All nonblack things are nonravens" are not strictly equivalent ... due

to their different existential presuppositions. Moreover, although

and

describe the same regularity -

the nonexistence of nonblack ravens - they have different logical forms. The two hypotheses have different
senses and incorporate different procedures for testing the regularity they describe.
A modified logic can take account of existential presuppositions using the presuppositional operator, '*'. For
example,

can denote "All ravens are black" while indicating that it is ravens and not non-black objects which are presupposed
to exist in this example.
... the logical form of each hypothesis distinguishes it with respect to its recommended type of supporting
evidence: the possibly true substitution instances of each hypothesis relate to different types of objects. The
fact that the two hypotheses incorporate different kinds of testing procedures is expressed in the formal
language by prefixing the operator '*' to a different predicate. The presuppositional operator thus serves as a
relevance operator as well. It is prefixed to the predicate ' is a raven' in
because the objects relevant to
the testing procedure incorporated in "All raven are black" include only ravens; it is prefixed to the predicate '
is nonblack', in
, because the objects relevant to the testing procedure incorporated in "All nonblack
things are nonravens" include only nonblack things. ... Using Fregean terms: whenever their presuppositions
hold, the two hypotheses have the same referent (truth-value), but different senses; that is, they express two
different ways to determine that truth-value.[41]

Raven paradox

Notes
[1]
[2]
[3]
[4]

http:/ / plato. stanford. edu/ entries/ hempel/


LINK (http:/ / www. philoscience. unibe. ch/ documents/ TexteHS10/ Hempel1945. pdf)
LINK (http:/ / www. collier. sts. vt. edu/ 5305/ hempel-II. pdf)
Nicod had proposed that, in relation to conditional hypotheses, instances of their antecedents that are also instances of their consequents
confirm them; instances of their antecedents that are not instances of their consequents disconfirm them; and non-instantiations of their
antecedents are neutral, neither confirming nor disconfirming. Stanford Encyclopedia of Philosophy (http:/ / plato. stanford. edu/ entries/
hempel/ )
[5] LINK (http:/ / www. sts-biu. org/ images/ file/ Swinburne - paradoxes. pdf)
[6] Good, IJ (1960) The Paradox of Confirmation, The British Journal for the Philosophy of Science, Vol. 11, No. 42, 145-149 JSTOR (http:/ /
links. jstor. org/ sici?sici=0007-0882(196008)11:42<145:TPOC>2. 0. CO;2-5)
[7] Fitelson, B and Hawthorne, J (2006) How Bayesian Confirmation Theory Handles the Paradox of the Ravens, in Probability in Science,
Chicago: Open Court LINK (http:/ / fitelson. org/ ravens. pdf)
[8] Alexander, HG (1958) The Paradoxes of Confirmation, The British Journal for the Philosophy of Science, Vol. 9, No. 35, P. 227 JSTOR
(http:/ / www. jstor. org/ stable/ 685654?origin=JSTOR-pdf)
[9] JSTOR (http:/ / www. jstor. org/ action/ showArticle?doi=10. 2307/ 2268173) LINK (http:/ / fitelson. org/ confirmation/ lindenbaum. pdf)
[10] Note: Good used "crow" instead of "raven", but "raven" has been used here throughout for consistency.
[11] Chihara, (1987) Some Problems for Bayesian Confirmation Theory, British Journal for the Philosophy of Science, Vol. 38, No. 4 LINK
(http:/ / bjps. oxfordjournals. org/ cgi/ reprint/ 38/ 4/ 551)
[12] Earman, 1992 Bayes or Bust? A Critical Examination of Bayesian Confirmation Theory, MIT Press, Cambridge, MA.
[13] Eells, 1982 Rational Decision and Causality. New York: Cambridge University Press
[14] Gibson, 1969 On Ravens and Relevance and a Likelihood Solution of the Paradox of Confirmation, LINK (http:/ / www. jstor. org/ stable/
686720)
[15] Hosiasson-Lindenbaum 1940
[16] Howson, Urbach, 1993 Scientific Reasoning: The Bayesian Approach, Open Court Publishing Company
[17] Mackie, 1963 The Paradox of Confirmation, Brit. J. Phil. Sci. Vol. 13, No. 52, p. 265 LINK (http:/ / bjps. oxfordjournals. org/ cgi/ content/
citation/ XIII/ 52/ 265)
[18] Hintikka J. 1969, Inductive Independence and the Paradoxes of Confirmation LINK (http:/ / books. google. com/
books?id=pWtPcRwuacAC& pg=PA24& lpg=PA24& ots=-1PKZt0Jbz& lr=& sig=EK2qqOZ6-cZR1P1ZKIsndgxttMs)
[19] Humburg 1986, The solution of Hempel's raven paradox in Rudolf Carnap's system of inductive logic, Erkenntnis, Vol. 24, No. 1, pp
[20] Maher 1999
[21] Fitelson 2006
[22] Vranas (2002) Hempel's Raven Paradox: A Lacuna in the Standard Bayesian Solution LINK (http:/ / philsci-archive. pitt. edu/ archive/
00000688/ 00/ hempelacuna. doc)
[23] Maher, 1999
[24] Fitelson, 2006
[25] Hempel 1967, The White Shoe - No Red Herring, The British Journal for the Philosophy of Science, Vol. 18, No. 3, p. 239 JSTOR (http:/ /
www. jstor. org/ stable/ 686596)
[26] LINK (http:/ / patrick. maher1. net/ pctl. pdf)
[27] Reprinted in:
[28] Hintikka, 1969
[29] Scheffler I, Goodman NJ, Selective Confirmation and the Ravens, Journal of Philosophy, Vol. 69, No. 3, 1972 JSTOR (http:/ / www. jstor.
org/ stable/ 2024647)
[30] Popper, K. Realism and the Aim of Science, Routlege, 1992, p. 325
[31] Popper K, Miller D, (1983) A Proof of the Impossibility of Inductive Probability, Nature, Vol. 302, p. 687 LINK (http:/ / www. nature. com/
nature/ journal/ v302/ n5910/ abs/ 302687a0. html)
[32] JSTOR (http:/ / www. jstor. org/ stable/ 91247) LINK (http:/ / www. stats. org. uk/ statistical-inference/ NeymanPearson1933. pdf)
[33] Giere, RN (1970) An Orthodox Statistical Resolution of the Paradox of Confirmation, Philosophy of Science, Vol. 37, No. 3, p.354 JSTOR
(http:/ / www. jstor. org/ stable/ 186464)
[34] LINK (http:/ / projecteuclid. org/ DPubS/ Repository/ 1. 0/ Disseminate?view=body& id=pdf_1& handle=euclid. ndjfl/ 1093882546)
[35] Farrell (1979)
[36] Good (1960)
[37] LINK (http:/ / philsci-archive. pitt. edu/ archive/ 00003932/ 01/ judgment6. pdf)
[38] O'Flanagan (2008)
[39] Strawson PF (1952) Introduction to Logical Theory, methuan & Co. London, John Wiley & Sons, New York
[40] Cohen Y (1987) Ravens and Relevance, Erkenntnis LINK (http:/ / www. springerlink. com/ content/ hnn2lutn1066xw47/ fulltext. pdf)
[41] Cohen (1987)

118

Raven paradox

References
Franceschi, P. The Doomsday Argument and Hempel's Problem (http://paulfranceschi.com/?p=9), English
translation of a paper initially published in French in the Canadian Journal of Philosophy 29, 139-156, 1999,
under the title Comment l'urne de Carter et Leslie se dverse dans celle de Hempel
Hempel, C. G. A Purely Syntactical Definition of Confirmation. (http://fitelson.org/confirmation/hempel_1943.
pdf) J. Symb. Logic 8, 122-143, 1943.
Hempel, C. G. Studies in the Logic of Confirmation (I) Mind 54, 1-26, 1945.
Hempel, C. G. Studies in the Logic of Confirmation (II) Mind 54, 97-121, 1945.
Hempel, C. G. Studies in the Logic of Confirmation. In Marguerite H. Foster and Michael L. Martin (http://
www.bu.edu/philo/faculty/martin.html), eds. Probability, Confirmation, and Simplicity. New York: Odyssey
Press, 1966. 145-183.
Whiteley, C. H. Hempel's Paradoxes of Confirmation. Mind 54, 156-158, 1945.

External links
"Hempels Ravens Paradox," PRIME (Platonic Realms Interactive Mathematics Encyclopedia). (http://www.
mathacademy.com/pr/prime/articles/paradox_raven/index.asp) Retrieved November 29, 2010.

Unexpected hanging paradox


The unexpected hanging paradox, hangman paradox, unexpected exam paradox, surprise test paradox or
prediction paradox is a paradox about a person's expectations about the timing of a future event (e.g. a prisoner's
hanging, or a school test) which he is told will occur at an unexpected time.
Despite significant academic interest, there is no consensus on its precise nature and consequently a final 'correct'
resolution has not yet been established.[1] One approach, offered by the logical school of thought, suggests that the
problem arises in a self-contradictory self-referencing statement at the heart of the judge's sentence. Another
approach, offered by the epistemological school of thought, suggests the unexpected hanging paradox is an example
of an epistemic paradox because it turns on our concept of knowledge.[2] Even though it is apparently simple, the
paradox's underlying complexities have even led to it being called a "significant problem" for philosophy.

Description of the paradox


The paradox has been described as follows:
A judge tells a condemned prisoner that he will be hanged at noon on one weekday in the following week but
that the execution will be a surprise to the prisoner. He will not know the day of the hanging until the
executioner knocks on his cell door at noon that day.
Having reflected on his sentence, the prisoner draws the conclusion that he will escape from the hanging. His
reasoning is in several parts. He begins by concluding that the "surprise hanging" can't be on Friday, as if he
hasn't been hanged by Thursday, there is only one day left - and so it won't be a surprise if he's hanged on
Friday. Since the judge's sentence stipulated that the hanging would be a surprise to him, he concludes it
cannot occur on Friday.
He then reasons that the surprise hanging cannot be on Thursday either, because Friday has already been
eliminated and if he hasn't been hanged by Wednesday night, the hanging must occur on Thursday, making a
Thursday hanging not a surprise either. By similar reasoning he concludes that the hanging can also not occur
on Wednesday, Tuesday or Monday. Joyfully he retires to his cell confident that the hanging will not occur at
all.

119

Unexpected hanging paradox


The next week, the executioner knocks on the prisoner's door at noon on Wednesday which, despite all the
above, was an utter surprise to him. Everything the judge said came true.
Other versions of the paradox replace the death sentence with a surprise fire drill, examination, pop quiz, or a lion
behind a door.
The informal nature of everyday language allows for multiple interpretations of the paradox. In the extreme case, a
prisoner who is paranoid might feel certain in his knowledge that the executioner will arrive at noon on Monday,
then certain that he will come on Tuesday and so forth, thus ensuring that every day he is not hanged really is a
"surprise" to him, but that the day of his hanging he was indeed expecting to be hanged. But even without adding this
element to the story, the vagueness of the account prohibits one from being objectively clear about which
formalization truly captures its essence. There has been considerable debate between the logical school, which uses
mathematical language, and the epistemological school, which employs concepts such as knowledge, belief and
memory, over which formulation is correct.

The logical school


Formulation of the judge's announcement into formal logic is made difficult by the vague meaning of the word
"surprise". An attempt at formulation might be:
The prisoner will be hanged next week and the date (of the hanging) will not be deducible in advance from the
assumption that the hanging will occur during the week (A).
Given this announcement the prisoner can deduce that the hanging will not occur on the last day of the week.
However, in order to reproduce the next stage of the argument, which eliminates the penultimate day of the week,
the prisoner must argue that his ability to deduce, from statement (A), that the hanging will not occur on the last day,
implies that a last-day hanging would not be surprising. But since the meaning of "surprising" has been restricted to
not deducible from the assumption that the hanging will occur during the week instead of not deducible from
statement (A), the argument is blocked.
This suggests that a better formulation would in fact be:
The prisoner will be hanged next week and its date will not be deducible in advance using this statement as an
axiom (B).
Some authorsWikipedia:Manual of Style/Words to watch#Unsupported attributions have claimed that the
self-referential nature of this statement is the source of the paradox. Fitch has shown that this statement can still be
expressed in formal logic. Using an equivalent form of the paradox which reduces the length of the week to just two
days, he proved that although self-reference is not illegitimate in all circumstances, it is in this case because the
statement is self-contradictory.

Objections
The first objection often raised to the logical school's approach is that it fails to explain how the judge's
announcement appears to be vindicated after the fact. If the judge's statement is self-contradictory, how does he
manage to be right all along? This objection rests on an understanding of the conclusion to be that the judge's
statement is self-contradictory and therefore the source of the paradox. However, the conclusion is more precisely
that in order for the prisoner to carry out his argument that the judge's sentence cannot be fulfilled, he must interpret
the judge's announcement as (B). A reasonable assumption would be that the judge did not intend (B) but that the
prisoner misinterprets his words to reach his paradoxical conclusion. The judge's sentence appears to be vindicated
afterwards but the statement which is actually shown to be true is that "the prisoner will be psychologically surprised
by the hanging". This statement in formal logic would not allow the prisoner's argument to be carried out.
A related objection is that the paradox only occurs because the judge tells the prisoner his sentence (rather than
keeping it secret) which suggests that the act of declaring the sentence is important. Some have argued that since

120

Unexpected hanging paradox


this action is missing from the logical school's approach, it must be an incomplete analysis. But the action is included
implicitly. The public utterance of the sentence and its context changes the judge's meaning to something like "there
will be a surprise hanging despite my having told you that there will be a surprise hanging". The logical school's
approach does implicitly take this into account.

The epistemological school


Various epistemological formulations have been proposed that show that the prisoner's tacit assumptions about what
he will know in the future, together with several plausible assumptions about knowledge, are inconsistent.
Chow (1998) provides a detailed analysis of a version of the paradox in which a surprise examination is to take place
on one of two days. Applying Chow's analysis to the case of the unexpected hanging (again with the week shortened
to two days for simplicity), we start with the observation that the judge's announcement seems to affirm three things:
S1: The hanging will occur on Monday or Tuesday.
S2: If the hanging occurs on Monday, then the prisoner will not know on Sunday evening that it will occur on
Monday.
S3: If the hanging occurs on Tuesday, then the prisoner will not know on Monday evening that it will occur on
Tuesday.
As a first step, the prisoner reasons that a scenario in which the hanging occurs on Tuesday is impossible because it
leads to a contradiction: on the one hand, by S3, the prisoner would not be able to predict the Tuesday hanging on
Monday evening; but on the other hand, by S1 and process of elimination, the prisoner would be able to predict the
Tuesday hanging on Monday evening.
Chow's analysis points to a subtle flaw in the prisoner's reasoning. What is impossible is not a Tuesday hanging.
Rather, what is impossible is a situation in which the hanging occurs on Tuesday despite the prisoner knowing on
Monday evening that the judge's assertions S1, S2, and S3 are all true.
The prisoner's reasoning, which gives rise to the paradox, is able to get off the ground because the prisoner tacitly
assumes that on Monday evening, he will (if he is still alive) know S1, S2, and S3 to be true. This assumption seems
unwarranted on several different grounds. It may be argued that the judge's pronouncement that something is true
can never be sufficient grounds for the prisoner knowing that it is true. Further, even if the prisoner knows something
to be true in the present moment, unknown psychological factors may erase this knowledge in the future. Finally,
Chow suggests that because the statement which the prisoner is supposed to "know" to be true is a statement about
his inability to "know" certain things, there is reason to believe that the unexpected hanging paradox is simply a
more intricate version of Moore's paradox. A suitable analogy can be reached by reducing the length of the week to
just one day. Then the judge's sentence becomes: You will be hanged tomorrow, but you do not know that.

References
[1] T. Y. Chow, "The surprise examination or unexpected hanging paradox," The American Mathematical Monthly Jan 1998 (http:/ / www-math.
mit. edu/ ~tchow/ unexpected. pdf)
[2] Stanford Encyclopedia discussion of hanging paradox together with other epistemic paradoxes (http:/ / plato. stanford. edu/ entries/
epistemic-paradoxes/ )

Further reading
O'Connor, D. J. (1948). "Pragmatic Paradoxes". Mind 57: 358359. doi: 10.1093/mind/lvii.227.358 (http://dx.
doi.org/10.1093/mind/lvii.227.358). The first appearance of the paradox in print. The author claims that certain contingent
future tense statements cannot come true.

Scriven, M. (1951). "Paradoxical Announcements". Mind 60: 403407. The author critiques O'Connor and discovers the
paradox as we know it today.

121

Unexpected hanging paradox


Shaw, R. (1958). "The Unexpected Examination". Mind 67: 382384. doi: 10.1093/mind/lxvii.267.382 (http://
dx.doi.org/10.1093/mind/lxvii.267.382). The author claims that the prisoner's premises are self-referring.
Wright, C. & Sudbury, A. (1977). "the Paradox of the Unexpected Examination". Australasian Journal of
Philosophy 55: 4158. doi: 10.1080/00048407712341031 (http://dx.doi.org/10.1080/00048407712341031).
The first complete formalization of the paradox, and a proposed solution to it.

Margalit, A. & Bar-Hillel, M. (1983). "Expecting the Unexpected". Philosophia 13: 337344. A history and
bibliography of writings on the paradox up to 1983.

Chihara, C. S. (1985). "Olin, Quine, and the Surprise Examination". Philosophical Studies 47: 1926. The author
claims that the prisoner assumes, falsely, that if he knows some proposition, then he also knows that he knows it.

Kirkham, R. (1991). "On Paradoxes and a Surprise Exam". Philosophia 21: 3151. doi: 10.1007/bf02381968
(http://dx.doi.org/10.1007/bf02381968). The author defends and extends Wright and Sudbury's solution. He also updates the
history and bibliography of Margalit and Bar-Hillel up to 1991.

Chow, T. Y. (1998). "The surprise examination or unexpected hanging paradox" (http://www-math.mit.edu/


~tchow/unexpected.pdf). The American Mathematical Monthly.
Franceschi, P. (2005). "Une analyse dichotomique du paradoxe de l'examen surprise". Philosophiques 32 (2):
399421. doi: 10.7202/011875ar (http://dx.doi.org/10.7202/011875ar). English translation (http://www.
paulfranceschi.com/index.php?option=com_content&view=article&
id=6:a-dichotomic-analysis-of-the-surprise-examination-paradox&catid=1:analytic-philosophy&Itemid=2).
Gardner, M. (1969). "The Paradox of the Unexpected Hanging". The Unexpected Hanging and Other *
Mathematical Diversions. Completely analyzes the paradox and introduces other situations with similar logic.
Quine, W. V. O. (1953). "On a So-called Paradox". Mind 62: 6566. doi: 10.1093/mind/lxii.245.65 (http://dx.
doi.org/10.1093/mind/lxii.245.65).
Sorensen, R. A. (1982). "Recalcitrant versions of the prediction paradox". Australasian Journal of Philosophy 69:
355362.
Kacser, Claude (1986). " On the unexpected hanging paradox (http://dx.doi.org/10.1119/1.14658)".
American Journal of Physics 54 (4): 296.
Shapiro, Stuart C. (1998). " A Procedural Solution to the Unexpected Hanging and Sorites Paradoxes (http://
www.jstor.org/stable/2659782)". Mind 107: 751761. doi: 10.1093/mind/107.428.751 (http://dx.doi.org/10.
1093/mind/107.428.751).

External links
"The Surprise Examination Paradox and the Second Incompleteness Theorem" (http://www.ams.org/notices/
201011/rtx101101454p.pdf) by Shira Kritchman and Ran Raz, at ams.org
"The Surprise Examination Paradox: A review of two so-called solutions in dynamic epistemic logic" (http://
staff.science.uva.nl/~grossi/DyLoPro/StudentPapers/Final_Marcoci.pdf) by Alexandru Marcoci, at Faculty
of Science: University of Amsterdam

122

What the Tortoise Said to Achilles

What the Tortoise Said to Achilles


"What the Tortoise Said to Achilles", written by Lewis Carroll in 1895 for the philosophical journal Mind, is a
brief dialogue which problematises the foundations of logic. The title alludes to one of Zeno's paradoxes of motion,
in which Achilles could never overtake the tortoise in a race. In Carroll's dialogue, the tortoise challenges Achilles to
use the force of logic to make him accept the conclusion of a simple deductive argument. Ultimately, Achilles fails,
because the clever tortoise leads him into an infinite regression.

Summary of the dialogue


The discussion begins by considering the following logical argument:
A: "Things that are equal to the same are equal to each other" (Euclidean relation, a weakened form of the
transitive property)
B: "The two sides of this triangle are things that are equal to the same"
Therefore Z: "The two sides of this triangle are equal to each other"
The Tortoise asks Achilles whether the conclusion logically follows from the premises, and Achilles grants that it
obviously does. The Tortoise then asks Achilles whether there might be a reader of Euclid who grants that the
argument is logically valid, as a sequence, while denying that A and B are true. Achilles accepts that such a reader
might exist, and that he would hold that if A and B are true, then Z must be true, while not yet accepting that A and B
are true. (A reader who denies the premises.)
The Tortoise then asks Achilles whether a second kind of reader might exist, who accepts that A and B are true, but
who does not yet accept the principle that if A and B are both true, then Z must be true. Achilles grants the Tortoise
that this second kind of reader might also exist. The Tortoise, then, asks Achilles to treat the Tortoise as a reader of
this second kind. Achilles must now logically compel the Tortoise to accept that Z must be true. (The tortoise is a
reader who denies the argument itself; the syllogism's conclusion, structure, or validity.)
After writing down A, B, and Z in his notebook, Achilles asks the Tortoise to accept the hypothetical:
C: "If A and B are true, Z must be true"
The Tortoise agrees to accept C, if Achilles will write down what it has to accept in his notebook, making the new
argument:

A: "Things that are equal to the same are equal to each other"
B: "The two sides of this triangle are things that are equal to the same"
C: "If A and B are true, Z must be true"
Therefore Z: "The two sides of this triangle are equal to each other"

But now that the Tortoise accepts premise C, it still refuses to accept the expanded argument. When Achilles
demands that "If you accept A and B and C, you must accept Z," the Tortoise remarks that that's another hypothetical
proposition, and suggests even if it accepts C, it could still fail to conclude Z if it did not see the truth of:
D: "If A and B and C are true, Z must be true"
The Tortoise continues to accept each hypothetical premise once Achilles writes it down, but denies that the
conclusion necessarily follows, since each time it denies the hypothetical that if all the premises written down so far
are true, Z must be true:
"And at last we've got to the end of this ideal racecourse! Now that you accept A and B and C and D, of course
you accept Z."
"Do I?" said the Tortoise innocently. "Let's make that quite clear. I accept A and B and C and D. Suppose I still
refused to accept Z?"

123

What the Tortoise Said to Achilles


"Then Logic would take you by the throat, and force you to do it!" Achilles triumphantly replied. "Logic
would tell you, 'You can't help yourself. Now that you've accepted A and B and C and D, you must accept Z!'
So you've no choice, you see."
"Whatever Logic is good enough to tell me is worth writing down," said the Tortoise. "So enter it in your
notebook, please. We will call it
(E) If A and B and C and D are true, Z must be true.
Until I've granted that, of course I needn't grant Z. So it's quite a necessary step, you see?"
"I see," said Achilles; and there was a touch of sadness in his tone.
Thus, the list of premises continues to grow without end, leaving the argument always in the form:

(1): "Things that are equal to the same are equal to each other"
(2): "The two sides of this triangle are things that are equal to the same"
(3): (1) and (2) (Z)
(4): (1) and (2) and (3) (Z)
...
(n): (1) and (2) and (3) and (4) and ... and (n 1) (Z)
Therefore (Z): "The two sides of this triangle are equal to each other"

At each step, the Tortoise argues that even though he accepts all the premises that have been written down, there is
some further premise (that if all of (1)(n) are true, then (Z) must be true) that it still needs to accept before it is
compelled to accept that (Z) is true.

Explanation
Lewis Carroll was showing that there is a regress problem that arises from modus ponens deductions.

The regress problem arises because, in order to explain the logical principle, we have to propose a prior principle.
And, once we explain that principle, we have to introduce another principle to explain that principle. Thus, if the
causal chain is to continue, we are to fall into infinite regress. However, if we introduce a formal system where
modus ponens is simply an axiom, then we are to abide by it simply, because it is so. For example, in a chess game
there are particular rules, and the rules simply go without question. As players of the chess game, we are to simply
follow the rules. Likewise, if we are engaging in a formal system of logic, then we are to simply follow the rules
without question. Hence, introducing the formal system of logic stops the infinite regressionthat is, because the
regress would stop at the axioms or rules, per se, of the given game, system, etc. Though, it does also state that there
are problems with this as well, because, within the system, no proposition or variable carries with it any semantic
content. So, the moment you add to any proposition or variable semantic content, the problem arises again, because
the propositions and variables with semantic content run outside the system. Thus, if the solution is to be said to
work, then it is to be said to work solely within the given formal system, and not otherwise.
Some logicians (Kenneth Ross, Charles Wright) draw a firm distinction between the conditional connective (the
syntactic sign ""), and the implication relation (the formal object denoted by the double arrow symbol ""). These
logicians use the phrase not p or q for the conditional connective and the term implies for the implication relation.
Some explain the difference by saying that the conditional is the contemplated relation while the implication is the
asserted relation.Wikipedia:Citation needed In most fields of mathematics, it is treated as a variation in the usage of
the single sign "," not requiring two separate signs. Not all of those who use the sign "" for the conditional
connective regard it as a sign that denotes any kind of object, but treat it as a so-called syncategorematic sign, that is,
a sign with a purely syntactic function. For the sake of clarity and simplicity in the present introduction, it is
convenient to use the two-sign notation, but allow the sign "" to denote the boolean function that is associated with

124

What the Tortoise Said to Achilles


the truth table of the material conditional.
These considerations result in the following scheme of notation.

The paradox ceases to exist the moment we replace informal logic with propositional logicWikipedia:Citation
needed. The Tortoise and Achilles don't agree on any definition of logical implication. In propositional logic the
logical implication is defined as follows:
P Q if and only if the proposition P Q is a tautology.
Hence de modus ponens [P (P Q)] Q, is a valid logical implication according to the definition of logical
implication just stated. There is no need to recurse since the logical implication can be translated into symbols, and
propositional operators such as . Demonstrating the logical implication simply translates into verifying that the
compound truth table is producing a tautology.

Discussion
Several philosophers have tried to resolve Carroll's paradox. Bertrand Russell discussed the paradox briefly in 38
of The Principles of Mathematics [1] (1903), distinguishing between implication (associated with the form "if p, then
q"), which he held to be a relation between unasserted propositions, and inference (associated with the form "p,
therefore q"), which he held to be a relation between asserted propositions; having made this distinction, Russell
could deny that the Tortoise's attempt to treat inferring Z from A and B is equivalent to, or dependent on, agreeing to
the hypothetical "If A and B are true, then Z is true."
The Wittgensteinian philosopher Peter Winch discussed the paradox in The Idea of a Social Science and its Relation
to Philosophy (1958), where he argued that the paradox showed that "the actual process of drawing an inference,
which is after all at the heart of logic, is something which cannot be represented as a logical formula ... Learning to
infer is not just a matter of being taught about explicit logical relations between propositions; it is learning to do
something" (p.57). Winch goes on to suggest that the moral of the dialogue is a particular case of a general lesson, to
the effect that the proper application of rules governing a form of human activity cannot itself be summed up with a
set of further rules, and so that "a form of human activity can never be summed up in a set of explicit precepts"
(p.53).
According to Penelope Maddy, Carroll's dialogue is apparently the first description of an obstacle to
Conventionalism about logical truth, then reworked in more sober philosophical terms by W. O. Quine.

References
[1] http:/ / fair-use. org/ bertrand-russell/ the-principles-of-mathematics/ s. 38

Where to find the article


Carol, Lewis (1995). "What the Tortoise Said to Achilles". Mind 104 (416): 691693. JSTOR 2254477 (http://
www.jstor.org/stable/2254477).
Hofstadter, Douglas. Gdel, Escher, Bach: an Eternal Golden Braid. See the second dialogue, entitled "Two-Part
Invention." Dr. Hofstadter appropriated the characters of Achilles and the Tortoise for other, original, dialogues in
the book which alternate contrapuntally with prose chapters. Hofstadter's Tortoise is of the male sex, though the
Tortoise's sex is never specified by Carroll. The French translation of the book rendered the Tortoise's name as
"Madame Tortue."
A number of websites, including "What the Tortoise Said to Achilles" (http://www.lewiscarroll.org/achilles.
html) at the Lewis Carroll Society of North America (http://www.lewiscarroll.org), "What the Tortoise Said to
Achilles" (http://www.ditext.com/carroll/tortoise.html) at Digital Text International (http://www.ditext.

125

What the Tortoise Said to Achilles


com/), and "What the Tortoise Said to Achilles" (http://fair-use.org/mind/1895/04/
what-the-tortoise-said-to-achilles) at Fair Use Repository (http://fair-use.org).

126

127

Mathematics
Accuracy paradox
The accuracy paradox for predictive analytics states that predictive models with a given level of accuracy may have
greater predictive power than models with higher accuracy. It may be better to avoid the accuracy metric in favor of
other metrics such as precision and recall.
Accuracy is often the starting point for analyzing the quality of a predictive model, as well as an obvious criterion for
prediction. Accuracy measures the ratio of correct predictions to the total number of cases evaluated. It may seem
obvious that the ratio of correct predictions to cases should be a key metric. A predictive model may have high
accuracy, but be useless.
In an example predictive model for an insurance fraud application, all cases that are predicted as high-risk by the
model will be investigated. To evaluate the performance of the model, the insurance company has created a sample
data set of 10,000 claims. All 10,000 cases in the validation sample have been carefully checked and it is known
which cases are fraudulent. To analyze the quality of the model, the insurance uses the table of confusion. The
definition of accuracy, the table of confusion for model M1Fraud, and the calculation of accuracy for model M1Fraud is
shown below.
where
TN is the number of true negative cases
FP is the number of false positive cases
FN is the number of false negative cases
TP is the number of true positive cases
Formula 1: Definition of Accuracy
Predicted Negative Predicted Positive
Negative Cases 9,700

150

Positive Cases

100

50

Table 1: Table of Confusion for Fraud Model M1Fraud.

Formula 2: Accuracy for model M1Fraud


With an accuracy of 98.0% model M1Fraud appears to perform fairly well. The paradox lies in the fact that accuracy
can be easily improved to 98.5% by always predicting "no fraud". The table of confusion and the accuracy for this
trivial always predict negative model M2Fraud and the accuracy of this model are shown below.

Accuracy paradox

128

Predicted Negative Predicted Positive


Negative Cases 9,850

Positive Cases

150

Table 2: Table of Confusion for Fraud Model M2Fraud.

Formula 3: Accuracy for model M2Fraud


Model M2Fraudreduces the rate of inaccurate predictions from 2% to 1.5%. This is an apparent improvement of 25%.
The new model M2Fraud shows fewer incorrect predictions and markedly improved accuracy, as compared to the
original model M1Fraud, but is obviously useless.
The alternative model M2Fraud does not offer any value to the company for preventing fraud. The less accurate model
is more useful than the more accurate model.
Model improvements should not be measured in terms of accuracy gains. It may be going too far to say that accuracy
is irrelevant, but caution is advised when using accuracy in the evaluation of predictive models.

Bibliography
Zhu, Xingquan (2007), Knowledge Discovery and Data Mining: Challenges and Realities [1], IGI Global,
pp.118119, ISBN978-1-59904-252-7
doi:10.1117/12.785623 [2]
pp 86-87 of this Master's thesis [3]

References
[1] http:/ / books. google. com/ ?id=zdJQAAAAMAAJ& q=data+ mining+ challenges+ and+ realities& dq=data+ mining+ challenges+ and+
realities
[2] http:/ / dx. doi. org/ 10. 1117%2F12. 785623
[3] http:/ / www. utwente. nl/ ewi/ trese/ graduation_projects/ 2009/ Abma. pdf

Apportionment paradox

Apportionment paradox
An apportionment paradox exists when the rules for apportionment in a political system produce results which are
unexpected or seem to violate common sense.
To apportion is to divide into parts according to some rule, the rule typically being one of proportion. Certain
quantities, like milk, can be divided in any proportion whatsoever; others, such as horses, cannotonly whole
numbers will do. In the latter case, there is an inherent tension between our desire to obey the rule of proportion as
closely as possible and the constraint restricting the size of each portion to discrete values. This results, at times, in
unintuitive observations, or paradoxes.
Several paradoxes related to apportionment, also called fair division, have been identified. In some cases, simple
adjustments to an apportionment methodology can resolve observed paradoxes. Others, such as those relating to the
United States House of Representatives, call into question notions that mathematics alone can provide a single, fair
resolution.

History
The Alabama paradox was discovered in 1880, when it was found that increasing the total number of seats in the
House of Representatives would decrease Alabama's share from 8 to 7. There was more to come: when Oklahoma
became a state in 1907, a recomputation of apportionment showed that the number of seats due to other states would
be affected even though Oklahoma would be given a fair share of seats and the total number of seats increased by
that number.Wikipedia:Citation needed
The method for apportionment used during this period, originally put forth by Alexander Hamilton but not adopted
until 1852, was as follows (after meeting the requirements of the United States Constitution, wherein each state must
be allocated at least one seat in the House of Representatives, regardless of population):
First, the fair share of each state, i.e. the proportional share of seats that each state would get if fractional values
were allowed, is computed.
Next, the fair shares are rounded down to whole numbers, resulting in unallocated "leftover" seats. These seats are
allocated, one each, to the states whose fair share exceeds the rounded-down number by the highest
amount.Wikipedia:Citation needed

Impossibility result
In 1982 two mathematicians, Michel Balinski and Peyton Young, proved that any method of apportionment will
result in paradoxes whenever there are three or more parties (or states, regions, etc.). More precisely, their theorem
states that there is no apportionment system that has the following properties (as the example we take the division of
seats between parties in a system of proportional representation):
It follows the quota rule: Each of the parties gets one of the two numbers closest to its fair share of seats (if the
party's fair share is 7.34 seats, it gets either 7 or 8).
It does not have the Alabama paradox: If the total number of seats is increased, no party's number of seats
decreases.
It does not have the population paradox: If party A gets more votes and party B gets fewer votes, no seat will be
transferred from A to B.

129

Apportionment paradox

130

Examples of paradoxes
Alabama paradox
The Alabama paradox was the first of the apportionment paradoxes to be discovered. The US House of
Representatives is constitutionally required to allocate seats based on population counts, which are required every 10
years. The size of the House is set by statute.
After the 1880 census, C. W. Seaton, chief clerk of the United States Census Bureau, computed apportionments for
all House sizes between 275 and 350, and discovered that Alabama would get 8 seats with a House size of 299 but
only 7 with a House size of 300. In general the term Alabama paradox refers to any apportionment scenario where
increasing the total number of items would decrease one of the shares. A similar exercise by the Census Bureau after
the 1900 census computed apportionments for all House sizes between 350 and 400: Colorado would have received
three seats in all cases, except with a House size of 357 in which case it would have received two.[1]
The following is a simplified example (following the largest remainder method) with three states and 10 seats and 11
seats.
With 10 seats

With 11 seats

State Population Fair share Seats Fair share Seats


A

4.286

4.714

4.286

4.714

1.429

1.571

Observe that state C's share decreases from 2 to 1 with the added seat.
This occurs because increasing the number of seats increases the fair share faster for the large states than for the
small states. In particular, large A and B had their fair share increase faster than small C. Therefore, the fractional
parts for A and B increased faster than those for C. In fact, they overtook C's fraction, causing C to lose its seat, since
the Hamilton method examines which states have the largest fraction.

New states paradox


Given a fixed number of total representatives (as determined by the United States House of Representatives), adding
a new state would in theory reduce the number of representatives for existing states, as under the United States
Constitution each state is entitled to at least one representative regardless of its population. However, because of how
the particular apportionment rules deal with rounding methods, it is possible for an existing state to get more
representatives than if the new state were not added.

Population paradox
The population paradox is a counterintuitive result of some procedures for apportionment. When two states have
populations increasing at different rates, a small state with rapid growth can lose a legislative seat to a big state with
slower growth.
The paradox arises because of rounding in the procedure for dividing the seats. See the apportionment rules for the
United States Congress for an example.

Apportionment paradox

External links
The Constitution and Paradoxes [2]
Alabama Paradox [3]
New States Paradox [4]
Population Paradox [5]
Apportionment: Balinski and Young's Contribution [6]

References
[1]
[2]
[3]
[4]
[5]
[6]

Cut-the-knot: The Constitution and Paradoxes (http:/ / www. cut-the-knot. org/ ctk/ Democracy. shtml)
http:/ / www. cut-the-knot. org/ ctk/ Democracy. shtml
http:/ / www. cut-the-knot. org/ ctk/ Democracy. shtml#alabama
http:/ / www. cut-the-knot. org/ ctk/ Democracy. shtml#new-states
http:/ / www. cut-the-knot. org/ ctk/ Democracy. shtml#population
http:/ / www. ams. org/ featurecolumn/ archive/ apportionII3. html

BanachTarski paradox
The BanachTarski paradox is a
theorem in set-theoretic geometry,
which states the following: Given a
solid ball in 3dimensional space, there
exists a decomposition of the ball into
Can a ball be decomposed into a finite number of point sets and reassembled into two
a finite number[1] of non-overlapping
balls identical to the original?
pieces (i.e., disjoint subsets), which
can then be put back together in a
different way to yield two identical copies of the original ball. Indeed, the reassembly process involves only moving
the pieces around and rotating them, without changing their shape. However, the pieces themselves are not "solids"
in the usual sense, but infinite scatterings of points.
A stronger form of the theorem implies that given any two "reasonable" solid objects (such as a small ball and a huge
ball), either one can be reassembled into the other. This is often stated informally as "a pea can be chopped up and
reassembled into the Sun" and called "pea and the Sun paradox".
The reason the BanachTarski theorem is called a paradox is that it contradicts basic geometric intuition. "Doubling
the ball" by dividing it into parts and moving them around by rotations and translations, without any stretching,
bending, or adding new points, seems to be impossible, since all these operations ought to, intuitively speaking,
preserve the volume, but they don't necessarily all do that, and the volume is doubled in the end.
Unlike with most theorems in geometry, the proof of this result depends in a critical way on the choice of axioms for
set theory. It can be proven only by using the axiom of choice,[2] which allows for the construction of nonmeasurable
sets, i.e., collections of points that do not have a volume in the ordinary sense and that for their construction would
require performing an uncountably infinite number of choices.
It was shown in 2005 that the pieces in the decomposition can be chosen in such a way that they can be moved
continuously into place without running into one another.

131

BanachTarski paradox

Banach and Tarski publication


In a paper published in 1924, Stefan Banach and Alfred Tarski gave a construction of such a paradoxical
decomposition, based on earlier work by Giuseppe Vitali concerning the unit interval and on the paradoxical
decompositions of the sphere by Felix Hausdorff, and discussed a number of related questions concerning
decompositions of subsets of Euclidean spaces in various dimensions. They proved the following more general
statement, the strong form of the BanachTarski paradox:
Given any two bounded subsets A and B of an Euclidean space in at least three dimensions, both of which have
a nonempty interior, there are partitions of A and B into a finite number of disjoint subsets, A = A1 ... Ak,
B = B1 ... Bk, such that for each i between 1 and k, the sets Ai and Bi are congruent.
Now let A be the original ball and B be the union of two translated copies of the original ball. Then the proposition
means that you can divide the original ball A into a certain number of pieces and then rotate and translate these
pieces in such a way that the result is the whole set B, which contains two copies of A.
The strong form of the BanachTarski paradox is false in dimensions one and two, but Banach and Tarski showed
that an analogous statement remains true if countably many subsets are allowed. The difference between the
dimensions 1 and 2 on the one hand, and three and higher, on the other hand, is due to the richer structure of the
group E(n) of the Euclidean motions in the higher dimensions, which is solvable for n = 1, 2 and contains a free
group with two generators for n 3. John von Neumann studied the properties of the group of equivalences that
make a paradoxical decomposition possible and introduced the notion of amenable groups. He also found a form of
the paradox in the plane which uses area-preserving affine transformations in place of the usual congruences.
Tarski proved that amenable groups are precisely those for which no paradoxical decompositions exist.

Formal treatment
The BanachTarski paradox states that a ball in the ordinary Euclidean space can be doubled using only the
operations of partitioning into subsets, replacing a set with a congruent set, and reassembly. Its mathematical
structure is greatly elucidated by emphasizing the role played by the group of Euclidean motions and introducing the
notions of equidecomposable sets and paradoxical set. Suppose that G is a group acting on a set X. In the most
important special case, X is an n-dimensional Euclidean space, and G consists of all isometries of X, i.e. the
transformations of X into itself that preserve the distances, usually denoted E(n). Two geometric figures that can be
transformed into each other are called congruent, and this terminology will be extended to the general G-action. Two
subsets A and B of X are called G-equidecomposable, or equidecomposable with respect to G, if A and B can be
partitioned into the same finite number of respectively G-congruent pieces. This defines an equivalence relation
among all subsets of X. Formally, if

then we will say that A and B are G-equidecomposable using k pieces. If a set E has two disjoint subsets A and B
such that A and E, as well as B and E, are G-equidecomposable then E is called paradoxical.
Using this terminology, the BanachTarski paradox can be reformulated as follows:
A three-dimensional Euclidean ball is equidecomposable with two copies of itself.
In fact, there is a sharp result in this case, due to Robinson:[3] doubling the ball can be accomplished with five pieces,
and fewer than five pieces will not suffice.
The strong version of the paradox claims:
Any two bounded subsets of 3-dimensional Euclidean space with non-empty interiors are equidecomposable.
While apparently more general, this statement is derived in a simple way from the doubling of a ball by using a
generalization of the BernsteinSchroeder theorem due to Banach that implies that if A is equidecomposable with a

132

BanachTarski paradox
subset of B and B is equidecomposable with a subset of A, then A and B are equidecomposable.
The BanachTarski paradox can be put in context by pointing out that for two sets in the strong form of the paradox,
there is always a bijective function that can map the points in one shape into the other in a one-to-one fashion. In the
language of Georg Cantor's set theory, these two sets have equal cardinality. Thus, if one enlarges the group to allow
arbitrary bijections of X then all sets with non-empty interior become congruent. Likewise, we can make one ball
into a larger or smaller ball by stretching, in other words, by applying similarity transformations. Hence if the group
G is large enough, we may find G-equidecomposable sets whose "size" varies. Moreover, since a countable set can
be made into two copies of itself, one might expect that somehow, using countably many pieces could do the trick.
On the other hand, in the BanachTarski paradox the number of pieces is finite and the allowed equivalences are
Euclidean congruences, which preserve the volumes. Yet, somehow, they end up doubling the volume of the ball!
While this is certainly surprising, some of the pieces used in the paradoxical decomposition are non-measurable sets,
so the notion of volume (more precisely, Lebesgue measure) is not defined for them, and the partitioning cannot be
accomplished in a practical way. In fact, the BanachTarski paradox demonstrates that it is impossible to find a
finitely-additive measure (or a Banach measure) defined on all subsets of a Euclidean space of three (and greater)
dimensions that is invariant with respect to Euclidean motions and takes the value one on a unit cube. In his later
work, Tarski showed that, conversely, non-existence of paradoxical decompositions of this type implies the existence
of a finitely-additive invariant measure.
The heart of the proof of the "doubling the ball" form of the paradox presented below is the remarkable fact that by a
Euclidean isometry (and renaming of elements), one can divide a certain set (essentially, the surface of a unit sphere)
into four parts, then rotate one of them to become itself plus two of the other parts. This follows rather easily from a
F2-paradoxical decomposition of F2, the free group with two generators. Banach and Tarski's proof relied on an
analogous fact discovered by Hausdorff some years earlier: the surface of a unit sphere in space is a disjoint union of
three sets B, C, D and a countable set E such that, on the one hand, B, C, D are pairwise congruent, and, on the other
hand, B is congruent with the union of C and D. This is often called the Hausdorff paradox.

Connection with earlier work and the role of the axiom of choice
Banach and Tarski explicitly acknowledge Giuseppe Vitali's 1905 construction of the set bearing his name,
Hausdorff's paradox (1914), and an earlier (1923) paper of Banach as the precursors to their work. Vitali's and
Hausdorff's constructions depend on Zermelo's axiom of choice ("AC"), which is also crucial to the BanachTarski
paper, both for proving their paradox and for the proof of another result:
Two Euclidean polygons, one of which strictly contains the other, are not equidecomposable.
They remark:
Le rle que joue cet axiome dans nos raisonnements nous semble mriter l'attention
(The role this axiom plays in our reasoning seems to us to deserve attention)
and point out that while the second result fully agrees with our geometric intuition, its proof uses AC in an even
more substantial way than the proof of the paradox. Thus Banach and Tarski imply that AC should not be rejected
simply because it produces a paradoxical decomposition, for such an argument also undermines proofs of
geometrically intuitive statements.
However, in 1949 A.P. Morse showed that the statement about Euclidean polygons can be proved in ZF set theory
and thus does not require the axiom of choice. In 1964, Paul Cohen proved that the axiom of choice cannot be
proved from ZF. A weaker version of an axiom of choice is the axiom of dependent choice, DC. It has been shown
that
The BanachTarski paradox is not a theorem of ZF, nor of ZF+DC.[4]
Large amounts of mathematics use AC. As Stan Wagon points out at the end of his monograph, the BanachTarski
paradox has been more significant for its role in pure mathematics than for foundational questions: it motivated a

133

BanachTarski paradox

134

fruitful new direction for research, the amenability of groups, which has nothing to do with the foundational
questions.
In 1991, using then-recent results by Matthew Foreman and Friedrich Wehrung, Janusz Pawlikowski proved that the
BanachTarski paradox follows from ZF plus the HahnBanach theorem. The HahnBanach theorem doesn't rely
on the full axiom of choice but can be proved using a weaker version of AC called the ultrafilter lemma. So
Pawlikowski proved that the set theory needed to prove the BanachTarski paradox, while stronger than ZF, is
weaker than full ZFC.

A sketch of the proof


Here we sketch a proof which is similar but not identical to that given by Banach and Tarski. Essentially, the
paradoxical decomposition of the ball is achieved in four steps:
1. Find a paradoxical decomposition of the free group in two generators.
2. Find a group of rotations in 3-d space isomorphic to the free group in two generators.
3. Use the paradoxical decomposition of that group and the axiom of choice to produce a paradoxical decomposition
of the hollow unit sphere.
4. Extend this decomposition of the sphere to a decomposition of the solid unit ball.
We now discuss each of these steps in more detail.

Step 1
The free group with two generators a and b consists of all finite strings that can be formed from the four symbols a,
a1, b and b1 such that no a appears directly next to an a1 and no b appears directly next to a b1. Two such strings
can be concatenated and converted into a string of this type by repeatedly replacing the "forbidden" substrings with
the empty string. For instance: abab1a1 concatenated with abab1a yields abab1a1abab1a, which contains the
substring a1a, and so gets reduced to abaab1a. One can check that the set of those strings with this operation forms
a group with identity element the empty string e. We will call this group F2.
The group

can be "paradoxically decomposed" as

follows: let S(a) be the set of all non-forbidden strings


that start with a and define S(a1), S(b) and S(b1)
similarly. Clearly,

but also

and
The notation aS(a1) means take all the strings in
S(a1) and concatenate them on the left with a.
Make sure that you understand this last line, because it
is at the core of the proof. For example, there may be a
string
in the set
which, because of

The sets S(a1) and aS(a1) in the Cayley graph of F2

the rule that

must not appear next to

, reduces

to the string

. Similarly, it contains all the strings that start with

reduces to

). In this way,

(for example the string

contains all the strings that start with

and

which

We have cut our group F2 into four pieces (plus the singleton {e}), then "shifted" two of them by multiplying with a
or b, then "reassembled" two pieces to make one copy of
and the other two to make another copy of
. That is

BanachTarski paradox

135

exactly what we want to do to the ball.

Step 2
In order to find a free group of rotations of 3D space, i.e. that behaves just like (or "is isomorphic to") the free group
F2, we take two orthogonal axes, e.g. the x and z axes, and let A be a rotation of
x axis, and B be a rotation of

about the first,

about the z axis (there are many other suitable pairs of irrational multiples of , that

could be used here as well).[5]


The group of rotations generated by A and B will be called H. Let
on the z axis, of the form
.
It can be shown by induction that
. Analysing

and

maps the point

modulo 3 one show that

to

be an element of H which starts with a rotation

where

. The same argument repeated (by symmetry of the

problem) is valid for the opposite angle, as well as on the x axis. This shows that for any non trivial word

H,

then
. Therefore the group H is a free group, isomorphic to F2.
The two rotations behave just like the elements a and b in the group F2: we now have a paradoxical decomposition
of H.
This step cannot be performed in two dimensions since it involves rotations in three dimensions. If we take two
rotations about the same axis, the resulting group is commutative and doesn't have the property required in step 1.
An alternate arithmetic proof of the existence of free groups in some special orthogonal groups using integral
quaternions leads to paradoxical decompositions of the rotation group.[6]

Step 3
The unit sphere S2 is partitioned into orbits by the action of our group H: two points belong to the same orbit if and
only if there's a rotation in H which moves the first point into the second. (Note that the orbit of a point is a dense set
in S2.) We can use the axiom of choice to pick exactly one point from every orbit; collect these points into a set M.
Now (almost) every point in S2 can be reached in exactly one way by applying the proper rotation from H to the
proper element from M, and because of this, the paradoxical decomposition of H then yields a paradoxical
decomposition of S2 into four pieces A1, A2, A3, A4 as follows:

where we use the notation

and likewise for the other sets and define

(We didn't use the five "paradoxical" parts of F2 directly, as they would leave us with M as an extra piece after
doubling, owing to the presence of the singleton {e}!)
The (majority of the) sphere has now been divided into four sets (each one dense on the sphere), and when two of
these are rotated, we end up with double what we had before:

BanachTarski paradox

Step 4
Finally, connect every point on S2 with a ray to the origin; the paradoxical decomposition of S2 then yields a
paradoxical decomposition of the solid unit ball minus the point at the ball's centre (this center point needs a bit more
care, see below).
N.B. This sketch glosses over some details. One has to be careful about the set of points on the sphere which happen
to lie on the axis of some rotation in H. However, there are only countably many such points, and like the point at
the centre of the ball, it is possible to patch the proof to account for them all (see below).

Some details, fleshed out


In Step 3, we partitioned the sphere into orbits of our group H. To streamline the proof, we omitted the discussion of
points that are fixed by some rotation; since the paradoxical decomposition of F2 relies on shifting certain subsets,
the fact that some points are fixed might cause some trouble. Since any rotation of S2 (other than the null rotation)
has exactly two fixed points, and since H, which is isomorphic to F2, is countable, there are countably many points
of S2 that are fixed by some rotation in H, denote this set of fixed points D. Step 3 proves that S2 D admits a
paradoxical decomposition.
What remains to be shown is the Claim: S2 D is equidecomposable with S2.
Proof. Let be some line through the origin that does not intersect any point in D this is possible since D is
countable. Let J be the set of angles, , such that for some natural number n, and some P in D, r(n)P is also in D,
where r(n) is a rotation about of n. Then J is countable so there exists an angle not in J. Let be the rotation
about by , then acts on S2 with no fixed points in D, i.e., n(D) is disjoint from D, and for natural m<n, n(D) is
disjoint from m(D). Let E be the disjoint union of n(D) over n = 0,1,2,.... Then S2 = E (S2 E) ~ (E) (S2
E)=(E D)(S2E)=S2 D, where ~ denotes "is equidecomposable to".
For step 4, it has already been shown that the ball minus a point admits a paradoxical decomposition; it remains to be
shown that the ball minus a point is equidecomposable with the ball. Consider a circle within the ball, containing the
point at the centre of the ball. Using an argument like that used to prove the Claim, one can see that the full circle is
equidecomposable with the circle minus the point at the ball's centre. (Basically, a countable set of points on the
circle can be rotated to give itself plus one more point.) Note that this involves the rotation about a point other than
the origin, so the BanachTarski paradox involves isometries of Euclidean 3-space rather than just SO(3).
We are using the fact that if A ~ B and B ~ C, then A ~ C. The decomposition of A into C can be done using number
of pieces equal to the product of the numbers needed for taking A into B and for taking B into C.
The proof sketched above requires 2 4 2+8=24 pieces, a factor of 2 to remove fixed points, a factor 4 from
step 1, a factor 2 to recreate fixed points, and 8 for the center point of the second ball. But in step 1 when moving {e}
and all strings of the form an into S(a1), do this to all orbits except one. Move {e} of this last orbit to the center
point of the second ball. This brings the total down to 16+1 pieces. With more algebra one can also decompose
fixed orbits into 4 sets as in step 1. This gives 5 pieces and is the best possible.

Obtaining infinitely many balls from one


Using the BanachTarski paradox, it is possible to obtain k copies of a ball in the Euclidean n-space from one, for
any integers n 3 and k 1, i.e. a ball can be cut into k pieces so that each of them is equidecomposable to a ball of
the same size as the original. Using the fact that the free group F2 of rank 2 admits a free subgroup of countably
infinite rank, a similar proof yields that the unit sphere Sn1 can be partitioned into countably infinitely many pieces,
each of which is equidecomposable (with two pieces) to the Sn1 using rotations. By using analytic properties of the
rotation group SO(n), which is a connected analytic Lie group, one can further prove that the sphere Sn1 can be
partitioned into as many pieces as there are real numbers (that is,
pieces), so that each piece is
equidecomposable with two pieces to Sn1 using rotations. These results then extend to the unit ball deprived of the

136

BanachTarski paradox
origin. A 2010 article by Valeriy Churkin gives a new proof of the continuous version of the BanachTarski
paradox.[7]

The von Neumann paradox in the Euclidean plane


Main article: von Neumann paradox
In the Euclidean plane, two figures that are equidecomposable with respect to the group of Euclidean motions are
necessarily of the same area, therefore, a paradoxical decomposition of a square or disk of BanachTarski type that
uses only Euclidean congruences is impossible. A conceptual explanation of the distinction between the planar and
higher-dimensional cases was given by John von Neumann: unlike the group SO(3) of rotations in three dimensions,
the group E(2) of Euclidean motions of the plane is solvable, which implies the existence of a finitely-additive
measure on E(2) and R2 which is invariant under translations and rotations, and rules out paradoxical decompositions
of non-negligible sets. Von Neumann then posed the following question: can such a paradoxical decomposition be
constructed if one allowed a larger group of equivalences?
It is clear that if one permits similarities, any two squares in the plane become equivalent even without further
subdivision. This motivates restricting one's attention to the group SA2 of area-preserving affine transformations.
Since the area is preserved, any paradoxical decomposition of a square with respect to this group would be
counterintuitive for the same reasons as the BanachTarski decomposition of a ball. In fact, the group SA2 contains
as a subgroup the special linear group SL(2,R), which in its turn contains the free group F2 with two generators as a
subgroup. This makes it plausible that the proof of BanachTarski paradox can be imitated in the plane. The main
difficulty here lies in the fact that the unit square is not invariant under the action of the linear group SL(2, R), hence
one cannot simply transfer a paradoxical decomposition from the group to the square, as in the third step of the
above proof of the BanachTarski paradox. Moreover, the fixed points of the group present difficulties (for example,
the origin is fixed under all linear transformations). This is why von Neumann used the larger group SA2 including
the translations, and he constructed a paradoxical decomposition of the unit square with respect to the enlarged group
(in 1929). Applying the BanachTarski method, the paradox for the square can be strengthened as follows:
Any two bounded subsets of the Euclidean plane with non-empty interiors are equidecomposable with respect
to the area-preserving affine maps.
As von Neumann notes,[8]
"Infolgedessen gibt es bereits in der Ebene kein nichtnegatives additives Ma (wo das Einheitsquadrat das
Ma 1 hat), das gegenber allen Abbildungen von A2 invariant wre."
"In accordance with this, already in the plane there is no nonnegative additive measure (for which the unit
square has a measure of 1), which is invariant with respect to all transformations belonging to A2 [the group of
area-preserving affine transformations]."
To explain this a bit more, the question of whether a finitely additive measure exists, that is preserved under certain
transformations, depends on what transformations are allowed. The Banach measure of sets in the plane, which is
preserved by translations and rotations, is not preserved by non-isometric transformations even when they do
preserve the area of polygons. The points of the plane (other than the origin) can be divided into two dense sets
which we may call A and B. If the A points of a given polygon are transformed by a certain area-preserving
transformation and the B points by another, both sets can become subsets of the A points in two new polygons. The
new polygons have the same area as the old polygon, but the two transformed sets cannot have the same measure as
before (since they contain only part of the A points), and therefore there is no measure that "works".
The class of groups isolated by von Neumann in the course of study of BanachTarski phenomenon turned out to be
very important for many areas of mathematics: these are amenable groups, or groups with an invariant mean, and
include all finite and all solvable groups. Generally speaking, paradoxical decompositions arise when the group used
for equivalences in the definition of equidecomposability is not amenable.

137

BanachTarski paradox

Recent progress
2000. Von Neumann's paper left open the possibility of a paradoxical decomposition of the interior of the unit
square with respect to the linear group SL(2,R) (Wagon, Question 7.4). In 2000, Mikls Laczkovich proved that
such a decomposition exists. More precisely, let A be the family of all bounded subsets of the plane with
non-empty interior and at a positive distance from the origin, and B the family of all planar sets with the property
that a union of finitely many translates under some elements of SL(2, R) contains a punctured neighbourhood of
the origin. Then all sets in the family A are SL(2, R)-equidecomposable, and likewise for the sets in B. It follows
that both families consist of paradoxical sets.
2003. It had been known for a long time that the full plane was paradoxical with respect to SA2, and that the
minimal number of pieces would equal four provided that there exists a locally commutative free subgroup of
SA2. In 2003 Kenzi Sat constructed such a subgroup, confirming that four pieces suffice.

Notes
[1] Five pieces will suffice;
[2] Wagon, Corollary 13.3
[3] Robinson, R. M. (1947). "On the Decomposition of Spheres." Fund. Math. 34:246260. This article, based on an analysis of the Hausdorff
paradox, settled a question put forth by von Neumann in 1929.
[4]
[5]
[6]
[7]

Wagon, Corollary 13.3


Wagon, p. 16.
INVARIANT MEASURES, EXPANDERS AND PROPERTY T MAXIME BERGERON
Full text in Russian is available from the Mathnet.ru page (http:/ / www. mathnet. ru/ php/ archive. phtml?wshow=paper& jrnid=al&
paperid=431& option_lang=eng).
[8] On p. 85.

References
Banach, Stefan; Tarski, Alfred (1924). "Sur la dcomposition des ensembles de points en parties respectivement
congruentes" (http://matwbn.icm.edu.pl/ksiazki/fm/fm6/fm6127.pdf) (PDF). Review at JFM (http://www.
emis.de/cgi-bin/JFM-item?50.0370.02). Fundamenta Mathematicae 6: 244277.
Churkin, V. A. (2010). "A continuous version of the HausdorffBanachTarski paradox". Algebra and Logic 49
(1): 9198. doi: 10.1007/s10469-010-9080-y (http://dx.doi.org/10.1007/s10469-010-9080-y).
Edward Kasner & James Newman (1940) Mathematics and the Imagination, pp 2057, Simon & Schuster.
Kuro5hin. "Layman's Guide to the BanachTarski Paradox" (http://www.kuro5hin.org/story/2003/5/23/
134430/275).
Stromberg, Karl (March 1979). "The BanachTarski paradox". The American Mathematical Monthly
(Mathematical Association of America) 86 (3): 151161. doi: 10.2307/2321514 (http://dx.doi.org/10.2307/
2321514). JSTOR 2321514 (http://www.jstor.org/stable/2321514).
Su, Francis E. "The BanachTarski Paradox" (http://www.math.hmc.edu/~su/papers.dir/banachtarski.pdf)
(PDF).
von Neumann, John (1929). "Zur allgemeinen Theorie des Masses" (http://matwbn.icm.edu.pl/ksiazki/fm/
fm13/fm1316.pdf) (PDF). Fundamenta Mathematicae 13: 73116.
Wagon, Stan (1994). The BanachTarski Paradox. Cambridge: Cambridge University Press.
ISBN0-521-45704-1.
Wapner, Leonard M. (2005). The Pea and the Sun: A Mathematical Paradox (http://gen.lib.rus.ec/
get?md5=59f223482492f1644b1023fccd4968f1). Wellesley, Mass.: A.K. Peters. ISBN1-56881-213-2.

138

BanachTarski paradox

139

External links
The Banach-Tarski Paradox (http://demonstrations.wolfram.com/TheBanachTarskiParadox/) by Stan Wagon
(Macalester College), the Wolfram Demonstrations Project.
Irregular Webcomic! #2339 (http://www.irregularwebcomic.net/2339.html) by David Morgan-Mar provides a
non-technical explanation of the paradox. It includes a step-by-step demonstration of how to create two spheres
from one.

Berkson's paradox
Berkson's paradox also known as Berkson's bias or Berkson's fallacy is a result in conditional probability and
statistics which is counterintuitive for some people, and hence a veridical paradox. It is a complicating factor arising
in statistical tests of proportions. Specifically, it arises when there is an ascertainment bias inherent in a study design.
It is often described in the fields of medical statistics or biostatistics, as in the original description of the problem by
Joseph Berkson.

Statement
The result is that two independent events become conditionally dependent (negatively dependent) given that at least
one of them occurs. Symbolically:
if 0 < P(A) < 1 and 0 < P(B) < 1,
and P(A|B) = P(A), i.e. they are independent,
then P(A|B,C) < P(A|C) where C = AB (i.e. A or B).
In words, given two independent events, if you only consider outcomes where at least one occurs, then they become
negatively dependent.

Explanation
The cause is that the conditional probability of event A occurring, given that it or B occurs, is inflated: it is higher
than the unconditional probability, because we have excluded cases where neither occur.
P(A|AB) > P(A)
conditional probability inflated relative to unconditional
One can see this in tabular form as follows: the gray regions are the outcomes where at least one event occurs (and
~A means "not A").

~A

A&B

~A & B

~B A & ~B ~A & ~B

For instance, if one has a sample of 100, and both A and B occur independently half the time (So P(A) = P(B) = 1/2),
one obtains:

Berkson's paradox

140

A ~A
B

25 25

~B 25 25

So in 75 outcomes, either A or B occurs, of which 50 have A occurring, so


P(A|AB) = 50/75 = 2/3 > 1/2 = 50/100 = P(A).
Thus the probability of A is higher in the subset (of outcomes where it or B occurs), 2/3, than in the overall
population, 1/2.
Berkson's paradox arises because the conditional probability of A given B within this subset equals the conditional
probability in the overall population, but the unconditional probability within the subset is inflated relative to the
unconditional probability in the overall population, hence, within the subset, the presence of B decreases the
conditional probability of A (back to its overall unconditional probability):
P(A|B, AB) = P(A|B) = P(A)
P(A|AB) > P(A).

Examples
Berkson's original illustration involves a retrospective study examining a risk factor for a disease in a statistical
sample from a hospital in-patient population. If a control group is also ascertained from the in-patient population, a
difference in hospital admission rates for the control sample and case sample can result in a spurious negative
association between the disease and the risk factor. For example, a hospital patient without diabetes is more likely to
have cholecystis, since they must have had some non-diabetes reason to enter the hospital in the first place.
An example presented by Jordan Ellenberg: Suppose you will only date a man if his niceness plus his handsomeness
exceeds some threshold. Then nicer men do not have to be as handsome in order to qualify for your dating pool. So,
among the men that you date, you may observe that the nicer ones are less handsome on average (and vice-versa),
even if these traits are uncorrelated in the general population.
Note that this does not mean that men in your dating pool compare unfavorably with men in the population. On the
contrary, your selection criterion means that you have high standards. The average nice man that you date is actually
more handsome than the average man in the population (since even among nice men, you skip the ugliest portion of
the population). Berkson's negative correlation is an effect that arises within your dating pool: the rude men that you
date must have been even more handsome to qualify.
As a quantitative example, suppose a collector has 1000 postage stamps, of which 300 are pretty and 100 are rare,
with 30 being both pretty and rare. 10% of all her stamps are rare and 10% of her pretty stamps are rare, so prettiness
tells nothing about rarity. She puts the 370 stamps which are pretty or rare on display. Just over 27% of the stamps
on display are rare, but still only 10% of the pretty stamps are rare (and 100% of the 70 not-pretty stamps on display
are rare). If an observer only considers stamps on display, he will observe a spurious negative relationship between
prettiness and rarity as a result of the selection bias (that is, not-prettiness strongly indicates rarity in the display, but
not in the total collection).

Berkson's paradox

References
Berkson, Joseph (June 1946). "Limitations of the Application of Fourfold Table Analysis to Hospital Data".
Biometrics Bulletin 2 (3): 4753. doi:10.2307/3002000 [1]. JSTOR3002000 [2]. (The paper is frequently miscited
as Berkson, J. (1949) Biological Bulletin 2, 4753.)

External Links
Jordan Ellenberg, "Why are handsome men such jerks? [3]"

References
[1] http:/ / dx. doi. org/ 10. 2307%2F3002000
[2] http:/ / www. jstor. org/ stable/ 3002000
[3] http:/ / www. slate. com/ blogs/ how_not_to_be_wrong/ 2014/ 06/ 03/ berkson_s_fallacy_why_are_handsome_men_such_jerks. html

Bertrand's box paradox


For other paradoxes by Joseph Bertrand, see Bertrand's paradox (disambiguation).
Bertrand's box paradox is a classic paradox of elementary probability theory. It was first posed by Joseph Bertrand
in his Calcul des probabilits, published in 1889.
There are three boxes:
1. a box containing two gold coins,
2. a box containing two silver coins,
3. a box containing one gold coin and one silver coin.
After choosing a box at random and withdrawing one coin at random, if that happens to be a gold coin, it may seem
that the probability that the remaining coin is gold is 12; in fact, the probability is actually 23. Two problems that are
very similar are the Monty Hall problem and the Three Prisoners problem.
These simple but slightly counterintuitive puzzles are used as a standard example in teaching probability theory.
Their solution illustrates some basic principles, including the Kolmogorov axioms.

Box version
There are three boxes, each with one drawer on each of two sides. Each drawer contains a coin. One box has a gold
coin on each side (GG), one a silver coin on each side (SS), and the other a gold coin on one side and a silver coin
on the other (GS). A box is chosen at random, a random drawer is opened, and a gold coin is found inside it. What is
the chance of the coin on the other side being gold?
The following reasoning appears to give a probability of 12:

Originally, all three boxes were equally likely to be chosen.


The chosen box cannot be box SS.
So it must be box GG or GS.
The two remaining possibilities are equally likely. So the probability that the box is GG, and the other coin is
also gold, is 12.

The flaw is in the last step. While those two cases were originally equally likely, the fact that you are certain to find a
gold coin if you had chosen the GG box, but are only 50% sure of finding a gold coin if you had chosen the GS box,
means they are no longer equally likely given that you have found a gold coin. Specifically:
The probability that GG would produce a gold coin is 1.

141

Bertrand's box paradox


The probability that SS would produce a gold coin is 0.
The probability that GS would produce a gold coin is 12.
Initially GG, SS and GS are equally likely. Therefore by Bayes rule the conditional probability that the chosen box
is GG, given we have observed a gold coin, is:

The correct answer of 23 can also be obtained as follows:

Originally, all six coins were equally likely to be chosen.


The chosen coin cannot be from drawer S of box GS, or from either drawer of box SS.
So it must come from the G drawer of box GS, or either drawer of box GG.
The three remaining possibilities are equally likely, so the probability that the drawer is from box GG is 23.

Alternatively, one can simply note that the chosen box has two coins of the same type 23 of the time. So, regardless
of what kind of coin is in the chosen drawer, the box has two coins of that type 23 of the time. In other words, the
problem is equivalent to asking the question "What is the probability that I will pick a box with two coins of the
same color?".
Bertrand's point in constructing this example was to show that merely counting cases is not always proper. Instead,
one should sum the probabilities that the cases would produce the observed result; and the two methods are
equivalent only if this probability is either 1 or 0 in every case. This condition is correctly applied in the second
solution method, but not in the first.

The paradox as stated by Bertrand


It can be easier to understand the correct answer if you consider the paradox as Bertrand originally described it. After
a box has been chosen, but before a box is opened to let you observe a coin, the probability is 2/3 that the box has
two of the same kind of coin. If the probability of "observing a gold coin" in combination with "the box has two of
the same kind of coin" is 1/2, then the probability of "observing a silver coin" in combination with "the box has two
of the same kind of coin" must also be 1/2. And if the probability that the box has two like coins changes to 1/2 no
matter what kind of coin is shown, the probability would have to be 1/2 even if you hadn't observed a coin this way.
Since we know his probability is 2/3, not 1/2, we have an apparent paradox. It can be resolved only by recognizing
how the combination of "observing a gold coin" with each possible box can only affect the probability that the box
was GS or SS, but not GG.

Card version
Suppose there are three cards:
A black card that is black on both sides,
A white card that is white on both sides, and
A mixed card that is black on one side and white on the other.
All the cards are placed into a hat and one is pulled at random and placed on a table. The side facing up is black.
What are the odds that the other side is also black?
The answer is that the other side is black with probability 23. However, common intuition suggests a probability of
1
2 either because there are two cards with black on them that this card could be, or because there are 3 white and 3
black sides and many people forget to eliminate the possibility of the "white card" in this situation (i.e. the card they
flipped CANNOT be the "white card" because a black side was turned over).
In a survey of 53 Psychology freshmen taking an introductory probability course, 35 incorrectly responded 12; only
3 students correctly responded 23.[1]

142

Bertrand's box paradox


Another presentation of the problem is to say : pick a random card out of the three, what are the odds that it has the
same color on the other side? Since only one card is mixed and two have the same color on their sides, it is easier to
understand that the probability is 23. Also note that saying that the color is black (or the coin is gold) instead of
white doesn't matter since it is symmetric: the answer is the same for white. So is the answer for the generic question
'same color on both sides'.

Preliminaries
To solve the problem, either formally or informally, one must assign probabilities to the events of drawing each of
the six faces of the three cards. These probabilities could conceivably be very different; perhaps the white card is
larger than the black card, or the black side of the mixed card is heavier than the white side. The statement of the
question does not explicitly address these concerns. The only constraints implied by the Kolmogorov axioms are that
the probabilities are all non-negative, and they sum to 1.
The custom in problems when one literally pulls objects from a hat is to assume that all the drawing probabilities are
equal. This forces the probability of drawing each side to be 16, and so the probability of drawing a given card is 13.
In particular, the probability of drawing the double-white card is 13, and the probability of drawing a different card is
2
3.
In question, however, one has already selected a card from the hat and it shows a black face. At first glance it
appears that there is a 50/50 chance (i.e. probability 12) that the other side of the card is black, since there are two
cards it might be: the black and the mixed. However, this reasoning fails to exploit all of the information; one knows
not only that the card on the table has at least one black face, but also that in the population it was selected from,
only 1 of the 3 black faces was on the mixed card.
An easy explanation is that to name the black sides as x, y and z where x and y are on the same card while z is on the
mixed card, then the probability is divided on the 3 black sides with 13 each. thus the probability that we chose either
x or y is the sum of their probabilities thus 23.

Solutions
Intuition
Intuition tells one that one is choosing a card at random. However, one is actually choosing a face at random. There
are 6 faces, of which 3 faces are white and 3 faces are black. Two of the 3 black faces belong to the same card. The
chance of choosing one of those 2 faces is 23. Therefore, the chance of flipping the card over and finding another
black face is also 23. Another way of thinking about it is that the problem is not about the chance that the other side
is black, it's about the chance that you drew the all black card. If you drew a black face, then it's twice as likely that
that face belongs to the black card than the mixed card.
Alternately, it can be seen as a bet not on a particular color, but a bet that the sides match. Betting on a particular
color regardless of the face shown, will always have a chance of 12. However, betting that the sides match is 23,
because 2 cards match and 1 does not.
Labels
One solution method is to label the card faces, for example numbers 1 through 6.[2] Label the faces of the black card
1 and 2; label the faces of the mixed card 3 (black) and 4 (white); and label the faces of the white card 5 and 6. The
observed black face could be 1, 2, or 3, all equally likely; if it is 1 or 2, the other side is black, and if it is 3, the other
side is white. The probability that the other side is black is 23. This probability can be derived in the following
manner: Let random variable B equal the a black face (i.e. the probability of a success since the black face is what
we are looking for). Using Kolmogrov's Axiom of all probabilities having to equal 1, we can conclude that the
probability of drawing a white face is 1-P(B). Since P(B)=P(1)+P(2) therefore P(B)=13+13=23. Likewise we can do

143

Bertrand's box paradox


this P(white face)=1-23=13.
Bayes' theorem
Given that the shown face is black, the other face is black if and only if the card is the black card. If the black card is
drawn, a black face is shown with probability 1. The total probability of seeing a black face is 12; the total
probability of drawing the black card is 13. By Bayes' theorem, the conditional probability of having drawn the black
card, given that a black face is showing, is

It can be more intuitive to present this argument using Bayes' rule rather than Bayes' theorem[3]. Having seen a
black face we can rule out the white card. We are interested in the probability that the card is black given a black
face is showing. Initially it is equally likely that the card is black and that it is mixed: the prior odds are 1:1. Given
that it is black we are certain to see a black face, but given that it is mixed we are only 50% certain to see a black
face. The ratio of these probabilities, called the likelihood ratio or Bayes factor, is 2:1. Bayes' rule says "posterior
odds equals prior odds times likelihood ratio". Since the prior odds are 1:1 the posterior odds equals the likelihood
ratio, 2:1. It is now twice as likely that the card is black than that it is mixed.
Eliminating the white card
Although the incorrect solution reasons that the white card is eliminated, one can also use that information in a
correct solution. Modifying the previous method, given that the white card is not drawn, the probability of seeing a
black face is 34, and the probability of drawing the black card is 12. The conditional probability of having drawn the
black card, given that a black face is showing, is

Symmetry
The probability (without considering the individual colors) that the hidden color is the same as the displayed color is
clearly 23, as this holds if and only if the chosen card is black or white, which chooses 2 of the 3 cards. Symmetry
suggests that the probability is independent of the color chosen, so that the information about which color is shown
does not affect the odds that both sides have the same color.
This argument is correct and can be formalized as follows. By the law of total probability, the probability that the
hidden color is the same as the displayed color equals the weighted average of the probabilities that the hidden color
is the same as the displayed color, given that the displayed color is black or white respectively (the weights are the
probabilities of seeing black and white respectively). By symmetry, the two conditional probabilities that the colours
are the same given we see black and given we see white are the same. Since they moreover average out to 2/3 they
must both be equal to 2/3.
Experiment
Using specially constructed cards, the choice can be tested a number of times. By constructing a fraction with the
denominator being the number of times "B" is on top, and the numerator being the number of times both sides are
"B", the experimenter will probably find the ratio to be near 23.
Note the logical fact that the B/B card contributes significantly more (in fact twice) to the number of times "B" is on
top. With the card B/W there is always a 50% chance W being on top, thus in 50% of the cases card B/W is drawn,
the draw affects neither numerator nor denominator and effectively does not count (this is also true for all times
W/W is drawn, so that card might as well be removed from the set altogether). Conclusively, the cards B/B and B/W
are not of equal chances, because in the 50% of the cases B/W is drawn, this card is simply "disqualified".

144

Bertrand's box paradox

Related problems

Boy or Girl paradox


Three Prisoners problem
Two envelopes problem
Sleeping Beauty problem

Notes and references


1. ^ Bar-Hillel and Falk (page 119)
2. ^ Nickerson (page 158) advocates this solution as "less confusing" than other methods.
3. ^ Bar-Hillel and Falk (page 120) advocate using Bayes' Rule.
Bar-Hillel, Maya; Falk, Ruma (1982). "Some teasers concerning conditional probabilities". Cognition 11 (2):
10922. doi:10.1016/0010-0277(82)90021-X [4]. PMID7198956 [5].
Nickerson, Raymond (2004). Cognition and Chance: The psychology of probabilistic reasoning, Lawrence
Erlbaum. Ch. 5, "Some instructive problems: Three cards", pp.157160. ISBN 0-8058-4898-3
Michael Clark, Paradoxes from A to Z, p.16;
Howard Margolis, Wason, Monty Hall, and Adverse Defaults [6].

References
[1]
[2]
[3]
[4]
[5]
[6]

http:/ / en. wikipedia. org/ wiki/ Bertrand%27s_box_paradox#endnote_53


http:/ / en. wikipedia. org/ wiki/ Bertrand%27s_box_paradox#endnote_Label16
http:/ / en. wikipedia. org/ wiki/ Bertrand%27s_box_paradox#endnote_Bayes
http:/ / dx. doi. org/ 10. 1016%2F0010-0277%2882%2990021-X
http:/ / www. ncbi. nlm. nih. gov/ pubmed/ 7198956
http:/ / harrisschool. uchicago. edu/ About/ publications/ working-papers/ pdf/ wp_05_14. pdf

145

Bertrand paradox

146

Bertrand paradox
For other paradoxes by Joseph Bertrand, see Bertrand's paradox (disambiguation).
The Bertrand paradox is a problem within the classical interpretation of probability theory. Joseph Bertrand
introduced it in his work Calcul des probabilits (1889) as an example to show that probabilities may not be well
defined if the mechanism or method that produces the random variable is not clearly defined.

Bertrand's formulation of the problem


The Bertrand paradox goes as follows: Consider an equilateral triangle inscribed in a circle. Suppose a chord of the
circle is chosen at random. What is the probability that the chord is longer than a side of the triangle?
Bertrand gave three arguments, all apparently valid, yet yielding different results.
The "random endpoints" method: Choose two random points on the
circumference of the circle and draw the chord joining them. To
calculate the probability in question imagine the triangle rotated so its
vertex coincides with one of the chord endpoints. Observe that if the
other chord endpoint lies on the arc between the endpoints of the
triangle side opposite the first point, the chord is longer than a side of
the triangle. The length of the arc is one third of the circumference of
the circle, therefore the probability that a random chord is longer than a
side of the inscribed triangle is 1/3.

Random chords, selection method 1; red = longer


than triangle side, blue = shorter

The "random radius" method: Choose a radius of the circle, choose a


point on the radius and construct the chord through this point and
perpendicular to the radius. To calculate the probability in question
imagine the triangle rotated so a side is perpendicular to the radius. The
chord is longer than a side of the triangle if the chosen point is nearer
the center of the circle than the point where the side of the triangle
intersects the radius. The side of the triangle bisects the radius,
therefore the probability a random chord is longer than a side of the
inscribed triangle is 1/2.

Random chords, selection method 2

Bertrand paradox

147

The "random midpoint" method: Choose a point anywhere within the


circle and construct a chord with the chosen point as its midpoint. The
chord is longer than a side of the inscribed triangle if the chosen point
falls within a concentric circle of radius 1/2 the radius of the larger
circle. The area of the smaller circle is one fourth the area of the larger
circle, therefore the probability a random chord is longer than a side of
the inscribed triangle is 1/4.
The selection methods can also be visualized as follows. A chord is
uniquely identified by its midpoint. Each of the three selection methods
presented above yields a different distribution of midpoints. Methods 1
and 2 yield two different nonuniform distributions, while method 3
yields a uniform distribution. On the other hand, if one looks at the
images of the chords below, the chords of method 2 give the circle a
homogeneously shaded look, while method 1 and 3 do not.

Random chords, selection method 3

Midpoints of chords chosen at random,


method 1

Midpoints of chords chosen at random,


method 2

Midpoints of chords chosen at random,


method 3

Chords chosen at random, method 1

Chords chosen at random, method 2

Chords chosen at random, method 3

Other distributions can easily be imagined, many of which will yield a different proportion of chords which are
longer than a side of the inscribed triangle.

Bertrand paradox

Classical solution
The problem's classical solution thus hinges on the method by which a chord is chosen "at random". It turns out that
if, and only if, the method of random selection is specified, does the problem have a well-defined solution. There is
no unique selection method, so there cannot be a unique solution. The three solutions presented by Bertrand
correspond to different selection methods, and in the absence of further information there is no reason to prefer one
over another.
This and other paradoxes of the classical interpretation of probability justified more stringent formulations, including
frequentist probability and subjectivist Bayesian probability.

Jaynes' solution using the "maximum ignorance" principle


In his 1973 paper The Well-Posed Problem, Edwin Jaynes proposed a solution to Bertrand's paradox, based on the
principle of "maximum ignorance"that we should not use any information that is not given in the statement of the
problem. Jaynes pointed out that Bertrand's problem does not specify the position or size of the circle, and argued
that therefore any definite and objective solution must be "indifferent" to size and position. In other words: the
solution must be both scale and translation invariant.
To illustrate: assume that chords are laid at random onto a circle with a diameter of 2, for example by throwing
straws onto it from far away. Now another circle with a smaller diameter (e.g., 1.1) is laid into the larger circle. Then
the distribution of the chords on that smaller circle needs to be the same as on the larger circle. If the smaller circle is
moved around within the larger circle, the probability must not change either. It can be seen very easily that there
would be a change for method 3: the chord distribution on the small red circle looks qualitatively different from the
distribution on the large circle:

The same occurs for method 1, though it is harder to see in a graphical representation. Method 2 is the only one that
is both scale invariant and translation invariant; method 3 is just scale invariant, method 1 is neither.
However, Jaynes did not just use invariances to accept or reject given methods: this would leave the possibility that
there is another not yet described method that would meet his common-sense criteria. Jaynes used the integral

148

Bertrand paradox
equations describing the invariances to directly determine the probability distribution. In this problem, the integral
equations indeed have a unique solution, and it is precisely what was called "method 2" above, the random radius
method.

Physical experiments
"Method 2" is the only solution that fulfills the transformation invariants that are present in certain physical
systemssuch as in statistical mechanics and gas physicsas well as in Jaynes's proposed experiment of throwing
straws from a distance onto a small circle. Nevertheless, one can design other practical experiments that give
answers according to the other methods. For example, in order to arrive at the solution of "method 1", the random
endpoints method, one can affix a spinner to the center of the circle, and let the results of two independent spins
mark the endpoints of the chord. In order to arrive at the solution of "method 3", one could cover the circle with
molasses and mark the first point that a fly lands on as the midpoint of the chord. Several observers have designed
experiments in order to obtain the different solutions and verified the results empirically.

Notes
References
Clark M. (2002), Paradoxes from A to Z (Routledge).
Aerts D., Sassoli de Bianchi M. (2014), " Solving the hard problem of Bertrand's paradox (http://arxiv.org/abs/
1403.4139)".

Birthday problem
In probability theory, the birthday problem or birthday paradox[1] concerns the probability that, in a set of n
randomly chosen people, some pair of them will have the same birthday. By the pigeonhole principle, the probability
reaches 100% when the number of people reaches 367 (since there are 366 possible birthdays, including February
29). However, 99.9% probability is reached with just 70 people, and 50% probability with 23 people. These
conclusions include the assumption that each day of the year (except February 29) is equally probable for a birthday.
The history of the problem is obscure, but W. W. Rouse Ball indicated (without citation) that it was first discussed
by an"H. Davenport", almost certainly Harold Davenport.[2]
The mathematics behind this problem led to a well-known cryptographic attack called the birthday attack, which
uses this probabilistic model to reduce the complexity of cracking a hash function.

149

Birthday problem

Understanding the problem


The birthday problem is to find the
probability that, in a group of N people,
there is at least one pair of people who have
the same birthday. See "Same birthday as
you" further for an analysis of the case of
finding the probability of a given, fixed
person having the same birthday as any of
the remaining N - 1.
In the example given earlier, a list of 23
people, comparing the birthday of the first
person on the list to the others allows 22
A graph showing the computed probability of at least two people sharing a
chances for a matching birthday, the second
birthday amongst a certain number of people.
person on the list to the others allows 21
chances for a matching birthday (in fact the 'second' person also has total 22 chances of matching birthday with the
others but his/her chance of matching birthday with the 'first' person, one chance, has already been counted with the
first person's 22 chances and shall not be duplicated), third person has 20 chances, and so on. Hence total chances
are: 22+21+20+....+1 = 253, so comparing every person to all of the others allows 253 distinct chances
(combinations): in a group of 23 people there are
distinct possible combinations of pairing.
Presuming all birthdays are equally probable,[3] the probability of a given birthday for a person chosen from the
entire population at random is 1/365 (ignoring "leap day", February 29). Although the number of pairings in a group
of 23 people is not statistically equivalent to 253 pairs chosen independently, the birthday paradox becomes less
surprising if a group is thought of in terms of the number of possible pairs, rather than as the number of individuals.

Calculating the probability


The problem is to compute the approximate probability that in a room of n people, at least two have the same
birthday. For simplicity, disregard variations in the distribution, such as leap years, twins, seasonal or weekday
variations, and assume that the 365 possible birthdays are equally likely. Real-life birthday distributions are not
uniform since not all dates are equally likely.
If P(A) is the probability of at least two people in the room having the same birthday, it may be simpler to calculate
P(A'), the probability of there not being any two people having the same birthday. Then, because A and A' are the
only two possibilities and are also mutually exclusive, P(A)=1P(A').
In deference to widely published solutions concluding that 23 is the number of people necessary to have a P(A) that
is greater than 50%, the following calculation of P(A) will use 23 people as an example.
When events are independent of each other, the probability of all of the events occurring is equal to a product of the
probabilities of each of the events occurring. Therefore, if P(A') can be described as 23 independent events, P(A')
could be calculated as P(1)P(2)P(3)...P(23).
The 23 independent events correspond to the 23 people, and can be defined in order. Each event can be defined as
the corresponding person not sharing his/her birthday with any of the previously analyzed people. For Event 1, there
are no previously analyzed people. Therefore, the probability, P(1), that person number 1 does not share his/her
birthday with previously analyzed people is 1, or 100%. Ignoring leap years for this analysis, the probability of 1 can
also be written as 365/365, for reasons that will become clear below.

150

Birthday problem

151

For Event 2, the only previously analyzed people is Person 1. Assuming that birthdays are equally likely to happen
on each of the 365 days of the year, the probability, P(2), that Person 2 has a different birthday than Person 1 is
364/365. This is because, if Person 2 was born on any of the other 364 days of the year, Persons 1 and 2 will not
share the same birthday.
Similarly, if Person 3 is born on any of the 363 days of the year other than the birthdays of Persons 1 and 2, Person 3
will not share their birthday. This makes the probability P(3)=363/365.
This analysis continues until Person 23 is reached, whose probability of not sharing his/her birthday with people
analyzed before, P(23), is 343/365.
P(A') is equal to the product of these individual probabilities:
(1) P(A') = 365/365364/365363/365362/365...343/365
The terms of equation (1) can be collected to arrive at:
(2) P(A') = (1/365)23(365364363...343)
Evaluating equation (2) gives P(A') 0.492703
Therefore, P(A) 10.492703 =0.507297(50.7297%)
This process can be generalized to a group of n people, where p(n) is the probability of at least two of the n people
sharing a birthday. It is easier to first calculate the probability p(n) that all n birthdays are different. According to the
pigeonhole principle, p(n) is zero whenn>365. Whenn365:

where ' ! ' is the factorial operator,

is the binomial coefficient and

denotes permutation.

The equation expresses the fact that the first person has no one to share a birthday, the second person cannot have the
same birthday as the first (364/365), the third cannot have the same birthday as the first two (363/365), and in
general the nth birthday cannot be the same as any of the n1 preceding birthdays.
The event of at least two of the n persons having the same birthday is complementary to all n birthdays being
different. Therefore, its probability p(n) is

This probability surpasses 1/2 forn=23 (with value about50.7%). The following table shows the probability for
some other values of n (this table ignores the existence of leap years, as described above, as well as assumes each
birthday is equally likely):

Birthday problem

152

The probability that no two people share a birthday in a group of n people. Note
that the vertical scale is logarithmic (each step down is 1020 times less likely).

p(n)

2.7%

10

11.7%

20

41.1%

23

50.7%

30

70.6%

40

89.1%

50

97.0%

60

99.4%

70

99.9%

100 99.99997%
200 99.9999999999999999999999999998%
300 (100 (61080))%
350 (100 (310129))%
365 (100 (1.4510155))%
366 100%

Birthday problem

153

Abstract proof
Here we prove the same result as above, but with results about sets and functions to provide a simpler proof.
Firstly, define

to be a set of N people and let

Define the birthday function

be the set of dates in a year.

to be the map that sends a person to their birthdate. So everyone in

has

a unique birthday if and only if the birthday function is injective.


Now we consider how many functions, and how many injective functions, exist between
Since

and

, it follows that there are

possible functions, and

and

.
possible

injective functions (see Twelvefold way#case i).


Let A be the statement "Everybody in the set has a unique birthday" (so P(A') is what we are actually looking
for). By definition, P(A) is the fraction of injective functions out of all possible functions (i.e., the probability of the
birthday function being one that assigns only one person to each birthdate), which gives
.

Hence,

Approximations
The Taylor series expansion of the
exponential function (the constant e
2.718281828)

provides a first-order approximation for ex


for x 1:

To apply this approximation to the first


expression
derived
for
p(n),
set
. Thus,

Then, replace i with non-negative integers


for each term in the formula of p(n) until i =
n 1, for example, when i = 1,

Graphs showing the approximate probabilities of at least two people sharing a


birthday (red) and its complementary event (blue)

The first expression derived for p(n) can be approximated as

Birthday problem

154

A graph showing the accuracy of the approximation


(white)

Therefore,

An even coarser approximation is given by

which, as the graph illustrates, is still fairly accurate.


According to the approximation, the same approach can be applied to any number of "people" and "days". If rather
than 365 days there are d, if there are n persons, and if n d, then using the same approach as above we achieve the
result that if p(n, d) is the probability that at least two out of n people share the same birthday from a set of d
available days, then:

A simple exponentiation
The probability of any two people not having the same birthday is 364/365. In a room containing n people, there are
C(n,2)=n(n1)/2 pairs of people, i.e. C(n,2) events. The probability of no two people sharing the same birthday
can be approximated by assuming that these events are independent and hence by multiplying their probability
together. In short 364/365 can be multiplied by itself C(n,2) times, which gives us

Since this is the probability of no one having the same birthday, then the probability of someone sharing a birthday is

Birthday problem

Poisson approximation
Applying the Poisson approximation for the binomial on the group of 23 people,

The result is over 50% as previous descriptions.

Square approximation
A good rule of thumb which can be used for mental calculation is the relation

which can also be written as

which works well for probabilities less than or equal to 0.5.


For instance, to estimate the number of people required for a 0.5 chance of a shared birthday, we get

Which is not too far from the correct answer of 23.


This approximation scheme is especially easy to use for when working with exponents. For instance, suppose you
are building 32 bit hashes (
) and want the chance of a collision to be at most one in a million (
), how many documents could we have at the most?
which is close to the correct answer of 93.

Approximation of number of people


This can also be approximated using the following formula for the number of people necessary to have at least a
50% chance of matching:

This is a result of the good approximation that an event with 1 in k probability will have a 50% chance of occurring
at least once if it is repeated kln2 times.

Probability table
Main article: Birthday attack

155

Birthday problem

156

length of #bits hash space


Number of hashed elements such that {probability of at least one hash collision=p}
hex string
size
p = 1018 p = 1015 p = 1012 p = 109 p = 106 p = 0.1% p = 1% p = 25% p = 50% p = 75%
(2#bits)
8

32

4.3109

2.9

93

2.9103

9.3103

5.0104

7.7104

1.1105

16

64

1.81019

6.1

1.9102

6.1103

1.9105

6.1106

1.9108

6.1108

3.3109

5.1109

7.2109

32

128

3.41038

2.61010 8.21011 2.61013 8.21014 2.61016 8.31017 2.61018 1.41019 2.21019 3.11019

64

256

1.21077

4.81029 1.51031 4.81032 1.51034 4.81035 1.51037 4.81037 2.61038 4.01038 5.71038

(96)
128

(384) (3.910115) 8.91048 2.81050 8.91051 2.81053 8.91054 2.81056 8.91056 4.81057 7.41057 1.01058
512

1.310154

1.61068 5.21069 1.61071 5.21072 1.61074 5.21075 1.61076 8.81076 1.41077 1.91077

The white fields in this table show the number of hashes needed to achieve the given probability of collision
(column) given a hash space of a certain size in bits (row). Using the birthday analogy: the "hash space size"
resembles the "available days", the "probability of collision" resembles the "probability of shared birthday", and the
"required number of hashed elements" resembles the "required number of people in a group". One could of course
also use this chart to determine the minimum hash size required (given upper bounds on the hashes and probability
of error), or the probability of collision (for fixed number of hashes and probability of error).
For comparison, 1018 to 1015 is the uncorrectable bit error rate of a typical hard disk.[4] In theory, 128-bit hash
functions, such as MD5, should stay within that range until about 820 billion documents, even if its possible outputs
are many more.

An upper bound
The argument below is adapted from an argument of Paul Halmos.[5]
As stated above, the probability that no two birthdays coincide is

As in earlier paragraphs, interest lies in the smallest n such that p(n)>1/2; or equivalently, the smallest n such that
p(n)<1/2.
Using the inequality 1x<ex in the above expression we replace 1k/365 with ek/365. This yields

Therefore, the expression above is not only an approximation, but also an upper bound of p(n). The inequality

implies p(n)<1/2. Solving for n gives


Now, 730ln2 is approximately 505.997, which is barely below 506, the value of n2n attained when n=23.
Therefore, 23 people suffice. Solving n2n =2365ln2 for n gives, by the way, the approximate formula of
Frank H. Mathis cited above.
This derivation only shows that at most 23 people are needed to ensure a birthday match with even chance; it leaves
open the possibility that n is 22 or less could also work.

Birthday problem

157

Generalizations
The generalized birthday problem
Given a year with d days, the generalized birthday problem asks for the minimal number n(d) such that, in a set of
n(d) randomly chosen people, the probability of a birthday coincidence is at least 50%. In other words, n(d) is the
minimal integer n such that

The classical birthday problem thus corresponds to determining n(365). The first 99 values of n(d) are given here:
d

12 35 69 1016 1723 2432 3342 4354 5568 6982 8399

n(d)

10

11

12

A number of bounds and formulas for n(d) have been published. For any d1, the number n(d) satisfies

These bounds are optimal in the sense that the sequence


, while it has

gets arbitrarily close to

as its maximum, taken for d=43. The bounds are

sufficiently tight to give the exact value of n(d) in 99% of all cases, for example n(365)=23. In general, it follows
from these bounds that n(d) always equals either
or
where
denotes the ceiling
function. The formula
holds for 73% of all integers d. The formula

holds for almost all d, i.e., for a set of integers d with asymptotic density 1. The formula

holds for all d up to 1018, but it is conjectured that there are infinitely many counter-examples to this formula. The
formula

holds too for all d up to 1018, and it is conjectured that this formula holds for all d.

Cast as a collision problem


The birthday problem can be generalized as follows:Wikipedia:Citation needed given n random integers drawn from
a discrete uniform distribution with range [1,d], what is the probability p(n;d) that at least two numbers are the same?
(d=365 gives the usual birthday problem.)
The generic results can be derived using the same arguments given above.

Birthday problem

Conversely, if n(p;d) denotes the number of random integers drawn from [1,d] to obtain a probability p that at least
two numbers are the same, then

The birthday problem in this more generic sense applies to hash functions: the expected number of N-bit hashes that
can be generated before getting a collision is not 2N, but rather only 2N/2. This is exploited by birthday attacks on
cryptographic hash functions and is the reason why a small number of collisions in a hash table are, for all practical
purposes, inevitable.
The theory behind the birthday problem was used by Zoe Schnabel[6] under the name of capture-recapture statistics
to estimate the size of fish population in lakes.
Generalization to multiple types
The basic problem considers all trials to be of one "type". The birthday problem has been generalized to consider an
arbitrary number of types.[7] In the simplest extension there are two types of people, say m men and n women, and
the problem becomes characterizing the probability of a shared birthday between at least one man and one woman.
(Shared birthdays between, say, two women do not count.) The probability of no (i.e. zero) shared birthdays here is

where d=365 and S2 are Stirling numbers of the second kind. Consequently, the desired probability is 1p0.
This variation of the birthday problem is interesting because there is not a unique solution for the total number of
people m+n. For example, the usual 0.5 probability value is realized for both a 32-member group of 16 men and 16
women and a 49-member group of 43 women and 6 men.

Other birthday problems


Reverse problem
For a fixed probability p:
Find the greatest n for which the probability p(n) is smaller than the given p, or
Find the smallest n for which the probability p(n) is greater than the given p.
Taking the above formula for d=365 we have:

Sample calculations

158

Birthday problem

159

p(n)

p(n)

0.01 0.14178365 = 2.70864

2 0.00274

3 0.00820

0.05 0.32029365 = 6.11916

6 0.04046

7 0.05624

0.1

0.45904365 = 8.77002

8 0.07434

9 0.09462

0.2

0.66805365 = 12.76302

12 0.16702

13 0.19441

0.3

0.84460365 = 16.13607

16 0.28360

17 0.31501

0.5

1.17741365 = 22.49439

22 0.47570

23 0.50730

0.7

1.55176365 = 29.64625

29 0.68097

30 0.70632

0.8

1.79412365 = 34.27666

34 0.79532

35 0.81438

0.9

2.14597365 = 40.99862

40 0.89123

41 0.90315

0.95 2.44775365 = 46.76414

46 0.94825

47 0.95477

0.99 3.03485365 = 57.98081

57 0.99012

58 0.99166

Note: some values falling outside the bounds have been colored to show that the approximation is not always exact.

First match
A related question is, as people enter a room one at a time, which one is most likely to be the first to have the same
birthday as someone already in the room? That is, for what n is p(n)p(n1) maximum? The answer is 20if
there's a prize for first match, the best position in line is 20th.

Same birthday as you


Note that in the birthday problem, neither of
the two people is chosen in advance. By
way of contrast, the probability q(n) that
someone in a room of n other people has the
same birthday as a particular person (for
example, you), is given by

and for general d by


Comparing p(n) = probability of a birthday match with q(n) = probability of
matching your birthday

In the standard case of d = 365 substituting


n = 23 gives about 6.1%, which is less than 1 chance in 16. For a greater than 50% chance that one person in a
roomful of n people has the same birthday as you, n would need to be at least 253. Note that this number is
significantly higher than 365/2 = 182.5: the reason is that it is likely that there are some birthday matches among the
other people in the room.
It is not a coincidence that

; a similar approximate pattern can be found using a number of

possibilities different from 365, or a target probability different from 50%.

Birthday problem

160

Near matches
Another generalization is to ask what is the probability of finding at least one pair in a group of n people with
birthdays within k calendar days of each other's, if there are m equally likely birthdays.
[8]

The number of people required so that the probability that some pair will have a birthday separated by k days or
fewer will be higher than 50% is:
k # people required(i.e. n) when m=365
0

23

14

11

Thus in a group of just seven random people, it is more likely than not that two of them will have a birthday within a
week of each other.

Collision counting
The probability that the kth integer randomly chosen from [1,d] will repeat at least one previous choice equals
q(k1;d) above. The expected total number of times a selection will repeat a previous selection as n such integers
are chosen equalsWikipedia:Citation needed

Average number of people


In an alternative formulation of the birthday problem, one asks the average number of people required to find a pair
with the same birthday. If we consider the probability function Pr[n people have at least one shared birthday], this
average is determining the Mean of the distribution, as opposed to the customary formulation which determines the
Median. The problem is relevant to several hashing algorithms analyzed by Donald Knuth in his book The Art of
Computer Programming. It may be shown[9][10] that if one samples uniformly, with replacement, from a population
of size M, the number of trials required for the first repeated sampling of some individual has expected value
, where

The function

has been studied by Srinivasa Ramanujan and has asymptotic expansion:

Birthday problem
With M=365 days in a year, the average number of people required to find a pair with the same birthday is
, somewhat more than 23, the number required for a 50% chance. In the best case, two people
will suffice; at worst, the maximum possible number of M+1=366 people is needed; but on average, only 25
people are required.
An informal demonstration of the problem can be made from the list of Prime Ministers of Australia, of which there
have been 27, in which Paul Keating, the 24th Prime Minister, and Edmund Barton, the first Prime Minister, share
the same birthday, 18 January.
In the 2014 FIFA World Cup, each of the 32 squads had 23 players. An analysis of the official squad lists suggested
that 16 squads had pairs of players sharing birthdays, and of these 5 squads had two pairs: Argentina, France, Iran,
South Korea and Switzerland each had two pairs, and Australia, Bosnia Herzegovina, Brazil, Cameroon, Colombia,
Honduras, Netherlands, Nigeria, Russia, Spain and USA each with one pair.[11]

Partition problem
A related problem is the partition problem, a variant of the knapsack problem from operations research. Some
weights are put on a balance scale; each weight is an integer number of grams randomly chosen between one gram
and one million grams (one metric ton). The question is whether one can usually (that is, with probability close to 1)
transfer the weights between the left and right arms to balance the scale. (In case the sum of all the weights is an odd
number of grams, a discrepancy of one gram is allowed.) If there are only two or three weights, the answer is very
clearly no; although there are some combinations which work, the majority of randomly selected combinations of
three weights do not. If there are very many weights, the answer is clearly yes. The question is, how many are just
sufficient? That is, what is the number of weights such that it is equally likely for it to be possible to balance them as
it is to be impossible?
Some people's intuition is that the answer is above 100,000. Most people's intuition is that it is in the thousands or
tens of thousands, while others feel it should at least be in the hundreds. The correct answer is approximately 23.
The reason is that the correct comparison is to the number of partitions of the weights into left and right. There are
2N1 different partitions for N weights, and the left sum minus the right sum can be thought of as a new random
quantity for each partition. The distribution of the sum of weights is approximately Gaussian, with a peak at
1,000,000N and width
, so that when 2N1 is approximately equal to
the transition occurs.
2231 is about 4 million, while the width of the distribution is only 5 million.[12]

In fiction
Arthur C. Clarke's novel A Fall of Moondust, published in 1961, contains a section where the main characters,
trapped underground for an indefinite amount of time, are celebrating a birthday and find themselves discussing the
validity of the Birthday problem. As stated by a physicist passenger: "If you have a group of more than twenty-four
people, the odds are better than even that two of them have the same birthday." Eventually, out of 22 present, it is
revealed that two characters share the same birthday, May 23.

161

Birthday problem

Notes and references


[1] This is not a paradox in the sense of leading to a logical contradiction, but is called a paradox because the mathematical truth contradicts
nave intuition: an intuitive guess would suggest that the chance of two individuals sharing the same birthday in a group of 23 is much lower
than 50%, but the birthday problem demonstrates that this is not the case.
[2] W. W. Rouse Ball, 1960, Other Questions on Probability, in Mathematical Recreations and Essays, Macmillan', New York, p 45.
[3] In reality, birthdays are not evenly distributed throughout the year; there are more births per day in some seasons than in others, but for the
purposes of this problem the distribution is treated as uniform. In particular, many children are born in the summer, especially the months of
August and September (for the northern hemisphere) (http:/ / scienceworld. wolfram. com/ astronomy/ LeapDay. html), and in the U.S. it has
been noted that many children are conceived around the holidays of Christmas and New Year's Day. Also, because hospitals rarely schedule
C-sections and induced labor on the weekend, more Americans are born on Mondays and Tuesdays than on weekends; where many of the
people share a birth year (e.g. a class in a school), this creates a tendency toward particular dates. In Sweden 9.3% of the population is born in
March and 7.3% in November when a uniform distribution would give 8.3% Swedish statistics board (http:/ / www. scb. se/ statistik/ BE/
BE0101/ 2006A01a/ BE0101_2006A01a_SM_BE12SM0701. pdf). See also: These factors tend to increase the chance of identical birth dates,
since a denser subset has more possible pairs (in the extreme case when everyone was born on three days, there would obviously be many
identical birthdays). The problem of a non-uniform number of births occurring during each day of the year was first understood by Murray
Klamkin in 1967. A formal proof that the probability of two matching birthdays is least for a uniform distribution of birthdays was given by D.
Bloom (1973).
[4] Jim Gray, Catharine van Ingen. Empirical Measurements of Disk Failure Rates and Error Rates (http:/ / arxiv. org/ abs/ cs/ 0701166)
[5] In his autobiography, Halmos criticized the form in which the birthday paradox is often presented, in terms of numerical computation. He
believed that it should be used as an example in the use of more abstract mathematical concepts. He wrote:

The reasoning is based on important tools that all students of mathematics should have ready access to.
The birthday problem used to be a splendid illustration of the advantages of pure thought over
mechanical manipulation; the inequalities can be obtained in a minute or two, whereas the
multiplications would take much longer, and be much more subject to error, whether the instrument is a
pencil or an old-fashioned desk computer. What calculators do not yield is understanding, or
mathematical facility, or a solid basis for more advanced, generalized theories.
[6] Z. E. Schnabel (1938) The Estimation of the Total Fish Population of a Lake, American Mathematical Monthly 45, 348352.
[7] M. C. Wendl (2003) Collision Probability Between Sets of Random Variables (http:/ / dx. doi. org/ 10. 1016/ S0167-7152(03)00168-8),
Statistics and Probability Letters 64(3), 249254.
[8] M. Abramson and W. O. J. Moser (1970) More Birthday Surprises, American Mathematical Monthly 77, 856858
[9] D. E. Knuth; The Art of Computer Programming. Vol. 3, Sorting and Searching (Addison-Wesley, Reading, Massachusetts, 1973)
[10] P. Flajolet, P. J. Grabner, P. Kirschenhofer, H. Prodinger (1995), On Ramanujan's Q-Function, Journal of Computational and Applied
Mathematics 58, 103116
[11] http:/ / www. bbc. co. uk/ news/ magazine-27835311
[12] C. Borgs, J. Chayes, and B. Pittel (2001) Phase Transition and Finite Size Scaling in the Integer Partition Problem, Random Structures and
Algorithms 19(34), 247288.

Bibliography
M. Abramson and W. O. J. Moser (1970) More Birthday Surprises, American Mathematical Monthly 77,
856858.
D. Bloom (1973) A Birthday Problem, American Mathematical Monthly 80, 11411142.
John G. Kemeny, J. Laurie Snell, and Gerald Thompson Introduction to Finite Mathematics . The first edition,
1957.
M. Klamkin and D. Newman (1967) Extensions of the Birthday Surprise, Journal of Combinatorial Theory 3,
279282.
E. H. McKinney (1966) Generalized Birthday Problem, American Mathematical Monthly 73, 385387.
Leila Schneps and Coralie Colmez, Math on trial. How numbers get used and abused in the courtroom, Basic
Books, 2013. ISBN 978-0-465-03292-1. (Fifth chapter: "Math error number 5. The case of Diana Sylvester: cold
hit analysis").

162

Birthday problem

163

External links
Coincidences: the truth is out there (http://www.rsscse-edu.org.uk/tsj/wp-content/uploads/2011/03/
matthews.pdf) Experimental test of the Birthday Paradox and other coincidences
http://www.efgh.com/math/birthday.htm
http://planetmath.org/encyclopedia/BirthdayProblem.html
Weisstein, Eric W., "Birthday Problem" (http://mathworld.wolfram.com/BirthdayProblem.html), MathWorld.
A humorous article explaining the paradox (http://www.damninteresting.com/?p=402)
SOCR EduMaterials activities birthday experiment (http://wiki.stat.ucla.edu/socr/index.php/
SOCR_EduMaterials_Activities_BirthdayExperiment)
Understanding the Birthday Problem (Better Explained) (http://betterexplained.com/articles/
understanding-the-birthday-paradox/)
Eurobirthdays 2012. A birthday problem. (http://www.matifutbol.com/en/eurobirthdays.html) A practical
football example of the birthday paradox.
Grime, James. "23: Birthday Probability" (http://www.numberphile.com/videos/23birthday.html).
Numberphile. Brady Haran.

BorelKolmogorov paradox
In probability theory, the BorelKolmogorov paradox (sometimes known as Borel's paradox) is a paradox relating
to conditional probability with respect to an event of probability zero (also known as a null set). It is named after
mile Borel and Andrey Kolmogorov.

A great circle puzzle


Suppose that a random variable has a uniform distribution on a unit sphere. What is its conditional distribution on a
great circle? Because of the symmetry of the sphere, one might expect that the distribution is uniform and
independent of the choice of coordinates. However, two analyses give contradictory results. First, note that choosing
a point uniformly on the sphere is equivalent to choosing the longitude uniformly from [-,] and choosing the
latitude from [-/2,/2] with density

. Then we can look at two different great circles:

1. If the coordinates are chosen so that the great circle is an equator (latitude =0), the conditional density for
a longitude defined on the interval [,] is

2. If the great circle is a line of longitude with =0, the conditional density for on the interval [/2,/2] is

One distribution is uniform on the circle, the other is not. Yet both seem to be referring to the same great circle in
different coordinate systems.
Many quite futile arguments have raged - between otherwise competent probabilists - over which of these
results is 'correct'.
E.T. Jaynes

BorelKolmogorov paradox

164

Explanation and implications


In case (1) above, the conditional probability that the longitude lies in a set E given that = 0 can be written P(
E | = 0). Elementary probability theory suggests this can be computed as P( E and =0)/P(=0), but that
expression is not well-defined since P(=0) = 0. Measure theory provides a way to define a conditional probability,
using the family of events Rab = { : a < < b} which are horizontal rings consisting of all points with latitude
between a and b.
The resolution of the paradox is to notice that in case (2), P( F | =0) is defined using the events Lab = { : a <
< b}, which are lunes (vertical wedges), consisting of all points whose longitude varies between a and b. So although
P( E | =0) and P( F | =0) each provide a probability distribution on a great circle, one of them is defined
using rings, and the other using lunes. Thus it is not surprising after all that P( E | =0) and P( F | =0) have
different distributions.
The concept of a conditional probability with regard to an isolated hypothesis whose probability equals 0 is
inadmissible. For we can obtain a probability distribution for [the latitude] on the meridian circle only if we
regard this circle as an element of the decomposition of the entire spherical surface onto meridian circles with
the given poles
Andrey Kolmogorov[1]
the term 'great circle' is ambiguous until we specify what limiting operation is to produce it. The intuitive
symmetry argument presupposes the equatorial limit; yet one eating slices of an orange might presuppose the
other.
E.T. Jaynes

Mathematical explication
To understand the problem we need to recognize that a distribution on a continuous random variable is described by
a density f only with respect to some measure . Both are important for the full description of the probability
distribution. Or, equivalently, we need to fully define the space on which we want to define f.
Let and denote two random variables taking values in 1 = [-/2,/2] respectively 2 = [-,]. An event
{=,=} gives a point on the sphere S(r) with radius r. We define the coordinate transform

for which we obtain the volume element

Furthermore, if either or is fixed, we get the volume elements

Let

denote the joint measure on

, which has a density

with respect to

and let

BorelKolmogorov paradox

If we assume that the density

Hence,
other hand,

165

is uniform, then

has a uniform density with respect to


has a uniform density with respect to

but not with respect to the Lebesgue measure. On the


and the Lebesgue measure.

Notes
[1] Originally Kolmogorov (1933), translated in Kolmogorov (1956). Sourced from Pollard (2002)

References and further reading


Jaynes, E.T. (2003). "15.7 The Borel-Kolmogorov paradox". Probability Theory: The Logic of Science.
Cambridge University Press. pp.467470. ISBN0-521-59271-2. MR 1992316 (http://www.ams.org/
mathscinet-getitem?mr=1992316).
Fragmentary Edition (1994) (pp. 15141517) (http://omega.math.albany.edu:8008/ETJ-PS/cc15w.ps)
(PostScript format)
Kolmogorov, Andrey (1933). Grundbegriffe der Wahrscheinlichkeitsrechnung (in German). Berlin: Julius
Springer.
Translation: Kolmogorov, Andrey (1956). "Chapter V, 2. Explanation of a Borel Paradox" (http://www.
mathematik.com/Kolmogorov/0029.html). Foundations of the Theory of Probability (http://www.
mathematik.com/Kolmogorov/index.html) (2nd ed.). New York: Chelsea. pp.5051. ISBN0-8284-0023-7.
Pollard, David (2002). "Chapter 5. Conditioning, Example 17.". A User's Guide to Measure Theoretic
Probability. Cambridge University Press. pp.122123. ISBN0-521-00289-3. MR 1873379 (http://www.ams.
org/mathscinet-getitem?mr=1873379).
Mosegaard, K., & Tarantola, A. (2002). 16 Probabilistic approach to inverse problems. International Geophysics,
81, 237-265.

Boy or Girl paradox

Boy or Girl paradox


The Boy or Girl paradox surrounds a well-known set of questions in probability theory which are also known as
The Two Child Problem, Mr. Smith's Children and the Mrs. Smith Problem. The initial formulation of the question
dates back to at least 1959, when Martin Gardner published one of the earliest variants of the paradox in Scientific
American. Titled The Two Children Problem, he phrased the paradox as follows:
Mr. Jones has two children. The older child is a girl. What is the probability that both children are girls?
Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys?
Gardner initially gave the answers 1/2 and 1/3, respectively; but later acknowledged that the second question was
ambiguous. Its answer could be 1/2, depending on how you found out that one child was a boy. The ambiguity,
depending on the exact wording and possible assumptions, was confirmed by Bar-Hillel and Falk, and Nickerson.
Other variants of this question, with varying degrees of ambiguity, have been recently popularized by Ask Marilyn in
Parade Magazine, John Tierney of The New York Times, and Leonard Mlodinow in Drunkard's Walk. One scientific
study showed that when identical information was conveyed, but with different partially ambiguous wordings that
emphasized different points, that the percentage of MBA students who answered 1/2 changed from 85% to 39%.
The paradox has frequently stimulated a great deal of controversy. Many people argued strongly for both sides with a
great deal of confidence, sometimes showing disdain for those who took the opposing view. The paradox stems from
whether the problem setup is similar for the two questions. The intuitive answer is 1/2. This answer is intuitive if the
question leads the reader to believe that there are two equally likely possibilities for the sex of the second child (i.e.,
boy and girl), and that the probability of these outcomes is absolute, not conditional.

Common assumptions
The two possible answers share a number of assumptions. First, it is assumed that the space of all possible events can
be easily enumerated, providing an extensional definition of outcomes: {BB, BG, GB, GG}. This notation indicates
that there are four possible combinations of children, labeling boys B and girls G, and using the first letter to
represent the older child. Second, it is assumed that these outcomes are equally probable. This implies the following
model, a Bernoulli process with
:
1. Each child is either male or female.
2. Each child has the same chance of being male as of being female.
3. The sex of each child is independent of the sex of the other.
In reality, this is a rather inaccurate model, since it ignores (amongst other factors) the fact that the ratio of boys to
girls is not exactly 50:50, the possibility of identical twins (who are always the same sex), and the possibility of an
intersex child. However, this problem is about probability and not biology. The mathematical outcome would be the
same if it were phrased in terms of a coin toss.

First question
Mr. Jones has two children. The older child is a girl. What is the probability that both children are girls?
Under the forementioned assumptions, in this problem, a random family is selected. In this sample space, there are
four equally probable events:

166

Boy or Girl paradox

167

Older child Younger child


Girl

Girl

Girl

Boy

Boy

Girl

Boy

Boy

Only two of these possible events meet the criteria specified in the question (e.g., GG, GB). Since both of the two
possibilities in the new sample space {GG, GB} are equally likely, and only one of the two, GG, includes two girls,
the probability that the younger child is also a girl is 1/2.

Second question
Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys?
This question is identical to question one, except that instead of specifying that the older child is a boy, it is specified
that at least one of them is a boy. In response to reader criticism of the question posed in 1959, Gardner agreed that a
precise formulation of the question is critical to getting different answers for question 1 and 2. Specifically, Gardner
argued that a "failure to specify the randomizing procedure" could lead readers to interpret the question in two
distinct ways:
From all families with two children, at least one of whom is a boy, a family is chosen at random. This would yield
the answer of 1/3.
From all families with two children, one child is selected at random, and the sex of that child is specified. This
would yield an answer of 1/2.
Grinstead and Snell argue that the question is ambiguous in much the same way Gardner did.
For example, if you see the children in the garden, you may see a boy. The other child may be hidden behind a tree.
In this case, the statement is equivalent to the second (the child that you can see is a boy). The first statement does
not match as one case is one boy, one girl. Then the girl may be visible. (The first statement says that it can be
either.)
While it is certainly true that every possible Mr. Smith has at least one boy - i.e., the condition is necessary - it is not
clear that every Mr. Smith with at least one boy is intended. That is, the problem statement does not say that having a
boy is a sufficient condition for Mr. Smith to be identified as having a boy this way.
Commenting on Gardner's version of the problem, Bar-Hillel and Falk note that "Mr. Smith, unlike the reader, is
presumably aware of the sex of both of his children when making this statement", i.e. that 'I have two children and at
least one of them is a boy.' If it is further assumed that Mr Smith would report this fact if it were true then the correct
answer is 1/3 as Gardner intended.

Analysis of the ambiguity


If it is assumed that this information was obtained by looking at both children to see if there is at least one boy, the
condition is both necessary and sufficient. Three of the four equally probable events for a two-child family in the
sample space above meet the condition:

Boy or Girl paradox

168

Older child Younger child


Girl

Girl

Girl

Boy

Boy

Girl

Boy

Boy

Thus, if it is assumed that both children were considered while looking for a boy, the answer to question 2 is 1/3.
However, if the family was first selected and then a random, true statement was made about the gender of one child
(whether or not both were considered), the correct way to calculate the conditional probability is not to count the
cases that match. Instead, one must add the probabilities that the condition will be satisfied in each case:
Older child Younger child P(this case) P("at least one boy" given this case) P(both this case, and "at least one boy")
Girl

Girl

1/4

Girl

Boy

1/4

1/2

1/8

Boy

Girl

1/4

1/2

1/8

Boy

Boy

1/4

1/4

The answer is found by adding the numbers in the last column wherever you would have counted that case:
(1/4)/(0+1/8+1/8+1/4)=1/2. Note that this is not necessarily the same as reporting the gender of a specific child,
although doing so will produce the same result by a different calculation. For instance, if the younger child is picked,
the calculation is (1/4)/(0+1/4+0+1/4)=1/2. In general, 1/2 is a better answer any time a Mr. Smith with a boy and a
girl could have been identified as having at least one girl.

Bayesian analysis
Following classical probability arguments, we consider a large urn containing two children. We assume equal
probability that either is a boy or a girl. The three discernible cases are thus: 1. both are girls (GG) with
probability P(GG) = 0.25, 2. both are boys (BB) with probability of P(BB) = 0.25, and 3. one of each (G.B)
with probability of P(G.B) = 0.50. These are the prior probabilities.
Now we add the additional assumption that "at least one is a boy" = B. Using Bayes' Theorem, we find
P(BB|B) = P(B|BB) P(BB) / P(B) = 1 1/4 / 3/4 = 1/3.
where P(A|B) means "probability of A given B". P(B|BB) = probability of at least one boy given both are boys = 1.
P(BB) = probability of both boys = 1/4 from the prior distribution. P(B) = probability of at least one being a boy,
which includes cases BB and G.B = 1/4 + 1/2 = 3/4.
Note that, although the natural assumption seems to be a probability of 1/2, so the derived value of 1/3 seems low,
the actual "normal" value for P(BB) is 1/4, so the 1/3 is actually a bit higher.
The paradox arises because the second assumption is somewhat artificial, and when describing the problem in an
actual setting things get a bit sticky. Just how do we know that "at least" one is a boy? One description of the
problem states that we look into a window, see only one child and it is a boy. This sounds like the same assumption.
However, this one is equivalent to "sampling" the distribution (i.e. removing one child from the urn, ascertaining that
it is a boy, then replacing). Let's call the statement "the sample is a boy" proposition "b". Now we have:
P(BB|b) = P(b|BB) P(BB) / P(b) = 1 1/4 / 1/2 = 1/2.
The difference here is the P(b), which is just the probability of drawing a boy from all possible cases (i.e. without the
"at least"), which is clearly 0.5.

Boy or Girl paradox


The Bayesian analysis generalizes easily to the case in which we relax the 50/50 population assumption. If we have
no information about the populations then we assume a "flat prior", i.e. P(GG) = P(BB) = P(G.B) = 1/3. In this case
the "at least" assumption produces the result P(BB|B) = 1/2, and the sampling assumption produces P(BB|b) = 2/3, a
result also derivable from the Rule of Succession.

Martingale analysis
Suppose you had wagered that Mr Smith had two boys, and received fair odds. You paid $1 and you will receive $4
if he has two boys. We think of your wager as investment that will increase in value as good news arrives. What
evidence would make you happier about your investment? Learning that at least one child out of two is a boy, or
learning that at least one child out of one is a boy?
The latter is a priori less likely, and therefore better news. That is why the two answers cannot be the same.
Now for the numbers. If we bet on one child and win, the value of your investment has doubled. It must double again
to get to $4, so the odds are 1 in 2.
On the other hand if we learn that at least one of two children is a boy, our investment increases as if we had
wagered on this question. Our one dollar is now worth $4/3 dollars. To get to $4 we still have to increase our wealth
threefold. So the answer is 1 in 3.

Variants of the question


Following the popularization of the paradox by Gardner it has been presented and discussed in various forms. The
first variant presented by Bar-Hillel & Falk is worded as follows:
Mr. Smith is the father of two. We meet him walking along the street with a young boy whom he proudly
introduces as his son. What is the probability that Mr. Smiths other child is also a boy?
Bar-Hillel & Falk use this variant to highlight the importance of considering the underlying assumptions. The
intuitive answer is 1/2 and, when making the most natural assumptions, this is correct. However, someone may argue
that ...before Mr. Smith identifies the boy as his son, we know only that he is either the father of two boys, BB, or of
two girls, GG, or of one of each in either birth order, i.e., BG or GB. Assuming again independence and
equiprobability, we begin with a probability of 1/4 that Smith is the father of two boys. Discovering that he has at
least one boy rules out the event GG. Since the remaining three events were equiprobable, we obtain a probability of
1/3 for BB.
The natural assumption is that Mr. Smith selected the child companion at random. If so, as combination BB has
twice the probability of either BG or GB of having resulted in the boy walking companion (and combination GG has
zero probability, ruling it out), the union of events BG and GB becomes equiprobable with event BB, and so the
chance that the other child is also a boy is 1/2. Bar-Hillel & Falk, however, suggest an alternative scenario. They
imagine a culture in which boys are invariably chosen over girls as walking companions. In this case, the
combinations of BB, BG and GB are assumed equally likely to have resulted in the boy walking companion, and
thus the probability that the other child is also a boy is 1/3.
In 1991, Marilyn vos Savant responded to a reader who asked her to answer a variant of the Boy or Girl paradox that
included beagles. In 1996, she published the question again in a different form. The 1991 and 1996 questions,
respectively were phrased:
A shopkeeper says she has two new baby beagles to show you, but she doesn't know whether they're male,
female, or a pair. You tell her that you want only a male, and she telephones the fellow who's giving them a bath.
"Is at least one a male?" she asks him. "Yes!" she informs you with a smile. What is the probability that the other
one is a male?
Say that a woman and a man (who are unrelated) each has two children. We know that at least one of the woman's
children is a boy and that the man's oldest child is a boy. Can you explain why the chances that the woman has

169

Boy or Girl paradox


two boys do not equal the chances that the man has two boys?
With regard to the second formulation Vos Savant gave the classic answer that the chances that the woman has two
boys are about 1/3 whereas the chances that the man has two boys are about 1/2. In response to reader response that
questioned her analysis vos Savant conducted a survey of readers with exactly two children, at least one of which is a
boy. Of 17,946 responses, 35.9% reported two boys.
Vos Savant's articles were discussed by Carlton and Stansfield in a 2005 article in The American Statistician. The
authors do not discuss the possible ambiguity in the question and conclude that her answer is correct from a
mathematical perspective, given the assumptions that the likelihood of a child being a boy or girl is equal, and that
the sex of the second child is independent of the first. With regard to her survey they say it "at least validates vos
Savants correct assertion that the chances posed in the original question, though similar-sounding, are different,
and that the first probability is certainly nearer to 1 in 3 than to 1 in 2."
Carlton and Stansfield go on to discuss the common assumptions in the Boy or Girl paradox. They demonstrate that
in reality male children are actually more likely than female children, and that the sex of the second child is not
independent of the sex of the first. The authors conclude that, although the assumptions of the question run counter
to observations, the paradox still has pedagogical value, since it "illustrates one of the more intriguing applications of
conditional probability." Of course, the actual probability values do not matter; the purpose of the paradox is to
demonstrate seemingly contradictory logic, not actual birth rates.

Information about the child


Suppose we were told not only that Mr. Smith has two children, and one of them is a boy, but also that the boy was
born on a Tuesday: does this change our previous analyses? Again, the answer depends on how this information
comes to us - what kind of selection process brought us this knowledge.
Following the tradition of the problem, let us suppose that out there in the population of two-child families, the sex
of the two children is independent of one another, equally likely boy or girl, and that each child is independently of
the other children born on any of the seven days of the week, each with equal probability 1/7. In that case, the chance
that a two child family consists of two boys, one (at least) born on a Tuesday, is equal to 1/4 (the probability of two
boys) times one minus 6/7 squared = 1 - 36/49 = 13/49 (one minus the probability that neither child is born on a
Tuesday). 1/4 times 13/49 equals 13/196.
The probability that a two child family consists of a boy and a girl, the boy born on a Tuesday, equals 2 (boy-girl or
girl-boy) times 1/4 (the two specified sexes) times 1/7 (the boy born on Tuesday) = 1/14. Therefore, among all two
child families with at least one boy born on a Tuesday, the fraction of families in which the other child is a girl is
1/14 divided by the sum of 1/14 plus 13/196 = 0.5185185.
It seems that we introduced quite irrelevant information, yet the probability of the sex of the other child has changed
dramatically from what it was before (the chance the other child was a girl was 2/3, when we didn't know that the
boy was born on Tuesday).
This is still a bit bigger than a half, but close! It is not difficult to check that as we specify more and more details
about the boy child (for instance: born on January 1), the chance that the other child is a girl approaches one half.
However, is it really plausible that our child family with at least one boy born on a Tuesday was delivered to us by
choosing just one of such families at random? It is much more easy to imagine the following scenario. We know Mr.
Smith has two children. We knock at his door and a boy comes and answers the door. We ask the boy on what day of
the week he was born. Let's assume that which of the two children answers the door is determined by chance! Then
the procedure was (1) pick a two-child family at random from all two-child families (2) pick one of the two children
at random, (3) see it's a boy and ask on what day he was born. The chance the other child is a girl is 1/2. This is a
very different procedure from (1) picking a two-child family at random from all families with two children, at least
one a boy, born on a Tuesday. The chance the family consists of a boy and a girl is 0.5185815...

170

Boy or Girl paradox


This variant of the boy and girl problem is discussed on many recent internet blogs and is the subject of a paper by
Ruma Falk, [1]. The moral of the story is that these probabilities don't just depend on the information we have in
front of us, but on how we came by that information.

Psychological investigation
From the position of statistical analysis the relevant question is often ambiguous and as such there is no correct
answer. However, this does not exhaust the boy or girl paradox for it is not necessarily the ambiguity that explains
how the intuitive probability is derived. A survey such as vos Savants suggests that the majority of people adopt an
understanding of Gardners problem that if they were consistent would lead them to the 1/3 probability answer but
overwhelmingly people intuitively arrive at the 1/2 probability answer. Ambiguity notwithstanding, this makes the
problem of interest to psychological researchers who seek to understand how humans estimate probability.
Fox & Levav (2004) used the problem (called the Mr. Smith problem, credited to Gardner, but not worded exactly
the same as Gardner's version) to test theories of how people estimate conditional probabilities. In this study, the
paradox was posed to participants in two ways:
"Mr. Smith says: 'I have two children and at least one of them is a boy.' Given this information, what is the
probability that the other child is a boy?"
"Mr. Smith says: 'I have two children and it is not the case that they are both girls.' Given this information, what is
the probability that both children are boys?"
The authors argue that the first formulation gives the reader the mistaken impression that there are two possible
outcomes for the "other child", whereas the second formulation gives the reader the impression that there are four
possible outcomes, of which one has been rejected (resulting in 1/3 being the probability of both children being boys,
as there are 3 remaining possible outcomes, only one of which is that both of the children are boys). The study found
that 85% of participants answered 1/2 for the first formulation, while only 39% responded that way to the second
formulation. The authors argued that the reason people respond differently to each question (along with other similar
problems, such as the Monty Hall Problem and the Bertrand's box paradox) is because of the use of naive heuristics
that fail to properly define the number of possible outcomes.

References
[1] http:/ / www. tandfonline. com/ doi/ abs/ 10. 1080/ 13546783. 2011. 613690

External links

Boy or Girl: Two Interpretations (http://mathforum.org/library/drmath/view/52186.html)


At Least One Girl (http://www.mathpages.com/home/kmath036.htm) at MathPages
A Problem With Two Bear Cubs (http://www.cut-the-knot.org/bears.shtml)
Lewis Carroll's Pillow Problem (http://www.cut-the-knot.org/carroll.shtml)
When intuition and math probably look wrong (http://www.sciencenews.org/view/generic/id/60598/title/
Math_Trek__When_intuition_and_math_probably_look_wrong)
The Boy or Girl Paradox: A Martingale Perspective (http://finmathblog.blogspot.com/2013/07/
the-boy-or-girl-paradox-martingale.html) at FinancialMathematics.com

171

Burali-Forti paradox

172

Burali-Forti paradox
In set theory, a field of mathematics, the Burali-Forti paradox demonstrates that navely constructing "the set of all
ordinal numbers" leads to a contradiction and therefore shows an antinomy in a system that allows its construction. It
is named after Cesare Burali-Forti, who in 1897 published a paper proving a theorem which, unknown to him,
contradicted a previously proved result. Bertrand Russell subsequently noticed the contradiction, and when he
published it, he stated that it had been suggested to him by Burali-Forti's paper, with the result that it came to be
known by Burali-Forti's name.[1]

Stated in terms of von Neumann ordinals


Let

be the "set" of all ordinals. Since

carries all properties of an ordinal number, it is an ordinal number itself.

We can therefore construct its successor


must be an element of

, since

and

, which is strictly greater than

. However, this ordinal number

contains all ordinal numbers. Finally, we arrive at


.

Stated more generally


The version of the paradox above is anachronistic, because it presupposes the definition of the ordinals due to John
von Neumann, under which each ordinal is the set of all preceding ordinals, which was not known at the time the
paradox was framed by Burali-Forti. Here is an account with fewer presuppositions: suppose that we associate with
each well-ordering an object called its "order type" in an unspecified way (the order types are the ordinal numbers).
The "order types" (ordinal numbers) themselves are well-ordered in a natural way, and this well-ordering must have
an order type . It is easily shown in nave set theory (and remains true in ZFC but not in New Foundations) that
the order type of all ordinal numbers less than a fixed is itself. So the order type of all ordinal numbers less
than is itself. But this means that , being the order type of a proper initial segment of the ordinals, is strictly
less than the order type of all the ordinals, but the latter is

itself by definition. This is a contradiction.

If we use the von Neumann definition, under which each ordinal is identified as the set of all preceding ordinals, the
paradox is unavoidable: the offending proposition that the order type of all ordinal numbers less than a fixed is
itself must be true. The collection of von Neumann ordinals, like the collection in the Russell paradox, cannot be
a set in any set theory with classical logic. But the collection of order types in New Foundations (defined as
equivalence classes of well-orderings under similarity) is actually a set, and the paradox is avoided because the order
type of the ordinals less than turns out not to be .

Resolution of the paradox


Modern axiomatic set theory such as ZF and ZFC circumvents this antinomy by simply not allowing construction of
sets with unrestricted comprehension terms like "all sets with the property ", as it was for example possible in
Gottlob Frege's axiom system. New Foundations uses a different solution.

References
[1] https:/ / www. cs. auckland. ac. nz/ ~chaitin/ lowell. html

Burali-Forti, Cesare (1897), "Una questione sui numeri transfiniti", Rendiconti del Circolo Matematico di
Palermo 11: 154164, doi: 10.1007/BF03015911 (http://dx.doi.org/10.1007/BF03015911)

Burali-Forti paradox

External links
Stanford Encyclopedia of Philosophy: " Paradoxes and Contemporary Logic (http://plato.stanford.edu/entries/
paradoxes-contemporary-logic/)" -- by Andrea Cantini.

Cantor's paradox
In set theory, Cantor's paradox is derivable from the theorem that there is no greatest cardinal number, so that the
collection of "infinite sizes" is itself infinite. The difficulty is handled in axiomatic set theory by declaring that this
collection is not a set but a proper class; in von NeumannBernaysGdel set theory it follows from this and the
axiom of limitation of size that this proper class must be in bijection with the class of all sets. Thus, not only are
there infinitely many infinities, but this infinity is larger than any of the infinities it enumerates.
This paradox is named for Georg Cantor, who is often credited with first identifying it in 1899 (or between 1895 and
1897). Like a number of "paradoxes" it is not actually contradictory but merely indicative of a mistaken intuition, in
this case about the nature of infinity and the notion of a set. Put another way, it is paradoxical within the confines of
nave set theory and therefore demonstrates that a careless axiomatization of this theory is inconsistent.

Statements and proofs


In order to state the paradox it is necessary to understand that the cardinal numbers admit an ordering, so that one
can speak about one being greater or less than another. Then Cantor's paradox is:
Theorem: There is no greatest cardinal number.
This fact is a direct consequence of Cantor's theorem on the cardinality of the power set of a set.
Proof: Assume the contrary, and let C be the largest cardinal number. Then (in the von Neumann formulation
of cardinality) C is a set and therefore has a power set 2C which, by Cantor's theorem, has cardinality strictly
larger than that of C. Demonstrating a cardinality (namely that of 2C) larger than C, which was assumed to be
the greatest cardinal number, falsifies the definition of C. This contradiction establishes that such a cardinal
cannot exist.
Another consequence of Cantor's theorem is that the cardinal numbers constitute a proper class. That is, they cannot
all be collected together as elements of a single set. Here is a somewhat more general result.
Theorem: If S is any set then S cannot contain elements of all cardinalities. In fact, there is a strict upper
bound on the cardinalities of the elements of S.
Proof: Let S be a set, and let T be the union of the elements of S. Then every element of S is a subset of T, and
hence is of cardinality less than or equal to the cardinality of T. Cantor's theorem then implies that every
element of S is of cardinality strictly less than the cardinality of 2T.

Discussion and consequences


Since the cardinal numbers are well-ordered by indexing with the ordinal numbers (see Cardinal number, formal
definition), this also establishes that there is no greatest ordinal number; conversely, the latter statement implies
Cantor's paradox. By applying this indexing to the Burali-Forti paradox we obtain another proof that the cardinal
numbers are a proper class rather than a set, and (at least in ZFC or in von NeumannBernaysGdel set theory) it
follows from this that there is a bijection between the class of cardinals and the class of all sets. Since every set is a
subset of this latter class, and every cardinality is the cardinality of a set (by definition!) this intuitively means that
the "cardinality" of the collection of cardinals is greater than the cardinality of any set: it is more infinite than any
true infinity. This is the paradoxical nature of Cantor's "paradox".

173

Cantor's paradox

Historical note
While Cantor is usually credited with first identifying this property of cardinal sets, some mathematicians award this
distinction to Bertrand Russell, who defined a similar theorem in 1899 or 1901.

References
Anellis, I.H. (1991). Drucker, Thomas, ed. "The first Russell paradox," Perspectives on the History of
Mathematical Logic. Cambridge, Mass.: Birkuser Boston. pp.3346.
Moore, G.H. and Garciadiego, A. (1981). "Burali-Forti's paradox: a reappraisal of its origins". Historia Math 8
(3): 319350. doi:10.1016/0315-0860(81)90070-7 [1].

External links
An Historical Account of Set-Theoretic Antinomies Caused by the Axiom of Abstraction [2]: report by Justin T.
Miller, Department of Mathematics, University of Arizona.
PlanetMath.org [3]: article.

References
[1] http:/ / dx. doi. org/ 10. 1016%2F0315-0860%2881%2990070-7
[2] http:/ / citeseer. ist. psu. edu/ 496807. html
[3] http:/ / planetmath. org/ encyclopedia/ CantorsParadox. html

Coastline paradox

174

Coastline paradox

An example of the coastline paradox. If the coastline of Great Britain is measured using units 100km (62mi) long,
then the length of the coastline is approximately 2,800km (1,700mi). With 50km (31mi) units, the total length is
approximately 3,400km (2,100mi), approximately 600km (370mi) longer.
The coastline paradox is the counterintuitive observation that the coastline of a landmass does not have a
well-defined length. This results from the fractal-like properties of coastlines. The first recorded observation of this
phenomenon was by Lewis Fry Richardson.
More concretely, the length of the coastline depends on the method used to measure it. Since a landmass has features
at all scales, from hundreds of kilometers in size to tiny fractions of a millimeter and below, there is no obvious size
of the smallest feature that should be measured around, and hence no single well-defined perimeter to the landmass.
Various approximations exist when specific assumptions are made about minimum feature size.

Mathematical aspects
The basic concept of length originates from Euclidean distance. In the familiar Euclidean geometry, a straight line
represents the shortest distance between two points; this line has only one length. The geodesic length on the surface
of a sphere, called the great circle length, is measured along the surface curve which exists in the plane containing
both end points of the path and the center of the sphere. The length of basic curves is more complicated but can also
be calculated. Measuring with rulers, one can approximate the length of a curve by adding the sum of the straight
lines which connect the points:

Using a few straight lines to approximate the length of a curve will produce a low estimate. Using shorter and shorter
lines will produce sums that approach the curve's true length. A precise value for this length can be established using
calculus, a branch of mathematics which enables calculation of infinitely small distances. The following animation
illustrates how a smooth curve can be meaningfully assigned a precise length:

175

Coastline paradox

However, not all curves can be measured in this way. A fractal is by definition a curve whose complexity changes
with measurement scale. Whereas approximations of a smooth curve get closer and closer to a single value as
measurement precision increases, the measured value of fractals may change wildly.

This Sierpiski curve, which repeats the same pattern on a smaller and smaller scale, continues to increase in length.
If understood to iterate within an infinitely subdivisible geometric space, its length approaches infinity. At the same
time, the area enclosed by the curve does converge to a precise figurejust as, analogously, the land mass of an
island can be calculated more easily than the length of its coastline.
The length of a "true fractal" always diverges to infinity.[1] However, this figure relies on the assumption that space
can be subdivided indefinitely. The truth value of this assumptionwhich underlies Euclidean geometry and serves
as a useful model in everyday measurementis a matter of philosophical speculation, and may or may not reflect the
changing realities of 'space' and 'distance' on the atomic level.
Coastlines differ from mathematical fractals because they are formed by numerous small events, which create
patterns only statistically.[2]

Practical
For practical considerations, an appropriate choice of minimum feature size is on the order of the units being used to
measure. If a coastline is measured in kilometers, then small variations much smaller than one kilometer are easily
ignored. To measure the coastline in centimeters, tiny variations the size of centimeters must be considered.
However, at scales on the order of centimeters various arbitrary and non-fractal assumptions must be made, such as
where an estuary joins the sea, or where in a broad tidal flat the coastline measurements ought to be taken. Using
different measurement methodologies for different units also destroys the usual certainty that units can be converted
by a simple multiplication.
Extreme cases of the coastline paradox include the fjord-heavy coastlines of Norway, Chile and the Pacific
Northwest of North America. From the southern tip of Vancouver Island northwards to the southern tip of the Alaska

176

Coastline paradox
Panhandle, the convolutions of the coastline of the Canadian province of British Columbia make it over 10% of the
entire Canadian coastline25,725km (15,985mi) out of 243,042km (151,019mi)Wikipedia:Please clarify over a
linear distance of only 965km (600mi), including the maze of islands of the Arctic archipelago.[3]

Notes
[1] Post & Eisen, p. 550.
[2] Heinz-Otto Peitgen, Hartmut Jrgens, Dietmar Saupe, Chaos and Fractals: New Frontiers of Science; Spring, 2004; p. 424 (http:/ / books.
google. com/ books?id=LDCkFOkD2IIC& lpg=PP1& pg=PA424).
[3] Sebert, L.M., and M. R. Munro. 1972. Dimensions and Areas of Maps of the National Topographic System of Canada. Technical Report
72-1. Ottawa: Department of Energy, Mines and Resources, Surveys and Mapping Branch.

Bibliography
Post, David G., and Michael Eisen. " How Long is the Coastline of Law? Thoughts on the Fractal Nature of Legal
Systems (http://papers.ssrn.com/sol3/papers.cfm?abstract_id=943509)". Journal of Legal Studies XXIX(1),
January 2000.

External links
" Coastlines (http://classes.yale.edu/fractals/Panorama/Nature/Coastlines/Coastlines.html)" at Fractal
Geometry (http://classes.yale.edu/fractals/) (ed. Michael Frame, Benoit Mandelbrot, and Nial Neger;
maintained for Math 190a at Yale University)
The Atlas of Canada Coastline and Shoreline (http://atlas.nrcan.gc.ca/site/english/learningresources/facts/
coastline.html)
NOAA GeoZone Blog on Digital Coast (http://www.csc.noaa.gov/digitalcoast/geozone/
how-much-length-do-you-really-need-ahhh-shoreline-length-that-is)

177

Cramer's paradox

Cramer's paradox
In mathematics, Cramer's paradox is the statement that the number of points of intersection of two higher-order
curves can be greater than the number of arbitrary points that are usually needed to define one such curve. It is
named after the Swiss mathematician Gabriel Cramer.
This paradox is the result of a naive understanding or a misapplication of two theorems:
Bzout's theorem (the number of points of intersection of two algebraic curves is equal to the product of their
degrees, provided that certain necessary conditions are met).
Cramer's theorem (a curve of degree n is determined by n(n+3)/2 points, again assuming that certain conditions
hold).
Observe that n3, n2 n(n+3)/2.

History
The paradox was first published by Maclaurin. Cramer and Euler corresponded on the paradox in letters of 1744 and
1745 and Euler explained the problem to Cramer.
It has become known as Cramer's paradox after featuring in his 1750 book Introduction l'analyse des lignes
courbes algbriques, although Cramer quoted Maclaurin as the source of the statement.
At about the same time, Euler published examples showing a cubic curve which was not uniquely defined by 9
points[1] and discussed the problem in his book Introductio in analysin infinitorum.
The result was publicized by James Stirling and explained by Julius Plcker.

No paradox for lines and nondegenerate conics


For first order curves (that is lines) the paradox does not occur. In general two lines L1 and L2 intersect at a single
point P unless the lines are of equal gradient. A single point is not sufficient to define a line (two are needed);
through the point P there pass not only the two given lines but an infinite number of other lines as well.
Similarly two nondegenerate conics intersect at most at 4 points, and 5 points are needed to define a nondegenerate
conic.

Cramer's example for cubic curves


In a letter to Euler, Cramer pointed out that the cubic curves y3-y=0 and x3-x=0 intersect in precisely 9 points (the
equations represent two sets of three parallel lines x=-1, x=0, x=+1; and y=-1, y=0, y=+1 respectively). Hence 9
points are not sufficient to uniquely determine a cubic curve.

References
[1] Euler, L. "Sur une contradiction apparente dans la doctrine des lignes courbes." Mmoires de l'Academie des Sciences de Berlin 4, 219-233,
1750

External links
Ed Sandifer "Cramers Paradox" (http://eulerarchive.maa.org/hedi/HEDI-2004-08.pdf)
Cramer's Paradox (http://www.mathpages.com/home/kmath207/kmath207.htm) at MathPages

178

Elevator paradox

Elevator paradox
This article refers to the elevator paradox for the transport device. For the elevator paradox for the
hydrometer, see elevator paradox (physics).
The elevator paradox is a paradox first noted by Marvin Stern and George Gamow, physicists who had offices on
different floors of a multi-story building. Gamow, who had an office near the bottom of the building noticed that the
first elevator to stop at his floor was most often going down, while Stern, who had an office near the top, noticed that
the first elevator to stop at his floor was most often going up.
At first sight, this created the impression that perhaps elevator cars were being manufactured in the middle of the
building and sent upwards to the roof and downwards to the basement to be dismantled. Clearly this was not the
case. But how could the observation be explained?

Modeling the elevator problem


Several attempts (beginning with Gamow
and Stern) were made to analyze the reason
for this phenomenon: the basic analysis is
simple, while detailed analysis is more
difficult than it would at first appear.
Simply, if one is on the top floor of a
building, all elevators will come from below
(none can come from above), and then
depart going down, while if one is on the
second from top floor, an elevator going to
the top floor will pass first on the way up,
and then shortly afterward on the way down
thus, while an equal number will pass
Near the top floor, elevators to the top come down shortly after they go up.
going up as going down, downwards
elevators will generally shortly follow upwards elevators (unless the elevator idles on the top floor), and thus the first
elevator observed will usually be going up. The first elevator observed will be going down only if one begins
observing in the short interval after an elevator has passed going up, while the rest of the time the first elevator
observed will be going up.
In more detail, the explanation is as follows: a single elevator spends most of its time in the larger section of the
building, and thus is more likely to approach from that direction when the prospective elevator user arrives. An
observer who remains by the elevator doors for hours or days, observing every elevator arrival, rather than only
observing the first elevator to arrive, would note an equal number of elevators traveling in each direction. This then
becomes a sampling problem the observer is sampling stochastically a non uniform interval.
To help visualize this, consider a thirty-story building, plus lobby, with only one slow elevator. The elevator is so
slow because it stops at every floor on the way up, and then on every floor on the way down. It takes a minute to
travel between floors and wait for passengers. Here is the arrival schedule for people unlucky enough to work in this
building; as depicted above, it forms a triangle wave:

179

Elevator paradox

180

Floor

Time on way up Time on way down

Lobby

8:00, 9:00, ...

n/a

1st floor

8:01, 9:01, ...

8:59, 9:59, ...

2nd floor

8:02, 9:02, ...

8:58, 9:58, ...

...

...

...

29th floor 8:29, 9:29, ...

8:31, 9:31, ...

30th floor n/a

8:30, 9:30, ...

If you were on the first floor and walked up randomly to the elevator, chances are the next elevator would be heading
down. The next elevator would be heading up only during the first two minutes at each hour, e.g., at 9:00 and 9:01.
The number of elevator stops going upwards and downwards are the same, but the odds that the next elevator is
going up is only 2 in 60.
A similar effect can be observed in railway stations where a station near the end of the line will likely have the next
train headed for the end of the line. Another visualization is to imagine sitting in bleachers near one end of an oval
racetrack: if you are waiting for a single car to pass in front of you, it will be more likely to pass on the straight-away
before entering the turn.

More than one elevator


Interestingly, if there is more than one elevator in a building, the bias decreases since there is a greater chance
that the intending passenger will arrive at the elevator lobby during the time that at least one elevator is below them;
with an infinite number of elevators, the probabilities would be equal.
In the example above, if there are 30 floors and 58 elevators, so at every minute there are 2 elevators on each floor,
one going up and one going down (save at the top and bottom), the bias is eliminated every minute, one elevator
arrives going up and another going down. This also occurs with 30 elevators spaced 2 minutes apart on odd floors
they alternate up/down arrivals, while on even floors they arrive simultaneously every two minutes.
Watching cars pass on an oval racetrack, one perceives little bias if the time between cars is small compared to the
time required for a car to return past the observer.

The real-world case


In a real building, there are complicated factors such as the tendency of elevators to be frequently required on the
ground or first floor, and to return there when idle. These factors tend to shift the frequency of observed arrivals, but
do not eliminate the paradox entirely. In particular, a user very near the top floor will perceive the paradox even
more strongly, as elevators are infrequently present or required above their floor.
There are other complications of a real building: such as lopsided demand where everyone wants to go down at the
end of the day; the way full elevators skip extra stops; or the effect of short trips where the elevator stays idle. These
complications make the paradox harder to visualize than the race track examples.

Elevator paradox

References
Martin Gardner, Knotted Doughnuts and Other Mathematical Entertainments, chapter 10. W H Freeman & Co.;
(October 1986). ISBN 0-7167-1799-9.
Martin Gardner, Aha! Gotcha, page 96. W H Freeman & Co.; 1982. ISBN 0-7167-1414-0

External links
A detailed treatment, part 1 (http://www.kwansei.ac.jp/hs/z90010/english/sugakuc/toukei/elevator/
elevator.htm) by Tokihiko Niwa
Part 2: the multi-elevator case (http://www.kwansei.ac.jp/hs/z90010/english/sugakuc/toukei/elevator2/
elevator2.htm)
MathWorld article (http://mathworld.wolfram.com/ElevatorParadox.html) on the elevator paradox

False positive paradox


The false positive paradox is a statistical result where false positive tests are more probable than true positive tests,
occurring when the overall population has a low incidence of a condition and the incidence rate is lower than the
false positive rate. The probability of a positive test result is determined not only by the accuracy of the test but by
the characteristics of the sampled population. When the incidence, the proportion of those who have a given
condition, is lower than the test's false positive rate, even tests that have a very low chance of giving a false positive
in an individual case will give more false than true positives overall.[1] So, in a society with very few infected
peoplefewer proportionately than the test gives false positivesthere will actually be more who test positive for a
disease incorrectly and don't have it than those who test positive accurately and do. The paradox has surprised many.
It is especially counter-intuitive when interpreting a positive result in a test on a low-incidence population after
having dealt with positive results drawn from a high-incidence population. If the false positive rate of the test is
higher than the proportion of the new population with the condition, then a test administrator whose experience has
been drawn from testing in a high-incidence population may conclude from experience that a positive test result
usually indicates a positive subject, when in fact a false positive is far more likely to have occurred.
Not adjusting to the scarcity of the condition in the new population, and concluding that a positive test result
probably indicates a positive subject, even though population incidence is below the false positive rate is a "base rate
fallacy".

Example
High-incidence population

181

False positive paradox

182

Number
of people

Infected

Uninfected

Total

Test
positive

400
(true positive)

30
(false positive)

430

Test
0
570
negative (false negative) (true negative)

570

Total

400

600

1000

Imagine running an HIV test on population A of 1000 persons, in which 40% are infected. The test has a false
positive rate of 5% (0.05) and no false negative rate. The expected outcome of the 1000 tests on population A would
be:
Infected and test indicates disease (true positive)
1000 40/100 = 400 people would receive a true positive
Uninfected and test indicates disease (false positive)
1000 100 40/100 0.05 = 30 people would receive a false positive
The remaining 570 tests are correctly negative.
So, in population A, a person receiving a positive test could be over 93% confident (400/30 + 400) that it correctly
indicates infection.

Low-incidence population
Number
of people

Infected

Uninfected

Total

Test
positive

20
(true positive)

49
(false positive)

69

Test
0
931
negative (false negative) (true negative)

931

Total

20

980

1000

Now consider the same test applied to population B, in which only 2% is infected. The expected outcome of 1000
tests on population B would be:
Infected and test indicates disease (true positive)
1000 2/100 = 20 people would receive a true positive
Uninfected and test indicates disease (false positive)
1000 100 2/100 0.05 = 49 people would receive a false positive
The remaining 931 tests are correctly negative.
In population B, only 20 of the 69 total people with a positive test result are actually infected. So, the probability of
actually being infected after one is told that one is infected is only 29% (20/20 + 49) for a test that otherwise appears
to be "95% accurate".
A tester with experience of group A might find it a paradox that in group B, a result that had usually correctly
indicated infection is now usually a false positive. The confusion of the posterior probability of infection with the
prior probability of receiving a false positive is a natural error after receiving a life-threatening test result.

False positive paradox

183

Discussion
Cory Doctorow discusses this paradox in his book Little Brother.
If you ever decide to do something as stupid as build an automatic terrorism detector, here's a math
lesson you need to learn first. It's called "the paradox of the false positive," and it's a doozy.
Number
(rounded)

Has
Super-AIDS

Does not have


Super-AIDS

Total

Test
positive

1
(true positive)

10,000
(false positive)

10,001

Test
negative

0
989,999
(false negative) (true negative)

989,999

Total

999,999

1,000,000

Say you have a new disease, called Super-AIDS. Only one in a million people gets Super-AIDS. You
develop a test for Super-AIDS that's 99 percent accurate. I mean, 99 percent of the time, it gives the
correct result -- true if the subject is infected, and false if the subject is healthy. You give the test to a
million people. One in a million people have Super-AIDS. One in a hundred people that you test will
generate a "false positive" -- the test will say he has Super-AIDS even though he doesn't. That's what
"99 percent accurate" means: one percent wrong. What's one percent of one million? 1,000,000/100 =
10,000 One in a million people has Super-AIDS. If you test a million random people, you'll probably
only find one case of real Super-AIDS. But your test won't identify one person as having Super-AIDS. It
will identify 10,000 people as having it. Your 99 percent accurate test will perform with 99.99 percent
inaccuracy. That's the paradox of the false positive. When you try to find something really rare, your
test's accuracy has to match the rarity of the thing you're looking for. If you're trying to point at a single
pixel on your screen, a sharp pencil is a good pointer: the pencil-tip is a lot smaller (more accurate) than
the pixels. But a pencil-tip is no good at pointing at a single atom in your screen. For that, you need a
pointer -- a test -- that's one atom wide or less at the tip.
Number
(rounded)

Is a
terrorist

Is not a
terrorist

Total

Test
positive

10
(true positive)

200,000
(false positive)

200,010

Test
negative

0
19,799,990 19,799,990
(false negative) (true negative)

Total

10

19,999,990

20,000,000

This is the paradox of the false positive, and here's how it applies to terrorism: Terrorists are really rare.
In a city of twenty million like New York, there might be one or two terrorists. Maybe ten of them at the
outside. 10/20,000,000 = 0.00005 percent. One twenty-thousandth of a percent. That's pretty rare all
right. Now, say you've got some software that can sift through all the bank-records, or toll-pass records,
or public transit records, or phone-call records in the city and catch terrorists 99 percent of the time. In a
pool of twenty million people, a 99 percent accurate test will identify two hundred thousand people as
being terrorists. But only ten of them are terrorists. To catch ten bad guys, you have to haul in and
investigate two hundred thousand innocent people. Guess what? Terrorism tests aren't anywhere close to
99 percent accurate. More like 60 percent accurate. Even 40 percent accurate, sometimes.

False positive paradox

184

References
[1] - Citing:

External links
The false positive paradox explained visually (http://www.youtube.com/watch?v=D8VZqxcu0I0) (video)

Gabriel's Horn
Gabriel's Horn (also called Torricelli's trumpet) is a geometric
figure which has infinite surface area but finite volume. The name
refers to the tradition identifying the Archangel Gabriel as the angel
who blows the horn to announce Judgment Day, associating the divine,
or infinite, with the finite. The properties of this figure were first
studied by Italian physicist and mathematician Evangelista Torricelli in
the 17th century.

3D illustration of Gabriel's Horn.

Mathematical definition
Gabriel's horn is formed by taking the graph of
the domain

, with

(thus avoiding the asymptote at x = 0) and rotating

it in three dimensions about the x-axis. The discovery was made using
Cavalieri's principle before the invention of calculus, but today
calculus can be used to calculate the volume and surface area of the
horn between x = 1 and x = a, where a > 1. Using integration (see Solid
of revolution and Surface of revolution for details), it is possible to find
the volume

and the surface area

Graph of

can be as large as required, but it can be seen from the equation that the volume of the part of the horn between
and
will never exceed ; however, it will get closer and closer to as becomes larger.
Mathematically, the volume approaches

as

approaches infinity. Using the limit notation of calculus:

As for the area, the above shows that the area is


the natural logarithm of
area. That is to say;

times the natural logarithm of

. There is no upper bound for

as it approaches infinity. That means, in this case, that the horn has an infinite surface

Gabriel's Horn

185

Apparent paradox
When the properties of Gabriel's Horn were discovered, the fact that the rotation of an infinitely large section of the
x-y plane about the x-axis generates an object of finite volume was considered paradoxical.
Actually, the section lying in the x-y plane is the only one which has an infinite area, while any other, parallel to it,
has a finite area. The volume, being calculated from the 'weighted sum' of sections, is finite.
The more obvious approach is to treat the horn as a stack of disks with diminishing radii. As their shape is identical,
one is tempted to calculate just the sum of radii which produces the harmonic series that goes to infinity. A more
careful consideration shows that one should calculate the sum of their squares. Every disk has a radius r=1/x and an
area .r2 or /x2. The series 1/x is divergent but for any real >0, 1/x1+ converges.
The apparent paradox formed part of a great dispute over the nature of infinity involving many of the key thinkers of
the time including Thomas Hobbes, John Wallis and Galileo Galilei.

Painter's Paradox
Since the Horn has finite volume but infinite surface area, it seems that it could be filled with a finite quantity of
paint, and yet that paint would not be sufficient to coat its inner surface an apparent paradox. In fact, in a
theoretical mathematical sense, a finite amount of paint can coat an infinite area, provided the thickness of the coat
becomes vanishingly small "quickly enough" to compensate for the ever-expanding area, which in this case is forced
to happen to an inner-surface coat as the horn narrows. However, to coat the outer surface of the horn with a constant
thickness of paint, no matter how thin, would require an infinite amount of paint.
Of course, in reality, paint is not infinitely divisible, and at some point the horn would become too narrow for even
one molecule to pass.

Converse
The converse phenomenon of Gabriel's horn a surface of revolution which has a finite surface area but an infinite
volume cannot occur:
Theorem:
Let
Write

be a continuously differentiable function.


for the solid of revolution of the graph

If the surface area of

about the

-axis.

is finite, then so is the volume.

Proof:
Since the lateral surface area

Therefore, there exists a

is finite, note the limit superior:

such that the supremum

is finite.

Hence,
must be finite since
is bounded on the interval

is a continuous function, which implies that

Gabriel's Horn

186

Finally, note that the volume:

Therefore:
if the area

is finite, then the volume

must also be finite.

Further reading
Gabriel's Other Possessions, Melvin Royer, doi:10.1080/10511970.2010.517601 [1]
Gabriel's Wedding Cake, Julian F. Fleron, http://people.emich.edu/aross15/math121/misc/
gabriels-horn-ma044.pdf
A Paradoxical Paint Pail, Mark Lynch, http://www.maa.org/programs/faculty-and-departments/
classroom-capsules-and-notes/a-paradoxical-paint-pail
Supersolids: Solids Having Finite Volume and Infinite Surfaces, William P. Love, JSTOR27966098 [2]

References
[1] http:/ / dx. doi. org/ 10. 1080%2F10511970. 2010. 517601
[2] http:/ / www. jstor. org/ stable/ 27966098

External links

Information and diagrams about Gabriel's Horn (http://curvebank.calstatela.edu/torricelli/torricelli.htm)


Torricelli's Trumpet at PlanetMath (http://planetmath.org/encyclopedia/TorricellisTrumpet.html)
Weisstein, Eric W., "Gabriel's Horn" (http://mathworld.wolfram.com/GabrielsHorn.html), MathWorld.
"Gabriel's Horn" (http://demonstrations.wolfram.com/GabrielsHorn/) by John Snyder, the Wolfram
Demonstrations Project, 2007.
Gabriel's Horn: An Understanding of a Solid with Finite Volume and Infinite Surface Area (http://www.
palmbeachstate.edu/honors/Documents/jeansergejoseph.pdf) by Jean S. Joseph.

Galileo's paradox

187

Galileo's paradox
Galileo's paradox is a demonstration of one of the surprising properties of infinite sets. In his final scientific work,
Two New Sciences, Galileo Galilei made apparently contradictory statements about the positive integers. First, some
numbers are squares, while others are not; therefore, all the numbers, including both squares and non-squares, must
be more numerous than just the squares. And yet, for every square there is exactly one positive number that is its
square root, and for every number there is exactly one square; hence, there cannot be more of one than of the other.
This is an early use, though not the first, of the idea of one-to-one correspondence in the context of infinite sets.
Galileo concluded that the ideas of less, equal, and greater apply to finite sets, but not to infinite sets. In the
nineteenth century, using the same methods, Cantor showed that this restriction is not necessary. It is possible to
define comparisons amongst infinite sets in a meaningful way (by which definition the two sets he considers,
integers and squares, have "the same size"), and that by this definition some infinite sets are strictly larger than
others.
Galileo also worked on Zeno's paradoxes in order to open the way for his mathematical theory of motion.[1]

Galileo on infinite sets


The relevant section of Two New Sciences is excerpted below:
Simplicio: Here a difficulty presents itself which appears to me insoluble. Since it is clear that we may
have one line greater than another, each containing an infinite number of points, we are forced to admit
that, within one and the same class, we may have something greater than infinity, because the infinity of
points in the long line is greater than the infinity of points in the short line. This assigning to an infinite
quantity a value greater than infinity is quite beyond my comprehension.
Salviati: This is one of the difficulties which arise when we attempt, with our finite minds, to discuss the
infinite, assigning to it those properties which we give to the finite and limited; but this I think is wrong,
for we cannot speak of infinite quantities as being the one greater or less than or equal to another. To
prove this I have in mind an argument which, for the sake of clearness, I shall put in the form of
questions to Simplicio who raised this difficulty.
I take it for granted that you know which of the numbers are squares and which are not.
Simplicio: I am quite aware that a squared number is one which results from the multiplication of
another number by itself; thus 4, 9, etc., are squared numbers which come from multiplying 2, 3, etc., by
themselves.
Salviati: Very well; and you also know that just as the products are called squares so the factors are
called sides or roots; while on the other hand those numbers which do not consist of two equal factors
are not squares. Therefore if I assert that all numbers, including both squares and non-squares, are more
than the squares alone, I shall speak the truth, shall I not?
Simplicio: Most certainly.
Salviati: If I should ask further how many squares there are one might reply truly that there are as many
as the corresponding number of roots, since every square has its own root and every root its own square,
while no square has more than one root and no root more than one square.
Simplicio: Precisely so.
Salviati: But if I inquire how many roots there are, it cannot be denied that there are as many as the
numbers because every number is the root of some square. This being granted, we must say that there
are as many squares as there are numbers because they are just as numerous as their roots, and all the
numbers are roots. Yet at the outset we said that there are many more numbers than squares, since the

Galileo's paradox

188
larger portion of them are not squares. Not only so, but the proportionate number of squares diminishes
as we pass to larger numbers, Thus up to 100 we have 10 squares, that is, the squares constitute 1/10 part
of all the numbers; up to 10000, we find only 1/100 part to be squares; and up to a million only 1/1000
part; on the other hand in an infinite number, if one could conceive of such a thing, he would be forced
to admit that there are as many squares as there are numbers taken all together.
Sagredo: What then must one conclude under these circumstances?
Salviati: So far as I see we can only infer that the totality of all numbers is infinite, that the number of
squares is infinite, and that the number of their roots is infinite; neither is the number of squares less
than the totality of all the numbers, nor the latter greater than the former; and finally the attributes
"equal," "greater," and "less," are not applicable to infinite, but only to finite, quantities. When therefore
Simplicio introduces several lines of different lengths and asks me how it is possible that the longer ones
do not contain more points than the shorter, I answer him that one line does not contain more or less or
just as many points as another, but that each line contains an infinite number.
Galileo, Two New Sciences

References
[1] Alfred Renyi, Dialogs on Mathematics, Holden-Day, San Francisco, 1967.

External links
Philosophical Method and Galileo's Paradox of Infinity (http://philsci-archive.pitt.edu/archive/00004276/),
Matthew W. Parker, in the PhilSci Archive

Gambler's fallacy
The gambler's fallacy, also known as the Monte Carlo fallacy or the fallacy of the maturity of chances, is the
mistaken belief that if something happens more frequently than normal during some period, then it will happen less
frequently in the future, or that if something happens less frequently than normal during some period, then it will
happen more frequently in the future (presumably as a means of balancing nature). In situations where what is being
observed is truly random (i.e. independent trials of a random process), this belief, though appealing to the human
mind, is false. This fallacy can arise in many practical situations although it is most strongly associated with
gambling where such mistakes are common among players.
The use of the term Monte Carlo fallacy originates from the most famous example of this phenomenon, which
occurred in a Monte Carlo Casino in 1913.[1]

Gambler's fallacy

189

An example: coin-tossing
The gambler's fallacy can be illustrated by considering the repeated toss of a fair coin.
With a fair coin, the outcomes in different tosses are statistically independent and the
probability of getting heads on a single toss is exactly 12 (one in two). It follows that the
probability of getting two heads in two tosses is 14 (one in four) and the probability of
getting three heads in three tosses is 18 (one in eight). In general, if we let Ai be the event
that toss i of a fair coin comes up heads, then we have,
.
Now suppose that we have just tossed four heads in a row, so that if the next coin toss
were also to come up heads, it would complete a run of five successive heads. Since the
probability of a run of five successive heads is only 132 (one in thirty-two), a person
subject to the gambler's fallacy might believe that this next flip was less likely to be
heads than to be tails. However, this is not correct, and is a manifestation of the
gambler's fallacy; the event of 5 heads in a row and the event of "first 4 heads, then a
tails" are equally likely, each having probability 132. Given that the first four rolls turn
up heads, the probability that the next toss is a head is in fact,
.
While a run of five heads is only 132 = 0.03125, it is only that before the coin is first
tossed. After the first four tosses the results are no longer unknown, so their probabilities
are 1. Reasoning that it is more likely that the next toss will be a tail than a head due to
the past tosses, that a run of luck in the past somehow influences the odds in the future, is
the fallacy.

Simulation of coin tosses:


Each frame, a coin is
flipped which is red on one
side and blue on the other.
The result of each flip is
added as a colored dot in
the corresponding column.
As the pie chart shows, the
proportion of red versus
blue approaches 50-50 (the
Law of Large Numbers).
But the difference between
red and blue does not
systematically decrease to
zero.

Explaining why the probability is 1/2 for a fair coin


We can see from the above that, if one flips a fair coin 21 times, then the probability of 21 heads is 1 in 2,097,152.
However, the probability of flipping a head after having already flipped 20 heads in a row is simply 12. This is an
application of Bayes' theorem.
This can also be seen without knowing that 20 heads have occurred for certain (without applying of Bayes' theorem).
Consider the following two probabilities, assuming a fair coin:
probability of 20 heads, then 1 tail = 0.520 0.5 = 0.521
probability of 20 heads, then 1 head = 0.520 0.5 = 0.521
The probability of getting 20 heads then 1 tail, and the probability of getting 20 heads then another head are both 1 in
2,097,152. Therefore, it is equally likely to flip 21 heads as it is to flip 20 heads and then 1 tail when flipping a fair
coin 21 times. Furthermore, these two probabilities are equally as likely as any other 21-flip combinations that can
be obtained (there are 2,097,152 total); all 21-flip combinations will have probabilities equal to 0.521, or 1 in
2,097,152. From these observations, there is no reason to assume at any point that a change of luck is warranted
based on prior trials (flips), because every outcome observed will always have been as likely as the other outcomes
that were not observed for that particular trial, given a fair coin. Therefore, just as Bayes' theorem shows, the result
of each trial comes down to the base probability of the fair coin: 12.

Gambler's fallacy

Other examples
There is another way to emphasize the fallacy. As already mentioned, the fallacy is built on the notion that previous
failures indicate an increased probability of success on subsequent attempts. This is, in fact, the inverse of what
actually happens, even on a fair chance of a successful event, given a set number of iterations. Assume a fair
16-sided die, where a win is defined as rolling a 1. Assume a player is given 16 rolls to obtain at least one win
(1Pr(rolling no 1's in 16 rolls)). The low winning odds are just to make the change in probability more noticeable.
The probability of having at least one win in the 16 rolls is:

However, assume now that the first roll was a loss (93.75% chance of that, 1516). The player now only has 15 rolls
left and, according to the fallacy, should have a higher chance of winning since one loss has occurred. His chances of
having at least one win are now:

Simply by losing one toss the player's probability of winning dropped by 2 percentage points. By the time this
reaches 5 losses (11 rolls left), his probability of winning on one of the remaining rolls will have dropped to ~50%.
The player's odds for at least one win in those 16 rolls has not increased given a series of losses; his odds have
decreased because he has fewer iterations left to win. In other words, the previous losses in no way contribute to the
odds of the remaining attempts, but there are fewer remaining attempts to gain a win, which results in a lower
probability of obtaining it.
The player becomes more likely to lose in a set number of iterations as he fails to win, and eventually his probability
of winning will again equal the probability of winning a single toss, when only one toss is left: 6.25% in this
instance.
Some lottery players will choose the same numbers every time, or intentionally change their numbers, but both are
equally likely to win any individual lottery draw. Copying the numbers that won the previous lottery draw gives an
equal probability, although a rational gambler might attempt to predict other players' choices and then deliberately
avoid these numbers. Low numbers (below 31 and especially below 12) are popular because people play birthdays as
their so-called lucky numbers; hence a win in which these numbers are over-represented is more likely to result in a
shared payout.
A joke told among mathematicians demonstrates the nature of the fallacy. When flying on an aircraft, a man decides
to always bring a bomb with him. "The chances of an aircraft having a bomb on it are very small," he reasons, "and
certainly the chances of having two are almost none!" A similar example is in the book The World According to
Garp when the hero Garp decides to buy a house a moment after a small plane crashes into it, reasoning that the
chances of another plane hitting the house have just dropped to zero.

Reverse fallacy
The reversal can also be a fallacy in which a gambler may instead decide, after a consistent tendency towards tails,
that tails are more likely out of some mystical preconception that fate has thus far allowed for consistent results of
tails. Believing the odds to favor tails, the gambler sees no reason to change to heads. Again, the fallacy is the belief
that the "universe" somehow carries a memory of past results which tend to favor or disfavor future outcomes.
However, it is not necessarily a fallacy as a consistent observed tendency towards one outcome may rationally be
taken as evidence that the coin is not fair.
Ian Hacking's unrelated inverse gambler's fallacy describes a situation where a gambler entering a room and seeing a
person rolling a double-six on a pair of dice may erroneously conclude that the person must have been rolling the
dice for quite a while, as they would be unlikely to get a double-six on their first attempt.

190

Gambler's fallacy

Caveats
In most illustrations of the gambler's fallacy and the reversed gambler's fallacy, the trial (e.g. flipping a coin) is
assumed to be fair. In practice, this assumption may not hold.
For example, if one flips a fair coin 21 times, then the probability of 21 heads is 1 in 2,097,152 (above). If the coin is
fair, then the probability of the next flip being heads is 1/2. However, because the odds of flipping 21 heads in a row
is so slim, it may well be that the coin is somehow biased towards landing on heads, or that it is being controlled by
hidden magnets, or similar.[2] In this case, the smart bet is "heads" because the Bayesian inference from the empirical
evidence 21 "heads" in a row suggests that the coin is likely to be biased toward "heads", contradicting the
general assumption that the coin is fair.
The opening scene of the play Rosencrantz and Guildenstern Are Dead by Tom Stoppard discusses these issues as
one man continually flips heads and the other considers various possible explanations.

Childbirth
Instances of the gamblers fallacy being applied to childbirth can be traced all the way back to 1796, in Pierre-Simon
Laplaces A Philosophical Essay on Probabilities. Laplace wrote of the ways in which men calculated their
probability of having sons: "I have seen men, ardently desirous of having a son, who could learn only with anxiety of
the births of boys in the month when they expected to become fathers. Imagining that the ratio of these births to
those of girls ought to be the same at the end of each month, they judged that the boys already born would render
more probable the births next of girls." In short, the expectant fathers feared that if more sons were born in the
surrounding community, then they themselves would be more likely to have a daughter.[3]
Some expectant parents believe that, after having multiple children of the same sex, they are "due" to have a child of
the opposite sex. While the TriversWillard hypothesis predicts that birth sex is dependent on living conditions (i.e.
more male children are born in "good" living conditions, while more female children are born in poorer living
conditions), the probability of having a child of either gender is still generally regarded as near 50%.

Monte Carlo Casino


The most famous example of the gamblers fallacy occurred in a game of roulette at the Monte Carlo Casino on
August 18, 1913,[4] when the ball fell in black 26 times in a row. This was an extremely uncommon occurrence,
although no more nor less common than any of the other 67,108,863 sequences of 26 red or black. Gamblers lost
millions of francs betting against black, reasoning incorrectly that the streak was causing an "imbalance" in the
randomness of the wheel, and that it had to be followed by a long streak of red.

Non-examples of the fallacy


There are many scenarios where the gambler's fallacy might superficially seem to apply, when it actually does not.
When the probability of different events is not independent, the probability of future events can change based on the
outcome of past events (see statistical permutation). Formally, the system is said to have memory. An example of this
is cards drawn without replacement. For example, if an ace is drawn from a deck and not reinserted, the next draw is
less likely to be an ace and more likely to be of another rank. The odds for drawing another ace, assuming that it was
the first card drawn and that there are no jokers, have decreased from 452 (7.69%) to 351 (5.88%), while the odds for
each other rank have increased from 452 (7.69%) to 451 (7.84%). This type of effect is what allows card counting
systems to work (for example in the game of blackjack).
The reversed gambler's fallacy may appear to apply in the story of Joseph Jagger, who hired clerks to record the
results of roulette wheels in Monte Carlo. He discovered that one wheel favored nine particular numbers, and was
able to win large sums of money by betting on them until the casino started rebalancing the roulette wheels daily. In

191

Gambler's fallacy
this situation, the observation of the wheel's behavior provided information about the physical properties of the
wheel rather than its "probability" in some abstract sense, a concept which is the basis of both the gambler's fallacy
and its reversal. Even a biased wheel's past results will not affect future results, but the results can provide
information about what sort of results the wheel tends to produce. However, if it is known for certain that the wheel
is completely fair, then past results provide no information about future ones.
The outcome of future events can be affected if external factors are allowed to change the probability of the events
(e.g., changes in the rules of a game affecting a sports team's performance levels). Additionally, an inexperienced
player's success may decrease after opposing teams discover his weaknesses and exploit them. The player must then
attempt to compensate and randomize his strategy. Such analysis is part of game theory.

Non-example: unknown probability of event


When the probabilities of repeated events are not known, outcomes may not be equally probable. In the case of coin
tossing, as a run of heads gets longer and longer, the likelihood that the coin is biased towards heads increases. If one
flips a coin 21 times in a row and obtains 21 heads, one might rationally conclude a high probability of bias towards
heads, and hence conclude that future flips of this coin are also highly likely to be heads. In fact, Bayesian inference
can be used to show that when the long-run proportion of different outcomes are unknown but exchangeable
(meaning that the random process from which they are generated may be biased but is equally likely to be biased in
any direction) and that previous observations demonstrate the likely direction of the bias, the outcome which has
occurred the most in the observed data is the most likely to occur again.[5]

Psychology behind the fallacy


Origins
Gambler's fallacy arises out of a belief in a "law of small numbers", or the erroneous belief that small samples must
be representative of the larger population. According to the fallacy, "streaks" must eventually even out in order to be
representative.[6] Amos Tversky and Daniel Kahneman first proposed that the gambler's fallacy is a cognitive bias
produced by a psychological heuristic called the representativeness heuristic, which states that people evaluate the
probability of a certain event by assessing how similar it is to events they have experienced before, and how similar
the events surrounding those two processes are. According to this view, "after observing a long run of red on the
roulette wheel, for example, most people erroneously believe that black will result in a more representative sequence
than the occurrence of an additional red",[7] so people expect that a short run of random outcomes should share
properties of a longer run, specifically in that deviations from average should balance out. When people are asked to
make up a random-looking sequence of coin tosses, they tend to make sequences where the proportion of heads to
tails stays closer to 0.5 in any short segment than would be predicted by chance (insensitivity to sample size);
Kahneman and Tversky interpret this to mean that people believe short sequences of random events should be
representative of longer ones.[8] The representativeness heuristic is also cited behind the related phenomenon of the
clustering illusion, according to which people see streaks of random events as being non-random when such streaks
are actually much more likely to occur in small samples than people expect.
The gambler's fallacy can also be attributed to the mistaken belief that gambling (or even chance itself) is a fair
process that can correct itself in the event of streaks, otherwise known as the just-world hypothesis.[9] Other
researchers believe that individuals with an internal locus of controli.e., people who believe that the gambling
outcomes are the result of their own skillare more susceptible to the gambler's fallacy because they reject the idea
that chance could overcome skill or talent.[10]

192

Gambler's fallacy

Variations of the gambler's fallacy


Some researchers believe that there are actually two types of gambler's fallacy: TypeI and TypeII. TypeI is the
"classic" gambler's fallacy, when individuals believe that a certain outcome is "due" after a long streak of another
outcome. TypeII gambler's fallacy, as defined by Gideon Keren and Charles Lewis, occurs when a gambler
underestimates how many observations are needed to detect a favorable outcome (such as watching a roulette wheel
for a length of time and then betting on the numbers that appear most often). Detecting a bias that will lead to a
favorable outcome takes an impractically large amount of time and is very difficult, if not impossible, to do;
therefore people fall prey to the TypeII gambler's fallacy.[11] The two types are different in that TypeI wrongly
assumes that gambling conditions are fair and perfect, while TypeII assumes that the conditions are biased, and that
this bias can be detected after a certain amount of time.
Another variety, known as the retrospective gambler's fallacy, occurs when individuals judge that a seemingly rare
event must come from a longer sequence than a more common event does. For example, people believe that an
imaginary sequence of die rolls is more than three times as long when a set of three 6's is observed as opposed to
when there are only two 6's. This effect can be observed in isolated instances, or even sequentially. A real world
example is that when a teenager becomes pregnant after having unprotected sex, people assume that she has been
engaging in unprotected sex for longer than someone who has been engaging in unprotected sex and is not
pregnant.[12]

Relationship to hot-hand fallacy


Another psychological perspective states that gambler's fallacy can be seen as the counterpart to basketball's
hot-hand fallacy, in which people tend to predict the same outcome of the last event (positive recency)that a high
scorer will continue to score. In gambler's fallacy, however, people predict the opposite outcome of the last event
(negative recency)that, for example, since the roulette wheel has landed on black the last six times, it is due to land
on red the next. Ayton and Fischer have theorized that people display positive recency for the hot-hand fallacy
because the fallacy deals with human performance, and that people do not believe that an inanimate object can
become "hot." Human performance is not perceived as "random," and people are more likely to continue streaks
when they believe that the process generating the results is nonrandom. Usually, when a person exhibits the
gambler's fallacy, they are more likely to exhibit the hot-hand fallacy as well, suggesting that one construct is
responsible for the two fallacies.
The difference between the two fallacies is also represented in economic decision-making. A study by Huber,
Kirchler, and Stockl (2010) examined how the hot hand and the gambler's fallacy are exhibited in the financial
market. The researchers gave their participants a choice: they could either bet on the outcome of a series of coin
tosses, use an "expert" opinion to sway their decision, or choose a risk-free alternative instead for a smaller financial
reward. Participants turned to the "expert" opinion to make their decision 24% of the time based on their past
experience of success, which exemplifies the hot-hand. If the expert was correct, 78% of the participants chose the
expert's opinion again, as opposed to 57% doing so when the expert was wrong. The participants also exhibited the
gambler's fallacy, with their selection of either heads or tails decreasing after noticing a streak of that outcome. This
experiment helped bolster Ayton and Fischer's theory that people put more faith in human performance than they do
in seemingly random processes.

Neurophysiology
While the representativeness heuristic and other cognitive biases are the most commonly cited cause of the gambler's
fallacy, research suggests that there may be a neurological component to it as well. Functional magnetic resonance
imaging has revealed that, after losing a bet or gamble ("riskloss"), the frontoparietal network of the brain is
activated, resulting in more risk-taking behavior. In contrast, there is decreased activity in the amygdala, caudate,
and ventral striatum after a riskloss. Activation in the amygdala is negatively correlated with gambler's fallacythe

193

Gambler's fallacy
more activity exhibited in the amygdala, the less likely an individual is to fall prey to the gambler's fallacy. These
results suggest that gambler's fallacy relies more on the prefrontal cortex (responsible for executive, goal-directed
processes) and less on the brain areas that control affective decision-making.
The desire to continue gambling or betting is controlled by the striatum, which supports a choice-outcome
contingency learning method. The striatum processes the errors in prediction and the behavior changes accordingly.
After a win, the positive behavior is reinforced and after a loss, the behavior is conditioned to be avoided. In
individuals exhibiting the gambler's fallacy, this choice-outcome contingency method is impaired, and they continue
to make risks after a series of losses.

Possible solutions
The gambler's fallacy is a deep-seated cognitive bias and therefore very difficult to eliminate. For the most part,
educating individuals about the nature of randomness has not proven effective in reducing or eliminating any
manifestation of the gambler's fallacy. Participants in an early study by Beach and Swensson (1967) were shown a
shuffled deck of index cards with shapes on them, and were told to guess which shape would come next in a
sequence. The experimental group of participants was informed about the nature and existence of the gambler's
fallacy, and were explicitly instructed not to rely on "run dependency" to make their guesses. The control group was
not given this information. Even so, the response styles of the two groups were similar, indicating that the
experimental group still based their choices on the length of the run sequence. Clearly, instructing individuals about
randomness is not sufficient in lessening the gambler's fallacy.
It does appear, however, that an individual's susceptibility to the gambler's fallacy decreases with age. Fischbein and
Schnarch (1997) administered a questionnaire to five groups: students in grades 5, 7, 9, 11, and college students
specializing in teaching mathematics. None of the participants had received any prior education regarding
probability. The question was, "Ronni flipped a coin three times and in all cases heads came up. Ronni intends to flip
the coin again. What is the chance of getting heads the fourth time?" The results indicated that as the older the
students got, the less likely they were to answer with "smaller than the chance of getting tails", which would indicate
a negative recency effect. 35% of the 5th graders, 35% of the 7th graders, and 20% of the 9th graders exhibited the
negative recency effect. Only 10% of the 11th graders answered this way, however, and none of the college students
did. Fischbein and Schnarch therefore theorized that an individual's tendency to rely on the representativeness
heuristic and other cognitive biases can be overcome with age.
Another possible solution that could be seen as more proactive comes from Roney and Trick, Gestalt psychologists
who suggest that the fallacy may be eliminated as a result of grouping. When a future event (ex: a coin toss) is
described as part of a sequence, no matter how arbitrarily, a person will automatically consider the event as it relates
to the past events, resulting in the gambler's fallacy. When a person considers every event as independent, however,
the fallacy can be greatly reduced.
In their experiment, Roney and Trick told participants that they were betting on either two blocks of six coin tosses,
or on two blocks of seven coin tosses. The fourth, fifth, and sixth tosses all had the same outcome, either three heads
or three tails. The seventh toss was grouped with either the end of one block, or the beginning of the next block.
Participants exhibited the strongest gambler's fallacy when the seventh trial was part of the first block, directly after
the sequence of three heads or tails. Additionally, the researchers pointed out how insidious the fallacy can bethe
participants that did not show the gambler's fallacy showed less confidence in their bets and bet fewer times than the
participants who picked "with" the gambler's fallacy. However, when the seventh trial was grouped with the second
block (and was therefore perceived as not being part of a streak), the gambler's fallacy did not occur.
Roney and Trick argue that a solution to gambler's fallacy could be, instead of teaching individuals about the nature
of randomness, training people to treat each event as if it is a beginning and not a continuation of previous events.
This would prevent people from gambling when they are losing in the vain hope that their chances of winning are
due to increase.

194

Gambler's fallacy

References
[1]
[2]
[3]
[4]

Blog - "Fallacy Files" (http:/ / www. fallacyfiles. org/ gamblers. html) What happened at Monte Carlo in 1913.
Martin Gardner, Entertaining Mathematical Puzzles, Dover Publications, 69-70.
Barron, G. and Leider, S. (2010). The role of experience in the gambler's fallacy. Journal of Behavioral Decision Making, 23, 117-129.
"Roulette", in The Universal Book of Mathematics: From Abracadabra to Zeno's Paradoxes, by David Darling (John Wiley & Sons, 2004)
p278
[5] O'Neill, B. and Puza, B.D. (2004) Dice have no memories but I do: A defence of the reverse gambler's belief. (http:/ / cbe. anu. edu. au/
research/ papers/ pdf/ STAT0004WP. pdf). Reprinted in abridged form as O'Neill, B. and Puza, B.D. (2005) In defence of the reverse
gambler's belief. The Mathematical Scientist 30(1), pp. 1316.
[6] Burns, B. D. and Corpus, B. (2004). Randomness and inductions from streaks: "Gambler's fallacy" versus "hot hand". Psychonomic Bulletin
and Review. 11, 179-184
[7] Tversky & Kahneman, 1974.
[8] Tversky & Kahneman, 1971.
[9] Rogers, P. (1998). "The cognitive psychology of lottery gambling: A theoretical review". Journal of Gambling Studies, 14, 111-134
[10] Sundali, J. and Croson, R. (2006). "Biases in casino betting: The hot hand and the gambler's fallacy". Judgment and Decision Making, 1,
1-12.
[11] Keren, G. and Lewis, C. (1994). "The two fallacies of gamblers: TypeI and TypeII". Organizational Behavior and Human Decision
Processes, 60, 75-89.
[12] Oppenheimer, D. M. and Monin, B. (2009). "The retrospective gambler's fallacy: Unlikely events, constructing the past, and multiple
universes". Judgment and Decision Making, 4, 326-334.

Gdel's incompleteness theorems


Gdel's incompleteness theorems are two theorems of mathematical logic that establish inherent limitations of all
but the most trivial axiomatic systems capable of doing arithmetic. The theorems, proven by Kurt Gdel in 1931, are
important both in mathematical logic and in the philosophy of mathematics. The two results are widely, but not
universally, interpreted as showing that Hilbert's program to find a complete and consistent set of axioms for all
mathematics is impossible, giving a negative answer to Hilbert's second problem.
The first incompleteness theorem states that no consistent system of axioms whose theorems can be listed by an
"effective procedure" (e.g., a computer program, but it could be any sort of algorithm) is capable of proving all truths
about the relations of the natural numbers (arithmetic). For any such system, there will always be statements about
the natural numbers that are true, but that are unprovable within the system. The second incompleteness theorem, an
extension of the first, shows that such a system cannot demonstrate its own consistency.

Background
Because statements of a formal theory are written in symbolic form, it is possible to verify mechanically that a
formal proof from a finite set of axioms is valid. This task, known as automatic proof verification, is closely related
to automated theorem proving. The difference is that instead of constructing a new proof, the proof verifier simply
checks that a provided formal proof (or, in instructions that can be followed to create a formal proof) is correct. This
process is not merely hypothetical; systems such as Isabelle or Coq are used today to formalize proofs and then
check their validity.
Many theories of interest include an infinite set of axioms, however. To verify a formal proof when the set of axioms
is infinite, it must be possible to determine whether a statement that is claimed to be an axiom is actually an axiom.
This issue arises in first order theories of arithmetic, such as Peano arithmetic, because the principle of mathematical
induction is expressed as an infinite set of axioms (an axiom schema).
A formal theory is said to be effectively generated if its set of axioms is a recursively enumerable set. This means
that there is a computer program that, in principle, could enumerate all the axioms of the theory without listing any
statements that are not axioms. This is equivalent to the existence of a program that enumerates all the theorems of
the theory without enumerating any statements that are not theorems. Examples of effectively generated theories

195

Gdel's incompleteness theorems


with infinite sets of axioms include Peano arithmetic and ZermeloFraenkel set theory.
In choosing a set of axioms, one goal is to be able to prove as many correct results as possible, without proving any
incorrect results. A set of axioms is complete if, for any statement in the axioms' language, either that statement or its
negation is provable from the axioms. A set of axioms is (simply) consistent if there is no statement such that both
the statement and its negation are provable from the axioms. In the standard system of first-order logic, an
inconsistent set of axioms will prove every statement in its language (this is sometimes called the principle of
explosion), and is thus automatically complete. A set of axioms that is both complete and consistent, however,
proves a maximal set of non-contradictory theorems. Gdel's incompleteness theorems show that in certain cases it is
not possible to obtain an effectively generated, complete, consistent theory.

First incompleteness theorem


Gdel's first incompleteness theorem first appeared as "Theorem VI" in Gdel's 1931 paper On Formally
Undecidable Propositions in Principia Mathematica and Related Systems I.
The formal theorem is written in highly technical language. It may be paraphrased in English as:
Any effectively generated theory capable of expressing elementary arithmetic cannot be both consistent and
complete. In particular, for any consistent, effectively generated formal theory that proves certain basic
arithmetic truths, there is an arithmetical statement that is true,[1] but not provable in the theory (Kleene 1967,
p.250).
The true but unprovable statement referred to by the theorem is often referred to as "the Gdel sentence" for the
theory. The proof constructs a specific Gdel sentence for each consistent effectively generated theory, but there are
infinitely many statements in the language of the theory that share the property of being true but unprovable. For
example, the conjunction of the Gdel sentence and any logically valid sentence will have this property.
For each consistent formal theory T having the required small amount of number theory, the corresponding Gdel
sentence G asserts: "G cannot be proved within the theoryT". This interpretation of G leads to the following
informal analysis. If G were provable under the axioms and rules of inference of T, then T would have a theorem, G,
which effectively contradicts itself, and thus the theory T would be inconsistent. This means that if the theory T is
consistent then G cannot be proved within it, and so the theory T is incomplete. Moreover, the claim G makes about
its own unprovability is correct. In this sense G is not only unprovable but true, and provability-within-the-theory-T
is not the same as truth. This informal analysis can be formalized to make a rigorous proof of the incompleteness
theorem, as described in the section "Proof sketch for the first theorem" below. The formal proof reveals exactly the
hypotheses required for the theory T in order for the self-contradictory nature of G to lead to a genuine contradiction.
Each effectively generated theory has its own Gdel statement. It is possible to define a larger theory T that contains
the whole of T, plus G as an additional axiom. This will not result in a complete theory, because Gdel's theorem
will also apply to T, and thus T cannot be complete. In this case, G is indeed a theorem in T, because it is an axiom.
Since G states only that it is not provable in T, no contradiction is presented by its provability in T. However,
because the incompleteness theorem applies to T: there will be a new Gdel statement G for T, showing that T is
also incomplete. G will differ from G in that G will refer to T, rather thanT.
To prove the first incompleteness theorem, Gdel represented statements by numbers. Then the theory at hand,
which is assumed to prove certain facts about numbers, also proves facts about its own statements, provided that it is
effectively generated. Questions about the provability of statements are represented as questions about the properties
of numbers, which would be decidable by the theory if it were complete. In these terms, the Gdel sentence states
that no natural number exists with a certain, strange property. A number with this property would encode a proof of
the inconsistency of the theory. If there were such a number then the theory would be inconsistent, contrary to the
consistency hypothesis. So, under the assumption that the theory is consistent, there is no such number.

196

Gdel's incompleteness theorems

Meaning of the first incompleteness theorem


Gdel's first incompleteness theorem shows that any consistent effective formal system that includes enough of the
theory of the natural numbers is incomplete: there are true statements expressible in its language that are unprovable
within the system. Thus no formal system (satisfying the hypotheses of the theorem) that aims to characterize the
natural numbers can actually do so, as there will be true number-theoretical statements which that system cannot
prove. This fact is sometimes thought to have severe consequences for the program of logicism proposed by Gottlob
Frege and Bertrand Russell, which aimed to define the natural numbers in terms of logic (Hellman 1981,
p.451468). Bob Hale and Crispin Wright argue that it is not a problem for logicism because the incompleteness
theorems apply equally to first order logic as they do to arithmetic. They argue that only those who believe that the
natural numbers are to be defined in terms of first order logic have this problem.
The existence of an incomplete formal system is, in itself, not particularly surprising. A system may be incomplete
simply because not all the necessary axioms have been discovered. For example, Euclidean geometry without the
parallel postulate is incomplete; it is not possible to prove or disprove the parallel postulate from the remaining
axioms.
Gdel's theorem shows that, in theories that include a small portion of number theory, a complete and consistent
finite list of axioms can never be created, nor even an infinite list that can be enumerated by a computer program.
Each time a new statement is added as an axiom, there are other true statements that still cannot be proved, even with
the new axiom. If an axiom is ever added that makes the system complete, it does so at the cost of making the system
inconsistent.
There are complete and consistent lists of axioms for arithmetic that cannot be enumerated by a computer program.
For example, one might take all true statements about the natural numbers to be axioms (and no false statements),
which gives the theory known as "true arithmetic". The difficulty is that there is no mechanical way to decide, given
a statement about the natural numbers, whether it is an axiom of this theory, and thus there is no effective way to
verify a formal proof in this theory.
Many logicians believe that Gdel's incompleteness theorems struck a fatal blow to David Hilbert's second problem,
which asked for a finitary consistency proof for mathematics. The second incompleteness theorem, in particular, is
often viewed as making the problem impossible. Not all mathematicians agree with this analysis, however, and the
status of Hilbert's second problem is not yet decided (see "Modern viewpoints on the status of the problem").

Relation to the liar paradox


The liar paradox is the sentence "This sentence is false." An analysis of the liar sentence shows that it cannot be true
(for then, as it asserts, it is false), nor can it be false (for then, it is true). A Gdel sentence G for a theory T makes a
similar assertion to the liar sentence, but with truth replaced by provability: G says "G is not provable in the theory
T." The analysis of the truth and provability of G is a formalized version of the analysis of the truth of the liar
sentence.
It is not possible to replace "not provable" with "false" in a Gdel sentence because the predicate "Q is the Gdel
number of a false formula" cannot be represented as a formula of arithmetic. This result, known as Tarski's
undefinability theorem, was discovered independently by Gdel (when he was working on the proof of the
incompleteness theorem) and by Alfred Tarski.

197

Gdel's incompleteness theorems

Extensions of Gdel's original result


Gdel demonstrated the incompleteness of the theory of Principia Mathematica, a particular theory of arithmetic, but
a parallel demonstration could be given for any effective theory of a certain expressiveness. Gdel commented on
this fact in the introduction to his paper, but restricted the proof to one system for concreteness. In modern
statements of the theorem, it is common to state the effectiveness and expressiveness conditions as hypotheses for
the incompleteness theorem, so that it is not limited to any particular formal theory. The terminology used to state
these conditions was not yet developed in 1931 when Gdel published his results.
Gdel's original statement and proof of the incompleteness theorem requires the assumption that the theory is not just
consistent but -consistent. A theory is -consistent if it is not -inconsistent, and is -inconsistent if there is a
predicate P such that for every specific natural number m the theory proves ~P(m), and yet the theory also proves
that there exists a natural number n such that P(n). That is, the theory says that a number with property P exists while
denying that it has any specific value. The -consistency of a theory implies its consistency, but consistency does
not imply -consistency. J. Barkley Rosser (1936) strengthened the incompleteness theorem by finding a variation
of the proof (Rosser's trick) that only requires the theory to be consistent, rather than -consistent. This is mostly of
technical interest, since all true formal theories of arithmetic (theories whose axioms are all true statements about
natural numbers) are -consistent, and thus Gdel's theorem as originally stated applies to them. The stronger
version of the incompleteness theorem that only assumes consistency, rather than -consistency, is now commonly
known as Gdel's incompleteness theorem and as the GdelRosser theorem.

Second incompleteness theorem


Gdel's second incompleteness theorem first appeared as "Theorem XI" in Gdel's 1931 paper On Formally
Undecidable Propositions in Principia Mathematica and Related Systems I.
Like with the first incompleteness theorem, Gdel wrote this theorem in highly technical formal mathematics. It may
be paraphrased in English as:
For any formal effectively generated theory T including basic arithmetical truths and also certain truths about
formal provability, if T includes a statement of its own consistency then T is inconsistent.
This strengthens the first incompleteness theorem, because the statement constructed in the first incompleteness
theorem does not directly express the consistency of the theory. The proof of the second incompleteness theorem is
obtained by formalizing the proof of the first incompleteness theorem within the theory itself.
A technical subtlety in the second incompleteness theorem is how to express the consistency of T as a formula in the
language of T. There are many ways to do this, and not all of them lead to the same result. In particular, different
formalizations of the claim that T is consistent may be inequivalent in T, and some may even be provable. For
example, first-order Peano arithmetic (PA) can prove that the largest consistent subset of PA is consistent. But since
PA is consistent, the largest consistent subset of PA is just PA, so in this sense PA "proves that it is consistent".
What PA does not prove is that the largest consistent subset of PA is, in fact, the whole of PA. (The term "largest
consistent subset of PA" is technically ambiguous, but what is meant here is the largest consistent initial segment of
the axioms of PA ordered according to specific criteria; i.e., by "Gdel numbers", the numbers encoding the axioms
as per the scheme used by Gdel mentioned above).
For Peano arithmetic, or any familiar explicitly axiomatized theory T, it is possible to canonically define a formula
Con(T) expressing the consistency of T; this formula expresses the property that "there does not exist a natural
number coding a sequence of formulas, such that each formula is either of the axioms of T, a logical axiom, or an
immediate consequence of preceding formulas according to the rules of inference of first-order logic, and such that
the last formula is a contradiction".
The formalization of Con(T) depends on two factors: formalizing the notion of a sentence being derivable from a set
of sentences and formalizing the notion of being an axiom of T. Formalizing derivability can be done in canonical

198

Gdel's incompleteness theorems


fashion: given an arithmetical formula A(x) defining a set of axioms, one can canonically form a predicate ProvA(P)
which expresses that a sentence P is provable from the set of axioms defined by A(x).
In addition, the standard proof of the second incompleteness theorem assumes that ProvA(P) satisfies that
HilbertBernays provability conditions. Letting #(P) represent the Gdel number of a formula P, the derivability
conditions say:
1. If T proves P, then T proves ProvA(#(P)).
2. T proves 1.; that is, T proves that if T proves P, then T proves ProvA(#(P)). In other words, T proves that
ProvA(#(P)) implies ProvA(#(ProvA(#(P)))).
3. T proves that if T proves that (P Q) and T proves P then T proves Q. In other words, T proves that ProvA(#(P
Q)) and ProvA(#(P)) imply ProvA(#(Q)).

Implications for consistency proofs


Gdel's second incompleteness theorem also implies that a theory T1 satisfying the technical conditions outlined
above cannot prove the consistency of any theory T2 which proves the consistency of T1. This is because such a
theory T1 can prove that if T2 proves the consistency of T1, then T1 is in fact consistent. For the claim that T1 is
consistent has form "for all numbers n, n has the decidable property of not being a code for a proof of contradiction
in T1". If T1 were in fact inconsistent, then T2 would prove for some n that n is the code of a contradiction in T1. But
if T2 also proved that T1 is consistent (that is, that there is no such n), then it would itself be inconsistent. This
reasoning can be formalized in T1 to show that if T2 is consistent, then T1 is consistent. Since, by second
incompleteness theorem, T1 does not prove its consistency, it cannot prove the consistency of T2 either.
This corollary of the second incompleteness theorem shows that there is no hope of proving, for example, the
consistency of Peano arithmetic using any finitistic means that can be formalized in a theory the consistency of
which is provable in Peano arithmetic. For example, the theory of primitive recursive arithmetic (PRA), which is
widely accepted as an accurate formalization of finitistic mathematics, is provably consistent in PA. Thus PRA
cannot prove the consistency of PA. This fact is generally seen to imply that Hilbert's program, which aimed to
justify the use of "ideal" (infinitistic) mathematical principles in the proofs of "real" (finitistic) mathematical
statements by giving a finitistic proof that the ideal principles are consistent, cannot be carried out.Wikipedia:Please
clarify
The corollary also indicates the epistemological relevance of the second incompleteness theorem. It would actually
provide no interesting information if a theory T proved its consistency. This is because inconsistent theories prove
everything, including their consistency. Thus a consistency proof of T in T would give us no clue as to whether T
really is consistent; no doubts about the consistency of T would be resolved by such a consistency proof. The interest
in consistency proofs lies in the possibility of proving the consistency of a theory T in some theory T which is in
some sense less doubtful than T itself, for example weaker than T. For many naturally occurring theories T and T,
such as T = ZermeloFraenkel set theory and T = primitive recursive arithmetic, the consistency of T is provable in
T, and thus T can't prove the consistency of T by the above corollary of the second incompleteness theorem.
The second incompleteness theorem does not rule out consistency proofs altogether, only consistency proofs that
could be formalized in the theory that is proved consistent. For example, Gerhard Gentzen proved the consistency of
Peano arithmetic (PA) in a different theory which includes an axiom asserting that the ordinal called 0 is
wellfounded; see Gentzen's consistency proof. Gentzen's theorem spurred the development of ordinal analysis in
proof theory.

199

Gdel's incompleteness theorems

Examples of undecidable statements


See also: List of statements undecidable in ZFC
There are two distinct senses of the word "undecidable" in mathematics and computer science. The first of these is
the proof-theoretic sense used in relation to Gdel's theorems, that of a statement being neither provable nor
refutable in a specified deductive system. The second sense, which will not be discussed here, is used in relation to
computability theory and applies not to statements but to decision problems, which are countably infinite sets of
questions each requiring a yes or no answer. Such a problem is said to be undecidable if there is no computable
function that correctly answers every question in the problem set (see undecidable problem).
Because of the two meanings of the word undecidable, the term independent is sometimes used instead of
undecidable for the "neither provable nor refutable" sense. The usage of "independent" is also ambiguous, however.
SomeWikipedia:Manual of Style/Words to watch#Unsupported attributions use it to mean just "not provable",
leaving open whether an independent statement might be refuted.
Undecidability of a statement in a particular deductive system does not, in and of itself, address the question of
whether the truth value of the statement is well-defined, or whether it can be determined by other means.
Undecidability only implies that the particular deductive system being considered does not prove the truth or falsity
of the statement. Whether there exist so-called "absolutely undecidable" statements, whose truth value can never be
known or is ill-specified, is a controversial point in the philosophy of mathematics.
The combined work of Gdel and Paul Cohen has given two concrete examples of undecidable statements (in the
first sense of the term): The continuum hypothesis can neither be proved nor refuted in ZFC (the standard
axiomatization of set theory), and the axiom of choice can neither be proved nor refuted in ZF (which is all the ZFC
axioms except the axiom of choice). These results do not require the incompleteness theorem. Gdel proved in 1940
that neither of these statements could be disproved in ZF or ZFC set theory. In the 1960s, Cohen proved that neither
is provable from ZF, and the continuum hypothesis cannot be proven from ZFC.
In 1973, the Whitehead problem in group theory was shown to be undecidable, in the first sense of the term, in
standard set theory.
Gregory Chaitin produced undecidable statements in algorithmic information theory and proved another
incompleteness theorem in that setting. Chaitin's incompleteness theorem states that for any theory that can represent
enough arithmetic, there is an upper bound c such that no specific number can be proven in that theory to have
Kolmogorov complexity greater than c. While Gdel's theorem is related to the liar paradox, Chaitin's result is
related to Berry's paradox.

Undecidable statements provable in larger systems


These are natural mathematical equivalents of the Gdel "true but undecidable" sentence. They can be proved in a
larger system which is generally accepted as a valid form of reasoning, but are undecidable in a more limited system
such as Peano Arithmetic.
In 1977, Paris and Harrington proved that the Paris-Harrington principle, a version of the Ramsey theorem, is
undecidable in the first-order axiomatization of arithmetic called Peano arithmetic, but can be proven in the larger
system of second-order arithmetic. Kirby and Paris later showed Goodstein's theorem, a statement about sequences
of natural numbers somewhat simpler than the Paris-Harrington principle, to be undecidable in Peano arithmetic.
Kruskal's tree theorem, which has applications in computer science, is also undecidable from Peano arithmetic but
provable in set theory. In fact Kruskal's tree theorem (or its finite form) is undecidable in a much stronger system
codifying the principles acceptable based on a philosophy of mathematics called predicativism. The related but more
general graph minor theorem (2003) has consequences for computational complexity theory.

200

Gdel's incompleteness theorems

Limitations of Gdel's theorems


The conclusions of Gdel's theorems are only proven for the formal theories that satisfy the necessary hypotheses.
Not all axiom systems satisfy these hypotheses, even when these systems have models that include the natural
numbers as a subset. For example, there are first-order axiomatizations of Euclidean geometry, of real closed fields,
and of arithmetic in which multiplication is not provably total; none of these meet the hypotheses of Gdel's
theorems. The key fact is that these axiomatizations are not expressive enough to define the set of natural numbers or
develop basic properties of the natural numbers. Regarding the third example, Dan Willard (2001) has studied many
weak systems of arithmetic which do not satisfy the hypotheses of the second incompleteness theorem, and which
are consistent and capable of proving their own consistency (see self-verifying theories).
Gdel's theorems only apply to effectively generated (that is, recursively enumerable) theories. If all true statements
about natural numbers are taken as axioms for a theory, then this theory is a consistent, complete extension of Peano
arithmetic (called true arithmetic) for which none of Gdel's theorems apply in a meaningful way, because this
theory is not recursively enumerable.
The second incompleteness theorem only shows that the consistency of certain theories cannot be proved from the
axioms of those theories themselves. It does not show that the consistency cannot be proved from other (consistent)
axioms. For example, the consistency of the Peano arithmetic can be proved in ZermeloFraenkel set theory (ZFC),
or in theories of arithmetic augmented with transfinite induction, as in Gentzen's consistency proof.

Relationship with computability


The incompleteness theorem is closely related to several results about undecidable sets in recursion theory.
Stephen Cole Kleene (1943) presented a proof of Gdel's incompleteness theorem using basic results of
computability theory. One such result shows that the halting problem is undecidable: there is no computer program
that can correctly determine, given a program P as input, whether P eventually halts when run with a particular given
input. Kleene showed that the existence of a complete effective theory of arithmetic with certain consistency
properties would force the halting problem to be decidable, a contradiction. This method of proof has also been
presented by Shoenfield (1967, p.132); Charlesworth (1980); and Hopcroft and Ullman (1979).
Franzn (2005, p.73) explains how Matiyasevich's solution to Hilbert's 10th problem can be used to obtain a proof
to Gdel's first incompleteness theorem. Matiyasevich proved that there is no algorithm that, given a multivariate
polynomial p(x1, x2,...,xk) with integer coefficients, determines whether there is an integer solution to the equation p
= 0. Because polynomials with integer coefficients, and integers themselves, are directly expressible in the language
of arithmetic, if a multivariate integer polynomial equation p = 0 does have a solution in the integers then any
sufficiently strong theory of arithmetic T will prove this. Moreover, if the theory T is -consistent, then it will never
prove that a particular polynomial equation has a solution when in fact there is no solution in the integers. Thus, if T
were complete and -consistent, it would be possible to determine algorithmically whether a polynomial equation
has a solution by merely enumerating proofs of T until either "p has a solution" or "p has no solution" is found, in
contradiction to Matiyasevich's theorem. Moreover, for each consistent effectively generated theory T, it is possible
to effectively generate a multivariate polynomial p over the integers such that the equation p = 0 has no solutions
over the integers, but the lack of solutions cannot be proved in T (Davis 2006:416, Jones 1980).
Smorynski (1977, p.842) shows how the existence of recursively inseparable sets can be used to prove the first
incompleteness theorem. This proof is often extended to show that systems such as Peano arithmetic are essentially
undecidable (see Kleene 1967, p.274).
Chaitin's incompleteness theorem gives a different method of producing independent sentences, based on
Kolmogorov complexity. Like the proof presented by Kleene that was mentioned above, Chaitin's theorem only
applies to theories with the additional property that all their axioms are true in the standard model of the natural
numbers. Gdel's incompleteness theorem is distinguished by its applicability to consistent theories that nonetheless

201

Gdel's incompleteness theorems


include statements that are false in the standard model; these theories are known as -inconsistent.

Proof sketch for the first theorem


Main article: Proof sketch for Gdel's first incompleteness theorem
The proof by contradiction has three essential parts. To begin, choose a formal system that meets the proposed
criteria:
1. Statements in the system can be represented by natural numbers (known as Gdel numbers). The significance of
this is that properties of statementssuch as their truth and falsehoodwill be equivalent to determining whether
their Gdel numbers have certain properties, and that properties of the statements can therefore be demonstrated
by examining their Gdel numbers. This part culminates in the construction of a formula expressing the idea that
"statement S is provable in the system" (which can be applied to any statement "S" in the system).
2. In the formal system it is possible to construct a number whose matching statement, when interpreted, is
self-referential and essentially says that it (i.e. the statement itself) is unprovable. This is done using a technique
called "diagonalization" (so-called because of its origins as Cantor's diagonal argument).
3. Within the formal system this statement permits a demonstration that it is neither provable nor disprovable in the
system, and therefore the system cannot in fact be -consistent. Hence the original assumption that the proposed
system met the criteria is false.

Arithmetization of syntax
The main problem in fleshing out the proof described above is that it seems at first that to construct a statement p
that is equivalent to "p cannot be proved", p would somehow have to contain a reference to p, which could easily
give rise to an infinite regress. Gdel's ingenious technique is to show that statements can be matched with numbers
(often called the arithmetization of syntax) in such a way that "proving a statement" can be replaced with "testing
whether a number has a given property". This allows a self-referential formula to be constructed in a way that avoids
any infinite regress of definitions. The same technique was later used by Alan Turing in his work on the
Entscheidungsproblem.
In simple terms, a method can be devised so that every formula or statement that can be formulated in the system
gets a unique number, called its Gdel number, in such a way that it is possible to mechanically convert back and
forth between formulas and Gdel numbers. The numbers involved might be very long indeed (in terms of number of
digits), but this is not a barrier; all that matters is that such numbers can be constructed. A simple example is the way
in which English is stored as a sequence of numbers in computers using ASCII or Unicode:
The word HELLO is represented by 72-69-76-76-79 using decimal ASCII, ie the number 7269767679.
The logical statement x=y => y=x is represented by 120-061-121-032-061-062-032-121-061-120 using
octal ASCII, ie the number 120061121032061062032121061120.
In principle, proving a statement true or false can be shown to be equivalent to proving that the number matching the
statement does or doesn't have a given property. Because the formal system is strong enough to support reasoning
about numbers in general, it can support reasoning about numbers which represent formulae and statements as well.
Crucially, because the system can support reasoning about properties of numbers, the results are equivalent to
reasoning about provability of their equivalent statements.

202

Gdel's incompleteness theorems

Construction of a statement about "provability"


Having shown that in principle the system can indirectly make statements about provability, by analyzing properties
of those numbers representing statements it is now possible to show how to create a statement that actually does this.
A formula F(x) that contains exactly one free variable x is called a statement form or class-sign. As soon as x is
replaced by a specific number, the statement form turns into a bona fide statement, and it is then either provable in
the system, or not. For certain formulas one can show that for every natural number n, F(n) is true if and only if it
can be proven (the precise requirement in the original proof is weaker, but for the proof sketch this will suffice). In
particular, this is true for every specific arithmetic operation between a finite number of natural numbers, such as
"23=6".
Statement forms themselves are not statements and therefore cannot be proved or disproved. But every statement
form F(x) can be assigned a Gdel number denoted by G(F). The choice of the free variable used in the form F(x) is
not relevant to the assignment of the Gdel number G(F).
Now comes the trick: The notion of provability itself can also be encoded by Gdel numbers, in the following way.
Since a proof is a list of statements which obey certain rules, the Gdel number of a proof can be defined. Now, for
every statement p, one may ask whether a number x is the Gdel number of its proof. The relation between the Gdel
number of p and x, the potential Gdel number of its proof, is an arithmetical relation between two numbers.
Therefore there is a statement form Bew(y) that uses this arithmetical relation to state that a Gdel number of a proof
of y exists:
Bew(y) = x ( y is the Gdel number of a formula and x is the Gdel number of a proof of the formula
encoded by y).
The name Bew is short for beweisbar, the German word for "provable"; this name was originally used by Gdel to
denote the provability formula just described. Note that "Bew(y)" is merely an abbreviation that represents a
particular, very long, formula in the original language of T; the string "Bew" itself is not claimed to be part of this
language.
An important feature of the formula Bew(y) is that if a statement p is provable in the system then Bew(G(p)) is also
provable. This is because any proof of p would have a corresponding Gdel number, the existence of which causes
Bew(G(p)) to be satisfied.

Diagonalization
The next step in the proof is to obtain a statement that says it is unprovable. Although Gdel constructed this
statement directly, the existence of at least one such statement follows from the diagonal lemma, which says that for
any sufficiently strong formal system and any statement form F there is a statement p such that the system proves
p F(G(p)).
By letting F be the negation of Bew(x), we obtain the theorem
p ~Bew(G(p))
and the p defined by this roughly states that its own Gdel number is the Gdel number of an unprovable formula.
The statement p is not literally equal to ~Bew(G(p)); rather, p states that if a certain calculation is performed, the
resulting Gdel number will be that of an unprovable statement. But when this calculation is performed, the resulting
Gdel number turns out to be the Gdel number of p itself. This is similar to the following sentence in English:
", when preceded by itself in quotes, is unprovable.", when preceded by itself in quotes, is unprovable.
This sentence does not directly refer to itself, but when the stated transformation is made the original sentence is
obtained as a result, and thus this sentence asserts its own unprovability. The proof of the diagonal lemma employs a
similar method.
Now, assume that the axiomatic system is -consistent, and let p be the statement obtained in the previous section.

203

Gdel's incompleteness theorems


If p were provable, then Bew(G(p)) would be provable, as argued above. But p asserts the negation of Bew(G(p)).
Thus the system would be inconsistent, proving both a statement and its negation. This contradiction shows that p
cannot be provable.
If the negation of p were provable, then Bew(G(p)) would be provable (because p was constructed to be equivalent
to the negation of Bew(G(p))). However, for each specific number x, x cannot be the Gdel number of the proof of p,
because p is not provable (from the previous paragraph). Thus on one hand the system proves there is a number with
a certain property (that it is the Gdel number of the proof of p), but on the other hand, for every specific number x,
we can prove that it does not have this property. This is impossible in an -consistent system. Thus the negation of p
is not provable.
Thus the statement p is undecidable in our axiomatic system: it can neither be proved nor disproved within the
system.
In fact, to show that p is not provable only requires the assumption that the system is consistent. The stronger
assumption of -consistency is required to show that the negation of p is not provable. Thus, if p is constructed for a
particular system:
If the system is -consistent, it can prove neither p nor its negation, and so p is undecidable.
If the system is consistent, it may have the same situation, or it may prove the negation of p. In the later case, we
have a statement ("not p") which is false but provable, and the system is not -consistent.
If one tries to "add the missing axioms" to avoid the incompleteness of the system, then one has to add either p or
"not p" as axioms. But then the definition of "being a Gdel number of a proof" of a statement changes. which means
that the formula Bew(x) is now different. Thus when we apply the diagonal lemma to this new Bew, we obtain a new
statement p, different from the previous one, which will be undecidable in the new system if it is -consistent.

Proof via Berry's paradox


George Boolos (1989) sketches an alternative proof of the first incompleteness theorem that uses Berry's paradox
rather than the liar paradox to construct a true but unprovable formula. A similar proof method was independently
discovered by Saul Kripke (Boolos 1998, p.383). Boolos's proof proceeds by constructing, for any computably
enumerable set S of true sentences of arithmetic, another sentence which is true but not contained in S. This gives the
first incompleteness theorem as a corollary. According to Boolos, this proof is interesting because it provides a
"different sort of reason" for the incompleteness of effective, consistent theories of arithmetic (Boolos 1998, p.388).

Formalized proofs
Formalized proofs of versions of the incompleteness theorem have been developed by Natarajan Shankar in 1986
using Nqthm (Shankar 1994) and by Russell O'Connor in 2003 using Coq (O'Connor 2005).

Proof sketch for the second theorem


The main difficulty in proving the second incompleteness theorem is to show that various facts about provability
used in the proof of the first incompleteness theorem can be formalized within the system using a formal predicate
for provability. Once this is done, the second incompleteness theorem follows by formalizing the entire proof of the
first incompleteness theorem within the system itself.
Let p stand for the undecidable sentence constructed above, and assume that the consistency of the system can be
proven from within the system itself. The demonstration above shows that if the system is consistent, then p is not
provable. The proof of this implication can be formalized within the system, and therefore the statement "p is not
provable", or "not P(p)" can be proven in the system.
But this last statement is equivalent to p itself (and this equivalence can be proven in the system), so p can be proven
in the system. This contradiction shows that the system must be inconsistent.

204

Gdel's incompleteness theorems

Discussion and implications


The incompleteness results affect the philosophy of mathematics, particularly versions of formalism, which use a
single system formal logic to define their principles. One can paraphrase the first theorem as saying the following:
An all-encompassing axiomatic system can never be found that is able to prove all mathematical truths, but no
falsehoods.
On the other hand, from a strict formalist perspective this paraphrase would be considered meaningless because it
presupposes that mathematical "truth" and "falsehood" are well-defined in an absolute sense, rather than relative to
each formal system.
The following rephrasing of the second theorem is even more unsettling to the foundations of mathematics:
If an axiomatic system can be proven to be consistent from within itself, then it is inconsistent.
Therefore, to establish the consistency of a system S, one needs to use some other system T, but a proof in T is not
completely convincing unless T's consistency has already been established without using S.
Theories such as Peano arithmetic, for which any computably enumerable consistent extension is incomplete, are
called essentially undecidable or essentially incomplete.

Minds and machines


Main article: Mechanism (philosophy) Gdelian arguments
Authors including the philosopher J. R. Lucas and physicist Roger Penrose have debated what, if anything, Gdel's
incompleteness theorems imply about human intelligence. Much of the debate centers on whether the human mind is
equivalent to a Turing machine, or by the ChurchTuring thesis, any finite machine at all. If it is, and if the machine
is consistent, then Gdel's incompleteness theorems would apply to it.
Hilary Putnam (1960) suggested that while Gdel's theorems cannot be applied to humans, since they make mistakes
and are therefore inconsistent, it may be applied to the human faculty of science or mathematics in general.
Assuming that it is consistent, either its consistency cannot be proved or it cannot be represented by a Turing
machine.
Avi Wigderson (2010) has proposed that the concept of mathematical "knowability" should be based on
computational complexity rather than logical decidability. He writes that "when knowability is interpreted by modern
standards, namely via computational complexity, the Gdel phenomena are very much with us."

Paraconsistent logic
Although Gdel's theorems are usually studied in the context of classical logic, they also have a role in the study of
paraconsistent logic and of inherently contradictory statements (dialetheia). Graham Priest (1984, 2006) argues that
replacing the notion of formal proof in Gdel's theorem with the usual notion of informal proof can be used to show
that naive mathematics is inconsistent, and uses this as evidence for dialetheism. The cause of this inconsistency is
the inclusion of a truth predicate for a theory within the language of the theory (Priest 2006:47). Stewart Shapiro
(2002) gives a more mixed appraisal of the applications of Gdel's theorems to dialetheism. Carl Hewitt (2008) has
proposed that (inconsistent) paraconsistent logics that prove their own Gdel sentences may have applications in
software engineering.

205

Gdel's incompleteness theorems

Appeals to the incompleteness theorems in other fields


Appeals and analogies are sometimes made to the incompleteness theorems in support of arguments that go beyond
mathematics and logic. Several authors have commented negatively on such extensions and interpretations, including
Torkel Franzn (2005); Alan Sokal and Jean Bricmont (1999); and Ophelia Benson and Jeremy Stangroom (2006).
Bricmont and Stangroom (2006, p.10), for example, quote from Rebecca Goldstein's comments on the disparity
between Gdel's avowed Platonism and the anti-realist uses to which his ideas are sometimes put. Sokal and
Bricmont (1999, p.187) criticize Rgis Debray's invocation of the theorem in the context of sociology; Debray has
defended this use as metaphorical (ibid.).

Role of self-reference
Torkel Franzn (2005, p.46) observes:
Gdel's proof of the first incompleteness theorem and Rosser's strengthened version have given many
the impression that the theorem can only be proved by constructing self-referential statements [...] or
even that only strange self-referential statements are known to be undecidable in elementary arithmetic.
To counteract such impressions, we need only introduce a different kind of proof of the first
incompleteness theorem.
He then proposes the proofs based on computability, or on information theory, as described earlier in this article, as
examples of proofs that should "counteract such impressions".

History
After Gdel published his proof of the completeness theorem as his doctoral thesis in 1929, he turned to a second
problem for his habilitation. His original goal was to obtain a positive solution to Hilbert's second problem (Dawson
1997, p.63). At the time, theories of the natural numbers and real numbers similar to second-order arithmetic were
known as "analysis", while theories of the natural numbers alone were known as "arithmetic".
Gdel was not the only person working on the consistency problem. Ackermann had published a flawed consistency
proof for analysis in 1925, in which he attempted to use the method of -substitution originally developed by Hilbert.
Later that year, von Neumann was able to correct the proof for a theory of arithmetic without any axioms of
induction. By 1928, Ackermann had communicated a modified proof to Bernays; this modified proof led Hilbert to
announce his belief in 1929 that the consistency of arithmetic had been demonstrated and that a consistency proof of
analysis would likely soon follow. After the publication of the incompleteness theorems showed that Ackermann's
modified proof must be erroneous, von Neumann produced a concrete example showing that its main technique was
unsound (Zach 2006, p.418, Zach 2003, p.33).
In the course of his research, Gdel discovered that although a sentence which asserts its own falsehood leads to
paradox, a sentence that asserts its own non-provability does not. In particular, Gdel was aware of the result now
called Tarski's indefinability theorem, although he never published it. Gdel announced his first incompleteness
theorem to Carnap, Feigel and Waismann on August 26, 1930; all four would attend a key conference in Knigsberg
the following week.

Announcement
The 1930 Knigsberg conference was a joint meeting of three academic societies, with many of the key logicians of
the time in attendance. Carnap, Heyting, and von Neumann delivered one-hour addresses on the mathematical
philosophies of logicism, intuitionism, and formalism, respectively (Dawson 1996, p.69). The conference also
included Hilbert's retirement address, as he was leaving his position at the University of Gttingen. Hilbert used the
speech to argue his belief that all mathematical problems can be solved. He ended his address by saying,

206

Gdel's incompleteness theorems


For the mathematician there is no Ignorabimus, and, in my opinion, not at all for natural science either. ... The
true reason why [no one] has succeeded in finding an unsolvable problem is, in my opinion, that there is no
unsolvable problem. In contrast to the foolish Ignoramibus, our credo avers: We must know. We shall know!
This speech quickly became known as a summary of Hilbert's beliefs on mathematics (its final six words, "Wir
mssen wissen. Wir werden wissen!", were used as Hilbert's epitaph in 1943). Although Gdel was likely in
attendance for Hilbert's address, the two never met face to face (Dawson 1996, p.72).
Gdel announced his first incompleteness theorem at a roundtable discussion session on the third day of the
conference. The announcement drew little attention apart from that of von Neumann, who pulled Gdel aside for
conversation. Later that year, working independently with knowledge of the first incompleteness theorem, von
Neumann obtained a proof of the second incompleteness theorem, which he announced to Gdel in a letter dated
November 20, 1930 (Dawson 1996, p.70). Gdel had independently obtained the second incompleteness theorem
and included it in his submitted manuscript, which was received by Monatshefte fr Mathematik on November 17,
1930.
Gdel's paper was published in the Monatshefte in 1931 under the title ber formal unentscheidbare Stze der
Principia Mathematica und verwandter Systeme I (On Formally Undecidable Propositions in Principia Mathematica
and Related Systems I). As the title implies, Gdel originally planned to publish a second part of the paper; it was
never written.

Generalization and acceptance


Gdel gave a series of lectures on his theorems at Princeton in 19331934 to an audience that included Church,
Kleene, and Rosser. By this time, Gdel had grasped that the key property his theorems required is that the theory
must be effective (at the time, the term "general recursive" was used). Rosser proved in 1936 that the hypothesis of
-consistency, which was an integral part of Gdel's original proof, could be replaced by simple consistency, if the
Gdel sentence was changed in an appropriate way. These developments left the incompleteness theorems in
essentially their modern form.
Gentzen published his consistency proof for first-order arithmetic in 1936. Hilbert accepted this proof as "finitary"
although (as Gdel's theorem had already shown) it cannot be formalized within the system of arithmetic that is
being proved consistent.
The impact of the incompleteness theorems on Hilbert's program was quickly realized. Bernays included a full proof
of the incompleteness theorems in the second volume of Grundlagen der Mathematik (1939), along with additional
results of Ackermann on the -substitution method and Gentzen's consistency proof of arithmetic. This was the first
full published proof of the second incompleteness theorem.

Criticisms
Finsler
Paul Finsler (1926) used a version of Richard's paradox to construct an expression that was false but unprovable in a
particular, informal framework he had developed. Gdel was unaware of this paper when he proved the
incompleteness theorems (Collected Works Vol. IV., p.9). Finsler wrote to Gdel in 1931 to inform him about this
paper, which Finsler felt had priority for an incompleteness theorem. Finsler's methods did not rely on formalized
provability, and had only a superficial resemblance to Gdel's work (van Heijenoort 1967:328). Gdel read the paper
but found it deeply flawed, and his response to Finsler laid out concerns about the lack of formalization
(Dawson:89). Finsler continued to argue for his philosophy of mathematics, which eschewed formalization, for the
remainder of his career.

207

Gdel's incompleteness theorems


Zermelo
In September 1931, Ernst Zermelo wrote Gdel to announce what he described as an "essential gap" in Gdel's
argument (Dawson:76). In October, Gdel replied with a 10-page letter (Dawson:76, Grattan-Guinness:512-513).
But Zermelo did not relent and published his criticisms in print with "a rather scathing paragraph on his young
competitor" (Grattan-Guinness:513). Gdel decided that to pursue the matter further was pointless, and Carnap
agreed (Dawson:77). Much of Zermelo's subsequent work was related to logics stronger than first-order logic, with
which he hoped to show both the consistency and categoricity of mathematical theories.
Wittgenstein
Ludwig Wittgenstein wrote several passages about the incompleteness theorems that were published posthumously
in his 1953 Remarks on the Foundations of Mathematics. Gdel was a member of the Vienna Circle during the
period in which Wittgenstein's early ideal language philosophy and Tractatus Logico-Philosophicus dominated the
circle's thinking. Writings in Gdel's Nachlass express the belief that Wittgenstein deliberately misread his ideas.
Multiple commentators have read Wittgenstein as misunderstanding Gdel (Rodych 2003), although Juliet Floyd and
Hilary Putnam (2000), as well as Graham Priest (2004) have provided textual readings arguing that most
commentary misunderstands Wittgenstein. On their release, Bernays, Dummett, and Kreisel wrote separate reviews
on Wittgenstein's remarks, all of which were extremely negative (Berto 2009:208). The unanimity of this criticism
caused Wittgenstein's remarks on the incompleteness theorems to have little impact on the logic community. In
1972, Gdel, stated: "Has Wittgenstein lost his mind? Does he mean it seriously?" (Wang 1996:197) And wrote to
Karl Menger that Wittgenstein's comments demonstrate a willful misunderstanding of the incompleteness theorems
writing:
"It is clear from the passages you cite that Wittgenstein did "not" understand [the first incompleteness
theorem] (or pretended not to understand it). He interpreted it as a kind of logical paradox, while in fact is just
the opposite, namely a mathematical theorem within an absolutely uncontroversial part of mathematics
(finitary number theory or combinatorics)." (Wang 1996:197)
Since the publication of Wittgenstein's Nachlass in 2000, a series of papers in philosophy have sought to evaluate
whether the original criticism of Wittgenstein's remarks was justified. Floyd and Putnam (2000) argue that
Wittgenstein had a more complete understanding of the incompleteness theorem than was previously assumed. They
are particularly concerned with the interpretation of a Gdel sentence for an -inconsistent theory as actually saying
"I am not provable", since the theory has no models in which the provability predicate corresponds to actual
provability. Rodych (2003) argues that their interpretation of Wittgenstein is not historically justified, while Bays
(2004) argues against Floyd and Putnam's philosophical analysis of the provability predicate. Berto (2009) explores
the relationship between Wittgenstein's writing and theories of paraconsistent logic.

Notes
[1] The word "true" is used disquotationally here: the Gdel sentence is true in this sense because it "asserts its own unprovability and it is indeed
unprovable" (Smoryski 1977 p. 825; also see Franzn 2005 pp. 2833). It is also possible to read "GT is true" in the formal sense that
primitive recursive arithmetic proves the implication Con(T)GT, where Con(T) is a canonical sentence asserting the consistency of T
(Smoryski 1977 p. 840, Kikuchi and Tanaka 1994 p. 403). However, the arithmetic statement in question is false in some nonstandard
models of arithmetic.

References
Articles by Gdel
1931, ber formal unentscheidbare Stze der Principia Mathematica und verwandter Systeme, I. Monatshefte fr
Mathematik und Physik 38: 173-98.

208

Gdel's incompleteness theorems


1931, ber formal unentscheidbare Stze der Principia Mathematica und verwandter Systeme, I. and On formally
undecidable propositions of Principia Mathematica and related systems I in Solomon Feferman, ed., 1986. Kurt
Gdel Collected works, Vol. I. Oxford University Press: 144-195. The original German with a facing English
translation, preceded by a very illuminating introductory note by Kleene.
Hirzel, Martin, 2000, On formally undecidable propositions of Principia Mathematica and related systems I.
(http://www.research.ibm.com/people/h/hirzel/papers/canon00-goedel.pdf). A modern translation by
Hirzel.
1951, Some basic theorems on the foundations of mathematics and their implications in Solomon Feferman, ed.,
1995. Kurt Gdel Collected works, Vol. III. Oxford University Press: 304-23.

Translations, during his lifetime, of Gdel's paper into English


None of the following agree in all translated words and in typography. The typography is a serious matter, because
Gdel expressly wished to emphasize "those metamathematical notions that had been defined in their usual sense
before . . ."(van Heijenoort 1967:595). Three translations exist. Of the first John Dawson states that: "The Meltzer
translation was seriously deficient and received a devastating review in the Journal of Symbolic Logic; "Gdel also
complained about Braithwaite's commentary (Dawson 1997:216). "Fortunately, the Meltzer translation was soon
supplanted by a better one prepared by Elliott Mendelson for Martin Davis's anthology The Undecidable . . . he
found the translation "not quite so good" as he had expected . . . [but because of time constraints he] agreed to its
publication" (ibid). (In a footnote Dawson states that "he would regret his compliance, for the published volume was
marred throughout by sloppy typography and numerous misprints" (ibid)). Dawson states that "The translation that
Gdel favored was that by Jean van Heijenoort"(ibid). For the serious student another version exists as a set of
lecture notes recorded by Stephen Kleene and J. B. Rosser "during lectures given by Gdel at to the Institute for
Advanced Study during the spring of 1934" (cf commentary by Davis 1965:39 and beginning on p.41); this version
is titled "On Undecidable Propositions of Formal Mathematical Systems". In their order of publication:
B. Meltzer (translation) and R. B. Braithwaite (Introduction), 1962. On Formally Undecidable Propositions of
Principia Mathematica and Related Systems, Dover Publications, New York (Dover edition 1992), ISBN
0-486-66980-7 (pbk.) This contains a useful translation of Gdel's German abbreviations on pp.3334. As noted
above, typography, translation and commentary is suspect. Unfortunately, this translation was reprinted with all
its suspect content by
Stephen Hawking editor, 2005. God Created the Integers: The Mathematical Breakthroughs That Changed
History, Running Press, Philadelphia, ISBN 0-7624-1922-9. Gdel's paper appears starting on p. 1097, with
Hawking's commentary starting on p. 1089.
Martin Davis editor, 1965. The Undecidable: Basic Papers on Undecidable Propositions, Unsolvable problems
and Computable Functions, Raven Press, New York, no ISBN. Gdel's paper begins on page 5, preceded by one
page of commentary.
Jean van Heijenoort editor, 1967, 3rd edition 1967. From Frege to Gdel: A Source Book in Mathematical Logic,
1879-1931, Harvard University Press, Cambridge Mass., ISBN 0-674-32449-8 (pbk). van Heijenoort did the
translation. He states that "Professor Gdel approved the translation, which in many places was accommodated to
his wishes."(p.595). Gdel's paper begins on p.595; van Heijenoort's commentary begins on p.592.
Martin Davis editor, 1965, ibid. "On Undecidable Propositions of Formal Mathematical Systems." A copy with
Gdel's corrections of errata and Gdel's added notes begins on page 41, preceded by two pages of Davis's
commentary. Until Davis included this in his volume this lecture existed only as mimeographed notes.

209

Gdel's incompleteness theorems


Citation

Articles by others
George Boolos, 1989, "A New Proof of the Gdel Incompleteness Theorem", Notices of the American
Mathematical Society v. 36, pp.388390 and p.676, reprinted in Boolos, 1998, Logic, Logic, and Logic, Harvard
Univ. Press. ISBN 0-674-53766-1
Arthur Charlesworth, 1980, "A Proof of Godel's Theorem in Terms of Computer Programs," Mathematics
Magazine, v. 54 n. 3, pp.109121. JStor (http://links.jstor.org/
sici?sici=0025-570X(198105)54:3<109:APOGTI>2.0.CO;2-1&size=LARGE&origin=JSTOR-enlargePage)
Martin Davis, " The Incompleteness Theorem (http://www.ams.org/notices/200604/fea-davis.pdf)", in
Notices of the AMS vol. 53 no. 4 (April 2006), p.414.
Jean van Heijenoort, 1963. "Gdel's Theorem" in Edwards, Paul, ed., Encyclopedia of Philosophy, Vol. 3.
Macmillan: 348-57.
Geoffrey Hellman, How to Gdel a Frege-Russell: Gdel's Incompleteness Theorems and Logicism. Nos, Vol.
15, No. 4, Special Issue on Philosophy of Mathematics. (Nov., 1981), pp.451468.
David Hilbert, 1900, " Mathematical Problems. (http://aleph0.clarku.edu/~djoyce/hilbert/problems.
html#prob2)" English translation of a lecture delivered before the International Congress of Mathematicians at
Paris, containing Hilbert's statement of his Second Problem.
Kikuchi, Makoto; Tanaka, Kazuyuki (1994), "On formalization of model-theoretic proofs of Gdel's theorems",
Notre Dame Journal of Formal Logic 35 (3): 403412, doi: 10.1305/ndjfl/1040511346 (http://dx.doi.org/10.
1305/ndjfl/1040511346), ISSN 0029-4527 (http://www.worldcat.org/issn/0029-4527), MR 1326122 (http:/
/www.ams.org/mathscinet-getitem?mr=1326122)
Stephen Cole Kleene, 1943, "Recursive predicates and quantifiers," reprinted from Transactions of the American
Mathematical Society, v. 53 n. 1, pp.4173 in Martin Davis 1965, The Undecidable (loc. cit.) pp.255287.
John Barkley Rosser, 1936, "Extensions of some theorems of Gdel and Church," reprinted from the Journal of
Symbolic Logic vol. 1 (1936) pp.8791, in Martin Davis 1965, The Undecidable (loc. cit.) pp.230235.
John Barkley Rosser, 1939, "An Informal Exposition of proofs of Gdel's Theorem and Church's Theorem",
Reprinted from the Journal of Symbolic Logic, vol. 4 (1939) pp.5360, in Martin Davis 1965, The Undecidable
(loc. cit.) pp.223230
C. Smoryski, "The incompleteness theorems", in J. Barwise, ed., Handbook of Mathematical Logic,
North-Holland 1982 ISBN 978-0-444-86388-1, pp.821866.
Dan E. Willard (2001), " Self-Verifying Axiom Systems, the Incompleteness Theorem and Related Reflection
Principles (http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.jsl/
1183746459)", Journal of Symbolic Logic, v. 66 n. 2, pp.536596. doi: 10.2307/2695030 (http://dx.doi.org/
10.2307/2695030)
Zach, Richard (2003), "The Practice of Finitism: Epsilon Calculus and Consistency Proofs in Hilbert's Program"
(http://www.ucalgary.ca/~rzach/static/conprf.pdf), Synthese (Berlin, New York: Springer-Verlag) 137 (1):
211259, doi: 10.1023/A:1026247421383 (http://dx.doi.org/10.1023/A:1026247421383), ISSN 0039-7857
(http://www.worldcat.org/issn/0039-7857)
Richard Zach, 2005, "Paper on the incompleteness theorems" in Grattan-Guinness, I., ed., Landmark Writings in
Western Mathematics. Elsevier: 917-25.

210

Gdel's incompleteness theorems

Books about the theorems


Francesco Berto. There's Something about Gdel: The Complete Guide to the Incompleteness Theorem John
Wiley and Sons. 2010.
Domeisen, Norbert, 1990. Logik der Antinomien. Bern: Peter Lang. 142 S. 1990. ISBN 3-261-04214-1.
Zentralblatt MATH (http://www.zentralblatt-math.org/zbmath/search/?q=an:0724.03003)
Torkel Franzn, 2005. Gdel's Theorem: An Incomplete Guide to its Use and Abuse. A.K. Peters. ISBN
1-56881-238-8 MR 2007d:03001 (http://www.ams.org/mathscinet-getitem?mr=2007d:03001)
Douglas Hofstadter, 1979. Gdel, Escher, Bach: An Eternal Golden Braid. Vintage Books. ISBN 0-465-02685-0.
1999 reprint: ISBN 0-465-02656-7. MR 80j:03009 (http://www.ams.org/mathscinet-getitem?mr=80j:03009)
Douglas Hofstadter, 2007. I Am a Strange Loop. Basic Books. ISBN 978-0-465-03078-1. ISBN 0-465-03078-5.
MR 2008g:00004 (http://www.ams.org/mathscinet-getitem?mr=2008g:00004)
Stanley Jaki, OSB, 2005. The drama of the quantities. Real View Books. (http://www.realviewbooks.com/)
Per Lindstrm, 1997, Aspects of Incompleteness (http://projecteuclid.org/DPubS?service=UI&version=1.0&
verb=Display&handle=euclid.lnl/1235416274), Lecture Notes in Logic v. 10.
J.R. Lucas, FBA, 1970. The Freedom of the Will. Clarendon Press, Oxford, 1970.
Ernest Nagel, James Roy Newman, Douglas Hofstadter, 2002 (1958). Gdel's Proof, revised ed. ISBN
0-8147-5816-9. MR 2002i:03001 (http://www.ams.org/mathscinet-getitem?mr=2002i:03001)
Rudy Rucker, 1995 (1982). Infinity and the Mind: The Science and Philosophy of the Infinite. Princeton Univ.
Press. MR 84d:03012 (http://www.ams.org/mathscinet-getitem?mr=84d:03012)
Smith, Peter, 2007. An Introduction to Gdel's Theorems. (http://www.godelbook.net/) Cambridge University
Press. MathSciNet (http://www.ams.org/mathscinet/search/publdoc.html?arg3=&co4=AND&co5=AND&
co6=AND&co7=AND&dr=all&pg4=AUCN&pg5=AUCN&pg6=PC&pg7=ALLF&pg8=ET&s4=Smith,
Peter&s5=&s6=&s7=&s8=All&yearRangeFirst=&yearRangeSecond=&yrop=eq&r=2&mx-pid=2384958)
N. Shankar, 1994. Metamathematics, Machines and Gdel's Proof, Volume 38 of Cambridge tracts in theoretical
computer science. ISBN 0-521-58533-3
Raymond Smullyan, 1991. Godel's Incompleteness Theorems. Oxford Univ. Press.
, 1994. Diagonalization and Self-Reference. Oxford Univ. Press. MR 96c:03001 (http://www.ams.org/
mathscinet-getitem?mr=96c:03001)
Hao Wang, 1997. A Logical Journey: From Gdel to Philosophy. MIT Press. ISBN 0-262-23189-1 MR
97m:01090 (http://www.ams.org/mathscinet-getitem?mr=97m:01090)

Miscellaneous references
Francesco Berto. "The Gdel Paradox and Wittgenstein's Reasons" Philosophia Mathematica (III) 17. 2009.
John W. Dawson, Jr., 1997. Logical Dilemmas: The Life and Work of Kurt Gdel, A. K. Peters, Wellesley Mass,
ISBN 1-56881-256-6.
Goldstein, Rebecca, 2005, Incompleteness: the Proof and Paradox of Kurt Gdel, W. W. Norton & Company.
ISBN 0-393-05169-2
Juliet Floyd and Hilary Putnam, 2000, "A Note on Wittgenstein's 'Notorious Paragraph' About the Gdel
Theorem", Journal of Philosophy v. 97 n. 11, pp.624632.
Carl Hewitt, 2008, "Large-scale Organizational Computing requires Unstratified Reflection and Strong
Paraconsistency", Coordination, Organizations, Institutions, and Norms in Agent Systems III, Springer-Verlag.
David Hilbert and Paul Bernays, Grundlagen der Mathematik, Springer-Verlag.
John Hopcroft and Jeffrey Ullman 1979, Introduction to Automata Theory, Languages, and Computation,
Addison-Wesley, ISBN 0-201-02988-X
James P. Jones, Undecidable Diophantine Equations (http://www.ams.org/bull/1980-03-02/
S0273-0979-1980-14832-6/S0273-0979-1980-14832-6.pdf), Bulletin of the American Mathematical Society v. 3
n. 2, 1980, pp.859862.

211

Gdel's incompleteness theorems


Stephen Cole Kleene, 1967, Mathematical Logic. Reprinted by Dover, 2002. ISBN 0-486-42533-9
Russell O'Connor, 2005, " Essential Incompleteness of Arithmetic Verified by Coq (http://arxiv.org/abs/cs/
0505034)", Lecture Notes in Computer Science v. 3603, pp.245260.
Graham Priest, 2006, In Contradiction: A Study of the Transconsistent, Oxford University Press, ISBN
0-19-926329-9
Graham Priest, 2004, Wittgenstein's Remarks on Gdel's Theorem in Max Klbel, ed., Wittgenstein's lasting
significance, Psychology Press, pp.207227.
Graham Priest, 1984, "Logic of Paradox Revisited", Journal of Philosophical Logic, v. 13,` n. 2, pp.153179
Hilary Putnam, 1960, Minds and Machines in Sidney Hook, ed., Dimensions of Mind: A Symposium. New York
University Press. Reprinted in Anderson, A. R., ed., 1964. Minds and Machines. Prentice-Hall: 77.
Rautenberg, Wolfgang (2010), A Concise Introduction to Mathematical Logic (http://www.springerlink.com/
content/978-1-4419-1220-6/) (3rd ed.), New York: Springer Science+Business Media, doi:
10.1007/978-1-4419-1221-3 (http://dx.doi.org/10.1007/978-1-4419-1221-3), ISBN978-1-4419-1220-6.
Victor Rodych, 2003, "Misunderstanding Gdel: New Arguments about Wittgenstein and New Remarks by
Wittgenstein", Dialectica v. 57 n. 3, pp.279313. doi: 10.1111/j.1746-8361.2003.tb00272.x (http://dx.doi.org/
10.1111/j.1746-8361.2003.tb00272.x)
Stewart Shapiro, 2002, "Incompleteness and Inconsistency", Mind, v. 111, pp 81732. doi:
10.1093/mind/111.444.817 (http://dx.doi.org/10.1093/mind/111.444.817)
Alan Sokal and Jean Bricmont, 1999, Fashionable Nonsense: Postmodern Intellectuals' Abuse of Science,
Picador. ISBN 0-312-20407-8
Joseph R. Shoenfield (1967), Mathematical Logic. Reprinted by A.K. Peters for the Association for Symbolic
Logic, 2001. ISBN 978-1-56881-135-2
Jeremy Stangroom and Ophelia Benson, Why Truth Matters, Continuum. ISBN 0-8264-9528-1
George Tourlakis, Lectures in Logic and Set Theory, Volume 1, Mathematical Logic, Cambridge University Press,
2003. ISBN 978-0-521-75373-9
Wigderson, Avi (2010), "The Gdel Phenomena in Mathematics: A Modern View" (http://www.math.ias.edu/
~avi/BOOKS/Godel_Widgerson_Text.pdf), Kurt Gdel and the Foundations of Mathematics: Horizons of
Truth, Cambridge University Press
Hao Wang, 1996, A Logical Journey: From Gdel to Philosophy, The MIT Press, Cambridge MA, ISBN
0-262-23189-1.
Richard Zach, 2006, "Hilbert's program then and now" (http://www.ucalgary.ca/~rzach/static/hptn.pdf), in
Philosophy of Logic, Dale Jacquette (ed.), Handbook of the Philosophy of Science, v. 5., Elsevier, pp.411447.

External links
Godel's Incompleteness Theorems (http://www.bbc.co.uk/programmes/b00dshx3) on In Our Time at the
BBC. ( listen now (http://www.bbc.co.uk/iplayer/console/b00dshx3/
In_Our_Time_Godel's_Incompleteness_Theorems))
Stanford Encyclopedia of Philosophy: " Kurt Gdel (http://plato.stanford.edu/entries/goedel/)" by Juliette
Kennedy.
MacTutor biographies:
Kurt Gdel. (http://www-groups.dcs.st-and.ac.uk/~history/Mathematicians/Godel.html)
Gerhard Gentzen. (http://www-groups.dcs.st-and.ac.uk/~history/Mathematicians/Gentzen.html)
What is Mathematics:Gdel's Theorem and Around (http://podnieks.id.lv/gt.html) by Karlis Podnieks. An
online free book.
World's shortest explanation of Gdel's theorem (http://blog.plover.com/math/Gdl-Smullyan.html) using a
printing machine as an example.

212

Gdel's incompleteness theorems


October 2011 RadioLab episode (http://www.radiolab.org/2011/oct/04/break-cycle/) about/including
Gdel's Incompleteness theorem
Hazewinkel, Michiel, ed. (2001), "Gdel incompleteness theorem" (http://www.encyclopediaofmath.org/index.
php?title=p/g044530), Encyclopedia of Mathematics, Springer, ISBN978-1-55608-010-4

Interesting number paradox


The interesting number paradox is a semi-humorous paradox which arises from the attempt to classify natural
numbers as "interesting" or "dull". The paradox states that all natural numbers are interesting. The "proof" is by
contradiction: if there exists a non-empty set of uninteresting numbers, there would be a smallest uninteresting
number but the smallest uninteresting number is itself interesting because it is the smallest uninteresting number,
producing a contradiction.

Paradoxical nature
Attempting to classify all numbers this way leads to a paradox or an antinomy of definition. Any hypothetical
partition of natural numbers into interesting and dull sets seems to fail. Since the definition of interesting is usually a
subjective, intuitive notion of "interesting", it should be understood as a half-humorous application of self-reference
in order to obtain a paradox.
The paradox is alleviated if "interesting" is instead defined objectively: for example, the smallest integer that does
not appear in an entry of the On-Line Encyclopedia of Integer Sequences and was originally found to be 11630 on 12
June 2009. The number fitting this definition later became 12407 from November 2009 until at least November
2011, then 13794 as of April 2012, until it appeared in sequence OEIS:A218631 as of 3 November 2012. Since
November 2013, that number was 14228, at least until 14 April 2014. Depending on the sources used for the list of
interesting numbers, a variety of other numbers can be characterized as uninteresting in the same way. This might be
better described as "not known to be interesting".
However, as there are many significant results in mathematics that make use of self-reference (such as Gdel's
Incompleteness Theorem), the paradox illustrates some of the power of self-reference, and thus touches on serious
issues in many fields of study.
This version of the paradox applies only to well-ordered sets with a natural order, such as the natural numbers; the
argument would not apply to the real numbers.
One proposed resolution of the paradox asserts that only the first uninteresting number is made interesting by that
fact. For example, if 39 and 41 were the first two uninteresting numbers, then 39 would become interesting as a
result, but 41 would not since it is not the first uninteresting number.[1] However, this resolution is invalid, since the
paradox is proved by contradiction: assuming that there is any uninteresting number, we arrive to the fact that that
same number is interesting, hence no number can be uninteresting; its aim is not in particular to identify the
interesting or uninteresting numbers, but to speculate whether any number can in fact exhibit such properties.
An obvious weakness in the proof is that what qualifies as "interesting" is not defined. However, assuming this
predicate is defined with a finite, definite list of "interesting properties of positive integers", and is defined
self-referentially to include the smallest number not in such a list, a paradox arises. The Berry paradox is closely
related, since it arises from a similar self-referential definition. As the paradox lies in the definition of "interesting",
it applies only to persons with particular opinions on numbers: if one's view is that all numbers are boring, and one
finds uninteresting the observation that 0 is the smallest boring number, there is no paradox.

213

Interesting number paradox

Notes
[1] Clark, M., 2007, Paradoxes from A to Z, Routledge, ISBN 0-521-46168-5.

Further reading
Gardner, Martin (1959). Mathematical Puzzles and Diversions. ISBN0-226-28253-8.
Gleick, James (2010). The Information (chapter 12). New York: Pantheon Books. ISBN978-0-307-37957-3.

External links
Most Boring Day in History (http://articles.timesofindia.indiatimes.com/2010-11-26/uk/
28233422_1_search-true-knowledge-20th-century)
A list of "special properties" for each of the first 10000 natural numbers (http://www2.stetson.edu/~efriedma/
numbers.html)

KleeneRosser paradox
In mathematics, the KleeneRosser paradox is a paradox that shows that certain systems of formal logic are
inconsistent, in particular the version of Curry's combinatory logic introduced in 1930, and Church's original lambda
calculus, introduced in 19321933, both originally intended as systems of formal logic. The paradox was exhibited
by Stephen Kleene and J. B. Rosser in 1935.

The paradox
Kleene and Rosser were able to show that both systems are able to characterize and enumerate their provably total,
definable number-theoretic functions, which enabled them to construct a term that essentially replicates the Richard
paradox in formal language.
Curry later managed to identify the crucial ingredients of the calculi that allowed the construction of this paradox,
and used this to construct a much simpler paradox, now known as Curry's paradox.

References
Andrea Cantini, "The inconsistency of certain formal logics [1]", in the Paradoxes and Contemporary Logic entry
of Stanford Encyclopedia of Philosophy (2007).
Kleene, S. C. & Rosser, J. B. (1935). "The inconsistency of certain formal logics". Annals of Mathematics 36 (3):
630636. doi:10.2307/1968646 [2].

References
[1] http:/ / plato. stanford. edu/ entries/ paradoxes-contemporary-logic/ #IncCerForLog
[2] http:/ / dx. doi. org/ 10. 2307%2F1968646

214

Lindley's paradox

215

Lindley's paradox
Lindley's paradox is a counterintuitive situation in statistics in which the Bayesian and frequentist approaches to a
hypothesis testing problem give different results for certain choices of the prior distribution. The problem of the
disagreement between the two approaches was discussed in Harold Jeffreys' 1939 textbook; it became known as
Lindley's paradox after Dennis Lindley called the disagreement a paradox in a 1957 paper.
Although referred to as a paradox, the differing results from the Bayesian and Frequentist approaches can be
explained as using them to answer fundamentally different questions, rather than actual disagreement between the
two methods.

Description of the paradox


Consider the result
distribution

of some experiment, with two possible explanations, hypotheses

and

, and some prior

representing uncertainty as to which hypothesis is more accurate before taking into account

Lindley's paradox occurs when


1. The result

is "significant" by a frequentist test of

5% level, and
2. The posterior probability of
than

given

, indicating sufficient evidence to reject

is high, indicating strong evidence that

, say, at the

is in better agreement with

These results can occur at the same time when

is very specific,

more diffuse, and the prior distribution does

not strongly favor one or the other, as seen below.

Numerical example
We can illustrate Lindley's paradox with a numerical example. Imagine a certain city where 49,581 boys and 48,870
girls have been born over a certain time period. The observed proportion of male births is thus 49,581/98,451
0.5036. We assume the number of male births is a binomial variable with parameter . We are interested in testing
whether

is 0.5 or some other value. That is, our null hypothesis is

and the alternative is

Frequentist approach
The frequentist approach to testing
least as large as

assuming

approximation

for

the

is to compute a p-value, the probability of observing a fraction of boys at


is true. Because the number of births is very large, we can use a normal
fraction

of

male

births

with

and
, to compute

We would have been equally surprised if we had seen 49,581 female births, i.e.
would usually perform a two-sided test, for which the p-value would be
cases, the p-value is lower than the significance level of 5%, so the frequentist approach rejects
with the observed data.

, so a frequentist
. In both
as disagreeing

Lindley's paradox

216

Bayesian approach
Assuming no reason to favor one hypothesis over the other, the Bayesian approach would be to assign prior
probabilities
, and then to compute the posterior probability of
using Bayes'
theorem,

After observing

boys out of

births, we can compute the posterior probability of each

hypothesis using the probability mass function for a binomial variable,

where

is the Beta function.

From these values, we find the posterior probability of

, which strongly favors

over

The two approachesthe Bayesian and the frequentistappear to be in conflict, and this is the "paradox".

The lack of an actual paradox


The apparent disagreement between the two approaches is caused by a combination of factors. First, the frequentist
approach above tests
without reference to
. The Bayesian approach evaluates
as an alternative to
,
and finds the first to be in better agreement with the observations. This is because the latter hypothesis is much more
diffuse, as can be anywhere in
, which results in it having a very low posterior probability. To understand
why, it is helpful to consider the two hypotheses as generators of the observations:
Under
, we choose
, and ask how likely it is to see 49,581 boys in 98,451 births.
Under
, we choose randomly from anywhere within 0 to 1, and ask the same question.
Most of the possible values for

under

are very poorly supported by the observations. In essence, the apparent

disagreement between the methods is not a disagreement at all, but rather two different statements about how the
hypotheses relate to the data:
The Frequentist finds that
The Bayesian finds that

is a poor explanation for the observation.


is a far better explanation for the observation than

The ratio of the sex of newborns is improbably 50/50 male/female, according the frequentist test. Yet 50/50 is a
better approximation than most, but not all, other ratios. The hypothesis
would have fit the observation
much better than almost all other ratios, including

For example, this choice of hypotheses and prior probabilities implies the statement: "if
then the prior probability of

being exactly 0.5 is 0.50/0.51

, it is easy to see why the Bayesian approach favors


value of

lies

> 0.49 and

< 0.51,

98%." Given such a strong preference for

in the face of

, even though the observed

away from 0.5. The deviation of over 2 sigma from

is considered significant in the

frequentist approach, but its significance is overruled by the prior in the Bayesian approach.
Looking at it another way, we can see that the prior distribution is essentially flat with a delta function at

Clearly this is dubious. In fact if you were to picture real numbers as being continuous, then it would be more logical
to assume that it would impossible for any given number to be exactly the parameter value, i.e., we should assume
P(theta = 0.5) = 0.
A more realistic distribution for
. For example, if we replace
posterior probability of

in the alternative hypothesis produces a less surprising result for the posterior of
with

, i.e., the maximum likelihood estimate for

would be only 0.07 compared to 0.93 for

, the

(Of course, one cannot actually use the

Lindley's paradox

217

MLE as part of a prior distribution).

Reconciling the Bayesian and Frequentist approaches


If one uses an uninformative prior and tests a hypothesis more similar to that in the Frequentist approach, the
paradox disappears.
For example, if we calculate the posterior distribution

, using a uniform prior distribution on

(i.e.,

), we find
If we use this to check the probability that a newborn is more likely to be a boy than a girl, i.e.,
, we find
In other words, it is very likely that the proportion of male births is above 0.5.
Neither analysis gives an estimate of the effect size, directly, but both could be used to determine, for instance, if the
fraction of boy births is likely to be above some particular threshold.

Notes
References
Shafer, Glenn (1982). "Lindley's paradox". Journal of the American Statistical Association 77 (378): 325334.
doi: 10.2307/2287244 (http://dx.doi.org/10.2307/2287244). JSTOR 2287244 (http://www.jstor.org/stable/
2287244). MR 664677 (http://www.ams.org/mathscinet-getitem?mr=664677).

Low birth weight paradox


The low birth-weight paradox is an apparently paradoxical observation relating to the birth weights and mortality
rate of children born to tobacco smoking mothers. Low birth-weight children born to smoking mothers have a lower
infant mortality rate than the low birth weight children of non-smokers. It is an example of Simpson's paradox.

History
Traditionally, babies weighing less than a certain amount (which varies between countries) have been classified as
having low birth weight. In a given population, low birth weight babies have a significantly higher mortality rate
than others; thus, populations with a higher rate of low birth weights typically also have higher rates of child
mortality than other populations.
Based on prior research, the children of smoking mothers are more likely to be of low birth weight than children of
non-smoking mothers. Thus, by extension the child mortality rate should be higher among children of smoking
mothers. So it is a surprising real-world observation that low birth weight babies of smoking mothers have a lower
child mortality than low birth weight babies of non-smokers.

Low birth weight paradox

Explanation
At first sight these findings seemed to suggest that, at least for some babies, having a smoking mother might be
beneficial to one's health. However the paradox can be explained statistically by uncovering a lurking variable
between smoking and the two key variables: birth weight and risk of mortality. Both variables are acted on
independently by smoking and other adverse conditions birth weight is lowered and the risk of mortality
increases. However, each condition does not necessarily affect both variables to the same extent.
The birth weight distribution for children of smoking mothers is shifted to lower weights by their mothers' actions.
Therefore, otherwise healthy babies (who would weigh more if it were not for the fact their mother smoked) are born
underweight. However, they still have a lower mortality rate than children who have other, more severe, medical
reasons why they are born underweight.
In short, smoking may be harmful in that it contributes to low birth weight, but other causes of low birth
weight are generally more harmful.

Evidence
If one corrects and adjusts for the confounding by smoking, via stratification or multivariable regression modelling
to statistically control for smoking, one finds that the association between birth weight and mortality may be
attenuated towards the null. Nevertheless, most epidemiologic studies of birth weight and mortality have controlled
for maternal smoking, and the adjusted results, although attenuated after adjusting for smoking, still indicated a
significant association.
Additional support for the hypothesis that birth weight and mortality can be acted on independently came from the
analysis of birth data from Colorado: compared with the birth weight distribution in the US as a whole, the
distribution curve in Colorado is also shifted to lower weights. The overall child mortality of Colorado children is the
same as that for US children however, and if one corrects for the lower weights as above, one finds that babies of a
given (corrected) weight are just as likely to die, whether they are from Colorado or not. The likely explanation here
is that the higher altitude of Colorado affects birth weight, but not mortality.

References
Wilcox, Allen (2001). "On the importance and the unimportance of birthweight [1]". International Journal
of Epidemiology. 30:12331241.
Wilcox, Allen (2006). "The Perils of Birth Weight A Lesson from Directed Acyclic Graphs [2]". American
Journal of Epidemiology. 164(11):11211123.

External links
The Analysis of Birthweight [3], by Allen Wilcox

References
[1] http:/ / eb. niehs. nih. gov/ bwt/ V0M3QDQU. pdf
[2] http:/ / aje. oxfordjournals. org/ cgi/ content/ abstract/ 164/ 11/ 1121
[3] http:/ / eb. niehs. nih. gov/ bwt/ index. htm

218

Missing square puzzle

219

Missing square puzzle


The missing square puzzle is an optical
illusion used in mathematics classes to help
students reason about geometrical figures
(or rather to teach them to not reason using
figures, but only using the textual
description thereof and the axioms of
geometry. A figure is not a proof!). It
depicts two arrangements made of similar
shapes in slightly different configurations.
Each apparently forms a 135 right-angled
triangle, but one has a 11 hole in it.

Missing square puzzle animation

Solution
The key to the puzzle is the fact that neither of the 135
"triangles" is truly a triangle, because what appears to be the
hypotenuse is bent. In other words, the "hypotenuse" does not
maintain a consistent slope, even though it may appear that way to
the human eye. A true 135 triangle cannot be created from the
given component parts. The four figures (the yellow, red, blue and
green shapes) total 32 units of area. The apparent triangles formed
from the figures are 13 units wide and 5 units tall, so it appears
that the area should be
units. However, the
blue triangle has a ratio of 5:2 (=2.5:1), while the red triangle has
the ratio 8:3 (2.667:1), so the apparent combined hypotenuse in
The missing square shown in the lower triangle, where
each figure is actually bent. So with the bent hypotenuse, the first
both triangles are in a perfect grid
figure actually occupies a combined 32 units, while the second
figure occupies 33, including the "missing" square. The amount of
bending is approximately 1/28th of a unit (1.245364267), which is difficult to see on the diagram of this puzzle.
Note the grid point where the red and blue triangles in the lower image meet (5 squares to the right and two units up
from the lower left corner of the combined figure), and compare it to the same point on the other figure; the edge is
slightly under the mark in the upper image, but goes through it in the lower. Overlaying the hypotenuses from both
figures results in a very thin parallelogram with an area of exactly one grid squarethe same area "missing" from
the second figure.

Missing square puzzle

220

Principle
According to Martin Gardner, this particular puzzle was invented by a New York City amateur magician, Paul
Curry, in 1953. However, the principle of a dissection paradox has been known since the start of the 16th century.
The integer dimensions of the parts of the puzzle (2, 3, 5, 8, 13) are successive Fibonacci numbers. Many other
geometric dissection puzzles are based on a few simple properties of the Fibonacci sequence.

Missing square puzzle dimensions

Similar puzzles
Sam Loyd's paradoxical dissection. In the "larger" rearrangement,
the gaps between the figures have a combined unit square more
area than their square gaps counterparts, creating an illusion that
the figures there take up more space than those in the square
figure. In the "smaller" rearrangement, each quadrilateral needs to
overlap the triangle by an area of half a unit for its top/bottom
edge to align with a grid line.

Sam Loyd's paradoxical dissection

Mitsunobu Matsuyama's "Paradox" uses four congruent quadrilaterals and a


small square, which form a larger square. When the quadrilaterals are rotated
about their centers they fill the space of the small square, although the total
area of the figure seems unchanged. The apparent paradox is explained by the
fact that the side of the new large square is a little smaller than the original
one. If a is the side of the large square and is the angle between two
opposing sides in each quadrilateral, then the quotient between the two areas
is given by sec2 1. For = 5, this is approximately 1.00765, which
corresponds to a difference of about 0.8%.
A variant of Mitsunobu Matsuyama's
"Paradox"

Missing square puzzle

References
External links
A printable Missing Square variant (http://www.archimedes-lab.org/workshop13skulls.html) with a video
demonstration.
Curry's Paradox: How Is It Possible? (http://www.cut-the-knot.org/Curriculum/Fallacies/CurryParadox.
shtml) at cut-the-knot
Triangles and Paradoxes (http://www.archimedes-lab.org/page3b.html) at archimedes-lab.org
The Triangle Problem or What's Wrong with the Obvious Truth (http://www.marktaw.com/blog/
TheTriangleProblem.html)
Jigsaw Paradox (http://www.mathematik.uni-bielefeld.de/~sillke/PUZZLES/jigsaw-paradox.html)
The Eleven Holes Puzzle (http://www.slideshare.net/sualeh/the-eleven-holes-puzzle)
Very nice animated Excel workbook of the Missing Square Puzzle (http://www.excelhero.com/blog/2010/09/
excel-optical-illusions-week-30.html)
A video explaining Curry's Paradox and Area (http://www.youtube.com/watch?v=eFw0878Ig-A&
feature=related) by James Tanton

Paradoxes of set theory


This article contains a discussion of paradoxes of set theory. As with most mathematical paradoxes, they generally
reveal surprising and counter-intuitive mathematical results, rather than actual logical contradictions within modern
axiomatic set theory.

Basics
Cardinal numbers
Set theory as conceived by Georg Cantor assumes the existence of infinite sets. As this assumption cannot be proved
from first principles it has been introduced into axiomatic set theory by the axiom of infinity, which asserts the
existence of the set N of natural numbers. Every infinite set which can be enumerated by natural numbers is the same
size (cardinality) as N, and is said to be countable. Examples of countably infinite sets are the natural numbers, the
even numbers, the prime numbers, and also all the rational numbers, i.e., the fractions. These sets have in common
the cardinal number |N| =
(aleph-nought), a number greater than every natural number.
Cardinal numbers can be defined as follows. Define two sets to have the same size by: there exists a bijection
between the two sets (a one-to-one correspondence between the elements). Then a cardinal number is, by definition,
a class consisting of all sets of the same size. To have the same size is an equivalence relation, and the cardinal
numbers are the equivalence classes.

Ordinal numbers
Besides the cardinality, which describes the size of a set, ordered sets also form a subject of set theory. The axiom of
choice guarantees that every set can be well-ordered, which means that a total order can be imposed on its elements
such that every nonempty subset has a first element with respect to that order. The order of a well-ordered set is
described by an ordinal number. For instance, 3 is the ordinal number of the set {0, 1, 2} with the usual order 0 < 1 <
2; and is the ordinal number of the set of all natural numbers ordered the usual way. Neglecting the order, we are
left with the cardinal number |N|=||=
.

221

Paradoxes of set theory


Ordinal numbers can be defined with the same method used for cardinal numbers. Define two well-ordered sets to
have the same order type by: there exists a bijection between the two sets respecting the order: smaller elements are
mapped to smaller elements. Then an ordinal number is, by definition, a class consisting of all well-ordered sets of
the same order type. To have the same order type is an equivalence relation on the class of well-ordered sets, and the
ordinal numbers are the equivalence classes.
Two sets of the same order type have the same cardinality. The converse is not true in general for infinite sets: it is
possible to impose different well-orderings on the set of natural numbers that give rise to different ordinal numbers.
There is a natural ordering on the ordinals, which is itself a well-ordering. Given any ordinal , one can consider the
set of all ordinals less than . This set turns out to have ordinal number . This observation is used for a different
way of introducing the ordinals, in which an ordinal is equated with the set of all smaller ordinals. This form of
ordinal number is thus a canonical representative of the earlier form of equivalence class.

Power sets
By forming all subsets of a set S (all possible choices of its elements), we obtain the power set P(S). Georg Cantor
proved that the power set is always larger than the set, i.e., |P(S)| > |S|. A special case of Cantor's theorem proves that
the set of all real numbers R cannot be enumerated by natural numbers. R is uncountable: |R| > |N|.

Paradoxes of the infinite set


Instead of relying on ambiguous descriptions such as "that which cannot be enlarged" or "increasing without bound",
set theory provides definitions for the term infinite set to give an unambiguous meaning to phrases such as "the set of
all natural numbers is infinite". Just as for finite sets, the theory makes further definitions which allow us to
consistently compare two infinite sets as regards whether one set is "larger than", "smaller than", or "the same size
as" the other. But not every intuition regarding the size of finite sets applies to the size of infinite sets, leading to
various apparently paradoxical results regarding enumeration, size, measure and order.

Paradoxes of enumeration
Before set theory was introduced, the notion of the size of a set had been problematic. It had been discussed by
Galileo Galilei and Bernard Bolzano, among others. Are there as many natural numbers as squares of natural
numbers when measured by the method of enumeration?
The answer is yes, because for every natural number n there is a square number n2, and likewise the other way
around.
The answer is no, because the squares are a proper subset of the naturals: every square is a natural number but
there are natural numbers, like 2, which are not squares of natural numbers.
By defining the notion of the size of a set in terms of its cardinality, the issue can be settled. Since there is a bijection
between the two sets involved, this follows in fact directly from the definition of the cardinality of a set.
See Hilbert's paradox of the Grand Hotel for more on paradoxes of enumeration.

Je le vois, mais je ne crois pas


"I see it but I don't believe," Cantor wrote to Richard Dedekind after proving that the set of points of a square has the
same cardinality as that of the points on just an edge of the square: the cardinality of the continuum.
This demonstrates that the "size" of sets as defined by cardinality alone is not the only useful way of comparing sets.
Measure theory provides a more nuanced theory of size that conforms to our intuition that length and area are
incompatible measures of size.
The evidence strongly suggests that Cantor was quite confident in the result itself and that his comment to Dedekind
refers instead to his then-still-lingering concerns about the validity of his proof of it.[1] Nevertheless, Cantor's remark

222

Paradoxes of set theory


would also serve nicely to express the surprise that so many mathematicians after him have experienced on first
encountering a result that's so counterintuitive.

Paradoxes of well-ordering
In 1904 Ernst Zermelo proved by means of the axiom of choice (which was introduced for this reason) that every set
can be well-ordered. In 1963 Paul J. Cohen showed that using the axiom of choice is essential to well-ordering the
real numbers; no weaker assumption suffices.
However, the ability to well order any set allows certain constructions to be performed that have been called
paradoxical. One example is the BanachTarski paradox, a theorem widely considered to be nonintuitive. It states
that it is possible to decompose a ball of a fixed radius into a finite number of pieces and then move and reassemble
those pieces by ordinary translations and rotations (with no scaling) to obtain two copies from the one original copy.
The construction of these pieces requires the axiom of choice; the pieces are not simple regions of the ball, but
complicated subsets.

Paradoxes of the Supertask


Main article: supertask
In set theory, an infinite set is not considered to be created by some mathematical process such as "adding one
element" that is then carried out "an infinite number of times". Instead, a particular infinite set (such as the set of all
natural numbers) is said to already exist, "by fiat", as an assumption or an axiom. Given this infinite set, other
infinite sets are then proven to exist as well, as a logical consequence. But it is still a natural philosophical question
to contemplate some physical action that actually completes after an infinite number of discrete steps; and the
interpretation of this question using set theory gives rise to the paradoxes of the supertask.

The diary of Tristram Shandy


Tristram Shandy, the hero of a novel by Laurence Sterne, writes his autobiography so conscientiously that it takes
him one year to lay down the events of one day. If he is mortal he can never terminate; but if he lived forever then no
part of his diary would remain unwritten, for to each day of his life a year devoted to that day's description would
correspond.

The Ross-Littlewood paradox


Main article: RossLittlewood paradox
An increased version of this type of paradox shifts the infinitely remote finish to a finite time. Fill a huge reservoir
with balls enumerated by numbers 1 to 10 and take off ball number 1. Then add the balls enumerated by numbers 11
to 20 and take off number 2. Continue to add balls enumerated by numbers 10n - 9 to 10n and to remove ball number
n for all natural numbers n = 3, 4, 5, .... Let the first transaction last half an hour, let the second transaction last
quarter an hour, and so on, so that all transactions are finished after one hour. Obviously the set of balls in the
reservoir increases without bound. Nevertheless, after one hour the reservoir is empty because for every ball the time
of removal is known.
The paradox is further increased by the significance of the removal sequence. If the balls are not removed in the
sequence 1, 2, 3, ... but in the sequence 1, 11, 21, ... after one hour infinitely many balls populate the reservoir,
although the same amount of material as before has been moved.

223

Paradoxes of set theory

Paradoxes of proof and definability


For all its usefulness in resolving questions regarding infinite sets, naive set theory has some fatal flaws. In
particular, it is prey to logical paradoxes such as those exposed by Russell's paradox. The discovery of these
paradoxes revealed that not all sets which can be described in the language of naive set theory can actually be said to
exist without creating a contradiction. The 20th century saw a resolution to these paradoxes in the development of
the various axiomatizations of set theories such as ZFC and NBG in common use today. However, the gap between
the very formalized and symbolic language of these theories and our typical informal use of mathematical language
results in various paradoxical situations, as well as the philosophical question of exactly what it is that such formal
systems actually propose to be talking about.

Early paradoxes: the set of all sets


Main article: Russell's paradox
In 1897 the Italian mathematician Cesare Burali-Forti discovered that there is no set containing all ordinal numbers.
As every ordinal number is defined by a set of smaller ordinal numbers, the well-ordered set of all ordinal
numbers (if it exists) fits the definition and is itself an ordinal. On the other hand, no ordinal number can contain
itself, so cannot be an ordinal. Therefore, the set of all ordinal numbers cannot exist.
By the end of the 19th century Cantor was aware of the non-existence of the set of all cardinal numbers and the set of
all ordinal numbers. In letters to David Hilbert and Richard Dedekind he wrote about inconsistent sets, the elements
of which cannot be thought of as being all together, and he used this result to prove that every consistent set has a
cardinal number.
After all this, the version of the "set of all sets" paradox conceived by Bertrand Russell in 1903 led to a serious crisis
in set theory. Russell recognized that the statement x = x is true for every set, and thus the set of all sets is defined by
{x | x = x}. In 1906 he constructed several paradox sets, the most famous of which is the set of all sets which do not
contain themselves. Russell himself explained this abstract idea by means of some very concrete pictures. One
example, known as the Barber paradox, states: The male barber who shaves all and only men who don't shave
themselves has to shave himself only if he does not shave himself.
There are close similarities between Russell's paradox in set theory and the GrellingNelson paradox, which
demonstrates a paradox in natural language.

Paradoxes by change of language


Knig's paradox
In 1905, the Hungarian mathematician Julius Knig published a paradox based on the fact that there are only
countably many finite definitions. If we imagine the real numbers as a well-ordered set, those real numbers which
can be finitely defined form a subset. Hence in this well-order there should be a first real number that is not finitely
definable. This is paradoxical, because this real number has just been finitely defined by the last sentence. This leads
to a contradiction in naive set theory.
This paradox is avoided in axiomatic set theory. Although it is possible to represent a proposition about a set as a set,
by a system of codes known as Gdel numbers, there is no formula
in the language of set theory which
holds exactly when a is a code for a finite description of a set and this description is a true description of the set x.
This result is known as Tarski's indefinability theorem; it applies to a wide class of formal systems including all
commonly studied axiomatizations of set theory.

224

Paradoxes of set theory


Richard's paradox
Main article: Richard's paradox
In the same year the French mathematician Jules Richard used a variant of Cantor's diagonal method to obtain
another contradiction in naive set theory. Consider the set A of all finite agglomerations of words. The set E of all
finite definitions of real numbers is a subset of A. As A is countable, so is E. Let p be the nth decimal of the nth real
number defined by the set E; we form a number N having zero for the integral part and p + 1 for the nth decimal if p
is not equal either to 8 or 9, and unity if p is equal to 8 or 9. This number N is not defined by the set E because it
differs from any finitely defined real number, namely from the nth number by the nth digit. But N has been defined
by a finite number of words in this paragraph. It should therefore be in the set E. That is a contradiction.
As with Knig's paradox, this paradox cannot be formalized in axiomatic set theory because it requires the ability to
tell whether a description applies to a particular set (or, equivalently, to tell whether a formula is actually the
definition of a single set).

Paradox of Lwenheim and Skolem


Main article: Skolem's paradox
Based upon work of the German mathematician Leopold Lwenheim (1915) the Norwegian logician Thoralf Skolem
showed in 1922 that every consistent theory of first-order predicate calculus, such as set theory, has an at most
countable model. However, Cantor's theorem proves that there are uncountable sets. The root of this seeming
paradox is that the countability or noncountability of a set is not always absolute, but can depend on the model in
which the cardinality is measured. It is possible for a set to be uncountable in one model of set theory but countable
in a larger model (because the bijections that establish countability are in the larger model but not the smaller one).

Notes
[1] F. Q. Gouva, "Was Cantor Surprised?" (http:/ / www. maa. org/ pubs/ AMM-March11_Cantor. pdf), American Mathematical Monthly, 118,
March 2011, 198209.

References
G. Cantor: Gesammelte Abhandlungen mathematischen und philosophischen Inhalts, E. Zermelo (Ed.), Olms,
Hildesheim 1966.
H. Meschkowski, W. Nilson: Georg Cantor - Briefe, Springer, Berlin 1991.
A. Fraenkel: Einleitung in die Mengenlehre, Springer, Berlin 1923.
A. A. Fraenkel, A. Levy: Abstract Set Theory, North Holland, Amsterdam 1976.
F. Hausdorff: Grundzge der Mengenlehre, Chelsea, New York 1965.
B. Russell: The principles of mathematics I, Cambridge 1903.
B. Russell: On some difficulties in the theory of transfinite numbers and order types, Proc. London Math. Soc. (2)
4 (1907) 29-53.
P. J. Cohen: Set Theory and the Continuum Hypothesis, Benjamin, New York 1966.
S. Wagon: The Banach-Tarski-Paradox, Cambridge University Press, Cambridge 1985.
A. N. Whitehead, B. Russell: Principia Mathematica I, Cambridge Univ. Press, Cambridge 1910, p.64.
E. Zermelo: Neuer Beweis fr die Mglichkeit einer Wohlordnung, Math. Ann. 65 (1908) p.107-128.

225

Paradoxes of set theory

External links
(http://www.hti.umich.edu/cgi/t/text/pageviewer-idx?c=umhistmath;cc=umhistmath;rgn=full
text;idno=AAT3201.0001.001;didno=AAT3201.0001.001;view=pdf;seq=00000086)PDF
(http://dz-srv1.sub.uni-goettingen.de/sub/digbib/loader?ht=VIEW&did=D38183&p=125)
Definability paradoxes (http://www.dpmms.cam.ac.uk/~wtg10/richardsparadox.html) by Timothy Gowers

Parrondo's paradox
Parrondo's paradox, a paradox in game theory, has been described as: A combination of losing strategies becomes
a winning strategy. It is named after its creator, Juan Parrondo, who discovered the paradox in 1996. A more
explanatory description is:
There exist pairs of games, each with a higher probability of losing than winning, for which it is possible to
construct a winning strategy by playing the games alternately.
Parrondo devised the paradox in connection with his analysis of the Brownian ratchet, a thought experiment about a
machine that can purportedly extract energy from random heat motions popularized by physicist Richard Feynman.
However, the paradox disappears when rigorously analyzed.

Illustrative examples
The saw-tooth example
Consider an example in which there are two points A and B having the
same altitude, as shown in Figure 1. In the first case, we have a flat
profile connecting them. Here, if we leave some round marbles in the
middle that move back and forth in a random fashion, they will roll
around randomly but towards both ends with an equal probability. Now
Figure 1
consider the second case where we have a saw-tooth-like region
between them. Here also, the marbles will roll towards either ends with
equal probability (if there were a tendency to move in one direction, marbles in a ring of this shape would tend to
spontaneously extract thermal energy to revolve, violating the second law of thermodynamics). Now if we tilt the
whole profile towards the right, as shown in Figure 2, it is quite clear that both these cases will become biased
towards B.
Now consider the game in which we alternate the two profiles while judiciously choosing the time between
alternating from one profile to the other.
When we leave a few marbles on the first profile at point E, they
distribute themselves on the plane showing preferential movements
towards point B. However, if we apply the second profile when some
of the marbles have crossed the point C, but none have crossed point
D, we will end up having most marbles back at point E (where we
Figure 2
started from initially) but some also in the valley towards point A
given sufficient time for the marbles to roll to the valley. Then we
again apply the first profile and repeat the steps (points C, D and E now shifted one step to refer to the final valley
closest to A). If no marbles cross point C before the first marble crosses point D, we must apply the second profile
shortly before the first marble crosses point D, to start over.

226

Parrondo's paradox

227

It easily follows that eventually we will have marbles at point A, but none at point B. Hence for a problem defined
with having marbles at point A being a win and having marbles at point B a loss, we clearly win by playing two
losing games.

The coin-tossing example


A second example of Parrondo's paradox is drawn from the field of gambling. Consider playing two games, Game A
and Game B with the following rules. For convenience, define
to be our capital at time t, immediately before we
play a game.
1. Winning a game earns us $1 and losing requires us to surrender $1. It follows that
step t and
if we lose at step t.
2. In Game A, we toss a biased coin, Coin 1, with probability of winning
clearly a losing game in the long run.
3. In Game B, we first determine if our capital is a multiple of some integer
2, with probability of winning
probability of winning

if we win at
. If

, this is

. If it is, we toss a biased coin, Coin

. If it is not, we toss another biased coin, Coin 3, with


. The role of modulo

provides the periodicity as in the ratchet

teeth.
It is clear that by playing Game A, we will almost surely lose in the long run. Harmer and Abbott[1] show via
simulation that if
and
Game B is an almost surely losing game as well. In fact, Game B is a
Markov chain, and an analysis of its state transition matrix (again with M=3) shows that the steady state probability
of using coin 2 is 0.3836, and that of using coin 3 is 0.6164.[2] As coin 2 is selected nearly 40% of the time, it has a
disproportionate influence on the payoff from Game B, and results in it being a losing game.
However, when these two losing games are played in some alternating sequence - e.g. two games of A followed by
two games of B (AABBAABB...), the combination of the two games is, paradoxically, a winning game. Not all
alternating sequences of A and B result in winning games. For example, one game of A followed by one game of B
(ABABAB...) is a losing game, while one game of A followed by two games of B (ABBABB...) is a winning game.
This coin-tossing example has become the canonical illustration of Parrondo's paradox two games, both losing
when played individually, become a winning game when played in a particular alternating sequence. The apparent
paradox has been explained using a number of sophisticated approaches, including Markov chains,[3] flashing
ratchets,[4] Simulated Annealing[5] and information theory.[6] One way to explain the apparent paradox is as follows:
While Game B is a losing game under the probability distribution that results for
played individually (

modulo

is the remainder when

is divided by

modulo

when it is

), it can be a winning game

under other distributions, as there is at least one state in which its expectation is positive.
As the distribution of outcomes of Game B depend on the player's capital, the two games cannot be independent.
If they were, playing them in any sequence would lose as well.
The role of

now comes into sharp focus. It serves solely to induce a dependence between Games A and B, so

that a player is more likely to enter states in which Game B has a positive expectation, allowing it to overcome the
losses from Game A. With this understanding, the paradox resolves itself: The individual games are losing only
under a distribution that differs from that which is actually encountered when playing the compound game. In
summary, Parrondo's paradox is an example of how dependence can wreak havoc with probabilistic computations
made under a naive assumption of independence. A more detailed exposition of this point, along with several related
examples, can be found in Philips and Feldman.[7]

Parrondo's paradox

A simplified example
For a simpler example of how and why the paradox works, again consider two games Game A and Game B, this
time with the following rules:
1. In Game A, you simply lose $1 every time you play.
2. In Game B, you count how much money you have left. If it is an even number, you win $3. Otherwise you lose
$5.
Say you begin with $100 in your pocket. If you start playing Game A exclusively, you will obviously lose all your
money in 100 rounds. Similarly, if you decide to play Game B exclusively, you will also lose all your money in 100
rounds.
However, consider playing the games alternatively, starting with Game B, followed by A, then by B, and so on
(BABABA...). It should be easy to see that you will steadily earn a total of $2 for every two games.
Thus, even though each game is a losing proposition if played alone, because the results of Game B are affected by
Game A, the sequence in which the games are played can affect how often Game B earns you money, and
subsequently the result is different from the case where either game is played by itself.

Application
Parrondo's paradox is used extensively in game theory, and its application in engineering, population dynamics,[8]
financial risk, etc., are also being looked into as demonstrated by the reading lists below. Parrondo's games are of
little practical use such as for investing in stock markets[9] as the original games require the payoff from at least one
of the interacting games to depend on the player's capital. However, the games need not be restricted to their original
form and work continues in generalizing the phenomenon. Similarities to volatility pumping and the two-envelope
problem[10] have been pointed out. Simple finance textbook models of security returns have been used to prove that
individual investments with negative median long-term returns may be easily combined into diversified portfolios
with positive median long-term returns.[11] Similarly, a model that is often used to illustrate optimal betting rules has
been used to prove that splitting bets between multiple games can turn a negative median long-term return into a
positive one.[12]

Name
In the early literature on Parrondo's paradox, it was debated whether the word 'paradox' is an appropriate description
given that the Parrondo effect can be understood in mathematical terms. The 'paradoxical' effect can be
mathematically explained in terms of a convex linear combination.
However, Derek Abbott, a leading Parrondo's paradox researcher provides the following answer regarding the use of
the word 'paradox' in this context:
Is Parrondo's paradox really a "paradox"? This question is sometimes asked by mathematicians, whereas
physicists usually don't worry about such things. The first thing to point out is that "Parrondo's paradox"
is just a name, just like the "Braess paradox" or "Simpson's paradox." Secondly, as is the case with most
of these named paradoxes they are all really apparent paradoxes. People drop the word "apparent" in
these cases as it is a mouthful, and it is obvious anyway. So no one claims these are paradoxes in the
strict sense. In the wide sense, a paradox is simply something that is counterintuitive. Parrondo's games
certainly are counterintuitiveat least until you have intensively studied them for a few months. The
truth is we still keep finding new surprising things to delight us, as we research these games. I have had
one mathematician complain that the games always were obvious to him and hence we should not use
the word "paradox." He is either a genius or never really understood it in the first place. In either case, it
is not worth arguing with people like that.Wikipedia:Quotations

228

Parrondo's paradox
Parrondo's paradox does not seem that paradoxical if one notes that it is actually a combination of three simple
games: two of which have losing probabilities and one of which has a high probability of winning. To suggest that
one can create a winning strategy with three such games is neither counterintuitive nor paradoxical.

Further reading
John Allen Paulos, A Mathematician Plays the Stock Market [13], Basic Books, 2004, ISBN 0-465-05481-1.
Neil F. Johnson, Paul Jefferies, Pak Ming Hui, Financial Market Complexity [14], Oxford University Press, 2003,
ISBN 0-19-852665-2.
Ning Zhong and Jiming Liu, Intelligent Agent Technology: Research and Development, [15] World Scientific,
2001, ISBN 981-02-4706-0.
Elka Korutcheva and Rodolfo Cuerno, Advances in Condensed Matter and Statistical Physics [16], Nova
Publishers, 2004, ISBN 1-59033-899-5.
Maria Carla Galavotti, Roberto Scazzieri, and Patrick Suppes, Reasoning, Rationality, and Probability [17],
Center for the Study of Language and Information, 2008, ISBN 1-57586-557-2.
Derek Abbott and Laszlo B. Kish, Unsolved Problems of Noise and Fluctuations [18], American Institute of
Physics, 2000, ISBN 1-56396-826-6.
Visarath In, Patrick Longhini, and Antonio Palacios, Applications of Nonlinear Dynamics: Model and Design of
Complex Systems [19], Springer, 2009, ISBN 3-540-85631-5.
Marc Moore, Constance van Eeden, Sorana Froda, and Christian Lger, Mathematical Statistics and Applications:
Festschrift for Constance van Eeden [20], IMS, 2003, ISBN 0-940600-57-9.
Ehrhard Behrends, Fnf Minuten Mathematik: 100 Beitrge der Mathematik-Kolumne der Zeitung Die Welt [21],
Vieweg+Teubner Verlag, 2006, ISBN 3-8348-0082-1.
Lutz Schimansky-Geier, Noise in Complex Systems and Stochastic Dynamics [22], SPIE, 2003, ISBN
0-8194-4974-1.
Susan Shannon, Artificial Intelligence and Computer Science [23], Nova Science Publishers, 2005, ISBN
1-59454-411-5.
Eric W. Weisstein, CRC Concise Encyclopedia of Mathematics [24], CRC Press, 2003, ISBN 1-58488-347-2.
David Reguera, Jos M. G. Vilar, and Jos-Miguel Rub, Statistical Mechanics of Biocomplexity [25], Springer,
1999, ISBN 3-540-66245-6.
Sergey M. Bezrukov, Unsolved Problems of Noise and Fluctuations [26], Springer, 2003, ISBN 0-7354-0127-6.
Julian Chela-Flores, Tobias C. Owen, and F. Raulin, First Steps in the Origin of Life in the Universe [27],
Springer, 2001, ISBN 1-4020-0077-4.
Tnu Puu and Irina Sushko, Business Cycle Dynamics: Models and Tools [28], Springer, 2006, ISBN
3-540-32167-5.
Andrzej S. Nowak and Krzysztof Szajowski, Advances in Dynamic Games: Applications to Economics, Finance,
Optimization, and Stochastic Control [29], Birkhuser, 2005, ISBN 0-8176-4362-1.
Cristel Chandre, Xavier Leoncini, and George M. Zaslavsky, Chaos, Complexity and Transport: Theory and
Applications [30], World Scientific, 2008, ISBN 981-281-879-0.
Richard A. Epstein, The Theory of Gambling and Statistical Logic (Second edition), Academic Press, 2009, ISBN
0-12-374940-9.
Clifford A. Pickover, The Math Book, [31] Sterling, 2009, ISBN 1-4027-5796-4.

229

Parrondo's paradox

References
[1]
[2]
[3]
[4]

G. P. Harmer and D. Abbott, "Losing strategies can win by Parrondo's paradox", Nature 402 (1999), 864
D. Minor, "Parrondo's Paradox - Hope for Losers!", The College Mathematics Journal 34(1) (2003) 15-20
G. P. Harmer and D. Abbott, "Parrondo's paradox", Statistical Science 14 (1999) 206-213
G. P. Harmer, D. Abbott, P. G. Taylor, and J. M. R. Parrondo, in Proc. 2nd Int. Conf. Unsolved Problems of Noise and Fluctuations, D.
Abbott, and L. B. Kish, eds., American Institute of Physics, 2000
[5] G. P. Harmer, D. Abbott, and P. G. Taylor, The Paradox of Parrondo's games, Proc. Royal Society of London A 456 (2000), 1-13
[6] G. P. Harmer, D. Abbott, P. G. Taylor, C. E. M. Pearce and J. M. R. Parrondo, Information entropy and Parrondo's discrete-time ratchet, in
Proc. Stochastic and Chaotic Dynamics in the Lakes, Ambleside, U.K., P. V. E. McClintock, ed., American Institute of Physics, 2000
[7] Thomas K. Philips and Andrew B. Feldman, Parrondo's Paradox is not Paradoxical (http:/ / papers. ssrn. com/ sol3/ papers.
cfm?abstract_id=581521), Social Science Research Network (SSRN) Working Papers, August 2004
[8] V. A. A. Jansen and J. Yoshimura "Populations can persist in an environment consisting of sink habitats only". Proceedings of the National
Academy of Sciences USA, 95(1998), 3696-3698 .
[9] R. Iyengar and R. Kohli, "Why Parrondo's paradox is irrelevant for utility theory, stock buying, and the emergence of life," Complexity, 9(1),
pp. 23-27, 2004
[10] Winning While Losing: New Strategy Solves'Two-Envelope' Paradox (http:/ / www. physorg. com/ pdf169811689. pdf) at Physorg.com
[11] M. Stutzer, The Paradox of Diversification, The Journal of Investing, Vol. 19, No.1, 2010.
[12] M. Stutzer, "A Simple Parrondo Paradox", Mathematical Scientist, V.35, 2010.
[13] http:/ / books. google. com. au/ books?id=FUGI7KDTkTUC
[14] http:/ / books. google. com. au/ books?id=8jfV6nntNPkC& pg=PA74& dq=parrondo*
[15] http:/ / books. google. com. au/ books?id=eZ6YCz5NamsC& pg=PA150
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]

http:/ / books. google. com. au/ books?id=lIoZeb_domwC& pg=PA103


http:/ / books. google. com. au/ books?id=ZuMQAQAAIAAJ& q=parrondo*
http:/ / books. google. com. au/ books?id=ePoaAQAAIAAJ
http:/ / books. google. com. au/ books?id=FidKZcUqdIQC& pg=PA307
http:/ / books. google. com. au/ books?id=SJsDHpgsVgsC& pg=PA185
http:/ / books. google. com. au/ books?id=liNP2CpsU8EC& pg=PA10
http:/ / books. google. com. au/ books?id=WgJTAAAAMAAJ& q=parrondo*
http:/ / books. google. com. au/ books?id=PGtGAAAAYAAJ& q=parrondo*
http:/ / books. google. com. au/ books?id=UDk8QARabpwC& pg=PA2152& dq=parrondo*
http:/ / books. google. com. au/ books?id=0oMp60wubKIC& pg=PA95
http:/ / books. google. com. au/ books?id=soGS-YcwvxsC& pg=PA82
http:/ / books. google. com. au/ books?id=q8JwN_1p78UC& pg=PA17& dq=parrondo*
http:/ / books. google. com. au/ books?id=cTfwjzihuiIC& pg=PA148& dq=parrondo*
http:/ / books. google. com. au/ books?id=l5W20mVBeT4C& pg=PA650& dq=parrondo*
http:/ / books. google. com. au/ books?id=md092lhGSOQC& pg=PA107& dq=parrondo*
http:/ / sprott. physics. wisc. edu/ pickover/ math-book. html

External links
J. M. R. Parrondo, Parrondo's paradoxical games (http://seneca.fis.ucm.es/parr/GAMES/index.htm)
Google Scholar profiling of Parrondo's paradox (http://scholar.google.com.au/citations?hl=en&
user=aeNdbrUAAAAJ)
Nature news article on Parrondo's paradox (http://www.nature.com/news/1999/991223/full/news991223-13.
html)
Alternate game play ratchets up winnings: It's the law (http://www.eleceng.adelaide.edu.au/Groups/
parrondo/articles/sandiego.html)
Official Parrondo's paradox page (http://www.eleceng.adelaide.edu.au/Groups/parrondo)
Parrondo's Paradox - A Simulation (http://www.cut-the-knot.org/ctk/Parrondo.shtml)
The Wizard of Odds on Parrondo's Paradox (http://wizardofodds.com/askthewizard/149)
Parrondo's Paradox at Wolfram (http://mathworld.wolfram.com/ParrondosParadox.html)
Online Parrondo simulator (http://hampshire.edu/lspector/parrondo/parrondo.html)
Parrondo's paradox at Maplesoft (http://www.maplesoft.com/applications/view.aspx?SID=1761)
Donald Catlin on Parrondo's paradox (http://catlin.casinocitytimes.com/article/parrondos-paradox-46851)

230

Parrondo's paradox

231

Parrondo's paradox and poker (http://emergentfool.com/2008/02/16/parrondos-paradox-and-poker/)


Parrondo's paradox and epistemology (http://www.fil.lu.se/HommageaWlodek/site/papper/
StjernbergFredrik.pdf)
A Parrondo's paradox resource (http://pagesperso-orange.fr/jean-paul.davalan/proba/parr/index-en.html)
Optimal adaptive strategies and Parrondo (http://www.molgen.mpg.de/~rahmann/parrondo/parrondo.shtml)
Behrends on Parrondo (http://www.math.uni-potsdam.de/~roelly/WorkshopCDFAPotsdam09/Behrends.pdf)
God doesn't shoot craps (http://www.goddoesntshootcraps.com/paradox.html)
Parrondo's paradox in chemistry (http://www.fasebj.org/cgi/content/meeting_abstract/23/
1_MeetingAbstracts/514.1)
Parrondo's paradox in genetics (http://www.genetics.org/cgi/content/full/176/3/1923)
Parrondo effect in quantum mechanics (http://www.ingentaconnect.com/content/els/03784371/2003/
00000324/00000001/art01909)
Financial diversification and Parrondo (http://leeds.colorado.edu/uploadedFiles/Centers_of_Excellence/
Burridge_Center/Working_Papers/ParadoxOfDiversification.pdf)

Russell's paradox
Part of a series on

Bertrand Russell

Russell in 1916

Views on philosophy

Views on society

Russell's paradox

Russell's teapot

Theory of Descriptions

Logical atomism

v
t

e [1]

In the foundations of mathematics, Russell's paradox (also known as Russell's antinomy), discovered by Bertrand
Russell in 1901, showed that the naive set theory created by Georg Cantor leads to a contradiction. The same

Russell's paradox

232

paradox had been discovered a year before by Ernst Zermelo but he did not publish the idea, which remained known
only to Hilbert, Husserl and other members of the University of Gttingen.
According to naive set theory, any definable collection is a set. Let R be the set of all sets that are not members of
themselves. If R is not a member of itself, then its definition dictates that it must contain itself, and if it contains
itself, then it contradicts its own definition as the set of all sets that are not members of themselves. This
contradiction is Russell's paradox. Symbolically:

In 1908, two ways of avoiding the paradox were proposed, Russell's type theory and the Zermelo set theory, the first
constructed axiomatic set theory. Zermelo's axioms went well beyond Frege's axioms of extensionality and unlimited
set abstraction, and evolved into the now-canonical ZermeloFraenkel set theory (ZF).[2]

Informal presentation
Let us call a set "abnormal" if it is a member of itself, and "normal" otherwise. For example, take the set of all
squares in the plane. That set is not itself a square, and therefore is not a member of the set of all squares. So it is
"normal". On the other hand, if we take the complementary set that contains all non-squares, that set is itself not a
square and so should be one of its own members. It is "abnormal".
Now we consider the set of all normal sets, R. Determining whether R is normal or abnormal is impossible: if R were
a normal set, it would be contained in the set of normal sets (itself), and therefore be abnormal; and if R were
abnormal, it would not be contained in the set of all normal sets (itself), and therefore be normal. This leads to the
conclusion that R is neither normal nor abnormal: Russell's paradox.

Formal presentation
Define Naive Set Theory (NST) as the theory of predicate logic with a binary predicate
schema of unrestricted comprehension:

for any formula P with only the variable x free. Substitute

for

and the following axiom

. Then by existential instantiation

(reusing the symbol y) and universal instantiation we have


a contradiction. Therefore NST is inconsistent.

Set-theoretic responses
In 1908, Ernst Zermelo proposed an axiomatization of set theory that avoided the paradoxes of naive set theory by
replacing arbitrary set comprehension with weaker existence axioms, such as his axiom of separation
(Aussonderung). Modifications to this axiomatic theory proposed in the 1920s by Abraham Fraenkel, Thoralf
Skolem, and by Zermelo himself resulted in the axiomatic set theory called ZFC. This theory became widely
accepted once Zermelo's axiom of choice ceased to be controversial, and ZFC has remained the canonical axiomatic
set theory down to the present day.
ZFC does not assume that, for every property, there is a set of all things satisfying that property. Rather, it asserts
that given any set X, any subset of X definable using first-order logic exists. The object R discussed above cannot be
constructed in this fashion, and is therefore not a ZFC set. In some extensions of ZFC, objects like R are called
proper classes. ZFC is silent about types, although some argue that Zermelo's axioms tacitly presuppose a
background type theory.
In ZFC, given a set A, it is possible to define a set B that consists of exactly the sets in A that are not members of
themselves. B cannot be in A by the same reasoning in Russell's Paradox. This variation of Russell's paradox shows

Russell's paradox
that no set contains everything.
Through the work of Zermelo and others, especially John von Neumann, the structure of what some see as the
"natural" objects described by ZFC eventually became clear; they are the elements of the von Neumann universe, V,
built up from the empty set by transfinitely iterating the power set operation. It is thus now possible again to reason
about sets in a non-axiomatic fashion without running afoul of Russell's paradox, namely by reasoning about the
elements of V. Whether it is appropriate to think of sets in this way is a point of contention among the rival points of
view on the philosophy of mathematics.
Other resolutions to Russell's paradox, more in the spirit of type theory, include the axiomatic set theories New
Foundations and Scott-Potter set theory.

History
Russell discovered the paradox in May or June 1901. By his own account in his 1919 Introduction to Mathematical
Philosophy, he "attempted to discover some flaw in Cantor's proof that there is no greatest cardinal".[3] In a 1902
letter,[4] he announced the discovery to Gottlob Frege of the paradox in Frege's 1879 Begriffsschrift and framed the
problem in terms of both logic and set theory, and in particular in terms of Frege's definition of function; in the
following, p.17 refers to a page in the original Begriffsschrift, and page 23 refers to the same page in van Heijenoort
1967:
There is just one point where I have encountered a difficulty. You state (p. 17 [p. 23 above]) that a
function too, can act as the indeterminate element. This I formerly believed, but now this view seems
doubtful to me because of the following contradiction. Let w be the predicate: to be a predicate that
cannot be predicated of itself. Can w be predicated of itself? From each answer its opposite follows.
Therefore we must conclude that w is not a predicate. Likewise there is no class (as a totality) of those
classes which, each taken as a totality, do not belong to themselves. From this I conclude that under
certain circumstances a definable collection [Menge] does not form a totality.[5]
Russell would go on to cover it at length in his 1903 The Principles of Mathematics, where he repeated his first
encounter with the paradox:[6]
Before taking leave of fundamental questions, it is necessary to examine more in detail the singular
contradiction, already mentioned, with regard to predicates not predicable of themselves. ... I may
mention that I was led to it in the endeavour to reconcile Cantor's proof...."
Russell wrote to Frege about the paradox just as Frege was preparing the second volume of his Grundgesetze der
Arithmetik.[7] Frege responded to Russell very quickly; his letter dated 22 June 1902 appeared, with van Heijenoort's
commentary in Heijenoort 1967:126127. Frege then wrote an appendix admitting to the paradox,[8] and proposed a
solution that Russell would endorse in his Principles of Mathematics,[9] but was later considered by some to be
unsatisfactory.[10] For his part, Russell had his work at the printers and he added an appendix on the doctrine of
types.[11]
Ernst Zermelo in his (1908) A new proof of the possibility of a well-ordering (published at the same time he
published "the first axiomatic set theory")[12] laid claim to prior discovery of the antinomy in Cantor's naive set
theory. He states: "And yet, even the elementary form that Russell9 gave to the set-theoretic antinomies could have
persuaded them [J. Knig, Jourdain, F. Bernstein] that the solution of these difficulties is not to be sought in the
surrender of well-ordering but only in a suitable restriction of the notion of set".[13] Footnote 9 is where he stakes his
claim:
9

1903, pp. 366368. I had, however, discovered this antinomy myself, independently of Russell, and
had communicated it prior to 1903 to Professor Hilbert among others.[14]
A written account of Zermelo's actual argument was discovered in the Nachlass of Edmund Husserl.[15]

233

Russell's paradox

234

It is also known that unpublished discussions of set theoretical paradoxes took place in the mathematical community
at the turn of the century. van Heijenoort in his commentary before Russell's 1902 Letter to Frege states that
Zermelo "had discovered the paradox independently of Russell and communicated it to Hilbert, among others, prior
to its publication by Russell".[16]
In 1923, Ludwig Wittgenstein proposed to "dispose" of Russell's paradox as follows:
The reason why a function cannot be its own argument is that the sign for a function already contains the
prototype of its argument, and it cannot contain itself. For let us suppose that the function F(fx) could be
its own argument: in that case there would be a proposition 'F(F(fx))', in which the outer function F and
the inner function F must have different meanings, since the inner one has the form O(f(x)) and the outer
one has the form Y(O(fx)). Only the letter 'F' is common to the two functions, but the letter by itself
signifies nothing. This immediately becomes clear if instead of 'F(Fu)' we write '(do) : F(Ou) . Ou = Fu'.
That disposes of Russell's paradox. (Tractatus Logico-Philosophicus, 3.333)
Russell and Alfred North Whitehead wrote their three-volume Principia Mathematica hoping to achieve what Frege
had been unable to do. They sought to banish the paradoxes of naive set theory by employing a theory of types they
devised for this purpose. While they succeeded in grounding arithmetic in a fashion, it is not at all evident that they
did so by purely logical means. While Principia Mathematica avoided the known paradoxes and allows the
derivation of a great deal of mathematics, its system gave rise to new problems.
In any event, Kurt Gdel in 193031 proved that while the logic of much of Principia Mathematica, now known as
first-order logic, is complete, Peano arithmetic is necessarily incomplete if it is consistent. This is very widely
though not universally regarded as having shown the logicist program of Frege to be impossible to complete.
In 2001 A Centenary International Conference celebrating the first hundred years of Russell's paradox was held in
Munich and its proceedings have been published.

Applied versions
There are some versions of this paradox that are closer to real-life situations and may be easier to understand for
non-logicians. For example, the barber paradox supposes a barber who shaves all men who do not shave themselves
and only men who do not shave themselves. When one thinks about whether the barber should shave himself or not,
the paradox begins to emerge.
As another example, consider five lists of encyclopedia entries within the same encyclopedia:
List of articles about
people:

List of articles starting with the letter


L:

List of articles about


places:

List of articles about


Japan:

List of all lists that do not contain


themselves:

Ptolemy VII of
Egypt
Hermann Hesse
Don Nix
Don Knotts
Nikola Tesla
Sherlock Holmes
Emperor Knin

L
L!VE TV
L&H
Leivonmki

...

Leivonmki
Katase River
Enoshima

Emperor Showa
Katase River
Enoshima

...

List of articles starting with the


letter K
List of articles starting with the
letter L (itself; OK)
List of articles starting with the
letter M

List of articles about Japan


List of articles about places
List of articles about people

List of articles starting with the


letter K
List of articles starting with the
letter M

...

List of all lists that do not


contain themselves?

...

If the "List of all lists that do not contain themselves" contains itself, then it does not belong to itself and should be
removed. However, if it does not list itself, then it should be added to itself.

Russell's paradox
While appealing, these layman's versions of the paradox share a drawback: an easy refutation of the barber paradox
seems to be that such a barber does not exist, or at least does not shave (a variant of which is that the barber is a
woman). The whole point of Russell's paradox is that the answer "such a set does not exist" means the definition of
the notion of set within a given theory is unsatisfactory. Note the difference between the statements "such a set does
not exist" and "it is an empty set". It is like the difference between saying, "There is no bucket", and saying, "The
bucket is empty".
A notable exception to the above may be the GrellingNelson paradox, in which words and meaning are the
elements of the scenario rather than people and hair-cutting. Though it is easy to refute the barber's paradox by
saying that such a barber does not (and cannot) exist, it is impossible to say something similar about a meaningfully
defined word.
One way that the paradox has been dramatised is as follows:
Suppose that every public library has to compile a catalogue of all its books. Since the catalogue is itself one
of the library's books, some librarians include it in the catalogue for completeness; while others leave it out as
it being one of the library's books is self-evident.
Now imagine that all these catalogues are sent to the national library. Some of them include themselves in
their listings, others do not. The national librarian compiles two master catalogues one of all the catalogues
that list themselves, and one of all those that don't.
The question is: should these catalogues list themselves? The 'Catalogue of all catalogues that list themselves'
is no problem. If the librarian doesn't include it in its own listing, it is still a true catalog of those catalogues
that do include themselves. If he does include it, it remains a true catalogue of those that list themselves.
However, just as the librarian cannot go wrong with the first master catalogue, he is doomed to fail with the
second. When it comes to the 'Catalogue of all catalogues that don't list themselves', the librarian cannot
include it in its own listing, because then it would include itself. But in that case, it should belong to the other
catalogue, that of catalogues that do include themselves. However, if the librarian leaves it out, the catalogue is
incomplete. Either way, it can never be a true catalogue of catalogues that do not list themselves.

Applications and related topics


Russell-like paradoxes
As illustrated above for the Barber paradox, Russell's paradox is not hard to extend. Take:
A transitive verb <V>, that can be applied to its substantive form.
Form the sentence:
The <V>er that <V>s all (and only those) who don't <V> themselves,
Sometimes the "all" is replaced by "all <V>ers".
An example would be "paint":
The painter that paints all (and only those) that don't paint themselves.
or "elect"
The elector (representative), that elects all that don't elect themselves.
Paradoxes that fall in this scheme include:
The barber with "shave".
The original Russell's paradox with "contain": The container (Set) that contains all (containers) that don't contain
themselves.
The GrellingNelson paradox with "describer": The describer (word) that describes all words, that don't describe
themselves.

235

Russell's paradox
Richard's paradox with "denote": The denoter (number) that denotes all denoters (numbers) that don't denote
themselves. (In this paradox, all descriptions of numbers get an assigned number. The term "that denotes all
denoters (numbers) that don't denote themselves" is here called Richardian.)

Related paradoxes
The liar paradox and Epimenides paradox, whose origins are ancient
The KleeneRosser paradox, showing that the original lambda calculus is inconsistent, by means of a
self-negating statement
Curry's paradox (named after Haskell Curry), which does not require negation
The smallest uninteresting integer paradox

Notes
[1]
[2]
[3]
[4]

http:/ / en. wikipedia. org/ w/ index. php?title=Template:Bertrand_Russell& action=edit


Set theory paradoxes (http:/ / www. suitcaseofdreams. net/ Set_theory_Paradox. htm), by Tetyana Butler, 2006, Suitcase of Dreams
Russell 1920:136
. Also van Heijenoort 1967:124125

[5] Remarkably, this letter was unpublished until van Heijenoort 1967 it appears with van Heijenoort's commentary at van Heijenoort
1967:124125.
[6] Russell 1903:101
[7] cf van Heijenoort's commentary before Frege's Letter to Russell in van Heijenoort 1967:126.
[8] van Heijenoort's commentary, cf van Heijenoort 1967:126 ; Frege starts his analysis by this exceptionally honest comment : "Hardly anything
more unfortunate can befall a scientific writer than to have one of the foundations of his edifice shaken after the work is finished. This was the
position I was placed in by a letter of Mr Bertrand Russell, just when the printing of this volume was nearing its completion" (Appendix of
Grundgesetze der Arithmetik, vol. II, in The Frege Reader, p.279, translation by Michael Beaney
[9] cf van Heijenoort's commentary, cf van Heijenoort 1967:126. The added text reads as follows: " Note. The second volume of Gg., which
appeared too late to be noticed in the Appendix, contains an interesting discussion of the contradiction (pp. 253265), suggesting that the
solution is to be found by denying that two propositional functions that determine equal classes must be equivalent. As it seems very likely
that this is the true solution, the reader is strongly recommended to examine Frege's argument on the point" (Russell 1903:522); The
abbreviation Gg. stands for Frege's Grundgezetze der Arithmetik. Begriffsschriftlich abgeleitet. Vol. I. Jena, 1893. Vol. II. 1903.
[10] Livio states that "While Frege did make some desperate attempts to remedy his axiom system, he was unsuccessful. The conclusion
appeared to be disastrous...." Livio 2009:188. But van Heijenoort in his commentary before Frege's (1902) Letter to Russell describes Frege's
proposed "way out" in some detail the matter has to do with the " 'transformation of the generalization of an equality into an equality of
courses-of-values. For Frege a function is something incomplete, 'unsaturated' "; this seems to contradict the contemporary notion of a
"function in extension"; see Frege's wording at page 128: "Incidentally, it seems to me that the expession 'a predicate is predicated of itself' is
not exact. ...Therefore I would prefer to say that 'a concept is predicated of its own extension' [etc]". But he waffles at the end of his suggestion
that a function-as-concept-in-extension can be written as predicated of its function. van Heijenoort cites Quine: "For a late and thorough study
of Frege's "way out", see Quine 1955": "On Frege's way out", Mind 64, 145159; reprinted in Quine 1955b: Appendix. Completeness of
quantification theory. Loewenheim's theorem, enclosed as a pamphlet with part of the third printing (1955) of Quine 1950 and incorporated in
the revised edition (1959), 253260" (cf REFERENCES in van Heijenoort 1967:649)
[11] Russell mentions this fact to Frege, cf van Heijenoort's commentary before Frege's (1902) Letter to Russell in van Heijenoort 1967:126
[12] van Heijenoort's commentary before Zermelo (1908a) Investigations in the foundations of set theory I in van Heijenoort 1967:199
[13] van Heijenoort 1967:190191. In the section before this he objects strenuously to the notion of impredicativity as defined by Poincar (and
soon to be taken by Russell, too, in his 1908 Mathematical logic as based on the theory of types cf van Heijenoort 1967:150182).
[14] Ernst Zermelo (1908) A new proof of the possibility of a well-ordering in van Heijenoort 1967:183198. Livio 2009:191 reports that
Zermelo "discovered Russell's paradox independently as early as 1900"; Livio in turn cites Ewald 1996 and van Heijenoort 1967 (cf Livio
2009:268).
[15] B. Rang and W. Thomas, "Zermelo's discovery of the 'Russell Paradox'", Historia Mathematica, v. 8 n. 1, 1981, pp. 1522.
[16] van Heijenoort 1967:124

236

Russell's paradox

237

References
Potter, Michael (15 January 2004), Set Theory and its Philosophy, Clarendon Press (Oxford University Press),
ISBN978-0-19-926973-0
van Heijenoort, Jean (1967, third printing 1976), From Frege to Gdel: A Source Book in Mathematical Logic,
1879-1931, Cambridge, Massachusetts: Harvard University Press, ISBN0-674-32449-8
Livio, Mario (6 January 2009), Is God a Mathematician?, New York: Simon & Schuster,
ISBN978-0-7432-9405-8

External links
Weisstein, Eric W., "Russell's Antinomy" (http://mathworld.wolfram.com/RussellsAntinomy.html),
MathWorld.
Russell's Paradox (http://www.cut-the-knot.org/selfreference/russell.shtml) at Cut-the-Knot
Stanford Encyclopedia of Philosophy: " Russell's Paradox (http://plato.stanford.edu/entries/russell-paradox/)"
by A. D. Irvine.

Simpson's paradox
In probability and statistics, Simpson's paradox, or the
YuleSimpson effect, is a paradox in which a trend that appears in
different groups of data disappears when these groups are combined,
and the reverse trend appears for the aggregate data. This result is often
encountered in social-science and medical-science statistics, and is
particularly confounding when frequency data are unduly given causal
interpretations.[1] Simpson's Paradox disappears when causal relations
are brought into consideration. Many statisticians believe that the
mainstream public should be informed of the counter-intuitive results
in statistics such as Simpson's paradox.[2][3]

Simpson's paradox for continuous data: a positive


trend appears for two separate groups (blue and
red), a negative trend (black, dashed) appears
when the data are combined.

Edward H. Simpson first described this phenomenon in a technical


paper in 1951, but the statisticians Karl Pearson, et al., in 1899, and
Udny Yule, in 1903, had mentioned similar effects earlier. The name Simpson's paradox was introduced by Colin R.
Blyth in 1972. Since Edward Simpson did not actually discover this statistical paradox (an instance of Stigler's law
of eponymy), some writers, instead, have used the impersonal names reversal paradox and amalgamation paradox
in referring to what is now called Simpson's Paradox and the YuleSimpson effect.

Examples
Kidney stone treatment
This is a real-life example from a medical study comparing the success rates of two treatments for kidney stones.
The table below shows the success rates and numbers of treatments for treatments involving both small and large
kidney stones, where Treatment A includes all open surgical procedures and Treatment B is percutaneous
nephrolithotomy (which involves only a small puncture). The numbers in parentheses indicate the number of success
cases over the total size of the group. (For example, 93% equals 81 divided by 87.)

Simpson's paradox

238

Treatment A

Treatment B

Small Stones

Group 1
93% (81/87)

Group 2
87% (234/270)

Large Stones

Group 3
73% (192/263)

Group 4
69% (55/80)

Both

78% (273/350) 83% (289/350)

The paradoxical conclusion is that treatment A is more effective when used on small stones, and also when used on
large stones, yet treatment B is more effective when considering both sizes at the same time. In this example the
"lurking" variable (or confounding variable) of the stone size was not previously known to be important until its
effects were included.
Which treatment is considered better is determined by an inequality between two ratios (successes/total). The
reversal of the inequality between the ratios, which creates Simpson's paradox, happens because two effects occur
together:
1. The sizes of the groups, which are combined when the lurking variable is ignored, are very different. Doctors
tend to give the severe cases (large stones) the better treatment (A), and the milder cases (small stones) the
inferior treatment (B). Therefore, the totals are dominated by groups 3 and 2, and not by the two much smaller
groups 1 and 4.
2. The lurking variable has a large effect on the ratios, i.e. the success rate is more strongly influenced by the
severity of the case than by the choice of treatment. Therefore, the group of patients with large stones using
treatment A (group 3) does worse than the group with small stones, even if the latter used the inferior treatment B
(group 2).
Based on these effects, the paradoxical result can be rephrased more intuitively as follows: Treatment A, when
applied to a patient population consisting mainly of patients with large stones, is less successful than Treatment B
applied to a patient population consisting mainly of patients with small stones.

Berkeley gender bias case


One of the best-known real-life examples of Simpson's paradox occurred when the University of California,
Berkeley was sued for bias against women who had applied for admission to graduate schools there. The admission
figures for the fall of 1973 showed that men applying were more likely than women to be admitted, and the
difference was so large that it was unlikely to be due to chance.[4]
Applicants Admitted
Men

8442

44%

Women 4321

35%

But when examining the individual departments, it appeared that no department was significantly biased against
women. In fact, most departments had a "small but statistically significant bias in favor of women." The data from
the six largest departments are listed below.

Simpson's paradox

239

Department

Men

Women

Applicants Admitted Applicants Admitted


A

825

62%

108

82%

560

63%

25

68%

325

37%

593

34%

417

33%

375

35%

191

28%

393

24%

373

6%

341

7%

The research paper by Bickel et al. concluded that women tended to apply to competitive departments with low rates
of admission even among qualified applicants (such as in the English Department), whereas men tended to apply to
less-competitive departments with high rates of admission among the qualified applicants (such as in engineering
and chemistry). The conditions under which the admissions' frequency data from specific departments constitute a
proper defense against charges of discrimination are formulated in the book Causality by Pearl.

Low birth weight paradox


Main article: Low birth-weight paradox
The low birth weight paradox is an apparently paradoxical observation relating to the birth weights and mortality of
children born to tobacco smoking mothers. As a usual practice, babies weighing less than a certain amount (which
varies between different countries) have been classified as having low birth weight. In a given population, babies
with low birth weights have had a significantly higher infant mortality rate than others. Normal birth weight infants
of smokers have about the same mortality rate as normal birth weight infants of non-smokers, and low birth weight
infants of smokers have a much lower mortality rate than low birth weight infants of non-smokers, but infants of
smokers overall have a much higher mortality rate than infants of non-smokers. This is because many more infants
of smokers are low birth weight, and low birth weight babies have a much higher mortality rate than normal birth
weight babies.

Batting averages
A common example of Simpson's Paradox involves the batting averages of players in professional baseball. It is
possible for one player to hit for a higher batting average than another player during a given year, and to do so again
during the next year, but to have a lower batting average when the two years are combined. This phenomenon can
occur when there are large differences in the number of at-bats between the years. (The same situation applies to
calculating batting averages for the first half of the baseball season, and during the second half, and then combining
all of the data for the season's batting average.)
A real-life example is provided by Ken Ross[5] and involves the batting average of two baseball players, Derek Jeter
and David Justice, during the baseball years 1995 and 1996:[6]

Simpson's paradox

240

1995
Derek Jeter

12/48

1996

Combined

.250 183/582 .314 195/630 .310

David Justice 104/411 .253 45/140

.321 149/551 .270

In both 1995 and 1996, Justice had a higher batting average (in bold type) than Jeter did. However, when the two
baseball seasons are combined, Jeter shows a higher batting average than Justice. According to Ross, this
phenomenon would be observed about once per year among the possible pairs of interesting baseball players. In this
particular case, the Simpson's Paradox can still be observed if the year 1997 is also taken into account:
1995
Derek Jeter

12/48

1996

1997

Combined

.250 183/582 .314 190/654 .291 385/1284 .300

David Justice 104/411 .253 45/140

.321 163/495 .329 312/1046 .298

The Jeter and Justice example of Simpson's paradox was referred to in the "Conspiracy Theory" episode of the
television series Numb3rs, though a chart shown omitted some of the data, and listed the 1996 averages as
1995.Wikipedia:Citation needed
If weighting is used this phenomenon disappears. The table below has been normalized for the largest totals so that
the same things are compared.
1995
Derek Jeter

12/48*411

1996

Combined

102.75/411 .250 183/582*582 183/582 .314 285.75/993 .288

David Justice 104/411*411 104/411

.253 45/140*582

187/582 .321 291/993

.293

Correlation between variables


Simpsons Paradox can also arise in correlations, in which two variables appear to have (say) a positive correlation
towards one another, when in fact they have a negative correlation, the reversal having been brought about by a
lurking confounder. Berman et al. [7] give an example from economics, where a dataset suggests overall demand is
positively correlated with price (that is, higher prices lead to more demand), in contradiction of expectation. Analysis
reveals time to be the confounding variable: plotting both price and demand against time reveals the expected
negative correlation over various periods, which then reverses to become positive if the influence of time is ignored
by simply plotting demand against price.

Simpson's paradox

241

Description
Suppose two people, Lisa and Bart, each edit document
articles for two weeks. In the first week, Lisa improves
0 of the 3 articles she edited, and Bart improves 1 of
the 7 articles he edited. In the second week, Lisa
improves 5 of 7 articles she edited, while Bart improves
all 3 of the articles he edited.

Illustration of Simpson's Paradox; The first graph (on the top)


represents Lisa's contribution, the second one Bart's. The blue bars
represent the first week, the red bars the second week; the triangles
indicate the combined percentage of good contributions (weighted
average). While Bart's bars both show a higher rate of success than
Lisa's, Lisa's combined rate is higher because basically she improved
a greater ratio relative to the quantity edited.

Week 1 Week 2 Total


Lisa

0/3

5/7

5/10

Bart

1/7

3/3

4/10

Both times Bart improved a higher percentage of articles than Lisa, but the actual number of articles each edited (the
bottom number of their ratios, also known as the sample size) were not the same for both of them either week. When
the totals for the two weeks are added together, Bart and Lisa's work can be judged from an equal sample size, i.e.
the same number of articles edited by each. Looked at in this more accurate manner, Lisa's ratio is higher and,
therefore, so is her percentage. Also when the two tests are combined using a weighted average, overall, Lisa has
improved a much higher percentage than Bart because the quality modifier had a significantly higher percentage.
Therefore, like other paradoxes, it only appears to be a paradox because of incorrect assumptions, incomplete or
misguided information, or a lack of understanding a particular concept.
Week 1 quantity Week 2 quantity Total quantity and weighted quality
Lisa

0%

71.4%

50%

Bart

14.2%

100%

40%

This imagined paradox is caused when the percentage is provided but not the ratio. In this example, if only the
14.2% in the first week for Bart was provided but not the ratio (1:7), it would distort the information causing the
imagined paradox. Even though Bart's percentage is higher for the first and second week, when two weeks of articles
is combined, overall Lisa had improved a greater proportion, 50% of the 10 total articles. Lisa's proportional total of
articles improved exceeds Bart's total.
Here are some notations:
In the first week

Lisa improved 0% of the articles she edited.


Bart had a 14.2% success rate during that time.

Simpson's paradox

242

Success is associated with Bart.


In the second week

Lisa managed 71.4% in her busy life.


Bart achieved a 100% success rate.
Success is associated with Bart.

On both occasions Bart's edits were more successful than Lisa's. But if we combine the two sets, we see that Lisa and
Bart both edited 10 articles, and:

Lisa improved 5 articles.

Bart improved only 4.


Success is now associated with Lisa.

Bart is better for each set but worse overall.


The paradox stems from the intuition that Bart could not possibly be a better editor on each set but worse overall.
Pearl proved how this is possible, when "better editor" is taken in the counterfactual sense: "Were Bart to edit all
items in a set he would do better than Lisa would, on those same items". Clearly, frequency data cannot support this
sense of "better editor," because it does not tell us how Bart would perform on items edited by Lisa, and vice versa.
In the back of our mind, though, we assume that the articles were assigned at random to Bart and Lisa, an
assumption which (for a large sample) would support the counterfactual interpretation of "better editor." However,
under random assignment conditions, the data given in this example are unlikely, which accounts for our surprise
when confronting the rate reversal.
The arithmetical basis of the paradox is uncontroversial. If
must be greater than

and

we feel that

. However if different weights are used to form the overall score for each person then

this feeling may be disappointed. Here the first test is weighted

for Lisa and

for Bart while the weights are

reversed on the second test.

Lisa is a better editor on average, as her overall success rate is higher. But it is possible to have told the story in a
way which would make it appear obvious that Bart is more diligent.
Simpson's paradox shows us an extreme example of the importance of including data about possible confounding
variables when attempting to calculate causal relations. Precise criteria for selecting a set of "confounding variables,"
(i.e., variables that yield correct causal relationships if included in the analysis), is given in Pearl using causal graphs.
While Simpson's paradox often refers to the analysis of count tables, as shown in this example, it also occurs with
continuous data:[8] for example, if one fits separated regression lines through two sets of data, the two regression
lines may show a positive trend, while a regression line fitted through all data together will show a negative trend, as
shown on the first picture.

Simpson's paradox

243

Vector interpretation
Simpson's paradox can also be illustrated using the 2-dimensional
vector space. A success rate of
can be represented by a vector
, with a slope of

. If two rates

and

are combined, as in the examples given above, the result can be


represented by the sum of the vectors

and

, which

according to the parallelogram rule is the vector


with slope

Simpson's paradox says that even if a vector


has a smaller slope than another vector
smaller slope than

(in blue in the figure)


(in red), and

has a

, the sum of the two vectors

Vector interpretation of Simpson's paradox

(indicated by "+" in the figure) can still have a larger slope than the
sum of the two vectors
, as shown in the example.

Implications for decision making


The practical significance of Simpson's paradox surfaces in decision making situations where it poses the following
dilemma: Which data should we consult in choosing an action, the aggregated or the partitioned? In the Kidney
Stone example above, it is clear that if one is diagnosed with "Small Stones" or "Large Stones" the data for the
respective subpopulation should be consulted and Treatment A would be preferred to Treatment B. But what if a
patient is not diagnosed, and the size of the stone is not known; would it be appropriate to consult the aggregated
data and administer Treatment B? This would stand contrary to common sense; a treatment that is preferred both
under one condition and under its negation should also be preferred when the condition is unknown.
On the other hand, if the partitioned data is to be preferred a priori, what prevents one from partitioning the data into
arbitrary sub-categories (say based on eye color or post-treatment pain) artificially constructed to yield wrong
choices of treatments? Pearl shows that, indeed, in