Sei sulla pagina 1di 172

Rice University

ECO 501

Lecture Notes: Microeconomic


Theory I

Christian Roessler
Fall 2008
Contents
1 Preference 3
1.1 Consumption Set . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Rational Preference . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Utility Functions . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Utility 8
2.1 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Quasiconcavity . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 Demand I: Utility Maximization Problem 15
3.1 Budgets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 The Utility Maximization Problem (UMP) . . . . . . . . . . . 16
3.3 Indirect Utility Function . . . . . . . . . . . . . . . . . . . . . 21
4 Demand II: Expenditure Minimization Problem 23
4.1 EMP and Hicksian Demand . . . . . . . . . . . . . . . . . . . 23
4.2 Expenditure Function . . . . . . . . . . . . . . . . . . . . . . . 26
4.3 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5 Comparative Statics 30
5.1 Wealth Eects . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2 Price Eects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.3 Law of Demand . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.4 Elasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.5 Money Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.6 Welfare Comparisons . . . . . . . . . . . . . . . . . . . . . . . 36
6 Choice-Based Approach 41
6.1 Choice Structures . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.2 Weak Axiom . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.3 Relationship with the Law of Demand . . . . . . . . . . . . . 47
6.4 Strong Axiom . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
7 Integrability 55
7.1 Slutsky and Hicks Compensation . . . . . . . . . . . . . . . . 55
7.2 Aside: Dot Product . . . . . . . . . . . . . . . . . . . . . . . . 57
7.3 Substitution Matrix . . . . . . . . . . . . . . . . . . . . . . . . 58
1
7.4 Substitution Matrix with Preference . . . . . . . . . . . . . . . 62
7.5 Integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8 Aggregation 71
8.1 Aggregate Demand Function . . . . . . . . . . . . . . . . . . . 71
8.2 Representative Consumer . . . . . . . . . . . . . . . . . . . . . 76
8.3 Failure of the Weak Axiom . . . . . . . . . . . . . . . . . . . . 80
9 Expected Utility 84
9.1 Lotteries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
9.2 Preference over Lotteries . . . . . . . . . . . . . . . . . . . . . 86
9.3 Expected Utility Theorem . . . . . . . . . . . . . . . . . . . . 88
9.4 Paradoxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
9.5 State-Space Approaches . . . . . . . . . . . . . . . . . . . . . 98
10 Risk 100
10.1 Money Lotteries . . . . . . . . . . . . . . . . . . . . . . . . . . 100
10.2 Risk Attitude . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
10.3 Stochastic Dominance . . . . . . . . . . . . . . . . . . . . . . 107
11 Prot Maximization Problem 111
11.1 Production Set . . . . . . . . . . . . . . . . . . . . . . . . . . 111
11.2 Transformation Function . . . . . . . . . . . . . . . . . . . . . 114
11.3 Prot Maximization . . . . . . . . . . . . . . . . . . . . . . . 116
11.4 Law of Supply . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
12 Eciency of Aggregate Supply 120
12.1 Ecient Production . . . . . . . . . . . . . . . . . . . . . . . . 120
12.2 Cost Minimization . . . . . . . . . . . . . . . . . . . . . . . . 122
12.3 Aggregate Supply . . . . . . . . . . . . . . . . . . . . . . . . . 125
13 Partial Competitive Equilibrium 126
13.1 Competitive Equilibrium . . . . . . . . . . . . . . . . . . . . . 126
13.2 Partial Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . 129
13.3 The Long Run . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
14 Welfare Analysis 132
14.1 Pareto Eciency and Surplus . . . . . . . . . . . . . . . . . . 132
14.2 Eciency of Competitive Equilibrium . . . . . . . . . . . . . . 134
2
14.3 Ecient Allocations through the Market Mechanism . . . . . 135
15 Externalities 136
15.1 Externalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
15.2 Ineciency . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
15.3 Remedies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
15.4 Public Goods . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
16 Monopoly and Product Dierentiation 145
16.1 Monopoly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
16.2 Bertrand Price Competition . . . . . . . . . . . . . . . . . . . 148
16.3 Repetition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
16.4 Product Dierentation . . . . . . . . . . . . . . . . . . . . . . 153
17 Capacity Constraints 156
17.1 Capacity-Constrained Pricing . . . . . . . . . . . . . . . . . . 156
17.2 Cournot Quantity Competition . . . . . . . . . . . . . . . . . 158
17.3 Competitive Limit . . . . . . . . . . . . . . . . . . . . . . . . 161
18 Precommitment and Entry 162
18.1 Precommitment . . . . . . . . . . . . . . . . . . . . . . . . . . 162
18.2 Entry Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . 166
18.3 Socially Optimal Entry . . . . . . . . . . . . . . . . . . . . . . 168
1 Preference
1.1 Consumption Set
In the rst part of these lectures, we consider the consumer decision
problem in a market economy, where goods are oered at posted prices.
Most of our attention will be on the primitive approach that starts with
"rational" individual preferences and derives optimal choices for given
prices and endowments. We will also look at connections with the
"revealed preference" approach that makes assumptions directly about
choices.
Let there be 1 dierent commodities. A consumption bundle is a list
3
r
1
. r
2
. . . . . r
1
of quantities of each commodity, represented by a vector
r =
_

_
r
1
.
.
.
r
1
_

_
in the commodity space R
1
.
Note that R
1
includes consumption bundles that list negative quantities
of some goods. These can be interpreted as debts or giving some of ones
endowment to others.
Any two items that could sell at dierent prices should be modeled as
distinct commodities. Strictly speaking, the description of a commodity
may have to include detailed context information, such as "a diet coke
can from a vending machine at Grand Central Station in the summer
of 2008."
The denition of the commodity space tells us what kind of object a
consumption bundle is. It may be the case that not all such objects
(i.e. not all vectors) can be feasibly consumed, or we may wish to
impose restrictions in a particular model. Physical constraints include:
we need time to consume, and time is limited; we cannot consume in
two places simultaneously; we have to consume enough of certain things
(food, shelter) to survive.
A restriction we will impose for modeling purposes is that the set of
feasible consumption bundles, or the consumption set, is convex. That
is, if r and are feasible, then any mixture . = cr + (1 c) with
c (0. 1) is also feasible. This assumption rules out indivisible com-
modities.
Example. Suppose the commodities are time (in minutes) spent watch-
ing television, and time (in minutes) spent in a rollercoaster car on "The
Beast" (Kings Island near Cincinnati), a ride which takes approximately ve
minutes. It is possible to watch no television and ride "The Beast" once (call
this bundle r), and it is also possible to watch television for one hour and
not ride "The Beast" (call this bundle ). But one cannot watch television
4
for half an hour and ride "The Beast" for two-and-a-half minutes (unless it
is a one-period model ....). Since this bundle (call it .) is a mixture of r and
, the consumption set for these two commodities is not convex.
Any economic model is an idealized representation of reality. We will
make many assumptions that are in some sense too strong to ever hold
exactly. In some cases, they preserve the spirit of the problem, and
propositions for the idealized world are also informative about reality.
In others, the assumptions limit the applicability of the model to a more
specic problem. Convexity of the consumption set is a convenience
restriction that should not alter the thrust of our results, even though
many commodities are actually indivisible.
Most of the time, we will assume that the consumption set A contains
all bundles with only non-negative quantities:
A = R
1
+
=
_
r R
1
s.t. r

_ 0 for / = 1. . . . . 1
_
,
which is convex.
1.2 Rational Preference
The rst step in analyzing individual choices from the consumption
set is to dene preference over its elements. For the moment, we can
think of A more generally as a choice set, where the elements may in
particular be consumption bundles. Or they may be something else
one can express a preference about, for example sports teams.
A preference % is a binary relation on A: a subset of A A. We
interpret (r. ) %, which is commonly denoted r % , as "r is at least
as good as ."
The preference % associates with every r A its better-than (or upper
contour) set and worse-than (or lower contour) set:
% (r) = c A s.t. c % r
- (r) = / A s.t. r % / .
The intersection of the upper and lower contour sets is the indierence
set at r:
~ (r) = / A s.t. r ~ / .
5
From % derive the strict preference and indierence relations ~ and ~:
r ~ i r % and not % r
r ~ i r % and also % r.
In almost all economic theory, preferences are assumed to be rational
in the following sense.
Denition. Preference % is rational i
(i) % is complete: \r. A, r % or % r (or both).
(ii) % is transitive: \r. . . A, r % % . == r % ..
Completeness says that any two elements of A can be compared: one
is always preferred, or else they are indierent. But it is never the case
that the agent could not say whether he prefers r or (or nds them
indierent), for example because he is seeking more information. This
scenario could be accommodated by state-dependent preference, where
the state is dened by what the agent learns.
Transitivity is most closely related to the usual notion of rationality: it
requires the agent to rank alternatives consistently and predictably. In
practice, people often violate transitivity unwittingly, especially when
the choice objects are complex and unfamiliar. However, most would
revise a stated preference when one points out to them that it is in-
transitive.
Moreover, a transitivity violator is exposed to "Dutch book" trades.
Suppose the agent initially has r. Since the preference is cyclical, e.g.
r ~ % . % r, and the agent should be willing to pay some non-zero
amount to exchange for r, one could oer . for r, then for ., then
r for (at a price). Which leaves the agent with the initial bundle r,
but he has made a payment. If these trades are repeated often enough,
they will bankrupt the agent.
Exercise 1 (MWG1.B.1, 1.B.2). Show: if %is rational, then \r. . . A
(i) r ~ r, (ii) relations ~ and ~ are transitive, (iii) r ~ % . == r ~ ..
6
1.3 Utility Functions
To apply tools from calculus in analyzing choices, it is useful to repre-
sent the preference relation by a function.
Denition. The function n : A R is a utility function that represents
preference relation % if \r. A,
r % == n(r) _ n() .
If n represents %, then any strictly increasing function , : R R can
be composed with n to give a new function : A R (where (r) =
, (n(r)) for all r A), with the property that (r) _ (r
0
) ==
n(r) _ n(r
0
). Thus also represents %. Clearly, utility functions are
non-unique.
On the other hand, a utility function may fail to exist. It never exists
for a preference that is not rational.
Proposition. Only a rational preference relation can be represented by a
utility function.
Proof. Suppose a utility function n : A R represents preference %
on A. Then % is complete: for any r. A, we have n(r) _ n() or
n() _ n(r), so r % or % r. And % is transitive: when r % % ., we
have n(r) _ n() _ n(.), hence n(r) _ n(.) and then r % .. So existence
of a utility function implies that % is rational. The contrapositive, "if % is
not rational, then there does not exist a utility function," follows.

Exercise 2 (MWG 1.B.5). Show: if A is nite and % is a rational pref-


erence relation on A, then there exists a utility function n : A R that
represents %.
7
2 Utility
2.1 Continuity
In this lecture, we discuss which properties of preferences ensure the
existence of the type of utility function to which the standard optimiza-
tion techniques apply. When is % represented by a utility function n
that is twice dierentiable and has a (unique) maximum? Conditions
for dierentiability are fairly dicult to establish, so we focus on the
related, but weaker, property of continuity.
In addition to continuity, dierentiability requires the absence of kinks
in the utility function.
Denition. Preference relation % is continuous if r % whenever r and
are the limits of sequences r
a

1
a=1
and
a

1
a=1
such that r
a
%
a
for all
:.
Informally, a continuous preference ranks r and the same as it ranks
objects that are very similar to r and .
In particular, suppose
a

1
a=1
converges to , and
a
% r for all :.
Then continuity requires % r. Hence the limit of every convergent
sequence in % (r), the upper contour set, is also in % (r), so that % (r)
is closed. By the analogous argument, - (r), the lower contour set, is
also closed.
The converse is true, too. Thus, a continuous preference is equiva-
lently dened by the closedness of all lower and upper contour sets.
Informally, r is preferred to if objects that are very similar to r are
preferred to .
In the context of functions, continuity preserves closeness, rather than
preference, under limits: whenever r is very close to in the domain,
, (r) is very close to , () in the image.
Denition. The function n : A R is continuous if , (r) is the limit of
sequence , (r
a
)
1
a=1
whenever r is the limit of sequence r
a

1
a=1
_ A.
8
The proof of existence of a continuous utility function is simplied by
(but actually valid without) imposing a property called monotonicity
on preferences. Its denition presupposes that the consumption set is
A = R
1
+
.
The notation r stands for r

for / = 1. . . . . 1.
Denition. Preference relation % is monotone if r == r ~ .
Monotonicity is a fairly strong assumption. It rules out bads, but this
is not a real problem, since bads can be relabeled as goods (e.g. replace
"waste" with "waste disposal"). Many things are, however, desirable
only up to a point (you may like some ice cream, perhaps a lot, but
not tons of it delivered to your home). Monotonicity welcomes more of
anything; it is inconsistent with limited wants.
Proposition. Any continuous rational preference relation can be represented
by a continuous utility function.
Proof. Assume that preference % is continuous (and monotone) on R
1
+
.
We will construct utility values and show that the resulting function repre-
sents % and is continuous.
Let c = (1. . . . . 1) R
1
+
be a vector of 1s. Monotonicity implies: for every
consumption bundle r, there exists c _ 0 such that r % cc and c < such
that cc % r (just let c = 0 and c large enough so that all quantities in
cc are greater than the corresponding quantities in r). This means that the
sets
+
(r) = c R
+
s.t. cc % r and

(r) = c R
+
s.t. r % cc are
nonempty. Given r, we have c

(r) or c
+
(r) for any c R
+
by completeness, so R
+
_

(r) '
+
(r). Moreover,

(r) and
+
(r)
are closed, since - (r) and % (r) are closed, so that preference must be
preserved under limits of sequences c
a
c
1
a=1
.(i.e. sequences of bundles that
have equal quantities of all commodities). To sum,

(r) and
+
(r) are
nonempty, closed, and together cover R
+
, which is a connected set. Hence

(r)
+
(r) =
_
c R
+
s.t. r ~ cc
_
is nonempty for all r R
1
+
.
9
Monotonicity implies further that there is only one c such that r ~ cc,
since c
0
c ~ r for all c
0
c and r ~ cc for all c
00
< c. We may therefore
dene a function n : A R
+
by
n(r) = c

(r)
+
(r)
for all r A. It remains to be shown that n represents % and is continuous.
Suppose r % and r = cc and = ,c. Then c _ , by monotonicity,
so n(r) _ n() . Conversely, if n(r) _ n(), then cc _ ,c, so r % by
monotonicity.
Continuity of n requires: for any sequence r
a

1
a=1
with limit r, the se-
quence n(r
a
)
1
a=1
converges to n(r). Note that, because r
a

1
a=1
converges,
there exists for any a number :() such that |r
a
r| < for : _ :().
Hence n(r
a
)
1
a=a(.)
lies in a compact set: namely, in the interval [c. c]
where c = 0 and c is the highest quantity of any commodity in r
a

1
a=a(.)
.
Therefore n(r
a
)
1
a=1
must have a convergent subsequence.
Furthermore, every convergent subsequence has limit n(r), which is there-
fore the limit of the sequence n(r
a
)
1
a=1
. To see this, suppose there is a
convergent subsequence
_
n
_
r
n(a)
__
1
a=1
(: is an increasing function) that
converges to c ,= n(r). If e.g. c n(r), then cc ~ n(r) c by monotonicity,
and also ^ cc ~ n(r) c, where ^ c =
1
2
(c +n(r)) lies between c and n(r).
Because n
_
r
n(a)
_
c ^ c, we have, for some `, n
_
r
n(a)
_
^ c for all
: `. Then r
n(a)
~ n
_
r
n(a)
_
c ~ ^ cc, which implies r % ^ cc, because % is
continuous and % (r) therefore closed. But r ~ n(r) c, which conicts with
^ cc ~ n(r) c. Analogously, c < n(r) leads to a contradicton.

Exercise 3 (MWG 3.C.2). Prove the converse: if a continuous utility


function represents %, then % is continuous.
Example. Lexicographic preferences are not continuous and do not admit
a utility representation. With lexicographic preferences, there exists an or-
dering of commodities, so that r
(1)

(1)
== r % (where (1) refers to
the highest-priority commodity), r
(1)
=
(1)
and r
(2)

(2)
== r % , etc.
(As in alphabetic entries in a telephone book or dictionary.)
These preferences violate continuity: e.g. every bundle in the sequence
r
a

1
a=1
, where r
a
(1)
=
1
a
and r
a
(2)
= 0 for all :, is preferred to such that

(1)
= 0 and
(2)
= 1, but the limit r = (0. . . . . 0) is worse than .
10
Suppose n is a utility function representing lexicographic preference %.
Let r
(1)
=
(1)
= ` and r
(2)
= 1.
(2)
= 2. Whatever values n(r) and
n() n(r) the utility function assigns to r and , we can nd a rational
number : (`) (n(r) . n()). Lexicographic preference implies ` `
0
==
: (`) : (`
0
) (since r with r
(1)
= ` is preferred to
0
with
0
(1)
= `
0
, i.e.
` n(r) n(
0
) `
0
). Hence the function : maps one-to-one from R (an
uncountable set) to Q (a countable set), a contradiction.
One can understand the argument intuitively by noting that, with lexi-
cographic preferences, the indierence sets are singletons, so that a dierent
utility value has to be assigned for every bundle, i.e. every point of R
1
. But
the utility values come from the (in some sense smaller) set R.
Exercise 4 (MWG 3.C.4). Find a preference relation that is not contin-
uous but has a utility function that represents it.
2.2 Quasiconcavity
Besides continuity (dierentiability), which allows us to apply calculus
in solving the utility maximization problem, we would like to know
whether the utility function has a maximum. This property (quasicon-
cavity) relates to the convexity of preference.
Denition. Preference relation.% is convex if \r A the upper contour
set % (r) is convex: . . % (r) == \c [0. 1]
c + (1 c) . % (r) .
I.e. % is convex if % r and . % r imply \c [0. 1], c +(1 c) . %
r.
Preference relation % is strictly convex if % r and . % r imply
\c (0. 1), c + (1 c) . ~ r, provided ,= ..
11
Convex preference can be interpreted as a desire for variety: if I like
two bundles equally, then I nd a mixture of the two (a more balanced
bundle) more appealing. By the same token, I dislike extremes. If I
have a lot of one commodity and little of the others, I am willing to
trade aggressively (give up a lot) for a more balanced bundle. This
results in a "diminishing marginal rate of substitution": a high relative
valuation for commodities of which I have little, which decreases as I
acquire more.
Exercise 5 (MWG 3.C.1). Show that lexicographic preference is rational,
monotone and strictly convex.
Denition. Utility function n is quasiconcave if \r A the upper contour
set % (r) = c A s.t. n(c) _ n(r) is convex.
Quasiconcavity is weaker than concavity, which is easily apparent from
an equivalent denition: n is quasiconcave if \r. A and \c [0. 1]
n(cr + (1 c) ) _ min n(r) . n() .
(Why are the denitions equivalent? Suppose r % , then convexity of
% () implies cr + (1 c) % (), hence n(cr + (1 c) ) _ n().
Conversely, suppose r. % (.), and note that n(cr + (1 c) ) _
min n(r) . n() _ n(.) implies cr + (1 c) % (.).)
Of course, a convex function satises: \r. A and \c (0. 1)
n(cr + (1 c) ) _ cn(r) + (1 c) n() _ min n(r) . n() .
Utility function n is strictly quasiconcave if \r. A and \c (0. 1)
n(cr + (1 c) ) min n(r) . n() .
provided ,= ..
The graph of a convex function always lies above a line segment between
two of its points. The graph of a quasiconcave function only has to lie
above the lower of the two points. Figure 1 illustrates the concave and
quasiconcave cases in panels (a) and (b).
12
Figure 1: Concave and quasiconcave functions
Quasiconcavity is sucient to guarantee that a local extremum is a
global maximum. If there existed a greater maximum or a minimum,
then the utility function would have a trough somewhere and increase
on both sides of it, resulting in a "gap" (non-convexity) in the upper
contour set. However, the global maximum may not be unique if it is
part of a plateau.
Proposition. If a utility function represents a (strict) convex preference
relation, then it is (strictly) quasiconcave.
Proof. Consider a weakly convex preference relation; I demonstrate that
it implies quasiconcavity (the strict case is analogous). To show:
n(cr + (1 c) ) _ n(r)
or
n(cr + (1 c) ) _ n()
for all r. A, with c (0. 1). With convex preference, cr +(1 c) % r
whenever % r, and cr + (1 c) % whenever r % . If n represents
13
%, it must satisfy n(cr + (1 c) ) _ n(r) (i.e. the rst inequality) if
n() _ n(r), and n(cr + (1 c) ) _ n() (i.e. the second inequality) if
n(r) _ n().

Exercise 6 (MWG 3.B.1). Show: if % is monotone, then % is locally


nonsatiated.
Exercise 7 (MWG 3.C.5).
(a).Preference relation % is homothetic if \r. A, r == ~ r,
and r ~ == \c _ 0, cr ~ c. Utility function n is homogeneous
of degree 1 if \c 0 n(cr) = cn(r). Show: a continuous preference is
homothetic i it admits a utility function that is homogeneous of degree 1.
(b) Let c
1
= (1. 0. . . . . 0) denote the bundle that contains 1 unit of com-
modity 1 and none of the other commodities. Preference relation.% is qua-
silinear with respect to good 1 if \r. A, \c 0, r + cc
1
~ r, and
r ~ == \c R, (r +cc
1
) ~ ( +cc
1
). Show: a continuous preference
is quasilinear with respect to commodity 1 if it admits a utility function of
the form n(r) = r
1
+c(r
2
. . . . . r
1
). (No need to show a converse.)
Exercise 8 (MWG 3.C.6). Consider preferences in a two-commodity
world represented by the CES (constant elasticity of substitution) utility
function
n(r) = (c
1
r
j
1
+c
2
r
j
2
)
1j
.
Show:
(a) When j = 1, the graphs of the indierence.sets are linear.
(b) As j 0, the CES utility function represents in the limit the same
preferences as the Cobb-Douglas utility function n(r) = r
c
1
1
r
c
2
2
.
(c) As j , the CES utility function has in the limit the same
indierence sets as the Leontief utility function n(r) = min r
1
. r
2
.
14
3 Demand I: Utility Maximization Problem
3.1 Budgets
A consumer is constrained in her choices by a limited budget. If she
has wealth n, and commodities sell at market prices
j =
_

_
j
1
.
.
.
j
1
_

_
R
1
++
.
then her chosen bundle r R
1
+
must satisfy
j r = j
1
r
1
+ +j
1
r
1
_ n.
The set of bundles that meet the budget contraint, for given prices and
wealth, is called the budget set.
Denition. The Walrasian budget set is the set of all bundles r R
1
+
aordable with wealth n at prices j:
1
j,&
=
_
r R
1
+
s.t. j r _ n
_
.
Notice that a decline in any price enlarges the budget set: more bundles
satisfy the constraint.
In a two-commodity world, the budget set is the area under the budget
line j
1
r
1
+j
2
r
2
= n.
More generally, the budget set is bounded by an (1 1)-dimensional
hyperplane, dened by j r = n.
The budget set is convex: if j r _ n and j r
0
_ n, then \c [0. 1]
we have j r
00
= j cr + j (1 c) r
0
_ n. It is also compact: closed
and bounded, since r

_ n,j

for / = 1. . . . . 1 (you can at most spend


all your wealth on one commodity)
Exercise 9 (MWG 2.D.2). For an individual who consumes amount r of
a commodity priced at j, and / hours of leisure, when the hourly wage is 1,
what is the Walrasian budget set?
15
3.2 The Utility Maximization Problem (UMP)
Let the consumer preferences be rational, continuous, and locally non-
satiated - which implies, from the previous lecture, that there is a
continuous and quasiconcave utility function representing it. The con-
sumption set is A = R
1
+
.
The "utility maximization problem" is one of two equivalent ways to
frame the consumer choice problem (the other is the "expenditure min-
imization problem," which we will get to later on). In the UMP, the
consumer picks the most-preferred bundle in her Walrasian budget set,
which gives utility
max
a0
n(r) s.t. j r _ n.
By the Weierstrass theorem, a continuous real-valued function on a
compact, nonempty set has a maximum. Therefore the UMP has a
solution as long as preferences are continuous.
A solution to the UMP is in principle a set of consumption bundles
in the budget set, all of which give maximal utility. This set depends
on the parameters of the budget constraint: commodity prices and
individual wealth. The map from prices and wealth to the set of utility-
maximizing consumption bundles in the associated budget set is called
the Walrasian demand correspondence.
In general, the solution can be stated in terms of Kuhn-Tucker condi-
tions: for / = 1. . . . . 1,
Jn (r)
Jr

= `
Jq (r)
Jr

+
1

=1
j

J/

(r)
Jr

.
where ` _ 0 and the j

_ 0 are Lagrange multipliers (shadow prices),


q (r) = j r n _ 0.
is the budget constraint, and the
/

(r

) = r

_ 0.
/ = 1. . . . . 1 are the non-negativity constraints. If the relevant con-
straint is non-binding, ` = 0, respectively j

= 0 (i.e. the interior


rst-order condition holds).
16
The conditions reduce to
Jn(r)
Jr

= `j

j.
or equivalently
Jn (r)
Jr

_ `j

with equality if r

0. More concisely we can write this as _n(r) _ `j


(with equalities where r

0) in terms of the gradient vector


_n(r) =
_

_
0&(a)
0a
1
.
.
.
0&(a)
0a
L
_

_
.
In a two-commodity world, where the budget constraint binds and both
goods are consumed, these conditions imply
Jn (r) ,Jr
1
Jn (r) ,Jr
2
=
j
1
j
2
.
which is incidentally the tangency condition familiar from diagrams in
intermediate micro books. The left side is the marginal rate of substi-
tution, i.e. the slope of the indierence curve (solve dn = `l
1
dr
1
+
`l
2
dr
2
= 0 for dr
2
,dr
1
). The right side is the slope of the bud-
get line (found by totally dierentiating the budget line constraint, i.e.
j
1
dr
1
+j
2
dr
2
= 0, and solving for dr
2
,dr
1
.). To be tangent, the slopes
have to be equal. See Figure 2.
Tangency is necessary for an interior optimum, but there are innitely
many points satisfying it for dierent levels of wealth. Therefore, the
budget constraint is needed to x the solution.
Example. Consider preferences represented by the Cobb-Douglas utility
function n(r
1
. r
2
) = r
c
1
r
1c
2
. Note that, if n represents preferences, then so
must
~ n(r
1
. r
2
) = ln n(r
1
. r
2
) = cln r
1
+ (1 c) ln r
2
17
Figure 2: Optimal choice at the point of tangency
represent them, since the logarithm is an increasing transformation. Because
~ n(r
1
. r
2
) is strictly increasing in r
1
and r
2
, the budget constraint
j
1
r
1
+j
2
r
2
_ n
must bind.
Thus, the constrained choice is the solution to the Lagrangean problem
max
a
1
,a
2
1(r
1
. r
2
. `) = cln r
1
+ (1 c) ln r
2
+`(n j
1
r
1
j
2
r
2
) .
which satises rst-order conditions
J1(r
1
. r
2
. `)
Jr
1
=
c
r
1
`j
1
= 0
J1(r
1
. r
2
. `)
Jr
2
=
1 c
r
2
`j
2
= 0
J1(r
1
. r
2
. `)
J`
= n j
1
r
1
j
2
r
2
= 0.
The rst two reduce to
c
1 c
r
2
r
1
=
j
1
j
2
.
18
Thus
cj
2
r
2
= (1 c) j
1
r
1
.
From the budget constraint (the third rst-order condition), we have
n j
1
r
1
= j
2
r
2
.
hence
r
1
= c
n
j
1
and r
2
= (1 c)
n
j
2
.
If preference is monotonic, the budget will always be exhausted, since
all commodities are valuable. But a weaker property, local nonsatia-
tion, actually suces.
Denition. Preference relation.% is locally nonsatiated if \r A and
\ 0, A such that | r| _ and ~ r.
Local nonsatiation (which also has the eect of ruling out thick indif-
ference sets) is a more plausible property than monotonicity, since it
allows for limited wants.
Denition. The Walrasian demand correspondence r (j. n) satises Walras
law if \j 0,\n 0 and \r r (j. n), j r = n.
Walras law is satised if the consumers choice exhausts the budget.
The demand correspondence is homogeneous of degree / if r (cj. cn) =
c
I
r (j. n).
Proposition. If n is a continuous utility function that represents locally
non-satiated preference relation % on A = R
1
+
, then \j 0, \n 0,
\r r (j. n) the Walrasian demand correspondence is homogeneous of degree
zero and satises Walras law. If n is in addition quasiconcave, then r (j. n)
is convex, and if n is strictly quasiconcave, then r (j. n) is a singleton.
19
Proof. Homogeneity of degree zero follows from the fact that the budget
constraint is unchanged when prices and wealth are scaled by c:
cj r _ cn == j r _ n.
Walras law reects non-satiation. If j r < n for r r (j. n), then by
non-satiation there exists in every neighborhood of r an r
0
such that r
0
~ r.
If we pick a sucienty small neighborhood, then j r
0
< n, so that r
0
is in
the budget set. But then r is not a utility-maximizing choice, a construction.
If n is quasiconcave, then its upper contour sets are convex. Since all
elements of r (j. n) are equally preferred, r (j. n) is also the upper contour
set of any of its members: if r r (j. n), then
r (j. n) =% (r) = c A s.t. n(c) _ n(r) .
Hence r (j. n) inherits the convexity of % (r). If n is strictly quasiconcave
and r (j. n) has two distinct elements r and r
0
, then
n(r
00
) = n(cr + (1 c) r
0
) min (n(r) . n(r
0
)) .
But since n(r) = n(r
0
) = min (n(r) . n(r
0
)), this implies r
00
(which is in the
budget set, because it is convex) is strictly preferred to both r and r
0
, which
contradicts r. r
0
r (j. n) .

Exercise 10 (MWG 3.D.1). Verify that the above proposition holds for
the Walrasian demand function with Cobb-Douglas utility.
The following is a natural generalization of the continuity concept for
point-valued functions to set-valued correspondences. Informally it says
that solutions in one constraint set should still be solutions at a very
similar constraint set.
Denition. The Walrasian demand correspondence is upper hemi-continuous
if \(j. n), r r (j. n) whenever (j. n) and r are limits of sequences (j
a
. n
a
)
1
a=1
and r
a

1
a=1
such that r
a
r (j
a
. n
a
) for all :.
Proposition. If n is a continuous utility function that represents locally
non-satiated preference relation % on A = R
1
+
, then \j 0, \n 0 the
Walrasian demand correspondence is upper hemi-continuous.
20
Proof. Suppose sequences (j
a
. n
a
)
1
a=1
and r
a

1
a=1
converge to (j. n)
and r, and we have r
a
r (j
a
. n
a
) for all :, but r , r (j. n). Then there
exists r
0
in 1
j,&
such that n(r
0
) n (r). By continuity of n, there exists
also 1
j,&
such that n() n(r). Since (j
a
. n
a
) converges to (j. n), it
must be the case for all suciently large : that 1
j
n
,&
n. This implies
n(r
a
) _ n(), since r
a
is in the choice set for 1
j
n
,&
n. But as r
a

1
a=1
converges to r, this argument leads to n(r) _ n(), a contradiction.

Of course, the result implies that, if the Walrasian demand correspon-


dence is in fact a function, then this function is continuous.
3.3 Indirect Utility Function
The value of the utility function at a solution to the UMP is the highest
it can attain on the budget set, i.e. for a particular set of prices and
wealth. The map from prices and wealth to the highest attainable
utility value is called the independent utility function.
The indirect utility function is quasiconvex if \ the lower contour
set (j. n) s.t. (j. n) _ is convex.
Proposition. If n is a continuous utility function that represents locally
nonsatiated preference relation % on A = R
1
+
, then \j 0, \n 0, the
indirect utility function is homogeneous of degree zero, i.e. (cj. cn) =
(j. n), strictly increasing in n and non-increasing in j

for / = 1. . . . . 1,
quasiconvex, and continuous in j and n.
Proof. The indirect utility function is homogeneous of degree zero because
the Walrasian demand correspondence is: since the scaling of prices and
wealth does not aect the set of utility-maximizing choices, it cannot aect
the utility derived from them.
Since wealth increases and price decreases enlarge the budget set, the
best available choice can only improve, so that utility associated with it
cannot decrease. Because of local nonsatiation, which implies that the utility-
maximizing choice lies in the boundary of the budget set, the indirect utility
function must strictly increase in wealth.
21
To establish quasiconvexity, suppose (j. n) _ and (j
0
. n
0
) _ and
consider (j
00
. n
00
), the maximal utility attainable in the budget set dened
by prices
j
00
= cj + (1 c) j
0
and wealth
n
00
= cn + (1 c) n
0
.
For any r in this budget set,
cj r + (1 c) j
0
r _ cn + (1 c) n
0
.
so j r _ n or j
0
r _ n
0
must be true. In the rst case, r 1
j,&
, so
n(r) _ (j. n). In the second case, r 1
j
0
,&
0 , so n(r) _ (j
0
. n
0
). Thus
n(r) _ , i.e. r (j. n) s.t. (j. n) _ , the lower contour set at .
Because r was arbitrary, all bundles in 1
j
00
,&
00 belong to the lower contour
set at , and because was arbitrary, is quasiconvex.
In the more restrictive case that n is strictly quasiconcave, r (j. n) is
continuous, and n(r (j. n)) is a composition of continuous functions, so that
it is also continuous.

Exercise 11 (MWG3.D.2). Verify that the above proposition holds for the
inverse utility function with the log transformation of Cobb-Douglas utility.
Exercise 12 (MWG 3.D.6). In a three-commodity setting, let preferences
be represented by the utility function
n(r) = (r
1
/
1
)
c
(r
2
/
2
)
o
(r
3
/
3
)

.
(a) Why is there no loss of generality from imposing c + , + = 1?
Assume this for the remaining parts.
(b) What are the rst-order conditions for the UMP? Derive Walrasian
demand and the indirect utility function.
(c) Verify that the general properties of Walrasian demand and indirect
utility functions hold in this case.
22
4 Demand II: Expenditure Minimization Prob-
lem
4.1 EMP and Hicksian Demand
In the UMP ("utility maximization problem"), an optimal choice max-
imizes utility on a xed budget set. Analogously, we can dene optimal
choice as minimizing expenditure on a xed upper contour set (i.e. a
set in which all bundles yield at least a certain utility). This approach
is called the EMP ("expenditure minimization problem").
In the EMP, a consumer picks the consumption bundle for which ex-
penditure is
min
a0
j r s.t. n(r) _ n
= max
a0
(j r) s.t. n(r) _ n.
We assume j 0 and n n (0) throughout (so that at least one com-
modity must be consumed, and none in innite quantity). Further-
more, we assume that n represents a continuous, locally non-satiated
preference on R
1
+
, and is dierentiable.
The set of consumption bundles that are solutions to the EMP at prices
j and required utility n is denoted as /(j. n) _ R
1
+
and called the
Hicksian demand correspondence (or function, if single-valued).
Exercise 13 (MWG 3.E.3). Argue that a solution to the EMP exists if
j 0 and n(r) _ n for some r R
1
+
.
In parallel to the UMP, we apply the Kuhn-Tucker conditions: at a
solution r

, we have for / = 1. . . . . 1,

J (j r

)
Jr

= `
J (n n(r

))
Jr

+
1

=1
j

J (r

)
Jr

.
where ` _ 0 and the j

_ 0 are Lagrange multipliers (shadow prices).


If the utility constraint is non-binding at r, i.e. n(r) n, we have
` = 0. If the /th non-negativity constraint is non-binding at r, i.e.
r

0, then j

= 0.
23
Resolving the derivatives,
j

= `
Jn(r

)
Jr

.
i.e. (after rearranging and relabeling the multiplier
~
` = 1,`)
Jn (r

)
Jr

_
~
`j

with equality if r

0 (so that j

= 0).
This is exactly analogous to the rst-order conditions in the UMP. In a
two-commodity world, where the utility constraint binds (so that
~
` <
) and both goods are consumed, the solution r

is again characterized
by the tangency condition:
Jn (r

) ,Jr
1
Jn (r

) ,Jr
2
=
j
1
j
2
.
The only dierence is that the remaining constraint that xes r

is now
not the budget constraint, but the utility constraint n(r

) = n.
Example. Reconsider preferences represented by the Cobb-Douglas utility
function n(r
1
. r
2
) = r
c
1
r
1c
2
. The utility constraint has to bind, since expen-
diture j r is strictly decreasing in r
1
and r
2
. (If it did not bind, you could
slightly lower consumption of one or both of the goods and attain a lower
expenditure within the utility constraint.) Moreover, n(r) = n(0) if r
1
or
r
2
is zero, so they must be strictly positive for r to attain n n (0).
The constrained choice therefore satises the "tangency" condition, which
specializes to
c
1 c
r

2
r

1
=
j
1
j
2
.
given the Cobb-Douglas marginal utilities.
From the utility constraint, we have
r
c
1
r
1c
2
= n.
hence
r

2
=
_
r

2
r

1
_
c
n
24
and
r

2
=
_
1 c
c
j
1
j
2
_
c
n and r

1
=
_
c
1 c
j
2
j
1
_
1c
n.
Notice that r

1
and r

2
are functions of prices and the required utility
(not of wealth) in the EMP. The wealth that is needed to attain n is
allowed to vary. In this sense, Hicksian demand is also called com-
pensated demand, because if prices increase, expenditure is implicitly
adjusted as needed in order to keep utility constant. But the consump-
tion bundle r may change so as to make the increase in expenditure as
small as possible.
Hicksian demand has properties that correspond to those of Walrasian
demand. The Hicksian demand correspondence is said to have no excess
utility if \r /(j. n), n(r) = n.
Proposition. If n represents continuous, locally non-satiated preferences
on R
1
+
, and j 0, then the Hicksian demand correspondence /(j. n) is
homogeneous of degree zero in j, has no excess utility, is convex if preference
is convex, and single-valued if preference is strictly convex.
Proof. Homogeneity of degree zero follows from the fact that the upper
contour set at n (the constraint set), and therefore any solution to EMP, is
unaected by scaling prices:
r /(cj. n) == r /(j. n) .
(It is only the expenditure that changes, not the bundle that minimizes it -
the scaling leaves relative prices unaltered.)
No excess utility reects continuity of the utility function, which is in-
herited from preferences. If there were a solution r /(cj. n) such that
n(r) n, then we could construct a scaled-down bundle r
0
= cr with
c (0. 1) that satises j r
0
< j r (since r
0
r) and n(r
0
) _ n (for c close
enough to 1, by continuity). But this means r
0
, and not r, can be a solution
to EMP at n, a contradiction.
Let r. r
0
/(j. n), so that j r = j r
0
. If preference is convex (utility
quasiconcave), then upper contour sets are convex, so
r
00
= cr + (1 c) r
0
25
attains utility n. Moreover,
j r
00
= j cr +j (1 c) r
0
= cj r + (1 c) j r
0
= j r.
It follows that r
00
also minimizes expenditure and belongs to /(j. n). If
preference is strictly convex (utility strictly quasiconcave), then r
00
~ r (and
r
00
~ r
0
since /(j. n) satises no excess utility, so that r
0
~ r). Continuity
implies there exists , (0. 1) such that ,r
00
~ r, i.e. n(,r
00
) n(r). But
since ,r
00
r
00
and j r
00
= j r, we have j ,r
00
< j r, which implies
that r does not minimize expenditure on the constraint set, i.e. r , /(j. n),
a contradiction. Hence two such elements r. r
0
of /(j. n) cannot exist, and
/(j. n) must be a singleton.

4.2 Expenditure Function


The value of j r

at a solution r

to the EMP is denoted as c (j. n)


and called the (minimum) expenditure.
The expenditure function is homogeneous of degree one in j if \c 0,
c (cj. n) = cc (j. n).
Proposition. If n represents a continuous, locally non-satiated preference
on R
1
+
, and j 0, then the expenditure function c (j. n) is homogeneous
of degree one in j, strictly increasing in n and non-decreasing in j

for
/ = 1. . . . . 1, concave in j, and continuous in j and n.
Proof. Since scaling up the price does not change the constraint set
(i.e. the upper contour set at n), it does not aect the solution r

. Hence
c (cj. n) = cj r

= cc (j. n), so that the expenditure function is homoge-


neous of degree one in j.
If c (j. n) were not strictly increasing in n, then there would exist solutions
r
0
and r
00
, respectively at n
0
and n
00
n
0
, such that j r
0
_ j r
00
0. (We
maintain n n(0), so r
0
. r
00
,= 0.) Continuity implies there is a bundle
r = cr
00
for some c (0. 1) that satises n(r) n
0
and j r < j r
00
_ j r
0
(since r r
00
). But then r
0
cannot be a solution to EMP.
Suppose c (j. n) were strictly decreasing in j

for some /. I.e. if we


compare expenditure at two price vectors j
0
and j
00
diering only in that
j
00

_ j
0

, then c (j
00
. n) < c (j
0
. n). Since the constraint set is not aected
26
by the price dierence, the same r is a solution with both j and j
0
, so
c (j
00
. n) = j
00
r _ j
0
r = c (j
0
. n), a contradiction.
If r
00
solves EMP at n with prices j
00
= cj +(1 c) j
0
for c [0. 1], then
c (j
00
. n) = j
00
r
00
= cj r
00
+ (1 c) j
0
r
00
_ cc (j. n) + (1 c) c (j
0
. n) .
since r
00
is available at prices j and j
0
(so expenditure with prices j and j
0
at the minimizing bundles r and r
0
cannot be larger then at r
00
). Hence the
expenditure function is concave in j.
We do not prove continuity.

Concavity follows from the fact that, if j is increased to j


0
and r kept
xed, expenditure (to maintain utility level n) increases linearly to
c (j
0
. n). The consumer can always attain n at an expenditure no greater
than c (j
0
. n), but may be able adjust r to r
0
to attain n at reduced
expenditure. Since this is true for all j, c (j
0
. n) increases less than
linearly in j, thus is concave.
Exercise 14 (MWG 3.E.2). Conrm that the general properties of the
Hicksian demand function and the expenditure function hold with Cobb-
Douglas preferences.
Exercise 15 (MWG 3.E.6). For the CES (constant elasticity of substitu-
tion) utility function
n(r) = (r
j
1
+r
j
2
)
1j
.
derive the Hicksian demand function and the expenditure function, and verify
their general properties in this case.
Exercise 16 (MWG 3.E.7). If preferences are quasilinear with respect to
the rst good, show that the Hicksian demand functions for the remaining
goods are invariant to n, and nd the form of the expenditure function.
27
4.3 Duality
The connection between Walrasian and Hicksian demand is that UMP
and EMP have the same solutions when wealth n in the UMP is xed
at the level of minimized expenditure j r

= c (j. n) in the EMP,


and (equivalently) when utility n in the EMP is xed at the level of
maximized (or indirect) utility n(r

) = (j. n) in the UMP. I.e.


/(j. n) = r (j. c (j. n)) and r (j. n) = /(j. (j. n)) .
Example. Compare the Walrasian and Hicksian demands for Cobb-Douglas
preferences:
r
&
1
= c
n
j
1
and r
&
2
= (1 c)
n
j
2
and
r
I
1
=
_
c
1 c
j
2
j
1
_
1c
n and r
I
2
=
_
1 c
c
j
1
j
2
_
c
n.
Equating either the r
1
s or r
2
s gives
n =
_
j
1
c
_
c
_
j
2
1 c
_
1c
n.
Since
n(r
&
) =
_
c
n
j
1
_
c
_
(1 c)
n
j
2
_
1c
=
_
c
j
1
_
c
_
1 c
j
2
_
1c
n = (j. n)
in the UMP, we see that r
&
= r
I
if n = n(r
&
). On the other hand, since
j r
I
= j
1
_
c
1 c
j
2
j
1
_
1c
n +j
2
_
1 c
c
j
1
j
2
_
c
n
=
_
_
c
1 c
_
1c
+
_
1 c
c
_
c
_
j
c
1
j
1c
2
n
=
_
j
1
c
_
c
_
j
2
1 c
_
1c
n = c (j. n) .
in the EMP, r
&
= r
I
if n = j r
I
.
28
The UMP and EMP are generally equivalent in the following sense.
Proposition. If n represents continuous, locally non-satiated preferences
on R
1
+
, and j 0, then:
(i) if r

solves the UMP at wealth n 0, then r

solves the EMP at


n = n(r

), and c (j. n) = n;
(ii) if r

solves the EMP at utility n n (0), then r

solves the UMP at


n = c (j. n), and n(r

) = n.
Proof. (i) Suppose r

is a solution to UMP at n, but not to EMP at


n = n(r

). Let instead r
0
be a solution to EMP at n = n(r

), so that
j r
0
< j r

and n(r
0
) _ n. Local nonsatiation implies that there exists r
00
such that n(r
00
) n (r
0
) suciently close to r
0
for j r
00
< j r

to still hold.
But then r
00
1
j,&
and n(r
00
) n(r

), so that r

was not a solution to


UMP. By contradiction, r

solves EMP, and therefore c (j. n) = j r

= n.
(ii) Conversely, suppose r

is a solution in EMP at n n(0), but not


in UMP at n = j r

. Let instead r
0
be a solution to UMP at n, so that
n(r
0
) n(r

) and j r
0
_ j r

. By continuity, we can scale r


0
down to
r
00
= cr
0
, with c (0. 1) suciently close to 1, such that n(r
00
) n (r

) still
holds. (Note that n n(0) implies r

,= 0 and r
0
,= 0, so j r

0.) Since
r
00
r

, we have j r
00
< j r

, and r

cannot be a solution to EMP. By


contradiction, r

solves UMP, and therefore n(r

) = n.

Exercise 17 (MWG 3.E.9). Using the equivalence of the UMP and EMP,
show that the general properties of the indirect utility function (homogeneous
of degree zero, strictly increasing in n, non-increasing in prices, quasiconvex,
continuous in j and n) imply the general properties of the expenditure func-
tion (homogeneous of degree one in j, strictly increasing in n, non-decreasing
in prices, concave in j, continuous in j and n), and vice versa.
Exercise 18 (MWG 3.E.10). Using the equivalence of the UMP and
EMP and the properties of indirect utility and expenditure functions, show
that properties of the Walrasian demand function for continuous, locally
non-satiated preferences (homogeneous of degree zero, Walras law, convex /
single-valued if preferences are convex / strictly convex) imply properties of
the Hicksian demand function (homogeneous of degree zero in j, no excess
utility, convex / single-valued if preferences are convex / strictly convex),
29
and vice versa. (Note there is a typo in the book - you are to show that
Proposition 3.D2 implies Proposition 3.E3, not 3.E4).
The relationship between UMP and EMP is just a special case of a far
more general theory of duality. In this connection, duality means that
a constrained maximization problem can be expressed as a constrained
minimization problem, swapping objective and constraint.
5 Comparative Statics
5.1 Wealth Eects
In this lecture, we examine the behavior of the demand function as
prices and wealth change. We assume for now that demand is single-
valued (i.e. preferences are strictly convex).
The wealth eect for the /th good is Jr

(j. n) ,Jn.
If the wealth eect is positive, i.e. Jr

(j. n) ,Jn _ 0, we say that the


good is normal. If the wealth eect is negative, i.e. Jr (j. n) ,Jn < 0,
we say that the good is inferior.
Goods are inferior when there are higher-quality, costlier substitutes
for the agent to switch to as wealth increases. (E.g. supermarket bread
vs. fresh bread from a bakery.)
The total wealth eect is given by the derivative vector with respect to
n:
1
&
r (j. n) =
_

_
0a
1
(j,&)
0&
.
.
.
0a
L
(j,&)
0&
_

_
R
1
.
5.2 Price Eects
The price eect for the /th good is Jr

(j. n) ,Jj

.
Typically, we expect the price eect to be negative. A good for which
the price eect is positive, so that the agent consumes more of it after
its price increases, is called a Gien good.
30
A Gien good is similar to an inferior good - actually, it is just a very
inferior good. As price increases, the agents budget set shrinks, and if
the wealth eect (having more of an inferior good when you are poorer)
is suciently powerful, we have a Gien good.
The total price eect is given by the derivative matrix with respect to
j:
1
j
r (j. n) =
_

_
0a
1
(j,&)
0j
1

0a
1
(j,&)
0j
L
.
.
.
.
.
.
.
.
.
0a
L
(j,&)
0j
1

0a
L
(j,&)
0j
L
_

_
.
We consider now some relationships between price and wealth eects.
Recall that Walrasian demand is homogeneous of degree zero. An im-
pliciation is that changes to the consumption bundle in response to
simultaneous (and proportionate) increases in prices and wealth can-
cel. This merely reects the invariance of the budget constraint.
Proposition. If Walrasian demand r (j. n) is homogeneous of degree zero,
then \j and \n,
1
j
r (j. n) j +1
&
r (j. n) n = 0.
Proof. By zero-homogeneity, \c 0,
r (cj. cn) r (j. n) = 0.
i.e.
_

_
r
1
(cj. cn)
.
.
.
r
1
(cj. cn)
_

_
r
1
(j. n)
.
.
.
r
1
(j. n)
_

_
=
_
_
0
0
0
_
_
This is true, in particular, for c = 1. Totally dierentiating with respect to
c, we have for / = 1. . . . . 1,
1

I=1
Jr

(cj. cn)
Jj
I
j
I
+
Jr

(cj. cn)
Jn
n = 0.
which corresponds to the claim in matrix notation when c = 1.

31
Walras law implies that an increase in prices (while wealth remains
xed) must be accompanied by an osetting decrease in consumption,
and that an increase in wealth (while prices remain xed) brings with
it a corresponding increase in consumption.
To understand how the following results relate to this intuition, no-
tice that a price increase for all commodities would increase the re-
quired wealth in proportion r (j. n), and therefore the decrease in re-
quired wealth from reduced consumption, j1
j
r (j. n), has to be of the
same magnitude. Similarly, an increase in available wealth has to be
matched by the increase in required wealth from greater consumption,
j 1
&
r (j. n).
Proposition. If Walrasian demand r (j. n) satises Walras law, then \j
and \n,
r (j. n)
T
+j
T
1
j
r (j. n) = 0
T
and
j
T
1
&
r (j. n) = 1.
Proof. Dierentiating Walras law,
j r (j. n) = n.
rst with respect to prices, we have for / = 1. . . . . 1
r

(j. n) +
1

I=1
j
I
Jr
I
(j. n)
Jj

= 0.
which corresponds to the rst claim in matrix notation.
Dierentiating Walras law with respect to wealth, we have for / =
1. . . . . 1
1

=1
j

Jr (j. n)
Jn
= 1.
which corresponds to the second claim in matrix notation.

32
Exercise 19 (MWG 2.E.3). If r (j. n) is homogeneous of degree zero, i.e.
\c 0, r (cj. cn) = r (j. n), and satises Walras law, show that
j 1
j
r (j. n) j = n.
Interpret.
Exercise 20 (MWG 2.E.4). If r (j. n) is homogeneous of degree one with
respect to n, i.e. \c 0, r (j. cn) = cr (j. n), and satises Walras law,
show that
&
(j. n) = 1 for / = 1. . . . . 1. Interpret.
Exercise 21 (MWG 2.E.6). In the case of the demand function
r
1
(j. n) =
j
2
j
1
+j
2
+j
3
n
j
1
r
2
(j. n) =
j
3
j
1
+j
2
+j
3
n
j
2
r
3
(j. n) =
j
1
j
1
+j
2
+j
3
n
j
3
.
verify the three derivative conditions for a demand function r (j. n) that is
homogeneous of degree zero and satises Walras law.
5.3 Law of Demand
The "law of demand" refers to the intuitive property that a price in-
crease should reduce demand for a commodity. This statement is, how-
ever, not generally valid for Walrasian demand.
Exceptions are the Gien goods. They arise because a price increase
eectively makes the agent poorer and may lead to an increase in the
consumption of relatively cheap commodities.
In the Hicksian denition of demand, a price increase is accompanied
by an increase in expenditure, so it does not "impoverish," and the
Gien eect is absent. Hence the two denitions of demand dier in
comparative statics.
Hicksian demand is said to obey the "compensated law of demand,"
i.e. (single-valued) demand for any commodity diminishes if its price
increases.
33
Proposition. If n represents continuous, locally non-satiated preferences
on R
1
+
, and the Hicksian demand correspondence /(j. n) is single-valued for
all j 0, then \j
0
. j
00
,
(j
00
j
0
) (/(j
00
. n) /(j
0
. n)) _ 0.
Proof. Simply observe that
j
00
/(j
00
. n) _ j
00
/(j
0
. n)
j
0
/(j
0
. n) _ j
0
/(j
00
. n) .
since /(j
00
. n) minimizes expenditure when prices are j
00
, and /(j
0
. n) mini-
mizes expenditure when prices are j
0
. Subtracting j
0
/(j
0
. n) on the right
and j
0
/(j
00
. n) on the left of the rst inequality preserves signs and gives
the result.

5.4 Elasticity
The partial derivatives are not unit-free measures. E.g. Jr

(j. n) ,Jj

would depend on the currency in which we quote prices: clearly, a $1


increase in the price of petrol would cause a greater response in demand
than a U1 increase (which translates into about one cent). Moreover,
Jr

(j. n) ,Jj

depends in this case on whether petrol is sold by gallon


or by liter (about a quarter gallon).
An alternative way to describe price and wealth eects is in terms of
elasticities, which are unit-free.
Denition. The price elastisticity of demand r

(j. n) with respect to the


/th good is:

I
(j. n) =
Jr

(j. n)
Jj
I
j
I
r

(j. n)
.
If / = /,
I
(j. n) is called "own-price elasticity." If / ,= /,
I
(j. n) is
called "cross-price elasticity." The wealth elastisticity of demand r

(j. n) is:

&
(j. n) =
Jr

(j. n)
Jn
n
r

(j. n)
.
34
Elasticities can be interpreted as the approximate relative percent change
in two variables (this relationship holds exactly only in the limit, when
the change is very small): e.g. the percent change in r

(j. n) in re-
sponse to a small percent change in j

.
Exercise 22 (MWG 2.E.8). Demonstrate that the price elasticity of de-
mand r

(j. n) with respect to the /th good can be expressed as

I
(j. n) =
J ln (r

(j. n))
J ln j
I
.
and derive a corresponding expression for
&
(j. n).
In terms of elasticities, we can restate the relationship between the
responses in demand for commodity / to changes in prices and wealth
(as implied by homogeneity) as:
1

I=1

I
(j. n) +
&
(j. n) = 0.
5.5 Money Metric
In the remainder, we consider the welfare implications of price changes.
First, we derive a quantitative measure of welfare change in terms of the
expenditure function. Since data availability is normally a signicant
constraint, we are also interested in conditions under which a price is
welfare-improving that are based on minimal information (such as the
initial price and demand and the new price).
Let the consumers preference relation be rational, continuous, locally
nonsatiated and the expenditure and indirect utility functions dieren-
tiable. Suppose wealth is xed at n 0, and the price changes from
j
0
to j
1
.
If consumers preferences are known, then she is better o after the
price change if and only if (j
1
. n) (j
0
. n) _ 0.
35
Due to the duality of the UMP and EMP, we can express the indirect
utility at j
1
and j
0
in terms of the expenditure required to attain it.
Recall that the expenditure function c (j. n) is strictly increasing in n.
Hence, at given prices, the expenditure function is itself an indirect
utility function when n is evaluated at n = (j. n).
I.e. we have, for any xed price j 0,
c ( j. (j. n)) _ c ( j. (j
0
. n
0
)) == (j. n) _ (j
0
. n
0
) .
and c ( j. (j. n)) c ( j. (j
0
. n
0
)) is a meaningful quantitative measure
of welfare change. It is the change in wealth needed to buy the optimal
bundle when the budget set changes from 1
j,&
to 1
j
0
,&
0 .
The expenditure function evaluated at n = (j. n) is called the money
metric (indirect utility function). It is independent of the utility repre-
sentation we choose for the agents preference, as all utility representa-
tions select the same consumption bundle at given prices and wealth.
Hence it is unique up to the choice of j.
Note that because indirect utility is decreasing in prices, so is the money
metric. (Expenditure increases in its price arguments, but those prices
are xed in the money metric.)
Clearly, the agent is better o at prices j
1
than at price j
0
if and only
if c ( j. (j
1
. n)) c ( j. (j
0
. n)) _ 0 (for any indirect utility function),
i.e. if a bundle that yields the maximal utility attainable at prices j
1
is
more expensive than a bundle that yields the maximal utility attainable
at prices j
0
.
5.6 Welfare Comparisons
Let n
0
= (j
0
. n) and n
1
= (j
1
. n). Then we can dene the change
in welfare by
c
_
j
0
. n
1
_
c
_
j
0
. n
0
_
= c
_
j
0
. n
1
_
c
_
j
1
. n
1
_
.
since c (j
0
. n
0
) = c (j
0
. (j
0
. n)) = n = c (j
1
. (j
1
. n)) = c (j
1
. n
1
)
through the equivalence of UMP and EMP.
36
This is the additional wealth the agent would have needed at the old
prices j
0
in order to attain the level of utility n
1
that is available under
the new prices j
1
.
The following is also a plausible way to dene the change in welfare
from the money metric:
c
_
j
1
. n
1
_
c
_
j
1
. n
0
_
= c
_
j
0
. n
0
_
c
_
j
1
. n
0
_
.
This is what the agent has to spend at the new prices j
1
in order to
maintain the original utility n
0
that was available at the old prices j
0
.
Note, however, that if there were wealth eects, then these measures
would not coincide. Consider a price change in good 1 only. If j
0
1
j
1
1
,
so that n
1
n
0
, then n
1
is associated with greater wealth at given
prices. If good 1 is normal, this means more of it is demanded at
optimum. If good 1 is inferior, less of it is demanded. Hence the chosen
bundles and their associated expenditures would depend on how n is
xed.
Exercise 23 (MWG 3.I.3). Welfare change as measured by
1\
_
j
0
. j
1
. n
_
= c
_
j
0
. n
1
_
c
_
j
1
. n
1
_
is called the "equivalent variation" (EV), and
C\
_
j
0
. j
1
. n
_
= c
_
j
0
. n
0
_
c
_
j
1
. n
0
_
is called the "compensating variation" (CV). (Respectively, these tell us the
change in wealth that is required to maintain utility at n
1
and n
0
after a
price change.) Suppose the price of good / falls (other price remained xed),
giving the new price vector j
1
_ j
0
. Demonstrate that C\ (j
0
. j
1
. n)
1\ (j
0
. j
1
. n) if good / is inferior.
Exercise 24 (MWG 3.I.5). Suppose n(r) is quasilinear with respect to
the rst good (and j = 1 is xed). Show that C\ (j
0
. j
1
. n) = 1\ (j
0
. j
1
. n)
for any prices j
0
and j
1
, at all wealth levels n.
Exercise 25 (MWG3.I.6). Let there be a population of consumers indexed
by i = 1. . . . . 1, with utility functions n
i
(r) and individual wealths n
i
. For
37
any change in prices from j
0
to j
1
such that

i
C\
i
(j
0
. j
1
. n) 0, show that
it is possible to compensate everyone for lost utility. I.e. there exist wealth
levels n
0
i

1
i=1
such that

i
n
0
i
_

i
n
i
and
i
(j
1
. n
0
i
) _
i
(j
0
. n
i
) for all i.
Absent wealth eects, the two denitions of welfare change are equiv-
alent. We can use the fact that, for / = 1. . . . . 1,
Jc (j. n)
Jj

= /

(j. n)
(by the envelope theorem, since c (j. n) = j /(j. n) is locally invariant
to changes in its minimizer /(j. n)) to derive
c
_
j
0
. n
_
c
_
j
1
. n
_
=
1

=1
_
j
0
`
0
/

(j. n) dj

=1
_
j
1
`
0
/

(j. n) dj

=
1

=1
_
j
0
`
j
1
`
/

(j. n) dj

.
This is the change in consumer surplus as a result of the price change.
See Figure 3 for illustrations of the consumer surplus from good 1 before
and after a price cut.
What if only partial information is available about demand, such as
the initial price j
0
and choice r
0
= r (j
0
. n), and we wish to evaluate
the impact of a change in prices to j
1
?
Proposition. If preference is locally nonsatiated, then the agent is strictly
better o under (j
1
. n) than under (j
0
. n) if (j
1
j
0
) r
0
< 0.
Proof. By Walras law, j
0
r
0
= n, so (j
1
j
0
) r
0
< 0 implies j
1
r
0
< n.
But then r
0
is in the interior of the budget set at j
1
, and by local nonsatiation
there exists a bundle r 1
j
1
,&
that is strictly preferred to r
0
.

Exercise 26 (MWG 3.I.12). Extend this test of welfare improvement to


changes in prices and wealth from (j
0
. n
0
) to (j
1
. n
1
), where it is now not
necessarily the case that n
1
= n
0
. (No need to do this for equivalent and
compensating variation.)
38
Figure 3: Change in consumer surplus
The converse is not always true: (j
1
j
0
) r
0
0 does not imply
that the agent is worse o after the price change. To appreciate the
dierence, look at Figure 4, where the two scenarios are depicted in
price space.
The set of prices that keep expenditure constant at c (j
0
. r
0
) is drawn
as a convex curve, because the expenditure function is concave in each
price. (I.e. keeping j
1
xed, expenditure increases at a diminishing
rate as j
2
increases, hence smaller reductions in j
1
are required to
oset increases of given size in j
2
.)
Since j
0
is an optimal choice, it attains c (j
0
. r
0
), and therefore lies on
the curve. The gradient of the xed-expenditure curve at this point
is \
j
c (j
0
. r
0
) = r
0
, by the envelope theorem: since r
0
minimizes
c (j
0
. r
0
) at j
0
, i.e. \
a
c (j
0
. r
0
) = 0, we have for / = 1. 2,
dc (j
0
. r
0
)
dj

=
Jc (j
0
. r
0
)
Jj

+
Jc (j
0
. r
0
)
Jr

Jr

Jj

=
Jc (j
0
. r
0
)
Jj

= r

.
The gradient vector r
0
of c (j. n
0
) is orthogonal to the tangent of the
level set j s.t. c (j. n
0
) = c (j
0
. n
0
) at j
0
. To see this, think of a vector
39
Figure 4: Scenarios (j
1
j
0
) r
0
< 0 (left) and (j
1
j
0
) r
0
0 (right)
in the tangent space (i.e. a vector in the direction of the tangent). It
must be a multiple of = (1. r
0
1
,r
0
2
), since a one-unit increase in j
1
requires j
2
to be reduced by (r
0
1
,r
0
2
) j
1
to keep expenditure constant.
Clearly, r
0
= 0.
Since j
1
j
0
is the vector "from" j
0
to j
1
, it must lie below the tangent
to the level curve, which is orthogonal to r
0
, when (j
1
j
0
) r
0
< 0,
and above when (j
1
j
0
) r
0
0. The concavity of the expenditure
function therefore implies that c (j
1
. n
0
) < c (j
0
. n
0
) in the rst case (so
that there is a welfare improvement), but not necessarily in the second.
As drawn in the right panel of Figure 4, welfare decreases, since j
1
lies
above the level curve at c (j
0
. n
0
), but if the price change (in the same
direction) were suciently large, j
1
could lie beneath the level curve.
Exercise 27 (MWG 3.I.11). Suppose r
1
= r (j
1
. n) is known in addition
to j
0
, j
1
and r
0
. Argue that the agent is worse o at j
1
than at j
0
if
(j
1
j
0
) r
1
0, or equivalently if j
0
(r
1
r
0
) < 0 (wealth is xed).
Give graphic intuition for these results. (They can be established via rst-
order approximation of the expenditure function at j
1
, but direct proofs are
enough.)
40
6 Choice-Based Approach
6.1 Choice Structures
So far we have assumed that choice behavior is consistent with an un-
derlying rational preference ordering. Since preferences are unobserv-
able, their existence and properties are not directly testable. In this
lecture, we will build a largely parallel theory of demand that starts
from the properties of observable choices.
A choice structure (E. C ()) consists of a family E of budget sets and
a choice rule C () that assigns a nonempty subset C (1) _ 1 to every
1 E.
In this context, a budget set 1 E may be thought of as a specic
decision problem, of many possible problems, the agent may face. The
problem the agent solves is to choose one or more elements from this
set.
Since existence of a preference is not assumed, it is not meaningful to
say that objects in C (1) are indierent to each other and preferred to
everything else in 1. The set C (1) simply describes the objects the
agent might be observed to choose when presented with budget set 1,
whatever the reasons.
One may dene a "revealed preference" relation from a choice rule as
follows.
Denition. %

is the "revealed preference" derived from choice structure


(E. C ()) if r %

("r is revealed preferred to ") if and only if r C (1)


for some 1 E such that r. 1.
The relation %

need have none of the rationality properties we assumed


for primitive preferences. For example, r and are only comparable if
r C (1) or C (1) for some 1 E such that r. 1. If r and
are never chosen, then we have no information about them. Hence %

is not necessarily complete.


41
There is also no guarantee of transitivity. For example, if C (r. ) =
r, C (. .) = , and C (r. .) = ., then r ~

., but
. ~

r.
Exercise 28 (MWG 2.F.4). Laspeyres and Paasche indices measure the
change in consumption between two points in time at xed prices. Let
j
0
and
0
denote prices and quantities at time 0, and let j
1
and
1
de-
note the new prices and quantities at time 1. The Laspeyres index, 1
Q
=
(j
0
r
1
) , (j
0
r
0
), is based on initial prices. The Paasche index, 1
Q
=
(j
1
r
1
) , (j
1
r
0
), uses new prices. Consider also the expenditure change
1
Q
= (j
1
r
1
) , (j
0
r
0
), which allows prices to vary. (If demands refer to
aggregate consumption, this is the percent change in GDP.) Argue:
(a) If 1
Q
< 1, then r
0
is revealed preferred to r
1
.
(b) If 1
Q
1, then r
1
is revealed preferred to r
0
.
(c) If 1
Q
< 1 or 1
Q
1, no revealed preference between r
0
and r
1
can
be established.
Given a budget set and a preference relation %, one may derive a choice
rule as follows.
Denition. (E,C

(. %) . ) is the choice struture generated by preference %


if and only if \1 E,
C

(1. %) = r 1 s.t. \ 1, r % .
In principle, the generated choice rule could be empty for some 1
E. (I.e. there is no most-preferred element in the budget set.) As
we know, rationality and continuity of % ensure that there exists a
continuous utility representation and a solution to the UMP, hence that
C

(1. %) ,= ?. Whenever we refer to a generated choice structure in


this lecture, we will impicitly assume that it satises the denition of a
choice structure, i.e. that the generated choices are always nonempty.
42
6.2 Weak Axiom
In order to have anything substantive to say about observed choices, we
need to impose some minimal consistency. The weak axiom of revealed
preference says that, if r is ever a choice when is available, then r
must be a choice whenever r is available and is a choice. In other
words, if we ever observe a preference for r over , then we can never
observe a strict preference for over r.
Denition. Choice structure (E. C ()) satises the weak axiom (WA) if,
whenever r C (1) for 1 E with r. 1, and C (1
0
) for 1
0
E
with r. 1
0
, we also have r C (1
0
).
Exercise 29 (MWG 2.F.3). The following is partial information about a
consumers purchases:
Year 1 Year 2
Quantity Price Quantity Price
Good 1 100 100 120 100
Good 1 100 100 ? 80
.
Give the range of quantities of good 2, consumed in year 2, such that
(a) the choices violate the weak axiom,
(b) the consumption bundle in year 1 is revealed preferred to that in year
2,
(c) the consumption bundle in year 2 is revealed preferred to that in year
1,
(d) neither (a), (b) nor (c) can be concluded based on the data,
(e) given WA holds, good 1 is inferior at some price,
(f) given WA holds, good 2 is inferior at some price.
Exercise 30 (MWG 1.C.2). Argue that WA is equivalent to the following
property. If 1. 1
0
E, with r. 1 1
0
, then r C (1) and C (1
0
)
imply r. _ C (1) C (1
0
).
Example. In the non-transitive scenario above, WA is not violated. Since
each pair (r. , . ., r. .) belongs to only one 1 E, the axiom is
not tested. It would be a dierent matter if we added a fourth budget set
r. . . to E. Now any choice rule violates WA. If r C (r. . .), then
43
WA stipulates that r is a choice for any subset of r. . . that contains r
and where or . is a choice. But this was not the case. The same argument
applies to C (r. . .) and . C (r. . .), so that C (r. . .) must
be empty (which is impossible by denition).
Hence, for an arbitrary family of budget sets E, it is not the case
that the revealed preference %

derived from (E. ( ()) which satises


WA is necessarily rational. The choice rule needs to be dened on a
suciently comprehensive family of budget sets. (This makes it more
restrictive, in the sense that it has to "commit" to a choice in more
decision situations.)
Proposition. Let E include all subsets of A that contain one, two or three
elements. Then, if the choice structure (E. ( ()) satises WA, the revealed
preference %

derived from it is rational.


Proof. The relation %

derived from (E. ( ()) is complete: \r. A,


we have r. E. Either r C (r. ), i.e. r %

, or C (r. ),
i.e. %

r, or both. Moreover, %

derived from (E. ( ()) is transitive:


suppose r %

and %

.. Then there exists 1 such that r. 1


and r C (1), and there exists 1
0
such that . . 1
0
and C (1
0
).
Consider the budget set r. . .. If . C (r. . .), then WA requires
C (r. . .). If C (r. . .), then WA requires r C (r. . .).
Hence r C (r. . .), whereby r %

..

Since the revealed preference ordering is fully determined by choices


from two-object sets, it is in fact unique. (Provided every possible set
of two objects is included in E.)
Exercise 31 (MWG 1.C.3). Let choice structure (E. C ()) satisfy WA,
and dene revealed strict preference such that r ~

== 1 E with
r. 1, r C (1) and , C (1).
(a) Compare ~

to ~

dened such that r ~

== r %

and
not %

r (where %

is the revealed preference derived from the choice


44
structure). Show that the two denitions are equivalent. Does this depend
on WA?
(b) Give an example where ~

is not transitive.
(c) Argue that ~

is transitive if E includes all three-element subsets of A.


When the choice rule C () in choice structure (E. C ()) is generated by
a rational preference %, i.e. \1 E, C (1) = C

(1. %), we say that


% rationalizes C ().
One can think of generating a choice structure from a rational pref-
erence as simulating empirical choice data with a rational preference
model. If the model exactly predicts choices, then the choices are ex-
plained (rationalized) by the model (preference).
If the revealed preference %

derived from choice structure (E. C ()) is


rational, then it rationalizes (E. C ()). This is not quite obvious; we
must verify that %

generates (E. C ()). In fact, that statement is true


only if WA holds for (E. C ()).
Proposition. The revealed preference %

derived from a choice structure


(E. C ()) that satises WA generates (E. C ()).
Proof. Our task is to show that \1 E, C (1) = C

(1. %). Let r


C (1). Since r %

, \ 1, we have r C

(1. %

). Thus C (1) _
C

(1. %

). In the other direction, let r C

(1. %

), i.e. r %

, \ 1.
Then there must exist, for every 1, a budget set 1
j
such that r.
1
j
and r C (1
j
). Now, some such 1 must be chosen from 1, i.e.
C (1). By WA, this implies r C (1). Thus C

(1. %

) _ C (1).
Combining the inclusions, C (1) = C

(1. %

).

Hence, if WAholds for the choice structure, then the revealed preference
derived from it is rational and generates the choice structure. I.e. WA
implies that the choice structure is rationalizable (provided the choice
rule covers all sets of up to three objects).
It turns out that WA is not only sucient (with restrictions on E), but
also necessary for a choice structure to be rationalizable.
45
Proposition. A choice structure (E. C

(. %)) that is generated by a rational


preference % satises WA.
Proof. Suppose r C

(1. %) for 1 E and 1. Since the choice


structure (E. C

(. %)) is generated by %, r % . Let C

(1
0
. %) and
r 1
0
. WA requires that r C

(1
0
. %). Suppose now that % is rational,
i.e. in particular transitive. Because C

(1
0
. %), we have \. 1
0
, % .
and by transitivity r % .. Then r C

(1
0
. %), so WA holds.

Put together, these results establish that a choice rule, dened (at least)
on all sets with up to three elements, reects a rational preference if
and only if it satises WA.
One might be tempted to conclude that the preference- and choice-
based approaches are basically equivalent, since we could include all
possible budget sets in E. But in the theory of demand, budget sets
have a special form (they satisfy j r _ n), which is restrictive (e.g.
convex). Arbitrary budget sets may not make sense, nor does a solution
to the UMP necessarily exist for them.
In a meaningful sense, the preference-based approach (using rational-
ity) is less general than the choice-based approach (using the weak
axiom). Rational preference always gives us the weak axiom in a gen-
erated choice structure, but choices that satisfy the weak axiom need
not be consistent with rational preference.
Exercise 32 (MWG 1.D.4). If choice structure (E. C ()) is rational-
izable, show that it satises path-invariance: \1
1
. 1
2
_ E such that
1
1
' 1
2
E and C (1
1
) ' C (1
2
) E, it is the case that C (1
1
) ' C (1
2
) =
C (C (1
1
) ' C (1
2
)).
Exercise 33 (MWG 1.D.5). On the set of objects A = r. . ., dene the
family of budget sets E = r. . . . . r. .. Think of the choice rule
C () as assigning, to each budget set 1 E, a probability distribution C (1)
over objects in 1. This stochastic choice rule C () is said to be rationalizable
by preferences if there exists a probability distribution over (strict) preference
relations on A (here there are six such relations) that \1 E induces C (1).
46
(a) Show that C (r. ) = C (. .) = C (r. .) = (1,2. 1,2) is ratio-
nalizable in this sense.
(b) Show that C (r. ) = C (. .) = C (r. .) = (1,4. 3,4) is not
rationalizable in this sense.
(c) Find c. c [0. 1] such that C (r. ) = C (. .) = C (r. .) =
(c. 1 c) is rationalizable if and only if c [c. c].
6.3 Relationship with the Law of Demand
It seems plausible that Walrasian demand could satisfy a version of the
law of demand if we mimic the compensation for price changes that is
implicit in Hicksian demand by adjusting wealth with prices.
This intuition is correct. In addition to homogeneity of degree zero and
Walras law, the weak axiom is the minimal property we need to impose
on (single-valued) Walrasian demand to make it satisfy a compensated
law of demand.
In the context of (single-valued) demand, WA has the following specic
form: \(j. n) and \(j
0
. n
0
),
j r (j
0
. n
0
) _ n and r (j. n) ,= r (j
0
. n
0
) == j
0
r (j. n) n
0
.
I.e. if r (j
0
. n
0
) was available at prices j and wealth n, but r (j. n)
was chosen instead, then r (j. n) must be unavailable at prices j
0
and
wealth n
0
, when r (j
0
. n
0
) is chosen.
Notice how the single-valuedness of choices aects WA: since we cannot
require that r C (1
0
) when r C (1 ) and C (1
0
r), we
stipulate instead that r , 1
0
. Single-valuedness implies, when r is
chosen over in 1, that r is revealed strictly preferred to , and strict
preference is antisymmetric.
Exercise 34 (MWG 2.F.12). Verify that a Walrasian demand function
r (j. n) which is generated by a rational preference satises WA.
Exercise 35 (MWG 2.F.14). Argue that a Walrasian demand function
r (j. n) that satises WA is homogeneous of degree zero.
47
Figure 5 illustrates how WA restricts choices given budget sets 1
j,&
and 1
j
0
,&
0 . The top left panel depicts the budget sets; the other panels
show all possible locations for r = C (1
j,&
) and r
0
C (1
j
0
,&
0 ) that
satisfy WA. In the top right panel, r 1
j,&
1
j
0
,&
0 . Since r is available
in both budget sets, the (single) choice of r
0
in 1
j
0
,&
0 reveals it (strictly)
preferred to r. Then r
0
, 1
j,&
, else not choosing r
0
violates WA. The
same type of argument applies in the other cases.
Before I give an analytical proof of the equivalence of WA and the
compensated law of demand, a graphic sketch will be helpful. Suppose
we start with bundle r, which lies on the boundary of the budget
set 1
j,&
, in accordance with Walras law. Consider a compensated
price change to j
0
, where wealth is adjusted to n
0
such that r remains
aordable. The pivoting of the budget line through r is visible in Figure
6.
In the left panel, the set of bundles that satises WA is dotted. Since
r is aordable in both 1
j,&
and 1
j
0
,&
0 (by the construction of the com-
pensation), and was revealed preferred to everything in 1
j,&
, the choice
r
0
in 1
j
0
,&
0 must lie outside 1
j,&
if it is distinct from r. (Otherwise, r
0
would have been chosen prior to the compensated price change.) If we
assume that r
0
also satises Walras law, then it lies on the bold-striped
portion of the boundary of 1
j
0
,&
0 .
In the right panel, the set of bundles that satises CLD is dotted. Since
j
0
2
< j
2
as drawn, and the price change is compensated, CLD requires
that r
0
2
r
2
. Then r
0
must lie in the triangle above r below the new
budget line. By Walras law, r
0
belongs to the bold-striped portion of
the boundary of 1
j
0
,&
0 . This is exactly the same set that satises WA.
Are we done? Actually no. We have only shown that WA is equivalent
to CLD for compensated price changes. But WA is a property that
pertains to arbitrary (possibly uncompensated) price changes. We must
still demonstrate that such price changes will also satisfy WA.
This point can be made via the contrapositive: if a price change violates
WA, then we can construct a compensated price change that violates
CLD, contrary to the assumption. (Hence no price change can violate
WA.)
48
Figure 5: WA for single-valued demand
49
Figure 6: Equivalence of WA and CLD for compensated price changes
In Figure 7, we have situation that is at odds with WA: r and r
0
both
belong to 1
j,&
and 1
j
0
,&
0 (other violations are straightforward to deal
with). We can construct a new budget set 1
j
00
,&
00 that represents a
compensated price change for both r and r
0
(graphically, the boundary
of 1
j
00
,&
00 pivots through both r and r
0
).
One can see that j
00
2
< j
2
and j
00
2
j
0
2
. Hence CLD requires, for these
compensated changes in demand, that r
00
2
r
2
and r
00
2
< r
0
2
. I.e. r
00
must lie in both dotted triangles, above r and below r
0
. But these are
disjoint sets, so CLD is violated. Since failures of WA (with arbitrary
price changes) lead to contradiction, when CLD holds for compensated
price changes, CLD is in fact sucient for WA.
Proposition. For a Walrasian demand function r (j. n) that is homoge-
neous of degree zero and satises Walras law, WA holds if and only if \(j. n)
and \(j
0
. n
0
) such that n
0
= j
0
r (j. n) and r (j. n) ,= r (j
0
. n
0
),
(j
0
j) (r (j
0
. n
0
) r (j. n)) < 0.
50
Figure 7: Impossibility of WA failures
Proof. Since r (j
0
. n
0
) satises Walras law, we have j
0
r (j
0
. n
0
) = n
0
=
j
0
r (j. n), so that the CLD inequality reduces to the equivalent inequality
j (r (j
0
. n
0
) r (j. n)) 0.
(If) WA holds vacuously if r (j. n) = r (j
0
. n
0
) and is satised if j
0
r (j. n)
n
0
or j r (j
0
. n
0
) n. Consider therefore (j. n) and (j
0
. n
0
) where r (j. n) ,=
r (j
0
. n
0
) and n
0
_ j
0
r (j. n). If n
0
= j
0
r (j. n), then the inequality
applies, and j r (j
0
. n
0
) _ n would imply j r (j. n) < n, violating Walras
law. Hence j r (j
0
. n
0
) n, so that WA is not tested. Similarly for the case
n = j r (j
0
. n
0
).
Thus we can concentrate on j
0
r (j. n) < n
0
and j r (j
0
. n
0
) < n.
This appears to be a violation of WA, but we will show that there exists
a compensated price change for which the CLD cannot hold. Hence the
scenario does not satisfy the assumptions.
Given the strict inequalities, there exists c (0. 1) such that
(cj + (1 c) j
0
) r (j. n) = (cj + (1 c) j
0
) r (j
0
. n
0
)
(because at c suciently close to 1, the right side is close to j r (j. n) =
n j r (j
0
. n
0
), and at c suciently close to 0, the left side is close to
51
j
0
r (j
0
. n
0
) = n
0
j
0
r (j. n) < n
0
, where the equalities are due to Walras
law).
Let j
00
= cj+(1 c) j
0
and n
00
= (cj + (1 c) j
0
)r (j
0
. n
0
) = (cj + (1 c) j
0
)
r (j. n). Since
j
00
r (j. n) = n
00
and
j
00
r (j
0
. n
0
) = n
00
.
we have constructed compensated price changes from j to j
00
and from j
0
to
j
00
.
By Walras law, j
00
r (j
00
. n
00
) = n
00
, so
j
00
(r (j
00
. n
00
) r (j. n)) = 0
and
j
00
(r (j
00
. n
00
) r (j
0
. n
0
)) = 0.
Because j to j
00
and j
0
to j
00
are compensated price changes, CLD must hold
for both:
j (r (j
00
. n
00
) r (j. n)) 0
and
j
0
(r (j
00
. n
00
) r (j
0
. n
0
)) 0.
From the denition of j
00
, we have j = (1,c) j
00
((1 c) ,c) j
0
. There-
fore, CLD for j to j
00
implies
(j
00
(1 c) j
0
) (r (j
00
. n
00
) r (j. n)) 0.
and, since j
00
(r (j
00
. n
00
) r (j. n)) = 0, this means
j
0
r (j. n) j
0
r (j
00
. n
00
) .
Now, since j
0
r (j
0
. n
0
) = n
0
j
0
r (j. n) by assumption, it follows that
j
0
r (j
0
. n
0
) j
0
r (j
00
. n
00
) .
But then the CLD for j
0
to j
00
cannot hold. Hence j
0
r (j. n) < n
0
and
j r (j
0
. n
0
) < n is not possible, and we have shown that WA holds for
all (j. n) and (j
0
. n
0
) if CLD holds for all compensated price changes (and
Walras law is in eect).
52
(Only if) Because j
0
r (j. n) = n
0
, the bundle r (j. n) is available at
(j
0
. n
0
). But since r (j
0
. n
0
) is chosen instead at (j
0
. n
0
) (revealing it pre-
ferred), WA requires that r (j
0
. n
0
) is unavailable at (j. n), i.e. j r (j
0
. n
0
)
n. From Walras law, j r (j. n) = n, which gives the strict inequality
j (r (j
0
. n
0
) r (j. n)) 0.

Observe that we are not comparing initial demand r (j. n) to the un-
compensated demand r (j
0
. n) after the price change to j
0
. Instead,
wealth is adjusted to n
0
= j
0
r (j. n), so that the initial bundle is still
in the budget set at the new prices.
The remarkable aspect of the result is that nothing more than WA is
needed for the compensated law of (Walrasian) demand (other than
homogeneity of degree zero and Walras law, which require only that
preference is locally nonsatiated). In particular, preference does not
have to be rational or continuous.
Since it is an "if and only if" proposition, WA can be said to be equiv-
alent to the compensated law of demand under local nonsatiation.
Exercise 36 (MWG 2.F.5). Let a dierentiable Walrasian demand func-
tion r (j. n) satisfy homogeneity of degree zero, Walras law and the weak
axiom. Suppose r (. ) is also homogeneous of degree one with respect to
wealth n, so that consumption of all goods increases proportionately: \j. n
and \c 0, r (j. cn) = cr (j. n). Show that the law of demand holds also
for uncompensated price changes: \j. j
0
, (j
0
j) (r (j
0
. n) r (j. n)) _ 0.
Exercise 37 (MWG 2.F.13). Consider a multivalued Walrasian demand
correspondence r (j. n).
(a) Give the Walrasian version of WA in this case, generalized to choice
sets.
(b) If r (j. n) satises WA and Walras law, show that r (j. n) also has
the following property: \r r (j. n) and \r
0
r (j
0
. n
0
),
j r
0
< n == j r n.
53
(c) Moreover, show that if r (j. n) satises WA and Walras law, then
the following compensated law of demand holds: \(j. n) and \r r (j. n),
and \(j
0
. n
0
) such that n
0
= j
0
r,
(j
0
j) (r
0
r) _ 0.
6.4 Strong Axiom
A choice-based theory that is fully equivalent to the preference-based
approach with rationality can be based on the strong axiom of revealed
preference.
Denition. Choice structure (E. C ()) satises the strong axiom (SA) if,
for any collection of budget sets
_
1
1
. . . . . 1
.
_
_ E, and r
1
. . . . . r
.
such that
r
a
C (1
a
) and r
a+1
1
a
for : = 1. . . . . ` 1, we have r
1
C
_
1
.
_
if
r
1
1
.
.
SA is just WA if we consider only collections of ` = 2 budget sets
1. 1
0
_ E: then r C (1) and C (1
0
) with 1 and r 1
0
implies r C (1
0
). I.e. if r
1
is chosen when r
2
is available, then r
1
must be chosen whenever it is available and r
2
is chosen.
Beyond WA, SA says, if r
1
is chosen when r
2
is available, and r
2
is
chosen when r
3
is available (so that r
1
is indirectly revealed preferred
to r
3
), then r
1
must be chosen whenever it is available and r
3
is chosen.
(And we could make the chain arbitrarily long.) You will recognize the
avor of transitivity in SA.
Transitivity is precisely what WA failed to guarantee in a revealed
preference relation derived from it. SA, however, is sucient without
any requirements on the domain of the choice rule. Thus, If the choice
structure (E. C ()) satises SA, then the revealed preference %

derived
from it is rational. (The proof is non-elementary.)
Since SA implies WA, it inherits all properties of WA. Therefore, a
choice structure is generated by a rational preference if it satises SA
and only if it satises WA.
Exercise 38 (MWG 3.J.1). Show for the consumption set R
2
+
that Wal-
rasian demand satises WA if and only if it satises SA.
54
7 Integrability
7.1 Slutsky and Hicks Compensation
If choices are generated from rational preferences, then they satisfy the
weak axiom, which is equivalent to the compensated law of demand
(as long as choices are homogeneous of degree zero and satisfy Walras
law): \(j. n) and \(j
0
. n
0
) such that n
0
= j
0
r (j. n),
(j
0
j) (r (j
0
. n
0
) r (j. n)) _ 0.
Hence rational preferences imply the version of the compensated law
of demand we posed in the discussion of the choice-based approach.
The reverse conclusion is problematic: while the compensated law of
demand implies the weak axiom, the weak axiom is not sucient for
the existence of a rational preference that generates the choices.
In the context of the expenditure minimization problem, we also derived
a compensated law of demand (for continuous and locally nonsatiated
preferences): \j and \j
0
,
(j
0
j) (/(j
0
. n) /(j. n)) _ 0.
Are the two statements of the compensated law of demand equiva-
lent? In the choice-based approach, the consumers wealth is explic-
itly adjusted so that the initial choice r remains aordable at prices
j
0
. Graphically, a price increase (inward pivot of the budget line) is
accompanied by a wealth increase (outward shift of the budget line,
which passes through r), as shown in the left panel of Figure 8. This
type of compensation is called Slutsky compensation.
In the preference-based approach, the consumers utility is held xed
and expenditure is allowed to vary, so that the initial utility is still
attained at the optimal choice /(j
0
. n) after the price increase. Graph-
ically, a price increase is accompanied by a wealth increase (outward
shift of the budget line, which remains tangent to the indierence curve
at n), as shown in the right panel of Figure 8. This is known as Hicks
compensation.
55
Figure 8: Compensation in the choice- and preference approaches
Visually, it is clear that the two versions of the compensated law of
demand are not the same. However, for innitesimal price changes,
they do lead to the same adjustments in choices.
Our interest in this lecture is in whether the preference-based com-
pensated law of demand implies stronger restrictions on choices that
guarantee rationalizability. I.e. if we observe that choices satisfy it,
can they always be constructed from a rational preference?
To this end, we will characterize the compensated law of demand in the
choice-based and preference-based approaches in terms of the substi-
tution matrix, which contains price eects. Negative semideniteness
of the substitution matrix is (almost) equivalent to the choice-based
compensated law of demand and necessary (but not sucient) for the
preference-based compensated law of demand.
The additional property that the substitution matrix must have if the
preference-based compensated law of demand holds is symmetry. We
then argue that symmetry is enough to recover a rational preference
from choices, so that the preference-based compensated law of demand
indeed guarantees rationalizability.
56
7.2 Aside: Dot Product
Because we will make heavy use of vector notation and transformations
of vector equations in this lecture, I briey review the properties of the
dot product.
The dot product of vectors r and is the sum of the products of
corresponding elements (that have the same indices):
r =
1

=1
r

.
Because multiplication is commutative, so is the dot product: r = r
since
r =
1

=1
r

=
1

=1

= r.
Because multiplication is distributive, so is the dot product: r( +.) =
r +r since
r ( +.) =
1

=1
r

+.

) =
1

=1
r

+
1

=1
r

= r +r ..
The dot product is not associative: (r ) . ,= r ( .) since, in
general,
(r ) . =
1

I=1
_
1

=1
r

_
.
I
=
_
1

=1
r

_
.
1
+ +
_
1

=1
r

_
.
1
,=
_
1

=1

_
r
1
+ +
_
1

=1

_
r
1
.
The dot product of r and can be written in matrix notation:
r = r
T
.
Commutativity, r
T
=
T
r, and distributivity, r
T
( +.) = r
T
+r
T
.,
are of course inherited.
57
The outer product of vectors r and is a matrix (as opposed to the
inner product r
T
, which is a scalar):
r
T
=
_

_
r
1

1
r
1

1
.
.
.
.
.
.
.
.
.
r
1

1
r
1

1
_

_
.
In matrix notation, there is a version of associativity, which involves a
switch from outer product to inner product:
_
r
T
_
. = r
_

T
.
_
since
_
r
T
_
. =
_

_
r
1

1
r
1

1
.
.
.
.
.
.
.
.
.
r
1

1
r
1

1
_

_
_

_
.
1
.
.
.
.
1
_

_
=
_
1

=1

_
r
1
+ +
_
1

=1

_
r
1
=
_

_
r
1
.
.
.
r
1
_

_
_
_
_
_

1

1

_
.
1
.
.
.
.
1
_

_
_
_
_
= r
_

T
.
_
.
Notice that the matrix r
T
cannot be expressed in terms of the dot
product between two vectors, so there is no conict with the claim that
the dot product is not associative.
7.3 Substitution Matrix
The substitution matrix (alternatively known as the Slutsky matrix)
o (j. n) lists, for every two commodities / and /, how much more (or
less) is chosen of commodity / per dierential increase in the price of
commodity / at (j. n). The entry in row / and column / is
:
I
(j. n) =
dr

(j. n)
dj
I
=
Jr

(j. n)
Jj
I
+
Jr

(j. n)
Jn
J (j r (j
0
. n
0
))
Jj
I
=
Jr

(j. n)
Jj
I
+
Jr

(j. n)
Jn
r
I
(j. n) .
where r (j
0
. n
0
) is the xed initial demand.
The rst term measures the direct eect of the price change on r

(j. n)
and is called the (pure) substitution eect. The second term measures
the eect of the Slutsky compensation on r

(j. n) and is called the


wealth eect.
58
Even though we denote these choices as Walrasian demands, we do
not necessarily assume here that they arise from the utility maximiza-
tion problem. These are observed choices from the budget set 1
j,&
that may or may not be rationalizable by an underlying preference.
(They are Walrasian in the sense that there is no implicit adjustment
in expenditure after a price change, via a xed utility level. Any such
compensation is explicit.)
Hence
o (j. n) =
_

_
:
11
(j. n) :
11
(j. n)
.
.
.
.
.
.
.
.
.
:
11
(j. n) :
11
(j. n)
_

_
=
_

_
0a
1
(j,&)
0j
1
+
0a
`
(j,&)
0&
r
1
(j. n)
0a
1
(j,&)
0j
L
+
0a
1
(j,&)
0&
r
1
(j. n)
.
.
.
.
.
.
.
.
.
0a
L
(j,&)
0j
1
+
0a
L
(j,&)
0&
r
1
(j. n)
0a
L
(j,&)
0j
L
+
0a
L
(j,&)
0&
r
1
(j. n)
_

_
= 1
j
r (j. n) +1
&
r (j. n) r (j. n)
T
.
The following is a characterization of the choice-based compensated
law of demand.
Proposition. If Walrasian demand function r (j. n) is dierentiable, ho-
mogeneous of degree zero and satises Walras law and the weak axiom,
then \(j. n) the substitution matrix o (j. n) is negative semidenite, i.e.
\ R
1
,
T
o (j. n) _ 0.
Proof. Under the assumptions, the compensated law of demand holds:
\(j. n) and \(j
0
. n
0
) such that n
0
= j
0
r (j. n),
(j
0
j) (r (j
0
. n
0
) r (j. n)) = dj dr(j. n) _ 0.
This inequality implies negative semideniteness of the substitution matrix.
From the total derivative of r, and the fact that, for a compensated price
change,
dn = n
0
n = (j
0
j) r (j. n) = dj r (j. n) .
59
we have
dr = 1
j
r (j. n) dj +1
&
r (j. n) dn
= 1
j
r (j. n) dj +1
&
r (j. n) (dj r (j. n))
= 1
j
r (j. n) dj +1
&
r (j. n)
_
r (j. n)
T
dj
_
=
_
1
j
r (j. n) +1
&
r (j. n) r (j. n)
T
_
dj.
(using commutativity of the dot product and, after switching to matrix no-
tation, associativity and distributivity).
Substituting into the rst inequality,
dj
_
1
j
r (j. n) +1
&
r (j. n) r (j. n)
T
_
dj _ 0.
Because the magnitude of the price change dj 1
1
was unrestricted, this
implies dr(j. n) ,dj = o (j. n) is negative semidenite.

The converse, that a negative semidenite substitution matrix implies


the weak axiom, is true provided that o (j. n) is in addition negative
denite, i.e.
T
o (j. n) < 0, whenever is not proportional to j
( ,= cj for any c).
As we will see in a moment, the substitution matrix cannot be negative
denite for all R
1
(i.e. it is in fact never denite) since, letting
= j, we get j
T
o (j. n) j = 0.
An implication of negative semideniteness is that all diagonal entries
(the own-price eects) are non-positive:
dr

(j. n)
dj

=
Jr

(j. n)
Jj

+
Jr

(j. n)
Jn
r

(j. n) _ 0.
Therefore, a Gien good is necessarily inferior, since Jr

(j. n) ,Jj

0
only if Jr

(j. n) ,Jn < 0.


60
Proposition. If Walrasian demand function r (j. n) is dierentiable, homo-
geneous of degree zero and satises Walras law, then \(j. n), o (j. n) j =
j
T
o (j. n) = 0.
Proof. Previously (in Lecture 5), we established
1
j
r (j. n) j +1
&
r (j. n) n = 0.
when r (j. n) is homogeneous of degree zero, and
j
T
1
j
r (j. n) +r (j. n)
T
= 0
as well as j
T
1
&
r (j. n) = 1 when r (j. n) satises Walras law.
Then
o (j. n) j = 1
j
r (j. n) j +1
&
r (j. n)
_
r (j. n)
T
j
_
= 1
j
r (j. n) j +1
&
r (j. n) n = 0
(where the last equality uses Walras law), and furthermore
j
T
o (j. n) = j
T
1
j
r (j. n) +j
T
1
&
r (j. n) r (j. n)
T
= j
T
1
j
r (j. n) +r (j. n)
T
= 0.

Hence negative semideniteness is exactly what is implied by the choice-


based compensated law of demand (equivalently the weak axiom).
Since o (j. n) j = 0, zero is an eigenvalue of o (j. n), which means
its determinant [o (j. n)[ = 0. (Recall that ` is an eigenvalue if
o (j. n) j = `j for some j or if [o (j. n) `1[ = 0. The latter im-
plies [o (j. n)[ = 0 if ` = 0.) This is useful to know when checking
negative semideniteness from the signs of the principal minors. We
will see an example later on.
Exercise 39 (MWG 2.F.17). Let the Walrasian demand function r (j. n)
have the form: for / = 1. . . . . 1,
r
I
(j. n) =
n

1
=1
j

.
61
Determine whether r (j. n)
(a) is homogeneous of degree zero;
(b) satises Walras law;
(b) is consistent with the weak axiom;
(c) has a negative semidenite, symmetric substitution matrix.
Exercise 40 (MWG 2.F.10) Compute the substitution matrix for Wal-
rasian demand function r (j. n) where
r
1
(j. n) =
1
j
1
+j
2
+j
3
j
2
j
1
.
r
2
(j. n) =
1
j
1
+j
2
+j
3
j
3
j
2
.
r
3
(j. n) =
1
j
1
+j
2
+j
3
j
1
j
3
.
Demonstrate that the substitution matrix is negative semidenite, but not
symmetric, at j = (1. 1. 1). Show that it does not satisfy the weak axiom.
Exercise 41 (MWG 2.F.16). Let the Walrasian demand function r (j. n)
be as follows:
r
1
(j. n) =
j
2
j
3
.
r
2
(j. n) =
j
1
j
3
.
r
3
(j. n) =
n
j
3
.
(a) Conrm that r (j. n) is homogeneous of degree zero and that Walras
law holds.
(b) Demonstrate that r (j. n) does not satisfy the weak axiom.
(c) Show that \ R
3
, o (j. n) = 0.
7.4 Substitution Matrix with Preference
If we allow choices to arise from a standard preference, then the du-
ality between utility minimization and expenditure minimization lets
62
us express the substitution matrix in terms of Hicksian demand at
n = (j. n):
J/(j. n)
Jj
=
d/(j. (j. n))
dj
=
dr(j. n)
dj
= :
I
(j. n)
for /. / = 1. . . . . 1.
In order for a dierentiable Hicksian demand function /(j. n) to exist
as a solution to the expenditure minimization problem, n() has to
represent a continuous, locally nonsatiated, strictly convex preference
on A = R
1
+
. I emphasize that the equivalence with a given Walrasian
demand function is only valid if the conditions for the existence of a
Hicksian demand function hold.
Rewriting the substitution matrix accordingly as
o (j. n) =
_

_
0I
1
(j,&)
0j
1

0I
1
(j,&)
0j
L
.
.
.
.
.
.
.
.
.
0I
L
(j,&)
0j
1

0I
L
(j,&)
0j
L
_

_
= 1
j
/(j. n) .
a further property can be derived, because o (j. n) is now implicitly
conned to choices the reect an underlying standard preference.
Proposition. If n() represents a continuous, locally nonsatiated, strictly
convex preference on A = R
1
+
, and the Hicksian demand function /(j. n)
is continuously dierentiable at n, then the matrix 1
j
/(j. n) = o (j. n) is
negative semidenite and symmetric (i.e. d/

(j. n) ,dj
I
= d/
I
(j. n) ,dj

for
/. / = 1. . . . . 1).
Proof. Recall rst that /(j. n) = \
j
c (j. n) by the envelope theorem, so
that
1
j
/(j. n) = 1
2
j
c (j. n) .
Since c (j. n) is a concave function, we can appeal to the fact that the Hessian
(i.e. the second-derivative matrix) of any concave function is negative semi-
denite.
I will prove this general result. The Taylor expansion of c (j. n) around
o = 0 in displacement direction R
1
is
c (j +o. n) = c (j. n) +\
j
c (j. n)
T
(o) +
o
2
2

T
1
2
j
c (j +. n)
63
for some [0. o]. Then

T
1
2
j
c (j +. n) _ 0.
because the concavity of the expenditure function implies
c (j +o. n) _ c (j. n) +\
j
c (j. n)
T
(o) .
This is apparent when, in the denition of concavity, i.e. \j, \j
0
and
\c [0. 1],
c (cj + (1 c) j
0
. n) _ cc (j. n) + (1 c) c (j
0
. n) .
we substitute o = j
0
j for j
0
, so that (after rearrangement)
c (j +o. n) _ c (j. n) +
c (j + (1 c) o. n) c (j. n)
1 c
.
and we apply the limit as c 1 (making the last term a derivative with
respect to j, in direction ).
Now, if the magnitude o of the displacement in the Taylor expansion is
chosen suciently small, then is very close to zero, and we have

T
1
2
j
c (j. n) _ 0.
where the direction R
1
was arbitrary. This establishes that 1
2
j
c (j. n) =
1
j
/(j. n) is negative semidenite.
Symmetry comes from Clairauts theorem, which states that the Hessian
of any function that has continuous second partial derivatives at all points
in the domain is symmetric. Since this is true for c (j. n) by virtue of the
fact that its second derivatives are the rst derivatives of /(j. n), which were
assumed to be continuous, 1
2
j
c (j. n) = o (j. n) must be symmetric.

The preference-based compensated law of demand, which is implicit in


Hicksian demand, therefore requires both negative semideniteness and
symmetry of the substitution matrix. Symmetry was not an implication
of the choice-based compensated law of demand.
64
Before we examine this dierence in examples, we conrm that the
substitution matrix will not be negative denite.
Proposition. If n() represents a continuous, locally nonsatiated, strictly
convex preference on A = R
1
+
, and /(j. n) is dierentiable at n, then
1
j
/(j. n) j = o (j. n) j = 0.
Proof. Under these conditions, Hicksian demand is homogeneous of degree
zero in j, so that /(cj. n) = /(j. n). Dierentiating both sides with respect
to c, we have for / = 1. . . . . 1,
1

I=1
J/

(cj. n)
Jcj
I
Jcj
I
Jc
=
1

I=1
J/

(cj. n)
Jj
I
j
I
= 0.
which corresponds to the claim.

Example. Suppose choices have the Cobb-Douglas form:


r
1
(j. n) =
1
3
n
j
1
. r
2
(j. n) =
1
3
n
j
2
. r
3
(j. n) =
1
3
n
j
3
.
Observe that the choices satisfy homogeneity of degree zero and Walras law
(they add up to n). The substitution matrix is easily calculated from partial
derivatives to be
o (j. n) =
1
9
n
_

_
2
1
j
2
1
1
j
1
j
2
1
j
1
j
3
1
j
1
j
2
2
1
j
2
2
1
j
2
j
3
1
j
1
j
3
1
j
2
j
3
2
1
j
2
3
_

_
.
Clearly, this is symmetric. One can check directly, that it is negative semi-
denite because \ R
1
,

T
o (j. n) =
1
9
n
_
2

2
1
j
2
1
+ 2

2
j
1
j
2
+ 2

3
j
1
j
3
2

2
2
j
2
2
+ 2

3
j
2
j
3
2

2
3
j
2
3
_
=
1
9
n
_
_

1
j
1


2
j
2
_
2
+
_

1
j
1


3
j
3
_
2
+
_

2
j
2


3
j
3
_
2
_
_ 0
(with equality only if = cj for any c). Hence Cobb-Douglas choices sat-
isfy the both the choice-based and the preference-based compensated law of
demand.
65
Example. Consider now the following modication:
r
1
(j. n) =
1
3
n
j
1
+
j
2
j
1
. r
2
(j. n) =
1
3
n
j
2
1. r
3
(j. n) =
1
3
n
j
3
.
Neither homogeneity of degree zero nor Walras law are violated, e.g.
3

=1
j

(j. n) =
1
3
n +j
2
+
1
3
n j
2
+
1
3
n = n.
The substitution matrix
o (j. n) =
1
9
n
_

1
j
2
1
_
2 + 6
j
2
&
_
1
j
1
j
2
_
1 + 3
j
2
&
_
1
j
1
j
3
_
1 + 3
j
2
&
_
1
j
1
j
2
_
1 + 6
j
2
&
_

1
j
2
2
_
2 + 3
j
2
&
_
1
j
2
j
3
_
1 3
j
2
&
_
1
j
1
j
3
1
j
2
j
3
2
1
j
2
3
_

_
is obviously not symmetric.
In this case, it is more convenient to check negative semideniteness in-
directly. One technique is to compute the prinicipal minors, which are de-
terminants of : : matrices derived by deleting : : corresponding rows
and columns. A matrix is negative semidenite if all odd principal minors
(i.e. with : odd) are non-positive and all even principal minors (i.e. with :
even) are non-negative.
The rst-order principal minors are the diagonal entries, which are nega-
tive. The second-order minors have the following form. If row : and column
c are deleted, and i and , are the lowest and highest non-deleted rows, and
/ and | are the lowest and highest non-deleted columns,
`
vc
= :
iI
(j. n) :
)|
(j. n):
i|
(j. n) :
)I
(j. n) = (1)
1(v,c)
n
j
i
j
)
j
I
j
|
_
1
3
+
j
2
n
_
.
where 1 (:. c) = 1 if : +c is odd, and 1 (:. c) = 0 if : +c is even. Since `
vc
is
negative if and only if : +c is odd, and : = c (i.e. `
vc
is a principal minor)
implies : +c is even, the second-order principal minors are all positive.
The third-order principal minor is the determinant of the matrix, which
we know is always zero for o (j. n). It is worth checking once that, indeed,
[o (j. n)[ = :
11
(j. n) `
11
:
12
(j. n) `
12
+:
13
(j. n) `
13
= 0.
66
Hence the matrix is negative semidenite, which implies that the weak axiom
holds (because all lower-order principal minors have strict signs, the negative
deniteness for ,= cj obtains). These choices do not satisfy the preference-
based compensated law of demand, since the substitution matrix fails to be
symmetric.
Exercise 42 (MWG 3.G.14). The following are Walrasian substitution
eects for an agent who has rational preferences and faces prices j = (1. 2. 6):
o (j. n) =
_
_
10 ? ?
? 4 ?
3 ? ?
_
_
.
Find the missing entries.
Exercise 43 (MWG 2.F.11). Show that, in a two-commodity world, where
r (j. n) is dierentiable, homogeneous of degree zero and satses Walras law,
o (j. n) is always symmetric.
7.5 Integrability
There is a deeper signicance to the observation that the substitution
matrix is symmetric in the derivation from preference-based demand,
but not in the choice-based approach using the weak axiom. We know
that the weak axiom guarantees that the choice structure is rational-
izable (i.e. admits an underlying preference), and is also equivalent to
the compensated law of demand.
The preference-based approach is more special in nature, as is evident
from the fact that it implies both the compensated law of demand
and symmetry of the substitution matrix. Hence there is hope that a
demand function that has both properties is rationalizable.
From a modeling point of view, when we build a theory from a demand
function with certain properties, rather than from preferences, it is
important to know whether the demand function could in fact occur
if the agent were optimizing with respect to some rational preference.
This is known as the integrability problem: are homogeneity of degree
zero, Walras law, the compensated law of demand and symmetry of
the substitution matrix sucient for rationalizability?
67
The answer is yes. I.e. choices are rationalizable (given that they sat-
isfy homogeneity of degree zero, Walras law and the compensated law
of demand) if the substitution matrix is symmetric. (We also presume
that preferences are convex and continuous on in this discussion, so
rationalizability refers here to the existence of a generating preference
that has all the standard properties, not just completeness and transi-
tivity. The symmetry of the substitution matrix is also necessary for
rationalizability in terms of such preferences.)
To understand the connection, we must think about how a preference
would be recovered from choices r (j. n). A preference is fully described
by its upper contour sets
% (r) =
_
c R
1
+
s.t. c % r
_
=
_
c R
1
+
s.t. n(c) _ n(r)
_
for some utility function that represents %.
Equivalently we can express % (r), using the duality between utility
maximization and expenditure minimization, as
%
&(a)
=
_
c R
1
+
s.t. \j 0, c (j. n(r)) _ j c
_
.
This is the set of bundles that are more costly than the cheapest bundle
required to attain n(r), at any given prices. To see why these bundles
form the upper contour set of r, we need insights from abstract duality
theory.
Duality theory revolves around the idea that any closed, convex set
can be described in terms of its linear approximation by hyperplanes
that separate the set from points not in the set. The sets of interest
to us are the upper contour sets of a continuous and convex preference
relation.
Fix a utility function n() that represents %, and a level n, and denote
the upper contour set
% (r) =
_
c 1
1
+
s.t. n(c) _ n(r)
_
by %
&
. A hyperplane that separates the bundle , %
&
from the set %
&
is parameterized by some vector j and scalar c
&
such that
j < c
&
_ j r
68
for all r %
&
.
Given %
&
, the separating hyperplane theorem says that such a hyper-
plane (i.e. j and c
&
) exists for every , %
&
. Each hyperplane separates
R
1
into two halfspaces, one containing the point and the other con-
taining the set %
&
. The intersection of the latter halfspaces is the
original set %
&
, since it excludes all points not in the set.
For every j, there is an element r

%
&
that generates the smallest
value of j r attainable in %
&
. The function that maps j to this value
at r

is called the support function of %


&
:
c
&
(j) = inf j r s.t. r %
&
.
This function is necessarily concave in j (by the argument we already
made for the concavity of the expenditure function - it increases at
most linearly in j).
Clearly, we have c
&
(j) _ j r for all r %
&
, at all j. It follows from
the separating hyperplane theorem that there exists some j 0 for
every , %
&
such that c
&
(j) j . Hence, no , %
&
can satisfy
c
&
(j) _ j for all j 0. Then %
&
can be expressed as
%
&
=
_
r R
1
s.t. \j 0. c
&
(j) _ j r
_
.
The upper contour set at n in the expenditure minimization problem
is convex and closed if preferences are convex and continuous. This
set can be approximated by tangent budget hyperplanes that contain
bundles r

in the upper contour set at which j r is minimized, given


prices. By nding such a budget hyperplane for every j, we trace
the boundary ot the upper contour set. Knowledge of the minimum
expenditure j r

at every price, i.e. the expenditure function (which


is the support function) would allow us to recover the upper contour
set. See Figure 9.
Duality theory lets us infer the upper contour set at every utility level
n and therefore recover the preference relation, once we have an expen-
diture function. What remains is to obtain the expenditure function
from choices.
69
Figure 9: Upper contour set approximated by c (j. n)
This is the proper integrability problem, the question whether the sys-
tem of partial dierential equations \
j
c (j. (j. n)) = /(j. (j. n)) =
r (j. n) has a solution. The Frobenius theorem gives necessary and
sucient conditions for integrability that are, in particular, satised if
the derivative matrix is symmetric.
As we have seen, the derivative matrix of \
j
c (j. (j. n)) is the substi-
tution matrix o (j. n) = 1
j
/(j. (j. n)). Hence symmetry is exactly
what is needed for recovering preferences from choices, i.e. for choices
to be rationalizable.
Exercise 44 (MWG 3.H.6). Derive expenditure and utility function from
the Walrasian demand function r

(j. n) = c

n,j

for / = 1. . . . . 1 with

1
=1
c

= 1.
Exercise 45 (MWG 3.H.5). How can expenditure and utility function be
recovered from the indirect utility function?
70
8 Aggregation
8.1 Aggregate Demand Function
In general, aggregate demand is not a function of aggregate wealth, but
rather a correspondence even if individual demands are functions. At
a given level of aggregate wealth, dierent wealth distributions lead to
dierent choices, individually and in the aggregate. In this lecture, we
consider the special circumstances under which aggregate demand is a
well-dened function of aggregate wealth.
We also ask whether an aggregate demand function has welfare con-
tent. Individual demands are outcomes of utility maximization and
therefore represent the best choices available to an agent. Can we say
that aggregate demand reects the best consumption choices available
to society?
Finally, we consider whether aggregate demand inherits the weak ax-
iom from individual demands, which is an important condition for the
uniqueness of general equilibria.
Let there be 1 consumers i = 1. . . . . 1 with rational preference relations
%
i
and individual wealths n
i
. Their Walrasian demands are denoted
by r
i
(j. n
i
). Then aggregate demand is:
r (j. n
1
. . . . . n
1
) =
1

i=1
r
i
(j. n
i
) .
Exercise 46 (MWG 4.C.11). Let two agents have identical wealths
n
1
= n
2
= n,2, and let their preferences of over bundles of two goods have
utility representations
n
1
(r
11
. r
21
) = r
11
+ 4
_
r
21
and
n
2
(r
12
. r
22
) = 4
_
r
12
+r
22
.
(a) Derive the individual demand functions and the aggregate demand func-
tion.
71
(b) Find the individual Slutsky matrices o
i
(j. n,2) for i = 1. 2 and the
aggregate Slutsky matrix o (j. n). (With two goods, the entire matrix is
determined by one element.) Demonstrate that dj o (j. n) dj < 0 for all
dj ,= 0 that are not proportional to j. Does aggregate demand satisfy the
weak axiom?
In general, aggregate demand depends on prices and all individual
wealth levels. Can we build a theory where only the aggregate wealth

1
i=1
n
i
= n aects aggregate demand? I.e. when is

1
i=1
r
i
(j. n
i
) =

1
i=1
r
i
(j. n
0
i
) for all (n
1
. . . . . n
1
) and (n
0
1
. . . . . n
0
1
) that distribute the
same aggregate wealth n, so that we can write
r (j. n
1
. . . . . n
1
) =
1

i=1
r
i
(j. n
i
) = r
_
j.
1

i=1
n
i
_
?
Consider a wealth distribution (n
1
. . . . . n
1
) and some dierential change
in wealth (dn
1
. . . . . dn
1
) such that

1
i=1
dn
i
= 0. Since dn is a redis-
tribution of wealth that does not aect total wealth, it cannot aect
aggregate consumption of any commodity if r is to be a function of
aggregate wealth only. I.e.
1

i=1
Jr
i
(j. n
i
)
Jn
i
dn
i
= 0
for commodity / = 1. . . . . 1.
Because dn might aect only two individuals, and leave everyones
wealth unchanged, it must be that the wealth eect for each commodity
must be the same for any two individuals, i.e. \i. , 1,
Jr
i
(j. n
i
)
Jn
i
=
Jr
)
(j. n
)
)
Jn
)
for / = 1. . . . . 1.
Demand functions have this property at any prices and wealth distrib-
ution if and only if preferences admit an indirect utility function of the
Gorman form:

i
(j. n
i
) = c
i
(j) +/ (j) n
i
.
72
For the "if" part, we need Roys identity.
Proposition. If n represents a continuous, locally nonsatiated and strictly
convex preference on A = R
1
+
, and the indirect utility function (. ) is
dierentiable at ( j. n) 0, then
r ( j. n) =
\
j
( j. n)
J ( j. n) ,Jn
.
Proof. Suppose ( j. n) = n. By duality of the utility maximization and
expenditure minimization problems, \j, (j. c (j. n)) = n. Hence
\
j
(j. c (j. n)) +
J (j. c (j. n))
Jc (j. n)
\
j
c (j. n) = 0.
Evaluating at j = j, and using \
j
c ( j. n) = /( j. n) = r ( j. n), and replacing
c ( j. n) with n, we have
\
j
( j. n) +
J ( j. n)
Jn
r (j. n) = 0
as claimed.

Substituting the Gorman form of the indirect utility function into Roys
identity, we have
r (j. n) =
\
j
(j. n)
J (j. n) ,Jn
=
\
j
c
i
(j)
/ (j)

\
j
/ (j)
/ (j)
n
i
.
Dierentiating with respect to wealth,
\
&
i
r (j. n) =
\
j
/ (j)
/ (j)
.
which is independent of n
i
. Since the wealth eect is the same for all
individuals and all wealth levels, aggregate demand depends only on
aggregate wealth.
73
This argument only provides suciency, but the Gorman form is in
fact necessary for the existence of an aggregate demand function.
Example. Preference % is homothetic if r ~ implies tr ~ t for all
r. and all t 0. If n is a utility function that represents homothetic
preference %, then n(r) = n() == n(tr) = n(t) for all t 0. It follows
that the utility function is homogeneous of degree one, i.e. n(tr) = tn(r).
Furthermore, the expenditure function is homogeneous of degree one in n:
tc (j. n) = t (j r) = j (tr) = c (j. tn) .
Denote the expenditure required to reach n = 1, given prices, by a new
function c (j. 1) =
~
/ (j). Now one can write the expenditure at an arbitrary
utility n as c (j. n) = nc (j. 1) = n
~
/ (j). Since c (j. n) = n and n = (j. n)
in the corresponding utility maximization problem, we have
(j. n) = / (j) n
where / (j) = 1,
~
/ (j), for a homothetic preference. Hence the indirect utility
function has the Gorman form.
We can see directly how the Gorman form relates to linear wealth expan-
sion paths in this case. Using Roys identity,
r (j. n) =
\
j
/ (j)
/ (j)
n.
Hence, at xed prices, demand increases linearly (and proportionately for
all commodities) in wealth. Thus \
&
r (j. n) = r (j. n) ,n, and the income
elasticity of demand is, for commodity / = 1. . . . . 1,

&
(j. n) =
Jr

(j. n)
Jn
n
r (j. n)
= 1.
This means a xed share of the budget is spent on each commodity, a property
that you know from demand functions associated with Cobb-Douglas utility
functions, which belong to the class of homothetic preferences.
Example. Preference is quasilinear in good / if r ~ implies r + cc
I
~
+cc
I
, where c 0 and c
I
is a bundle of one unit of commodity / (and zero
74
of any other commodity). I.e. consuming commodity / does not aect how
the agent values other commodities. As a consequence, r
I
(j. n) must enter
additively into the utility function, e.g. n(r) = r
I
(j. n) + , (r
I
(j. n)).
(The notation r
I
(j. n) refers to quantities of all commodities other than
/.) The function , is nonlinear - if other commodities enter additively, then
it is a condition of optimality that only one of these is consumed.
Recall that utility maximization entails the "tangency"
Jn (r) ,Jr

Jn (r) ,Jr
I
=
j

j
I
for all / (at an interior solution). Since
Jn(r)
Jr
I
= 1.
all marginal utilities are constant with respect to consumption.
For all commodities that enter nonlinearly, such a condition xes the
quantity at a specic level. Only commodity / has a constant marginal
utility at all levels of r
I
(j. n), hence only r
I
(j. n) can adjust to a wealth
change at a solution. It follows that
r
I
(j. n) =
1
j
I
_
n
1

6=I=1
j

(j)
_
=
n
j
I
q (j)
(all remaining wealth is spent on commodity /). Letting c (j) = , ( r
I
(j))
q (j) and / (j) = 1,j
I
, the indirect utility function therefore has the Gorman
form:
(j. n) = c (j) / (j) n.
The wealth eects of quasilinear demands are constant
Jr
I
(j. n)
Jn
=
1
j
I
.
Jr

(j. n)
Jn
= 0
for all / ,= /. Thus, the wealth expansion path is in this case parallel to the
/-axis. With homothetic preference, it is a straight line through the origin
(i.e. the zero bundle).
75
In some cases, there may be a xed wealth distribution rule (n
1
(j. n) . . . . . n
1
(j. n))
that depends only on prices and aggregate wealth n. Then aggregate
demand can be written as a function of aggregate wealth,
1

i=1
r
i
(j. n
i
(j. n)) =
1

i=1
~ r
i
(j. n) .
without imposing uniformwealth eects. (Recall that we normally need
them because aggregate wealth could be distributed in many ways. If
we know what the distribution is, then multiplicity is not an issue.)
If wealth eects are non-uniform, and the distribution of wealth is not
xed, aggregate demand may depend on certain statistics of the wealth
distribution. One statistic is aggregate wealth (i.e. the mean); we may
also have to observe variance and higher-order moments. Then aggre-
gate demand may be a function of prices and distributional statistics
(rather than full information linking each preference to a particular
individual wealth).
8.2 Representative Consumer
Given that we have an aggregate demand function, we ask now whether
it is rationalizable in the sense that there exists a preference based on
which a hypothetical consumer could make these choices on behalf of
the entire population.
This is a precondition for the aggregate demand function to have some
welfare content. Without it, one cannot talk about improving on a
given aggregate bundle, e.g. by redistributing wealth.
Denition. The aggregate demand function r (j. n) admits a positive rep-
resentative consumer if there exists a preference % such that r (j. n) is the
Walrasian demand function generated by %. I.e. \(j. n) and \r, r ,= r (j. n)
and j r _ n == r (j. n) ~ r.
To actually compare aggregate demands, we need to specify how we
would evaluate a particular list of individual outcomes. The rule could
76
be utilitarian (adding up individual utilities) or egalitarian (preferring
less variation between individual utilities), or something else.
Denition. A Bergson-Samuelson social welfare function \ : R
1
R
assigns a value to every utility vector (n
1
. . . . . n
1
) for the 1 agents.
The optimal (feasible) distribution of wealth (n
1
. . . . . n
1
), given a social
welfare function \ (), is that which attains
max
(&
1
,...,&
I
)
\ (
1
(j. n
1
) . . . . .
1
(j. n
1
)) s.t.
1

i=1
n
i
_ n
= (j. n) .
The aggregate indirect utility function (j. n) (given social welfare
function \ ()) arises as follows. For every price vector j and aggre-
gate wealth level n, we record the value of \ () for every possible dis-
tribution of n among the 1 individuals, who are assumed to maximize
their utility from consumption within their budget sets 1
j,&
1
. . . . . 1
j,&
I
as determined by the distribution and prices. The highest achievable
value of \ () among all distributions, i.e. the maximum at given prices
and aggregate welath, is the indirect utility at (j. n).
Exercise 47 (MWG 4.D.2). Conrm that (j. n), thus constructed has
the usual properties of an indirect utility function (i.e. homogeneous of degree
zero, increasing in n, decreasing in j and quasiconvex).
Suppose \ () is increasing, concave and dierentiable, and the distrib-
ution function that solves \ () is such that individual wealth n
i
(j. n)
is dierentiable in price and homogeneous of degree one for all i.
It can be shown that (j. n) is the maximum of a utility function that
represents the preference of a positive representative consumer. I.e. the
Walrasian demand function derived from (j. n) via Roys identity is
the aggregate of the individual demands underlying (
1
(j. n
1
) . . . . .
1
(j. n
1
)).
77
Since the aggregate demand function attains maximal utility (j. n)
at all (j. n), it is chosen by the preference underlying (j. n) (we could
construct a complete utility function from all values of \ ()). Hence
there exists a preference with respect to which the aggregate demand
at every (j. n) is the utility-maximizing bundle. This means we have
a positive representative consumer.
In this particular case, the positive representative consumer is moreover
a normative representative consumer. The normative representative
consumers preference chooses an aggregate demand function r (j. n)
that is \(j. n) consistent with individual choices r
1
(j. n
1
) . . . . . r
1
(j. n
1
)
at the wealth distribution (n
1
(j. n) . . . . . n
1
(j. n)) that maximizes \ ().
Note that a positive representative consumer could choose some other
aggregate demand function, which arises from suboptimal distributions
of wealth (according to \ ()). There may well be a preference that
rationalizes it. The normative requirement is much stronger.
Denition. The aggregate demand function r (j. n) admits a normative
representative consumer with respect to welfare function \ () if there ex-
ists a preference % that generates the welfare-maximizing aggregate demand
function r (j. n) (which reects at every (j. n) the wealth distribution that
maximizes \ ()).
Example. Suppose all agents have homothetic preferences and the social
welfare function is
\ (n
1
. . . . . n
1
) =
1

i=1
c
i
ln n
i
with c
i
0 for all i, and

1
i=1
c
i
= 1. What is the aggregate demand
function of a normative representative consumer? The rst-order conditions
to maximize \ (), subject to

1
i=1
n
i
= n, are, for i = 1. . . . . 1,
J\ (n
1
. . . . . n
1
)
Jn
i
J
i
(j. n
i
)
Jn
i
= `.
where ` is the Lagrange multiplier and derivatives are evaluated at the
welfare-maximizing wealth distribution (n
1
. . . . . n
1
).
78
Since
J\ (n
1
. . . . . n
1
)
Jn
i
=
c
i
n
i
=
c
i

i
(j. n
i
)
and
J
i
(j. n
i
)
Jn
i
=

i
(j. n
i
)
n
i
(follows from (j. n) = / (j) n as derived above for homothetic preference),
we have c
i
,n
i
on the left side of the rst-order conditions. Summing n
i
=
c
i
,` over i,
n =
1

i=1
n
i
=

1
i=1
c
i
`
=
1
`
.
So the optimal wealth distribution is n
i
(j. n) = c
i
n for all i. Then the
welfare-maximizing aggregate demand function is
r (j. n) =
1

i=1
r
i
(j. c
i
n)
(remember, the c
i
are given by the social welfare function).
Exercise 48 (MWG 4.D.8). Suppose for any distribution (n
1
. . . . . n
1
) of
n there is a distribution (n
0
1
. . . . . n
0
1
) of n
0
such that
i
(j
0
. n
0
i
)
i
(j. n
i
) for
all i. Argue that any normative representative consumer must then prefer
(j
0
. n
0
) to (j. n).
Exercise 49 (MWG 4.D.1). Show that (j. n) can alternatively derived
by solving the problem
max
a
1
,...,a
I
\ (n
1
(r
1
) . . . . . n
1
(r
1
))
s.t. j
1

i=1
r
i
_ n
for (r
1
(j. n
1
(j. n)) . . . . . r
1
(j. n
1
(j. n))), the individual demands at the op-
timal wealth distribution rule (n
1
(j. n) . . . . . n
1
(j. n)). Since this formal-
ization, where a planner chooses consumption bundles, is equivalent to the
one where the planner chooses wealth levels (and lets consumers make their
consumption decisions in the market), we have a version of the second welfare
theorem.
79
8.3 Failure of the Weak Axiom
Suppose aggregate demand is a well-dened function r (j. n) of prices
and aggregate wealth.
It is clear that aggregate demand inherits homogeneity of degree zero
and Walras law from individual demand functions.
The denition of the weak axiom in the Walrasian demand setting
extends directly from individual to aggregate demands: it says if j
r (j
0
. n
0
) _ n and r (j. n) ,= r (j
0
. n
0
), then j
0
r (j. n) n. (If
aggregate bundle r (j. n) was chosen over r (j
0
. n
0
) in one situation,
then r (j
0
. n
0
) can only be chosen if r (j. n) is unavailable.)
In general, the weak axiom does not survive aggregation.
Example. Let wealth n = 10 be distributed equally (n
1
= n
2
= 5) and
consider demands
r
1
(j. n
1
) =
_
0.
5
2
_
. r
2
(j. n
1
) = (3. 1)
at prices (j
1
. j
2
) = (1. 2), as well as
r
1
(j
0
. n
2
) =
_
3
2
. 2
_
. r
2
(j
0
. n
2
) = (2. 1)
at prices (j
0
1
. j
0
2
) = (2. 1). for agent 2. (Here, the subscripts refer to individ-
uals, not commodities.)
These demands satisfy the weak axiom, since bundle r
1
(j
0
. n,2) costs
more than n
1
= 5 at prices j, and bundle r
2
(j. n,2) is costs more than
n
2
= 5 at prices j
0
. (We have well-dened orderings r
1
(j. n
1
) ~

r
1
(j
0
. n
1
)
and r
2
(j
0
. n
2
) ~

r
2
(j. n,2).)
Aggregate demands are
r (j. n) =
_
3.
7
2
_
. r (j
0
. n) =
_
7
2
. 3
_
.
Since both aggregate bundles cost n = 10 at the prices at which they are
chosen and less than 10 when they are not chosen, they violate the weak
80
Figure 10: Failure of the aggregate weak axiom
axiom, since each is available in budget sets 1
j,&
and 1
j
0
,&
, but r (j. n) ,=
r (j
0
. n). (There is no well-dened ordering, since r (j. n) ~

r (j
0
. n) on
1
j,&
, and r (j
0
. n) ~

r (j. n) on 1
j
0
,&
.) As can be seen in Figure 10, the
scaled-down aggregate bundles lie inside both individual budget sets (which
are, in this case, scaled-down versions of the aggregate budget sets).
Even though the weak axiom fails in general, it does hold when indi-
vidual demands are always decreasing in prices (for compensated and
uncompensated price changes). We know from individual consumer
theory that this is not a compelling property. It requires (pure) sub-
stitution eects to be suciently large to cancel out any income eects
of inferior goods.
Denition. Individual demand function r
i
(j. n
i
) satises the uncompen-
sated law of demand (ULD) if \j. j
0
and \n
i
,
(j
0
j) (r
i
(j
0
. n
i
) r
i
(j. n
i
)) < 0
when r
i
(j. n
i
) ,= r
i
(j
0
. n
i
). ULD for the aggregate demand function r
i
(j. n
i
)
is the same property without subscripts.
81
Unlike the weak axiom, ULD does survive aggregation.
Proposition. If every individual Walrasian demand function r
i
(j. n
i
) sat-
ises ULD, then aggregate demand r (j. n) satises ULD.
Proof. If r (j. n) ,= r (j
0
. n), then for some i,
(j
0
j) (r
i
(j
0
. n
i
) r
i
(j. n
i
)) < 0
(for all other agents, the inequality is non-positive). Summing over 1, we
have \j. j
0
and \n,
(j
0
j) (r (j
0
. n) r (j. n)) < 0.

The aggregate version of ULD, in combination the other standard prop-


erties, implies the weak axiom. Thus, we can give conditions for on
which aggregate demand satises the weak axiom, but these conditions
do not have choice-theoretic foundations.
Proposition. If aggregate demand r (j. n
i
) satises homogeneity of degree
zero, Walras law and ULD, then it satises the weak axiom.
Proof. Given two price-wealth pairs (j. n) and (j
0
. n
0
) with r (j. n) ,=
r (j
0
. n), let j r (j
0
. n
0
) _ n. The weak axiom requires j
0
r (j. n) n
0
, so
that r (j. n) is revealed preferred to r (j
0
. n
0
). Let j
00
= (n,n
0
) j
0
and note
that, because j
00
,n = j
0
,n
0
, homogeneity of degree zero implies r (j
00
. n) =
r (j
0
. n
0
). Thus. j
00
r (j
00
. n) = (n,n
0
) (j
0
r (j
0
. n
0
)) = n (by Walras law).
From ULD,
(j
00
j) (r (j
00
. n) r (j. n)) < 0.
Since j r (j
00
. n) = j r (j
0
. n
0
) _ n and j r (j. n) = n, it is necessary that
j
00
(r (j
00
. n) r (j. n)) = n j
00
r (j. n) < 0.
i.e. j
00
r (j. n) n. Then (substituting for j
00
), j
0
r (j. n) n
0
.

82
The compensated law of demand holds if the price derivative matrix
of Hicksian demand, 1
j
/
i
(j. n
i
), which is equal to the substitution
matrix o
i
(j. n
i
) of compensated price eects, is negative semidenite
(and denite when pre- and post-multiplied by vectors that are not
proportional to j).
Similarly, choices satisfy ULD if the price derivative matrix of Wal-
rasian demand, 1
j
r
i
(j. n
i
), has the same properties.
Exercise 50 (MWG 4.C.1). Show that r
i
(j. n
i
) satises ULD only if
1
j
r
i
(j. n
i
) is negative semidenite, and conversely, if 1
j
r
i
(j. n
i
) is negative
denite (except
T
1
j
r
i
(j. n
i
) = 0 when = cj for some c), then r
i
(j. n
i
)
satises ULD.
Homothetic preferences are a special case where choices respect ULD
(and therefore the weak axiom).
Example. With a homothetic preference, we saw that \
&
i
r
i
(j. n
i
) =
r
i
(j. n
i
) ,n
i
. Rearranging o
i
(j. n
i
) = 1
j
r
i
(j. n
i
) +\
&
i
r
i
(j. n
i
) r
i
(j. n
i
)
T
gives
1
j
r
i
(j. n
i
) = o
i
(j. n
i
)
1
n
i
r
i
(j. n
i
) r
i
(j. n
i
)
T
.
Since
T
o
i
(j. n
i
) _ 0 (strictly if is not proportional to j) and
T
r
i
(j. n
i
) r
i
(j. n
i
)
T
=
_

T
r
i
(j. n
i
)
_
2
_ 0, we have \ R
1
,
T
1
j
r
i
(j. n
i
) _ 0 (strictly if is
not proportional to j). This makes 1
j
r
i
(j. n
i
) negative denite for every
individual, so that ULD holds in the aggregate and implies the weak axiom.
Exercise 51 (MWG 4.C.6). Verify that the following claim is true in
the case of a homothetic preference %
i
. If %
i
can be represented by a twice
continuously dierentiable, concave utility function n
i
().and \r,

r
i
1
2
n
i
(r
i
) r
i
r
i
\n
i
(r
i
)
< 4.
then r
i
(j. n
i
) satises ULD.
83
Exercise 52 (MWG 4.C.7). If every individual has the same consumption
function ~ r (j. n), and individual wealth n is distributed on [0. n] with density
non-increasing in wealth, show that the aggregate demand function
r (j) =
_
&
0
~ r (j. n) dn
satises ULD. On the other hand, demonstrate that there are unimodal distri-
butions of wealth (where the density function is single-peaked, rst increasing
and then decreasing) for which ~ r (j. n) does not satisfy ULD.
9 Expected Utility
9.1 Lotteries
In this lecture, we introduce risk into the choice framework.
The set of ` possible outcomes will be denoted by C (for consequences,
e.g. C = A could be the set of consumption bundles). The decision
makers choice induces a probability distribution on C: which of the
consequences will be realized is not certain at the time the choice is
made.
Denition. A simple lottery is a probability distribution 1 = (j
1
. . . . . j
.
) on
the set of consequences C. I.e. j
a
_ 0 for : = 1. . . . . `, and

.
a=1
j
a
= 1.
A simple lottery is an element of the (` 1)-dimensional simplex, i.e.
the set
=
_
j R
.
+
s.t.
.

a=1
j
a
= 1
_
.
(It is not `-dimensional because j
.
is determined by j
1
. . . . . j
.1
and
the requirement that probabilities sum to 1. E.g. if ` = 2, the distri-
bution can be represented by a point j [0. 1] in the one-dimensional
line space.)
84
Figure 11: Simplex
Figure 11 depicts such a simplex for three consequences. For each
consequence, there is a vertex. The simplex is drawn with height 1,
and a lottery 1 = (j
1
. j
2
. j
3
) is mapped to the unique point whose
perpendicular distance from the three edges reects the probabilities
of the consequences.
Specically, the distance of point 1 from the edge opposite 1, along the
perpendicular line labeled j
1
, is equal to the probability of consequence
1. The distance from the edge opposite 2, which is labeled j
2
, is the
probability of consequence 2, etc. This technique takes advantage of
the fact that the lengths of these perpendicular lines from a point to the
edges always sum to 1. Hence the points in the simplex can represent
probability distributions.
Denition. A compound lottery is a probability distribution (c
1
. . . . . c
1
) on
a set of 1 simple lotteries 1
1
. . . . . 1
1
, where c
I
_ 0 for / = 1. . . . . 1, and

1
I=1
c
I
= 1.
It is then an easy matter to reduce a compound lottery on 1
1
. . . . . 1
1
,
where 1
I
=
_
j
I
1
. . . . . j
I
.
_
, to a simple lottery 1 = (j
1
. . . . . j
.
) such that
j
a
=
1

I=1
c
I
j
I
a
85
for : = 1. . . . . `. I.e. the probability of consequence : in the reduced
lottery is the result of adding probabilities j
1
a
. . . . . j
1
a
over simple lot-
teries 1
1
. . . . . 1
1
, where each j
I
a
is weighted by the probability c
I
that
1
I
is realized in the compund lottery.
We can write the reduced lottery as a vector sum:
1 = c
1
1
1
+ +c
1
1
1
.
Our focus will be entirely on simple lotteries, on the premise that the
decision maker views any compound lottery as equivalent to the simple
lottery it reduces to. Denote the set of all simple lotteries by /.
9.2 Preference over Lotteries
Analogously to the certain setting, we start by endowing the agent
with a rational preference relation that ranks all elements of /, i.e. all
simple lotteries. Then we add further axioms.
From two lotteries 1 and 1
0
we can obtain a new lottery c1+(1 c) 1
0
,
where the probability of consequence : in c1+(1 c) 1
0
is a weighted
average of :s probabilities in 1 and 1
0
, and c [0. 1] is the weight of
1.
Denition. Preference % on / is mixture-continuous if \1. 1
0
/ and
\1
00
/, the set of mixtures of 1 and 1
0
that are preferred to 1
00
,
c [0. 1] s.t. c1 + (1 c) 1
0
% 1
00
.
is closed, as is the set of mixtures of 1 and 1
0
to which 1 is preferred,
c [0. 1] s.t. 1
00
% c1 + (1 c) 1
00
.
Mixture continuity is a weaker property than continuity of %(i.e. upper
and lower contour sets of 1
00
are closed). Whereas continuity implies
that all convergent sequences in % (1
00
) have limits in % (1
00
), mixture
continuity only requires this of a restricted set of sequences (those that
can be constructed by mixing two lotteries and varying the weight c).
The same applies, of course, to - (1
00
).
86
The avor is, however, much the same. Mixture continuity means that
preference between two lotteries is robust to suciently small changes
in their probabilities (in certain directions): if
~
1 is very similar to 1,
then 1 % 1
0
only if
~
1 % 1
0
.
One can think of some instances where people may violate continuity for
extreme c. (Because their preferences over consequences are essentially
lexicographic.) E.g. most of us would never commit a violent crime
(lottery 1). We might prefer a completely law-abiding life (lottery
1
0
) to small-scale tax evasion (lottery 1
00
), and prefer small-scale tax
evasion to a mostly law-abiding life with a small chance of committing
a violent crime (mixing 1 and 1
0
).
Such violations sound plausible, but they are often sensitive to framing.
We have trouble imagining very small probabilities and treat them as
if they are substantial. If you ask, "would you risk your life to watch a
sports game?," you are likely to get a dierent answer than if you ask,
"would you drive to your friends house to watch the game?"
Continuity plays the same role here as under certainty: it guarantees
that there exists a utility representation l : / R such that 1 %
1
0
== l (1) _ l (1
0
). Note that it is conventional to use the capital
letter l to distinguish a utility representation for a preference over
lotteries from a utility representation for a preference over consumption
bundles or consequences.
There is an obvious relationship between consequence : and the de-
generate lottery 1
a
that assigns probability 1 to consequence : (and
zero to all others). The utility value of such a degenerate lottery will
be denoted by n
a
= l (1
a
).
Denition. Preference % on / satises independence if \1. 1
0
,
1 % 1
0
== c1 + (1 c) 1
00
% c1
0
+ (1 c) 1
00
\1
00
/ and \c (0. 1).
The independence axiom says that preference between 1 and 1
0
should
not be aected by rescaling the probabilities in both lotteries in a given
direction.
87
A more intuitive rationale can be given by thinking of c as a random-
ization. Suppose you prefer 1 to 1
0
and are oered compound lotteries
that result in (1) 1 with probability c and 1
00
with probability 1 c
and (2) 1
0
with probability c and 1
00
with probability 1 c. If c is
the probability of some state of the world (e.g. outcome of a toin coss),
then whichever state is realized, you will be at least as happy with the
outcome of compound lottery (1). Hence you should prefer (1).
The independence axiom is the fundamental dierence between the
formalizations of lottery preferences and consumption preferences. It
is primarily responsible for the stronger results we will derive in the
present context.
It is important to understand that the independence axiom only makes
sense in the absence of complementarities, hence it could not be im-
posed on preferences over consumption bundles. While the commodi-
ties that make up a bundle are consumed together, the consequences of
lotteries are mutually exclusive. I.e. mixing lotteries does not change
the nature of the consequences in any way, only their probabilities.
9.3 Expected Utility Theorem
What the independence axiom gives us, in conjunction with mixture
continuity, goes beyond the existence of a continuous utility function
that represents lottery preference. It implies that the utility function
is linear in character. This statement is the expected utility theorem.
We will get to the actual theorem in a few steps.
Denition. Utility function l : / R has the expected utility form if there
exists (n
1
. . . . . n
.
) R
.
such that \1 = (j
1
. . . . . j
.
) /,
l (1) =
.

a=1
j
a
n
a
.
Autility function with the expected utility formis called a von Neumann-
Morgenstern (vNM) utility function.
88
Notice that the denition requires (n
1
. . . . . n
.
) R
.
to be a utility
function for the restriction of the lottery preference to degenerate lot-
teries. For if 1 = 1
a
(the degenerate lottery that assigns probability 1
to consequence :), then l (1
a
) = j
a
n
a
= n
a
if l () has the expected
utility form.
Exercise 53 (MWG 6.B.2). If l () represents a preference % on simple
lotteries / and has the expected utility form, demonstrate that % satises
the independence axiom.
Proposition. Utility function l : / R has the expected utility form if
and only if it is linear:
l
_
1

I=1
c
I
1
I
_
=
1

I=1
c
I
l (1
I
)
for any 1 lotteries 1
1
. . . . . 1
1
/ and weights c
1
. . . . . c
1
[0. 1] such that

1
I=1
c
I
= 1.
Proof. (If) Any lottery 1 = (j
1
. . . . . j
.
) can be written in terms of
degenerate lotteries 1
1
. . . . . 1
.
as 1 =

.
a=1
j
a
1
a
. If l () has the linearity
property, then
l (1) = l
_
.

a=1
j
a
1
a
_
=
.

a=1
j
a
l (1
a
) =
.

a=1
j
a
n
a
.
so that l () has the expected utility form.
(Only if) Consider the compound lottery that assigns probabilities (c
1
. . . . . c
1
)
to lotteries 1
1
. . . . . 1
1
, where 1
I
=
_
j
I
1
. . . . . j
I
.
_
for / = 1. . . . . 1. If l () has
the expected utility property, then it assigns to the corresponding reduced
lottery

1
I=1
c
I
1
I
=
_

1
I=1
c
I
j
I
1
. . . . .

1
I=1
c
I
j
I
.
_
the value
l
_
1

I=1
c
I
1
I
_
=
.

a=1
_
1

I=1
c
I
j
I
a
_
n
a
=
1

I=1
c
I
.

a=1
j
I
a
n
a
=
1

I=1
c
I
l (1
I
)
(swapping the order of summation is permitted by distributivity). Hence
l () is linear.

89
Figure 12: Indierence curves with the expected utility form
Linearity of the utility function implies linear indierence curves for
lotteries in the simplex, as in Figure 12. Suppose 1 ~ 1
0
, so that
l (1) = l (1
0
) = cl (1) + (1 c) l (1
0
). If l () is linear, then
cl (1) + (1 c) l (1
0
) = l (c1 + (1 c) 1
0
), i.e. any convex combi-
nation of 1 and 1
0
is indierent to 1 and 1
0
.
A utility function that has the expected utility form is therefore asso-
ciated with linear indierence curves.
According to the expected utility theorem, the independence axiom
essentially implies the expected utility form (the only other property
needed is continuity). The connection is easy to see graphically, since
nonlinear indierence curves violate the independence axiom.
Consider the left panel of Figure 13. The curved indierence set con-
tains 1 and 1
0
, but not 1
00
= c1 + (1 c) 1
0
. However, since 1 ~ 1
0
,
the independence axiom says c1 + (1 c) 1
0
~ 1
0
+ (1 c) 1
0
, i.e.
1
00
~ 1
0
. Such a contradiction arises whenever an indierence curve is
nonlinear.
Furthermore, the independence axiom requires the indierence curves
to be parallel (as in Figure 11). The right panel of Figure 13 depicts how
nonparallel indierence curves cause a contradiction. Lotteries 1
1
and
1
0
1
yield utility l
1
, and we construct lotteries 1
2
and 1
0
2
by respectively
90
Figure 13: Independence induces linear and parallel indierence curves
mixing 1
1
and 1
0
1
identically with a lottery 1. I.e. 1
2
= c1
1
+(1 c) 1
and 1
0
2
= c1
0
1
+ (1 c) 1 for some c (0. 1). Since 1
1
~ 1
0
1
, the
independence axiom says c1
1
+(1 c) 1 ~ c1
0
1
+(1 c) 1, i.e. 1
2
~
1
0
2
. But if the indierence line through 1
2
is not parallel to that for 1
1
and 1
0
1
, then it cannot contain 1
0
2
.
Here is the expected utility theorem.
Proposition. If rational preference % on / satises continuity and inde-
pendence, then % admits a utility representation that has the expected utility
form.
Proof. I start with a few observations about % that are intuitive, but
should and can be proven formally from the axioms. There exist lotteries
1 and 1 that are respectively preferred to all and none of the 1 /. If
c. , [0. 1], then , c == ,1 + (1 ,) 1 ~ c1 + (1 c) 1.
For any 1 / it is possible to nd a unique c
1
[0. 1] such that
1 ~ c
1
1 + (1 c
1
) 1. (We veried it in constructing a utility function
for a continuous rational preference. This is where the continuity axiom is
needed.) Now we will show that l (1) = c
1
is a utility function for % and
has the expected utility form.
By the denition of c
1
, 1 % 1
0
if and only if
1 ~ c
1
1 + (1 c
1
) 1 % c
1
0 1 + (1 c
1
0 ) 1 ~ 1
0
.
91
i.e. if and only if c
1
_ c
1
0 . Therefore, if l (1) = c
1
for all 1 /, then
1 % 1
0
== l (1) _ l (1
0
), so that l () represents %.
The expected utility form is equivalent to linearity, i.e. l
_

1
I=1
,
I
1
I
_
=

1
I=1
,
I
l (1
I
) for all 1
1
. . . . . 1
1
and all ,
1
. . . . . ,
1
_ 0 such that

1
I=1
,
I
=
1. Since 1
I
~ c
1
k
1+(1 c
1
k
) 1, induction on the independence axiom gives
1 =
1

I=1
,
I
1
I
~
1

I=1
,
I
_
c
1
k
1 + (1 c
1
k
) 1
_
=
1

I=1
,
I
c
1
k
1 +
_
1
1

I=1
,
I
c
1
k
_
1.
Thus c
1
=

1
I=1
,
I
c
1
k
, so by the construction of l (),
l
_
1

I=1
,
I
1
I
_
=
1

I=1
,
I
c
1
k
=
1

I=1
,
I
l (1
I
) .

Exercise 54 (MWG 6.B.3). If the set of outcomes C is nite and rational


preference % on / satises independence, demonstrate that there are best
and worst lotteries 1 and 1 in /, such that 1 % 1 % 1 for all 1 /.
A utility function that represents a preference over consumption bun-
dles is unique up to strictly increasing transformation. Hence it re-
ects the ordinal nature of the preference: there is no signicance in
the magnitude [n(r) n(r
0
)[, only in the fact that n(r) _ n(r
0
) or
n(r) _ n(r/). If [n(r) n(r
0
)[ [n(r) n(r
00
)[, we can nd an-
other utility function that represents the same preference, but where
[~ n(r) ~ n(r
0
)[ < [~ n(r) ~ n(r
00
)[.
In contrast, the next result indicates that a vNM expected utility
function is cardinal in nature, because it is unique only up to pos-
itive linear transformation, which preserves the (relative) magnitude
of dierences. (If [l (1) l (1
0
)[ [l (1) l (1
00
)[, then \, 0,
[,l (1) ,l (1
0
)[ [,l (1) ,l (1
00
)[.)
92
Proposition. If utility functions l : / R and
~
l : / R represent
preference % on /, and l () has the expected utility form, then
~
l : / R
has the expected utility form if and only if it is a positive linear transformation
of l (), i.e. , 0 and R such that \1 /,
~
l (1) = ,l (1) +.
Proof. (If) If l (1) has the expected utility form and
~
l (1) = ,l (1)+,
then
~
l (1) is linear: since 1 =

1
I=1
c
I
1
I
for some set of lotteries 1
1
. . . . . 1
1
and weights c
1
. . . . . c
1
,
~
l (1) =
~
l
_
1

I=1
c
I
1
I
_
= ,l
_
1

I=1
c
I
1
I
_
+
= ,
_
1

I=1
c
I
l (1
I
)
_
+ =
1

I=1
c
I
(,l (1
I
) +) .
where the last two equalities reect linearity of l () and that

1
I=1
c
I
= 1.
Linearity of
~
l () implies the expected utility form.
(Only if) Suppose l () and
~
l () both have the expected utility form
(thus, both are linear). We will construct constants , 0 and such that
\1 /,
~
l (1) = ,l (1) + . Unless all lotteries are indierent (in which
case l () and
~
l () are constant functions, so that the property holds), there
are best and worst lotteries 1 and 1 in / such that 1 ~ 1. (Since l () is
continuous, it has a maximizer and minimizer on the probability simplex /,
which is a compact set.)
For every 1 /, dene
`
1
=
l (1) l (1)
l
_
1
_
l (1)
.
Rearrangement gives
l (1) = `
1
l
_
1
_
+ (1 `
1
) l (1)
and, because l () is linear, l (1) = l
_
`
1
1 + (1 `
1
) 1
_
. Thus 1 ~ `
1
1+
(1 `
1
) 1.
Because
~
l () represents % and is linear,
~
l (1) =
~
l
_
`
1
1 + (1 `
1
) 1
_
= `
1
~
l
_
1
_
+ (1 `
1
)
~
l (1) = ,l (1) +.
where
, =
~
l
_
1
_

~
l (1)
l
_
1
_
l (1)
0. =
~
l (1) ,l (1) .
93

Exercise 55 (MWG 6.B.4). A safety agency is looking for an evacuation


criterion for an area that has a 1% probability of ooding. Four things can
happen: (A) No evacuation is necessary, and none is performed. (B) An un-
necessary evacuation is performed. (C) A necessary evacuation is performed.
(D) A necessary evacuation is not performed.
The agency is indierent between sure outcome (B) and a scenario where
(A) occurs with probability j (0. 1) and (D) occurs with probability 1 j.
The agency is also indierent between sure outcome (C) and a scenario where
(A) occurs with probability (0. 1) and (D) occurs with probability 1 .
Moreover, it prefers (A) to (D). Suppose the expected utility theorem applies.
(a) Construct an expected utility function for the agency.
(b) Compare the following policy criteria: (1) Evacuate in 90% of ood-
ing instances, and evacuate unnecessarily in 10% of the instances where no
ooding occurs. (2) Evacuate in 95% of ooding instances, and evacuate un-
necessarily in 15% of instances where no ooding occurs. Derive probability
distributions over the four outcomes under these criteria and decide, based
on the expected utility function, which criterion the agency prefers.
(Note that this version of the problem corrects two typos in MWG.)
9.4 Paradoxes
Expected utility (or, rather, the underlying independence axiom) en-
tails some specic predictions about choices. These predictions fre-
quently fail in some famous experiments that have led to various alter-
native axiomatizations in order to explain the observed behavior.
Consider four lotteries that oer the following probabilities over three
prizes:
$2.5 million $0.5 million $0 million
1
1
0 1 0
1
0
1
0.10 0.89 0.01
1
2
0 0.11 0.89
1
0
2
0.10 0 0.90
.
The decision maker is asked to compare 1
1
to 1
0
1
and 1
2
to 1
0
2
.
If is often observed (roughly half the time) that 1
1
~ 1
0
1
(giving up
the chance to win a greater prize to avoid a small risk of zero), while
94
1
0
2
~ 1
2
(accepting a slightly increased risk of zero for the prospect of
a greater prize).
These preferences are not necessarily irrational, but they are incon-
sistent with expected utility theory. Notice that in both cases 1
0
is
obtainable from 1 by taking 0.11 out of the probability of winning $0.5
million and distributing 0.01 to the probability of $0 and 0.10 to the
probability of $2.5 million. Only the initial probabilities dier, but ac-
cording to the independence axiom they should not aect preference
for the adjustment.
Explicitly, if 1
1
~ 1
0
1
, then l (1
1
) = n
0.5
_ 0.10n
2.5
+0.89n
0.5
+0.01n
0
,
which implies, after adding 0.89 (n
0
n
0.5
) on both sides, 0.11n
0.5
+
0.89n
0
_ 0.10n
2.5
+ 0.90n
0
, i.e. 1
2
~ 1
0
2
. But 1
0
2
~ 1
2
was observed.
This is known as the Allais paradox.
Perhaps people worry about regrets they might have if a bad outcome
occurs. (I.e. if 1
0
1
results in zero, it is viewed as a loss of $0.5 million
that the agent would have had with 1
1
, and therefore seems somehow
worse than zero. In contrast, if 1
0
2
results in zero, there was no way to
guarantee a non-zero prize by choosing 1
2
.)
Exercise 56 (MWG 6.B.5). The following property is known as the
"betweenness axiom": \1. 1
0
/ and \` (0. 1), if 1 ~ 1
0
, then `1 +
(1 `) 1
0
~ 1. Suppose there are three possible outcomes.
(a) Show that a preference relation on lotteries satises independence only
if it satises betweenness.
(b) Depict in the simplex that, if the continuity and betweenness axioms
hold, then indierence curves of a lottery preference must be straight lines.
Conversely, depict that straight-line indierence curves imply betweenness.
(c) Argue (from a graphic comparison) that betweenness is weaker than
independence.
(d) Draw an indierence map that satises betweenness and produces the
choices from the Allais paradox.
Consider now the following experiment. The alternatives are lotteries
over outcomes: go to Venice, watch a movie about Venice, stay at home.
95
Even though you prefer the outcomes in this order, you may prefer to
randomize between going to Venice and staying at home, rather than
between going to Venice and watching the movie. This would make
sense if you expect that, in case you cannot go to Venice (but were
hoping to), you will no longer enjoy the movie.
But it clearly violates the independence axiom. This is known as
Machinas paradox. It reminds us that preference need not be xed,
but may depend on realizations of events. If you want to think about
it in terms of regrets, the regret here is not over a choice the decision
maker failed to make, but over the outcome "nature" chose (which is
beyond the agents control).
Perhaps the most inuential critique of expected utility theory is based
on the Ellsberg paradox. Suppose there are two urns, 1 and H, that
each contain 100 balls of white or black color. The proportion of colors
is known for 1 (there are 49 white and 51 black balls), but not for H.
The decision maker will win a $1. 000 prize if he can pick a ball of a
specied color (i.e. either white or black) from one of the urns. All he
has to do is choose the urn from which to take the ball.
Many people will always take the ball from 1 in successive experi-
ments where rst a black ball and then a white ball wins. If a black
ball wins, then 1 induces a lottery 1
1
over outcomes $1. 000 and $0
with probabilities 0.51 and 0.49, whereas H induces a lottery H
1
with
unknown probabilities : and 1 :. If a white ball wins, 1 induces
1
W
= (0.49. 0.51) and H induces H
1
= (1 :. :).
The problem is that there is no way to assign a probability : (that
a black ball is chosen from H) that could justify choosing 1 in both
experiments. Since the agent wants to maximize expected winnings,
he can prefer 1
1
= (0.51. 0.49) to H
b
only if : < 0.51. But then
1 : 0.49, so that he must prefer H
W
to 1
W
.
This failure of expected utility theory is fundamental: behavior appears
to be at times inconsistent with the notion that individuals choose be-
tween known (or even estimated) probability distributions over out-
comes.
96
As we will see, expected utility theory can be extended to replace the
objective probabilities in the denition of lotteries with subjective (im-
plicitly believed) probabilities of events. But this does not address the
Ellsberg paradox, which contradicts the existence of unique probabili-
ties altogether.
An important current research area in decision theory is therefore non-
expected utility theory, where agents may e.g. have ambiguous beliefs
(allowing for multiple probability distributions), and choices could re-
ect optimistic or pessimistic expectations. These situations, where
agents do not use simple probabilistic information or beliefs, are said
to be characterized by (Knightian) uncertainty, rather than risk.
Exercise 57 (MWG 6.F.2). In the setting of the Ellsberg paradox, let
n(0) = 0 and n(1000) = 1 represent the decision makers preferences over
sure amounts of money. A probabilistic belief that the color of the H-ball
is white can be expressed as : [0. 1]. Suppose, however, that the decision
maker has a set 1 _ [0. 1] of such beliefs. The available actions are 1 and H
(respectively, picking the ball from urn 1 and H). Denote by \ the choice
situation where the $1000 prize is won if the ball is white and $0 otherwise.
In choice situation 1, a black ball wins $1000 dollars, and the decision maker
gets $0 otherwise.
For each choice situation, let the utility function over actions 1 and
H be as follows. In \, l
W
: 1. H R is such that l
W
(1) = 0.49
and l
W
(H) = min : s.t. : 1. In 1, l
1
: 1. H R is such that
l
W
(1) = 0.51 and l
W
(H) = min 1 : s.t. : 1. I.e. l
W
(1) is the
expected utility of $1000 given the (objective) probability implied by the
number of white and black balls in urn 1. But l
W
(H) is the expected
utility of $1000 based on the most pessimistic probability in 1. (This is an
instance of Gilboa and Schmeidlers theory of nonunique prior beliefs.)
(a) Show that if 1 consists of a single belief, then l
W
and l
1
are derived
from a vNM utility function, and l
W
(1) l
W
(H) == l
1
(1) < l
1
(H).
(b) Find a set 1 such that l
W
(1) l
W
(H) and l
1
(1) l
1
(H).
Exercise 58 (MWG 6.B.6). Sometimes, an agents preference over lotter-
ies depends on a prior action c (e.g. when you have to bring wine to
dinner, you would like to know what kind of meat will be served, so you prefer
97
a degenerate lottery). Such a preference has an induced utility representation
l (1) = max
o2
.

a=1
j
a
n
a
(c)
for all 1 = (j
1
. . . . . j
.
) /, where n
a
(c) is the utility assigned to degenerate
lottery 1
a
if action c is taken. Show that l () is convex, but (by
example) not necessarily linear.
9.5 State-Space Approaches
It is often possible and useful to impose a bit more structure where
we have so far assumed given probabilities. Let o be a set of states
(decision-relevant situations that may materialize), where the proba-
bility :
c
of state : o is objectively known (we will relax this to
subjective knowledge later).
The choice objects are now taken to be random variables r : o C, i.e.
functions that determine which consequence will occur in each state.
(These functions are also called acts.)
Such random variables are closely related to lotteries via the probability
distribution over states. While a lottery assigns probabilities to conse-
quences, a random variable, or act, assigns consequences to events that
are characterized by their probabilities. Both types of choice objects
therefore induce a probability distribution over consequences.
Preference is dened on the space of random variables A and, if it
satises continuity and a variant of independence called the sure-thing
principle, can be represented by a function that has a modied expected
utility form.
Denition. Utility function l : A R has the extended expected util-
ity form if there exists \: o, a function n
c
: C R such that \r =
(r
1
. . . . . r
S
) A,
l (r) =
S

c=1
:
c
n
c
(r
c
) .
98
Comparing this to the expected utility form, l (1) =

.
a=1
j
a
n
a
, the
probability j
a
of consequence : (that is determined by the choice of
lottery 1) is replaced by state probability :
c
. And utility n
a
of the
lottery that is degenerate in consequence : is replaced by utility n
c
(r
c
)
of consequence r
c
(which the act r associates with state :).
An extended expected utility function represents preferences on A that
satisfy continuity and the sure-thing principle (provided there are at
least three states). Informally, the sure-thing principle says that pref-
erence between r and r
0
should be determined on states in which they
disagree.
To express the sure-thing principle formally, I introduce some new lan-
guage. Denote by r
1
the restriction of random variable r on o to the
event 1 _ o, and by r
1
the restriction of r to the complement of 1.
(I.e. r
1
assigns states : 1 to consequences.)
Say that r %
1
r
0
if r
1
r
0
1
% r
0
(i.e. r is preferred to r
0
when r
1
is
replaced by r
0
1
, so that r diers from r
0
only on 1). One can read
r %
1
r
0
as "r is preferred to r
0
on 1."
Denition. Preference % on A satises the sure-thing principle if \1 _ o,
r %
1
r
0
== r %
1
r
0
whenever r
1
= r
1
and r
0
1
= r
0
1
.
The axiom applies when consequences are identical for states outside
the event 1, so that the choice between r and r
0
, respectively r and
r
0
, matters only if 1 occurs. (Hence the name "sure thing" - if what
happens outside 1 cannot be changed, it should not aect preference.)
So if r % r
0
, and r
1
= r
1
, and r
0
1
= r
0
1
, then it should be the case
that r % r
0
.
The sure-thing principle takes the place of the independence axiom in
the present setting. It is analogous in that it requires preference to be
independent of events on which the acts do not dier. (Or, in alterna-
tive formulations, events that have zero probability of occurring.)
So far, we have taken for granted that there are objective probabilities
for the states. It is, of course, rare to have such information. In fact,
99
nothing forces us to interpret the :
1
. . . . . :
S
in the extended expected
utility function as objective probabilities.
Suppose preference % on A respects continuity and the sure-thing
principle, so that it has an extended expected utility representation
l (r) =

S
c=1
:
c
n
c
(r
c
). This tell us that, for every state : and act r,
the value :
c
n
c
(r
c
) is uniquely determined up to positive linear trans-
formation. If :
c
is not given objectively, then it is arbitrary.
Is there a compelling way to "disentangle" :
c
from n
c
(r
c
), i.e. x :
c
so
that we can interpret it as the subjective probability the agent assigns
implicitly to state :? If we are willing to require that preference is
state-uniform (depends only on the consequence chosen in state :, but
ranks consequences the same in every state), then n
c
(r
c
) = n(r
c
) for
all : o, and :
c
is determined through :
c
n(r
c
) up to scaling. If the
:
1
. . . . . :
S
are to be interpreted as probabilities, then

S
c=1
:
c
= 1, so
that they are completely determined.
The result that a preference on A that satises continuity, the sure-
thing principle and state uniformity admits a utility representation of
the form l (r) =

S
c=1
:
c
n(r
c
), with unique probabilities :
1
. . . . . :
S
,
is known as the subjective expected utility theorem, due to Savage. It
implies that expected utility theory does not depend on factual knowl-
edge of probabilities, but can be built around personal beliefs that are
probabilistic.
10 Risk
10.1 Money Lotteries
A lottery over continuous amounts of money r R can be described
most generally in terms of its cumulative distribution function 1 : R
[0. 1]. (The more direct approach would be to use density functions
, (), but these do not always exists and exclude the case of discrete
outcomes. If , () does exist, then 1 (r) =
_
a
1
, (t) dt.)
We now take the lottery space / to be the set of distribution func-
tions on R. The continuous version of the expected utility theorem
100
guarantees that a continuous preference % on /, that satises the in-
dependence axiom, can be represented by
l (1) =
_
n(r) d1 (r) .
(Note that , (r) = d1 (r) ,dr, so that d1 (r) = , (r) dr, and thus
_
n(r) d1 (r) =
_
n(r) , (r) dr whenever the density exists.)
The function n(), which records the values of degenerate lotteries, i.e.
certain amounts of money, is called Bernoulli utility function. Assume
that n() is increasing, continuous and bounded. (If it were unbounded,
small probability events could make a lottery innitely desirable - the
St. Petersburg paradox.)
Exercise (MWG 6.C.2). Suppose an individuals Bernoulli utility function
n() has the quadratic form n(r) = ,r
2
+ r. Show that utility from a
distribution is determined by the mean and variance of the distribution, and
only these moments. (No need to do part b.)
10.2 Risk Attitude
A risk-averse agent is someone who rejects fair gambles (that have
neither an expected gain nor loss).
Denition. An agent is risk-averse if she prefers the expected money value
of a lottery to the lottery itself: \1 /,
n
__
rd1 (r)
_
_ l (1) .
(Strictly risk-averse if this is an equality only if 1 () is degenerate.) The
agent is risk-neutral if always indierent between the expected value of a
lottery and the lottery itself, i.e. the above is an equality.
The criterion for risk aversion implies, when l () has the expected
utility form,
n
__
rd1 (r)
_
_
_
n(r) d1 (r) .
101
This is Jensens inequality that characterizes a concave function n().
(If you think of the integral over the probability distribution 1 () as a
weighted sum, the inequality relates to the basic denition of a concave
function n(cr + (1 c) r
0
) _ cn(r) + (1 c) n(r
0
).)
Hence the Bernoulli utility function of a risk-averse agent is concave,
and anyone with a concave Bernoulli function is risk-averse. This fact
has a straightforward explanation: the risk-averse agents Bernoulli
utility increases more slowly with gain than it decreases with a loss.
Since the agent has, in utility terms, more to lose than gain from a
lottery that is fair in money terms, she declines the lottery unless the
odds are strictly favorable.
In Figure 14, the expected utility of a fair lottery, i.e. random vari-
able A : r
1
. r
2
[0. 1] with equal probabilities, is labeled l (A).
The Bernoulli utility of certain amount 1 (A) = (1,2) r
1
+ (1,2) r
2
is labeled n(1 (A)). The individual in the left panel is risk-averse,
and l (A) _ n(1 (A)). (Observe how r
1
and r
2
along the r-axis are
equidistant from 1 (A), while the corresponding utility value n(1 (A))
is closer to the utility of the better state, n(r
2
).)
On the right panel, we have the contrasting case of a risk-seeker, whose
Bernoulli utility is convex.
Denition. The certainty equivalent c (1. n) of lottery 1 () is its money
value to an agent whose preference is represented by Bernoulli utility function
n():
n(c (1. n)) =
_
n(r) d1 (r) .
It is intuitive that a risk-averse person has a certainty equivalent less
than the expected value of the lottery, i.e. she only values the lottery
the same if it produces on average a gain relative to the certainty equiv-
alent. Indeed, concavity of the (increasing) Bernoulli utility function
102
Figure 14: Bernoulli utility functions for risk-averter (left) and risk-seeker
(right)
implies:
_
n(r) d1 (r) _ n
__
rd1 (r)
_
== n(c (1. n)) _ n
__
rd1 (r)
_
== c (1. n) _
_
rd1 (r) .
Figure 15 shows the certainty equivalent for risk-averse and risk-seeking
agents.
Beyond these formal characterizations, risk-averse behavior is evident
in the propensity to buy insurance, even when premia are not "actuar-
ially fair" (i.e. the expected payout is less than the premium).
Example. A strictly risk averse agent with initial wealth n faces a possible
damage of $1 with probability :. The agent is oered insurance at a fair
premium : per dollar-payout in the event of a loss. If c dollars of insurance
103
Figure 15: Certainty equivalent for risk-averter (left) and risk-seeker (right)
(i.e. conditional payout) are purchased at this premium, the agents wealth
will be either n :c (if no damage occurs) or n :c 1 + c = n +
(1 :) c 1 (if there is damage). Expected utility from a choice of c,
which induces a lottery over n+(1 :) c1 and n:c with probabilities
(:. 1 :), is then
l (c) = :n (n + (1 :) c 1) + (1 :) n(n :c) .
The rst-order condition with respect to c,
n
0
(n + (1 :) c

1) = n
0
(n :c

) .
can be solved for n + (1 :) c

1 = n :c

, i.e. c

= 1 because n
0
()
is strictly decreasing from strict concavity. Thus, a risk-averse agent insures
fully if the premium is actuarially fair.
Example. Suppose an amount of wealth can be invested in a safe asset,
which yields $1, and a risky asset with earnings distribution 1 (.) such that
expected return
_
.d1 (.) is greater than 1. The portfolio choice problem is
to determine the optimal shares c and , to invest in these assets, such that
c + , = 1. Since the random return, given c and ,, is c + ,., utility is a
104
random variable n(c +,.) = n(1 , +,.). An expected utility maximizer
solves:
max
o2[0,1]
_
n(1 +, (. 1)) d1 (.) .
which has rst-order condition, with respect to ,,
_
(. 1) n
0
(1 +,

(. 1)) d1 (.) = 0
(at an interior solution). Since the left side is greater than zero at , = 0,
given
_
.d1 (.) 1 and that n() is increasing everywhere, we must have
,

0, whether or not the individual is risk-averse. The general principle


is that an agent will always invest some share of wealth in an actuarially
favorable asset.
Exercise (MWG 6.C.19).
Risk aversion can be quantied and compared by means of the absolute
and relative coecients of risk aversion.
Denition. The Arrow-Pratt coecient of absolute risk aversion is, for a
twice dierentiable Bernoulli utility function n() at r,
:

(r. n) =
n
00
(r)
n
0
(r)
.
This is essentially a measure of the concave curvature of the Bernoulli
utility function (for an increasing concave function, n
0
(r) 0 and
n
00
(r) < 0). Note that we cannot compare the degree of concavity based
on n
00
(r) alone, since this derivative can be scaled by a positive linear
transformation (which yields another utility function that represents
the preference). But such a transformation would also scale n
0
(r), so
it cannot aect :

(r. n).
For a risk-neutral agent, n
0
(r) is constant and n
00
(r) = 0, so :

(r. n) =
0.
105
The more risk-averse of two agents has a lower certainty equivalent
for any given lottery 1 (). Moreover, 2 is more risk-averse than 1 in
the sense that :

(r. n
2
) _ :

(r. n
1
) if and only if n
2
() is a concave
transformation of n
1
(). I.e. there exists an increasing concave function
() such that \r, n
2
(r) = (n
1
(r)).
Exercise (MWG 6.C.6, C.7).
Example. It is possible to recover the preference from the Arrow-Pratt coef-
cient. Suppose :

(r. n) = c for all r. Integrating n


00
(r) = cn
0
(r) on both
sides, we have n
0
(r) ,n(r) = J ln n(r) ,Jr = c, and integrating once more
on both sides, n(r) = c
oa
, i.e. the utility function is exponential when the
Arrow-Pratt coecient is constant. I have constructed one particular utility
function, assuming the integration constants are zero, but others are still
exponential (solve the dierential equation to see this). Exponential utility
functions therefore constitute the CARA (constant absolute risk aversion)
class.
The DARA class of Bernoulli utility functions has the plausible prop-
erty that wealthier people tend to be less risk-averse.
Denition. Bernoulli utility function n() exhibits decreasing absolute risk
aversion (DARA) if :

(r. n) is a decreasing function of r, i.e. \r, :

(r. n)
:

(r
0
. n) whenever r < r
0
.
Exercise (MWG 6.C.8).
Denition. The coecient of relative risk aversion is, for a twice dieren-
tiable Bernoulli utility function n() at r,
:
1
(r. n) = r
n
00
(r)
n
0
(r)
.
106
If :
1
(r. n) = r:

(r. n) is decreasing in r, then clearly :

(r. n) must
be decreasing in r. The converse is not true. Therefore, non-increasing
relative risk reversion is a stronger property than decreasing absolute
risk aversion.
Example. Consider the Bernoulli utility function n(r) = r
1j
, (1 j),
where j (0. 1). Since :

(r. n) = j,r is decreasing in r, this function is in


the DARA class. But :
1
(r. n) = r:
1
(r. n) = j is constant, so DARA does
not imply DRRA.
Exercise (MWG 6.C.18).
Exercise (MWG 6.C.12).
10.3 Stochastic Dominance
Up to now, we have compared agents in terms of the risk aversion
exhibited by their utility functions. Now we are interested in compar-
ing lotteries and nding criteria by which they can be ranked, given
properties of preference, such as risk attitude.
If distribution 1 yields a higher expected utility than lottery G, re-
gardless of risk attitude (i.e. the specic form of the Bernoulli utility
function), 1 is said to rst-order stochastically dominate G.
Denition. Distribution 1 () rst-order stochastically dominates G(), writ-
ten 1 %
1OS1
G, if
_
n(r) d1 (r) _
_
n(r) dG(r)
for any non-decreasing function n : R R.
If distribution 1 yields a higher expected utility than lottery G for a
risk-averse agent (i.e. a concave Bernoulli utility function), then 1 is
said to second-order stochastically dominate G.
107
Denition. Distribution 1 () second-order stochastically dominates G(),
written 1 %
SOS1
G, if 1 () and G() have the same expectation of r, i.e.
_
rd1 (r) =
_
rdG(r), and
_
n(r) d1 (r) _
_
n(r) dG(r)
for any non-decreasing concave function n : R
+
R.
1 %
1OS1
G is equivalent to the property: for any r, getting more than
r is more likely under 1 than under G.
Arelated statement can be made for second-order dominance. 1 %
SOS1
G is equivalent to the property: for any r, probability mass accumulates
faster toward r under G than under 1. (I.e. G spreads the probability
mass more evenly and gives more weight to the extremes. I will give a
precise characterization in a moment.)
The next results make use of some integral relationships that may be
found through "integration by parts." The technique is based on the
product rule applied to n(r) 1 (r):
d
dr
n(r) 1 (r) = n
0
(r) 1 (r) +n(r) , (r)
implies
n(r) , (r) =
d
dr
n(r) 1 (r) n
0
(r) 1 (r)
and, integrating both sides,
_
n(r) , (r) dr = c
_
n
0
(r) 1 (r) dr.
where c is a constant (since dn(r) 1 (r) ,dr integrated on (. ) is
n() 1 () n() 1 (), and any distribution function satises
1 () = 0 and 1 () = 1).
Applied to n
0
(r)
_
a
1
1 (t) dt, the product rule gives
d
dr
n
0
(r)
_
a
1
1 (t) dt = n
00
(r)
_
a
1
1 (t) dt + 2n
0
(r) 1 (r)
108
(since the derivative of
_
a
1
1 (t) dt with respect to r is
_
a
1
, (t) dt +
1 (r) = 21 (r) by Leibniz rule). Thus
n
0
(r) 1 (r) =
1
2
d
dr
n
0
(r)
_
a
1
1 (t) dr
1
2
n
00
(r)
_
a
1
1 (t) dt
and
_
n
0
(r) 1 (r) dr = ,
1
2
_ _
n
00
(r)
_
a
1
1 (t) dt
_
dr.
(where , is a constant, since
1
2
n
0
()
_
1
1
1 (t) dt =
1
2
n
0
()).
Hence
_
n(r) d1 (r) = c
_
n
0
(r) 1 (r) dr
= c , +
1
2
_ _
n
00
(r)
_
a
1
1 (t) dt
_
dr.
Proposition. The payo distribution 1 () rst-order stochastically domi-
nates G() if and only if \r, 1 (r) _ G(r).
Proof. (If) Let 1 (r) _ G(r) for all r. From integration by parts,
we have
_
n(r) d1 (r) = c
_
n
0
(r) 1 (r) dr and
_
n(r) dG(r) = c
_
n
0
(r) G(r) dr, so
_
n
0
(r) 1 (r) dr _
_
n
0
(r) G(r) dr ==
_
n(r) d1 (r) _
_
n(r) dG(r) .
If n() is an increasing function, the rst inequality holds, given that 1 (r) _
G(r) for all r. Hence 1 rst-order dominates G.
(Only if) Suppose 1 ( r) G( r) for some r. To show that 1 fails to
rst-order dominate G, we need to nd a non-decreasing function n() such
that
_
n(r) d1 (r) <
_
n(r) dG(r) at some r. Consider n(r) = 1 for r r
and n(r) = 0 for r _ r, which is non-decreasing. Then
_
n(r) d1 (r) =
_
1
a
d1 (r) = 1 1 ( r)
< 1 G( r) =
_
1
a
dG(r) =
_
n(r) d1 (r) .
109

Exercise (MWG 6.D.1).


Exercise (MWG 6.D.2).
Proposition. The payo distribution 1 () second-order stochastically dom-
inates G() if and only if \r,
_
a
1
G(t) dt _
_
a
1
1 (t) dt.
Proof. (If) Let
_
a
1
1 (t) dt _
_
a
1
G(t) dt for all r. From integrating
by parts, we have
_
n(r) d1 (r) = c , +
1
2
_
_
n
00
(r)
_
a
1
1 (t) dt
_
dr and
_
n(r) dG(r) = c, +
1
2
_
_
n
00
(r)
_
a
1
G(t) dt
_
dr. If n() is concave, then
n
00
(r) < 0, so
_
a
1
1 (t) dt _
_
a
1
G(t) dt ==
_
n(r) d1 (r) _
_
n(r) dG(r) .
Hence 1 second-order dominates G.
(Only if) Suppose
_
a
1
1 (t) dt
_
a
1
G(t) dt for some r. To show that
1 fails to second-order dominate G, we need to nd a concave function
n() such that
_
n(t) d1 (t) <
_
n(t) dG(t) at r. Let n(t) = t, except
in an interval [r. r] containing r where
_
a
0
1
1 (t) dt
_
a
0
1
G(t) dt for all
r
0
[r. r]. On this interval, let n(t) be strictly concave.
From integration by parts,
_
n(t) d1 (t)
_
n(t) dG(t) ==
_ _
n
00
(r)
_
a
1
1 (t) dt
_
dr
_ _
n
00
(r)
_
a
1
G(t) dt
_
dr.
Since n
00
(r) = 0 for all r , [r. r] and n
00
(r) < 0 for all r [r. r], where
_
a
a
1 (t) dt
_
a
a
G(t) dt, the left inequality cannot hold, so
_
n(r) d1 (r) <
_
n(r) dG(r). This means G ~
SOS1
1, a contradiction.

110
1 %
SOS1
G is also equivalent to the property that G is a "mean-
preserving spread" of 1. I.e. if G is obtainable from 1 by replac-
ing every certain outcome r with a lottery that yields r + ., where .
is a zero-mean random variable, distributed according to H
a
(.) with
_
.H
a
(.) = 0.
Example. Consider two lotteries that reward outcomes of rolling a fair die.
1 gives $1 if a number up to 3 is rolled and $2 if the number is greater
than 3. G pays nothing for a 1 and $5 for a 6, and $1 otherwise. These
lotteries have the same mean, 3,2, and probabilities (1,2. 1,2) over ($1. $2),
respectively (1,6. 2,3. 1,6) over ($0. $1. $5). To obtain G from 1, replace
the $1 and $2 wins in 1 with lotteries that give ($0. $1. $5) respectively with
probabilities (1,3. 7,12. 1,12) and (0. 3,4. 1,4). Observe that the expected
values of these lotteries are $1 and $2. The compound lottery over ($0. $1. $5)
that plays (1,3. 7,12. 1,12) and (0. 3,4. 1,4) with equal probability reduces
to (1,6. 2,3. 1,6) = G. So we have constructed G as a mean-preserving
spread of 1, i.e. 1 %
SOS1
G.
Example. Continuing in the previous scenario, distribution H is called an
"elementary increase in risk" from G if it redistributes all probability mass
to the extreme points in Gs domain (while preserving the mean). I.e. H
is the lottery (7,10. 0. 3,10). This is a mean-preserving spread via lotteries
(1. 0. 0), (4,5. 0. 1,5) and (0. 0. 1) in place of the $0, $1 and $5 wins.
Exercise (MWG 6.D.4).
Exercise (MWG 6.D.3).
11 Prot Maximization Problem
11.1 Production Set
Firms exist in order to transform some goods (inputs) into other goods
(outputs). Which goods are inputs and which are outputs is a matter
of choice for every rm within the technological constraints. Feasible
production plans are described by the production set 1 , which lists all
possible combinations of input and output quantities. These are simply
bundles, or vectors, in the commodity space.
111
However, unlike consumption bundles, production vectors necessarily
contain negative entries (every technology requires inputs). The pro-
duction set can therefore not be restricted to R
+
. A typical element
is (
1
. . . . .
1
) R
1
, where

< 0 identies the /th commodity as an


input for the particular rm, and

0 means / is an output. The set


of all such feasible vectors is the production set 1 _ R
1
.
It is commonly assumed that 1 is nonempty and closed (i.e. there are
ecient production plans), and that there can be "no free lunch" (no
output without input, 1 and _ 0 == = 0).
The following are typical properties that may fail to apply in special
circumstances. If the rm is able to shut down its operations without
"sunk costs," it has the option of inaction (0 1 ). If the rm can
always use more inputs without reducing output, free disposal applies
( 1 and
0
_ ==
0
1 ). If it is impossible to fully recover
inputs from outputs, we have irreversibility ( 1 and ,= 0 ==
, 1 ).
Exercise (MWG 5.B.5).
Finally, one often assumes some form of convexity, which can be broken
down into additivity and returns-to-scale properties. Additivity says
that two feasible production plans can be combined (.
0
1 ==
+
0
1 ). (Free entry would imply additivity.) With returns to
scale, we have \ 1 , c 1 for c [0. 1] (non-increasing), c _ 1
(non-decreasing) or c _ 0 (constant). (Note these are all versions of
what is conventionally called "constant returns to scale;" they restrict
the scale parameter to dierent intervals.)
Proposition. The production set 1 is additive and exhibits non-increasing
returns to scale if and only if it is a convex cone, i.e. \.
0
1 and \c. , _
0, c +,
0
1 .
Proof. (If) Take any .
0
1 and c. , 0. Suppose 1 has the additivity
and non-increasing returns properties. By / times adding , we have / 1 .
This can be done for any integer /, so let / max c. ,. Because c,/ <
112
Figure 16: Typical convex cone production set (shaded)
1, nonincreasing returns implies (c,/) (/) = c 1 . By the analogous
construction, ,
0
1 . Now additivity gives c + ,
0
1 , so that 1 is a
convex cone.
(Only if) With c = 1 and , = 1, the convex cone is seen to satisfy
additivity. Similarly, with , = 0, it satises nonincreasing returns. Hence
1 is a convex cone only if 1 has the additivity and non-increasing returns
properties.

If 1 is a convex cone, then 1 is convex (.


0
1 and c [0. 1] ==
c + (1 c)
0
1 ). The converse is true only if 0 1 (inactivity is
possible), so that non-increasing returns hold (let
0
= 0), but this is
not sucient (for additivity).
We will typically assume that 1 is convex, which is a weaker prop-
erty than convex cone, or that 1 is strictly convex ( ,=
0
and c
113
(0. 1) == c + (1 c)
0
is in the interior of 1 ). (A convex cone is
not strictly convex.)
Exercise (MWG 5.B.2).
Exercise (MWG 5.B.3).
Exercise (MWG 5.C.8).
11.2 Transformation Function
The production set can be expressed in terms of a "transformation
function" 1 : R
1
R that assigns values 1 () _ 0 to 1 and
1 () 0 to , 1 , with 1 () = 0 if and only if is a boundary
point of 1 . (Such a function could be dened for any set, including the
consumption set, which is however easy enough to express as A = R
+
.)
Example. Suppose each of goods
1
and
2
can be made from 1 and
1. Technologies are Cobb-Douglas:
1
= /
c
1
|
1c
1
and
2
= /
o
2
|
1o
2
, where
/
1
+ /
2
= 1 or |
1
+ |
2
= 1. Given /
1
and |
1
, the maximal output of
2
is

2
= (1 /
1
)
o
(1 |
1
)
1o
. The (production possibility) frontier is therefore
described by
1 =
_

1
|
1c
1
_
1c
+
_

2
(1 |
1
)
1o
_
1o
(rearrange the production functions for /
1
and 1 /
1
, add up). A transfor-
mation function is:
1 (
1
.
2
. /. |) = /
1
+/
2
1 =
_

1
|
1c
1
_
1c
+
_

2
|
1o
2
_
1o
/
1
/
2
.
Notice that 1 (
1
.
2
. /. |) = 0 if and only if / = /
1
+ /
2
= 1 and | =
|
1
+ |
2
= 1, and else 1 (
1
.
2
. /. |) < 0 unless /
1
+ /
2
1 or |
1
+ |
2
1
(which are infeasible).
The transformation function contains all relevant information about pro-
duction possibilities. You can recognize outputs and inputs by the fact that
an increase in
1
and
2
increases 1 (), and an increase in a / or | decreases
114
1 (). (When inputs are xed, an increase in output indicates greater e-
ciency. When outputs are xed, an increase in input means lower eciency.)
The production plan (
1
.
2
. 1. 1), where 1 (
1
.
2
. 1. 1) = 0, is fully
ecient.
Suppose 1 () is dierentiable at a boundary point . Then, holding

xed for / = 3. . . . . 1,
d1 ( ) =
J1 ( )
J
1
d
1
+
J1 ( )
J
2
d
2
= 0.
and the slope of the transformation function is
d
2
d
1
=
J1 ( ) ,J
1
J1 ( ) ,J
2
.
If
1
and
2
are outputs, this ratio is called the marginal rate of trans-
formation (MRT). It measures the amount by which, at xed input
levels, one output has to be reduced in order to produce more of the
other. If
1
and
2
are intputs, the ratio is the marginal rate of tech-
nical substitution (MRTS). Then it measures the amount by which, at
xed output levels, one input can be reduced when using more of the
other.
Example. In the Cobb-Douglas case above, where the transformation func-
tion was
1 (
1
.
2
. 1. 1) =
_

1
|
1c
1
_
1c
+
_

2
(1 |
1
)
1o
_
1o
1.
we nd
`11
21
=
d
2
d
1
=
,
c
(
1
,|
1
)
(1c)c
(
2
, (1 |
1
))
(1o)o
=
,
c
(/
1
,|
1
)
1c
((1 /
1
) , (1 |
1
))
1o
and
`11o
11
=
d1
d1
=
,
1 ,
_
1 |
1

2
_
1o
=
,
1 ,
1 |
1
1 /
1
.
115
11.3 Prot Maximization
The standard behavioral premise about rms is that they maximize
prots. This standpoint is not as immediately compelling as maxi-
mization with respect to consumer preferences. However, if rms are
owned by consumers, and xed shares of prot accrue to the owners,
higher prots enlarge the owners budget sets and therefore increase
indirect utilities.
However, this argument relies on the "price-taking" assumption that
each rm regards j = (j
1
. . . . . j
1
) 0 as independent of its production
plan. Else, an owner may nd it optimal to manipulate prices by
increasing the production of the goods she likes. We do assume price-
taking behavior throughout.
Prot maximization also requires that the technology is certain: else, a
risk-averse owner may favor less risky (but potentially less protable)
production plans.
Finally, if rms are operated by agents instead of owners, they may
pursue other objectives than prot maximization, since the prot does
not accrue to them.
Exercise (MWG 5.G.1).
Prot, in the conventional sense, is j =

1
=1
j

(recall that inputs


enter negatively, as costs). From a nonempty and closed production
set, the rm chooses production plan at j 0 that attains
max
j2Y
j = max
j2R
L
j
s.t. 1 () _ 0.
Clearly, the rm must choose a production vector in the boundary of 1 ,
else it is possible to reduce inputs without reducing outputs, which at
strictly positive prices increases prot. Hence the constraint specializes
to 1 () = 0.
116
A solution to the rms problem is the supply correspondence
(j) = 1 s.t. j = : (j) .
We dene the prot function as : (j) = max
j2Y
j , i.e. it is the value
function associated with the rms maximization problem (analogous
to the indirect utility function and the expenditure function in demand
theory). It depends only on the parameter j; the optimal choice (j)
is implicit.
Suppose the transformation function is dierentiable. Then the La-
grangean rst-order conditions for prot-maximization are, for / =
1. . . . . 1,
`
J1 (

)
J

= j

.
or in matrix notation, `\1 (

) = j. (Since the gradient of the trans-


formation function is proportional to j at a solution

, the prot-
maximizing direction to adjust , if the technological constraint is re-
laxed, is in the direction of prices, such that expensive goods are pro-
duced using cheap goods.)
For any pair of commodities /. |, the condition implies
J1 (

) ,J
I
J1 (

) ,J
|
=
j
I
j
|
.
Example. The rm from the previous examples uses only inputs 1 and 1,
at prices : and n. Thus, a condition for prot-maximization is
`11o
11
=
,
1 ,
1 |
1
1 /
1
=
:
n
.
and another is
`11
21
=
,
c
(/
1
,|
1
)
1c
((1 /
1
) , (1 |
1
))
1o
=
j
1
j
2
.
(Additional rst-order conditions place implicit restrictions on prices.)
117
Solving jointly, we obtain the optimal input ratios in the production of
each output:
i
1
=
/
1
|
1
=
_
c
,
j
1
j
2
_
1(1c)
_
,
1 ,
n
:
_
(1o)(1c)
.
i
2
=
1 /
1
1 |
1
=
,
1 ,
n
:
.
The scale of production is not determined, since the technology exhibits
constant returns. We can express the outputs in terms of the levels of one
input

1
= /
c
1
|
1c
1
= i
1
|
1
.

2
= (1 /
1
)
o
(1 |
1
)
1o
= i
2
(1 |
1
) .
or use the identity |
1
= (i
2
1 1) , (i
2
i
1
) to express outputs in terms of
the total use of both inputs:

1
=
i
1
i
2
i
1
(i
2
1 1) .

2
=
i
2
i
2
i
1
(1 i
1
1) .
Since i
1
increases in j
1
, it makes sense that
1
increases, and
2
decreases,
in i
1
.
Exercise (MWG 5.C.9).
Exercise (MWG 5.C.12).
Proposition. The supply correspondence () for a production set 1 that is
closed and satises free disposal is homogeneous of degree zero, convex if 1
is convex, and single-valued if 1 is strictly convex. The prot function : ()
is homogeneous of degree one and convex.
Proof. Since a price increase from j to /j by / 0 does not aect the pro-
duction set and can be factored out of j, it does not change the maximizers,
i.e. the supply correspondence (). Then it is clear that : (j) = j

with

(j) is homogeneous of degree one.


Suppose 1 is convex. If (j) and
0
(j), then j = j
0
(since
all elements of the supply correspondence maximize prot at j), so
j = cj + (1 c) j
0
= j (c + (1 c)
0
) .
118
i.e. c +(1 c)
0
(j). Moreover, if 1 is strictly convex, then \.
0
1 ,
where ,=
0
, and \c (0. 1), the convex combination c + (1 c)
0
is in
the interior of 1 . Then it is possible to reduce inputs and increase outputs
in 1 , which must lead to strictly higher prot than j (c + (1 c)
0
).
Therefore, if (j) and
0
(j), then c + (1 c)
0
, (j), which
contradicts convexity of (j). Hence (j) cannot contain distinct and
0
,
and must be single-valued.
To see that : () is convex, let c [0. 1] and j. j
0
0. For any
(cj + (1 c) j
0
), by denition of : () as a maximum, j _ : (j) and
j
0
_ : (j
0
). Therefore,
: (cj + (1 c) j
0
) = (cj + (1 c) j
0
) _ c: (j) + (1 c) : (j
0
) .
which means : () is convex.

11.4 Law of Supply


The law of supply says that more is produced of an output, less used
of an input, whose price increases. It arises from the convexity of the
prot function, which is an intuitive property: a price increase causes
a linear increase in prot at xed supply, so the rms ability to adjust
supply can only further enhance prot.
Proposition. If the supply function () for a production set 1 that is
closed and satises free disposal is dierentiable at j, then ( j) = \: ( j)
and 1 ( j) = 1
2
: ( j) is a symmetric and positive semidenite matrix, i.e.
\ R
1
,
T
1 ( j) _ 0 (equality if = j).
Proof. That ( j) = \: ( j) follows immediately from the envelope theorem
( ( j) is a maximizer at : ( j) = j ( j), hence locally constant). Positive
semideniteness of 1
2
: ( j) is also immediate from the fact that : () is con-
vex. Since () is homogeneous of degree zero, (c j) = ( j) for c 0.
Dierentiating on the left and on the right with respect to j, we have for
/ = 1. . . . . 1,
1

I=1
J

(c j)
Jcj
I
Jcj
I
Jc
=
1

I=1
J

(c j)
Jj
I
j
I
= 0.
This is the /th entry in the vector 1 ( j) j. Therefore, j
T
1 ( j) j = 0.
119

The statement ( j) = \: ( j), that the supply function can be recov-


ered from the (maximal) proft function, is known as Hotellings lemma.
Positive semideniteness of 1 ( j) is an expression of the law of sup-
ply and implies, in particular, that own-price eects J

(j) ,Jj

are
positive. Its non-dierential equivalent can be understood directly:
(j j
0
) (
0
) = j (
0
) +j
0
(
0
) _ 0
for all j, j
0
and (j),
0
(j
0
). This is true because j _ j
0
and j
0

0
_ j
0
(by the optimality of the supply correspondence).
12 Eciency of Aggregate Supply
12.1 Ecient Production
The focus of this lecture are the eciency properties of supply and
aggregate supply. First, we look at a basic notion of eciency that is
dened from the properties of the production plan alone. Second, we
discuss the stronger notion of cost minimization, which incorporates
prices. Both are essentially implied by prot maximization and survive
aggregation.
Ecient production is non-wasteful.
Denition. Production plan 1 is ecient if there exists no
0
1 such
that ,= and
0
_ .
While all ecient production plans are boundary points of the produc-
tion set 1 , not all boundary points are ecient. (Consider a natural
resource that cannot be produced. Any feasible production plan that
does not use the resource is a boundary point, but it may be possible
to reduce waste in other commodities.)
120
Proposition. Production plan 1 maximizes prot at j 0 only if it
is ecient.
Proof. If is not ecient, i.e. there exists
0
_ in 1 and
0
,= ,
then j
0
j (given strictly positive prices), so that cannot be prot-
maximizing.

This result becomes the rst welfare theorem under aggregation. The
partial converse corresponds to the second welfare theorem.
Proposition. If the production set 1 is convex, then there exists for every
ecient production plan 1 a nonzero price vector j _ 0, given which
is prot-maximizing on 1 .
Proof. If 1 is ecient, then 1 must be disjoint from the set
1
j
=
_

0
R
1
+
s.t.
0

_
of production plans that contain more of each output and less of each input.
Since 1
j
is convex (if
0
and
00
, then any convex combination
strictly exceeds ), there exists a hyperplane that separates 1 and 1
j
, i.e.
there is some j ,= 0 such that j
0
_ j
00
for any
0
1
j
and any
00
1 .
We need to show that j _ 0 and that is prot-maximizing at prices j.
Suppose j

< 0 for some commodity /. Then one can nd


0
with
0

suciently large that j


0
< j . But this was ruled out by the construction
of j.
Suppose there is a feasible production plan
00
1 such that j
00
j .
Then j
00
j
0
for every
0
in some (suciently close) neighborhood of
. Because such a neighborhood contains
0
, the construction of j such
that j
0
_ j
00
is again violated. It follows that j _ j
00
for any

00
1 .

Example. Consider a production plan (. .) that is ecient on the single-


output production set 1 given by concave production function , (), i.e.
, (.) = . From the rst-order conditions, we can construct prices (. n)
at which . is prot-maximizing. Represent 1 by the transformation function
1 (. .) = , (.). Fix the output price j, say at j = 1. Then for every
121
/, divide the rst-order condition `J1 (. .

) ,J.

= `J, (.

) ,J.

= n

by
the rst-order condition for the optimal output , `J1 (. .

) ,J = ` =
j = 1: for / = 1. . . . . 1 1,
n

=
J, (.

)
J.

.
This determines the desired price vector at which the ecient production
plan (. .) is prot-maximizing.
The denition of production eciency makes no reference to prices. A
stronger criterion that selects among ecient production plans is cost
minimization.
12.2 Cost Minimization
I focus on the single-output case, where a quantity is produced using
input vector . R
11
, and the technological contraint is expressed by
a production function , (.). The production function can be viewed as
a transformation function 1 (. .) = , (.), since \(. .) , 1 ,
, (.) and \(. .) 1 , _ , (.) with equality if and only if
(. .) is in the boundary of 1 .
Denote by n 0 the vector of input prices. The cost-minimizing
choice of inputs, at a given output, denes the cost function:
c (n. ) = min
:0
n .
s.t. , (.) _ .
Notice that the optimal choice of inputs is implicit in the cost function;
its value c (n. ) is the lowest cost required to produce at prices n.
A solution to the cost minimization problem at dierent (n. ) is the
(conditional) factor demand correspondence . (n. ). (The qualier
"conditional" refers to the fact that the factor demand depends on
output.)
122
The factor demands at any solution (

. .

) to the prot maximization


problem must be cost-minimizing given

, since prot at

, i.e. j

n ., is strictly decreasing in the cost n .. Hence prot maximization


implies not just eciency, but also that factor demand . solves the cost
minimization problem.
If the production set is convex (i.e. the production function is concave),
the following rst-order conditions (together with the production func-
tion) characterize .

. (n. ). Letting the /th commodity be the


output, for all / = 1. . . . . 1 1,
`
J, (.

)
J.

_ n

with equality if .

0. In matrix notation, `\, (.

) _ n.
Exercise (MWG 5.C.10).
Since must be the prot-maximizing output, given the cost func-
tion, i.e. solve max
q
(j c (n. )), the rst-order condition for prot
maximization, j = Jc (n. ) ,J, must hold. The multiplier ` in the
cost-minimization problem is the marginal value of relaxing the tech-
nological constraint, i.e. ` = Jc (. n) ,J. Therefore, ` = j whenever
cost is minimized at the prot-maximizing output level.
The direction of the inequality in the rst-order condition for cost-
minimization is then intuitive: whenever `(J, (.

) ,J.

) n

, the
use of input / should be increased, because / adds value j`1

that
is greater than its price n

. Hence .

0 at an optimum (and
`(J, (.

) ,J.

) = n

) unless `(J, (.

) ,J.

) < n

at .

= 0.
If .

1
. .

2
,= 0, the rst-order conditions imply `1
1
,`1
2
= n
1
,n
2
and,
since `1
1
,`1
2
is the slope of , (.) = , and n
1
,n
2
is the slope of
c (n. ) = c, tangency of the "isoquant" (graph of , (.) = ) and the
"isocost" (graph of c (n. ) = c). See Figure 17.
This picture is reminiscent of the expenditure minimization problem if
you think of (or the implied prot j) as the rm owners "utility." In
fact, at given prices and costs, the owners wealth and indirect utility
(j. n) are increasing in , so there is a direct connection.
123
Figure 17: Tangency solution to cost minimization
This parallel implies that c (n. ) inherits properties of the expenditure
function (homogeneous of degree one and concave in input prices n,
and non-decreasing in ). Moreover, the factor demand correspondence
has properties of the Hicksian demand correspondence for commodity
bundles (homogeneous of degree zero in n, . (n. ) = \
&
c (n. ) where
dierentiable, 1
&
. (n. ) is symmetric and negative semidenite, i.e.
satises the law of demand). (The identity . (n. ) = \
&
c (n. ) is
called Shepards lemma.)
Exercise (MWG 5.C.3). Consider a single-output technology 1 , with free
disposal, that is is closed and characterized by production function , ().
Show: if , () is homogeneous of degree one (constant returns to scale), then
c () and . () are homogeneous of degree one in . If , () is concave (non-
increasing returns to scale), then c () is a convex function of , so that
marginal cost is non-decreasing in .
The (minimum) cost function is better behaved than the prot function
under non-decreasing returns to scale (when it is optimal to produce
innite or zero output). Moreover, the cost function always exists,
whereas the prot function only exists if rms are price-takers.
124
Exercise (MWG 5.D.4).
Exercise (MWG 5.D.5).
12.3 Aggregate Supply
Suppose now there are J rms with production sets 1
1
. . . . . 1
J
(each
nonempty, closed and with free disposal). Denote rm ,s prot func-
tion and supply correspondence by :
)
(j) and
)
(j).
The aggregate supply correspondence is
(j) =
_
R
1
s.t.
1
. . . . .
J
with
)

)
(j) and
J

)=1

)
=
_
.
It includes every sum of production plans
1
. . . . .
J
that are individu-
ally optimal for the rms.
The aggregated supply correspondence admits a representative rm in
the sense that such a rm, faced with the aggregated production set
1
1
+ +1
J
, would choose a vector in (j), i.e. an aggregate of plans
that rms , = 1. . . . . J would choose in the individual production sets
1
1
. . . . . 1
J
.
Let

(j) denote the supply correspondence on 1


1
+ +1
J
, and call
the associated prot function :

(j).
Proposition. If j 0, then any

(j) satises
=
J

)=1

)
for some such that
)

)
(j) for , = 1. . . . . J. Moreover, :

(j) =

J
)=1
:
)
(j).
Proof. It is helpful to establish the second part rst: :

(j) =

J
)=1
:
)
(j).
Let
)
1
)
for , = 1. . . . . J. Then

J
)=1

)
1 = 1
1
+ + 1
J
implies
:

(j) _ j

J
)=1

)
=

J
)=1
(j
)
), i.e. :

(j) _

J
)=1
:
)
(j). On the other
125
hand, if 1 , there are
)
1
)
for , = 1. . . . . J such that

J
)=1

)
= . Then
j = j

J
)=1

)
=

J
)=1
(j
)
) _

J
)=1
:
)
(j), i.e. :

(j) _

J
)=1
:
)
(j).
For the rst part, we need

(j) _ =
_

J
)=1

)
s.t.
)

)
(j) for , = 1. . . . . J
_
and also _

(j). Let

(j) and =

J
)=1

)
for some
)
1
)
, , =
1. . . . . J. Because is prot-maximizing, j

J
)=1

)
= :

(j) =

J
)=1
:
)
(j).
I.e. j
)
< :
)
(j) for some , is only possible if j
I
:
I
(j) for some
/, which conicts with the denition of :
I
(j) as a maximum value func-
tion. It follows that j
)
= :
)
(j), hence
)

)
(j), for all ,. I.e.

(j) _ . Now let


)
1
)
, , = 1. . . . . J. Since j
)
= :
)
(j) for all ,,
j

J
)=1

)
=

J
)=1
(j
)
) = :

(j). Thus

J
)=1

(j), i.e. _

(j).

A key implication of this result is that eciency aggregates: since every


rm produces a given output at minimal cost, and a hypothetical plan-
ner, who has the economys combined production possibilities available,
can do not better than to replicate individual choices, aggregate supply
must be cost-minimizing.
Recall that the "law of supply" (optimal production plans increase in
commodities whose prices increase) applies under mild conditions (1
is closed and has free disposal).
If every
)
(j) satises the law of supply, then (j) also does. This is
so because symmetry and positive semideniteness are properties that
are preserved under matrix addition.
In fact, the lawof supply inequality for rm,, (j j
0
)(
)
(j)
)
(j
0
)) _
0, sums over , = 1. . . . . J to (j j
0
) ( (j) (j
0
)) _ 0, its aggregate
version.
13 Partial Competitive Equilibrium
13.1 Competitive Equilibrium
A private ownership economy is populated by consumers with prefer-
ences, endowments and ownership stakes in rms that have production
126
possibilities. We have discussed the consumer problem of choosing a
bundle r A from the consumption set A, and the producer problem
of choosing a plan 1 from the production set 1 . Aggregate de-
mand and supply are functions of prices that we took to be exogenous.
Now we consider how prices are determined through the interaction of
demand and supply.
Let there by 1 consumers i = 1. . . . . 1, J rms , = 1. . . . . J, and 1
goods / = 1. . . . . 1. Consumer is preference over consumption bundles
r
i
A
i
_ R
1
is represented by the utility function n
i
(). Initially, i
is endowed with a bundle .
i
= (.
i
. . . . . .
1i
) A
i
. The total endowed
quantity of commodity / in the economy is .

1
i=1
.
i
.
Firm , implements the production plan
)
1
)
_ R
1
that maximizes
prot. Consumer i holds a share o
i)
in rm ,, which is a proportional
right to the rms net output (i.e. the consumer provides a fraction
o
i)
of the inputs used by rm , and receives a fraction o
i)
of the out-
puts). Given the production decisions, the total available quantity of
commodity / in the economy is .

J
)=1

)
.
An allocation describes, in this context, the consumption bundle each
consumer ends up with and the production activity of each rm.
Denition. An allocation (r

1
. . . . . r

1
.

1
. . . . .

J
) is a list of consumption
vectors r

i
A
i
_ R
1
for all consumers i 1 and production vectors

)

1
)
_ R
1
for all rms , J.
An allocation is said to be feasible, given the economys endowments
.
1
. . . . . .
1
of commodities / = 1. . . . . 1, if for each /,
1

i=1
r

i
_ .

+
J

)=1

)
(i.e. no more than the available quantity of each commodity is allocated
for consumption).
127
Denition. A competitive equilibrium is an allocation (r

1
. . . . . r

1
.

1
. . . . .

J
)
and price vector j

R
1
such that consumers and rms optimize, and mar-
kets clear:
(i) \i 1, n

i
(r

) = max
a
i
2A
i
n
i
(r
i
) s.t. j

r
i
_ j

.
i
+

J
)=1
o
i)
_
j

)
_
,
(ii) \, J, j

)
= max
j
j
2Y
j
j


)
,
(iii) for / = 1. . . . . 1,

1
i=1
r

i
= .

J
)=1

)
.
It is implicit in the denition of competitive equilibrium that consumers
and rms treat prices as independent of their choices. However, in the
aggregate these choices determine prices through the market clearing
conditions.
Notice that a proportional change in all prices, from j to cj with c 0,
has no eect on any of the three aspects of competitive equilibrium,
since demand and supply are homogeneous of degree zero. Therefore,
we can arbitrarily assign a price 1 to some commodity (which is then
called the numeraire) without changing the equilibrium allocation.
Suppose markets clear under an allocation for all commodities except
/. Then, if consumers exhaust their budgets, the market for / must
clear, too. This is apparent from adding up the budget constraints,
1

i=1
(j r
i
)
1

i=1
(j .
i
)
1

i=1
J

)=1
(o
i)
(j
)
))
=
1

i=1
_
j r
i
j .
i
j
J

)=1
o
i)

)
_
= j
1

i=1
_
r
i
.
i

)=1
o
i)

)
_
= 0.
(factoring out the price vector is possible by the distributivity of the
dot product).
Now if markets for commodities / ,= / clear, i.e.
1

i=1
_
r
i
.
i

)=1
o
i)

)
_
= 0.
128
then
j
I
1

i=1
_
r
Ii
.
Ii

)=1
o
i)

I)
_
= 0.
which (provided j
I
0) implies that the market for commodity /
clears.
13.2 Partial Equilibrium
Often we are interested in the market for a particular good / and would
like to treat the remaining markets as "everything else," absorbing
whatever wealth the consumer does not spend on /. Strictly speaking,
that is a quasilinear scenario, but one may argue that it applies ap-
proximately whenever only a small portion of the consumers wealth is
spent on /.
The main prerequisites for such a "partial equilibrium" analysis are
that changes in the price of / do not create wealth eects and that
changes in the quantity demanded of / do not cause price adjustments
in other markets. If they did, then the marginal value of residual wealth
would not be xed, and it is then not clear what it means to measure
the price of a unit of / in terms of what is given up of other goods. If
money existed, the value of a dollar spent on / would not be constant;
it would depend on how much of / is consumed.
We will assume a quasilinear environment, but note that it may general-
ize a bit beyond to situations where demand for / is just not signicant
enough to cause non-negligible wealth eects and price changes else-
where. Clearly, this argument is violated if / has close substitutes or
complements.
Suppose then that all goods other than /, bundled together, enter con-
sumer is utility function linearly, and denote the quantity of this com-
posite commodity by :
i
("money" left over). It can be treated as the
numeraire, i.e. we normalize its price to 1.
Let the quantity of good / be r
i
, and is quasilinear preferences are
represented by
n
i
(:
i
. r
i
) = :
i
+c
i
(r
i
) .
129
where c
i
() is concave, i.e. c
0
i
(r
i
) 0, c
00
i
(r
i
) < 0 at all r
i
_ 0, and
c
i
(0) = 0 (a normalization). The price of / is j.
Firms require c
)
(
)
) of numeraire commodity to produce
)
_ 0 units of
good /. Let c
)
(
)
) be convex and twice dierentiable, i.e. c
0
i
(r
i
) 0,
c
00
i
(r
i
) _ 0 at all
)
_ 0. (Note that c (0) = 0 only makes sense in the
long run, when there are no xed costs.)
Suppose there is no endowment of /, and an endowment .
ni
0 of the
numeraire.
Firm , sets
)
so that j

)
attains
max
q
j
0
j

)
c
)
(
)
) .
which has rst-order conditions j

_ c
0
)
_

)
_
, with equality if

)
0.
Consumer i solves
max
n
i
2R,a
i
2R
+
:
i
+c
i
(r
i
)
s.t. :
i
+j

r
i
_ .
ni
+
J

)=1
o
i)
_
j

)
c
)
_

)
__
= max
n
i
2R,a
i
2R
+
_
c
i
(r
i
) j

r
i
+.
ni
+
J

)=1
o
i)
_
j

)
c
)
_

)
__
_
.
since the budget constraint holds with equality at a solution (so that
we can substitute for :
i
).
The rst-order conditions, c
0
i
(r
i
) _ j

(equality if r

i
0), determine
the equilibrium j

together with market clearing,



1
i=1
r

i
=

J
)=1

)
.
Note that none of the equilibrium conditions depend on endowments
or ownership stakes. This is a property of quasilinear preferences.
Exercise (MWG 10.C.2).
Exercise (MWG 10.G.2).
Exercise (MWG 10.C.6).
Exercise (MWG 10.G.5). (Only part a.)
130
13.3 The Long Run
Let there be innitely many rms that could potentially produce com-
modity / with the same cost function. Firms are identical, so we may
denote any individual rms output as (since it is unique if costs are
strictly convex).
Given enough time, a rm can attain zero prot by shutting down
completely (c (0) = 0). Assuming free entry and exit, rms will then
enter as long as positive prot can be made, and exit if only negative
prot can be made. Therefore, all individual rms must have zero
prot in long-run equilibrium, else (if positive) more would enter or (if
negative) more would exit, so some rms could not be optimizing.
This leads to the following special version of competitive equilibrium.
Denition. A long-run competitive equilibrium, given aggregate demand
function r (j) and cost function c () for potentially active rms (where
c (0) = 0), is a price j

, a per-rm production level

, and a number of
rms J

such that rms optimize, protable entry is not possible, and mar-
kets clear:
(i) j

c (

) = max
q0
j

c (),
(ii) j

c (

) = 0,
(iii) r (j

) = J

.
Assume strictly positive demand at a price equal to marginal cost,
r (c
0
(0)) 0, and constant returns to scale: c () = c for some c 0.
Then we must have j

_ c, else every rm would want to produce


innitely by (i). Hence max
q0
(j

c) = 0, so that j

= c. Since
r (c
0
(0)) = r (c) 0, market clearing (iii) requires

0, but

is
indeterminate by (ii), and therefore J

is indeterminate.
There is no long-run equilibrium with increasing and strictly convex
costs. In this case, if j c
0
(0), rms make a positive prot at some
level of production, so total supply is innite. Therefore j _ c
0
(0), but
then supply is zero and demand is positive by assumption, r (c
0
(0)) 0,
i.e. market clearing fails.
131
To get a determinate long-run number of rms, there must exists an
ecient scale that minimizes long-run average cost c = c ( ) , . If
j c, output is innite. If j < c, then no nonzero output level is
protable (since c is the minimized average cost). So we must have
j = c, and J = r ( c) , .
Exercise (MWG10.F.2). Long-run and short-run competitive equilibrium
price
Exercise (MWG 10.F.3). Tax impact in short run and long run
Exercise (MWG 10.F.6). Short-run and long-run supply function with
xed factor
14 Welfare Analysis
14.1 Pareto Eciency and Surplus
We now consider the welfare properties of competitive equilibrium.
Denition. A feasible allocation (r
1
. . . . . r
1
.
1
. . . . .
J
) is Pareto ecient
with respect to preferences represented by utility functions n
1
() . . . . . n
1
()
if there exists no other feasible allocation (r
0
1
. . . . . r
0
1
.
0
1
. . . . .
0
J
) such that
n
i
(r
0
i
) _ n
i
(r
i
) for all i = 1. . . . . 1 (and the inequality is strict for someone).
This is a minimal welfare criterion: there is no distributional fairness
implied (allocating everything to one person is Pareto ecient). How-
ever, it makes sense to at least require that not everyone can be made
better o.
Consider a partial equilibrium setting where individual utility functions
are quasiconcave, i.e.
n
i
(:. r) = :
i
+c
i
(r
i
)
for i = 1. . . . . 1. As in the previous lecture, :
i
is is consumption of a
numeraire commodity, which is a composite of all goods other than /.
Commodity / is consumed by individual i in quantity r
i
and produced
by rm , in quantity
)
at cost c
)
(
)
).
132
In the quasilinear setting, there is a convenient welfare test. An al-
location is Pareto ecient if and only if it maximizes the aggregate
surplus
o (r
1
. . . . . r
1
.
1
. . . . .
J
) =
1

i=1
c
i
(r
i
)
J

)=1
c
)
(
)
) .
To understand the connection, consider the utility possibility set:
l =
_
(n
1
. . . . . n
1
) R
1
s.t. feasible allocation (r
1
. . . . . r
1
.
1
. . . . .
J
)
with n
i
_ n
i
(r
i
) for i = 1. . . . . 1
_
.
i.e. the set of all attainable "utility allocations" n
1
. . . . . n
1
.
If individual consumption of good / and production plans are xed at
( r
1
. . . . . r
1
.
1
. . . . .
J
), then .
n

J
)=1
c
)
(
)
) is available to be spent
on the numeraire good. Since the total utility from consuming / is

1
i=1
c
i
( r
i
), the feasible utility constraint is
1

i=1
n
i
_ .
n
+
1

i=1
c
i
( r
i
)
J

)=1
c
)
(
)
)
= .
n
+o ( r
1
. . . . . r
1
.
1
. . . . .
J
) .
This means, the boundary of the utility possibility set is linear in the
consumption of the numeraire (with slope 1). Altering production or
consumption of good 1 leads to parallel shifts
Optimal consumption and production levels for good / are those where
the boundary of the utility possibility set is as far out as possible. Then
Pareto-optimal allocations can only dier in the distribution of the
numeraire among consumers (given the c
i
() are strictly concave and
c
)
() strictly convex, their levels are uniquely determined at optimal
consumption and production choices)
An increase in aggregate surplus (in the quasilinear setting) expands
the utility possibility set and is therefore welfare-improving with respect
to any other reasonable measure (that respects the Pareto principle).
Hence, in the quasilinear environment, Pareto-optimal and surplus-
maximizing is equivalent.
133
14.2 Eciency of Competitive Equilibrium
Recall that, in the competitive equilibrium problem, j = c
0
)
(
)
) for all
,, and j = c
0
i
(r
i
) for all i at a solution. Dening the inverse demand
function as r
1
() such that r
1
(r (j)) = j, and the inverse supply
function as
1
() such that
1
( (j)) = j, we have
1
( (j)) = j =
c
0
)
(
)
).
Note that marginal cost and marginal utility, at a solution, are the
same across all consumers and rms (since they are both equal to the
constant j). Hence we can replace c
0
)
() with the "industry marginal
cost" C
0
(), and r
1
(r (j)) with the price 1 () at which a particular
quantity is demanded. Then c
0
i
( r
i
) = 1 (r) for every i, and c
0
)
(
)
) =
C
0
() for every ,.
Consider a dierential change in aggregate surplus:
do =
1

i=1
c
0
i
(r
i
) dr
i

)=1
c
0
)
(
)
) d
)
= 1 (r)
1

i=1
dr
i
C
0
()
J

)=1
d
)
= (1 (r) C
0
(r)) dr
(Because 1 (r) and C
0
(r) are constants at a competitive equilibrium,
they do not vary with individual or rm. Apply feasibility = r and

1
i=1
dr
i
= dr = d =

J
)=1
d
)
in the last equality.)
Integrating over quantity consumed of /, we get
o (r) = o
0
+
_
a
c=0
(1 (:) C
0
(:)) d:.
which is graphically the area between demand and supply curve below
r, the standard depiction of aggregate surplus. (o
0
equals aggregate
surplus when / is not consumed, i.e. in this case the endowment of the
numeraire.)
It is clear from the expression that aggregate surplus is maximized at r
such that 1 (r) = C
0
(r), i.e. the competitive equilibrium consumption
level r where price equals marginal cost of /.
134
Note that aggregate surplus is the area between the inverse demand
function 1 (r) = r
1
(r (j)) = j = c
0
i
(r
i
) and the inverse supply func-
tion C
0
() =
1
( (j)) = j = c
0
)
(
)
). I.e. the inverse demand function
indicates marginal utility of consumption, and the inverse supply func-
tion indicates marginal cost of production. (In a well-dened sense,
since these are equalized across individuals in competitive equilibrium.)
At the competitive equilibrium solution, we therefore have 1 (r) =
C
0
(r) == c
0
i
(r
i
) = c
0
)
(
)
). I.e. the marginal utility of another unit
of / equals its opportunity cost in terms of the numeraire. (All this
presumes consumers and rms are price takers.)
Thus, competitive equilibrium allocation is Pareto-optimal. This is an
instance of the rst welfare theorem, which says that a competitive
equilibrium allocation is ecient.
Exercise (MWG 10.D.3).
Exercise (MWG 10.D.4).
14.3 Ecient Allocations through the Market Mecha-
nism
Since the competitive equilibrium is surplus-maximizing in the quasi-
linear case, it can be thought of as the solution to the problem
max
(a
1
,...,a
I
)0
(q
1
,...,q
J
)0
_
.
n
+
1

i=1
c
i
(r
i
)
J

)=1
c
)
(
)
)
_
s.t.
1

i=1
r
i
=
1

i=1

)
.
First-order conditions are: for i = 1. . . . . 1,
c
0
i
(r
i
) _ j
and for , = 1. . . . . J,
c
0
)
(
)
) _ j.
(With equality if r
i
0, respectively
)
0.) And the feasibility
constraint.
135
To equate these to rst-order conditions of the competitive equilibrium
problem, must have j = j. I.e. the shadow price j of the resource
constraint for good / is its price j, i.e. price reects the marginal
social value of the good (also implies, through rst-order conditions,
c
0
i
(r
i
) = j = c
0
)
(
)
), marginal utility from consumption equals mar-
ginal cost). Then the problem jointly solves for a competitive equilib-
rium and Pareto ecient allocation.
Because the amount of the numeraire commodity .
n
can be allocated
in any conceivable way among individuals without aecting the solution
(but it does, of course, aect individual utility), every possible utility
distribution can be implemented and remains Pareto optimal. This the
second welfare theorem.
The second welfare theorem says that any Pareto-ecient allocation is
a competitive equilibrium allocation for some specication of zero-sum
transfers, i.e. 1
1
. . . . . 1
1
such that

1
i=1
1
i
= 0, between individual
endowments of the numeraire commodity.
Exercise (MWG 10.C.3).
Exercise (MWG 10.C.4).
15 Externalities
15.1 Externalities
Up until now, we have assumed that individuals care only about their
own consumption - they are indierent to the consumption choices of
others or production choices of rms, except insofar as these aect ones
own budget set. One can, however, think of many examples where
other peoples actions directly impact ones personal welfare (trac
congestion, noise, pollution, to name a few). These eects are called
externalities.
Denition. An externality is a welfare eect one agents actions have on
another.
136
To get a sense of the issue, consider initially a bilateral externality
that is imposed by one person on another (as opposed to a multilateral
externality that aects many others). The two concerned individuals
i = 1. 2 are a small part of the economy, i.e. their choices do not aect
prices j R
1
. Their wealths, given these prices, are n
1
and n
2
.
Besides having preferences over their consumption of goods r
i
= (r
1i
. . . . . r
1i
),
both individuals also care about an action / R
+
taken by 1. Thus, 1
imposes a (positive or negative) externality on 2.
Since we wish to focus on the role of the externality, it is convenient to
dene a "derived" preference over levels of /, given that consumption
of the 1 goods is optimized. The associated utility function is

i
(j. n
i
. /) = max
a
i
0
n
i
(r
i
. /)
s.t. j r
i
_ n
i
.
Let preferences be quasilinear with respect to a numeraire (good 1), so
that
n
i
(r
i
. /) = q
i
(r
1i
. /) +r
1i
(where r
1i
= (r
22
. . . . . r
12
. /) denotes consumption of goods 2. . . . . 1).
Evaluating at the optimal demands r
1i
(j. /), which are independent
of wealth, we have

i
(j. n
i
. /) = q
i
(r
1i
(j. /) . /) +n
i
j
1
r
1i
(j. /)
= c
i
(j. /) +n
i
where c
i
(j. /) = q
i
(r
1i
(j. /) . /) j
1
r
1i
(j. /).
Since commodity prices j are xed throughout the discussion, we can
treat c
i
(. ) as a function of / only. Let c
i
() be strictly concave (i.e.
c
00
i
() < 0), and normalize to c
i
(0) = 0.
We should, however, note that concavity may well be violated in prac-
tice. For example, if a rm is able to shut down operations when a
suciently high negative externality is imposed on it, then its prot
function cannot be forever decreasing in /, making it convex some-
where. Or if a rms constant-returns-to-scale production function is
aected by a positive externality (which acts like another input), then
the rm eectively operates an increasing returns technology and there-
fore has a convex prot function.
137
15.2 Ineciency
Denote by /

the equilibrium level of action /, which must be optimal


for 1, and by /

the (possibly dierent) socially optimal level. Re-


call that, in the quasilinear setting, the socially optimal allocation is
unambiguously the one that maximizes aggregate surplus
o (/) = c
1
(/) +c
2
(/) +n
1
+n
2
.
(Pareto eciency holds if and only if surplus is maximized).
The rst-order condition that the socially optimal externality /

must
satisfy is therefore
c
0
1
(/

) +c
0
2
(/

) _ 0.
with equality if /

0.
However, 1s choice of / maximizes 1s derived utility function, which
has the Kuhn-Tucker rst-order condition
c
0
1
(/

) _ 0.
with equality if /

0.
Suppose /

0, so that 1 in fact imposes the externality on 2, and say


that /

0 (otherwise, the externality is obviously inecient). If / is


a negative externality, i.e. c
0
2
() < 0 everywhere, then /

= /

only if
c
0
1
(/

) = c
0
2
(/

) 0.
Since c
0
1
(/) was assumed to be decreasing in / (by concavity), and
c
0
1
(/

) 0 = c
0
1
(/

), we must have /

< /

(too much of the exter-


nality is imposed).
If / is a positive externality, i.e. c
0
2
() 0 everywhere, then /

= /

only if
c
0
1
(/

) = c
0
2
(/

) < 0.
Again, since c
0
1
(/) is decreasing in / and c
0
1
(/

) < c
0
1
(/

), it follows
that /

(too little of the externality is imposed).


Exercise (MWG 11.D.1).
138
15.3 Remedies
One way to eliminate the ineciency is by quota, i.e. by mandating
agent 1 to provide no more or no less than the socially ecient level
/

, depending on whether the externality is negative or positive.


Another option is to tax or subsidize the externality. The optimal tax
(subsidy) has to achieve /

= /

, i.e. c
0
1
(/

) = c
0
2
(/

). Since
the tax subtracts from what agent 1 is able to spend on the numeraire
commodity, it leaves derived utility

1
(j. n
1
. /) = c
1
(/) t/ +n
1
.
which is maximized by / such that
c
0
1
(/

) = t
(given that /

0, so that the government wants to induce /

0).
One can see that /

= /

if
t = c
0
2
(j. /

) .
The interpretation is that 1 is made to bear the cost (receives the
benet) bestowed on 2. Such corrective taxes are called Pigouvian
taxes.
Exercise (MWG 11.D.2).
Exercise (MWG 11.B.4).
Bargaining can also resolve the externality problem. Say, 1 has the
right to choose any level of /, but can sign a contract with 2 to set / to
a specic level in return for compensation. If 2 oers a payment of 1
for the implementation of

/, that is sucient for 1, then 2 maximizes
derived utility

2
_
j. n
2
.

/
_
= c
2
_

/
_
1 +n
2
s.t. c
1
_

/
_
+1 +n
1
= c
1
(/

) +n
1
= c
2
_

/
_

_
c
1
(/

) c
1
_

/
__
+n
2
.
after substituting 1 = c
1
(/

) c
1
_

/
_
from the constraint.
139
Because c
0
1
(/

) = 0 (if /

0), the rst-order condition is


c
0
2
_

/
_
= c
0
1
_

/
_
.
hence

/ = /

.
Alternatively, if 2 has the right to choose /, then 2 can let 1 purchase
the right to implement

/ at a price 1. The optimal oer maximizes

2
_
j. n
2
.

/
_
= c
2
_

/
_
+1 +n
2
s.t. c
1
_

/
_
1 +n
1
= c
1
(0) + n
1
= c
2
_

/
_

_
c
1
(0) c
1
_

/
__
+n
2
.
which is identical to the case where 1 owns the right to the externality
(since c
1
(0) = 0) and has the same solution.
While it does not matter who has the right to externality, it is crucial
that the right is allocated to someone in advance and tradable. This
arguments is known as the "Coase theorem." In the case of rms, the
transfer of rights may occur through a merger or acquisition, which
creates value by internalizing the externality.
Exercise (MWG 11.B.1).
Related is the "missing markets" view of externalities. Instead of bar-
gaining over the amount of the externality, a market for units of the
externality would also x the problem. An externality is then simply
a commodity, and ineciency arises when it is not traded (the market
is missing).
We can generalize to a model where J rms produce amounts /
1
. . . . . /
J
of the externality and 1 individuals experience amounts /
1
. . . . . /
1
of it.
(To connect with the previous two-person environment, you can think
of 1 as a rm and 2 as a person.)
If rms must buy the right to impose a unit of the externality on
someone at a price j
I
, then rm , demands /
)
units that maximize its
prot
:
)
(j. /
)
) = ,
)
(/
)
) j
I
/
)
.
140
where ,
)
(/
)
) is ,s (concave) production function for the numeraire,
which requires /
)
as an input. (This set-up reects the interpretation
that the externality is negative. If it is positive, think of ,
)
(/
)
) < 0
as a convex cost function, and of j
I
< 0, i.e. rms must be paid to
produce the externality. Conclusions are the same.)
Consumer i supplies /
i
units of the externality, maximizing

i
(j. n
i
. /
i
) = c
i
(/
i
) +j
I
/
i
+n
i
.
First-order conditions imply
,
0
)
(/
)
) = j
I
= c
0
i
(/
i
)
for rms that produce, and individuals who consume, a positive amount
of the externality. /
)
0, respectively /
i
0. The market price of the
externality turns out to be the Pigouvian tax.
Socially optimal provision of the externality maximizes aggregate sur-
plus
o (/) =
1

i=1
(c
i
(/
i
) +n
i
) +
J

)=1
,
)
(/
)
) .
(Note that transfers

1
i=1
j
I
/
i
=

J
)=1
j
I
/
)
drop out of the social
objective due to market clearing.)
First-order conditions for
max
I
1
,...,I
I
0
I
1
,...,I
J
0
o (/) s.t.
1

i=1
/
i
=
J

)=1
/
)
give
,
0
)
_
j. /

)
_
= j = c
0
i
(j. /

i
)
for every i and , such that /

i
0 and /

)
0. This implies /

)
= /

)
,
i.e. the market provision of the externality is Pareto-ecient.
Exercise (MWG 11.D.5). (Assume , (0) = 0.)
141
15.4 Public Goods
The market solution relied on the depletable nature of the externality:
a rm could directly sell a unit to a consumer without aecting others.
This is a reasonable assumption for some types of externalities (say,
construction noise next to a single house), but not for others (for ex-
ample, air pollution). The latter type of externality is called a public
good (or public bad, as the case may be).
Denition. A public good is a non-rivalrous (i.e. non-depletable) commod-
ity: it can be consumed simultaneously by all agents.
Suppose rms , = 1. . . . . J produce quantities /
1
. . . . . /
J
of a public
good at convex cost c
)
(/
)
) in terms of the numeraire. They sell units of
it to individuals at price j
I
. Individuals i = 1. . . . . 1 demand quantities
/
1
. . . . . /
1
, but in fact consume the entire amount of the public good
that rms supply, i.e.

J
)=1
/
)
=

1
i=1
/
i
= /.
Then

i
(j. n
i
. /
i
) = c
i
(/) j
I
/
i
+n
i
.
and
:
)
(/
)
) = j
I
/
)
c
)
(/
)
) .
The aggregate surplus is
o (/) =
1

i=1
(c
i
(/) +n
i
) c (/) .
where c (/) =

J
)=1
c
)
(/
)
) is the total cost of providing amount / of
the public good (payments

1
i=1
j
I
/
i
=

J
)=1
j
I
/
)
canceled out).
Surplus is maximized (Pareto eciency attained) only if
1

i=1
c
0
i
(/

) = c
0
(/

)
(provided /

0). This is the Samuelson condition for the optimal


provision of a public good. (It says that the sum of marginal rates
of substitution between a public and private good, in this case the
numeraire, should equal the marginal rate of transformation.)
142
If the public good is excludable, then government can achieve eciency
by imposing personal prices (j
1
. . . . . j
1
) on individuals for a unit of the
public good: so-called Lindahl prices j
i
= c
0
i
(j. /

) per unit of the


public good consumed, for i = 1. . . . . 1.
These prices are rst-order conditions for consumers, hence all individ-
uals optimize by demanding /

units. A single rm (or a consortium


of J rms) maximizes prot : (/) =
_

1
i=1
j
i
_
/ c (/) by setting

1
i=1
j
i
= c
0
(/

). Hence

1
i=1
c
0
i
(/

) = c
0
(/

), which implies /

= /

.
The classic (or pure) public good is, however, non-excludable in ad-
dition to being non-rivalrous (i.e. non-depletable). Then individual
i can choose to pay for amount /
i
of the public good, but consume
/ =

1
i=1
/
i
. If the good is sold in the market at price j
I
, the rst-
order conditions for individuals and rms are
c
0
i
(/

) j
I
_ 0. j
I
c
0
)
_
/

)
_
_ 0
with equality if /

i
0, respectively /

)
0. Note that, in a competitive
equilibrium, marginal cost of production is equalized across rms, so
that c
0
)
_
/

)
_
= c
0
(/

) when /

)
0..
If at least one unit of the public good is provided, then c
0
i
(/

i
) = j
I
=
c
0
(/

) for some individual i and rm ,. Given that c


0
i
(/) 0 for all i,
and c
0
)
(/
)
) 0 for all ,, this implies
1

i=1
c
0
i
(/

) c
0
(/

)
For example, if every rm produced the public good, and every individ-
ual consumed it, then

1
i=1
c
0
i
(/

) =

J
)=1
c
0
)
_
/

)
_
= Jc
0
(/

) c
0
(/

).
Because c
0
i
() is a decreasing function of /, and c
0
() is an increasing
function of /

, it follows that /

< /

, i.e. the public good is under-


provided. This is an instance of the free-rider problem: individuals do
not pay for the benets they receive from purchases by others, so the
price rms receive for producing a unit of the public good understates
the social value.
143
In fact, if there is someone who has a higher marginal utility for the
public good than everyone else, then c
0
1
(/) = j
I
for this individual
entails c
0
i
(/) < j
I
for all others, i.e. no one else pays for a unit of the
good.
The government can either provide a public good directly or use (mini-
mum) quotas or subsidies to induce the ecient amount. For example,
a per-unit subsidy of the form
:
i
=

I6=i
c
0
I
(j. /

)
would cause consumers to take the total benet of their public good
purchases into account.
Exercise (MWG 11.B.5).
Exercise (MWG 11.D.3).
In general, whenever government must intervene, it faces the funda-
mental problem that it has no rst-hand information about the eect
of the public good on individuals. Thus, it may over- or underprovide
the costly public good. A central issue is how to design a nancing
mechanism that elicits the correct information from individuals.
A solution is a Groves-Clarke (or pivotal) mechanism, which species
that every agent pays the "costs" he inicts on others. Specically,
the government could ask consumers how their well-being is aected
by dierent levels of pollution and ask rms how benecial it is for
them to be able to pollute. Based on the responses, the government
implements the level of pollution that is socially optimal, given the
reports. In addition, it pays consumers the benets reported by rms,
and collects damages from rms equal to the loss in well-being reported
by consumers.
No one has an incentive to misrepresent their true costs or benets
because ones own report only aects payments the other side makes
or receives (not ones own liabilities). It also determines the level of
144
pollution the government sets, but neither rms nor individuals have
an interest to manipulate it. If rms report excessive costs, then more
pollution will be allowed, but they will also have to pay higher damages.
If consumers exaggerate their loss in well-being, pollution will be more
restricted, but they they receive less compensation.
Because a Groves-Clarke mechanism completely internalizes externali-
ties - everyone bears the social cost of their actions (or reaps the social
benets) - it achieves eciency.
16 Monopoly and Product Dierentiation
16.1 Monopoly
For many industries, the price-taking assumption that is fundamental
to competitive equilibrium is unrealistic. The notion that many small
rms produce a particular good can be relaxed to varying extents. Most
dramatically, to rms whose products have no close substitutes (mo-
nopolists). Then, to unique products that are, however, imperfectly
substitutable (dierentiation). Finally, to perfectly substitutable prod-
ucts provided by a limited number of rms.
We begin with a monopolist that faces dierentiable demand r () and
cost c () for its product (and we also assume that prot is quasiconcave,
i.e. rst-order conditions identify a unique maximum).
The monopolist sets its price to maximize prot
: (j) = jr (j) c (r (j)) .
The rst-order condition
r (j

) +j

Jr (j

)
Jj
=
Jc (r (j

))
Jr
Jr (j

)
Jj
can be restated in terms of the inverse demand function j (r (j

)) as
Jj (r (j

))
Jr
r (j

) +j

=
Jc (r (j

))
Jr
.
145
i.e. marginal revenue equals marginal cost. This is a general op-
timization principle for the rm, even in competitive environments,
where its marginal impact on price is zero: Jj (r) ,Jr = 0, and thus
j = Jc (r) ,Jr.
Note on inverting the derivative Jr (j

) ,Jj: it is generally the case


that
Jr (j)
Jj
Jr
1
(r (j))
Jr
= 1.
This can be seen by dierentiating the left and right side of r
1
(r (j)) =
j with respect to j (using the chain rule on the left). Then, if we denote
the inverse demand function r
1
(r (j)) by j (r),
Jj (r)
Jr
=
1
Jr (j) ,Jj
.
Alternatively, one can come to the same conclusion by writing prot
as j (r) r c (r) and dierentiating with respect to r.
Exercise (MWG 12.B.1).
Rearranging the rst-order condition to
j

=
Jc (r (j

))
Jr

Jj (r (j

))
Jr
r (j

) .
we see that the monopoly price strictly exceeds marginal cost, provided
Jj (r (j)) ,Jr < 0. The reason is that the monopolist reduces sales from
the socially optimal level where j = Jc (r) ,Jr, in order to increase the
price consumers are willing to pay per unit. (A competitive rm, on
the other hand, is too small relative to the market to aect the price.)
Exercise (MWG 12.B.6).
146
Figure 18: Surplus sharing under perfect competition (left) and monopoly
(right)
The restraint in sales causes a "deadweight loss" to society, since the
foregone units cost less to produce than people are willing to pay for
them. From societys point of view, only the allocation matters, and
the price is irrelevant. However, the price determines how surplus from
trade is divided between the rm and consumers, so it is not irrelevant
to the monopolist. Figure 18 illustrates.
Example. Let inverse demand be linear, j (r) = c /r, and marginal cost
constant at c. The optimal price and sales quantity satisfy marginal revenue
equals marginal cost:
J
Jr
j (r

) r

= c 2/r

= c.
i.e.
r

=
c c
2/
. j

= j (r

) =
c +c
2
.
You can see from the inverse demand function that the rm would not
produce unless c c, so j

c = j

in the relevant circumstances. Since


147
r (j) = (c j) ,/, the competitive output is r

= r (c) = (c c) ,/ r

.
The deadweight loss is the area of the triange with height j

= j

c
and base length r

:
1 =
1
2
(r

) (j

) =
(c c)
2
4/
0.
Exercise (MWG 12.B.9).
Exercise (MWG 12.B.10).
If the monopolist were able to (perfectly) price-discriminate, i.e. charge
every consumer her valuation of each unit sold, then it would be prot-
maximizing to sell the socially optimal quantity, and no deadweight
loss would occur. This is immediately apparent from the fact that the
monopolist then captures the entire surplus and directly maximizes it.
There are, however, important practical obstacles to price discrimina-
tion, from limited information to the possibility of resale by consumers
who obtain the product at lower prices.
Exercise (MWG 12.B.5).
16.2 Bertrand Price Competition
The two workhorse oligopoly models are Bertrand and Cournot duopoly.
Both extend straightforwardly to more players. In this lecture, we de-
velop Bertrand competition from its pure case, where rms interact
once with identical products, to repetition and to dierentiated prod-
ucts.
Two rms operate with constant marginal cost c and face total demand
r (j) for their joint output. Each rm i = 1. 2 faces demand
r
i
(j
i
. j
)
) =
_
_
_
0 if j
i
j
)
1
2
r (j
i
) if j
i
= j
)
r (j
i
) if j
i
< j
)
individually.
148
The rms simultaneously set prices. The solution of interest is a Nash
equilibrium. I will not give a formal denition of strategy here: in
words, it species an action for every possible information state the
player may nd himself in. For our purposes, think of a players strategy
as a direct action or, if there is a sequence of moves, as an action
conditional on the prior moves. Let o
i
be the set of such actions or
conditional actions available to player i.
Denition. A Nash equilibrium is a strategy prole : = (:
1
. :
2
) such that,
for i = 1. 2, :
i
o
i
and
:
i
(:
i
. :
)
) _ :
i
(:
0
i
. :
)
)
for all :
0
i
o
i
.
A Nash equilibrium species the strategy chosen by each player. It is
stable in the sense that, given ,s strategy, i cannot gain by chang-
ing hers, and vice versa. Since the payos are in this context prots
:
i
(j
i
) = (j
i
c) r
i
(j
i
. j
)
), a Nash equilibrium is a price for each rm
that is prot-maximizing, given the other rms price.
Pure Bertrand duopoly has a unique Nash equilibrium (j

1
. j

2
) in which
j

1
= j

2
= c. Given j

)
= c, j

i
< 0 gains all sales, but makes a negative
prot per sale, whereas j

i
0 loses all sales. Hence neither improves
is payo. This means j

i
= c for i = 1. 2 is a Nash equilibrium.
Suppose there existed a distinct other Nash equilibrium (j
0
1
. j
0
2
). If
j
0
i
< j
0
)
, then i can strictly increase prot by raising the price (regard-
less of whether j
0
i
_ c or j
0
i
< c). Hence none of the prices can be
larger than the other; we must have j
0
i
= j
0
)
. Now if j
0
i
< c, then
increasing the price would strictly increase is prot. On the other
hand if j
0
i
c, decreasing the price would gain all sales and strictly
increase is prot. It follows that only j
0
i
= j
0
)
= c qualies as a Nash
equilibrium. (Hence the Nash equilibrium is unique.)
The nding that interaction of even just two rms reverts to marginal
cost pricing is striking, but typically not observed in reality. Next, we
consider two departures from pure Bertrand competition that restore
higher prices and positive prots.
149
Exercise (MWG 12.C.1).
Exercise (MWG 12.C.4). (Only part a.)
16.3 Repetition
One factor the pure Bertrand model ignores is that most rms know
that they will face the joint price-setting problem repeatedly over time.
This circumstance greatly expands the rms options. They can now
condition their pricing on the rivals past actions and react aggressively
to price cuts.
Above marginal cost pricing may become viable because rms maxi-
mize intertemporal, rather than immediate prot and will consider the
eect of setting a low price today on the rivals behavior tomorrow.
Intertemporal prots are
1

t=1
o
t1
:
it
.
where :
it
is is prot in period t and o (0. 1) is a discount factor. The
discount factor may be interpreted in various ways, e.g. as the constant
probability (at each point in time) that the rms will compete again in
the following period.
Exercise (MWG 12.D.1).
Note that, if :
it
is a constant value :
i
, then the innite sum can be
reduced to
1

t=1
o
t1
:
i
=
1
1 o
:
i
.
since
(1 o)
1

t=1
o
t1
:
i
=
1

t=1
o
t1
:
i

t=1
o
t
:
i
=
1

t=1
o
t1
:
i

t=2
o
t1
:
i
= :
i
.
150
Cooperation on high prices (for example, the monopoly price j
n
) might
be sustainable in Nash equilibrium if both rms play the trigger strat-
egy
j
i
(t) =
_
j
n
if t = 1 or j
)
(t) = j
n
for t = 1. . . . . t 1
c otherwise
for i = 1. 2. I.e. at any given time, rm i sets a high price j
n
only if ,
has set j
n
at all previous times. Otherwise, i "punishes" , by pricing
at marginal cost, so that , can make no more than zero prot.
Such a strategy is called Nash reversion, because it switches from the
best to the worst Nash equilibrium in the punishment phase. It is also
a "grim" strategy in that it never forgives the rival for setting a low
price.
In theory, if monopoly prices can be maintained forever in equilibrium,
punishment will never actually occur - all that matters is the threat.
This is why an unforgiving trigger strategy such as the one above can
be optimal: the threat is never tested.
More pragmatic trigger strategies may maintain low prices for a su-
ciently long time to deter the rival in the future. In case the punish-
ment is ever triggered, say because the rival makes a mistake, marginal
cost pricing hurts the rm in the present, but can be understood as
an investment in reputation that permits protable cooperation in the
future.
If o _ 1,2 (i.e. there is not too much discounting of future income),
then j
i
(t) for i = 1. 2 is a Nash equilibrium of an innitely repeated
Bertrand game. The reasoning is inductive. In the rst period, the
strategies prescribe (j
1
(1) . j
2
(1)) = (j
n
. j
n
). Nowsuppose (j
1
(t 1) . j
2
(t 1)) =
(j
n
. j
n
) and j
1
(t) = j
n
. The best alternative for 2 to j
2
(t) = j
n
is to slightly undercut and earn a prot of (j
n
c)
n
, instead of
(j
n
c) (
n
,2).
Undercutting triggers j
1
(t) = c for t = t + 1. . . . . (so that the
maximal prot 2 can receive in future periods is zero), so it is optimal
for 2 in period t if and only if
(j
n
c)
n

1
1 o
(j
n
c)

n
2
.
151
i.e. if the prot from capturing all sales today exceeds the discounted
value of future prot streams when sharing the sales.
Because the inequality holds only when o < 1,2, it is optimal for 2
to set j
2
(t) = j
n
if o _ 1,2. Since the rms share the market at
t = 1, and continue to do so whenever they have shared it previously,
induction implies that they share the market in all periods. Hence
(j
1
(1) . j
2
(1)) = (j
n
. j
n
) is a Nash equilibrium of the innitely re-
peated game.
Since o < 1,2 implies that the optimal coordination on (j
1
(t) . j
2
(t)) =
(j
n
. j
n
) cannot be sustained, it is intuitively clear that no other kind
of coordination can work either. I.e. either j
1
(t) = c or j
2
(t) = c at
all times t.
However, o _ 1,2 admits many other cooperative equilibria besides
the prot-maximizing one. This is an instance of the Folk Theorem
from game theory, which says that equilibria leading to any payos that
exceed the best each player can achieve independently (the minmax, i.e.
the best attainable in the worst-case scenario) exist for some suciently
large o.
Exercise (MWG 12.D.4).
Here, the minmax prots are zero, since every player can always guar-
antee zero by setting price to marginal cost, but no more than that.
Therefore, any combination of prots exceeding zero is possible (to see
this, just replace j
n
above with any other prices, possibly asymmetric,
and consider o very close to 1, so that threats are arbitrarily damaging
in the long run). Which equilibrium occurs often depends on history
and focal points.
Collusion is often not maintained as smoothly as suggested here. In
practice, demands and costs uctuate, and it may not be obvious to
other parties whether a rm is ceasing to cooperate, or simply reacting
to environmental changes. To prevent rms from taking advantage of
this ambiguity, it becomes necessary to punish deviations, even if it is
152
not clear why they happened, and this leads to intermittent price wars,
between periods of high prices.
Exercise (MWG 12.D.5).
16.4 Product Dierentation
When products are dierentiated, rmis demand is not perfectly price-
elastic. It may instead be a continuously decreasing function r
i
(j
i
. j
)
)
of rm is price, given the price j
)
of the competitor.
If rms face constant marginal cost c 0, they choose their prices to
maximize
:
i
(j
i
) = (j
i
c) r
)
(j
i
. j
)
) .
Typically, rms retain some positive demand when pricing above mar-
ginal cost, from consumers who value their unique products. Hence
pricing above marginal cost will be the prot-maximizing strategy, even
if it means getting undercut.
There are two main approaches to product dierentation: in one, the
Hotelling model, a consumer demands only one of the products; in
the other, the Dixit-Stiglitz model, products are used together, there
is an explicit preference for variety. Monopolistic competition is the
special case of the Dixit-Stiglitz model where variety is innite (there
is a continuum of dierentiated products with small market shares, but
some market power). I focus on the Hotelling model.
Consider duopolists that serve a continuum of consumers whose de-
mands arise in the following manner. The rms are associated with
points 0 and 1 in a spectrum [0. 1] of possible products. Every indi-
vidual occupies at a point . [0. 1] that reects her most preferred
product. In purchasing either 0 or 1, the individual incurs a "travel
cost" .t or (1 .) t, which indicates lost satisfaction from consuming
a non-ideal product.
Someone who prefers . is therefore willing to pay at most
1
0
(.) = .t
153
for product 0, and at most
1
1
(.) = (1 .) t
for product 1 ( is the goods undiscounted value).
Individual demand is for either one or zero units, hence no one consumes
both products, and some might consume neither. If the rms charge
j
0
and j
1
for a unit of their products, then the person who prefers .
will consume product 0 if 1
0
(.) _ j
0
, i.e.
. _
j
0
t
and 1
0
(.) j
0
_ 1
1
(.) j
1
, i.e.
. _
1
2
+
j
1
j
0
2t
= ^ ..
On the other hand, this person will consumer product 1 if
. _ 1
j
1
t
and the second inequality is reversed.
The point ^ ., where 1
0
(.) j
0
= 1
1
(.) j, belongs to the "marginal
consumer," who is just indierent between the products.
Suppose . is uniformly distributed on [0. 1], so that every possible prod-
uct is preferred by the same mass of consumers. Then the total demand
for product 0 is the length of the interval [0. .
0
], where
.
0
= min
_
j
0
t
. ^ .
_
.
and total demand for product 1 is the length of the interval [.
1
. 1], i.e.
1 .
1
, where
.
1
= max
_
1
j
1
t
. ^ .
_
.
(Assuming, of course, that prices are such that .
0
. .
1
(0. 1). When-
ever this is violated, one rm makes no sales, which cannot be prot-
maximizing unless the other rm prices below marginal cost.)
154
For simplicity, let be suciently large that the market is "covered"
in equilibrium: every consumer purchases one of the products. Then
the marginal consumer determines the rms demand functions
r
0
(j
0
. j
1
) = ^ .. r
1
(j
0
. j
1
) = 1 ^ ..
Prots
:
0
(j
0
. j
1
) = (j
0
c) ^ .. :
1
(j
0
. j
1
) = (j
1
c) (1 ^ .)
are maximized when the rst-order conditions with respect to j
0
and
j
1
,
j

0
(j
1
) =
j
1
+c +t
2
.
j

1
(j
0
) =
j
0
+c +t
2
.
are met.
These are the best responses to the other rms price. A joint solution,
where prices are mutual best responses, is a Nash equilibrium:
j

0
(j

1
) = c +t = j

1
(j

0
) .
I.e. product dierentiation allows the rms to price above marginal
cost in Bertrand equilibrium, and increasingly so the more consumers
discount for dierences from their most preferred products. In the
limiting case, where consumers do not care about such dierences and
thus t = 0, prices equal marginal cost, and prots are zero.
This simple case can be extended to multiple rms, arbitrary consumer
taste distributions, individual demands other than zero-one, asymmet-
ric costs, as well as endogenous rm locations (strategic product posi-
tioning). Above marginal cost pricing is not merely owed to the fact
that few rms compete in the market. In fact, entry may not lower
prices because it tends to attract consumers on the margin (who have
the lowest willingness to pay for the product). Once the incumbent
rm loses these price-sensitive consumers, it may actually increase its
price further.
Exercise (MWG 12.C.17). Linear city with dierent costs
Exercise (MWG 12.C.16). Circular city with quadratic cost
155
17 Capacity Constraints
17.1 Capacity-Constrained Pricing
An assumption that was implicit in Bertrand oligopoly, and turns out
to be crucial for its marginal cost pricing equilibrium, is that rms
can serve arbitrarily large demands (i.e. capacity is free or can be
increased instantly). Then, no rm can aord to be undercut, since all
sales would be lost to competitors. These circumstances lead to very
aggressive pricing.
With limited production capacities, the lower-priced of two identical
products is sold to early buyers, and the higher-priced alternative may
still be demanded by latecomers. This tends to soften pricing, because
a rm can make a prot despite being undercut. Capacity constraints
lead to the Cournot model of oligopoly, which can be viewed as quantity,
rather than price, competition in the sense that the rms fundamental
choices are their production capacities.
For simplicity, consider again a duopoly in which rms i = 1. 2 simul-
taneously choose capacities
1
and
2
(at a positive per-unit cost), and
subsequently engage in price competition. At the latter point, each
rm is able to sell up to its capacity at a constant marginal cost c _ 0.
Both capacities are known to both rms at the time when prices are
set.
We assume that the products are identical, and demand for the total
output is a continuous, strictly decreasing and concave function r (j).
Denote the inverse demand by j (). Which of the two prices eectively
constrains demand depends on the capacities. If j
i
< j
)
and
i
_ r (j
i
),
then rm i serves everyone, so j
i
determines how much is demanded. If

i
< r (j
i
), then rm , serves those who cannot buy from rm i, which
sells at capacity, so j
)
determines how much is demanded.
We must be specic about who buys from i at the lower price when
demand exceeds is capacity. This is determined through a rationing
rule. We will impose that consumers with the highest willingness to
156
pay are at the head of the queue. Then demand is, for i = 1. 2,
r
i
(j
1
. j
2
) =
_
_
_
min
i
. r(j
i
) if j
i
< j
)
min
_

i
. max
_
r (j
i
)
)
.
1
2
r (j
i
)
__
if j
i
= j
)
min
i
. max r (j
i
)
)
. 0 if j
i
j
)
.
This rationing rule where "highest valuations served rst" is known in
industrial organization as the "ecient rationing rule." (An alternative
is, for example, the proportional rationing rule, where anyone is equally
likely to be at the head of the queue, so that rm ,s customers have
the same expected willingness to pay as rm is. This would increase
,s demand, and decrease is demand, at any given prices.)
We begin by analyzing the pricing game for some given capacities
1
0
and
2
0. (If one rm does not invest in capacity, we simply have a
monopoly.) Denoting by / (
)
) the optimal quantity rm i would sell
if it were not capacity-constrained (and rm , sold
)
), let
1
_ / (
2
)
and
2
_ / (
1
). (Hence, the capacity constraints bind.)
Since capacity is costly to build, both rms must sell a nonzero quantity
in equilibrium: else they could avoid losses by not investing in capacity.
This means prices must be equal. Otherwise, if j

i
< j

)
, then rm
i would sell at capacity (given that consumers rst go to the lower-
priced seller, and rm , still manages to sell something). But then
i could slightly increase its price and still sell at capacity, giving it a
strictly higher prot.
Similarly, it is not possible that j

1
= j

2
< j (
1
+
2
), in which case
market demand would exceed both capacities, so that each rm could
protably increase its price. On the other hand, if j

1
= j

2
j (
1
+
2
),
then market demand does not cover both capacities. At least one rm
has spare capacity and will want to slightly lower its price (i.e. undercut
like a pure Bertrand competitor) as long as j

1
= j

2
c (otherwise, if
j

1
= j

2
_ c, it is not optimal to invest in capacity).
The only alternative is j

1
= j

2
= j (
1
+
2
). It remains to be shown
that this is a Nash equilibrium given
1
and
2
. Suppose therefore
that j
)
= j (
1
+
2
). Neither of is deviations from j
i
= j (
1
+
2
) is
protable. Namely, j
i
< j (
1
+
2
) simply lowers the price, but cannot
157
increase sales, since rm i is already at capacity. And j
i
j (
1
+
2
)
is undesirable by the assumption / (
)
) _
i
, which implies that rm is
prot increases in sales when , is at, and i is below, capacity. Hence i
should lower price while j
i
_ j (
1
+
2
).
We conclude that j

1
= j

2
= j (
1
+
2
) is the unique Nash equilibrium
of the pricing game with capacities
i
(0. / (
)
)]. (Incidentally, there
is no equilibrium if
i
/ (
)
) and j (
1
+
2
) c, since i then sells
below capacity, but at symmetric prices it would always be protable
to slightly undercut.)
This insight motivates the Cournot model of oligopoly, which takes as
given that each rm will produce up to capacity and set the market-
clearing price j (
1
+
2
). The Cournot model focuses on the capacity-
or quantity-setting game preceding price formation.
17.2 Cournot Quantity Competition
Let the inverse demand j () be a decreasing, dierentiable function of

i
+
)
with j (0) c. Duopolist i = 1. 2 solves
max
q
i
0
(j (
i
+
)
) c)
i
.
which has rst-order condition
j
0
(

i
+
)
)

i
+j (

i
+
)
) c _ 0
(equality if

i
0).
Suppose

i
= 0. Then j (

i
+
)
) = j (
)
) _ c, and

)
= 0 contradicts
j (0) c. On the other hand,

)
0 would imply j
0
_

)
_

i
+j
_

)
_
= c
at
i
=

i
, i.e. j
_

)
_
< c (since j
0
() < 0 everywhere). Again, this
contradicts j (0) c. Under the assumptions, we must have

i
0 for
i = 1. 2, so that the rst-order conditions apply with equality.
Then they are functions

i
=
j (

i
+
)
) c
j
0
(

i
+
)
)
that implicitly determine the best quantity choice for rm i, given rm
,s quantity choice.
158
A Nash equilibrium
_

i
.

)
_
jointly solves these best-response functions,
thus:
j
_

i
+

)
_
= c j
0
_

i
+

)
_

i
+

)
2
= c j
0
_

i
+

)
_

i
.
Proposition. Given identical, constant marginal costs c 0, and in-
verse demand for the industrys combined output such that j (0) 0 and
j
0
(
1
+
2
) < 0 whenever
1
+
2
_ 0, the Cournot equilibrium price satises
c < j (

1
+

2
) < j (
n
) .
where
n
is the optimal monopoly output.
Proof. From the Nash equilibrium condition above, it is immediate that
j (

1
+

2
) c. So we show j (

1
+

2
) < j (
n
). Since j
0
() < 0, this will
be the case if

1
+

2

n
. If

1
+

2
<
n
, then rm i can increase its
production to
i
=
n

)
, in which case the industry supplies the monopoly
quantity, at the monopoly price and monopoly prot. Because the increase
in

i
lowers the price, while

)
remains xed, this must strictly reduce ,s
prot. But combined prot cannot decrease, since the monopoly prot is
an upper bound on industry prot. Hence is prot strictly increases. This
violates Nash equilibrium, hence we must have

1
+

2
_
n
.
If

1
+

2
=
n
, then the industry acts like a monopolist, and the rst-
order condition for Cournot equilibrium should be identical to the rst-order
condition for a monopoly equilibrium at

1
+

2
=
n
, namely j () = c
j
0
() , where =
1
+
2
. You can check that it is not so. Thus

1
+

2

n
.

Note that

1
= / (

2
) and

2
= / (

1
) in Cournot equilibrium: the
quantity choices can be interpreted as binding capacity choices that lead
to j (

1
+

2
) as a price equilibrium in a Bertrand game with capacity
constraints.
Unlike pure Bertrand oligopoly, Cournot oligopoly does not lead to
competitive (marginal cost) pricing. The reason is that, given capacity
159
constraints, neither rm can capture the entire market by undercutting.
Since each rms demand is less than perfectly elastic, it can make a
positive prot by pricing above marginal cost.
Yet the rms are unable to maximize industry prot, i.e. set the
monopoly price. In determining its capacity, each rm only takes into
account how an increase in sales, and therefore a lower market-clearing
price, aects its own revenue. It ignores the eect on the rivals revenue.
Hence the rms do not respond to the full benet (to the industry) of
keeping sales low and price high as a monopolist would; they sell "too
much."
This is easy to see from the Nash equilibrium condition: Cournot
duopolists mark-up on marginal cost by j
0
_

i
+

)
_

i
, i.e. the ef-
fect of the price reduction on own revenue. By contrast, a monopolist
marks up by j
0
(

, with

i
+

)
, i.e. the eect on industry
revenue.
Example. If inverse demand is linear, j (
1
+
2
) = c / (
1
+
2
) (where
c c and / 0), then prots are (c /
i
/
)
c)
i
, resulting in rst-order
conditions (best-response functions)

1
(
2
) =
c /
2
c
2/
and

2
(
1
) =
c /
1
c
2/
.
(In this context, best-response functions are often referred to as reaction
functions.) Solving jointly, we obtain the Nash equilibrium

1
(

2
) =
c c
3/
=

2
(

1
) .
with market-clearing price
j (

1
+

2
) =
1
3
c +
2
3
c.
Since c c, this means j (

1
+

2
) c. The monopoly quantity is rm is
best response to
)
= 0. Hence
n
=

i
(0) = (c c) , (2/) and j (
n
) =
c,2 +c,2 j (

1
+

2
).
160
Exercise (MWG 12.C.9).
Exercise (MWG 12.C.20).
Exercise (MWG 12.D.3).
17.3 Competitive Limit
In the generalization to J rms producing nonzero quantities with iden-
tical marginal costs c, rst-order conditions imply that the outputs are
equal, and adding up over , = 1. . . . . J, i.e.
J

= J
j (J

) c
j
0
(J

)
.
can be arranged for the equilibrium price
j (J

) = c j
0
(J

.
Intuitively, as J , each rms production level becomes very small,
so that j (J

) c. (I.e. competitive pricing is approached in the


limit.) If the number of rms in the industry is determined by free entry
in response to prot opportunities, the competitive limit is obtained as
demand increases.
Exercise (MWG 12.C.7).
Consider parameterized demand functions r
c
(j) = cr (j), where r (j)
is some aggregate demand function, and c 0 is the "market size."
Let j
c
be the equilibrium price, and Q
c
the equilibrium total output
(after all protable entry has occurred), when the market size is c.
Assume that each rm has the same minimum average cost c at some
level of output 0 (we do not impose constant marginal cost here).
Then Q
c
+ _ cr ( c) because otherwise, if demand exceeded supply
(plus an additional quantity ) at j = c, the market-clearing price after
a further entry (at the minimum average cost level ) would be greater
than c, and entry would be protable.
161
This imposes a lower bound on the equilibrium supply Q
c
: it cannot
fall short by more than of the quantity that would be supplied if
j = c. It follows that the equilibrium price j
c
is bounded above: it
cannot exceed c by so much that the post entry price, after another
rm enters and supplies , is still greater than c.
Let j
c
() be the inverse supply function at market size c, and dene
the dierence between this upper bound on j
c
and c (which is a lower
bound such that no rm exits) as
j
c
= j
c
(cr ( c) ) j
c
(cr ( c)) .
In order to see what happens to the equilibrium price interval as market
size increases from c = 1 toward innity, rewrite j
c
in terms of j (),
the inverse supply function at c = 1. Since
c
= r
c
= cr = c, we
have j () = j
c
(
c
,c), i.e.
j
c
= j
_
cr ( c)
c
_
j (r ( c)) .
which goes to zero as c .
This means the equilibrium price converges to j
c
(r
c
( c)) = c, the long-
run competitive price, as market size gets large. The industry output
is then also competitive (and rms are small relative to the market,
since they produce at the xed minimum average cost level ). With
constant marginal cost c, c = c.
Exercise (MWG 12.F.3).
Exercise (MWG 12.F.2).
Exercise (MWG 12.F.4).
18 Precommitment and Entry
18.1 Precommitment
In some industries, rms act sequentially, rather than simultaneously
as assumed so far. In fact, it may supercially appear that sequential
162
moves are always more realistic, but this view misses the point of the
distinction between the game forms. It is a matter of information,
rather than timing.
In a sequential game, strategy sets dier in that some players are able
to condition their actions on what other players are observed to do.
This is not necessarily a benecial power to have: rst movers can
potentially exploit it by manipulating the incentives for late movers.
The eect of an observed action by rm i on the behavior of another
rm , is captured by ,s best response function. If , best-responds
to a change in is strategy with a change in the same direction, i.e.
d/
)
(:
i
) ,d:
i
0, then :
i
and :
)
= /
)
(:
i
) are strategic complements. If
,s best response is in the opposite direction, i.e. d/
)
(:
i
) ,d:
i
< 0, then
we have strategic substitutes.
For instance, in pure Bertrand competition, a price cut by one rm
(above marginal cost) is answered by a price cut by the other rm, so
these are strategic complements. In a Cournot setting, a sales increase
by one rm often creates incentives for the other rm to reduce its
sales, so these are strategic substitutes.
Example. Suppose costs depend on an investment / that rm i can make
(for example, in process innovation), i.e. c
0
(/) < 0. Let the investment stage
be followed by Cournot competition with linear inverse demand j (
i
+
)
) =
c / (
i
+
)
) (where c c _ 0 and / 0) and constant marginal costs
(namely, c (/) for i, and c for ,). With rm-specic marginal costs, the
best-response functions are

i
(
)
. /) =
c c (/)
2/
+
1
2

)
.

)
(
i
(/)) =
c c
2/
+
1
2

i
(/) .
leading to equilibrium

i
_

)
. /
_
=
c 2c (/) +c
3/
.

)
(

i
(/)) =
c 2c +c (/)
3/
.
163
The best-response level of output for i is increasing in the cost-reduction: for
any
)
,
0
i
(/) 0. Dierentiating ,s best response with respect to / gives
J

)
(
i
(/))
J/
=
1
2

0
i
(/) < 0,
so output levels are strategic substitutes, and is investment in cost reduc-
tion leads to less aggressive behavior from ,, which reinforces the inherent
advantage of lower cost.
Exercise (MWG 12.G.1).
A Cournot duopolist may take advantage of the strategic substitutes
relationship by precommitting to high sales, so that the competitor will
concede a large share of the market.
The sequential version of Cournot duopoly is called Stackelberg duopoly.
One rm (the "leader") sets its quantity rst, and the choice is observed
by the other rm (the "follower"). The followers optimal reaction is
anticipated by the leader. Given the leaders choice
1
, the follower
maximizes
:
2
(
1
.
2
) = (j (
1
+
2
) c)
2
.
The resulting best-response function for the follower is

2
(
1
) =
j (
1
+

2
) c
j
0
(
1
+

2
)
.
This is identical to a Cournot rms best response function.
The leader, however, maximizes against the followers best response
function (rather than a xed value):
:
1
(
1
.

2
(
1
)) = (j (
1
+

2
(
1
)) c)
1
.
This leads to the leaders best response

1
(

2
) =
j (

1
+

2
(

1
)) c
j
0
(

1
+

2
(

1
))
1
1 +
0
2
(

1
)
=

2
(

1
)
1 +
0
2
(

1
)
.
If
1
and
2
are strategic substitutes, i.e.
0
2
(

1
) < 0, then

1
(

2
)

2
(

1
). If
1
and
2
are strategic complements, i.e.
0
2
(

1
) 0, then

1
(

2
) <

2
(

1
).
164
Since

2
(
1
) is the Cournot best response function, it is clear that total
output increases in the case of strategic substitutes, and total output
decreases in the case of strategic complements. If inverse demand j ()
is decreasing in combined sales, then the industry price decreases with
strategic substitutes and increases with strategic complements relative
to Cournot competition.
Example. With linear inverse demand, j (
1
+
2
) = c / (
1
+
2
), the
followers best response in the Stackelberg model is

2
(
1
) =
c c
2/

1
2
.
Since 1 +
0
2
(

1
) = 1,2,

1
(

2
) = 2

2
(
1
) =
c c
/

1
=
c c
2/
.
whereas

2
(

1
) =
c c
4/
.
This is the strategic substitutes case (
0
2
(

1
) = 1,2); the leader sells a
larger quantity, knowing that the follower will practice restraint in order
avoid too much price deterioration. Compared to the simultaneous Cournot
outcome

1
(

2
) =

2
(
1
) = (c c) , (3/), the overall quantity sold increases
from

1
+

2
= (2,3) (c c) ,/ to

1
+

2
= (3,4) (c c) ,/ under Stackelberg
competition. As a result, price decreases from j (

1
+

2
) = c,3 + (2,3) c to
j (

1
+

2
) = c,4 +(3,4) c (keeping in mind that c c). You can check that
the leaders prot is larger, and the followers prot smaller, than in Cournot
competition.
In the Stackelberg model, both rms produce a nonzero quantity, as
long as price exceeds marginal cost. (There is always at least a small
strictly positive prot to be made.) When production requires a xed
initial outlay (e.g. in product development or a plant), the leader might
deter the follower from incurring the start-up cost by committing to an
aggressive response (high sales at a low price). This intention can be
signaled credibly (in the sense that it becomes the best response) by
investing in high capacity and low marginal cost.
165
The leader might then be able to operate as a monopolist, possibly
under constraints to keep the price lowenough to continually discourage
entry.
18.2 Entry Equilibrium
In the remainder of the lecture, we return to a symmetric environment
and consider how much entry will occur in equilibrium with a xed
entry cost 1 0. There is an innite number of potential entrants
who decide, at stage 1 of the game, whether to invest 1 or not, and
the entrants compete at stage 2 as oligopolists.
Assume that, for any number J of entrants, there is in stage 2 a unique,
symmetric Nash equilibrium, yielding prot :
J
for each entrant (this
excludes the entry cost 1). The equilibrium number of entrants J

is such that no rm wants to either enter or exit, given the prevailing


prot :
J
:
:
J
_ 1 and :
J

+1
< 1.
(We assume rms enter if indierent, i.e. if :
J

+1
= 1.)
If :
J
is decreasing in J, and :
J
0 as J , then the equilibrium
J

is unique.
Example. Suppose stage 2 is a pure Bertrand game, where rms have con-
stant marginal costs c, and inverse demand is linear, j () = c / with
c c _ 0 and / 0. We know that :
J
= 0 when J 1. If the monopoly
prot :
n
exceeds entry cost 1, a single rm will enter (i.e. J

= 1)
and set the monopoly price. Since :
n
() = (j () c) is maximized by

= (c c) , (2/), the optimal prot is :


n
(

) = (c c)
2
, (4/). The entry
criterion :
n
(

) _ 1 is therefore equivalent to 1 _ (c c)
2
, (4/).
Exercise (MWG 12.E.1).
In an alternative set-up, entry cost may be incurred only if the rm
makes non-zero sales. Thus, a rm can observe how many other rms
enter, and bear the xed cost only if it can make a non-negative prot.
This approach restores something close to marginal cost pricing in the
166
Bertrand entry game, since rms enter if and only if there is positive
prot to be made.by undercutting the prevailing price.
Amonopolist could in this case not set a price above j

= (1 +cr (j)) ,r(j),


which approaches c when industry demand r (j) is large. Else, another
rm could enter without pre-paying the entry cost 1. In other words,
the entrant would not have to worry that it may be undercut in the
ensuing Bertrand competition and be left with a loss of 1. It can sim-
ply respond to the incumbents current price, since it can exit freely
if the incumbents behavior changes. Industries where such "hit and
run" entry is possible are called contestable markets.
Example. Under the same cost and demand conditions, consider a Cournot
game in stage 2. Each entrant maximizes
:
J
= (j (J
J
) c)
J
= (c (J 1) /
J
/
J
c)
J
.
so that

J
=
c c
2/

J 1
2

J
=
c c
/
1
J

+ 1
.
since all rms symmetrically set
J
=

J
. Thus
:
J
=
1
/
_
c c
J

+ 1
_
2
is strictly decreasing in J

, and :
J
0 as J

0.
Since :
~
J
= 1 at
~
J =
c c
_
/1
1.
the equilibrium number of entrants J

is the greatest integer below


~
J. As
1 0 or / 0, J

and the industry price


j (J

J
) = c J

/
J
= c
J

+ 1
(c c)
=
1
J

+ 1
c +
J

+ 1
c
approaches marginal cost.
167
18.3 Socially Optimal Entry
Entry in an industry provides, on the one hand, valued goods to con-
sumers and, on the other, duplicates entry cost for rms. Given that
rms choose their production levels to maximize prot at the post-entry
stage, given the market structure, there exists an ecient number of
entrants from a social perspective that optimally resolves this tradeo.
This number J

maximizes consumption benets (as measured by the


willingness to pay for each unit) less production cost (which includes
each rms entry cost 1):
o (J) =
_
Jq
J
0
j (:) d: J (c (
J
) +1) .
Example. Returning to the pure Bertrand example, the socially optimal
number of entrants can be no more than two, since price equals marginal
cost if two rms compete (hence surplus is maximized, and it cannot be
socially desirable to incur further entry costs). Since a single rm enters in
equilibrium, the number is either socially optimal, or one fewer than socially
optimal.
Example. In the Cournot example that was already described,
o (J) = J
_
(c c)
J

1
2
J/
2
J
1
_
=
_
J
J + 1

1
2
_
J
J + 1
_
2
_
(c c)
2
/
J1.
The welfare-maximizing number of rms J

satises
o
0
(J

) =
1
(J

+ 1)
3
(c c)
2
/
1 = 0.
so
(J

+ 1)
3
=
_
c c
_
/1
_
2
.
168
Recall that
~
J = (c c) ,
_
/1 1 was the equilibrium number of entrants.
It exceeds the socially optimal number, since J

+ 1 =
_
~
J + 1
_
23
<
~
J + 1.
Exercise (MWG 12.E.3).
Entry may be inecient with Bertrand as well as with Cournot com-
petition: in the rst case, we have (possibly) too little, in the second
generally too much, entry. There are two kinds of failures.
No rm may nd it protable to enter an industry as a monopolist
because the maximal prot the rm can make with per-unit pricing
does not recover the cost of entry. However, from a social perspective
it may be desirable that the monopolist operate, since the consumer
surplus may oset losses in prot. Of course, the potential monopolist
ignores these benets to consumers, unless it can extract them through
price discrimination or is oered a subsidy.
The same logic could apply to an additional potential entrant in an
industry with some number of incumbents - it may be socially valu-
able, but not privately optimal, for the entry to occur. This reasoning
suggests that underentry by one rm (relative to J

) is possible.
More typically, there is overentry because rms consider only whether
their prots can cover the cost of entry, but ignore the reduction in
incumbents prots. Because the incumbents sell fewer units after an
additional entry, their contribution to surplus declines, while the com-
bined entry cost they incurred remains xed. From a social point of
view, the net increase in total output (and consumption benets) may
not justify the duplication of entry costs.
But for the rm, entry may still be protable because its sales exceed
the increase in total output (the dierence being output reductions
by incumbents) and could therefore cover the entry cost. In short,
overentry is possible due to the "business-stealing" eect.
These insights can be stated as a general result: under standard condi-
tions, equilibrium entry may well exceed the socially optimal level, but
may fall short by at most one rm.
169
Proposition. If marginal rm prots are positive for any number of entrants
J (i.e. j (J
J
) c
0
(
J
) _ 0), and more entry increases industry output
( J J
0
== J
J
_ J
0

J
) but decreases rm output and prot ( J J
0
==

J
_
J
0 ), and if furthermore j
0
() < 0 and c
00
() _ 0, then the equilibrium
number of entrants J

_ J

1, where J

is the socially optimal number


of entrants.
Proof. Since the claim is obviously true when J

= 1, let J

1. By
denition, \ (J

) _ \ (J

1), hence:
_
J

q
J

(J

1)q
J

1
j (:) d: J

(c (
J
) c (
J

1
)) c (
J

1
) _ 1.
Under the assumptions, J

J
(J

1)
J

1
, and price is a decreasing
function of industry supply, so j ((J

1)
J

1
) is the maximum in the
price interval from j (J

J
) to j ((J

1)
J

1
). Therefore,
_
J

q
J

(J

1)q
J

1
j (:) d: _ j ((J

1)
J

1
) (J

J
(J

1)
J

1
) .
(Think of the fact that the area under the curve j () between
0
= (J

1)
J

1
and

1
= J

J
is contained in the rectangle formed by
1

0
and the maximal
height of the function j ().)
Then, by rearranging the rst inequality,
J

(j ((J

1)
J

1
) (
J

J
) +c (
J
) c (
J

1
))
_ j ((J

1)
J

1
)
J

1
c (
J

1
) 1.
Now, the convexity of the cost function (c
00
() _ 0) implies
c (
J
) c (
J

1
) =
_
q
J

q
J

1
c
0
(:) d: _ c
0
(
J

1
) (
J

J

1
)
since c
0
(
J

1
) is the minimum in the cost interval between c
0
(
J

1
) and
c
0
(
J
). Thus,
J

(j ((J

1)
J

1
) c
0
(
J

1
)) (
J

J
)
_ j ((J

1)
J

1
)
J

1
c (
J

1
) 1.
Since
J

1

J
and j ((J

1)
J

1
) c
0
(
J

1
) _ 0 by assump-
tion, the right side is positive, i.e.
j ((J

1)
J

1
)
J

1
c (
J

1
) = :
J

1
_ 1.
170
Then rms must enter at least until their number is J

1, because :
J
is
decreasing in J:
:
J
:
J1
= j (J
J
)
J
j ((J 1)
J1
)
J1
(c (
J
) c (
J1
))
_ j (J
J
)
J
j ((J 1)
J1
)
J1
c
0
(
J1
) (
J

J1
)
= (j (J
J
) c
0
(
J1
))
J
(j ((J 1)
J1
) c
0
(
J1
))
J1
_ 0.
since
J
_
J1
and j (J
J
) _ j ((J 1)
J1
) from J
J
_ (J 1)
J1
and
j
0
() < 0.

Exercise (MWG 12.E.2).


Exercise (MWG 12.E.4).
171

Potrebbero piacerti anche