Sei sulla pagina 1di 234

The Microeconomics of Risk and Information

This page intentionally left blank


The Microeconomics of
Risk and Information
Richard Watt
Richard Watt 2011
All rights reserved. No reproduction, copy or transmission of this
publication may be made without written permission.
No portion of this publication may be reproduced, copied or transmitted
save with written permission or in accordance with the provisions of the
Copyright, Designs and Patents Act 1988, or under the terms of any licence
permitting limited copying issued by the Copyright Licensing Agency,
Saffron House, 6-10 Kirby Street, London EC1N 8TS.
Any person who does any unauthorized act in relation to this publication
may be liable to criminal prosecution and civil claims for damages.
The author has asserted his right to be identied as the authors of this work
in accordance with the Copyright, Designs and Patents Act 1988.
First published in 2011 by
PALGRAVE MACMILLAN
Palgrave Macmillan in the UK is an imprint of Macmillan Publishers Limited,
registered in England, company number 785998, of Houndmills, Basingstoke,
Hampshire RG21 6XS.
Palgrave Macmillan in the US is a division of St Martins Press LLC,
175 Fifth Avenue, New York, NY 10010.
Palgrave Macmillan is the global academic imprint of the above companies
and has companies and representatives throughout the world.
Palgrave and Macmillan are registered trademarks in the United States,
the United Kingdom, Europe and other countries.
ISBN 9780230280793 hardback
ISBN 9780230280809 paperback
This book is printed on paper suitable for recycling and made from fully
managed and sustained forest sources. Logging, pulping and manufacturing
processes are expected to conform to the environmental regulations of the
country of origin.
A catalogue record for this book is available from the British Library.
A catalog record for this book is available from the Library of Congress.
10 9 8 7 6 5 4 3 2 1
20 19 18 17 16 15 14 13 12 11
Printed and bound in Great Britain by
CPI Antony Rowe, Chippenham and Eastbourne
This book is dedicated, with my deepest gratitude, to my
wife Marta and our children, Daniel and Olivia, for making
life a pleasure, and hard work worthwhile.
This page intentionally left blank
Contents

List of gures viii

Preface xi

1 Introduction 1
1.1 Focus of the book 3
1.2 Basic objectives 4
1.3 Content and structure 5
1.4 Some advice 9

Part I Individual decision making under risk

2 Risk and preferences 15


2.1 Historical antecedents 17
2.2 Expected utility theory 20
2.3 Alternative decision criteria 24

3 Risk aversion 33
3.1 Marschak-Machina triangle 33
3.2 Contingent claims 40
3.3 Measures of risk aversion 49
3.4 Slope of risk aversion 63

4 Applications 69
4.1 Portfolio choice 69
4.2 The demand for insurance 74
4.3 Precautionary savings 87
4.4 Theory of production under risk 96

Part II Risk sharing environments

vii
viii Contents

5 Perfect information 115


5.1 The contract curve 116
5.2 Constant proportional risk sharing 123
5.3 Increases in aggregate wealth 131

6 Adverse selection 138


6.1 Preliminary comments 138
6.2 Adverse selection without risk 142
6.3 Principal-agent setting 148

7 Moral hazard 173


7.1 Perfect competition 180
7.2 A monopolistic principal 184

Part III Appendices

A Mathematical toolkit 191


A.1 The implicit function theorem 192
A.2 Concavity and convexity 195
A.3 Kuhn-Tucker optimisation 201
A.4 Probability and lotteries 206

B A primer on consumer theory 209


B.1 The basic microeconomic problem 209
B.2 Utility maximisation under certainty 210

Index 219
List of Figures

2.1 A traditional utility function and a prospect theory


utility function 29

3.1 A Marschak-Machina triangle 34


3.2 Concave utility function 37
3.3 Expected value and expected utility in the Marschak-Machina
triangle under concave utility 38
3.4 Allais paradox in the Marschak-Machina triangle 40
3.5 Contingent claims space 42
3.6 Expected value and variance lines in the contingent
claims graph 45
3.7 Expected utility indierence curves with risk averse
preferences 46
3.8 Optimal choice between a risky and a risk-free asset 48
3.9 An acceptance set 50
3.10 Greater risk aversion 51
3.11 Graphical construction of the maximum level of risk
premium 60
3.12 Eect of greater risk aversion and greater risk upon the
risk premium 62

4.1 Optimal portfolio demand 72


4.2 A short position in rm 1 75
4.3 Zone of mutually benecial insurance contracts 77
4.4 Perfectly competitive and monopoly insurer equilibria 78
4.5 Optimal savings under certainty compared to optimal
savings with a risky second period income 92
4.6 Eect of the value of prudence on the savings decision
under a risky interest rate 95
4.7 Feasible set for the risky production problem 101

ix
x List of Figures

4.8 Optimal production choice under risk aversion, and un-


der risk neutrality 102
4.9 Newsboy expected utility assuming that pc > (1
p)(q c) 105
4.10 Newsboy expected utility assuming that pc = (1
p)(q c) 106
4.11 Newsboy expected utility assuming that pc < (1
p)(q c) 107

5.1 An Edgeworth box under risk 116


5.2 Two feasible types of contract curve 120
5.3 Contract curve with two decreasing relative risk averse
players 128
5.4 Possible contract curves with two constant relative risk
averse players 130

6.1 Separating equilibrium in the Spence signalling model 146


6.2 Type-1 and type-2 agent indierence curves 151
6.3 Expected prot lines when the principal contracts with
a type-1 or a type-2 agent 152
6.4 A pooling contract with a competitive principal 156
6.5 Negative expected prots from points B and C. 158
6.6 Separating equilibruim in the adverse selection problem
with a competitive principal 160
6.7 Zone of rebel contracts 161
6.8 Optimal type-1 contract, for a given type-2 contract 164

7.1 The incentive compatibility constraint of the agent 178


7.2 Two expected prot lines of equal value 180
7.3 Optimal contract for low eort 181
7.4 Optimal contract for high eort 182
7.5 Special case of high and low eort equally preferred by
the agent 183
7.6 Optimal contracts for high and low eort with a mo-
nopolistic principal 185
7.7 A case in which the equilibrium contract is high eort 186

A.1 A concave function 195


A.2 A convex indierence curve 198

B.1 Roys identity 218


Preface

This is a book about microeconomic theory. More specically, it is a


book concerning the way the presence of risk aects optimal decision
making. The study of decision making under risk is certainly not
new, but due to certain mathematical complexities that it involves,
it is often left out of undergraduate microeconomics programs on the
grounds of being too dicult for students to manage. However, the
increase in complexity that is introduced in models with risk is often
as much due to increases in dimensionality as the addition of risk to
the modelling environment. The basic modelling that is required in
order to solve problems in decision making under risk is no dierent
to what is done under an assumption of certainty, so long as the
dimensionality of the problem is not altered. All that needs to be done
is to re-interpret the basic elements of the model the variables, and
the graphical curves and lines that are used to analyse the problem.
With that in mind, this book oers a short course in choice under risk,
packaged in exactly the same environment as a typical undergraduate
course in choice under certainty, that is, an environment with two
choice variables (at most). The principal idea is to show students
how to handle scenarios with risk, and to point out some of the
mathematical toolkit that is useful in that environment (indeed, in
microeconomic theory generally), without actually leaving the comfort
zone of a simple two-dimensional graphical setting.
As with most text books, the material presented here derives from
a fairly long history of teaching the subject. Over the past 20 years
or so, I have taught this information to students in their nal year
of undergraduate economics on two continents. The problems that
arise are the same everywhere, and those problems generally involve
a diculty in visualising microeconomic problems in mathematical
guise. In essence, what is of issue is not a lack of mathematical ability
or knowledge, but rather a shortfall in the understanding of microeco-

xi
xii Preface

nomic processes. Once a student can see what a problem involves, and
how it should be tackled graphically, then it is a relatively easy step
to apply the correct mathematical techniques to it. The underlying
theme of the present book is to attempt to achieve this by sticking
rigorously with problems in only two dimensions, and showing as much
as possible both mathematical and graphical treatments side by side.
I am indebted to a great many individuals both for fostering my
own interest in the topic of the microeconomics of decision making
under risk, and for turning my rough-and-ready lecture notes into
what I hope is now a coherent and sensible treatment of the topic.
I was initially lured by problems in choice under risk by the late
Prof. Richard Manning in classes that were taught at the University
of Canterbury some 25 years ago. Since then, the main impetus to
my interest in the topic has come from the many vibrant discussions
that are so typical at the annual meetings of the European Group of
Risk and Insurance Economists (EGRIE), which I habitually attend.
I owe a huge debt of gratitude to Jasper Mackenzie who took on the
arduous task of preparing so professionally the graphs that appear in
the book. I also thank Nick Sanders who helped me with an earlier set
of graphs, which allowed deadlines to be reached. Aleta Bezuidenhout
and Jaime Marshall at Palgrave Macmillan have been a pleasure to
work with.
Chapter 1

Introduction

The standard theory of choice that is taught in all introductory and


intermediate microeconomic theory courses posits a consumer who
would like to make a choice of how much of each of two goods to
consume, given the prices of those goods and a level of wealth that
can be dedicated to the purchase. This is a typical constrained opti-
misation problem the choice variables are the amounts of each good
to consume, the objective is to maximise welfare, and the constraints
are determined by, on the one hand, the fact that neither good can
be consumed in negative amounts and, on the other, the restriction
that the cost of the choice (quantities demanded multiplied by prices)
cannot exceed the nancial resources available.
In spite of its radical simplicity, as a theoretical construction, this
standard consumer choice model is able to provide logically persuasive
solutions to questions of some importance. For example, the model
predicts (always) that welfare is increasing in wealth and decreasing
in prices, and that if wealth is increased in compensation for a price
increase then the demand for each good is decreasing in its own price
and increasing in the price of the other good. The model also predicts
(usually, but not always) that the uncompensated demand for each
good is increasing in wealth and decreasing in its own price. Any
number of other results can also be obtained, related to such things
as changes in preferences, introduction of taxes of dierent types, and
even non-linear pricing.
However there are aspects of the model that many students nd
to be overly simplied. Perhaps the simplication that is most often
noted is the fact that the model is usually presented in only two

1
2 1. Introduction

dimensions. That is, the assumption is that there are only two goods
present in the choice problem. This, however, should not be a concern.
The restricted number of dimensions is in place only in order that the
visual apparatus of a graph can be used. There is not doubt that a
graphical exposition of the solution helps enormously to capture most
of the essential elements of the solution, and for that reason two-
dimensional analysis is often used. But the model itself is robust to
an extension to any number of goods, and indeed it is often solved in
its multi-dimensional version in more advanced courses.
The other most often cited simplication that is important for the
model to be a faithful representation of real-world decision making
is the fact that everything that the decision maker needs to know,
he does know. In particular, he is fully informed of the availability
of all goods, of the prices of all goods, and, of course, of his own
income and preferences. It is likely that none of these things are really
quite so certain. Prices and availability of goods dier over sellers,
and it is often very dicult (or at least, very costly) to know exactly
where to go to get any particular item at any particular price. Even
personal attributes such as the disposable income and preferences
of the decision maker are known only approximately. One way to
deal with income uncertainty might be to set a budget for purchases
that is small enough to be guaranteed to be available, and then any
surplus income that results is simply retained as a random element of
savings. But then, we should ask what would be the optimal size of
the consumption budget that should be established? More generally,
we would be better to enquire about how the risks and uncertainties
that undoubtedly surround a decision-making environment can be
best catered for. This is the underlying theme of this book.
Risk is an ever-present element in decision making. It is often
related to time, because the nal consequences of the decisions that
we make often do not occur simultaneously with the decision. Between
the moment of the decision, and the moment of the consequence,
other random elements in the problem environment might be playing
out, aecting the consequences of our decisions. That is, a given
decision can, feasibly, lead to more than one outcome or consequence,
depending on the outcome of other relevant stochastic elements. What
we need, therefore, is a convincing theory of how to best take such
stochastic elements into account when the decisions are made.
One obvious way in which the existence of risk aects economic
transactions is the existence of markets and institutions in which
1.1. Focus of the book 3

risk itself can be traded. Take, for example, the insurance industry,
which clearly oers a service designed to shift risk from insurance
consumers to insurance companies, in exchange for a premium pay-
ment. However, many other examples exist, including (but certainly
not restricted to) markets for nancial products like shares in busi-
nesses, and, of course, options and futures on those shares, contracts
between employers and employees that shift risks from the former to
the latter, xed rather than variable interest rate contracts that shift
risk from borrowers to lenders, and so on. Achieving an understanding
of how such markets and institutions work to the mutual benet of
all concerned, and how they aect decision making, is a fundamental
purpose of this book.

1.1 Focus of the book


This book contains a short course, designed to be completed in a single
semester of study, in the economic theory of risk and information.
These two intimately related topics are now standard inclusions in the
economic theory curriculum at universities all over the world. This,
of course, reects the now generally recognised importance of risk
and information as integral aspects of almost any economic analysis.
Knowing how to handle risky, or stochastic, environments, and above
all, how to deal with scenarios in which the parties to a transaction
have dierent information sets, is of primordial importance in the
education of economists.
That said, the supply of specialist text books designed to cater
to the need to learn about risk and information has typically been
restricted to texts at the post-graduate level. This is a natural course
of events since the norm has been to teach economics undergraduates
the standard theory of consumer and producer choice, equilibrium
and markets, all under certainty, and then to move onto the extension
to stochastic environments only in post-graduate courses. At most,
undergraduates will have seen a single chapter in their general microe-
conomics text on choices under risk, and another for the economics of
asymmetric information.
While it is true that risk should be studied only after successfully
following a course in choice under certainty, it has increasingly been
the case that nal year undergraduates are oered a one-semester
elective in the economics of risk and information, but as yet there has
been no specic text book that caters to such a course. The present
text is an attempt to ll that gap.
4 1. Introduction

1.2 Basic objectives


The book has several objectives. First and foremost, it oers students
a minimal content of topics in the economics of risk and information.
However, the book has also been designed to be able to be studied in a
single semester course, with perhaps between 24 and 36 lecture hours
only. Thus some selection of possible topics has taken place, and I
hope that the nal choice of included topics is a fair reection of what
the profession has deemed, by revealed preference, to be important.
The primary objective of the book is to provide understanding
rather than to simply inform. This is a very dicult task, as anyone
who has ever attempted to lecture a theory topic will attest. How-
ever, in order to full the objective of comprehension, with only one
or two exceptions, the book sticks entirely with a two-dimensional
setting, one that should be intimately familiar to any student of
microeconomics who has completed at least an introductory 101-
type course. In particular, I have purposefully avoided the technique
of providing mathematical analysis at n-dimensional level and only
illustrative examples in two dimensions. By having full correspondence
between the two-dimensional analysis and the two dimensional graphs
throughout, a student gets two looks at each and every critical point
that is brought up, and it is hoped that this eventually leads to greater
understanding rather than just learning.
The second fundamental objective of the book is comprehension
of the use of constrained optimisation techniques in microeconomic
theory generally. Again, by retaining a strict two dimensional analysis
throughout, it is hoped that students will ultimately see that what is
being taught in this book is really no dierent to what was taught in,
say, consumer theory under certainty. All that has happened is that
the axes of the graphical environment have been re-labeled to mea-
sure dierent (all-be-it very similar) variables, the budget constraint
has been re-interpreted, and the indierence curves correspond to a
particular case of utility. Noting the huge similarity between how risk
is handled and how certainty is analysed is an important step in fully
understanding the robustness of the standard consumer theory setting
to the analysis of dierent problems. What is more, it is hoped that
from the present text, if it had not already been noted, the student
can clearly relate a graph of indierence curves and a choice set to
a problem of constrained optimisation, and the characteristics of the
solution point (a tangency on the frontier of the choice set) to the rst
1.3. Content and structure 5

and second order conditions of that optimisation problem.


Third, the book works simultaneously with mathematical treat-
ments of problems and their graphical representations. Once again,
this is achievable only by retaining a two dimensional setting through-
out. All too often it is the case that economics students are asked
to provide mathematical analyses of problems without really fully
understanding what that mathematics is doing. Students are taught
that utility functions should be concave, however the relationship
between concavity of utility and the shape of the indierence curves
is often not understood. Indeed, second-order conditions of maximi-
sation problems generally are poorly covered and often not always
grasped by students. It is hoped that the technique used in this book
will help to ease the burden of moving into a fully mathematical
analysis of economic problems, by providing such an analysis side-
by-side with a full graphical intuition for a series of models.
Given the objective of providing a gentle mathematical treat-
ment of microeconomic problems, a certain degree of mathematical
sophistication is assumed throughout. However, at no point will any
mathematical technique be used that is not a standard inclusion in
high school mathematics the world over. Indeed, in terms of mathe-
matics, all that is needed is the ability to derive (derivatives of the
rst and second degree), and algebraic manipulation of equations.
In any case, in Appendix A, a set of important mathematical tools
is carefully explained. It is recommended that readers who are not
condent at applying mathematics to problem solving should begin
by carefully reading the mathematical appendix (Appendix A), and
that this appendix be consulted whenever an unfamiliar mathematical
technique appears. Above all, this is a thinking course, and as such
you can get a very long way if you search for economic intuition and
logic in the results that are derived mathematically.

1.3 Content and structure


The book is divided into two main parts and several chapters. Part
I deals with individual decision making (i.e., scenarios in which there
is a single decision maker active in the model) and Part II deals
with scenarios with two decision makers. The chapters themselves
are ordered such that a coherent story is able to be told. The story
in question is about a decision maker who is exposed to scenarios of
ever-increasing complexity in which choices must be made. To start
6 1. Introduction

with, the decision maker lives in a world in which the only person
of relevance is himself, just like Robinson Crusoe living alone on his
tropical island. His choices and decisions are made in an environment
in which other important things may change the weather, the tides,
the appearance of ships on the horizon but those other changes are
not controlled directly by any other decision maker. They are, as it
were, acts of nature. The objective of these decisions is to provide the
decision maker with the greatest possible welfare (or utility), given
the fact that some other important values have yet to be xed. Part
I of this book deals with this type of single-person decision problem.
Then, another decision maker turns up. Just like Friday, whom
Robinson Crusoe meets on the island. Now, with two decision makers
on the island, a small economy emerges in which meaningful transac-
tions can take place between the two. In Part II of this book we look
at how these small economies may work in as much as risk sharing
goes. In particular, we are interested in how the two individuals can
join together in an eort to aront the risks that they face, the risks
posed by the whims of nature. The main thing at this intermediate
stage of existence is that both Robinson and Friday are both fully
informed about the exact nature of the risks that they face. They
both know what outcomes would result under each and every feasible
state of nature, and (importantly) the likelihood of each and every
state of nature.
The climax of the story is when, perhaps after some time on
the island, Friday begins to understand that there is a fundamental
dierence between himself and Robinson. Their information sets are
dierent, and this will have a profound eect upon the way they work
together. Perhaps we can think that Robinson (as the master) is busy
writing the story of his adventures, and so he sends Friday (as the
servant) o to labour each day in the jungles and oceans to get food
for the two of them. Assume, for example, that Robinson really likes
to eat sh, and that Friday is happy to eat only fruit. Fruit is easily
available and in plentiful supply all over the island, and so there is no
problem about gathering all the fruit that the two may ever require.
But shing is dierent. It is inherently risky, and the outcome of how
many sh are caught depends upon many random factors. Maybe it
turns out that the best place to sh is a cove that is very far away,
and Friday would rather not walk so far, and instead he prefers to sh
at a closer location in spite of it not being such a plentiful supplier
of sh. Robinson, who does not want to have to accompany Friday
1.3. Content and structure 7

shing each day to see where he goes, must think up an agreement


with Friday that convinces Friday that indeed he should go to the
far away shing spot, in spite of the personal costs to him of getting
there. After all, when Friday comes back in the evening with only
one sh rather than many after having shed at the closer spot, he
could just tell Robinson that he was at the far away cove but that it
was just a poor day for shing. Working out exactly how Robinson
should go about convincing Friday to sh where Robinson would like
him to rather than where Friday would prefer is the nal part of our
book. It is the point to which all of the earlier work leads, as it will
happen that Robinson cannot solve the informational problem with
Friday without appealing to his previous experiences, rst alone and
then with Friday but under common information sets.
With that in mind, in the next chapter, a detailed investigation
into the very concept of risk is provided, together with what we
know about how the existence of risk should be incorporated into
preferences, and ultimately into decision making. Chapter 3 sets out
a more detailed analysis of the concept of risk aversion, which
shapes all of the rest of the topics that are analysed in the book.
Once risk aversion is clearly analysed, Chapter 4 moves on to look
at a series of applications of decision making under risk that has
been the subject of economists attention. Specically, in Chapter 4
the reader will nd the applications of portfolio demand, insurance
demand, precautionary savings, and producer theory.
These rst four chapters of the book, grouped together in Part
I, deal with situations in which there is only one active party in the
model, making optimal decisions in a risky environment.1 The last
three chapters (grouped as Part II) bring a second player into the
model. It is at this point that the assumption of risk rather than
uncertainty becomes more important. The risk assumption implies
that both of the parties to the transactions that we analyse agree
upon the probabilities of the dierent outcomes. This assumption is
made only because the models then become more user-friendly, in
the sense that otherwise they would become excessively cluttered. If
the two parties had dierent probability beliefs in a two-dimensional

1
The possible exception is the case of insurance demand, where one could argue
that not only the insurance consumer, but also the insurer is present. However,
when we analyse the insurance demand model, our primary attention is placed
upon the decision of the consumer, and the insurer is really present only as a
parameter set in the demanders decision problem.
8 1. Introduction

stochastic problem, then we have to bring both of their probabilities


into the analysis rather than just one. And this would be assuming
that each party is fully informed of the probability belief of the other.
Though asymmetric beliefs can be analysed, doing so does not add
anything of any importance to the understanding of how risk sharing
would work, and yet the mathematical complexity would be greatly
increased.
In Chapter 5, the scene is set by taking a look at risk sharing
under symmetric information in the Edgeworth box diagram, which
should be familiar to all students who have undergone an intermediate
microeconomics course. The principal aspect of risk sharing is the
location and shape of the contract curve, something that is discussed
in Chapter 5. Finally, Chapters 6 and 7 analyse decision making and
risk sharing under asymmetric information one of the parties to the
transaction is uninformed of an important parameter (Chapter 6) or
variable (Chapter 7). All of the analysis in Chapters 6 and 7 is also
carried out in the Edgeworth box, although the upper axes of the
box are omitted in the graphs, mainly because doing so has become
standard in most texts and articles that are written on this topic.
It is important to note that asymmetric information models make
sense only in stochastic environments. Without a random element in
the models, there can be no asymmetric information, as the outcome
of the transaction would reveal all information about the environment
to both parties. To go back to our analogy with Robinson and Friday,
if there were no uncertainty or risk about shing, then it could be
that one sh for sure would be caught in the relatively poor shing
area, and three would be caught for sure in the better area. If that
were so, when Friday returns home with a single sh rather than
three, Robinson knows for sure where Friday was shing, even though
Robinson stayed home all day. Therefore, our analysis of asymmetric
information should be seen to be the culmination of the story being
told in this book on decision making under risk. It is a model in which
risk is present, which conditions the decision making of two individuals
simultaneously, under an assumption of dierent information sets.
The rst four chapters look at decision making under risk with a
single individual, Chapter 5 adds a second individual but under an
assumption of symmetric information, and Chapters 6 and 7 add the
nal ingredient of asymmetric information to the two-player model.
Throughout the chapters, specic exercises together with their
solutions are used to illustrate particular ideas in the main text. At
1.4. Some advice 9

the end of each chapter, a set of problems without solutions is oered


to serve as practice for what is covered in the text. The exercises are
designed to give students closely guided practice at problem solving.
In some cases, the exercises are used to show particular aspects, or
applications, of what is in the text generally, and indeed the results
obtained in some exercises are incorporated into the main story that
a chapter is telling.
On the other hand, the end-of-chapter problems are oered with-
out solutions. It is now very well known that if all of the problems
that are oered to a student are accompanied by their solutions,
checking the solution before tackling the problem is just too much
of a temptation for all but a tiny minority of readers. Also, getting
to grips with a problem by looking at its solution is a very dangerous
strategy. All problems look simple once you are told how to solve them,
but without the solutions at hand, they can be much more testing. If
a student were to only look at problems together with their solutions,
there is a very real danger that the student thinks he/she could solve
such problems when in fact this is not true. You will know whether
you can solve problems only if you try to do it without looking at
the solutions rst. It is very rewarding to be able to solve a problem
all by oneself by solving a problem you conrm to yourself that you
have understood something new. By not including the solutions to the
problems I hope to encourage students to try to solve the problems
by themselves. Many students often ask for solutions to problems in
order that they can check that they have solved them correctly, but
again this is not really needed. I am very condent that you will know
when your answer is good, and when it is not. It was (originally)
Confucius who recognised that I hear, I forget. I see, I remember.
I do, I understand. The fact that the answers to the problems are
not given in this book is intended as strong encouragement to follow
Confucius wise words of advice try to do the problems by yourself.

1.4 Some advice


The best piece of advice that I can give readers of this book is to
stick to the chapter order, at least as far as rst covering the chapters
on choice under risk before attempting to study the chapters on the
applications of that theory, including the chapters on asymmetric in-
formation. The book consistently re-applies concepts as they come up,
and once they have been introduced, they will be used often. If those
10 1. Introduction

concepts are not properly understood, it becomes increasingly more


dicult to understand the rest of the material. The exception to the
general rule regarding the order in which the chapters should be read
is that I suggest that the rst part of the book that should be looked
at by all readers is Part III, which contains the appendices, where
the microeconomic and mathematical concepts that are necessary to
follow the book are set out. Regardless of where you stand in terms of
math and economics, it is a good idea to start by familiarising yourself
with the appendices to ensure that you are able to read through the
main text with little interruption.
Finally, let me oer a simple way to follow the way in which the
analysis contained in this book proceeds (indeed, I would argue that
it is how almost all of microeconomic analysis proceeds). Consider the
ow-chart below. Start at the top, dening the parameters, variables
and mathematical functions that are important for the problem at
hand, and ensuring that we put all these pieces together correctly.
In essence, at step 1, what we are doing is to correctly set up a
constrained optimisation problem, establishing the function that is to
be optimised (the objective function), identifying the variables that
we can choose in order to maximise the objective function (the choice
variables), and properly dening the set of available choices (the
choice set). Once that is done, we have a constrained optimisation
problem appropriately dened and set up, and so we can move on to
step 2 which is where we solve the problem. At step 2 we typically
need to locate a set of rst-order conditions, along with a set of
complementary slackness conditions (which indicate which of the
constraints are actually binding), and solve them out to nd the
optimal solution.2 Finally, we move on to step 3, which is arguably
the most important step in the process. Step 3 is the analysis of com-
parative statics, which is looking at how the solution would change
should any of the elements identied at step 1 be altered. In this step
of the process, we are considering the question of the dependence of
the optimal solution to the problem on the initial setting into which
it was placed. It may be that not all alterations in the initial elements
will aect the solution, but more interestingly, when something is
modied at step 1, we should expect that there will be some impact
upon what the optimisation process gives us in the end. A comparative
statics exercise keeps track of these eects, and, of course, it is useful

2
If you are unsure about how this is done, check Appendix A.
1.4. Some advice 11

because it may point to which initial parameters we should use, if


we have some idea of what we would like to see happening at the
end of the maximisation process. This, for example, is the basic idea
behind incentives, which are so fundamental to economics. Assume,
for example, that you would like someone to take a particular decision
as the result of his own personal maximisation problem, but where you
have some control over the parameters dening that persons decision
problem. Using comparative statics you can back-track to nd the
best parameters to input to the problem so that the persons optimal
decision coincides with what you would most prefer.

1. Set the problem up


Identify the choice variables, identify all relevant param-
eters, identify the objective function and the constraining
functions, check for concavity of the objective and convexity
of the feasible set.

2. Solve the constrained optimisation problem


Write out the Lagrangean, nd the rst order conditions
and the complementary slackness conditions, solve out to
get the solution values of the choice variables as functions
of the systems parameters.

3. Do the comparative statics


Carry out an analysis of the solution functions found at step
2; how do the optimal values of the choice variables change
when the initial parameter values are changed? Normally,
this is done using derivatives (rather than re-doing the
maximisation problem at step 2 with the new parameter
values).

I have found that many students of microeconomics consider that


the most important thing is to be able to solve for the optimal solution
(i.e., to complete step 2), but that is really only a minor part of what
microeconomics is about. Really, it is the analysis of that optimal
12 1. Introduction

solution, above all how it changes as we alter the parameters that


dene the problem, that is of primordial importance. Governments
do not really want to know about the equilibrium level of demand
for a certain product, they want to know what would happen to that
demand (and perhaps what would happen to the market price of the
product) if a sales tax were to be introduced. This is an analysis
in comparative statics. Even a demand curve itself is an analysis in
comparative statics it shows how an optimal choice changes as price
is altered, for all feasible prices. Generally comparative statics can
be carried out using simple derivatives and the implicit function
theorem (see Appendix A if you are unsure what the implicit function
theorem is, or how it relates to comparative statics). This is true
when the source of the alteration is a parameter value (a number),
which can be increased or decreased. However, in other instances we
are interested in how a change in a function will aect the problem.
A commonly studied functional change asks how does a change in
preferences (the shape of the objective function) alter the solution?
For example, often we are interested in the eects of risk aversion
upon decision making how does the demand for particular goods
change as people become more averse to risk? In order to do this,
one needs to be a little more adventurous than simple derivatives for
comparative statics.
Part I

Individual decision making


under risk
This page intentionally left blank
Chapter 2

Risk and preferences

It is an obvious fact of life that decisions are taken in an environment


of uncertainty. That is, when a decision is made, some of the relevant
parameters are stochastic random variables. This, of course, contrasts
to the setting (so often assumed in elementary models) in which the
values of all parameters are known to the decision maker. For example,
consider the elementary undergraduate model of consumer choice.
In the simplest setting all parameters income, all prices and the
utility function are known with certainty by the individual. Here,
we wonder what occurs when uncertainty or risk impinges upon such a
problem, that is, when some of these parameters are random variables.
First and foremost, we must clarify exactly what is meant by the
terms risk and uncertainty. Really, uncertainty by denition is
any situation that is not fully certain, that is, a stochastic situation
involving some kind of randomness. However, it has now become
common to give specic denitions to both risk and uncertainty. Here
we shall follow the traditional meanings, rst suggested by Frank
Knight in his doctoral thesis.1 Both risk and uncertainty refer to sit-
uations in which at least one relevant parameter is a random variable
(i.e., it is a variable that can take on more than one possible value,
depending on a probability density function). When the probabilities
with which this variable takes on each of its possible values are known
(objective probabilities), then we say that the situation is one of risk.
On the other hand, if the probabilities are not known (subjective
probabilities) then we say that the situation is one of uncertainty. For
1
See the book by Knight titled Risk, Uncertainty and Prot, published by
Century Press (New York) in 1964 (originally published in 1921).

15
16 2. Risk and preferences

example, consider the case of an individual who owns a ticket in the


lottery and must decide today (before the lottery is drawn) how much
of his current income to save. If, by the mechanism under which the
winner of the lottery is determined the probability of winning can be
calculated, and if the prize for winning is a pre-established amount,
then we have a situation of risk. Simple cases of risk are lotteries
determined by the throw of a coin, or by the roll of a dice.
On the other hand, consider the case of a person who needs to
decide which of two modes of transport to use say bus or train to
travel between two cities on a given day. Say the option of the train is
more reliable in terms of exact travel time (busses are more subject to
trac jams, break-downs, etc.) but somewhat more expensive. If we
assume that he is interested in only the time it takes to travel and the
price of the ticket, then we have a decision that must be taken under
uncertainty, since the exact probabilities of each possible duration for
the bus trip are unknown and unknowable. They must be estimated
by the individual subjectively. Similarly, lotteries like the outcomes of
sporting events are situations of uncertainty rather than risk.
However, given this, we can in fact notice that a situation of risk is
really a special case of a situation of uncertainty. When a person must
make a decision, and a random variable is present, an estimation of
the probabilities must be made somehow. Independently of whether or
not there exists a mathematically correct way to determine the exact
probabilities, an individual will simply use what he considers to be the
most adequate probabilities. Only when the individual actually uses
the objective probabilities when these are available, do we have a sit-
uation of risk. Thus, a situation of pure uncertainty is one in which no
objective probabilities exist, and the choice of what probabilities to use
must be made using other criteria that may dier across individuals.
On the other hand, when the criteria used to establish the probabilities
is to simply use the objective probabilities (whenever they exist) then
we have a situation of risk (and there would be no dierence across
individuals). Really the only dierence between risk and uncertainty
is the specic criteria used to arrive at the relevant probabilities
in both cases any individual is free to use those probabilities that he
feels t (or in other words, probabilities are always subjective), and a
situation is only one of risk when the criteria is to set the subjective
probabilities equal to the objective ones (if they exist).2
2
Clearly, when the objective probabilities are very complex and dicult to
determine, even though they exist (and so we can talk of a situation of risk)
2.1. Historical antecedents 17

Not withstanding the fact that risk is a subset of all cases of


uncertainty, the only types of situation that will be discussed in this
book are situations of risk. In essence, for all of the models in which
there is a single decision maker, it makes no dierence whether or not
the probabilities used are objective or subjective, but when there are
two players in the model, it does. It is much simpler if we stick with
common probability beliefs in such a case, and the most reasonable
situation in which common beliefs holds is one of pure risk rather than
uncertainty. We shall normally discuss situations of risk as choices
between dierent lotteries, and so above all we will be interested in
studying preferences over lotteries.

2.1 Historical antecedents


The study of decision making under risk and uncertainty is now quite
old. The rst steps in the right direction were taken by amateur
mathematicians about 500 years ago with the objective of analysing
games of chance.3 Never-the-less, the early analyses were purely sta-
tistical, concerned above all with the calculation of expected values of
monetary lotteries. In fact, at the time it was generally accepted that
the value of a lottery was given by the mathematical expectation
of the prizes. This idea lead to what is now known as the Saint
Petersburg Paradox, which we shall now discuss.
Consider a game of chance in which a fair coin is tossed repeatedly,
until it comes up heads, in which case the game ends. The bettor
receives a monetary prize that depends upon the number of tosses
that occur before heads appears for the rst time. Concretely, if heads
appears for the rst time on toss n then the prize is the amount 2n1 .
In this way, the prize list is 1 (heads appears on the very rst toss),
2 (a head on the second toss), 4, 8, 16 and so on. The question is, how
much is this lottery worth?
If we were to follow the advice of the mathematicians of the early
eighteenth century4 (which is when the problem of valuing this lottery

they will almost certainly not be used. This is an example of what is known by
economists as bounded rationality.
3
For an excellent account of this history, see Against the Gods: The Remarkable
Story of Risk, by Peter Bernstein, published by John Wiley & Sons (New York) in
1996.
4
And we are not dealing with just any mathematicians. The two names most
often associated with this idea are Blaise Pascal and Pierre de Fermat.
18 2. Risk and preferences

was rst suggested), then we would value the lottery at its expected
value. However, the expected value of the lottery5 is
 2  3
1 1 1
Ex= 1+ 2+ 4 + ...
2 2 2
  i
1
= 2i1
2
i=1

 2i1
=
2i
i=1

1
=
2
i=1
=
The expected monetary prize turns out to be innite! However, it
appears quite clear that no sane bettor would place a very high value
on the lottery, since the most likely outcome is that the prize that will
end up being won is on the order of 1 or 2 monetary units, or perhaps
4 if we are quite lucky. So how can this dilemma be resolved?
It was the famous swiss mathematician Daniel Bernoulli6 who rst
proposed a solution. Bernoulli postulated that what was important in
a risky situation such as that proposed was the moral value of the
prizes, rather than their pure monetary values. His analysis rests on
the recognition that the loss of a certain amount of money, say x
implies a change in happiness that is, in absolute value, greater
than the change that would occur if that same amount of money was
earned. These days, we use the term utility of the prizes rather than
the moral value of the prizes. In simple mathematical terms, if we
use u() to denote a utility function and w as the initial (riskless)
wealth, then what Bernoulli recognises is that for any given x we
should have u(w) u(w x) > u(w + x) u(w).
Using a clear and logical argument, based principally on intuition,
Bernoulli concludes that the bettor in the Saint Petersburg lottery
5
Note thethat has appeared above the random variable x. This tilde is often
used to distinguish a random variable from a deterministic one, and we shall follow
that custom throughout this text. The E is the expectations operator.
6
The Bernoulli family was full of famous mathematicians. It was a cousin of
Daniel, called Nicolas, who suggested the Saint Pertersburg problem in the rst
place. Even though history credits Daniel Bernoulli as being the author of the
solution that we shall analyse here, he himself recognised that Gabriel Cramer had
discussed the same solution some years earlier.
2.1. Historical antecedents 19

should act in accordance with maximising the expected value of a


concave function of the prizes (the utility of the prizes), and not the
expected value of the monetary value of the prizes. In fact, Bernoulli
suggested that the appropriate function to use is the natural log
function. In this case, it is relatively simple to prove that the bettor
would value the lottery at the modest amount of ln(2) units of utility.
That is, the bettor would be indierent between playing the lottery,
and having a sure wealth of only 2 monetary units. In that sense the
lottery is worth only 2, a far cry from the initial innite worth.

Exercise 1.1: What does Bernoullis idea that the loss of a


certain amount of money leads to a change in utility that is
greater than the change from a gain of the same amount of
money imply for the shape of the utility function?

Answer: Bernoullis idea is that, for any positive numbers w


and x we have u(w) u(w x) > u(w + x) u(w). This
can be rearranged to read 2u(w) > u(w + x) + u(w x), or
u(w) > 12 u(w + x) + 12 u(w x). Notice that this is nothing more
than a special case of Jensens inequality for concave functions.
Thus, Bernoullis equation holds true if (and only if) the utility
function is everywhere concave.

Bernoullis hypothesis was generally accepted by his peers as the


solution to the problem, and then it appears to have been practi-
cally ignored. This could have been due, at least in part, to the fact
that Bernoulli published his paper in a rather specialised journal,
and in Latin, and so it may not have been easily accessed by the
economists that worked on the idea of utility in the late nineteenth
century (Jevons, Edgeworth, etc.). However, it was in the 1920s with
the invention of the mathematical theory of games (mainly due to
John von Neumann), and the publication of the doctoral thesis of
Frank Knight on the subject of risk and uncertainty in economics,
that renewed interest in Bernoullis hypothesis appeared. It was von
Neumann himself, together with his colleague the economist Oskar
Morgenstern, who provided the rst formal and convincing proof of
the Bernoulli hypothesis, thereby converting the hypothesis into a
theorem. Since the result states that preferences over lotteries should
be represented by the expected value of the utility of the prizes, the
theorem has gone down in history with the name of expected utility
theory.
20 2. Risk and preferences

2.2 Expected utility theory


In this section, we shall give a short introduction to the theory of
preferences between discrete lotteries (i.e., lotteries whose prize set is
discrete). In order to do this, it will be necessary for us to use lotteries
with many prizes, that is, strictly speaking this is not a 2-dimensional
analysis. However, once the principal theorem has been proved, we
will move directly to our two dimensional models.
The initial assumption is that there exists a well-dened prefer-
ence relation over such lotteries, and that relation is complete and
transitive. As is usual, we shall indicate this preference relationship
by the symbol , and our nal objective is to nd an explicit utility
function for lotteries that represents these preferences. This is done
by introducing some reasonable assumptions on human preferences
called axioms.
A lottery can be dened by two vectors; the prize vector x n ,
and the vector of probabilities p n . By the denition of numerical
probabilities, in any lottery we have 0 pi 1 i = 1, ..., n and

pi = 1. Given this, we can represent a lottery by the notation

(x, p) = (x1 , x2 , ..., xn , p1 , p2 , ..., pn )

Naturally, the bettor will receive only one prize, and the probability of
prize xi is pi . We shall indicate dierent lotteries by dierent vectors,
in this case dierent probability vectors, expressed by the introduction
of a super-index. That is, two dierent lotteries are expressed as
1 (x, p1 ) and 2 (x, p2 ). Note that this implies that all lotteries share
the same prize vector, something that will be altered later on. How-
ever, for now it is sucient to note that, since in any probability vector
we can incorporate a 0 in any place that we desire, two lotteries with
dierent prizes can still be captured by our notation. For example,
if lottery 1 has the prizes x1 and x2 , and lottery 2 has the prizes
x3 and x4 , then we can simply write 1 (x1 , x2 , x3 , x4 , p11 , p12 , 0, 0) and
2 (x1 , x2 , x3 , x4 , 0, 0, p23 , p24 ).
We shall indicate the utility function for lotteries by U () =
U (x, p). With this notation we are explicitly assuming that such a
utility function actually exists, and recognising that the utility of a
lottery will depend upon both the set of possible prizes, and the asso-
ciated set of probabilities. Our initial objective is to nd a particular
functional form for U (), using the denition of a utility function,
that is U ( h ) U ( k ) if and only if h  k . In other words, we
2.2. Expected utility theory 21

would like to nd out exactly how prizes and probabilities should be


combined for a reasonable preference functional.
First, we need to assume that the individual is able to order the
prizes themselves according to a preference relationship dened over
the prizes. This preference relationship over the prizes is assumed to
be (at least) complete and transitive. In this case, there will exist a
transitive relationship  such that for any two prizes xi and xj , either
xi  xj , or xj  xi , or both are true. Given this, there will exist a
utility function for the prizes, u(x) such that u(xi ) u(xj ) if and only
if xi  xj . Without loss of generality, it is useful for us to order the
prizes such that x1  x2  ...  xn . Furthermore, although it is not
strictly necessary, we shall only consider situations in which there are
no two indierent prizes,7 that is x1  x2  ...  xn , which we can
also express as xi  xj if and only if i < j.
The rst task is to nd a utility function for the prizes, u(x). To
do this, we shall use the following axiom:

Axiom 1 (rst-order stochastic


m dominance):
m k Assuming that
xi  xj whenever i < j, then if i=1 pi
h
i=1 pi m = 1, ..., n
1 with > for at least one value of m, then we have h (x, ph ) 
k (x, pk ).

First-order stochastic dominance is the risky environment equiva-


lent to the assumption of monotone preferences under certainty, since
it indicates that lotteries that weigh the better prizes relatively more
are more preferred. For lotteries with only two prizes (which is really
all we need for what follows) xi and xj , where xi  xj , rst-order
stochastic dominance implies that

phi > pki h (xi , xj , phi , 1 phi )  k (xi , xj , pki , 1 pki )

Note that with only two prizes, the implication goes in both directions.
With more than two prizes, the implication goes only from left to right.
You should think carefully about why this is so. That is, rst-order
stochastic dominance implies that a greater probability on the better
prize leads to a more preferred lottery, and if there are only two prizes,
then it is also true that the more preferred lottery must have a greater
probability on the better prize.
7
This assumption is further justied below, in footnote 10.
22 2. Risk and preferences

Now, note that we can associate with each one of the prizes a
number, i , such that

xi
i (x1 , xn , i , 1 i ) i = 1, 2, ..., n

i is the probability8 that the individual requires on the prize x1 ,


in order that he is indierent between receiving the prize xi with
certainty and receiving the lottery between the best and the worst
possible prizes, x1 and xn . To see why there should be such a number
i , think about the two numbers, 1 and n , corresponding to the
two extreme prizes. Clearly it must be true that 1 = 1 (in order
to be indierent between receiving the prize x1 with certainty, and
receiving the lottery between this same prize and a worse one, it is
necessary that the lottery also allocates the prize x1 with probability
1). After all, by the very denition of probability it cannot happen that
1 > 1, and if we use 1 < 1, then the individual has on the one hand a
situation in which the very best prize, x1 , is consumed for sure, and on
the other hand a situation in which it may happen that x1 is consumed
and it may also happen that xn , the worst possible prize, is consumed.
The non-zero chance of getting the worst prize rather than the best one
makes this lottery less appealing than having x1 for sure. Similarly, we
must have n = 0. Indeed, under rst-order stochastic dominance, as
is reduced from 1 to 0, it denes lotteries i (x1 , xn , , 1) that are
progressively less preferred, and since is a continuous variable, by
the Intermediate Value Theorem we should be able to nd a i that
equates the level of preference that would be gained by having any
intermediate prize xi for sure and the level of preference that would
be gained by having the lottery i (x1 , xn , i , 1 i ).
Now, since xi  xj if and only if i < j, by transitive preferences it
holds that i (x1 , xn , i , 1 i )  j (x1 , xn , j , 1 j ) if and only if
i < j. But then under the axiom of rst-order stochastic dominance,
it must be true that i > j if and only if i < j. In short, we have
the result that i > j if and only if xi  xj , and so we can simply
dene the utility function over the prizes as u(xi ) = i i = 1, ..., n.

8
Using the same notation as up to now, we should really write i1 instead of
i . However, when there are only two prizes there is no need to continue with
the subindex that indicates the corresponding prize, and so we eliminate it in the
interests of simplifying the notation.
2.2. Expected utility theory 23

We now require a second axiom:9


Axiom 2 (independence of irrelevant alternatives): Con-
sider a second vector of n prizes, z. If it turns out that xi zi i
then k (x, pk ) k (z, pk ). That is, in any particular lottery k (x, pk ),
we can substitute any of the prizes for another prize that is indierent
to that which is removed, without altering the utility of the lottery
itself.10
Axiom 2 gets its name from the fact that it implies that there is
no relationship in preferences between the prizes that are altered and
those that are not. That is, there is a preference independence between
the dierent prizes on oer, and so altering some of the prizes does
not aect the way one thinks about the prizes that are not altered.
Under axiom 2, in any lottery we can remove each prize xi and in
the same place substitute the lottery i () without altering the utility
of the initial lottery . That is, k (x, pk ) k ( , pk ), where obviously
 = (
2 , ...,
1, n ). However, the lotteries i () all have the same two
prizes, x1 and xn , and so the individual can only receive one of these
two prizes. Given that, we can write11 k ( , pk ) = k (x1 , xn , qk , 1qk ),
and the only question left to resolve is the value of qk .
Now, note that qk is just the probability of receiving the prize x1
from the lottery that allocates as its prizes the n other sub-lotteries.
Therefore, from elemental statistics, we know that the probability of
receiving x1 is just the sum of the probabilities of winning that prize
conditional upon the probability of receiving each of the sub-lotteries,
that is, qk = pk1 1 + pk2 2 + ... + pkn n . Finally, by the axiom of rst-
order stochastic dominance, a greater value of qk must indicate a more
preferred lottery, h (x1 , xn , qh , 1qh )  k (x1 , xn , qk , 1qk ) whenever
9
This axiom can be expressed in a number of dierent, but equivalent manners.
While the one used here is not the most common, I nd that it is the most useful
for the purposes of most easily arriving at the expected utility theorem.
10
Of course, since it is trivially true that xi xi , it is not really necessary
that the new vector of prizes, z, is dierent to the original vector, x, in each and
every element. Note also that the axiom of independence of irrelevant alternatives
justies our earlier assumption that we only consider situations in which there are
no indierent prizes. Indierent prizes can be considered, from the point of view
of utility, to be the same prize.
11
Actually, the equals sign in the next expression is a slight exageration. The
lottery of lotteries (i.e. the compound lottery) k is only equivalent to a lottery
in which prize x1 is gained with probability qk and xn is gained with probability
1 qk . The two are equal only if the decision maker does not care about the exact
process by which a given prize is won.
24 2. Risk and preferences

qh qk , with indierence only in the case of the same value of q. In


short, we have arrived at the conclusion that

qh qk if and only if h  k
 k 
and so we can simply take U ( k ) = qk = pi i = pi u(xi ). That
is, the utility of the lottery is equal to the expected value of the utility
of its prizes.

2.3 The Allais paradox and alternative


decision criteria
It is interesting to note that the expected value of a lottery is linear
in both its prizes and its probabilities (the derivative with respect
to any of these elements is a constant). On the other hand, however,
expected utility is linear in probabilities but not in prizes. Shortly,
we shall see graphical representations of both of these ideas. The idea
that utility is non-linear in prizes is not at all controversial, and dates
back to the very early work on utility theory. However, the fact that
the utility of a lottery should be linear in probability has proven to
be a rather more controversial topic.
In the early 1950s, French economist Maurice Allais (winner of the
Nobel Prize in economics in 1988) studied choices between two sets
of two lotteries each. Concretely, the rst choice is between lottery A,
which gives a prize of 5 with probability 1, and lottery B, which gives
a prize of 25 with probability 0.1, a prize of 5 with probability 0.89,
and a prize of 0 with probability 0.01. The second choice is between
lottery C, which gives a prize of 5 with probability 0.11 and a prize of
0 with probability 0.89, and lottery D, which gives a prize of 25 with
probability 0.1 and a prize of 0 with probability 0.9. Note that over
these four lotteries, three dierent prizes can be won, namely x1 = 25,
x2 = 5 and x3 = 0, and the only dierence between the lotteries is the
probability with which each of the prizes can be won. The relevant
probabilities for the four lotteries are summarised in Table 2.1, where
for each lottery the probability of prize xi is indicated by the value of
pi .
It turns out that many people choose lottery A in the rst choice
(a prize of 5 with certainty) and then lottery D in the second choice
2.3. Alternative decision criteria 25

(a prize of 25 with probability 0.1). However, if lottery A is preferred


to lottery B, then it must be that the expected utility of the former
is greater than the expected utility of the latter:

u(5) > 0.1u(25) + 0.89u(5) + 0.01u(0)

= 0.11u(5) > 0.1u(25) + 0.01u(0)


But, if lottery D is preferred to lottery C then we must have

0.1u(25) + 0.9u(0) > 0.11u(5) + 0.89u(0)

= 0.1u(25) + 0.01u(0) > 0.11u(5)


This is clearly inconsistent.

Table 2.1 The Allais paradox lotteries

Lottery p1 p2 p3
A 0 1 0
B 0.1 0.89 0.01
C 0 0.11 0.89
D 0.1 0 0.9

Allais paradox relies upon the decision maker showing preferences


that are inconsistent with the axiom of independence of irrelevant
alternatives. To see this, we can note that a common restatement of
the independence axiom is the following. Say a decision maker is asked
to rank a lottery a, denoted a , and a lottery b, denoted b , and the
ranking is that a  b . Then, the decision maker is asked to rank
the lottery c = ( a , x; p, 1 p) and d = ( b , x; p, 1 p). That is,
lottery c is a probability of p of winning lottery a and a probability
of 1 p of winning some other prize x. Lottery d is a probability
of p of winning lottery b and a probability of 1 p of winning the
other prize x. The independence axiom together with the revealed
preference a  b then imply that c  d . This can be easily checked
by recalling that under the independence axiom, we know that the
preference functional is expected utility. The expected utility of c
is U ( c ) = pU ( a ) + (1 p)U (x), while the expected utility of d is
U ( d ) = pU ( b ) + (1 p)U (x). Since (by revealed preference) we have
U ( a ) > U ( b ), and since clearly U (x) = U (x), when we compare
U ( c ) and U ( d ), we must conclude that U ( c ) > U ( d ), irrespective
26 2. Risk and preferences

Table 2.2 Alternative representation of the Allais


paradox lotteries

Lottery x1 x2 x3 p1 p2 p3
A 5 5 5 0.1 0.89 0.01
B 25 5 0 0.1 0.89 0.01
C 25 5 0 0 0.11 0.89
D 25 0 0 0.1 0.01 0.89

of what is x. So the choice between the two compound lotteries c


and d is made with independence of the common outcome, x.
Note that the Allais lotteries shown in Table 2.1 can be described
with an alternative representation. We can also represent each of the
two choices (the choice between lotteries A and B on the one hand,
and the choice between lotteries C and D on the other) with one of
the prizes having a common probability. This is set out in Table 2.2.
At the rst choice the sure thing lottery is now re-phrased as a
prize of 5 with probability 0.1, a prize of 5 with probability 0.89, and
a prize of 5 with probability 0.01. In this way, both of the lotteries A
and B involve the prize 5 with probability 0.89. So if A is preferred
to B, then under the independence axiom this preference must be
independent of the 0.89 probability of prize 5 which is common to both
as prize x2 . So, the independence axiom implies that a preference for
lottery A over lottery B indicates that the rest of these two lotteries
can be compared a prize of 5 with probability 0.11 (i.e., 1 0.89)
is better than a 0.1 probability of 25. But then, when the second two
lotteries are compared, lottery D is re-phrased as a probability of 0.1
of 25, a probability of 0.01 of 0 and a probability of 0.89 of 0. In this
case, the 0.89 probability of 0 is shared with lottery C and should
be irrelevant to the choice between the two lotteries. But we already
know (from the choice made between A and B) that a 0.11 probability
of 5 is preferred to a 0.1 probability of 25, in which case C should be
preferred to D.

The Elsberg paradox


Given results like those of the Allais choices, economists began to
re-think the theory of expected utility, above all, the aspect that
preferences should satisfy the independence axiom. More concretely,
they have questioned such aspects of preferences like the idea that they
2.3. Alternative decision criteria 27

should be linear in probabilities, or that preferences are dened over


absolute values of wealth attained. This literature is now known as
the theory of generalised expected utility, and it is founded largely
on behavioural principles rather than axiomatic ones. That is, the
alterations to strict expected utility are an attempt to better model
observed behaviour, either in experimental settings or sometimes from
eld data.
Take, for example, the situation known as the Elsberg paradoxA
decision maker is shown an urn containing 90 balls, of which it is
known that 30 are red. The other 60 are either white or black, but
the exact number of white (and thus black) balls is unknown. The
decision maker is asked to rank lotteries that involve a payo that
depends upon the colour of a randomly drawn ball. Specically, the
individual is oered the choice of two lotteries lottery r is you get
x dollars if the randomly drawn ball is red and 0 dollars otherwise,
and lottery w is you get x dollars if the randomly drawn ball is
white and 0 dollars otherwise. Then, the decision maker is asked to
choose between lottery rb which is you get x dollars if the randomly
drawn ball is either red or black and 0 dollars otherwise, and lottery
bw which is you get x dollars if the randomly drawn ball is either
black or white and 0 dollars otherwise. A preference for lottery r
over lottery w implies that the subject estimates that the probability
of a white ball is less than that of a red ball, which is 13 . Critically,
this implies that the probability of a black ball must be greater than
1
3 , and so the probability of either a red or a black ball is estimated
to be greater than 23 while it is known that the probability of either a
white or black ball is exactly 23 . Therefore, a preference for lottery r
over lottery w would imply (under expected utility) a preference for
lottery rw over lottery bw. However, subjects are often observed to
choose lottery r over lottery w, and then lottery bw over lottery rw.
Such a preference is typically ascribed to what has become known
as ambiguity aversion, or a preference for outcomes with known
probabilities to outcomes with unknown (or ambiguous) probabili-
ties.12 Since expected utility is linear in probabilities, there is no place
for ambiguity aversion in expected utility. A decision maker should be
indierent between two scenarios with the same expected value of
12
In reality, given the Knightian denitions of risk and uncertainty, ambiguity
aversion might be better called uncertainty aversion. But perhaps the term
ambiguity was used to avoid a confusion between uncertainty aversion and
risk aversion which is a completely dierent concept as we shall see later on.
28 2. Risk and preferences

probabilities over the same prize set, but where in one scenario the
probabilities are known and in the other they are not.

Prospect theory
Another well-known alternative decision criteria is known as prospect
theory, which hypothesises that utility should be dened relative
to a particular wealth level (often taken to be the perceived initial
wealth), and what is important are changes in wealth from that level
rather than the levels of wealth that are attained. Above all, it is often
hypothesised that utility may be convex below the critical wealth
level (the domain of losses) and concave above the critical wealth
level (the domain of gains). It is also often assumed that the utility
function may have a non-derivable kink at the critical wealth level.
When the critical wealth level is indeed taken as perceived initial
wealth, clearly it will change over time as lotteries play out. A small
risk involves comparing falling below the initial wealth into a zone of
convex utility (the loss domain) and going above the initial wealth into
a zone of concave utility (the gains domain). In such a setting small
risks become disproportionately important compared to the smooth,
everywhere concave, kind of utility function that is typically used in
expected utility theory. An example of two utility functions, one for
traditional utility and the other for prospect theory, together with
a perceived initial wealth of w0 , is shown in Figure 2.1. Specically,
Figure 2.1 assumes a kink at the initial wealth w0 , and that the two
functions coincide above w0 , but they are clearly dierent below w0 .
Convexity of utility under the perceived initial wealth and con-
cavity of utility above perceived initial wealth has the implication of
what has become known as loss aversion. Loss aversion is simply a
situation in which the eect upon utility of a loss of a small amount of
wealth is greater than the eect of a gain of the same small amount of
wealth. Clearly, any concave function is loss averse in this sense, but
when the function is drawn with a kink at perceived initial wealth
and with convex utility below the initial wealth, the loss aversion
eect for small losses is greatly amplied, both by the kink, and by
the change from convexity to concavity as we move from left to right.
This is shown in Figure 2.1, where the small change in wealth is x.
Because Figure 2.1 assumes the two functions to be equal above w0
the welfare eect of the gain (+x) is the same for both functions,
the distance a. But the welfare eect of the loss (x) is greater (in
2.3. Alternative decision criteria 29

u(w)

a
u(w0 )
c

w0 -x w0 w0 +x w

Figure 2.1 A traditional utility function and a


prospect theory utility function

absolute value) for the prospect theory utility function (distance b)


than for the traditional utility function (distance c).
As a simple example, assume a decision maker has an initial wealth
of w0 , and a lottery that gives a loss of x with probability p. The nal
wealth options of this situation can be framed in more than one way:

1. initial wealth is w0 together with a probability p of having w0 x.


2. initial wealth is w0 x together with a probability 1p of having
w0 .

Under expected utility, both of these descriptions are exactly equiv-


alent, under prospect theory they are not.

Discussion
Economic theorists have reacted in very dierent ways to the general-
isation of expected utility that is suggested by making the preferences
non-linear in probabilities. In reality, the debate relates to positive
30 2. Risk and preferences

and normative economics. If the nal objective is to describe, and to


predict decision making (which is a positive viewpoint), in an attempt
to accommodate real-life evidence, then the accumulation of experi-
mental evidence in lotteries like those of Allais may be persuasive for
the case for abandoning expected utility. However, if the objective
is to provide council, good advice, and in general guidance so that
decision makers can make better choices, and so that they can correct
logical errors, then we should not introduce any alteration to expected
utility theory that is not in accordance with the basic axioms upon
which it is formulated, so long as the decision maker agrees to the
axioms. If a decision maker declares that he is in full agreement with
the axioms (and the axioms can normally be presented in a way that
is far easier to understand than the structure of some of the lotteries
that have been used to discredit the theoretical validity of expected
utility), then we should not abandon strictly expected utility, as it has
been mathematically proven to be the rational preference functional
subject to acceptance of the axioms.
The alternative decision criteria that have been suggested rely
heavily upon modelling preferences that are inconsistent with the
independence axiom. In essence, the independence axiom requires
that there are no complementarities between prizes, but observed
behaviour from experiments such as the Allais paradox and the Els-
berg paradox indicates that this may not be the case. However, while
complementarities may be possible when the prizes are general baskets
of goods and services, for the case at hand, where prizes are all just
dierent sums of money, it is very hard to believe that there can be
such complementarities. Indeed, Nobel Prize winner Paul Samuelson
has argued convincingly that even with a general basket of goods as
prizes, since one and only one of the baskets will actually be received,
taking into account complementarities over dierent prizes is a fallacy.
This will be the posture taken for the entirety of the present
text, and so from now on, all of the analysis in the text is limited
to expected utility.

Summary
In this chapter, you should have learned the following:

1. The basic dierence between what is risk (a random environ-


ment in which objective probabilities exist and are used) and
2.3. Alternative decision criteria 31

uncertainty (a random environment in which only subjective


probabilities exist). This book concentrates on risk for the simple
reason that in that way everything is simplied when there is
more than one decision in the models later on.
2. Based upon a few reasonable axioms, the rational preference
functional for choices under risk is expected utility that is, the
expected value of the utility of the prizes in a lottery determines
the utility of that lottery.
3. Expected utility has not been universally accepted as being
a reasonable description of preferences. Behavioral economists
have found many instances of choices in real-world environments
(many of which are experimental) in which subjects display
preferences, are inconsistent with expected utility. The principal
departure from expected utility appears to be the independence
axiom.
4. Not withstanding these diculties, the expected utility axioms
are persuasive and appealing, and there is still a very strong
case for studying choice under expected utility, above all if the
researcher has a normative rather than a positive approach to
decision theory.

Problems
1. Show mathematically that, when the utility function for wealth
of w is equal to ln(w), then the expected utility of the Saint
Petersburg lottery is equal to ln(2).
2. Work out the value of the expected utility of the Saint Peters-

burg lottery when the utility function for wealth of w is w.
3. Really, the analysis of Daniel Bernoulli asks the wrong question.
Bernoulli is interested in the number w for which the utility of
w for sure is equal to the utility of the posed lottery. While this
is an interesting question, an even more interesting one from
the point of view of economics is the following. Given an initial
wealth of w0 , what price q would an individual be willing to pay
to purchase the St. Petersburg lottery? Write down the equation
that would dene q for the St. Petersburg lottery, assuming that
utility is the log function. Can this equation be solved exactly
for any given w0 ? How about for w0 = 2?
4. Use your equation from the previous problem, establishing the
limit price q that would be paid by an individual with risk-
32 2. Risk and preferences

free wealth of w0 to participate in the St. Petersburg lottery,


to consider the eect upon q of an increase in w0 , under the
assumption that w0 2. Retain the assumption that utility is
the log function.
5. Say that in the St. Petersburg lottery, instead of allowing the
feasible coin throws to be innite, they are limited to some
number n. Now write down the equation that denes the price
q that is alluded to in problem 3. How do you think the value
of q would change if n increases?
6. Jack and Jill both have the same initial wealth w0 and the same
1
utility function for nal wealth w, which is u(w) = (w w0 ) 3 .
One of the two is given a lottery ticket that pays r dollars (where
r > 0) or 0 dollars, each with probability equal to 12 . Show that,
regardless of which of the two receives the gifted lottery ticket,
there is a strictly positive price b at which the lottery ticket
could be sold to the other, such that both are made better o.
7. An expected utility maximising decision maker declares that he
prefers a lottery that pays $5 and $10 with equal probability to a
lottery that pays $10 with probability 34 , and $0 with probability
1
4 . Which would the decision maker choose out of a lottery that
pays $5 and $0 with equal probability, or a lottery that pays $10
with probability 14 and $0 with probability 34 ?
8. Assume an individual with initial risk-free wealth of w0 = 25
and a lottery that pays x1 = 16 or x2 = 16 each with equal
probability. His utility function for levels of wealth w is given
by the piecewise function

w w  w for w < w 
u(w) =
w+ ww  for w w 

where w  is a reference level of wealth. Draw a graph of this


utility function for w > 0. Assuming that the individual assigns
 = w0 , what is the expected utility of his initial situation?
w
Assuming that he assigns w  = w0 + x2 , recalculate his expected
utility. In what sense can the reference level of wealth be used
to describe the individuals pessimism?
Chapter 3

Risk aversion

3.1 The Marschak-Machina triangle and risk


aversion
One of the earliest, and most useful, graphical tools used to analyse
choice under uncertainty was a triangular graph that was proposed by
Jacob Marschak in 1950. The graph was later re-used extensively by
Mark Machina during the 1980s to understand the results of experi-
ments designed to nd out whether real-life decisions can be explained
by expected utility theory or not. Since this is a graphical exercise,
it is necessary to study a reduced set of lotteries only. Concretely, we
need to reduce the number of prizes down to n = 3, in order that the
analysis can be carried out in an entirely two-dimensional environ-
ment. The assumption of only 3 prizes in any lottery is the greatest
dimensionality that can be studied in a two-dimensional graph. If
we place probabilities on the axes, lotteries can be represented by
only two probabilities, since the third is the dierence between the
sum of the other two and the number 1. Concretely, it is convenient
to eliminate the probability of the intermediate prize, writing it as
p2 = 1 p1 p3 , thereby maintaining on the axes of the graph the
probabilities of the two extreme prizes.
It is also necessary at this point to limit our lotteries to prizes over
dierent quantities of a single good. Strictly speaking, this was not
required in the previous chapter, so expected utility is valid for a wider
range of options, but it is useful from here on. Given this, we shall
simply assume that the only good in the model is money itself, and so
all lotteries allocate prizes of dierent amounts of money. We shall use

33
34 3. Risk aversion

the generic variable w to represent such monetary amounts (and w  to


represent the corresponding random variable). The implication is that
the utility function for prizes, u(w), is just the indirect utility function
from neoclassical demand theory (see Appendix B if you are unsure
what indirect utility is or what properties it has). Finally, since we
are assuming monetary prizes, the assumption of wi  wj for i < j,
can be expressed using normal inequalities as w1 > w2 > w3 .

p3

3
p33
4
p43
2
p13 = p23 1

p21 p41 p11 = p31 1 p1

Figure 3.1 A Marschak-Machina triangle

Recall that, at least throughout this chapter, the three numbers


wi i = 1, 2, 3 are xed parameters at all times, and that dierent
lotteries are represented by dierent probability vectors, pi = pj .
Therefore, with n = 3, which as we have just mentioned allows us
to write p2 = 1 p1 p3 , we can represent any given lottery as a
point in the graphical space (p1 , p3 ). This is done in Figure 3.1. Any
lottery that lies on the horizontal axis has p3 = 0, so the only possible
prizes are w1 and w2 . Similarly, any lottery that lies on the vertical
axis indicates that the only possible prizes are w2 and w3 . Finally, any
lottery located on the hypotenuse of the triangle, where p1 + p3 = 1,
and so p2 = 0, implies that the only possible prizes are w1 and w3 .
3.1. Marschak-Machina triangle 35

Thus, only when the lottery is located at a strictly interior point in


the triangle are all three prizes possible (as is, for example, the case
with lottery 1 in Figure 3.1).

Exercise 3.1. Indicate as a distance in a Marschak-Machina


triangle the probability p2 for a strictly interior lottery.

Answer. Draw a Marschak-Machina triangle, and place a dot


in its interior somewhere. Label the coordinates of your dot, as
read from the axes, as (p1 , p3 ). But if we draw the line from your
dot horizontally across until it touches the hypotenuse of the
triangle, and then look at the value of the p1 -axis at that point,
it must be the number 1p3 . This is just because the hypotenuse
of the triangle denes the points such that p1 + p3 = 1, or
p1 = 1p3 . Now, you have two points indicated on the horizontal
axis the point directly below your dot, which is the point p1 ,
and the point just located as 1 p3 . Along the horizontal axis,
the distance from the origin to the point directly below your dot
is the value of p1 , and the distance from the point just located as
1p3 to the number 1 on the axis is the measure of p3 . And since
the three probabilities must sum to 1, the value of p2 is just the
distance between the two points on your horizontal axis. If you
like, from your dot, move directly to the right until you reach
the hypotenuse. The distance travelled is p2 . Alternatively, from
your dot move directly upwards until you reach the hypotenuse.
Again, the distance travelled is p2 .

In order to understand the direction of preferences in the triangle,


we need to use rst-order stochastic dominance. Consider the two
lotteries 2 and 1 in Figure 3.1. Since p13 = p23 , it must be true that
p11 + p12 = p21 + p22 . But since p11 > p21 , it turns out that lottery 1
rst-order stochastically dominates 2 , and so 1  2 . Now consider
1 and 3 . Since p13 < p33 , we have p11 + p12 > p31 + p32 , and again since
p11 = p31 , lottery 1 rst-order stochastically dominates 3 , and so it
follows that 1  3 . Finally, consider the lottery 4 . Since p13 < p43 we
have p11 + p12 > p41 + p42 . But we also have p11 > p41 , and so by rst-order
stochastic dominance, 1  4 .
In short, rst-order stochastic dominance indicates that more pre-
ferred lotteries in the triangle lie to the south-east. Of course, this also
indicates that if we have two lotteries in the triangle that are indier-
ent to each other, then a straight line joining them must have strictly
36 3. Risk aversion

positive slope, and so in the triangle preferences can be represented


by indierence curves that have strictly positive slope.

Exercise 3.2. Consider a lottery that pays $1 with probability


(1p) and $0 with probability p. Assume that a bettor is oered
either one or two independent trials of this lottery. Call a single
trial of the lottery L1 and two independent trials L2 . Locate
both L1 and L2 in a single Marschak-Machina triangle. Can you
determine which of the two options is the most preferred for a
risk averse bettor?

Answer. The Marschak-Machina triangle would have the best


prize equal to $2, the intermediate prize equal to $1 and the
worst prize equal to $0. Since L1 oers a 0 probability of the
best prize, it is located upon the vertical axis of the triangle,
at a height of p. Lottery L2 oers a probability of (1 p)2 of
the best prize, and a probability of p2 of the worst prize. Thus
L2 locates at the strictly interior point dened by p1 = (1 p)2
and p3 = p2 . Since p2 < p, L2 is located below and right of L1 .
Under rst-order stochastic dominance, L2 is preferred to L1 .

Indeed, it is easy to get the exact equation for the slope of an


indierence curve in the triangle, at least under expected utility. An
indierence curve is dened as the set of points (p1 , p3 ) such that
Eu(w) = p1 u(w1 ) + (1 p1 p3 )u(w2 ) + p3 u(w3 ) = C, where C is a
constant and E is the expectations operator. Then, from the implicit
function theorem, we have

dp3 u(w1 ) u(w2 )
= >0
dp1 dEu(w)=0
 u(w2 ) u(w3 )

Note that, since wi i = 1, 2, 3 are constants, u(wi ) i = 1, 2, 3 are


also constants, and so the slope of an indierence curve is a posi-
tive constant (independent of the particular point (p1 , p3 ) chosen). In
other words, indierence curves in the Marschak-Machina triangle are
straight lines, with higher valued curves lying to the south-east.
It is also interesting to compare indierence curves with the curves
along which expected value is constant, E w  = p1 w1 +(1p1 p2 )w2 +
p3 w3 = V . In exactly the same way as above, we get

dp3 w1 w2
= >0
dp1 dE w=0
 w2 w3
3.1. Marschak-Machina triangle 37

That is, the curves that maintain expected value constant (from now
on, iso-expected value curves), are also straight lines with positive
slope. The interesting question is, how do the indierence curves and
the iso-expected value curves compare to each other? The answer
depends entirely upon the concavity of the utility function, u(w). Lets
see how.

u(w)
c
u(w1 )

b 2
u(w2 )

a 1
u(w3 ) d

w3 w2 w1 w

Figure 3.2 Concave utility function

Figure 3.2 shows a typical concave utility function, along with the
three levels of wealth w1 > w2 > w3 . If we draw the line segments
joining point a to point b, and point b to point c, then due to the
concavity of the utility function, the slope of the line joining a to
b must be greater than the slope of the line joining b to c. That
is, 1 > 2 . We can measure these two slopes using some simple
geometry. Consider the triangle formed by the three points a, b and d.
The slope of the line (actually, the tangent of the angle at 1 ) joining
a to b is equal to the length of the opposite side (the distance from
d to b) divided by the length of the adjacent side (the distance from
a to d). But these two distances are, respectively, u(w2 ) u(w3 ) and
w2 w3 . Thus, we have 1 = u(ww22)u(w
w3
3)
. In exactly the same way,
38 3. Risk aversion

u(w1 )u(w2 ) u(w2 )u(w3 )


we have 2 = w1 w2 . And since 1 > 2 , we get w2 w3 >
u(w1 )u(w2 ) 1 w2 u(w1 )u(w2 )
w1 w2 which rearranges directly to w
, w2 w3 > u(w2 )u(w3 ) .
In words, if u(w) is strictly concave, then the iso-expected value
lines are steeper than the indierence curves. Such a situation is drawn
in Figure 3.3. Clearly, if the utility function is convex rather than
concave, then the iso-expected value lines would work out to be less
steep than the indierence curves, and if the utility function were
linear, then the two sets of curves in the Marschak-Machina graph
would coincide exactly.

p3

E w = constant

Eu(w) = constant

1 p1

Figure 3.3 Expected value and expected utility in


the Marschak-Machina triangle under concave utility

Since the most reasonable assumption on the utility function is


that it is concave (decreasing marginal utility of wealth), from now on
we shall assume that this is so, and so we shall be dealing exclusively
with situations like that of Figure 3.3.
Now, consider two lotteries with the same expected value, say 1
and 2 in Figure 3.3. Since u(w) is concave, we get 1  2 , that is
U ( 1 ) > U ( 2 ). Apart from the dierence in expected utility, the two
lotteries also dier as far as their statistical variance is concerned.
3.1. Marschak-Machina triangle 39


Variance is dened as var() = 2 () =  2 . In fact,
pi (wi E w)
it turns out that 2 ( 2 ) > 2 ( 1 ). To see why, it is necessary to
consider the derivative of 2 () with respect to p1 conditional upon
Ew  remaining constant (this is suggested as problem 1 at the end
of the chapter). Never-the-less, note that as we increase p1 along a
particular iso-expected value line, we need to increase p3 and therefore
decrease p2 . This corresponds to a displacement of probability weight
from the centre of the distribution to the extremes, which implies an
increase in variance.
In short, we have reached the following important conclusion: if
the utility function is strictly concave, then an increase in variance
while holding the expected value constant implies a decrease in expected
utility.1 Economists say that such preferences display risk aversion,
since it is normal to associate variance with risk. Thus, concavity of
the utility function is equivalent to risk aversion. Of course, if u(w)
were linear, then we would have a risk neutral preference, and if it
were convex we have a preference for risk (sometimes referred to as
risk loving).

Exercise 3.3. Draw in a Marschak-Machina triangle the lot-


teries corresponding to the Allais paradox, and show how the
paradox cannot be consistent with expected utility.
Answer. If we set w1 = 25, w2 = 5 and w3 = 0, the four lotter-
ies of the Allais paradox can be easily located in the Marschak-
Machina triangle. This has been done in Figure 3.4. What we
should note is that a straight line that connects the two lotteries
in the rst choice ( A and B ) will have exactly the same slope
as a straight line that connects the two lotteries in the second
choice ( C and D ). Concretely, the slope of the two connecting
lines is 0.1. However, if the individual making the choices is an
expected utility maximiser, then we know that his indierence
curves over the entire probability space are straight lines with
common slope, and so if A  B , then these indierence curves
must have a slope that is less than 0.1. But then, we would
have to conclude that C  D . One possible way to explain
the apparent paradox is for the individual to have a preference
over lotteries that corresponds to indierence curves that fan
1
Although it may not be so obvious, it is also true that a decrease in expected
value while maintaining variance constant will reduce expected utility. Proving this
in the triangle is not easy, since the iso-variance curves are conics.
40 3. Risk aversion

out, in the sense that they are steeper and steeper as we move
upwards and to the west in the triangle. Such preferences cannot
correspond to expected utility.

p3

1
D
0.9
0.89 C

B
0.01 A
0.1 1 p1

Figure 3.4 Allais paradox in the Marschak-Machina


triangle

3.2 The contingent claims environment


In short, expected utility theory asserts that, so long as the individ-
uals preferences satisfy a short list of very reasonable assumptions,
then the utility that should be attached to a lottery is nothing more
than the mathematical expectation of the utility of each prize. If we
restrict our attention (as we will from now on) to lotteries with only
two prizes, each of which is an amount of wealth, say prize w1 with
probability 1 p and w2 with probability p, then the utility of this
situation is
 = (1 p)u(w1 ) + pu(w2 )
Eu(w) (3.1)
Of course, the utility function in question would really be the indirect
utility function, and so u(w) denotes the utility of wealth, or the
utility of money.
3.2. Contingent claims 41

This is a simple case of a utility function that is separable, and this


makes it easy to study in the same type of graphical environment that
is typically used in undergraduate consumer theory under certainty.
The graph in question is often called the contingent claims graph,
which assumes probabilities to be xed and prizes to be variable
monetary amounts.
The rst important analysis based on variable prizes and xed
probabilities is the model of Nobel Laureates Ken Arrow and Gerard
Debreu, where general equilibrium is extended to account for uncer-
tainty. In that model, dierent states of nature are thought of as
dierent markets for contingencies. The model extends the space of
goods by understanding that there is no formal dierence between two
dierent goods, and the same good at two dierent locations, or at
two dierent states of nature. Thus, uncertainty is just an increase in
the number of dierent goods present in a model. The Arrow-Debreu
model is known as the contingent claims environment.2
The fundamental idea is simple, even more so in two-dimensional
space. We begin by establishing a set of possible states of nature,
where a state of nature is simply a full and complete description of
all relevant aspects contingent upon a given outcome of a stochastic
process. We also need to establish the probability density over the pos-
sible states of nature. For example, an investor in the stock exchange
knows that the price of his shares may go up (state 1) or go down
(state 2).3 As soon as we establish the probability that the shares will
increase in price, then we have a properly dened contingent claims
environment. For the type of problem that we will be interested in
here, we shall simply consider an individuals wealth, w. We shall
assume that there are only two possible states of nature, state 1 and
state 2, and that in state i the individuals wealth is wi i = 1, 2. We
shall denote the probability of state 2 as p, and thus the probability of
state 1 is 1 p. The relevant utility function for this type of problem
is the individuals indirect utility, which we shall denote by u(w).
2
The contingent claims model is useful when choices can lead to alterations
in the set of prizes. For example, take the case of a person who faces an initial
lottery in which he can lose x with a given probability p, and lose nothing with
probability 1 p. If he insures half of this loss at a premium of q, then he now gets
a loss of 0.5x + q with probability p and a loss of q with probability 1 p. Same
probabilities, dierent prizes.
3
Of course, with this example it is possible to dene a much richer set of states
of nature the price of one share goes up by 1% and that of another goes down
by 2%, and so on.
42 3. Risk aversion

Continuing from what we have already done in the earlier sections of


this book, we shall assume always (unless otherwise stated) that this
function is strictly increasing and concave, u (w) > 0 and u (w) < 0.

w2

w1 = w2

w0

w0 q

w0 w0 q + x w1

Figure 3.5 Contingent claims space

In a two-dimensional graph, we can represent the individuals


wealth in each state of nature on the axes (see Figure 3.5). The initial
endowment of the individual is often referred to as the initial risk
allocation. It is customary that, when initially w1 = w2 , then we
dene the state of nature with less wealth to be state 2, that is, we
would dene our states such that w2 < w1 . The diagonal line passing
through the origin of the graph (the line with slope equal to 1) is
known as the certainty line, since along it are all the vectors of
wealth such that w1 = w2 . Clearly, independently of the probabilities
of receiving the two dierent wealth levels, if they are both equal
then nal wealth is known with probability 1 (i.e., with certainty) as
w = w1 = w2 .
As an example, in Figure 3.5 two situations are shown. On the one
hand, we show the case of an individual with certain wealth of w0 ,
and on the other hand the situation of an individual with a certain
3.2. Contingent claims 43

wealth of w0 plus a lottery ticket that costs q to purchase and that


pays a prize of x > 0 with probability 1 p and a prize of 0 with
probability p. We assume that q < x so that the lottery ticket is a
logically valid option to consider. The wealth vector contingent upon
the outcome of the lottery is (w1 , w2 ) = (w0 q + x, w0 q). Since
q < x, even though the risk distribution before purchasing the lottery
is on the certainty line, the distribution achieved after purchasing it
lies beneath the certainty line. The important point to note about the
contingent claims environment is simply that the individual will only
actually receive the wealth indicated on one of the axes, that is, the
wealth level of only one of the components of the vector w, rather
than both components as is the case in traditional two-dimensional
consumer theory under certainty.
To begin with, let us reconsider the expected value and variance of
any particular point in contingent claims space. By denition, where
E represents the expectations operator, the expected value of a point
w is

 = pw2 + (1 p)w1
Ew (3.2)

Clearly, this equation presents a structure that is identical to a budget


constraint in traditional consumer theory, but where now instead of
prices we have probabilities. Using this comparison (or, if you like,
just use the implicit function theorem) it is immediate that, in the
contingent claims space, the slope of a line that holds expected value
constant is just

dw2 (1 p)
= <0
dw1 dE w=0
 p

So an iso-expected value line in this space is a straight line with neg-


ative slope. Any particular one of them (dierent lines corresponding
to dierent levels of expected value) will cut through the certainty
line at only one point. Let us identify this point as w1 = w2 = w, and
then we have E w  = pw + (1 p)w = w. This tells us that the further
an iso-expected value line lies from the origin of the graph, the greater
is the expected value it represents.
Second, the variance of a point w is dened by

2 (w)  2 + (1 p)(w1 E w)
 = p(w2 E w) 2
44 3. Risk aversion

Using (3.2), the variance 2 (w)


 is equal to

p(w2 pw2 (1 p)w1 )2 + (1 p)(w1 pw2 (1 p)w1 )2


= p((1 p)(w2 w1 ))2 + (1 p)(p(w1 w2 ))2
= p(1 p)2 (w2 w1 )2 + (1 p)p2 (w1 w2 )2

And since (w2 w1 )2 = (w1 w2 )2 , it turns out that




2 (w)
 = p(1 p) (1 p)(w1 w2 )2 + p(w1 w2 )2
= p(1 p)(w1 w2 )2

Therefore, the iso-variance curves in contingent claims space are


also straight lines, but with slope equal to 1, since directly from the
implicit function theorem we can calculate that

dw2 2p(1 p)(w1 w2 )
= =1
dw1 d2 =0 2p(1 p)(w1 w2 )

The certainty line is a particular example of an iso-variance line.


It is the line along which variance is equal to 0. Iso-variance lines that
lie further away from the certainty line, in either direction, indicate a
greater variance since they correspond to a greater value of w1 w2 ,
as is shown in Figure 3.6.
Now we can consider preferences. As we have already argued,
preferences in the model are given by expected utility. Therefore, the
 = pu(w2 )+(1p)u(w1 ). An indierence
utility of a vector w is Eu(w)
curve maintains expected utility constant, dEu(w)  = 0, and so again
we only need to apply the implicit function theorem to see that

dw2 (1 p)u (w1 )
= M RS(w) (3.3)
dw1 dEu(w)=0
 pu (w2 )

Here, M RS(w) indicates the marginal rate of substitution at the point


w.
Since we are assuming that the utility function is increasing and
concave, it is not dicult to prove that expected utility is concave in
the vector (w1 , w2 ) (see problem 5 at the end of the chapter). So, for
any two points in the contingent claims space, w1 and w2 , and for any
that satises 0 < < 1, it is true that

1 + (1 )w
Eu(w 2 ) > Eu(w
1 ) + (1 )Eu(w
2 )
3.2. Contingent claims 45

w2
2 = 22 2 = 21
2 = 0

2 = 21

2 = 22

w1

Figure 3.6 Expected value and variance lines in the


contingent claims graph

Given that, we know that the corresponding indierence curves are


strictly convex.
Finally, note that since the indierence curves have negative slope,
each one must cut the certainty line at exactly one point. If we denote
this point by w1 = w2 = w,  then we have
 = pu(w)
Eu(w)  + (1 p)u(w)
 = u(w)

Since we have assumed u (w) > 0, it is now evident that indierence
curves that are further from the origin indicate more preferred vectors,
since they are consistent with a greater level of expected utility.
If we draw some indierence curves corresponding to a strictly
concave utility function together with iso-expected value and iso-
variance lines, then it becomes immediate that the individual displays
what is known as risk aversion (see Figure 3.7). Risk is taken as
being analogous to variance, and so risk aversion is the characteristic
that leads individuals to dislike variance at any given expected value.
First, note that from the equation for the marginal rate of substitution
(3.3), the slope of an indierence curve at the point at which it cuts
46 3. Risk aversion

w2

w1 = w2

2 > 0


1p w1
p

Figure 3.7 Expected utility indierence curves with


risk averse preferences

the certainty line is equal to (1p)


p , which is the same slope as an
iso-expected value line. Therefore, we can directly deduce that the
unique solution to the problem of choosing freely from all lotteries
with an expected value that is less than or equal to some particular
amount, say w, is the lottery that gives an expected value of exactly
w but with zero variance our decision maker is clearly showing a
dislike for variance, that is, he is risk averse. In order to see this in
another way, consider a movement along an iso-expected value line
towards lotteries of ever greater variance (i.e., movements away from
the certainty line). When the indierence curves are convex, each such
successive movement implies moving to a lower indierence curve,
which again directly implies risk aversion as dened above. In the
contingent claims environment, it is also immediate to see that an
increase in expected value that holds variance constant will always
increase expected utility.

Exercise 3.4. Consider your own preferences for simple lot-


teries. Imagine you are oered the choice between a coin-toss
where the outcome is win a dollar on heads, lose a dollar on
3.2. Contingent claims 47

tails. Would you voluntarily accept this lottery? How about win
two dollars on heads, lose a dollar on tails? Try to answer the
following question honestly. You are oered to voluntarily play a
lottery in which on heads you win x dollars, and on tails you lose
one dollar. What is the smallest value of x for which you would
play this lottery? What is the expected value of the lottery, and
what is its variance? Think about what your answers imply for
your own preference towards risk.

Answer. This question relates to your own personal preferences,


so there is no one correct answer here. Dierent people will
answer the question dierently. However, most people would not
voluntarily accept the coin-toss lottery that pays one dollar on
heads and that costs one dollar on tails. If that is true for you,
then your preferences display risk aversion, at least for this small
stakes range of wealth. If I am asked about a coin-toss lottery in
which I lose a dollar on tails and gain x dollars on heads, I would
set my minimum value of x at something around $1.50. The
expected value of the lottery is 0.5x+0.5(1) = 0.750.5 =
2
0.25. The variance is 0.5 0.5(1.5 + 1)2 = 2.54 = 1.5625. My
preferences display risk aversion since in order to be indierent
between playing (having a variance of 1.5625) and not playing
(having a variance of 0), I require a strictly positive gain in
expected value. Graphically, the minimally acceptable lottery is
located below the certainty line, and above the expected value
line of not playing. Since my indierence curve for not playing
cuts the certainty line at the same place as the expected value
line for not playing, it must be a convex curve in order to also
go through the lottery point (recall, I am indierent whether I
play or not).

Exercise 3.5. A classic problem in the economics of risk is the


choice of the split of initial risk-free wealth between an asset
with a risk-free return, and one with a risky return. Each dollar
invested in the risk-free asset yields, say, (1 + t) dollars for
sure, while each dollar invested in the risky asset yields, say,
(1 + r) dollars with probability (1 p) and (1 r) dollars with
probability p. Assume that r and t are both positive numbers,
and that that the investor can split his money, putting some in
the risk-free asset and the rest in the risky asset. Assume that
the risky asset has a higher expected return than the risk-free
48 3. Risk aversion

asset. Can the risk-free asset ever dominate the risky one, in the
sense that the investor would invest only in the risk-free asset
and not in the risky asset at all?

Answer. Whether or not the risky asset will be purchased at


all depends upon the relationship between the expected value
of the risky asset and that of the risk-free asset. The risky asset
will only be included in the optimal portfolio if it has a strictly
greater expected value than the risk-free asset. This happens
if (1 p)(1 + r) + p(1 r) > 1 + t. When this happens, it is
always optimal to include some of the risky asset in the optimal
portfolio, regardless of risk aversion. It may also happen that
the risky asset is the only one in the optimal portfolio (a corner
solution). The problem is relatively simple to solve graphically
(see Figure 3.8).

w2

w1 = w2

w0 (1 + t) w
w0 (1 r)


1p
p

w0 (1 + t) w0 (1 + r) w1

Figure 3.8 Optimal choice between a risky and a


risk-free asset
3.3. Measures of risk aversion 49

Since the expected return on the risky asset is greater than the
expected return on the risk-free asset, we know that the point
corresponding to all wealth invested in the risky asset, which lies
below the certainty line, must lie above the expected value line
passing through the point on the certainty line corresponding to
all wealth invested in the risk-free asset. Thus, the straight line
joining these two points (the line showing all possible investment
opportunities as wealth is spread over the two investments) is
less steep (atter) than the expected value line of the risk-free
investment. But since the slope of the risk-free expected value
line is simply the ratio of state contingent probabilities, we also
know that the indierence curve at the risk-free investment is
steeper than the market opportunities line. Thus the tangency
between the market opportunities line and the indierence curve
must occur below the certainty line, that is, some money is
always invested in the risky asset. Curiously, the result that
some risk will always be purchased is independent of exactly
how risk averse the individual is, and how slight might be the
expected value advantage of the risky asset.

3.3 Measures of risk aversion


Now that the concept of risk aversion has been formally introduced, it
makes sense to analyse it in greater detail. One of the most interesting
question about risk aversion is whether or not we can characterise
dierent individuals according to their risk aversion. That is, if any
two individuals can be ranked, or ordered, according to who is more
risk averse. In order to do this, consider Figure 3.9, in which we
represent the situation of an individual with an initial situation of
pure risk-free wealth of w0 . The initial wealth endowment is given
by the point w0 = (w10 , w20 ) = (w0 , w0 ). The indierence curve that
passes through this point cuts the contingent claims space into two
separate parts; the points that lie below the endowed indierence
curve (lotteries that are less preferred to the endowment point w0 , i.e.
w : w0  w) and points that are on or above the endowed indierence
curve (lotteries that are at least as preferred as w0 , i.e., w : w  w0 ).
We shall refer to the set A(w0 ) = {w : w  w0 } as the acceptance set,
since it indicates all the lotteries that the individual would accept,
voluntarily, in exchange for his endowment.
50 3. Risk aversion

w2 w1 = w2

w0 A(w0 )

w0 w1

Figure 3.9 An acceptance set

Absolute risk aversion


Now, consider two individuals who are identical in all but their utility
function. In particular, the two individuals share the same endowment
point, and the same probabilities of the two states of nature. We have
just seen that independently of the particular utility function, the
slope of an indierence curve as it crosses the certainty line is always
equal to (1p)
p , and so the frontiers of the two acceptance sets are
necessarily tangent to each other at the common endowment point w0 .
Assume now that one of the acceptance sets is a sub-set of the other,
say Ai (w0 ) Aj (w0 ), then all of the lotteries that are acceptable to
individual i are also acceptable to individual j, but the opposite is not
true. There exist lotteries that are accepted by j but that are rejected
by i in a proposed exchange for the endowment point w0 . Concretely,
for any particular expected value the lotteries that are acceptable to j
but that are rejected by i are those with the greatest variance within
js acceptance set. They are the lotteries that correspond to greater
risk. In this case, it is natural to say that i is, locally (i.e., around
w0 ), more risk averse than j.
3.3. Measures of risk aversion 51

w2

w1 = w2

w0
ui (w0 )
uj (w0 )

w0 w1

Figure 3.10 Greater risk aversion

Graphically, if individual i is more risk averse than is individual


j, then the indierence curve of i that passes through the endowment
point will be, at least locally, more convex than the indierence curve
of j passing through the same point (Figure 3.10). Lets formalise this
idea.
To begin with, from equation (3.3), the rst derivative of an indif-
ference curve at any point is

dw2 (1 p)u (w1 )
=
dw1 dEu(w)=0
 pu (w2 )

Dierentiating a second time with respect to w1 , we nd



d2 w2
=
d(w1 )2 dEu(w)=0


  u (w1 )u (w2 ) u (w1 )u (w2 ) dw2
1p dw1


p [u (w2 )]2
52 3. Risk aversion

Substituting in the point w2 = w1 = w0 yields



d2 w2
=
d(w1 )2 dEu(w)=0


  u (w )u (w ) u (w )u (w ) (1p)
1p 0 0 0 0 p

2
p [u (w0 )]

  u (w )u (w ) 1 + (1p)
1p 0 0 p

=  2
p u (w0 )
u (w0 )
= f (p)
u (w0 )
Ra (w0 )f (p)

where f (p) = 1pp2


. The important point to note is that, since the two
individuals share the same probability p, if their endowed indierence
curves have dierent second derivatives at the endowment point, then
this dierence is due entirely to the term Ra (w0 ). Given our assump-
tions that u (w) > 0 and u (w) < 0, it turns out that Ra (w0 ) is
positive. Ra (w0 ) is known in the economics literature as the Arrow-
Pratt measure of absolute risk aversion, and if Ria (w0 ) > Rja (w0 ) then
individual i is more risk averse than individual j.
We can use the measure of absolute risk aversion to point out an
important aspect of expected utility that was not mentioned earlier.
Clearly, if two utility functions, ui (w) and uj (w), are to represent
the same preferences in an uncertain or risky environment, then they
must share exactly the same set of indierence curves in contingent
claims space. But this can happen only if both functions have the
same measure of absolute risk aversion at any given wealth w, that is,
we require Ria (w) = Rja (w) for all w.
Now, traditional consumer theory under certainty teaches us that
a utility function is an ordinal concept, that is, it is only useful for
ordering consumption bundles from the least to the most preferred.
If, in a certainty environment, we have wi  wj , then in principle we
can use a utility function that returns u(wi ) = 4 and u(wj ) = 2 or
another that returns u(wi ) = 37 and u(wj ) = 9.6. The only important
point is that u(wi ) > u(wj ), and not the dierence between the two
utility values, u(wi ) u(wj ). In a model of choice under certainty,
we say that if a utility function u(w) represents preferences  in the
sense that u(wi ) u(wj ) if and only if wi  wj , then any f (u(w))
3.3. Measures of risk aversion 53

with f  (u) > 0 will also represent the same preferences. A composite
function of the form z(w) f (u(w)) with f  (u) > 0 is known as a
positive monotone transformation of u(w).
Lets now go back to our uncertain environment (just for now, let
us consider an n state world, rather than a strictly 2 state world). If
two utility functions for wealth ui (w) and uj (w) are to represent the
same preferences over lotteries, then it must be true that the two func-
tions always give the same ordering over lotteries, or in other words,
that the two expected utilities are related by a positive monotone
transformation:
 n 
 n 
pk ui (wk ) = H pk uj (wk ) with H  () > 0
k=1 k=1

Dierentiating with respect to (any) wk , we have

ui (wk ) = H  ()uj (wk ) wk

and so
ui (wk )
H  () = wk (3.4)
uj (wk )
Dierentiating (3.4) yields
ui (wk )uj (wk ) ui (wk )uj (wk )
H  () =  2 wk
uj (wk )

But recall that if the two functions are to represent the same prefer-
ences over lotteries, then it must hold that Ria (wk ) = Rja (wk ) for all
wk , that is
ui (wk ) uj (wk )
= ui (wk )uj (wk ) = ui (wk )uj (wk ) wk
ui (wk ) uj (wk )
and so clearly it would have to hold that

H  () = 0 wk

In words, if the two utility functions for wealth are to represent


the same preferences for lotteries, then we can admit only functions
H() relating the implied expected utilities that are linear. Therefore
 n 
n  n
pk ui (wk ) = H pk uj (wk ) = a pk uj (wk ) + b
k=1 k=1 k=1
54 3. Risk aversion

n
where a > 0 from (3.4). Now, since b = k=1 pk b, we have


n 
n 
n 
n
pk ui (wk ) = a pk uj (wk ) + pk b = pk (auj (wk ) + b)
k=1 k=1 k=1 k=1

that is
ui (w) = auj (w) + b with a > 0

Again, in words, if ui (w) and uj (w) represent the same preferences


in a problem of choice under risk or uncertainty, then they must be
related linearly. This implies that the incorporation of the dimension
of uncertainty to a choice model reduces the set of admissible utility
functions by reducing the generality of the type of transformation that
can be used. Instead of any positive monotone transformation, we
are now restricted to those that are linear. This important dierence
between utility representations in problems of choice under certainty
and under uncertainty has led to the uncertainty utility function
becoming known as a von Neumann-Morgenstern utility function,
named after the economists who rst formally proved the validity of
expected utility theory.
In short, if we assume two individuals i and j with dierent utility
functions in the sense that there are no two numbers a > 0 and
b such that ui (w) = auj (w) + b, then this dierence implies that
Ria (w) = Rja (w). In that case we can name our individuals such that
Ria (w) > Rja (w), that is, individual i is (locally in the neighbourhood
of a level of wealth w) more risk averse than individual j.

Exercise 3.6. Prove that if f (u) is a strictly increasing and


concave function (f  (u) > 0 and f  (u) < 0), then the utility
function v(w) f (u(w)) is more risk averse than the utility
function u(w).

Answer. The rst derivative of v(w) = f (u(w)) with respect


to w is v  (w) = f  (u)u (w). Dierentiating again with respect
to w yields v  (w) = f  (u)u (w)2 + f  (u)u (w). Thus, by con-
struction the Arrow-Pratt measure ofabsolute risk aversion for
f (u)u (w)2 +f  (u)u (w)
utility function v(w) is Rv (w) =
a
f  (u)u (w) which
  2  
simplies to ff (u)u (w) f (u)u (w) a 
(u)u (w) f  (u)u (w) or Rf (u)u (w) + Ru (w).
a

Since Rfa (u)u (w) > 0 it turns out that Rva (w) > Rua (w) for all
w. So indeed v(w) is more risk averse than u(w).
3.3. Measures of risk aversion 55

Note that Ra (w) is a properly dened function that returns a


value for any given scalar4 w, since we could have used any particular
point on the certainty line as our endowment in the above argument.
Shortly we shall discuss the derivatives of Ra (w), which are of immense
importance to the economics of risk and uncertainty.

Relative risk aversion


The word absolute in the name of Ra (w) is due to the fact that the
lotteries used to nd it are absolute, that is, they are lotteries whose
prizes w1 and w2 are absolute quantities of money. There exists a
second type of lottery, denominated as relative lotteries, whose prizes
are expressed in relative terms to the initial situation. For example,
the lottery dened by r = (r1 , r2 , 1 p, p) is a relative lottery if the
prizes are ri w for i = 1, 2 and for any particular initial w.
In the space of the ri we can represent the indierence curves
for relative lotteries, and these curves are closely related to those
of absolute lotteries. To see this, note that the expected utility of a
relative lottery is

rw) = pu(r2 w) + (1 p)u(r1 w)


Eu(

By the implicit function theorem, we have



dr2 (1 p)u (r1 w)w (1 p)u (r1 w)
= =
dr1 dEu=0 pu (r2 w)w pu (r2 w)

For any relative lottery that oers certainty (that is, r1 = r2 ), we get
the result that the slope of the indierence curve is equal to (1p)
p ,
just as in the case of absolute lotteries. The second derivative of an
indierence curve in the space of relative lotteries is

d2 r2
=
d(r1 )2 dEu=0

  wu (r w)u (r w) u (r w)wu (r w) dr2
1p 1 2 1 2 dr1

 2
p (u (r2 w))

4
Earlier we used w to indicate a wealth vector, and now it is being used to
indicate a scalar. From the context of the analysis it should always be clear what
the exact dimensionality of w is being assumed.
56 3. Risk aversion

Now, at any lottery such that r1 = r2 = r, we get


d2 r2
=
d(r1 )2 dEu=0

  wu (rw)u (rw) u (rw)wu (rw) 1p
1p p

 2
p (u (rw))

  wu (rw)u (rw) 1 + 1p
1p p

=  2
p (u (rw))
     
1p 1p wu (rw)
= 1+
p p u (rw)
wu (rw)
=  f (p)
u (rw)
Rr (w)f (p)

Note that when r = 1, we have Rr (w) = wRa (w). But if we assume


(as before) that our individual starts o with an initial wealth w that
is risk-free, then the relevant point in the space of relative lotteries to
represent such an endowment is exactly the certainty lottery with r =
1, and so this is the lottery that we should use to dene the measure
of risk aversion in the case of relative lotteries. For this reason, the
Arrow-Pratt measure of relative risk aversion is dened as Rr (w) =
 (w)
wu a
u (w) = wR (w). If an individual displays a greater value of relative
risk aversion than another, then the former is more risk averse over
relative lotteries than the latter.
The Arrow-Pratt measure of relative risk aversion shows up in
a great many situations in microeconomic analysis, both in models
of risk and uncertainty and in models of certainty. This is due to a
simple fact, which can be noted by re-writing the measure of relative
risk aversion in a slightly dierent way

du (w) du (w)
wu (w) w dw u (w)
Rr (w) = = =  dw 
u (w) 
u (w) w

So the measure of relative risk aversion is nothing more than the


(negative of the) elasticity of marginal utility with respect to wealth.
3.3. Measures of risk aversion 57

Risk premium
Lets go back to absolute lotteries. In what we have done above, we
always began with a situation of certainty, that is, our endowment
points were risk-free. Now lets consider what can be done when we
start o from a wealth distribution that involves risk, concretely we
shall assume an endowment characterised by w1 > w2 . In the same
manner as previously, the indierence curve that passes through the
endowment point denes the lower frontier of the acceptance set. This
indierence curve cuts the certainty line at a point of wealth equal to,
say, w in either state. w satises

u(w ) = pu(w2 ) + (1 p)u(w1 ) (3.5)

and it is known as the certainty equivalent wealth.

Exercise 3.7. What is the certainty equivalent wealth for an


individual with the lottery of the Saint Petersburg paradox,
assuming that his utility function is u(w) = ln(w) and that his
initial wealth (before the lottery prize is added) is the risk-free
quantity 0?

Answer. When the utility function is the logarithmic function


ln(w), we know that the utility of the St. Petersburg paradox
lottery is just ln(2). But the St. Petersburg paradox question
is posed as if the bettor had no wealth other than what is
obtained via the lottery. Thus, the certainty equivalent wealth
for the lottery, under the assumption that the bettor has 0
wealth outside of the lottery, is the wealth of 2. Curiously then,
when Bernoulli posed his solution to the paradox, he anticipated
the concept of certainty equivalent wealth, but not the concept
of willingness-to-pay for participating in the lottery.

Since the indierence curve is strictly convex, it is always true that


Ew  = w > w . Indeed, the dierence between the expected value and
the certainty equivalent, ww , gives us a second way to measure
risk aversion. Clearly, = 0 is possible only if the indierence curve
is linear (it coincides with the iso-expected value line), that is, risk
aversion is zero. Also, given an initial lottery, the more convex is the
indierence curve (the greater is risk aversion), the lower will be w ,
58 3. Risk aversion

and so the greater will be . The variable5 is known as the risk


premium.

Exercise 3.8. Assume a strictly risk averse decision maker with


a risky endowment such that his wealth is w1 with probability
1p and w2 with probability p. Assume that w1 > w2 . Write the
equation that implicitly denes the risk premium as a function
of w1 , w2 and p. Use your equation to work out the value of
the risk premium for the extreme points p = 0 and p = 1.
Use this information to sketch a graph of how you think that
the risk premium should look as a function of p. Now conrm
mathematically whether or not the risk premium is convex or
concave in p. Find the equation that characterises the turning
point of the risk premium as a function of p. Draw a graph,
with wealth on the horizontal axis and utility on the vertical,
with a construction that indicates exactly this level of the risk
premium.

Answer. The equation that denes the risk premium () is

(1 p)u(w1 ) + pu(w2 ) = u(E w


 )

where of course E w = (1 p)w1 + pw2 . When p = 0, the above


equation would read u(w1 ) = u(w1 ), and this just says that
with p = 0 we have = 0. Likewise, with p = 1 the equation
reads u(w2 ) = u(w2 ), which again implies = 0. Since for
any other p (i.e., for 0 < p < 1) we have > 0 due to risk
aversion, you should sketch a graph that shows as a concave
function on the support [0,1], taking the value 0 at the two
endpoints and taking positive values at all intermediate points.
To conrm concavity of in p we need to derive with respect
to p the equation that denes the risk premium. Deriving once,
we get
 
 (E w )
u(w1 ) + u(w2 ) = u (E w
 )
p
 
 E w
 )
= u (E w
p p
5
Actually rather than being a normal variable is a function. In principle,
it changes with any of the systems parameters.
3.3. Measures of risk aversion 59


Since E w
p = w1 + w2 , this is just
 

u(w1 ) + u(w2 ) = u (E w
 ) w1 + w2
p

Deriving a second time we get


 2
 2
 ) w1 + w2
0 = u (E w u (E w
 )
p p2

This says that


2
2 u (E w
 ) w1 + w2
p
= <0
p2 u (E w
 )

So indeed the risk premium is strictly concave in p.



The turning point is that at which p = 0. From the equation
for the rst derivative, write

u(w1 ) + u(w2 ) = u (E w
 ) (w1 + w2 ) u (E w
 )
p

u (E w
 ) = u (E w
 ) (w1 + w2 ) + u(w1 ) u(w2 )
p
u (E w
 ) (w1 + w2 ) + u(w1 ) u(w2 )
=
p u (E w
 )

So it turns out that we get  


p = 0 at the point where u (E w
) (w1 + w2 )+u(w1 )u(w2 ) = 0. This point is better identied
as the point such that

u(w2 ) u(w1 )
u (E w
 ) =
w2 w1

Multiply the right-hand-side by 1 both in the numerator and


 ) = u(ww11)u(w
the denominator, so that this reads u (E w w2
2)
.
The point in question is identied in Figure 3.11. It is the point
at which the slope of the utility function is equal to that of the
line joining the two points corresponding to the two options of
wealth.
60 3. Risk aversion

u(w)
u(w1 ) u(w)

u (E w )

u(w2 )

w2 E w w1 w

Figure 3.11 Graphical construction of the maximum


level of risk premium

Arrow-Pratt approximation
A logical thing to think about is exactly how the risk premium relates
to the Arrow-Pratt measure of absolute risk aversion. To begin with,
note that from the denition of the certainty equivalent, and from the
denition of the risk premium, we can write

 = u(w ) = u(w )
pu(w2 ) + (1 p)u(w1 ) = Eu(w)

Thus, the risk premium is the maximum amount of wealth that the
individual would be willing to pay to substitute his lottery for the
one with no risk at all but with the same expected value. It is now
useful to split the initial lottery into two parts6 the risk-free part,
w0 , and a risky part which we denote by the random variable x , whose
expected value is E x = x, multiplied by a constant, k. In this way,
the endowed expected utility is

pu(w0 + kx2 ) + (1 p)u(w0 + kx1 ) = Eu(w0 + k


x)
6
There is no loss of generalty here, since we can always have w0 = 0.
3.3. Measures of risk aversion 61

We can now study risk-free situations by simply using k = 0.


Since the expected value of the initial situation is E(w0 + k
x) =
w0 + kE x = w0 + kx, the risk premium corresponding to the initial
situation is dened by such that

x) = u(w0 + kx )
Eu(w0 + k (3.6)

We shall study the behaviour of as k changes, maintaining the rest


of the parameters constant, and so it is easier if we write the risk
premium as = (k). Above all, we are interested in the function
(k) around the point k = 0, that is, we are interested in small risks,
in order to relate the risk premium with the absolute risk aversion
function dened above, which you should recall is also dened around
a point of certainty.
First, we take a second-order Taylors expansion of (k) around
the point k = 0:

k 2 
(k) (0) + k  (0) + (0) (3.7)
2
In this equation, we are going to substitute for the values of (0),
 (0) and  (0). We begin by noting that, if we set k = 0 in (3.6),
then we directly obtain the result (0) = 0. Second, derive (3.6) with
respect to k to obtain

xu (w0 + k
E x) = (x  (k))u (w0 + kx (k))

When k = 0, we get u (w0 )x = (x  (0))u (w0 (0)) = (x


 (0))u (w0 ) where we have used the fact that (0) = 0. But then we
must have x = x  (0), and so  (0) = 0. Deriving (3.6) a second
time with respect to k yields

x2 u (w0 + k
E x) = (x  (k))2 u (w0 + kx (k))
 (k)u (w0 + kx (k))

But since (as we have just seen) (0) = 0 and  (0) = 0, setting k = 0
gives us the result that

x2 u (w0 ) = x2 u (w0 )  (0)u (w0 )


E

that is,
u (w0 )
 (0) = x2 x2 )
(E
u (w0 )
62 3. Risk aversion

Finally, substitute these three results into (3.7) to obtain


  
k2 u (w0 ) 2 2 k 2 (Ex2 x2 ) a
(k)  x x ) =
(E R (w0 )
2 u (w0 ) 2
Now, from the denition of variance, simple steps that the reader
can (and should) check, reveal that
2 (k x kx)2 = k 2 (E
x) = E(k x2 x2 )
and so we end up with the equation
2 (k
x) a
(k) R (w0 ) (3.8)
2
Equation (3.8) is known as the Arrow-Pratt approximation to
the risk premium. It shows how, as we have already indicated, the
greater is the measure of absolute risk aversion, the greater is the risk
premium, but it also clearly shows that for a given measure of risk
aversion, the greater is the variance of the lottery, the greater is the
risk premium.

w2

w1 = w2

w2
i
w3 j
i
j
a b c d Ew w1

Figure 3.12 Eect of greater risk aversion and


greater risk upon the risk premium

In Figure 3.12 we have drawn two lotteries with the same expected
value and dierent variances. The expected utility indierence curves
3.4. Slope of risk aversion 63

through each of these two lotteries are also drawn for two individuals,
one of whom (individual i) is more risk averse than the other (individ-
ual j). If we concentrate on either of the two dierent lotteries, then it
is clear that the risk premium is greater for the individual with greater
risk aversion, so the risk premium is increasing in risk aversion. On
the other hand, if we concentrate on the indierence curves of any
of the two individuals, then it is also clear that holding risk aversion
constant and increasing variance also leads to an increase in the risk
premium.7

3.4 Slope of risk aversion


It is important to note that the Arrow-Pratt absolute risk aversion
function is a properly dened function (as is the relative risk aversion
function), and so it is natural to wonder how risk aversion is aected
by an increase in w. First, since Rr (w) = wRa (w), we get

Rr (w) = Ra (w) + wRa (w)

Thus, always within the assumption that the utility function itself
is increasing and concave (so that both relative and absolute risk
aversion are positive) we can directly conclude that

1. If absolute risk aversion is not decreasing (Ra (w) 0), then


relative risk aversion is increasing (Rr (w) > 0).
2. If relative risk aversion is not increasing (Rr (w) 0), then
absolute risk aversion is decreasing (Ra (w) < 0).

Second, if we derive the denition of absolute risk aversion, and if


 (w)
we dene P (w) uu (w) , then we obtain
 
u (w)u (w) u (w)u (w)
Ra (w) =
u (w)2
 
u (w) u (w) 2
=  +
u (w) u (w)

u (w) u (w)
=  + Ra (w)2
u (w) u (w)
7
Actually, this may not hold risk aversion constant, as the certainty equivalent
wealth changes, which may imply that risk aversion also changes for a particular
individual.
64 3. Risk aversion

= P (w)Ra (w) + Ra (w)2


= Ra (w) (Ra (w) P (w))

At the second step we can note that, u (w) 0 is a necessary (but not
sucient) condition for decreasing absolute risk aversion; Ra (w) < 0.
In words, a necessary condition for decreasing absolute risk aversion
is that marginal utility is convex. But we have already assumed that
u (w) > 0 and that u (w) < 0 for all w, that is, marginal utility is
positive and decreasing. From that, we can directly conclude that, at
least for very large values of w, marginal utility will indeed be convex
(if not, it would either have to be negative or increasing draw a
graph of marginal utility if you are not convinced).
At the nal step, we can also conclude that a necessary and su-
cient condition for decreasing absolute risk aversion is that Ra (w) <
P (w). The function P (w) as dened above is known as absolute
prudence, and so absolute risk aversion is decreasing if (and only if)
absolute risk aversion is less than absolute prudence. Another way of
looking at prudence is to consider the utility function v(w) = u (w).
Prudence of u(w) is then just the Arrow-Pratt measure of absolute risk
aversion of v(w). So u(w) displays decreasing absolute risk aversion if
u(w) is less risk averse than is u (w). The concept of prudence turns
out to be important for decisions that involve savings as a hedge
against risk, and it is normally accepted that risk averse individuals
also display positive prudence, implying that indeed u (w) > 0. We
study exactly this kind of problem in the next chapter.
In short, it is very often accepted that absolute risk aversion is
in fact decreasing (indeed, a common assumption which is also
often found to correspond to real life choices in empirical analyses is
that relative risk aversion is constant). In graphical terms, decreasing
absolute risk aversion corresponds to a family of indierence curves
that become more and more linear as we move away from the origin
of the graph.

Exercise 3.9. Calculate the Arrow-Pratt measures of absolute


risk aversion, relative risk aversion, and absolute prudence, for
the following utility functions: u(w) = ln(w), u(w) = aebw ,
and u(w) = aw2 + bw + c, where a, b and c are all positive
constants.

Answer. It is easiest to do each of these simply by construc-


tion. That is, work out the rst and second derivatives, divide
3.4. Slope of risk aversion 65

the second derivative by the rst and multiply by 1 to get


absolute risk aversion. Multiply the absolute risk aversion by w
to get relative risk aversion. Calculate prudence by working out
the third derivative and dividing that by the second derivative
(and, of course multiplying by 1). If you carry out each of
these derivatives correctly, you should arrive at the following
conclusions: (a) for the function u(w) = ln(w), absolute risk
aversion is w1 , relative risk aversion is 1, and absolute prudence
is w2 , (b) for the function u(w) = aebw , absolute risk aversion
is b, relative risk aversion is bw, and absolute prudence is b, (c)
for the function u(w) = aw2 + bw + c absolute risk aversion is
2a 2aw
b2aw , relative risk aversion is b2aw , and absolute prudence is 0.
Thus, the logarithmic function is decreasing absolute risk aver-
sion (DARA) and constant relative risk aversion (CRRA), the
exponential function is constant absolute risk aversion (CARA)
and increasing (actually linear) relative risk aversion (IRRA),
and the quadratic function has increasing absolute risk aversion
(IARA) and increasing relative risk aversion (IRRA).

Summary
In this chapter you should have learned the following:

1. A principal aspect of expected utility preferences is that they


are linear in probabilities.
2. If the utility function for money is concave (second derivative
negative), then the decision maker suers what is known as risk
aversion. Risk aversion is a situation in which an increase in
variance that leaves expected value unchanged leads to a less
preferred outcome.
3. Risk aversion shows up in the Marschak-Machina triangle as
indierence lines that are less steep than the iso-expected value
lines. It shows up in the contingent claims environment as indif-
ference curves that are convex.
4. The standard graphical environment for studying choice under
risk is the contingent claims setting. In that setting, we repre-
sent the outcomes (prizes) of a lottery on the two axes. The
probabilities of the outcomes then show up in the slopes of
the indierence curves (marginal rates of substitution) and the
slopes of expected value lines in the graph.
66 3. Risk aversion

5. Dierent preferences are dened by dierent risk aversion. Risk


aversion at any given level of wealth w can be measured by the
 (w)
function Ra (w) = uu (w) , the Arrow-Pratt measure of absolute
risk aversion. If two utility functions are related by a positive lin-
ear transformation, they will show the same level of risk aversion
for every level of wealth, and so they show the same preferences
exactly. But if one utility function is a concave transform of
another, then the former is more risk averse than the latter.
6. Other important functions that are relevant to risk aversion are
 (w)
relative risk aversion, Rr (w) = wu u (w) , and prudence P (w) =
 (w)
uu (w) . Relative risk aversion is a measure of the elasticity of
marginal utility to wealth, and prudence is the absolute risk
aversion of the utility function v(w) = u (w).
7. The slope of absolute risk aversion is an important ingredient to
many problems in economics. It is often assumed that absolute
risk aversion is decreasing (decision makers are less risk averse
the wealthier they become). This is equivalent to saying that
absolute risk aversion is less than prudence.
8. Two other important concepts for decision making under risk
are the certainty equivalent wealth and the risk premium cor-
responding to a given risk. Certainty equivalent wealth is the
level of wealth that generates exactly the same level of utility as
a given lottery, and the risk premium is the dierence between
the expected value of wealth and the certainty equivalent wealth.
9. We can estimate the value of the risk premium, at least for
small risks, using the Arrow-Pratt approximation. Under this
approximation, the risk premium is (approximately) equal to
half of the variance of the lottery multiplied by the level of
absolute risk aversion measured at the expected level of wealth.
This approximation conrms that the risk premium increases
with risk aversion and with the risk of the lottery (its variance).

Problems
1. Prove mathematically that a movement upwards along a line
of constant expected value in the Marschak-Machina triangle
corresponds to an increase in variance of wealth.
2. Use Jensens inequality to prove that if u(w) is concave, then
the iso-expected value lines in the Marschak-Machina triangle
are steeper than the indierence curves.
3.4. Slope of risk aversion 67

3. Assuming strictly risk averse preferences, indicate in a Marschak-


Machina triangle a lottery, denoted by lottery A, between only
the best and worst prizes that is indierent to receiving the
intermediate prize for sure. Draw the indierence curve going
through lottery A, and evaluate its slope in terms of probabili-
ties. Indicate on the graph the set of lotteries that is at least as
good as lottery A. Is this a convex set?
4. In a variant of the two lotteries in exercise 3.2, assume now that
each trial of the lotteries pays $1 with probability (1 p) and
$1 with probability p. Can the two lotteries implied by a single,
and a repeated, trial of this be located in a single Marschak-
Machina triangle? If p were equal to one-half, would you expect
a risk averse bettor to accept a single trial of this game? How
about the two trial version of the game? (Hint: try using Jensens
inequality for a concave utility function).
5. Use Jensens inequality to prove that if u(w) is concave in the
scalar w, then Eu(w)  is concave in the vector (w1 , w2 ).
6. Assume that an individual has risk-free wealth of $350,000. Find
the limit value of relative risk aversion for which the individ-
ual should certainly reject a bet that involves winning $105
with probability one half and losing $100 with probability one
half? (Clue: use the Arrow-Pratt approximation for the risk
premium).
7. Consider a two-state problem in which a strictly risk averse
expected utility maximiser must allocate an initial wealth of
w0 over two states, where the state contingent claims (w1 and
w2 ) are traded at prices q1 and q2 respectively. Assuming an
interior solution, prove that
w1 Ra (w ) w2
= a 2
w0 R (w1 ) w0
where Ra (wi ) is the Arrow-Pratt measure of absolute risk aver-
sion in state i. What does this result imply for the signs of the
eect of an increase in initial wealth on the demand for the two
state contingent claims?
8. Continuing from problem 7, now derive the budget constraint,
w0 = q1 w1 + q2 w2 with respect to initial wealth, and solve out
w
for the values of wi0 for i = 1, 2. Assuming that absolute risk
aversion is constant in wealth, how would an increase in absolute
risk aversion aect your solution?
68 3. Risk aversion

9. Draw a graph in contingent claims space that represents the


indierence curves corresponding to constant absolute risk aver-
sion. Be careful to clearly indicate how CARA shows up in the
graph. Show that if two dierent points in the graph have the
same marginal rate of substitution, then they must lie on a
straight line with slope equal to 1 (i.e., they must both have
the same variance).
10. Repeat your graph of the previous problem, but this time for
the case of constant relative risk aversion, CRRA. This time,
you need to show that if two points have the same marginal
rate of substitution, then they must lie on a straight line that is
a ray from the origin.
11. In exercise 3.9 we saw that the utility function u(w) = aebw
corresponds to constant absolute risk aversion. However, when
both a and b are positive numbers, it also corresponds to neg-
ative utility. Do you think that negative utility is unreasonable
for the analysis of choice? Explain why or why not.
12. Dene the utility function v(w) u (w). What are the signs
of the rst and second derivatives of this function? What is
the absolute risk aversion of the function? Assuming that u(w)
displays decreasing absolute risk aversion, is v(w) more or less
risk averse than u(w)?
13. Prove that the set of utility functions that display decreasing
absolute risk aversion is a convex set. Is the same true for the
set of constant absolute risk aversion functions?
Chapter 4

Applications

4.1 Portfolio choice


One of the most important types of markets in which individuals
make decisions regarding risk and uncertainty are nancial markets,
in particular the share market where shares in companies are traded.
Trading in shares oers individual investors the opportunity of both
capital gains and dividend payments (a part of the prots of the com-
panies that they own shares in). Share trading also oers signicant
options to diversify risk the owners of a company may decide to sell
a part of the company in order to use the funds to purchase parts of
other companies, thereby diversifying their risk over more than one
industry.
To begin with, let us assume as always that there are only two
states of nature, and for this application, assume that there are also
only two rms. We shall use subindexes to indicate the dierent states
of nature and super-indexes to indicate the dierent rms. Each rm
j = 1, 2 is made up of N j parts, known as shares, that are traded at
a xed unit price of v j , so that rm j is worth in total V j = v j N j .
We shall assume that being the owner of a proportion j of the shares
gives the right to the same proportion of the prots of the rm. In state
i the prots of rm j amount to ji . As before, the probability of state 2
is denoted by p. Of course, it must always be true that j 1 j = 1, 2.
In short, an individual who is the owner of a proportion 1 of rm 1
and 2 of rm 2 has wealth in state i of

wi = 1 1i + 2 2i i = 1, 2

69
70 4. Applications

Instead of assuming an initial endowment of risk-free wealth, we


shall assume that the individual is born with an endowment of shares
in the rms. Concretely, we assume that initially, our individual has
a proportion j0 of rm j, where obviously j = 1, 2. Given that, the
investor can nance the purchase of shares in one rm by selling shares
in the other. His budget constraint is

v 1 1 N 1 + v 2 2 N 2 v 1 10 N 1 + v 2 20 N 2

v 1 N 1 ( 1 10 ) + v 2 N 2 ( 2 20 ) 0
We shall also add the restrictions that j 0 j = 1, 2, that is,
it is impossible to be the owner of a negative share in a rm. In
reality, this type of restriction does not necessarily need to hold, since
owning a negative proportion of a rm is simply a situation in which
instead of owning shares, shares are owed. In many real-world markets
this is possible, and is known as holding a short position in a rm.
Selling more shares than what one owns is known as a short sale. Short
positions are possible only when there exists a time dimension in share
trading. An individual who believes that the price of a share will go
down tomorrow, can sell them today (although he does not actually
have them) at the current market price, with the promise of delivering
them the day after tomorrow. Then, with the money that he gets for
the sale, he waits until the next day when he purchases the shares (at
the lower price, if his belief has been fullled), and then he settles his
share debt. The prot from such a trade (net of any transactions costs)
is the dierence in the price of the shares, multiplied by the number
of shares involved. Of course, this can be a very dangerous strategy.
If rather than going down, the shares increase in price, the investor
makes a negative prot, and what is more, since (theoretically) the
price can increase without bound, the negative prot can also become
very large.1 Many bankruptcies have occurred through betting on
short sales. Our assumption of eliminating short sales avoids such
a complication.
Our interest is in the optimal portfolio choice of the investor,
that is, his optimal choices of j . Formally, the problem is to max-

imise Eu(w()) with respect to , conditional upon v 1 N 1 ( 1 10 ) +
2 2 2 2
v N ( 0 ) 0 and j 0 j = 1, 2. Since the objective function
(expected utility) is concave in , and since the restrictions are linear,
1
In comparison, holding only positive positions in rms limits losses to the
amount invested (the scenario in which the prices of the shares held drops to 0).
4.1. Portfolio choice 71

we can rest assured that the problem has a unique solution. We shall
formulate the problem by ignoring the no-negativity constraints, since
if they are satised in any solution found by not imposing them, we
know that the same solution would be found by imposing them, and
if one of them is not satised then we know that the optimal solution
is to simply set that equal to 0. The Lagrangean function for the
problem is

L(, ) =pu( 1 12 + 2 22 ) + (1 p)u( 1 11 + 2 21 )+




0 v 1 N 1 ( 1 10 ) v 2 N 2 ( 2 20 )

If we write wi = 1 1i + 2 2i i = 1, 2, then the rst-order


conditions are

pu (w2 ) j2 + (1 p)u (w1 ) j1 = v j N j j = 1, 2

and the complementary slackness condition is




v 1 N 1 ( 1 10 ) + v 2 N 2 ( 2 20 ) = 0

However, since the rst-order conditions indicate that

pu (w2 ) j2 + (1 p)u (w1 ) j1


= > 0 j = 1, 2
vj N j
we know that in the solution the restriction must saturate, that is,
the complementary slackness condition can be better written as

v 1 N 1 ( 1 10 ) + v 2 N 2 ( 2 20 ) = 0 (4.1)

Now, dividing the rst rst-order condition by the second, we get

pu (w2 ) 12 + (1 p)u (w1 ) 11 v1N 1


= (4.2)
pu (w2 ) 22 + (1 p)u (w1 ) 21 v2N 2

Together, the simultaneous solution of equations (4.1) and (4.2) gives


the solution to the problem.
What is not so obvious is the graphical representation of what we
have just done. Consider the space of contingent wealth, in Figure 4.1.
The two points j indicate the distributions of prots of the two
rms, and the straight lines that join them to the origin indicate all
of the wealth distributions that can be obtained from each rm with
positions of j between 0 and 1 ( j = 0 would indicate the origin of
72 4. Applications

w2

1
12

C
2
22 w

A
10 12

B
20 22

10 11 20 11 11 21 w1

Figure 4.1 Optimal portfolio demand

the graph, and j = 1 would give the point j ). The initial endowment
of the investor is indicated by point A on the line pertaining to rm
1 and point B on the line of rm 2. The vector sum of these two
points indicates that the individuals initial point is found at C, and
the optimal position of the individual is given by the tangency point
between his indierence curve and the frontier of all feasible trades
(the line passing through C).
The principal problem in working this through is simply to obtain
the equation for the slope of the frontier of feasible trades in state
contingent wealth space. Note carefully, since we have not depicted the
two assets (the shares) on the axes, the slope of the budget constraint
in state contingent wealth space is certainly not equal to the negative
of the ratio of the prices of the assets, as one may be tempted into
believing at rst glance. Lets investigate.
We know that the slope of the individuals indierence curves in
state contingent wealth space (his marginal rate of substitution) is

(1 p)u (w1 )

pu (w2 )
4.1. Portfolio choice 73

With a minimal amount of eort, equation (4.2) reorders to give


 2 1 
(1 p)u (w1 ) V 2 V 1 22
=  1 2 
pu (w2 ) V 1 V 2 11

where V j j = 1, 2 is the total market value of rm j, that is, V j =


v j N j . Since this is our equilibrium condition, the right-hand-side of
this equation must be the slope of the budget constraint in state
contingent wealth space. Lets just perform a check of that.
Consider the restriction V 1 ( 1 10 ) + V 2 ( 2 20 ) = 0 in the
space (w2 , w1 ). Dene

g() = V 1 ( 1 10 ) + V 2 ( 2 20 )

so that the restriction reads g() = 0. First, note that point C must
necessarily lie on the implied restriction, since it corresponds to j =
j0 j = 1, 2 which clearly yields g() = 0. Second, from the implicit
function theorem we have
 1
d 2 V
1 =
d dg()=0 V2

And since wi = 1 1i + 2 2i i = 1, 2 we also know that


 2
dwi 1 d 2
1 = i + 1 i i = 1, 2
d dg()=0 d
 1
V
= 1i 2i i = 1, 2
V2
Dividing the rst of these by the second we get
 
dw2
 
1 1 V 1 2
d dg()=0 dw2 2 V2 2
 = = 
dw1 dw1 dg()=0 1
11 VV 2 21
d 1 dg()=0

Operating on this, we nd that it reduces to


 
1 V 1 2

dw2 2 V2 2
=  
dw1 dg()=0 V 1 2 1
V 2 1 1
 2 1 
V 2 V 1 22
=  1 2 
V 1 V 2 11
74 4. Applications

Therefore, as noted above, the budget constraint in state contin-


gent wealth space is a straight line passing through point C, and
the optimal choice is the point on this line that is tangent to an
indierence curve (Figure 4.1).
Notice that the slope of the budget line might feasibly be strange.
As there are no restrictions that relate to the relative values of V 2 12
and V 1 22 , and of V 1 21 and V 2 11 , the budget constraint need not
always have strictly negative slope. It may turn out to have positive
slope, zero slope, or even innite slope. However, consider what would
be implied by, say, a positively sloped budget constraint. Since the
budget constraint shows all of the trades that are feasibly possible, this
indicates that the investor would never obtain a tangency solution.
Instead he would go higher and higher along the budget constraint,
purchasing shares in one rm and selling shares in the other until some
other restriction is met (either he ends up owning all of the rm he is
purchasing shares in, or he has nothing left to sell of the other rm).
While a theoretical possibility, it is certainly not a logical outcome.
The reason why this possibility exists (theoretically) in the model
explained here and not in the real world is that we have assumed
that the share prices are xed at v1 and v2 . If there were two prices
such that the budget constraint were positively sloped, there would
be a massive excess demand for the shares of one of the rms, and a
massive excess supply of the other rms shares, which would lead to
a share price adjustment. Thus, the only stable equilibrium outcomes
would indeed correspond to negatively sloped budget constraints.

Exercise 4.1. In this analysis we deliberately eliminated short


sales as an outcome. Imagine that we did allow short sales. Draw
a graph of a solution in which 1 < 0.
Answer. The relevant graph is shown in Figure 4.2. Starting
from an initially positive holding in each rm (points A and B),
giving an initial portfolio of point C, the individual maximises
utility at point D. The holding of rm 1 at point D can be found
at point E, which is clearly a negative shareholding.

4.2 The demand for insurance


An insurance contract is an agreement to share risk in exchange for a
premium payment. The insured individual sacrices a certain amount
4.2. The demand for insurance 75

w2

1
C
B

A
E w1

Figure 4.2 A short position in rm 1

of money in order to reduce the riskiness of his nal allocation. This


is a standard type of problem that can be studied in the contingent
claims environment that we have introduced above. For now, we shall
assume that both the insurance company and the individual agree
completely on the exact characteristics of the risk to be insured.
Assume that a strictly risk averse individual has an initial endow-
ment comprising of risk-free wealth of w0 and a lottery dened by
L (0, L, (1 p), p), where L > 0. That is, with probability p the
individual suers a loss of L. While it is not really necessary, we shall
assume here that w0 L. The individuals initial situation can be
described by risk-free wealth of w0 plus the lottery 0 (0, L, (1p), p).
His expected wealth is E w  = w0 pL, and his expected utility is
(1 p)u(w0 ) + pu(w0 L). In a contingent claims graph, the indif-
ference curve passing through the initial point cuts the graph into
two parts the points strictly below the initial curve, and the points
on or above the initial indierence curve (the acceptance set). The
individual would voluntarily exchange his initial situation for any
other point within the acceptance set. Let us dene a new lottery
76 4. Applications

by x (x1 , x2 , (1 p), p) for particular values of x1 and x2 . Now, the


individual would always exchange his initial lottery, 0 , for this new
lottery, x if

(1 p)u(w0 ) + pu(w0 L) (1 p)u(w0 + x1 ) + pu(w0 + x2 )

We shall refer to x (x1 , x2 , (1 p), p) as an insurance contract.


Now, consider the situation of the insurer. We shall assume that
the insurer is risk neutral,2 and that her initial situation is described
by risk-free wealth of z0 and a lottery y (y1 , y2 , (1 p), p). We can
understand the lottery y (y1 , y2 , (1 p), p) as the existing portfolio
of clients of the insurer. Since she is assumed to be risk neutral, we
are assuming that the insurer maximises her expected prot, which is
initially z0 +(1p)y1 +py2 . If the insurer does oer the above contract
to the individual, and if this contract is accepted, then the insurer adds
the lottery 0 (0, L, (1 p), p) to her portfolio, in exchange for the
individual taking the lottery x (x1 , x2 , (1 p), p) from the portfolio.
This implies that the expected prot of the insurer becomes

z0 + (1 p)(y1 x1 ) + p(y2 L x2 )

Of course, the net surplus that the insurer receives by participating


in the exchange is given by

B(x) = [z0 + (1 p)(y1 x1 ) + p(y2 L x2 )]


[z0 + (1 p)y1 + py2 ]
= [(1 p)x1 + px2 ] pL

A logical condition for the insurer to participate in the exchange is


that B(x) 0, that is

pL (1 p)x1 + px2

Taking this condition into account, we note that the exchange


leaves the individual with a situation of expected wealth equal to
2
Although risk neutrality is not really absolutely necessary, it is a common
assumption and it certainly simplies the analysis considerably. Actually, even if
the insurer is assumed to be risk averse, it would not be very risk averse (especially
compared to the individual) due to the very fact that its business is to collect
risks, and to the fact that being a large corporation, it is likely to have very large
resources, and if risk aversion is decreasing, then it could only have a very low
degree of risk aversion.
4.2. The demand for insurance 77

w2 w1 = w2

Eu(w) = (1 p)u(w0 )
w0 + pu(w0 L)

E w = w0 pL
w0 w1

Figure 4.3 Zone of mutually benecial insurance


contracts

w0 + (1 p)x1 + px2 w0 pL. That is, the only possible contracts


correspond to expected value lines for the individual in the contingent
claims graph that are no higher than his initial situation. This indi-
cates a zone of possible contracts between the initial indierence curve
of the individual and his initial expected value line. This is the zone
of mutually benecial contracts, in the sense that they increase the
welfare of at least one agent without reducing the welfare of the other.
This zone of mutually benecial contracts is shown as the shaded zone
in Figure 4.3.
Now, lets think about the general characteristics that the optimal
insurance contract must satisfy. First, note that it must hold that
x1 = x2 . To see why, assume that this equality is not respected. In
this case, the insurance contract must leave the individual with a
risky wealth distribution (i.e., it leaves him at a point that is not on
his certainty line). Assume, for example, that x1 > x2 , so that the
insurance contract leaves the individual below the certainty line. But
then we have a situation that is formally identical to where we started,
and so there still exists a zone of mutually benecial contracts that
78 4. Applications

should be taken advantage of. In other words, if the insurance contract


satises x1 > x2 then there will always exist a further contract that
will increase the welfare of at least one of the two parties without
decreasing the welfare of the other. Therefore, the contract must
satisfy x1 = x2 , and so the contract leaves the individual on the
certainty line3 at some point between w and w in Figure 4.4. The
question of exactly which point will be chosen will be tackled here only
for two extreme cases; on one hand the case of an insurer that acts in a
perfectly competitive environment, and on the other hand the case of
an insurer who is a monopolist. We shall use x1 = x2 = x to indicate
the contract, with xc indicating a situation of perfect competition,
and xm indicating a situation of monopoly.

w2

w1 = w2
w0

w
w0 pL
w
w0 pL
w0 L

w0 pL L
w0 pL L

w0 pL w0 pL

w0 w1

Figure 4.4 Perfectly competitive and monopoly


insurer equilibria

When the insurer is a competitive rm, we know that it must


always earn an expected prot of 0. Thus both before and after adding
the new client to its portfolio, the expected prot must be 0, and
so the expected prot added by this client must also be exactly 0.
3
The fact that the insured ends up on the certainty line is known as full coverage.
4.2. The demand for insurance 79

Thus the contract must satisfy (1 p)x1 + px2 = pL, and since
x1 = x2 = xc , we have in this case xc = pL. Thus the individuals
nal wealth ends up at the point w in Figure 4.4, and his expected
utility is the greatest possible within the set of possibilities that is
oered by the zone of feasible contracts. Note that under this contract,
the individuals wealth in both states ends up at w0 pL, so in state 1
the contract asks him to pay pL, and in state 2 the contract gives him
a payment of L pL. In other words, the contract asks for a premium
payment from the insured to the insurer of pL in both states, and
oers an indemnity payment from the insurer to the insured in state
2 of L. The fact that the indemnity is, in absolute value, equal to the
size of the loss is the indication that the contract has full coverage,
and the fact that the premium is equal to the expected value of the
loss is known as a case of a fair premium.
Second, consider the case when the insurer is a monopolist, so
that the contract is characterised by x1 = x2 = xm , that is, again
we know that the contract will still involve full coverage, and all that
we need to nd out is the amount of the premium. But clearly, the
company will oer the contact that maximises her expected prot
while still being accepted by the individual. If the contract does not
leave the insured indierent between accepting it or not, then the
same indemnity payment can be made with a higher premium, which
must increase the expected prot of the insurer. So the contract must
lie on the individuals initial indierence curve. But since it oers full
coverage, it oers full certainty to the individual. Thus

u(w0 + xm ) = (1 p)u(w0 ) + pu(w0 L)

This is just the denition that we saw previously for the certainty
equivalent wealth, w , and so we have w0 + xm = w . But from the
denition of the risk premium, we now know that w = w =
w0 pL , and so xm = (pL + ). A monopoly contract leaves the
individual with wealth of w0 (pL+) in both states, and so it implies
that the premium paid by the insured to the insurer is pL + , and
the indemnity coverage to be received if state 2 eventuates is equal
to L. Hence, the only dierence between a competitive contract and
a monopoly contract is the premium to be paid in both contracts
the indemnity to be received in state 2 is always equal to the loss,
L. When we have a monopoly insurer, the premium is equal to the
competitive premium plus the individuals risk premium.
80 4. Applications

Exercise 4.2. Analyse the cases of insurance demand under


both competitive and monopoly insurers using the Lagrange
method of constrained optimisation.

Answer. First, lets look at the competitive case. We know


in this case that the insurance company oers insurance con-
tracts that at most maintain expected value constant. Thus the
insurance consumers problem is to maximise (1 p)u(w1 ) +
pu(w2 ) subject to the constraint (1 p)w1 + pw2 w0 pL.
The Lagrangean for this problem is L(w, ) = (1 p)u(w1 ) +
pu(w2 ) + [w0 pL ((1 p)w1 + pw2 )], where, of course, is
the multiplier. The rst-order conditions are (1p)u (w1 )(1
p) = 0 and pu (w2 ) p = 0, and the complementary slackness
condition is [w0 pL ((1 p)w1 + pw2 )] = 0. The two rst-
order conditions directly imply u (w1 ) = u (w2 ) = , from
whence we know > 0, and so the complementary slackness
condition implies w0 pL = (1 p)w1 + pw2 . Now, note that
again since u (w1 ) = u (w2 ), concavity of the utility function
implies that it must be true that w1 = w2 wc . Substituting
back into the complementary slackness condition, we now get
w0 pL = wc . That is, the individual locates at the certain
point with coordinates w1 = w0 pL = w2 .
Second, the monopoly case. Since now the insurer is able to
maximise prots, which is the same as minimising the expected
value of the contract for the insurance consumer, the problem
can be studied as one in which we minimise (1 p)w1 + pw2
subject to the constraint that the consumer accepts the contract,
that is (1 p)u(w1 ) + pu(w2 ) (1 p)u(w0 ) + pu(w0 L).
There are two small problems, instead of minimising we would
prefer to maximise, and the inequality in the constraint goes
the wrong way around. Both problems can be solved by using
negatives. Minimising any function f (w) is exactly equivalent
to maximising f (w), so all we need to do is to maximise
((1 p)w1 + pw2 ). Second, if we multiply both sides of the
constraint by 1, the constraint itself is unaltered, however it
will now read

[(1 p)u(w1 ) + pu(w2 )] [(1 p)u(w0 ) + pu(w0 L)]


4.2. The demand for insurance 81

We can now go ahead with our maximisation via Lagrange. The


Lagrangean is now

L(w, ) = ((1 p)w1 + pw2 ) +

{ [(1 p)u(w0 ) + pu(w0 L)] + [(1 p)u(w1 ) + pu(w2 )]}


The rst-order conditions are (1 p) (1 p)u (w1 ) = 0
and p pu (w2 ) = 0. The complementary slackness condition
is { [(1 p)u(w0 ) + pu(w0 L)] + [(1 p)u(w1 ) + pu(w2 )]}.
Notice that the two rst-order conditions here are exactly the
same as in the competitive insurer problem, and so they imply
u (w1 ) = u (w2 ) = > 0, from which we know w1 = w2
wm and (1 p)u(w ) + pu(w L) = (1 p)u(w ) + pu(w ).
0 0 1 2
Substituting, we have (1 p)u(w0 ) + pu(w0 L) = u(wm ), from

which we can directly see that wm = w pL .


0

Insurance with a marginally loaded premium


In what we have just seen, the total premium payment for insurance
coverage was either pL, or pL + , depending on whether the insurer
acts in a competitive environment or as a monopolist. Under either of
these two premium types, the optimal demand for insurance (i.e., the
indemnity in case of accident) is equal to the loss, that is, insurance
involves full coverage, which is indicated by the fact that the con-
sumers nal position involves no risk at all. This is a special demand
for insurance function as it is independent of many of the variables
that we might expect to aect demand for example, the insureds
risk-free wealth, the prices of other goods in the economy, and even,
to a certain extent, the price of insurance itself since one of the total
premium payments is greater than the other but they both have the
same coverage. Both of the scenarios that we have studied above can
be seen to be cases in which the premium payment for insurance is
dened as a linear function of coverage; pC + k, where C is coverage
and k is a constant (equal to 0 in one case, and to in the other).
There are two important things to notice. First, this is a two-part
tari rather than a simple pricing arrangement, and second, the slope
parameter is the same always, p. When the slope parameter is equal
to the probability of loss, the premium is marginally fair, and this
is what is driving the result that optimal insurance coverage is always
full.
82 4. Applications

It is worthwhile to take a closer look at how the demand for


insurance works with a more general pricing arrangement. Concretely,
let us assume that if C units of coverage are contracted, then the cost
of insurance is qC +k, that is, the per-unit price of coverage is denoted
by q, which may or may not be equal to p, and the entry price to the
insurance market is k 0. It is important to note that here we will
be concerned only with the insureds choice of coverage, and we will
not discuss the insurers optimal choice of pricing arrangements. In
fact the only aspect of the supply of insurance that we will take into
account is that the insurers prots cannot be negative.
Using this pricing schedule, the individuals choice problem is the
following:

max pu(w0 (qC + k) L + C) + (1 p)u(w0 (qC + k))


C

subject to

pu(w0 (qC + k) L + C)+


(1 p)u(w0 (qC + k)) u(w0 L) + (1 p)u(w0 )

Of course, we should also restrict C 0, but as always we shall ignore


such no-negativity constraints and deal with them only if it ever turns
out that it is not satised in the unrestricted solution.
Lets start by looking quickly at the restriction. All it says is that
the nal insured situation must give a greater level of expected utility
than the no-insurance option. Again, if the restriction is not satised
in the unrestricted solution to the problem, then the best option is
to not insure at all, saving on both the variable and the xed cost
of insurance. Thus, again we shall not deal with this restriction here
explicitly, but we will leave that to an exercise later on (exercise 4.3).
So, we are interested in maximising Eu(C) = pu(w0 (qC + k)
L + C) + (1 p)u(w0 (qC + k)) with respect to the choice of coverage
C. The rst derivative of the objective function is
Eu(C)
= pu (w0 (qC+k)L+C)(1q)+(1p)u (w0 (qC+k))(q)
C
and the second derivative is
2 Eu(C)
=pu (w0 (qC + k) L + C)(1 q)2 +
C 2
(1 p)u (w0 (qC + k))(q)2
4.2. The demand for insurance 83

Due to the fact that utility is assumed to be strictly concave, we have


2 Eu(C)
C 2
< 0. The negative sign on this second derivative indicates
that the objective function is concave in the choice variable, and so
the optimal solution satises the following rst-order condition:

Eu(C )
=pu (w0 (qC + k) L + C )(1 q)
C
(1 p)u (w0 (qC + k))q (4.3)
=0

Re-ordering, this can be written as

p(1 q) u (w0 (qC + k))


=  (4.4)
(1 p)q u (w0 (qC + k) L + C )

Some special cases are now quite evident. First, say q = p, which
is the case studied in the previous section. In that case (and only
in that case), p(1q)
(1p)q = 1, and so the rst-order condition indicates
that u (w0 (qC + k)) = u (w0 (qC + k) L + C ). But since


utility is by assumption strictly concave, this in turn indicates that


w0 (qC + k) = w0 (qC + k) L + C , which easily reduces to
C = L, that is, full insurance is optimal.

Exercise 4.3. What is the expected utility of the insurance


demander in the optimum when insurance coverage is priced
according to pC + k? How does insurance demand behave with
respect to the xed cost element, k, of the price?

Answer. We know that, since coverage is priced in a marginally


actuarially fair manner, if any insurance is purchased, it will be
full coverage, C = L. But it is possible that no coverage will be
purchased. It will depend upon k. Concretely, expected utility
under optimal coverage is pu(w0 (pC + k) L + C ) + (1
p)u(w0 (pC + k)). Thus, if some insurance is purchased, C =
L and expected utility becomes u(w0 (pL + k)). This will be
optimal only if it provides for more utility than not purchasing
insurance, which would give expected utility of pu(w0 L)+(1
p)u(w0 ). So, optimal coverage is positive only if pu(w0 L)+(1
p)u(w0 ) < u(w0 (pL + k)). Finally, since by denition of risk
premium , we have pu(w0 L)+(1p)u(w0 ) = u(w0 pL),
we know that optimal coverage is positive (and equal to L) only
84 4. Applications

if u(w0 pL ) < u(w0 (pL + k)). This is equivalent to


w0 pL < w0 pL k, or k < . In short, the optimal
insurance demand function is C = L if k < , and C = 0 if
k .

Consider the other two options, q > p and q < p. In the rst of
these cases, we get p(1q) 
(1p)q < 1, in which case u (w0 (qC + k)) <
u (w0 (qC + k) L + C ). Again, due to concavity of the utility
function, this indicates that w0 (qC + k) > w0 (qC + k) L + C ,
or C < L, that is, under-insurance. The other case, q < p leads
directly to over-insurance, C > L.
Now, typically over-insurance is problematic, and is never a feature
of the real-world insurance business. It is not dicult to see why. If,
in case of accident, the insured receives back more money in indem-
nity than what he loses in the accident, he has a clear incentive to
articially boost the probability of accident, which is detrimental to
the expected prot of the insurer. Normally, actions by the insured
to attempt to create accidents (a well-known type of insurance fraud)
cannot be easily monitored by the insurer, and so in order to avoid
such a scenario, insurers do not oer contracts with q < p, and
correspondingly, we shall ignore that option here.
As far as the comparative statics of insurance are concerned, the
interesting case occurs when q > p, since the optimal coverage does not
respond at all to changes in any parameter values (outside of moving
k from below to above ) in the case when q = p. So from now on,
let us consider only the case q > p, and we shall look at the eects
of changing the parameter values on the optimal insurance choice. To
do this, go back to the rst version of the rst-order condition (4.3),
to which we can directly apply the implicit function theorem.
The most important result to note is what happens when the
individual becomes independently wealthier, that is, w0 increases.
From the implicit function theorem,

2 Eu
C
Cw0
=
w0 dEu=0 2 Eu
C 2

2 Eu C
But since we know that C 2
< 0, the sign of w0 is the same as the
2 Eu
sign of Cw0 . Dierentiating (4.3) we have
4.2. The demand for insurance 85

2 Eu
= pu (w0 (qC +k)L+C )(1q)(1p)u (w0 (qC +k))q
Cw0
(4.5)
However, we can cancel the term p(1q) using the rst-order condition
itself, to get
2 Eu
=
Cw0
 
 q(1 p)u (w0 (qC + k))

u (w0 (qC + k) L + C )
u (w0 (qC + k) L + C )
(1 p)u (w0 (qC + k))q
 
u (w0 (qC + k) L + C )u (w0 (qC + k))
=q(1 p)
u (w0 (qC + k) L + C )


u (w0 (qC + k))

2 Eu
Now, since q(1 p) > 0, the sign of Cw0 is equal to the sign of the
bracketed term
u (w0 (qC + k) L + C )u (w0 (qC + k))
u (w0 (qC +k))
u (w0 (qC + k) L + C )
2 Eu
or more formally, Cw0  0 as

u (w0 (qC + k) L + C )u (w0 (qC + k))



u (w0 (qC + k) L + C )
u (w0 (qC + k))
2 Eu
Cross-multiplying, we can see that the condition for Cw0  0 is

u (w0 (qC + k) L + C ) u (w0 (qC + k))



u (w0 (qC + k) L + C ) u (w0 (qC + k))
But when we multiply each side of this by 1 (dont forget to switch
the inequality direction when you do this), we end up with a statement
on absolute risk aversion, that is,

Ra (w0 (qC + k) L + C )  Ra (w0 (qC + k))

In short, we have shown that an increase in risk-free wealth will


increase (decrease) the optimal insurance purchase if absolute risk
86 4. Applications

aversion with wealth w0 (qC + k) L + C is smaller (greater)


than absolute risk aversion with wealth w0 (qC + k). Recall that
one of the most logical assumptions we can make on absolute risk
aversion is that it decreases with wealth, in which case we would have
Ra (w0 (qC +k)L+C ) > Ra (w0 (qC +k)). This is due to the fact
that under a marginally loaded premium we know that only partial
insurance coverage will be purchased, and so we must have a greater
wealth in state 1 than in state 2, that is, w0 (qC + k) L + C <
w0 (qC +k). So, if absolute risk aversion is decreasing, then absolute
risk aversion would be greater in state 2 than it is in state 1, and
the condition would then indicate that an increase in risk-free wealth
should decrease the optimal insurance purchase.
For many, the result that insurance demand decreases as wealth
increases is somewhat baing, as it seems to state that insurance
something that is very commonly purchased all over the world, and
obviously of great value is an inferior good! However, there is a
clear logic and intuition for the result. If, as we are supposing, our
individual becomes less risk averse as he gets wealthier, then he will
have less need for insurance the wealthier he gets (since, being less risk
averse, he is more willing to take on risks by himself). Of course, this
assumes that along with the increase in wealth there is no increase in
the risk itself. You might want to attempt to work out yourself what
happens if the size of the risk increases along with the individuals
wealth.
Given that for the logical assumption on risk aversion, insurance
is an inferior good, we already know (from the Slutsky equation) that
a change in the unit price of insurance (here denoted by q) will also
give an ambiguous result inferior goods may or may not be Gien.
However, let us look quickly at the eect of an increase in the xed
part of the price of insurance, k. Following the same initial steps as
before, we need to nd the sign of

2 Eu
= p(1 q)u (w0 (qC + k) L + C )+
Ck
(1 p)qu (w0 (qC + k))
2 Eu
=
Cw0
2 Eu
where we have used (4.5). Thus, if Cw0 < 0 due to decreasing
2 Eu
absolute risk aversion, then we have Ck > 0, and an increase in
4.3. Precautionary savings 87

the xed component of the price of insurance will actually increase the
demand for insurance! Why is this? Again, the logic is quite easy. Note
that the xed component of the price of insurance is nothing more
than a loss in risk-free wealth regardless of whether or not an accident
occurs. Thus an increase in k is exactly equivalent to a decrease in
w0 . Running our previous argument in reverse, this decrease in risk-
free wealth would make the individual more risk averse, and thereby
increase his insurance purchase.
In actual fact, it is somewhat unfair to label insurance as an
inferior good when absolute risk aversion is decreasing. In this
insurance model, the only decision that the insurance consumer takes
is how much insurance to purchase. This is a little dierent from the
traditional consumer model in which a decision is taken on at least
two goods. If we introduce a second good into the insurance model, it
becomes unclear whether or not insurance is inferior.

4.3 Precautionary savings

In all that we have done up to now, we have considered risks that


appear simultaneously with the choice. That is, we have been dealing
entirely with a single period. However, it is natural that we take a
look at a model that includes a time dimension, as time is the real
source of risk and uncertainty. Thus, in this sub-section we look at the
case of an individual who has two periods to live, and must choose
how to best consume the income that he earns over the two periods.
Specically, there is the option of passing money from one period to
the other in the form of savings or loans (which are nothing more than
negative savings, and so we will only refer to savings from now on)
using a rudimentary nancial system.
What we will do is to rst briey consider the optimal inter-
temporal choice under complete certainty, in order that we can then
look at how the introduction of risk aects that choice. Of course, we
will also be interested in how risk aversion aects the optimal choice.
There are a number of other comparative statics eects that would
be interesting to consider, although some of these are left to the end
of chapter problems.
88 4. Applications

Optimal savings choices under certainty


Assume that our subject has two periods in which to live the current
period (denoted period 1) and the next one (period 2). He receives a
monetary income of yi in period i = 1, 2. There is a single consumption
good, the price of which is normalised to 1. Consumption in period
i is denoted ci . Utility is separable over the two periods, such that if
consumption is c = (c1 , c2 ) then total utility is given by U (c) = u(c1 )+
u(c2 ), where is the inter-temporal discount factor (0 1).
Note that we are assuming that the utility function for consumption,
u, is the same in each of the two periods. We assume that u is strictly
increasing and strictly concave. The consumers objective is to allocate
consumption over the two periods so as to maximise U (c). In order
to dierentiate with the risky scenario that we shall study below (for
which we will indicate optima using asterixes), lets call the certainty
solution c0 .
There exists a nancial system in the model, under which money
can be transferred over periods at an interest rate of r. The nancial
system is perfectly competitive and functions without frictions, and
so the same interest rate is applied to savings as to loans. Specically,
if an amount s of period 1 income is saved (i.e., not consumed), then
on top of the period 2 income of y2 the individual will have (1 + r)s
dollars to spend in period 2. If savings are negative, that is, period
1 consumption is greater than period 1 income (i.e., a loan is taken),
then the loan principal and interest must be paid out of period 2
income in the same way as savings were liquidated. Since there are
no further periods after period 2 (the individual dies at the end
of period 2), it is impossible to loan money in period 2. That is, all
nancial assets are liquidated in period 2. This is reected in the inter-
temporal budget constraint that must be imposed on the problem,
which is that the present nancial value of consumption must not
exceed the present nancial value of income;
c2 y2
c1 + y1 + c2 (1 + r)(y1 c1 ) + y2
1+r 1+r
Given that, the problem can be expressed as
max U (c) = u(c1 ) + u(c2 )
c
subject to c2 (1 + r)(y1 c1 ) + y2
The problem can be represented graphically in (c1 , c2 ) space. The
budget constraint is a straight line with slope equal to (1 + r),
4.3. Precautionary savings 89

and the indierence curves are downward sloping convex curves. The
optimal choice is where an indierence curve is tangent to the budget
constraint. The marginal rate of substitution can be found from the
utility function using the implicit function theorem, and it is

dc2 u (c1 )
=
dc1 dU =0 u (c2 )

The solution vector is found as the solution to the two simultaneous


equations

u (c01 )
= (1 + r)
u (c02 )
c02 = (1 + r)(y1 c01 ) + y2

We shall write the rst of these equations (the tangency condition) as

u (c01 )
= (1 + r)
u (c02 )
1
Exercise 4.4. Show that if = (1+r) , the consumer will con-
sume the same in each period. Further, if the nancial system is
costless (r = 0) and the individual is innitely patient ( = 1),
show that consumption in each period is exactly the average
of total income. How do the two consumption choices relate to
1 1
each other when > (1+r) and when < (1+r) ? Can you give
some economic intuition for these results?
1 u (c01 )
Answer. If = (1+r) the tangency condition becomes u (c02 )
=
1. Given concave utility, this is just the same as saying c1 = c02 ,
0

that is, the consumer will consume the same in each period. In
this case, consumption c0 can be calculated from the budget
constraint; c0 = (1 + r)(y1 c0 ) + y2 , which solves out to c0 =
(1+r)y1 +y2 u (c0 )
2+r . With r = 0 and = 1, again we get u (c10 ) = 1
2

and so again c01 = c02 = c0 = (1+r)y 2+r


1 +y2
, but now substituting
0 y1 +y2 1 u (c0 )
r = 0 gives c = 2 . When > (1+r) we have u (c10 ) > 1, or
2
1
c01 < c02 , and < (1+r) gives c01 > c02 . The logic of these results
is that describes the individuals patience in waiting a period
1
to consume, while (1+r) in a sense describes the patience of the
banking system in transferring money from one period to the
90 4. Applications

next. When the individual is more patient than is the banking


system, it is worthwhile for him to be the one that refrains from
consuming so much in the rst period, and when the opposite
happens, it is worthwhile for him to bring money from period 2
into period 1 so he doesnt have to wait so long to consume.

The above description of the problem mimics the traditional mi-


croeconomic framework for individual choice theory. However, the
inter-temporal problem with only two periods can be more easily
studied using a change of variable that reects savings. This is because
the actual decision of the individual is how much to save in period 1,
and in period 2 no decision is actually taken since all income is simply
consumed. Thus by structuring the problem in terms of savings, we
can reduce the problem to one of a single choice variable rather than
one of two choice variables.
Specically, dene period 1 savings as s = y1 c1 . In that way,
the problem is now

max U (s) = u(y1 s) + u(y2 + (1 + r)s)


s
y2
subject to s y1
1+r
The restrictions on s reect the maximum and minimum values of
savings given the amount of income available in each period (you
cannot save more than what you have in period 1, and you cannot
loan more than what you can pay back in period 2).
Given that this is now a problem with only one choice variable, we
only need to check that the objective function is concave in that choice
variable, and then look for the point at which the rst derivative is 0.
We have

U  (s) = u (y1 s) + u (y2 + (1 + r)s)(1 + r)


U  (s) = u (y1 s) + u (y2 + (1 + r)s)(1 + r)2 < 0

Since the second derivative is negative, the solution (assuming that


it falls within the bounds of the restrictions on s) satises

u (y1 s0 ) + (1 + r)u (y2 + (1 + r)s0 ) = 0 (4.6)

Exercise 4.5. Check that this equilibrium condition is still the


same as what was obtained previously when the problem was
structured in terms of periodic consumption.
4.3. Precautionary savings 91

Answer. The equilibrium condition in the savings version of


 0
the model can be reordered to read u (yu2(y 1 s )
+(1+r)s0 )
= (1 + r). To
be the same condition as in the periodic consumption model,
 0 u (c01 )
we require u (yu2(y 1 s )
+(1+r)s )0 = u (c0 )
. But since s = y1 c1 , we
2
have c01 = y1 s0 , so the two numerators are clearly equal. And
y2 + (1 + r)s0 = y2 + (1 + r)(y1 c01 ) which is just the budget
constraint equation, so indeed it is equal to c02 .

Optimal savings choices under risk


Now we shall introduce risk into the model, to see how this aects
the optimal choice of savings. We will study two types of risk, rst
the case when second period income is risky, and second, the case in
which the interest rate itself is risky. In both cases, we shall make the
assumption that the risk is a pure variance increase over the case of
certainty that we have already studied, so that we are able to analyse
the eect of pure risk upon the decision maker.

Risky second period income

In this sub-section we will assume that instead of getting y2 for sure,


the individual suers some risk on his second period income. Speci-
cally, we shall assume that with probability 1p second period income
is y21 and with probability p second period income is y22 . We shall
assume that y21 > y22 , although all that is really important is that
the two are not equal. We shall also assume that this is a pure risk
compared to the certainty case studied above, that is, we assume that
py22 + (1 p)y21 = y2 . The problem faced by the individual is

max U (s) = u(y1 s) + [pu(y22 + (1 + r)s) + (1 p)u(y21 + (1 + r)s)]

Ey2
subject to s y1
1+r
Again, the second-order condition is satised by concavity of the
utility function, and so (assuming an interior solution) the optimal
savings is the solution to

u (y1 s ) + [pu (y22 + (1 + r)s )(1 + r)+


(1 p)u (y21 + (1 + r)s )(1 + r)] = 0
92 4. Applications

That is,
u (y1 s )+(1+r)[pu (y22 +(1+r)s )+(1p)u (y21 +(1+r)s )] = 0
(4.7)
Now, we are interested in seeing how the solution to (4.7) compares
to the solution to (4.6), that is, what is the eect of the introduction
of pure income risk? In principle, we would most likely expect that the
risk will result in more savings, since by passing income into period
2 savings is a way in which the adverse outcome of low period 2
income can be insured against. Such a savings strategy is known as
precautionary savings.

U (s)

U (s ) U  (s0 ) > 0

U (s0 )

s0 s s

Figure 4.5 Optimal savings under certainty com-


pared to optimal savings with a risky second period
income

In order to look at the relationship between the two solutions,


think about the graph of U (s) under risk. Given that it reaches its
maximum at s which is where its slope goes to 0, and since it is
strictly concave, if it is true that s > s0 then the slope of U (s) when
considered at the point s0 must be strictly positive, that is we would
have U  (s0 ) > 0 (where of course s0 is the solution to the savings
4.3. Precautionary savings 93

problem under certainty, equation (4.6)). Such a solution is shown in


Figure 4.5.
So, we write out the rst derivative of the risky problems utility
at the savings point s0 , and we consider if it is indeed positive;
u (y1 s0 )+(1+r)[pu (y22 +(1+r)s0 )+(1p)u (y21 +(1+r)s0 )] > 0
From (4.6), the rst term of this is equal to (1 + r)u (y2 + (1 +
r)s0 ), and so we get
(1 + r)[u (y2 + (1 + r)s0 ) + pu (y22 + (1 + r)s0 )+
(1 p)u (y21 + (1 + r)s0 )] > 0
But since (1 + r) > 0 this says that we need
pu (y22 + (1 + r)s0 ) + (1 p)u (y21 + (1 + r)s0 )] > u (y2 + (1 + r)s0 )
This is Jensens inequality for a convex function (recall that we are
assuming py22 + (1 p)y21 = y2 ). So, our result is the following; if the
marginal utility function is convex, then the individual will respond
to second period income risk with precautionary savings. Since convex
marginal utility is the same as assuming u > 0, then the individual
will be a precautionary saver whenever he is prudent.

Risky interest rate


Now, instead of assuming that the second period income is risky, we
assume that it is the rate of interest that is risky. Again, in order
that this is a pure addition of risk, we assume that the expected
interest rate is the same as in our certainty model. Specically, with
probability 1 p the interest rate is r1 and with probability p the
interest rate is r2 , where pr2 + (1 p)r1 = r, and of course r1 = r2 .
Now, the relevant problem is the following;
maxU (s) = u(y1 s) + [pu(y2 + (1 + r2 )s) + (1 p)u(y2 + (1 + r1 )s)]
s

y2
subject to s y1
1 Er
Again, since the objective function is concave in s, if we assume that
the solution is interior, then the optimal savings for this problem, s ,
is given by the solution to the rst-order condition;

u (y1 s ) + pu (y2 + (1 + r2 )s )(1 + r2 )+



(1 p)u (y2 + (1 + r1 )s )(1 + r1 ) = 0
94 4. Applications

We are interested in whether or not it will be true that the optimal


savings from this problem is greater than the optimal savings when
the interest rate risk did not exist, that is, if s > s0 . Using the same
argument as in the previous section, this will be the case if

u (y1 s0 ) + pu (y2 + (1 + r2 )s0 )(1 + r2 )+



(1 p)u (y2 + (1 + r1 )s0 )(1 + r1 ) > 0

Using (4.6), we write this as

u (y2 + (1 + r)s0 )(1 + r)+


+ pu (y2 + (1 + r2 )s0 )(1 + r2 )+



(1 p)u (y2 + (1 + r1 )s0 )(1 + r1 ) > 0

which simplies to

pu (y2 + (1 + r2 )s0 )(1 + r2 )+


(1 p)u (y2 + (1 + r1 )s0 )(1 + r1 ) > u (y2 + (1 + r)s0 )(1 + r)

Note that this equation is of the form ph(r2 ) + (1 p)h(r1 ) > h(r),
where r = pr2 + (1 p)r1 . Thus the requirement is that h(r) =
u (y2 + (1 + r)s0 )(1 + r) is convex in r, that is, we require h (r) > 0.
However, we can calculate

h (r) = u (y2 + (1 + r)s0 )(1 + r)s0 + u (y2 + (1 + r)s0 )

and

h (r) = u (y2 + (1 + r)s0 )(1 + r)(s0 )2 + u (y2 + (1 + r)s0 )s0 +
+ u (y2 + (1 + r)s0 )s0
= u (y2 + (1 + r)s0 )(1 + r)(s0 )2 + 2u (y2 + (1 + r)s0 )s0

Thus, h (r) > 0 if

2u (y2 + (1 + r)s0 )s0 > u (y2 + (1 + r)s0 )(1 + r)(s0 )2

or
u (y2 + (1 + r)s0 )(1 + r)s0
2< (4.8)
u (y2 + (1 + r)s0 )
4.3. Precautionary savings 95

At the last step, be careful that you understand why the inequality
direction has changed it is because we divided by u which is
negative.
Condition (4.8) tells us several things about savings under interest
rate risk. First, it is not true that positive prudence is necessary
for the individual to save more under interest rate risk than under
certainty. Prudence must be suciently high. To see this, recall that
the coecient of absolute prudence is the third derivative of utility
divided by the second derivative and multiplied by 1, and then
simply write our condition in the following ways

2 u (y2 + (1 + r)s0 )


0 < < if s0 > 0
(1 + r)s0 u (y2 + (1 + r)s0 )
2 u (y2 + (1 + r)s0 )
0 > > if s0 < 0
(1 + r)s0 u (y2 + (1 + r)s0 )

The second of these is impossible under positive prudence, and so


we should conclude that if it is optimal to loan rather than to save
under certainty, then the introduction of interest rate risk cannot lead
to a smaller loan. On the other hand, if it is optimal to save some
positive amount under certainty, then the individual would save more
when the interest rate becomes risky only if prudence is greater than
2
the limit indicated by (1+r)s 0.

P (y2 + (1 + r)s ) 2/((1 + r)s )

s
2/((1 + r)s )

Would not increase saving Would increase saving

Figure 4.6 Eect of the value of prudence on the


savings decision under a risky interest rate
96 4. Applications

What is happening here is that an interest rate risk has two


eects; rst it makes the second period income risky and so a prudent
individual would like to save more. But second, a risky interest rate is a
worse investment than a sure-thing interest rate, and so any risk averse
investor would like less of this savings investment. Thus, because of
prudence the individual would like to save more, but because of risk
aversion he would like to save less. He ends up saving more only when
prudence is suciently strong.
If we assume decreasing absolute prudence, then we can draw a
graph of what we have discovered (Figure 4.6).

4.4 Theory of production under risk


Up to now our analysis has centred on the case of an individual
consumer making decisions in a risky world. However, the theory
of choice under risk also has a rich history regarding the decisions
of producers. In this sub-section we shall take a look at one of the
principal issues in the theory of production under risk the case of a
producer facing a risky price.
The producer in question outputs a good, say x, using a production
technology that is described by a cost function c(x). We assume that
the cost function is strictly increasing (the greater is x the greater is
the cost), convex (marginal cost is increasing in x), and that c(0) = 0.
The price at which the producer can sell his output is given by a
demand function d(x), plus a random perturbation  . The demand
function will be assumed not to have positive slope, d (x) 0, which
conforms to most standard examples of demand functions in eco-
nomics.
We shall assume that the perturbation can take either of two
values, 1 and 2 , and that the probability of 1 is known to be
1 p, so that the probability of 2 is p. We shall also assume that
the perturbation is a pure risk, that is, it has an expected value of 0;
(1 p)1 + p2 = 0, that is, one of the values is positive and the
other negative. We shall assume that 1 > 0 > 2 . The producers
income is the random variable y = (d(x) +  )x c(x). Since there are
two feasible values of , there are also two feasible values of income y,
one corresponding to each .
We assume that the producer values income according to a utility
function, u(y), which is strictly increasing and concave, so that the
producer prefers greater income to less and he is risk averse. Our
4.4. Theory of production under risk 97

objective is to consider how the risk aversion of the producer impacts


upon his optimal decision regarding output x.
In essence, at least as far as preferences are concerned, this problem
looks just like those studied previously. We can draw indierence
curves in the space dened by (y1 , y2 ), in which case we already know
that the indierence curves are downward sloping and convex. The
only dierence here is that the producer chooses a single value of x,
which then indirectly leads to a vector of y values. Since he does not
directly choose the point in y space, we need to be a little more careful
about the feasible set for the problem in y space.
First, lets nd the solution to this problem mathematically, and
then we will work out the graphical representation of that solution.
Our main focus of attention is, as in the precautionary savings prob-
lem, to see how the introduction of risk, and in this case of risk
aversion, alters the solution. Intuition suggests that a risk averse
producer facing a risk will decide to produce less than if either no
risk were present or if he were risk neutral. The reason for this is that
the greater is the output, the greater would be the risk suered since
the risk is on price, and price multiplies the output chosen.
The producer wants to choose x so as to maximise his expected
utility, which is

y ) = (1 p)u(y1 (x)) + pu(y2 (x))


Eu(

where, of course, yi (x) = (d(x)+i )xc(x) for i = 1, 2. The rst-order


condition for an optimal choice of x is
Eu(y)
= (1 p)u (y1 (x ))y1 (x ) + pu (y2 (x ))y2 (x ) = 0 (4.9)
x
where yi (x) = d(x) + i + d (x)x c (x). The second-order condition,
that the second derivative of expected utility be negative, is

2 Eu( y)
2
= (1 p){u (y1 (x ))[y1 (x )]2 + u (y1 (x ))y1 (x )}+
x
p{u (y2 (x ))[y2 (x )]2 + u (y2 (x ))y2 (x )}

The sign of this depends on the sign of yi (x) = 2d (x) + d (x)x
c (x).
Since we have already assumed that c (x) > 0, and that d (x)
0, the second derivative of expected utility with respect to x is likely
to be negative, but we do need to assume that d (x) is not too positive
for this to happen. We shall indeed make this assumption, as if it were
98 4. Applications

not to hold, all that happens is that we would get a corner solution
(either output of 0, or output going innite), which is neither realistic
nor interesting.
Now, if the producer were risk neutral, then u (y1 (x)) = u (y2 (x)),
and the rst-order condition (4.9) would simplify to (using super-
indexes of 0 to indicate the risk neutral solution)

(1 p)y1 (x0 ) + py2 (x0 ) = 0

Substituting for yi (x), we get




(1 p) d(x0 ) + 1 + d (x0 )x0 c (x0 ) +


p d(x0 ) + 2 + d (x0 )x0 c (x0 ) = 0

or
d(x0 ) + d (x0 )x0 c (x0 ) = [(1 p)1 + p2 ] = 0
So in the end, the risk neutral solution is nothing more than the
condition that marginal revenue d(x0 ) + d (x0 )x0 be equal to marginal
cost c (x0 ). Of course, this is also the solution when no risk exists
(1 = 2 = 0).
Now, lets substitute that solution x0 into the rst-order condition
for the risk averse producer. If the sign of the rst-order condition
becomes negative, we would then know that x < x0 (draw a quick
graph of a concave function to help you see why).
When we use the condition for x0 , it turns out that

yi (x0 ) = d(x0 ) + i + d (x0 )x0 c (x0 ) = i

and so when we substitute x0 into (4.9), we get

Eu(y)
= (1 p)u (y1 (x0 ))1 + pu (y2 (x0 ))2 (4.10)
x
Now, since our assumption is 1 > 0 > 2 , it also happens that for
any x we have y1 (x) = (d(x)+1 )xc(x) > y2 (x) = (d(x)+2 )xc(x).
Thus y1 (x0 ) > y2 (x0 ), and since utility is concave, this implies that
u (y1 (x0 )) < u (y2 (x0 )). We can use this in equation (4.10) as follows:

(1 p)u (y1 (x0 ))1 + pu (y2 (x0 ))2 <


(1 p)u (y2 (x0 ))1 + pu (y2 (x0 ))2
4.4. Theory of production under risk 99

But the right-hand side of this is equal to

u (y2 (x0 )) [(1 p)1 + p2 ] = 0


So, in the end we have indeed proved that the solution under risk aver-
sion (and risk), x , is smaller than the solution under risk neutrality
(or no risk), x0 .

Exercise 4.6. Go back to those last few steps, where we substi-


tute u (y2 ) in the place of u (y1 ) to prove that (1p)u (y1 (x0 ))1 +
pu (y2 (x0 ))2 < 0. Is it true that doing the opposite (substi-
tuting u (y1 ) in the place of u (y2 )) would not indicate that
(1 p)u (y1 (x0 ))1 + pu (y2 (x0 ))2 < 0?

Answer. Although quite a subtile point, we must be very aware


of negatives in any analysis we carry out. The substitution as
performed works because 1 is positive. In that case, 0 < (1
p)u (y1 (x0 ))1 < (1 p)u (y2 (x0 ))1 . If we attempt the other
suggested substitution, since 2 is negative, we get u (y2 (x0 )) >
u (y1 (x0 )) pu (y2 (x0 )) > pu (y1 (x0 )) pu (y2 (x0 ))2 <
pu (y1 (x0 ))2 , where the inequality direction switches at the
last step since we are multiplying by a negative number. Given
that, the substitution of u (y1 ) in the place of u (y2 ) tells us
that (1 p)u (y1 (x0 ))1 + pu (y2 (x0 ))2 < (1 p)u (y1 (x0 ))1 +
pu (y1 (x0 ))2 , which again leads us to the same conclusion, that
is, (1 p)u (y1 (x0 ))1 + pu (y2 (x0 ))2 < 0.

Now, lets try to set up a graphical analysis of what we have just


done. The graphical space to use is the space of income vectors (y1 , y2 ),
and our main problem is in establishing the feasible set. The problem is
that the producer is choosing a single value of x which then determines
a point in (y1 , y2 ) space, but he is not directly able to choose the
(y1 , y2 ) point. We need to establish the relationship between the points
in (y1 , y2 ) space and the choice of x. Fortunately, this is not so dicult,
although it does lead to a feasible set that is unlike any other that we
have encountered up to now.
Recall that we have

y1 (x) = (d(x) + 1 )x c(x)


y2 (x) = (d(x) + 2 )x c(x)
100 4. Applications

and since 1 > 0 > 2 , for any x > 0 we have y1 (x) > y2 (x). Thus
the feasible set that we are looking for is located below the certainty
line in (y1 , y2 ) space. When x = 0, y1 (x) = y2 (x) = 0, so the feasible
set does contain the origin of (y1 , y2 ) space. Furthermore, we have
y1 (x) y2 (x) = (1 2 )x, which is larger the larger is x. Thus, as x
grows, the upper frontier of the feasible set gets further and further
away from the certainty line.
Consider how yi (x) changes with x:

yi (x) = d(x) + i + d (x)x c (x)

The second derivative is just

yi (x) = 2d (xi ) + d (xi )xi c (xi ) < 0

So, yi (x) is (under the assumptions made on d(x) and c(x)) a concave
function of x, and so it has a maximum. This means that the feasible
set that we are looking for must be bounded, since neither y1 nor y2
can exceed their maximum values. If you like, say the maximum value
of yi is denoted by yimax , then we can draw in our (y1 , y2 ) space a
rectangle with sides of y1max and y2max , and the feasible set that we are
looking for must be everywhere contained within that rectangle.
Denote by xi the value of x that maximises yi (x). Under our
assumptions on d(x) and c(x), we have

xi d(xi ) + i + d (xi )xi c (xi ) = 0

If we apply the implicit function theorem to this, we can see that


dxi 1
=  >0
di 2d (xi ) + d (xi )xi c (xi )


where we have used the fact that 2d (xi ) + d (xi )xi c (xi ) < 0 from
the second-order condition for our main maximisation problem. The
dx
fact that dii > 0 indicates that as increases, so does the value of x
that maximises y(x). So it turns out that x1 > x2 .
Now, all of this indicates that we know that for any x < x2 both y1
and y2 are increasing with x. Thus the frontier of the feasible set over
this range of values of x must be positively sloped. Since it started at
y1 = y2 = 0, and since it both lies below, and slopes away from, the
certainty line, over this range of values of x the frontier of the feasible
set must be an increasing function of slope less than 1. But then, when
4.4. Theory of production under risk 101

x goes above x2 but without yet reaching x1 , we know that y2 is now


decreasing in x while y1 is still increasing in x. Thus over the range of
x values such that x2 < x < x1 , the frontier of the feasible set must
now take negative slope in (y1 , y2 ) space. Finally, when x reaches x1 ,
y1 will have reached its maximum, and so for any larger values of x
both y1 and y2 are decreasing. This implies that the frontier of the
feasible set bends backwards (positive slope again). Such a frontier,
along with all of the relevant elements of this analysis, are shown in
Figure 4.7.

y2

x = x2
y2max

x


x = x1




x=0 y1max y1

Figure 4.7 Feasible set for the risky production


problem

The only really relevant section of the feasible set is the negatively
sloped part. This is because the indierence curves in (y1 , y2 ) space
are negatively sloped, and so when we maximise utility on this feasible
set the optimal point must turn out to be on the negatively sloped
section, that is, we know that whatever value of x maximises utility,
it must satisfy x2 < x < x1 . The solution is shown in Figure 4.8,
where the indierence curve is tangent to the feasible set boundary.
102 4. Applications

w2

w1 = w2

x0

w1

Figure 4.8 Optimal production choice under risk


aversion, and under risk neutrality

Figure 4.8 also shows the risk neutral solution, x0 . The risk neutral
indierence curves are straight lines with slope 1pp , which, of course,
is the slope of the indierence curves of the risk averse problem as they
cross the certainty line. It is this property that leads to the risk averse
solution, x , locating to the north-west of the risk neutral solution x0 .
If you imagine the risk averse indierence curve going through x0 , it
would have to be less steep than the risk neutral indierence curve at
that point. Thus the risk averse problem must nd its maximum at a
smaller value of x.

The newsboy problem


There are several ways in which the theory of the producer can be
aected by risk. We have studied above one option for the case of
price risk. Another option is the case of a risk on the production
technology (the cost function), which you are asked to look at in
problem 10. However, there is a particular problem that has been
well commented, and that in its two-dimensional version provides a
4.4. Theory of production under risk 103

wonderful example of a problem in which the objective function is


piecewise. In this section we shall look at this problem, and provide
its solution.
The problem is as follows. The seller of a good, x, faces a risky
demand. Demand is either high, x1 , which occurs with probability
1 p, or demand is low, x2 , which occurs with probability p. Of
course, x1 > x2 . The seller does not produce good x but rather buys
units of it to then sell on to his customers. The cost of each unit of x
to the seller is c. The price at which the seller oers the good to his
customers is not a choice variable, and it is set at q, which is strictly
larger than c and which is independent of the nal value of x. The
good x is perishable, in the sense that any stock that is left over at the
end of the days trading becomes worthless. It is for this reason that
this problem is often called the newsboy problem newspapers are
brought in to sell, the demand is stochastic, the price is not decided
by the news-stand owner, and leftover newspapers are worthless the
next day. The problem is, how many newspapers should be ordered
in at the start of each day in order to maximise the expected utility
of the prots from the business?
To begin with, note that if the newsboy orders in x < x2 , then he
suers no risk at all. Sales will always be at least x2 , and so in this
case the newsboy will certainly sell all the papers that he orders. On
this interval of newspaper orders, the expected utility of the newsboy
is
Eu(x)|x<x2 = u(qx cx) = u(x(q c))
Since the derivative of this with respect to x is strictly positive, over
the interval x < x2 expected utility is strictly increasing.
Second, consider the interval x > x1 . Over this interval, the news-
boy will certainly sell x2 newspapers, and may sell x1 newspapers,
but any newspapers over x1 will never sell. Expected utility on this
interval is

Eu(x)|x>x1 =(1 p)u(qx1 cx1 c(x x1 ))


+ pu(qx2 cx2 c(x x2 ))
= (1 p)u(qx1 cx) + pu(qx2 cx)

Since the derivative of this with respect to x is




Eu (x) x>x1 = c (1 p)u (qx1 cx) + pu (qx2 cx) < 0

expected utility is decreasing over this interval.


104 4. Applications

The analysis of expected utility over the two intervals x < x2


and x > x1 tells us that it will never be optimal to set x in either
of those intervals, since for an optimal solution we are looking for a
point at which marginal utility is 0. Thus it must be that whatever is
the optimum, it must satisfy x1 x x2 .
Now, consider the interval between the two feasible demand levels,
x1 x x2 . Over this interval, the newsboy would sell all of what he
orders in should the demand be x1 , and he would have left-over stock
if the demand turns out to be x2 . In this case, expected utility is

Eu(x)|x2 xx1 =(1 p)u(qx cx) + pu(qx2 cx2 c(x x2 ))


=(1 p)u(x(q c)) + pu(qx2 cx)

The marginal utility on this interval is



Eu (x) x2 xx1 = (1 p)u (x(q c))(q c) + pu (qx2 cx)(c)

Notice that the rst term on the right-hand side of this is positive,
and the second term is negative. So this could indeed be positive,
negative or equal to zero. Expected utility over this interval is concave
under the assumption that the newsboy is risk averse, u < 0 (you
can check this by calculating the second derivative and checking that
it is negative). In short, the optimal number of newspapers to order
in, x , must satisfy

(1 p)u (x (q c))(q c) = pu (qx2 cx )c

Lets write this as


u (x (q c)) pc
= (4.11)
u (qx2 cx ) (1 p)(q c)

Now, three possibilities emerge. First, say pc > (1 p)(q c). In


this case, it must hold that u (x (q c)) > u (qx2 cx ), or since
utility is concave, x (q c) < qx2 cx . But this re-orders to x < x2 .
This is not even on the interval x1 > x > x2 . What that implies
is that for all values of x that are actually on the relevant interval,
expected utility is decreasing. In this case, the optimal number of
newspapers to order is x = x2 . This solution is shown in Figure 4.9.
In the gure you can see that outside of the zone between the two x
values the expected utility curve has been drawn as a dashed curve.
This is to indicate that in fact this curve would not correspond to
4.4. Theory of production under risk 105

expected utility on those zones, since outside of the intermediate zone


we need to calculate expected utility dierently. Indeed, we saw that
expected utility is strictly increasing below x2 and strictly decreasing
above x1 . For example, the correct expected utility curve, with each
section correctly represented, would have a non-derivable peak at x2
in Figure 4.9.

Eu(x)

x2 x1 x

Figure 4.9 Newsboy expected utility assuming that


pc > (1 p)(q c)

Second, say pc = (1 p)(q c). Following the same steps, it turns


out that the turning point for expected utility is exactly at x = x2 .
So again it is optimal to order in x2 newspapers. This option is shown
in Figure 4.10.
And third, if pc < (1 p)(q c) the same analysis tells us that
x > x2 , as is depicted in Figure 4.11.
What this tells us is that the choice of ordering x = x2 , or x > x2
hinges upon whether pc (1 p)(q c) or not. But this inequality
simplies to c (1 p)q, or p qc
q . That is, for given values of q and
c, the newsboy orders in x2 newspapers when the probability of the
low demand level is suciently low, and he would order in more than
106 4. Applications

Eu(x)

x2 x1 x

Figure 4.10 Newsboy expected utility assuming that


pc = (1 p)(q c)

x2 newspapers only when the probability of the high demand state is


suciently high.
Notice that the analysis points to whether or not the newsboy
decides to bear risk. Choosing x = x2 implies that no risk at all is
borne, and choosing x > x2 implies that the newsboy bears some risk.
The choice of whether or not to bear risk depends only upon the size of
c relative to (1 p)q, and not on the degree of risk aversion. However,
the degree of risk aversion will determine the size of x for those cases
in which it is optimal to bear risk. For example, if the newsboy were
risk neutral, and if p < qc q , then he would order in x1 newspapers
(his utility would be strictly increasing on the interval x2 x x1 ,
and since it is strictly decreasing on the interval x > x1 , the optimum
is at x1 although it will not be a derivable point). But if he is risk
averse, the optimal order of newspapers can certainly be less than x1
even when p < qc q . In problem 11 you will be asked to consider the
optimal choice of x as a function of p.
4.4. Theory of production under risk 107

Eu(x)

x2 x1 x

Figure 4.11 Newsboy expected utility assuming that


pc < (1 p)(q c)

Exercise 4.7. Assume that the newsboys utility function is


the natural log function. Assuming that p < qc
q , what is the
optimal number of newspapers to order? Can this ever be equal
to x1 ?

Answer. Under logarithmic utility, the rst-order condition (4.11)


becomes
qx2 cx pc
=
x (q c) (1 p)(q c)

After a minor amount of eort, this simplies to

q(1 p)
x = x 2
c

Since the assumption of p < qc q is just c < q(1 p), clearly we


have x > x2 . We would only have x = x1 if it were to be the
case that x1 x2 q(1p)
c , which is simply a matter of parameter
values, and so may or may not hold true.
108 4. Applications

Summary
In this chapter you should have learned the following:
1. The basic theory of choice under risk can be applied to many
specic questions, relating to consumers, investors, savers and
producers (to name a few).
2. The stock market, where shares in companies are traded, pro-
vides a mechanism under which individuals can organise their
holdings of risk. In the model analysed in this chapter, there
was no price risk, only prot risk, and our investor was able to
spread his portfolio over companies with dierent (and risky)
prot outcomes. The main thing to note in this model is how its
solution conforms, almost exactly, with the kind of solution that
we get in any standard consumer theory model the equilibrium
is at the point at which the indierence curve is tangent to a
budget line.
3. The classic model of transactions involving risk is found in the
insurance market. Here we have analysed the insurance decision
of an insurance consumer, that is, the demand for insurance.
We saw that insurance demand will always involve full cover-
age whenever the premium is marginally fair, and so long as
there is no xed-cost element to the premium that exceeds the
insurance consumers risk premium. When the premium is no
longer marginally fair, we get partial coverage. In these cases it
also happens that, if absolute risk aversion is decreasing with
wealth, as the insurance consumer gets wealthier, less coverage
is demanded.
4. When more than one period is brought into the analysis, a
decision maker has the opportunity to pass money from one
period to the next in the form of savings. When it happens that
there are risks in the second period, then it may be optimal to
save from the rst to the second period in order to mitigate
the eects of second period risks. Such a strategy is known
as precautionary savings. We showed that, when the second
period risk is upon income, then the decision maker will be a
precautionary saver only if she is prudent (i.e., if her marginal
utility is convex). When the second period risk is upon the
interest rate, prudence alone is a necessary but not sucient
condition for precautionary savings. The sucient condition is
that prudence must be suciently high.
4.4. Theory of production under risk 109

5. In the chapter we also looked at production choices when the


price at which output is sold is risky. We showed that a risk
averse producer faced with such a risk will produce less output
than would a risk neutral producer (or a producer facing no
risk). The principal aspect of this problem to note is the fact that
the feasible set is not quite so trivially obtained as in some other
problems, since the producers decision is not directly how much
income to earn in each state of the world, but rather how much
to produce, which then indirectly determines state contingent
income.
6. Finally, the newsboy problem was considered. This is an inter-
esting problem as when we have a two-dimensional environment,
the utility function becomes piecewise, that is, it takes dierent
functional forms over dierent intervals of the choice variable.
We showed that, depending upon the probability of low demand
being suciently low or not, the newsboy elects to either bear
risk or not.

Problems
1. Draw a graph of a situation of a strictly risk averse monopolist
insurer. Comment on the dierences between this situation and
that of a risk neutral insurer.
2. Assume a model of a risk-neutral monopolistic insurer. What is
the expected prot that this insurer extracts from a risk averse
individual with a loss of L that occurs with probability p? Would
the insurer prefer to insure a risk with higher or lower p?
3. An individual with strictly increasing and concave utility has a
lottery that pays x1 with probability 1 p and x2 with prob-
ability p. Assume 0 < x2 < x1 . The individual can insure his
lottery with a perfectly competitive insurer. Write the equation
for the increase in expected utility that the individual receives
under the optimal insurance demand. This increase in expected
utility is a function of the probability p, so write the expected
utility increase as H(p). Evaluate the concavity or convexity of
H(p) and nd the value of p that would maximise H(p).
4. Assume an individual with wealth w0 , and a risk on that wealth.
The risk is that with probability p, a fraction of the wealth
is lost. Assume that an insurer oers coverage such that if the
loss occurs, the indemnity paid to the individual is C (which is
110 4. Applications

restricted to be no greater than the amount lost). The premium


for this contract is qC, where q > p. Analyse the optimal demand
for this individual, and the eect of an increase in wealth upon
the optimal coverage.
5. Insurance can also be studied in a graph that has the total
premium payment on one axis and the level of indemnity (or
coverage) on the other. Call the premium payment Q and the
level of coverage C, and assume that both can be freely chosen
by the insured, subject to the insurer accepting. In your graph,
put C on the horizontal axis and Q on the vertical. Assume that
the insurable risk is identical to that studied in the main text.
(a) Write the expected utility of the insured individual as a
function of Q and C. What is the marginal rate of trans-
formation between these two variables? Draw some indif-
ference curves for this utility function in the space dened
by Q and C, taking care to show correctly their shape.
Indicate, specically, the indierence curve corresponding
to no insurance.
(b) Repeat part (a) but for the case of the insurer (who we
are assuming is risk neutral). Indicate the zone of Pareto
improving points in the graph.
(c) Finally, locate graphically the equilibrium if the insurer
acts in a perfectly competitive market, and the equilibrium
if the insurer is a monopolist. Check (mathematically) that
the result is the same as what we obtained in the text in
the contingent claims graph.
6. Assume a two-state insurance demand scenario. The individual
in question, who, of course, is strictly risk averse, has risk-free
wealth plus a loss lottery that implies a loss only in state 2.
The insurer is risk neutral, and oers coverage against the loss
should it occur, at a constant per-unit coverage premium that is
no less than the probability of loss. That is, assuming that the
probability of loss is p, and that coverage of C is demanded, the
premium would be equal to qC, where q p. Note that there is
no xed cost element in the premium.
(a) Draw a graph (in state-contingent claims space) of the
equilibrium point if q = p.
(b) Draw a graph in which q is at the limit price for positive
coverage to be purchased.
4.4. Theory of production under risk 111

(c) Now draw the locus of optimal points, one for each value of
q between the two extremes of the previous two questions.
Locate (graphically) the premium that corresponds to the
maximal expected prot of the insurer.

7. Draw a graph of the solution to the savings problem under


certainty (with s on the horizontal axis) for the case of positive
savings, and on a second graph draw an example of a solution
with negative savings. Indicate on your graphs the utility value
to the consumer of the existence of a nancial system that allows
money to be transferred over periods at the interest rate r.
8. In the section on precautionary savings in the text, we specif-
ically assumed that the risk on the second period income was
such that the expected value of second period income was equal
to the deterministic second period income from the certainty
model (y2 ) so that the two optimal choices can be studied. This
was done by assuming that the options for the second period
income, y21 and y22 , are such that py22 + (1 p)y21 = y2 . We
can generalise this by setting
the two options for second period
wealth equal to w2 = y2 1p
2
p z and w1 = y21 + z, for some
z 0. In such a formulation, the case studied in the text can be
found by setting z = 0. For the following questions, you should
assume that y22 < y21 , that z 0, and that marginal utility is
convex.

(a) Check that, so long as py22 + (1 p)y21 = y2 , the expected


value of second period wealth is still equal to y2 with this
general formulation, even if z > 0.
(b) What is the variance of second period wealth when we use
the general formulation? How is variance aected by an
increase in z?
(c) Try to calculate the eect upon the optimal level of savings
in the model with risk of an increase in z.

9. One can often nd producers joining together in mutuals to


insure each other against price uctuations. In order for this to
be optimal, they need to be risk averse for lotteries on price.
Consider a producer who earns prot of B(p, x) = px c(x),
where p is the price at which output x is sold, and the cost
function c(x) is strictly increasing and concave. Assume that
the price is set in a perfectly competitive market, and so it is
112 4. Applications

independent of this producers choice of x. Assume also that


output x is produced using a strictly increasing and concave
production function of an input y, that is we have x = f (y).
Find the rst- and second-order conditions for an optimal choice
of y. Since now we have y and, therefore, x as functions of p,
write the equation for the indirect prot function, V (p). Is
this producer risk averse for lotteries on the market price?
10. A monopolist sells his output x at a price that is determined
by the (inverse) demand function D(x). Assume that D (x) <
0. Assume that the cost function is linear, c(x) = cx. The
monopolist is risk averse, so that he wants to maximise the
utility of prots, under an increasing and strictly concave utility
function. Now, there exists a risk to the production technology
such that the marginal cost, c, is either low (c1 ) or high (c2 ).
Assume the probability of c2 is p, so that the probability of
c1 is 1 p. Calculate the optimal production for this risky
environment. Consider how the optimal production compares
with what would be produced should the marginal cost be pc2 +
(1 p)c1 for sure.
11. Use the implicit function theorem in the newsboy problem to
prove that, conditional upon x x2 , a decrease in the proba-
bility of the low demand state, p, will lead to an increase in x .
Draw a graph of the optimal choice of newspapers to order as
a function of p. Consider carefully how your graph should look
at both p = 0 and p = 1, and thus make a conclusion about
whether or not x can ever be equal to x1 . How does this relate
to what was done in exercise 4.7?
Part II

Risk sharing environments


This page intentionally left blank
Chapter 5

Risk sharing with perfect


information

From now on we shall consider how risk and uncertainty can be dealt
with in a somewhat more general equilibrium setting. It will, however,
be a very simple general equilibrium, with only two economic agents
present at all times. We begin (in this chapter) with an analysis of risk
sharing between the two individuals under an assumption of perfect
information (all that is relevant to the situation is fully known by
both players), and then later (chapters 6 and 7) we shall consider
what happens when we relax the perfect information assumption.
Here then, we retain the contingent claims environment of the
previous chapter, but we adapt our graphical presentation to include
two individuals, both of whom are assumed to be strictly risk averse
(unless otherwise stated). The natural way to do this is by using the
well-known Edgeworth box diagram of intermediate microeconomics.
This is be the principal graphical tool that is used throughout this
chapter. Again, the only good present in the model is money. The
initial endowment of individual i is given by the vector w  i = (w 1i , w
2i )
for i = 1, 2. Here, wji represents the wealth of individual i in state of
nature j, and if w 1 = w
i 2i then individual i has a risky endowment.
In what follows, we assume that state 2 is the unfavourable state for
both individuals, that is, w 1i for both i = 1, 2. This assumption
2i < w
implies that total (or aggregate) wealth in state 2, W 2 = w 21 + w 22 , is
strictly less than total (or aggregate) wealth in state 1, W1 = w 11 + w 12 .
Such a situation, W 2 < W 1 , is known as characterising aggregate risk.
When there exists aggregate risk, the Edgeworth box is longer than
it is tall (see Figure 5.1).
115
116 5. Perfect information

w2
w11 w21
O2
w1
C1
Eu1 (w)
w22

Eu2 (w) w

w21

C2
O1 w1
w2

Figure 5.1 An Edgeworth box under risk

In Figure 5.1, two straight lines are shown, labeled as Ci for i =


1, 2. These are the two certainty lines for our two individuals. Any
point on the line Ci oers absolute certainty to individual i in the
sense that it indicates a point at which his wealth is the same in both
states of nature. The assumption that state 2 is the unfavourable state
for both individuals as far as their initial endowments are concerned
implies that the endowment point in the box, w,  lies strictly between
the two certainty lines. Figure 5.1 also shows the two indierence
curves corresponding to the endowment point. These two curves are
drawn not tangent to each other at the endowment, so that a mutually
benecial area of trading opportunities exists (the lens-shaped area
between the two indierence curves).

5.1 The contract curve


As in any model of general equilibrium in an Edgeworth box, the
contract curve in contingent claims space is the set of all points that
are Pareto ecient. Graphically, these are all the points such that the
two indierence curves (one for each individual) are tangent to each
other (so that it becomes impossible to increase the utility of one
5.1. The contract curve 117

individual without decreasing the utility of the other). The fact that
both of the individuals are assumed to be risk averse has an important
consequence in the graph. It implies that at all interior points on it,
the contract curve must lie strictly between the two certainty lines,
without ever touching either. The reason for this is quite simple to
see. We know that the slope of an indierence curve where it passes
through the corresponding certainty line is equal to (1p) p , where of
course, p is the probability that state 2 occurs. The strict convexity of
the indierence curves of individual 1 then implies that it is impossible
that an indierence curve of individual 1 has slope equal to (1p)p at
the point at which it passes through the certainty line of individual 2
(and so the two indierence curves can never be tangent to each other
at a point on C2 ). After all, the indierence curves of individual 1 take
the particular slope (1p)
p only at points on C1 , and they never take
that slope again anywhere else. In the same way, the strict concavity
of the indierence curves of individual 2 (with respect to the origin
O1 ), and the fact that their slope is equal to (1p)
p at points on C2
imply that these curves can never have slope equal to (1p)
p at points
on C1 . So the contract curve cannot ever touch either certainty line.
To see that in fact the contract curve lies between the two certainty
lines, consider one indierence curve in particular, Eu2 ( x) = U . This
curve must cut both certainty lines, but at the point at which it cuts
C1 it must be less steep than it is at the point at which it cuts C2 .
When it cuts C1 , it is atter than the indierence curve of individual 1
at that same point. But a similar argument suces to show that where
this particular indierence curve cuts C2 , it must be steeper than the
indierence curve of individual 1 passing through that same point. So
as we move along that indierence curve of individual 2, the marginal
rate of substitution of individual 1 is less than that of individual 2
at C1 (remember that the M RS are negative numbers), and greater
than that of individual 1 at C2 . By the intermediate value theorem,
at some point between the two certainty lines, the two marginal rates
of substitution must be equal.
The logic of why the contract curve cannot touch either certainty
line (at interior points in the Edgeworth box) is also easy to appreci-
ate. Since both individuals are strictly risk averse, it cannot be ecient
for only one of them to accept all of the risk.
There are a couple of straight forward exceptions to the rule that
the contract curve cannot touch the certainty lines, but they must
118 5. Perfect information

imply a small change in the underlying assumptions. First, if one of


the individuals is risk neutral, then the contract curve will coincide
entirely with the certainty line of the other individual (the risk averse
one). The risk neutral player has linear indierence curve, whose slope
is equal to (1p)
p . Therefore, the tangencies between these lines and
the indierence curves of the other individual must occur along the
risk averse players certainty line. Again, the intuition is easy if
there is a risk neutral individual and the rest are risk averse, then it is
ecient that the risk neutral person accepts all the risk, since doing
so is costless to him. In essence, this is why the equilibria that we
obtained in the analysis of insurance with marginally fair premia, both
under competitive and monopolistic insurers, occur on the certainty
line of the insured individual. That certainty line is the contract curve
for the problem.
Second, even though both individuals are risk averse, there is still a
special case in which the contract curve touches the certainty lines. It
is the case in which there is no aggregate risk, that is, when W 1 = W2 .
In this case, the Edgeworth box becomes a perfect square, and the
two certainty lines coincide along the diagonal of the box. But then
any point on the certainty line of individual 1, where his indierence
curve has slope equal to (1p) p is also a point on the certainty line
of individual 2 where his indierence curve has that same slope, since
the two certainty lines have in fact become one and the same. Thus,
the contract curve coincides exactly with the certainty line. Again,
this is logical, since if there is no aggregate risk, it is always possible
to distribute the state contingent wealth among the two individuals
in such a way that neither faces any risk at all, which is ecient if
both are risk averse.
Aside from these two special cases, the rst one of which we
shall encounter again when we discuss asymmetric information, the
contract curve cannot touch either certainty line at an interior point.
From now on we shall assume that this is so (so we are assuming two
strictly risk averse individuals, and the existence of aggregate risk).
However, we should point out that the above result holds only
for strictly interior points, that is, when both individuals consume a
strictly positive amount of wealth in each state of nature. A logical
question is, what happens at the origins of the box? It turns out that
the contract curve may touch one or both of the origins (O1 and O2 ),
but this cannot be guaranteed. This is somewhat curious, since it is
very tempting to draw contract curves that pass through both origins.
5.1. The contract curve 119

Lets consider this aspect of the contract curve before moving on.
We know that the contract curve is formed by all the points at
which the two marginal rates of substitution are equated. Since the
marginal rate of substitution of individual i is

(1 p)ui (w1i )
RM Si = for i = 1, 2
pui (w2i )

it can be seen that the contract curve is the set of points such that
j for j = 1, 2, that satisfy
wj1 + wj2 = W

(1 p)u1 (w11 ) (1 p)u2 (w12 )


=
pu1 (w21 ) pu2 (w22 )

Note that the probabilities cancel, so that the contract curve satises

u1 (w11 ) u2 (w12 )


= (5.1)
u1 (w21 ) u2 (w22 )

Curiously then, the position and slope of the contract curve is inde-
pendent of the probabilities of the states of nature.1
Using (5.1) we can easily see what happens along the borders of
the box, and in particular, what happens at the origins. Take, for
example the lower axis of the Edgeworth box, that is, the axis that
sets individual 1s state 2 wealth equal to 0 (and thus individual 2s
state 2 wealth equal to W 2 ). The contract curve must establish exactly
how state 1 wealth should be shared when state 2 wealth is shared in
this way. When we look at interior points close to the lower axis, the
contract curve will involve points at which the two marginal rates of
substitution are equal, and that contract curve must touch the lower
axis at some point. Let us call that point w11 = a, as is the case for
contract curve a shown in Figure 5.2.
If point a is not the origin O1 (as in curve a Figure 5.2), then the
contract curve will actually follow the lower border of the Edgeworth
box until it reaches O1 , but those points will be corner solutions rather
than tangency solutions between indierence curves. What is more
interesting are the cases like contract curve b in Figure 5.2, where
there are no corner solutions, and the contract curve converges to the
origin. We might wonder when this kind of contract curve eventuates.
1
Obviously, this is true only when both individuals share the same probabities
(i.e., the case is one of risk, rather than uncertainty).
120 5. Perfect information

w2
O2
w1
C1
Contract curve b
Contract curve a

C2
O1 a w1
w2

Figure 5.2 Two feasible types of contract curve

To dierentiate from the kind of contract curve that involves corner


solutions, we shall say that a contract curve like curve b in Figure
5.2 converges to the origin O1 . If a contract curve does converge
to an origin, then (in limit, i.e., arbitrarily close to the origin) the
marginal rates of substitution of the two individuals are equal.
First, consider the point O1 , which sets w11 = w21 = 0 and wj2 = W j
for j = 1, 2. Now, since W 1 > W 2 , and utility is concave, it is clearly
true that
u2 (w12 ) 1 )
u2 (W
= <1
u2 (w22 ) w2 =W
 j 2
2 )
u ( W
j

On the other hand, if 0 < u1 (0) < , then



u1 (w11 )
=1
u1 (w21 ) w1 =w1 =0
1 2

So, conditional upon 0 < u1 (0) < , it becomes impossible that
the contract curve converges to the origin O1 , since the marginal
rates of substitution are dierent at that point. Following an identical
argument, conditional upon 0 < u2 (0) < , it turns out that the
contract curve also cannot converge to the origin O2 . Given this, lets
5.1. The contract curve 121

think a bit more about the condition, 0 < ui (0) < , that marginal
utility with zero wealth is positive and nite.
Given that we are assuming the utility function to be concave,
we know that the marginal utility is decreasing. But if it is also to
be positive for positive wealth levels, we must directly eliminate the
option that ui (0) = 0. So, as far as the relationship between the
contract curve and the origins of the box is concerned, the only case
that we need to consider is the possibility that2 ui (0) = .
If ui (0) = then the contract curve converges to the origin Oi . To
see why, assume that individual 1s utility function is characterised by
u1 (0) = . Now, what is the value of w11 that corresponds to a Pareto
ecient allocation in which w21 = 0 and w22 = W 2 ? Graphically, the
question is, conditional upon being on the lower axis of the box, which
point is Pareto ecient? Let the value of w11 that we are searching for
be denoted by w11 = a. Whatever is the value of a, it is necessarily true
that a W 1 W
2 , since we know that the point in question can never
lie beneath the certainty line of individual 2 (recall, the contract curve
always lies strictly between the two certainty lines), and this certainty
line touches the lower axis at the point w11 = W 1 W 2 .
Now, at the relevant point we get

u1 (w11 ) u1 (a)
=
u1 (w21 ) w1 =aW
1 W
2 , w1 =0
1 2

On the other hand, at this point individual 2 gets state 1 wealth of


1 a W
w12 = W 1 (W1 W 2 ) = W
2 > 0, and state 2 wealth of

W2 > 0. Thus, at the point in question we get

u2 (w12 ) 1 a)
u2 (W
= >0
u2 (w22 ) w2 =W
1 a>0, w2 =W
2 2 )
u2 (W
1 2

Finally then, in order that the point be Pareto ecient, we must


satisfy equality of the two ratios of marginal utilities, that is,

u1 (a) 1 a)
u ( W
= 1 >0
2 )
u1 (W
2
This should really be written as limw0 ui (w) = , but no confusion at all
will result from the simpler expression used in the text.
122 5. Perfect information

But this is impossible whenever we choose a > 0, since it would


imply that u1 (a) < , and thus

u1 (a) u (W1 a)


=0< 1 
u1 (W2 )

So the only possible option is that, when u1 (0) = we must set
a = 0, in which case the contract curve converges3 to the origin O1 .
Naturally, the very same argument reveals that when u2 (0) = the
contract curve must converge to the origin O2 . Of course, both could
occur simultaneously.
In order to say more about the contract curve, we need to look
at its slope. We will now do this, limiting ourselves to interior points
only. Also, since any point on the contract curve is fully dened by
the coordinates of individual 1s consumption, and since we know
that whatever is individual 1s consumption, individual 2 consumes
the rest of the wealth in each state, we can simplify our notation by
eliminating the need to continue with the super-indexes that indicate
which individual is which. Thus, we now use simply wi1 = wi and
wi2 = Wi wi for i = 1, 2.
We begin with the equation that denes the contract curve itself,
equation (5.1), which now reads as follows

2 w2 ) = u (w2 )u (W
u1 (w1 )u2 (W 1 w1 )
1 2

Taking logarithms on each side, this equation is written as

2 w2 )) = Ln(u1 (w2 )) + Ln(u2 (W


Ln(u1 (w1 )) + Ln(u2 (W 1 w1 ))

Now, let us dene the function

2 w2 ))
h(w1 , w2 ) Ln(u1 (w1 )) + Ln(u2 (W
Ln(u (w2 )) Ln(u (W 1 w1 )) (5.2)
1 2

So the contract curve, understood as a function in the space dened


from the origin O1 , that is w2 = c(w1 ), is given by the equation
h(w1 , w2 ) = 0.
3
Of course, we now get the ratio of marginal utilities of individual 1 as
which is undened. However, it is the only possible option, since a positive can
never work.
5.2. Constant proportional risk sharing 123

From the implicit function theorem, we get the slope of the con-
tract curve as
h()
dw2 w1
=
dw1 h()
dh()=0
w2
h()
so long as w2 = 0. However, it is evident that
Ln(u (w)) u (w)
=  = Ra (w)
w u (w)
where Ra (w) is the Arrow-Pratt measure of absolute risk aversion.
Carrying out the suggested derivatives we get4
h() 1 w1 ) < 0
= R1a (w1 ) R2a (W
w1
h() 2 w2 ) + Ra (w2 ) > 0
= R2a (W 1
w2
Finally then, we nd that the slope of the contract curve at any
interior point is

dw2 dw2 1 w1 )
R1a (w1 ) + R2a (W
= = (5.3)
dw1 dh()=0 dw1 cc R1a (w2 ) + R2a (W2 w2 )

Directly from (5.3) we can conclude the following.

1. Since both individuals are strictly risk averse, Ria > 0 for i =
1, 2, the contract curve has strictly positive slope at all interior
points.
2. Since the contract curve lies between the two certainty lines, we
have w1 > w2 and W 1 w1 > W 2 w2 , and so the slope of the
contract curve is less than 1 if both individuals have decreasing
absolute risk aversion, greater than 1 if both individuals have
increasing absolute risk aversion, and equal to 1 if both individ-
uals have constant absolute risk aversion.

5.2 Constant proportional risk sharing


contracts
In real life situations that involve risk sharing, it is common to see
contracts that involve constant proportional sharing. That is, given a
4
The subindex on Ria corresponds to individual i.
124 5. Perfect information

random variable, it is agreed that a certain proportion (percentage


if you like) will go to individual 1, and the rest to individual 2,
whatever is the outcome of the random variable. For example, prots
in businesses are often shared in this way between shareholders, where
the proportions of prot that each shareholder gets is equal to the
proportion of the total shares that each holds. Also, royalty contracts
for writers and singers are often based on a proportional share of
revenue (often the author gets about 10% of total revenue, and the
distributor record label, book editor, etc. gets the rest).
Now clearly, proportional risk sharing is a special case in that
in general we could stipulate that the share of the random variable
that each individual enjoys will be a function of the outcome itself.
Simply put, in our 2-state 2-individual environment, the proportional
share that individual 1 gets in state 1 need not be the same as the
proportional share he gets in state 2.5 In this section we consider if
proportional risk sharing is indeed a Pareto ecient mechanism, at
least for the case of symmetric information under risk (rather than
uncertainty).
In our state contingent claims environment, we can understand
1 and W
W 2 as two possible outcomes of a random variable, where W 2
occurs with probability p. The two individuals should agree upon how
 is to be shared between them before the outcome is
the nal value of W
known. For example, we can understand W  to be the surplus that the
individuals create from some sort of economic relationship in which
they both participate. Given that in any ecient allocation we need
to respect the condition that all of the surplus is shared, we only need
to consider the share that goes to individual 1. In general, then, we
are thinking about two numbers;

1 ) and w2 = k2 (W
w1 = k1 ( W 2 )

If the contract is to be of constant proportional sharing, then the


two functions kj take a very special form. We would have

i for i = 1, 2
wi = W
5
This does in fact occur in some royalty type contracts. The author is often
paid a larger share of the revenue when the revenue is large. We shall consider
exactly this example later on.
5.2. Constant proportional risk sharing 125

w1
But in this case, since =  , it turns out that the contract stipulates
W1
an allocation that must satisfy
   
w 1

W 2
w2 = W2 = 2 =
W w1

W1 
W1
Clearly, this is a point on the diagonal of the Edgeworth box. Given
this, our task is to wonder if such a point can ever be the result of
an ecient risk sharing arrangement. This requires thinking about
intersections between the contract curve and the diagonal passing
through the Edgeworth box.
It is easy to see that, quite in general, the contract curve must have
at least one point that coincides with the diagonal of the Edgeworth
box. This is trivial when ui (0) = for some i = 1, 2, since in that case
we know that the contract curve touches the diagonal at least at the
origin Oi . For the case in which ui (0) < for both i = 1, 2, we need
only recall that the contract curve cannot touch either certainty line.
But since it also cannot pass through the origins, it must start out
below the diagonal and end up above it. Thus there must be at least
one internal point at which the contract curve crosses the diagonal.
So, we know that independently of the exact situation, there must
always exist at least one point such that a constant proportional risk
sharing contract is Pareto ecient. However, this is quite dierent to
a statement to the eect that such a point will always be ecient.
Indeed, for a very large set of logical cases, there is a single point
of intersection between the contract curve and the diagonal. Simply
using (5.3), we can see that if absolute risk aversion for each individual
is non-decreasing, the contract curve must have a slope that is greater
than or equal to 1, and so in all of those cases there can be only a
single intersection with the diagonal of the box.
Other cases are also relatively easy to spot. For example, if we
2
dene W
W1 < 1, then a contract on the diagonal of the Edgeworth
box is w2 = w1 . Now, there can be only a single intersection between
the contract curve and the diagonal if, at any such intersection the
contract curve has greater slope than the diagonal. Using (5.3) this
requires that at the point in question
 
1 w1 ) > R1a (w1 ) + R2a ((W
R1a (w1 ) + R2a (W 1 w1 )) (5.4)
Now, dene a function
 
1 w1 ))
f () R1a (w1 ) + R2a ((W
126 5. Perfect information

Note that equation (5.4) is equivalent to f (1) > f (), and since < 1,
it is, therefore, sucient that f  () > 0. Since
 
1 w1 )) +
f  () = R1a (w1 ) + R2a ((W
 
1 w1 )Ra ((W
w1 R1a (w1 ) + (W 1 w1 ))
2

we require
 
1 w1 )) +
R1a (w1 ) + R2a ((W
 
1 w1 )Ra ((W
w1 R1a (w1 ) + (W 1 w1 )) > 0
2

which can be written as




R1a (w1 ) + w1 R1a (w1 ) +
 
1 w1 )) + (W
R2a ((W 1 w1 )Ra ((W
2
1 w1 )) > 0

This equation is satised for sure if

Ria (z) + zRia (z) > 0 i = 1, 2 (5.5)

Again we see that if absolute risk aversion is non-decreasing then this


is satised trivially. But even if absolute risk aversion is decreasing,
the condition may still be satised, so long as the second term on
the left-hand-side is smaller in absolute value than the rst term.
Somewhat more directly, we can state that since Rir (z) = zRia (z), we
have Rir (z) = Ria (z) + zRia (z), and so our condition is simply that

Rir (z) > 0 i = 1, 2

that is, that relative risk aversion be non-decreasing, a relatively


common assumption.

Exercise 5.1. Reconsider the immediately preceding analysis,


leading to equation (5.5). An identical analysis can be repeated
under assuming that, at any (internal) intersection between the
contract curve and the diagonal of the Edgeworth box, the con-
tract curve is less steep than the diagonal. What is the nal con-
dition that would result from such an analysis, and what would
the contract curve look like in such a case if both individuals
had the same utility function?
5.2. Constant proportional risk sharing 127

Answer. If the contract curve is less steep than the diagonal


at any internal intersection,
 then it would hold that R1a (w1 ) +
1 w1 ) < Ra (w1 ) + Ra ((W
R a (W 1 w1 )) . Following the
2 1 2
same steps as in the text leads us to the condition that Rir (z) <
0 i = 1, 2, that is, both players have decreasing relative risk
aversion. If this were to hold, and if both individuals had the
same utility function, then starting from the lower left-hand
corner, the contract curve must (a) converge to origin O1 , (b) run
above the diagonal until the mid-point of the box, where upon
it must cut through the diagonal, and nally (c) run below the
diagonal but then converge to origin O2 . That is, the contract
curve must be a sort of inverted S shape. The reason for this is
that if both individuals have the same utility function then the
contract curve must cut through the mid-point of the box (you
will be asked to prove this in the problems below), and since the
contract curve at all interior points must have positive slope,
and cannot touch the certainty lines, and cannot re-intersect
the diagonal, then as we go down the contract curve towards
origin O1 , the only place for it to go is convergent towards the
origin. A graph of such a contract curve is given in Figure 5.3

Summing up, we know that if both individuals have utility func-


tions that are characterised by non-decreasing absolute risk aversion,
or by either increasing or decreasing relative risk aversion, then the
contract curve has a single intersection with the diagonal of the Edge-
worth box. Therefore, in those cases there exists a contract that is
both Pareto ecient and that corresponds to constant proportional
sharing. However, that contract is a unique point out of innitely
many. So it seems unlikely that it will be consistently chosen as the
equilibrium contract. What would be far more useful is to nd the case
in which the contract curve is the diagonal line, so that whatever is
the equilibrium contract, it will correspond to constant proportional
sharing. From what we have just done, we know that the case we
are looking for must correspond to decreasing absolute risk aversion,
and must correspond to non-increasing and non-decreasing relative
risk aversion. It must also correspond to the marginal utility of each
individual at wealth of 0 tending to innity, since the diagonal line
passes through each origin. Our analysis of the non-increasing and
non-decreasing relative risk aversion cases points to the only logical
candidate the empirically relevant case of constant relative risk
aversion, which we now go on to analyse.
128 5. Perfect information

w2
O2
w1
C1

C2
O1 w1
w2

Figure 5.3 Contract curve with two decreasing


relative risk averse players

It can be shown that the only utility function that corresponds to


constant relative risk aversion is

w1R
u(w) =
1R

Using this function, marginal utility is

1
u (w) = (5.6)
wR

which clearly tends to as w tends to 0. That is, if both individuals


have constant relative risk aversion, then the contract curve converges
to both origins. Clearly, then, here is a case in which the entire
diagonal of the Edgeworth box can coincide with the contract curve.
We now consider what happens at any strictly internal point (i.e., at
points that are not at the origins).
Recall that the contract curve is dened by equation (5.2) as
h(w1 , w2 ) = 0, which once we substitute in for the correct marginal
utilities of the agents when the relative risk aversion of agent i is the
5.2. Constant proportional risk sharing 129

constant Ri gives
  R1   R2 
1 1
Ln + Ln =
w1 2 w2
W
    R 2 
1 R1 1
Ln + Ln
w2 1 w1
W

which reduces directly to


 
R1 [Ln (w1 ) Ln (w2 )] = R2 Ln W 1 w1 Ln W 2 w2
(5.7)
Using this, we get the result that

R1  R2 Ln (w1 ) Ln (w2 )  Ln W 1 w1 Ln W 2 w2

that is,
w1 1 w1
W
R1  R 2 
w2 2 w2
W
or cross-multiplying the second inequality, we get

w2 2 w2
W
R1  R 2 
w1 1 w1
W

But any point on the diagonal line in the Edgeworth box is dened
by w2 = w1 , that is, the diagonal line satises

 1 w1
W
w2 w1 W2 w 2
= = and = =
w1 w1 1 w1
W W1 w1

that is, if (w1 , w2 ) is on the diagonal line we get

w2 2 w2
W
==
w1 1 w1
W

while if (w1 , w2 ) is above the diagonal line we get

w2 2 w2
W
>>
w1 1 w1
W
130 5. Perfect information

and if (w1 , w2 ) is below the diagonal line we get

w2 2 w2
W
<<
w1 1 w1
W
In short, then, it turns out that if R1 > R2 , so that all points on
the contract curve must satisfy

w2 2 w2
W
>
w1 1 w1
W
and so in this case the contract curve must lie above the diagonal line
at all internal points. Likewise, if R2 > R1 then the contract curve
must lie below the diagonal line at all internal points, and if R1 = R2
then the contract curve coincides with the diagonal line.

w2
O2
w1
C1

R2
>
R1 R2
1
=
R
R2
<
R1

C2
O1 w1
w2

Figure 5.4 Possible contract curves with two con-


stant relative risk averse players

In Figure 5.4 we show the three options. Note that the contract
curve bends towards the certainty line of the most risk averse indi-
vidual. Thus the least risk averse of the two is insuring the position
of the more risk averse, by accepting a larger share of the risk in any
ecient contract. Note also that, in the case of two individuals with
5.3. Increases in aggregate wealth 131

constant relative risk aversion, if one is more risk averse than the other,
then there are no internal points that are Pareto ecient, and so in
this rather likely case it is impossible that a contract with constant
proportional risk sharing be Pareto ecient. Finally, as a limit case, if
one of the individuals is risk neutral (R = 0), then the contract curve
will coincide entirely with the certainty line of the other.

5.3 Sharing an increase in aggregate wealth


The nal question to look at here is how exactly an increase in the
aggregate amount of wealth available in one of the states of nature is
shared among the two individuals. For arguments sake, lets assume
1 , that increases, and so
that it is the aggregate wealth in state 1, W
starting o from a Pareto ecient risk sharing agreement (tangency
of indierence curves in state contingent claims space), we wonder
how the agreement will be altered by an increase in W 1 . We start by
noting a most important result:
The Mutuality Principle: In a Pareto ecient risk allocation,
the nal wealth of each individual in each state will depend only upon
the aggregate wealth in that state.
More specically to our two-person model, the mutuality principle
says that the wealth allocated to individual 1 in state i can depend
only upon Wi and not on Wj . That is,

wi
= 0 for j = i
j
W

The mutuality principle states that in any Pareto ecient risk


allocation, an increase in the aggregate wealth in one state can aect
only how that states wealth is allocated, and must leave the allocation
of the aggregate wealth in the other state unchanged. We can see why
this is so by reconsidering the contract curve. We can characterise any
point on the contract curve using a simple constrained maximisation
problem. Specically, the problem is the following:

i i = 1, 2
max k1 Eu1 (w1 ) + k2 Eu2 (w2 ) subject to wi1 + wi2 W
w1 ,w2

where k1 and k2 are positive arbitrary constants that capture the


weighting of each individual in social welfare. This problem says that
132 5. Perfect information

we should maximise a weighted sum of the expected utilities of the


two individuals, subject to the resource constraints for each state of
nature.
Given the concavity of utility, the objective function for this prob-
lem is concave, and the feasible set is convex (since the equations
dening it are linear), and so we know that a unique optimum exists.
The Lagrangean for this problem is

L(w, ) = k1 Eu1 (w1 ) + k2 Eu2 (w2 )+



1 W 1 w11 w12 + 2 W2 w21 w22

The rst-order conditions are


L(w, )
= 0 k1 (1 p)u1 (w11 ) 1 = 0
w11
L(w, )
= 0 k2 (1 p)u2 (w12 ) 1 = 0
w12
L(w, )
= 0 k1 pu1 (w21 ) 2 = 0
w21
L(w, )
= 0 k2 pu2 (w22 ) 2 = 0
w22

These equations all imply that the two multipliers are strictly
positive, thus as expected the solution allocates all the wealth in each
state, and so w12 = W1 w1 and w2 = W 2 w1 . This allows us to go
1 2 2
back to our original notation, so we can express the state contingent
1 w , w1 = w ,
wealths of the two individuals as w11 = w1 , w12 = W 1 2 2
and w22 = W2 w . Using this, the rst-order conditions can be more
2
easily combined and expressed as
1 w1 )
k1 u1 (w1 ) =k2 u2 (W (5.8)
2 w2 )
k1 u1 (w2 ) =k2 u2 (W (5.9)

Of course, dividing the rst of these by the second conrms that


we are looking at points such that the marginal rates of substitution
of the two individuals are equal.
Now consider what happens if, starting from a system in equilib-
rium (i.e., the rst-order conditions all hold) the amount of aggregate
wealth in state 1 is increased. Look at equation (5.9). If it were to
happen that the increase in W 1 were to increase the wealth allocation
5.3. Increases in aggregate wealth 133

of individual 1 in state 2, w2 , then the left-hand side of (5.9) must


get smaller (because the utility function is concave). But an increase
in w2 implies that the value of W2 w must get smaller, and so the
2
right-hand side of (5.9) increases. The end result is that the left-hand
side cannot equal the right-hand side, and we cannot not be in an
equilibrium. Of course, a similar argument holds for the case of w2
1 increases. Therefore, the only possibility is that
getting smaller as W

w2 is unaected by an increase in aggregate wealth in state 1, just
as the mutuality principle states. The same is true of the relationship
between w1 and W 2 .

Exercise 5.3. Use the two equations (5.8) and (5.9) to nd


1 upon the
exact equations for the eect of an increase in W
optimal allocation.
Answer. First, lets remove the utility weights by dividing (5.8)
by (5.9) and cross-multiplying to get
2 w ) = u (W
u1 (w1 )u2 (W 1 w )u (w )
2 2 1 1 2

As before, take logs so that this reads


2 w2 )) = Ln(u1 (w2 ))+Ln(u2 (W
Ln(u1 (w1 ))+Ln(u2 (W 1 w1 ))

1 , to get
Now, derive this with respect to W
w1 w2
2 w2 ) =
(R1a (w1 )) R2a (W
1
W W1
 
w2
w1 1 w1 )

(R1 (w2 )) + 1
a
R2a (W
W 1 W1

Ln(u (w)) u (w)


Where we have used the fact that w = u (w) = Ra (w).
w
But from the mutuality principle, we have 2 = 0, and so this
W1
simplies to
 
w1 w1 1 w )

(R1 (w1 )) = 1
a
R2a (W 1
1
W W1

Multiply through by 1 and re-order to get


w1 1 w )
R2a (W 1
= .

W1 a a 
R1 (w1 ) + R2 (W1 w1 )
134 5. Perfect information

The result found in exercise 5.3 is often more usefully expressed in


terms of risk tolerance, T (w), which is just the inverse of risk aversion:
 
1
w1 1 w )
T2 (W 1
= 
1
W 1 1
1 w ) + T1 (w )
T2 (W 1 1
 
1
1 w )
T2 (W 1
= 
1 w )+T1 (w )
T2 (W 1 1
1 w )T1 (w )
T2 (W 1 1

T1 (w1 )
=
1 w ) + T1 (w )
T2 ( W 1 1

Thus, an increase in aggregate state 1 wealth will be shared in


such a way that individual 1 takes a proportion of the increase that
is equal to the ratio of his absolute risk tolerance to the sum of
the risk tolerances of both individuals in that state. Likewise, it is
a simple matter to show that the mutuality principle also implies that
a marginal increase in aggregate wealth in state 2 will be shared in
such a way that individual 1 gets a share that is equal to the ratio of
his absolute risk tolerance to the sum of the risk tolerances of the two
individuals in that state.

Summary
In this chapter, you should have learned the following.

1. Risk sharing between two individuals can be described in an


Edgeworth box diagram.
2. The principal component of the diagram is the contract curve,
the set of all points such that the marginal rates of substitution
of the two individuals are equal. The contract curve traces out
all of the Pareto ecient points in the graphical space.
3. If the two individuals are both strictly risk averse, and if there
is aggregate risk (the aggregate amount of wealth available over
the two states of nature are not equal), then in any nal equi-
librium allocation that allocates strictly positive wealth to both
individuals in both states of nature, both individuals must suer
5.3. Increases in aggregate wealth 135

some risk. That is, the contract curve cannot touch the certainty
line of either individual at any strictly interior point of the box.
4. There are a great many cases in which the contract curve touches
the diagonal of the Edgeworth box at only one interior point.
In those cases, since the diagonal describes all allocations that
involve constant proportional risk sharing, it is unlikely that
constant proportional risk sharing will be consistently chosen as
the equilibrium allocation.
5. Only when both individuals have constant and equal relative
risk aversion does the contract curve coincide with the diagonal
of the box, and so only in this case can we guarantee that
a constant proportional risk sharing contract will be optimal.
Specically, if the two individuals have constant but dierent
relative risk aversion, then a constant proportional risk sharing
contract will never be optimal.
6. Aggregate risk will be shared between the two according to a
sharing rule that depends upon absolute risk tolerances. Specif-
ically, individual i will retain a proportion of any increase in
aggregate wealth in state 1 that is equal to that individuals risk
tolerance in state 1 divided by the sum of the two individuals
risk tolerances in state 1. A similar result holds for how increases
in state 2 wealth is shared.
7. In particular, the way an increase in state i aggregate wealth
is shared is independent of how much wealth is available for
sharing in state j.

Problems
1. Assume that the two agents have dierent probability beliefs
regarding the probability of the two states of nature. Specically,
assume that individual 1 believes that the probability of state
2 is p1 while individual 2 believes it to be p2 , and, of course,
p1 = p2 . Each individual is fully informed of the probability
belief of the other.

(a) Write out the equation that describes the contract curve,
and evaluate its slope.
(b) Does it still hold true that the contract curve cannot touch
the certainty lines of either of the two individuals at an
interior point? Explain why or why not.
136 5. Perfect information

(c) How is the position of the contract curve aected by an


increase in, say, p1 ?

2. Assuming that both individuals have the same utility function,


prove that the contract curve must pass through the point at the
centre of the Edgeworth box. What is the slope of the contract
curve at that point in this case?
3. Assume that a record company is contracting with a singer to
record and distribute a record. Assume that both the recording
company and the singer are risk averse with constant relative
risk aversion, but that the singer is more risk averse than the
recording company. There are two states of the world; either
sales revenue of the record is high or it is low, and the contract
between the two only stipulates how sales revenue is to be split
between them. Is it ecient for the royalty contract to stipulate
that the singer should receive a higher royalty commission for
high sales than for low sales?
4. Again in the context of a singer and a recording company, many
royalty contracts include an up-front payment from the company
to the artist, as a payment against future royalties. That is,
at the outset the company pays the artist a fee of F , and the
contract stipulates future royalty payments as a function of sales
revenue. But the company pays out royalties only if and when
the total royalty payment due begins to exceed F . How does
such a contract relate to insurance?
5. Consider an Edgeworth box under risk in which the utility func-
tion of player i is ui (w) = wi , for 0 < i < 1. Both players
share the same beliefs regarding the probabilities of the states of
nature. Write out the equation that denes the contract curve.
Assume the special case of 21 1
1 = 2, and solve out explicitly for
the equation of the contract curve in (w1 , w2 ) space.
6. It is typical to draw S shaped contract curves that go both
sides of the diagonal of the Edgeworth box, with a single inter-
section with the diagonal, and yet still passing through the two
corners. Give an example of two utility functions, one for each
player, such that the contract curve would be S shaped in an
Edgeworth box under risk.
7. Assume that we are at a general equilibrium in the Edgeworth
box, and then the amount of aggregate wealth available in state
1 increases marginally. Assume that the two players both have
5.3. Increases in aggregate wealth 137

constant relative risk aversion, but that player 1 is more risk


averse than player 2. Calculate the proportion of the change in
aggregate state 1 wealth that is retained by player 1 as a function
of the two levels of relative risk aversion and the original general
equilibrium point. How is this proportion altered by changes in
the levels of risk aversion of the two players?
8. Following on from problem 7, compare the proportion taken by
player 1 of the increase in state 1 wealth with the proportion
of total state 1 wealth that this player took at the original
equilibrium. Which, if either, is greater? Sketch a graph of the
proportion of state 1 wealth taken by player 1 as a function of
the amount of state 1 wealth that is available.
9. Consider an Edgeworth box under risk, with aggregate risk and
constant relative risk averse players. Imagine that both state 1
wealth and state 2 wealth were to increase in the same propor-
tion. Will the allocations of each player in both states of nature
increase in exactly that same proportion?
Chapter 6

Asymmetric information:
Adverse selection

In the preceding chapter there were two active agents in the model,
an insurer and an insured individual. Ecient risk sharing in the
insurance model was dependent upon the assumption that both active
economic agents have exactly the same information. Above all, we
assumed that both agreed on the value of the probabilities of the
states of nature. It is relatively simple to see that, if the two agents
had dierent beliefs as to exactly what is the value of the probability of
each state of nature, then nothing important changes, so long as each
knows the probability belief of the other. But again, having dierent
beliefs, but where each agent is fully informed of the beliefs of the
other, is not really an extension to the model since it retains the
symmetric information nature of the set-up. But what happens when
at least one agent is totally ignorant of the probability belief of the
other? Such a situation is known as asymmetric information.

6.1 Asymmetric information; some


preliminary comments
Given the obvious importance and realism of an assumption of asym-
metric information, it is rather interesting that we carry out an anal-
ysis of how our economic model of risk sharing is aected by the
two economic agents having dierent information sets. However, just
before embarking upon this endeavour, it is probably worthwhile to
clear up the meanings of a couple of terms that are important.

138
6.1. Preliminary comments 139

In all that follows, we are only interested in what people know and
do not know, and not when there is disagreement as to true values.
For example, if you are convinced that team A will win on Saturday
with probability one half, and your friend is equally convinced that
the probability of team A winning is only one quarter, and you are
both informed as to the probability assessment of the other, we cannot
speak of a case of asymmetric information. Everything that is relevant
is known by all concerned. Thus asymmetric information as we shall
study it involves situations in which at least one party is totally
uninformed as to some relevant data point.
A perfect information setting is one in which all economic agents
are fully informed of all relevant parameter values. On the other hand,
imperfect information is a situation in which at least one agent is
uninformed of the value of at least one relevant data point. If it turns
out that both individuals are uninformed of the values of the same
data, then we have a case of imperfect but symmetric information.
Note that imperfect but symmetric information does not necessarily
arise when there exists uncertainty as to a relevant data value, and
both individuals estimate the probability density that they think
should correspond to the unknown data. It depends on whether or
not each individual knows the others probability belief. Thus a model
of pure uncertainty (or risk) is not generally a setting of imperfect
information.
When imperfect information exists, it is possible that the two
agents dier in what they each know. Such a scenario is a case of
asymmetric information, and that is what we are interested in here. In
order that things are as simple as possible, we shall only be considering
here very simple asymmetric information problems, in which one agent
is fully informed, and the other is informed of all relevant data except
for one specic value.1
The model that will be used throughout this chapter is the Edge-
worth box, although we will not be drawing the axes corresponding
to person 2 (the top and right-hand side axes). In that way, a point
on our graphs will represent the allocation from person 1s point of
view (where person 1s origin is the origin of the graph). Person 1s
more preferred allocations are to the north-east, while person 2s more
preferred allocations are to the south-west. Our convention will always
1
Cases in which the uniformed party is uninformed as to more than one data
point, or when both parties are uninformed as to something, but where what is
unknown to one is known to the other, are possible but unnecessarily complicated.
140 6. Adverse selection

be to represent the informed party as person 1, and the uninformed


party as person 2. We shall study two important types of asymmetric
information problems, where the dierence concerns the nature of the
data that person 2 is not informed about. First, if the relevant data
is a parameter (i.e., its exact value is established exogenously, so it is
not a choice variable) that is known by person 1 and not by person
2, then we say we have a problem of adverse selection. On the other
hand, if the relevant data is the value of a variable that is chosen by
person 1 but not observed by person 2, then we say we have a problem
of moral hazard.
The basic model in which we shall analyse asymmetric information
is known as the principal-agent model. The names principal and
agent are borrowed from the legal literature where they are often
used to dene the parties to a contract in which one person (the
agent) acts upon anothers (the principal) behalf. Specic examples of
principal-agent relationships are those of a lawyer (agent) and a client
(principal), company executives (agent) and shareholders (principal),
and author (principal) and publisher-distributor (agent).2
In all that follows, we take person 1 to be the agent and person
2 to be the principal, so in all of our analysis it is the agent who is
fully informed, and the principal who is not. In the models that we
shall study there will typically exist many dierent individual agents,
and there may be either a single principal (in which case the principal
is a monopolist) or there may be innitely many individual princi-
pals (perfect competition). Intermediate cases of a limited number of
principals can be studied using bargaining theory, but this will not be
attempted here.
The word contract is used liberally in this, and in most other,
analyses of asymmetric information. A contract can be thought of as
an agreement between the two parties that captures each and every
responsibility and right of the parties concerned. A contract in real life
can be a complex document, but our simplied setting here implies
that we can also use a very simple description of a contract. Here, since
we shall restrict ourselves to the contingent claims environment, and
2
There are, of course, a great many other examples that can be cited. For some
examples the relationship can be thought of in more than one way. For example,
the case of a government and the public can be thought of as the public being the
principal and the government being the agent since the latter takes decisions that
aect the former. However, when tax time comes around, the public becomes the
agent who needs to declare earnings, and the government becomes the principal
who benets from the income declaration.
6.1. Preliminary comments 141

since in all cases it is the agent who is contracted to take actions on


behalf of the principal but it is the principal who receives the nancial
rewards of that action, a contract need only stipulate the amount of
money (commonly called the wage) that the agent will be paid in each
state of nature as recompense for his services.
In all of the problems that we analyse, the principal acts rst by
making an oer of a set of contracts, and the agent then either accepts
one of those contracts, or rejects them all. If the agent does not accept
any of the contracts, then both the principal and the agent receive
their endowed reservation utility. On the other hand, if a contract is
accepted, then the agent carries out an action, the state of nature is
revealed, and payos according to both the state of nature that occurs
and the contracted wage for that state are realised.
When dening the equilibrium of an asymmetric information prob-
lem, it is customary to use the concept of Nash equilibrium from game
theory. Concretely, we shall have an equilibrium when no principal
has an incentive to alter the set of contracts that she oers, given the
contracts oered by all other principals.
A problem of asymmetric information is interesting only when it is
accompanied by an environment of risk or uncertainty. Here we stick
with the risky environment studied in previous chapters, that is, the
probabilities of the respective states of nature are known to all parties
concerned. Consider for a moment what would happen if there were no
risk as to the nal state of nature that eventuates. In such a world, the
principal contracts with an agent to carry out a well-dened task, and
that task gives rise to an observable result. Assume, for example, that
the eort with which the agent carries out the task is not observable
to the principal, but that dierent eort levels give rise to dierent
nal results. Logically, the greater is the eort, the better is the nal
result obtained. Even though the principal cannot observe the level of
eort used, in a riskless world the result obtained is sucient for the
principal to calculate, ex post, the level of eort that was used, and
thus the agent can be paid accordingly. We say that in this case the
result obtained is a perfect signal for the eort level, and so really we
do not have a situation of asymmetric information. On the other hand,
consider what happens when there is risk, in the sense that for each
possible choice of eort all possible results are still feasible, according
to a probability density. For example, when a salesman exerts high
eort, it is likely that he makes good sales, but if he is unlucky it is
still possible that he does not sell much in spite of his eort level.
142 6. Adverse selection

On the other hand, say the salesman exerts low eort in which case
it is likely that he does not sell much, and yet he may simply have
a lucky day anyway and manage to make good sales in spite of his
laziness.3 The point is that the result obtained becomes an imperfect
signal for the level of eort used, and we have a legitimate situation
of asymmetric information whose solution is no longer trivial.
In a situation of asymmetric information, the objective of the
principal is to choose a set of contracts to oer such that the agents
best choice (or best response) among these contracts reveals the infor-
mation that the principal lacks at the outset. In this chapter and the
next, we consider the two basic problems in turn, to see how they are
solved. However, before working through the principal-agent model
proper, it is worthwhile to take a look at a couple of ideas related to
adverse selection in risk-free situations.

6.2 Adverse selection without risk


An adverse selection problem occurs when the agent is informed as to
the true value of some relevant parameter, and the principal is not.
Consider the following simple game rst suggested by George Ackerlof
as a description of adverse selection. An individual has a car and he
wishes to sell the vehicle in the second-hand market. The car may be
of excellent quality, in which case it is worth v1 , or it may be faulty,
in which case it is worth v2 , where obviously v2 < v1 . Because the
current owner of the car has had the opportunity to drive it and thus
has learnt about the car, let us assume that he is fully informed about
the true value of the car, which will be denoted by v, whether it is
of high or low quality. Never-the-less, it is generally known that a
certain proportion, say q, of the all cars in the second-hand market
are in fact faulty. Now, a second individual enters the second-hand
market looking to purchase a car, and so the two are negotiating a
price at which this particular car can be sold for, p. We shall assume
that both individuals are risk neutral.
Now, on the one hand, the current owner of the car will never
accept a price that is less than the true value of the vehicle, thus we
are restricted to thinking about prices that satisfy p v. But on the
other hand, due to the risk neutrality of the potential buyer, he is
3
It is easy to think of other examples; a stock agent may give good advice, and
yet you still end up losing money; a football coach may make lots of mistakes in
preparing the team, and yet lucky breaks during the game still give them the win.
6.2. Adverse selection without risk 143

willing to pay, at most, the expected value of the car, and so the price
must also satisfy p (1 q)v1 + qv2 .
But now think about the two options for the true value of the
car. Say the car is really of high quality, and so its true value (known
to the seller, unknown to the buyer) is actually v1 . But clearly it is
always true that (1 q)v1 + qv2 < v1 , and so combining all of our
inequalities we see that we are looking for a price that satises

p (1 q)v1 + qv2 < v1 p

which is clearly impossible. Thus in this case no deal can ever be


struck.
What happens if the car is really worth only v2 ? In this case since
(1 q)v1 + qv2 > v2 the seller would be pleased to accept the price
suggested by the buyer. But the buyer realises that it is now impossible
for the car to be really worth v1 since if it were, the seller would not
have accepted the price (1 q)v1 + qv2 . So when the seller accepts the
price (1 q)v1 + qv2 we have a perfect signal that the car is really
only worth v2 , and so the buyer will now only be willing to pay v2 . In
this case, the car can be transacted, but at a price of p = v2 .
The fact that the only cars that can ever be sold in the second
hand market are those that are known to be of low quality is what
motivated the term adverse selection asymmetric information can
result in the market selecting only the most adverse quality as being
able to be transacted.

Exercise 6.1. Can the buyer understand that when a seller


rejects a price of (1 q)v1 + qv2 then we have a perfect signal
that the car is worth v1 ? Would it be a reasonable solution to the
adverse selection problem if a mechanic can be hired to reveal
(with certainty) the true value of the car?

Answer. The rejection by a seller of a price that is equal to the


expected value of a randomly chosen vehicle is not a signal that
the car is worth v1 . To see why, simply consider what would
happen if it were. Then it would be in the interests of sellers
of v2 cars to reject a price oer of (1 q)v1 + qv2 , and so at
that price all sellers would reject, holding out for a price of v1 .
If a mechanic can be hired to certify the true value of the car,
then there remains the problem of who exactly would pay the
mechanic. Say the mechanics fee is m. If the seller pays this
144 6. Adverse selection

fee in order to sell the car along with the certicate, the seller
would now require the buyer to pay v1 + m, but the buyer is still
only getting a vehicle of value v1 , and so would not be willing
to pay any more than that. So the seller cannot aord to pay
the mechanic. You should go through the case when the buyer
pays the mechanic yourself, but the same outcome happens.

In the used car example, there is no uncertainty, and so if we are


willing to think of a longer time span, then clearly we can arrive at
simple mechanisms that solve the problem. For example, selling the
car with a guarantee would be sucient, so long as the true value of the
car can be fully ascertained by a third party if need be (for example, a
judge), and so long as the quality of the car cannot be altered by the
buyer once he is the new owner. The use of a guarantee by the seller
of a good quality product is an example of a signalling mechanism.
Signalling is an important part of adverse selection models, and so
before going ahead with the main model, it is worthwhile to look at
a model of signalling.

Signalling
In a rather provocative and very inuential paper in the early 1970s,
Michael Spence considered the relationship between employers and
employees when the latter are better informed of their underlying
value to the rm (e.g., their productivity) than are the former. Speci-
cally, say there are two types of workers, those with value va and those
with value vb , where va > vb . It is known that a proportion of all
workers are of type a. Now, this is exactly the same type of adverse
selection problem as in the Ackerlof used car market under perfect
information (and assuming the employer is perfectly competitive, that
is, the employer must earn zero prots) the rm would like to pay each
worker their value, but under asymmetric information (the rm cannot
observe the workers type) the rm is afraid of paying the wage va as
it might be accepted by a type-b worker thereby generating negative
prots. So without any further mechanisms in place, the high wage
cannot be oered, and the upshot is that only the type-b workers are
employed.
However, Spence recognised that the value of a worker to a rm
might very well be highly correlated with the workers abilities in other
areas of life. Specically, Spence considered the possibility that high
value workers might also be more capable students in an education
6.2. Adverse selection without risk 145

environment. Given that, the level of education that a worker obtains


before entering the labour market may serve as a signal of his value
to a rm. To see how this might work, Spence assumed that the cost
of acquiring a unit of education, e, is dierent over the two types of
worker.4 Specically, a unit of education costs ca for a type-a worker
and cb for a type-b worker, where ca < cb . Thus, the more valuable
workers are also those with the lowest costs of education.
So, given that there is assumed to be this perfect correlation
between the workers value to the rm and his costs of education,
it now turns out that the level of education acquired can be used as a
signal of value.5 The employer pays the wage wb = vb to any worker
who has a level of education below a specied level, e < e , and the
wage wa = va to those workers who have a level of education that is
no less than that level, e e . The employer calculates e such that
the probability that v = va conditional upon e e is equal to 1, and
the probability that v = vb conditional upon e < e is also equal to 1.
Assuming that the workers utility for money is linear (i.e. he is risk
neutral), then the employer needs to set e such that the following
two equations are satised:
v b v a c a e
vb va cb e
The rst of these equations ensures that type-a workers would
prefer a wage of va together with the required level of education
e to a wage of vb with no such education requirement. The second
equation ensures that type-b workers have just the opposite preference
they prefer a wage of vb with no education requirement to a wage
of va together with the required education. But note that these two
equations can be re-written (respectively) as
va vb
e
ca
v a vb
e
cb
4
The costs of education are not supposed to represent the nancial costs of
enrolling for courses, but rather the eort costs of passing units of education.
Education itself is here measured using some continuous scale. All that is really
important is that two workers entering the workforce can be dierentiated by who
has managed to obtain the better education.
5
While education is correlated with a workers value, the value of the worker is
not in any way enhanced by education in the Spence model. In problem 2 you are
asked to alter this assumption.
146 6. Adverse selection

So, in the end, the employer needs to set an education requirement


e such that
va vb va vb
e
cb ca
Since cb > ca , there must exist levels of education that will satisfy
this.
The situation is summed up in Figure 6.1. The step function is the
wage oering that is made by the rm, and the worker then looks at
the vertical dierence between this wage function and his own personal
education cost line. The worker then selects that level of education
which maximises this vertical dierence. The distance d is equal to
va cb e , while the distance g is equal to va ca e . By moving e
around, the employer alters these two distances. All he needs to do is
to nd a value of e such that distance d is smaller than the distance
from the origin up the vertical axis to the point vb , and such that g is
greater than the distance from the origin up the vertical axis to the
point vb .

w cb e

va
d ca e

vb

e e

Figure 6.1 Separating equilibrium in the Spence


signalling model

The nal equilibrium result is that type-b workers will invest in


no education at all, eb = 0, and so they will be employed at the wage
6.2. Adverse selection without risk 147

vb . On the other hand, type-a workers will invest in exactly the level
of education ea = e , and will be employed at the wage va . No worker
type has an incentive to alter his investment in education, and the
employers beliefs on who is who are conrmed at this equilibrium. In
this way the signal has allowed the employer to sort the two types of
workers into their correct wage categories, even though their under-
lying value was not observable.

Exercise 6.2. We have just seen that levels of education that


would work as a signalling mechanism in the Spence model
certainly exist, but is there any particular one that would be
preferable in any sense?

Answer. Notice that the only players in the Spence model


that are adversely aected by the existence of asymmetrical
information are the type-a workers, since they are the ones
who must now invest in a costly signal in order to obtain the
wage that they would have been paid had information been
symmetric. This is actually a common aspect of many of the
models to follow. Since education in the Spence model has no
productive element, it is merely a costly signal, there is a socially
optimal level of the signal. There is no real need to get the type-
a workers to invest in any more education than is absolutely
necessary for them to signal their type, and so the signal should
be set at the minimum level that satises vacv b
b
e vacv
a
b
,
v a v b
that is, the signal should be set at e = cb . In this way,
the type-b workers will actually be indierent between investing
in education or not, but the type-a workers will have a strict
preference for investing in the signal. You should be unconcerned
with the assumption that, even though the type-b workers are
indierent between investing in the signal or not, they decide
not to. Indeed, this is a common aspect of the asymmetric
information models that follow. In essence, it is an argument in
limits if there really was any chance that they would resolve
their indierence by investing in the signal, then we only need
to set the required signal level at vacvb
b
+ , where is some
arbitrarily small number. In limit, we can set to zero.

Both the Ackerlof and Spence models are set in risk-free envi-
ronments. That means that there is always another way to resolve
the issue we only need to pay at the end of the game rather than
148 6. Adverse selection

to contract for a guaranteed wage at the outset. For example, in


the Spence model, education is required as a signal so that type-a
workers can contract to a wage of va at the outset, and type-b workers
can contract to a wage of vb at the outset. However, since the model
involves no risk, so long as the nal output of any worker is known
(i.e., is observable) at the end of the day, all that really needed to
be contracted was a wage conditional upon the nal output. Since
there is no risk at all, type-a workers will produce output of value va
and type-b workers will produce output of value vb . So, at the outset,
the contract really needed only to stipulate that the worker would be
paid a wage of w = v, where v is the value of the output that the
worker managed to produce at the end of the day, either va or vb ,
with no need for the education signal. In eect, in the same way that
the level of education is a signal of a workers type, so is the value of
that workers output. Of course, more realistic situations are cast in
an environment of risk or uncertainty, and we now go on to consider
exactly that type of problem.

6.3 Adverse selection with risk: The


principal-agent setting
Assume that a principal would like to contract with agents to carry
out a well-dened and totally observable task.6 There exists a whole
population of agents that are dierentiated by their type, which is a
term used to summarise their talents and natural characteristics (for
example, intelligence, ability to work in groups, likelihood of illness,
etc.). An agents type is not a choice variable for him, and is not
observable by the principal. For simplicity, we assume that there are
only two dierent types of agent in the population, referred to here
as type-1 and type-2. The proportion of the entire population that
are type-1 agents is , which is assumed to be strictly between 0 and
1, and known by all parties concerned. Only an agent is informed of
his individual type, and so an asymmetric information problem exists
(since the principal only knows that there are two dierent agent
types, the complete description of those two types, and the exact
6
We do not restrict ourselves to cases in which the principal only wants to
contract with a single agent. The idea is that all contracts are fully independent,
so if one contract is protable for a principal, then the principal would like to
replicate that same contract as many times over as possible.
6.3. Principal-agent setting 149

proportion of the population that each type represents, but she does
not directly observe a particular agents true type).
The proposed relationship will be carried out under risk, with
two states of nature. As always, we consider state 1 to be the better
state in the sense that whatever is the type of agent concerned, the
relationship yields a greater result for the principal in state 1 than in
state 2. Concretely, we assume that if state i occurs, then the contract
yields a payment for the principal of xi , where x1 > x2 , and where we
understand the variable x as a monetary amount.
The driving assumption in the model is that the underlying dif-
ferences between the two types of agent result in them being dieren-
tiated only by the probability with which the states of nature occur.
If the principal contracts with a type i agent, then the probability of
state 2 is pi , for i = 1, 2. We assume that p1 < p2 , so that type-1
agents are better than type-2 agents, since type-1 agents manage
to generate the better payo for the principal (the more favourable
state of nature) with a greater probability.
In this set-up, a contract oered by the principal consists of a
vector of two numbers that indicate what the agent will receive in
each of the two possible states of nature, once the state has been
realised and the payo to the principal (x) has been received. Thus a
contract is a vector w = (w1 , w2 ), where the agent is paid the wage wi
when the principal receives the result xi for i = 1, 2. Notice that the
contract shares the outcome of each state of nature between the two
parties, the agent getting wi and the principal getting xi wi . Thus
the contract is both a way to remunerate and provide incentives to
the agent, and to share risk.
The principal may oer more than one contract, and allow each
agent to choose between the contracts on oer. In general, we say
that the set of contracts oered by a principal is the contract menu.
However, it is very important to note that, since the principal cannot
distinguish between dierent agent types, she must oer exactly the
same contract menu to all agents, thus allowing all agents exactly the
same choice.
Note that since there are only two types of agent, at most the prin-
cipal will include only two dierent contracts in the contract menu.
The reason is clear; if she were to include more than two contracts
in the menu, all but two will certainly be ignored by both types of
agent. That is, since all agents of a given type are exactly identical,
what appeals to one will appeal to all of them in the same way. So all
150 6. Adverse selection

type-1 agents will prefer the same contract, and all type-2 agents will
also coincide as to which contract is the most preferred, although the
type-2 agents may choose a dierent contract to the type-1s. So, there
is never anything to be gained by oering more than two dierent
contracts in the menu (although it may be useful to oer a single
contract in the menu, something which we can interpret as a special
case of oering two contracts it is oering two contracts that are
equal). Our objective is to nd the coordinates of the two optimal
contracts, which we shall denote by w1 and w2 respectively, without
requiring that they necessarily be dierent.
If it turns out that all agents, irrespective of their type, choose
the same contract, then we say that they have been pooled, and we
speak of a pooling equilibrium. On the other hand, if type-1 agents
choose a dierent contract to type-2 agents, then we say that the
agents have been separated, and we talk of a separating equilibrium.
This second situation (separating equilibria) is much more interesting,
since it implies that the principal will be able to perfectly infer an
agents type by the choice of contract that he makes, and so contract
choice is a perfect signal for agent type. For this reason, separating
equilibria are also often known as self-selecting equilibria.
Since the principal is the contractor in the relationship, we can
assume that she is some type of business person, and we shall assign
her an objective function that is expected (monetary) prot. Thus we
are assuming that the principal is risk neutral. On the other hand,
the agents are the contracted parties (e.g., workers, or employees in
general) and so will be assigned an objective function that is expected
utility, where their utility function, u(w), is an increasing concave
function of money; u (w) > 0 and u (w) < 0.
If the principal contracts with a type i agent, her expected prot
is

x w)
Ei (  = (1 pi )(x1 w1 ) + pi (x2 w2 )
 (1 pi )w1 pi w2
= Ei x

and the expected utility of the contracted type i agent is

 = (1 pi )u(w1 ) + pi u(w2 ) i = 1, 2
Ei u(w)

The indierence curves of the agents in the space of contracts


are decreasing and convex to the origin, and the marginal rate of
6.3. Principal-agent setting 151

substitution at any given point w is


(1 pi )u (w1 )
M RSi (w) = i = 1, 2
pi u (w2 )
Of course, at any point on the certainty line for the agent (w1 = w2 ),
(1 pi )
M RSi (w)|w1 =w2 = i = 1, 2
pi

w2
w1 = w2

preference
direction

E2 u

E1 u

w1

Figure 6.2 Type-1 and type-2 agent indierence


curves

Now, the assumption that p1 < p2 tells us that at any given point w
in the contract space the indierence curve of a type-1 agent is steeper
than the indierence curve of a type-2 agent at the same point. To
see this, we simply need to derive M RSi (w) with respect to pi ,
M RSi (u (w1 )pi u (w2 )) (1 pi )u (w1 )u (w2 )
=
pi (pi u (w2 ))2
 
u (w1 )u (w2 )
= (pi + (1 pi ))
(pi u (w2 ))2
u (w1 )
= 2  >0
pi u (w2 )
152 6. Adverse selection

That is, the greater is pi , the greater is the marginal rate of


substitution. And since the marginal rate of substitution is a negative
number, an increase in M RS corresponds to a less steep indierence
curve. The economic intuition behind these slopes is not hard to see.
Take the point at which the two curves intersect, and then consider
the increase in w2 that would keep each agent type indierent to the
loss of, say, 1 unit of w1 . The agent with the greatest probability
of w2 (the type-2 agent) will require a smaller increase in w2 since
it is received with greater probability. Therefore, to the left of the
intersection point, the type-2 indierence curve must be below the
type-1 indierence curve. A similar argument shows that to the right
of the intersection point, the type-1 indierence curve is lower than
the type-2 indierence curve. In short, the type-1 indierence curve
is steeper at the intersection point (see Figure 6.2).

w2
w1 = w2

preference
direction

w0
E2 (x w)

E1 (x w)

w1

Figure 6.3 Expected prot lines when the principal


contracts with a type-1 or a type-2 agent

In the same way, the principals indierence curves in the space


of contracts are linear, and their slope at any given point, when the
contract is with a type i agent is (1p i)
pi , that is, the indierence
6.3. Principal-agent setting 153

curves of the principal are less steep when she contracts with a type-2
agent than when she contracts with a type-1 agent. In Figure 6.3 the
indierence curves of a principal (expected prot lines) are shown for
the cases of contracts signed with each type of agent.
Let the endowed, or reservation, utility of a type i agent be denoted
by ui , and assume that the reservation utility of the principal is 0. We
also assume that, since type-1 agents are in some way more propense
to generate the good state of nature, they also have greater reservation
utility than type-2 agents; u2 < u1 . This assumption implies that
the reservation utility indierence curve of a type-1 agent cuts the
certainty line in the space of contracts above the point at which the
reservation indierence curve of a type-2 agent cuts it. It also implies
that the two reservation utility indierence curves intersect each other
at a point characterised by w1 > w2 .
A contract w will attract a type i agent voluntarily, and will be
voluntarily oered by the principal conditional upon being accepted
by a type i agent, if it satises the conditions

(1 pi )u(w1 ) + pi u(w2 ) ui
 (1 pi )w1 pi w2 0
Ei x

These two conditions are known as the participation conditions (of a


type i agent, and of the principal, respectively).
Naturally, any given contract, w, will give the principal a dierent
expected prot according to the type of agent that signs it. For that
reason, if the principal does oer two dierent contracts, one designed
with type-1 agents in mind, and the other designed for type-2 agents,
which we denote by w1 and w2 respectively, then it is necessary
to ensure that the rst contract is signed only by type-1 agents,
and the second contract is signed only by type-2 agents. Formally,
and recalling that the principal must oer the same choice to all
agents, this requires that the contracts respect what are known as
the incentive compatibility conditions, which are

i ) Ei u(w
Ei u(w j ) i, j = 1, 2

The incentive compatibility conditions ensure that each agent type is


most interested in the contract that the principal designed with his
type in mind.
The objective that the principal has when considering what the
optimal contracts are will depend upon the market conditions in which
154 6. Adverse selection

she operates. Here we consider only two extreme assumptions; the


principal acts either in a perfectly competitive environment (i.e., there
are innitely many identical principals), or as a monopolist (there is
a single principal). In what follows, these two cases will be discussed
separately.
However, before going on to look at the two solutions, as a bench-
mark case consider what would happen under symmetric information
(i.e., when the principal can fully observe the type of any given agent).
In such a scenario, it is as if the principal were playing two separate
games with the agents the type-1 agents on the one hand, and the
type-2 agents on the other. But since the principal is risk neutral and
the agents are risk averse, we know that any solution will lie on the
certainty line of the agent, and so really the graphical environment is
exactly equivalent to that used for the insurance problem studied in
Chapter 4. Therefore, if the principal acts in a perfectly competitive
environment the equilibrium contract for each game is the certainty
point (for the agent) that retains the expected value of the agents
endowment, and if the principal is a monopolist, the solution is that
the agent gets his certainty equivalent wealth. But in each case (per-
fect competition and monopoly) the wealth that is to be received by
a type-1 agent is greater than the wealth to be received by a type-2
agent. Thus if type-2 agents could somehow pass themselves o as
type-1 agents, they would do so in order to get the higher payo. This
implies that the solution that we get in the symmetric information
problem is not going to work under asymmetric information, since it
will violate the incentive compatibility of the type-2 agents (type-2s
would like to disguise themselves as type-1s).

Perfect competition
If the principal acts in a perfectly competitive environment, she is
restricted to earning a non-positive prot, but since her participation
condition requires that the expected prot also be non-negative, we
have the result that the expected prot must be exactly equal to 0. In
this case, eciency demands that the principal searches for the two
contracts w1 and w2 , where the rst is designed for type-1 agents and
the second is designed for type-2 agents, that respectively maximise
the expected utility of the two types of agent, subject to the condition
that she earns an expected prot of 0, and subject to the participation
and incentive compatibility constraints of the two types of agent. That
6.3. Principal-agent setting 155

is, the principal faces two simultaneous but interrelated maximisation


problems, each with the same set of conditions;

max (1 p1 )u(w11 ) + p1 u(w21 )


w1
max (1 p2 )u(w12 ) + p2 u(w22 )
w2

subject to
E1 ( 1 ) + (1 )E2 (
xw 2 ) = 0
xw (6.1)
and

(1 p1 )u(w11 ) + p1 u(w21 ) u1
(1 p2 )u(w12 ) + p2 u(w22 ) u2
(1 p1 )u(w11 ) + p1 u(w21 ) (1 p1 )u(w12 ) + p1 u(w22 )
(1 p2 )u(w12 ) + p2 u(w22 ) (1 p2 )u(w11 ) + p2 u(w21 )

This is clearly a complex and large problem, with four choice vari-
ables (two components of the two wage vectors), and ve restrictions
(implying ve Lagrange multipliers). In all, if we solve the problem
using the Lagrange method, we would have to handle a system of nine
simultaneous equations in nine unknowns. Fortunately, it is far easier
to carry out a graphical analysis.
To begin with, we have the following result:
Result 6.1: Whatever is the solution to an adverse selection prob-
lem under perfect competition, it is characterised by w1 = w2 .
Result 6.1 indicates that it is impossible for the solution to involve
a single contract for both types of agent, that is, there will never be
a pooling equilibrium. To see why, assume that this were not true,
that is, assume that we can have a solution with w1 = w2 = w, and
dene q p1 + (1 )p2 . Now, if the solution were to imply a single
contract for both types of agent, then to satisfy (6.1), we require

 (1 p1 )w1 p1 w2 ] = (1 ) [E2 x
[E1 x  (1 p2 )w1 p2 w2 ]

which re-orders to the equation of a straight line in contract space;


 + (1 )E2 x
E1 x  (1 q)
w2 = w1
q q
This line passes through the point of intersection of the two reserva-
tion indierence curves of the principal (one for each type of agent),
156 6. Adverse selection

identied as point w0 in Figure 6.4. At points on this line to the left


of w0 the positive expected prot that the principal obtains from the
type-1 agents is exactly oset by the negative expected prot obtained
x w)
from the type-2 agents. We shall indicate this line by E (  = 0.
However, the proposed solution to the problem at hand must be a
point located between the two points at which the indierence curves
x w)
of the two types of agent are tangent to the line E (  = 0, since
under any other option we can increase the utility of both agents with
a movement along the line (you should check that you understand
why by drawing a quick graph). Thus the solution would have to
correspond to a point at which the indierence curve of type-1 agents
x w)
is steeper than the line E (  = 0, and the indierence curve of
type-2 agents is less steep than the same line.

w2
w1 = w2

w
w0
E2 (x w) = 0

E (x w) = 0

E1 (x w) = 0

w1

Figure 6.4 A pooling contract with a competitive


principal

The situation has been drawn in Figure 6.4. The proposed equilib-
rium contract is the point w where the two indierence curves intersect
(the steepest indierence curve at that point corresponds to a type-1
agent).
6.3. Principal-agent setting 157

Now, note that this graph implies that we can always design a
new contract, located below the indierence curve of the type-2 agent
and above the indierence curve of the type-1 agent (so that it would
be accepted only by type-1 agents), and yet that oers the principal
a strictly positive expected prot. For example, in Figure 6.4 any
x w)
contract located on the line E (  = 0 a little below the point w
would be sucient. But since all principals have the same incentive
to oer this new contract given that (by assumption) the others are
all oering the point w, we cannot have a Nash equilibrium at w.

Exercise 6.3. Draw a graph of the situation, described above,


when the indierence curve of each type of agent is tangential to
x w)
the line E (  = 0. This is a situation in which a principal
oers the menu of contracts dened by this straight line, and
allows each agent to freely choose which contract they prefer.
Use your graph to show that a principal who does this would
suer negative expected prots.

Answer. The relevant graph is Figure 6.5. The type-2 agent


wants to locate above the certainty line at point B, and the
type-1 agent wants to locate below the certainty line at point
C. Notice that in Figure 6.5 the expected value lines of each type
of agent are drawn through the contract that each would like to
have. Now, consider the point at which the line E ( x w)
 =
0 intersects the certainty line. This point is labelled point A.
If both agents were to be given point A, the principal would
make exactly 0 expected prots (this is the very denition of
the line E (x w)
 = 0; when both agent types locate at a
single contract upon this line, expected prots are 0). But the
expected value lines for each agent type independently going
through point A (not drawn on the graph, to avoid excessive
cluttering) are both lower down than are the expected value lines
going through each of the two contracts that they would actually
choose. This implies that, on both of the two contracts that are
chosen, the principal would earn less expected prot than had
both been given point A. And since point A gives exactly 0
expected prots, the freely chosen contracts must give negative
expected prots. This result can also be proved mathematically.
You might want to have a try at doing it.
158 6. Adverse selection

w2
w1 = w2

A
C

E (x w) = 0
w1

Figure 6.5 Negative expected prots from points B


and C.

The result that there can never be a pooling equilibrium tells us


that the principal will always design dierent contracts for each of
the two types of agent. But in that case, it must be true that the
x w)
contract designed for type-1 agents must lie on the line E1 (  = 0,
and the contract designed for the type-2 agents must lie on the line
x w)
E2 (  = 0. If one of the two contracts was not located on the
relevant line for expected prot of 0, then the other contract also
cannot be located on its relevant expected prot of 0 line, since the
sum of expected prots must be 0 under perfect competition. Thus it is
sucient to show that one of the two contracts must always be located
on the relevant expected prot equals 0 line. But we only need to note
that if all principals were to oer two specic contracts, one of which
were to earn a positive expected prot and the other an osetting
negative expected prot, then all principals would have an incentive
to remove the contract that earns the negative expected prot, thereby
specialising in agents of the type that oer positive expected prot.
Again, the existence of such an incentive implies directly that the
6.3. Principal-agent setting 159

situation assumed at the outset (neither contract earns expected prot


of 0 alone) cannot be a Nash equilibrium.
Now, choose any particular point on the line E2 ( x w) = 0 as
2
the contract designed for type-2 agents, and call it w (it is usual
to assume that w2 is located to the left of w0 ). Going through this
point is an indierence curve for a type-2 agent, E2 u(w 2 ), that must
cut through the line E1 (  = 0 at some point, say w1 (w2 ). In
x w)
order to respect the type-2 agent incentive compatibility condition,
the contract that we now design for type-1 agents (w1 ) cannot be
located above the indierence curve E2 u(w 2 ) or else type-2 agents
1 2
would prefer contract w over contract w . But we also know that
w1 must be located on the line E1 ( x w)
 = 0 since the principals
expected prot must be 0. Given all of that, consider what is the point
w1 that maximises E1 u(w 1 ) subject to being on the line E1 (x w)
 =0
and being on or under the indierence curve E2 u(w 2 ). Clearly, the
point we are looking for is precisely w1 (w2 ), the point of intersection
of the curve E2 u(w 2 ) and the line E1 (x w)
 = 0, since if we move
the contract upwards along the line E1 ( x w)
 = 0 we would hold the
expected value constant for a type-1 agent and we would reduce the
variance, which implies that we would increase the expected utility of
that agent. In short, by choosing any arbitrary point w2 on the line
x w)
E2 (  = 0 as the contract designed for type-2 agents, then the
corresponding contract that will be designed for the type-1 agents is
the point of intersection between the indierence curve E2 u(w 2 ) and
x w)
the line E1 (  = 0.
Now that we know which will be the contract designed for type-1
agents for any particular choice of contract for type-2s, all we need
to do now is to nd the optimal contract to oer the type-2 agents.
But this is really trivial, since the initial objective was to maximise
the utility of those agents, and knowing that we are restricted to
contracts that keep their expected value constant, clearly the optimal
contract is the one that oers certainty, that is, the contract lying
at the intersection of the line E2 (x w)
 = 0 and the certainty axis
(w1 = w2 ). The situation is represented in Figure 6.6.
In short, if the principal oers all agents the choice between the
two contracts w1 and w2 as shown in Figure 6.6, then

1. type-1 agents will choose contract w1 ,


2. type-2 agents will choose contract w2 ,
3. the principal gets an expected prot of 0,
160 6. Adverse selection

w2
w1 = w2

w1

E2 (x w) = 0

E1 (x w) = 0

w1

Figure 6.6 Separating equilibruim in the adverse


selection problem with a competitive principal

4. it is impossible to design another pair of contracts that are


dierent to each other, such that the previous three points are
achieved without reducing the utility of at least one type of
agent.

Therefore, the two contracts depicted in Figure 6.6 consitute the


unique separating equilibrium for the problem of adverse selection.
Since there are no pooling equilibria, it turns out that the contract
menu of Figure 6.6 constitutes the only feasible equilibrium for the
problem.
However, there may be a small problem. Consider the possibility
that the indierence curve of a type-1 agent at the corresponding
x w)
equilibrium contract cuts the line E (  = 0, as has been drawn
in Figure 6.7. In this case, given the oer of the menu (w1 , w2 ) by
all the other principals, a rebel principal would appear to have
an incentive to oer a single contract located in the shaded zone in
Figure 6.7, which would attract all of the agents (because it is above
both indierence curves) and at the same time would give the rebel
6.3. Principal-agent setting 161

principal a positive expected prot (because it is located below the


x w)
line E (  = 0). In this case, the existence of the possibility of
earning positive expected prot with a pooling contract that improves
the expected utility of both types of agent appears to destroy the
equilibrium nature of the original menu (w1 , w2 ).

w2
w1 = w2

w1

E2 (x w) = 0
E (x w) = 0

E1 (x w) = 0

w1

Figure 6.7 Zone of rebel contracts

Never-the-less, is it really possible that some principal would oer


a contract in the shaded area? If she did, then she would immediately
be vulnerable to a counter-oer by another principal that is located
above the type-1 agent indierence curve at the rebel contract and
below the type-2 agent indierence curve at the rebel contract (you
should draw a quick graph to convince yourself that such a contract
certainly exists). Since such a counter-oer contract can be located
below the line E1 ( x w) = 0 and so would earn positive prots
for any principal making the counter-oer. The counter-oer contract
takes all of the type-1 agents from the rebel contract, and leaves the
type-2 agents there, and since the rebel contract is located above the
x w)
line E2 (  = 0, it will now imply a negative expected prot for the
original rebel principal. In summary, this type of argument is sucient
162 6. Adverse selection

to defend the situation drawn in Figure 6.6 as the equilibrium contract


menu of the adverse selection problem in perfect competition.

A monopolistic principal
When the principal is a monopolist, her objective is to maximise
expected prot. Naturally, when there is only one principal, we can
safely ignore all the arguments in the previous sub-section based
on rebel contracts that take one or another type of agent from the
rest of the principals. In the monopoly problem, the principal need
only search for the two contracts that maximise her expected prot
(conditional upon that expected prot being non-negative) subject to
the participation and incentive compatibility constraints of both types
of agent. Indeed, since we know from the previous problem (perfect
competition) that it is always possible for the principal to oer two
contracts that give her an expected prot of 0, we can in fact also
ignore the participation constraint of the principal (the restriction
that in the solution to the problem her expected prot must be non-
negative), since at least one contract menu exists that achieves this
objective. So we know that whatever is the solution to the expected
prot maximising problem, it can never end up giving a negative
expected prot. Thus, the problem can be formulated as


max E1 x (1 p1 )w11 p1 w21 +
1
w ,w 2


(1 ) E2 x (1 p2 )w12 p2 w22
subject to
(1 p1 )u(w11 ) + p1 u(w21 ) u1 (6.2)
(1 p2 )u(w12 ) + p2 u(w22 ) u2 (6.3)
(1 p1 )u(w11 ) + p1 u(w21 ) (1 p1 )u(w12 ) + p1 u(w22 ) (6.4)
(1 p2 )u(w12 ) + p2 u(w22 ) c(1 p2 )u(w11 ) + p2 u(w21 ) (6.5)
Again, this is a rather large problem, with four variables and
four restrictions which implies four multipliers. A full mathematical
treatment of the problem would require analysing the simultaneous
solution to eight equations in eight unknowns. However, using some
easy graphical analysis, we can reduce the problem down to an equiv-
alent one with only two equations in two unknowns. Lets see how.
6.3. Principal-agent setting 163

It turns out that in the solution to the problem the following is


true:
Result 6.2: The following three characteristics hold in the solution
to the problem of a monopolistic principal:

1. the participation condition of type-1 agents, (6.2), must bind,


2. the incentive compatibility condition of type-2 agents, (6.5), must
bind,
3. the optimal contract designed for type-2 agents satises w12 =
w22 .

An explication of the three statements is in order. First, whatever


is the contract that is designed for the type-2 agents, it will provide
them with some level of utility, and, therefore, a particular indierence
curve. Lets refer to that indierence curve as E2 u(w 2 ). Now, of all
possible contracts that would give that level of utility to type-2 agents,
which would maximise the expected prot of the principal, conditional
upon it being signed by a type-2 agent? The answer is the contract
on that indierence curve that gives the agent certainty, since that
will be the point of tangency between the indierence curve and the
x w),
iso-prot line of the principal, E2 (  that lies as close as possible
to the origin. So eectively, for any given level of utility for type-
2 agents, the most protable manner to contract is with a contract
characterised by w12 = w22 .
Now, in the space of contracts, draw an indierence curve of a type-
 = u(w2 ), and then draw
2 agent at an arbitrary level of utility, E2 u(w)
the reservation utility indierence curve of type-1 agents (see Figure
 = u1 . The zone of points that are simultaneously above
6.8), E1 u(w)
 = u1 and below E2 u(w)
E1 u(w)  = u(w2 ) corresponds to the set of
contracts that simultaneously satisfy participation of type-1 agents
and incentive compatibility of both (recall that the contract used to
give the type-2 agents the indierence curve E2 u(w) = u(w2 ) is the
certainty contract w12 = w22 w2 ), which is the shaded zone in Figure
6.8. Which of all of these options will maximise the expected prot
that the principal gets from the contract, conditional upon it being
signed by a type-1 agent? Well, by simply moving to dierent expected
prot lines E1 (x w),
 each one closer to the origin than the one
before, we can see that the contract that is at the intersection of the
two indierence curves is that which we are searching for. Concretely
then, the point is located on the reservation utility indierence curve
164 6. Adverse selection

 = u1 , so it binds the participation condition


of a type-1 agent, E1 u(w)
of the type-1 agent and the incentive compatibility condition of the
type-2 agent. We call the point in question w1 (w2 ).

w2
w1 = w2

w2
w1 (w2 )
E2 u(w) = u(w2 )
E1 u(w) = u1

w2 w1

Figure 6.8 Optimal type-1 contract, for a given


type-2 contract

All that is required now is to nd out the optimal contract to


design for the type-2 agents, w2 , since once we have that we can
directly calculate the corresponding contract for the type-1 agents as
w1 (w2 ). Recall that we know the contract for the type-2 agents is
a certainty contract, so it is characterised by a single number, and
all that we need to take into account for that number is that it
satises the participation constraint for type-2 agents. In that way,
our problem has been reduced from one of eight unknowns in eight
simultaneous equations, to one of only two unknowns (the type-2 agent
contract, and the Lagrange multiplier corresponding to the type-2
agent participation constraint) in two equations. A far simpler matter.
Before going on to analyse this problem, it is worthwhile noting
a bit of intuition concerning the optimum. Consider what happens
when the principal increases the certain payment corresponding to
6.3. Principal-agent setting 165

the type-2 contract. Directly, she will lose some expected prot on the
type-2s, but it also has the eect of pushing the type-2 indierence
curve upwards, and forcing the optimal contract of the type-1 agents
upwards around the type-1 reservation utility indierence curve. This
implies an increase in the expected prot that is earned on the type-1
contract. So the principal will increase the payment to the type-2s
until the marginal loss she suers on that contract exactly equals the
marginal gain she gets back on the type-1 contract. In general, then,
it is certainly not true that we should conclude that the principal will
keep the type-2 agents on their reservation utility indierence curve,
as she would in a symmetric information problem. We shall now go
on to look at this in a little more detail, but in order to simplify the
notation, from now on we use the variable w to represent the wage
that is paid to the type-2 agents (the same in each state of nature),
and wi to represent the wage of the type-1 agents in state i.
To begin with, note that so long as the principal sets w at a
level that is less than the certainty equivalent wealth of type-1 agents
(the point where their reservation utility indierence curve cuts the
certainty axis), then we know that the type-1 incentive compatibility
condition cannot bind, and that the type-1 optimal contract must
be characterised by w1 > w2 . In the following, we shall make use
of the general result that, outside of a very extreme case (which we
will consider), the type-1 agent incentive compatibility condition will
never bind, and so is irrelevant to the problem and can be ignored.
Now, we know that in all cases the type-1 agent participation
condition binds, as does the type-2 agent incentive compatibility con-
dition. Formally, these two ideas are written as

(1 p1 )u(w1 ) + p1 u(w2 ) = u1
(1 p2 )u(w1 ) + p2 u(w2 ) = u(w)

With a minimal amount of eort, we can use these two equations


to dene the coordinates of the type-1 contract as implicit functions
of the type-2 contract wage. This is done by simply re-ordering the
equations so that they read
 
p2 u1 p1 u(w)
u(w1 ) =0
p 2 p1
 
(1 p1 )u(w) (1 p2 )u1
u(w2 ) =0
p2 p 1
166 6. Adverse selection

Now, apply the implicit function theorem to get


  
w1 p1 u (w)
= <0 (6.6)
w p2 p1 u (w1 )
and
 
w2 1 p1 u (w)
= >0 (6.7)
w p2 p1 u (w2 )
Since, as was noted above, we can safely ignore the incentive
compatibility constraint of the type-1 agents (unless the equilibrium
is pooling, which we shall consider shortly), our problem can now be
expressed as
max f (w) E1 x
+(1)E2 x
 [(1 p1 )w1 (w) + p1 w2 (w)](1)w
w

subject to
u(w) u2
If we write the restriction as
g(w) u(w) u2
then we can use the Lagrange method, so long as the objective func-
tion is concave in the choice variable w, since the equation that denes
the restriction, g(w), is convex by the assumption of concavity of the
utility function.
The rst derivative of the objective function with respect to w is
 
 w1 w2
f (w) = (1 p1 ) + p1 (1 )
w w
       
p1 u (w) 1 p1 u (w)
= (1 p1 ) + p1
p2 p1 u (w1 ) p2 p1 u (w2 )
(1 )
 
(1 p1 )p1

= u (w) u (w2 )1 u (w1 )1
p2 p1
(1 ) (6.8)
where we have used (6.6) and (6.7). The second derivative is
 
 (1 p1 )p1

f (w) = u (w) u (w2 )1 u (w1 )1
p2 p1
    
(1 p1 )p1  u (w2 )1 u (w1 )1
u (w)
p 2 p1 w
6.3. Principal-agent setting 167

But since
 
u (w2 )1 u (w1 )1
=
w
w2 w1
u (w2 )2 u (w2 ) + u (w1 )2 u (w1 ) >0
w w
the second term of the second derivative is certainly negative, and we
only need concern ourselves with the rst term. The rst term of the
second derivative is not positive if

u (w2 )1 u (w1 )1 0

that is, if
u (w2 ) u (w1 ) = w2 w1
However, as will be shown below, since this will be true in all possible
cases, it is indeed true that the objective function is concave in w
and we can solve the principals simplied problem using traditional
maximisation techniques.
The Lagrangean for the problem is

 + (1 )E2 x
L(w, ) = E1 x  (1 p1 )w1 (w)+


p1 w2 (w) (1 )w + u2 + u(w)

And so the rst-order condition is wL


= 0, that is, f  (w) + u (w) = 0.
Using (6.8) this is just
 
(1 p1 )p1

u (w) u (w2 )1 u (w1 )1 (1 ) + u (w) = 0
p2 p 1
(6.9)
On the other hand, the complementary slackness condition is

[u2 + u(w)] = 0

Now, note that (as mentioned above) it is never feasible to have


a solution with w2 > w1 . To see why, assume that w2 > w1 . But
this then implies u (w2 )1 u (w1 )1 > 0, which in turn from the
rst-order condition indicates that > 0, that is, the participation
condition of type-2 agents would need to bind. But then the optimal
contract for type-1 agents is located at the intersection of the two
reservation utility indierence curves, which under the assumption of
168 6. Adverse selection

u2 < u1 must occur at a point where w1 > w2 , which is clearly in


contradiction to where we started.
Let us consider for a moment the special case of = 1. In this
case the rst-order condition becomes
 
(1 p1 )p1

u (w) u (w2 )1 u (w1 )1 = u (w) 0
p 2 p1
from which clearly

u (w2 )1 u (w1 )1 0

that is,
w2 w1
However, since it is never feasible to have an equilibrium with w2 >
w1 , this case must correspond to w2 = w1 , and so the type-1 contract
is located on the certainty axis. But since the type-1 agent contract
is also located on the indierence curve of the type-2 agents, we now
know that when = 1 the equilibrium is pooling with w2 = w1 = w.
Of course, this is not at all surprising if there are no type-2 agents
(which is basically what = 1 indicates) then the principal needs only
to deal with the type-1 agents in an expected prot maximising way.
Really, when = 1 there is no problem of asymmetric information.
Furthermore, in any other case ( < 1) it must necessarily be true
that w2 < w1 , and so the equilibrium will be separating. To see this,
just apply the implicit function theorem to the rst-order condition
2
L
w w
= 2
L
w2

The sign of this is equal to the sign of the numerator as the Lagrangean
has already been shown to be concave in w. But since
2L 2f
=
w w
from (6.8) it turns out that
 
2L (1 p1 )p1

= u (w) u (w2 )1 u (w1 )1 + 1
w p2 p1

which is strictly positive whenever u (w2 )1 u (w1 )1 0, that is,


2L
whenever w2 w1 . Beginning with = 1, where w = 1 > 0, we
6.3. Principal-agent setting 169

know that a marginal reduction in implies a reduction in w, and a


corresponding movement to an equilibrium characterised by w1 > w2 .
Continuing the process, further reductions in must always reduce w
until an equilibrium is reached in which the participation condition of
type-2 agents binds.
It turns out that the equilibrium solution binds the participation
condition for all levels of that are less than or equal to some partic-
ular level, say 0 , where 0 > 0. You are asked to provide a proof of
the fact that 0 > 0 in problem 9.

Summary
From our analysis of the perfect competition case, we can conclude
that

1. The optimal contract designed for type-2 agents, denoted w2 , is


found as the simultaneous solution to the two equations E2 ( x
 = 0 and w1 = w2 . That is, w12 = w22 = E2 x.
w)
2. The optimal contract designed for type-1 agents, denoted w1 , is
found as the simultaneous solution to the two equations E1 ( x
 = 0 and E2 u(w)
w)  = u(E2 x ).
3. The equilibrium is separating, that is, each type of agent chooses
a dierent contract.
4. In the equilibrium, it happens that the incentive compatibility
condition of type-2 agents binds, but the incentive compatibility
condition of type-1 agents does not.
5. In the equilibrium, type-2 agents receive the same utility (and
the same contract) that they would have received under condi-
tions of symmetric information, but the type-1 agents receive
less utility than what they would have received in a problem of
symmetric information.
6. The reduction in utility suered by the type-1 agents compared
to a symmetric information scenario is due to an increase in
risk that they must accept in order to signal their true type
that is, it is the cost that they must endure so that the contract
designed for them is unattractive to type-2 agents.
170 6. Adverse selection

From our analysis of the case of a monopolistic principal, we can


conclude that

1. The equilibrium is always separating whenever the probability


that any particular agent is type-2 is not 0.
2. Type-2 agents get a risk-free contract, and they may obtain a
level of utility that is greater than their reservation level (and
so the asymmetric information may provide them with a benet
over the symmetric information setting).
3. Type-1 agents get a risky contract (so long as < 1), and they
obtain exactly their reservation utility always.
4. The principal must earn a lower expected prot than would be
available had the information been symmetric.
5. At the extreme case of = 1, the equilibrium is pooling at
a contract equal to the certainty equivalent wealth of type-1
agents. Then, as reduces, the equilibrium wage for type-2
agents is reduced (along the certainty axis), and the equilibrium
contract for type-1 agents involves more and more risk. When
reaches some minimal level, denoted by 0 , which is strictly
greater than 0, the equilibrium for all from then on down sets
the type-2 contract wage at the certainty equivalent wealth of
type-2 agents.

Problems
1. In the model of Ackerlof, of the second-hand car market, cars
were dened to be of high or low quality, without really paying
much attention to what quality actually means. Assume that
any given car can either break down or not, and that the proba-
bility of breaking down is p. Good quality cars break down with
probability p1 and bad quality cars break down with probability
p2 , where p1 < p2 . For simplicity, assume that a broken down
car has value 0, and a non-broken down car has value 1. Assume
that sellers can oer their cars along with a guarantee. The
guarantee stipulates that the seller will pay the purchaser an
amount of money, g, should the vehicle break down. What is
the cost to a seller of each quality of car of selling with the
guarantee? Calculate the minimum size of the guarantee such
that it signals a good quality car. Describe the nal (separating)
equilibrium.
6.3. Principal-agent setting 171

2. In the model of Spence of education as a signal of labour quality,


it is assumed that education is non-productive, that is, education
itself does not improve the value of a worker to the rm. Try to
re-do the model such that education does enhance the value of a
worker. Assume that the utility of a type-i worker is u(w)ci (e),
where u(w) is increasing and concave and ci (e) is increasing and
convex. Assume that the dierence between type-1 and type-
2 workers is that c1 (e) < c2 (e) for all values of e, so that the
more able workers can obtain a marginal unit of education more
cheaply (in terms of utility). Draw the indierence curves of the
two types of worker in (e, w) space. Now, assume that the value
of a worker to the rm is vi e, where v1 > v2 , so that education
enhances the workers value, but the more able workers are
enhanced at a greater rate. The rm is perfectly competitive,
and so must earn zero prots. Work through the options for
contracting under separating equilibrium arrangements.
3. In this chapter, we saw models of adverse selection when the
agents were all equal to each other in everything except the
probabilities of the two states of nature. Now consider a model
in which all agents have the same probabilities of the states of
nature, but that they are dierent with respect to their utility
functions. Specically, assume that there are only two types of
agent, and that they have dierent values of absolute risk aver-
sion. Let type-1 agents have a utility function with Arrow-Pratt
measure of absolute risk aversion of R1a (w), and type-2 agents
have a utility function with R2a (w), where w is wealth. Assume
that R1a (w) < R2a (w) for all w. Assume that the reservation
outcome for any agent (i.e., what they would get should they
not end up contracted to a principal) is a state contingent point
w0 = (w10 , w20 ) such that w10 > w20 . Solve graphically for the
solutions to the implied adverse selection problem for both a
perfectly competitive and a monopolistic principal.
4. Go back to the specication of adverse selection of the chapter
(i.e., same utility functions, dierent probabilities). Now add a
third type of agent with probability of state 2 equal to p3 , where
p1 < p2 < p3 . Solve graphically for the solution to the adverse
selection problem with a competitive principal.
5. Explain carefully who the winners and the losers are in a model
of adverse selection with perfectly competitive principals as com-
pared to the same setting but with perfect information.
172 6. Adverse selection

6. In the equilibrium of an adverse selection problem with two


types of agent, how would an increase in , the probability that
any given agent is type-1, aect the nal equilibrium when the
principal is (a) a monopolist, and (b) perfectly competitive?
7. In a perfect competition model of adverse selection, how does
an increase in the risk aversion of the agents aect the equilib-
rium contract menu and the welfare of all participants in the
equilibrium?
8. Assume an adverse selection problem with two types of agent.

All agents have utility u(w) = w. Type-1 agents have a prob-
ability of state 2 of p1 = 0.2, while type-2 agents have p2 = 0.6.
Type-1 agents have reservation utility of u1 = 9, and type-2
agents have u2 = 7. The proportion of type-1 agents in the
economy is = 0.9. The principal earns x1 = 100 should
state 1 occur, and x2 = 40 should state 2 occur. Calculate the
equilibrium contracts for both the case of a perfectly competitive
principal and for a monopolistic principal.
9. Prove the statement that there exists a limit value for de-
noted by 0 which is strictly positive, 0 > 0, such that for all
0 the equilibrium contract for type 2 agents binds their
participation condition.
Chapter 7

Asymmetric information:
Moral hazard

In a situation of moral hazard, rather than being uninformed about a


particular parameter, the principal is uninformed as to a variable that
is controlled by the agent. Contrary to adverse selection, where the
principal could not observe the agents identity, in moral hazard the
principal cannot observe the agents actions. In order to model this
type of problem with a minimal change over what we have done for
adverse selection, we will again assume that all agents are identical as
far as utility is concerned (and, of course, we retain the assumptions
that the utility function is increasing and concave in wealth), but
now we allow the agent the possibility of choosing the probability of
the states of nature. That is, now our agent can choose whether the
probability of state of nature 2 is p1 or whether it is p2 , where once
again p1 < p2 .
Of course, the agent does not directly set the probability, but
rather the probability is determined from the agents actions which
he does directly and unilaterally choose without the principal being
able to see what particular choice is made. For example, consider the
case of an individual who wants to insure his car against the risk
of theft. The probability of theft (state 2) will clearly depend upon
such things as whether or not he parks in well lit streets or dark alleys,
whether or not he leaves the keys in the ignition while going shopping,
whether or not he has an anti-theft alarm installed, and so on. If, as
is quite reasonable to assume, the insurer cannot directly observe (at
least at a reasonable cost) this type of choice by the individual, we
have a moral hazard problem.
173
174 7. Moral hazard

The key point to see in a moral hazard problem is that there is


a conict of interests between the principal and the agent as far as
the action that the agent should choose is concerned, and, therefore,
as far as the probabilities of each state of nature is concerned. This
happens whenever it is costly for the agent to choose actions that
reduce the probability of the worst state (in everything we do here,
state 2). Thus, if it is costly for the agent to take due care that his
car is not stolen, and yet if the insurer (the principal here) would
like him to do so, the insurer must design a contract that provides
the agent with the correct incentives to do as the principal would
desire. Clearly, it is not sucient to oer the agent money to carry
out the required actions, since under a moral hazard situation the
insurer cannot check that the actions were indeed carried out. For
example, if the insurer simply oers a discount premium for cars with
burglar alarms installed, it is necessary to check that those individuals
claiming the discount actually do have alarms in their cars. If the
insurer cannot do that,1 she needs to search for a better incentive
mechanism.
When we think of a moral hazard problem, the rst thing to
notice is that now all agents are identical in all respects (same utility
function, same set of feasible actions, same probabilities of states of
nature). So it is now incorrect to speak of dierent agent types as
was done in adverse selection. And since, as we noted in the previous
section, the principal should design the same number of contracts in
the menu as there are types of agent, in a moral hazard problem the
principal need only design one contract, since identical agents will all
respond in an identical manner to any contractual incentive.
The exact assumptions that we shall use are the following. In state
i the relationship with the agent generates an income for the principal
of xi , where x1 > x2 , that is state 2 is again the unfavourable state.
The probability with which state 2 occurs depends on a variable that
we shall call eort, and that we denote by e, that is chosen by
the agent without the principal being able to observe this choice. We
assume that there are only two possible values that the agent can
choose from, e1 and e2 , with e1 > e2 . So we are assuming that e1
is high eort, and e2 is low eort (often known as shirking). The
1
Even if the insurer checks the car when the contract is signed, there is no way
to know that the alarm is taken out and sold to someone else after the contract
is signed. The contract must provide an incentive to install and keep the alarm in
the car.
175

probability of state 2 is then determined as p(e), with p (e) < 0, so


that when the agent chooses high eort, the probability of the worst
state is reduced, that is, p(e1 ) < p(e2 ).
Now, we need to assume that the choice of eort somehow aects
the utility of the agent. The simplest way to do this is to assume that
the utility function is separable in money and eort. So we assume
that the agents utility is U (w, e) = u(w) d(e), where u (w) >
0, u (w) < 0 and d (e) > 0. As always, we refer to u(w) as the
utility of money, and the new function, d(e), is referred to as the
disutility of eort. The separable nature of the utility function is
purely for mathematical simplicity (it implies that the cross derivative
with respect to money and eort is 0). Without this assumption the
problem can still be tackled, but it becomes much more complex. All
of the relevant intuition can be found in our simpler setting.
The agents expected utility of a contract that oers him a wage
of wi in state i = 1, 2, is
 d(e)) =(1 p(e))u(w1 ) + p(e)u(w2 ) d(e)
Ee (u(w)
 d(e)
=Ee u(w)
As before, we assume the principal is risk neutral, so that her objective
is just expected monetary prots Ee ( x w).

Now, dene the following function:
f (w) [(1 p(e2 ))u(w1 ) + p(e2 )u(w2 ) d(e2 )]
[(1 p(e1 ))u(w1 ) + p(e1 )u(w2 ) d(e1 )]
=u(w2 )(p(e2 ) p(e1 )) + u(w1 )(1 p(e2 ) 1 + p(e1 ))
d(e2 ) + d(e1 )
=(p(e2 ) p(e1 ))(u(w2 ) u(w1 )) d(e2 ) + d(e1 )
Clearly, if the agent is oered a contract such that f (w) > 0, then he
will have a strict preference for low eort, e2 , while if the contract gives
f (w) < 0, then the agent will prefer e1 . Finally, if we have a contract
such that f (w) = 0, then the agent is indierent between the two
eort levels. The function f (w) captures the incentive compatibility
of the agent in this problem. Applying the implicit function theorem,
we have
dw2 u (w1 )
= >0
dw1 df (w)=0 u (w2 )
that is, the contours of the function f (w) have positive slope in the
space of contracts.
176 7. Moral hazard

Now, consider the particular contour corresponding to f (w) = 0.


By denition, f (w) = 0 corresponds to all the contracts such that the
individual is indierent between high and low eort. We note that

(p(e2 ) p(e1 ))(u(w2 ) u(w1 )) d(e2 ) + d(e1 ) = 0

and so

(p(e2 ) p(e1 ))(u(w2 ) u(w1 )) = d(e2 ) d(e1 ) < 0

But since p(e2 ) p(e1 ) > 0, it turns out that the vectors w such
that f (w) = 0 must satisfy u(w2 ) u(w1 ) < 0, that is, they have
a higher wage in state 1 than in state 2, w2 < w1 . But then, since
the slope of the contour is nothing more than the ratio of marginal
utilities, and recalling that the utility function is concave (marginal
utility is decreasing), the fact that w2 < w1 implies that the slope of
the contour is always less than 1.

Exercise 7.1. Consider the contour of the function that denes


the set of points such that the agents incentive compatibility
condition binds, f (w) = 0. Assuming that utility is the log-
arithm of wealth, what is the slope of the contour? In this
case, would the contour be linear, concave or convex? Assuming
that the utility function displays constant absolute risk aversion,
evaluate the concavity or convexity of the contour f (w) = 0.

Answer. The slope of the contour is the ratio of marginal utility



in state 1 to marginal utility in state 2, uu (w 1)
(w2 ) . If utility is the
logarithm of wealth, then u (w) = w1 , so in this case the slope of
the contour is just w 2
w1 . To see if the contour is linear, concave or
convex, derive its slope with respect to w1 . Using the quotient
rule, this is

dw1 w1 w2
dw2

(w1 )2

w2 w2
But since in this case dw 2 w2
dw1 = w1 , this reduces to (w1 )2 = 0.
So the contour under logarithmic utility is linear. What about
under constant absolute
risk aversion? In general the slope of
dw2 
the contour is dw1 = uu (w 1)
(w2 ) . Deriving again with respect
df (w)=0
177

to w1 we get

d2 w2 u (w1 )u (w2 ) u (w1 )u (w2 ) dw
dw1
2

=
d(w1 )2 df (w)=0 u (w2 )2

u (w1 )u (w2 ) u (w1 )u (w2 ) uu (w 1)
(w2 )
=  2
u (w2 )

This is less than zero (i.e., the contour is concave) if


  
    u (w1 )
u (w1 )u (w2 ) < u (w1 )u (w2 )
u (w2 )

Which re-orders to
u (w1 ) u (w2 ) u (w1 )
<
u (w1 ) u (w2 ) u (w2 )

or, if we multiply by 1, this becomes

u (w1 )
Ra (w1 ) > Ra (w2 )
u (w2 )

So if the utility function displays constant absolute risk aversion,


Ra (w1 ) = Ra (w2 ), and the equation would read

u (w1 )
1>
u (w2 )

We know this to be true, so it must also be true that, under


the assumption of constant absolute risk aversion, the contour
f (w) = 0 is concave.

In Figure 7.1 we can see the curve f (w) = 0 together with two
indierence curves of the agent passing through a point on the contour.
It is important to note that the two indierence curves drawn repre-
sent only the part of the utility function that depends on money, that
is, they are curves along which Ee u(w) is constant. Clearly, since d(e)
is independent of the contingent wage vectors whatever was the choice
of e, along the curves that are drawn total utility Ee u(w)  d(e) is
also constant. Since we know that an individual is indierent between
two situations when his indierence curves for total expected utility
178 7. Moral hazard

(utility of money less disutility of eort) intersect at the certainty axis,


if we were to move the two curves that have been drawn in Figure
7.1 downwards by a distance equal to the corresponding utility cost
d(ei ), they would intersect on the certainty axis (i.e., the curve in
Figure 7.1 that cuts the line w1 = w2 highest would move downwards
by a greater distance, since that is the curve corresponding to the
greater eort disutility, d(e1 ) > d(e2 )). The steepest indierence curve
corresponds to the case of high eort, e1 , and the less steep indierence
curve implies low eort has been chosen. Since we know that for the
situation shown in the graph Ee1 u(w)  d(e1 ) = Ee2 u(w)
 d(e2 ), and
 > Ee2 u(w).
since d(e1 ) > d(e2 ), naturally it is true that Ee1 u(w) 

w2
w1 = w2

f (w) = 0

Ee 2 u
Ee 1 u

w1

Figure 7.1 The incentive compatibility constraint of


the agent

Now, the higher up the curve f (w) = 0 is the point of intersection


with the indierence curve, the greater is the expected utility of the
agent. It is also true that if two indierence curves do not intersect
on the curve f (w) = 0, then the agent prefers that which cuts the
f (w) = 0 line highest.
179

The agent is indierent between the two eort levels that he can
choose between for any contract located on the curve f (w) = 0, and
so clearly the agent has a strict preference for one eort level over the
other for any point not located on that curve. In order to see exactly
what that preference is, consider a point characterised by w1 = w2 =
w, which is to the left of the curve f (w) = 0. With a risk-free wage,
the individual is indierent between which state of nature occurs, and
so will always choose the least costly eort level, that is, e2 . So at all
points to the left of f (w) = 0 the individual prefers low eort to high
eort, and at any point to the right of f (w) = 0 the preference is for
high eort over low. The intuition is clear; the greater is the variance
that the contract oers, the more state 1 is preferred to state 2 by
the agent, and so the more reasonable it becomes that he is willing to
suer additional costs in terms of eort to increase the probability of
occurrence of state 1.
Now consider the principal. Since the principal is risk neutral, her
expected prot at any contract when the agent oers eort of ei is
p(ei )(x2 w2 ) + (1 p(ei ))(x1 w1 )
Thus, the principal is indierent between the two eort levels if

p(e1 )(x2 w2 ) + (1 p(e1 ))(x1 w1 ) =


p(e2 )(x2 w2 ) + (1 p(e2 ))(x1 w1 )
that is, if
g(w) (x2 x1 ) + w1 w2 = 0
You should notice that g(w) = 0 is a linear contour with slope equal to
1. In Figure 7.2 this line is shown together with two expected prot
contours of the principal, corresponding to the two possible eort
x w)
levels. In the graph, we have Ee1 (  = Ee2 (x w).
 The further we
move downwards along g(w) = 0, the greater is the expected prot of
the principal. So if we take a point to the left of g(w) = 0, and draw
the two expected prot lines going through it, we can see that the one
corresponding to high eort will cut g(w) = 0 below the point where
the low eort expected prot line does so. Therefore, at points to the
left of g(w) = 0 the principal prefers high eort, and at points to the
right the principal prefers low eort.2 Again, the intuition is clear
2
Notice that there is now a clear conict in interests between the principal and
the agent. The former prefers high eort at contracts located towards the north-
west and low eort at contracts in the south-east of the graph. The agent has the
opposite preference.
180 7. Moral hazard

the higher is the wage to be paid in state 1 compared to state 2, the


less the principal is interested in state 1 occurring, and so low eort
would be preferred in that case.

w2
w1 = w2

g(w) = 0

Ee2 (x w)

Ee1 (x w)

x1 x2 w1

Figure 7.2 Two expected prot lines of equal value

Now, we only need to super-impose Figure 7.2 onto Figure 7.1 in


order to nd the equilibrium contract for the two cases of a principal
who acts in a perfectly competitive environment and of a monopolistic
principal. Again, we will work through each of these two possibilities
separately.

7.1 Perfect competition


When the principal acts in a perfectly competitive environment we
know that she is restricted to earning an expected prot of 0, and so we
can begin by considering the two expected prot lines Eei (x w) = 0.
Now, the principals problem is to nd the contract that maximises the
expected utility of the agent, subject to expected prots being equal to
0, to the agents participation constraint, and to the agents incentive
compatibility constraint. In the same way as in the adverse selection
7.1. Perfect competition 181

problem, we can simply assume that the participation condition will


be satised, since otherwise it is simply impossible to establish a
relationship. Now, what is the point on the line Ee2 ( x w)
 = 0 that
is most preferred by the agent? Obviously, the answer is the point
of zero variance, that is, the point on the certainty axis. And since
this point lies to the left of the curve f (w) = 0 it satises (but does
not bind) the incentive compatibility constraint. So at that contract,
the principal demands e2 , and the agent willingly supplies e2 . This
is the best possible contract that can be oered conditional upon
an objective of achieving eort level e2 . In Figure 7.3, this point is
indicated as point A .

w2
w 1 = w2

g(w) = 0

Ee2 (x w)

Ee1 (x w)

x1 x2 w1

Figure 7.3 Optimal contract for low eort

But it is by no means clear that A is the best contract that


the principal can oer. We still need to nd the optimal contract
that would achieve that the agent supplies the high level of eort e1 ,
and then choose between this one and A that which the agent most
prefers (since the principal is restricted to earning an expected prot
of 0 whatever is the contract, she will be indierent between the two
contracts). The relevant contract (that which achieves the high level
182 7. Moral hazard

of eort) must be located on the line Ee1 ( x w)


 = 0, and it must be
located on or to the right of the curve f (w) = 0. It is evident that the
agent prefers the contract on the line Ee1 ( x w)
 = 0 that gets him
as close as possible to the certainty axis. Under the constraint that
it is not to the left of f (w) = 0, the relevant contract is the point of
intersection between Ee1 (  = 0 and f (w) = 0, that is, point B
x w)
in Figure 7.4.

w2
w1 = w2

f (w) = 0

Ee1 (x w) = 0

w1

Figure 7.4 Optimal contract for high eort

Finally, which of the two candidate contracts, A and B should


the principal oer? One simple solution is to just oer both, and let
the agent choose between them the one that he most prefers. However,
we can evaluate that preference quite simply ourselves. We need only
draw in the two indierence curves of the agent that pass through
each of the two candidate contracts, and see which of them intersects
the curve f (w) = 0 at the highest point. In Figure 7.5 a special case
is drawn, from which all others can be inferred. Figure 7.5 shows the
limit case in which the agent is exactly indierent between the two
contracts A and B . The indierence curve of the agent, conditional
upon low eort, passing through the point A also passes through the
7.1. Perfect competition 183

point B on the curve f (w) = 0. So if contract A were slightly higher


than the position indicated in Figure 7.5, then the indierence curve
passing through A would cut through f (w) = 0 at a point above
B , which indicates that the agent would have a strict preference
for the low eort contract A . Since the principal is indierent, the
equilibrium contract in such a case would be A , which demands eort
e2 , and which incites this same eort from the agent. On the other
hand, if A were slightly lower than the position indicated in Figure
7.5, then the agent would have a strict preference for contract B ,
which would then be the equilibrium contract, and the equilibrium
eort level would be e1 .

w2
w1 = w2

f (w) = 0

B Ee 2 u
Ee 1 u

w1

Figure 7.5 Special case of high and low eort equally


preferred by the agent

Exercise 7.2. It is always possible for the principal to design,


and oer, only a contract that would result in high eort. Pro-
vide an intuitive reason why this might not be optimal.

Answer. High eort needs to be compensated. That is, the


incentive that is provided to the agent to supply high eort is
184 7. Moral hazard

a higher wage in state 1 than in state 2. Whether or not this


is worthwhile depends upon exactly how much this will end up
costing the principal, relative to the other option which is to
pay a constant wage and receive low eort. In essence, whether
or not the principal will indeed prefer to contract high eort
will depend upon the probabilities of the states of nature under
both types of eort, and the level of risk aversion of the agent.
For example, the more risk averse is the agent, the less he likes
a non-constant wage (like what is oered for high eort). So
in order to provide the incentive for high eort, the contract
needs to be particularly generous to the agent, or in other words,
particularly costly to the principal.

7.2 A monopolistic principal


The objective of a monopolistic principal is to oer the contract that
maximises her expected prot, conditional upon the agent accepting
to participate with the desired eort level. In the same way as for a
competitive principal, we begin by locating the optimal contract when
the objective is to incite low eort, e2 . Since the principal is interested
in pushing the line Ee2 (x w)
 as close as possible to the origin of the
graph, the relevant contract is that which oers the agent certainty
at the reservation level of utility. Since this contract lies within the
area of the graph in which indeed the agent prefers to supply low
eort, incentives are aligned and so this is the optimal contract for low
eort. Assuming that outside of the relationship with the principal,
the agent can obtain a level of utility of u, the optimal contract for
eort level e2 is the point w2 that satises Ee2 u(w 2 ) d(e2 ) = u,
that is, Ee2 u(w 2
 ) = u + d(e2 ). This contract is shown in Figure 7.6
as the point A .
Now, what is the optimal contract, from the principals point of
view, when the objective is to incite high eort, e1 ? To begin with,
it must be on the reservation utility indierence curve of the agent
conditional upon him supplying high eort (since if it did not bind
the participation constraint, the principal could simply pay a lower
wage in at least one state of nature without the agent declining to
participate), and it cannot lie to the left of the curve f (w) = 0,
otherwise the agent would supply e2 instead of e1 . Since the principal
is interested in moving as close as possible to the origin of the graph
around the reservation utility indierence curve of the agent, the
7.2. A monopolistic principal 185

contract that will be chosen is the one that lies at the intersection
 = u + d(e1 ) and the curve f (w) =
of the indierence curve Ee1 u(w)
0. This contract is indicated in Figure 7.6 as the point B , which
incidentally also lies on the indierence curve for low eort passing
through the optimal contract for low eort, A , due to the fact that
both are on indierence curves that represent the same reservation
level of utility, u.

w2
w1 = w2

f (w) = 0

B Ee2 u = u + d(e2 )
Ee1 u = u + d(e1 )

w1

Figure 7.6 Optimal contracts for high and low eort


with a monopolistic principal

Finally, the principal needs to choose between the two optimal


contracts which she will in fact oer. Fortunately it is a relatively
simple matter to see which of the two contracts oers the greatest
expected prot for the principal. All we need to do is to see which of
the expected prot lines passing through the two candidate contracts
intersects the line g(w) = 0 at the lowest point. This is the same as
seeing where the two expected prot lines intersect in relation to the
position of the line g(w) = 0; if the intersection occurs exactly on
the line g(w) = 0, then the principal is indierent, if the intersection
occurs to the left of g(w) = 0 (as is the case depicted in Figure 7.7)
186 7. Moral hazard

then the principal prefers contract B (high eort is demanded and


supplied), and if the intersection occurs to the right of g(w) = 0
then the principal prefers contract A (low eort is demanded and
supplied).

w2
w1 = w2

f (w) = 0

A
g(w) = 0
B Ee2 u = u + d(e2 )
Ee1 u = u + d(e1 )

w1

Figure 7.7 A case in which the equilibrium contract


is high eort

Summary
To summarise the case of perfect competition, we can note the follow-
ing points:

1. If the optimal contract demands eort e2 , then it is characterised


by a certain wage for the agent.
2. If the optimal contract demands eort e1 , then it is characterised
by a higher wage in state 1 than in state 2, that is, it is a risky
contract for the agent.
3. If the equilibrium contract demands high eort, e1 , then the
incentive compatibility condition binds, while if the equilibrium
7.2. A monopolistic principal 187

contract demands low eort, e2 , then the incentive compatibility


condition is slack.
4. Compared to the same problem under symmetric information,
the principal is indierent to whether or not the information is
asymmetric independently of the eort level that is demanded.
The agent is also indierent if low eort is demanded, but the
agent is worse o under asymmetric information if the equilib-
rium contract demands high eort.

As a summary, for a problem of moral hazard with a monopolistic


principal, we have the following results:

1. Whatever is the level of eort demanded, the equilibrium con-


tract binds the agents participation condition.
2. The optimal contract for low eort is located at the intersection
of the agents reservation utility indierence curve conditional
upon low eort, and the certainty axis.
3. The optimal contract for high eort is located at the intersection
of the agents reservation utility indierence curve conditional
upon high eort and the curve f (w) = 0.
4. If, in the equilibrium, the principal demands high eort, then the
agents incentive compatibility constraint binds, but if low eort
is demanded, the incentive compatibility condition is slack.
5. When compared to the same problem under symmetric informa-
tion, the agent is indierent to whether or not the information is
asymmetric independently of the eort level that is demanded.
The principal is also indierent if low eort is demanded, but
the principal is worse o under asymmetric information than
under symmetric information if high eort is demanded.

Problems
1. Assume that a perfectly competitive principal decides to con-
tract low eort. Then the risk aversion of the agents increases.
Is it in the best interests of the principal to change to contracting
high eort? Explain.
2. Assume that you observe the eort (high or low) demanded
by a monopolistic principal under symmetric information. Can
you then know what eort (high or low) this principal should
contract under asymmetric information? Explain.
188 7. Moral hazard

3. Assume a perfectly competitive principal, and an agent with



utility w ei . Eort can be either e1 = 1, in which case the
probability of state 2 is p1 = 13 , or e2 = 0, in which case the
probability of state 2 is p2 = 23 . The principal earns x1 = 13 in
state 1 and x2 = 7 in state 2. Find the equilibrium contract.
4. Assume a model of a monopolistic principal under moral hazard.

The agents utility function is 2 w ei , where eort is either
e1 = 2e or e2 = e, for some parameter e > 0. The agent has
a reservation utility of u = 10. If the agent uses high eort,
e1 , the probability of state 2 is p1 = 0.2, and if low eort e2 is
used, the probability of state 2 is p2 = 0.6. The principal earns
x1 = 200 in state 1 and x2 = 50 in state 2. Work out the optimal
contracts for low and high eort, as functions of the parameter
e. For which values of e does the principal prefer e1 , and for
which values of e does the principal prefer e2 ?
Part III

Appendices
This page intentionally left blank
Appendix A

Mathematical toolkit

While the study of any microeconomic problem can normally be car-


ried out using graphs and logic alone, the use of some simple math-
ematical analysis is now very standard. After all, both graphics and
logic are ways in which mathematical ideas and implications can be
represented a graph is nothing more than a drawing of a mathemat-
ical function or relationship, and a logical deduction of one statement
from another is really a form of algebraic manipulation. However,
both graphics and logic, without explicit use of the underlying math-
ematical relationships themselves can, unless one is very, very careful,
often lead to incorrect deductions as imprecisions and errors can easily
creep in. However, pure mathematical reasoning follows simple and
consistent rules, and so more often looking for answers and intuition
directly in a mathematical framework can actually signicantly sim-
plify, rather than complicate, our task. Once that is understood, any
student of microeconomic theory, at least at any level beyond what is
purely introductory, should soon recognise the enormous benet that
is gained by the ability to analyse a problem mathematically.
In the present book, as was already stated in the Introduction,
a certain degree of mathematical sophistication by the reader is as-
sumed, although in actual fact the level of math that is used is by
no means advanced most, if not all, of the mathematical techniques
used in the book are normally covered in high school, or at most,
in rst year university courses. Really, all that a student of this text
needs to be uent in is simple algebra and calculus only to rst and
second derivatives. However, it is of fundamental importance that, in
order to make mathematics useful for microeconomic analysis, it is

191
192 A. Mathematical toolkit

necessary that the student fully understands exactly what each piece
of mathematical toolkit is actually doing. This is a far dierent story
than simply being able to do the maths when asked to. It is only when
you understand what the maths is doing that you will know why each
technique is useful and when.
Given the above, in this appendix we set out the basic mathemat-
ical techniques that are used over and over again in the text. Really
there are only a very few of them, but students are well advised to
be very comfortable with each of them before moving forward into
the text proper. The toolkit that is set out in this appendix are the
following: the implicit function theorem, considerations of concavity
of functions, the Kuhn-Tucker method of constrained optimisation,
and some very basic ideas regarding probability.

A.1 The implicit function theorem


Take a function, say f (x) = y, where x is a vector, and y is a scalar (a
number). For all of the material covered in the text, we can simply take
x to be a two-dimensional vector, that is, x = (x1 , x2 ). Now, the rst
thing to note is that many (perhaps innitely many) dierent vectors
x can be consistent with any given value y. For example, if f (x) is
increasing in xi i = 1, 2 (i.e., an increase in either of the elements of
the vector x will increase the value of f (x)), then, in principle, for any
particular increase x1 , which will increase the value of f (x), we should
be able to nd a corresponding decrease in x2 , which will decrease the
value of f (x), such that the increase and the decrease in the value
of the function exactly cancel each other out. The implicit function
theorem considers the particular relationship between the elements of
the vector x such that changes in them leave the value of the function
itself unaltered.
Given that, assume now that the function f is required to take a
particular value, say y, independently of the vector x that is chosen.
Furthermore, without loss of generality at all, let us write x2 as if it
were a function of x1 , that is, we shall write x2 (x1 ). This is perfectly
permissible, since either x2 depends on x1 (in which case writing the
former as a function of the latter is clearly valid), or it isnt. But when
x2 does not depend upon x1 , all we are really saying is that there is
a special type of dependence, it is a dependence that has zero slope,
that is, a marginal change in x1 gives a zero change in x2 . Thus, the
A.1. The implicit function theorem 193

case at hand can be summed up in the expression


f (x1 , x2 (x1 )) = y
and we are interested in nding out about the relationship between
x2 and x1 , given that the function f will always take the particular
value y. Using the chain rule, we simply need to derive the previous
expression with respect to x1 :
f (x) f (x) x2 y
+ = =0
x1 x2 x1 x1
Simply re-ordering this equation, we end up with what is known
as the implicit function theorem:

f (x)

x2 x1
=
x1 df =0 f (x)
x2
f (x)
Clearly, this is valid only when x2 = 0. The df = 0 that has
appeared on the left-hand-side of the above expression is just to
remind us that the value of the function f is held constant (at y)
when the eect of x1 on x2 is calculated.
You may have noticed that I have used a partial derivative sign
() when looking at the eect that a change in x1 has upon x2 rather
than a normal derivative (d). We would tend to use d rather than
when the only variable that can aect x2 is x1 , and when there may
be other variables that will aect x2 . The partial derivative notation
is more general, and so I prefer to retain it here, and indeed for many
of the applications that we make of the implicit function theorem it
will be true that there are other variables in the implicit relationship
between the two x variables.
Let us consider three simple examples of where the implicit func-
tion theorem comes in handy. The examples are all taken from second
year (intermediate) microeconomics.
Example 1 The budget constraint (the upper frontier of the feasible
set) in a problem of consumer choice, with two goods, and with xed
prices (p1 and p2 ) and income (w) is given by p1 x1 +p2 x2 = w. Dene
g(x) p1 x1 + p2 x2 . Then since w is xed, we can directly apply the
implicit function theorem to nd out the slope of the budget constraint:

g(x)
x2 x1 p
= = 1
x1 dg=0 g(x) p2
x2
194 A. Mathematical toolkit

Example 2 An indierence curve for the same problem as in the


previous example is simply the set of vectors (consumption bundles) x
that are all consistent with some given value of utility; x : u(x1 , x2 ) =
k. Again, we have a function that is restrained to a particular value,
and so from the implicit function theorem we can directly calculate
the slope of an indierence curve at any point (the marginal rate of
substitution):

u(x)
x2 x1
= = M RS(x)
x1 du=0
u(x)
x2

Example 3 Finally, consider the case of a rm in a short run per-


fectly competitive industry, with factors of production labour (L) and
capital (K), but where, due to the assumption of short run, capital is
held xed at K.
We know that since we are in perfect competition, the price at
which output is sold is also a constant, say p, and so the objective of
the rm is to maximise its prots, pf (L, K) wL rK, with respect
to the choice of L. In this prot function, the per-unit wage for labour
is w, the per-unit wage for capital is r, and f (L, K) is the production
function.
The optimal level of labour must satisfy the rst-order condition,
,K)
where marginal prot is zero, that is, L p f (L L = w. This
equation simply states that the optimal level of labour must satisfy the
criteria that the marginal revenue product of labour is equal to the
marginal cost of labour.
Now, how is the optimal level of labour aected by a change in its
per-unit wage, w? To nd out, note that the rst-order condition is
,K)
p f (L
L w = 0, which we can interpret as an equation of the type
h(L, w) = 0, and so we can simply apply the implicit function theorem
to be able to conclude that so long as the marginal product of labour is
decreasing, then the demand curve for labour is negatively sloped (i.e.,
an increase in the wage rate will decrease the optimal employment of
labour):

h(L,w)
L w 1 1
= = 2 = 2 <0
w dh=0 h(L,w) f
p L f
p L
L 2 2
A.2. Concavity and convexity 195

A.2 Concavity of functions, convexity of sets


and convex combinations
In most of economic theory, the concavity of certain functions turns
out to be very important. There are several types of concavity, but
here we shall mostly be interested in one particular type (strict con-
cavity), although we shall be interested in the concavity of functions
of two independent variables, and of functions of only one independent
variable. Let us start with the case of functions of one independent
variable. Take any general function, again let us take the expression
f (x) = y, but where both x and y are now scalars. The function f (x)
is concave if a straight line joining any two points on it lies entirely
below the function itself (see Figure A.1).

y
f (x)
y2

y1

x1 x2 x

Figure A.1 A concave function

There is a simple way of writing the requirement that a straight


line between any two points on a function lies beneath the function
itself, using what is known as a convex combination. Lets take a
careful look at this. Take two given points on the function f (x), say
those corresponding to x1 and x2 . These two points correspond to the
196 A. Mathematical toolkit

values y1 = f (x1 ) and y2 = f (x2 ). The straight line that joins these
two points in (x, y) space is given by an equation of the type:

y = a + bx

where a and b are constants. Of course, given the two points (x1 , y1 )
and (x2 , y2 ), we can actually solve the two equations yi = a + bxi i =
1, 2 in the two unknowns a and b, but we do not need to actually do
that right now. All we need to note is that, any value of x that lies
between x1 and x2 can be written as a weighted average of the two
extremes. That is, if we write x3 = x1 + (1 )x2 , then so long as
we take 0 < < 1, we get min{x1 , x2 } < x3 < max{x1 , x2 }. Now,
consider the value of y that corresponds to our x3 thus dened. Using
the equation for a straight line we have

y3 =a + bx3
=a + b [x1 + (1 )x2 ]
= [a + (1 )a] + bx1 + (1 )bx2
=(a + bx1 ) + (1 )(a + bx2 )
=y1 + (1 )y2

where (at the third step) we have used the obvious fact that a + (1
)a = a.
All of this tells us that, given any two points (x1 , y1 ) and (x2 , y2 ),
then any other point on the straight line that joins them can be dened
as the point (x3 , y3 ) = (x1 + (1 )x2 , y1 + (1 )y2 ), so long as
0 < < 1. Therefore, the statement that the straight line joining any
two points on a function like that shown in Figure A.1 lies beneath
the function itself can be written mathematically as follows; for any
x1 and x2 , and for any : 0 < < 1, then

f (x1 + (1 )x2 ) > f (x1 ) + (1 )f (x2 )

This is known as Jensens inequality, for strictly concave functions.


By simply changing the inequality direction in the above equation, we
have the denition of a strictly convex function, and if the inequality
is replaced by an equality, then the function f (x) is linear. In fact, if
f (x1 + (1 )x2 ) f (x1 ) + (1 )f (x2 ), then f (x) is a concave
function, but not strictly concave. Thus, linear functions, which still
satisfy the inequality f (x1 + (1 )x2 ) f (x1 ) + (1 )f (x2 ) are
still concave, just not strictly concave.
A.2. Concavity and convexity 197

For scalar functions, it is obvious that concavity is exactly equiv-


alent to the statement that the rst derivative of the function is
decreasing as x increases, the value of f  (x) decreases.1 In short,
if f (x) is a concave scalar function (i.e., it satises Jensens inequality
for concave functions), then it holds that f  (x) < 0.
How about the case when x is a vector? It turns out that there
is no formal dierence when we use Jensens inequality Jensens
inequality is still how concavity is dened but now we cannot arm
that a function is concave only by considering its second derivatives.
Let us now consider that we are dealing with a function like f (x) = y,
where x is a vector (for our purposes, again it is sucient that it is
a two-dimensional vector), and y is a scalar. As we have seen in the
previous discussion on the implicit function theorem, any given value
of y can be consistent with lots of dierent vectors x. Lets consider
two such vectors,2 say x1 and x2 , that is, we have f (x1 ) = f (x2 ). Then
the convex combination x1 + (1 )x2 denes a third vector that lies
on the straight line that joins x1 and x2 in two-dimensional space. So
the function f (x) is concave if, for any x1 and x2 , and for any : 0 <
< 1, it turns out that f (x1 + (1 )x2 ) > f (x1 ) + (1 )f (x2 ).
For example, go back to intermediate consumption theory. Recall
that, in the (two-dimensional) space of consumption possibilities, an
indierence curve (the curve that maintains utility constant) is often
drawn convex to the origin (see Figure A.2). It turns out that, if
the underlying utility function is concave, then we will get exactly
such a representation for the indierence curves. To see why, take two
points on an indierence curve (these are just two vectors in two-
dimensional space), and by drawing the straight line that joins them
we can represent all the possible convex combinations of these two
initial vectors. But if the value of utility at any of these intermediate
points is greater than at either extreme, then the initial indierence
curve must lie beneath this straight line that is, if the utility function
is concave, then it generates convex indierence curves. Formally,
looking at Figure A.2, we have u(x1 ) = u(x2 ). A point on the straight
line between the vector points x1 and x2 is a point like x1 +(1)x2 .
1
You may want to take a closer look at the case of a decreasing function, that
is, one for which f  (x) < 0. Mathematically, there is no dierence, but you have
to remember that now a (graphically) steeper function corresponds to a smaller
slope, since it corresponds to a negative number that is further from 0.
2
Note that we are indicating dierent vectors by a super-index, and dierent
elements of vectors by sub-indexes. Thus, x12 is the second element of a vector
indicated by x1 .
198 A. Mathematical toolkit

Then, if the indierence curve is convex, and assuming that utility is


increasing in each good, the graph shows that u(x1 + (1 )x2 ) >
u(x1 ) + (1 )u(x2 ), which is consistent with a concave utility
function.

x2

x1

x2
u(x) = constant
x1

Figure A.2 A convex indierence curve

There are two important points to note here. First, the utility
function and indierence curve example is special, since it corresponds
to a specic assumption on the rst derivatives of the function (utility
is increasing in each argument). This is what leads to decreasing
convex indierence curves. Mathematically, an indierence curve is
really just a contour or level set of the underlying function, since
it is all the vectors such that the function itself does not alter its value.
Try to draw the graphs of a contour of a concave function f (x1 , x2 )
that is increasing in one argument and decreasing in the other. Or a
contour of a concave function that is decreasing in both arguments.
The second important point to note is that you should never
confuse a contour with the function that generates it. In terms of
utility theory, an indierence curve is an entirely dierent concept to
a utility function. We are interested in the concavity of the utility
A.2. Concavity and convexity 199

function, and such a function generates convex contours. Be very


careful that you fully understand the dierence before going on.

Quasi-concavity and quasi-convexity


There is a second, more subtle, particularity of the above argument
on indierence curves. It was assumed that the utilities of the two
extreme points are equal, that is, the graph shows a situation for which
u(x1 ) = u(x2 ). Looking into a more general case leads us directly to
the concepts of quasi-concavity and quasi-convexity. Although
the present textbook is not overly concerned with issues of quasi-
concavity/convexity, it is never-the-less a very important concept in
economics.
Before getting down to the task of dening and discussing quasi-
concavity and quasi-convexity, it is worthwhile to look at something
that is perhaps a bit more familiar a convex set. Say X is a set of
points or vectors x. The mathematical denition of a convex set is the
following:

xi , xj X and : 0 1, if xk () X X is convex

where, of course, xk () xi + (1 )xj . We have already used this


type of thing above. xk () is a convex combination of the points xi
and xj , and so it denes the set of points that lie on the straight line
joining these two points. Thus, in the two-dimensional plane, if the
straight line that joins any two points belonging to a given set lies
entirely within the same set, then that set is convex.3

Example 4 Think of the set of consumption bundles that is at least


as preferred as any given bundle. Graphically, this denes the set
of points on or above a given indierence curve. If the consumers
indierence curve in question is convex (as would be normal), then
the indicated set must be convex.

Now that we know what a convex set is, we can go on to look at


quasi-concave and quasi-convex functions.

1. A function, h(x) is quasi-concave if, for all c, the set X(c)


{x : h(x) c} is a convex set.
3
A set is either convex or non-convex. There is no such thing as a concave
set.
200 A. Mathematical toolkit

2. A function h(x) is quasi-convex if, for all c, the set X(c) {x :


h(x) c} is a convex set.

From the denition of quasi-concavity, it can be seen that for any


two vectors xi , xj , and for any satisfying
 0 1, if h(x) is quasi-
concave then it is true that min h(x ), h(x ) h(x ()). In the
i j k

same way, if h(x) is quasi-convex, then for any two vectors


 xi , xj , and
for any satisfying 0 1, it is true that max h(x ), h(xj )
i

h(xk ()). To see this, note that for any xi and xj , we can dene
min{h(xi ), h(xj )} c. Then both xi and xj belong to the set X(c)
{x : h(x) c}. But if h(x) is quasi-concave, then X(c)  is ai convex set,
k
and so x () must also belong to X(c), that is, min h(x ), h(x ) = j

c h(xk ()). The case of quasi-convex functions can be proved in the


same way.
Any concave
 function
 is quasi-concave (because it is always true
that min h(xi ), h(xj ) h(xi ) + (1 )h(xj )),  and any convex
function is quasi-convex (since max h(x ), h(x ) h(x ) + (1
i j i

)h(xj )), but neither of the reverse armations is true. When the
strict inequality in any of the denitions of concavity/convexity is
used, then the corresponding characteristic is strict (e.g., h(xk ()) >
h(xi ) + (1 )h(xj ) implies that h(x) is strictly concave).

Example 5 Consider the expenditure function of a two-dimensional


consumer choice problem; g(x) = p1 x1 + p2 x2 , where pi is the unit
price of good xi . This is a linear function, and thus it is convex,
and correspondingly it is also quasi-convex. Then, for any two points
xi and xj we can dene max{g(xi ), g(xj )} w, where w is the
consumers wealth. Given that, we can now dene the set of points
X(p, w) {x : g(x) w}. In this way, both xi and xj belong to the
set X(p, w). Now, since g(x) is quasi-convex, X(p, w) is a convex set.
In short, since the expenditure function is quasi-convex, the budget set
X(p, b) {x : g(x) w} is a convex set.

Now you should be able to see that the previous discussion con-
cerning concave utility functions and convex indierence curves can
also be framed in terms of quasi-concavity. The minimal concavity
requisite on a consumers utility function for each indierence curve
to be a convex function in goods space is that the utility function be
quasi-concave. Naturally, this is also the minimal requirement upon a
choice problem with a convex budget set to be guaranteed to have a
unique optimal point.
A.3. Kuhn-Tucker optimisation 201

In short, what you should understand from the above is that


quasi-concavity of the utility function is really what will generate
convex indierence curves of the type that are so commonly drawn in
intermediate microeconomics. However, any concave utility function
is also quasi-concave, and so all concave utility functions will also
generate convex indierence curves. As it happens, concavity is a much
more useful characteristic in terms of mathematical representation
you only need to use either Jensens inequality or even only second
derivatives if the function is scalar while quasi-concavity is somewhat
less user-friendly as a mathematical expression. So, even though we
are really interested only in convex indierence curves (since that
is what we associate with the concept of decreasing marginal rate
of substitution, and it is also what guarantees a unique optimum
in a traditional consumer problem), in the present text we restrict
ourselves to concave, rather than quasi-concave, utility forms.4

A.3 The Kuhn-Tucker method of


constrained optimisation
In microeconomics, we are often interested in nding the solution to
problems of the type maximise, with respect to a choice on x, the
value of the function f (x) subject to the constraints gi (x) bi i =
1, ..., m, that is,

max f (x) s.t. gi (x) bi i = 1, ..., m


x

where f (x) is increasing and concave and each gi (x) is increasing and
convex. Certainly the most familiar example is maximising the utility
of consumption subject to a budget constraint and non-negativity on
the goods in question. You may have seen problems that look very sim-
ilar to this, but where the that appears in the restrictions is written
as an equality. There is a signicant dierence between problems with
inequality restrictions and problems with equality restrictions, and
here we will be interested only in the former.
4
Actually, you will see that when we study risky rather than certain environ-
ments, as is the case for the present text, we do indeed require concavity of utility
in goods, not just quasi-concavity, in order to guarantee that our decision maker
is what we call risk averse. This is shown in the next appendix.
202 A. Mathematical toolkit

Before doing anything general, consider the case of m = 1, that


is, only one restriction. In this case, the problem is:

max f (x) s.t. g(x) b


x

Since we are assuming that f (x) is continuous, increasing and


strictly concave, and that g(x) is increasing and convex (in which case
the feasible set is compact and convex), we know that there exists a
unique optimal vector,5 x . Besides, we also know that in the solution
the restriction must bind, otherwise there is room to increase at least
one xi which would increase the value of f (x) since it is an increasing
function. Thus x satises the equation g(x ) = b. By the implicit
function theorem, the slope of the contour g(x) = b at the point x is:

g(x )

x2 x1
= )

x1 dg(x)=0 g(x
x2

On the other hand, the slope of the contour of the objective


function passing through the point x is given by

f (x )
x2
x1
=
x f (x )
1 df (x)=0
x2

It is impossible that these two slopes not be equal at x . To see why,


simply note that if the two slopes were dierent at that point, then
the two contours must cut each other at that point. But then the
contour of f (x) must pass through some point, x , that lies below the
contour g(x) = b. That is, we would have x indierent to x with
g(x ) < b. But then there must exist a third point, say x , such that
x is preferred to x , and such that g(x ) b. Finally, by transitivity
of preferences, we have the result that x is also preferred to x , which
contradicts the initial hypothesis that x was the optimal vector, since
x is both feasible and preferred to x . Therefore, in any solution to
the problem, the two contours must have the same slope:



g(x ) f (x )
x1 x1
= (A.1)
g(x ) f (x )
x2 x2

5
We know this from the Weirstrauss Theorem.
A.3. Kuhn-Tucker optimisation 203

Together, equation (A.1) and the fact that g(x ) = b are two
equations in the two unknowns, x1 and x2 , and so their simultaneous
solution gives us the solution to the initial constrained optimisation
problem.
It is worthwhile to clearly point out that, although the solution to
the above problem involves g(x ) = b, this equality was not directly
assumed at any point. The underlying restriction for the problem is
g(x ) b, and the fact that this is solved with equality rather than
with inequality appears endogenously as we solve the problem. You
should see, from the above logic, that the equality has in fact been
a direct result of the assumption that the objective function f (x) is
increasing in the elements of the vector x.
In order to resolve a more general problem, with any number of
restrictions, we cannot fall back on the simple intuition that was just
used. The reason is that, although it will always be true that at least
one restriction binds (due to the fact that the objective function is
increasing in all variables), we cannot know for sure which one or
which ones. So it is impossible to know which of the equations gi (x) =
bi are valid for obtaining the solution. In order to solve the problem,
it is convenient to transform it into a second maximisation problem
with no restrictions.
In a variant of the well-known Lagrange method of solving prob-
lems with equality constraints,6 Harold Kuhn and Albert Tucker have
proved that the solution to the general problem coincides with the
solution to the alternative problem:

m
max L(x, ) f (x) + i [bi gi (x)]
x i=1

where the vector = ( 1 , ..., m ) contains non-negative numbers known


as Lagrange multipliers, each of which are dened by

i [bi gi (x )] = 0 i = 1, ..., m (A.2)

where x is the vector that maximises the Lagrangean function, L(x, ).


Since the Lagrange programme is a maximisation problem with no
restrictions, it is generally much easier to solve than the original
problem, although the number of variables has increased with the
addition of the m multipliers.
6
Named after the famous Italian mathematician Joseph-Louis Lagrange (1736-
1813).
204 A. Mathematical toolkit

To see the logic underlying the Lagrange method, note that since
f (x) is concave, and each gi (x) is convex (and so gi (x) is concave),
it turns out that L(x, ) is concave in x. Thus, since the multipliers
are non-negative, the global maximum of L(x, ) is found where its
rst derivatives are 0. Call the point that achieves this x . Now, by
the very denition of a global maximum, it is true that

m 
m
f (x ) + i [bi gi (x )] f (x) + i [bi gi (x)] x
i=1 i=1
m
But since the multipliers are dened such that i=1 i [bi gi (x )] =
0, it holds that

m
f (x ) f (x) + i [bi gi (x)] x
i=1

Finally, since the multipliers are non-negative, we get i [bi gi (x)]


0 whenever bi gi (x) 0. And so we can conclude that:

f (x ) f (x) x : gi (x) bi i = 1, ..., m

which is what is required of the solution to the original problem.


The solution to the problem can be calculated by using the two
equations that guarantee that x is a critical point of L(x, ), and the
m equations that determine the multipliers

L(x, ) f (x )  m gj (x )
= i = 0 i = 1, 2 (A.3)
xi xi j=1 xi

i [bi gi (x )] = 0 i = 1, ..., m (A.4)


It is habitual to refer to the equations (A.3) as the rst-order
conditions, and to the equations (A.4) as the complementary slackness
conditions, although all together the set of equations (A.3) and (A.4)
are known as the Kuhn-Tucker conditions. Together they form a set
of m + 2 equations in m + 2 unknowns (the two variables in the vector
x and the m Lagrange multipliers).
In general, we can write the solution to the problem as

xi = xi (b, g) i = 1, 2

where b is the vector with elements bi i = 1, ...m, and g is the vector


of the restrictions gi i = 1, ..., m, that would, of course, normally
A.3. Kuhn-Tucker optimisation 205

be described by a series of parameters. The value of the objective


function in the optimal solution, f (x ) v(b, g), is known as the
indirect objective function. Since the solution to the problem must
satisfy the complementary slackness conditions (A.4), it turns out
that v(b, g) = L(x , ).
Finally, let us consider the economic signicance of the Lagrange
multipliers. To do so, dierentiate the Lagrangean function at the
optimal vector with respect to one of the restraining parameters, bk .

2
 2
 
m
L() f (x ) xi gj (x ) xi
= j +
bk xi bk xi bk
i=1 i=1 j=1

m
j
[bj gj (x )] + k
bk
j=1

Joining together the rst two terms, we get



2
 
m
L() f (x ) gj (x ) xi
= j +
bk xi xi bk
i=1 j=1

m
j
[bj gj (x )] + k
bk
j=1

But, from the rst-order conditions (A.3), the rst term of this is
exactly 0, and so we are left with

L()  j
m
= [bj gj (x )] + k
bk bk
j=1

However, from the complementary slackness conditions (A.4), in order


that we have j [bj gj (x )] = 0, it must be that either [bj gj (x )] =

0 and j 0 (in which case whatever is the value of bkj , we must have
j
bk [bj gj (x )] = 0) or [bj gj (x )] 0 and j = 0 (in which case
again bkj [bj gj (x )] = 0, this time since
j
bk = 0). In short, we
always have
L(x , )
= k
bk
So in the end the complementary slackness conditions imply that
L(x , ) = f (x ), and the Lagrange multipliers measure the amount by
206 A. Mathematical toolkit

which the objective function would increase if one of the restrictions is


relaxed marginally. Note that if the restriction in question was binding
in the optimum, then k > 0, and relaxing the restriction would have
the eect of increasing the value of the objective function. This is
simply because if we relax a restriction that was binding, then clearly
the optimal vector will change for the better. On the other hand, if
the restriction in question was not binding, then we know (from the
complementary slackness condition) that k = 0, and relaxing that
restriction has no eect at all on the value of the objective function
(since it will not lead to a change in the optimal vector).
The fact that the Lagrange multipliers measure the increase that
is obtained in the objective function when a restriction is relaxed has
lead to them becoming known as shadow prices of scarce resources.
Recall that the values of b measure the amounts of scarce resources
that are available to be dedicated to the maximisation problem
they are the resources that restrict the values that the functions gi (x)
can take. If we were to ask our individual how much she is willing
to sacrice in units of the objective function, to obtain a marginal
additional unit of a scarce resource (one of the parameters bk ), the
answer would be that she would sacrice at most k units of the
objective function, since that is exactly what she can expect to obtain
in return.

A.4 Probability and lotteries


Mathematicians have discussed the concept of probability for cen-
turies. It seems to be basically accepted that by the term probability
we mean a numerical representation of an estimation of the degree
of faith in the truth of a statement. For example, when a fair six-
sided dice is thrown, the objective probability that the outcome will
be an odd number is the same as the objective probability that the
outcome will be a number not greater than 3. In these cases, it is
relatively simple to assign a probability measure, that is, to dene
probabilities numerically, and the characteristics of such probabilities
are well known. However, it is far from clear that the same is true of
subjective probabilities.
During the years 1920-1950, theoretical statisticians (in particular,
Bruno di Finetti and Leonard Savage) studied decision making under
uncertainty; above all, they thought about the problem of when it is
possible to assign numerical probabilities, that is, subjective prob-
A.4. Probability and lotteries 207

abilities. In short,7 we can point out the result that if the set of
possible outcomes can be divided into a suciently large number of
independent events, then there will exist a probability measure that
represents subjective probabilities, in the sense that if A is not less
probable than B, then the corresponding probability measure assigns
numbers p(A) and p(B) such that p(A) p(B). However, for our
purposes, this result is not particularly useful, since we will typically
be considering simple cases with a small number of possible outcomes
(two, or at most three), that cannot be sub-divided. However, for us
a simple denition of probability will suce.
Let x be a random variable,8 and let X be the set of values that
 can take. Naturally, the set X cannot be empty, X = . We shall
x
identify any general element of X by xi , and we shall assume that
there are z dierent elements in X, that is, X can be thought of as
a vector with z elements; X = (x1 , x2 , ..., xz ). Now, if z = 1, that is,
there is only one element in X, then we say that x  is a constant (it is
deterministic). A deterministic variable is also sometimes referred to
as a degenerate random variable. On the other hand, if z > 1, then
we say that x  is a random variable (it is stochastic). We use the term
lottery to describe the mechanism by which a particular element of X
is assigned to x.
When a lottery is repeated many times independently, we obtain a
list of the values that have been assigned to x  in each trial. Denote by
ni (m) the number of times that the particular value xi was assigned
to x when the lottery is repeated m times. In this  way we obtain the
z
vector n(m) = (n1 (m), n2 (m), ..., nz (m)), where i=1 ni (m) = m.
On the other hand, we also obtain the relative frequencies of each
(m) n2 (m) (m)
xi dened by the vector r(m) = ( n1m , m , ..., nzm ). Of course,
the relative frequencies are numbers with the properties
 that, for any
(m) (m)
given m we have 0 nim 1 for all i, and zi=1 nim = 1. It is
important to note that the relative frequencies of X refer to the past,
while the concept of probability that we are searching for refers to the
future.
Now, we can use the following denition of probability; the prob-
ability of xi , denoted by pi , is the belief that the individual has for
the relative frequency of xi that would be obtained if the lottery were
7
For a more detailed account, see The Foundations of Statistics, by Leonard
Savage, originally published by J. Wiley & Sons in 1954.
8
In all of the present text, all random variables (those that can take on more
than one nal value) will be indicated by a curly line above the variable.
208 A. Mathematical toolkit

repeated m independent times, where m . In mathematical terms,


if we denote by nei (m) the number of times that the individual believes
the value xi will be assigned in m independent repetitions of the
lottery, then
ne (m)
pi = lim i
m m
It is important to note that a probability is a belief, that is, in all
cases it is a personal or subjective measure. Never-the-less, if there
is a unanimously held belief on pi , that is, everyone is in complete
agreement as to the value of the probability, then we say that the
probability in question is objective (like, for example, the probability
of throwing a 6 on a single toss of a dice). In any case, with the
above denition of probability, which adequately covers our immediate
requirements, we have a simple numerical measure for probability,
which is all we need for a formal analysis of choice in a stochastic
environment.
Given this discussion, it should be clear that the case of objective
probability (choice under risk) is really a special case of uncertainty,
in which our subject uses as his subjective probability measure the
common objective probability. We directly assume, for everything that
we study in this text, that numerical probabilities exist that describe
the randomness of any stochastic parameters.
Appendix B

A primer on consumer
theory under certainty,
and indirect utility

B.1 The basic microeconomic problem


Pretty much all of intermediate undergraduate microeconomic theory
can be reduced to the study of a single constrained maximisation
problem:

max f (x) subject to gi (x) bi , i = 1, 2, ..., m (B.1)


x

For the case of consumer theory, x can be considered to be the


commodity bundle, and f (x) the utility function. Assuming that there
are only two dierent goods in the commodity bundle, x = [x1 , x2 ],
the constraints in the consumer case are, on the one hand, the budget
constraint; g1 (x) = p1 x1 + p2 x2 w = b1 , where pi is the unit price of
good i, and w is the consumers wealth (or income, if we are modelling
a choice in a given period). On the other hand, we have the no-
negativity constraints; g2 (x) = x1 0 = b2 and g3 (x) = x2
0 = b3 .
The case of producer theory is also simple to accommodate; x
would be the bundle of produced outputs (often assumed to be a
single product in the simple case), f (x) is the rms prot function,
and the constraints gi (x) are aspects such as the demand function
for the product in question, and technological restrictions concerning

209
210 B. A primer on consumer theory

the availability and pricing of inputs, ecient production and no-


negativity of output.1
Once certain assumptions are placed upon the functions f (x) and
gi (x), problem (B.1) has a unique solution, and the study of this
solution is what occupies much of intermediate microeconomics. Given
the similarities between consumer and producer theory, in the present
text we shall be mainly concerned with the former, though we shall
look into the complexities of the problem in greater detail than in
typical intermediate microeconomics courses. Specically, we shall
interest ourselves in exactly what are the assumptions on the func-
tions involved that are required for the problem to have a unique
solution, and how these assumptions present themselves graphically.
The idea is to look more deeply into the inner workings of this very
standard problem for students of microeconomics, in order to glean
a full understanding of it. We will then go on to show how the very
same model can be re-applied to dierent settings, most importantly
for us, settings in which choices are made in risky environments.

B.2 Utility maximisation under certainty


In this appendix we shall look closely at the typical consumer max-
imisation problem under certainty:

max u(x) subject to p1 x1 + p2 x2 w and xi 0 i = 1, 2 (B.2)


x

where pi is the unit price of good i, and w is the consumers xed


wealth. Note that the two no-negativity constraints can be expressed
as xi 0, so we have not strayed at all from the formulation dis-
cussed in the previous mathematical appendix. Now, we shall assume
that the utility function for goods, u(x), is strictly increasing in both
x1 and x2 , and strictly concave in the vector x. We shall also assume
that it is a continuous function,2 with continuous derivatives, and
so our assumptions on increasingness can be written more easily as
1
Of course, it is often the case that the cost scenario is modelled apart from the
prot maximising one. But clearly the choice of ecient production arising from
the cost minimsation problem is identical in nature to the utility maximisation
problem of a consumer, and once ecient production has been established, the
prot maximising choice can easily be established with the inclusion of a restriction
amounting to ecient production.
2
Note that this also implies that we are assuming, as is normal, that the two
goods are perfectly divisible.
B.2. Utility maximisation under certainty 211

u(x)
xi > 0 i = 1, 2. Our assumption on concavity will be described,
as was established in the previous appendix, by Jensens inequality;
x1 , x2 and (0, 1), u(x1 + (1 )x2 ) > u(x1 ) + (1 )u(x2 ).
These assumptions imply that the indierence curves correspond-
ing to u(x), drawn in (x1 , x2 ) space, are decreasing and strictly convex,
and that indierence curves located further from the origin correspond
to greater levels of utility. From the implicit function theorem, the
slope of the indierence curve passing through any given point x at
that point is
u(x)

x2 x1
=
x1 du(x)=0 u(x)
x2

This quantity is normally known as the marginal rate of substitu-


tion, or M RS(x) for short.
On the other hand, look at the budget constraint for the problem;
p1 x1 + p2 x2 w. The expenditure function, g(p, x) = p1 x1 + p2 x2
is linear in both of the x variables, and so the expenditure function
is linear in the x vector. Its contours, drawn in (x1 , x2 ) space, and
holding g(p, x) constant, are straight lines. These lines are, of course,
the exact analogy to indierence curves. Again, directly from the
implicit function theorem, the slope of a contour of the expenditure
function is
e(p,x)
x2 x1 p
= = 1
x1 de(p,x)=0 e(p,x) p2
x2

One of these contours is that corresponding to g(p, x) = w, and this


particular contour of the expenditure function is what is commonly
known as the budget line in elementary microeconomics courses.
The feasible set for our problem, as always viewed in (x1 , x2 ) space, is
simply the triangle formed by the two axes of the Cartesian coordinate
graph and the budget line. This feasible set is commonly termed the
budget set for the problem. Our problem is to determine the point
x located within the feasible set for which utility is maximised.
Of course, graphically, the solution is very simple to locate. We
begin by noting that, whatever it is, the solution must saturate the
budget constraint. That is, if we denote the solution by x , then it
must hold that p1 x1 +p2 x2 = w. If, contrary to this, the consumer were
to select a point lying below the budget line, then his total expenditure
would be less than w. Say the total expenditure were z < w, so that
212 B. A primer on consumer theory

the unspent wealth is w z > 0. But then this unspent wealth can
be protably allocated to the purchase of at least one of the two
goods. Say it is all allocated to good 1, then the strictly positive
additional amount wz p1 of good 1 can be purchased, and since utility
is increasing in the consumption of good 1, adding this new quantity
to the consumption bundle must increase utility. Thus, in our search
for the optimal solution we need only consider points that lie on the
budget line.
Next, note that unless the optimal point is at one of the extreme
vertices of the budget set, it must correspond to a point of tangency
between the budget line and an indierence curve. If this were not
the case, then the budget line and the indierence curve would cut at
the proposed point, which implies that some part of that indierence
curve lies strictly within the budget set. In other words, the proposed
point is indierent to some other point for which not all wealth is
spent. But since we have just shown that any point for which not all
wealth is spent can be improved upon, no point that is indierent to
such a point can ever be optimal. Thus, outside of a corner solution,
there must be a tangency between the budget line and an indierence
curve, which is expressed as

u(x )
x1 p
= 1
u(x ) p2
x2

This tangency condition can be thought of as one equation in


two unknowns, the two elements of the optimal x vector. The other
equation in the same two unknowns that we need is, of course, the
budget line p1 x1 + p2 x2 = w. The simultaneous solution to these two
equations will give a unique solution to the problem. If both of the
coordinates of the solution thus determined are non-negative, then we
are nished. However, it may turn out that one of the coordinates of
the solution of the two simultaneous equations is negative, in which
case clearly we have not found an optimal solution at all, since it falls
outside of the feasible set. However, in this case it is elementary to see
that the true solution to the problem is found by setting the negative
valued x coordinate to 0, and spending all wealth on the other good.
For example, if the simultaneous solution to the tangency condition
and the budget line gave us a point with x1 > 0 and x2 < 0, then the
true solution to the problem is (x1 = pw1 , x2 = 0). You should draw
a graph of a tangency that falls outside of the budget set and use it
B.2. Utility maximisation under certainty 213

to convince yourself that the nearest corner to that tangency is the


solution to the constrained problem.
It is worthwhile to re-do the above problem using the Kuhn-
Tucker method that was discussed in the mathematical appendix. The
Lagrange function for the problem at hand is

L(x, ) = u(x) + 1 [0 + x1 ] + 2 [0 + x2 ] + 3 [w p1 x1 p2 x2 ]

where the variables are the Lagrange multipliers. Denoting the


solution vector by x , the rst-order conditions are
u(x )
+ i 3 pi = 0 i = 1, 2 (B.3)
xi
and the complementary slackness conditions are

i xi = 0 , i = 1, 2 ; 3 [w p1 x1 p2 x2 ] = 0 (B.4)

The rst thing to notice is that again we can be sure that the
budget constraint will saturate, w = p1 x1 + p2 x2 . To see why, note
that if we can show that in any solution we always get 3 > 0, then
directly from the third complementary slackness condition we would
know that w = p1 x1 + p2 x2 . Given that, lets write the rst-order
conditions as:
u(x )
+ i = 3 pi i = 1, 2
xi
and multiply this by xi , so that from the complementary slackness
conditions we can ignore the term i xi , and so we get
u(x )
x = 3 pi xi i = 1, 2
xi i
Now, we sum these two equations to obtain
2
 u(x )
xi = 3 (p1 x1 + p2 x2 ) 3 w
xi
i=1

Thus, any solution must satisfy


2
1  u(x )
3 x (B.5)
w xi i
i=1

Now, the right-hand side of this equation is necessarily positive, since


given non-zero wealth at least one of the optimal quantities xi must
214 B. A primer on consumer theory

be strictly positive (marginal utility is positive by assumption, and


so any vector with at least one positive component will always give
greater utility than the point xi = 0 i = 1, 2). Thus, we have
indeed shown that, conditional upon the logical requisite that w is
strictly positive, 3 > 0 and so the budget constraint will always
saturate. Again, it is important to realise that the budget constraint
equality has not been assumed at the outset, but has been derived
endogenously as we solve the problem. It is a direct consequence of
the assumption of increasing utility.
It is worthwhile to notice that since all of the wealth will be spent,
we can in fact eliminate the inequality from equation (B.5), and so we
can calculate the exact value of the Lagrange multiplier corresponding
to the budget constraint as
2
1  u(x )
3 = x (B.6)
w xi i
i=1

This argument implies that when we calculate the optimal vec-


tor, we can substitute the third complementary slackness condition,
3 [w p1 x1 p2 x2 ] = 0, for the equation w = p1 x1 + p2 x2 .
Now, since i 0 i = 1, 2, if the solution were to satisfy xi > 0 i =
1, 2, then the complementary slackness conditions would indicate that
i = 0, i = 1, 2. In this case, the rst-order conditions would be
written as
u(x )
= 3 pi ; i = 1, 2 (B.7)
xi
Dividing the rst equation of (B.7) by the second gives the tan-
gency condition
u(x )
x1 p

= 1 (B.8)
u(x ) p2
x2

Of course, since the left-hand side of the tangency condition is the


absolute value of marginal rate of substitution, and the right-hand
side is the absolute value of the slope of the budget line, we know
that any internal solution must be a point of tangency between an
indierence curve and the budget line.
Mathematically, for the case of an interior solution, the equations
(B.8) and p1 x1 + p2 x2 = w, form a system of two simultaneous
equations in the two unknowns (x1 , x2 ), the solution of which is the
optimal vector for the problem.
B.2. Utility maximisation under certainty 215

If the solution is not interior, that is, one of the two quantities
x1 or x2 is equal to zero (recall that both cannot be zero with pos-
itive wealth, since the solution must lie on the budget line), then
the solution will not in general be given by the tangency condition.
These types of cases, known as corner solutions can still be easily
calculated from the tangency condition. If we denote by x the point
that does satisfy (B.8), then the optimal vector (the point x that
simultaneously satises (B.3) and (B.4)) is found as

x if xi 0; i = 1, 2
x =
(xi = 0, xj = pwj ) if xi < 0; i, j = 1, 2; i = j
In all that follows, unless we specically state otherwise, we shall
simply assume that the solution to the problem is interior, and so is
calculated directly from the tangency condition and the budget line.

Marshallian demand and indirect utility


The optimal vector is in eect a function of the price vector, p =
(p1 , p2 ), and the consumers wealth, w. We can write this as x =
(x1 (p, w), x2 (p, w)). The functions xi (p, w) are known as the Mar-
shallian demand curves for the two goods in question.
An important function that can be dened from the Marshal-
lian demand curves is the indirect utility function, which is found
by substituting the demand curves into the direct utility function;
u(x ) = v(p, w). In order to dierentiate between direct utility (the
utility of goods), and indirect utility (a function of prices and wealth),
it is customary to dene the latter by v.
Since we are assuming a strictly interior solution, we have i =
0 i = 1, 2, and so we can dene 3 . The three equations that
determine the optimal values of the three unknowns, x1 , x2 and , are
u(x )
p1 x1 + p2 x2 = w ; = pi ; i = 1, 2 (B.9)
xi
Consider the eect on the level of utility in the optimal solution of
an increase in wealth. The derivative of v(p, w) = u(x ) with respect
to w is
v(p, w) u(x ) x1 u(x ) x2
= +
w x1 w x2 w
Using the rst-order conditions (B.9), this can be written as
 
v(p, w) x x
= p1 1 + p2 2
w w w
216 B. A primer on consumer theory

However, in any optimal solution (i.e., both before and after the
increase in w) we know that the budget constraint must saturate (B.9),
and so we can derive this restriction with respect to w, which reveals
the result
x x
p 1 1 + p2 2 = 1
w w
Substituting this into the previous equation, it turns out that

v(p, w)
=>0
w
Note that this is exactly what was mentioned at the end of the
previous appendix, when optimisation was considered in general. How-
ever, now we can clearly refer to as the marginal utility of wealth,
an important concept in microeconomics. Since is always strictly
positive, we know that an increase in wealth will always increase
utility.
Next, consider an increase in one of the prices, say pi . Deriving
the indirect utility function, we get

v(p, w) u(x ) x1 u(x ) x2


= + i = 1, 2
pi x1 pi x2 pi

Again, using the rst-order conditions(B.9), we can write this as


 
v(p, w) x1 x2
= p1 + p2 i = 1, 2
pi pi pi

Now, dierentiating the budget constraint with respect to pi reveals


the result
x x
xi + p1 1 + p2 2 = 0 i = 1, 2
pi pi
Substituting this into the previous equation, we get

v(p, w)
= xi < 0 i = 1, 2
pi

This is strictly negative since we are assuming a strictly interior


solution, xi > 0 for i = 1, 2. If we had a corner solution, this could,
of course, be 0. The result indicates that, if the price of a good that
is in positive demand rises, then utility is reduced, while utility is not
aected by a marginal price rise of a good that is not demanded.
B.2. Utility maximisation under certainty 217

We can join the previous two results into a single equation:



v(p,w)
v(p, w) v(p, w) pi
= xi xi = ; i = 1, 2
pi w v(p,w)
w

This equation is known as Roys Identity. It gives us a useful way to


write the Marshallian demand functions.
What about the convexity/concavity of v(p, w) in prices and in
wealth? Consider what happens when we look at the optimal choices
under two dierent parameter sets. Denote by xi the optimal solution
with prices (pi1 , pi2 ) and wealth wi , for i = 1, 2. Further, denote by
x3 the optimal solution when prices are (p11 + (1 )p21 , p12 + (1
)p22 ) = (p31 , p32 ), and wealth is w3 = w1 +(1)w2 , where, of course,
0 1. Now, note that it is impossible that pi1 x3 i 3
1 + p2 x 2 > w i
for i = 1, 2 simultaneously (in words, the vector x3 must be feasible
under at least one of the original price vectors). To see why, simply
think what would happen if indeed x3 were not feasible under either
of the original price vectors. We would have p11 x3 1 3
1 + p2 x2 > w1
2 3 2 3
and (1 )p1 x1 + (1 )p2 x2 > (1 )w2 . But summing these two
inequalities indicates that

(p11 + (1 )p21 )x3 1 2 3


1 + (p2 + (1 )p2 )x2 > w1 + (1 )w2 = w3

That is, p31 x3 3 3


1 + p2 x2 > w3 , which is impossible since x
3 is the opti-
3 3
mal vector under prices p1 , p2 and wealth w3 (and so it is necessarily
feasible). Thus, it must be true that either p11 x3 1 3
1 + p2 x2 < w1 , or
2 3 2 3 3
p1 x1 + p2 x2 < w2 , or both. In words, x must be a feasible vector
under at least one of the original parameter sets. But in turn, this
implies that xi  x3 for at least one i, or

u(x3 ) max{u(x1 ), u(x2 )}

Finally, from the denition of the indirect utility function, we now


have
v(p3 , w3 ) max{v(p1 , w1 ), v(p2 , w2 )}
That is, the indirect utility function is quasi-convex.
The quasi-convexity of the indirect utility function allows us to
draw a nice picture that might help to clarify Roys identity. If we
keep xed one of the prices, say pj , then we can draw contours of
v(p, w) in the space dened by wealth and the other price (say pi ).
218 B. A primer on consumer theory

Since the indirect utility function is increasing in w, decreasing in pi ,


and quasi-convex, the implied graph is as is shown in Figure B.1. On
the other hand, from the implicit function theorem, the slope of the
contour at any given point is

v(p,w)
dw pi
=
dpi dv=0
v(p,w)
w

But from Roys identity, this is equal to xi .

v(p, w) =constant

xi

pi

Figure B.1 Roys identity

The nal result that we should look at here is also perhaps the
most important, at least for the subject matter of the main text;
the concavity of the indirect utility function in wealth. That utility
should be concave in wealth is an often assumed characteristic, and it
is most important to the economics of risk and uncertainty. It turns
out that it is true that indirect utility is concave in wealth, but only
conditional upon the direct utility function being concave in the vector
of goods. This might not seem to be a severe restriction, as indeed it
is very often assumed that u(x) is concave in x, since among other
B.2. Utility maximisation under certainty 219

things this implies that the indierence curves will be convex contours.
But as we have seen in the mathematical appendix, concavity of u(x)
is by no means necessary for convexity of indierence curves. What
is required is that utility be quasi-concave in the vector of goods,
a weaker requirement than strict concavity, and one that will not
necessarily generate concavity of indirect utility in wealth. However,
that said, it is still not too much of a compromise to assume strict
concavity of u(x), and so we shall.
The result can be proved as follows. Hold prices constant, and
compare the optimal solution to the utility maximisation problem
with two dierent levels of wealth, say w1 and w2 . Call the solutions
to these two problems, respectively, x1 and x2 . Then, consider the
utility maximisation problem with wealth equal to w1 + (1 )w2 =
w3 . Call the solution to that problem x3 . We know that p1 x1 1 +
1 2 2
p2 x2 = w1 and that p1 x1 + p2 x2 = w2 . Multiplying the rst of
these equations by and the second by (1 ), and summing them
gives

(p1 x1 1 2 2
1 + p2 x2 ) + (1 )(p1 x1 + p2 x2 ) = w1 + (1 )w2 = w3

Bringing together common terms,

p1 [x1 2 1 2
1 + (1 )x1 ] + p2 [x2 + (1 )x2 ] = w3

However, this implies that the vector x1 + (1 )x2 is feasible (but


not necessarily optimal) when wealth is w3 . Thus, it must be true that
u(x3 ) u(x1 + (1 )x2 ). Finally, then, if utility is concave, we
have

u(x1 ) + (1 )u(x2 ) < u(x1 + (1 )x2 ) u(x3 )

But since, by denition, u(xi ) = v(p, wi ), this reads

v(p, w1 ) + (1 )v(p, w2 ) < v(p, w3 )

That is, the indirect utility function is concave in wealth.


Index

Acceptance set, 49 Edgeworth box, 115, 139


and demand for insurance, contract curve, 116
75 Elsberg paradox, 26
Ackerlof, George, 142 ambiguity aversion, 27
Adverse selection, 140, 142 Expected utility
competitive principal, 154 theory, axiomatic proof of,
monopolistic principal, 162 20
Allais, Maurice
Allais paradox, 24 First order stochastic dominance,
Arrow, Ken, 41 21, 35
Asymmetric information, 138
Implicit function theorem, 12,
Bernoulli, Daniel, 18 36, 43, 44, 55, 73, 84,
89, 100, 123, 166, 168,
Certainty equivalent wealth, 57 175, 192, 197, 202, 211,
Certainty line, 42 218
Concavity Incentive compatibility condi-
of functions, 195 tions, 153, 175
Constant proportional risk shar- Independence of irrelevant al-
ing, 123 ternatives, 23
Contingent claims graph, 40 Information
Contract, 3, 124, 140 asymmetric, 139
and adverse selection, 149 imperfect, 139
and moral hazard, 174 imperfect and symmetric, 139
and Nash equilibrium, 141 perfect, 139
insurance, 74 Insurance
menu, 149 competitive insurer, 78
Contract curve, 8, 116 demand for, 74
Convex combination, 195 marginally loaded premium,
Convex set, 199 81
monopolistic insurer, 79
Debreu, Gerard, 41
di Finetti, Bruno, 206 Jensens inequality, 19, 93, 196
Index 221

Knight, Frank, 15 Risk, 15


Kuhn, Harold, 203 aggregate, 115
Kuhn-Tucker optimisation, 201 sharing, 115
Risk aversion, 33
Lagrange absolute risk aversion, Arrow-
Joseph-Louis, 203 Pratt measure, 52
Lagrange multipliers, 203 and comparative statics of
Lagrangean, 71, 80, 81, 132, 167, insurance, 85
168, 203, 205 constant relative, 127
in the contingent claims graph,
Machina, Mark, 33
45
Marschak, Jacob, 33
in the Marschak-Machina tri-
Marshallian demand, 215
angle, 39
Moral hazard, 140, 173
measures, 49
competitive principal, 180
relative risk aversion, Arrow-
monopolistic principal, 184
Pratt measure, 56
Morgenstern, Oskar, 19
slope of, 63
mutuality principle, 131
Risk premium, 58
Nash equilibrium, 141 Arrow-Pratt approx., 60
Newsboy problem, 102 Roys identity, 217

Participation conditions, 153 Samuelson, Paul, 30


Pooling equilibrium, 150 Savage, Leonard, 206
Portfolio choice, 69 Savings, 87
Preferences, 20 under certainty, 88
in the contingent claims graph, under risk, 91
44 Separating equilibrium, 150
in the Marschak-Machina tri- Shadow prices, 206
angle, 35 Signal
Principal-agent model, 140 imperfect, 142
and adverse selection, 148 perfect, 141
and moral hazard, 173 Signalling, 144
Probability Spence, Michael, 144
and lotteries, 206 St. Petersburg paradox, 17
Production, 96 State of nature, denition, 41
Prospect theory, 28 Tucker, Albert, 203
loss aversion, 28
Prudence, 64 Uncertainty, 15
and precautionary savings, Utility maximisation
93 under certainty, 210

Quasi-concavity, 199 von Neumann, John, 19


Quasi-convexity, 199

Potrebbero piacerti anche