Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
f
x
g=
Figure 1. Constrained Maximum with an Equality Constraint.
be tangent. Since the gradient vectors are always perpendicular to the tangent line, they must
be colinear. Algebraically, this means that there are coecients and (multipliers, if you
will), not both zero, satisfying
f (x ) + g (x ) = 0.
In general, this is all that can be said. But if the gradient g is nonzero, then, as we shall see, the
multiplier on f can be taken to be unity, and we get the more familiar condition, f + g = 0.
Note that this does not imply that itself is nonzero, since f may be zero itself. Also note
that in general we cannot say anything about the sign of . That is, there is nothing to tell
us if g points in the same direction as f , or the opposite direction. This changes when we
have an inequality constraint. If there is a local maximum of f subject to g(x) , then the
gradient of g points into [g > ], and the gradient of f points out. See Figure 2. This means
that we can take , 0. Even if [g > ] is empty, then g = 0 (why?), so we can take
1
KC Border
g=
f
x
g
g>a
g=
i gi (x ) = 0.
i=1
The next result is provides a dierent version of the Lagrange Multiplier Theorem that
includes the first as a special case. A proof may be found in Carathodory [3, Theorem 11.1,
pp. 175177].
3 Lagrange Multiplier Theorem II Let X Rn , and let f, g1 , . . . , gm : X R be continuous. Let x be an interior constrained local maximizer of f subject to g(x) = 0. Suppose f ,
g1 , . . . , gm are dierentiable at x .
Then there exist real numbers , 1 , . . . , m , not all zero, such that
f (x ) +
i gi (x ) = 0.
i=1
v. 2015.11.02::14.01
KC Border
4 Example (Multipliers are zero) The Lagrange Multiplier Theorem does not guarantee
that all the multipliers on the constraints will be nonzero. In fact the multipliers on the
constraints may all be zero. For instance consider the constrained maximum of
f (x, y) := (x2 + y 2 )
subject to the single constraint
g(x, y) := y = 0.
g (x, y)
Observe that
= (0, 1) = 0, so the gradient is linearly independent. The point (0, 0) is a
constrained maximizer of f , but f (x, y) = (2x, 2y) is equal to zero at (0, 0). Thus the only
way to solve f (0, 0) + g (0, 0) is to set = 0.
5 Example (Dependent constraint gradients) If you are like me, you may be tempted
to think that if the gradients of the constraints are linearly dependent, then one of them may
be redundant. This is not true. Consider the constrained maximum of
f (x, y) := x
subject to the two constraints
g1 (x, y) := y x2 = 0
g2 (x, y) := y + x2 = 0.
It is easy to see that (0, 0) is the only point satisfying both constraints, and
g1 (0, 0) = (0, 1) = g2 (0, 0).
Thus the gradients of the constraints are dependent at the maximizer. Since f = (1, 0), there
is no solution to f (0, 0) + 1 g1 (0, 0) + 2 g2 (0, 0). There is however a nonzero solution to
0 f (0, 0) + 1 g1 (0, 0) + 2 g2 (0, 0), namely 0 = 0, 1 = 1, and 2 = 1.
Notice that neither constraint is redundant, since if one of them is dropped, there are no
constrained maxima.
i gi (x ) = 0.
i=1
Then
n
n
i=1 j=1
(x )
v = 0, i = 1, . . . , m.
v. 2015.11.02::14.01
KC Border
Constrained Minimization
Since minimizing f is the same as maximizing f , we do not need any new results for minimization, but there a few things worth pointing out.
The Lagrangean for maximizing f subject to gi = 0, i = 1, . . . , m is
f (x) +
i gi (x),
i=1
Dij f (x ) +
i=1 j=1
Dij g(x ) vi vj 0,
i=1
Dij f (x )
i=1 j=1
Dij g(x ) vi vj 0,
i=1
which suggests that it is more convenient to define the Lagrangean for a minimization problem
as
m
L(x, ) = f (x)
i gi (x).
i=1
The first order conditions will be exactly the same. For the second order conditions we have
the following.
7 Theorem (Necessary Second Order Conditions for a Minimum) Let U Rn and
let x int U . Let f, g1 , . . . , gm : U R be C 2 , and suppose x is a local constrained minimizer
of f subject to g(x) = 0. Define the Lagrangean
L(x, ) = f (x)
i gi (x).
i=1
i gi (x ) = 0.
i=1
Then
n
n
i=1 j=1
v. 2015.11.02::14.01
KC Border
V
pi
f
pi |x=x (p) .
Proof : By definition V (p) f (x, p) for all x, p, and V (p) = f (x (p), p). Thus the function
h(p, x) = V (p) f (x, p)
(
achieves a minimum whenever (p, x) = p, x (p) . The first order conditions for a minimum of
h are:
Di h(p, x) = 0, i = 1, . . . , n + .
But Di h(p, x) = Di V (p) Dn+i f (x, p) for i = 1, . . . , n.
9 Theorem (Envelope Theorem for Constrained Maximization)
Let X Rn and
1
P R be open, and let f, g1 , . . . , gm : X P R be C . For each p P , let x (p) be an
interior constrained local maximizer of f (x, p) subject to g(x, p) = 0. Define the Lagrangean
L(x, ; p) = f (x, p) +
i gi (x, p),
i=1
and assume that the conclusion of the Lagrange Multiplier Theorem holds for each p, that is,
there exist real numbers 1 (p), . . . , m (p), such that the first order conditions
(
m
(
)
(
)
L x (p), (p), p
= fx x (p), p +
i (p)gi x x (p), p = 0
x
i=1
m
L x (p), (p), p
V (p)
f (x , p)
gi (x , p)
=
=
+
i (p)
.
pj
pj
pj
pj
i=1
i=1
v. 2015.11.02::14.01
i (p)gi (x , p).
Notation!!!!
KC Border
f (x , p) x k
xk
k=1
{
m
pj
f (x , p)
pj
i (p)
gi (x , p) + (p)
pj
i=1
[(
gi (x , p) x k
xk
k=1
pj
gi (x , p)
+
pj
]}
f (x , p)
gi (x , p)
+
i (p)
pj
pj
i=1
+
+
i (p)
pj
i=1
(
n
k=1
gi (x , p)
(1)
m
f (x , p)
gi (x , p)
+
(p)
xk
xk
i=1
x k
.
pj
(2)
The theorem now follows from the fact that both terms (1) and (2) are zero. Term (1) is zero
since x satisfies the constraints, and term (2) is zero, since the first order conditions imply
(x ,p)
gi (x ,p)
that each fx
+ m
= 0.
i=1 (p) xk
k
10 Theorem (Envelope Theorem for Minimization) Let X Rn and P R be open,
and let f, g1 , . . . , gm : X P R be C 1 . For each p P , let x (p) be an interior constrained
local maximizer of f (x, p) subject to g(x, p) = 0. Define the Lagrangean
L(x, ; p) = f (x, p)
i gi (x, p),
i=1
and assume that the conclusion of the Lagrange Multiplier Theorem holds for each p, that is,
there exist real numbers 1 (p), . . . , m (p), such that the first order conditions
(
m
(
)
(
)
L x (p), (p), p
= fx x (p), p
i (p)gi x x (p), p = 0
x
i=1
m
L x (p), (p), p
V (p)
f (x , p)
gi (x , p)
=
=
i (p)
.
pj
pj
pj
pj
i=1
Inequality constraints
The classical Lagrange Multiplier Theorem deals only with equality constraints. Now we take
up inequality constraints. I learned this approach from Quirk [10].
v. 2015.11.02::14.01
Notation!!!!
KC Border
i gi (x ) 0.
(3)
i=1
x f (x ) +
i gi (x )
=0
(4)
i=1
0.
(5)
g(x ) = 0.
(6)
zj =
gi (x )
(7)
xj ,
(8)
m
= 0. Now
i + j=1 nj h
i=1 i g
j
gi (x )
gi (x , y , z ) = 2yi ei
0
and
ej
hj (x , y , z ) = 0 .
2zj ej
v. 2015.11.02::14.01
(9)
KC Border
i gi (x ) +
iB
j ej = 0.
jZ
i gi (x, y, x) +
i=1
= f (x) +
j (x, y, z).
j h
j=1
i gi (x) yi2 +
i=1
j (xj zj2 ).
j=1
j = 1, . . . , n,
(10)
21 yi = 0
i = 1, . . . , m,
(11)
2j zj
j = 1, . . . , n.
(12)
=0
(x , y , z ; , ))
2L
x1 x1
..
.
2L
xn x1
2L
x1 xn
..
.
2L
xn xn
21
..
.
2m
21
..
.
2n
v. 2015.11.02::14.01
KC Border
From the second order conditions (Theorem 6) for the revised equality constrained problem, we know that this Hessian is negative semidefinite under constraint. That is, if a vector
is orthogonal to the gradients of the constraints, then the quadratic form in the Hessian is
nonpositive. In particular, consider a vector of the form v = (0, ek , 0) Rn Rm Rn . It
follows from (9) that for i = k this vector v is orthogonal to gi (x , y , z ), and also orthogonal
(x , y , z ), j = 1, . . . , n. The vector v is orthogonal to g (x , y , z ) if and only if y = 0,
to h
j
k
k
that is, when k B. Thus for k B the second order conditions imply 2k 0, so
gi (x ) = 0 = i 0.
Next consider a vector of the form u = (0, 0, ek ) Rn Rm Rn . It follows from (9) that
(x , y , z ) for j = k. The vector u
this vector u is orthogonal to each gi (x , y , z ) each h
j
(x , y , z ) if and only if z = 0, that is, j Z. Again the second order
is orthogonal to h
k
k
conditions imply that the quadratic form in u, which has value 2k is nonnegative for k Z,
so
xj = 0 = j 0.
Now if i
/ B, that is, gi (x ) > 0, so that yi > 0, then from the first order condition (11) we
have i = 0. Also, from (12), if xj > 0, so that zj > 0, then j = 0. That is,
gi (x ) > 0 = i = 0
and
xj > 0 = j = 0
Combining this with the paragraph above we see that 0 and 0. Thus (10) implies
conclusion (3). A little more thought will show you that we have just deduced conditions (4)
through (6) as well.
There is a simple variation on the slack variable approach that applies to mixed inequality
and equality constraints. To prove the next result, simply omit the slack variables for the
equality constraints and follow the same proof as in Theorem 11.
12 Corollary Let U Rn be open, and let f, g1 , . . . , gm : U R be twice continuously
dierentiable on U . Let x be a constrained local maximizer of f subject to
gi (x) = 0
i E,
gi (x) 0 i E c ,
xj 0 j N.
Let B = {i E c : gi (x ) = 0} (binding inequality constraints), and let Z = {j N : xj = 0}
(binding nonnegativity constraints). Assume that
{gi (x ) : i E B} {ej : j Z} is linearly independent,
v. 2015.11.02::14.01
KC Border
10
x f (x ) +
j=1 i gj (x ) = 0
g(x ) = 0.
We now translate the result for minimization.
13 Theorem (Minimization)
Let U Rn be open, and let f, g1 , . . . , gm : U R be
twice continuously dierentiable on U . Let x be a constrained local minimizer of f subject to
g(x) 0 and x 0.
Let B = {i : gi (x ) = 0}, the set of binding constraints, and let Z = {j : xj = 0}, the set
of binding nonnegativity constraints. Assume that {gi (x ) : i B} {ej : j Z} is linearly
independent. Then there exists Rm such that
f (x )
i gi (x ) 0.
(13)
i=1
x f (x )
i gi (x )
=0
(14)
i=1
0.
(15)
g(x ) = 0.
(16)
Proof : As in the proof of Theorem 11, introduce m+n slack variables y1 , . . . , ym and z1 , . . . , zn ,
j (x, y, z) = xj z 2 ,
and define f(x, y, z) = f (x), gi (x, y, z) = gi (x) yi2 , i = 1, . . . , m and h
j
xj . Observe that (x , y , z )
j (x, y, z) = 0,
minimize f(x, y, z) subject to gi (x, y, z) = 0, i = 1, . . . , m, and h
j = 1, . . . , n.
The proof of the linear independence of the constraint gradients of the revised equality
problem is the same as in Theorem 11.
So form the Lagrangean
y, z; , ) = f (x)
L(x,
i gi (x) yi2
i=1
v. 2015.11.02::14.01
j=1
j (xj zj2 ).
KC Border
11
i
j = 0
xj
x
j
i=1
j = 1, . . . , n,
(17)
21 yi = 0
i = 1, . . . , m,
(18)
2j zj
j = 1, . . . , n.
(19)
=0
2L
x1 x
x1 x1
n
..
.
2L
xn x1
..
.
2L
xn xn
21
..
.
2m
21
..
.
2n
From the second order conditions for minimization (Theorem 7) for the revised equality constrained problem, we know that this Hessian is positive semidefinite under constraint. In
particular, as in the proof of Theorem 11, we have that
gi (x ) = 0 = i 0.
xj = 0 = j 0.
From the first order conditions, if i
/ B, that is, gi (x ) > 0, so that yi = 0, then i = 0.
and
xj > 0 = j = 0
Combining this with the paragraph above we see that 0 and 0. Thus (17) implies
conclusion (13). A little more thought will show you that we have just deduced conditions (14)
through (16) as well.
14 Corollary Let U Rn be open, and let f, g1 , . . . , gm : U R be twice continuously
dierentiable on U . Let x be a constrained local minimizer of f subject to
gi (x) = 0
i E,
gi (x) 0 i E c ,
xj 0 j N.
v. 2015.11.02::14.01
KC Border
12
i 0 i E c .
x f (x ) +
j=1 i gj (x )
=0
g(x ) = 0.
KarushKuhnTucker Theory
A drawback of the slack variable approach is that it assumes twice continuous dierentiability in
order to apply the second order conditions and thus conclude 0 and 0. Fortunately,
Karush [5] and Kuhn and Tucker [8] provide another approach that remedies this defect. They
only assume dierentiability, and replace the independence condition on gradients by a weaker
but more obscure condition called the KarushKuhnTucker Constraint Qualification.
15 Definition Let f, g1 , . . . , gm : Rn+ R. Let
C = {x Rn : x 0, gi (x) 0, i = 1, . . . , m}.
In other words, C is the constraint set. Consider a point x C and define
B = {i : gi (x ) = 0} and Z = {j : xj = 0},
the set of binding constraints and binding nonnegativity constraints, respectively. The point x
satisfies the KarushKuhnTucker Constraint Qualification if f, g1 , . . . , gm are dierentiable at x , and for every v Rn satisfying
v j = v ej 0
v gi (x ) 0
j Z,
i B,
D(0) = v,
where D(0) is the one-sided directional derivative at 0.
v. 2015.11.02::14.01
Consistent notation?
KC Border
13
This condition is actually a little weaker than Kuhn and Tuckers condition. They assumed
that the functions f, g1 , . . . , gm were dierentiable everywhere and required to be dierentiable
everywhere. You can see that it may be dicult to verify it in practice.
16 Theorem (KarushKuhnTucker) Let f, g1 , . . . , gm : Rn+ R be dierentiable at x ,
and let x be a constrained local maximizer of f subject to g(x) 0 and x 0.
Let B = {i : gi (x ) = 0}, the set of binding constraints, and let Z = {j : xj = 0}, the
set of binding nonnegativity constraints. Assume that x satisfies the KarushKuhnTucker
Constraint Qualification. Then there exists Rm such that
f (x ) +
i gi (x ) 0,
i=1
x f (x ) +
i gi (x )
= 0,
i=1
0,
g(x ) = 0.
To better understand the hypotheses of the theorem, lets look at a classic example of its
failure (cf. Kuhn and Tucker [8]).
17 Example (Failure of the KarushKuhnTucker Constraint Qualification) Consider the functions f : R2 R via f (x, y) = x and g : R2 R via g(x, y) = (1 x)3 y. The
curve g = 0 is shown in Figure 3, and the constraint set in Figure 4.
Clearly (x , y ) = (1, 0) maximizes f subject to (x, y) 0 and g 0. At this point we have
g (1, 0) = (0, 1) and f = (1, 0) everywhere. Note that no (nonnegative or not) satisfies
(1, 0) + (0, 1) (0, 0).
Fortunately for the theorem, the Constraint Qualification fails at (1, 0). To see this, note that
the constraint g 0 binds, that is g(1, 0) = 0 and the second coordinate of (x , y ) is zero.
Suppose v = (vx , vy ) satisfies
v g (1, 0) = v (0, 1) = vy 0
and
v e2 = vy 0,
that is, vy = 0. For instance, take v = (1, 0). The constraint qualification requires that there is
a path starting at (1, 0) in the direction (1, 0) that stays in the constraint set. Clearly no such
path exists, so the constraint qualification fails.
v. 2015.11.02::14.01
KC Border
g=0
g<0
g>0
1
v. 2015.11.02::14.01
14
KC Border
15
f (1, 0)
g (1, 0)
Figure 4. This constraint set violates the Constraint Qualification. (Note: f and g are not to
scale.)
v. 2015.11.02::14.01
KC Border
16
Let B = {i : gi (x ) = 0}, the set of binding constraints, and let Z = {j : xj = 0}, the
set of binding nonnegativity constraints. Assume that x satisfies the KarushKuhnTucker
Constraint Qualification. Then there exists Rm such that
f (x )
i gi (x ) 0,
i=1
x f (x )
i gi (x )
= 0,
i=1
0,
g(x ) = 0.
Proof : Minimizing f is the same as maximizing f . The KarushKuhnTucker conditions for
this imply that there exists Rm
+ such that
f (x ) +
i gi (x ) 0,
i=1
Quasiconcave functions
There are weaker notions of convexity properties that are commonly applied in economic theory.
19 Definition A function f : C R on a convex subset C of a vector space is:
quasiconcave if
KC Border
17
Of course, not every quasiconcave function is concave. For instance, the function in Figure 5
is quasiconcave but not concave.
Characterizations of quasiconcavity are given in the next lemma.
21 Lemma For a function f : C R on a convex set, the following are equivalent:
1. The function f is quasiconcave.
2. For each R, the strict upper contour set [f (x) > ] is convex, but possibly empty.
3. For each R, the upper contour set [f (x) ] is convex, but possibly empty.
Proof : (1) = (2) If f is quasiconcave and x, y in C satisfy f (x) > and f (y) > , then for
each 0 1 we have
(
[f > n1 ],
n=1
f (x) =
x0
x 0,
and
g(x) =
x0
x 0.
v. 2015.11.02::14.01
KC Border
18
The next result has applications to production functions. (Cf. Jehle [4, Theorem 5.2.1,
pp. 224225] and Shephard [11, pp. 57].)
23 Theorem Let f : Rn+ R+ be nonnegative, nondecreasing, quasiconcave, and positively
homogeneous of degree k where 0 < k 1. Then f is concave.
Proof : Let x, y Rn and suppose first that f (x) = > 0 and f (y) = > 0. Then by
homogeneity,
(
)
)
(
y
x
=f
=1
f
1
1
k
k
By quasiconcavity,
x
1
k
for 0 1. So setting =
+ (1 )
k
1
k
, we have
+ k
x
1
k + k
1.
k + k
By homogeneity,
1
f (x + y) ( k + k )k = f (x) k + f (y) k
]k
(20)
Observe that since f is nonnegative and nondecreasing, (20) holds even if f (x) = 0 or f (y) = 0.
Now replace x by x and y by (1 )y in (20), where 0 1, to get
(
f x + (1 )y
f (x) + f (1 )y
[
1
k
)1
1
f (x) k + (1 )f (y) k
(
f (x) k
)k
]k
]k
1
+ (1 ) f (y) k
)k
= f (x) + (1 )f (y),
where the last inequality follows from the concavity of 7 k . Since x and y are arbitrary, f
is concave.
f x+(yx) f (x)
.
Notation? Definition?
KC Border
19
f x+(yx) f (x)
2 f (x)
i=1 j=1
xi xj
vi vj 0
Proof : Pick v Rn and define g() = f (x + v). Then g(0) = f (x), g (0) = f (x) v, and
2 f (x)
g (0) = ni=1 nj=1 x
vi vj . What we have to show is that if g (0) = 0, then g (0) 0.
i xj
Assume for the sake of contradiction that g (0) = 0 and g (0) > 0. Then g has a strict local
minimum at zero. That is, for > 0 small enough, f (x + v) > f (x) and f (x v) > f (x).
But by quasiconcavity,
f (x) = f
(1
2 (x
a contradiction.
10
The following theorem and its proof may be found in Arrow and Enthoven [1].
26 Theorem (ArrowEnthoven) Let f, g1 , . . . , gm : Rn+ R be dierentiable and quasiconcave. Suppose x Rn+ and Rm satisfy the constraints g(x ) 0 and x 0 and the
KarushKuhnTuckerLagrange first order conditions:
f (x ) +
(
x f (x ) +
j=1 i gj (x )
0
)
j=1 i gj (x ) = 0
0
g(x ) = 0.
Say that a variable xj is relevant if it may take on a strictly positive value in the constraint
set. That is, if there exists some x
0 satisfying x
j > 0 and g(
x) 0.
Suppose one of the following conditions is satisfied:
f (x )
1.
< 0 for some relevant variable xj0 .
xj0
2.
f (x )
> 0 for some relevant variable xj1 .
xj1
KC Border
20
References
[1] K. J. Arrow and A. C. Enthoven. 1961. Quasi-concave programming. Econometrica
29(4):779800.
http://www.jstor.org/stable/1911819
[2] K. C. Border. Notes on the implicit function theorem.
http://www.hss.caltech.edu/~kcb/Notes/IFT.pdf
[3] C. Carathodory. 1982. Calculus of variations, 2d. ed. New York: Chelsea. This was
originally published in 1935 in two volumes by B. G. Teubner in Berlin as Variationsrechnung und Partielle Dierentialgleichungen erster Ordnung. In 1956 the first volume was
edited and updated by E. Hlder. The revised work was translated by Robert B. Dean and
Julius J. Brandstatter and published in two volumes as Calculus of variations and partial
dierential equations of the first order by Holden-Day in 196566. The Chelsea second
edition combines and revises the 1967 edition.
[4] G. A. Jehle. 1991.
Prentice-Hall.
[5] W. Karush. 1939. Minima of functions of several variables with inequalities as side conditions. Masters thesis, Department of Mathematics, University of Chicago.
[6] H. W. Kuhn. 1976. Nonlinear programming: A historical view. SIAMAMS Proceedings
IX:126. Reprinted as [7].
[7]
[8] H. W. Kuhn and A. W. Tucker. 1951. Nonlinear programming. In J. Neyman, ed., Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability II,
Part I, pages 481492. Berkeley: University of California Press. Reprinted in [9, Chapter
1, pp. 314].
http://projecteuclid.org/euclid.bsmsp/1200500249
[9] P. Newman, ed. 1968. Readings in mathematical economics I: Value theory. Baltimore:
Johns Hopkins Press.
[10] J. P. Quirk. 1976. Intermediate microeconomics: Mathematical notes. Chicago: Science
Research Associates.
[11] R. W. Shephard. 1981. Cost and production functions. Number 194 in Lecture Notes in
Economics and Mathematical Systems. Berlin: SpringerVerlag. Reprint of the 1953 edition
published by Princeton University Press.
[12] A. Takayama. 1974. Mathematical economics. Hinsdale, Illinois: Dryden Press.
v. 2015.11.02::14.01