Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Notations:
• A set is any collection of items. For example R is the set of all real numbers.
• R2 = {(x, y) : x ∈ R, y ∈ R}.
2 = {(x, y) : x ≥ 0, y ≥ 0}.
• R+
2
• R++ = {(x, y) : x > 0, y > 0}.
• A ∪ B = {x : either x ∈ A or x ∈ B}.
• A ∩ B = {x : x ∈ A and x ∈ B}
Convex Sets
• A set X in Rn is said to be a convex set if for any two points in the set X, all its
convex combinitions are also included in the set X. In other word, a set is convex
if and only if for any given two points in the set, the line segment jointing the two
points is included in the set.
Geometrically, a function of one variable is concave if the secant line lies below the
graph of the function, or equivalently if the tangent line lies above the graph of the
function.
1
• A function f defined on a convex set X is strictly concave if for every two different
points x, x0 in X and any λ ∈ (0, 1),
Geometrically, a function of one variable is strictly concave if the secant line lies
strictly below the graph of the function. If the graph of a function has a straight part
then the function can not be strictly concave.
• A strictly concave (convex) function must be concave (convex). But the converse may
not be true. For example the function f (x) = |x| is convex but not strictly convex;
f (x) = −|x| is concave but not strictly concave.
• The level set (or level curve if the function has only two variables) of a function
y = f (x) with a domain X in Rn is the set
L(c) = {x ∈ X : f (x) = c}
for some constant c. If y = f (x) is a production function, then its level set is known
as an isoquant and if it is a utility function then its level set is known as a indifference
curve.
• The better set of the point x0 ∈ Rn for function y = f (x) with a domain X in Rn is
the region
B(x0 ) = {x ∈ X : f (x) ≥ f (x0 )}.
• The worse set of the point x0 ∈ Rn for function y = f (x) with a domain X in Rn is
the region
W (x0 ) = {x ∈ X : f (x) ≤ f (x0 )}.
2
The first order differentials
• The first order total differential can be used to calculate the slope of a level curve
for a function, say z = f (x, y). The procedure is as follows: consider the equation
f (x, y) = c where c is a constant. Take the total differential on both side of the
equation f (x, y) = c. Since c is a constant, the total differential of a constant is zero.
dy dy
So we have 0 = fx dx + fy dy. Solving dx , we have dx = − ffxy .
Quadratic Forms:
q(x) = x> Ax
is a quadratic form. There is no loss of generality in assuming that the matrix which
generates the form is symmetric, since the matrix A∗ with elements
1
a∗ij = (aij + aji )
2
is symmetric and has the same quadratic form as A.
3
• Given a matrix A, one can write the associated quadratic form q(x) = x> Ax and
conversely given a quadratic function q(x), one can find a symmetric matrix A such
that q(x) is the quadratic form associated with matrix A.
• if
q(x) = x> Ax > 0(< 0) for all x 6= 0,
then q(x) is said to be a positive (negative) definite quadratic form and A is said to
be a a positive (negative) definite matrix.
• if
q(x) = x> Ax ≥ 0(≤ 0) for all x,
then q(x) is said to be a positive (negative) semidefinite quadratic form and A is said
to be a a positive (negative) semidefinite matrix.
It is obvious if x is the zero vector then the quadratic form q(x) = x> Ax = 0. The
quadratic form is positive definite if and only if the quadratic form as a function is strictly
convex and has a unique minimum at the origin. The shape of the surface is like a bowl.
The quadratic form is negative definite if and only if the quadratic form as a function is
strictly concave and has a unique maximum at the origin. The shape of the surface is like
a dome. If the quadratic form is positive (negative) semidefinite then the quadratic form
as a function is convex (concave) and has a minimum (maximum) at the origin but the
extremum may not be unique. The shape of the surface may look like a sheet of paper
rolled upwards (downwards). If the quadratic form is indefinite, then the quadratic form as
a function is neither convex nor concave. The shape of the surface is like a saddle.
4
Tests for definiteness in terms of determinants:
Let An×n be a sysmmetric matrix of size n. A leading principal submatrix of order k is
obtained by deleting the last n − k rows and columns. Let Ak denote the kth order leading
principal submatrix. Then the determinant |Ak | is called the kth order leading principle
minor.
(b) A is negative definite if and only if |A1 | < 0, |A2 | > 0, |A3 | < 0, · · ·.
(d) if |A1 | < 0, |A2 | > 0, · · · , (−1)n−1 |An−1 | > 0, |An | = 0, then A is negative semidefinite.
• A is positive definite if and only if |A1 | > 0, |A2 | > 0, |A3 | > 0.
• A is negative definite if and only if |A1 | < 0, |A2 | > 0, |A3 | < 0.
• if either a11 a22 − a212 < 0 or a11 a33 − a213 < 0 or a22 a33 − a223 < 0, then A is indefinite.
5
For a function of n variables y = f (x), x ∈ Rn , the first order differential is
where ∇f (x) is the gradient vector and dx is the vector dx = (dx1 , dx2 , . . . , dxn ). The
second order total differential is
• if d2 f (x) > 0(< 0) for all x and all dx 6= 0, then f is strictly convex (concave).
Note that the second order total differential is the quadratic form of the Hessian matrix.
Hence testing the definiteness of the Hessian matrix is equivalent to testing the signs of the
second order total differential. Hence we can use the Hessian matrix to test the convexity
and concavity as we did in using the second derivatives to test the concavity in calculus
of one variable. Now for functions of more than one variables, we use the Hessian matrix
to replace the second order derivative. Indeed, when n = 1 the Hessian matrix of f is the
second derivative of f . So it is natural that we have the following test:
• f is convex (concave) if and only if the Hessian matrix ∇2 f (x) is positive (negative)
semidefinite for all x
• If the Hessian matrix ∇2 f (x) is positive (negative) definite for all x, then f is strictly
convex (concave).
6
Tests of quasiconvexity and quasiconcavity by bordered Hessian:
The bordered Hessian of y = f (x1 , x2 ) is
0 f1 f2
H̄ = f1 f11 f12
f2 f12 f22
• First order condition for optimality (FOC): A local extremum must be a stationary
point.
• Second order sufficient condition for global extremum (SOSC): Suppose that x∗ is a
stationary point, i.e., ∇f (x∗ ) = 0. If the Hessian matrix ∇2 f (x) is negative semidefi-
nite for ALL x, then f is concave and x∗ is a global maximizer; If the Hessian matrix
∇2 f (x) is negative definite for ALL x then f is strictly concave and x∗ is the unique
global maximizer. If the Hessian matrix ∇2 f (x) is positive semidefinite for all x, then
f is convex and x∗ is a global minimizer. If the Hessian matrix ∇2 f (x) is positive
definite for all x then f is strictly convex and x∗ is the unique global minimizer.
• Second order sufficient condition for local extremum (SOSC): Suppose that x∗ is a
stationary point, i.e., ∇f (x∗ ) = 0. If the Hessian matrix ∇2 f (x∗ ) is negative (positive)
definite, then f is concave (convex) around x∗ and x∗ is a local maximizer (minimizer).
If the Hessian matrix ∇2 f (x∗ ) is indefinite, then x∗ is a saddle point (i.e. neither a
maximum nor minimum).
7
FOC and SOSC for constrained optimization problems
Consider the constrained optimization problem
(P ) max(min) f (x1 , x2 )
s.t. g(x1 , x2 ) = 0.
• Critical points (or a stationary point): A solution (x∗1 , x∗2 ) to the following system of
equations is called a critical point and λ∗ is called a Lagrange multiplier.
∂L
= f1 (x∗1 , x∗2 ) + λ∗ g1 (x∗1 , x∗2 ) = 0
∂x1
∂L
= f2 (x∗1 , x∗2 ) + λ∗ g2 (x∗1 , x∗2 ) = 0
∂x2
∂L
= g(x∗1 , x∗2 ) = 0.
∂λ
• First order necessary condition for constrained optimization (FOC): If (x∗1 , x∗2 ) is a
local extremum of the problem (P) and ∇g(x∗, x∗2 ) 6= 0, then (x∗1 , x∗2 ) is a critical pint.
• Second order sufficient condition (SOSC): The Hessian matrix of the Lagrange func-
tion is
L
11
L12 g1
H(x1 , x2 , λ) = L12 L22 g2
g1 g2 0
It varies with (x1 , x2 , λ). If we fix (x∗1 , x∗2 , λ∗ ), then we denote
L∗ L∗12 g1∗
11
detH ∗ = detH(x∗1 , x∗2 , λ∗ ) = det L∗12 L∗22 g2∗ = 2g1∗ g2∗ L∗12 − g1∗ 2 L∗22 − g2∗ 2 L∗11
g1∗ g2∗ 0
where L∗11 := f11 (x∗1 , x∗2 ) + λ∗ g11 (x∗1 , x∗2 ) etc. Assume that (x∗1 , x∗2 , λ∗ ) satisfies the
FOCs. Since the restricted second order total differential is defined by
L∗11 L∗12 g2∗
“d2 L∗ ” = [g2∗ − g1∗ ] ,
L∗12 L∗22 −g1∗
we have “d2 L∗ ” = − det(H ∗ ) (i.e. the signs of the determinate of the matrix H ∗ and
“d2 L∗ ” are opposite). Therefore
8
The meaning of Lagrange multiplier—shadow price
Consider the parametric constrained optimization problem
s.t. g(x1 , x2 ) = α,
where α is a parameter.
The Lagrange function for P(α) is the function
Suppose that we can solve the problem P(α) and find a solution (x∗1 (α), x∗2 (α)) with La-
dx∗1 dx∗1
grange multiplier λ∗ (α) and assme that dα , dα exist and are differentiable. Then the value
function V (α) = f (x∗1 (α), x∗2 (α)) is differentiable with the derivative equal to the Lagrange
multiplier, i.e.,
V 0 (α) = λ∗ (α).
Note that this is a very useful formula since one does not need to find the value function in
order to find the derivative of the value function. One can get this information free from
solving the optimization problem.
In particular let α0 be a fixed number. Then λ∗ (α0 ), the Lagrange multiplier for problem
P(α0 ) measures the rate of change of the maximum (minimum) value of the objective
function when α changes from α0 (i.e. when constraint g(x1 , x2 ) = α0 is relaxed or tighten
slightly). For this reason, λ∗ (α) is referred to as the “shadow price” of α.
In the consumer problem (with p1 , p2 given):
max u(x1 , x2 )
s.t. p1 x1 + p2 x2 = m.
The value function V (m) is called the indirect utility function and the Lagrange multiplier
λ∗ (m0 ) is the rate of the change of the optimal utility when the budget is tighten or relaxed
slightly from m0 . It is “the marginal utility of income”.
In the expenditure minimization problem (with p1 , p2 given)
min p1 x1 + p2 x2
The value function E(ū) is called the expenditure function and the Lagrange multiplier
λ∗ (u0 ) is the rate of the change of the cost when the utility constraint is tighten or relaxed
slightly from ū = u0 . It is “the marginal cost of utility”.