Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
OPTIMIZATION CALCULATIONS
M.J.D.
i.
Powell
Introduction
An algorithm for solving the general constrained optimization problem is
presented that combines the advantages of variable metric methods for unconstrained
optimization calculations with the fast convergence of Newton's method for solving
nonlinear equations.
The given method is very similar to the one suggested in Section 5 of Powell (1976).
The main progress that has been made since that paper was written is that through
calculation and analysis the understanding of that method has increased.
i = i, 2 . . . . .
func-
(i.i)
We let
~(x) be the
gradient vector
= X F(~)
(1.2)
and we let N be the matrix whose columns are the normals, ~ci, of the "active
constraints".
145
to force convergence
controls
is defined.
from poor
in Sections
3 and 4
It seems to be
done by the best of the other published algorithms for constrained optimization.
theoretical
elsewhere
2.
(Powell,
properties
for constrained
optimization
calculations.
matrix initially,
quadratic
in the uncon-
(1.2) is
The vector
that minimizes
the
functiQn
Q(~) = F(~) + d T ~
is calculated.
x
calculated.
1977).
starting point ~
+ d r BH
(2.1)
~* = ~ + ~ d ,
where ~
(2.2)
of one variable
(~)
= F(x + ~ d)
(2.3)
Then another
to this method in
on the variables.
straint normals
N =
~ ~cl, Vc 2 . . . . .
Vc m, }
(2.4)
146
has full rank, an obvious method of defining the search direction d is to satisfy
the equations
NTd + c
= O,
(2.5)
if we let ~
n the advantages
ci(~)
(2.2).
strained optimization
for uncon-
the function
(2.1)
~ + c i = O,
vciT~ + c i
i = i, 2 . . . . .
> O,
m'
i = m'+l, m'+2,...m
(2.6)
J,
definite and that each new value of x has the form (2.2), we say that an algorithm
that calculates d in this way is a "variable metric method for constrained optimization".
to keep
be positive definite at the required minimum so B can be regarded as a second derivative approximation.
For example,
2
F(x,) = -x,
the
has negative
curvature.
numerical
1975).
One of the
theory in Powell
is usually obtained,
function"
(Fletcher,
function positive,
Our
is indefinite.
the
147
Keeping B positive definite not only helps the quadratic programming calculation that provides ~ but it also allows a basic feature of variable metric
methods to be retained.
steps of the algorithm do not require the user to scale the variables and the constraint functions carefully before starting the calculation.
liminary scaling may be helpful to the subroutines for matrix calculations that are
used by the algorithm.
If this case
occurs it does not necessarily imply that the nonlinear constraints (I.I) are
inconsistent.
= O,
i = i, 2 . . . . , m'
(2.7)
~ci T ~ + e i ~ i >
where
~i
O,
i = m'+l, m'+2,..., m ) ,
ci > 0
~ i = ~,
ci < O J-
(2.8)
Thus we modify only the constraints that are not satisfied at the starting point
of the iteration,
0 ~ ~ ~ I.
objective
We make
same as before.
Usually
148
form (2.2).
However,
of d are zero.
to the violations
of the nonlinear
constraints
are
inconsistent.
3,
The revision of B
In order to achieve superlinear
second derivative
revising B.
information,
convergence
of F(x).
constraints
used for
of the Lagrangian
and
the equation
VF(x) -
where
m'
~ - - ~i ~ci(~)
i = i
= O,
m'+n equations
algorithm
in m'+n unknowns,
(3.1)
These conditions
provide
information
We note
= F(~) -
~(x).
(3.2)
is that second derivatives
of
Therefore
It is suitable to let ~
at the solution of
149
that
Experience
with variable metric methods for unconstrained optimization suggests that B should
be replaced by a matrix, B* say, that depends on B and on the difference in gradients
= ~x ~ ( ~
where
~(_x,
+~
, ~ ) -~x~(~,
~ ),
(3.3)
- ~.
(3.4)
~T_~
is positive.
in
are used for revising B suppose that this has been done, but, when there are constraints, it can happen that
In this case the usual methods for revising B would fail to make the matrix B*
positive definite.
Therefore we replace ~
~=
that is closest to ~
@~
+ (1-O) 36 ,
O.< @
,< i,
(3.5)
>/ 0.2
@~B ~_>
~T~
o.8 _TB_
(3.6)
(3.7)
,
< 02
(Powell~ 1976).
We may use ~
in place of ~
We prefer the
BFGS formula
B* = E -
B~ g_TB
~] 7]r
(3.8)
because positive
definiteness of B and condition (3.6) imply that B* is positive definite and because
it gives invariance under changes of scale of the variables.
150
Powell
function,
I~ ~
is small.
Han's(1976)
a scale-invariant
to
work
implementation
4.
of the tangent
quadratic programming
of
the modification
The analysis
It suggests
II ! ~I /
worthwhile.
G say,
B and G, the projection being into the space that is the intersection
hyperplanes
con-
In this case
=L
in equation
(2.2) is extremely
important because
However,
the
choice of step--length is complicated by the fact that, not only do we wish to reduce
the objective
of the form
~(_x)
where P [ ~ ( ~ ) ]
otherwise.
Because algorithms
applied directly to
entiable,
method.
~(~)
However,
(4.1)
for minimizing
= F(~) + P ~ ! ( ~ ) J
became differ-
if
There-
151
~M~(_x, ~
m~
) = F(~) + ~
/ ~ Ici(x)l
i=l
i
m
+ ~
/L.~inEO,ci(~)]I , (4.2)
i=m'+l !
~(_x*,~)
< ~(_x, ~ ) ,
(4.3)
(o6) =
~L(~ +~d,
(4.4)
/,_ ),
which reduces to expression (2.3) when there are no constraints, decreases initially
when
is
made positive.
I A~I ~
~
i = I, 2 ..... m,
(4.5)
tion (4.5) on every iteration may be inefficient, because it can happen that on
most iterations
~_
~_~
be
l~i~
~i=
max
(i = I, 2 ..... m).
~ I~i I , ( / ~ i
~i
+ IAil ) ~ ,
i = i, 2 ..... m,
(4.6)
152
Because
do not apply.
Therefore,
constrained optimization
given method.
(i = i, 2 .....
culated.
theorems
for un-
Han's
of the iterations.
m) we let ~ .
~il
A
~/~(x, / ~ ) that occurs during each
sequence of iterations
remains constant.
An error return is made if
A
there is a run of five iterations where /~_ remains constant and the minimum value
of
~(x,
A
~u)
for which ~ _
converge satisfactorily
The procedure
~.~ is as follows.
for choosing
All
0'(0),
where
It depends on a number
~ (~)
is the function
/~
(4.4).
The first term in the sequence is ={o = i and, for k > I, the value of
depends on the quadratic approximation
~k
the equations
Ok(O) = 0(0)
~(o)
Ck(~k_l
We let
~k
z~
) =
0(~k_ I )
(4.7)
that minimizes
Ok (J'~)"
~ (O) + O . l = L k
(4.8)
/'k
in algorithms
for unconstrained
optimi-
153
However~
in the function
define
/~
differently
= O.
EO,I 3
, particularly
ci(~)
E~(1)
occurs near
- ~ (O)J
that would
information about
if the discontinuity
to the difference
We
is expected to occur in ~ ( ~
on the interval
discontinuities
This
(i = i, 2, ..., m)
The numerical results of the next section show that on nearly every iteration
the step-length
5.
Numerical Results
The given algorithm has been applied to several test problems,
constraints
define surfaces
Colville's
(Rosenbrock,1960)
including some
We report
six
five constraints
are active at the solution, which are identified on every iteration by the quadratic
programming
calculation
that defines ~.
Throughout
the problem
the calculation
What is surprising
in this case is
slow.
Colville's
that he recommends,
at the solution.
Our algorithm
identifies
these constraints
are active
successfully
on the
154
second iteration.
is only five.
There are four, they are all linear, while the number of variables
In Colville's
is obtained.
Because of symmetry
equality constraints,
to
Similarly
calculation
iteration when the standard starting point (i0, i0, i0) is used.
freedom.
on
is rapid.
The nonlinearity
of the constraints
Hence
calculation.
the calculation,
which is analogous
to choosing
small
large values of
(4.2).
It is that we
is shown in
The given figures are the number of function and gradient evaluations
155
that are required to solve each constrained minimization problem, except that the
figures in brackets are the number of iterations.
mentioned already.
starting points are (iO, IO, IO) and (-2, 2~ 2, -i, -i) respectively.
three columns of figures are taken from Colville
(1975).
The first
In the case of our algorithm we suppose that the solution is found when all
Colville's
even though the three given numbers are obtained by three different algorithms.
The results due to Biggs (1972) were calculated by his REQP Fortran program
which is similar to the method that we prefer.
way he approaches constraint boundaries and his use of the objective function F(~)
instead of the Lagrangian function (3.2) in order to revise the matrix that we
call B.
He now uses the Lagrangian function to revise B (Biggs, 1975), but this
change influences the numerical results only when some of the active constraints
are nonlinear.
Fletcher
He gives figures for each version and, as in the case of the Colville results, our
table shows the figures that are most favourable to Fletcher's work on each test
problem.
are superior to the augmented Lagrangian method because most of the early algorithms
are rather inconsistent, both in the amount of work and in the final accuracy.
We do, however, claim that the table shows that the algorithm described in
this paper is a very good method of solving constrained optimization calculations
with nonlinear constraints.
156
-~
It is usually
unneccessary to use the form (2.7) of the linear approximations to the constraints
instead of the form (2.6).
A
~-
, described in
Such a program
is usually satisfactory for small calculations and for calculations where most of
the computer time is spent on function and gradient evaluations.
however,
In other cases,
the matrix calculations of the algorithm may dominate the running time of
the program.
TABLE
Comparison of Algorithms
PROBLEM
COLVILLE i
COLVILLE
13
BIGGS
FLETCHER
39
PRESENT
(4)
(4)
COLVILLE 2
112
47
149
(3)
17
(16)
COLVILLE 3
23
iO
64
(5)
3
(2)
POP
--
ii
3O
(4)
7
(5)
37
(5)
7
(6)
POWELL
157
References
Biggs, M.C. (1972) "Constrained minimization using recursive equality quadratic
programming" in Numerical methods for nonlinear optimization, ed. F.A. Lootsma,
Academic Press (London).
Biggs, M.C. (1975) "Constrained minimization using recursive quadratic programming:
some alternative subproblem formulations" in Towards $1obal optimization,
eds. L.C.W. Dixon and G.Po SzegB, North-Holland Publishing Co. (Amsterdam).
Colville, A.R. (1968) "A comparative study on nonlinear programming codes",
Report No. 320-2949 (IBM New York Scientific Center).
Dennis, J.E. and Mor~, J. (1977) "Quasi-Newton methods, motivation and theory",
SIAM Review, Vol. 19, pp. 46-89.
Fletcher, R. (1970) "A Fortran subroutine for quadratic prograrmning", Report
No. R6370 (A.E.R.E., Harwell).
Fletcher, R. (1975) "An ideal penalty function for constrained optimization",
J. Inst. Maths. Applics., Vol 15, pp. 319-342.
Han, S-P. (1975) "A globally convergent method for nonlinear programming",
Report No. 75-257 (Dept. of Computer Science, Cornell University).
Han, S-P (1976) "Superlinearly convergent variable metric algorithms for general
nonlinear prograrmning problems", Mathematical Prosrammin$, Vol. ii, pp. 263-282.
Powell, M.J.D. (1969) "A method for nonlinear constraints in minimization
problems" in Optimization, ed. R. Fletcher, Academic Press (London).
Powell, M.J.D. (1976) "Algorithms for nonlinear constraints that use Lagrangian
functions", presented at the Ninth International Symposium on Mathematical
Prograrmning, Budapest.
Powell, M.J.D. (1977) "The convergence of variable metric methods for nonlinearly
constrained optimization calculations", presented at Nonlinear Programming
Symposium 3, Madison, Wisconsin.
Rosenbrock, H.H. (1960) "An automatic method for finding the greatest or the
least value of a function", Computer Journal, Vol. 3, pp. 175-184.
Sargent, R.W.H. (1974) "Reduced-gradient and projection methods for nonlinear
programming" in Numerical methods for constrained optimization, eds. P.E. Gill
and W. Murray, Academic Press (London).