Sei sulla pagina 1di 10

Lecture outline

Static optimization unconstrained problems


Graduate course on Optimal and Robust Control (spring12) Derivative-free optimization Nelder-Mead simplex method

Zden ek Hur ak
Department of Control Engineering Faculty of Electrical Engineering Czech Technical University in Prague

Derivative-based optimization Line search methods


Methods for line search (step length) Methods for descent direction search

February 19, 2013

Trust region methods

1 / 38

2 / 38

Numerical algorithms for unconstrained optimization

Derivative-free methods Nelder-Mead simplex method


Not to be confused with simplex method in linear programming!
2

The key classication Methods based on derivatives Derivative-free methods (Nelder-Mead)


1 3

fminsearch() in Matlab

3 / 38

4 / 38

Derivative-based methods

Line search methods

1. descent direction search . . . dk Line search methods Trust region methods xk +1 = xk + k pk 2. line search (step length determination) . . . k

5 / 38

6 / 38

Methods for line search

Fibonacci search
Fibonacci sequence = 1, 1, 2, 3, 5, 8, 13, . . . Fix the number of intervals at the beginning. Say, 13:
f(x)

1. Fibonacci, golden section 2. Bisection 3. Newton 4. Inexact line search


1 2 3 5 8 13 x

Start by evaluating f (x ) at x = 5 and x = 8. Need 4 evaluations (13 is the n = 4th Fib. number). In general n 2 steps and the uncertainty (b a)/Fn Improvement in the uncertainty Fn1 /Fn .
n
7 / 38

lim Fn1 /Fn =

(1 +

5)/2

0.618
8 / 38

Golden section search

Speed of convergence Order of convergence

dk +1 /dk 0.618
f(x)

Order p of convergence of the sequence {rk } to r 0 lim sup


k

rk +1 r < (rk r )p

Examples: rk = ak ,
x a x2 x1 b

0<a<1 0<a<1

rk = a(2 ) ,

9 / 38

10 / 38

Linear convergence

Bisection method

rk +1 r =<1 k rk r lim Geometric series rk = c k Comparisons of two linearly converging algorithms based on their convergence ratios .

f(x)

x a x1 b

For = 0: superlinear convergence. For = 1: sublinear convergence. Ex.: rk = 1/k

11 / 38

12 / 38

Newtons method for line search


Approximate the function by a parabola (use f (xk ), f (xk ) and f (xk )): 1 q (x ) = f (xk ) + f (xk )(x xk ) + f (xk )(x xk )2 2 Find the minimum of the approximating function can be done analytically: 0 = q (x ) = f (xk ) + f (xk )(x xk ) f (xk ) = xk f (xk )

Newtons method for line search


0.8 0.75 0.7 0.65 0.6 f(x) 0.55 0.5 0.45 0.4 f (x ) (x0 )(x x0 ) x0 + f(x0 )(x x0 ) + 1/2f

xk +1

0.35 0.3 0

0.1

0.2

0.3

0.4

0.5 x

0.6

0.7

0.8

0.9

13 / 38

14 / 38

Another look at Newtons method equation solving


Solving g (x ) = 0

Quadratic convergence of Newtons method


Lets stay with the equation solving formulation.

g ( x)

xk +1 x = xk x
g ( xk ) g ( xk )

x 0 xk+1 xk x

g (xk ) g (x ) g (xk ) g (xk ) g (x ) + g (xk )(x xk ) = g (xk ) 1 g ( ) = (xk x )2 2 g (xk ) k1 |xk x |2 2k2

|xk +1 x | xk +1 = xk g (xk ) g (xk )


15 / 38

16 / 38

Methods for descent direction search

Steepest descent
Condition for descending direction dT k f (xk ) < 0 Recall the geometric interpettation of an inner product |dT k f (xk )| = |dk ||f (xk )| cos The steepest descent xk +1 = xk k f (xk )

1. steepest descent 2. Newton 3. Quasi-Newton 4. Conjugate direction /conjugage gradient method (CG)

17 / 38

18 / 38

Steepest descent applied to quadratic cost


1 f (x) = xT Qx bT x 2 Find that minimizes f (xk k fk ) 1 f (xk k fk ) = (xk k fk )T Q(xk k fk ) bT (xk k fk ) 2 Using that gradient is f (x) = Qx b we get upon dierentiation wrt fkT fk k = fkT Qfk Hence steepest descent method is xk +1 = xk fkT fk fk fkT Qfk
19 / 38

Zigzagging of the steepest descent method


100 80 60
7500
7500

750
5000

12

50

10
500 0

00

2500

22 5 20 00 0 17 00 50 15 0 00 0

75
25

40 20

00
12 50 0

00

10

50 00

0 00

500 0
25

x2

0
10 00 0

00

0
750

20 40 60 80

75

15 00 17 0 50 20 0 00 22 0 50 0

50

10
12

50

100 100

80

60

40

20

0 x

20

40

60

80

75

75

00

5000

Poor convergence rate depending on the scaling.


20 / 38

00

00 0

500

00

250

250

12 50 0

00

100

Newtons search (also Newton-Raphson)

Solving symmetric positive denite linear equations

Idea: The function to be minimized is approximated locally by a quadratic function and this approximating function is minimized exactly. xk +1 = xk [2 f (xk )]1 f (xk )
Hessian gradient

Solve Ax = b Cholesky factorization A = X X .

Local convergence guaranteed but not global!

21 / 38

22 / 38

Modications of Newtons search damping

Modications of Newtons search positive deniteness


Positive denite matrix Mk instead of Hessian
1 xk +1 = xk k M k f (xk )

Interprettation in the scalar nonlinear equation case:


g ( x)

A search parameter introduced xk +1 = xk k [2 f (xk )]1 f (xk )

Another approach Bk = 2 f (xk ) + Ek where Ek = 0 if f (xk ) is suciently positive denite, otherwise it is chosen so that Bk > 0.
23 / 38 24 / 38

Quasi-Newton
From the denition of Hessian 2 fk (xk +1 xk ) fk +1 fk
sk yk

Popular updates in quasi-Newton methods


(yk Bk sk )(yk Bk sk )T (yk Bk sk )T sk

Symmetric-rank-one (SR1): Bk +1 = Bk + BFGS: Bk +1 = Bk


B k sk sT k Bk sT k B k sk

T yk yk Ts yk k

Find a matrix Bk +1 that mimics the Hessian behaviour above. Bk +1 sk = yk Typically two requirements symmetry (as Hessian) low-rank approximation between the steps

As inversion of Bk is needed, the update can be applied to its inverse directly: DFP (Davidon, Fletcher and Powell) Other updates keep the Hessian in factored formulation H = RRT Matlab chol() function ... Cholesky factorization

25 / 38

26 / 38

Conjugate gradient directions

Inexact line search

Armijo Goldstein Wolfe

27 / 38

28 / 38

Intuitive approach step size reduction

However, convergence to minimum not guaranteed: s (1x )2 4 2(1 x ) if x > 1 x )2 f (x ) = s (1+ 2(1 + x ) if x < 1 4 2 x 1 if 1 x 1.

Start with an initial step size s and if the corresponding vector xk + s d does not yield an improved (smaller) value of f (), that is, if f (xk + s d) f (xk ), reduce the step size, possibly by a xed factor. Repeat.
2 1

f 2.5 2.0 1.5 1.0 0.5 1 2 x

0.5 1.0

xk +1 = xk 1f (xk ).
29 / 38 30 / 38

Armijos condition

Goldsteins condition

31 / 38

32 / 38

Wolfes condition

Terminal conditions

33 / 38

34 / 38

Trust region methods


Recall 1 f (xk + p) f (xk ) + f T (xk ) p + p2 f (xk )p 2 We seek the minimum of the quadratic model function 1 mk (p) = f (xk ) + f (xk ) p + pBk p 2
T

Trust region methods

Trust region Line search direction Trust region step

Contours of mk (x)

subject to p k . For Bk = 2 f (xk ) trust-region Newton method.


Contours of f (x)

35 / 38

36 / 38

Software

Summary

1. Optimization toolbox for Matlab: fminunc() (trust-region Newton), fminsearch() (Nelder-Mead simplex) 2. UnconstrainedProblems package for Mathematica: FindMinimumPlot, FindMinimum

line search methods: direction search, step lentgth determination, Newton methods, quasi-Newton. trust region methods.

37 / 38

38 / 38

Potrebbero piacerti anche