Sei sulla pagina 1di 72

OPTIMASI

Ingredients
 Objective function
 Variables
 Constraints

Find values of the variables


that minimize or maximize the objective function
while satisfying the constraints
Different Kinds of Optimization
Different Optimization
Techniques
 Algorithms have very different flavor
depending on specific problem
 Closed form vs. numerical vs. discrete
 Local vs. global minima

 Running times ranging from O(1) to NP-hard


Optimization problem

min f (x)
x

subject to : g ( x)  0
h ( x)  0
x  X  n
x  x  x
Summary

Defining an optimization problem:


1. Choose design variables and their
bounds
2. Formulate objective (best?)
3. Formulate constraints (restrictions?)
4. Choose suitable optimization
algorithm
Standard forms
 Several standard forms exist:

Negative null form: Positive null form: g ( x)  0


h ( x)  0
min f (x)
x

subject to : g ( x)  0 Neg. unity form: g ( x)  1


h ( x)  0 h ( x)  1
x  X  n
g ( x)  1
x  x  x
Pos. unity form:

h ( x)  1
Multi-objective problems

 Minimize c(x) Vector!

s.t. g(x)  0, h(x) = 0


 Input from designer required! Popular
approach: replace by weighted sum:
f ( x)   wi ci ( x)
i

● Optimum, clearly, depends on choice of weights

● Pareto optimal point: “no other feasible point exists


that has a smaller ci without having a larger cj”
Optimization in 1-D
 Look for analogies to bracketing in root-finding

(xleft, f(xleft))

(xright, f(xright))

xleft < xmid < xright


f(xmid) < f(xleft)
(xmid, f(xmid)) f(xmid) < f(xright)
Optimization in 1-D

 Once we have these properties, there is at


least one local minimum between xleft and
xright
 Establishing bracket initially:
 Given xinitial, increment
 Evaluate f(xinitial), f(xinitial+increment)
 If decreasing, step until find an increase
 Else, step in opposite direction until find an
increase
 Grow increment at each step

 For maximization: substitute –f for f


Optimization in 1-D
 Strategy: evaluate function at some xnew

(xleft, f(xleft))

(xright, f(xright))

(xnew, f(xnew))
(xmid, f(xmid))
Optimization in 1-D
 Strategy: evaluate function at some xnew
 Here, new “bracket” points are xnew, xmid, xright

(xleft, f(xleft))

(xright, f(xright))

(xnew, f(xnew))
(xmid, f(xmid))
Optimization in 1-D
 Strategy: evaluate function at some xnew
 Here, new “bracket” points are xleft, xnew, xmid

(xleft, f(xleft))

(xright, f(xright))

(xnew, f(xnew)) (xmid, f(xmid))


Optimization in 1-D
 Unlike with root-finding, can’t always
guarantee that interval will be reduced by a
factor of 2
 Let’s find the optimal place for xmid, relative to
left and right, that will guarantee same factor of
reduction regardless of outcome
Optimization in 1-D

2

if f(xnew) < f(xmid)


new interval = 
else
new interval = 1–2
Golden Section Search
 To assure same interval, want  = 1–2
 So, 5 1
 
2
 This is the “golden ratio” = 0.618…
 So, interval decreases by 30% per iteration
 Linear convergence
Error Tolerance
 Around minimum, derivative = 0, so
f ( x  x)  f ( x)  12 f ( x)x 2  ...
f ( x  x)  f ( x)  1
2 f ( x)x 2  machine 
 x ~ 
 Rule of thumb: pointless to ask for more
accuracy than sqrt( )
 Can use double precision if you want a single-
precision result (and/or have single-precision
data)
Faster 1-D Optimization
 Trade off super-linear convergence for
worse robustness
 Combine with Golden Section search for safety
 Usual bag of tricks:
 Fit
parabola through 3 points, find minimum
 Compute derivatives as well as positions, fit cubic

 Use second derivatives: Newton


Newton’s Method
Newton’s Method
Newton’s Method
Newton’s Method
Newton’s Method
 At each step:

f ( xk )
xk 1  xk 
f ( xk )
 Requires 1st and 2nd derivatives
 Quadratic convergence
Multi-Dimensional Optimization

 Important in many areas


 Fitting
a model to measured data
 Finding best design in some parameter space

 Hard in general
 Weird shapes: multiple extrema, saddles,
curved or elongated valleys, etc.
 Can’t bracket

 In general, easier than rootfinding


 Can always walk “downhill”
Newton’s Method in
Multiple Dimensions
 Replace 1st derivative with gradient,
2nd derivative with Hessian

f ( x, y )
 fx 
f   f 
 
 y 
 2 2 f 2 f 
H   x2 f 
xy
 2 f 
 xy y 2 
Newton’s Method in
Multiple Dimensions
 Replace 1st derivative with gradient,
2nd derivative with Hessian
 So,

  1  
xk 1  xk  H ( xk ) f ( xk )
 Tends to be extremely fragile unless function
very smooth and starting close to minimum
Optimality Conditions
 Unconstrained optimization – multivariate
calculus problem. For Y=f(X), the optimum
occurs at the point where f '(X) =0 and
f’''(X) meets second order conditions
 A relative minimum occurs where f '(X) =0 and
f’''(X) >0
 A relative maximum occurs where f '(X) =0 and
f’''(X) <0
Concavity and Second Derivative
local max and
global max

local max

f’’(x)<0 f’’(x)>0 f’’(x)<0 f‘’(x)>0

local min local min and global min


Multivariate Case
 To find an optimum point, set the first partial
derivatives (all of them) to zero.
 At the optimum point, evaluate the matrix of
second partial derivatives (Hessian matrix) to
see if it is positive definite (minimum) or
negative definite (maximum).
 Check characteristic roots or apply
determinental test to principal minors.
Determinental Test for a Maximum – Negative
Definite Hessian

f11 f12 f13


f21 f22 f23 <0 These would
f31 f32 f33 all be positive
for a minimum.
(matrix positive
definite)
f11 f12
f21 f22 >0

f11 <0
Global Optimum
A univariate function with a negative second derivative everywhere
guarantees a global maximum at the point (if there is one) where f’(X)=0.
These functions are called “concave down” or sometimes just “concave.”

A univariate function with a positive second


derivative everywhere guarantees a global
minimum (if there is one) at the point where
f’(X)=0. These functions are called “concave
up” or sometimes “convex.”
Multivariate Global Optimum
If the Hessian matrix is positive
definite (or negative definite) for all
values of the variables, then any
optimum point found will be a global
minimum (maximum).
Constrained Optimization
 Equality constraints – often solvable by
calculus
 Inequality constraints – sometimes solvable by
numerical methods
Equality Constraints

Maximize f(X)
s.t. gi(X) = bi

Set up the Lagrangian function:

L(X,) = f(X) - ii(gi(X)-bi)


Constrained Optimization
 Equality constraints: optimize f(x)
subject to gi(x)=0
 Method of Lagrange multipliers: convert to a
higher-dimensional problem
 Minimize w.r.t.
f ( x)   i g i ( x) ( x1  xn ; 1  k )
Constrained Optimization
 Inequality constraints are harder…
 If objective function and constraints all linear,
this is “linear programming”
 Observation: minimum must lie at corner of
region formed by constraints
 Simplex method: move from vertex to vertex,
minimizing objective function
Constrained Optimization
 General “nonlinear programming” hard
 Algorithms for special cases (e.g. quadratic)
Global Optimization

 In general, can’t guarantee that you’ve found


global (rather than local) minimum
 Some heuristics:
 Multi-start:
try local optimization from
several starting positions
 Very slow simulated annealing

 Use analytical methods (or graphing) to


determine behavior, guide methods to correct
neighborhoods
Direct Search Methods for
Nonlinear Optimization
 Cyclic Coordinate Search
 Simulated Annealing
 Genetic Algorithm
Integer and Discrete
Programming
 Zero – One Programming
 Branch and Bound Algorithm for Mixed
Integers
 Farkas’ Method For Disceret Nonlinear
Monotone Structural Problems
 Genetic Algorithm for Disceret Programmig
Direct Search Methods for
Nonlinear Optimization
 These Methods are referred to in the literature
as Zero-order methods or minimization
methods without dervatives.
Coordinate Descent Methods

 Basically, each coordinate axis is searched and a descent is only made


along a unit vector.

 The Cyclic Coordinate Descent method minimizes a function


ƒ(x1, x2, ..., xn)
cyclically with respect to the coordinate variables. That is, first x1 is
searched, then x2, etc.

 Various variations are possible.


 Can you think of some?

 Plus: Generally attractive because of their easy implementation.


 Minus: Generally their convergence properties are poorer than
steepest descent.

42
Hook and Jeeves Pattern
Search
1. Define:
 Starting point x0
 Increment i for all variables (i = 1, .., n)
 Step reduction factor 
 Termination parameter 
2. Perform Exploratory search
3. If exploratory move successful, Go to 5. If not, continue
4. Check for termination: Is ||i||<  ?
• Yes: Stop. Current point is x*
• No: Reduce increments i= (i /  for i = 1, .., n. Go to 2
5. Perform pattern move: xpk+1 = xk + (xk – xk-1)
6. Perform exploratory move with xp as the base point. Let result
be xk+1.
7. Is f(xk+1) < f(xk)?
• Yes: Set xk-1 = xk and xk = xk+1. Go to 5.
• No: go to 4.
43
Cyclic Coordinate Algorithm
Let x1 be the starting point for the cycle, set i = 1, x= x1 , f= f1

Step 1 : Determine the i such that f()=(x + ei) is a minimum


Note :  is permitted to take positive or negative values
Step 2 : Move to the new point by setting x = x + ei, f=f(x)
Step 3 : if i = n, go to step 4, else, set i = i + 1 and go to step 1
Step 4 : Acceleration Step : Denote direction d = x – x1
Find d such that f(x + d) is a minimum
Move to the new point by setting x = x + d, f = f(x)
Step 5 : if |x-x1| < x or |f – f1| < f, go to step 6
Step 6 : Set x1 = x, f1 = f go to step
Step 7 : Converged
Contoh


Given f  100 x2  x1   1  x 
2 2
1
2

xo  2,1 and d o  1,0 , Develop an expression for the function


T T

f    f x o  d o 
Pattern Search
Graphically
x3acc

p
e
x2 acc x4temp x4acc
e
p
= x3 temp
p
= x 4base e
x 1acc

= x 2temp
p
= x3base
x1temp
KEY
= x 2base
e : exploratory move
e p : pattern move
improv ement
e no improvement
x 1base

46
Pattern
Search
Advantages:
 Simple
 Robust (relatively)
 No gradients necessary.

Disadvantages:
 Can get stuck and special "tricks" are needed to get the search
going again.
 May take a lot of calculations

 Nonlinear multi-objective multiplex implementations using


pattern search algorithms have been made.

47
Exploratory
Search
18 18
l l l l l l l l l l 17
17
l l l l l l l l l l
16 16
l l l l l l l l l l

l l l l l l l l l l 15 1
15

BEAM (meters)
BEAM (meters)

l l l l l l l l l l
14 14
l l l l l l l l l l
13 l l l l l l l l l l 13

12 l l l l l l l l l l 12 1
2
l l l l l l l l l l
11 11 5 43
l l l l l l l l l l 876
10 10
116118 120122124126128130132134 116118 120122124126128130132134
LBP (meters) LBP (meters)

(a) Points evaluated (b) Contours of function generated

48
Exploratory Search
Algorithms
Two options currently exist to identify the vector of design variables, x.
1) Randomly select n points in the bounded space. A uniform
distribution is assumed for the variables between the defined bounds.
(An enhancement could be to allow for alternative distributions.)
Monte Carlo methods are based upon this.
2) Use a systematic search algorithm such as from Aird and Rice
(1977). While they claim that their systematic algorithm provides a
better way of searching a region (and provides better starting points)
than a random algorithm, their method uses a small number of
variable values repeatedly on each axis.
This can be a problem when considering a large number of variables
and attempting to “visualize” the effect of changes. Considering a
problem in 15 variables, 2 to the power 15 or 32,768 candidate
designs would need to be evaluated to ensure just two sample values
on each axis.

49
Cyclic coordinate search

 In This method, the search is conducted along each of the


coordinate directions for finding the minimum.
 Search alternatingly in each coordinate direction
 Perform single-variable optimization along each direction
(line search):
min f (x  s)

● Directions fixed: can lead


to slow convergence
Biologically inspired methods

 Popular: inspiration for algorithms from


biological processes:
 Genetic algorithms / evolutionary optimization
 Particle swarms / flocks

 Ant colony methods

● Typically make use of population (collection of


designs)

● Computationally intensive

● Stochastic nature, global optimization properties


Genetic algorithms

 Based on evolution theory of Darwin:


Survival of the fittest
 Objective = fitness function
 Designs are encoded in chromosomal
strings, ~ genes: e.g. binary strings:

1 1 0 1 0 0 1 0 1 1 0 0 1 0 1

x1 x2
Genetic programming

 Building mathematical functions


using evolution-like approach
 Approach good fit by crossover
and mutation of expressions
^2
2
 x1 
  x3  +
 x2  / x3

x1 x2
Genetic programming

 LS fitting with population of analytic


expressions
 Selection / evolution rules
Features:
– Can capture very complex
behavior
– Danger of artifacts /
overfitting
– Quite expensive procedure
GA flowchart

Create initial
population Evaluate fitness
of all individuals

Create new population Test termination


criteria
Crossover Mutation Reproduction

Select individuals
for reproduction Quit
GA population operators

 Reproduction:
 Exact copy/copies of individual
 Crossover:
 Randomly exchange genes of different parents
 Many possibilities: how many genes, parents,
children …
 Mutation:
 Randomly flip some bits of a gene string
 Used sparingly, but important to explore new
designs
Population operators

 Crossover:
Parent 1 Parent 2
1 1 0 1 0 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 0 0 1

0 1 1 0 0 0 1 0 1 1 0 0 1 0 1 1 1 0 1 1 0 0 1 0 1 1 0 0 0 1
Child 1 Child 2

● Mutation:
1 1 0 1 0 0 1 0 1 1 0 0 1 0 1

1 1 0 1 0 1 1 0 1 1 0 0 1 0 1
Unconstrained problems

 Transformation methods
 Existence of solutions, optimality conditions

 Nature of stationary points

 Global optimality
Unconstrained Optimization
 Why?
 Elimination of active constraints  unconstrained
problem
 Develop basic understanding useful for
constrained optimization
 Transformation of constrained problems into
unconstrained problems
Transformation of constrained problems into
unconstrained
 Relevant problems
engineering problems
(potential energy minimization)
Transforming constrained
problem
 Reformulation through barrier functions:
f  x
g
g  x2 1  0

Transformation:

~ 1
f  f  ln(  g )
r
 
~ 1 f
f   x  ln x 2  1
r x
Transformed problem
~~
f~f(r(r1)32) g
f (r  10)

f
Transformed problem
 Barrier functions result in feasible, interior
optimum:

r = 100

r = 200

r = 400

r = 800
Penalization

 Alternative reformulation: penalty functions


~ ~~ ~
f  x f (ffp((pp0.412))) f ( p  10) g

g  x2 1  0

Transformation:

f  f  pmax(0, g ) 
~ 2

~

f   x  p max(0, x  1)
2

2
f
Penalization (2)
 Penalty functions result in infeasible, exerior
optimum:
p = 40

p = 20

p = 10

p=4
Problem transformation
summary
Barrier function Penalty
function

Need feasible Yes No


starting point?
Nature of optimum Interior Exterior
(feasible) (infeasible)
Type of constraints g g, h
Unconstrained Optimization
 Why?
 Elimination of active constraints  unconstrained
problem
 Develop basic understanding useful for
constrained optimization
 Transformation of constrained problems into
unconstrained problems
 Relevant engineering problems
(potential energy minimization)
Unconstrained engineering problem

 Example: displacement of loaded structure

k1 = 8N/cm x1
10 cm
Fy = 5N x2
Fx = 5N

10 cm k2 = 1N/cm

● Equilibrium: minimum potential energy


Unconstrained engineering problem

 Potential energy:
k1
1 1 Fy
  k1u1  k2u2  Fx x1  Fy x2
2 2
x1
2 2 Fx
2
 4 x1  (10  x2 ) 2  10 
2 x2
 
2
 0.5 x1  (10  x2 ) 2  10 
2 k2
 
 5 x1  5 x2

● Equilibrium: min 
x1 , x2
Unconstrained engineering problem

x2

x2
x1
x1
Contents

 Unconstrained problems
 Transformation methods
 Existence of solutions, optimality conditions

 Nature of stationary points

 Global optimality
Theory for solving unc.
problems
 Assumptions:
 Objectivecontinuous and differentiable (C1)
 Domain closed and bounded (compact)

f 
]a, b[

Potrebbero piacerti anche