Sei sulla pagina 1di 4

IOE 519: NLP, Winter 2012

c Marina A. Epelman

2
2.1

Examples of nonlinear programming problems formulations


Forms and components of a mathematical programming problems

A mathematical programming problem or, simply, a mathematical program is a mathematical formulation of an optimization problem. Unconstrained Problem: (P) minx f (x) s.t. x 2 X, where x = (x1 , . . . , xn )T 2 Rn , f (x) : Rn ! R, and X is an open set (usually, but not always, X = Rn ). Constrained Problem: (P) minx f (x) s.t. gi (x) 0 i = 1, . . . , m hi (x) = 0 i = 1, . . . , l x 2 X,

where g1 (x), . . . , gm (x), h1 (x), . . . , hl (x) : Rn ! R.

Let g (x) = (g1 (x), . . . , gm (x))T : Rn ! Rm , h(x) = (h1 (x), . . . , hl (x))T : Rn ! Rl . Then (P) can be written as (P) minx f (x) s.t. g (x) 0 (1) h( x ) = 0 x 2 X. Some terminology: Function f (x) is the objective function. Restrictions hi (x) = 0 are referred to as equality constraints, while gi (x) 0 are inequality constraints. Notice that we do not use constraints in the form gi (x) < 0! A point x is feasible for (P) if it satises all the constraints. (For an unconstrained problem, x 2 X .) The set of all feasible points forms the feasible region, or feasible set (let us denote it by S ). The goal of an optimization problem in minimization form, as above, is to nd a feasible point x such that f ( x) f (x) for any other feasible point x.

2.2

Markowitz portfolio optimization model

Suppose one has the opportunity to invest in n assets. Their future returns are represented by random variables, R1 , . . . , Rn , whose expected values and covariances, E[Ri ], i = 1, . . . , n and Cov(Ri , Rj ), i, j = 1, . . . , n, respectively, can be estimated based on historical data and, possibly, other considerations. At least one of these assets is a risk-free asset. Suppose xi , P i = 1, . . . , n, are the fractions of your wealth allocated to each of the assets is, P(that n n x 0 and i=1 xi = 1). The return of the resulting portfolio is a random variable i=1 xi Ri

IOE 519: NLP, Winter 2012

c Marina A. Epelman

P Pn Pn with mean n i=1 xi E[Ri ] and variance i=1 j =1 xi xj Cov(Ri , Rj ). A portfolio is usually chosen to optimize some measure of a tradeo between the expected return and the risk, such as Pn P Pn max xi E[Ri ] n i =1 i=1 j =1 xi xj Cov(Ri , Rj ) Pn s.t. x = 1 i i=1 x 0, where > 0 is a (xed) parameter reecting the investors preferences in the above tradeo. Since it is hard to assess anybodys value of , the above problem can (and should) be solved for a variety of values of , thus generating a variety of portfolios on the e cient frontier.

2.3

Least squares problem (parameter estimation)

Applications in model constructions, statistics (e.g., linear regression), neural networks, etc. We consider a linear measurement model, i.e., we stipulate that an (output) quantity of interest y 2 R can be expressed as a linear function y aT x of input a 2 Rn and model parameters x 2 Rn . Our goal is to nd the vector of parameters x which provide the best t for the available set of input-output pairs (ai , yi ), i = 1, . . . , m. If t is measured by sum of squared errors between estimated and measured outputs, solution to the following optimization problem Pm P 2 2 = min min = minx2Rn m aT y k2 x2Rn kAx 2, i x) i=1 (vi ) i=1 (yi n x2R s.t. v i = yi a T i x, i = 1, . . . , m provides the best t. Here, A is the matrix with rows aT i .

2.4

Maximum likelihood estimation

Consider a family of probability distributions p () on R, parameterized by vector . E.g., we could be considering the family of exponential distributions, which is parameterized by a single parameter = > 0 and has the form ( e t, t 0 p ( t) = 0, t < 0. Another example of a parametric family of probability distributions is the Normal distribution, parameterized by = (, ), where is the mean, and the standard deviation, of the distribution. When considered as a function of for a particular observation of a random variable y 2 R, the function p (y ) is called the likelihood function. It is more convenient to work with its logarithm, which is called the log-likelihood function : l() = log p (y ). Consider the problem of estimating the value of the parameter vector based on observing one sample point y from the distribution. One possible method, maximum likelihood (ML) estimation, is to estimate as = argmax p (y ) = argmax l(),

IOE 519: NLP, Winter 2012

c Marina A. Epelman

i.e., to choose as the estimate the value of the parameter that maximizes the likelihood (or the log-likelihood) function for the observed value of y . If there is prior information available about , we can add constraint 2 C Rn explicitly, or impose it implicitly, by redening p (y ) = 0 for 62 C (note that in that case l() = 1 for 62 C ). For m iid sample points (y1 , . . . , ym ), the log-likelihood function is l() = log(
m Y i=1 m X i=1

p (yi )) =

log p (yi ).

The ML estimation is thus an optimization problem: max l() subject to 2 C. For example, returning to the linear measurement model y = aT x + v , let us now assume that the error v is iid random noise with known density p(). If there are m measurement/output pairs (ai , yi ) available, then the likelihood function is px ( y ) = and the log-likelihood function is l ( x) =
m X i=1 m Y i=1

p( y i

aT i x) ,

log p(yi

aT i x) .

For example, suppose the noise is Gaussian (or Normal) with mean 0 and standard deviation . Then p(z ) =
p 1 2
2

z2 2 2

and the log-likelihood function is l ( x) = 1 log(2 ) 2 1 2


2

kAx

y k2 2.

Therefore, the ML estimate of x is arg minx kAx y k2 2 , the solution of the least squares approximation problem (note that the analysis is the same whether is known or not).

2.5

Current in a resistive electric network

Consider a linear resistive electric network with node set N and arc set A. Let vi be the voltage of node i and let xij be the current on arc (i, j ). Kirchho s current law says that for each node i, the total incoming current is equal to the total outgoing current: X X xji = xij .
j :(j,i)2A j :(i,j )2A

Ohms law says that the current xij and the voltage drop vi by vi vj = Rij xij tij .

vj along each arc (i, j ) are related

IOE 519: NLP, Winter 2012

c Marina A. Epelman

where Rij 0 is a resistance parameter, and tij 0 is another parameter that is nonzero when there is voltage source along the arc (i, j ) (tij is positive if the voltage source pushes current in the direction from i to j ). Given the arc resistance and arc voltage parameters Rij and tij for all (i, j ) 2 A, the current in the system is distributed so that to minimize the energy loss in the system, while satisfying Kirchho s current law. This can be modeled as the following nonlinear programming problem: P 1 2 min tij xij (i,j )2A 2 Rij xij P P s.t. j :(i,j )2A xij = j :(j,i)2A xji 8i 2 N It can be shown by studying the optimality conditions for this problem that the optimal solution of the above problem satises Ohms law. Moreover, if a vector of currents values x? and a vector of node voltage values v ? together satisfy Kirchho s and Ohms laws, then the vector x? is an optimal solution of the above optimization problem.

Potrebbero piacerti anche