Sei sulla pagina 1di 80

Chapter 16

DESIGN OF RECURSIVE FILTERS USING


OPTIMIZATION METHODS
16.1 Introduction
16.2 Problem Formulation
16.3 Newton’s Method

Copyright
c 2005 Andreas Antoniou
Victoria, BC, Canada
Email: aantoniou@ieee.org

July 14, 2018

Frame # 1 Slide # 1 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Introduction

t Chaps. 11 and 12 have dealt with several methods for the


solution of the approximation problem in recursive filters.

Frame # 2 Slide # 2 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Introduction

t Chaps. 11 and 12 have dealt with several methods for the


solution of the approximation problem in recursive filters.
t These methods lead to a complete description of the transfer
function in closed form, either in terms of its zeros and poles
or its coefficients.

Frame # 2 Slide # 3 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Introduction

t Chaps. 11 and 12 have dealt with several methods for the


solution of the approximation problem in recursive filters.
t These methods lead to a complete description of the transfer
function in closed form, either in terms of its zeros and poles
or its coefficients.
t Consequently, they are very efficient and lead to very precise
designs.

Frame # 2 Slide # 4 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Introduction

t Chaps. 11 and 12 have dealt with several methods for the


solution of the approximation problem in recursive filters.
t These methods lead to a complete description of the transfer
function in closed form, either in terms of its zeros and poles
or its coefficients.
t Consequently, they are very efficient and lead to very precise
designs.
t Their main disadvantage is that they are applicable only for
the design of classical-type filters such as Butterworth lowpass
filters, elliptic bandpass filters, etc.

Frame # 2 Slide # 5 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Introduction Cont’d

t If unusual amplitude or phase responses or delay characteristics are


required, then the only available approach for the design of recursive
filters is through the use of optimization methods.

Frame # 3 Slide # 6 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Introduction Cont’d

t If unusual amplitude or phase responses or delay characteristics are


required, then the only available approach for the design of recursive
filters is through the use of optimization methods.
t In these methods, a discrete-time transfer function is assumed and
an error function is formulated on the basis of some desired
amplitude or phase response or some specified group-delay
characteristic.

Frame # 3 Slide # 7 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Introduction Cont’d

t If unusual amplitude or phase responses or delay characteristics are


required, then the only available approach for the design of recursive
filters is through the use of optimization methods.
t In these methods, a discrete-time transfer function is assumed and
an error function is formulated on the basis of some desired
amplitude or phase response or some specified group-delay
characteristic.
t A norm of the error function is then minimized with respect to the
transfer-function coefficients.

Frame # 3 Slide # 8 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Introduction Cont’d

t If unusual amplitude or phase responses or delay characteristics are


required, then the only available approach for the design of recursive
filters is through the use of optimization methods.
t In these methods, a discrete-time transfer function is assumed and
an error function is formulated on the basis of some desired
amplitude or phase response or some specified group-delay
characteristic.
t A norm of the error function is then minimized with respect to the
transfer-function coefficients.
t Like the Remez algorithm described in Chap. 15, optimization
methods for the design of recursive filters are iterative.

As a result, they usually involve a large amount of computation.

Frame # 3 Slide # 9 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Introduction Cont’d

t This presentation is concerned with the application of


optimization methods for the design of recursive digital filters.

Frame # 4 Slide # 10 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Introduction Cont’d

t This presentation is concerned with the application of


optimization methods for the design of recursive digital filters.
Specific topics to be examined include:
– formulation of an optimization problem
– fundamentals of optimization
– the basic Newton algorithm

Frame # 4 Slide # 11 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Introduction Cont’d

t This presentation is concerned with the application of


optimization methods for the design of recursive digital filters.
Specific topics to be examined include:
– formulation of an optimization problem
– fundamentals of optimization
– the basic Newton algorithm
t More sophisticated practical optimization algorithms such as
quasi-Newton and minimax algorithms along with their
application for the design of recursive digital filters will be
presented later.

Frame # 4 Slide # 12 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation

The design of a recursive digital filter by optimization involves two


general steps:
1. Construct an objective function which is proportional on the
difference between the actual and specified amplitude or phase
response.

Frame # 5 Slide # 13 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation

The design of a recursive digital filter by optimization involves two


general steps:
1. Construct an objective function which is proportional on the
difference between the actual and specified amplitude or phase
response.
2. Minimize the objective function with respect to the transfer
function coefficients.

Frame # 5 Slide # 14 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation Cont’d

t Let us assume that we need to design an Nth-order recursive


filter with a piecewise-linear amplitude response M0 (ω) such
as that shown in the figure.

M0(ω)
Gain

ω, rad/s

Frame # 6 Slide # 15 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation Cont’d

t The transfer function of the filter can be expressed as

J
Y a0j + a1j z + z 2
H(z) = H0
b0j + b1j z + z 2
j=1

where aij and bij are real coefficients, J = N/2, and H0 is a


positive multiplier constant.

Frame # 7 Slide # 16 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation Cont’d

t The transfer function of the filter can be expressed as

J
Y a0j + a1j z + z 2
H(z) = H0
b0j + b1j z + z 2
j=1

where aij and bij are real coefficients, J = N/2, and H0 is a


positive multiplier constant.
t As presented, H(z) would be of even order; however, an
odd-order H(z) can be obtained by letting

a0j = b0j = 0

for one value of j.

Frame # 7 Slide # 17 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation Cont’d

t The amplitude response of an arbitrary recursive filter can be


deduced as
M(x, ω) = |H(e jωT )|
where ω is the frequency and

x = [a01 a11 b01 b11 · · · b1J H0 ]T

is a column vector with 4J + 1 elements.

Frame # 8 Slide # 18 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation Cont’d

t An approximation error can be constructed as the difference


between the actual amplitude response M(x, ω) and the
desired amplitude response M0 (ω) as

e(x, ω) = M(x, ω) − M0 (ω)

Frame # 9 Slide # 19 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation Cont’d

t An approximation error can be constructed as the difference


between the actual amplitude response M(x, ω) and the
desired amplitude response M0 (ω) as

e(x, ω) = M(x, ω) − M0 (ω)

t By sampling e(x, ω) at frequencies ω1 , ω2 , . . . , ωK , the


column vector

E(x) = [e1 (x) e2 (x) . . . eK (x)]T

can be formed where

ei (x) = e(x, ωi )

for i = 1, 2, . . . , K .

Frame # 9 Slide # 20 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation Cont’d

M(x, ω)

M0(ω)

e(x, ωi)
Gain

ω1 ω2 ωi ωK
ω, rad/s

Frame # 10 Slide # 21 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation Cont’d

t A recursive filter can be designed by finding a point x = ^


x
such that
^
ei ( x ) ≈ 0 for i = 1, 2, . . . , K

Frame # 11 Slide # 22 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation Cont’d

t A recursive filter can be designed by finding a point x = ^


x
such that
^
ei ( x ) ≈ 0 for i = 1, 2, . . . , K

t Such a point can be obtained by minimizing the Lp norm of


E(x), which is defined as
" K
#1/p
X
Ψ(x) = Lp (x) = ||E(x)||p = |ei (x)|p
i=1

where p is a positive integer.

Frame # 11 Slide # 23 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation Cont’d

t For filter design, the most important norms are the L2 and L∞
norms which are defined as
" K
#1/2
X
L2 (x) = |ei (x)|2
( Ki=1 )1/p
X p
and L∞ (x) = lim |ei (x)|
p→∞
i=1
( K " #p )1/p
_ X |ei (x)|
= E (x) lim _
p→∞
i=1 E (x)
_
= E (x)
_
where E (x) = max |ei (x)|
1≤i ≤K

Frame # 12 Slide # 24 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation Cont’d

t In summary, a recursive filter with an amplitude response that


approaches a specified amplitude response M0 (ω) can be
designed by solving the optimization problem

minimize Ψ(x)
x

Frame # 13 Slide # 25 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation Cont’d

t In summary, a recursive filter with an amplitude response that


approaches a specified amplitude response M0 (ω) can be
designed by solving the optimization problem

minimize Ψ(x)
x

t If
Ψ(x) = L2 (x)
a least-squares solution is obtained and if

Ψ(x) = L∞ (x)

the outcome will be a so-called minimax solution.

Frame # 13 Slide # 26 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation Cont’d

t The optimization problem obtained can be solved by using a


great variety of unconstrained optimization algorithms that
have evolved since the invention of computers.

Frame # 14 Slide # 27 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation Cont’d

t The optimization problem obtained can be solved by using a


great variety of unconstrained optimization algorithms that
have evolved since the invention of computers.
t In the next series of slides, various implementations of the
most fundamental unconstrained optimization algorithm,
namely, the Newton algorithm, will be presented.

Frame # 14 Slide # 28 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Problem Formulation Cont’d

t The optimization problem obtained can be solved by using a


great variety of unconstrained optimization algorithms that
have evolved since the invention of computers.
t In the next series of slides, various implementations of the
most fundamental unconstrained optimization algorithm,
namely, the Newton algorithm, will be presented.
t In due course, the family of quasi-Newton algorithms will be
described which, as may be expected, are based on the
Newton algorithm.
These algorithms have been found to be quite robust and very
efficient in many applications including the design of recursive
digital filters.

Frame # 14 Slide # 29 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm

t The Newton algorithm is based on Newton’s classical method


for finding the minimum of a quadratic convex function.

Minimum point

Frame # 15 Slide # 30 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t Consider a function f (x) of n variables, where


x = [x1 x2 · · · xn ]T is a column vector, and let
δ = [δ1 δ2 · · · δn ]T be a change in x.

Frame # 16 Slide # 31 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t Consider a function f (x) of n variables, where


x = [x1 x2 · · · xn ]T is a column vector, and let
δ = [δ1 δ2 · · · δn ]T be a change in x.
t If f (x) has continuous second derivatives, its Taylor series at
point x + δ is given by
n
X ∂f (x)
f (x + δ) = f (x) + δi
∂xi
i=1
n n 2
1 X X ∂ f (x)
δi δj + o ||δ||22

+
2 ∂xi ∂xj
i=1 j=1

where the remainder o(||δ||22 ) approaches zero faster than


||δ||22 .

Frame # 16 Slide # 32 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t If the remainder is negligible and a stationary point exists in


the neighborhood of some point x, it can be determined by
differentiating f (x + δ) with respect to elements δk for
k = 1, 2, . . . , n, and setting the result to zero, i.e.,
n 2
∂f (x + δ) ∂f (x) X ∂ f (x)
= + δi = 0
∂δk ∂xk ∂xi ∂xk
i=1

for k = 1, 2, . . . , n.

Frame # 17 Slide # 33 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t If the remainder is negligible and a stationary point exists in


the neighborhood of some point x, it can be determined by
differentiating f (x + δ) with respect to elements δk for
k = 1, 2, . . . , n, and setting the result to zero, i.e.,
n 2
∂f (x + δ) ∂f (x) X ∂ f (x)
= + δi = 0
∂δk ∂xk ∂xi ∂xk
i=1

for k = 1, 2, . . . , n.
t The solution of the above equation can be expressed as

δ = −H−1 g

where g is the gradient vector and H is the Hessian matrix.

Frame # 17 Slide # 34 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

···
δ = −H−1 g

t The gradient and Hessian are given by


h iT
∂f (x) ∂f (x) ∂f (x)
g = ∇f (x) = ∂x1 ∂x2 ··· ∂xn

and  2 2 2 
∂ f (x) ∂ f (x) ∂ f (x)
2 ∂x1 ∂x2 ··· ∂x1 ∂xn
 ∂x1 
 
 ∂ 2 f (x) 2
∂ f (x)
2 
∂ f (x) 

∂x22
···
H =  ∂x2 ∂x1 ∂x2 ∂xn 


 . .. .. 
 .. . . 
 
 2 2 2

∂ f (x) ∂ f (x) ∂ f (x)
∂xn ∂x1 ∂xn ∂x2 ··· ∂xn2

Frame # 18 Slide # 35 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

···
δ = −H−1 g
t This equation will give the solution if and only if the following
two conditions hold:
1. The remainder o(||δ||22 ) can be neglected.
2. The Hessian is nonsingular.

Frame # 19 Slide # 36 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

···
δ = −H−1 g
t This equation will give the solution if and only if the following
two conditions hold:
1. The remainder o(||δ||22 ) can be neglected.
2. The Hessian is nonsingular.
t If f (x) is a quadratic function, its second partial derivatives
are constants, i.e., H is a constant symmetric matrix, and its
third and higher derivatives are zero, i.e., condition (1) holds.

Frame # 19 Slide # 37 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

···
δ = −H−1 g
t This equation will give the solution if and only if the following
two conditions hold:
1. The remainder o(||δ||22 ) can be neglected.
2. The Hessian is nonsingular.
t If f (x) is a quadratic function, its second partial derivatives
are constants, i.e., H is a constant symmetric matrix, and its
third and higher derivatives are zero, i.e., condition (1) holds.
t If f (x) has a minimum at a stationary point, then the Hessian
matrix is positive definite at the minimum point.
In such a case, the Hessian is nonsingular, i.e., condition (2)
holds.

Frame # 19 Slide # 38 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

···
1. The remainder o(||δ||22 ) can be neglected.
2. The Hessian is nonsingular.
t When the two conditions apply, we go to MATLAB and very
quickly we obtain the solution of the optimization problem,
i.e., the minimum point and the minimum value of the
function, are obtained as
^ ^ ^
x = x + δ, f (x)

Frame # 20 Slide # 39 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

···
1. The remainder o(||δ||22 ) can be neglected.
2. The Hessian is nonsingular.
t When the two conditions apply, we go to MATLAB and very
quickly we obtain the solution of the optimization problem,
i.e., the minimum point and the minimum value of the
function, are obtained as
^ ^ ^
x = x + δ, f (x)

t No optimization algorithms are necessary.

Frame # 20 Slide # 40 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

···
1. The remainder o(||δ||22 ) can be neglected.
2. The Hessian is nonsingular.
t We need optimization algorithms to solve problems that are not
quadratic.

In these problems condition (1) and/or condition (2) may be


violated.

Frame # 21 Slide # 41 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

···
1. The remainder o(||δ||22 ) can be neglected.
2. The Hessian is nonsingular.
t We need optimization algorithms to solve problems that are not
quadratic.

In these problems condition (1) and/or condition (2) may be


violated.
t If condition (1) is violated, then the equation

δ = −H−1 g

will not give the solution.

Frame # 21 Slide # 42 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

···
1. The remainder o(||δ||22 ) can be neglected.
2. The Hessian is nonsingular.
t We need optimization algorithms to solve problems that are not
quadratic.

In these problems condition (1) and/or condition (2) may be


violated.
t If condition (1) is violated, then the equation

δ = −H−1 g

will not give the solution.


t If condition (2) is violated, the equation either has an infinite
number of solutions or it has no solutions at all.

Frame # 21 Slide # 43 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t If f (x) is a general nonquadratic convex function that has a


^ ^
minimum at point x , then in the neighborhood of point x it
can be approximated by a quadratic function.

Frame # 22 Slide # 44 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t If f (x) is a general nonquadratic convex function that has a


^ ^
minimum at point x , then in the neighborhood of point x it
can be approximated by a quadratic function.
In such a case
– the remainder o(||δ||22 ) becomes negligible, and
– the second partial derivatives of f (x) become approximately
constant.

Frame # 22 Slide # 45 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t If f (x) is a general nonquadratic convex function that has a


^ ^
minimum at point x , then in the neighborhood of point x it
can be approximated by a quadratic function.
In such a case
– the remainder o(||δ||22 ) becomes negligible, and
– the second partial derivatives of f (x) become approximately
constant.
t As a result, in the neighborhood of the solution, conditions
(1) and (2) are again satisfied and the equation

δ = −H−1 g

will yield an accurate estimate of the minimum point.

Frame # 22 Slide # 46 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t Unfortunately, since we do not know the solution, we also do


not know where the neighborhood of the solution might be
located!

Frame # 23 Slide # 47 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t Unfortunately, since we do not know the solution, we also do


not know where the neighborhood of the solution might be
located!
t However, if we could find a way of getting near the solution,
we would immediately be able to get the solution itself by
evaluating
δ = −H−1 g

Frame # 23 Slide # 48 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t Unfortunately, since we do not know the solution, we also do


not know where the neighborhood of the solution might be
located!
t However, if we could find a way of getting near the solution,
we would immediately be able to get the solution itself by
evaluating
δ = −H−1 g
t Thus in order to be able to construct an unconstrained
optimization algorithm for the solution of nonquadratic
problems, we need to find a mechanism that will enable the
algorithm to get to the locale of the solution starting from an
arbitrary initial point.

Frame # 23 Slide # 49 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t A basic strategy adopted in most unconstrained optimization


algorithms is to apply a series of corrections to an initial point
ensuring that each correction is made in a direction that will
reduce the objective function.
Such directions are known as descent directions.

Frame # 24 Slide # 50 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t A basic strategy adopted in most unconstrained optimization


algorithms is to apply a series of corrections to an initial point
ensuring that each correction is made in a direction that will
reduce the objective function.
Such directions are known as descent directions.
t It turns out that if the Hessian H is positive definite, then so is
the inverse Hessian H−1 . In such a case, the direction vector

d = −H−1 g

is a descent direction, as can be easily demonstrated.

Frame # 24 Slide # 51 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t A basic strategy adopted in most unconstrained optimization


algorithms is to apply a series of corrections to an initial point
ensuring that each correction is made in a direction that will
reduce the objective function.
Such directions are known as descent directions.
t It turns out that if the Hessian H is positive definite, then so is
the inverse Hessian H−1 . In such a case, the direction vector

d = −H−1 g

is a descent direction, as can be easily demonstrated.


t Since the above descent direction would give the solution in
just one shot in a quadratic problem, it is commonly referred
to as the Newton direction.

Frame # 24 Slide # 52 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t If the problem is nonquadratic, the Newton direction will not


give the solution.
Nevertheless, it will reduce the value of the objective function
by a certain amount and, therefore, a new point will be
obtained that is closer to the solution.

Frame # 25 Slide # 53 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t If the problem is nonquadratic, the Newton direction will not


give the solution.
Nevertheless, it will reduce the value of the objective function
by a certain amount and, therefore, a new point will be
obtained that is closer to the solution.
t To maximize the reduction achieved in the objective function
and thereby get closer to the solution, we let

δ = αd

and choose parameter α such that the objective function,


f (x + αd), is minimized.
We can do that by using a 1-dimensional optimization
algorithm also known as a line search.

Frame # 25 Slide # 54 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t Most of the elements of an unconstrained optimization


algorithm are now in place except that in a nonquadratic
problem the Hessian may sometimes become nonpositive
definite.
In such a case, the Newton direction would become an ascent
direction!

Frame # 26 Slide # 55 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t Most of the elements of an unconstrained optimization


algorithm are now in place except that in a nonquadratic
problem the Hessian may sometimes become nonpositive
definite.
In such a case, the Newton direction would become an ascent
direction!
t To circumvent this problem, we simply force the Hessian to
become positive definite, for example, we could assign

H=I

where I is the identity matrix.

Frame # 26 Slide # 56 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

1. Input x0 , ε, and set k = 0.

Frame # 27 Slide # 57 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

1. Input x0 , ε, and set k = 0.


2. Compute the gradient gk and Hessian Hk .
If Hk is not positive definite, force it to become positive
definite.

Frame # 27 Slide # 58 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

1. Input x0 , ε, and set k = 0.


2. Compute the gradient gk and Hessian Hk .
If Hk is not positive definite, force it to become positive
definite.
3. Compute H−1 −1
k and dk = −Hk gk .

Frame # 27 Slide # 59 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

1. Input x0 , ε, and set k = 0.


2. Compute the gradient gk and Hessian Hk .
If Hk is not positive definite, force it to become positive
definite.
3. Compute H−1 −1
k and dk = −Hk gk .

4. Find αk , the value of α that minimizes f (xk + α dk ), using a


line search.

Frame # 27 Slide # 60 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

1. Input x0 , ε, and set k = 0.


2. Compute the gradient gk and Hessian Hk .
If Hk is not positive definite, force it to become positive
definite.
3. Compute H−1 −1
k and dk = −Hk gk .

4. Find αk , the value of α that minimizes f (xk + α dk ), using a


line search.
5. Set xk+1 = xk + δk , where δk = αk dk , and compute
fk+1 = f (xk+1 ).

Frame # 27 Slide # 61 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

1. Input x0 , ε, and set k = 0.


2. Compute the gradient gk and Hessian Hk .
If Hk is not positive definite, force it to become positive
definite.
3. Compute H−1 −1
k and dk = −Hk gk .

4. Find αk , the value of α that minimizes f (xk + α dk ), using a


line search.
5. Set xk+1 = xk + δk , where δk = αk dk , and compute
fk+1 = f (xk+1 ).
^ ^
6. If ||αk dk ||2 < ε, then output x = xk+1 , f ( x ) = fk+1 , and
stop.

Frame # 27 Slide # 62 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

1. Input x0 , ε, and set k = 0.


2. Compute the gradient gk and Hessian Hk .
If Hk is not positive definite, force it to become positive
definite.
3. Compute H−1 −1
k and dk = −Hk gk .

4. Find αk , the value of α that minimizes f (xk + α dk ), using a


line search.
5. Set xk+1 = xk + δk , where δk = αk dk , and compute
fk+1 = f (xk+1 ).
^ ^
6. If ||αk dk ||2 < ε, then output x = xk+1 , f ( x ) = fk+1 , and
stop.
Otherwise, set k = k + 1 and repeat from step 2.

Frame # 27 Slide # 63 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t In Step 4 of the algorithm, a line search is used to find the


value of αk that minimizes the value of the objective function,
f (xk + α dk ), along the Newton direction.

Frame # 28 Slide # 64 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t In Step 4 of the algorithm, a line search is used to find the


value of αk that minimizes the value of the objective function,
f (xk + α dk ), along the Newton direction.
t Any 1-dimensional optimization algorithm can be used as a
line search.Therefore, a great variety of possible line searches
are available. See Chap. 4 of
A. Antoniou and W.-S. Lu, Practical Optimization:
Algorithms and Engineering Applications, Springer, 2007, at
the following link:
http://www.ece.uvic.ca/∼optimize

Frame # 28 Slide # 65 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t In Step 6, the algorithm is terminated if the L2 norm of αk dk ,


i.e., the magnitude of the change in x, is less than ε.
The parameter ε is said to be the termination tolerance and it
is a small positive constant whose value is determined by the
application under consideration.

Frame # 29 Slide # 66 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t In Step 6, the algorithm is terminated if the L2 norm of αk dk ,


i.e., the magnitude of the change in x, is less than ε.
The parameter ε is said to be the termination tolerance and it
is a small positive constant whose value is determined by the
application under consideration.
t In certain applications, a termination tolerance on the change
in the objective function itself, e.g., |fk+1 − fk | < ε, may be
preferable and sometimes termination tolerances may be
imposed on the magnitudes of both the changes in x and the
objective function.

Frame # 29 Slide # 67 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t Considering a nonquadratic convex problem, starting from an


arbitrary initial point, the Newton algorithm computes a series
of corrections to the initial point, in each case minimizing the
objective function along a Newton direction.

Frame # 30 Slide # 68 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t Considering a nonquadratic convex problem, starting from an


arbitrary initial point, the Newton algorithm computes a series
of corrections to the initial point, in each case minimizing the
objective function along a Newton direction.
t In this way, new points are generated that are getting closer
and closer to the solution.

Frame # 30 Slide # 69 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t Considering a nonquadratic convex problem, starting from an


arbitrary initial point, the Newton algorithm computes a series
of corrections to the initial point, in each case minimizing the
objective function along a Newton direction.
t In this way, new points are generated that are getting closer
and closer to the solution.
t As the solution is approached, the objective function behaves
more and more like a quadratic function, parameter α
assumes values that are closer and closer to unity, and the
Newton directions give new points that are closer and closer
to the solution.

Frame # 30 Slide # 70 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t Considering a nonquadratic convex problem, starting from an


arbitrary initial point, the Newton algorithm computes a series
of corrections to the initial point, in each case minimizing the
objective function along a Newton direction.
t In this way, new points are generated that are getting closer
and closer to the solution.
t As the solution is approached, the objective function behaves
more and more like a quadratic function, parameter α
assumes values that are closer and closer to unity, and the
Newton directions give new points that are closer and closer
to the solution.
t Eventually, a point is obtained that is sufficiently close to the
solution and convergence is deemed to have been achieved.

Frame # 30 Slide # 71 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t A characteristic feature of the basic Newton algorithm is that


at the beginning the algorithm makes small but steady
reductions in the objective function from one iteration to the
next.

Frame # 31 Slide # 72 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t A characteristic feature of the basic Newton algorithm is that


at the beginning the algorithm makes small but steady
reductions in the objective function from one iteration to the
next.
t As the solution is approached, progress becomes very rapid
and the algorithm appears to zoom to the solution.

Frame # 31 Slide # 73 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t A characteristic feature of the basic Newton algorithm is that


at the beginning the algorithm makes small but steady
reductions in the objective function from one iteration to the
next.
t As the solution is approached, progress becomes very rapid
and the algorithm appears to zoom to the solution.
t This feature is shared by all the algorithms based on Newton’s
classical method for finding the minima of a function,
including the family of quasi-Newton algorithms.

Frame # 31 Slide # 74 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t So far, we have assumed that we are dealing with a convex


optimization problem that has a unique global minimum.

Frame # 32 Slide # 75 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t So far, we have assumed that we are dealing with a convex


optimization problem that has a unique global minimum.
t In practice, the problem may have more than one local
minimum, sometimes a large number, and on occasion a
well-defined minimum may not even exist.

Frame # 32 Slide # 76 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t So far, we have assumed that we are dealing with a convex


optimization problem that has a unique global minimum.
t In practice, the problem may have more than one local
minimum, sometimes a large number, and on occasion a
well-defined minimum may not even exist.
t We must, therefore, abandon the expectation that we shall
always be able to obtain the best solution available.

Frame # 32 Slide # 77 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t So far, we have assumed that we are dealing with a convex


optimization problem that has a unique global minimum.
t In practice, the problem may have more than one local
minimum, sometimes a large number, and on occasion a
well-defined minimum may not even exist.
t We must, therefore, abandon the expectation that we shall
always be able to obtain the best solution available.
t The best we can hope for is a solution that satisfies the
required specifications.

Frame # 32 Slide # 78 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


Newton Algorithm Cont’d

t So far, we have assumed that we are dealing with a convex


optimization problem that has a unique global minimum.
t In practice, the problem may have more than one local
minimum, sometimes a large number, and on occasion a
well-defined minimum may not even exist.
t We must, therefore, abandon the expectation that we shall
always be able to obtain the best solution available.
t The best we can hope for is a solution that satisfies the
required specifications.
t This is usually achieved by using different initialization points.

Frame # 32 Slide # 79 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3


This slide concludes the presentation.
Thank you for your attention.

Frame # 33 Slide # 80 A. Antoniou Digital Signal Processing – Secs. 16.1-16.3

Potrebbero piacerti anche