A VSLMS Algorithm

IEEE TRANSACTIONS ON SIGNAL PROCESSING. VOL 40, NO 7.
JULY 1992
1633
A Variable Step Size LMS Algorithm

Raymond H. Kwong, Member, IEEE, and Edward W. Johnston
Abstract-A new LMS-type adaptive filter with a variable step size is introduced. The step size increases or decreases as the mean-square error increases or decreases, allowing the adaptive filter to track changes in the system as well as produce a small steady state error. The convergence and steady state behavior of the algorithm are analyzed. These results reduce to well-known ones when specialized to the constant step size case. Simulation results are presented to support the analysis and to compare the performance of the new algorithm with the usual LMS algorithm and another variable step algorithm. They show that the performance of the new algorithm compares favorably with these existing algorithms.
I. INTRODUCTION NE of the most popular algorithms in adaptive signal processing is the least mean square (LMS) algorithm of Widrow and Hoff [ 11. It has been extensively analyzed in the literature, and a large number of results on its steady state misadjustment and its tracking performance has been obtained [2]-[8]. The majority of these papers examine the LMS algorithm with a constant step size. The choice of the step size reflects a tradeoff between misadjustment and the speed of adaptation. In [l], approximate expressions were derived which showed that a small step size gives small misadjustment but also a longer convergence time constant. Subsequent works have discussed the issue of optimization of the step size or methods of varying the step size to improve performance [9], [lo]. It seems to us, however, that there is as yet no detailed analysis of a variable step size algorithm that is simple to implement and is capable of giving both fast tracking as well as small misadjustment . In this paper, we propose a variable step size LMS algorithm where the step size adjustment is controlled by the square of the prediction error. The motivation is that a large prediction error will cause the step size to increase to provide faster tracking while a small prediction error will result in a decrease in the step size to yield smaller misadjustment. The adjustment equation is simple to implement, and its form is such that a detailed analysis of the algorithm is possible under the standard independence assumptions commonly made in the literature [ l ] to simplify the analysis of LMS algorithms.
The paper is organized as follows. In Section 11, we formulate the adaptive system identification problem and describe the new variable step size LMS algorithm. Simplifying assumptions are introduced and their justification discussed. The analysis of the algorithm begins in Section 1 1 1 where the convergence of the mean weight vector is treated. In Section IV, we study the behavior of the meansquare error. Section V contains the steady state results. Conditions for convergence of the mean-square error are given. Expressions for the steady state misadjustment are also derived. In Section VI, simulation results obtained using the new algorithm are described. They are compared to the results obtained for the fixed step size algorithm and the variable step algorithm described in [9]. The improvements in performance over the constant step size algorithm are clearly shown. The simulation results are also shown to correspond closely to the theoretical predictions. Section VI1 contains the conclusions.
STEP SIZELMS ALGORITHM 11. A VARIABLE

The adaptive filtering or system identification problem being considered is to try to adjust a set of filter weights so that the system output tracks a desired signal. Let the input vector to the system be denoted by X , and the desired scalar output be dk. These processes are assumed to be related by the equation d,
=
X,'W,*
+ ek
(1)
where ek is a zero mean Gaussian independent sequence, independent of the input process X,. Two cases will be considered: W t equals a constant W * , and W t is randomly varying according to the equation
w:,,
aWt
+ Zk
(2)
Manuscript received June 27, 1989; revised February 5 , 1991. This work was supported by the Natural Sciences and Engineering Research Council of Canada under Grant A0875. R. H. Kwong is with the Department of Electrical Engineering, Universlty of Toronto, Toronto, Ontario M5S 1A4, Canada. E. W . Johnston is with Atomic Energy of Canada, Ltd. (AECL), Mississauga, Ontario, L5K 1B2, Canada. IEEE Log Number 920026 1.
where a is less than but close to 1, and Z , is an independent zero mean sequence, independent of X k and ek, with covariance E { Z k Z T } = OZI 6 , , 6 , being the Kronecker delta function. The first case will be referred to as a stationary system or environment, the second a nonstationary system or environment. They correspond to the models considered in [l]. The input process X , is assumed to be a zero mean independent sequence with covariance E(X,X,') = R , a positive definite matrix. This simplifying assumption is often made in the literature [l], [5], [7]. While it is usually not met in practice, analyses based on this assumption give predictions which are often validated in applications and simulations. This will also be the case with our results.
1053-587X/92$03.00 0 1992 IEEE
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on January 9, 2010 at 13:44 from IEEE Xplore. Restrictions apply.
1634
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 40, NO. 7, JULY 1992
The LMS type adaptive algorithm is a gradient search algorithm which computes a set of weights Wk that seeks to minimize E ( d k - XtWk)2,The algorithm is of the form
wk+l
Assumption 1: For the algorithm (3)-(6), E(pkXkEk) = E(Pk)E(XkEk).

This assumption is of course true if is a constant, but cannot really hold for the VSS algorithm. However, we can say that it is approximately true. This is because if y is small, Pk will vary slowly around its mean value. By writing
= wk
pkXkck
(3)
where
Ek =
dk - x,'wk
(4)
and pk is the step size. In the standard LMS algorithm [l], PI, is a constant. In [9], Pk is time varying with its value determined by the number of sign changes of an error surface gradient estimate. Here, we propose a new algorithm, which we shall refer to as the variable step size or VSS algorithm, for adjusting the step size pk :
p i + l = a p k -t
EbkXkCk)
E(pk)E(XkEk)
E { [ P k - E@k)lXkEk)
(7)
(5)
with O<a<l, and y > o
P ~ += I
< p,,,. The initial step size po is usually where 0 < pmin taken to be pmax, although the algorithm is not sensitive to the choice. As can be seen from (3, the step size PI,is always positive and is controlled by the size of the prediction error and the parameters a! and y. Intuitively speaking, a large prediction error increases the step size to provide faster tracking. If the prediction error decreases, the step size will be decreased to reduce the misis chosen to ensure that the adjustment. The constant pmax mean-square error (mse) of the algorithm remains bounded. A sufficient condition for pma, to guarantee bounded mse is [ 7 ]
L
Pmax
n
Pmax
pmin
> i f ~ i +< l
if p i + I otherwise
Pmax
pmin
(6)
we see that for y sufficiently small, the second term on the right-hand side of (7) will be small compared to the first. Assumption 1 allows us to derive theoretical results whose predictions are borne out by simulations. Making such simplifying assumptions is not an uncommon practice in the adaptive signal processing literature [l], [ 5 ] , 171. We first study the convergence of the mean weight vector. Since the stationary case can be derived from the nonstationary one by setting a = 1, a: = 0 (resulting in Zk = 0 with probability one), and W,* = W * , we shall give the derivation for the nonstationary case only. By assumption 1,
1)
p [+
E(&)E(XkEk)
= E(Wk) - E(pk)RE(wk -
w;).
Now,
E(W,*+,) = aE(W,*).
Thus the error weight vector W k= Wk - W,* satisfies the equation
EWk+I) = [I - E ( P k ) R l E ( m + (1
Equation (8) is stable if and only if
n
W ( W , * ) . (8)
IT k=O
[I - E(pk)R]
0,
as n
-+
W.
(9)
3 tr ( R ) '
A sufficient condition for (9) to hold is
pmin is chosen to provide a minimum level of tracking ability. Usually, pmin will be near the value of p that would
E h )<
2
~
Xmax
(R)
be chosen for the fixed step size (FSS) algorithm. a must be chosen in the range (0, 1) to provide exponential forgetting. A typical value of a that was found to work well in simulations is a = 0 . 9 7 . The parameter y is usually small (4.8 X was used in most of our simulations) and may be chosen in conjunction with a to meet the misadjustment requirements according to formulas presented later. The additional overhead over the FSS algorithm is essentially one more weight update at each time step, so that the increase in complexity is minimal. 111. CONVERGENCE OF THE MEANWEIGHT VECTOR The VSS algorithm given by (3)-(6) is difficult to analyze exactly. To make the analysis tractable, we introduce the following simplifying assumption.
where Amax(R) is the maximum eigenvalue of the matrix R . Furthermore, for any (a1 < 1, E ( W,*) 0. Hence under ( 9 ) , E ( w k ) 0. The stationary case where W,* k+m = W * is even simpler in that (8) becomes a homogeneous difference equation. Equation ( 9 ) is then a necessary and sufficient condition for E (W,) W *. -rm A stronger but simpler sufficient condition for the con< (2/Amax(R)). This vergence of E ( W k ) to W * is pmax condition is the same as that for the constant step size LMS algorithm. The convergence of the mean weight vector is of course not sufficient to guarantee convergence of the mean-square error. In the next section, we shall derive equations which describe the behavior of the mean-square error.
KWONG AND JOHNSTON: VARIABLE STEP SIZE LMS ALGORITHM
1635
IV. MEAN-SQUARE ERROR BEHAVIOR As is the case for the regular LMS algorithm, the covariance of the weight vector is directly related to the mean-square error. We therefore first analyze the covariance of the weight vector. Let be the error weight vector
Putting (1 3) into (10) yields
EVk+ 1v + ;
1)
w,
-
= E(VkV;) - w P k ) ~ ( w ; )
- E(pk)E(f'k
w, = w,
%+
1
w;.
V$A)
+ 2 E ( p : ) m ( I / , VgA)
(14)
In the nonstationary case, W,satisfies the equation

= (I -
+ E ( p : ) A tr [AE(V,V;)] + aZZ + E(p;)Atmin.

From the definition of
E(:)
= =
E,,
-
cckxkx;)R
-
+ (1
a)W: - Z,
+ pkXkek.
and the independence of ek,
E(e:
2ekxiTvk
+ v,Tx;x;'v,)
(15)
Since R , the covariance matrix of X,, is symmetric, there exist matrices Q and A, with A diagonal, such that R = QAQ'and QTQ = I . Let V, = Q'W,, X ; = Q T X k , W;' = QTW,*,andZ; = Q'Z,. Then,
EV,+
I
Emin + tr
(AE[V,Vl]).
In Appendix B, we derive the following approximate expression for E ( E i):
c+ = E [ ( I - PCIXiXi')
I)
V,V,TY - p,x;X;'>I
- U)]
E(:) = 3Ekin + 64rnin tr [m(1/,V;)l
+ E[(Z - ppXIX;T)VkW;'T(l
+ E[(1 - a)W;'v;(I + E[(1 - a)2W:'w:'']
-
+ 3{tr [E(AVkV;)]}2+ 6 tr [E(AVkV;)21

-
p,X;X;T)]
6 tr [ E ( ( I - pk-IA)4)E(Vk/kI)E(VkT-l)A2].
(16)
+ E ( z ; z ; ~+ ) E(p$;X;Tei).
To proceed further, we shall make the following additional assumptions. Assumption 1 The step size pk is independent of Xk and V,. Assumption 2: The components of V, are independent, conditionally Gaussian random variables given pk - Assumption 1' is basically a strengthening of assumption 1. Justification for assumption 2 will be discussed at the end of this section. Now, assuming that a is close to 1 so that all terms with (1 - a ) can be discarded, we have
I:
Now let Gk be a vector whose entries are the diagonal elements of E ( A V , V ; ) , and let 1 be a column vector of 1's which is of the same length as G,. Then, using (14)(16) we obtain the following equation describing Gk:
G k + l = [I
2AE(pk)
+ A2E(pi)(2Z + 11')lGk
(17)
+ E(p:)A21tmi, + Ala:
E ( P , + ~ )=
a ~ ( ~ ky(tm1n )
+ 1'~k)
(18)
E ( P ; + ~ )= a 2 ~ ( p i+ ) 2ayE(p/J(tmIn
+ lT~k)
+ 3y2(t;,,, + 1TGk)2+ 6y2G:Gk

-
Wk+lG+d
%
6 tr [E((Z - p, - lA)4)E( V, - ,)E(V;(19)
E{VkV:} - A E { p ~ V ~ V ; )
-
E { pk V, V;}A
+ E { p :X;XkT V, V$X; X;'}

(10)
+a f+ ~ E{p:X;X;Tei}.
Assume that p,
+
= p;, I.
Then
Note that since under the assumption E ( p k ) < 2/Xm,, ( R ) , E( Vk) + 0, the last term in the right-hand side of (19) k+ m will asymptotically be zero. In [ 5 ] , the excess mse E,, ( k ) = t ( k ) - tmln is shown to be given by
E ( P k + l ) = aE(p,)
+Y E ( : )
(11)
(12)
t c x ( k ) = lTG,.
(20)
E(P;+I) = a 2 E ( d ) + 2~YE(CI,i)+ y2E(&.
We can now use the Gaussian moment factoring theorem (see Appendix A) to simplify some of the expressions in the above equations. We have
E(p
;xix ; V , V,TX;X;')
= =
E(p;)E(X;X;TVkV;X;X;T) E@:) [2AE(VkV;)A
+ A tr ( A E { V , V ; } ) ] .
(13)
The mean-square error behavior is now completely described by (17)-(20). The use of the Gaussian moment factoring theorem is made possible by assumption 2. Assumption 2 is used only in the evaluation of E ( ; ) . Since E ( ; ) is multiplied by the small quantity y2, we need only to show that assumption 2 holds approximately. We shall show that under conditions required for the convergence of the meansquare error, assumption 2 will indeed hold asymptotically, for small p,. First, the conditionally Gaussian assumption can be justified in the same way as [ l l ] for small pk. Now as-
1636
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 40, NO. 7 , JULY 1992
sume that lim E ( p k ) = ,E

k-m
Assume that the first two moments of the step size converge to a steady state value
E{Pk}
k ,P
E{P2
Then the off-diagonal d ~ r n e n t sof E(vk are determined by a homogeneous equation of the form
E(vk+
1
E P2.
-
vt)
k'T+
I)rj
= prJE(I/,
vl)tJ
(21)
The technique of [7] will now be used to derive conditions under which the Gk equation tends to a steady state solution. Define
pj =
1 - 2PA,
where the pi, are given by the following equation:

p,, =
= (1
1 -
P(A, + A,)
+ 27A,A,.
+ 2p2h,2 - pAJ2 + + 2 ( 7 - ,E2)h,2.

+ 7A2(21 + 11').
-+
(22)
Also define
But,
p,,pjj
F = I - 2,EA
pt = (1 -
EA, + 2 7 ~ : ) ( 1 - 2 p ~ + , 27~;)
[l -
P(A, + A,)
+ 2SA,hj]*
The system matrix in (17) for Gk converges to F as k 03. Standard results on stability [12] then show that (17) is exponentially stable if the eigenvalues of F lie strictly inside the unit circle. Now,
-
= [(I -
+ ( 2 7 - P2>h?1[(1- F ~ J ) * det (F
{ I ) = det [diag
(p,
- {,
, pn -
r)]
+ ( 2 7 - F2)A,2] - [(l - Ph;)(l - PA,)
+ ( 2 7 - ji2)XiAjI2
=
p2A211T]
(2p2 - P2)[A;(l - EA,) - hj(l - phi)]2 (23)
I0.
In the next section, we show that one of the conditions for convergence of the mse is that pii < 1 , vi = 1, . . , n . Therefore, under this condition and using (23)
pij
< 1.
(24)
j= I
This means that the off-diagonal components of E (vk V i )
(26)
LE(I/,+ I
vl+
1Pk)lij
pj
<
1,
j = 1,2,
* * -
,n
= [1
- Pk(hi
+ hi) + 2Pihihj]E(l/kI/;),,
v k
(25)
These are equivalent to the conditions
and so the components of totically uncorrelated.
are also conditionally asymp-
V . STEADYSTATEMISADJUSTMENT In this section, we examine the performance of our variable step size algorithm. The figure of merit we shall use is the steady state misadjustment M ,which is defined to be
and
,=4,,
4mm
where E,, is the steady state value of t e x ( k ) .Since Eex(k) is given by l T G k ,we shall first study the steady state behavior of Gk.
Assuming these two conditions to hold then, the solution for Gkconverges to G, where G is given by the following:
[2FI - A 7 ( 2 1
+ l l T ) 3 - l ( a ~ 1+ 7A21Emi,).
(29)
1637
Applying the matrix inversion lemma to (29), we obtain
G =
i ( j i Z - p2A)-I[Z + (1 - ilT(jiZ - TA)-' 7 A l ) - ' i 7AllT(jiZ - TA)-']

*
Let y
l T Y . Then
(39)
[aZl
+ p2Altmin].
(30)
The misadjustment M can then be written as
After a little bit of algebra, the following set of equations are obtained:
M=-
Y 1-y'
p = P =
y(tmin
+ lTc> - y(tmin + t e x )
1-a
(32)
2ayji(tmin+ lTG) + 3y2(tmin + lTG)2 + 6y2GTG 1 - CY2 (33)
Equation (40) does not give an explicit expression for the misadjustment, since y depends on M through ji and p 2 . We shall discuss the solution of the nonlinear equation for E,, later in conhection with the nonstationary case. However, we note that if pk is fixed to be a constant, say 2p', then y is given by
y =
1 = 1
c1
P'X~ 2p'h,
For small values of misadjustment, 2GTG 1TG)2,so that

-
<<
(Emin
p2
5:
2~~yji(t,i,+ lTG) + 3y2(E,in 1 - CY2
+ 1TG)2.
(34)
which is the same result as that obtained in [5]. Also, we can get rather simple expressions for M , based on approximations valid for small M , as we shall now show. First, we observe that from the definition of Y, Y satisfies
The choice of CY and y are clearly important for the convergence of G,. Here we give a simple sufficient condition on CY and y to guarantee convergence of Gk when t,, I Emin, which is the usual situation. Using the results of [7], we see that the following condition is sufficient to guarantee that (27) and (28) are satisfied:
P2 -5-
2jiY
PA(2Y
+ 1).
For small values of misadjustment, the components of Y are << 1 so that

-
2pY
p2Al.
We have therefore the following approximate expression for y :

-
2ji
3 tr ( R ) '
(35)
2py
5:
p 2 tr
(RI.
Combining (32) and (34), we get
Substituting the expression (34) the fact that

Emin
for p 2 ,
(41) (32) for
i ,and
tmin + lTG = -
1 - Y
into (4 l ) , we find, after a little algebra, that y satisfies the following quadratic equation:
since 0 < a < 1 . Thus, a sufficient condition on a and y for the convergence of Gk is
Since y
<< 1, the correct root to take is given
by
Using the above equations, we can now derive the expressions for the misadjustment. Stationary Misadjustment: For stationary systems (i.e., at = 01, we can write tex as
tex
Finally, we arrive at the following approximate expression for the misadjustment M , valid for small misadjustments in the stationary case:
lTY m y
tmm
(38)
where
(jiz
~ A - I ~ A I .
1638
IEEE TRANSACTIONS ON SlGNAL PROCESSlNG. V O L . 40, NO. 7, JULY 1992

noise,
?
This expression can be used to determine, for a fixed value of a,the value of y to achieve a desired level of misadjustment. Nonstutionary Mkudjustment: We focus on the situation where (34) for p 2 is valid. Substituting (32) into (34) gives (44) From (31), we have
noise.
additive noise
vanance =
km,,,
I
Unknown Time Vanable System to be Modelled
w:
Adapove Filter
XI
w,
I
YJ
(45) Equations (32), (44), and (45) can be solved by iteration to find f,,. Note that condition (28) required for the stability of the Gk equation also guarantees convergence of the iteration, starting say at E,, = 0. We can once again note that for constant step size pk = p and equal eigenvalues, (45) simplifies to
Fig. 1. System modeling setup.
which is thesame result as that i ~ [ 1 3 ] .Also, if we assume that pXi << i , and that p tr ( R ) << 2;, (31) reduces to
driven by white noise with variance U:. In the simulations presented here, the number of weights n = 4. Since time averages as a measure of statistical behavior is not useful in nonstationary situations, an ensemble of filters is simulated and averages are taken over the ensemble. Here, there are 100 filters in the ensemble, each with an independent input sequence. For completeness, we summarize Harriss algorithm below. Each adaptive weight W,( k ) is adjusted according to the equation
W,(k + 1)
If we now let p k be a constant, say 2 p , the above becomes
w,(k)
+ p,(k)x,(k)(k),
= 0,
. * , n.
which is the result given in [l] VI. SIMULATION RESULTS In this section, we describe simulations performed to verify the theory developed in the previous sections, and to compare experimentally the performance of the new variable step size (VSS) algorithm to that of the fixed step size (FSS) algorithm and Harriss VS (variable step) algorithm [9]. These simulations are system modeling experiments. In each case, an adaptive filter is placed in parallel with the system to be modelled. This setup is the same as that used in [ 11, and is shown in Fig. 1. The input xk is assumed to be an independent, zero mean, Gaussian random sequence with correlation matrix R = I in all the simulations except the one presented in Fig. 8. In that simulation, xk is a correlated sequence, and we shall supply the details for that case separately later. In addition, white noise with variance E,,, is added to the output to prevent exact modeling. In order to simulate a nonstationary environment, the weights of the transversal filters are determined by the output of a bank of low-pass filters
Each of the step sizes p, is updated as follows: i) If mo consecutive sign changes in x , ( j ) , , ( j )have occurred, then p,( k ) = p , ( k - l ) / a . ii) If ml consecutive identical signs have occurred, p , ( k ) = a p , ( k - 1) where CY > 1. In adpmax] to guardition, p, is restricted to the range [pmtn, antee stability of the algorithm. It should be noted that although the FSS algorithm is the simplest to implement, neither the new VSS algorithm nor Harriss algorithm is significantly more complex to implement. Fig. 2 shows the behavior of the FSS algorithm and the new VSS algorithm in a stationary environment. In order to show clearly the different mse characteristics in Fig. 2, the data has been smoothed using a first-order low-pass filter, and plotted on a linear scale focused on the steady state region. (The mse data to be presented in later graphs are plotted on a semilog scale without smoothing.) The parameters used in the VSS algorithm are a = 0.97, y = 4.8 * p , , , = 0.1, and pmln= lo-. The same values for p,,,,, and p,,,,,are used in the simulations shown in Figs. 2-7, except in connection with Fig. 3 where p,,,,, = 8.25 * The value for a appears to be a good choice for all the experiments, while the value for y is chosen arbitrarily. The step sizes for the two FSS simulations have been chosen to give comparable misadjustment or convergence rate to the VSS algorithm. Notice that the misadjustment level of the FSS algorithm with a small step size is achieved, but the convergence rate of the FSS algorithm with a large step size is also achieved
'r
1.7 1.6 1.1 '0 100
1639
I
step size = 0.0165 variable step sue stepsm=O.l
loZ
l\
lo'
i\
100
200
300
400
500 iteration
600
700
800
900
loo0
10-1 0
50
100
150
200
250
iteration
300
350
400
450
500
Fig. 2 . Comparison of smoothed MSE of VSS and two FSS algorithms.
(a)
lo'
r-----7
step slze = 0 0165 vanable step w e
i
0.07 -
0.06 0.05 -
Z B
:
10' 103
:
0
100 200 3M)
:
400
:
500 iteration 600 700
800
I
900 loo0
0.02 0.01 0 50 100 150 200
250
iteration
300
350
400
450
500
(b)
Fig. 3. Comparison of MSE of VSS, FSS, and Harris's algorithms in stationary environment.
Fig. 5 . (a) Comparison of MSE from simulation and theoretical prediction in stationary environment. (b) Comparison of step size from simulation and theoretical prediction in stationary environment.
102
I
t
vanable step slze
3 3
101
100
"'0
100
200
300
400
500 iterauon
600
700
800
900
loo0
Fig 4 Comparison of MSE of VSS, FSS, and Hams's algorithms in nonstationary environment
by the VSS algorithm. The VSS algorithm has reduced the tradeoff between misadjustment and convergence rate. Fig. 3 compares the VSS algorithm with Harris's al-
gorithm and the FSS algorithm. Notice that both the VSS algorithm and Harris's algorithm provide much faster convergence than the FSS algorithm for the same level of misadjustment. The parameters used in Harris's algorithm are CY = 2, mo = 4,ml = 5 . The parameters for the other two algorithms are the same as those in Fig. 2. Fig. 4 shows the behavior of the three algorithms in a nonstationary environment. The fixed step size p and the CY and y parameters for the VSS algorithm have been chosen to give minimum misadjustment. The parameters used for the VSS algorithm are CY = 0.97 and y = 7.65 * lop4. Since there is no theoretical analysis of the misadjustment in Harris's algorithm, there is no guidance available to choose the design parameters. For this simulation, they are chosen to be CY = 2, mo = 3 , and m l = 3 . Here u: = As may be seen, the VSS algorithm has better or at least as good a performance as the other two algorithms. Table I shows that the new algorithm is also less sensitive to variations in the level of nonstationarity than the FSS algorithm. Optimal parameters for both the FSS and VSS algorithms are calculated for a given level of nonsta-
1640
IEEE TRANSACTIONS ON SIGNAL PROCESSING. VOL. 40, NO. 7, J U L Y 1992
-1
lo'
10-1 0
TABLE I SENSITIVITY COMPARISON

simulation
Measured Misadjustment
Predicted Misadjustment Fixed Step Size 7.1 13.9 81.9 3.47 3.47 3.47 Variable Step Size 6.9 14.1 64.7 1.7 3.6 8.0
0 :
t,,,
1.0
1.0
Fixed Step Size 7.0 14.2 88.3 4.41 4.41 4.41
Variable Step Size 6.8 14.4 68.9 2.65 4.51 8.80
0.0001 0.001 0.01 0.0 0.0 0.0
1.0 0.5 1. 0 2.0
50
100
150
200
250
300
350
400
450
500
iteration
(a)
p =
o.olI
0' 0
50
100
150
200
250
300
350
400
450
500
iteration
(b) Fig. 6. (a) Comparison of MSE from simulation and theoretical prediction in nonstationary environment. (b) Comparison of step size from simulation and theoretical prediction in nonstationary environment.
1'21
0.97 and y = 7.65 * lop4. The FSS algorithm has 0.029. However, the VSS algorithm is more sensitive to increases in the level of tmin. The FSS algorithm maintains a constant level of misadjustment independent of the value o f t m l n while , the misadjustment of the VSS algorithm increases and decreases as Emin. This is of course to be expected since for the VSS algorithm increases with tmin, resulting therefore is a larger value of the misadjustment. If desired, the parameter y can be decreased to reduce the level of misadjustment. These results are also shown in Table I, where the parameters used are CY = 0.97, y = 4.8 * l o p 4 , and p = 0.0165. Figs. 5(a), (b) and 6(a), (b) compare the theoretical predictions of the VSS algorithm for the mean-square error and the step size, described in the previous sections, to the results of simulations in both stationary and nonstationary environments. It can be seen that our analysis agrees well with simulation results. One of the main features of the new algorithm is the ability to increase the step size for improved tracking when changes in the system to be modeled occur. Fig. 7 shows the ratio of the VSS mse to that of the FSS algorithm. Where changes occur in the system, we see that the ratio decreases to less than one. For example, when mse ratio is 0.8, the VSS mse is only 80% of the FSS mse. The results displayed in Fig. 7 clearly show the responsiveness of the VSS algorithm. In Fig. 8 , we show simulation results with a nonwhite input sequence in a stationary environment. The xk input sequence is generated by
(Y
CL
04
0 . 2
I
0
200
1
600
800
loo0 iteration
l2oD
1400
1600
1800
2000
Fig. 7. Ratio of MSE of VSS algorithm over FSS algorithm.
tionarity. The VSS algorithm gives a lower misadjustment than the FSS algorithm when the level of nonstationarity increases. The parameters used for the VSS algorithm are
where bk is a zero mean independent Gaussian sequence = 1. The parameters used for Harris's algorithm with pmax = 0.01, are a = 2, m, = 2, m, = 3 , pmin= p0 = 0.01. The parameters used in the VSS algorithm are = 0.97, = pmin = pman = 0.01, = 0.01. The plot has again been smoothed for better contrast. Fig. 8 shows results similar to the previous simulations: the VSS algorithm has faster convergence and lower steady state mse than the FSS algorithm and Harris's algorithm. The difference in convergence speed is
1641
1.41
in [ 5 ] . If x, has mean XI,then on Details can be found letting i l = x, = x,, we have
E (x I x2x3x4)
Vanable Step Size Hams Algonthm
= E(i{iZ)E(i324)
+ E(iIi2)X3Xq + E(i2.?.3)XIX4
+ E ( i l i $ ? ( i l i 4 ) + E(iIi4)E(i223) + E ( i I i 3 ) X 2 X 4 + E(.Eli4)X2X3 + E ( i 2 i 4 ) X l X 3 + E(.E3i4)XIX2
+ XIX2X3X4.
I
0
200
400
600
800
loo0
iteration
1200 1400 1600 1800 Zoo0
Fig. 8 . Comparison of MSE of VSS, FSS, and Harris's algorithms in stationary environment with correlated input.
not as dramatic in this example because pmax is only twice the value of the fixed step size. While Harris's algorithm performed well initially, it was difficult to find values of mo and m lthat would give good steady state results.
VII. CONCLUSIONS
A new LMS type algorithm has been introduced which uses a variable step size to reduce the tradeoff between misadjustment and tracking ability of the fixed step size LMS algorithm. The variable step size algorithm also reduces sensitivity of the misadjustment to the level of nonstationarity . A significant feature of the new algorithm is that approximate formulas can be derived to predict the misadjustment in both stationary and nonstationary environments. These theoretical predictions agree well with simulations of the algorithm. Comparison of the new algorithm with the fixed step size algorithm and another variable step algorithm due to Harris er al. shows that it has superior performance to the fixed step size algorithm, and performs at least as well as Hams's algorithm. APPENDIX A GAUSSIAN MOMENT FACTORING THEOREM For zero mean Gaussian random variables x i , i = 1, , 4, the following result holds:
* *
E (XI x2%x4) = E (x I x2)E (x3x4)
+ E (xI X 3 ) E
(x2x4)
Applying the above result to a zero mean Gaussian random vectorxwith E ( X X T ) = A, A = diag ( A l , X2, * * * , An), we obtain the following:
E ( X X ~ A X X= ~) AAA
+ A A ~ A+ A tr (AA).
1642
IEEE TRANSACTIONS ON SIGNAL PROCESSING. VOL. 40. NO. 7. JULY 1992
Noting that
REFERENCES
[E(xixj)E(x,xrn1
= [
+ E(xix/)E(xjxm) + E(xjx,)E
61,
(xjx,)]
xjx, 6,
+ xjxj 6j, Sj, + A,
6im Sj,]
we find, after some tedious algebra,
E, ( VLX;X ; VkV l X ; X i Vk)

=
3
-
c E,(u;)E,(v:)(x;x/ + 2x; 6 c [E,(U~)]~~?.

1.1
1
ai,)
03.4)
Discarding terms which contain (1 - a ) , it is readily seen thdt
[Ec(ui(k +
so that
(1 -
~khi)~[E(ui(k))]~
[E(Vi)J4A=
E ( C [E,(ui)l?h?)
2 E[(1 - P k - 1
I
[E(Ui(k -
1))]4x;
03-51
Next, we will show that
E [ E c ( V ? )Ec(V?)I
E [(vi)2iE[ ( t : )I.
f
(B.6)
First observe that
E,-(u?(k + 1)) = (1
2hjPk
I
2Pih?)E(uf(k))
+ pihi c X j E ( U , 2 ( k ) ) + u t + /.L;Xi.gmi.
Since Pk is small, terms involving cubic or higher powers in Pk are small compared to p i and will be discarded. Thus
[I] B. Widrow, J . M. McCool, M. G. Larimore, and C. R. Johnson, Jr.. Stationary and nonstationary learning characteristics of the LMS adaptive filter, Proc. IEEE, vol. 64, pp. 1151-1162, Aug. 1976. 121 B. Widrow, J . R. Glover, Jr., J. M. McCool, J. Kaunitz. C. S . Williams, R. H . Hearn. I. R. Zeidler, E. Dong, Jr., and R . C. Goodlin, Adaptive noise cancelling: Principles and applications, Proc. IEEE, vol. 63. pp. 1692-1716, Dec. 1975. [3] B. Widrow and S. D. Steams, Aduprive Signal Processing. Englewood Cliffs, NJ: Prentice-Hall, 1985. [4]M. M. Sandhi and D. A. Berkley, Silencing echoes on the telephone network, Proc. IEEE, vol. 68, pp. 948-963, Aug. 1980. [5] L. L. Horowitz and K. D. Senne, Performance advantage of coniplex LMS for controlling narrow-band adaptive arrays, IEEE Trans. Acousr., Speech, Signal Processing, vol. ASSP-29. pp. 722-736, June 1981. [6] 0 . Macchi and E. Eweda, Second-order convergence analysis of stochastic adaptive linear filtering, IEEE Trans. Automat. Contr., vol. AC-28, pp. 76-85. Jan. 1983. 171 A. Feuer and E. Weinstein, Convergence analysis of LMS filters with uncorrelated Gaussian data, IEEE Trans. Acousr., Speech, Signal Processing, vol. ASSP-33, pp. 222-230, Feb. 1985. [8] E. Eweda and 0. Macchi, Tracking error bounds of adaptive nonstationary filtering, Automarica. vol. 21, pp. 293-302, 1985. [9] R . W. Harris, D. M. Chabries. and F. A. Bishop, A variable step (VS) adaptive filter algorithm, IEEE Trans. Acousr., Speech, Signal Processing, vol. ASSP-34, pp. 309-316, Apr. 1986. [IO] N . J . Bershad, On the optimum gain parameter in LMS adaptation, IEEE Trans. Acoust., Specch. Signal Processing, vol. ASSP-35, pp. 1065-1068, July 1987. [ l l ] N . J . Bershad and L. Z. Qu, On the probability density function of the LMS adaptive filter weights, IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 43-56, Jan. 1989. [I21 G. C. Goodwin, D. J . Hill, and Xie Xianya, Stochastic adaptive control for exponentially convergent time-varying systems, SIAM J . C o n f r . Oprirnizafion. vol. 24, pp. 589-603, July 1986. [I31 S. Marcos and 0. Macchi, Tracking capability of the least mean square algorithm: Application to an asynchronous echo canceller, IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp, 1570-1578, NOV.1987.
Raymond H. Kwong (S71-M75) was born in

Hong Kong in 1949. He received the S.B., S.M., and Ph.D. degrees in electrical engineering from the Massachusetts Institute of Technology, Cambridge, in 1971, 1972, and 1975, respectively. From 1975 to 1977, he was a visiting Assistant Professor of Electrical Engineering at McGill University and a Research Associate at the Centre de Recherches Mathematiques, Universite de Montreal, Montreal, Canada. Since August 1977, he has been with the Department of Electrical Engineering at the University of Toronto, where he is now Professor. His current research interests are in the areas of estimation and stochastic control, system identification, adaptive signal processing and control, biological signal processing. and neural networks.
E [ E , ( d ( k + l))Ec(u?(k+ 1))l
-
E(uf(k
+ l))E(uP(k + 1))
~ [ E ( P ;) (E(P!J)~Ihih/E(u;(k>)E(o:(k)>
= 4E[(Pk
- E ( P k ) ) ( P k + E(Pk))l
(B. 7)
. xiX / E ( U ; ( ~ ) ) E ( u (k :) ) .
pk
As explained in remarks about assumption 1, Section 111, - E ( p k ) is small when y is small. Since ultimately these expressions are used in the evaluation of E ( E ~ ) , which in turn is multiplied by y 2 , we are justified in concluding (B. 6). Finally, since
E(V$X;X~I/kV$X;X;TVk)
=
Edward W. Johnston was born in Halifax, Nova Scotia, Canada, in 1962 He received the B A.Sc degree from the University of Waterloo in 1986, and the M A Sc degree from the University of Toronto in 1988. He is now with AECL in Mississauga, Ontario.
E[E,( V:X;X;Vk VlX;XLT Vk)]
(B.8)
combining (B.2), (B.4)-(B.6), and (B.8), we obtain (16).

A VSLMS Algorithm

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

A VSLMS Algorithm

Caricato da

Copyright:

Formati disponibili

IEEE TRANSACTIONS ON SIGNAL PROCESSING. VOL 40, NO 7.

A Variable Step Size LMS Algorithm

STEP SIZELMS ALGORITHM 11. A VARIABLE

1053-587X/92$03.00 0 1992 IEEE

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 40, NO. 7, JULY 1992

Assumption 1: For the algorithm (3)-(6), E(pkXkEk) = E(Pk)E(XkEk).

with O<a<l, and y > o

A sufficient condition for (9) to hold is

KWONG AND JOHNSTON: VARIABLE STEP SIZE LMS ALGORITHM

Putting (1 3) into (10) yields

In the nonstationary case, W,satisfies the equation

+ E ( p : ) A tr [AE(V,V;)] + aZZ + E(p;)Atmin.

and the independence of ek,

In Appendix B, we derive the following approximate expression for E ( E i):

E(:) = 3Ekin + 64rnin tr [m(1/,V;)l

+ 3{tr [E(AVkV;)]}2+ 6 tr [E(AVkV;)21

+ 3y2(t;,,, + 1TGk)2+ 6y2G:Gk

6 tr [E((Z - p, - lA)4)E( V, - ,)E(V;(19)

+ E { p :X;XkT V, V$X; X;'}

E(P;+I) = a 2 E ( d ) + 2~YE(CI,i)+ y2E(&.

E(p;)E(X;X;TVkV;X;X;T) E@:) [2AE(VkV;)A

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 40, NO. 7 , JULY 1992

sume that lim E ( p k ) = ,E

where the pi, are given by the following equation:

+ 2p2h,2 - pAJ2 + + 2 ( 7 - ,E2)h,2.

+ ( 2 7 - F2)A,2] - [(l - Ph;)(l - PA,)

(2p2 - P2)[A;(l - EA,) - hj(l - phi)]2 (23)

This means that the off-diagonal components of E (vk V i )

and so the components of totically uncorrelated.

are also conditionally asymp-

KWONG AND JOHNSTON: VARIABLE STEP SIZE LMS ALGORITHM

Applying the matrix inversion lemma to (29), we obtain

i ( j i Z - p2A)-I[Z + (1 - ilT(jiZ - TA)-' 7 A l ) - ' i 7AllT(jiZ - TA)-']

The misadjustment M can then be written as

2ayji(tmin+ lTG) + 3y2(tmin + lTG)2 + 6y2GTG 1 - CY2 (33)

For small values of misadjustment, 2GTG 1TG)2,so that

2~~yji(t,i,+ lTG) + 3y2(E,in 1 - CY2

For small values of misadjustment, the components of Y are << 1 so that

We have therefore the following approximate expression for y :

Combining (32) and (34), we get

Substituting the expression (34) the fact that

(41) (32) for

<< 1, the correct root to take is given

IEEE TRANSACTIONS ON SlGNAL PROCESSlNG. V O L . 40, NO. 7, JULY 1992

Fig. 1. System modeling setup.

KWONG AND JOHNSTON: VARIABLE STEP SIZE LMS ALGORITHM

Fig. 2 . Comparison of smoothed MSE of VSS and two FSS algorithms.

0.02 0.01 0 50 100 150 200

IEEE TRANSACTIONS ON SIGNAL PROCESSING. VOL. 40, NO. 7, J U L Y 1992

TABLE I SENSITIVITY COMPARISON

Fixed Step Size 7.0 14.2 88.3 4.41 4.41 4.41

Variable Step Size 6.8 14.4 68.9 2.65 4.51 8.80

0.0001 0.001 0.01 0.0 0.0 0.0

1.0 0.5 1. 0 2.0

Fig. 7. Ratio of MSE of VSS algorithm over FSS algorithm.

KWONG AND JOHNSTON: VARIABLE STEP SIZE LMS ALGORITHM

in [ 5 ] . If x, has mean XI,then on Details can be found letting i l = x, = x,, we have

+ E ( i l i $ ? ( i l i 4 ) + E(iIi4)E(i223) + E ( i I i 3 ) X 2 X 4 + E(.Eli4)X2X3 + E ( i 2 i 4 ) X l X 3 + E(.E3i4)XIX2

1200 1400 1600 1800 Zoo0

E (XI x2%x4) = E (x I x2)E (x3x4)

IEEE TRANSACTIONS ON SIGNAL PROCESSING. VOL. 40. NO. 7. JULY 1992

+ xjxj 6j, Sj, + A,

we find, after some tedious algebra,