Sei sulla pagina 1di 12

Contents

Lecture 7 Least Squares Adaptive Filtering


N. Tangsangiumvisai 2102876 Adaptive Signal Processing 1

Revision on the adaptive filtering algorithm The Least Squares Method The RLS Algorithm Computational Complexity Performance Analysis Comparison between LMS and RLS The Fast RLS (FRLS) Algorithms Summary y

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

Rev. on the adaptive filtering algorithm g g


d(n) + x(n) adaptive filter w d(n) e(n) ( )

Revision (II)
If x(n), d(n) are zero mean and WSS random processes, the cost function (MSE) at time n is
2 J (n) = d w H (n)p p H w (n) + w H (n)R w (n)

x(n) input signal d(n) desired signal e(n) error signal ( ) i l

... (2)

adaptive filtering algorithm l ith

e( n ) = d ( n ) d ( n / X n )

= d ( n) w H ( n) x( n)
Fig.1 Generic from of an adaptive filter

At the minimum point of the error-performance surface, w (n) = w opt hence, minimum MSE
2 J min = d p H w opt

- The adaptive filtering algorithm is employed to control the coefficients so as to minimize some cost function. - The usual form of the update equation:
w (n + 1) = w (n) + w (n)
N. Tangsangiumvisai Adaptive Signal Processing : Lecture 7

... (3)

since
w opt = R 1p
... (4)

... (1)
3 Adaptive Signal Processing : Lecture 7 4

N. Tangsangiumvisai

Revision (III)
Method of the steepest descent
- a recursive method to find the minimum MSE coefficients (Wiener solutions) at global minimum - uses exact measurement of the gradient vector t t f th di t t

Revision (IV)
Steepest Descent algorithm (true gradient)

Least Mean Square algorithm (stochastic gradient) The gradient vector is estimated from the available data, i.e. instantaneous estimates of R and p are used.
J ( n ) = 2 x ( n ) d * ( n ) + 2 x ( n ) x H ( n ) w ( n )
R ( n) = x( n) x H ( n) p ( n) = x( n) d * ( n)
5 Adaptive Signal Processing : Lecture 7 6

The d t Th update equation: ti

w (n + 1) = w (n) + 1 ( w J (n) ) 2

... (5)

where is a positive constant.

w J (n) = 2p + 2Rw (n)

... (7)

w (n + 1) = w (n) + (p R w (n) ), n = 0,1,2,

... (6)

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

N. Tangsangiumvisai

Revision (V)
The update equation of LMS :
w (n + 1) = w (n) + x(n) d * (n) x H (n)w (n)

Least Squares Adaptive Filtering

... (8)

e* ( n )
The update equation of NLMS :
w (n + 1) = w (n) + v
a small positive constant

x( n)

+ x( n) 2
2

e* ( n )

... (9)

LMS / NLMS uses the instantaneous estimate of the gradient vector excess MSE Misadjustment
N. Tangsangiumvisai Adaptive Signal Processing : Lecture 7 7

Contents Revision on the adaptive filtering algorithm The Least Squares Method The RLS Algorithm Computational Complexity Performance Analysis Comparison between LMS and RLS The Fast RLS (FRLS) Algorithms ( ) g Summary

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

Least Squares Method


Wiener filter theory : - obtained from ensemble averages - based on the assumptions on the statistics of the input applied to the adaptive filter. Method of Least Squares : - a model-dependent procedure - a best fit obtained by minimizing the sum of squares of difference b t diff between real-valued measurements and points l l d t d i t of a curve constructed to fit those measurements. - a deterministic approach
involves time averages depends on the number of samples used in the computation. p p p

Least Squares Method (II)


Two main families of adaptive filtering algorithms : LMS family - Gradient Descent method - approximate minimization of the mean-squared error - use of ensemble averages of data g
(instantaneous value)

RLS family - Least Squares technique - exact minimization of the sum of square errors - use of time averages of data

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

10

Least Squares Method (III)


Consider 2 sets of variables, input signal : x(n), x(n 1), ..., x(n M + 1) desired signal : d ( n) The desired signal is modeled as
d (n) = wk x(n k ) + em (n)
k =0 M 1

Least Squares Method (IV)


x(n) ( ) w0

x(n-1)

w1

... (10)
measurement error

em(n) y(n) x(n M+1) x(n-M+1) wM-1

where the unknown system is described as


w (n) = [ w0 w1 wM 1 ]
T

... (11)

d(n)

Fig.2: Linear Transversal Filter model g


Adaptive Signal Processing : Lecture 7 11 Adaptive Signal Processing : Lecture 7 12

N. Tangsangiumvisai

N. Tangsangiumvisai

Least Squares Method (V)


The measurement error is an unobservable random variables. i bl It is assumed to be white with zero mean and variance 2 m, i.e.
x(n)

Least Squares Method (VI)


unknown system w(n) d(n)
+

adaptive filter w(n) d(n)

e(n)

E {em (n)} = 0 n 0,
and
* E em (n)em (k )

... (12)

m , n = k 2 = 0 nk

Fig.3: Fig 3: System Identification

... (13)

To estimate the unknown parameter w (n) from the tap-weights w (n) = w0 w1 wM 1 T , the estimated desired signal is M 1 equal to ... (14) d (n) = wk x(n k )
k =0
N. Tangsangiumvisai Adaptive Signal Processing : Lecture 7 14

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

13

Least Squares Method (VII)


The adaptive filter employs the linear transversal filter model : d l
x(n-1) x(n)
1 z-1 1 z-1

Least Squares Method (VIII)


The error signal is therefore obtained as

e( n ) = d ( n ) d ( n )

... (15) ... (16)

x(n-2)

x(n-M+2)
1 z-1

x(n-M+1)

w0

w1

w2

wM-2

wM-1

Cost function (based on the pre-windowing method)


d(n)

e
n =t1

t2

( n)
error energy

... (17)

Fig.4: Linear Transversal Filter Model

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

15

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

16

Methods for data windowing


Covariance Method - make no assumption about the data outside the interval [1,N] - t1 = M, t2 = N - Input data matrix is given by
x( M + 1) x( N ) x( M ) x( M 1) x( M ) x( N 1) x(2) x( N M + 1) x(1)

Methods for data windowing (II)


Autocorrelation Method - assume that the data prior to time n=1 and after n=N are all zero - t1 = 1, t2 = N+M-1 - Input data matrix is given by
x( M + 1) x( N ) 0 0 x(1) x(2) x( M ) 0 x(1) x( M 1) x( M ) x( N 1) x( N ) 0 () x(2) ( ) ) ) 0 x(1) x( N M + 1) x( N M + 2) x( N ) 0
n =1 n = 2 n=M n=N n = N + M +1

n=M
N. Tangsangiumvisai

n = M +1

n=N
17

Adaptive Signal Processing : Lecture 7

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

18

Methods for data windowing (III)


Pre-windowing Method - assume that the data prior to time n=1 are zero - t1 = 1, t2 = N - Input data matrix is given by
x( M + 1) x( N ) x(1) x(2) x( M ) 0 x(1) x( M 1) () ) x( M ) x( N 1) ) x( N M + 1) 0 x(1) x(2) 0
n =1 n = 2 n=M n=N

Methods for data windowing (IV)


Post-windowing Method - make no assumption about the data prior to time n=1, but make an assumption that the data after n=N are zero - t1 = M, t2 = N+M-1 - Input data matrix is given by
x( M + 1) x( N ) 0 0 x( M ) x( M 1) x( M ) x( N 1) x( N ) 0 () x(2) ( ) ) ) x( N M + 1) x( N M + 2) x( N ) x(1)
n=M n = M +1 n=N n = N +1 n = N + M +1

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

19

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

20

Least Squares Method (IX)


The cost function of the LS method:
J ( w0 , w1 , , wM 1 ) =
th

Least Squares Method (X)


By substituting Eq.(13) into Eq.(17), the normal equations of a linear least-squares least squares filter are obtained as
M 1 k =0

e
n = t1

t2

( n)

... (18)

w x(n i) x(n k ) = x(n i)d (n),


k n =t1 n =t1

t2

t2

i, k = 0,1, , M 1

The k component of the gradient vector gives


k J = 2

x ( n k )e( n )
n =t1

t2

... (19)

time-averaged autocorrelation function

(k , i)

z (i )

time-averaged cross-correlation function

k J = 0, k = 0,1, , M 1 Since is required, hence, the principle of orthogonality :

M 1

x(n k )e(n) = 0, k = 0,1,, M 1


n =t1

t2

... (20)

w (k , i )
k =0 0 k

= z (i ),

i = 0,1, , M 1

... (21)

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

21

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

22

Least Squares Method (XI)


Vector notation
desired response vector
(n x 1)

Least Squares Method (XII)


parameter vector of a filter ... (22)
(M x 1)

d ( n) d (n 1) d( n) = d (1)
T

w1 (n) w2 (n) w ( n) = wM (n)

... (24)

x(n 1) x ( n) x(n 1) x(n 2) data matrix X( n) = (n x M) x(0) x(1)


M = .
N. Tangsangiumvisai

x(n M + 1) x (n) x(n M ) xT (n 1) = ... (23) x( M ) xT (1)

e( n ) e(n 1) = d(n) X(n)w (n) ... (25) The error vector e( n) = (n x 1) e(1)

Adaptive Signal Processing : Lecture 7

23

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

24

Least Squares Method (XIII)


There are 3 possibilities for this set of equations : (1) under-determined, when n < M (2) exactly determined, when n = M (3) over-determined, when n > M
If w (n) is chosen so that e(n) is zero, and

Least Squares Method (XIV)


Then,

(X

( n) X( n) w ( n)

XT (n)d(n)

w ( n) =

(X

( n ) X( n )
-1(n)

XT (n)d(n)
z(n) ( )

... (26)

time-averaged autocorrelation matrix t l ti ti

time-averaged cross-correlation vector

assume that n > M, e( n ) = d ( n ) X ( n ) w ( n ) = 0


d ( n) = X( n) w ( n )

Thus, the pseudo inverse of this matrix exists, although

-1(n)

is not a square matrix q

XT (n)d(n) = XT (n) X(n)w (n)


Adaptive Signal Processing : Lecture 7 25 Adaptive Signal Processing : Lecture 7 26

N. Tangsangiumvisai

N. Tangsangiumvisai

Least Squares Method (XV)


The orthogonality condition of least squares estimation
states at the optimal value of w (n) that, the error is orthogonal to the t t t th ti l l f th t th i th l t th data on which the prediction is made.

Least Squares Adaptive Filtering

eT (n) X(n) = 0T
Therefore

... (27)

( d ( n)

X(n)w (n) ) X(n) = 0T


T

dT (n) X(n) = wT (n) XT (n) X(n)


applying the transpose operator on both sides yields XT (n)d(n) = XT (n) X(n)w (n)
w (n) = XT (n) X(n)

XT (n)d(n)

... (28)

Contents Revision on the adaptive filtering algorithm The Least Squares Method The RLS Algorithm Computational Complexity Performance Analysis Comparison between LMS and RLS The Fast RLS (FRLS) Algorithms ( ) g Summary

-1(n)
N. Tangsangiumvisai

z ( n)
27 N. Tangsangiumvisai Adaptive Signal Processing : Lecture 7 28

Adaptive Signal Processing : Lecture 7

Recursive Least Squares (RLS)


Recursively compute the updated estimate of the tapweight vector upon th arrival of new d t i ht t the i l f data. Its cost function requires no statistical information about x(n) or d(n), and is given by
J ( n) =

RLS algorithm (II)


By partitioning the data matrix, new data vector

((n+1) x M)

xT (n + 1) X(n + 1) = X( n )

... (30) old data matrix

e
i =1

a sample autocorrelation input matrix can now be written as ... (29)


(M x M)

(i )

(n + 1) = XT (n + 1) X(n + 1)

Note th t th l N t that, the length of data samples is i th f d t l i increasing. i

x(i) x (i) = x(i ) x (i ) +


=
T in 1 =
T i =1
29 N. Tangsangiumvisai

n +1

... (31)

x(n + 1)xT (n + 1)
... (32)

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

Adaptive Signal Processing : Lecture 7

30

RLS algorithm (III)


Matrix Inversion Lemma
A = B + CDCT
... (33)

RLS algorithm (IV)


Define : kalman gain vector likelihood variable
From Eq.(35), we have

k (n) = 1 (n) x(n)

... (36) ... (37)

then

A 1

= B 1 B 1C CT B 1C + D 1
A = (n + 1) B = (n)

CT B 1

... (34)

= x (n + 1) (n)x(n + 1)
T

If we select

1 (n + 1)( + 1) = 1 (n)( + 1) 1 (n)x(n + 1)xT (n + 1) 1 (n)


post-multiplying both sides by x(n+1) gives

C = x(n + 1) ( )
D = 1

k (n + 1)( + 1) = 1 (n)( + 1)x(n + 1) 1 (n)x(n + 1)


1

1 (n + 1) = 1 (n)

1 (n)x(n + 1)xT (n + 1) 1 (n) x (n + 1) (n)x(n + 1) + 1


T

... (35)

k (n + 1) =

1 (n)x(n + 1) ( + 1)

... (38)

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

31

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

32

RLS algorithm (V)


1 (n + 1) = 1 (n) k (n + 1)xT (n + 1) 1 (n)
Since ... (39)

RLS algorithm (VI)

z (n + 1) =

i=1

n +1

d (i )x(i )
... (40) ... (41)

= xT (n + 1) 1 (n)x(n + 1)
k (n + 1) = 1 (n)x(n + 1) ( + 1)

= z (n) + d (n + 1)x(n + 1)
To solve

w (n + 1) = 1 (n + 1)z (n + 1)

(n + 1) = d (n + 1) xT (n + 1)w (n)
w (n + 1) = w (n) + k (n + 1) (n + 1) ) ) )

Substituting Eq (39) and Eq (40) in Eq (41) yields Eq.(39) Eq.(40) Eq.(41)

w (n + 1) = w (n) + k (n + 1) (n + 1)
where the priori estimation error is given by

... (42)

1 (n + 1) = 1 (n) k (n + 1)xT (n + 1) 1 (n) ) ) )

(n + 1) = d (n + 1) xT (n + 1)w (n)
N. Tangsangiumvisai Adaptive Signal Processing : Lecture 7

... (43)
33 Adaptive Signal Processing : Lecture 7 34

N. Tangsangiumvisai

RLS Initialisation

Least Squares Adaptive Filtering

Assumption of the input data : pre-windowing method data prior to n = 0 are zero

Re-define

( n) =

x(i) x
i =1

(i ) + I

... (44)

where is a small positive constant, hence 1 (0) = 1 I

... (45) ( )

Contents Revision on the adaptive filtering algorithm The Least Squares Method The RLS Algorithm Computational Complexity Performance Analysis Comparison between LMS and RLS The Fast RLS (FRLS) Algorithms ( ) g Summary

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

35

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

36

Computational Complexity
x +/-

Non-stationary environment
Previously, we assume a statistically stationary environment for the RLS algorithm. i t f th l ith In a non-stationary environment,
J ( n) =

= xT (n + 1) 1 (n)x(n + 1)
1 (n)x(n + 1) ( + 1)

k (n + 1) =

i =1

n i

e 2 (i )

... (46)

(n + 1) = d (n + 1) xT (n + 1)w (n)
w (n + 1) = w (n) + k (n + 1) (n + 1)
1 (n + 1) = 1 (n) k (n + 1)xT (n + 1) 1 (n) Total

where (0,1] is a forgetting factor.

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

37

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

38

RLS algorithm (VI)

Least Squares Adaptive Filtering

= xT (n + 1) 1 (n)x(n + 1)
k (n + 1) =

1 1 (n)x(n + 1) (1 + )

(n + 1) = d (n + 1) xT (n + 1)w (n)
w (n + 1) = w (n) + k (n + 1) (n + 1) ) ) )
1 (n + 1) = 1 1 (n) 1k (n + 1)xT (n + 1) 1 (n) ) ) )

Contents Revision on the adaptive filtering algorithm The Least Squares Method The RLS Algorithm Computational Complexity Performance Analysis Comparison between LMS and RLS The Fast RLS (FRLS) Algorithms ( ) g Summary

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

39

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

40

Performance Analysis
By writing the desired signal as

Performance Analysis (II)


For convenience, the MSE of RLS is modified to

d (n) = w x(n) + em (n)


T

... (47)

J (n) =

i =1

(i )

... (50)

( where em (n) is the measurement error of zero mean and 2 variance m , and is independent of the input signal.
By defining the B d fi i th weight-error vector as i ht t ... (48) the a priori estimation error can then be expressed as

Hence, Hence the MSE of RLS is equal to


2 J (n) = m + tr{R K (n-1)}

where R (n) = E

K ( n) = E

{ (n) (n)} is the weight-error correlation matrix.

{ x(n)xT (n)} is the autocorrelation input matrix and


T

... (51)

(n) = d (n) wT (n 1)x(n) )


(n) = em (n) (n 1)x(n)
T
N. Tangsangiumvisai

= em (n) ( w (n 1) w )

The steady-state MSE approaches variances of the measurement 2 error m as n . (in stationary environment)

x( n)
... (49)
41 N. Tangsangiumvisai

Since RLS i b Si is based on a d t d deterministic cost function. i i ti tf ti


Adaptive Signal Processing : Lecture 7 42

Adaptive Signal Processing : Lecture 7

Least Squares Adaptive Filtering


Co e ge ce ate Convergence rate Contents Revision on the adaptive filtering algorithm The Least Squares Method The RLS Algorithm Computational Complexity Performance Analysis Comparison between LMS and RLS The Fast RLS (FRLS) Algorithms ( ) g Summary

LMS V.S. RLS

LMS 20M RLS 2M The convergence rate of RLS does not depend on the condition number of the input data as in the case of LMS.

Computational Complexity
LMS O(2M) ( ) RLS O(M2)

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

43

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

44

LMS V.S. RLS (II)

Least Squares Adaptive Filtering

Robustness
Leakage LMS : operate in fixed-point implementation RLS : numerically sensitive to rounding errors, when < 1

Tracking Performance
LMS RLS

(1 )

Contents Revision on the adaptive filtering algorithm The Least Squares Method The RLS Algorithm Computational Complexity Performance Analysis Comparison between LMS and RLS The Fast RLS (FRLS) Algorithms ( ) g Summary

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

45

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

46

Fast RLS (FRLS) algorithms


To reduce the computational complexity of RLS to O(L), where > 2. By the use of linear forward and backward prediction. Example - Fast Transversal Filters (FTF) - Fast Newton Transversal Filters (FNTF) - Fast Quasi-Newton (FQN) - F t LMS/Newton Fast LMS/N t - Fast Least Squares (FLS) Problem : numerical instability, particularly when < 1 .

Summary

The cost function of RLS requires no statistical information about input or desired signals. RLS achieves the improvement in performance, as compared to LMS but at higher computational cost LMS, cost. Next lecture Lecture 8 : Frequency-domain algorithms

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

47

N. Tangsangiumvisai

Adaptive Signal Processing : Lecture 7

48

Potrebbero piacerti anche