Sei sulla pagina 1di 6

IEEE TRANSACTIONS ON ACOUSTICS,

SPEECH,ANDSIGNALPROCESSING,

VOL. ASSP-33, NO. 4, AUGUST 1985

933

Predictable Processesand Wold's Decomposition: A Review


A. PAPOULIS

Abstmet-The concepts of predictability and band-limitedness are of Wold's decompositionis presented in reexamined and a simple proof the contextof mean-square estimation.

E(y2[n]) = RJO] =

2jT S(2") 2a
7 r

(H(e'")(i do.

(4)

I. PREDICTABLE AND BAND-LIMITED PROCESSES HE purpose of this paper is to clarify a number of conceptsrelatedto predictability and band-limitedness; and to give a simple proof of Wold's decomposition in the context of linear mean-square (MS) estimation. The paper is essentially tutorial and, unlike most other treatments of this topic, mainly in the mathematical literature, the development is phrased in a language familiar to most readers of these transactions. Considerareal,discrete-time, wide-sense stationary process x [ n ] with autocorrelation

Band-Limited Processes We shall say that a process x [ n ] is band limited (BL) if


S(e'")
=

w E

D.

(5)

where D is a set consisting of one ormore intervals (a set of positive measure). An extreme case of a BL process is a process whose spectrum consists of lines only:

S(d") =

c a16(w I

Wj).

(6)

R[m] = E(x[n and power spectrum


m

+ m] x [ n ] }

In this case, the complement D of D is a countable set of points wi. We show next that the values of a BL process x[n] for n from - 00 to 03 are linearly dependent. For this purpose, we construct a continuous bounded functionf(w) such that
f(w) =

0 E D.

(7)

Hence, [see ( 5 ) ] Clearly, S(2") is a periodic function coefficients R [ m ] .Hence, with Fourier series
f ( w ) S(e'") = 0.

(8)

E(x2[n]} = R[O] = -L 2 7 r

S(d") dw.
7 r

(2)

We, next, expand f ( w ) into a Fourier series in the interval (- 7r, a)


m

If x[n] is the input to a real, stable, linear system, then the resulting output
m

f(w> =

n=-m

cne-Jnw.

(9)

y[n] =

k= -m

C h g [ n - k]
with power spectrum

The functionf(w) specifies a linear system with delta response c,. We maintain that, if x [ n ] is the input to this system, the resulting out y [ n ] is identically zero'
co

is a wide-sense stationary process

111
Sy(z) = S(z) H(z) w z - 9
where
m

y[n] =

k= -m

c k

x [ n - k]

EE

0.

(10)

(3)

Indeed, as we see from (4)and (8),

H(z) =

n= -m

hg-"

is the system function. From (3) and ( 2 ) it follows that


Manuscript received January 31, 1984; revised November 8, 1984. This work was supported by the Joint Services Technical Advising Committee under Contract F4620-82-C-0084. The author is with the Polytechnic Institute of New York, Farmingdale, NY 11735.

E { y 2 [ n ] )= S(d") F(w)l2 dw = 0 (11) 2 7 r -7r ,and (10) results. Conversely, if the values of a process x[n] are linearly dependent, i.e., if there exists a set of constants c k such that the sum y [ n ] in (10) is zero, then the process x [ n ] is BL. To prove the above, we form a periodic functionf(o) as
'All identities in this paper involving random variables will be in interpreted in the MS sense.

17r

0096-3518/85/0800-0933$01.00O 1985 IEEE

934

IEEE TRANSACTIONS ON

ACOUSTICS, SPEECH, SIGNAL AND PROCESSING,

VOL. ASSP-33, NO.

4, AUGUST 1985

in ( 9 ) with Fourier series coefficients ck. This function is D of positive measure because different from zero in a set the constants c k are not all zero. Furthermore, the process y [ n ] is the output of a system with input x[n] and system of the error [n] = x [ n ] - f[n]. If a set of constants ak function f(w). And since y [ n ] = 0 by assumption, it fol- exists such that lows from (11) that P = E(e2[n]} S(P)f(w)= 0 then f[n] is called the predictor of x[n]. almost everywhere. This is possible only if S(d") = 0 in The error E [n] is the output of the error jilter

D.

We have, thus, shown that the values of a process x [ n ] E(z) = 1 akzWk are linearly dependent iff this process is BL. We can assume, introducing a shift if necessary, that with input x [ n ] .Hence, [see (4)] co # 0. It then follows from (10) that

kzl
m

(18)

P =E(E~[= ~]} 2a
if

S(d") lE(d")I2 dw.


-x

(19)

Dejinition I : A process x [ n ] will be called predictable

P=O Thus, the present value x [ n ] of a BL process can be expressed in terms of its past and future values. This rep- or, equivalently, if resentation is not, of course, unique because the function m f(w) is arbitrary subject only to the condition (8). We shall x[n] = ak x [ n - k ] . (20) k= 1 presently show that a BL process can be approximated arbitrarily closely by a sum involving only its past values. Theorem I : A process x [ n ] is predictable iff its specFurthermore, if S(d") consists of lines only as in (6), then trum consists of lines x [ n ] equals such a sum. S(d") = ai6 (0 - WJ. (21) i Regular Processes Proof: We shall say that a process x [ n ] is regular if (a) Suficiency-Suppose that S($") equals the above sum. We form the polynomial where L(z) is a function analytic for ( z ( > 1 E(z) = I I (I - z-' dWi) Z E 1 - C akzPk. (22)

From the above and (21) it follows that

S(d") E(&") = C ai E(&"? 6(0 - mi) = 0. (23) 1 It can be shown that a process x [ n ] is regular iff it Satisfies the Paley-Wiener condition [21, [31 If, therefore, we use in (16) the coefficients ak in (22), the resulting MS error P is zero [see (19)]. (b) Necessity-Suppose that P = 0. Since S(dw)1 0, (In S ( e ' " ) l d w < 00. (15) i T it follows from (19) that

5"

From (15) it follows that if a processis regular, it cannot be band-limited.2 Predictable Processes Given a process x [ n ] , we form the sum

S(d") IE(d")(2 = 0.

(24)

and the minimum MS value


'In the mathematical literature, the term "regular" means any process that is not BL. In our definition, we impose the additional condition that the corresponding spectrum does not contain any lines.

The function E(z) is a power series as in (14); therefore, E(@"' cannot equal zero for every w in an interval (PaleyWiener condition). Thus, (24) can hold only if S(d") = 0 everywhere except at a countable set of points. From this it follows that S(& must be a sum of impulses as in (21), where 2"' are the zeros of E(z) on the unit circle. Note: The predictor (20) of a predictable process x [ n ] is not unique. Indeed, if E(z) is the polynomial in (22) and El(z)is an arbitrary functionanalytic for Iz( I1, and such that E1(m) = 1, then[see (19) and (21)] the product E(z)E,(z) is also an error filter. Hence, the coefficients of the function

PAPOULIS:PREDICTABLEPROCESSES

AND WOLD'SDECOMPOSITION

935

1 - E(z) ELz)

(25)

specify a family of predictors of x[n] of the form (20). If $(e'") is not a sum of impulses as in (21), E ( r 2 [ n ] ) cannot be zero. Weshow next, however, that it can be arbitrarily small if the process x [ n ] is BL. Dejini'tion 2: Suppose that in (17) the minimum of the MS error does not exist. In this case, we define the, MS prediction error as the greatest lower bound
/I

-H
I

'8-r2-

Fig. 1.

(26) We shall say that a process x[n] is weakly predictuble if

P = O
or, equivalently, if the difference

From this and (28) it follows that

< p ( 0 ) < 36

w ED

X[n] -

k=l

c ak

X[n

- k]

can be made arbitrarily small. Denoting by Z the average of In p ( w ) , we conclude from We note that, if the minimum P in(17) exists, then the above and (29) that P = P . Thus, if a process is predictable, then it is also . nu weakly predictable. I =In p(w) dw A process x [ n ] will. be called unpredictable if it is not in predictable or weakly predictable. 1 Theorem 2: A process x [ n ] is weakly predictable iff it > - [e In A (2n - 8) In S ] = 0. 21r is band limited. Proofl Since p ( o ) > 0, it follows from the FejCr-Riess theo(a) Suficiency-We mustshow that, if S(e'") = 0 for rem [ 3 ] that we can find a Hurwitz polynomial m w E D,then, given E > 0, we can find a set of constants ak such that E(z) = bnz-"

/I

n=O

such that
e-'p(w)
=

IE(~'")I~

For this purpose, we form a continuous even function q(w) where [see (48) and (32)] consisting of straight line segments as in Fig. 1. The horizontal low-level segments are in the complement D of the b: = exp set D and theirheight equals 26. The horizontal high-level --?r segments are in a subset G of D and their height equals Thus, E(z) isan error filter as in (18) with a, A 6. The total length of the set G equals 8. Thus, hence [see (is)]

[&

-bn,

E { E ' [ ~ ]= )
In the above, 6 is an arbitrary positive constant and A = 6 1 -2um

$
2n

S(e'") IE(&")12du
--?r

<
(29)

S@") p ( 0d ) u < 36R[O] (34)

From the continuity of q(w), it follows that (Wierstrass theorem) we can find a function
m

because S(d") = 0 for 0 f D and e ' > 1. And since 6 is arbitrary, (27) results. (b) Necessity-If (27) is true, then
I

p(w) = such that

n=-m

cne-jno
for any
E

> '0. Since S(d") 1 0, this is possibleonlyif

x[n] is BL.

936

TRANSACTIONS IEEE

ON ACOUSTICS, SPEECH,

SIGNAL AND

PROCESSING, ASSP-33, VOL. NO.

4, AUGUST 1985

11. INNOVATIONS AND PREDICTION [4],[5] The predictor

f[n] =

k= I

c ak x [ n

(35) k]

of a process x [ n ] can be considered as the projection of its present value on the space spanned by its past. To determine the coefficients akrwe can, therefore, 'use the pro- convergent for IzI > 1 . They specify, therefore, two linear jection theorem: the MS prediction error is minimum if causalsystems with delta responses In and y n , respectively. The systemL(z)will be called innovationsfilter and the error the system r ( z ) whiteningfilter. Using as input to the sys [ n ] = x[n] - f[n] tem r(z)the process x [ n ] , we obtain as output another process is orthogonal to the data. This yields

[(

x[n]

k=l

ak x[n - k] x[n - m]

> I
m

m 2 1

i[n]

k=O

yk x[n - k ] .

(39)

from which it follows that


m

This process will be called the innovations of x [ n ] . Since

S(z)
ak R[m - k]
2

L(z) L(z- '1


r(Z-l) =

(40)
1.

R[m] =
k=O

(36)

it follows from ( 3 ) that

This is the discrete-time form of the Wiener-Hopf equation. ] white noise We note for later use that the error ~ [ nis with power spectrum
S,,(Z) = P.

s&)

= S(Z)

(41)

Thus, i[n] is unit-power white noise. As we see from Fig. 2 , if i[n] is the input to the innovations filter L(z), then the resulting response equals x[n]. Thus, x[n]
=
k=O

(37)

Indeed, ~ [ n k] depends linearly on x [ n - k] and its ] k 1 1. past; hence [see ( 3 7 ) ] it is orthogonal to ~ [ nfor To determine the coefficients ak in (3.3, we must solve the system ( 3 6 ) . We show next that, if x[n] is a regular process, then ( 3 6 ) can be solved simply with the method of innovations. This method is an extension of the GramSchmidt orthonormalization to the space spanned by x [ n ] and its past.

lk

i[n

(42) k].

From ( 3 9 ) and ( 4 2 ) it follows that the processes x[n] and i[n] are linearly equivalent in the sense that each is linearly dependent on 'the other and its past. This shows that the predictor%[n]of x[n] can be expressed in terms of the past of i [n]. We maintain, in fact, that

Innovations %[a]= lk i[n - (43) k]. k= 1 As we noted, a process x[n] is regular if its spectrum can be factored as in (13), where L(z) is a function analytic Indeed, the resulting error equals for Iz( 1 1 . Without loss of generality, we can assume [n] = x[n] - i[n] = lo i [ n ] . (44) also that L(z) is minimum-phase, i.e., that it and its inverse This error is orthogonal to i [ n - k] fork I1 because i[n] is white noise. Hence (orthogonality principle), i [ n ] is the predictor of x [ n ]. We have, thus, expressed f [ n ] in terms of the innovaare analytic for Iz 1 > 1. Indeed, if ziare the zeros of L(z) tions of x [ n ] . As we see from ( 4 3 ) and ( 4 2 ) , f [ n ]is the outside the unit circle, then replacing all factors z - zi of output of the system L(z) by the factors z zi - 1, we obtain a minimum-phase &(z) = L(z) - lo (45) function satisfying (13) because with input i [ n ] . To complete the specification of % [ n ] ,we must express i[n] in terms of x [ n ] using the innovations filter r(z)[Fig. 3 ( a ) ] . The resulting system [Fig. 3(b)] Thus, the functions L(z) and F(z) can be expanded into power series
m m

L(Z) =

n=O

I,, z - ~

r(z) = nC =O

yn z - ~

is theWiener predictor of x [ n ], that is, its response to x[nI (38) equals %in].

PAPOULIS: PREDICTABLE PROCESSES

AND WOLD'S DECOMPOSITION

937

i--- - - 1- - - - -

111. WOLD'SDECOMPOSITION [8]-[ll] Wold's decomposition theorem states that an arbitrary unpredictable process x [ n ] can be written as a sum
x[n] = x,[n]
(b)

+ xp[n]

( 51)

Fig. 3.

where XJn] is a regular process and xp[n] is a predictable process orthogonal to x,[n] . To prove this theorem, we form the predictor 2[n] of x [ n ] as in ( 3 5 ) , and the normalized prediction error (Fig. 4)

The Kolmogorof-Szego Error Formula From (44) it follows that (47) where P is the MS error as We shall use the above to express P directly in terms of P > O the power spectrum S($") ofx[n]. To do so, we show, by assumption. As we know [see ( 3 7 ) ] , the process { [ n ] first, that is white noise and
=

E ( E 2 [ n ] } = 1;.

In

=-

2n

lr

In IL(e"")(2 dw.

(48)

--r

Sf&)

= 1.

(53)

Proof [6],[ 7 ] : Cleariy,

SI,

In IL(d")(2 dw =

We next form the MS estimate of x[n] in terms of 5[n] and its past. This estimate is a sum
OD

In [L(z)L(z-')l dz

x,[n] =

k=O

wk

5 [n - k]

(54)

when C is the unit circle. Furthermore, where

the weights wk are such as of the error

to minimize the MS value


-

xp[n] = x[n]
To prove (48), it suffices, therefore, to show that

x,[n].

(55)

From the orthogonality principle it follows that wk must be such that

(49)

xP[n] I

5 [n - k]

0.

(56)

Furthermore, As we know, L(z) is minimum phase, hence, the funcE [a] I x[n - k], ~ , [ n- k], xp[n - k] k 2 1 tion In L(z) is analytic for IzI 2 1. We can, therefore, replace in ( 4 9 ) the circle C with a circle whose radius is because x,[n] depends linearly on e [ n - k] and its past, arbitrarily large (Cauchy's theorem). And since E [n] is white noise, and E [n] I x [ n - k] for k 2 1 (orthogonality principle). Hence [see (5411, UZ) 10 IZI -+E(x,[n + r n ] x,[n]} = 0 all m. (57) the integral in (49) equals 27rj In (1,l. This yields (49). Thus, the processes x,[n] and x,[n] are orthogonal and Comparing.(47) with (48), we obtain E ( x 2 [ n ] ) = E(x:[iz]) + E(x;[nI}. (58)

The above leads to the conclusion that [see (53)]


OD

This result is known as the Kolmogoroff-Szego MS error formula.

4= E{x:[n]}
k=O

E(x2[n]} = R[O] < 00.

(59)

IEEE TRANSACTIONS ON ACOUSTICS,SPEECH,ANDSIGNALPROCESSING,VOL.ASSP-33,NO.

4, AUGUST 1985

with input E [n].Hence y [ n ] is linearly dependent on E [n] and its past. But y [ n ] is also orthogonal to ~ [ n and ] its past [see (64) and ( 5 6 ) ] ,hence y [ n ] = 0 in the MS sense. From the preceding discussion it follows that the spectrum S(e'") of an unpredictable process x [ n ] is a sum

S(e'") = Sr(e'")
Therefore, the sum

+ SP(eiW)

(66)

where S,.(d")is the spectrum of x,[n] and S,(d") is the spectrum of x p [ n ] .The first term is the sum in (61) and the second term is a sum of impulses as in (21). W(Z) = w, z-n (60) n=O We have thus shown that the processes x [ n ] and xp[n] converges for IzI > 1 and it defines a linear causal system. have the same predictors. However, whereas H(z) is the Clearly, if r[n]is the input to this system, then the result- unique predictor of x [ n ] , the process xp[n] has many preing output equals x , [ n ] . Hence, [see (53) and ( 3 ) ] , the dictors [see ( 2 5 ) ] .In fact, H(z) is not its simplest predictor. The minimum degree predictor of xp[n] is the polypower spectrum of x,[n] equals nomial m

H,(Z) = 1 - IT (1 - 2-' P i ) I (61) We have, thus, shown that the MS estimate x,[n] of x[n] where t P i are the zerosof E(z) on the unit circle [see ( 2 4 ) ] . is terms of r[n] and its past is a regular process orthogonal REFERENCES to the error x p [ n ] .To complete the proof of the theorem, A. Papoulis, Probability, Random Variables,andStochasticProwe must show that xp[n]is a predictable process. We shall cesses, 2ndEd. NewYork: McGraw-Hill,1984. show, in fact, that

sr(d")=

Ic

w, e-jn"

n=O

1.

xp[n]

=
k=l

ak xp[n (62) - k]

where ak are the coefficients of the predictor H(Z) =


n= 1

c a, zm

(63)

of x [ n ] . To prove (62), we must show that, if

y [ n ] = xp[n] then

k= 1

a k xp[n - k]

(64)

R. E. A. C. Paley and N. Wiener, "Fourier transforms in the complex domain," Amer. Math. Soc. Coll., vol.19,1934. A . Papoulis, Signal Analysis. NewYork: McGraw-Hill, 1977. T. Kailath, "An innovations approach to detection and estimation theory,'' Proc. IEEE, vol. 58, 1970. N. Wiener, Extrapolation, Interpolation and Smoothing of Stationary Time Series. NewYork: MIT Press and Wiley,1950. G. Szego, "Orthogonal polynomials," Amer. Math. SOC. Coll., 1939. A . Papoulis, "Maximum entropy and spectral estimation: A review," IEEE Trans. Acoust. Speech, Signal Processing, vol. ASSP-29, Dec. 1981. J. L. Doob, StochasticProcesses. NewYork:Wiley,1953. J. Lamperti, StochasticProcesses. NewYork: Springer, 1977. M . B. Priestley, Spectral Analysis and Time Series. NewYork: Academic, 1982. S. Haykin, NonlinearMethods of Spectral Analysis. Berlin, Germany: Springer, 1983.

E{y2[n]} = 0.
Clearly, y [ n ] is the output of the error filter
m

(65)

E(z) = 1

n= 1

an Z-"

= 1 -

H(z)

with input xp[n] = x[n] - x,[n]. And since the response of E(z) to x[n] equals ~ [ n we ] , conclude that

~ [ n= l E [nl - yr[nI where y,[n] is the response of E(z) to x , [ n ] . From this it follows that the process y [ n ] is the output of the causal filter (see Fig. 5 )

Potrebbero piacerti anche