Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
SPEECH,ANDSIGNALPROCESSING,
933
Abstmet-The concepts of predictability and band-limitedness are of Wold's decompositionis presented in reexamined and a simple proof the contextof mean-square estimation.
E(y2[n]) = RJO] =
2jT S(2") 2a
7 r
(H(e'")(i do.
(4)
I. PREDICTABLE AND BAND-LIMITED PROCESSES HE purpose of this paper is to clarify a number of conceptsrelatedto predictability and band-limitedness; and to give a simple proof of Wold's decomposition in the context of linear mean-square (MS) estimation. The paper is essentially tutorial and, unlike most other treatments of this topic, mainly in the mathematical literature, the development is phrased in a language familiar to most readers of these transactions. Considerareal,discrete-time, wide-sense stationary process x [ n ] with autocorrelation
w E
D.
(5)
where D is a set consisting of one ormore intervals (a set of positive measure). An extreme case of a BL process is a process whose spectrum consists of lines only:
S(d") =
c a16(w I
Wj).
(6)
+ m] x [ n ] }
In this case, the complement D of D is a countable set of points wi. We show next that the values of a BL process x[n] for n from - 00 to 03 are linearly dependent. For this purpose, we construct a continuous bounded functionf(w) such that
f(w) =
0 E D.
(7)
Hence, [see ( 5 ) ] Clearly, S(2") is a periodic function coefficients R [ m ] .Hence, with Fourier series
f ( w ) S(e'") = 0.
(8)
E(x2[n]} = R[O] = -L 2 7 r
S(d") dw.
7 r
(2)
If x[n] is the input to a real, stable, linear system, then the resulting output
m
f(w> =
n=-m
cne-Jnw.
(9)
y[n] =
k= -m
C h g [ n - k]
with power spectrum
The functionf(w) specifies a linear system with delta response c,. We maintain that, if x [ n ] is the input to this system, the resulting out y [ n ] is identically zero'
co
111
Sy(z) = S(z) H(z) w z - 9
where
m
y[n] =
k= -m
c k
x [ n - k]
EE
0.
(10)
(3)
H(z) =
n= -m
hg-"
E { y 2 [ n ] )= S(d") F(w)l2 dw = 0 (11) 2 7 r -7r ,and (10) results. Conversely, if the values of a process x[n] are linearly dependent, i.e., if there exists a set of constants c k such that the sum y [ n ] in (10) is zero, then the process x [ n ] is BL. To prove the above, we form a periodic functionf(o) as
'All identities in this paper involving random variables will be in interpreted in the MS sense.
17r
934
IEEE TRANSACTIONS ON
4, AUGUST 1985
in ( 9 ) with Fourier series coefficients ck. This function is D of positive measure because different from zero in a set the constants c k are not all zero. Furthermore, the process y [ n ] is the output of a system with input x[n] and system of the error [n] = x [ n ] - f[n]. If a set of constants ak function f(w). And since y [ n ] = 0 by assumption, it fol- exists such that lows from (11) that P = E(e2[n]} S(P)f(w)= 0 then f[n] is called the predictor of x[n]. almost everywhere. This is possible only if S(d") = 0 in The error E [n] is the output of the error jilter
D.
We have, thus, shown that the values of a process x [ n ] E(z) = 1 akzWk are linearly dependent iff this process is BL. We can assume, introducing a shift if necessary, that with input x [ n ] .Hence, [see (4)] co # 0. It then follows from (10) that
kzl
m
(18)
P =E(E~[= ~]} 2a
if
(19)
P=O Thus, the present value x [ n ] of a BL process can be expressed in terms of its past and future values. This rep- or, equivalently, if resentation is not, of course, unique because the function m f(w) is arbitrary subject only to the condition (8). We shall x[n] = ak x [ n - k ] . (20) k= 1 presently show that a BL process can be approximated arbitrarily closely by a sum involving only its past values. Theorem I : A process x [ n ] is predictable iff its specFurthermore, if S(d") consists of lines only as in (6), then trum consists of lines x [ n ] equals such a sum. S(d") = ai6 (0 - WJ. (21) i Regular Processes Proof: We shall say that a process x [ n ] is regular if (a) Suficiency-Suppose that S($") equals the above sum. We form the polynomial where L(z) is a function analytic for ( z ( > 1 E(z) = I I (I - z-' dWi) Z E 1 - C akzPk. (22)
S(d") E(&") = C ai E(&"? 6(0 - mi) = 0. (23) 1 It can be shown that a process x [ n ] is regular iff it Satisfies the Paley-Wiener condition [21, [31 If, therefore, we use in (16) the coefficients ak in (22), the resulting MS error P is zero [see (19)]. (b) Necessity-Suppose that P = 0. Since S(dw)1 0, (In S ( e ' " ) l d w < 00. (15) i T it follows from (19) that
5"
From (15) it follows that if a processis regular, it cannot be band-limited.2 Predictable Processes Given a process x [ n ] , we form the sum
S(d") IE(d")(2 = 0.
(24)
The function E(z) is a power series as in (14); therefore, E(@"' cannot equal zero for every w in an interval (PaleyWiener condition). Thus, (24) can hold only if S(d") = 0 everywhere except at a countable set of points. From this it follows that S(& must be a sum of impulses as in (21), where 2"' are the zeros of E(z) on the unit circle. Note: The predictor (20) of a predictable process x [ n ] is not unique. Indeed, if E(z) is the polynomial in (22) and El(z)is an arbitrary functionanalytic for Iz( I1, and such that E1(m) = 1, then[see (19) and (21)] the product E(z)E,(z) is also an error filter. Hence, the coefficients of the function
PAPOULIS:PREDICTABLEPROCESSES
AND WOLD'SDECOMPOSITION
935
1 - E(z) ELz)
(25)
specify a family of predictors of x[n] of the form (20). If $(e'") is not a sum of impulses as in (21), E ( r 2 [ n ] ) cannot be zero. Weshow next, however, that it can be arbitrarily small if the process x [ n ] is BL. Dejini'tion 2: Suppose that in (17) the minimum of the MS error does not exist. In this case, we define the, MS prediction error as the greatest lower bound
/I
-H
I
'8-r2-
Fig. 1.
P = O
or, equivalently, if the difference
< p ( 0 ) < 36
w ED
X[n] -
k=l
c ak
X[n
- k]
can be made arbitrarily small. Denoting by Z the average of In p ( w ) , we conclude from We note that, if the minimum P in(17) exists, then the above and (29) that P = P . Thus, if a process is predictable, then it is also . nu weakly predictable. I =In p(w) dw A process x [ n ] will. be called unpredictable if it is not in predictable or weakly predictable. 1 Theorem 2: A process x [ n ] is weakly predictable iff it > - [e In A (2n - 8) In S ] = 0. 21r is band limited. Proofl Since p ( o ) > 0, it follows from the FejCr-Riess theo(a) Suficiency-We mustshow that, if S(e'") = 0 for rem [ 3 ] that we can find a Hurwitz polynomial m w E D,then, given E > 0, we can find a set of constants ak such that E(z) = bnz-"
/I
n=O
such that
e-'p(w)
=
IE(~'")I~
For this purpose, we form a continuous even function q(w) where [see (48) and (32)] consisting of straight line segments as in Fig. 1. The horizontal low-level segments are in the complement D of the b: = exp set D and theirheight equals 26. The horizontal high-level --?r segments are in a subset G of D and their height equals Thus, E(z) isan error filter as in (18) with a, A 6. The total length of the set G equals 8. Thus, hence [see (is)]
[&
-bn,
E { E ' [ ~ ]= )
In the above, 6 is an arbitrary positive constant and A = 6 1 -2um
$
2n
S(e'") IE(&")12du
--?r
<
(29)
From the continuity of q(w), it follows that (Wierstrass theorem) we can find a function
m
because S(d") = 0 for 0 f D and e ' > 1. And since 6 is arbitrary, (27) results. (b) Necessity-If (27) is true, then
I
n=-m
cne-jno
for any
E
x[n] is BL.
936
TRANSACTIONS IEEE
ON ACOUSTICS, SPEECH,
SIGNAL AND
4, AUGUST 1985
f[n] =
k= I
c ak x [ n
(35) k]
of a process x [ n ] can be considered as the projection of its present value on the space spanned by its past. To determine the coefficients akrwe can, therefore, 'use the pro- convergent for IzI > 1 . They specify, therefore, two linear jection theorem: the MS prediction error is minimum if causalsystems with delta responses In and y n , respectively. The systemL(z)will be called innovationsfilter and the error the system r ( z ) whiteningfilter. Using as input to the sys [ n ] = x[n] - f[n] tem r(z)the process x [ n ] , we obtain as output another process is orthogonal to the data. This yields
[(
x[n]
k=l
ak x[n - k] x[n - m]
> I
m
m 2 1
i[n]
k=O
yk x[n - k ] .
(39)
S(z)
ak R[m - k]
2
(40)
1.
R[m] =
k=O
(36)
This is the discrete-time form of the Wiener-Hopf equation. ] white noise We note for later use that the error ~ [ nis with power spectrum
S,,(Z) = P.
s&)
= S(Z)
(41)
Thus, i[n] is unit-power white noise. As we see from Fig. 2 , if i[n] is the input to the innovations filter L(z), then the resulting response equals x[n]. Thus, x[n]
=
k=O
(37)
Indeed, ~ [ n k] depends linearly on x [ n - k] and its ] k 1 1. past; hence [see ( 3 7 ) ] it is orthogonal to ~ [ nfor To determine the coefficients ak in (3.3, we must solve the system ( 3 6 ) . We show next that, if x[n] is a regular process, then ( 3 6 ) can be solved simply with the method of innovations. This method is an extension of the GramSchmidt orthonormalization to the space spanned by x [ n ] and its past.
lk
i[n
(42) k].
From ( 3 9 ) and ( 4 2 ) it follows that the processes x[n] and i[n] are linearly equivalent in the sense that each is linearly dependent on 'the other and its past. This shows that the predictor%[n]of x[n] can be expressed in terms of the past of i [n]. We maintain, in fact, that
Innovations %[a]= lk i[n - (43) k]. k= 1 As we noted, a process x[n] is regular if its spectrum can be factored as in (13), where L(z) is a function analytic Indeed, the resulting error equals for Iz( 1 1 . Without loss of generality, we can assume [n] = x[n] - i[n] = lo i [ n ] . (44) also that L(z) is minimum-phase, i.e., that it and its inverse This error is orthogonal to i [ n - k] fork I1 because i[n] is white noise. Hence (orthogonality principle), i [ n ] is the predictor of x [ n ]. We have, thus, expressed f [ n ] in terms of the innovaare analytic for Iz 1 > 1. Indeed, if ziare the zeros of L(z) tions of x [ n ] . As we see from ( 4 3 ) and ( 4 2 ) , f [ n ]is the outside the unit circle, then replacing all factors z - zi of output of the system L(z) by the factors z zi - 1, we obtain a minimum-phase &(z) = L(z) - lo (45) function satisfying (13) because with input i [ n ] . To complete the specification of % [ n ] ,we must express i[n] in terms of x [ n ] using the innovations filter r(z)[Fig. 3 ( a ) ] . The resulting system [Fig. 3(b)] Thus, the functions L(z) and F(z) can be expanded into power series
m m
L(Z) =
n=O
I,, z - ~
r(z) = nC =O
yn z - ~
is theWiener predictor of x [ n ], that is, its response to x[nI (38) equals %in].
937
i--- - - 1- - - - -
111. WOLD'SDECOMPOSITION [8]-[ll] Wold's decomposition theorem states that an arbitrary unpredictable process x [ n ] can be written as a sum
x[n] = x,[n]
(b)
+ xp[n]
( 51)
Fig. 3.
where XJn] is a regular process and xp[n] is a predictable process orthogonal to x,[n] . To prove this theorem, we form the predictor 2[n] of x [ n ] as in ( 3 5 ) , and the normalized prediction error (Fig. 4)
The Kolmogorof-Szego Error Formula From (44) it follows that (47) where P is the MS error as We shall use the above to express P directly in terms of P > O the power spectrum S($") ofx[n]. To do so, we show, by assumption. As we know [see ( 3 7 ) ] , the process { [ n ] first, that is white noise and
=
E ( E 2 [ n ] } = 1;.
In
=-
2n
lr
In IL(e"")(2 dw.
(48)
--r
Sf&)
= 1.
(53)
SI,
In IL(d")(2 dw =
We next form the MS estimate of x[n] in terms of 5[n] and its past. This estimate is a sum
OD
In [L(z)L(z-')l dz
x,[n] =
k=O
wk
5 [n - k]
(54)
xp[n] = x[n]
To prove (48), it suffices, therefore, to show that
x,[n].
(55)
(49)
xP[n] I
5 [n - k]
0.
(56)
Furthermore, As we know, L(z) is minimum phase, hence, the funcE [a] I x[n - k], ~ , [ n- k], xp[n - k] k 2 1 tion In L(z) is analytic for IzI 2 1. We can, therefore, replace in ( 4 9 ) the circle C with a circle whose radius is because x,[n] depends linearly on e [ n - k] and its past, arbitrarily large (Cauchy's theorem). And since E [n] is white noise, and E [n] I x [ n - k] for k 2 1 (orthogonality principle). Hence [see (5411, UZ) 10 IZI -+E(x,[n + r n ] x,[n]} = 0 all m. (57) the integral in (49) equals 27rj In (1,l. This yields (49). Thus, the processes x,[n] and x,[n] are orthogonal and Comparing.(47) with (48), we obtain E ( x 2 [ n ] ) = E(x:[iz]) + E(x;[nI}. (58)
4= E{x:[n]}
k=O
(59)
4, AUGUST 1985
with input E [n].Hence y [ n ] is linearly dependent on E [n] and its past. But y [ n ] is also orthogonal to ~ [ n and ] its past [see (64) and ( 5 6 ) ] ,hence y [ n ] = 0 in the MS sense. From the preceding discussion it follows that the spectrum S(e'") of an unpredictable process x [ n ] is a sum
S(e'") = Sr(e'")
Therefore, the sum
+ SP(eiW)
(66)
where S,.(d")is the spectrum of x,[n] and S,(d") is the spectrum of x p [ n ] .The first term is the sum in (61) and the second term is a sum of impulses as in (21). W(Z) = w, z-n (60) n=O We have thus shown that the processes x [ n ] and xp[n] converges for IzI > 1 and it defines a linear causal system. have the same predictors. However, whereas H(z) is the Clearly, if r[n]is the input to this system, then the result- unique predictor of x [ n ] , the process xp[n] has many preing output equals x , [ n ] . Hence, [see (53) and ( 3 ) ] , the dictors [see ( 2 5 ) ] .In fact, H(z) is not its simplest predictor. The minimum degree predictor of xp[n] is the polypower spectrum of x,[n] equals nomial m
H,(Z) = 1 - IT (1 - 2-' P i ) I (61) We have, thus, shown that the MS estimate x,[n] of x[n] where t P i are the zerosof E(z) on the unit circle [see ( 2 4 ) ] . is terms of r[n] and its past is a regular process orthogonal REFERENCES to the error x p [ n ] .To complete the proof of the theorem, A. Papoulis, Probability, Random Variables,andStochasticProwe must show that xp[n]is a predictable process. We shall cesses, 2ndEd. NewYork: McGraw-Hill,1984. show, in fact, that
sr(d")=
Ic
w, e-jn"
n=O
1.
xp[n]
=
k=l
ak xp[n (62) - k]
c a, zm
(63)
y [ n ] = xp[n] then
k= 1
a k xp[n - k]
(64)
R. E. A. C. Paley and N. Wiener, "Fourier transforms in the complex domain," Amer. Math. Soc. Coll., vol.19,1934. A . Papoulis, Signal Analysis. NewYork: McGraw-Hill, 1977. T. Kailath, "An innovations approach to detection and estimation theory,'' Proc. IEEE, vol. 58, 1970. N. Wiener, Extrapolation, Interpolation and Smoothing of Stationary Time Series. NewYork: MIT Press and Wiley,1950. G. Szego, "Orthogonal polynomials," Amer. Math. SOC. Coll., 1939. A . Papoulis, "Maximum entropy and spectral estimation: A review," IEEE Trans. Acoust. Speech, Signal Processing, vol. ASSP-29, Dec. 1981. J. L. Doob, StochasticProcesses. NewYork:Wiley,1953. J. Lamperti, StochasticProcesses. NewYork: Springer, 1977. M . B. Priestley, Spectral Analysis and Time Series. NewYork: Academic, 1982. S. Haykin, NonlinearMethods of Spectral Analysis. Berlin, Germany: Springer, 1983.
E{y2[n]} = 0.
Clearly, y [ n ] is the output of the error filter
m
(65)
E(z) = 1
n= 1
an Z-"
= 1 -
H(z)
with input xp[n] = x[n] - x,[n]. And since the response of E(z) to x[n] equals ~ [ n we ] , conclude that
~ [ n= l E [nl - yr[nI where y,[n] is the response of E(z) to x , [ n ] . From this it follows that the process y [ n ] is the output of the causal filter (see Fig. 5 )