Spectral Analysis of Signals

SPECTRAL ANALYSIS
OF SIGNALS
The Missing Data Case
Copyright 2005 by Morgan & Claypool
All rights reserved. No part of this publication may be reproduced, stored in a retrieval sys-
tem, or transmitted in any form or by any means electronic, mechanical, photocopy, recording,
or any other except for brief quotations in printed reviews, without the prior permission of the
publisher.
Spectral Analysis of Signals, The Missing Data Case
Yanwei Wang, Jian Li, and Petre Stoica
www.morganclaypool.com
ISBN: 1598290002
Library of Congress Cataloging-in-Publication Data
First Edition
10 9 8 7 6 5 4 3 2 1
Printed in the United States of America
SPECTRAL ANALYSIS
OF SIGNALS
The Missing Data Case
Yanwei Wang
Diagnostic Ultrasound Corporation
Bothell, WA 98021
Jian Li
Department of Electrical and Computer Engineering,
University of Florida,
Gainesville, FL 32611, USA
Petre Stoica
Department of Information Technology,
Division of Systems and Control,
Uppsala University,
Uppsala, Sweden
M
&C
Morgan
&
Claypool Publishers
ABSTRACT
Spectral estimation is important in many elds including astronomy, meteorology,
seismology, communications, economics, speech analysis, medical imaging, radar,
sonar, and underwater acoustics. Most existing spectral estimation algorithms are
devised for uniformly sampled complete-data sequences. However, the spectral
estimation for data sequences with missing samples is also important in many ap-
plications ranging fromastronomical time series analysis to synthetic aperture radar
imaging with angular diversity. For spectral estimation in the missing-data case,
the challenge is how to extend the existing spectral estimation techniques to deal
with these missing-data samples. Recently, nonparametric adaptive ltering based
techniques have been developed successfully for various missing-data problems.
Collectively, these algorithms provide a comprehensive toolset for the missing-data
problem based exclusively on the nonparametric adaptive lter-bank approaches,
which are robust and accurate, and can provide high resolution and low sidelobes.
In this lecture, we present these algorithms for both one-dimensional and two-
dimensional spectral estimation problems.
KEYWORDS
Adaptive lter-bank, APES (amplitude and phase estimation),
Missing data, Nonparametric methods, Spectral estimation
v
Contents
1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Complete-Data Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Missing-Data Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. APES for Complete Data Spectral Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Forward-Only APES Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Two-Step Filtering-Based Interpretation . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 ForwardBackward Averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.6 Fast Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3. Gapped-Data APES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 GAPES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2.1 Initial Estimates via APES . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2.2 Data Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2.3 Summary of GAPES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Two-Dimensional GAPES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.1 Two-Dimensional APES Filter . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.2 Two-Dimensional GAPES . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4.1 One-Dimensional Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4.2 Two-Dimensional Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4. Maximum Likelihood Fitting Interpretation of APES. . . . . . . . . . . . . . . . . 31
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2 ML Fitting Based Spectral Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3 Remarks on the ML Fitting Criterion. . . . . . . . . . . . . . . . . . . . . . . . . . 33
vi CONTENTS
5. One-Dimensional Missing-Data APES via Expectation Maximization. . 35
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.2 EM for Missing-Data Spectral Estimation . . . . . . . . . . . . . . . . . . . . . 36
5.3 MAPES-EM1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.4 MAPES-EM2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.5 Aspects of Interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.5.1 Some Insights into the MAPES-EM Algorithms . . . . . . . . 45
5.5.2 MAPES-EM1 versus MAPES-EM2. . . . . . . . . . . . . . . . . . . 46
5.5.3 Missing-Sample Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.5.4 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.5.5 Stopping Criterion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.6 MAPES Compared With GAPES . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6. Two-Dimensional MAPES via Expectation Maximization and
Cyclic Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.2 Two-Dimensional ML-Based APES. . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.3 Two-Dimensional MAPES via EM. . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.3.1 Two-Dimensional MAPES-EM1 . . . . . . . . . . . . . . . . . . . . . . 64
6.3.2 Two-Dimensional MAPES-EM2 . . . . . . . . . . . . . . . . . . . . . . 68
6.4 Two-Dimensional MAPES via CM . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.5 MAPES-EM versus MAPES-CM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.6.1 Convergence Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.6.2 Performance Study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.6.3 Synthetic Aperture Radar Imaging Applications . . . . . . . . . 82
7. Conclusions and Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.1 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.2 Online Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
The Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
vii
Preface
This lecture considers the spectral estimation problem in the case where some of
the data samples are missing. The challenge is how to extend the existing spectral
estimation techniques to deal with these missing-data samples. Recently, nonpara-
metric adaptive ltering based techniques have been developed successfully for
various missing-data spectral estimation problems. Collectively, these algorithms
provide a comprehensive toolset for the missing-data problem based exclusively on
the nonparametric adaptive lter-bank approaches. They provide the main topic
of this book.
The authors would like to acknowledge the contributions of several other
people and organizations to the completion of this lecture. We are grateful to our
collaborators on this topic, including Erik G. Larsson, Hongbin Li, and Thomas
L. Marzetta, for their excellent work and support. In particular, we thank Erik G.
Larsson for providing us the Matlab codes that implement the two-dimensional
GAPESalgorithm. Most of the topics described here are outgrowths of our research
programs in spectral analysis. We would like to thank those who supported our
researchinthis area: the National Science Foundation, the SwedishScience Council
(VR), and the Swedish Foundation for International Cooperation in Research and
Higher Education (STINT). We also wish to thank Jose M. F. Moura for inviting
us to write this lecture and Joel Claypool for publishing our work.
viii
List of Abbreviations
1-D one-dimensional
2-D two-dimensional
APES amplitude and phase estimation
AR autoregressive
ARMA autoregressive moving-average
CAD computer aided design
CM cyclic maximization
DFT discrete Fourier transform
EM expectation maximization
FFT fast Fourier transform
FIR nite impulse response
GAPES gapped-data amplitude and phase estimation
LS least squares
MAPES missing-data amplitude and phase estimation
MAPES-CM missing-data amplitude and phase estimation via cyclic
maximization
MAPES-EM missing-data amplitude and phase estimation via expectation
maximization
ML maximum likelihood
RCF robust Capon lter-bank
RF radio frequency
RMSEs root mean-squared errors
SAR synthetic aperture radar
WFFT windowed fast Fourier transform
1
C H A P T E R 1
Introduction
Spectral estimation is important in many elds including astronomy, meteorology,
seismology, communications, economics, speech analysis, medical imaging, radar,
and underwater acoustics. Most existing spectral estimation algorithms are devised
for uniformly sampled complete-data sequences. However, the spectral estimation
for data sequences with missing samples is also important in a wide range of appli-
cations. For example, sensor failure or outliers can lead to missing-data problems.
In astronomical, meteorological, or satellite-based applications, weather or other
conditions may disturb sample taking schemes (e.g., measurements are available
only during nighttime for astronomical applications), which will result in missing
or gapped data [1]. In synthetic aperture radar imaging, missing-sample problems
arise when the synthetic aperture is gapped to reduce the radar resources needed for
the high-resolution imaging of a scene [24]. For foliage and ground penetrating
radar systems, certain radar operating frequency bands are reserved for applications
such as aviation and cannot be used, or they are under strong electromagnetic or
radio frequency interference [5, 6] so that the corresponding samples must be dis-
carded, both resulting in missing data. Similar problems arise in data fusion via
ultrawideband coherent processing [7].
1.1 COMPLETE-DATACASE
For complete-data spectral estimation, extensive work has already been carried out
inthe literature, see, e.g., [8]. The conventional discrete Fourier transform(DFT) or
fast Fourier transform based methods have been widely used for spectral estimation
2 SPECTRAL ANALYSIS OF SIGNALS
tasks because of their robustness and high computational efciency. However, they
suffer from low resolution and poor accuracy problems. Many advanced spectral
estimation methods have also been proposed, including parametric [911] and
nonparametric adaptive ltering basedapproaches [12, 13]. One problemassociated
with the parametric methods is order selection. Even with properly selected order, it
is hard to compare parametric and nonparametric approaches since the parametric
methods (except [11]) do not provide complex amplitude estimation. In general,
the nonparametric approaches are less sensitive to data mismodelling than their
parametric counterparts. Moreover, the adaptive lter-bank based nonparametric
spectral estimators can provide high resolution, low sidelobes, and accurate spectral
estimates while retaining the robust nature of the nonparametric methods [14, 15].
These include the amplitude and phase estimation (APES) method [13] and the
Capon spectral estimator [12].
However, the complete-data spectral estimation methods do not work well
in the missing-data case when the missing data samples are simply set to zero. For
the DFT-based spectral estimators, setting the missing samples to zero corresponds
to multiplying the original data with a windowing function that assumes a value of
one whenever a sample is available, and zero otherwise. In the frequency domain,
the resulting spectrum is the convolution between the Fourier transform of the
complete data and that of the windowing function. Since the Fourier transform of
the windowing function typically has an underestimated mainlobe and an extended
pattern of undesirable sidelobes, the resulting spectrumwill be poorly estimated and
contain severe artifacts. For the parametric and adaptive ltering based approaches,
similar performance degradations will also occur.
1.2 MISSING-DATACASE
For missing-data spectral estimation, various techniques have been developed pre-
viously. In [16] and [17], the LombScargle periodogram is developed for irregu-
larly sampled (unevenly spaced) data. In the missing-data case, the LombScargle
INTRODUCTION 3
periodogram is nothing but DFT with missing samples set to zero. The CLEAN
algorithm [18] is used to estimate the spectrum by deconvolving the missing-data
DFT spectrum (the so-called dirty map) into the true signal spectrum (the so-
called true clean map) and the Fourier transform of the windowing function (the
so-called dirty beam) via an iterative approach. Although the CLEAN algorithm
works for both missing and irregularly sampled data sequences, it cannot resolve
closely spaced spectral lines, and hence it may not be a suitable tool for high-
resolution spectral estimation. The multi-taper methods [19, 20] compute spectral
estimates by assuming certain quadratic functions of the available data samples.
The coefcients in the corresponding quadratic functions are optimized according
to certain criteria, but it appears that this approach cannot overcome the resolution
limit of DFT. To achieve high resolution, several parametric algorithms, e.g., those
based on an autoregressive or autoregressive moving-average models, were used
to handle the missing-data problem [2124]. Although these parametric methods
can provide improved spectral estimates, they are sensitive to model errors. Non-
parametric adaptive ltering based techniques are promising for the missing-data
problem, as we will show later.
1.3 SUMMARY
In this book, we present the recently developed nonparametric adaptive ltering
based algorithms for the missing-data case, namely gapped-data APES (GAPES)
and the more general missing-data APES(MAPES). The outlines of the remaining
chapters are as follows:
Chapter 2: In this chapter, we introduce the APES lter for the complete-data
case. The APES lter is needed for the missing-data algorithm developed in
Chapter 3.
Chapter 3: We consider the spectral analysis of a gapped-data sequence where
the available samples are clustered together in groups of reasonable size.
Following the lter design framework introduced in Chapter 2, GAPES
is developed to iteratively interpolate the missing data and to estimate the
spectrum. A two-dimensional extension of GAPES is also presented.
Chapter 4: In this chapter, we introduce a maximum likelihood (ML) based in-
terpretation of APES. This framework will lay the ground for the general
missing-data problem discussed in the following chapters.
Chapter 5: Although GAPES performs quite well for gapped data, it does not work
well for the more general problem of missing samples occurring in arbitrary
patterns. In this chapter, we develop two MAPES algorithms by using a ML
tting criterion as discussed in Chapter 4. Then we use the well-known
expectation maximization (EM) method to solve the so-obtained estimation
problemiteratively. We also demonstrate the advantage of MAPES-EMover
GAPES by comparing their design approaches.
Chapter 6: Two-dimensional extensions of the MAPES-EMalgorithms are devel-
oped. However, because of the high computational complexity involved, the
direct application of MAPES-EM to large data sets, e.g., two-dimensional
data, is computationally prohibitive. To reduce the computational complex-
ity, we develop another MAPES algorithm, referred to as MAPES-CM,
by solving a ML tting problem iteratively via cyclic maximization (CM).
MAPES-EM and MAPES-CM possess similar spectral estimation perfor-
mance, but the computational complexity of the latter is much lower than
that of the former.
Chapter 7: We summarize the book and provide some concluding remarks. Addi-
tional online resources such as Matlab codes that implement the missing-data
algorithms are also provided.
5
C H A P T E R 2
APES for Complete Data
Spectral Estimation
2.1 INTRODUCTION
Filter-bank approaches are commonly used for spectral analysis. As nonparametric
spectral estimators, they attempt to compute the spectral content of a signal with-
out using any a priori model information or making any explicit model assumption
about the signal. For any of these approaches, the key element is to design narrow-
band lters centered at the frequencies of interest. In fact, the well-known peri-
odogram can be interpreted as such a spectral estimator with a data-independent
lter-bank. In general, data-dependent (or data-adaptive) lters outperform their
data-independent counterparts and are hence preferred in many applications. A
well-knownadaptive lter-bank methodis the Caponspectral estimator [12]. More
recently, Li and Stoica [13] devised another adaptive lter-bank method with en-
hanced performance, which is referred to as the amplitude and phase estimation
(APES). APES surpasses its rivals in several aspects [15, 25] and nd applications
in various elds [1, 2631].
In this chapter, we derive the APES lter from pure narrowband-lter de-
sign considerations [32]. It is useful as the initialization step of the algorithms in
Chapter 3. The remainder of this chapter is organized as follows: The problem
formulation is given in Section 2.2 and the forward-only APES lter is presented
in Section 2.3. Section 2.4 provides a two-step ltering interpretation of the APES
estimator. Section 2.5 shows how the forwardbackward averaging can be used
to improve the performance of the estimator. A brief discussion about the fast
implementation of APES appears in Section 2.6.
2.2 PROBLEMFORMULATION
Consider the problem of estimating the amplitude spectrum of a complex-valued
uniformly sampled discrete-time signal {y
n
}
N1
n=0
. For a frequency of interest, the
signal y
n
is modeled as
y
n
= () e
j n
+ e
n
(), n = 0, . . . , N 1, [0, 2), (2.1)
where () denotes the complex amplitude of the sinusoidal component at fre-
quency , and e
n
() denotes the residual term (assumed zero-mean), which in-
cludes the unmodeled noise and interference from frequencies other than . The
problem of interest is to estimate () from {y
n
}
N1
n=0
for any given frequency .
2.3 FORWARD-ONLY APES ESTIMATOR
Let h() denote the impulse response of an M-tap nite impulse response (FIR)
lter-bank
h() = [h
0
() h
1
() h
M1
()]
T
, (2.2)
where ()
T
denotes the transpose. Then the lter output can be written as h
H
() y
l
,
where
y
l
= [y
l
y
l +1
y
l +M1
]
T
, l = 0, . . . , L 1 (2.3)
are the M1overlappingforwarddata subvectors (snapshots) and L = N M+ 1.
Here ()
H
denotes the conjugate transpose.
For each of interest, we consider the following design objective:
min
(),h()
L1
l =0
h
H
() y
l
() e
j l
2
s.t. h
H
()a() = 1, (2.4)
where a() is an M 1 vector given by
a() = [1 e
j
e
j (M1)
]
T
. (2.5)
APES FOR COMPLETE DATA SPECTRAL ESTIMATION 7
In the above approach, the lter-bank h() is designed such that
1. the ltered sequence is as close to a sinusoidal signal as possible in a least
squares (LS) sense;
2. the complex spectrum () is not distorted by the ltering.
Let g() denote the normalized Fourier transform of y
l
:
g() =
1
L
L1
l =0
y
l
e
j l
(2.6)
and dene
R =
1
L
L1
l =0
y
l
y
H
l
. (2.7)
A straightforward calculation shows that the objective function in (2.4) can be
rewritten as
1
L
L1
l =0
h
H
() y
l
() e
j l
2
= h
H
()
Rh()
()h
H
() g() () g
H
()h() + |()|
2
= |() h
H
() g()|
2
+ h
H
()
Rh() |h
H
() g()|
2
, (2.8)
where ()
denotes the complex conjugate. The minimization of (2.8) with respect

to () is given by
() = h
H
() g(). (2.9)
Insertion of (2.9) in (2.8) yields the following minimization problem for the deter-
mination of h():
min
h()
h
H
()
S()h() s.t. h
H
()a() = 1, (2.10)
where
S()

R g() g
H
(). (2.11)
The solution to (2.10) is readily obtained [33] as
h() =
S
1
()a()
a
H
()
S
1
()a()
. (2.12)
This is the forward-only APES lter, and the forward-only APES estimator in
(2.9) becomes
() =
a
H
()
S
1
() g()
a
H
()
S
1
()a()
. (2.13)
2.4 TWO-STEP FILTERING-BASED
INTERPRETATION
The APES spectral estimator has a two-step ltering interpretation: passing the
data {y
n
}
N1
n=0
through a bank of FIR bandpass lters with varying center frequency
, and then obtaining the spectrum estimate () for [0, 2) from the ltered
data.
For each frequency , the corresponding M-tap FIR lter-bank is given by
(2.12). Hence the output obtained by passing y
l
through the FIR lter

h() can be
written as
h
H
() y
l
= ()[
h
H
()a()] e
j l
+ w
l
()
= () e
j l
+ w
l
(), (2.14)
where w
l
() =

h
H
() e
l
() denotes the residue term at the lter output and the
second equality follows from the identity
h
H
()a() = 1. (2.15)
Thus, from the output of the FIR lter, we can obtain the LS estimate of () as
() =

h
H
() g(). (2.16)
2.5 FORWARDBACKWARDAVERAGING
Forwardbackward averaging has been widely used for enhanced performance in
many spectral analysis applications. In the previous section, we obtained the APES
lter by using only forward data vectors. Here we show that forwardbackward
averaging can be readily incorporated into the APES lter design by considering
both the forward and the backward data vectors.
Let the backward data subvectors (snapshots) be constructed as
y
l
= [y
Nl 1
y
Nl 2
y
Nl M
]
T
, l = 0, . . . , L 1. (2.17)
We require that the outputs obtained by running the data through the lter both
forward and backward are as close as possible to a sinusoid with frequency . This
design objective can be written as
min
h(),(),()
1
2L
L1
l =0
h
H
() y
l
() e
j l
2
+
h
H
() y
l
() e
j l
s.t. h
H
()a() = 1. (2.18)
The minimization of (2.18) with respect to () and () gives () = h
H
() g()
and

() = h
H
() g(), where g() is the normalized Fourier transform of y
l
:
g() =
1
L
L1
l =0
y
l
e
j l
. (2.19)
It follows that (2.18) leads to
min
h()
h
H
()
S
f b
()h() s.t. h
H
()a() = 1, (2.20)
where
S
f b
()

R
f b

g() g
H
() + g() g
H
()
2
(2.21)
with
R
f
=
1
L
L1
l =0
y
l
y
H
l
, (2.22)
R
b
=
1
L
L1
l =0
y
l
y
H
l
, (2.23)
and
R
f b
=
R
f
+

R
b
2
. (2.24)
Note that here we use

R
f
instead of

Rto emphasize on the fact that it is estimated
from the forward-only approach. The solution of (2.20) is given by
h
f b
() =
S
1
f b
()a()
a
H
()
S
1
f b
()a()
. (2.25)
Because of the following readily veried relationship
y
l
= J y
Ll 1
, (2.26)
we have
g() = J g
() e
j (L1)
, (2.27)
R
b
= J

R
T
f
J, (2.28)
and
g() g
H
() = J [ g() g
H
()]
T
J, (2.29)
where J denotes the exchange matrix whose antidiagonal elements are ones and the
remaining elements are zeros. So

S
f b
() can be conveniently calculated as
S
f b
() =
S
f
() + J
S
T
f
() J
2
, (2.30)
where
S
f
()

R
f
g() g
H
(). (2.31)
Given the forwardbackward APES lter

h
f b
(), the forwardbackward
spectral estimator can be written as

f b
() =
a
H
()
S
1
f b
() g()
a
H
()
S
1
f b
()a()
. (2.32)
Note that due to the above relationship, the forwardbackward estimator of ()
can be simplied as
f b
() = h
H
f b
() g() =
f b
e
j (N1)
, (2.33)
which indicates that from

f b
() we will get the same forwardbackward spectral
estimator
f b
().
Insummary, the forwardbackwardAPESlter andAPESspectral estimator
still has the same forms as in (2.12) and (2.13), but

R and

S() are replaced by
R
f b
and

S
f b
(), respectively. Note that

R
f b
and

S
f b
() are persymmetric matrices.
Compared with the non-persymmetric estimates

R
f
and

S
f
(), they are generally
better estimates of the true Rand Q(), where Rand Q() are the ideal covariance
matrices with and without the presence of the signal of interest, respectively. See
Chapter 4 for more details about R and Q().
For simplicity, all the APES-like algorithms we develop in the subsequent
chapters are based on the forward-only approach. For better estimation accuracy,
the forwardbackward averaging is used in all numerical examples.
2.6 FASTIMPLEMENTATION
The direct implementation of APES by simply computing (2.13) for many dif-
ferent of interest is computationally demanding. Several papers in the literature
have addressed this problem [29, 3436]. Here we give a brief discussion about
implementing APES efciently.
To avoid the inversion of an M M matrix

S() for each , we use the
matrix inversion lemma (see, e.g., [8]) to obtain
S
1
() =

R
1
+
R
1
g() g
H
()
R
1
1 g
H
()
R
1
g()
. (2.34)
Let

R
1/2
denote the Cholesky factor of

R
1
, and let
a() =

R
1/2
a()
g() =

R
1/2
g()
() = a
H
() a()
() = a
H
() g()
() = g
H
() g(). (2.35)
Then we can write (2.12) and (2.13) as
h() =
[
R
1/2
]
H
[(1 ()) a() + () g()]
()(1 ()) + |()|
2
(2.36)
and
() =
()
()(1 ()) + |()|
2
(2.37)
whose implementation requires only the Cholesky factorization of the matrix

R
that is independent of .
This strategy can be readily generalized to the forwardbackward averaging
case. Since the complete-data case is not the focus of this book, we refer the readers
to [29, 3436] for more details about the efcient implementations of APES.
13
C H A P T E R 3
Gapped-Data APES
3.1 INTRODUCTION
One special case of the missing-data problemis called gapped data, where the mea-
surements during certain periods are not valid due to many reasons such as interfer-
ence or jamming. The difference between the gapped-data problem and the gen-
eral missing-data problem, where the missing samples can occur at arbitrary places
among the complete data set, is that for the gapped-data case, there exists group(s)
of available data samples where within each group there are no missing samples.
Such scenarios exist in astronomical or radar applications where large seg-
ments of data are available in spite of the fact that the data between these segments
are missing. For example, in radar signal processing, the problemof combining sev-
eral sets of measurements made at different azimuth angle locations can be posed
as a problem of spectral estimation from gapped data [2, 4]. Similar problems arise
in data fusion via ultrawideband coherent processing [7]. In astronomy, data are
often available as groups of samples with rather long intervals during which no
measurements can be taken [17, 3741].
The gapped-data APES (GAPES) considers using the APES lter (as in-
troduced in Chapter 2) for the spectral estimation of gapped-data. Specically, the
GAPES algorithm consists of two steps: (1) estimating the adaptive lter and the
corresponding spectrum via APES and (2) lling in the gaps via LS t.
In the remainder of this chapter, one-dimensional (1-D) and two-
dimensional (2-D) GAPES are presented in Sections 3.2 and 3.3, respectively.
Numerical results are provided in Section 3.4.
3.2 GAPES
Assume that some segments of the 1-D data sequence {y
n
}
N1
n=0
are unavailable. Let
y [ y
1
y
2
y
N1
]
T
_
y
T
1
y
T
2
y
T
P
_
T
(3.1)
be the complete data vector, where y
1
, . . . , y
P
are subvectors of y, whose lengths are
N
1
, . . . , N
P
, respectively, with N
1
+ N
2
+ + N
P
= N. A gapped-data vector
is formed by assuming y
p
, for p = 1, 3, . . . , P (P is always an odd number),
are available:

_
y
T
1
y
T
3
y
T
P
_
T
. (3.2)
Similarly,
_
y
T
2
y
T
4
y
T
P1
_
T
(3.3)
denotes all the missing samples. Then and have dimensions g 1 and
(N g) 1, respectively, where g = N
1
+ N
3
+ + N
P
is the total number of
available samples.
3.2.1 Initial Estimates via APES
We obtain the initial APES estimates of h() and () from the available data
as follows.
Choose an initial lter length M
0
such that an initial full-rank covariance ma-
trix

R can be built with the lter length M
0
using only the available data segments.
This indicates
p{1,3,...,P}
max(0, N
p
M
0
+ 1) > M
0
. (3.4)
Let L
p
= N
p
M
0
+ 1 and let J be the subset of {1, 3, . . . , P} for which L
p
> 0.
Then the lter-bank h() is calculated from (2.11) and (2.12) by using the
GAPPED-DATA APES 15
following redenitions:
R =
1
pJ
L
p
pJ
N
1
++N
p1
+L
p
1
l =N
1
++N
p1
y
l
y
H
l
, (3.5)
g() =
1
pJ
L
p
pJ
N
1
++N
p1
+L
p
1
l =N
1
++N
p1
y
l
e
j l
. (3.6)
Note that the data snapshots used above have a size of M
0
1 whose elements are
only from, and hence they do not contain any missing samples. Correspondingly,
the

R and g() estimated above have sizes of M
0
M
0
and M
0
1, respecti-
vely.
Next, the lter-bank h() is applied to the available data and the LS
estimate of () from the lter output is calculated by using (2.16), where g()
is replaced by (3.6). Note that in the above ltering process, only the available
samples are passed through the lter. The initial LS estimate of () is based on
these so-obtained lter outputs only.
3.2.2 Data Interpolation
Now we consider the estimation of based on the initial spectral estimates ()
and

h() obtained as outlined above. Under the assumption that the missing data
have the same spectral content as the available data, we can determine under the
condition that the output of the lter

h() fed with the complete data sequence
made from and is as close as possible (in the LS sense) to () e
j l
, for
l = 0, . . . , L 1. Since usually we evaluate () on a K-point DFT grid,
k
=
2k/K for k = 0, . . . , K 1 (usually we have K > N ),we obtain as the solu-
tion to the following LS problem:
min
K1
k=0
L1
l =0
h
H
(
k
) y
l
(
k
) e
j
k
l
2
. (3.7)
Note that by estimating this way, we remain in the LS tting framework of
APES.
The quadratic minimization problem (3.7) can be readily solved. Let
H(
k
) =
_
_
_
_
_
_
_
0

h
M
0
1
0 0 0
0

h
0

h
M
0
1
0 0
.
.
.
.
.
.
.
.
.
0 0 0

h
0

h
M
0
1
_
_
=
_
_
_
_
_
_
_
h
H
(
k
)
h
H
(
k
)
.
.
.
h
H
(
k
)
_
_
C
LN
(3.8)
and
(
k
) = (
k
)
_
_
_
_
_
_
_
1
e
j
k
.
.
.
e
j
k
(L1)
_
_
C
L1
. (3.9)
Using this notation we can write the objective function in (3.7) as
K1
k=0
_
_
_
_
_
_
_
_
H(
k
)
_
_
_
_
y
0
.
.
.
y
N1
_
_
(
k
)
_
_
_
_
_
_
_
_
2
. (3.10)
Dene the L g and L (N g) matrices A(
k
) and B(
k
) from H(
k
) via the
following equality:
H(
k
)
_
_
_
_
y
0
.
.
.
y
N1
_
_
= A(
k
) + B(
k
). (3.11)
Also, let
d(
k
) = (
k
) A(
k
). (3.12)
With this notation the objective function (3.10) becomes
K1
k=0
B(
k
) d(
k
)
2
, (3.13)
GAPPED-DATA APES 17
whose minimizer with respect to is readily found to be
=
_
K1
k=0
B
H
(
k
)B(
k
)
_
1
_
K1
k=0
B
H
(
k
)d(
k
)
_
. (3.14)
3.2.3 Summary of GAPES
Once an estimate has become available, the next logical step should consist
of reestimating the spectrum and the lter-bank, by applying APES to the data
sequence made from and . According to the discussion around (2.4), this entails
the minimization with respect to h(
k
) and (
k
) of the function
K1
k=0
L1
l =0
h
H
(
k
)
y
l
(
k
) e
j
k
l
2
(3.15)
subject to h
H
(
k
)a(
k
) = 1, where

y
l
is made from and . Evidently, the min-
imization of (3.15) with respect to {h(
k
), (
k
)}
K1
k=0
can be decoupled into K
minimization problems of the form of (2.4), yet we prefer to write the criterion
function as in (3.15) to make the connection with (3.7). In effect, comparing (3.7)
and (3.15) clearly shows that the alternating estimation of {(
k
), h(
k
)} and
outlined above can be recognized as a cyclic optimization (see [42] for a tutorial of
cyclic optimization) approach for solving the following minimization problem:
min
,{(
k
),h(
k
)}
K1
k=0
L1
l =0
h
H
(
k
) y
l
(
k
) e
j
k
l
2
s.t. h
H
(
k
)a(
k
) = 1.
(3.16)
A step-by-step summary of GAPES is as follows:
Step 0: Obtain an initial estimate of {(
k
), h(
k
)}.
Step 1: Use the most recent estimate of {(
k
), h(
k
)} in (3.16) to estimate by
minimizing the so-obtained cost function, whose solution is given by (3.14).
Step 2: Use the latest estimate of to ll in the missing data samples and estimate
{(
k
), h(
k
)}
K1
k=0
by minimizing the cost function in (3.16) based on the
interpolated data. (This step is equivalent to applying APES to the complete
data.)
Step 3: Repeat steps 12 until practical convergence.
The practical convergence can be decided when the relative change of the cost
function in (3.16) corresponding to the current and previous estimates is smaller
than a preassigned threshold (e.g., = 10
3
). After convergence, we have a nal
spectral estimate { (
k
)}
K1
k=0
. If desired, we can use the nal interpolated data
sequence to compute the APES spectrum on a grid even ner than the one used in
the aforementioned minimization procedure.
Note that usually the selected initial lter length satises M
0
< M due to
the missing data samples, so there are many practical choices to increase the lter
length after initialization, which include, for example, increasing the lter length
after each iteration until it reaches M. For simplicity, we choose to use lter length
M right after the initialization step.
3.3 TWO-DIMENSIONAL GAPES
In this section, we extend the GAPES algorithm developed previously to 2-D data
matrices.
3.3.1 Two-Dimensional APES Filter
Consider the problem of estimating the amplitude spectrum of a complex-valued
uniformly sampled2-Ddiscrete-time signal {y
n
1
,n
2
}
N
1
1,N
2
1
n
1
=0,n
2
=0
, where the data matrix
has dimension N
1
N
2
.
For a 2-D frequency (
1
,
2
) of interest, the signal y
n
1
,n
2
is described as
y
n
1
,n
2
= (
1
,
2
) e
j (
1
n
1
+
2
n
2
)
+ e
n
1
,n
2
(
1
,
2
), n
1
= 0, . . . , N
1
1,
n
2
= 0, . . . , N
2
1,
1
,
2
[0, 2), (3.17)
where (
1
,
2
) denotes the complex amplitude of the 2-D sinusoidal compo-
nent at frequency (
1
,
2
) and e
n
1
,
n
2
(
1
,
2
) denotes the residual matrix (assumed
GAPPED-DATA APES 19
zero-mean), which includes the unmodeled noise and interference fromfrequencies
other than (
1
,
2
). The 2-D APES algorithm derived below estimates (
1
,
2
)
from {y
n
1
,n
2
} for any given frequency pair (
1
,
2
).
Let Y be an N
1
N
2
data matrix
Y
_
_
_
_
_
_
_
y
0,0
y
0,1
. . . y
0,N
2
1
y
1,0
y
1,1
. . . y
1,N
2
1
.
.
.
.
.
.
.
.
.
.
.
.
y
N
1
1,0
y
N
1
1,1
. . . y
N
1
1,N
2
1
_
_
, (3.18)
and let H(
1
,
2
) be an M
1
M
2
matrix that contains the coefcients of a 2-D
FIR lter
H(
1
,
2
)
_
_
_
_
_
_
_
h
0,0
(
1
,
2
) h
0,1
(
1
,
2
) . . . h
0,M
2
1
(
1
,
2
)
h
1,0
(
1
,
2
) h
1,1
(
1
,
2
) . . . h
1,M
2
1
(
1
,
2
)
.
.
.
.
.
.
.
.
.
.
.
.
h
M
1
1,0
(
1
,
2
) h
M
1
1,1
(
1
,
2
) . . . h
M
1
1,M
2
1
(
1
,
2
)
_
_
.
(3.19)
Let L
1
N
1
M
1
+ 1 and L
2
N
2
M
2
+ 1. Then we denote by
X = H(
1
,
2
) Y (3.20)
the following L
1
L
2
output data matrix obtained by ltering Y through the lter
determined by H(
1
,
2
)
x
l
1
,l
2
=
M
1
1
m
1
=0
M
2
1
m
2
=0
h
m
1
,m
2
(
1
,
2
)y
l
1
+m
1
,l
2
+m
2
= vec
H
(H(
1
,
2
)) y
l
1
,l
2
, (3.21)
where vec() denotes the operation of stacking the columns of a matrix onto each
other. In (3.21), y
l
1
,l
2
is dened by
y
l
1
,l
2
vec(
Y
l
1
,l
2
) vec
_
_
_
_
_
_
_
_
_
_
_
_
_
_
y
l
1
,l
2
y
l
1
,l
2
+1
. . . y
l
1
,l
2
+M
2
1
y
l
1
+1,l
2
y
l
1
+1,l
2
+1
. . . y
l
1
+1,l
2
+M
2
1
.
.
.
.
.
.
.
.
.
.
.
.
y
l
1
+M
1
1,l
2
y
l
1
+M
1
1,l
2
+1
. . . y
l
1
+M
1
1,l
2
+M
2
1
_
_
_
_
_
_
_
_
_
.
(3.22)
The APES spectrum estimate (
1
,
2
) and the lter coefcient matrix
H(
1
,
2
) are the minimizers of the following LS criterion:
min
(
1
,
2
),H(
1
,
2
)
L
1
1
l
1
=0
L
2
1
l
2
=0
x
l
1
,l
2
(
1
,
2
) e
j (
1
l
1
+
2
l
2
)
2
s.t. vec
H
(H(
1
,
2
))a
M
1
,M
2
(
1
,
2
) = 1. (3.23)
Here a
M
1
,M
2
(
1
,
2
) is an M
1
M
2
1 vector given by
a
M
1
,M
2
(
1
,
2
) a
M
2
(
2
) a
M
1
(
1
), (3.24)
where denotes the Kronecker matrix product and
a
M
k
(
k
) [1 e
j
k
. . . e
j (M
k
1)
k
]
T
, k = 1, 2. (3.25)
Substituting

X into (3.23), we have the following design objective for 2-D APES:
min
(
1
,
2
),H(
1
,
2
)
vec(H(
1
,
2
) Y) (
1
,
2
)a
L
1
,L
2
(
1
,
2
)
2
s.t. vec
H
(H(
1
,
2
))a
M
1
,M
2
(
1
,
2
) = 1, (3.26)
where a
L
1
,L
2
(
1
,
2
) is dened similar to a
M
1
,M
2
(
1
,
2
).
The solution to (3.26) can be readily derived. Dene
R =
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
y
l
1
,l
2
y
H
l
1
,l
2
(3.27)
GAPPED-DATA APES 21
and let g(
1
,
2
) denote the normalized 2-D Fourier transform of y
l
1
,l
2
:
g(
1
,
2
) =
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
y
l
1
,l
2
e
j (
1
l
1
+
2
l
2
)
. (3.28)
The lter H(
1
,
2
) that minimizes (3.26) is given by
vec(

H(
1
,
2
)) =
S
1
(
1
,
2
)a
M
1
,M
2
(
1
,
2
)
a
H
M
1
,M
2
(
1
,
2
)
S
1
(
1
,
2
)a
M
1
,M
2
(
1
,
2
)
(3.29)
and the APES spectrum is given by
(
1
,
2
) =
a
H
M
1
,M
2
(
1
,
2
)
S
1
(
1
,
2
) g(
1
,
2
)
a
H
M
1
,M
2
(
1
,
2
)
S
1
(
1
,
2
)a
M
1
,M
2
(
1
,
2
)
, (3.30)
where
S(
1
,
2
)

R g(
1
,
2
) g
H
(
1
,
2
). (3.31)
3.3.2 Two-Dimensional GAPES
Let G be the set of sample indices (n
1
, n
2
) for which the data samples are available,
and U be the set of sample indices (n
1
, n
2
) for which the data samples are missing.
The set of available samples {y
n
1
,n
2
: (n
1
, n
2
) G} is denoted by the g 1 vector
, whereas the set of missing samples {y
n
1
,n
2
: (n
1
, n
2
) U} is denoted by the
(N
1
N
2
g) 1 vector . The problemof interest is to estimate (
1
,
2
) given .
Assume we consider a K
1
K
2
-point DFT grid: (
k
1
,
k
2
) = (2k
1
/
K
1
, 2k
2
/K
2
), for k
1
= 0, . . . , K
1
1andk
2
= 0, . . . , K
2
1(with K
1
> N
1
and
K
2
> N
2
). The 2-D GAPES algorithm tries to solve the following minimization
problem:
min
, {(
k
1
,
k
2
), H(
k
1
,
k
2
)}
K
1
1
k
1
=0
K
2
1
k
2
=0
vec(H(
k
1
,
k
2
) Y) (
k
1
,
k
2
)a
L
1
,L
2
(
k
1
,
k
2
)
2
s.t. vec
H
(H(
k
1
,
k
2
))a
M
1
,M
2
(
k
1
,
k
2
) = 1, (3.32)
via cyclic optimization [42].
For the initialization step, we obtain the initial APES estimates of H(
1
,
2
)
and (
1
,
2
) from the available data in the following way. Let S be the set of
snapshot indices (l
1
, l
2
) such that the elements of the corresponding initial data
snapshot indices {(l
1
, l
2
), . . . , (l
1
, l
2
+ M
0
2
1), . . . , (l + M
0
1
1, l
2
), . . . , (l
1
+
M
0
1
1, l
2
+ M
0
2
1)} G. Dene the set of M
0
1
M
0
2
1 vectors { y
l
1
,l
2
:
(l
1
, l
2
) S}, which contain only the available data samples, and let |S| be the
number of vectors in S. Furthermore, dene the initial sample covariance matrix
R =
1
|S|
(l
1
,l
2
)S
y
l
1
,l
2
y
H
l
1
,l
2
. (3.33)
The size of the initial lter matrix M
0
1
M
0
2
must be chosen such that the

R
calculated in (3.33) has full rank. Similarly, the initial Fourier transform of the data
snapshots is given by
g(
1
,
2
) =
1
|S|
(l
1
,l
2
)S
y
l
1
,l
2
e
j (
1
l
1
+
2
l
2
)
. (3.34)
So the initial estimates of H(
1
,
2
) and (
1
,
2
) can be calculated by (3.29)
(3.31) but by using the

R and g(
1
,
2
) given above.
Next, we introduce some additional notationthat will be usedlater for the step
of interpolating the missing samples. Let the L
1
L
2
(L
2
N
1
M
1
+ 1) matrix T
be dened by
T =
_
_
_
_
_
_
_
I
L
1
0
L
1
,M
1
1
I
L
1
0
L
1
,M
1
1
.
.
.
I
L
1
_
_
. (3.35)
Hereafter, 0
K
1
,K
2
denotes a K
1
K
2
matrix of zeros only and I
K
stands for the K
K identity matrix. Furthermore, let Gbe the following (L
2
N
1
M
1
+ 1) N
1
N
2
GAPPED-DATA APES 23
Toeplitz matrix:
G(
1
,
2
) =
_
_
_
_
_
_
_
h
H
1
0
1,L
1
1
h
H
2
0
1,L
1
1
. . . h
H
M
2
0 . . . 0
0 h
H
1
0
1,L
1
1
h
H
2
0
1,L
1
1
. . . h
H
M
2
. . . 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 h
H
1
0
1,L
1
1
h
H
2
0
1,L
1
1
. . . h
H
M
2
_
_
(3.36)
where {h
m
2
}
M
2
m
2
=1
are the corresponding columns of H(
1
,
2
). With these deni-
tions, we have
vec(
X) = vec(H(
1
,
2
) Y) = TGvec(Y). (3.37)
By making use of (3.37), the estimate of based on the initial estimates
(
1
,
2
) and

H(
1
,
2
) is given by the solution to the following problem:
min
_
_
_
_
_
_
_
_
_
_
TG(
0
,
0
)
.
.
.
TG(
K
1
1
,
K
2
1
)
_
_
vec (Y)
_
_
_
(
0
,
0
)a
L
1
,L
2
(
0
,
0
)
.
.
.
(
K
1
1
,
K
2
1
)a
L
1
,L
2
(
K
1
1
,
K
2
1
)
_
_
_
_
_
_
_
_
_
2
.
(3.38)
To solve (3.38), let the matrices G
(
k
1
,
k
2
) and G
(
k
1
,
k
2
) be dened implicity
by the following equality:
G(
k
1
,
k
2
) vec(Y) = G
(
k
1
,
k
2
) + G
(
k
1
,
k
2
), (3.39)
where and are the vectors containing the available samples and missing samples,
respectively. In other words, G
(
k
1
,
k
2
) and G
(
k
1
,
k
2
) contain the columns of
G(
k
1
,
k
2
) that correspond to the indices in G and U, respectively. By introducing
the following matrices:
_
_
_
_
TG
(
0
,
0
)
.
.
.
TG
(
K
1
1
,
K
2
1
)
_
_
(3.40)
and
_
_
_
_
TG
(
0
,
0
)
.
.
.
TG
(
K
1
1
,
K
2
1
)
_
_
, (3.41)
the criterion (3.38) can then be written as
min
_
_
G
+

G

_
_
2
, (3.42)
where

_
_
_
_
(
0
,
0
)a
L
1
,L
2
(
0
,
0
)
.
.
.
(
K
1
1
,
K
2
1
)a
L
1
,L
2
(
K
1
1
,
K
2
1
)
_
_
. (3.43)
The closed-form solution of the quadratic problem (3.42) is easily obtained as
=
_
G
H
_
1
G
H
_

G
_
. (3.44)
A step-by-step summary of 2-D GAPES is as follows:
1
,
2
), h(
1
,
2
)}.
Step1: Use the most recent estimate of {(
1
,
2
), h(
1
,
2
)} in (3.32) to estimate
by minimizing the so-obtained cost function, whose solution is given by
(3.44).
Step2: Use the latest estimate of to ll in the missing data samples and estimate
{(
1
,
2
), H(
1
,
2
)}
K
1
1,K
2
1
k
1
=0,k
2
=0
by minimizing the cost function in (3.32)
based on the interpolated data. (This step is equivalent to applying 2-D
APES to the complete data.)
Step 3: Repeat steps 12 until practical convergence.
3.4 NUMERICAL EXAMPLES
We now present several numerical examples to illustrate the performance of
GAPES for the spectral analysis of gapped data. We compare GAPES with
windowed FFT(WFFT). ATaylor windowwith order 5 and sidelobe level 35 dB
is used for WFFT.
GAPPED-DATA APES 25
3.4.1 One-Dimensional Example
In this example, we consider the 1-D gapped-data spectral estimation. To imple-
ment GAPES, we choose K = 2N for the iteration steps and the nal spectrum
is estimated on a ner grid with K = 32. The initial lter length is chosen as
M
0
= 20, and we use M = N/2 = 64 after the initialization step. We calculate the
corresponding WFFT spectrum via zero-padded FFT.
The true spectrum of the simulated signal is shown in Fig. 3.1(a), where we
have four spectral lines located at f
1
= 0.05 Hz, f
2
= 0.065 Hz, f
3
= 0.26 Hz, and
f
4
= 0.28 Hz with complex amplitudes
1
=
2
=
3
= 1 and
4
= 0.5. Besides
these spectral lines, Fig. 3.1(a) also shows a continuous spectral component centered
at 0.18 Hz with a width b = 0.015 Hz and a constant modulus of 0.25. The data
sequence has N = 128 samples where the samples 2346 and 76100 are missing.
The data is corrupted by a zero-mean circularly symmetric complex white Gaussian
noise with variance
2
n
= 0.01.
In Fig. 3.1(b) the WFFT is applied to the data by lling in the gaps with
zeros. Note that the artifacts due to the missing data are quite severe in the spec-
trum. Figs. 3.1(c) and 3.1(d) show the moduli of the WFFT and APES spectra
of the complete data sequence, where the APES spectrum demonstrated superior
resolution compared to that of WFFT. Figs. 3.1(e) and 3.1(f ) illustrate the moduli
of the WFFT and APES spectra of the data sequence interpolated via GAPES.
Comparing Figs. 3.1(e) and 3.1(f ) with 3.1(c) and 3.1(d), we note that GAPES
can effectively ll in the gaps and estimate the spectrum.
3.4.2 Two-Dimensional Examples
GAPES applied to simulated data with line spectrum: In this example we con-
sider a data matrix of size 32 50 consisting of three noisy sinusoids, with fre-
quencies (1,0.8), (1,1.1), and (1.1,1.3) and amplitudes 1, 0.7, and 2, respectively,
embedded in white Gaussian noise with standard deviation 0.1. All samples in the
columns 1020 and 3040 are missing. The true spectrum is shown in Fig. 3.2(a)
and the missing-data pattern is shown in Fig. 3.2(b). In Fig. 3.2(c) we show the
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
Frequency (Hz) Frequency (Hz)
Frequency (Hz) Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
(a) (b)
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
0
0.2
0.4
0.6
0.8
1
1.2
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
(c) (d)
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
0
0.2
0.4
0.6
0.8
1
1.2
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
(e) (f)
FIGURE3.1: Modulus of the gapped-data spectral estimates [N = 128,
2
n
= 0.01, two
gaps involving 49 (40%) missing samples]. (a) True spectrum, (b) WFFT, (c) complete-data
WFFT, (d) complete-data APES, (e) WFFT with interpolated data via GAPES, and (f )
GAPES.
GAPPED-DATA APES 27
1.0000
0.7000
2.0000
0 0.5 1 1.5 2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
10 20 30 40 50
5
10
15
20
25
30
(a) (b)
0.98937
0.70254
2.0014
0 0.5 1 1.5 2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
1.0103
0.7024
2.0028
0 0.5 1 1.5 2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
(c) (d)
0.66132
0.77715
1.0147
0 0.5 1 1.5 2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
1.0461
0.6949
2.1452
0 0.5 1 1.5 2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
(e) (f)
FIGURE 3.2: Modulus of the 2-D spectra. (a) True spectrum, (b) 2-D data missing
pattern, the black stripes indicate missing samples, (c) 2-D complete-data WFFT, (d)
2-D complete-data APES with a 2-D lter of size 16 25, (e) 2-D WFFT, and (f ) 2-D
GAPES with an initial 2-D lter of size 10 8.
(a)
(b) (c)
5 10 15 20 25 30 35 40 45
5
10
15
20
25
30
35
40
45
(d) (e) (f)
FIGURE 3.3: Modulus of the SAR images of the backhoe data obtained from a 48 48
data matrix with missing samples. (a) 3-DCADmodel and K-space data, (b) 2-Dcomplete-
data WFFT, (c) 2-D complete-data APES with a 2-D lter of size 24 36, (d) 2-D data
missing pattern, the black stripes indicate missing samples, (e) 2-D WFFT, and (f ) 2-D
GAPES with an initial 2-D lter of size 20 9.
GAPPED-DATA APES 29
512 512 WFFT spectrum of the full data. In Fig. 3.2(d) we show the 512 512
APESspectrumof the full data obtainedby using a 2-Dlter matrix of size 16 25.
Fig. 3.2(e) shows the WFFT spectrum obtained by setting the missing samples to
zero. Fig. 3.2(f ) shows the GAPES spectrum with an initial lter of size 10 8.
Comparing Fig. 3.2(f ) with 3.2(d), we can see that GAPES still gives very good
spectral estimates as if there were no missing samples.
GAPES applied to SAR data: In this example we apply the GAPES al-
gorithm to the SAR data. The Backhoe Data Dome, Version 1.0 consists of
simulated wideband (713 GHz), full polarization, complex backscatter data from
a backhoe vehicle in free space. The 3-D computer-aided design (CAD) model of
the backhoe vehicle is shown in Fig. 3.3(a), with a viewing direction correspond-
ing to (approximately) 45
elevation and 45
azimuth. The backscattered data has

been generated over a full upper 2 steradian viewing hemisphere, which is also
illustrated in Fig. 3.3(a). We consider a 48 48 HH polarization data matrix col-
lected at 0
elevation, from approximately a 3
azimuth cut centered around 0
az-
imuth, covering approximately a 0.45 GHz bandwidth centered around 10 GHz. In
Fig. 3.3(b) we show the SAR image obtained by applying WFFT to the full data.
Fig. 3.3(c) shows the image obtained by the application of APES to the full data
with a 2-D lter of size 24 36. Note that the two closely located vertical lines
(corresponds to the loader bucket) are well resolved by APES because of its su-
per resolution. To simulate the gapped data, we create articial gaps in the phase
history data matrix by removing the columns 1017 and 3037, as illustrated in
Fig. 3.3(d). In Fig. 3.3(e) we show the result of applying WFFT to the data where
the missing samples are set to zero. Signicant artifacts due to the data gapping can
be observed. Fig. 3.3(f ) shows the resulting image of GAPES after one iteration.
(Further iteration did not change the result visibly.) To perform the interpolation,
we apply 2-D GAPES with an initial lter matrix of size 20 9 on a 96 96 grid.
After the interpolation step, the spectrum of the so-obtained interpolated data ma-
trix is computed via 2-D APES with the same lter size as that used in Fig. 3.3(c).
We can see that GAPES can still resolve the two vertical spectral lines clearly.
31
C H A P T E R 4
MaximumLikelihood
Fitting Interpretation
of APES
4.1 INTRODUCTION
In this chapter, we review the APES algorithm for complete-data spectral esti-
mation following the derivations in [13], which provide a maximum likelihood
(ML) tting interpretation of the APES estimator. They pave the ground for the
missing-data algorithms we will present in later chapters.
4.2 ML FITTINGBASEDSPECTRAL ESTIMATOR
Recall the problem of estimating the amplitude spectrum of a complex-valued
uniformly sampled data sequence introduced in Section 2.2. The APES algorithm
derived below estimates () from {y
n
}
N1
n=0
for any given frequency .
Partition the data vector
y = [y
0
y
1
y
N1
]
T
(4.1)
into L overlapping subvectors (data snapshots) of size M 1 with the following
shifted structure:
y
l
= [y
l
y
l +1
y
l +M1
]
T
, l = 0, . . . , L 1, (4.2)
where L = N M + 1. Then, according to the data model in (2.1), the l th data
snapshot y
l
can be written as
y
l
= ()a() e
j l
+ e
l
(), (4.3)
where a() is an M 1 vector given by (2.5) and e
l
() = [e
l
()e
l +1
()
e
l +M1
()]
T
. The APES algorithm mimics a ML approach to estimate () by
assuming that e
l
(), l = 0, 1, . . . , L 1, are zero-mean circularly symmetric com-
plex Gaussian random vectors that are statistically independent of each other and
have the same unknown covariance matrix
Q() = E
e
l
() e
H
l
()
. (4.4)
Then the covariance matrix of y
l
can be written as
R =
()
2
a()a
H
() + Q(). (4.5)
Since the vectors { e
l
()}
L1
l =0
in our case are overlapping, they are not statistically
independent of each other. Consequently, APES is not an exact ML estimator.
Using the above assumptions, we get the normalized surrogate log-likelihood
function of the data snapshots { y
l
} as follows:
1
L
lnp({ y
l
}
(), Q()) = M ln ln
Q()

1
L
L1
l =0
y
l
()a()e
j l
H
Q
1
()[ y
l
()a()e
j l
] (4.6)
= M ln ln
Q()
tr
Q
1
()
1
L
L1
l =0
y
l
()a()e
j l

y
l
()a()e
j l
, (4.7)
where tr{} and | | denote the trace and the determinant of a matrix, respectively.
For any given (), maximizing (4.7) with respect to Q () gives
Q

() =
1
L
L1
l =0
[ y
l
()a()e
j l
][ y
l
()a()e
j l
]
H
. (4.8)
ML FITTING INTERPRETATION OF APES 33
Inserting (4.8) into (4.7) yields the following concentrated cost function (with
changed sign)
G =

Q

()
1
L
L1
l =0
y
l
()a()e
j l
y
l
()a()e
j l
, (4.9)
which is to be minimized with respect to (). By using the notation g(),

R, and
S() dened in (2.6), (2.7), and (2.11), respectively, the cost function G in (4.9)
becomes
G =

R + |()|
2
a()a
H
() g()
H
()a
H
() ()a() g
H
()

R g() g
H
() + [()a() g()][()a() g()]
H
(4.10)
=

S()
I +

S
1
()[()a() g()][()a() g()]
H
, (4.11)
where

S() can be recognized as an estimate of Q (). Making use of the identity
I + AB
I + BA
, we get
G =

S()
1 + [()a() g()]
H

S
1
() [()a() g()]
. (4.12)
Minimizing G with respect to () yields
() =
a
H
()
S
1
() g()
a
H
()
S
1
()a()
. (4.13)
Making use of the calculation in (4.10), we get the estimate of Q () as
Q () =

S() + [ ()a() g()][ ()a() g()]
H
. (4.14)
In the APES algorithm, () is the sought spectral estimate and

Q () is the
estimate of the nuisance matrix parameter Q ().
4.3 REMARKS ONTHEML FITTINGCRITERION
The phrase ML tting criterion used above can be commented as follows. In some
estimation problems, using the exact ML method is computationally prohibitive or
even impossible. In such problems one can make a number of simplifying assump-
tions and derive the corresponding ML criterion. The estimates that minimize the
so-obtained surrogate ML tting criterion are not exact ML estimates, yet usually
they have good performance and generally they are by design much simpler to com-
pute than the exact ML estimates. For example, even if the data are not Gaussian
distributed, a MLtting criterion derived under the Gaussian hypothesis will often
lead to computationally convenient and yet accurate estimates. Another example
here is sinusoidal parameter estimation from data corrupted by colored noise: the
ML tting criterion derived under the assumption that the noise is white leads
to parameter estimates of the sinusoidal components whose accuracy asymptoti-
cally achieves the exact Cram erRao bound (derived under the correct assumption
of colored noise), see [43, 44]. The APES method ([13, 15]) is another example
where a surrogate ML tting criterion, derived under the assumption that the
data snapshots are Gaussian and independent, leads to estimates with excellent
performance. We follow the same approach in the following chapters by extending
the APES method to the missing-data case.
35
C H A P T E R 5
One-Dimensional
Missing-Data APES via
Expectation Maximization
5.1 INTRODUCTION
In Chapter 3 we presented GAPES for gapped-data spectral estimation. GAPES
iteratively interpolates the missing data and estimates the spectrum. However,
GAPES can deal only with missing data occurring in gaps and it does not work
well for the more general problem of missing data samples occurring in arbitrary
patterns.
In this chapter, we consider the problem of nonparametric spectral estima-
tion for data sequences with missing data samples occurring in arbitrary patterns
(including the gapped-data case) [45]. We develop two missing-data amplitude
and phase estimation (MAPES) algorithms by using a ML tting criterion as de-
rived in Chapter 4. Then we use the well-known expectation maximization (EM)
[42, 46] method to solve the so-obtained estimation problem iteratively. Through
numerical simulations, we demonstrate the excellent performance of the MAPES
algorithms for missing-data spectral estimation and missing-data restoration.
The remainder of this chapter is organized as follows: In Section 5.2, we give
a brief review of the EM algorithm for the missing-data problem. In Sections 5.3
and 5.4, we develop two nonparametric MAPES algorithms for the missing-data
spectral estimation problem via the EM algorithm. Some aspects of interest are
discussed in Section 5.5. In Section 5.6, we compare MAPES with GAPES for
the missing-data problem. Numerical results are provided in Section 5.7 to illustrate
the performance of the MAPES-EM algorithms.
5.2 EMFORMISSING-DATASPECTRAL
ESTIMATION
Assume that some arbitrary samples of the uniformly sampled data sequence
{y
n
}
N1
n=0
are missing. Because of these missing samples, which can be treated as
unknowns, the surrogate log-likelihood tting criterion in (4.6) cannot be maxi-
mized directly. We show below how to tackle this general missing-data problem
through the use of the EM algorithm.
Recall that the g 1 vector and the (N g) 1 vector contain all the
available samples (incomplete data) and all the missing samples, respectively, of the
N 1 complete data vector y. Then we have the following relationships:
= {y
n
}
N1
n=0
(5.1)
= , (5.2)
where denotes the empty set. Let = {(), Q ()}. An estimate

of can
be obtained by maximizing the following surrogate ML tting criterion involving
the available data vector :
= arg max
ln p( | ). (5.3)
If were available, the above problem would be easy to solve (as shown in the
previous chapter). In the absence of , however, the EM algorithm maximizes the
conditional (on ) expectation of the joint log-likelihood function of and . The
algorithm is iterative. At the i th iteration, we use

i 1
from the previous iteration
to update the parameter estimate by maximizing the conditional expectation:
i
= arg max
E
_
ln p(, | )
i 1
_
. (5.4)
It can be shown [42, 47] that for each iteration, the increase in the surrogate log-
likelihood function is greater than or equal to the increase in the expected joint
1-D MISSING-DATA APES VIA EM 37
surrogate log-likelihood in (5.4), i.e.,
ln p( |

i
) ln p( |

i 1
) E
_
ln p(, |

i
)
i 1
_
E
_
ln p(, |

i 1
)
i 1
_
. (5.5)
Since the data snapshots { y
l
} are overlapping, one missing sample may occur
inmany snapshots (note that there is only one newsample betweentwoadjacent data
snapshots). So two approaches are possible whenwe try to estimate the missing data:
estimate the missing data separately for each snapshot y
l
by ignoring any possible
overlapping, or jointly for all snapshots { y
l
}
L1
l =0
by observing the over lappings. In
the following two sections, we make use of these ideas to develop two different
MAPES-EM algorithms, namely MAPES-EM1 and MAPES-EM2.
5.3 MAPES-EM1
In this section we assume that the data snapshots { y
l
}
L1
l =1
are independent of each
other, and hence we estimate the missing data separately for different data snap-
shots. For each data snapshot y
l
, let
l
and
l
denote the vectors containing the
available and missing elements of y
l
, respectively. In general, the indices of the
missing components could be different for different l . Assume that
l
has dimen-
sion g
l
1, where 1 g
l
Mis the number of available elements in the snapshot
y
l
. (Although g
l
could be any integer that belongs to the interval 0 g
l
M, we
assume for now that g
l
= 0. Later we will explain what happens when g
l
= 0.)
Then
l
and
l
are related to y
l
by unitary transformations as follows:

l
=

S
T
g
(l ) y
l
(5.6)

l
=

S
T
m
(l ) y
l
, (5.7)
where

S
g
(l ) and

S
m
(l ) are M g
l
and M(M g
l
) unitary selection matrices
such that
S
T
g
(l )
S
g
(l ) = I
g
l
, (5.8)
S
T
m
(l )
S
m
(l ) = I
Mg
l
, (5.9)
and
S
T
g
(l )
S
m
(l ) = 0
g
l
(Mg
l
)
. (5.10)
For example, if M= 5 and we observe the rst, third, and fourth components of
y
l
, then g
l
= 3,
S
g
(l ) =
_
_
_
_
_
_
_
_
_
1 0 0
0 0 0
0 1 0
0 0 1
0 0 0
_
_
(5.11)
and
S
m
(l ) =
_
_
_
_
_
_
_
_
_
0 0
1 0
0 0
0 0
0 1
_
_
. (5.12)
Because we clearly have
y
l
= [
S
g
(l )
S
T
g
(l ) +

S
m
(l )
S
T
m
(l )] y
l
=

S
g
(l )
l
+

S
m
(l )
l
, (5.13)
the joint normalized surrogate log-likelihood function of {
l
,
l
} is obtained by
substituting (5.13) into (4.7)
1
L
ln p({
l
,
l
} | (), Q()) = M ln ln | Q() | tr
_
Q
1
()
1
L
L1
l =0
_
S
g
(l )
l
+

S
m
(l )
l
()a() e
j l
_
_
S
g
(l )
l
+

S
m
(l )
l
()a() e
j l
_
H
_
.
(5.14)
Owing to the Gaussian assumption on y
l
, the random vectors
_

l

l
_
=
_

S
T
m
(l )
S
T
g
(l )
_
y
l
, l = 0, . . . , L 1 (5.15)
are also Gaussian with mean
_

S
T
m
(l )
S
T
g
(l )
_
a()() e
j l
, l = 0, . . . , L 1 (5.16)
and covariance matrix
_

S
T
m
(l )
S
T
g
(l )
_
Q()
_
S
m
(l )

S
g
(l )
_
, l = 0, . . . , L 1. (5.17)
From the Gaussian distribution of
_

l

l
_
, it follows that the probability density func-
tion of
l
conditioned on
l
(for given =

i 1
) is a complex Gaussian with mean
b
l
and covariance matrix

K
l
[48]:

l
|
l
,

i 1
CN(
b
l
,

K
l
), (5.18)
where
b
l
= E
_

l

l
,

i 1
_
=

S
T
m
(l )a()
i 1
() e
j l
+

S
T
m
(l )

Q
i 1
()
S
g
(l )
_
S
T
g
(l )

Q
i 1
()
S
g
(l )
_
1_

l

S
T
g
(l )a()
i 1
() e
j l
_
(5.19)
and
K
l
= cov
_

l

l
,

i 1
_
=

S
T
m
(l )

Q
i 1
()
S
m
(l )

S
T
m
(l )

Q
i 1
()
S
g
(l )
_
S
T
g
(l )

Q
i 1
()
S
g
(l )
_
1
S
T
g
(l )

Q
i 1
()
S
m
(l ).
(5.20)
Expectation: We evaluate the conditional expectation of the surrogate log-
likelihood in (5.14) using (5.18)(5.20), which is most easily done by adding and
subtracting the conditional mean

b
l
from
l
in (5.14) as follows:
_
S
g
(l )
l
+

S
m
(l )
l
()a() e
j l
_
=
_
S
m
(l )(
l

b
l
)
_
+
_
S
g
(l )
l
+

S
m
(l )
b
l
()a() e
j l
_
. (5.21)
The cross-terms that result from the expansion of the quadratic term in (5.14)
vanish when we take the conditional expectation. Therefore the expectation step
yields
E
_
1
L
ln p({
l
,
l
}|(), Q ()) | {
l
},
i 1
(),

Q
i 1
()
_
= M ln ln |Q ()| tr
_
Q
1
()
1
L
L1
l =0
_
S
m
(l )

K
l

S
T
m
(l )
+
_
S
g
(l )
l
+

S
m
(l )
b
l
()a()e
j l
_ _
S
g
(l )
l
+

S
m
(l )
b
l
()a()e
j l
_
H
__
.
(5.22)
Maximization: The maximization part of the EM algorithm produces up-
dated estimates for () and Q (). The normalized expected surrogate log-
likelihood (5.22) can be rewritten as
M ln ln |Q ()| tr
_
Q
1
()
1
L
L1
l =0
_
l
+
_
z
l
()a() e
j l
_ _
z
l
()a() e
j l
_
H
_
_
,
(5.23)
where we have dened
l

S
m
(l )

K
l

S
T
m
(l ) (5.24)
and
z
l

S
g
(l )
l
+

S
m
(l )
b
l
. (5.25)
According to the derivation in Chapter 4, maximizing (5.23) with respect to ()
and Q () gives

1
() =
a
H
()
S
1
()

Z()
a
H
()
S
1
()a()
(5.26)
and
Q
1
() =

S() +[
1
()a()

Z()][
1
()a()

Z()]
H
, (5.27)
where
Z()
1
L
L1
l =0
z
l
e
j l
(5.28)
and
S()
1
L
L1
l =0
l
+
1
L
L1
l =0
z
l
z
H
l

Z()

Z
H
(). (5.29)
This completes the derivation of the MAPES-EM1 algorithm, a step-by-step
summary of which is as follows:
Step 0: Obtain an initial estimate of {(), Q ()}.
Step 1: Use the most recent estimate of {(), Q ()} in (5.19) and (5.20) to
calculate

b
l
and

K
l
, respectively. Note that

b
l
can be regarded as the current
estimate of the corresponding missing samples.
Step 2: Update the estimate of {(), Q ()} using (5.26) and (5.27).
Step 3: Repeat steps 1 and 2 until practical convergence.
Note that when g
l
= 0, which indicates that there is no available sample
in the current data snapshot y
l
,

S
g
(l ) and
l
do not exist and

S
m
(l ) is an M M
identity matrix; hence, the above algorithm can still be applied by simply removing
any term that involves

S
g
(l ) or
l
in the above equations.
5.4 MAPES-EM2
Following the observation that the same missing data may enter in many snapshots,
we propose a second method to implement the EM algorithm by estimating the
missing data simultaneously for all data snapshots.
Recall that the available and missing data vectors are denoted as ( g 1
vector) and [(N g) 1 vector], respectively. Let y denote the LM1 vector
obtained by concatenating all the snapshots
y
_
_
_
_
_
y
0
.
.
.
y
L1
_
_
= S
g
+S
m
, (5.30)
where S
g
(LM g) and S
m
(LM(N g)) are the corresponding selection ma-
trices for the available and missing data vectors, respectively. Because of the over-
lapping of { y
l
}, S
g
and S
m
are not unitary, but they are still orthogonal to each
other:
S
T
g
S
m
= 0
g(Ng)
. (5.31)
Instead of (5.6) and (5.7), we have from (5.30)
=
_
S
T
g
S
g
_
1
S
T
g
y =

S
T
g
y (5.32)
and
=
_
S
T
m
S
m
_
1
S
T
m
y =

S
T
m
y. (5.33)
The matrices

S
g
and

S
m
introduced above are dened as
S
g
S
g
_
S
T
g
S
g
_
1
(5.34)
and
S
m
S
m
_
S
T
m
S
m
_
1
, (5.35)
and they are also orthogonal to each other:
S
T
g

S
m
= 0
g(Ng)
. (5.36)
Note that S
T
g
S
g
and S
T
m
S
m
are diagonal matrices where each diagonal element
indicates how many times the corresponding sample appears in y owing to the
overlapping of { y
l
}. Hence both S
T
g
S
g
and S
T
m
S
m
can be easily inverted.
Nowthe normalized surrogate log-likelihood function in (4.6) can be written
as
1
L
ln p( y | (), Q ()) = M ln
1
L
ln |D()|
1
L
[ y ()()]
H
D
1
()[ y ()()],
(5.37)
where () and D() are dened as
()
_
_
_
_
e
j 0
a()
.
.
.
e
j (L1)
a()
_
_
(5.38)
and
D()
_
_
_
_
Q () 0
.
.
.
0 Q ()
_
_
. (5.39)
Substituting (5.30) into (5.37), we obtain the joint surrogate log-likelihood of
and :
1
L
ln p(, | (), Q ()) =
1
L
_
LM ln ln |D()| [S
g
+S
m
()()]
H
D
1
()[S
g
+S
m
()()]
_
+C
J
, (5.40)
where C
J
is a constant that accounts for the Jacobian of the nonunitary transfor-
mation between y and and in (5.30).
To derive the EM algorithm for the current set of assumptions, we note that
for given
i 1
() and

Q
i 1
(), we have (as in (5.18)(5.20))
| ,

i 1
CN(b, K), (5.41)
where
b = E
_
| ,

i 1
_
=

S
T
m
()
i 1
() +

S
T
m

D
i 1
()
S
g
_
S
T
g

D
i 1
()
S
g
_
1 _

S
T
g
()
i 1
()
_
(5.42)
and
K = cov
_
| ,

i 1
_
=

S
T
m

D
i 1
()
S
m

S
T
m

D
i 1
()
S
g
_
S
T
g

D
i 1
()
S
g
_
1
S
T
g

D
i 1
()
S
m
. (5.43)
Expectation: Following the same steps as in (5.21) and (5.22), we obtain the
conditional expectation of the surrogate log-likelihood function in (5.40):
E
_
1
L
ln p(, | (), Q ()) | ,
i 1
(),

Q
i 1
()
_
= M ln
1
L
ln |D()| tr
_
1
L
D
1
()
_
S
m
KS
T
m
+[S
g
+S
m
b ()()][S
g
+S
m
b ()()]
H
_
_
+C
J
. (5.44)
Maximization: To maximize the expected surrogate log-likelihood function
in (5.44), we need to exploit the known structure of D() and (). Let
_
_
_
_
z
0
.
.
.
z
L1
_
_
S
g
+S
m
b (5.45)
denote the data snapshots made up of the available and estimated data samples,
where each z
l
, l = 0, . . . , L 1, is an M1 vector. Also let
0
, . . . ,
L1
be the
M M blocks on the block diagonal of S
m
KS
T
m
. Then the expected surrogate
log-likelihood function we need to maximize with respect to () and Q ()
becomes (to within an additive constant)
ln |Q ()| tr
_
Q
1
()
1
L
L1
l =0
_
l
+
_
z
l
()a() e
j l
_ _
z
l
()a()e
j l
_
H
_
_
.
(5.46)
The solution can be readily obtained by a derivation similar to that in Section 5.3:

2
() =
a
H
()
S
1
()Z()
a
H
()
S
1
()a()
(5.47)
and
Q
2
() =

S() +[
2
()a() Z()][
2
()a() Z()]
H
, (5.48)
where S() and Z() are dened as
S()
1
L
L1
l =0
l
+
1
L
L1
l =0
z
l
z
H
l
Z()Z
H
() (5.49)
and
Z()
1
L
L1
l =0
z
l
e
j l
. (5.50)
The derivationof the MAPES-EM2 algorithmis thus complete, and a step-by-step
summary of this algorithm is as follows:
Step 0: Obtain an initial estimate of {(), Q ()}.
Step 1: Use the most recent estimates of {(), Q ()} in (5.42) and (5.43) to
calculate b and K. Note that b can be regarded as the current estimate of the
missing sample vector.
Step 2: Update the estimates of {(), Q ()} using (5.47) and (5.48).
5.5 ASPECTS OF INTEREST
5.5.1 Some Insights into the MAPES-EMAlgorithms
Comparing {
1
(),

Q
1
()} in (5.26) and (5.27) [or {
2
(),

Q
2
()} in (5.47) and
(5.48)] with { (),

Q ()} in (4.13) and (4.14), we can see that the EMalgorithms
are doing some intuitively obvious things. In particular, the estimator of ()
estimates the missing data and then uses the estimate {
b
l
} (or b) as though it were
correct. The estimator of Q () does the same thing, but it also adds an extra
term involving the conditional covariance

K
l
(or K), which can be regarded as a
generalized diagonal loading operation to make the spectral estimate robust against
estimation errors.
We stress again that the MAPES approach is based on a surrogate like-
lihood function that is not the true likelihood of the data snapshots. However,
such surrogate likelihood functions (for instance, based on false uncorrelatedness
or Gaussian assumptions) are known to lead to satisfactory tting criteria, under
fairly reasonable conditions (see, e.g., [42, 49]). Furthermore, it can be shown that
the EM algorithm applied to such a surrogate likelihood function (which is a valid
probability distribution function) still has the key property in (5.5) to monotonically
increase the function at each iteration.
5.5.2 MAPES-EM1 Versus MAPES-EM2
Because at each iteration and at each frequency of interest , MAPES-EM2 esti-
mates the missing samples only once (for all data snapshots), it has a lower com-
putational complexity than MAPES-EM1, which estimates the missing samples
separately for each data snapshot.
It is also interesting to observe that MAPES-EM1 makes the assump-
tion that the snapshots { y
l
} are independent when formulating the surrogate data
likelihood function, and it maintains this assumption when estimating the miss-
ing datahence a consistent ignoring of the overlapping. On the other hand,
MAPES-EM2 makes the same assumption when formulating the surrogate data
likelihood function, but in a somewhat inconsistent manner it observes the over-
lapping whenestimating the missing data. This suggests that MAPES-EM2, which
estimates fewer unknowns than MAPES-EM1, may not necessarily have a (much)
better performance, as might be expected (see the examples in Section 5.7).
5.5.3 Missing-Sample Estimation
For many applications, such as data restoration, estimating the missing samples
is needed and can be done via the MAPES-EM algorithms. For MAPES-EM2,
at each frequency of interest , we take the conditional mean b as an estimate of
the missing sample vector. The nal estimate of the missing sample vector is the
average of all b obtained from all frequencies of interest. For MAPES-EM1, at
each frequency of interest, there are multiple estimates (obtained from different
overlapping data snapshots) for the same missing sample. We calculate the mean
of these multiple estimates before averaging once again across all frequencies of
interest. We remark that we should not consider the {
b
l
} (or b) at each frequency
as an estimate of the -component of the missing data because other frequency
components contribute to the residue termas well, which determines the covariance
matrix Q () in the APES model.
5.5.4 Initialization
Since in general there is no guarantee that the EM algorithm will converge to a
global maximum, the MAPES-EM algorithms may converge to a local maximum,
which depends on the initial estimate

0
used. To demonstrate the robustness of our
MAPES-EM algorithms to the choice of the initial estimate, we will simply let the
initial estimate of () be given by the WFFT with the missing data samples set
to zero. The initial estimate of Q () follows from (4.8), where again, the missing
data samples are set to zero.
5.5.5 Stopping Criterion
We stop the iteration of the MAPES-EMalgorithms whenever the relative change
in the total power of the spectra corresponding to the current
_

i
(
k
)
_
and previous
_

i 1
(
k
)
_
estimates is smaller than a preselected threshold (e.g., = 10
3
):
|
K1
k=0
|
i
(
k
)|
2
K1
k=0
|
i 1
(
k
)|
2
|
K1
k=0
|
i 1
(
k
)|
2
, (5.51)
where we evaluate () on a K-point DFT grid:
k
= 2k/K, for k = 0, . . . ,
K 1.
5.6 MAPES COMPAREDWITHGAPES
As explained above, MAPES is derived from a surrogate ML formulation of the
APES algorithm; on the other hand, GAPES is derived from a LS formulation
of APES [32]. In the complete-data case, these two approaches are equivalent in
the sense that from either of them we can derive the same full-data APES spectral
estimator. So at rst, it might look counterintuitive that these two algorithms
(MAPES and GAPES) will perform differently for the missing-data problem (see
the numerical results in Section 5.7). We will now give a brief discussion about this
issue.
The difference between MAPES and GAPES concerns the way they esti-
mate when some data samples are missing. Although MAPES-EM estimates
each missing sample separately for each frequency
k
(and for each data snapshot
y
l
in MAPES-EM2) while GAPES estimates each missing sample by considering
all K frequencies together, the real difference between them concerns the different
criteria used in (3.16) and (5.3) for the estimation of : GAPES estimates the
missing sample based on a LS tting of the ltered data, h
H
(
k
) y
l
. On the other
hand, MAPES estimates the missing sub-sample directly from { y
l
} based on an
ML tting criterion. Because the LS formulation of APES focuses on the output
of the lter h(
k
) (which is supposed to suppress any other frequency components
except
k
), the GAPES algorithm is sensitive to the errors in h(
k
) when it tries to
estimate the missing data. This is why GAPES performs well in the gapped-data
case, since there a good estimate of h(
k
) can be calculated during the initializa-
tion step. However, when the missing samples occur in an arbitrary pattern, the
performance of GAPES degrades. Yet the MAPES-EM does not suffer from such
a degradation.
In this section we present detailed results of a few numerical examples to demon-
strate the performance of the MAPES-EM algorithms for missing-data spec-
tral estimation. We compare MAPES-EM with WFFT and GAPES. A Taylor
window with order 5 and sidelobe level 35 dB is used for WFFT. We choose
K = 32N to have a ne grid of discrete frequencies. We calculate the correspond-
ing WFFT spectrum via zero-padded FFT. The so-obtained WFFT spectrum is
used as the initial spectral estimate for the MAPES-EM and GAPES algorithms.
The initial estimate of Q () for MAPES-EM has been discussed before, and the
initial estimate of h() for GAPES is calculated from (2.12), where the missing
samples are set to zero. We stop the MAPES-EM and the GAPES algorithms
using the same stopping criterion in (5.51) with being selected as 10
3
and 10
2
,
respectively. The reason we choose a larger for GAPES is that it converges rela-
tively slowly for the general missing-data problem and its spectral estimate would
not improve much if we used an < 10
2
. All the adaptive ltering algorithms
considered (i.e. APES, GAPES, and MAPES-EM) use a lter length of M= N/2
for achieving high resolution.
1
= 0.05 Hz, f
2
= 0.065 Hz, f
3
= 0.26 Hz,
and f
4
1
=
2
=
3
= 1 and
4
= 0.5.
Besides these spectral lines, Fig. 5.1(a) also shows a continuous spectral component
centered at 0.18 Hz with a width b = 0.015 Hz and a constant modulus of 0.25.
The data sequence has N = 128 samples among which 51 (40%) samples are miss-
ing; the locations of the missing samples are chosen arbitrarily. The data is corrupted
by a zero-mean circularly symmetric complex white Gaussian noise with variance
2
n
= 0.01.
In Fig. 5.1(b), the APES algorithm is applied to the complete data and the
resulting spectrum is shown. The APES spectrum will be used later as a reference
for comparison purposes. The WFFT spectrum for the incomplete data is shown
in Fig. 5.1(c), where the artifacts due to the missing data are readily observed.
As expected, the WFFT spectrum has poor resolution and high sidelobes and it
underestimates the true spectrum. Note that the WFFT spectrum will be used as
the initial estimate for the GAPES and MAPES algorithms. Fig. 5.1(d) shows the
GAPES spectrum. GAPES also underestimates the sinusoidal components and
gives some artifacts. Apparently, owing to the poor initial estimate of h(
k
) for the
incomplete data, GAPES converges to one of the local minima of the cost function
in (3.16). Figs. 5.1(e) and 5.1(f ) show the MAPES-EM1 and MAPES-EM2
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
(a) (b)
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
(c) (d)
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
(e) (f)
FIGURE 5.1: Modulus of the missing-data spectral estimates [N = 128,
2
n
= 0.01, 51
(40%) missing samples]. (a) True spectrum, (b) complete-data APES, (c) WFFT, (d)
GAPES with M= 64 and = 10
2
, (e) MAPES-EM1 with M= 64 and = 10
3
,
and (f ) MAPES-EM2 with M= 64 and = 10
3
.
spectral estimates. Both MAPES algorithms perform quite well and their spectral
estimates are similar to the high-resolution APES spectrum in Fig. 5.1(b).
The MAPES-EM1 and MAPES-EM2 spectral estimates at different iter-
ations are plotted in Figs. 5.2(a) and 5.2(b), respectively. Both algorithms converge
quickly with MAPES-EM1 converging after 10 iterations while MAPES-EM2
after only 6.
The data restoration performance of MAPES-EM is shown in Fig. 5.3.
The missing samples are estimated using the averaging approach we introduced
previously. Figs. 5.3(a) and 5.3(b) display the real and imaginary parts of the inter-
polated data, respectively, obtained via MAPES-EM1. Figs. 5.3(c) and 5.3(d) show
the corresponding results for MAPES-EM2. The locations of the missing samples
are also indicated in Fig. 5.3. The missing samples estimated via the MAPES-
EM algorithms are quite accurate. More detailed results for MAPES-EM2 are
shown in Fig. 5.4. (Those for MAPES-EM1 are similar.) For a clear visualiza-
tion, only the estimates of the rst three missing samples are shown in Fig. 5.4.
The real and imaginary parts of the estimated samples as a function of frequency
are plotted in Figs. 5.4(a) and 5.4(b), respectively. All estimates are close to the
corresponding true values, which are also indicated in Fig. 5.4. It is interesting to
note that larger variations occur at frequencies where strong signal components are
present.
The results displayed so far were for one randomly picked realization of the
data. Using 100 Monte Carlo simulations (varying the realizations of the noise, the
initial phases of the different spectral components, and the missing-data patterns),
we obtain the root mean-squared errors (RMSEs) of the magnitude and phase
estimates of the four spectral lines at their true frequency locations. These RMSEs
for WFFT, GAPES, and MAPES-EM are listed in Tables 5.1 and 5.2. Based on
this limited set of Monte Carlo simulations, we can see that the two MAPES-EM
algorithms perform similarly, and that they are much more accurate than WFFT
and GAPES. A similar behavior has been observed in several other numerical
experiments.
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s i=0
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s i=1
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s i=2
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s i=3
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s i=4
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s i=5
Frequency (Hz)
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s i=0
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s i=1
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s i=2
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s i=3
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s i=4
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s i=5
Frequency (Hz)
(a) (b)
FIGURE5.2: Modulus of the missing-data spectral estimates obtained via the MAPES-
EM algorithms at different iterations [N = 128,
2
n
= 0.01, 51 (40%) missing samples].
(a) MAPES-EM1 and (b) MAPES-EM2.
20 40 60 80 100 120
4
2
0
2
4
n
R
e
a
l

P
a
r
t

o
f

S
i
g
n
a
l
True Data
Interpolated Data
Missing Data Locations
(a)
20 40 60 80 100 120
4
2
0
2
4
n
I
m
a
g
i
n
a
r
y

P
a
r
t

o
f

S
i
g
n
a
l
True Data
Interpolated Data
(b)
20 40 60 80 100 120
4
2
0
2
4
n
R
e
a
l

P
a
r
t

o
f

S
i
g
n
a
l
True Data
Interpolated Data
(c)
20 40 60 80 100 120
4
2
0
2
4
n
I
m
a
g
i
n
a
r
y

P
a
r
t

o
f

S
i
g
n
a
l
True Data
Interpolated Data
(d)
FIGURE5.3: Interpolation of the missing samples [N = 128,
2
n
= 0.01, 51 (40%) miss-
ing samples]. (a) Real part of the data interpolated via MAPES-EM1, (b) imaginary part of
the data interpolated via MAPES-EM1, (c) real part of the data interpolated via MAPES-
EM2, and (d) imaginary part of the data interpolated via MAPES-EM2.
0 0.1 0.2 0.3 0.4 0.5
1.5
1
0.5
0
0.5
1
1.5
2
2.5
Frequency (Hz)
R
e
a
l

P
a
r
t

o
f

M
i
s
s
i
n
g

S
a
m
p
l
e

E
s
t
i
m
a
t
e
s
1st Missing Sample
2nd Missing Sample
3rd Missing Sample
True Value (1st)
True Value (2nd)
True Value (3rd)
(a)
0 0.1 0.2 0.3 0.4 0.5
2.5
2
1.5
1
0.5
0
0.5
1
1.5
2
2.5
Frequency (Hz)
I
m
a
g
i
n
a
r
y

P
a
r
t

o
f

M
i
s
s
i
n
g

S
a
m
p
l
e

E
s
t
i
m
a
t
e
s
1st Missing Sample
2nd Missing Sample
3rd Missing Sample
True Value (1st)
True Value (2nd)
True Value (3rd)
(b)
2.5
2
FIGURE 5.4: Estimation of the rst three missing samples versus frequency [N = 128,
2
n
= 0.01, 51 (40%) samples are missing, vertical dotted lines indicate the true frequency
locations of the spectral components with the closely spaced lines indicating the continuous
spectral component]. (a) Real part and (b) imaginary part of the missing samples estimated
via MAPES-EM2.
TABLE 5.1: RMSEs of the Magnitude Estimates Obtained via the
WFFT, GAPES, and MAPES-EM Spectral Estimators
WFFT GAPES MAPES-EM1 MAPES-EM2
Signal 1 0.420 0.175 0.010 0.011
Signal 2 0.417 0.205 0.010 0.010
Signal 3 0.419 0.169 0.010 0.009
Signal 4 0.205 0.164 0.009 0.011
Next, we increase the number of missing samples to 77 (60% of the original
data). The results of WFFT, WFFT GAPES, MAPES-EM 1, and MAPES-
EM2 are shown in Figs. 5.5(a)5.5(d), respectively. The signal amplitudes in the
WFFT spectrum are low presumably due to the small (40%) amount of avail-
able data samples. The artifacts are so high that we can hardly identify the signal
components. GAPES also performs poorly [see Fig. 5.5(b)], as expected. The
MAPES-EM algorithms provide excellent spectral estimates with relatively minor
artifacts.
In our next experiment, we keep the 40% data missing rate but increase the
noise variance to
2
n
= 0.1 (10 dB higher than in the previous experiments). The
TABLE 5.2: RMSEs of the Phase (Radian) Estimates Obtained via the
WFFT, GAPES, and MAPES-EM Spectral Estimators
WFFT GAPES MAPES-EM1 MAPES-EM2
Signal 1 0.077 0.042 0.021 0.021
Signal 2 0.059 0.038 0.024 0.025
Signal 3 0.099 0.037 0.012 0.013
Signal 4 0.133 0.029 0.022 0.022
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
(a) (b)
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
(c) (d)
2
n
= 0.01, 77
(60%) missing samples] obtained via. (a) WFFT, (b) GAPES with M= 64 and = 10
2
,
(c) MAPES-EM1 with M= 64 and = 10
3
, and (d) MAPES-EM2 with M= 64 and
= 10
3
.
corresponding moduli of the spectral estimates of complete-data WFFT, APES,
missing-data WFFT, GAPES, MAPES-EM1, and MAPES-EM2 are plotted
in Figs. 5.6(a)5.6(f ), respectively. Again, the performance of the MAPES-EM
algorithms is excellent.
In our last experiment, we plot the RMSEs of the MAPES-EM1 estimates
as functions of the missing sample rate in Fig. 5.7. Only the RMSEs of the estimates
of rst spectral line located at f
1
= 0.05 Hz are plotted since the results for others
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
(a) (b)
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
(c) (d)
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
(e) (f)
2
n
= 0.1, 51
(40%) missing samples] obtained via (a) complete-data WFFT, (b) complete-data APES,
(c) WFFT, (d) GAPES with M= 64 and = 10
2
, (e) MAPES-EM1 with M= 64 and
= 10
3
, and (f ) MAPES-EM2 with M= 64 and = 10
3
.
0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
missing sample rate
R
M
S
E
SNR=0
SNR=5
SNR=10
SNR=15
SNR=20
(a)
0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
missing sample rate
R
M
S
E
SNR=0
SNR=5
SNR=10
SNR=15
SNR=20
(b)
FIGURE 5.7: RMSEs of the estimates of the rst spectral line located at f
1
= 0.05 Hz
obtained via MAPES-EM1. (a) Amplitude and (b) Phase (radian).
are similar. Each result is based on 100 Monte Carlo simulations (by varying, in each
trial, the realization of the noise, the initial phases of the different spectral compo-
nents, and the missing-data pattern). The signal-to-noise ratio (SNR) is dened as
SNR
1
= 10 log
10
|
1
|
2
2
n
(dB), (5.52)
with
1
being the complex amplitude of the rst sinusoid. For each xed SNR, the
RMSEs increase as the number of missing samples increases. Also as expected,
the RMSEs decrease when the SNR increases. Similar results can be observed for
MAPES-EM2.
61
C H A P T E R 6
Two-Dimensional
MAPES via Expectation
Maximization and Cyclic
Maximization
6.1 INTRODUCTION
In Chapter 5, we proposed the 1-D MAPES-EM algorithms to deal with the
general missing-data problem where the missing data samples occur in arbitrary
patterns [45]. The MAPES-EM algorithms were derived following a ML t-
ting based approach in which a ML tting problem was solved iteratively via the
EM algorithm. MAPES-EM exhibits better spectral estimation performance than
GAPES does. However, the MAPES-EM algorithms are computationally inten-
sive due to estimating the missing samples separately for each frequency of interest.
The direct application of MAPES-EM to large data sets, e.g., 2-D data, is com-
putationally prohibitive.
Herein we consider the problem of 2-D nonparametric spectral estimation
of data matrices with missing data samples occurring in arbitrary patterns [50].
First, we present the 2-D extensions of the MAPES-EM algorithms introduced
in [45] in the 1-D case. Then we develop a new MAPES algorithm, referred to
as MAPES-CM, by solving an ML problem iteratively via cyclic maximization
(CM) [42]. MAPES-EM and MAPES-CM possess similar spectral estimation
performance, but the computational complexity of the latter is much lower than
that of the former.
The remainder of this chapter is organized as follows: In Section 6.2, we
review the 2-D nonparametric APES algorithm. In Section 6.3, we present 2-D
extensions of the MAPES-EM algorithms and develop the 2-D MAPES-CM al-
gorithm in Section 6.4. In Section 6.5, we compare MAPES-CM with MAPES-
EM, from both theoretical and computational points of view. Numerical examples
are provided in Section 6.6 to demonstrate the performance of the MAPES algo-
rithms.
6.2 TWO-DIMENSIONAL ML-BASEDAPES
In this section we provide the 2-D extension of APES, devised via a ML tting
based approach, for complete-data spectral estimation.
Consider the 2-D problem introduced in Section 3.3.1. Partition the N
1
N
2
data matrix
Y
_
_
_
_
_
_
_
y
0,0
y
0,1
y
0,N
2
1
y
1,0
y
1,1
y
1,N
2
1
.
.
.
.
.
.
.
.
.
.
.
.
y
N
1
1,0
y
N
1
1,1
y
N
1
1,N
2
1
_
_
(6.1)
into L
1
L
2
overlapping submatrices of size M
1
M
2
:
Y
l
1
,l
2
=
_
_
_
_
_
_
_
y
l
1
,l
2
y
l
1
,l
2
+1
y
l
1
,l
2
+M
2
1
y
l
1
+1,l
2
y
l
1
+1,l
2
+1
y
l
1
+1,l
2
+M
2
1
.
.
.
.
.
.
.
.
.
.
.
.
y
l
1
+M
1
1,l
2
y
l
1
+M
1
1,l
2
+1
y
l
1
+M
1
1,l
2
+M
2
1
_
_
, (6.2)
where l
1
= 0, . . . , L
1
1, l
2
= 0, . . . , L
2
1, L
1
N
1
M
1
+1, and L
2
N
2
M
2
+1. Increasing M
1
and M
2
typically increases the spectral resolution
at the cost of reducing the statistical stability of the spectral estimates due to the
2-D MAPES VIA EM AND CYCLIC MAXIMIZATION 63
reduced number of submatrices. Typically, we choose M
1
= N
1
/2 and M
2
= N
2
/2
[13, 15].
Recall that
y
l
1
,l
2
= vec[
Y
l
1
,l
2
] (6.3)
and
a(
1
,
2
) = a
M
2
(
2
) a
M
1
(
1
), (6.4)
where denotes the Kronecker matrix product, and
a
M
k
(
k
) [1 e
j k
e
j (M
k
1)
k
]
T
, k = 1, 2. (6.5)
Then, according to (3.17), the snapshot vector y
l
1
,l
2
corresponding to

Y
l
1
,l
2
can be
written as
y
l
1
,l
2
= [(
1
,
2
)a(
1
,
2
)] e
j (
1
l
1
+
2
l
2
)
+ e
l
1
,l
2
(
1
,
2
), (6.6)
where e
l
1
,l
2
(
1
,
2
) is formed from{e
n
1
,n
2
(
1
,
2
)} in the same way as y
l
1
,l
2
is made
from {y
n
1
,n
2
}. To estimate (
1
,
2
), the APES algorithm mimics an ML estima-
tor by assuming that { e
l
1
,l
2
(
1
,
2
)}
L
1
1,L
2
1
l
1
=0,l
2
=0
are zero-mean circularly symmetric
complex Gaussian random vectors that are statistically independent of each other
and have the same unknown covariance matrix
Q (
1
,
2
) = E
_
e
l
1
,l
2
(
1
,
2
) e
H
l
1
,l
2
(
1
,
2
)
_
. (6.7)
Using the above assumptions, we get the normalized surrogate log-likelihood
function of the data snapshots { y
l
1
,l
2
} as follows:
1
L
1
L
2
ln p({ y
l
1
,l
2
} | (
1
,
2
), Q (
1
,
2
))
= M
1
M
2
ln ln |Q (
1
,
2
)|
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
_
y
l
1
,l
2
(
1
,
2
)a(
1
,
2
) e
j (
1
l
1
+
2
l
2
)
_
H
(6.8)
Q
1
(
1
,
2
)
_
y
l
1
,l
2
(
1
,
2
)a(
1
,
2
) e
j (
1
l
1
+
2
l
2
)
_
= M
1
M
2
ln ln |Q (
1
,
2
)| tr
_
Q
1
(
1
,
2
)
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
_
y
l
1
,l
2
(
1
,
2
)a(
1
,
2
) e
j (
1
l
1
+
2
l
2
)
_
_
y
l
1
,l
2
(
1
,
2
)a(
1
,
2
) e
j (
1
l
1
+
2
l
2
)
_
H
_
. (6.9)
Just as in the 1-D case, the maximization of the above surrogate likelihood
function gives the APES estimator
(
1
,
2
) =
a
H
(
1
,
2
)
S
1
(
1
,
2
) g(
1
,
2
)
a
H
(
1
,
2
)
S
1
(
1
,
2
)a(
1
,
2
)
(6.10)
and
Q (
1
,
2
) =

S(
1
,
2
) +[
ML
(
1
,
2
)a(
1
,
2
) g(
1
,
2
)]
[
ML
(
1
,
2
)a(
1
,
2
) g(
1
,
2
)]
H
, (6.11)
where

R, g(
1
,
2
), and

S(
1
,
2
) are as dened in Section 3.3.1.
6.3 TWO-DIMENSIONAL MAPES VIAEM
Assume that some arbitrary elements of the data matrix Y are missing. Because of
these missing data samples, which can be treated as unknowns, the log-likelihood
function (6.8) cannot be maximized directly. In this section, we will show how
to tackle this missing-data problem, in the ML context, using the EM and CM
algorithms. A comparison of these two approaches is also provided.
6.3.1 Two-Dimensional MAPES-EM1
We assume that the data snapshots {
Y
l
1
,l
2
} (or { y
l
1
,l
2
}) are independent of each other,
and we estimate the missing data separately for different data snapshots. For each
data snapshot y
l
1
,l
2
, let
l
1
,l
2
and
l
1
,l
2
denote the vectors containing the available
and missing elements of y
l
1
,l
2
, respectively. Assume that
l
1
,l
2
has dimension
g
l
1
,l
2
1, where 1 g
l
1
,l
2
M
1
M
2
is the number of available elements in the
snapshot y
l
1
,l
2
. Then
l
1
,l
2
and
l
1
,l
2
are related to y
l
1
,l
2
by unitary transformations
as follows:

l
1
,l
2
=

S
T
g
(l
1
, l
2
) y
l
1
,l
2
(6.12)

l
1
,l
2
=

S
T
m
(l
1
, l
2
) y
l
1
,l
2
, (6.13)
where

S
g
(l
1
, l
2
) and

S
m
(l
1
, l
2
) are M
1
M
2
g
l
1
,l
2
and M
1
M
2
(M
1
M
2
g
l
1
,l
2
) uni-
tary selection matrices such that

S
T
g
(l
1
, l
2
)
S
g
(l
1
, l
2
) = I
g
l
1
,l
2
,

S
T
m
(l
1
, l
2
)
S
m
(l
1
, l
2
) =
I
M
1
M
2
g
l
1
,l
2
, and

S
T
g
(l
1
, l
2
)
S
m
(l
1
, l
2
) = 0
g
l
1
,l
2
(M
1
M
2
g
l
1
,l
2
)
. For example, let M
1
= 3,
M
2
= 2, and let
Y
l
1
,l
2
=
_
_
_
_
y
l
1
,l
2

y
l
1
+2,l
2
y
l
1
+2,l
2
+1
_
_
, (6.14)
where each indicates a missing sample. Then we have g
l
1
,l
2
= 3,
y
l
1
,l
2
= [y
l
1
,l
2
y
l
1
+2,l
2
y
l
1
+2,l
2
+1
]
T
, (6.15)
and
S
g
(l
1
, l
2
) =
_
_
_
_
_
_
_
_
_
_
_
_
1 0 0
0 0 0
0 1 0
0 0 0
0 0 0
0 0 1
_
_
,

S
m
(l
1
, l
2
) =
_
_
_
_
_
_
_
_
_
_
_
_
0 0 0
1 0 0
0 0 0
0 1 0
0 0 1
0 0 0
_
_
. (6.16)
Because we have
y
l
1
,l
2
=
_
S
g
(l
1
, l
2
)
S
T
g
(l
1
, l
2
) +

S
m
(l
1
, l
2
)
S
T
m
(l
1
, l
2
)
_
y
l
1
,l
2
=

S
g
(l
1
, l
2
)
l
1
,l
2
+

S
m
(l
1
, l
2
)
l
1
,l
2
(6.17)
the joint normalized surrogate log-likelihood function of {
l
1
,l
2
,
l
1
,l
2
} is obtained
by substituting (6.17) into (6.9)
1
L
1
L
2
ln p({
l
1
,l
2
,
l
1
,l
2
} | (
1
,
2
), Q (
1
,
2
))
= M
1
M
2
ln ln |Q (
1
,
2
)| tr
_
Q
1
(
1
,
2
)
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
_
S
g
(l
1
, l
2
)
l
1
,l
2
+

S
m
(l
1
, l
2
)
l
1
,l
2
(
1
,
2
)a(
1
,
2
) e
j (
1
l
1
+
2
l
2
)
_
_
S
g
(l
1
, l
2
)
l
1
,l
2
+

S
m
(l
1
, l
2
)
l
1
,l
2
(
1
,
2
)a(
1
,
2
) e
j (
1
l
1
+
2
l
2
)
_
H
_
.
(6.18)
Just as in the 1-D case, the probability density function of
l
1
,l
2
conditioned
on
l
1
,l
2
(for given =

i 1
) is complex Gaussian with mean

b
l
1
,l
2
and covariance
matrix

K
l
1
,l
2
:

l
1
,l
2
|
l
1
,l
2
,

i 1
CN(
b
l
1
,l
2
,

K
l
1
,l
2
), (6.19)
where
b
l
1
,l
2
= E
_

l
1
,l
2

l
1
,l
2
,

i 1
_
=

S
T
m
(l
1
, l
2
)a(
1
,
2
)
i 1
(
1
,
2
) e
j (
1
l
1
+
2
l
2
)
+

S
T
m
(l
1
, l
2
)

Q
i 1
(
1
,
2
)
S
g
(l
1
, l
2
)
_
S
T
g
(l
1
, l
2
)

Q
i 1
(
1
,
2
)
S
g
(l
1
, l
2
)
_
1
_

l
1
,l
2

S
T
g
(l
1
, l
2
)a(
1
,
2
)
i 1
(
1
,
2
) e
j (
1
l
1
+
2
l
2
)
_
(6.20)
and
K
l
1
,l
2
= cov
_

l
1
,l
2

l
1
,l
2
,

i 1
_
=

S
T
m
(l
1
, l
2
)

Q
i 1
(
1
,
2
)
S
m
(l
1
, l
2
)

S
T
m
(l
1
, l
2
)

Q
i 1
(
1
,
2
)
S
g
(l
1
, l
2
)
S
T
g
(l
1
, l
2
)

Q
i 1
(
1
,
2
)
S
g
(l
1
, l
2
)
_
1
S
T
g
(l
1
, l
2
)

Q
i 1
(
1
,
2
)
S
m
(l
1
, l
2
).
(6.21)
Expectation: The conditional expectation of the surrogate log-likelihood in
(6.18) according to (6.19)(6.21) is given by
E
_
1
L
1
L
2
ln p({
l
1
,l
2
,
l
1
,l
2
} | (
1
,
2
), Q (
1
,
2
))|
{
l
1
,l
2
},
i 1
(
1
,
2
),

Q
i 1
(
1
,
2
)
_
= M
1
M
2
ln ln |Q (
1
,
2
)| tr
_
Q
1
(
1
,
2
)
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
_
S
m
(l
1
, l
2
)

K
l
1
,l
2
S
T
m
(l
1
, l
2
)
+
_
S
g
(l
1
, l
2
)
l
1
,l
2
+

S
m
(l
1
, l
2
)
b
l
1
,l
2
(
1
,
2
)a(
1
,
2
) e
j (
1
l
1
+
2
l
2
)
_
S
g
(l
1
, l
2
)
l
1
,l
2
+

S
m
(l
1
, l
2
)
b
l
1
,l
2
(
1
,
2
)a(
1
,
2
) e
j (
1
l
1
+
2
l
2
)
_
H
__
.
(6.22)
Maximization: The normalized expected surrogate log-likelihood (6.22)
can be rewritten as
M
1
M
2
ln ln |Q (
1
,
2
)| tr
_
Q
1
(
1
,
2
)
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
_
l
1
,l
2
+
_
z
l
1
,l
2
(
1
,
2
)a(
1
,
2
) e
j (
1
l
1
+
2
l
2
)
_
_
z
l
1
,l
2
(
1
,
2
)a(
1
,
2
) e
j (
1
l
1
+
2
l
2
)
_
H
_
_
, (6.23)
where we have dened
l
1
,l
2

S
m
(l
1
, l
2
)

K
l
1
,l
2
S
T
m
(l
1
, l
2
) (6.24)
and
z
l
1
,l
2

S
g
(l
1
, l
2
)
l
1
,l
2
+

S
m
(l
1
, l
2
)
b
l
1
,l
2
. (6.25)
Maximizing (6.23) gives

1
(
1
,
2
) =
a
H
(
1
,
2
)
S
1
(
1
,
2
)

Z(
1
,
2
)
a
H
(
1
,
2
)
S
1
(
1
,
2
)a(
1
,
2
)
(6.26)
and
Q
1
(
1
,
2
) =

S(
1
,
2
) +[
1
(
1
,
2
)a(
1
,
2
)

Z(
1
,
2
)]
[
1
(
1
,
2
)a(
1
,
2
)

Z(
1
,
2
)]
H
, (6.27)
where
Z(
1
,
2
)
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
z
l
1
,l
2
e
j (
1
l
1
+
2
l
2
)
(6.28)
and
S(
1
,
2
)
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
l
1
,l
2
+
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
z
l
1
,l
2
z
H
l
1
,l
2

Z(
1
,
2
)

Z
H
(
1
,
2
).
(6.29)
This completes the derivation of the 2-DMAPES-EM1 algorithm, a step-by-step
summary of which is as follows:
1
,
2
), Q (
1
,
2
)}.
Step1: Use the most recent estimate of {(
1
,
2
), Q (
1
,
2
)} in (6.20) and (6.21)
to calculate

b
l
1
,l
2
and

K
l
1
,l
2
, respectively. Note that

b
l
1
,l
2
can be regarded as
the current estimate of the corresponding missing samples.
Step 2: Update the estimate of {(
1
,
2
), Q (
1
,
2
)} using (6.26) and (6.27).
6.3.2 Two-Dimensional MAPES-EM2
MAPES-EM2 utilizes the EM algorithm by estimating the missing data simulta-
neously for all data snapshots. Let
y = vec[Y] (6.30)
denote the vector of all the data samples. Recall that and denote the vectors
containing the available and missing elements of y, respectively, where has a size
of g 1.
We let y denote the L
1
L
2
M
1
M
2
1 vector obtained by concatenating all the
snapshots:
y
_
_
_
_
y
0,0
.
.
.
y
L
1
1,L
2
1
_
_
= S
g
+S
m
, (6.31)
where S
g
(which has a size of L
1
L
2
M
1
M
2
g) and S
m
(which has a size of
L
1
L
2
M
1
M
2
(N
1
N
2
g)) are the corresponding selection matrices for the avail-
able and missing data vectors, respectively. Because of the overlapping of the vec-
tors { y
l
1
,l
2
}, S
g
and S
m
are not unitary, but they are still orthogonal to each other:
S
T
g
S
m
= 0
g(N
1
N
2
g)
. So instead of (6.12) and (6.13), we have from (6.31):
=
_
S
T
g
S
g
_
1
S
T
g
y =

S
T
g
y (6.32)
and
=
_
S
T
m
S
m
_
1
S
T
m
y =

S
T
m
y, (6.33)
where the matrices

S
g
and

S
m
introduced above are dened as

S
g
S
g
(S
T
g
S
g
)
1
,
S
m
S
m
(S
T
m
S
m
)
1
, and they are also orthogonal to eachother:

S
T
g

S
m
= 0
g(N
1
N
2
g)
.
Nowthe normalized surrogate log-likelihood function in (6.8) can be written
as
1
L
1
L
2
ln p( y | (
1
,
2
), Q (
1
,
2
))
= M
1
M
2
ln
1
L
1
L
2
ln |D(
1
,
2
)|
1
L
1
L
2
[ y (
1
,
2
)(
1
,
2
)]
H
D
1
(
1
,
2
)[ y (
1
,
2
)(
1
,
2
)], (6.34)
where (
1
,
2
) and D(
1
,
2
) are dened as
(
1
,
2
)
_
_
_
_
e
j (
1
0+
2
0)
a(
1
,
2
)
.
.
.
e
j [
1
(L
1
1)+
2
(L
2
1)]
a(
1
,
2
)
_
_
(6.35)
and
D(
1
,
2
)
_
_
_
_
Q (
1
,
2
) 0
.
.
.
0 Q (
1
,
2
)
_
_
(6.36)
Substituting (6.31) into (6.34), we obtain the joint surrogate log-likelihood of
and :
1
L
1
L
2
ln p(, | (
1
,
2
), Q (
1
,
2
))
=
1
L
1
L
2
_
L
1
L
2
M
1
M
2
ln ln |D(
1
,
2
)|
[S
g
+S
m
(
1
,
2
)(
1
,
2
)]
H
D
1
(
1
,
2
)
[S
g
+S
m
(
1
,
2
)(
1
,
2
)]
_
+C
J
(6.37)
where C
J
is a constant that accounts for the Jacobian of the nonunitary transfor-
mation between y and and in (6.31).
To derive the EM algorithm for the current set of assumptions, we have
| ,

i 1
CN(b, K), (6.38)
where
b = E
_
i 1
_
=

S
T
m
(
1
,
2
)
i 1
(
1
,
2
) +

S
T
m

D
i 1
(
1
,
2
)
S
g
_
S
T
g

D
i 1
(
1
,
2
)
S
g
_
1
_

S
T
g
(
1
,
2
)
i 1
(
1
,
2
)
_
(6.39)
and
K = cov
_
i 1
_
=

S
T
m

D
i 1
(
1
,
2
)
S
m

S
T
m

D
i 1
(
1
,
2
)
S
g
S
T
g

D
i 1
(
1
,
2
)
S
g
_
1
S
T
g

D
i 1
(
1
,
2
)
S
m
(6.40)
Expectation: The conditional expectation of the surrogate log-likelihood
function in (6.37) is given as
E
_
1
L
1
L
2
ln p(, | (
1
,
2
), Q (
1
,
2
))
,
i 1
(
1
,
2
),

Q
i 1
(
1
,
2
)
_
= M
1
M
2
ln
1
L
1
L
2
ln |D(
1
,
2
)| tr
_
1
L
1
L
2
D
1
(
1
,
2
)
_
S
m
KS
T
m
+
_
S
g
+S
m
b (
1
,
2
)(
1
,
2
)
__
S
g
+S
m
b (
1
,
2
)(
1
,
2
)
_
H
__
+C
J
.
(6.41)
Maximization: To maximize the expected surrogate log-likelihood func-
tion in (6.41), we again exploit the known structure of D(
1
,
2
) and (
1
,
2
).
Let
_
_
_
_
z
0,0
.
.
.
z
L
1
1,L
2
1
_
_
S
g
+S
m
b (6.42)
denote the data snapshots made up of the available and estimated data samples,
where each z
l
1
,l
2
(l
1
= 0, . . . , L
1
1 and l
2
= 0, . . . , L
2
1) is an M
1
M
2
1 vec-
tor. Also let
0,0
, . . . ,
L
1
1,L
2
1
be the M
1
M
2
M
1
M
2
blocks on the block diag-
onal of S
m
KS
T
m
. Then the expected surrogate log-likelihood function we need to
maximize with respect to (
1
,
2
) and Q (
1
,
2
) becomes (to within an additive
constant)
ln |Q (
1
,
2
)| tr
_
Q
1
(
1
,
2
)
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
_
l
1
,l
2
+
_
z
l
1
,l
2
(
1
,
2
)a(
1
,
2
) e
j (
1
l
1
+
2
l
2
)
_
_
z
l
1
,l
2
(
1
,
2
)a(
1
,
2
) e
j (
1
l
1
+
2
l
2
)
_
H
_
_
. (6.43)
The solution becomes

2
(
1
,
2
) =
a
H
(
1
,
2
)S
1
(
1
,
2
)Z(
1
,
2
)
a
H
(
1
,
2
)S
1
(
1
,
2
)a(
1
,
2
)
(6.44)
and
Q
2
(
1
,
2
) = S(
1
,
2
) +[
2
(
1
,
2
)a(
1
,
2
) Z(
1
,
2
)]
[
2
(
1
,
2
)a(
1
,
2
) Z(
1
,
2
)]
H
, (6.45)
where S(
1
,
2
) and Z(
1
,
2
) are dened as
S(
1
,
2
)
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
l
1
,l
2
+
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
z
l
1
,l
2
z
H
l
1
,l
2
Z(
1
,
2
)Z
H
(
1
,
2
). (6.46)
and
Z(
1
,
2
)
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
z
l
1
,l
2
e
j (
1
l
1
+
2
l
2
)
. (6.47)
The derivationof the MAPES-EM2 algorithmis thus complete, and a step-by-step
summary of this algorithm is as follows:
1
,
2
), Q (
1
,
2
)}.
Step 1: Use the most recent estimates of {(
1
,
2
), Q (
1
,
2
)} in (6.39) and
(6.40) to calculate b and K. Note that b can be regarded as the current
estimate of the missing sample vector.
Step 2: Update the estimates of {(
1
,
2
), Q (
1
,
2
)} using (6.44) and (6.45).
6.4 TWO-DIMENSIONAL MAPES VIACM
Next we consider evaluating the spectrumon the K
1
K
2
-point DFTgrid. Instead
of dealing with each individual frequency (
k
1
,
k
2
) separately, we consider the
following maximization problem:
max
,{(
k
1
,
k
2
),Q (
k
1
,
k
2
)}
K
1
1
k
1
=0
K
2
1
k
2
=0
_
ln |Q (
k
1
,
k
2
)|
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
_
y
l
1
,l
2
(
k
1
,
k
2
)a(
k
1
,
k
2
) e
j (
k
1
l
1
+
k
2
l
2
)
_
H
Q
1
(
k
1
,
k
2
)
_
y
l
1
,l
2
(
k
1
,
k
2
)a(
k
1
,
k
2
) e
j (
k
1
l
1
+
k
2
l
2
)
_
_
,
(6.48)
where the objective function is the summation over the 2-D frequency grid of all
the frequency-dependent complete-data likelihood functions in (6.8) (within an
additive constant). We solve the above optimization problem via a CM approach.
First, assuming that the previous estimate

i 1
formed from{
i 1
(
k
1
,
k
2
),
Q
i 1
(
k
1
,
k
2
)} is available, we maximize (6.48) with respect to . This step can
be reformulated as
min
K
1
1
k
1
=0
K
2
1
k
2
=0
_
y
i 1
(
k
1
,
k
2
)(
k
1
,
k
2
)
_
H
_

D
i 1
(
k
1
,
k
2
)
_
1
_
y
i 1
(
k
1
,
k
2
)(
k
1
,
k
2
)
_
, (6.49)
where y, (
k
1
,
k
2
), and

D
i 1
(
1
,
2
) have been dened previously. Recalling
that y = S
g
+S
m
, we can easily solve the optimization problem in (6.49) as its
objective function is quadratic in :
=
_
K
1
1
k
1
=0
K
2
1
k
2
=0
S
H
m
_

D
i 1
(
k
1
,
k
2
)
_
1
S
m
_
K
1
1
k
1
=0
K
2
1
k
2
=0
S
H
m
_

D
i 1
(
k
1
,
k
2
)
_
1
[
i 1
(
k
1
,
k
2
)(
k
1
,
k
2
) S
g
]. (6.50)
Anecessary conditionfor the inverse in(6.50) toexist is that L
1
L
2
M
1
M
2
> N
1
N
2
g, which is always satised.

Once an estimate has become available, we reestimate {(
k
1
,
k
2
)} and
{Q (
k
1
,
k
2
)} by maximizing (6.48) with replaced by . This can be done by
maximizing each frequency term separately:
max
(
k
1
,
k
2
),Q(
k
1
,
k
2
)
ln |Q(
k
1
,
k
2
)|
1
L
1
L
2
L
1
1
l
1
=0
L
2
1
l
2
=0
_
y
l
1
,l
2
(
k
1
,
k
2
)a(
k
1
,
k
2
) e
j (
k
1
l
1
+
k
2
l
2
)
_
H
Q
1
(
k
1
,
k
2
)
_
y
l
1
,l
2
(
k
1
,
k
2
)a(
k
1
,
k
2
) e
j (
k
1
l
1
+
k
2
l
2
)
_
,
(6.51)
which reduces to the 2-D APES problem.
A cyclic maximization of (6.48) can be implemented by the alternating max-
imization with respect to and, respectively, (
k
1
,
k
2
) and Q (
k
1
,
k
2
). A step-
by-step summary of 2-D MAPES-CM is as follows:
k
1
,
k
2
), Q (
k
1
,
k
2
)}.
Step 1: Use the most recent estimates of {(
k
1
,
k
2
), Q (
k
1
,
k
2
)} in (6.50) to
estimate the missing samples.
Step2: Update the estimates of {(
1
,
2
), Q (
k
1
,
k
2
)} using 2-DAPES applied
to the data matrices with the missing sample estimates fromstep 1 [see (6.10)
and (6.11)].
6.5 MAPES-EMVERSUS MAPES-CM
Consider evaluating the spectrum for all three MAPES algorithms on the same
DFTgrid. Since all three algorithms iterate step 1 and step 2 until practical conver-
gence, we can compare their computational complexity separately for each step.
In step 1, MAPES-CM estimates the missing samples via (6.50), which
can be rewritten in a simplied form as
=
_
S
H
m
D
s
S
m
_
1
_
S
H
m
D
S
H
m
D
s
S
g
_
, (6.52)
where
D
s
K
1
1
k
1
=0
K
2
1
k
2
=0
[

D
i 1
(
k
1
,
k
2
)]
1
(6.53)
and
D
K
1
1
k
1
=0
K
2
1
k
2
=0
[

D
i 1
(
k
1
,
k
2
)]
1

i 1
(
k
1
,
k
2
)(
k
1
,
k
2
). (6.54)
When computing D
s
and D
, the fact that

D
i 1
(
k
1
,
k
2
) is block diagonal
canbe exploitedto reduce the computational complexity. Comparing (6.52)
with (6.39) and (6.40) [or (6.20) and (6.21)], which have to be evaluated
for each frequency (
k
1
,
k
2
) [and for each snapshot (l
1
, l
2
) for (6.20) and
(6.21)], we note that the computational complexity of MAPES-CM is
much lower.
In step 2, MAPES-CM uses the standard APES algorithm, which can
be efciently implemented [34, 35] as discussed in Section 2.6. As for the
MAPES-EM spectral estimators
1
(
1
,
2
) and
2
(
1
,
2
), they have the
same structure as the APES estimator in (6.10), but with two differences.
The rst one is using the conditional mean (
b
l
1
,l
2
or b) of the missing
samples as their estimates. The second one is the additional terms (
l
1
,l
2
and
l
1
,l
2
) that involve the covariance matrices (

K
l
1
,l
2
and K) of the missing data
in the

S and S matrices. Because MAPES-EM uses different estimates for
the missing data at different frequencies, those techniques usedtoefciently
implement APES cannot be applied here (see Section 2.6). As a result, no
efcient algorithms are available to calculate the corresponding spectral
estimate.
In summary, the MAPES-CM algorithm possesses a computational com-
plexity that is much lower than that of MAPES-EM.
In this section, we present three numerical examples to illustrate the performance of
the MAPES algorithms for the 2-D missing-data spectral estimation problem. We
compare the MAPES algorithms with the WFFT and the GAPES [3]. A Taylor
windowwith order 5 and sidelobe level 35 dBis used to obtain the WFFTspectral
estimates. All the 2-D spectra are plotted with a dynamic range of 35 dB.
As in the 1-D case, we simply let the initial estimate of (
1
,
2
) be given
by the WFFT with the missing data samples set to zero. The initial estimate
of Q (
1
,
2
) follows from the 2-D counterpart of (4.8), where again, the miss-
ing data samples are set to zero. For the initialization of GAPES, we consider
two cases. If the data missing pattern is arbitrary and no initial lter with a
proper size can be chosen, we use the same initial estimates of (
1
,
2
) and
Q (
1
,
2
) as for MAPES, and the initial estimate of H(
1
,
2
) follows from
(3.29), where the missing samples are set to zero. If the data is gapped, as inexamples
discussed in Sections 6.6.2 and 6.6.3, we follow the initialization step considered in
Chapter 3.
We stop the iteration of all iterative algorithms whenever the relative change
in the total power of the 2-D spectra corresponding to the current (
i
(
k
1
,
k
2
))
and previous (
i 1
(
k
1
,
k
2
)) estimates is smaller than a preselected threshold (e.g.,
= 10
2
):
K
1
1
k
1
=0
K
2
1
k
2
=0
|
i
(
k
1
,
k
2
)|
2
K
1
1
k
1
=0
K
2
1
k
2
=0
|
i 1
(
k
1
,
k
2
)|
2
K
1
1
k
1
=0
K
2
1
k
2
=0
|
i 1
(
k
1
,
k
2
)|
2
.
(6.55)
6.6.1 Convergence Speed
In our rst example, we study the convergence properties of the MAPES algo-
rithms. We use a 1-D example for simplicity. (Note that a similar 1-D example was
considered in Chapter 5 but without the MAPES-CM algorithm, which has been
introduced in this chapter.)
1
= 0.05 Hz, f
2
= 0.065 Hz, f
3
= 0.26 Hz,
and f
4
1
=
2
=
3
= 1 and
4
= 0.5.
Besides these spectral lines, Fig. 5.1(a) also shows a continuous spectral component
centered at 0.18 Hz with a width of 0.015 Hz and a constant modulus of 0.25. The
data sequence has N
1
= 128 (N
2
= 1) samples out of which 51 (40%) samples are
missing; the locations of the missing samples are chosen arbitrarily. The data are
corrupted by a zero-mean circularly symmetric complex white Gaussian noise with
standard deviation 0.1.
In Fig. 6.1(a), the APES algorithm is applied to the complete data and the
resulting spectrum is shown. The APES spectrum will be used later as a reference
for comparison purposes. The WFFT spectrum for the incomplete data is shown
in Fig. 6.1(b). As expected, the WFFT spectrum has poor resolution and high
sidelobes, and it underestimates the true spectrum. Note that the WFFT spectrum
will be used as initial estimate for both the GAPES and MAPES algorithms in this
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
(a) (b)
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
(c) (d)
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
0 0.1 0.2 0.3 0.4 0.5
0
0.2
0.4
0.6
0.8
1
1.2
Frequency (Hz)
M
o
d
u
l
u
s

o
f

C
o
m
p
l
e
x

A
m
p
l
i
t
u
d
e
(e) (f)
FIGURE 6.1: Modulus of the missing-data spectral estimates [N
1
= 128, N
2
= 1, 51
(40%) missing samples]. (a) Complete-data APES, (b) WFFT, (c) GAPES with M
1
= 64
and M
2
= 1, (d) MAPES-CM with M
1
= 64 and M
2
= 1, (e) MAPES-EM1 with M
1
=
64 and M
2
= 1, and (f ) MAPES-EM2 with M
1
= 64 and M
2
= 1.
example. Fig. 6.1(c) shows the GAPES spectrum, which also underestimates the
sinusoidal components and gives some artifacts due to the poor initial estimate of
H(
1
,
2
). The MAPES-CM spectrum is plotted in Fig. 6.1(d). Figs. 6.1(e) and
6.1(f ) show the MAPES-EM1 and MAPES-EM2 spectra, respectively. All the
MAPES algorithms perform quite well and their missing-data spectral estimates
are similar to the high-resolution complete-data APES spectrum in Fig. 6.1(a).
The MAPES and GAPES spectral estimates at different iterations are plot-
ted in Figs. 6.2(a)6.2(d). Among the MAPES algorithms, MAPES-CM is the
fastest to converge, after three iterations. MAPES-EMconverges more slowly, with
MAPES-EM1 converging after 11 and MAPES-EM2 after 9 iterations. Because
of the relatively poor initial lter-bank H(
1
,
2
) used in this example, the GAPES
algorithm performs relatively poorly and converges relatively slowly, after ten it-
erations. In the following examples where the gapped-data initialization step in
Chapter 3 can be applied, GAPES usually converges faster, within a few iterations.
For illustration purposes, in Figure 6.2 we only plot the rst four iterations of each
method and note that the convergence behavior of each algorithm does not change
signicantly in the remaining iterations.
6.6.2 Performance Study
In this example, we illustrate the performance of the MAPES algorithms for 2-D
spectral estimation. We consider a 16 16 data matrix consisting of three 2-D
sinusoids (signals 1, 2, and 3) at normalized frequencies (4/16, 5/16), (6/16, 5/16),
and (10/16, 9/16) and with complex amplitudes equal to 1, 0.7, and 2, respectively,
embedded in zero-mean circularly symmetric complex Gaussian white noise with
standard deviation 0.1. All the samples in rows 4, 8, 11, 14, and in columns 3, 6,
7, 11, 12, 14 are missing, which amounts to over 50% of the total number of data
samples. The true amplitude spectrum is plotted in Fig. 6.3(a) with the estimated
amplitude values given next to each sinusoid. Each spectrum is obtained on a
64 64 grid. The WFFT spectrum of the complete data is shown in Fig. 6.3(b)
along with the estimated magnitudes of the sinusoids at their true locations. The
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=0
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=1
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=2
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=3
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=4
Frequency (Hz)
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=0
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=1
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=2
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=3
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=4
Frequency (Hz)
(a) (b)
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=0
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=1
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=2
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=3
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=4
Frequency (Hz)
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=0
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=1
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=2
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=3
0 0.1 0.2 0.3 0.4 0.5
0
0.5
1
M
o
d
u
l
u
s
i=4
Frequency (Hz)
(c) (d)
FIGURE 6.2: Modulus of the missing-data spectral estimates obtained at different iter-
ations [N
1
= 128, N
2
= 1, 51 (40%) missing samples]. (a) MAPES-CM, (b) MAPES-
EM1, (c) MAPES-EM2, and (d) GAPES.
1.0000
0.7000
2.0000
0 0.2 0.4 0.6 0.8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0.99808
0.68569
2.0014
0 0.2 0.4 0.6 0.8
0
0.2
0.4
0.6
0.8
1.005
0.70055
2.0017
0 0.2 0.4 0.6 0.8
0
0.2
0.4
0.6
0.8
(a) (b) (c)
2 4 6 8 10 12 14 16
2
4
6
8
10
12
14
16
0.42636
0.32908
0.84218
0 0.2 0.4 0.6 0.8
0
0.2
0.4
0.6
0.8
0.9768
0.64683
1.9866
0 0.2 0.4 0.6 0.8
0
0.2
0.4
0.6
0.8
(d) (e) (f)
1.0069
0.70073
1.9916
0 0.2 0.4 0.6 0.8
0
0.2
0.4
0.6
0.8
1.0136
0.69643
1.9936
0 0.2 0.4 0.6 0.8
0
0.2
0.4
0.6
0.8
0.99644
0.70751
1.9853
0 0.2 0.4 0.6 0.8
0
0.2
0.4
0.6
0.8
(g) (h) (i)
FIGURE 6.3: Modulus of the 2-D spectra. (a) True spectrum, (b) 2-D complete-data
WFFT, (c) 2-Dcomplete-data APESwith M
1
= M
2
= 8, (d) 2-Ddata missing pattern, the
black stripes indicate missing samples, (e) 2-D WFFT, (f ) 2-D GAPES with M
1
= M
2
=
8, (g) 2-DMAPES-EM1 with M
1
= M
2
= 8, (h) 2-DMAPES-EM2 with M
1
= M
2
= 8,
and (i) 2-D MAPES-CM with M
1
= M
2
= 8.
TABLE 6.1: Computational Complexities of the WFFT, GAPES, MAPES-EM,
and MAPES-CM Spectral Estimators
WFFT GAPES MAPES-EM1 MAPES-EM2 MAPES-CM
Flops 3 10
5
3 10
9
1 10
14
6 10
12
3 10
11
two closely spacedsinusoids are smearedtogether. Fig. 6.3(c) shows the APESspec-
trum, also constructed from the complete data, which has well-separated spectral
peaks and accurately estimated magnitudes. The data missing pattern is displayed in
Fig. 6.3(d). The WFFTspectrumfor the missing-data case is shown in Fig. 6.3(e),
which underestimates the sinusoids and contains strong artifacts due to the ze-
ros assumed for the missing samples. Fig. 6.3(f ) shows the spectrum estimated by
GAPES, with an initial lter of size 2 2. In Figs. 6.3(g)6.3(i), we showthe spec-
tra estimated by MAPES-EM1, MAPES-EM2, and MAPES-CM, respectively.
The GAPES algorithmestimates the signal magnitudes much more accurately than
the WFFT, but not as well as the MAPES algorithms. All MAPES algorithms
performwell giving accurate spectral estimates and clearly separated spectral peaks.
Now we compare the computational complexities of these algorithms for
the example in Fig. 6.3. The numbers of oating point operations (Flops) needed
for different algorithms are given in Table 6.1. The WFFT algorithm is the most
efcient, and GAPES is more efcient than the MAPES algorithms. Among
MAPES, MAPES-CM is 20 times faster than MAPES-EM2, which is about 17
times faster than MAPES-EM1.
The results displayed so far were for one typical realization of the data. Using
100 Monte Carlo simulations (by varying the realizations of the noise), we obtain
the RMSEs of the magnitude and phase estimates of the three sinusoids at their true
locations for WFFT, GAPES, and MAPES. The RMSEs of the magnitude and
phase estimates are listed in Tables 6.2 and 6.3, respectively. MAPES-EM1 gives
the best accuracy followed closely by MAPES-EM2 and MAPES-CM. GAPES
is slightly less accurate, whereas the accuracy corresponding to the WFFT is much
lower.
TABLE 6.2: RMSEs of the Magnitude Estimates Obtained via the WFFT, GAPES,
MAPES-EM, and MAPES-CM Spectral Estimators
Signal 1 0.577 0.021 0.010 0.013 0.014
Signal 2 0.372 0.053 0.009 0.014 0.014
Signal 3 1.156 0.011 0.008 0.011 0.015
6.6.3 Synthetic Aperture Radar Imaging Applications
In the following two examples, we illustrate the applications of the missing-data
algorithms to synthetic aperture radar (SAR) imaging using incomplete phase-
history data.
Two-dimensional high-resolution phase-history data of a Slicy object at 0
azimuth angle were generated using XPATCH [51], a high frequency electromag-
netic scattering prediction code for complex 3-D objects. A photo of the Slicy
object taken at 45
azimuth angle is shown in Fig. 6.4(a). The original data matrix

has a size of 288 288 with a resolution of 0.043 m in both range and cross-range.
Fig. 6.4(b) shows the modulus of the 2-D WFFT image of the original data. Here,
we consider only a 32 32 center block of the phase-history data. The APES
image of the complete 32 32 data is shown in Fig. 6.4(c). The data missing
pattern is displayed in Fig. 6.4(d), where the samples in rows 4, 14, 15, 22, 27,
TABLE 6.3: RMSEs of the Phase (Radian) Estimates Obtained via the WFFT, GAPES,
MAPES-EM, and MAPES-CM Spectral Estimators
Signal 1 0.148 0.010 0.009 0.009 0.009
Signal 2 0.145 0.013 0.012 0.015 0.016
Signal 3 0.057 0.006 0.004 0.004 0.006
(a) (b)
5 10 15 20 25 30
5
10
15
20
25
30
(c) (d) (e)
(f) (g) (h)
FIGURE 6.4: Modulus of the SAR images of the Slicy object obtained from a 32 32
data matrix with missing samples. (a) Photograph of the object (taken at 45
azimuth
angle), (b) 2-D WFFT for a 288 288 (not 32 32) data matrix, (c) 2-D complete-data
APES with M
1
= M
2
= 16, (d) 2-Ddata missing pattern, the black stripes indicate missing
samples, (e) 2-D WFFT, (f ) 2-D GAPES with M
1
= M
2
= 16, (g) 2-D MAPES-CM
with M
1
= M
2
= 16, and (h) 2-DMAPES-CMfollowed by 2-Drank-decient RCFwith
a 20 20 lter-bank and a unit radius spherical constraint.
10 20 30 40 50 60
10
20
30
40
50
60
10 20 30 40 50 60
10
20
30
40
50
60
(a) (b)
5 10 15 20 25 30
5
10
15
20
25
30
10 20 30 40 50 60
10
20
30
40
50
60
10 20 30 40 50 60
10
20
30
40
50
60
(c) (d) (e)
FIGURE 6.5: Modulus of the SAR images of the Backhoe data obtained from a 32 32
data matrix with missing samples. (a) 2-D complete-data WFFT, (b) 2-D complete-data
APES with a 2-D lter of size 16 16, (c) 2-D data missing pattern, the black stripes
indicate missing samples, (d) 2-D WFFT, and (e) 2-D MAPES-CM with a 2-D lter of
size 16 16.
28 and in columns 8, 9, 10, 20, 21, 22, 26, 27 are missing (possibly due to both
angular diversity andstrong interferences), resulting in40%missing data. Fig. 6.4(e)
shows the WFFT image, which has low resolution, high sidelobes, and smeared
features. By using an initial lter matrix of size 6 6, the GAPES image is shown
in Fig. 6.4(f ), where strong artifacts around the dihedrals are readily observed.
The image reconstructed via MAPES-CM is shown in Fig. 6.4(g), which is quite
similar to the complete-data image in Fig. 6.4(c).
By observing that the missing-data algorithms developed previously estimate
the missing data samples, we can achieve better spectral estimation performance,
for example, higher resolution than APES, based on the (complete) data inter-
polated via MAPES. It is known that the rank-decient robust Capon lter-bank
(RCF) spectral estimator [52] has higher resolutionthanthe existing nonparametric
spectral estimators. Hence based on the data interpolated via MAPES-CM, we
apply rank-decient RCF with a 20 20 lter-bank and a spherical steering vector
uncertainty set with unit radius. The resulting image is shown in Fig. 6.4(h), which
exhibits no sidelobe problem and retains all important features of Slicy. Compared
with Fig. 6.4(b), we note that although the data size was reduced from 288 288
to 32 32 and from the reduced data matrix 40% of the samples were omitted,
we can still obtain an image similar to that obtained by the WFFT applied to the
original high-resolution data. [Note that we cannot get a similar high-resolution
image with all the well-separated features and without sidelobe problem by simply
thresholding Fig. 6.4(f ), 6.4(g), or even 6.4(c).]
Next, we consider a 32 32 data matrix from the Backhoe Data Dome,
Version 1.0. At 0
elevation, the data are collected from a 2
azimuth cut centered

around 90
azimuth, covering a 0.3 GHz bandwidth centered around 10 GHz.

Fig. 6.5(a) shows the WFFT image of the complete data matrix with smeared
features. Fig. 6.5(b) shows the APES image of the complete data with a 16 16
2-D lter. Some smeared features in Fig. 6.5(a) are clearly observed here, such as
the one located at row 26 and column 20. Fig. 6.5(c) illustrates the data missing
pattern, where the samples in row 5, 13, 14, 21, 22, 27 and in columns 8, 9, 10, 18,
19, 20, 26, 27 are missing. Figs 6.5(d) and 6.5(e) show the WFFT and MAPES-
CM images of the missing-data matrix. It can be observed that despite the missing
samples, MAPES-CM can still give a spectral estimate that has all the features
shown in Fig. 6.5(b).
87
C H A P T E R 7
Conclusions and Software
7.1 CONCLUDINGREMARKS
We have presented some recent results on nonparametric spectral analysis with
missing samples. In particular, we have provided detailed discussions on using
GAPES for the gapped-data and the more general MAPES for the arbitrarily
missed-data spectral estimation problems. Both 1-D and 2-D applications are
considered.
Among these incomplete-data algorithms, GAPES has the least computa-
tional complexity, while MAPES-EM tends to give the best performance. Ac-
cording to their computational complexities, these algorithms can be arranged in
ascending order, starting from the most efcient one: GAPES, MAPES-CM,
MAPES-EM2, and MAPES-EM1. Clearly, there is a tradeoff between spectral
estimation performance and computational efciency.
The reader needs to nd out which algorithm is the best choice for each par-
ticular application, in terms of estimation accuracy and computational complexity.
We now provide some general guidelines based on our own experience:
1. If the missing samples are grouped together and large continuous data seg-
ments are available, GAPES is recommended due to its good performance
for the gapped-data problem and low computational complexity.
2. If the missing samples occur in arbitrary patterns, the MAPES algo-
rithms should be used due to their excellent performances. MAPES-
CM is faster than MAPES-EM1 and MAPES-EM2, but with slightly
worse performance. Hence MAPES-CM may be used for long 1-D
data sequences or 2-D data matrices. MAPES-EM1 or its faster version,
MAPES-EM2, may be used when computation is not a serious concern.
7.2 ONLINESOFTWARE
Several Matlab codes of the algorithms discussed within this book can be down-
loaded from ftp://www.sal.u.edu/ywang. Here is a list of the available codes:
1. MAPES1D EM1.m This function implements 1-D
MAPES-EM1.
MAPES-EM2.
3. MAPES2D CM.m This function implements 2-D
MAPES-CM.
MAPES-EM1.
MAPES-EM2.
6. GAPES1 GappedData.m This function implements 1-D GAPES
for gapped-data spectral analysis.
7. GAPES1D MissingData.m This function implements 1-D GAPES
for arbitrarily missed-data spectral
analysis.
8. GAPES2D GappedData This is a directory that contains 2-D
GAPES software for gapped-data spectral
analysis.
9. GAPES2D GappedData.m This is an example that demonstrates how
to use 2-D GAPES for gapped data
spectral analysis.
CONCLUSIONS AND SOFTWARE 89
10. dispimage.m This is a function that plots 2-Dspectrum.
11. taylor1.m This is a function that calculates 1-D
Taylor window with sidelobe level
35 dB.
12. taylor2.m This is a function that calculates 2-D
Taylor window with sidelobe level
35 dB.
Note that the above Matlab codes are provided for verication purposes only. They
are not optimized for specic applications.
91
References
[1] P. Stoica, E. G. Larsson, and J. Li, Adaptive lterbank approach to restora-
tion and spectral analysis of gapped data, Astron. J., vol. 120, pp. 21632173,
Oct. 2000. doi:10.1086/301572
[2] E. Larsson, G. Liu, P. Stoica, and J. Li, High-resolution SAR imaging
with angular diversity, IEEETrans. Aerosp. Electron. Syst., vol. 37, no. 4, pp.
13591372, Oct. 2001. doi:10.1109/7.976971
[3] E. Larsson, P. Stoica, and J. Li, Amplitude spectrum estimation for two-
dimensional gapped data, IEEE Trans. Signal Processing, vol. 50, no. 6, pp.
13431354, June 2002. doi:10.1109/TSP.2002.1003059
[4] E. Larsson and J. Li, Spectral analysis of periodically gapped data, IEEE
Trans. Aerosp. Electron. Syst., vol. 39, no. 3, pp. 10891097, July 2003.
doi:10.1109/TAES.2003.1238761
[5] J. Salzman, D. Akamine, R. Lefevre, and J. Kirk, Jr., Interrupted synthetic
aperture radar (SAR), in Proc. 2001 IEEE Radar Conf., Atlanta, GA, May
2001, pp. 117122. doi:10.1109/NRC.2001.922962
[6] J. Salzman, D. Akamine, andR. Lefevre, Optimal waveforms andprocessing
for sparse frequency UWB operation, in Proc. 2001 IEEE Radar Conf.,
Atlanta, GA, May 2001, pp. 105110. doi:10.1109/NRC.2001.922960
[7] K. M. Cuomo, J. E. Piou, and J. T. Mayhan, Ultrawide-band coherent
processing, IEEE Trans. Antennas Propagat., vol. 47, no. 6, pp. 10941107,
June 1999. doi:10.1109/8.777137
[8] P. Stoica and R. L. Moses, Introductionto Spectral Analysis. Englewood Cliffs,
NJ: Prentice-Hall, 1997.
[9] J. P. Burg, Maximum entropy spectral analysis, in presented at the Proc.
37th Meeting Society Exploration Geophys., Oklahoma City, OK, Oct. 1967.
92 REFERENCES
[10] R. Schmidt, A signal subspace approach to multiple emitter location and
spectral estimation, Ph.D. dissertation, Stanford University, CA, Nov. 1981.
[11] J. Li and P. Stoica, Efcient mixed-spectrum estimation with applications
to target feature extraction, IEEETrans. Signal Processing, vol. 44, pp. 281
295, Feb. 1996. doi:10.1109/78.485924
[12] J. Capon, High resolution frequency-wavenumber spectrum analysis, Proc.
IEEE, vol. 57, pp. 14081418, Aug. 1969.
[13] J. Li and P. Stoica, An adaptive ltering approach to spectral estimation and
SAR imaging, IEEETrans. Signal Processing, vol. 44, no. 6, pp. 14691484,
June 1996. doi:10.1109/78.506612
[14] P. Stoica, A. Jakobsson, and J. Li, Capon, APES and matched-lterbank
spectral estimation, Signal Process., vol. 66, no. 1, pp. 4559, April 1998.
doi:10.1016/S0165-1684(97)00239-9
[15] H. Li, J. Li, and P. Stoica, Performance analysis of forward-backward
matched-lterbank spectral estimators, IEEE Trans. Signal Processing, vol.
46, pp. 19541966, July 1998. doi:10.1109/78.700967
[16] N. Lomb, Least-squares frequency analysis of unequally spaced data, As-
trophys. Space Sci., pp. 447462, 1976. doi:10.1007/BF00648343
[17] J. D. Scargle, Studies in astronomical time series analysis II: Statistical
aspects of spectral analysis of unevenly spaced data, Astrophys. J., vol. 263,
pp. 835853, Dec. 1982. doi:10.1086/160554
[18] J. A. H ogbom, Aperture synthesis with a non-regular distribution of inter-
ferometer baselines, Astron. Astrophy. Suppl., vol. 15, pp. 417426, 1974.
[19] T. Bronez, Spectral estimation of irregularly sampled multidimensional
processes by generalized prolate spheroidal sequences, IEEE Trans.
Acoust. Speech Signal Process., vol. 36, no. 12, pp. 18621873, Dec. 1988.
doi:10.1109/29.9031
[20] I. Fodor and P. Stark, Multitaper spectrum estimation for time series with
gaps, IEEE Trans. Signal Processing, vol. 48, no. 12, pp. 34723483, Dec.
2000. doi:10.1109/78.887039
REFERENCES 93
[21] R. H. Jones, Maximum likelihood tting of ARMA models to time series
with missing observations, Technometrics, vol. 22, no. 3, pp. 389395, Aug.
1980.
[22] B. Porat and B. Friedlander, ARMA spectral estimation of time series with
missing observations, IEEE Trans. Inform. Theory, vol. 30, no. 4, pp. 601
602, July 1986.
[23] Y. Rosen and B. Porat, Optimal ARMA parameter estimation based on the
sample covariances for data with missing observations, IEEETrans. Inform.
Theory, vol. 35, no. 2, pp. 342349, Mar. 1989. doi:10.1109/18.32128
[24] P. Broersen, S. de Waele, and R. Bos, Estimation of autoregressive spec-
tra with randomly missing data, in Proc. 20th IEEE Instrument. Measure.
Technol. Conf., Vail, CO, vol. 2, May 2003, pp. 11541159.
[25] P. Stoica, H. Li, andJ. Li, Amplitude estimationof sinusoidal signals: Survey,
newresults, and an application, IEEE Trans. Signal Processing, vol. 48, no.
2, pp. 338352, Feb. 2000. doi:10.1109/78.823962
[26] M. R. Palsetia and J. Li, Using APES for interferometric SAR imaging,
IEEE Trans. Imaging Processing, vol. 7, no. 9, pp. 13401353, Sep. 1998.
doi:10.1109/83.709665
[27] F. Gini and F. Lombardini, Multilook APES for multibaseline SAR inter-
ferometry, IEEETrans. Signal Processing, vol. 50, no. 7, pp. 18001803, July
2002. doi:10.1109/TSP.2002.1011219
[28] , Multibaseline post-processing for SAR interferometry, presented at
the Proc. IEEE Sensor Array Multichannel Signal Process. Workshop (SAM),
Sitges, Spain, July 2004.
[29] R. Wu, Z.-S. Liu, and J. Li, Time-varying complex spectral analysis via
recursive APES, IEE Proc. Radar Sonar Navigation, vol. 145, no. 6, pp.
354360, Dec. 1998. doi:10.1049/ip-rsn:19982435
[30] D. J. Russell and R. D. Palmer, Application of APES to adaptive arrays on
the CDMA reverse channel, IEEE Trans. Veh. Technol., vol. 53, no. 1, pp.
317, Jan. 2004. doi:10.1109/TVT.2003.821991
94 REFERENCES
[31] H. Li, W. Sun, P. Stoica, and J. Li, Two-dimensional system identication
using amplitude estimation, IEEE Signal Processing Lett., vol. 9, no. 2, pp.
6163, Feb. 2002. doi:10.1109/97.991139
[32] P. Stoica, H. Li, and J. Li, Anewderivation of the APES lter, IEEESignal
ProcessingLett., vol. 6, no. 8, pp. 205206, Aug. 1999. doi:10.1109/97.774866
[33] H. L. Van Trees, Optimum Array Processing, Part IV of Detection, Estimation,
and Modulation Theory. New York, NY: John Wiley and Sons, Inc., 2002.
[34] Z.-S. Liu, H. Li, and J. Li, Efcient implementation of Capon and APES
for spectral estimation, IEEE Trans. Aerosp. Electron. Syst., vol. 34, no. 4,
pp. 13141319, Oct. 1998. doi:10.1109/7.722716
[35] E. Larsson and P. Stoica, Fast implementation of two-dimensional apes
and capon spectral estimators, Multidimen. Syst. Signal Process., vol. 13, pp.
3554, Jan. 2002. doi:10.1023/A:1013891327453
[36] E. G. Larsson, J. Li, and P. Stoica, High-resolution nonparametric spec-
tral analysis: Theory and applications, in High-Resolution and Robust Signal
Processing, Y. Hua, A. B. Gershman, and Q. Cheng Eds. New York: Marcel-
Dekker, 2003.
[37] D. D. Meisel, Fourier transforms of data sampled in unequally
spaced segments, Astron. J., vol. 84, no. 1, pp. 116126, Jan. 1979.
doi:10.1086/112397
[38] J. D. Scargle, Studies in astronomical time series analysis III: Fourier
transforms, autocorrelation functions, and cross-correlation functions of
unevenly spaced data, Astrophys. J., vol. 343, pp. 874887, Aug. 1989.
doi:10.1086/167757
[39] D. H. Roberts, J. Leh ar, and J. W. Dreher, Time series analysis with
CLEAN I: Derivation of a spectrum, Astron. J., vol. 93, no. 4, pp. 968
989, April 1987. doi:10.1086/114383
[40] G. B. Rybicki and W. H. Press, Interpolation, realization, and reconstruc-
tion of noisy, irregularly sampled data, Astrophys. J., vol. 398, pp. 169176,
Oct. 1992. doi:10.1086/171845
REFERENCES 95
[41] W. Press and G. Rybicki, Fast algorithm for spectral analysis of unevenly
sampled data, Astrophys. J., pp. 277280, Mar. 1989. doi:10.1086/167197
[42] P. Stoica and Y. Selen, Cyclic minimizers, majorization techniques,
and the expectation-maximization algorithm: A refresher, IEEE Sig-
nal Processing Mag., vol. 21, no. 1, pp. 112114, Jan. 2004.
doi:10.1109/MSP.2004.1267055
[43] P. Stoica and A. Nehorai, Statistical analysis of two nonlinear least-squares
estimators of sine-wave parameters in the colored-noise case, Circuits Syst.
Signal Process., vol. 8, no. 1, pp. 315, 1989. doi:10.1007/BF01598742
[44] P. Stoica, A. Jakobsson, and J. Li, Sinusoid parameter estimation in the
colored noise case: Asymptotic Cram er-Rao bound, maximum likelihood
and nonlinear least-squares, IEEE Trans. Signal Processing, vol. 45, no. 8,
pp. 20482059, Aug. 1997. doi:10.1109/78.611203
[45] Y. Wang, P. Stoica, J. Li, and T. Marzetta, Nonparametric spectral analysis
with missing data via the EMalgorithm, Digital Signal Processing, submitted
for publication.
[46] A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum-likelihood from
incomplete data via the EM algorithm, J. Royal Stat. Soc., vol. 39, no. 1, pp.
138, 1977.
[47] G. Casella and R. L. Berger, Statistical Inference. Pacic Grove, CA: Duxbury,
2001.
[48] S. M. Kay, Fundamentals of Statistical Signal Processing: Estimation Theory.
Upper Saddle River, NJ: Prentice Hall, 1993.
[49] L. Ljung, SystemIdentication: Theory for the User 2nd ed. Englewood Cliffs,
NJ: Prentice-Hall, 1999.
[50] Y. Wang, P. Stoica, andJ. Li, Two-dimensional nonparametric spectral anal-
ysis in the missing data case, IEEE Trans. Aerosp. Electron. Syst., submitted
for publication.
[51] D. J. Andersh, M. Hazlett, S. W. Lee, D. D. Reeves, D. P. Sullivan, and Y.
Chu, XPATCH: A high-frequency electromagnetic scattering prediction
96 REFERENCES
code and environment for complex three-dimensional objects, IEEE An-
tennas Propagat. Mag., vol. 36, no. 1, pp. 6569, Feb. 1994.
[52] Y. Wang, J. Li, and P. Stoica, Rank-decient robust Capon lter-bank
approach to complex spectral estimation, IEEE Trans. Signal Processing, to
be published.
97
The Authors
YANWEI WANG
Yanwei Wang received the B.Sc. degree in electrical engineering from the Beijing
University of Technology, China, in 1997 and the M.Sc. degree, again in electrical
engineering from the University of Florida, Gainesville, in 2001. Since January
2000, he has been a research assistant with the Department of Electrical and Com-
puter Engineering, University of Florida, where he received the Ph.D. degree in
December 2004. Currently, he is with the R&D group of Diagnostic Ultrasound
Corp. His research interests include spectral estimation, medical tomographic
imaging, and radar/array signal processing.
JIANLI
Jian Li received the M.Sc. and Ph.D. degrees in electrical engineering from The
Ohio State University, Columbus, in 1987 and 1991, respectively.
From April 1991 to June 1991, she was an Adjunct Assistant Professor with
the Department of Electrical Engineering, The Ohio State University, Columbus.
FromJuly 1991 to June 1993, she was anAssistant Professor withthe Department of
Electrical Engineering, University of Kentucky, Lexington. Since August 1993, she
has been with the Department of Electrical and Computer Engineering, University
of Florida, Gainesville, where she is currently a Professor. Her current research
interests include spectral estimation, array signal processing, and their applications.
Dr. Li is a member of Sigma Xi and Phi Kappa Phi. She received the 1994
National Science Foundation Young Investigator Award and the 1996 Ofce of
Naval Research Young Investigator Award. She was an Executive Committee
Member of the 2002 International Conference on Acoustics, Speech, and Signal
Processing, Orlando, Florida, May 2002. She has been an Associate Editor of the
98 THE AUTHORS
IEEE Transactions on Signal Processing since 1999 and an Associate Editor of the
IEEE Signal Processing Magazine since 2003. She is presently a member of the
Signal Processing Theory and Methods (SPTM) Technical Committee of the
IEEE Signal Processing Society.
PETRESTOICA
Petre Stoica (F94) received the D.Sc. degree in automatic control from the Poly-
technic Institute of Bucharest (BPI), Bucharest, Romania, in 1979 and an honorary
doctorate degree in science from Uppsala University (UU), Uppsala, Sweden, in
1993.
He is Professor of system modeling with the Department of Systems and
Control at UU. Previously, he was a Professor of system identication and signal
processing with the Faculty of Automatic Control and Computers at BPI. He held
longer visiting positions with Eindhoven University of Technology, Eindhoven,
The Netherlands; Chalmers University of Technology, Gothenburg, Sweden
(where he held a Jubilee Visiting Professorship); UU; The University of Florida,
Gainesville; and Stanford University, Stanford, CA. His main scientic interests are
in the areas of system identication, time series analysis and prediction, statistical
signal and array processing, spectral analysis, wireless communications, and radar
signal processing. He has published seven books, ten book chapters, and some 450
papers in archival journals and conference records on these topics. The most re-
cent book he coauthored, with R. Moses, is entitled Introduction to Spectral Analysis
(Englewood Cliffs, NJ: Prentice-Hall, 1997). He has also edited two books on sig-
nal processing advances in wireless communications and mobile communications,
published by Prentice-Hall in 2001. He is on the editorial boards of ve journals
in the eld: Journal of Forecasting; Signal Processing; Circuits, Signals, and Signal
Processing; Digital Signal ProcessingA Review Journal; and Multidimensional Sys-
tems and Signal Processing. He was a Co-Guest Editor for several special issues on
system identication, signal processing, spectral analysis, and radar for some of the
aforementioned journals, as well as for Proceeding of the IEEE.
THE AUTHORS 99
Dr. Stoica was a corecipient of the IEEE ASSP Senior Award for a paper on
statistical aspects of array signal processing. He was also a recipient of the Technical
Achievement Award of the IEEE Signal Processing Society for fundamental con-
tributions to statistical signal processing with applications in time series analysis,
system identication, and array signal processing. In 1998, he was the recipient of a
Senior Individual Grant Award of the Swedish Foundation for Strategic Research.
He was also a corecipient of the 1998 EURASIP Best Paper Award for Signal Pro-
cessing for a work on parameter estimation of exponential signals with time-varying
amplitude, a 1999 IEEE Signal Processing Society Best Paper Award for a paper
on parameter and rank estimation of reduced-rank regression, a 2000 IEEE Third
Millennium Medal, and the 2000 W. R. G. Baker Prize Paper Award for a paper
on maximum likelihood methods for radar. He was a member of the international
program committees of many topical conferences. From 1981 to 1986, he was the
Director of the International Time Series Analysis and Forecasting Society, and he
has been a member of the IFACTechnical Committee on Modeling, Identication,
and Signal Processing since 1994. He is also a member of the Romanian Academy
and a Fellow of the Royal Statistical Society.

Spectral Analysis of Signals

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Spectral Analysis of Signals

Caricato da

Copyright:

Formati disponibili

SPECTRAL ANALYSIS

denotes the complex conjugate. The minimization of (2.8) with respect

azimuth. The backscattered data has

elevation, from approximately a 3

azimuth cut centered around 0

g, which is always satised.

, the fact that

azimuth angle is shown in Fig. 6.4(a). The original data matrix

elevation, the data are collected from a 2

azimuth cut centered

azimuth, covering a 0.3 GHz bandwidth centered around 10 GHz.

Potrebbero piacerti anche