Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Abstract
1 Introduction
memory and used to recognise 3D objects under different poses. Von der Maalsburg and co-workers [2, 3]
have used a relatively simple multi-channel representation to model the appearance of key facial features.
Rather than providing overall recognition themselves,
these so-called jets are used to control the tting of an
active net to faces in dierent 3D poses. De Bonet and
Viola use a multiscale lter bank extract 48,000 features. The features are used to perform content-based
image indexing [5]. Mel's channel model is perhaps
one of the most ambitious [6]. Here there are over
100 dierent channels specialised not only to local features, but also to colour, spatial or linear contiguity
(blobs and lines) and local curvature (corners).
One of the key issues that raises itself when such
a multi-channel feature or object representation is
used is that of how to learn the pattern of lter
responses. The literature here is relatively sparse.
Most of the work which exploits neural network architectures adopts the working model that the recognition process should be trained from a few examples
and that the generalisation properties of the network
should be exploited to accommodate variable object
appearance [1]. An example of a more principled approach is Bregler and Maliks [4] use of the expectationmaximisation [7] algorithm to learn the channel mixing proportions. This procedure has been demonstrated to work eectively on relatively noise-free and
uncluttered imagery.
In this paper we are interested in the challenging feature recognition problems posed by noisy radar
data. Here we wish to identify lters capable of characterising complex radar re
ection patterns due to elevated features in the landscape. In particular we
are interested in learning lters for enhancing linear
features such as roads. We commence from a similar starting point to that of Bregler and Malik [4],
by adopting the EM algorithm as a learning engine.
However, our methodology diers in a number of important respects.
In the rst instance we commence by performing
channel balancing so as to ensure that each of the com-
Authorized licensed use limited to: University of York. Downloaded on August 16,2010 at 20:37:33 UTC from IEEE Xplore. Restrictions apply.
ponents of the lter bank has an equal noise throughput. In order to model the unknown foreground feature distribution for the channel responses, we adopt
a radial-basis distribution. The basic idea is to t
a series of Gaussian basis functions to the residue of
the probability distribution when the background process is subtracted. The parameterised distribution is
used to compute a posteriori feature probabilities in
the expectation step of the learning process. Finally,
once the a posteriori feature probabilities are to hand
they may be used to project out the optimal set of
channel combinations for the foreground or target features. Here we use the between class covariance matrix as a foreground-background separation measure.
We project out linear lter combinations by applying
principal components analysis to the between class covariance matrix.
2 Channel Model
The overall aim in this paper is to describe a statistical methodology for learning combinations of channel lters for feature characterisation. The learning
procedure is non-linear and is based on the EM algorithm of Dempster, Laird and Rubin [7]. In this
Section we outline the statistical model the underpins
the learning algorithm.
We are interested in identifying an orientational lter basis consisting of odd and even symmetry kernels that can be used to characterise mixed-symmetry
variable-width hedge-features. Although there are
many alternatives available in the literature, here we
make use of the Gabor-lter. If x and y denote the
spatial co-ordinates, then the so called Gabor functions with horizontal orientation having spatial width
w and frequency are as follows
Lw;0(x; y) = exp[,
x2 + y2
2w2 ] cos[2x]
(1)
Ew;0(x; y) = exp[,
x2 + y2
2w2 ] sin[2x]
(2)
Since it is of even symmetry, the cosine-phase Gabor kernel Lw;0(x; y) operates as a line-enhancement
operator. The sine-phase kernel, on the other hand,
is appropriate to edge-detection. The lter pair described above is appropriate to the detection of intensity features aligned along the x-axis of the image
plane. Kernels appropriate to the detection of features
oriented along the vertical are obtained by rotating the
horizontal kernels by 2 . To make this angular dependence explicit, we let Lw; (x; y) and Ew; (x; y) denote
tor of lter kernels and
is the convolution
operation
with the image I .
2.2 Background
Our statistical modelling of the lter-bank output commences by considering the image background.
Here we assume that the lters are being applied to
locally uniform image regions containing no signicant features or structure. We further assume that
the uniform structureless regions are subject to additive Gaussian noise with zero mean and variance
2 . Since the lter responses are obtained in a linear
fashion from the noisy image data, then the channel
response vector xij = W
follows a multivariate
In Iother
Gaussian with zero mean.
words, the probability density function for the background distribution
of channel-vectors is distribution
1
1
1
,
1
T
p
exp , 2 xi;j xi;j (3)
p(xi;j j) =
(2) n j j
2
2.3 Foreground
The statistical modelling of the foreground distribution of Mahalanobis length for feature detection
has proved to be an extremely elusive task. For instance, attempts at modelling the distribution of edgegradient for automatic control of the Canny hysteresis
thresholds have conned their attention to the background or noise process [8, 9]. In this paper our aim is
to use the background model to assist in learning the
channel structure of the foreground. Specically, we
augment the Gaussian background model with a radial basis expansion to which we use to parameterise
foreground structure. We distinguish between the different basis kernels by assigning them a label !. The
complete set of foreground kernels is denoted by the
set
. The kernel indexed ! has mean Mahalanobis
length ! and mixing proportion ! . The basis functions are assumed to have a radial structure. In other
Authorized licensed use limited to: University of York. Downloaded on August 16,2010 at 20:37:33 UTC from IEEE Xplore. Restrictions apply.
3.2 Maximisation:
In the maximisation step we aim to recover maximum likelihood basis-parameters which satisfy the
condition
(n+1) = arg max
K (j(n))
(9)
1
1
1
T ,1
)
(x
,
)
p(xi;j j!; ) =
exp
,
(x
,
2 i;j ! ! i;j ! At iteration n of the algorithm, the position of the
(2) n !
(4)
basis-function indexed ! is given by
where ! represents the set of basis parameters for
P
the kernel indexed !:
(1 , P (xj))P (!jx; (n) )
x2HP
(
n+1)
(10)
=
!
! = (! ; ! ; ! )T
(5)
x2H P (!jx; (n) )
Finally, the radial-basis approximation of the foreThe corresponding basis-function width is equal to
ground is
P
(n) 2
X
(n) )
x
2
H ((1 , P (xj)) , ! ) P (! jx;
p(xi;j j
) =
! p(xi;j j!; )
(6)
2
n+1)
P
=
!
! 2
Authorized licensed use limited to: University of York. Downloaded on August 16,2010 at 20:37:33 UTC from IEEE Xplore. Restrictions apply.
5 Experiments
(a)
(b)
(c)
(d)
The experimental evaluation of our feature characterization algorithm revolves around Millimetric
Doppler Beam Sharpened (MDBS) radar images.
These images dier from their SAR counterparts in a
number of important respects. Firstly, the frequency
of the radar is of the order of 100 GHz rather than
the 10 GHz which is typical of SAR. This means that
structures whose size is of the order of a few millemetres appear rough to the radar. The shorter wavelengths employed in the MDBS imagery are a consequence of physical constraints imposed upon the dimensions of the resonating cavities in airborne military radars. The second diculty stems from the
imaging geometry. Since the radar is used to sense
objects in the line of
ight from a low
ying aircraft,
the images are subject to small angle systematics.
Authorized licensed use limited to: University of York. Downloaded on August 16,2010 at 20:37:33 UTC from IEEE Xplore. Restrictions apply.
leading principal
component( the quantity of interest
qP
8
T
R
2
is yij =
k xij ) . Here the contrast bek=2 k (V
tween the elevated structures and the background is
very poor. In other words, the lter combination obtained by principal components analysis extracts most
of the salient raised structures from the ltered intensity image.
Finally, Figure 2 shows the lter obtained by combining the raw Gabor kernels in the proportions
suggested by the leading principal components. In
other words the lter-kernel for hedge-enhancement
is K (x; y) = VT W
(x; y). The combined kernel
is predominantly of even-symmetry structure. It is
this even-symmetry component which enhances linestructure. However, there is also an odd-symmetry
component which allows for an admixture of edge-like
structure in the detected features. It is the shadowing
of the elevated features which is responsible for the
small edge-component.
1
6 Conclusions
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
References
Authorized licensed use limited to: University of York. Downloaded on August 16,2010 at 20:37:33 UTC from IEEE Xplore. Restrictions apply.