Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Alessandro Neri
employ an unified mathematical framework for representing where (1) is the radial profile corresponding to the n-th cir-
the image and video contents and fulfill the above mentioned cular harmonic of f . We observe that every term of the series
tasks. Thus, this contribution explores the potentialities of is a Circular Harmonic Function, CHF, i.e., a bi-dimensional
hyper-complete image and video representations based on polar separable function ψn of the kind:
Circular Harmonic Functions (CHFS), [1], [2]. These kinds # $
of representations that are directly related to the Fouriers se- ψ rejθ = ψn (r,θ ) = hn (r) ejnθ .
ries expansion of images when represented in a polar domain,
in addition to many relevant mathematical properties, have The harmonic angular shape of ψn implies that it can be
the property of including in the basis set elements like edges, steered in any direction φ by simple multiplication by the
lines and corners, particularly relevant for the human visual complex factor e−jnφ , i.e., it is self-steerable, [11],[12]. In
system. Thus their adoption can reduce the effort paid by fact
the user in taking control of the technology still assuring the fp (r,θ − φ) = hn (r)ejnθ e−jnφ .
This work has been partially funded by the Radiolabs Consortium and By virtue of their harmonic angular shape, CHFs are in-
the Fondazione Ugo Bordoni. deed natural detectors for different classes of features: CHFs
f (·) and are obtained from the latter by projection onto the around x 0 . Then the following property holds:
LG basis: fˆn (x0 , σ) = e−jnα fˆkn (x0 , σ). (5)
α k
) ' ( ' (*
x − ξ 1 |x − ξ|
fˆkn (ξ, σ) = f (x) V , Lnk , θ (x − ξ) . In addition, as shown in [4], the LGT coefficients a fˆkn of
σ σ σ the scaled-windowed function f (Sa (x)) V (Sa (x/σ)) can be
directly computed as a linear combination of the LGT coeffi-
The set {fˆkn (ξ, σ), k = 0, 1, 2, ...} describes the local behav-
cients fˆkn of the unscaled windowed function f (x)V (x/σ):
ior of the n-th angular harmonic of the image. Consequently,
∞
for edges, lines, forks and crosses, the greatest part of the a ˆn
!
energy is respectively focused on the first, second, third and f k = B(a, n, k, l)fˆln , (6)
fourth harmonic. l=k
270
3. RESTORATION AND ENHANCEMENT
where 3 4−1
z̃sk (x) = γk zsk (x), (16) × RkNj (x) + RkZi (x) zsk (x), (17)
271
where where θ(x) is the phase of zsk (x).
By fact, the adaptive procedure acts as a surround inhibi-
βjk λki N2 [zsk , 0, RN
k
(x) + RkZi (x)]
wij [zsk (x)] = / / k k j
. tion terms that takes into account the context influence of the
k (x) + Rk (x)]
βj λi N2 [zsk , 0, RN j Zi surroundings of each point. The inhibition is large in textured
j j
areas, and small on flat regions thus leading to heaviest noise
Application of (17) requires the knowledge of the local suppression inside textures while sharpening contours. We
noise statistics. Considering that the complex edge image is observe that this inhibition has been also observed in the hu-
essentially a sample of a sparse random field, a simple Gaus- man visual system, being originated from the regions flanking
sian distribution can be successfully applied. Inspired to both the receptive field of an orientation selective neuron on both
parametric CFAR detectors and psychophysical and neuro- sides of the optimal stimulus for that neuron. The width of
physiological findings, we compute the noise variance as the the central region excluded from the computation of the back-
local average of the energy of the weighted complex edge im- ground statistics is equal to the width of the spatial support of
age on a neighbor of the current site. To avoid self masking the complex edge wavelet operator. By varying a, the inhibi-
the noise variance is set to the minimum of the average of the tion term can partially or completely suppress the sharpening
L2 norms of the complex edges computed on the sets given of textured areas.
by two half rings with centre in the current site and oriented With respect to the identification of the edge statistical
along the edge direction, with the exclusion of a narrow strip model we observe that object contours usually consist of long
of width 2a oriented along the edge, as illustrated in Fig. 2. and wide connected components of nonzero pixels, while tex-
ture edges, especially after surround inhibition, lead to rel-
atively short and thin components. Thus to build the con-
tour map we apply edge thinning by non-maxima suppres-
sion and binarization by thresholding. Specifically, we apply
to |ẑsk (x)| non-maxima suppression. Let usk (x) be the unit
vector parallel to ẑsk (x) and Ssk the set of all points which
are local maxima of |ẑsk (x)| in the direction of usk (x):
+ - ,
- ∂ |ẑsk (x)| ∂ |ẑsk (x)|
Ssk = x -- =0 ∧ <0
∂usk ∂usk
Fig. 2. CFAR sets for background activity estimation.
Let us partition the set Ssk into Nc connected subsets
At this aim, we define two weighting functions ws+k ,φ (x) Cl
(sk )
, so that
and ws−k ,φ (x):
N
9 C :
(sk ) (s ) (sk )
DoG± S sk = Cl , Ch k Cl =∅
s ,φ (x)
ws±k ,φ (x) = // k ± , (18) l=1 h$=l
DoGsk ,φ (x)
n m (s )
A morphological dilation is then applied to Cl k , with a 3×3
where DoG±
sk ,φ (x) is obtained as:
(s )
square structuring element q3 , yielding a dilated subset Dl k :
DoGsk (x) · U−1 [± (x1 cos φ + x2 sin φ − a)] , (19) Dl
(sk )
= Cl
(sk )
⊕ q3 .
where U−1 is the step function defined as follows: (sk )
Then, we define as global contour weight G of Cl , the sum
+ (s )
0, η< 0 of the values of |ẑsk (x)| over Dl k :
U−1 (η) = , (20)
1, η ≥ 0 6 7 !
(s )
G Cl k = |ẑsk [n, m]|.
and DoG(x) is the difference of two concentric Gaussian (s )
[n,m]∈Dl k
functions:
- 1 2 1 2-+ Finally a binary edge map Bsk is computed by thresholding
DoGσ [x] = -N2 x, 0, σ 2 I − N2 x, 0, l2 σ 2 I - , (21) the global contour weight map, i.e.,
where |η|+ = ηU−1 (η) denotes half-wave rectification. 9 (s )
B sk = Cl k
Then, we estimate the noise covariance matrix as follows: ! "
(s )
G Cl k >Gmin
RN (x) =
56 7 6 78 An example of this edge detection procedure in presence
2 2
α min |zsk (x)| ∗ ws+k ,θ (x) , |zsk (x)| ∗ ws−k ,θ (x) I. of heavy noise is illustrated in Fig. 3.
272
respect to restoration and enhancement, this correlation can
be summarized by the disparity map providing information
about the spatial relationships among objects and viewer and
the motion field that accounts for the temporal evolution.
Extraction from a stereo pair or a stereo video of the dis-
parity map and the motion field can benefit of the adoption
of a Gauss-Laguerre representation. In fact, as illustrated in
the next section, in this way the implementation of the Maxi-
mum Likelihood pattern location and rotation estimators can
be drastically simplified from a computational point of view.
(a) Noisy input image SNR=13 dB (b) Extracted edges
Nevertheless, in this section we focus our attention on the
visual enhancements that can be introduced when the char-
Fig. 3. Edge extraction in presence of noise. acteristics of the human visual system are combined with the
possibility of locally controlling the spatial bandwidth of the
reconstruction operator by acting on the γk factors of (16),
To restore an image affected by additive white Gaussian
[17] .
noise, the image is filtered with a bank of Gauss-Laguerre
In fact, as stated by the principle of ”Organization of
filters and decomposed at different scales. The single scale
space”, well known to painters and well exemplified by many
complex edge image resulting from the wavelet filter passes
paintings by Leonardo da Vinci, like the famous “Mona Lisa”,
through the adaptive MMSE complex edge estimator that lo-
foreground figures possess well-defined contours, while ob-
cally estimates the noise variance producing surround inhibi-
jects in the background appear more and more blurred while
tion where highly textured areas are suppressed. The estimate
their distance increases. In addition this effect increases in
is further processed in the binarization block, where a binary
presence of moving objects.
contour map is created, as previously described. At this point,
Visual appearance of 3D videos can be increased by
the iterative denoising process takes place. At each iteration,
adopting an adaptive multi-resolution enhancement technique
to compute the observation noise variance the sites belong-
[18] controlled by the disparity map providing the informa-
ing to the binary contour map are excluded. Moreover, these
tion on objects’ distances from the observer point-of view
sited are used to refine the edge statistics. An example of the
and, the motion field.
denoising procedure is shown in Fig.4.
Without loss of generality let us assume that the stereo
pair has been eventually rectified in order to comply with a
simplified epipolar geometry characterized by an horizontal
epipolar line, so that the disparity d(x), i.e., the horizontal
displacement between the left and right images of a given ob-
ject, is proportional to the the object depth z(x).
In addition let t(x) be the motion field. The disparity map
and the motion field are then combined in order to control the
amount of enhancement performed over different regions of a
given frame pair. The discussion about the disparity map and
motion field estimation and related regularization algorithms
(a) Noisy image (b) Restored image is beyond the scope of this paper. Here we assume that both
the disparity and the motion field have been eventually filtered
Fig. 4. Noisy and restored image. in order to remove outliers as well as meaningless details.
Let dmin and dmax respectively denote the minimum and
the maximum disparity for a given imaging geometry and
tmax the maximum motion vector compatible with the ex-
3.1. 3D image enhancement pected object dynamics. Then, the normalized disparity map
d(o) is defined as:
The great popularity gained by 3D Hollywood productions,
+ d(x)−dmin
the imminent introduction into the consumer market of 3D (o)
d (x) = dmax −dmin , d(x) < dmax , (22)
TV sets announced by the major manufacturers as well as 1, d(x) ≥ dmax
the 3D sports broadcasting that will enter into service with
the 2010 FIFA world Cup, focus our attention on 3D image Similarly, the normalized motion field t(o) is defined as:
and video enhancement. In principle each frame of a stereo + t(x)
pair could be separately processed, but relevant benefits can tmax , t(x) < tmax
t(o) (x) = , (23)
be gained when the existing correlation is considered. With 1, t(x) ≥ tmax
273
Then we introduce the normalized enhancement factor objects containing many details. To reduce the computational
map e(o) (x)as the linear combining of d(o) (x) and t(o) (x), complexity we may resort to the hyper-complete Riesz basis
namely, (4) so that a region of interest can be partitioned into smaller
and smaller square blocks whose content is approximated by
e(o) (x) = α · [1 − d(o) (x)] + (1 − α) · [1 − t(o) (x)]. (24) a truncated expansion making use of just a few CHFs.
A greater enhancement factor is associated to those objects More in detail, at the first step the basis set is empty and
which are the slowest, and closest to the observer. The pa- the current region R(0) is set equal to the given region of inter-
rameters α controls the relevance of the objects’ spatial and est (ROI). At the i-th step of the recursion, the center ξi of the
temporal behavior. current region R(i) is evaluated and the subset of functions
Therefore, Eqs. (15), (16) modify as follows + ' ( ,
1 (n) |x − ξi |
L , θ(x − ξi ) , k = 1, ..., K, n = 1, ..., N
L
si k si
!
f0(x) = γk [e(o) (x)]zsk (x)gsk (x) + ys0 (x). (25)
k=1 is considered as a potential candidate. Then the norm of the
corresponding approximation error on R(i) between the im-
In our experiment the design of the relationship between age itself f (x) and the reconstructed image fˆ(x) using the
γk and e(o) (x) has been inspired to the relationships between current candidate basis subset is computed, namely
the signal bandwidth B and the link length L observed in op-
tical communications for which we have: -- ' ( --
-- x − ξi --
||∆f || = ----wT [f (x) − fˆ(x)]---- ,
B0 δi
B= , (26)
Lm
where 0.5 < m ≤ 1 depending on the light dispersion nature. where δi denotes the width of R(i) , and wT (x) is a square
Thus the gain coefficients γ are selected in such a way window of unitary width. If this norm exceeds a predefined
that the spatial bandwidth of the reconstruction filter is pro- threshold the current region is split into four squares and
portional to [e(o) ]−m . At this aim, we observe that if a con- the quadtree decomposition is recursively applied to each of
stant set of gains γk is employed, the overall transfer function them. Otherwise the current candidate subset is added to the
is basis and the next region is considered.
The properties of the quadtree decomposition can be fur-
L ther exploited in order to to reduce the search time of a given
! γk |Hsk (ω)|
2
Gtot (ω) = , k = 1, 2, . . . , L. (27) pattern in a set of candidates images. In fact a sequential de-
/
L
2 tection and estimation procedure that verifies whether each
k=1 |Hsk (ω)|
k=1 candidate image contains each square of the quadtree can be
adopted. At this aim the template quadtree blocks can be
The corresponding bandwidth versus γ can then be easily ranked on the basis of the incremental Fisher’s information on
computed. pattern location, rotation and scale estimation. However, as
Two examples of the 3D image enhancement based on the demonstrated in [4] this quantity is proportional to the magni-
disparity map are illustrated in Fig. 5. Visual inspection con- tude of the energy of the derivatives along two orthogonal di-
firms the larger spatial bandwidth associated to foreground rections and to the energy of the angular derivative, or, equiv-
objects. For a proper setup of the parameters a subjective alently, to the effective spatial and angular bandwidths.
video quality assessment has been performed at the Univer-
Therefore, the template quadtree blocks are ranked on the
sity of Roma TRE, [17]. In addition to an overall preference
basis of the energy of the mid and high, angular and radial,
for the enhanced versions, the subjective tests indicated that
frequency components, computed directly from the Gauss La-
the best fitting of the Mean Opinion Score have been obtained
guerre expansion coefficients.
when α = 0.5, so that the same importance is attributed to
both factors. As an alternative, salient points based on invariants can
be extracted and quadtree blocks can then be ranked based on
the saliency of the key points falling inside them, [16].
4. PATTERN LOCALIZATION AND IMAGE
RETRIEVAL When a ROI of a given image has to be searched in a
database, the first block of the ranked list is considered.
Local expansions based on Gauss-Laguerre CHFs can be ef- Due to the self steering property detection and localization
fectively employed for Maximum Likelihood orientation in- of the pattern belonging to the first block can be performed
variant pattern recognition, [4]. However, a rather large num- by means of a quasi-Newton maximization procedure as the
ber of expansion terms is required when dealing with large Broyden-Fletcher-Goldfarb-Shanno algorithm maximizing,
274
for each b the quantity of both [27] , those based on global search and regional search
[28], and to the method employed in SIMPLIcity. The per-
2 formance assessment indicates that the average percentage of
GLLF (1) (b, a,ϕ ) = − ×
N0 Γ recovered relevant images is greater than 0.96 while the other
-
K -
-2 methods attain at the maximum 0.87 (global search).
N !
! L
! -
- n −jnϕ -
-Dk (ξc ) − B(a; n, k, l)Ck (ξc − b)e
n
- ,
- -
n=0 k=0 l=0 5. CONCLUSIONS
where ξc denotes the center of the current region.
CHF based image representations constitute powerful tools
Thus, for each discrete location of a grid, the rotation and
for highly performant algorithms addressing a wide range of
the scale maximizing GLLF (1) are determined and then a
applications in the image processing domain, from restora-
discrete direct search is performed to determine its absolute
tion to image retrieval. Although the techniques illustrated in
maximum. Thus, at the first step the parameter estimate is
this contribution have been specified for the Gauss-Laguerre
+ 6 7, CHFs, they can be easily extend to other CHF families, like
(1) (1)
[b̂ , â , ϕ̂ ] = Arg max GLLF (b, a,ϕ ) .
(1) (1)
those based on Zernike’s polynomials.
b,a,ϕ
Once for each image of the dataset the local maximum of 6. REFERENCES
GLLF (1) has been computed, the images are ranked on the
basis of this absolute maximum. Then the image correspond- [1] H.H.Arsenault, Y.Sheng, Properties of the circular harmonic ex-
ing to the highest GLLF (1) is selected as the potential candi- pansion for rotation invariant pattern recognition, Applied Op-
(1) tics, Vol25, No.18, Sept. 1986, pp. 3225-3229.
date for image matching, and [b̂ , â(1) , ϕ̂(1) ] is employed as
[2] P.E. Danielsson, Rotation invariant linear operators with direc-
coarse estimate in order to verify whether the candidate image
tional response, Proc. of fifth Int. Conf. Pattern recognition,
contains the second block of the rank ordered list of quadtree 1980, pp.1171-1176.
elements, too. Compared to the first block, the GLLF (2) map
[3] G. Jacovitti, A. Neri, “Anisotropic Wavelet Thresholding For
is built only for a limited set of possible locations, falling in-
Bayesian Image Denoising,” EUSIPCO 2002, Toulouse, France,
side a small neighbor of the site predicted on the basis of the
September 3-6, 2002.
coarse estimates. In addition, the quasi-Newton maximization
of GLLF (2) is initialized using the coarse estimate too. [4] A.Neri and G. Iacovitti, “Maximum Likelihood Localization of
If the energy of the difference between the subset of the 2-D Patterns in the Gauss-Laguerre Transform Domain: Theo-
retic Framework and Preliminary Results,” IEEE Trans. on Im-
reference template, constituted by the first and the second
age Processing,vol. 13, pp.72-86, Jan. 2004.
square of the quadtree and the current image falls below a
predefine threshold, location and rotation of the image are re- [5] G. Papari, P. Campisi, A. Neri, N. Petkov, “Contour detection by
fined and the next square analyzed. In general, at the h-th multiresolution surround inhibition,” IEEE ICIP 2006, Atlanta,
USA, 2006.
stage the GLLF (h) map is computed using the first h points
of the lattice, ranked according to the saliency indicator, i.e., [6] G. Papari, P. Campisi, N. Petkov, “Multilevel surround inhibi-
tion. A biologically inspired contour detector,” SPIE Electronic
h ! N !K Imaging 2007, Image Processing: Algorithms and Systems VI,
2 !
GLLF (h) [b, a,ϕ ] = − × Jan Jos, CA, USA, 2007.
N0 Γ m=1 n=0 [7] G. Papari, P. Campisi, N. Petkov and A. Neri: A biologi-
k=0
- % ' (& -2 cally motivated multiresolution approach to contour detection,
- L
! ξm − b -
- n n −jnϕ - EURASIP Journal on Advances in Signal Processing, Vol. 2007,
-Dk (ξm ) − B(a; n, k, l)Ck Rϕ e - .
- a - Article ID 71828, 28 pages, 2007
l=0
[8] G. Jacovitti and A. Neri, “Multiresolution circular harmonic
decomposition, IEEE Trans. Signal Processing, vol. 48, pp.
The procedure ends when the last block in the list has been 32423247, Nov. 2000.
processed. If at some stage the energy of the difference ex- [9] L. Sorgi, N. Cimminiello, and A. Neri, “Keypoint selection in
ceeds a predefined threshold, the current image is discarded the laguerre-gauss transformed domain,” in Proc. of 2nd Work-
and the next item of the dataset corresponding to the highest shop on Applications of Computer Vision, ECCV, May 2006.
GLLF (1) is considered as candidate for pattern matching. [10] L. Capodiferro, M. Carli, L. Costantini, A. Neri, V. Palma,
Extensive retrieval experiments making use of quadtree “Adaptive Riesz Basis Decomposition for Image Search”, EU-
decomposition combined with Gauss-Laguerre CHFs, as well SIPCO 2009, Glasgow, Scotland, 24-28 August, 2009.
as on Zernike’s CHF have been performed on the the Corel- [11] E.P. Simoncelli, W.T. Freeman, E. H. Adelson, D.J. Heeger,
1000-A Database. The proposed technique has been com- “Shiftable Multiscale Transforms”, IEEE Trans. on Information
pared to conventional including those based on combination Theory, Vol. 38, no. 2, pp. 587- 607, March 1992.
275
[12] Z. Zalevsky and I. Ouzieli, and D. Mendlovic, “Wavelet-
transform-based composite filters for invariant pattern recogni-
tion”, Applied Optics, vol.35, pp. 3141-3147, Jun. 1996.
[13] S. R. Deans, The Radon Transform and Some of Its Applica-
tions. Wiley, New York, 1983.
[14] W.F. McGee, “Complex Gaussian Noise Moments”, IEEE.
Trans. On Information Theory, vol. IT.17, No.2, pp.149-157,
March 1971.
[15] G. Jacovitti, A. Neri, “Multiscale image features analysis with
circular harmonic wavelets,” in Wavelet Applications in Signal
and Image Processing III, Proc. SPIE’s 2569, S. Diego, July
1995, pp. 363-375.
[16] L. Capodiferro and E. D. Di Claudio and G. Jacovitti and F.
Mangiatordi,“Application of Local Fisher Information Analysis
to Salient Points Extraction,” Proceedings of the IASTED SP-
PRA 2008, 13-15 February, Innsbruck, Austria.
[17] A. Neri, P. Campisi, E. Maiorana, F. Battisti “3D Video En-
hancement based on Human Visual System characteristics”,
VPQM 2010, Scottsdale, Arizona, U.S.A., Jan. 13-15, 2010.
[18] C. Ercole, P. Campisi, A. Neri, “Bayesian anisotropic denois-
ing in the Laguerre Gauss domain”, SPIE Conference on Image
Processing: Algorithms and Systems VI, March 2008.
[19] D. Comaniciu, P. Meer, “Mean Shift: A Robust Approach To-
ward Feature Space Analysis”, IEEE Transactions on PAMI,
Vol. 24, No. 5, pp: 603-619, May 2002.
[20] A. Neri, P. Campisi, F. Battisti, “Fuzzy Edge Enhancement
in the Complex Wavelet Domain”, International Workshop on (a) Original image
Video Processing and Quality Metrics (VPQM), 2009.
[21] J. L. Starck, E. J. Candes, and D. L. Donoho, “The curvelet
transform for image denoising”, IEEE Trans. Image Processing,
vol. 11, no. 6, pp 670684, Jun. 2002.
[22] D. L. Donoho, “De-noising by soft-thresholding,” IEEE Trans.
Inf. Theory, vol. 41, no. 3, pp. 613627, May 1995.
[23] D. L. Donoho and I. M. Johnstone, “Ideal spatial adaptation by
wavelet shrinkage,” Biometrika, vol. 81, pp. 425455, 1994.
[24] S.G. Chang, Bin Yu, M. Vetterli, “Adaptive Wavelet Treshold-
ing for Image Denoising and Compression,” IEEE Trans. on Im-
age Processing, Vol.9, No pp. 1532-1546, Sept. 2000.
[25] M.K. Mihcak, I. Kozintsef, K Ramchandran, P. Moulin, “Low-
Complexity Image Denoising based on Statistical Modeling of
Wavelet Coefficients,” IEEE Signal processing Letters, pp. 300-
303, Vol.6, No 12, December 1999
[26] R.A. Gopinath, M.Lang, H.Guo, and J.E. Odegard, “Enhance-
ment of decompressed images at low bit rates”, Wavelet Appli-
cations in Signal and image Processing Proc. SPIE 2302, San
Diego, July 1994.
[27] Z. Stejic, Y. Takama and K. Hirota “Genetic Algorithm-based
relevance feedback for image retrieval using local similarity pat-
terns,” in Information Proc. & Management, vol.39, pp. 1-23,
Jan. 2003.
[28] S. Rudinac, M. Rudinac, B. Reljin, M. Uscumlic and G. Zajic,
“Global Image Search vs. Regional Search in CBIR Systems,”
in Proc. of VIII International Workshop on Image Analysis for
Multimedia Interactive Services (WIAMIS’07), pp. 14-17, June
(b) Enhanced image
2007.