Image Fusion Using Empirical Mode Decomposition

CHAPTER 1
INTRODUCTION
1.1 EMPIRICAL MODE DECOMPOSITION
Empirical Mode Decomposition (EMD) was first introduced by Huang et al. [1] and
provides a powerful tool for adaptive multi scale analysis of non stationary signals. It is a
non-parametric data-driven analysis tool that decomposes non-linear non-stationary signals
into Intrinsic Mode Functions (IMFs). The final representation of signal is an energy-
frequency distribution, designated as Huang spectrum [1] that gives sharp identifications of
salient information. With the Hilbert transform, the IMFs allow representation of
instantaneous frequencies as functions of time. The main conceptual benefits are the
decomposition of parent signal into IMFs and the visualization of time-frequency
characteristics.
The joint space-spatial frequency representations have received special attention in the fields
of image processing, vision, and pattern recognition. The EMD method was originally
proposed for the study of ocean waves and found potential applications in
geophysical exploration, underwater acoustic signals, noise removal flter
and biomedicine, audio source separation [2], de noising [3], and analysis of Doppler
ultrasound signals. The major advantage of EMD is that the basis functions
can be directly derived from the signal itself. Compared with Fourier
analysis, EMD analysis is adaptive while the basis functions of Fourier
analysis are linear combinations of fxed sinusoids.
The combination of EMD with the Hilbert Spectral Analysis (HSA) designated as the
Hilbert–Huang Transform (HHT), in five patents [1] by the National Aeronautics and Space
Administration (NASA), has provided an alternative paradigm in time–frequency analysis.
Though the Hilbert transform is well known and has been widely used in the signal
processing field since 1940s, the Hilbert transform when applied for instantaneous frequency
computation has many drawbacks among them, the most serious one is its derived
instantaneous frequency of a signal could lose its physical meaning when the signal is not a
mono-component or AM/FM separable oscillatory function [4]. The EMD, at its very
beginning was developed to overcome this drawback so that the data can be examined in a
physically meaningful time–frequency–amplitude space. It has widely accepted that EMD
1
with its new improvements has become a powerful tool in both signal processing and
scientific data analysis.
EMD is empirical, intuitive, direct, and adaptive, without pre-determined basis

functions, in contrary to almost all the previous decomposition methods. The decomposition
is designed to seek the different simple intrinsic modes of oscillations in any data based on
local time scales. Although EMD is a simple decomposition method, it has many intriguing
but useful characteristics that other decomposition methods lack. Flandrin et al [5] studied the
Fourier spectra of IMFs of fractional Gaussian noise, which are widely used in the signal
processing community and financial data simulation. They found that the spectra of all IMFs
except the first one of any fractional Gaussian noise collapse to a single shape along the axis
of the logarithm of frequency (or period) with appropriate amplitude scaling of the spectra.
The center frequencies (periods) of the spectra of the neighboring IMFs are approximately
halved (and hence doubled); therefore, the EMD is essentially a dyadic filter bank [6].
Flandrin et al.[5] also demonstrated that EMD behaves like a cubic spline wavelet when it is
applied to Delta functions. Independently, Wu and Huang [1] found the same result for white
noise (which is a special case of fractional Gaussian noise).
In addition to that, Wu and Huang [1] argued using the central limit theorem that each
IMF of Gaussian noise is approximately Gaussian distributed, and therefore, the energy of
each IMF must be a c 2 distribution. From the characteristics they obtained, Wu and Huang
further derived the expected energy distribution of IMFs of white noise. By determining the
number of degrees of freedom of the c 2 distribution for each IMF of noise, they derived the
analytic form of the spread function of the energy of IMF. From these results, one would be
able to discriminate an IMF of data containing signals from that of only white noise with any
arbitrary statistical significance level. They verified their analytic results with those from the
Monte Carlo test and found consistency.
The main difference between the Wavelet and EMD fusion approaches is
depended on decomposition. The EMD method is widely used in one-
dimensional signal processing as well as in two-dimensional image
processing. Wavelet decomposition is related to the predefned wavelet
basis while the EMD is a non-parametric data-driven process that is not
required to predetermine the basis during decomposition. It is commonly
2
understood that Fourier transform is a useful method for stationary signal analysis, while
DWT is more suitable for non-stationary signal analysis. In fact, the DWT is a windowed
Fourier transform and a finite length of the DWT base may cause energy leakage. Once the
wavelet base and decomposition level are determined, the signal obtained is within a certain
frequency range that only depends on the sampling rate and has no relationship to the signal.
Therefore, this
method is not adaptive. Compared with the DWT, the EMD shows superior performance on
data analysis and data filtering. It is a powerful tool for adaptive multiscale analysis of brief
non-linear and non-stationary signals. These interesting characteristics of the EMD motivated
the extension of this method to the area of image processing [7].
1.2 OBJECTIVES OF THE WORK
The main objective of this work is to study Empirical Mode Decomposition (EMD) and to
perform image fusion using EMD .
 To study EMD process and carry out image decomposition using Bi dimensional
Empirical Mode Decomposition (BEMD) and Vectorized Empirical Mode
Decomposition(VEMD) and compare the results.
 To fuse the decomposed images from BEMD and VEMD using four different
methods.
 To compare the results and analyse the performance of each fusion method.
1.3 ORGANIZATION OF THESIS
Second chapter explains the properties of Intrinsic Mode Functions (IMFs) and the process of
one dimensional EMD.
Third chapter explains about Bidimensional Empirical Mode Decomposition (BEMD) and its
techniques in brief and presents BEMD by finite element method
Fourth chapter deals with the Vectorized Empirical Mode Decomposition (VEMD) and gives
the process of VEMD with the flow chart.
3
Fifth chapter deals with image fusion methods and explains four different image fusion
methods in detail.
Sixth chapter presents the comparative results and discussions with the help of subjective
and objective evaluation.
Sixth chapter deals with conclusions, assumptions made for carrying out the project.
4
CHAPTER 2
ONE DIMENSIONAL EMPIRICAL MODE DECOMPOSITION
Empirical Mode Decomposition (EMD) is a powerful tool for adaptive multi scale
analysis of non stationary signals [1]. It is a non-parametric data-driven analysis tool that
decomposes non-linear non-stationary signals into Intrinsic Mode Functions (IMFs). A
fundamental problem in data analysis is to obtain an adaptive application-oriented
representation for a given data set. EMD is an efficient method for such adaptive
representations. Indeed, the original purpose of EMD is to decompose a signal into
components, each of which has meaningful instantaneous frequency, and different
components correspond to different frequency scales. EMD decomposes a signal into a
infinite sum of intrinsic mode functions (IMFs) based on the direct extraction of the energy
associated with various intrinsic time scales. Many examples of using EMD show that the
IMFs obtained from EMD provide physical insights which are crucial in engineering
applications. Due to the fully adaptive nature of the method, it is particularly suitable for
processing nonlinear and non-stationary signals.
EMD for 1D signal analysis is described in this section.
2.1 EMD Assumptions
Contrary to many of the former decomposition methods, EMD is intuitive and direct,
with the basis functions based on and derived from the data. The assumptions for this method
are [1][4]
1. the signal has at least a pair of extrema.
2. the characteristic time scale is defined by the time between the successive extrema.
3. if there are no extrema, and only inflection points, then the signal can be
differentiated to realize the extrema, whose IMFs can be extracted. Integration may
be employed for reconstruction. The time between the successive extrema was used
by Huang et al. [1] as it allowed the decomposition of signals that were all positive,
all negative, or both. This implied that the data did not have to have a zero mean, as
in the case of image data. This also allowed a finer resolution of the oscillatory
modes.
2.2 INTRINSIC MODE FUNCTION
5
A function is an intrinsic mode function if the number of extrema equals the number
of zero-crossings and if it has a zero local mean. EMD decomposes a signal into its IMFs
based basically on the local frequency or oscillation information. The first IMF contains the
highest local frequencies of oscillation, the final IMF contains the lowest local frequencies of
oscillation and the residue contains the trend of the data. The IMFs obtained from EMD are
expected to have the following properties [8]:
i. in the whole data set, the number of extrema and the number of zero crossings must
be equal or differ by at most one.
ii. there should be only one mode of oscillation between two successive zero
crossings.
iii. at any point, the mean value of the envelope defined by the local maxima and the
envelope defined by the local minima is zero.
iv. the IMFs are locally orthogonal. In fact property (i) ensures property (ii), and vice
versa.
2.2.1 The Sifting Process
As per the definition of an IMF, the decomposition method can simply employ the
envelopes defined by the local maxima and minima individually. The extrema are identified
and all local maxima are connected by a cubic spline to form the upper envelope. This
process is repeated for the local minima and the lower envelope is constructed. While
interpolating, care is taken that the upper and lower envelopes cover all the data between
them. The point-wise mean of the envelopes is called m1 , and is subtracted from the data r0
for the first component h1` . For the first iteration, X(t) is the used as the data (shown in Fig-1),
r0 = X (t )
h1 = r0 - m1
As per mathematical definitions, h1` should be considered as one of the IMF, as h1` seems to
satisfy all the requirements of an IMF. But since we are interpolating the extrema with
numerical schemes, overshoots and undershoots are bound to occur. These generate new
maxima and minima, and distort the magnitude and phase of the existing extrema. These
effects will not affect the process directly as it is the mean of these envelopes that pass on to
the next stages of the algorithm and not the envelopes themselves. The formation of false
6
extrema cannot be avoided easily and an interesting offshoot is that this procedure inherently
recovers the proper modes lost in the initial examination and recovers low-amplitude riding
waves on repeated sifting. The envelope means may be different from true local mean and
consequently some asymmetric waveforms may occur but they can be ignored as their effects
in the final reconstruction are very minimal. Apart from a few theoretical difficulties, in
practice, a ringing effect at the ends of the data array can occur. But even with these effects,
the sifting process still extracts the essential scales from the data set. The sifting process
eliminates riding waves and makes the signal symmetrical. In the second sifting process, h1 is
considered as the data where m11 is the mean of h1 envelope.
h1` = h1 - m11
The sifting is continued k times until the first IMF is obtained.
h1k ` = h1( k -1) - m1k

(2.1)
We designate c1 as the first IMF,
c1 = h1k (2.2)
2.1.2 Stopping Criteria
In sifting, the finest oscillatory modes are separated from the data, analogous to
separating fine particles through a set of fine to coarse sieves. As can be expected of such a
process, uneven amplitudes will be smoothened. But if performed too long, the sifting
process becomes invasive and destroys the physical meaning of the amplitude fluctuations.
On sifting too long, we get IMFs that are frequency modulated signals with constant
amplitude. To retain the physical meanings of an IMF, in terms of amplitude and frequency
modulation, a standard deviation based stopping criterion is used. The standard deviation,
SD, computed from two consecutive sifting results, is used as one of the stopping criteria.
�h 2
�
i(k - 1) - h i(k )
SD = ��
� h 2 i(k -1) �
� �
(2.3)
7
Sifting is stopped if SD falls below a threshold (generally the threshold is between 0.2 to 0.3).
The isolated intrinsic mode function c1 contains the finest scale of the signal and we separate
c1 from the data.
r1 = r0 - c1 (2.4a)
The new signal called the residue r1 still holds lower frequency information. In the next
iteration, the residue r1 is treated as the new data in place of r0 and subjected to the sifting
process. This procedure is repeated on all the subsequent residues ( rj ’s), to realize a set of
IMFs.
r2 = r1 - c2 ,..., rn = rn -1 - cn
(2.4b)
The sifting through residuals can be stopped by any of the following stopping criteria: if the
residue becomes too small to be of any practical importance, or when the residue becomes a
monotonic function containing no more IMFs. It is not expected, to always have a residue
with zero mean, because even for data with zero mean, the final residue can still be different
from zero. The final residue is the trend of the data. If the extrema of the residue is greater
than 3 then it has to go through next shifting process, while on the other hand, if the extrema
is less than 2 then it is generally considered to have the lowest information and hence it is
considered as the final residue and another shifting is not required. Reconstruction of the
signal is performed using the relation,
n
X̂(t) = �ci + rn
i =1 (2.5)
Thus, the data is decomposed into n-empirical modes, and a residue rn , which can be either
the mean trend or a DC shift. The flow process of EMD is described in figure 1.
8
rik|i = 0 ;k = 0 = X ( t )
SD < 3
Extrema rik
<3
Envelope and mean det ection m ik
hik = hi ( k - 1 ) - mik hi ( k - 1 )
NO
YES
ci = hik ri = ri - 1 - ci
NO
IMFs YESS
Trend rn = ri
9
Figure 2.1 Flow diagram of EMD
0.6
0.4
0.2
0
Amplitude
-0.2
-0.4
-0.6
-0.8
0 20 40 60 80 100 120 140 160 180 200
Number of samples
0.4
0.3
0.2
0.1
Amplitude
-0.1
-0.2
-0.3
-0.4
0 20 40 60 80 100 120 140 160 180 200
Number of samples
0.25
0.2
0.15
0.1
Amplitude
0.05
-0.05
-0.1
-0.15
-0.2
0 20 40 60 80 100 120 140 160 180 200
Number of samples
10
0.08
0.06
0.04
0.02
Amplitude
-0.02
-0.04
-0.06
-0.08
0 20 40 60 80 100 120 140 160 180 200
Number of samples
0.08
0.06
0.04
0.02
Amplitude
-0.02
-0.04
-0.06
-0.08
0 20 40 60 80 100 120 140 160 180 200
Number of samples
0.1
0.08
0.06
0.04
0.02
Amplitude
-0.02
-0.04
-0.06
-0.08
-0.1
0 20 40 60 80 100 120 140 160 180 200
Number of samples
11
0.6
0.58
0.56
0.54
Amplitude
0.52
0.5
0.48
0.46
0 20 40 60 80 100 120 140 160 180 200
Number of samples
Figure 2.2 IMFs of a random signal
Figure 2 shows the Intrinsic Mode Functions IMFs of a random signal generated for 200
samples as shown. Last block shows the residue of the signal and it is the lowest
decomposition which cannot be further decomposed.
The 1D-EMD is very powerful for various signal processing applications because it
decomposes a signal into various spectral components called intrinsic mode functions (IMFs).
IMFs are adaptive and dependent on the signal itself, and are a useful tool for analyzing
nonlinear and non-stationary signals. The 1D-EMD has been used for audio source
separation, denoising and analysis of Doppler ultrasound signals. Its practical value led
researchers to investigate the extension of 1D-EMD into complex-valued signals [9], with the
overarching goal to connect 1D-EMD to traditional filter bank theory [6].
12
CHAPTER 3
BIDIMENSIONAL EMPIRICAL MODE DECOMPOSITION
EMD has many interesting features and an important feature is, it is a fully adaptive
multi scale decomposition. This is because EMD operates on the local extremum sequence
and the decomposition is carried out by direct extraction of the local energy associated with
the intrinsic time-scales of the signal itself. This is different from the wavelet-based multi
scale analysis that characterizes the scale of a signal event using pre-specified basis functions.
Owing to this feature, EMD is highly promising in dealing with other problems of a multi
scale nature. EMD found various advantages and it can be useful for two dimensional data
analysis. Although this is quite interesting, there are several challenging difficulties that need
to be overcome. One of them is the computation efficiency. For a medium-sized two-
dimensional data, e.g. a 512X512 image, the number of the local extrema can be tens of
thousands. Owing to the iterative nature of the EMD method, the decomposition of such a
dataset is rather time-consuming and could be unacceptable for many applications. Various
attempts are made for the development of fast bidimensional EMD which will be useful for
various applications. The 1D-EMD that is applied for signals has been extended for 2D image
and 2D-EMD has been developed recently by a few researchers based on a cubic polynomial
interpolation [10] and a partial differential equation [11]. 2D-EMD has been applied into
some applications such as image compression and image fusion [12].
An image is a bi dimensional IMF if it has a zero mean, if the maxima are positive
and the minima are negative and if the number of maxima equals the number of minima. Bi
dimensional empirical mode decomposition (BEMD) method is a relatively new, but potential
image processing algorithm. BEMD decomposes an image into multiple hierarchical
components known as bi dimensional intrinsic mode functions (BIMFs) and a bi dimensional
residue, based on the local spatial variations or scales of the image. In each iteration of the
process, two-dimensional (2D) scattered data surface interpolation is applied to a set of
arbitrarily distributed local maxima (minima) points to form the upper (lower) envelope.
13
BEMD is a promising image processing algorithm that can be applied in various real-world
problems, e.g., medical image analysis, pattern analysis, and texture analysis, Both EMD and
BEMD require finding local maxima and minima points and subsequent interpolation of
those points at each iteration of the process. One dimensional (1D) extrema points are
obtained using either a sliding window or a local derivative; 2D extrema points are obtained
using a sliding window or various morphologic operations [13].
3.1 BIDIMENSIONAL INTRINSIC MODE FUNCTIONS
EMD as well as BEMD decomposes a signal into its IMFs based basically on the
local frequency or oscillation information. The definition and properties of the BIMFs are
slightly different from the IMFs. It is sufficient for BIMFs to follow only the final two (iii
and iv in section 4.2) properties given above for IMFs [14, 15]. In fact, due to the properties
of an image and the BEMD process, it is not possible to satisfy the first two properties (i and
ii in section 4.2) given above in the case of BIMFs, since the maxima and minima points are
defined in a 2D scenario for an image.
The first IMF picture represents the finest scaled components which show the most detailed
and original information. The second IMF represents less detailed information compared to
the first IMF and so on. The detailed information describes the image texture of the finest
scaled section and the rest shows the image basic trend section and basic structure.
3.2 BEMD Process
The BEMD general process is same as that of one dimensional EMD but signal
should be replaced by an array i.e an image. The whole process of decomposition of an image
into its 2D IMFs (BIMFs) in order of local spatial scales is known as sifting. The
decomposition of any image into BIMFs is not a unique process.
In development of the two-dimensional EMD method, while the sifting process of the
original EMD can be directly used, the key step is to generate the local mean surface of the
two-dimensional data. BEMD decomposition and the resulting BIMFs are governed by the
method of extrema detection, criteria for stopping the iterations for each IMF and
interpolation techniques. Though all of these factors are important for successful
decomposition, the interpolation method may be considered the most crucial. Most of the
scattered data interpolation techniques to produce 2D surfaces are themselves iterative
14
processes. In the case of BEMD, it is very likely that the maxima or minima map does not
contain any interpolation centers at the boundary region, which may be more severe for the
later modes of decomposition. Hence, some kind of boundary processing to introduce
additional interpolation centers at the boundary may also be required for successful
decomposition [13,16]. Interpolation of the local maxima points is needed to form the upper
envelope, and interpolation of the local minima points is needed to form the lower envelope
of the data/image. The average of the upper and the lower envelopes of the image gives the
mean envelope. One of the purposes of the BEMD decomposition is to get BIMFs with zero
local mean defined by the mean envelope, which further plays a significant role for
orthogonal decomposition. Hence, the accuracy of the envelopes in terms of shape and
smoothness is very important, which calls for the need to identify an appropriate 2D scattered
data interpolation technique for BEMD.
In development of the two-dimensional EMD method, while the sifting process of the
original EMD can be directly used, the key step is to generate the local mean surface of the
two-dimensional data. The decomposition of any image into 2D IMFs is not a unique process.
Owing to the above statements there are various methods available for BEMD. The one-
dimensional EMD algorithm of Huang can be extended to the two-dimensional case by using
cubic interpolation on triangles to generate the upper and lower envelope surfaces (17). This
decomposition is based on Delaunay triangulation and on piecewise cubic polynomial
interpolation. A particular attention is devoted to boundary conditions that are crucial for the
feasibility of the bidimensional EMD.
3.2.1 BEMD by Finite Element Method
In BEMD using Finite Element method, the local mean surface of a two-dimensional dataset
is generated directly from the characteristic data points rather than from the upper and lower
envelopes. This overcomes the problem of possible over shootings between the upper and
lower envelopes. Our method avoids constructing two different two dimensional interpolating
surfaces (the upper and lower envelopes), which is normally a difficult task and requires
much computational cost. In addition, the characteristic data points in our method include not
only the local maxima and minima, but also the saddle points, which are a distinct feature of
two-dimensional data. In this paper, finite element method is used for fusing the images
using EMD [18].
15
The method of the two-dimensional EMD using finite element method is summarized for a
dataset I [19]. The first IMF of I is obtained as follows.
(i) Find the local extrema and saddle points of I .
(ii) Form a triangular mesh using the Delaunay method [20,21]
(iii) Smooth the characteristic point set using the Laplacian operator. [Equation 10 in
(19)].
(iv) Pick up the rows and columns on which the number of the characteristic points is
greater than the value specified in equation 13 in (19), determine the knots for the bi-
cubic interpolation and generate the local mean surface m(p) .
(v) Compute h = I - m .
[hk -1 (p) - h k (p)]2
(vi) If �
p h 2k -1 (p)
< SD
where hk (p) is the sifting result in the k th iteration, and SD is typically set
between 0.2 and 0.3, stop and we obtain an IMF. Otherwise, treat h as the data and
iterate on h through steps (i)–(vi).
Denote by c1 the first IMF and set

r1 = I - c1
which is the first residue. The algorithm proceeds to select the next IMF by applying the
above procedure to the first residue r1 . This process is repeated until the last residue rn has at
most one extremum or becomes constant. The original data can then be represented as
n
I = �c j + rn (3.1)
j= 1
16
CHAPTER 4
VECTORIZED EMD (VEMD)
Bidimensional Empirical Mode decomposition (BEMD) is computationally complex

and involves a series of steps. As it is two dimensional, every row and column have to be
processed separately. This in turn will increase the computation of algorithm and on the other
hand it is time consuming. BEMD by finite element method is not mathematically strong
which further adds its disadvantage. Interpolation error will also be present and triangle
mesh formation using Delaunay method will also introduce its own error, which in turn paved
a way for an faster, computationally inexpensive algorithm called Vectorized Empirical Mode
decomposition (VEMD) [21]. The algorithm of VEMD is simple and is explained as below.
4.1 VEMD PROCESS
Converting a two dimensional data to one dimension and then employing the one-
dimensional EMD could be an efficient approach to deal with some image processing
problems. This process would be fastest as expected than the other EMD methods. An image
is vectorized and one dimensional EMD is applied to two vectors and this process is called
VEMD.
a b c d
e f g h
I(x, y) =
i j k l
m n o p
Concatenate rows
I(x) = a b c d h g f e i j k l p o n m
17
Figure 4.1 Concatenation of rows to convert 2D image into 1D vector data
The flowchart of Vectorizing Empirical Mode Decomposition (VEMD) is as shown

in the figure 4.1. The image I(x, y) of size MxN is divided into rows and concatenates these
rows to form a 1D vector data I( x ) whose size would be MN as shown in Figure 4.2. After
vectorization the data will be a vector and one dimensional EMD is applied. The two-
dimensional image data of size (ex. 256*256) is converted into one dimensional data
1*65536 and 1D-EMD is applied to the converted signal. An 512*512 image is converted
into 1*262144, 64*64 image is converted into 1*4096 and so on.
Figure 4.2 Flow chart of VEMD
18
The extremas of the converted signal are found out and the maxima and minima of
the signal are interpolated to find the upper and lower envelope. The mean of the two
envelopes are calculated, shifting process is applied and the standard deviation of the shifted
signal is found out. If the standard deviation is less than .3 then no further shifting is not
possible and assign it as Intrinsic Mode Function. On the other hand if the standard deviation
is greater than .3 the above procedure is again repeated till the standard deviation becomes
less than .3.
Residue is calculated from IMF and if the number of extremas is less than 3, no further
decomposition is not possible. It is assigned as final residue and if the extremas is greater
than 3, the above procedure is repeated. The reconstructed image is found by summing all the
IMFs and the residue. After performing 1D EMD the reconstructed vector is converted back
to 2D image by reversing the procedure shown in Figure 4.1.
19
CHAPTER 5
IMAGE FUSION OF INTRINSIC MODE FUNCTIONS
5.1 IMAGE FUSION
Multi-sensor image fusion (MIF) is a technique to combine the registered images to

increase the spatial resolution of acquired low detail multi-sensor images and preserving their
spectral information. Nowadays, MIF has emerged as a new and promising research area. The
benefiting fields from MIF are: Military, remote sensing, machine vision, robotic, and
medical imaging, etc. Some generic requirements could be imposed on the fusion scheme: (a)
the fusion process should preserve all relevant information contained in the source images,
(b) the fusion process should not introduce any artifacts or inconsistencies which would
amuse the human observer or following processing stages and (c) irrelevant features and
noise should be suppressed to a maximum extent.
The problem that MIF tries to solve is to merge the information content from several
images (or acquired from different imaging sensors) taken from the same scene in order to
accomplish a fused image that contains the finest information coming from the original
images [22]. Hence, the fused image would provide enhanced superiority image than any
of the original source images. Dependent on the merging stage, MIF could be performed at
three different levels viz. pixel level, feature level and decision level [23]. In this paper, pixel-
level-based MIF is presented to represent a fusion process generating a single combined
image containing an additional truthful description than individual source image.
The simplest MIF is to take the average of the grey level source images pixel by
pixel. This technique would produce several undesired effects and reduced feature contrast.
To overcome this problem, multiscale transforms, such as wavelets, Laplacian pyramids,
morphological pyramid, and gradient pyramid have been proposed. Multi-resolution wavelet
transforms could provide good localization in both spatial and frequency domains. Discrete
20
wavelet transform would provide directional information in decomposition levels and contain
unique information at different resolutions [24, 25].
Various image fusion algorithms are in literature and an attempt have been made to
fuse the images using empirical mode decomposition. The images to be fused are
decomposed to several IMFs using the above BEMD process. Fusion is performed at the
decomposition level and the fused IMFs are reconstructed to realize the fused image. The
decomposed IMFs of the images are fused using three methods explained as below and the
performance metrics are compared to find the better results.
An important point to note in image fusion using EMD is that the number of IMFs
should be fixed. The number of IMFs can be different for two images and fusion of IMFs are
not possible. Here the number of IMFs should be fixed and the images are decomposed to
IMFs and fused. One of the important prerequisites to be able to apply fusion techniques to
source images is the image registration, i.e., the information in the source images is needed to
be adequately aligned and registered prior to fusion of the images. In this thesis, it is assumed
that the source images are already registered.
5.2 IMAGE FUSION METHODS
There are lot of image fusion methods available. Here in this project we discuss four
image fusion methods viz. Simple Averaging, Principle Component Analysis, Discrete
Wavelet Transform and Laplacian pyramid.
5.2.1 SIMPLE AVERAGING
Simple Average mechanism is a simple way of obtaining an output image with all
regions in focus. The value of the pixel I ( x , y ) of each image is taken and added. This sum
is then divided by N to obtain the average. The average value is assigned to the corresponding
pixel of the output image. This is repeated for all pixel values. This technique is a basic and
straightforward technique and fusion could be achieved by simply averaging the
corresponding pixels in each input image. The IMFs from each image are averaged using the
below formula and the fused IMFs are reconstructed to get the fused image.
I f ( x, y ) = a * I1 ( x, y) + b * I 2 ( x, y)
(5.1)
5.2.2 PRINCIPLE COMPONENT ANALYSIS
21
The PCA involves a mathematical procedure that transforms a number of correlated
variables into a number of uncorrelated variables called principal components. It computes a
compact and optimal description of the data set. The first principal component accounts for as
much of the variance in the data as possible and each succeeding component accounts for as
much of the remaining variance as possible. First principal component is taken to be along
the direction with the maximum variance. The second principal component is constrained to
lie in the subspace perpendicular of the first. Within this subspace, this component points the
direction of maximum variance. The third principal component is taken in the maximum
variance direction in the subspace perpendicular to the first two and so on. The PCA is also
called as Karhunen-Loève transform or the Hotelling transform. The PCA does not have a
fixed set of basis vectors like FFT, DCT and wavelet etc. and its basis vectors depend on the
data set [22].
5.2.2a PCA ALGORITHM
Let the source images (images to be fused) be arranged in two-column vectors. The
steps followed to project this data into a 2-D subspaces are:
1. Organise the data into column vectors. The resulting matrix Z is of dimension 2 x
n.
2. Compute the empirical mean along each column. The empirical mean vector M has
a dimension of 1 x 2.
3. Subtract the empirical mean vector M from each column of the data matrix Z. The
resulting matrix X is of dimension 2 x n.
4. Find the covariance matrix C of X i.e. C= XX T mean of expectation = cov(X).
5. Compute the eigenvectors V and eigen value D of C and sort them by decreasing
eigen value. Both V and D are of dimension 2 x 2.
6. Consider the first column of V which corresponds to larger eigenvalue to compute
P1 and P2 as:
V (1)
P1 = V (2)
�V ;
P2 =
�V (5.2)
5.2.2b IMAGE FUSION USING PCA
22
The IMFS of each image (images to be fused) I1 ( x, y ) and I 2 ( x , y ) are arranged
in two column vectors and their empirical means are subtracted. The resulting vector has a
dimension of n x 2, where n is length of the each image vector. Compute the eigenvector and
eigen values for this resulting vector are computed and the eigenvectors corresponding to the
larger eigen value obtained. The normalized components P1 and P2 (i.e., P1 + P2 = 1) are
computed from the obtained eigenvector. The fused image is:
I f ( x, y ) = P1 I1 ( x, y ) + P2 I 2 ( x , y )
(5.3)
5.2.3 DISCRETE WAVELET TRANSFORM
Wavelet theory is an extension of Fourier theory in many aspects and it is introduced

as an alternative to the short-time Fourier transform (STFT). In Fourier theory, the signal is
decomposed into sines and cosines but in wavelets the signal is projected on a set of wavelet
functions. Fourier transform would provide good resolution in frequency domain and wavelet
would provide good resolution in both time and frequency domains. Wavelet based methods
are common in image fusion. The wavelet transform is a data analysis tool that provides a
multi-resolution decomposition of an image. The input image is decomposed into a set of
wavelet decomposition levels. The basis functions are generated from one single basis
function popularly referred to as the mother wavelet. The mother wavelet is shifted and
scaled to obtain the basis functions. Wavelet decomposition can be administered on an image
in many ways. The fusion of images using wavelets follows a standard procedure and is
performed at the decomposition level. The input images are decomposed by a discrete
wavelet transform and the wavelet coefficients are selected using a set of fusion rules and an
inverse discrete wavelet transform is performed to reconstruct the fused image. Wavelet
fusion methods differ mostly in the fusion rule used for selection of wavelet coefficients.
23
Figure 5.1 Block diagram of Discrete Wavelet Transform.
Wavelet separately filters and down samples the 2-D data (image) in the vertical and
horizontal directions (separable filter bank). The input (source) image is I( x , y ) filtered by
low pass filter L and high pass filter H in horizontal direction and then down sampled by a
factor of two (keeping the alternative sample) to create the coefficient matrices I L ( x , y ) and
I H ( x , y ) .The coefficient matrices and I H ( x , y ) are both low pass and high pass filtered in
vertical direction and down sampled by a factor of two to create sub bands (sub images)
I LH ( x , y ) , I HL ( x , y ) and I HH ( x , y ) . The I LL ( x , y ) contains the average image
information corresponding to low frequency band of multi scale decomposition. It could be

considered as smoothed and sub sampled version of the source image I( x , y ) . It represents
the approximation of source image I( x , y ) , I LH ( x , y ) , I HL ( x , y ) and I HH ( x , y ) are

detailed sub images which contain directional (horizontal, vertical and diagonal) information
of the source image I( x , y ) due to spatial orientation. This process is as shown in the figure
5. Multi-resolution could be achieved by recursively applying the same algorithm to low pass
coefficients from the previous decomposition [22].
24
Figure 5.2 Block diagram of Inverse Discrete Wavelet Transform.
Inverse 2-D wavelet transform is used to reconstruct the image I( x , y ) from sub
I LL ( x , y ) , I LH ( x , y ) , I HL ( x , y ) and I HH ( x , y ) as shown in figure 6. This involves
column up sampling (inserting zeros between samples) and filtering using lowpass L% and
high pass filter H% for each sub images. Row up sampling and filtering with low pass filter
L% and high pass filter H% of the resulting image and summation of all matrices would
construct the image I( x , y ) .
5.2.4 LAPLACIAN PYRAMID
Image pyramids have been initially described for a multi-resolution image analysis
and as a model for the binocular fusion in human vision. An image pyramid can be described
as collection of low or band pass copies of an original image in which both the band limit and
sample density are reduced in regular steps. A multi resolution pyramid transformation
decomposes an image into multiple resolutions at different scales. A pyramid is a sequence of
images in which each level is a filtered and sub sampled copy of the predecessor. The lowest
level of the pyramid has the same scale as the original image and contains the highest
resolution information. Higher levels of the pyramid are reduced resolution and increased
scale versions of the original image.
25
5.2.4a IMAGE FUSION USING LAPLACIAN PYRAMID:
Several approaches to Laplacian fusion techniques have been documented since Burt
and Andelson introduced this transform back in 1983[26]. The Laplacian Pyramid
implements a “pattern selective” approach to image fusion, so that the composite image is
constructed not a pixel at a time, but a feature at a time. The basic idea of this technique is to
perform pyramid decomposition on each of the source images, and then integrate all these
decompositions to form a composite representation, and finally reconstruct the fused image
by performing an inverse pyramid transform.
If the original image is considered as g0, the first step in Laplacian pyramid
transform is to low-pass filter the original image g0 to obtain image g1, which is a “reduced”
version of g0. In similar way g2 is formed as a reduced version of g1, and so on. Image
reconstruction involves the opposite steps of the image decomposition which is explained as
above.
The first step is to construct a pyramid for each source image. The fusion is then
implemented for each level of the pyramid using a feature selection decision mechanism. It
can be used several modes of combination, such as selection or averaging. In the first one, the
combination process selects the most salient component pattern from the source and copies it
to the composite pyramid, while discarding the less salient pattern. In the second one, the
process averages the sources patterns. This averaging reduces noise and provides stability
where source images contain the same patter information. The former is used in locations
where the source images are distinctly different, and the latter is used in locations where the
source images are similar. One other possible approach, chosen in this research, is to select
the most Salient component, following next equation
�I ( x, y ), if I1 ( x , y ) > I 2 ( x, y ) �
I f ( x, y ) = �1 � (5.4)
� I 2 ( x, y ), otherwise
Where I f , I1 and I 2 are the two input and fused signals for levels 0 ≤ l ≤ N −1. Then a
consistency filter is applied. The aim of this consistency filter is to eliminate the isolated
points. Finally, for level N it is performed an average of both source components.
I1 ( x, y ) + I 2 ( x, y )
I f ( x, y) = (5.5)
2
This function method uses a recursive algorithm to achieve three main tasks. First, it
constructs the Laplacian pyramid of the source images [27]. Second, it does the fusion at each
26
level of the decomposition. And finally, it reconstructs the fused image from the fused
pyramid. To implement Laplacian pyramid decomposition, two elementary scaling operations
are to be defined first, usually referred to as reduce and expand. The reduce operation applies
a low-pass filter to the image and down samples it by a factor of two. The expand operation
employs a predefined interpolation method and upsamples the image by a factor of two.
The block diagram of laplacian pyramid is shown in the figure 5.3
Given these two operations, the Laplacian pyramid is obtained through the following two-
step process:
1. Generate a Laplacian pyramid Li for each of the images I i .
2. Merge the pyramids Li by taking the maximum at each pixel of the pyramid, obtaining the
Laplacian pyramid representation L of the fusion result.
3. Reconstruct the fusion result I from its Laplacian pyramid representation.
4. Normalize the dynamic range of the result so that it resides within the range of [0,1], and
apply additional post-processing techniques as necessary.
Figure 5.3 Block diagram of image fusion using laplacian pyramid.
27
CHAPTER 6
RESULTS AND DISCUSSION
6.1 OBJECTIVE EVALUATION

Quality measures are of crucial importance for assessing and monitoring signal
processing algorithms. Most objective quality measures for images are full-reference, since
they require a perfect quality original as a basis for comparison. Objective quality measures
are convenient because they do not have the costs associated with human subjects. Each
objective measure is designed to estimate specific quality aspects and is tuned to map to a
unique dataset. Thus, an objective measure will produce predictable results for the
environment, error conditions and impairments it was developed for. Laboratory studies are
time consuming and expensive, so researchers often choose to run informal studies or use
objective quality measures. Sometimes the produced results may not correlate well with
human perception [28]. Most commonly used performance metric for objective evaluation are
Root Mena Square Error (RMSE) and Peak Signal to Noise Ratio (PSNR). Various other
performance metrics are available for the objective evaluation of image fusion quality. Some
of them are listed below.
Root Mean Square Error (RMSE):
RMSE is calculated using corresponding pixels in the reference image and the reconstructed
image. RMSE value is 0 for a perfectly reconstructed image.
M N
1
RMSE =
MN
��( I (i , j ) - I
t r (i , j )) 2
i =1 j =1
(6.1)
Where It
refers to true/reference image and If refers to the reconstructed image
M and N refer to number of rows and columns respectively.
Mean Absolute Error (MAE):
MAE is also calculated using corresponding pixels in the reference image reconstructed
image. MAE value should also be nearer to zero same as that of RMSE.
28
M N
1
MAE =
MN
��I (i , j ) - I
t f (i , j )
i =1 j =1 (6.2)
Bias of Mean (BM):
It is the difference between the means of original reference image and the reconstructed
image. The value is relative to mean value of the original image. Its ideal value is 0. BM is
calculated using the formula
It - I f
BM =
It (6.3)
M ,N
1
Where It =
MN
�I ( i , j )
i , j =1
t
M ,N
1
If =
MN
�I
i , j=1
f ( i, j )
Standard Deviation (SD):
Important index to weight the information of image, it reflects the deviation degree of values
relative to mean of image. The greater the SD, more dispersive the gray grade distribution is.
Standard deviation would be more efficient in the absence of noise. An image with high
contrast would have a high standard deviation. It is calculated using the formula
1 M ,N
SD = �( I (i , j ) - I f )2
( MN ) i , j =1 f
(6.4)
Peak Signal to Noise ratio (PSNR):
PSNR is the ratio of peak values of signal to noise ratio. Its value will be high when the
reconstructed and reference images are similar. Higher value implies better reconstruction.
M N
��I (i , j ) I
i =1 j =1
t f (i , j )
NCC = M N
��I (i , j )
i =1 j = 1
t
(6.5)
29
Where L is the number of gray levels in the image
Spectral Angle Mapper (SAM):

Spectral Angle Mapper (SAM) compares each pixel in the image with
every end member for each class and assigns a value between 0 (low
resemblance) and 1 (high resemblance) [29]. The formula for SAM at a
specifc pixel is given by
N
�A B i i
Cos a = i =1
(6.6)
N N
�A A �B B
i=1
i i
i =1
i i
Here N is the the number of bands, A = ( A1 , A2 , A 3 , ....., AN ) and B = ( B1 , B2 , B 3 , ....., BN )

are two spectral vectors with the same wavelength from the multispectral image and fused
image, respectively. a is the spectral angle at a specific point, and to compute the SAM for
the entire image, we take the average of all a values.
Normalized Cross Correlation (NCC):

Normalized cross correlation are used to find out similarities between fused image and
registered image is given by the following equation
M N
��I (i , j ) I
i =1 j =1
t f (i , j )
NCC = M N (6.5)
��I (i , j )
i = 1 j =1
t
Maximum Difference (MD):

The difference between every pixel of the original and the fused image is calculated and the
maximum of this difference is the value. This is calculated by the formula
MD = max(max( I t ( i , j ) - I f ( i , j ))) (6.6)
Correlation coefficient (CC) :

The correlation coefficient measures the closeness or similarity in small size structures
between the original and the fused images. It can vary between -1 and +1.Values close to +1
indicate that they are highly similar while the values close to -1 indicate that they are highly
30
dissimilar. The ideal value is one when the reference and fused are exactly alike and it will be
less than one when the dissimilarity increases.
2C tf
CORR = (6.7)
C t +C f
M N
Where C r = ��I r ( i , j )2
i = 1 j =1
M N
C f = ��I f ( i , j )2
i =1 j =1
M N
C rf = ��I f ( i , j ) I t ( i , j )
i =1 j =1
Spatial Frequency (SF) :

This frequency in spatial domain indicates the overall activity level in the fused image. A
good fused image should have maximum spatial frequency. Spatial frequency is calculated
using row frequency (RF) and column frequency (CF) of the image which is calculated as
follows.
SF = RF 2 + CF 2 (6.8)
N M
1
RF =
MN
��[ I
i =1 j = 2
f ( i , j ) - I f ( i - 1, j )]2
N M
1
CF =
MN
��[ I
i =1 j = 2
f ( i , j ) - I f ( i - 1, j )]2
Universal Quality Index (UQI) :

This measures how much of the salient information contained in reference image has been
transformed into the fused image. The proposed index is calculated by modelling image
distortion as a combination of three factors viz. loss of distortion, luminance distortion and
contrast distortion. Experiments on various images show that this index performs better than
widely used distortion measure mean squared error [30]. The range of this metric is -1 to 1
and the best value 1 would be achieved if and only if reference and fused images are alike.
4s I t I f ( m I t + m I f )
QI = (6.9)
(s 2
It + s 2 I f )( m I t 2 + m I f 2 )
31
M N M N
1 1
m It =
MN
��I t (i , j ) , m I f =
i =1 j =1 MN
��I
i =1 j =1
f (i, j )
M N M N
1 1
s 2 It = � �
MN - 1 i =1 j =1
( I t ( i , j ) - m I )2 , s 2 I f =
t
� �
MN - 1 i =1 j =1
( I f ( i , j ) - m I )2
f
M N
1
s 2 It I f = ��( I ( i , j ) - m It )( I f ( i , j ) - m I f )
MN - 1 i =1 j =1 t
Visual Information Fidelity for Fusion (VIFF) :

It is a multi-resolution image fusion metric using visual information fidelity (VIF) used to
assess fusion performance objectively. Since, subjective evaluation is not adequate for
assessing work in an automatic system, using an objective image fusion performance metric
is a common approach to evaluate the quality of different fusion schemes. This method has
four stages: (1) Source and fused images are filtered and divided into blocks. (2) Visual
information is evaluated with and without distortion information in each block. (3) The visual
information fidelity for fusion (VIFF) of each sub-band is calculated (4) The overall quality
measure is determined by weighting the VIFF of each sub-band. It is found that VIFF
performs better in terms of both human perception matching and computational complexity.
Edge-Strength Similarity Based Image Quality Metric (ESSIM) :

With edge-strength defined, the visual fidelity between I f and I r can be predicted by the
similarity between their edge-strength maps. Specifically, the ESSIM index is defined as
1 N 2E ( I f , i )E ( Ir , i ) + C
ESSIM ( I f , I r ) =
N
�E ( I
i =1 , i )2 + E ( I r , i ) 2 + C
(6.10)
f
where the parameter C has two senses. Here N refers to product of rows and columns.Firstly,
it is introduced to avoid the denominator to be zero. Secondly, it can be viewed as a scaling
parameter, the different magnitude of will lead to the different ESSIM score. Here, we choose
C = ( BL)2
Where C is the predefined constant and L is the dynamic range of the edge .
6.2 COMPARISON OF RESULTS

6.2.1 BEMD Vs VEMD
A 256x256 image is taken as shown in the figure 4 and BEMD algorithm and VEMD
algorithm are applied. The results of both the algorithms are discussed. Figure 4 shows the
32
comparison of BIMFs and residue formed using BEMD and VEMD. In VEMD the IMF
signals are also shown in addition to the BIMFs. The reconstructed and the error image of the
same are also shown. The decompositions contains the image information shown in figure
and it is found that first IMFs capture the border information with better accuracy while the
latter IMFs receded into the details of the image. The residue is the low pass filtered one and
it contains the low information as compared to other IMFs.
SIGNAL IMAGE
Figure 6.1a True image (ground truth)
IMFs VEMD BEMD
IMF1
IMF2
33
IMF3
IMF4
IMF5
Residue
Image VEMD BEMD
Recons
tructed
image
Using
IMFs
-16
x 10
8
Error 2
Image -2
-4
-6
-8
0 1 2 3 4 5 6 7
4
x 10
34
Figure 6.1b Comparison of BEMD and VEMD
BEMD algorithm is applied to the given image as shown in the figure 4a. Table 1 shows the
effect of increasing the number of iterations, Table 2 depicts the results for increasing number
of IMFs. Table 3 shows the time taken for the various tolerance values and their
corresponding RMSE values are also presented. From the Tables 1-3, it is observed that time
increases with the increase in maximum number of iterations and the number of IMFs. But if
the tolerance decreases, time increases (i.e) if more precise the value of tolerance is, more the
number of iterations is needed.
Table 6.1 Time and RMSE values corresponding to number of iterations for tolerance=1 and
number of IMFs = 5
Time
Imax RMSE
(sec.)
1 32.681 1.8539e-014
2 41.605 2.4045e-014
3 46.275 2.4199e-014
4 46.257 2.4199e-014
5 46.285 2.4199e-014
Table 6.2 Time and RMSE values corresponding to number of IMFs for tolerance=1 and
number of iterations Imax = 5
No. of
Time(sec.) RMSE
IMFs
1 30.697 5.7629e-15
2 34.583 9.6615e-15
3 38.029 1.5727e-14
4 41.399 2.0036e-14
5 46.444 2.2712e-14
Table 6.3 Time and RMSE values corresponding to tolerance for No. of IMFs=1 and
maximum number of iterations = 5
Tolerance Time(sec.) RMSE
0.0001 75.174 2.9370e-14

0.001 75.032 2.9370e-14
0.01 74.958 2.9370e-14
0.1 74.935 2.9370e-14
1 46.346 2.2712e-14
35
Table 6.4 Performance evaluation of BEMD and VEMD for a given image with number of
iterations = 5, no of IMFs=5 and tolerance = 0.01
Image Time
Metrics RMSE MAE BM SD PSNR
size (s)
BEMD 1.6400e-14 7.2106e-15 0 47.6085 2.102e+11 189.060
512x512
VEMD 1.6243e-16 1.1417e-16 0 68.6460 1.7102e+11 3.227
BEMD 1.0676e-16 5.5379e-17 0 47.7836 8..4654e+11 55.020

256x256
VEMD 1.5157e-16 1.0514e-16 0 68.7134 7.4746e+11 0.811
BEMD 9.0733e-17 4.6670e-17 0 47.8297 3.143e+12 14.407

128x128
VEMD 1.3370e-16 9.0151e-17 0 68.7990 2.8321e+12 0.254
64x64 BEMD 8.6847e-17 4.3128e-17 0 47.7588 2.1103e+13 5.016
VEMD 1.2354e-16 8.2694e-17 0 68.8755 1.1681e+13 0.140
Table 6.5 shows the relative time taken for BEMD and VEMD for various image sizes. From
the table it inferred that VEMD shows shorter response time than BEMD. Figure 6.2 shows
the variation of relative time with different image sizes. If we fit the equation for the same,
we will get the polynomial as -0.0004697 x 2 + 0.3156 x +19.908 .
36
180
160
VEMD
140 BEMD
120
Time (sec.)
100
80
60
40
20
0
64x64 128x128 256x256 512x512
Image size
Figure 6.2 Graph showing the variation of time with the size of the image using BEMD and
VEMD
Table 6.5 Relative time of BEMD and VEMD for various image sizes
Time
Image size Metrics Time(s)
difference
BEMD 189.060
6.2.2 IMAGE 512x512 58.5869 FUSION USING
VEMD 3.227
BEMD Vs VEMD
The BEMD 55.020 results of image
fusion using 256x256 67.8422 VEMD and
VEMD 0.811
BEMD are discussed here.
Two 256*256 BEMD 14.407 images are taken
(figure 6.3b & 128x128 56.7205 figure 6.3c) and
VEMD 0.254
the fused image after applying
VEMD and BEMD 5.016 BEMD are
64x64 35.8286
compared with VEMD 0.140 the reference
image as shown in figure 6.3a.
37
The IMFs of each image after applying VEMD and BEMD are shown in the figure 6.4a and
6.5b respectively. The signal corresponding to the image are also shown.
SIGNAL IMAGE
Figure 6.3a True image (ground truth)
SIGNAL IMAGE
Figure 6.3b First image to be fused (Image 1)
SIGNAL IMAGE
38
Figure 6.3c Second image to be fused (Image 2)
IMFs VEMD BEMD
IMF1
IMF2
IMF3
39
IMF4
IMF5
Residue
Figure 6.4a Intrinsic Mode Functions (IMFs) of first image (Image 1)
IMFs VEMD BEMD
IMF1
40
IMF2
IMF3
IMF4
IMF5
Residue
Figure 6.4b Intrinsic Mode Functions (IMFs) of second image (Image 2)

The IMFs of each images obtained using VEMD and BEMD are fused using four
methods viz. Simple averaging (SA), Principle Component Analysis (PCA), Discrete Wavelet
41
Transform (DWT) and Laplacian Pyramid (LAP.PYMD). The fused images are shown in the
figure 6.5a and their corresponding error images are shown in the figure 6.5b.
Fusion VEMD BEMD
SA
PCA
DWT
LP
Figure 6.5a Figure showing the fused image using various fusion methods
Fusion VEMD BEMD
42
SA
PCA
DWT
LP
Figure 6.5b Figure showing the error image using various fusion methods
Image 1 (FLIR image) Image 2 (Tv image)
43
Figure 6.6a Image set 2 - FLIR image and low light visible Tv image.
Fusion VEMD BEMD
SA
PCA
44
DWT
LP
Figure 6.6b Figure showing the fused image for image set – 2.
The performance values are given in the table 6.6 for the evaluation of the four
fusion algorithms. These values corresponds to image set – 1 (figure 6.5). Values are given
for BEMD and VEMD separately.
45
Table 6.6a: Performance metrics for the evaluation of image fusion using BEMD and VEMD
46
Metrics Metrics RMSE MAE BM SD PSNR SAM
BEMD 8.4165 2.7002 -1.4261e-004 45.6566 38.9135 0.0361

SA
VEMD 8.4165 2.7002 -1.4261e-004 45.6566 38.9135 0.0361
BEMD 8.3929 2.6919 -1.4221e-004 45.6675 38.9257 0.0360

PCA
VEMD 8.3974 2.6999 -1.4421e-004 45.6679 38.9234 0.0360
BEMD 2.9671 1.4869 -1.4261e-04 48.8604 43.4414 0.0127

DWT
VEMD 3.8726 1.5309 -1.4261e-04 48.8903 42.2848 0.0166
BEMD 4.7194 1.9834 -4.1577e-004 47.6800 41.4260 0.0202

LP
VEMD 5.1987 2.1014 -8.3367e-005 47.2011 41.0058 0.0223
Table 6.6b Performance metrics for the evaluation of image fusion using BEMD and VEMD
Fusion Metric NC CC MD ESSIM SF FQI VIFF

0.9963 0.9993 70.5000 0.9997 12.638 0.9081 0.8121
BEMD
SA
0.9963 0.9993 70.5000 0.9997 12.635 0.9083 0.8121
VEMD
BEMD 0.9963 0.9993 69.8640 0.9997 12.652 0.9077 0.8128

PCA
VEMD 0.9963 0.9993 69.9986 0.9997 12.653 0.9097 0.8127
BEMD 0.9996 0.9999 30.9217 0.9997 21.51 0.8142 0.9538

DWT
VEMD 0.9996 0.9999 50.1299 0.9998 21.432 0.8357 0.9330
BEMD 0.9987 0.9998 53.4656 0.9997 20.732 0.8169 0.9013

LP
VEMD 0.9979 0.9998 41.8321 0.9997 19.527 0.8248 0.8888
6.2.3 INFERENCE
From the results obtained it is found that Discrete Wavelet Transform gives better
results for both BEMD and VEMD. BIMFs fused using discrete wavelet transform fetches
good results when compared with VEMD. BEMD using laplacian pyramid performs good
47
following discrete wavelet transform. It is found that both BEMD and VEMD gives exactly
the same results in case of simple averaging as expected. BEMD gives better fused results
than VEMD among four methods, but latter performs faster when compared with BEMD.
The comparison of response times of the same is shown in the table 6.7.
Table 6.7: Relative time of BEMD and VEMD for different fusion methods
Fusion Time
Metrics Time(s)
method difference
BEMD 99.062
SA 12.9256
VEMD 7.664
BEMD 84.948
PCA 11.1686
VEMD 7.606
BEMD 84.580
DWT 11.3897
VEMD 7.426
An important point to
note is that BEMD 93.391 laplacian pyramid
LP 2.3122
performs well if the VEMD 40.390 number of levels is
only 2. But if we increase the
number of levels the performance of both BEMD and VEMD decreases in a exponential
manner as shown in the figure 6.8 which is depicted from the error values as shown in table
6.8. The same case happens with discrete wavelet transform (ie) if we increase the
decomposition levels the performance decreases in an exponential manner (figure 6.9). This
depicts that image fusion using more decomposition levels on IMFs is not giving good
results.
Table 8: Performance values of BEMD and VEMD for increasing laplacian levels
48
No.of laplacian
Metrics RMSE MAE
levels
BEMD 4.7194 1.9834
2
VEMD 5.1987 2.1014
6.0586 2.8540
BEMD
3
VEMD 5.2807 2.6395
BEMD 19.2803 10.7934

5 19.4812 10.5724
VEMD
BEMD 26.2856 14.7935

8 40.7498 24.5984
VEMD
BEMD 26.9435 14.9060

14
VEMD 42.3029 26.9807
30
Root Mean Squared Error
25
20
15
10
0
2 4 6 8 10 12 14
Number of laplacian levels
Figure 6.8 Graph showing the exponentially increasing pattern of RMSE with the increase
of laplacian levels.
Table 6.9 Performance values of BEMD and VEMD for increasing decomposition levels in
discrete wavelet transform
49
No.of
Levels Metrics RMSE MAE
BEMD 2.9671 1.4869

3
VEMD 3.8726 1.5309
BEMD 7.9813 4.1048

4
VEMD 7.8451 3.5119
BEMD 13.8572 7.9762

5
VEMD 15.1549 7.7903
BEMD 22.2130 14.1147

7
VEMD 31.9629 18.8989
BEMD 23.4363 15.0642

8
VEMD 34.970 21.020
25
Root Mean Square Error
20
15
10
0
1 2 3 4 5 6 7 8
Number of decomposition levels
Figure 6.9 Graph showing pattern of RMSE with the increase of decomposition levels in
discrete wavelet transform
6.3 SUBJECTIVE EVALUATION
50
Objective quality measures are convenient because they do not have the costs
associated with human subjects. Objective measure will only produce predictable results for
the environment, error conditions and impairments it was developed for. This sensitivity is
especially severe for no-reference measures, which are typically developed to detect one
specific impairment (Image set2 in our case). Objective evaluation often depends on
characteristics which are difficult to infer, even given an undistorted reference. For instance,
visual attention and gaze direction are known to significantly
influence subjective quality [33].
Since multimedia systems typically have human end-users, the definitive quality
measure is given by human perception. Despite many efforts to design accurate objective
measures, so far none have been able to account for all the peculiarities of human
physiological and psychological responses. Thus, subjective tests are generally regarded as
the most reliable methods for assessing image quality. Subjective tests are generally regarded
as the most reliable and definitive methods for assessing image quality, although laboratory
studies are time consuming and expensive.
Subjective assessment method is a man-made visual analysis for fused image, it is simple and
intuitive. In addition to this, it has many advantages, such as it can be used to determine
whether the image has shadow, whether the fusion image texture or color information is
consistent, and whether the clarify has been reduced,etc. Therefore, the subjective assessment
method is often used to compare the edges of fused images. It can get the differences of
images in space decomposed force and clarity intuitively.
The fused image sets are circulated to the 10 volunteers with the true images. Ratings
provided by the volunteers are provided in the table 6.10 as below. The mean and the
standard deviation of these values are calculated which is also shown. The same procedure is
repeated for the image set 2. Volunteers are both experienced and inexperienced in image
processing domain. They are asked to rate the images on a ten scale and repeatability of the
scores are also verified. From the evaluation results, it is found out Discrete Wavelet
Transform got good scores among others for image set 1 and for image set 2 (IR and Tv
images) fusion using laplacian pyramid scored good when followed by DWT and PCA.
Table 6.10a Table showing Subjective evaluation scores for image set 1 using BEMD
51
Subject * BEMD VEMD
SA PCA DWT LP SA PCA DWT LP
1 No 7 8 10 9 7 7.5 10 9
2 yes 5 5 8 9 6 5.5 9 8.5
3 yes 7 6 8.5 9 6 7 8 9
4 No 7 7 8 8 6 7 8.5 8
5 No 7 7 8 8 6 7 8 8
6 yes 6.5 6 8 8 7 6 8 8
7 yes 6 7 9 8 6 7 8.5 8
8 no 7 7 8 8 7 7.5 8 8
9 yes 5 6.5 8 8 6 6 8 8
10 yes 6 7 8 8 5 6 8 8
6.35 6.65 8.35 8.3 6.2 6.65 8.4 8.25
±0.82 ±0.82 ±0.67 ±0.48 ±0.63 ±0.71 ±0.66 ±0.43
* - People working in Image Processing.
Table 6.10b Table showing Subjective evaluation scores for image set 2 using BEMD
Subject * BEMD VEMD
SA PCA DWT LP SA PCA DWT LP
1 No 7 8 8.5 9 7.5 7 8 8.5
2 yes 8 9 7.5 7 8 9 7 7
3 yes 7.5 7 8 8 7 7 8 9
4 No 5 6 7 8 5 6.5 7.5 8
5 No 7 8 6 8 7 7 8 8
6 yes 9 7 5 9 8.5 7 5 8.5
7 yes 6 7 9 8 6.5 7 8.5 8
8 no 8 8 7 8 7 7.5 7 8
9 yes 6.5 7.5 8 8 6.5 7.5 8 8
10 yes 7.5 9 6 8 7 8 6 8
7.15 7.65 7.2 8.1 7 7.35 7.3 8.1
±1.13 ±0.94 ±1.25 ±0.57 ±0.94 ±0.71 ±1.09 ±0.52
* - People working in Image Processing.
CHAPTER 6
CONCLUSION
Two types of Empirical Mode decomposition viz. BEMD and VEMD are studied and it is
found that VEMD performs faster than other. Intrinsic Mode Functions produced from
BEMD and VEMD algorithms are fused separately using four different methods and
evaluated both subjectively and objectively. From the objective evaluation, it is found that
52
BIMFs fused using Discrete Wavelet Transform gives good performance among them and it
is also found that image fusion using BEMD gives comparatively small error than VEMD
although latter gives faster results than BEMD. It is also found that fusion using more number
of decompositions on IMFs is degrading the performance.
In this work, the following assumptions have been made
 The images to be fused are assumed to be already registered. Image registration is not
carried out in this work.
 Eleven IMFs are assumed for image fusion using both BEMD and VEMD.
REFERENCES
[1] N. Huang, Z. Shen, S. Long, M. Wu, H. Shih, Q. Zheng, N. Yen, C. Tung, and H. Liu,
“The empirical mode decomposition and Hilbert spectrum for non-linear and non-stationary
time series analysis,” Proceedings of the Royal Society A, vol. 454, pp. 903–995, 1998.
[2] M. K. I. Molla, and K. Hirose, “Single-mixture audio source separation by subspace

decomposition of Hilbert spectrum”, IEEE Transactions on Audio, Speech, and Language
Proceedings, Vol. 25, no.3, pp. 893-900, Mar. 2007.
53
[3] Y. Kopsinis, and S. Mclaughlin, “Development of EMD-based de noising methods
inspired by wavelet thresholding”, IEEE Transactions on Signal Proceedings, Vol. 57, no.4,
pp. 1351-1362, Apr. 2009.
[4] G. Rilling, P. Flandrin and P. Goncalves, “On empirical Mode Decomposition and Its
Algorithm”, IEEE-EURASIP Worshop on Non Linear Signal and Image Processing NSIP-
03.
[5] G. Rilling, P. Flandrin and P. Goncalves, “Detrending and Denoising with Empirical
Mode Decomposition”, EUSIPCO-04, Wien.
[6] G. Rilling, P. Flandrin and P. Goncalves, “Empirical Mode Decomposition As a Filter

Bank”, IEEE Signal Processing Letters, vol. 11, no. 2, pp. 112-114, 2004.
[7] Wenzhong shi, Yan tian, Ying huang, Haixia mao and Kimfung liu, “A two
dimensional empirical mode decomposition method with application for
fusing panchromatic and multispectral satellite images”, International Journal
of Remote Sensing
Vol. 30, No. 10, 20 May 2009, 2637–2652.
[8] J. C. Nunes, Y. Bouaoune, E. Delechelle, O. Niang and Ph. Bunel, “Image analysis by
bidimensional empirical mode decomposition”, Image Vision Computers. vol.21, pp. 1019–
1026, 2003.
[9] G. Rilling, P. Flandrin, P. Goncalves, and J. M. Lilly, “Bivariate empirical mode

decomposition”, IEEE Sig. Proc. Letters, vol. 14, no. 12, pp. 936-939, Dec. 2007.
[10] Z. Liu and S. Peng, “Boundary processing of bidimensional EMD using texture
synthesis”, IEEE Signal Process. Lett.vol. 12(1), pp. 33–36, 2005.
[11] Chen, Z., Micchelli, C. A. & Xu, Y, “Fast collocation methods for second kind integral
equations”,SIAM J. Numerical Analysis,Vol. 40, 344–375, 2002.
54
[12] G. Rilling, P. Flandrin, P. Goncalves, and J. M. Lilly, “Bivariate empirical mode
decomposition”, IEEE Signal Processing Letters, vol. 14, no. 12, pp. 936-939, Dec. 2007.
[13] Sharif M. A. Bhuiyan, Jesmin F. khan, Reza r. adhami. “A novel approach of edge
detection via a fast and adaptive bidimensional empirical mode decomposition method”.
Advances in Adaptive Data Analysis vol 2, 171-192, 2010.
[14] J.C. Nunes, S. Guyot and E. Delechelle, “Texture analysis based on local analysis of the
bidimensional empirical mode decomposition”, Machine Vision Applications, Vol. 16(3),
177–188, 2005.
[15] A. Linderhed, “2D empirical mode decompositions in the spirit of image compression”,
Proceedings of SPIE, Vol. 4738, pp.1–8, 2002.
[16] Z. Liu and S. Peng, “Boundary processing of bidimensional EMD using texture
synthesis”, IEEE Signal Processing Letters, Vol. 12(1), pp. 33–36, 2005.
[17] C. Damerval, S. Meignen and V. Perrier, “A fast algorithm for bidimensional EMD”,
IEEE Signal Processing Letters vol. 12(10) , pp. 701–704, 2005.
[18] Ciarlet, P. G, “Numerical analysis of the finite element method”, Seminaire de

Mathematiques Superieures, No. 59, 1976.
[19] Y. Xu, B. Liu and S. Riemenschneider, “Two-dimensional empirical mode

decomposition by finite elements”, Proceedings of Royal Society London, Vol. 462 , pp.
3081–3096, 2006.
[20] Kobbelt, L., Campagna, S., Vorsatz, J. & Seidel, H.-P, “Interactive multiresolution
modeling on arbitrary meshes”, Proceedings of Signal Graphs, Vol. 98, pp. 105–114, 2002.
55
[21] Sapidis, N. & Perucchio, K. “Delaunay triangulation of arbitrarily shaped domains”
Computer Aided Geometry, Vol. 8, 421–437, 1991.
[21] Vaidehi.V et. al, “Fusion of Multi-Scale Visible and Thermal Images using EMD for
Improved Face Recognition”, IMECS , Vol I, 2011.
[22] VPS. Naidu and J.R. Roal, “Pixel-level Image Fusion using Wavelets and Principal
Component Analysis”, Defence Science Journal, Vol. 58, No. 3, pp. 338-352, May 2008.
[23] Varsheny, P.K, “ Multi-sensor data fusion”, Electronics and Communication Engg., Vol.
9(12), 245-53, 1997.
[24] Mallet, S.G. “A Theory for multiresolution signal decomposition: The wavelet
representation”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11(7),
674-93, 1989.
[25] Wang, H.; Peng, J. & Wu, W, “ Fusion algorithm for multisensor image based on
discrete multiwavelet transform”, IEEE Proceedings of Visual Image Signal Processing, Vol.
149(5), 2002.
[26] Burt, P.J. Kolczynski, R.J., “Enhanced image capture through fusion”, 4th International
Conference on Computer Vision, Berlin, Germany, 173-182, 1993.
[27] Valdimir S. Petrović, Costas S. Xydeas, “Gradient-Based Multi-resolution Image

Fusion”, IEEE Transactions on Image Processing, Vol. 3, 228-236, 2004.
[28] A.K. Moorthy and A.C. Bovik, “Perceptually significant spatial pooling techniques for
image quality assessment,” Proceedings of SPIE, 2009.
[29] Goetz, A.F. H., J. W. Boardman, and R. H. Yunas. "Discrimination Among Semi-Arid
Landscape Endmembers Using the Spectral Angle Mapper(SAM) Algorithm." Proceedings of
Summeries 3rd Annual JPL Airborne Geoscience Workshop, 147-149, 1992.
56
[30] Wang, Z. & Bovik, A.C. “A universal image quality index”, IEEE Signal Processing
Letters, Vol. 9(9), 81-84, March 2002.
[31] Yu Han, Yunze Cai, Yin Cao, Xiaoming Xu, “A new image fusion performance metric
based on visual information fidelity”, Information fusion, Vol. 14, Issue 2, pp.no 127-135,
April 2013.
[32] Xuande Zhang, Xiangchu Feng, Weiwei Wang, and Wufeng Xue, “Edge Strength
Similarity for Image Quality Assessment”, IEEE Signal Processing letters, Vol. 20, No.4,
2013.
[33] Flavio Ribeiro, Dinei Florencio and Vitor Nascimento, “Crowdsourcing subjective
image quality evaluation”, 18th IEEE conference on Image Processing,Vol.8, pp. no 3158-
3182, 2011.
57

Image Fusion Using Empirical Mode Decomposition

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Image Fusion Using Empirical Mode Decomposition

Caricato da

Copyright:

Formati disponibili

CHAPTER 1

1.1 EMPIRICAL MODE DECOMPOSITION

EMD is empirical, intuitive, direct, and adaptive, without pre-determined basis

1.2 OBJECTIVES OF THE WORK

1.3 ORGANIZATION OF THESIS

ONE DIMENSIONAL EMPIRICAL MODE DECOMPOSITION

2.1 EMD Assumptions

2.2 INTRINSIC MODE FUNCTION

2.2.1 The Sifting Process

The sifting is continued k times until the first IMF is obtained.

h1k ` = h1( k -1) - m1k

We designate c1 as the first IMF,

2.1.2 Stopping Criteria

Figure 2.2 IMFs of a random signal

BIDIMENSIONAL EMPIRICAL MODE DECOMPOSITION

3.1 BIDIMENSIONAL INTRINSIC MODE FUNCTIONS

3.2 BEMD Process

3.2.1 BEMD by Finite Element Method

Denote by c1 the first IMF and set

VECTORIZED EMD (VEMD)

Bidimensional Empirical Mode decomposition (BEMD) is computationally complex

The flowchart of Vectorizing Empirical Mode Decomposition (VEMD) is as shown

Figure 4.2 Flow chart of VEMD

IMAGE FUSION OF INTRINSIC MODE FUNCTIONS

5.1 IMAGE FUSION

Multi-sensor image fusion (MIF) is a technique to combine the registered images to

5.2 IMAGE FUSION METHODS

5.2.2 PRINCIPLE COMPONENT ANALYSIS

5.2.2a PCA ALGORITHM

5.2.2b IMAGE FUSION USING PCA

5.2.3 DISCRETE WAVELET TRANSFORM

Wavelet theory is an extension of Fourier theory in many aspects and it is introduced

information corresponding to low frequency band of multi scale decomposition. It could be

the approximation of source image I( x , y ) , I LH ( x , y ) , I HL ( x , y ) and I HH ( x , y ) are

I LL ( x , y ) , I LH ( x , y ) , I HL ( x , y ) and I HH ( x , y ) as shown in figure 6. This involves

construct the image I( x , y ) .

5.2.4 LAPLACIAN PYRAMID

Figure 5.3 Block diagram of image fusion using laplacian pyramid.

RESULTS AND DISCUSSION

6.1 OBJECTIVE EVALUATION

M and N refer to number of rows and columns respectively.

Mean Absolute Error (MAE):

Bias of Mean (BM):

Standard Deviation (SD):

Peak Signal to Noise ratio (PSNR):

Spectral Angle Mapper (SAM):

Here N is the the number of bands, A = ( A1 , A2 , A 3 , ....., AN ) and B = ( B1 , B2 , B 3 , ....., BN )

Normalized Cross Correlation (NCC):

Maximum Difference (MD):

Correlation coefficient (CC) :

Spatial Frequency (SF) :

Universal Quality Index (UQI) :

Visual Information Fidelity for Fusion (VIFF) :

Edge-Strength Similarity Based Image Quality Metric (ESSIM) :

6.2 COMPARISON OF RESULTS

Figure 6.1a True image (ground truth)

IMFs VEMD BEMD

Image VEMD BEMD

Tolerance Time(sec.) RMSE

0.0001 75.174 2.9370e-14

BEMD 1.0676e-16 5.5379e-17 0 47.7836 8..4654e+11 55.020

BEMD 9.0733e-17 4.6670e-17 0 47.8297 3.143e+12 14.407

64x64 BEMD 8.6847e-17 4.3128e-17 0 47.7588 2.1103e+13 5.016