Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Information Fusion
journal homepage: www.elsevier.com/locate/inffus
a r t i c l e i n f o a b s t r a c t
Article history: Multi-focus image fusion is an effective technique to integrate the relevant information from a set of im-
Received 17 May 2015 ages with the same scene, into a comprehensive image. The fused image would be more informative than
Revised 3 June 2016
any of the source images. In this paper, a novel fusion scheme based on image cartoon-texture decompo-
Accepted 18 September 2016
sition is proposed. Multi-focus source images are decomposed into cartoon content and texture content
Available online 20 September 2016
by an improved iterative re-weighted decomposition algorithm. It can achieve rapid convergence and nat-
Keywords: urally approximates the morphological structure components. The proper fusion rules are constructed to
Multi-focus image fusion fuse the cartoon content and the texture content, respectively. Finally, the fused cartoon and texture com-
Image decomposition ponents are combined to obtain the all-in-focus image. This fusion processing can preserve morphological
Cartoon content structure information from source images and performs few artifacts or additional noise. Our experimen-
Texture content tal results have clearly shown that the proposed algorithm outperforms many state-of-the-art methods,
in terms of visual and quantitative evaluations.
© 2016 Elsevier B.V. All rights reserved.
1. Introduction The growing appeal of this research area can be observed from
the large number of scientific papers [9–15], which could be cate-
Multi-focus image fusion could be defined as the process of gorized into two main groups: multi-resolution scheme and multi-
fusing substantial information from multiple images of the same spectral scheme. Multi-resolution domain fusion usually trans-
scene to generate a single composite image. The fused image forms the inputs into multi-resolution representation, and then se-
would be more suitable for human visual perception. Currently, lects the decomposed information to reconstruct the fused image.
multi-focus image fusion technology has been widely used in com- These methods enable the efficient combination of the relevant in-
puter vision, clinical medicine, remote sensing, military surveil- formation that are spectrally independent or spatially overlapped.
lance and digital imaging, and so on [1–7]. In recent years, a flurry Thus, multi-resolution-based fusion algorithms have attracted great
of fusion algorithms have been introduced from many and diverse research attentions based on different multi-resolution decomposi-
points of view, which can be categorized into three groups: pixel- tion, such as discrete wavelet transform (DWT) [8], gradient pyra-
level fusion, feature-level fusion and decision-level fusion [1,2,5]. mid [9], contrast pyramid [10]. However, these methods are com-
In comparison to the latter two schemes, the main superiority of plicated, high time-consuming to implement and sensitive to the
pixel-level fusion domain is that the original information is di- sensor noise. Unlike the former one, multi-spectral domain fusion
rectly involved. Although the pixel-level fusion methods comprise directly operates on the pixels or regions of the source images. Its
the open-ended problems: high computational complexity, poor fi- main principle is selecting the more clarified pixels or regions to
delity and blocking artifacts, this promising area is receiving ever- construct the fused image. Averaging, intensity-hue-saturation [11],
increasing attention and is still intensely ongoing. This work has principal component analysis [12], independent component anal-
concentrated mainly on the study of the pixel-level fusion domain. ysis [13] based fusion algorithms fall under this category. To the
best of our knowledge, the major drawbacks of multi-spectral do-
main fusion methods is that it introduces spatial distortions and
∗
Corresponding author. blocking artifacts in the resultant fused images. Recently, a corpus
E-mail address: yinhongpeng@gmail.com (H. Yin). of fusion algorithms in the pixel-level fusion domain have been
http://dx.doi.org/10.1016/j.inffus.2016.09.007
1566-2535/© 2016 Elsevier B.V. All rights reserved.
Z. Liu et al. / Information Fusion 35 (2017) 102–116 103
Fig. 1. The basic multi-focus image fusion scheme based on image cartoon-texture decomposition.
reported to solve the crucial problems, but this topic is largely still In this work, we aim to solve these general problems to get high-
open. quality separated contents, achieve rapid convergence and obtain
Note that the crucial issue is to effectively represent the foun- the all-in-focus image. In practice, an improved multi-focus im-
dational atoms of the inputs in the pixel-level fusion framework. age fusion approach is proposed specific to the mentioned prob-
More recently, a class of image decomposition-based fusion tech- lems. On the one hand, the texture and piecewise smooth (car-
niques [16–18] have attracted increasingly more attention. The key toon) parts are obtained by an improved iterative re-weighted im-
motivation behind lies in the observation that natural images could age decomposition algorithm. From a practical point of view, it
be efficiently decomposed into morphological structure compo- could remove the problem of information distortion and achieves
nents: texture contents and cartoon contents. The texture contents rapid convergence with an effective underlying representation of
hold textures, oscillating patterns, fine details and noise, while the different spatial morphologies. On the other hand, according to the
cartoon contents contain the geometric structures, isophotes and properties of cartoon and texture contents, proper fusion rules are
smooth-piece of source images. To preserve the textural informa- constructed to fuse the cartoon component and the texture com-
tion, a new color-gray image fusion algorithm based on Morpho- ponent, respectively. These can combine the complementary infor-
logical Component Analysis (MCA) is proposed to seek out the mation into the final fused image. Finally, the two fused compo-
most important information [16]. In order to effectively exploit nents are integrated to generate the final all-in-focus image. This
the property of morphological diversity of images and the ad- work would generate an extended depth-of-focus image, which is
vantages of MCA, a novel multi-component fusion method is pre- more suitable for human visual perception. It could also overcome
sented to generate better fused images [17]. Specific to the prob- the problems of poor fidelity and blocking artifacts in the conven-
lem of high computational complexity, a novel pixel-level fusion tional fusion methods. The main contributions can be summarized
algorithm based on multi-level local extrema (MLE) is constructed as follows:
to reflect the regional pattern and edge information of source im-
ages [18]. The generic fusion framework based on image cartoon- (1) In the pixel-level fusion domain, it is crucial to achieve high
texture decomposition is shown in Fig. 1 [16–18]. Generally, the fused image quality by effectively representing the build-
incoherent components of source images could be efficiently sep- ing atoms of multi-focus images. In this work, an iterative
arated and represented by image decomposition algorithms, which re-weighted image decomposition algorithm is proposed to
include MCA, Total-Variation (TV), TV-l1 , TV-l2 and other extended precisely separate and approximate the morphological struc-
methods [17,19–22]. According to the properties of morphological ture components. Furthermore, it can get global optimal so-
structure components, proper fusion rules are constructed to ex- lution and diminishes part of noise.
tract complementary contents. Ultimately, the fused cartoon and (2) From a practical point of view, high computational complex-
texture components are combined to generate the final all-in-focus ity imposes restrictions on putting fusion algorithms into
image. practice. The proposed image texture-cartoon decomposition
Although the above described models are feasible, there are method can greatly reduce computation complexity building
three main open-ended problems in multi-focus image fusion on the convergence and robustness.
framework based on image cartoon-texture decomposition: (1) (3) According to the properties of the separated contents, proper
how to construct the proper decomposition algorithm to achieve fusion rules are presented to fuse cartoon component and
fast convergence is a crucial problem; (2) how to naturally approx- texture component, respectively. The all-in-focus image is
imate the morphological components remains an open and diffi- obtained by combining the fused cartoon and texture con-
cult issue; (3) how to extract complementary information to gen- tents, which is more suitable for human visual perception.
erate the all-in-focus image is also an open-ended research project. This fusion processing can preserve the complementary in-
104 Z. Liu et al. / Information Fusion 35 (2017) 102–116
formation from the multi-focus source images and performs the additional component behaves like a white zero-mean Gaus-
as few artifacts or additional noise as possible. sian noise in many papers [32,34]. In general, there are other dif-
ferent noise models similarly introduced by other norms including
The rest of this paper is organized as follows. In Section 2,
l1 , l∞ and others.
the novel fusion approach based on image decomposition is pro-
Given the above solution, high-quality results can be achieved.
posed. Several experiments are conducted to verify the proposed
However, although the above-mentioned case is feasible, there
method in Section 3. Discussions and conclusions are summarized
still exist open-ended and difficult issues. On the one hand, high
in Section 4.
computational complexity (O(KN2 log2 2N) imposes restrictions on
putting this model into practice. Wherein K is the number of itera-
2. The proposed fusion approach based on image
tions for the outer loop of the program, and N is the number of it-
decomposition
erations for the inner loop of the program. On the other hand, how
to naturally approximate the two components remains a crucial
To overcome the crucial open-ended research issues, a gen-
and open-ended problem. Therefore, it is desirable to construct an
eral fusion approach based on image cartoon-texture decomposi-
approximate model to: (1) reduce computation complex with guar-
tion is proposed. First of all, an improved re-weighted decompo-
anteed convergence robustness; (2) reduce the reconstruction error
sition algorithm is presented. It could naturally approximate the
with the sparest and unreduced representation. Naturally, this al-
cartoon and texture components with low computation complex-
ternative convex formulation should achieve convergence rapidly
ity. Secondly, according to the properties of the separated con-
and accurately approximates the texture and natural scene compo-
tents, the improved fusion rules are utilized to fuse the cartoon
nents. In this work, an improved algorithm is proposed to optimize
content and texture content, respectively. These could extract more
the above constraint convex formulation.
curves, edges, anisotropic structures and detailed information into
Reviewing the decomposition formulation as shown in Eq. (1),
the fused cartoon image and texture image. The all-in-focus im-
the objects are the sparse coefficient vectors α t , α n with far longer
age is obtained by combining the complementary information from
length N2 . In order to simplify the separation process, we rectify
these fused components. The detailed implementations are illus-
the objects as the texture image, the cartoon image and unknown
trated as follows.
noise image, rather than the sparse coefficients. Therefore, the fea-
sible solution for image decomposition can be reformulated as:
2.1. The improved image cartoon-texture decomposition algorithm
{Xt , Xn } = Arg min Dt† Xt + Rt 1 + D†n Xn + Rn 1
The task of decomposing source images into their building {Xt ,Xn }
atoms is of great interest in many fields. Many image cartoon-
+λX − Xt − Xn 22 + γ T V Xn
texture decomposition methods are exploited, including MCA, TV
and other extended algorithms. Although these famous algorithms s.t. αt = Dt† Xt + Dt† rt = Dt† Xt + Rt
can achieve high performance, how to build a proper algorithm re- αn = D†n Xn + D†n rn = D†n Xn + Rn
mains an open and difficult issue. The presented algorithm should
be fast convergence and naturally approximate the incoherent Rt = Dt† rt = 0
components. In this work, a novel iterative re-weighted algorithm Rn = D†n rn = 0, (2)
is proposed to effectively separate source images into incoher-
ently morphological structural components. More specifically, this where, Rt and Rn are the arbitrary matrixes in the null-space of
† †
work provides a theoretical analysis of the decomposition idea, sparse dictionaries Dt and Dn . Dt and Dn are the Moore-Penrose
and reveals that a successful separation of image cartoon-texture †
pseudo-inverse of sparse dictionaries Dt and Dn . The terms Dt Xt
contents could be found in principle. Simulation results for 1- †
and Dn Xn
in Eq. (2) represent the linear transforms for the texture
dimension signal and 2-dimension image show that our proposed
image and the natural scene image, respectively.
method can achieve a feasible and optimal approximate solution
However, can we construct an alternative algorithm to achieve
with less time complexity.
a fast convergence and get an accurate and tractable solution upon
l0 or l1 norm minimization? Reviewing the properties of l0 and
2.1.1. The theoretical analysis of image decomposition model
l1 norm, a critical distinction between l0 and l1 norm relies on
Supposing a multi-focus image X(X ∈ RN × N ), it can be pro-
the magnitude. Naturally, is there a feasible way to penalize the
cessed as a 1-D vector with the length N2 by reordering. In gen-
larger coefficients more heavily than smaller sparse coefficients in
eral, a natural image could be separated into two images: cartoon
l1 norm minimization? In this work, an efficient re-weighted algo-
(piecewise smooth) image and texture image. In other words, an
rithm for separating the texture and cartoon contents based on l1
image Xt contains only texture comment and no pure piecewise
norm minimization is proposed. The optimal solution can be refor-
smooth comment by an over-complete dictionary Dt . Similarly, a
mulated as:
functional of an over-complete dictionary Dn can separate the re-
lated cartoon comment Xn from the source image X. A solvable op- {Xt , Xn } = Arg min Dt†Wt Xt + Rt 1 + D†nWn Xn + Rn 1
timal functional principle of this problem can be defined as [23– {Xt ,Xn }
33]: +λX − Xt − Xn 22 + γ T V Xn
{αt , αn } = Argαmin
,α
αt 1 + αn 1 + λX − Dt αt − Dn αn 22 1 1
t n s.t. Wt = diag , Wn = diag , (3)
+γ T V {Dn αn }, (1) |Xt + ε | |Xn + ε |
where, the parameter λ is a balance between sparity and recon- where, Wt or Wn is positive weights matrix. Especially, Wt or Wn
struction error [23]. α n and α t stand for the separated cartoon is the diagonal matrix with non-zero diagonal values to hold the
coefficients and the separated texture coefficients, respectively. In geometrical structure characteristic of source multi-focus images.
this solution, the natural content can be separated from the texture Naturally, the representation matrix of source images is merged by
content, and the additional noise can be removed as a by-product. each column successively. ε is a non-zero constant to enhance the
Furthermore, when observing the above formula, note that the stability of our proposed algorithm. Furthermore, it can prevent to
noise error norm is chosen as l2 norm. There is an assumption that generate a non-zero estimate for a real zero-valued content in tex-
Z. Liu et al. / Information Fusion 35 (2017) 102–116 105
Fig. 2. Reconstruct the signal X0 by •1 &W •1 norm minimization. (a) A feasible set between y = DX and ball of radius X0 1 ; (b) An infeasible set between y = DX and
ball of radius X0 1 ; (c) A commendable recovery by the re-weighted l1 norm W•1 .
ture image Xt or cartoon image Xn . In general, the parameter ε B: The local Discrete Cosine Transform for texture component
should take a smaller value than the expected error magnitudes. Local DCT is one of the extensive families of the Discrete Fourier
Naturally, the above model immediately raises a question: Transform. Its basic principle is to transform the image from the
whether the proposed re-weighted algorithm can achieve a high- spatial domain to the frequency domain [14,35]. It can commend-
quality reconstruction result, gets a feasible and optimal approx- ably represent details, lines and periodic components of a natu-
imation solution, and achieves fast convergence. For the sake of ral image. Furthermore, DCT can simplify the computation by re-
demonstration, a schematic instance is shown in Fig. 2. placing the complex analysis with real numbers. Thus, local DCT
As shown in Fig. 2, the simulated example is given in details be- is used to approximate the building atoms of texture components,
tween l1 relaxations and the re-weighted l1 relaxations. In compar- which consist of details and periodic structural contents.
ison, the corresponding l1 relaxations can achieve different solu- In this work, when the Curvelet transform is used to repre-
tions including feasible solutions and infeasible solutions, as shown sent the cartoon contents, ringing artifacts are caused by the os-
in Fig. 2(a) and (b). However, Fig. 2(c) demonstrates that the re- cillations of the Curvelet atoms. In many literatures, TV is used to
weighted l1 relaxation algorithm can get a unique solution. Fur- damp the ringing artifacts near edges [19,20,22]. As a matter of
thermore, it also indicates that large re-weighted values can be fact, the essential principle of the expression TV{Dn α n } is to com-
utilized to damp the non-zero points; while small re-weighted val- pute the image Xn = Dn ∗ αn and apply the TV norm on it. The car-
ues can be exploited to drive non-zero points. In other words, the toon image could be closer to a piecewise smooth image by pe-
re-weighted l1 relaxation algorithm can preserve the complemen- nalizing with TV. Therefore, to obtain better expected components
tary information of source images. The proof of the convergence of Dt α t and Dn α n , a total-variation (TV) regularization scheme is uti-
this algorithm is similar to the basis pursuit method given in [23– lized to make the cartoon image Dn α n meet with the natural scene
25] with slight modifications, and thus is omitted here. Especially, content model [35].
unlike looping many times in the iterations by MCA and TV, re- In brief, the chosen proper dictionaries are utilized to decom-
peating this procedure twice is sufficient enough in our proposed pose the multi-focus images into texture content and cartoon con-
method. In brief, the proposed algorithm can preserve the struc- tent. With the latter, an efficient iterative re-weighted algorithm
ture information in the cartoon image, such as the main image re- is presented to get a feasible and optimal approximate solution,
gion and edges; it can also protect the oscillating components and whose basic mathematical model is formulated as Eq. (3). The
discourages the additional noise in texture image. main steps of the iterative re-weighted decomposition algorithm
Reviewing the procedure of image decomposition, proper dic- are as follows.
tionaries Dt and Dn play an important role in representing the
1. Initialize Lmax = 5, δ = λLmax , λ = 0.4, ε = 10−3 , Xn = X,
texture and piecewise smooth components. One practical way to
Xt = 0 or Xt = 0.1 ∗ X, Wt &Wn − unitary matrix.
chose Dt and Dn is to utilize well known transforms. In this work,
2. Solve
we select widely known transforms to well represent both cartoon
content and texture content. To constrain our choice to Dt and Dn , {Xt , Xn } =Arg min Dt†Wt Xt + Rt 1 + D†nWn Xn + Rn 1
{Xt ,Xn }
it is desirable to meet the requirements: (1) could be easily imple-
mented with less computational complexity; (2) should be effec- + λX− Xt − Xn 22 + γ T V Xn
tive to represent the texture component or the piecewise smooth 1 1
s.t. Wt = diag , Wn = diag
component. Therefore, the Curvelet Transform and local Discrete |Xt + ε | |Xn + ε |
Cosine Transform (DCT) are used to represent the cartoon content 3. Loop
and the texture content, respectively. The motivations for trans- (a) Update Xn , assume Xt is fixed:
form basis selection are briefly illustrated below.
A: The Curvelet Transform for cartoon component rn = X − Xt − Xn
Roughly speaking, the basic principle of Curvelet Transform is αn = D†nWn (Xn + rn )
to separate an image into a set of wavelet sub-bounds, and then Xn = Dn αˆ n
to analyze each sub-band by means of a local ridgelet transform
where, αˆ n is the approximate coefficients; α n is the
[33,34]. It is more efficient to represent source images with abun-
sparse coefficients; Dn is the Curvelet Transform Dictio-
dant structure information and edges than wavelets or other con- †
nary; Dn is the Moore-Penrose pseudo-inverse of sparse
ventional representations. Especially, the Curvelet Transform could
dictionary Dn ; rn is the residual.
achieve exact reconstruction and stability due to the property of
(b) Update Wn :
invertibility and robustness. Therefore, the Curvelet Transform is
used to represent the information of cartoon components, which 1
Wn = diag
include anisotropic structures and profile of source images. |Xn + ε |
106 Z. Liu et al. / Information Fusion 35 (2017) 102–116
Fig. 3. The experimental results by l1 relaxation method and the re-weighted l1 relaxation algorithm. (a) The original signal marked in blue ‘∗ ’ and the recovery signal by l1
relaxation method marked in green; (b) The original signal marked in blue ‘∗ ’ and the recovery signal by the proposed re-weighted l1 relaxation algorithm. (For interpretation
of the references to colour in this figure legend, the reader is referred to the web version of this article.)
(c) Update Xt , assume Xn is fixed: contains 10 0 0 discrete entries, which are marked in blue “∗ ” in
Fig. 3(a) and (b). For the sake of illustrating the property, the ex-
rt = X − Xt − Xn
periments are sketched here by l1 relaxation method and the re-
αt = Dt†Wt (Xt + rt ) weighted l1 relaxation algorithm.
Xt = Dt αˆ t In Fig. 3(a), the distinction between the original signal and the
recovery signal by l1 relaxations is shown clearly. Interestingly, the
where, αˆ t is the approximate coefficients; α t is the
reconstructed signal by the novel re-weighted l1 relaxation algo-
sparse coefficients; Dt is the Curvelet Transform Dictio-
† rithm can dramatically approximate the sparse discrete entries as
nary; Dt is the Moore-Penrose pseudo-inverse of sparse
shown in Fig. 3(b). In a nutshell, the re-weighted values inversely
dictionary Dt ; rt is the residual.
to the true signal magnitudes can achieve a feasible and opti-
(d) Update Wt :
mal approximation solution with less time complexity. Put more
1 generally, there are many algorithms to approximate the l0 or l1
Wt = diag
|Xt + ε | norm minimization, including Basis Pursuit (BP), Matching Pur-
suit (MP), Orthogonal Matching Pursuit (OMP) and Iterative Re-
(e) Damp Ringing Artifacts by TV: weighted Least Square (IRLS) [36–40], etc. However, these methods
Xn = Xn − μγ (∂ T V Xn /∂ Xn ) may either guarantee the reconstructed image quality or achieve a
fast computation speed. In other words, the various ways may lead
Update Wn : to the high reconstruction error with a high computation speed,
1 which can be seen in Fig. 4.
Wn = diag
| n + ε|
X As shown in Fig. 4, a lower probability of error can be ob-
tained by the proposed approach than the results by the conven-
Update the threshold δ : δ = δ − λ tional algorithms under different cardinalities of the true solution.
4. Terminate if the threshold is satisfied (δ < λ): Although the running time for the proposed approach is much
Xn = Dn αˆ n higher than the computational time for MP and OMP methods, it
Xt = Dt αˆ t can guarantee the quality of the reconstructed images and greatly
reduces the running time as an extension of BP method. Therefore,
2.1.2. The experimental demonstration of the proposed decomposition the proposed approach can achieve a feasible and optimal approx-
model imation solution with less time complexity.
As mentioned above, an underlying model to describe image In image decomposition based fusion framework, the underly-
content is proposed. It can separate the source images into tex- ing representation of the inputs (image cartoon-texture composi-
ture and piecewise smooth (cartoon) parts theoretically. It is nat- tion) is the key step. Namely, how to completely and accurately
ural to ask, whether our presented algorithm would achieve high- separate multi-focus images into cartoon and texture components
quality performance in the real applications. In this subsection, the is a crucial issue. In this section, some experiments are sketched
comparative experiments for 1-D signal are sketched between the here to demonstrate the effectiveness and accuracy of the pro-
re-weighted l1 relaxation algorithm and other l1 relaxation algo- posed image decomposition algorithm, as shown in Fig. 5.
rithms. The experimental results show that the proposed method The decomposition results of multi-focus images are shown in
outperforms other classical methods in term of reconstruction er- Fig. 5. Multi-focus images are demonstrated in Fig. 5(a); while tex-
ror. It also get better tradeoff between computation complexity and ture components and cartoon components are shown in Fig. 5(b)–
reconstruction accuracy. When focusing on 2-D natural images, the (g), which is obtained by MCA, TV-l1 and the proposed iterative
presented algorithm could achieve better performance in compari- re-weighted decomposition algorithm, respectively. As shown in
son to the known decomposition methods, which could completely Fig. 5(d) and (e), there is still part of texture component stayed in
represent and separate morphological structure components. the cartoon image. In comparison, the texture and cartoon contents
To provide a more intuitive display, a practical example for 1- acquired by MCA and the proposed algorithm are well separated
dimension signal is given to verify the performance of the re- as shown in Fig. 5(b), (c), (f) and (g). However, MCA method does
weighted l1 relaxation algorithm in Fig. 3. The original signal not have the stability and robustness for all the multi-focus im-
Z. Liu et al. / Information Fusion 35 (2017) 102–116 107
Fig. 4. The probability of error and running time under diverse methods. (a) The probability of error obtained by the proposed approach and the conventional method; (b)
The running time obtained by the proposed approach and the conventional method.
Fig. 5. The results of image decomposition by the proposed iterative re-weighted algorithm, MCA and TV-l1 methods. (a) The source multi-focus images; (b) The texture
image obtained by MCA; (c) The cartoon image obtained by MCA; (d) The texture image obtained by TV-l1 ; (e) The cartoon image obtained by TV-l1 ; (f) The texture image
obtained by the proposed iterative re-weighted algorithm; (g) The cartoon image obtained by the proposed iterative re-weighted algorithm.
ages. Specific to the source image with more texture components, toon contents and texture contents demonstrate that the separated
the corresponding cartoon image still consists of definite texture components are well represented by MCA model and the proposed
component. Observing the separated cartoon component of source algorithm, as shown in Fig. 6(e)–(h). When carefully observing the
image with a ball, there is still part of cartoon component stayed in image spectrogram of texture components in Fig. 6(f) and (h), there
the texture image. In order to provide visually explicit directions, is much more luminance information obtained by MCA model than
the concept of image spectrogram is quoted and utilized to demon- by the proposed approach. In other words, the texture image ob-
strate the difference between the separated texture image and the tained by MCA model also consists of partial cartoon components.
separated cartoon image. As we all know, the image spectrogram In brief, the proposed algorithm can commendably separate mor-
represents the intensity of variation between one point and its phological structure component in comparison to the other two
neighborhood points. In other word, it is the distribution of image methods.
gradient (or image energy). According to this theoretical principle,
the image spectrogram of cartoon component should contain more 2.2. The improved multi-focus image fusion rules
energy contents (high brightness part); while the image spectro-
gram of texture component should demonstrate the mutant part As mentioned above, multi-focus image can be completely and
with low luminance (low energy). In this work, the image spectro- effectively separated into two incoherent components: cartoon
gram for one of experimental images is shown in Fig. 6. content and texture content. For the cartoon content, it contains
The source image and its spectrogram image are shown in the main geometry information and profile of multi-focus im-
Fig. 6(a) and (b). Comparatively speaking, the image spectrogram ages obtained by curvelet transform; while for the texture con-
of cartoon component in Fig. 6(c) indicates that TV-l1 model could tent, it mainly consists of the periodic texture information and par-
not primely represent the cartoon component. The image spectro- tial additional noise obtained by local DCT. Thus, different fusion
gram of texture component in Fig. 6(d) illustrates that the sep- rules should be constructed for morphological contents to preserve
arated texture image obtained by TV-l1 model contains partial much more structural information, which are illustrated in detail
cartoon contents. In comparison, the image spectrogram of car- as follows.
108 Z. Liu et al. / Information Fusion 35 (2017) 102–116
Fig. 6. The frequency spectrum of cartoon and texture images obtained by TV-l1 model, MCA model and the proposed algorithm on multifocus “Color_Book” image. (a) The
source “Color_Book” image; (b) The frequency spectrum of source image; (c) The frequency spectrum of cartoon image obtained by the proposed decomposition algorithm;
(d) The frequency spectrum of texture image obtained by the proposed decomposition algorithm; (e) The frequency spectrum of cartoon image obtained by MCA model; (f)
The frequency spectrum of texture image obtained by MCA model; (g) The frequency spectrum of cartoon image obtained by TV-l1 model; (h) The frequency spectrum of
texture image obtained by TV-l1 model.
A: Fusion rule for cartoon image In order to improve the fusion accuracy, energy and structural
For the separated cartoon part, it contains the main geometric similarity are used to construct fusion rule for texture image. The
structural behaviors including object hues and sharp boundaries. energy is defined as:
The fundamental goal of cartoon component fusion is to preserve
1
m n
the geometric structural information. Inspired by the fusion rule En(t1 ) = (XAt )2 , i = 1, 2, · · · , m, j = 1, 2, · · · , n
“coefficient-abs-max”, an extension fusion rule based on variance m×n
i=1 j=1
is constructed to fuse the cartoon images. The variance as an ac-
1
m
n
tivity measure can reflect the cartoon components in multi-focus En(n1 ) = (XAn )2 , i = 1, 2, · · · , m, j = 1, 2, · · · , n
images. Therefore, in this work, an extension of variance can be m×n
i=1 j=1
defined as follow:
1
m
n
variance. Small constants C1 , C2 are used to avoid the instability at which are shown as:
a zero interesting point. ⎧ A ⎫
⎪ gn,m A ⎪
According to the above description, two distinct combination ⎪
⎨ XAt , gXAt
n,m > g n,m ⎪
⎬
gn,m
modes (selection and averaging) are exploited to merge the tex- GA,X
n,m =
At
, (10)
⎪
⎪ gXAt ⎪
ture components into the final fused version. The novel fusion rule
⎩ An,m , otherwise ⎪
⎭
for texture images are calculated as Eq. (8). gn,m
XAt ∗ min(w0 (t1 ), w1 (t1 )) + XBt ∗ min(w0 (t2 ), w1 (t2 ))
XF t =
A,XAt
Hn,m = 2π −1 αn,m
XAt A
− αn,m − π /2, (11)
min(w0 (t1 ), w1 (t1 )) + XBt ∗ min(w0 (t2 ), w1 (t2 ))
En(t 1 ) where, gradient strength g and orientation α ∈ (0, π ) are extracted
w0 (t 1 ) = at each location n, m from each image using the Sobel operator.
En(t 1 ) + En(n1 )
The constants , kg , σ g , kα , σ α determine the exact shape of the
En(t 2 ) sigmoid functions used to form the edge strength and orientation
w0 (t 2 ) =
En(t 2 ) + En(n2 ) preservation values.
s.t .
S(t 1 ) As mentioned above, the fused cartoon image contains the
w1 (t 1 ) =
S(t 1 ) + S(n1 ) piecewise smooth changes in the luminous intensity or the marked
contents, and edges, which should be preserved as much as pos-
S(t 2 )
w1 (t 2 ) = , sible. The fused texture image describes the detailed information
S(t 2 ) + S(n2 )
in the regions enclosed by edges including the additional noise.
(8) Therefore, this part should bring in the additional noise as less
where, w0 (t1), w0 (t2) denote the weight between the sepa- as possible. Motivated by these principles, an extended version of
rated texture components and the separated cartoon components. weighted fusion rule is proposed to integrate the fused texture im-
w1 (t1), w1 (t2) represent the ratio of structural similarity between age and the fused cartoon image. The novel fusion rule is defined
the texture and cartoon images. min(w0 (t1), w1 (t1)), min(w0 (t2), as:
w1 (t2)) represent the importance of the coefficients in the final 1
XF = (Q(A,XAn ) + Q(B,XBn ) )XF n + (Q + Q(B,XBt ) )XF t
fused texture image. 2 (A,XAt )
The above formulas represent the fusion operator constructed
Q(A,XAn )
for fusing texture components. As is well-known, few cartoon com- Q(A,XAn ) =
ponents and partial additional noise may be introduced into the Q(A,XAn ) + Q(A,XAt )
texture content. According to the properties of the texture con- Q(A,XAt )
Q(A,XAt ) = (12)
tents, the “min” function is characterized by energy and structural Q(A,XAn ) + Q(A,XAt )
similarity to remove the cartoon comments and noise information. s.t.
Q(B,XBn )
Based on this operator, the fused texture components contain more Q(B,XBn ) =
Q(B,XBn ) + Q(B,XBt )
significant features and introduce as less artifacts or inconsistency
as possible. Q(B,XBt )
Q(B,XBt ) = ,
C: Fusion rule for cartoon image and texture image Q(B,XBn ) + Q(B,XBt )
In the fusion framework, the key issue for designing proper fu-
Through the above procedures, diverse improved fusion rules
sion rules is to integrate the important information into the final
are utilized to fuse the cartoon and texture components, respec-
fused image. When a natural image is decomposed into two inde-
tively; while the separated components are also effectively fused
pendent components with their inherent characteristics, the fused
by a proper fusion rule. These procedures can carry prominent
image can not be obtained by adding and subtracting operations
information without introduction of distortion and enhance the
on different components directly. Especially, the cartoon part con-
quality of fused image. In next section, several experiments are
tains the components: curves, edges and anisotropic structures;
sketched to illustrate the performance of the novel approach.
while the texture part mainly represents the details, lines and peri-
odic behaviors, as well as some degree of noise information. There-
3. Multi-focus image fusion experiments
fore, an improved fusion rule based on edge preservation is pro-
posed to preserve the geometric structural information. The edge
Image cameras with finite depth-of-field can not capture all
preservation is on behalf of the edge information from the sepa-
relevant and complementary information in focus in light optical
rated image. For source images A, B, the cartoon and texture im-
imaging systems. Multi-focus image fusion is an effective process
ages X(At), X(An), X(Bt), X(Bn), the spatial information preservation
of combining all the complementary information into a highly fo-
measures Q A,XAt , Q A,XAn , Q B,XBt , Q B,XBn [41,42] are defined as:
calized image, which can provide a suitable view for human or
machine perception. In this work, a novel multi-focus image fu-
A,XAt
Qn,m =
(
A,X
−σg )
A,XAt
−σα )
sion approach based on image decomposition is proposed. Firstly,
(1 + e kg Gn,mAt )(1 + ekα (Hn,m ) multi-focus images are decomposed into cartoon and texture com-
ponents by an iterative re-weighted decomposition algorithm. Sec-
A,XAn
Qn,m = ondly, the cartoon image and the texture image are fused by dif-
A,X A,X
( 1 + ekg ( Gn,mAn − σg ) )(1 + ekα (Hn,mAn −σα ) )
ferent fusion rules according to their properties. Finally, the fused
(9) images are obtained and some comparison experiments are con-
B,XBt
Qn,m = structed to illustrate the effectiveness of the proposed approach.
B,XBt A,XBt
(1 + ekg (Gn,m −σg ) )(1 + ekα (Hn,m −σα ) ) The objective evaluation is necessary for the fused images. The
evaluation criterion generally includes Average Gradient (AG), Mu-
B,XBn
Qn,m = , tual Information (MI), Edge-Intensity (EI), Relatively Warp (RW),
(
B,X A,X
σ ) )(1 + ekα (Hn,mBn −σα ) ) Structural Similarity (SSIM), Edge Retention QAB/F , Cross-Entropy
(1 + e kg Gn,mBn − g
(CE) and Figure Definition (FD) [1–6,14,15]. The fused medical im-
In Eq. (9), a pair of factors G(∗ ) and H(∗ ) are used to define age is better with the increasing numerical index of AG, MI, FD,
relative strength (gradient information) and orientation “change”, EI, while it is opposite for CE and RW. The Edge retention QAB/F
110 Z. Liu et al. / Information Fusion 35 (2017) 102–116
Table 1
The running time (Time/seconds) and PSNR by the proposed fusion approach and the
conventional fusion methods.
and Structural Similarity (SSIM), expound the impression of the Furthermore, the experimental results also demonstrate that some
medical image better than others if the value approximates to 1. noise information has been introduced into the fused images ob-
The evaluation criterion would give an impersonal assessment to tained by Max, MP, RP and TV-l1 . In comparison, the fused images
the fused images by the conventional fusion methods and the pro- merged by DWT, CP, LP and the proposed method contain much
posed fusion approach. All the source multi-focus images can be more detail information. On the other hand, the results of other
obtained at http://www.imagefusion.org. In addition, all the ex- schemes have a poor contrast in comparison to those of the above
periments are implemented in Matlab 2010a and on a Pentium(R) mentioned four methods. However, it is difficult to distinguish the
2.5 GHz PC with 2.00 GB RAM. diversity among those fused images acquired by DWT, CP, LP and
the proposed method for human visual system. Thus, the objective
indices are utilized to evaluate the fused images. The results of the
3.1. Fusion on multi-focus “Color_Flower” images assessment criteria are shown in Table 2.
As shown in Table 2, the proposed fusion approach outperforms
In this section, a pair of “Color_Flower” multi-focus images with other methods in terms of the evaluation criteria including CE,
size of 384 × 512 pixels is utilized to perform the proposed al- QAB/F , RW, SSIM and FD. These quantitative assessments indicate
gorithm and the comparative methods. Each “Color_Flower” image that the fused image obtained by the proposed approach contains
has different focused regions. Color_FlowerA focuses on the left more detail information. Although the value of quantitative assess-
side of the big flower with different edge shapes in each petal, big ments is smaller in terms of AG, EI and MI, the distinction is rela-
flower clear, whole brick wall and small lobules fuzzy. In compar- tively tiny. Especially, the quantitative assessment of FD illustrates
ison, Color_FlowerB focuses on the right side of the whole brick that the fused image by the proposed approach is more suitable
wall and many small lobules with similar brick contour in the for human visual perception. In conclusion, the proposed fusion
background wall, the lobules of arbitrary shapes and sizes, whole approach is superior to other methods through subjective visual
brick wall and small lobules clear, big flower fuzzy. This group analysis and objective quantitative evaluation.
of multi-focus images consists of both rich cartoon components
and abundant texture components comparatively. In this section,
we sketch here some experiments to verify the effectiveness (low 3.2. Fusion on multi-focus “Color_Book” images
computational complexity and high-quality fused image) of the
proposed fusion approach. In particular, the running time is ob- In this work, some fusion experiments on a set of multi-focus
tained by averaging 20 times of algorithm procedure, which are “Color_Book” images with size of 320 × 240 pixels are sketched
acquired by DCT, blockDCT, Curvelet, TV-l1 model, MCA model and here to illustrate the performance of the proposed fusion approach.
the proposed algorithm; the quality of the fused images is esti- Color_BookA focuses on the right side of the whole book with
mated by Peak Signal to Noise Ratio (PSNR). The experimental re- some objects of arbitrary shapes and sizes, whole book clear, part
sults are shown in Table 1. book fuzzy. In proportion, Color_BookB focuses on the left side of
As shown in Table 1, it can be seen clearly that the proposed the part book with sawtooth contour and some English words in
algorithm holds less time than TV-l1 model and MCA model. In the cover, part book clear, whole book fuzzy. As mentioned above,
comparison to the classical methods (DCT, curvelet), the presented this group of source images is abundant in cartoon components
algorithm takes up more time. However, the results of the eval- and texture components. To show the effectiveness of the pre-
uation index (PSNR) demonstrate that the proposed algorithm can sented approach, some experiments are implemented to demon-
get high-quality fused images. In brief, the improved image decom- strate the merits of the proposed algorithm, which pay close at-
position based fusion method can achieve better results with low tention to the running time and image quality. The experimental
computational complexity when compared with the conventional results are shown in Table 3.
methods and other image decomposition based fusion methods. In Table 3, the distinction of running time between the pro-
In order to provide an intuitive result, the experiments are imple- posed algorithm and other image decomposition based method
mented on “Color_ Flower” multi-focus images as shown in Fig. 7. (TV and MCA models) is distinct. These are acquired by averag-
The fusion results using different fusion methods are shown in ing 20 times of fusion processes. In comparison to the classical
Fig. 7. The source images are shown in Fig. 7(a) and (b). The fused methods (DCT, curvelet), although the presented algorithm takes
images are obtained by combining those sets of multi-focus images up more time, it can get high-quality fused images. In particular,
perceived as a single content, as shown in Fig. 7(c)–(l), which are the results of the Peak Signal to Noise Ratio (often abbreviated
acquired by average-based method (AV), Select maximum based PSNR) demonstrate that the presented method can achieve better
method (Max), PCA-based method (PCA), Contrast Pyramid method performance than other conventional methods. In brief, our pro-
(CP), DWT-based method (DWT), FSD Pyramid method (FSD), Gra- posed method can achieve promising performance and guarantees
dient Pyramid method (GP), Morphological Pyramid method (MP), the processing speed. To offer an intuitionistic view, the fusion ex-
Ratio Pyramid method (RP) and Laplacian Pyramid method (LP), periments are performed on a pair of multi-focus “Color_Book” im-
respectively. The fused images, as shown in Fig. 7(m)–(o), are ob- ages, as shown in Fig. 8.
tained by integrating the cartoon and texture components, which The fused images obtained by the proposed fusion approach
are carried out by TV-l1 model, MCA model and the proposed algo- and the well-known methods are shown in Fig. 8. The source
rithm, respectively. Although the fusion methods (such as AV, FSD, images can provide dissimilar information as shown in Fig. 8(a)
GP, PCA and MCA) get high-quality fused images, there is lumi- and (b). From the fused results, seriously manual inspection illus-
nance distortion to some extent compared with the source images. trates that the fused images are not satisfactory with poor con-
Z. Liu et al. / Information Fusion 35 (2017) 102–116 111
Fig. 7. The fused images obtained by different fusion methods for multi-focus “Color_Flower” Images. (a) and (b) Multi-focus source images: Color_ FlowerA and Color_
FlowerB; (c) The fused image obtained by average-based method; (d) The fused image obtained by Select maximum method; (e) The fused image obtained by PCA-based
method; (f) The fused image obtained by Contrast Pyramid method; (g) The fused image obtained by DWT-based method; (h) The fused image obtained by FSD Pyramid
method; (i) The fused image obtained by Gradient Pyramid method; (j) The fused image obtained by Morphological Pyramid method; (k) The fused image obtained by Ratio
Pyramid method; (l) The fused image obtained by Laplacian Pyramid method; (m) The fused image based on the proposed fusion rules and image decomposition by TV-l1 ;
(n) The fused image based on the proposed fusion rules and image decomposition by MCA; (o) The fused image based on the proposed fusion rules and image decomposition
by the proposed iterative re-weighted algorithm.
Table 2
The objective evaluation of the conventional methods and our novel approach.
Table 3 to distinguish the difference among those fused images for human
The running time (Time/seconds) and PSNR by the proposed fusion approach and
visual perception subjectively. Therefore, the objective assessment
the conventional fusion methods.
criteria are needed to measure the quality of the fused images, as
Mthods DCT Curvelet blockDCT TV MCA Proposed shown in Table 4.
Time 15.4709 16.9969 20.8112 22.0445 33.5805 19.0649
As shown in Table 4, the proposed fusion approach achieves
PSNR 27.0834 29.8013 28.6027 24.6287 29.9067 31.8063 better results when compared with other methods according to the
evaluation criteria including CE, QAB/F , RW, MI, SSIM and FD. The
results of these assessment criteria demonstrate that the proposed
fusion approach can capture much more information from source
trast as shown in Fig. 8(c), (d), (e), (h), (i), (k), (m) and (n), which images. In particular, although the value of AG and EI are not the
are obtained by average-based method, FSD-based method, RP- largest, the distinction is very little. Therefore, the proposed fusion
based method, Max-based method, PCA-based method, MCA-based approach outperforms the other methods according to visual anal-
method and TV-l1 -based method, respectively. These fused images ysis and parallel quantitative evaluation.
severely lose information in light intensity. Especially, the fused
image obtained by TV-l1 model contains a fraction of noise infor-
mation. The result may be generated by the poor property of sep- 3.3. Fusion on multi-focus “Ball” images
arating cartoon component and texture component. In contrast, as
shown in Fig. 8(f), (g), (l) and (o), the fused images have good per- In this section, the corresponding experiments are implemented
formance either in light intensity or in detail information (such as on a group of multi-focus “Ball” images among the proposed ap-
lines, shapes, edges and contour), which are obtained by CP-based proach and other methods. The source image, Ball_A, is the far
method, DWT-based method, MP-based method, LP-based method clear back focus image with many patches of arbitrary shapes and
and the proposed algorithm, respectively. However, it is not easy sizes, the patches clear, the ball fuzzy; while Ball_B is the close
112 Z. Liu et al. / Information Fusion 35 (2017) 102–116
Fig. 8. The fused images obtained by different fusion methods for multi-focus “Color_ Book” Images. (a) and (b) multi-focus source images: Color_BookA and Color_ BookB;
(c) The fused image obtained by average-based method; (d) The fused image obtained by Select maximum method; (e) The fused image obtained by PCA-based method; (f)
The fused image obtained by Contrast Pyramid method; (g) The fused image obtained by DWT-based method; (h) The fused image obtained by FSD Pyramid method; (i)
The fused image obtained by Gradient Pyramid method; (j) The fused image obtained by Morphological Pyramid method; (k) The fused image obtained by Ratio Pyramid
method; (l) The fused image obtained by Laplacian Pyramid method; (m) The fused image based on the proposed fusion rules and image decomposition by TV-l1 ; (n) The
fused image based on the proposed fusion rules and image decomposition by MCA; (o) The fused image based on the proposed fusion rules and image decomposition by
the proposed iterative re-weighted algorithm.
Table 4
The objective evaluation of the conventional methods and our novel approach.
Table 5 tional methods (DCT, blockDCT and Curvelet). However, the results
The running time (Time/seconds) and PSNR by the proposed fusion approach and
of PSNR illustrate that our presented algorithm outperforms bet-
the conventional fusion methods.
ter than other fusion methods. For human visual perception, there
Mthods DCT Curvelet blockDCT TV MCA Proposed are much more texture information in this set of multi-focus im-
Time 1.9588 1.6655 5.3015 6.5781 6.8715 5.2444
ages. The related fusion experiments are performed on a pair of
PSNR 22.4146 25.2725 25.5751 21.5964 27.7943 30.5659 multi-focus “Ball” images with finite depth-of-field as shown in
Fig. 9.
As shown in Fig. 9, the source images with size 128∗128 pix-
els are shown in Fig. 9(a) and (b). From the perspective of human
clear front focus image of a ball with many edge veins of arbitrary visual perception mechanism, there are some losses of local infor-
shapes and sizes, the ball clear, the patches fuzzy. In this work, mation or luminance distortion in Fig. 9(c), (d), (e), (h), (i), (j), (k),
we enforce some experiments to elaborate the performance of the (m) and (n), which are obtained by average-based method, Max-
proposed approach. In our experiments, the running time is ob- based method, PCA-based method, FSD-based method, GP-based
tained by averaging 20 times of fusion procedures; the quality of method, MP-based method, RP-based method, TV-l1 -based method
the fused images is estimated by Peak Signal to Noise Ratio (PSNR). and MCA-based method, respectively. Especially, there are intro-
The experimental results are shown in Table 5. duction of noise information in Fig. 9(d), (j) and (m). By compari-
As shown in Table 5, the fusion approach based on the im- son, high-quality fused images are obtained in Fig. 9(f), (g), (l) and
proved iterative re-weighted image decomposition algorithm has (o), which behave better in both light intensity and detail infor-
low time complexity ,when compared with other image decompo- mation including shapes, edges and contour. Those fused images
sition based fusion approaches (TV model and MCA model). Simi- are obtained by CP-based method, DWT-based method, LP-based
larly, the proposed method holds up more time than the conven- method and the proposed method, respectively. However, the dis-
Z. Liu et al. / Information Fusion 35 (2017) 102–116 113
Fig. 9. The fused images obtained by different fusion methods for multi-focus “Ball” Images. (a) and (b) multi-focus source images: Ball_A and Ball_B; (c) The fused image
obtained by average-based method; (d) The fused image obtained by Select maximum method; (e) The fused image obtained by PCA-based method; (f) The fused image
obtained by Contrast Pyramid method; (g) The fused image obtained by DWT-based method; (h) The fused image obtained by FSD Pyramid method; (i) The fused image
obtained by Gradient Pyramid method; (j) The fused image obtained by Morphological Pyramid method; (k) The fused image obtained by Ratio Pyramid method; (l) The
fused image obtained by Laplacian Pyramid method; (m) The fused image based on the proposed fusion rules and image decomposition by TV-l1 ; (n) The fused image based
on the proposed fusion rules and image decomposition by MCA; (o) The fused image based on the proposed fusion rules and image decomposition by the proposed iterative
re-weighted algorithm.
Table 6
The objective evaluation of the conventional methods and our novel approach.
tinction between these methods is difficult to give out according 3.4. Fusion on multi-focus “Testna_Slika” images
to human visual perception. Thus, the objective quantitative crite-
ria are needed for the fused images. The results of the quantitative In this work, fusion experiments on a group of multi-focus
assessments are shown in Table 6. “Testna_Slika” images with size of 160 × 160 pixels are imple-
As shown in Table 6, the results of the evaluation assessments mented to verify the property of the proposed fusion approach.
in terms of AG, CE, QAB/F , RW, SSIM and FD indicate that the pro- Observing multi-focus source images in Fig. 10(a), Testna_SlikaA is
posed fusion approach performs better than other compared meth- a close clear front focus image of an aircraft with some contours of
ods. In other words, the fused image obtained by the proposed general shapes, the aircraft clear, environmental background fuzzy.
fusion approach contains much more detail information, such as In proportion, as shown in Fig. 10(b), Testna_SlikaB is a far clear
shapes, edges and contour. In addition, although the objective back focus image with an object of prominent and irregular shapes
quantitative assessments are not optimal in terms of EI and MI, the in the environmental background, environmental background clear,
distinction is comparatively small. Especially, the quantitative as- front aircraft fuzzy. In this section, some experiments are imple-
sessments also demonstrate that the fused image by the proposed mented to demonstrate the performance of low time complexity
approach is more suitable for human visual perception system. In and high fused image quality. In the following experiments, the
conclusion, the proposed fusion approach outperforms other meth- running time in Table 7 is obtained by averaging 20 times of fu-
ods through subjective visual analysis and objective evaluation. sion processes. The evaluation criteria of PSNR is utilized to evalu-
114 Z. Liu et al. / Information Fusion 35 (2017) 102–116
Fig. 10. The fused images obtained by different fusion methods for multi-focus “Testna_Slika” Images. (a) and (b) multi-focus source images: Testna_SlikaA and Testna_
SlikaB; (c) The fused image obtained by average-based method; (d) The fused image obtained by Select maximum method; (e) The fused image obtained by PCA-based
method; (f) The fused image obtained by Contrast Pyramid method; (g) The fused image obtained by DWT-based method; (h) The fused image obtained by FSD Pyramid
method; (i) The fused image obtained by Gradient Pyramid method; (j) The fused image obtained by Morphological Pyramid method; (k) The fused image obtained by Ratio
Pyramid method; (l) The fused image obtained by Laplacian Pyramid method; (m) The fused image based on the proposed fusion rules and image decomposition by TV-l1 ;
(n) The fused image based on the proposed fusion rules and image decomposition by MCA; (o) The fused image based on the proposed fusion rules and image decomposition
by the proposed iterative re-weighted algorithm.
Table 8
The objective evaluation of the conventional methods and our novel approach.
4. Discussions and conclusions [3] Y.A.V. Phamila, R. Amutha, Discrete cosine transform based fusion of multi-fo-
cus images for visual sensor networks, Signal Process 95 (2014) 161–170.
[4] V.N. Gangapure, S. Banerjee, A.S. Chowdhury, Steerable local frequency based
Multi-focus image fusion plays a crucial role in digital im- multispectral multifocus image fusion, Inf. Fusion 23 (2015) 99–115.
age processing, such as computer vision, clinical medicine, mil- [5] T. Stathaki, Image Fusion: Algorithms and Applications, first ed., Elsevier, Lon-
itary surveillance, robotics and remote sensing. However, multi- don, 2008.
[6] W.W. Kong, Y. Lei, H.X. Zhao, Adaptive fusion method of visible light and in-
focus images with many morphological structures would be not frared images based on non-subsampled shearlet transform and fast non-neg-
well represented by only one single sparse transform method. In ative matrix factorization, Infrared Phys. Techn 67 (2014) 161–172.
this paper, an efficient multi-focus image fusion approach is pro- [7] M.M. Subashini, S.K. Sahoo, Pulse coupled neural networks and its applications,
Expert Syst. Appl 41 (8) (2014) 3965–3974.
posed to overcome these limitations. Firstly, all the source images
[8] Y. Liu, S.P. Liu, Z.F. Wang, A general framework for image fusion based on mul-
are decomposed into two components (cartoon content and tex- ti-scale transform and sparse representation, Inf. Fusion 24 (2015) 147–164.
ture content) by an improved iterative re-weighted image decom- [9] Z.Q. Zhou, S. Li, B. Wang, Multi-scale weighted gradient-based fusion for mul-
ti-focus images, Inf. Fusion 20 (1) (2014) 60–72.
position algorithm. Secondly, the cartoon component and the tex-
[10] D.X. He, Y. Meng, C.Y. Wang, Contrast pyramid based image fusion scheme for
ture component are integrated by the modified fusion rules drawn infrared image and visible image, in: Proceedings of 2011 IEEE International
on the properties of the separated parts. Finally, the fused compo- Conference on Geoscience & Remote Sensing Symposium, 2011, pp. 597–600.
nents are combined to generate the all-in-focus image. The results [11] S.T. Li, X.D. Kang, J.W. Hu, Image fusion with guided filtering, IEEE Trans. Image
Process 22 (7) (2013) 2864–2875.
indicate that the proposed fusion approach achieves better quality [12] A.P. James, B.V. Dasarathy, Medical image fusion: A survey of the state of the
in comparison to the existing state-of-the-art methods. art, Inf. Fusion 19 (3) (2014) 4–19.
However, every model of image processing has its own prac- [13] N. Mitianoudis, T. Stathaki, Pixel-based and region-based image fusion schemes
using ICA bases, Inf. Fusion 8 (2) (2007) 131–142.
tical limitations. In the follow-up study, many works can be per- [14] Z. Liu, H.P. Yin, Y. Chai, S.X. Yang, A novel approach for multimodal medical
formed in the multi-focus image fusion approach based on image image fusion, Expert Syst. Appl 41 (16) (2014) 7425–7435.
decomposition. In sparse representation domain, it is difficult to [15] Z.D. Liu, H.P. Yin, B. Fang, Y. Chai, A novel fusion scheme for visible and
infrared images based on compressive sensing, Opt. Commun. 335 (2015)
provide a dictionary to separate all kinds of source images. There- 168–177.
fore, how to construct a self-adaptive trained dictionary for a wide [16] X.L. Zhang, Y.C. Feng, X.F. Li, S. Wang, The morphological component analy-
family of multi-focus images is an essential challenge. Secondly, sis and its application to color-gray image fusion, J. Comput. Inf. Syst. 9 (24)
(2013) 9849–9856.
there is no quantitative evaluation criterion to evaluate the qual-
[17] Y. Jiang, M.H. Wang, Image fusion with morphological component analysis, Inf.
ity of the separated cartoon and texture components. How to de- Fusion 18 (7) (2014) 107–118.
sign general evaluation criteria is a meaningful research direction. [18] Z.P. Xu, Medical image fusion using multi-level local extrema, Inf. Fusion 19
(11) (2014) 38–48.
In addition, although the proposed image decomposition algorithm
[19] M. Elad, J.-L. Starck, P. Querre, D.L. Donoho, Simultaneous cartoon and texture
requires less time than the current methods, it is still a key issue image inpainting using morphological component analysis (MCA), Appl. Com-
to reduce the running time in practical applications. One point of put. Harmon. A. 19 (3) (2005) 340–358.
interest is to build a more efficient image decomposition algorithm [20] J. Bobin, J.-L. Starck, J.M. Fadili, Y. Moudden, D.L. Donoho, Morphological com-
ponent analysis: an adaptative thresholding strategy, IEEE Trans. Image Process
to speed up the procedure. These problems will be further investi- 16 (11) (2007) 2675–2681.
gated in our future work. [21] A. Buades, T.M. Le, J.-M. Morel, L.A. Vese, Fast cartoon + texture image filters,
IEEE Trans. Image Process 19 (8) (2010) 1978–1986.
[22] J.-L. Starck, M. Elad, D.L. Donoho, Image decomposition via the combination of
Acknowledgments sparse representations and a variational approach, IEEE Trans. Image Process
14 (10) (2005) 1570–1582.
[23] E.J. Candes, M.B. Wakin, S.P. Boyd, Enhancing sparsity by reweighted l1 mini-
We would like to thank the supports by National Natural mization, J. Fourier Anal. Appl. 14 (5–6) (2008) 877–905.
Science Foundation of China (61374135, 61203321, 61302041), [24] D. Giacobello, M.G. Christensen, M.N. Murthi, S.H. Jensen, M. Moonen, Enhanc-
ing sparsity in linear prediction of speech by iteratively reweighted 1-norm
Chongqing Nature Science Foundation for Fundamental science and
minimization, in: Proceedings of 2010 IEEE International Conference on Acous-
frontier technologies (cstc2015jcyjB0569), China Central Universi- tics, Speech, & Signal Processing, 2010, pp. 4650–4653.
ties Foundation (106112015CDJXY170 0 03) and Chongqing Graduate [25] S.S. Chen, D.L. Donoho, M.A. Saunders, Atomic decomposition by basis pursuit,
SIAM Rev 43 (1) (2001) 129–159.
Student Research Innovation Project (CYB14023).
[26] S. Haykin, Z. Chen, The cocktail party problem, Neural Comput 17 (9) (2005)
1875–1902.
References [27] M.A. Bee, C. Micheyl, The cocktail party problem: what is it? How can it be
solved? And why should animal behaviorists study it? J. Comp. Psychol 122
(3) (2008) 235–251.
[1] H.F. Li, Y. Chai, R. Ling, H.P. Yin, Multifocus image fusion scheme using fea- [28] H. Asari, B.A. Pearlmutter, A.M. Zador, Sparse representations for the cocktail
ture contrast of orientation information measure in lifting stationary wavelet party problem, J. Neurosci 26 (28) (2006) 7477–7490.
domain, J. Inf. Sci. Eng. 29 (2) (2013) 227–247. [29] D.J. Luo, C. Ding, H. Huang, Toward structural sparsity: an explicit l2 /l0 ap-
[2] A. Saha, G. Bhatnagar, Q.M.J. Wu, Mutual spectral residual approach for multi- proach, Knowl. Inf. Syst 36 (2) (2013) 411–438.
focus image fusion, Digit. Signal Process 23 (4) (2013) 1121–1135.
116 Z. Liu et al. / Information Fusion 35 (2017) 102–116
[30] K.P. Wang, Y. Chai, C.X. Su, Sparsely corrupted stimulated scattering signals [37] V. Saligrama, M. Zhao, Thresholded basis pursuit: LP algorithm for order-wise
recovery by iterative re-weighted continuous basis pursuit, Rev. Sci. Instrum. optimal support recovery for sparse and approximately sparse signals from
84 (8) (2013). 083103-1-083103-7 noisy random measurements, IEEE Trans. Inf. Theor. 57 (3) (2011) 1567–1586.
[31] M. Elad, A.M. Bruckstein, A generalized uncertainty principle and sparse rep- [38] M. Masood, T.Y. Al-Naffouri, Sparse reconstruction using distribution agnos-
resentation in pairs of bases, IEEE Trans. Inf. Theor. 48 (9) (2002) 2558–2567. tic Bayesian matching pursuit, IEEE Trans. Signal Process 61 (21) (2013)
[32] J.A. Tropp, Just relax: Convex programming methods for identifying sparse sig- 5298–5309.
nals in noise, IEEE Trans. Inf. Theor. 52 (3) (2006) 1030–1051. [39] S. Kunis, H. Rauhut, Random sampling of sparse trigonometric polynomials
[33] R. Gribonval, M. Nielsen, Sparse representations in unions of bases, IEEE Trans. II-orthogonal matching pursuit versus basis pursuit, Found. Comput. Math. 8
Inf. Theor. 49 (12) (2003) 3320–3325. (6) (2008) 737–763.
[34] G.G. Bhutada, R.S. Anand, S.C. Saxena, Edge preserved image enhancement us- [40] M. Beister, D. Kolditz, W.A. Kalender, Iterative reconstruction methods in x-ray
ing adaptive fusion of images denoised by wavelet and curvelet transform, CT, Phys. Medica 28 (2) (2012) 94–108.
Digit. Signal Process 21 (1) (2011) 118–130. [41] C.S. Xydeas, V. Petrovic, Objective image fusion performance measure, Electron.
[35] V.L. Guen, Cartoon + texture image decomposition by the TV-l1 model, Image Lett. 36 (4) (20 0 0) 308–309.
Process 4 (2014) 204–219. [42] V. Petrovic, T. Cootes, R. Pavlovic, Dynamic image fusion performance evalua-
[36] E.V.D. Berg, M.P. Friedlander, Probing the pareto frontier for basis pursuit so- tion, in: Proceedings of the 10th International Conference on Information Fu-
lutions, SIAM J. Sci. Comput. 31 (2) (2008) 890–912. sion, 2007, pp. 1154–1160.