Sei sulla pagina 1di 15

Information Fusion 35 (2017) 102–116

Contents lists available at ScienceDirect

Information Fusion
journal homepage: www.elsevier.com/locate/inffus

A novel multi-focus image fusion approach based on image


decomposition
Zhaodong Liu a, Yi Chai a,b, Hongpeng Yin a,c,∗, Jiayi Zhou a, Zhiqin Zhu a
a
College of Automation, Chongqing University, Chongqing City, 400030, China
b
State Key Laboratory of Power Transmission Equipment and System Security and New Technology, College of Automation, Chongqing University, 400030,
China
c
Key Laboratory of Dependable Service Computing in Cyber Physical Society, Ministry of Education, Chongqing, 400030, China

a r t i c l e i n f o a b s t r a c t

Article history: Multi-focus image fusion is an effective technique to integrate the relevant information from a set of im-
Received 17 May 2015 ages with the same scene, into a comprehensive image. The fused image would be more informative than
Revised 3 June 2016
any of the source images. In this paper, a novel fusion scheme based on image cartoon-texture decompo-
Accepted 18 September 2016
sition is proposed. Multi-focus source images are decomposed into cartoon content and texture content
Available online 20 September 2016
by an improved iterative re-weighted decomposition algorithm. It can achieve rapid convergence and nat-
Keywords: urally approximates the morphological structure components. The proper fusion rules are constructed to
Multi-focus image fusion fuse the cartoon content and the texture content, respectively. Finally, the fused cartoon and texture com-
Image decomposition ponents are combined to obtain the all-in-focus image. This fusion processing can preserve morphological
Cartoon content structure information from source images and performs few artifacts or additional noise. Our experimen-
Texture content tal results have clearly shown that the proposed algorithm outperforms many state-of-the-art methods,
in terms of visual and quantitative evaluations.
© 2016 Elsevier B.V. All rights reserved.

1. Introduction The growing appeal of this research area can be observed from
the large number of scientific papers [9–15], which could be cate-
Multi-focus image fusion could be defined as the process of gorized into two main groups: multi-resolution scheme and multi-
fusing substantial information from multiple images of the same spectral scheme. Multi-resolution domain fusion usually trans-
scene to generate a single composite image. The fused image forms the inputs into multi-resolution representation, and then se-
would be more suitable for human visual perception. Currently, lects the decomposed information to reconstruct the fused image.
multi-focus image fusion technology has been widely used in com- These methods enable the efficient combination of the relevant in-
puter vision, clinical medicine, remote sensing, military surveil- formation that are spectrally independent or spatially overlapped.
lance and digital imaging, and so on [1–7]. In recent years, a flurry Thus, multi-resolution-based fusion algorithms have attracted great
of fusion algorithms have been introduced from many and diverse research attentions based on different multi-resolution decomposi-
points of view, which can be categorized into three groups: pixel- tion, such as discrete wavelet transform (DWT) [8], gradient pyra-
level fusion, feature-level fusion and decision-level fusion [1,2,5]. mid [9], contrast pyramid [10]. However, these methods are com-
In comparison to the latter two schemes, the main superiority of plicated, high time-consuming to implement and sensitive to the
pixel-level fusion domain is that the original information is di- sensor noise. Unlike the former one, multi-spectral domain fusion
rectly involved. Although the pixel-level fusion methods comprise directly operates on the pixels or regions of the source images. Its
the open-ended problems: high computational complexity, poor fi- main principle is selecting the more clarified pixels or regions to
delity and blocking artifacts, this promising area is receiving ever- construct the fused image. Averaging, intensity-hue-saturation [11],
increasing attention and is still intensely ongoing. This work has principal component analysis [12], independent component anal-
concentrated mainly on the study of the pixel-level fusion domain. ysis [13] based fusion algorithms fall under this category. To the
best of our knowledge, the major drawbacks of multi-spectral do-
main fusion methods is that it introduces spatial distortions and

Corresponding author. blocking artifacts in the resultant fused images. Recently, a corpus
E-mail address: yinhongpeng@gmail.com (H. Yin). of fusion algorithms in the pixel-level fusion domain have been

http://dx.doi.org/10.1016/j.inffus.2016.09.007
1566-2535/© 2016 Elsevier B.V. All rights reserved.
Z. Liu et al. / Information Fusion 35 (2017) 102–116 103

Fig. 1. The basic multi-focus image fusion scheme based on image cartoon-texture decomposition.

reported to solve the crucial problems, but this topic is largely still In this work, we aim to solve these general problems to get high-
open. quality separated contents, achieve rapid convergence and obtain
Note that the crucial issue is to effectively represent the foun- the all-in-focus image. In practice, an improved multi-focus im-
dational atoms of the inputs in the pixel-level fusion framework. age fusion approach is proposed specific to the mentioned prob-
More recently, a class of image decomposition-based fusion tech- lems. On the one hand, the texture and piecewise smooth (car-
niques [16–18] have attracted increasingly more attention. The key toon) parts are obtained by an improved iterative re-weighted im-
motivation behind lies in the observation that natural images could age decomposition algorithm. From a practical point of view, it
be efficiently decomposed into morphological structure compo- could remove the problem of information distortion and achieves
nents: texture contents and cartoon contents. The texture contents rapid convergence with an effective underlying representation of
hold textures, oscillating patterns, fine details and noise, while the different spatial morphologies. On the other hand, according to the
cartoon contents contain the geometric structures, isophotes and properties of cartoon and texture contents, proper fusion rules are
smooth-piece of source images. To preserve the textural informa- constructed to fuse the cartoon component and the texture com-
tion, a new color-gray image fusion algorithm based on Morpho- ponent, respectively. These can combine the complementary infor-
logical Component Analysis (MCA) is proposed to seek out the mation into the final fused image. Finally, the two fused compo-
most important information [16]. In order to effectively exploit nents are integrated to generate the final all-in-focus image. This
the property of morphological diversity of images and the ad- work would generate an extended depth-of-focus image, which is
vantages of MCA, a novel multi-component fusion method is pre- more suitable for human visual perception. It could also overcome
sented to generate better fused images [17]. Specific to the prob- the problems of poor fidelity and blocking artifacts in the conven-
lem of high computational complexity, a novel pixel-level fusion tional fusion methods. The main contributions can be summarized
algorithm based on multi-level local extrema (MLE) is constructed as follows:
to reflect the regional pattern and edge information of source im-
ages [18]. The generic fusion framework based on image cartoon- (1) In the pixel-level fusion domain, it is crucial to achieve high
texture decomposition is shown in Fig. 1 [16–18]. Generally, the fused image quality by effectively representing the build-
incoherent components of source images could be efficiently sep- ing atoms of multi-focus images. In this work, an iterative
arated and represented by image decomposition algorithms, which re-weighted image decomposition algorithm is proposed to
include MCA, Total-Variation (TV), TV-l1 , TV-l2 and other extended precisely separate and approximate the morphological struc-
methods [17,19–22]. According to the properties of morphological ture components. Furthermore, it can get global optimal so-
structure components, proper fusion rules are constructed to ex- lution and diminishes part of noise.
tract complementary contents. Ultimately, the fused cartoon and (2) From a practical point of view, high computational complex-
texture components are combined to generate the final all-in-focus ity imposes restrictions on putting fusion algorithms into
image. practice. The proposed image texture-cartoon decomposition
Although the above described models are feasible, there are method can greatly reduce computation complexity building
three main open-ended problems in multi-focus image fusion on the convergence and robustness.
framework based on image cartoon-texture decomposition: (1) (3) According to the properties of the separated contents, proper
how to construct the proper decomposition algorithm to achieve fusion rules are presented to fuse cartoon component and
fast convergence is a crucial problem; (2) how to naturally approx- texture component, respectively. The all-in-focus image is
imate the morphological components remains an open and diffi- obtained by combining the fused cartoon and texture con-
cult issue; (3) how to extract complementary information to gen- tents, which is more suitable for human visual perception.
erate the all-in-focus image is also an open-ended research project. This fusion processing can preserve the complementary in-
104 Z. Liu et al. / Information Fusion 35 (2017) 102–116

formation from the multi-focus source images and performs the additional component behaves like a white zero-mean Gaus-
as few artifacts or additional noise as possible. sian noise in many papers [32,34]. In general, there are other dif-
ferent noise models similarly introduced by other norms including
The rest of this paper is organized as follows. In Section 2,
l1 , l∞ and others.
the novel fusion approach based on image decomposition is pro-
Given the above solution, high-quality results can be achieved.
posed. Several experiments are conducted to verify the proposed
However, although the above-mentioned case is feasible, there
method in Section 3. Discussions and conclusions are summarized
still exist open-ended and difficult issues. On the one hand, high
in Section 4.
computational complexity (O(KN2 log2 2N) imposes restrictions on
putting this model into practice. Wherein K is the number of itera-
2. The proposed fusion approach based on image
tions for the outer loop of the program, and N is the number of it-
decomposition
erations for the inner loop of the program. On the other hand, how
to naturally approximate the two components remains a crucial
To overcome the crucial open-ended research issues, a gen-
and open-ended problem. Therefore, it is desirable to construct an
eral fusion approach based on image cartoon-texture decomposi-
approximate model to: (1) reduce computation complex with guar-
tion is proposed. First of all, an improved re-weighted decompo-
anteed convergence robustness; (2) reduce the reconstruction error
sition algorithm is presented. It could naturally approximate the
with the sparest and unreduced representation. Naturally, this al-
cartoon and texture components with low computation complex-
ternative convex formulation should achieve convergence rapidly
ity. Secondly, according to the properties of the separated con-
and accurately approximates the texture and natural scene compo-
tents, the improved fusion rules are utilized to fuse the cartoon
nents. In this work, an improved algorithm is proposed to optimize
content and texture content, respectively. These could extract more
the above constraint convex formulation.
curves, edges, anisotropic structures and detailed information into
Reviewing the decomposition formulation as shown in Eq. (1),
the fused cartoon image and texture image. The all-in-focus im-
the objects are the sparse coefficient vectors α t , α n with far longer
age is obtained by combining the complementary information from
length N2 . In order to simplify the separation process, we rectify
these fused components. The detailed implementations are illus-
the objects as the texture image, the cartoon image and unknown
trated as follows.
noise image, rather than the sparse coefficients. Therefore, the fea-
sible solution for image decomposition can be reformulated as:
2.1. The improved image cartoon-texture decomposition algorithm
   
{Xt , Xn } = Arg min Dt† Xt + Rt 1 + D†n Xn + Rn 1
The task of decomposing source images into their building {Xt ,Xn }
atoms is of great interest in many fields. Many image cartoon-
+λX − Xt − Xn 22 + γ T V Xn
texture decomposition methods are exploited, including MCA, TV
and other extended algorithms. Although these famous algorithms s.t. αt = Dt† Xt + Dt† rt = Dt† Xt + Rt
can achieve high performance, how to build a proper algorithm re- αn = D†n Xn + D†n rn = D†n Xn + Rn
mains an open and difficult issue. The presented algorithm should
be fast convergence and naturally approximate the incoherent Rt = Dt† rt = 0
components. In this work, a novel iterative re-weighted algorithm Rn = D†n rn = 0, (2)
is proposed to effectively separate source images into incoher-
ently morphological structural components. More specifically, this where, Rt and Rn are the arbitrary matrixes in the null-space of
† †
work provides a theoretical analysis of the decomposition idea, sparse dictionaries Dt and Dn . Dt and Dn are the Moore-Penrose
and reveals that a successful separation of image cartoon-texture †
pseudo-inverse of sparse dictionaries Dt and Dn . The terms Dt Xt
contents could be found in principle. Simulation results for 1- †
and Dn Xn
in Eq. (2) represent the linear transforms for the texture
dimension signal and 2-dimension image show that our proposed
image and the natural scene image, respectively.
method can achieve a feasible and optimal approximate solution
However, can we construct an alternative algorithm to achieve
with less time complexity.
a fast convergence and get an accurate and tractable solution upon
l0 or l1 norm minimization? Reviewing the properties of l0 and
2.1.1. The theoretical analysis of image decomposition model
l1 norm, a critical distinction between l0 and l1 norm relies on
Supposing a multi-focus image X(X ∈ RN × N ), it can be pro-
the magnitude. Naturally, is there a feasible way to penalize the
cessed as a 1-D vector with the length N2 by reordering. In gen-
larger coefficients more heavily than smaller sparse coefficients in
eral, a natural image could be separated into two images: cartoon
l1 norm minimization? In this work, an efficient re-weighted algo-
(piecewise smooth) image and texture image. In other words, an
rithm for separating the texture and cartoon contents based on l1
image Xt contains only texture comment and no pure piecewise
norm minimization is proposed. The optimal solution can be refor-
smooth comment by an over-complete dictionary Dt . Similarly, a
mulated as:
functional of an over-complete dictionary Dn can separate the re-
   
lated cartoon comment Xn from the source image X. A solvable op- {Xt , Xn } = Arg min Dt†Wt Xt + Rt 1 + D†nWn Xn + Rn 1
timal functional principle of this problem can be defined as [23– {Xt ,Xn }
33]: +λX − Xt − Xn 22 + γ T V Xn
   
{αt , αn } = Argαmin

αt 1 + αn 1 + λX − Dt αt − Dn αn 22 1 1
t n s.t. Wt = diag , Wn = diag , (3)
+γ T V {Dn αn }, (1) |Xt + ε | |Xn + ε |
where, the parameter λ is a balance between sparity and recon- where, Wt or Wn is positive weights matrix. Especially, Wt or Wn
struction error [23]. α n and α t stand for the separated cartoon is the diagonal matrix with non-zero diagonal values to hold the
coefficients and the separated texture coefficients, respectively. In geometrical structure characteristic of source multi-focus images.
this solution, the natural content can be separated from the texture Naturally, the representation matrix of source images is merged by
content, and the additional noise can be removed as a by-product. each column successively. ε is a non-zero constant to enhance the
Furthermore, when observing the above formula, note that the stability of our proposed algorithm. Furthermore, it can prevent to
noise error norm is chosen as l2 norm. There is an assumption that generate a non-zero estimate for a real zero-valued content in tex-
Z. Liu et al. / Information Fusion 35 (2017) 102–116 105

Fig. 2. Reconstruct the signal X0 by •1 &W •1 norm minimization. (a) A feasible set between y = DX and ball of radius X0 1 ; (b) An infeasible set between y = DX and
ball of radius X0 1 ; (c) A commendable recovery by the re-weighted l1 norm W•1 .

ture image Xt or cartoon image Xn . In general, the parameter ε B: The local Discrete Cosine Transform for texture component
should take a smaller value than the expected error magnitudes. Local DCT is one of the extensive families of the Discrete Fourier
Naturally, the above model immediately raises a question: Transform. Its basic principle is to transform the image from the
whether the proposed re-weighted algorithm can achieve a high- spatial domain to the frequency domain [14,35]. It can commend-
quality reconstruction result, gets a feasible and optimal approx- ably represent details, lines and periodic components of a natu-
imation solution, and achieves fast convergence. For the sake of ral image. Furthermore, DCT can simplify the computation by re-
demonstration, a schematic instance is shown in Fig. 2. placing the complex analysis with real numbers. Thus, local DCT
As shown in Fig. 2, the simulated example is given in details be- is used to approximate the building atoms of texture components,
tween l1 relaxations and the re-weighted l1 relaxations. In compar- which consist of details and periodic structural contents.
ison, the corresponding l1 relaxations can achieve different solu- In this work, when the Curvelet transform is used to repre-
tions including feasible solutions and infeasible solutions, as shown sent the cartoon contents, ringing artifacts are caused by the os-
in Fig. 2(a) and (b). However, Fig. 2(c) demonstrates that the re- cillations of the Curvelet atoms. In many literatures, TV is used to
weighted l1 relaxation algorithm can get a unique solution. Fur- damp the ringing artifacts near edges [19,20,22]. As a matter of
thermore, it also indicates that large re-weighted values can be fact, the essential principle of the expression TV{Dn α n } is to com-
utilized to damp the non-zero points; while small re-weighted val- pute the image Xn = Dn ∗ αn and apply the TV norm on it. The car-
ues can be exploited to drive non-zero points. In other words, the toon image could be closer to a piecewise smooth image by pe-
re-weighted l1 relaxation algorithm can preserve the complemen- nalizing with TV. Therefore, to obtain better expected components
tary information of source images. The proof of the convergence of Dt α t and Dn α n , a total-variation (TV) regularization scheme is uti-
this algorithm is similar to the basis pursuit method given in [23– lized to make the cartoon image Dn α n meet with the natural scene
25] with slight modifications, and thus is omitted here. Especially, content model [35].
unlike looping many times in the iterations by MCA and TV, re- In brief, the chosen proper dictionaries are utilized to decom-
peating this procedure twice is sufficient enough in our proposed pose the multi-focus images into texture content and cartoon con-
method. In brief, the proposed algorithm can preserve the struc- tent. With the latter, an efficient iterative re-weighted algorithm
ture information in the cartoon image, such as the main image re- is presented to get a feasible and optimal approximate solution,
gion and edges; it can also protect the oscillating components and whose basic mathematical model is formulated as Eq. (3). The
discourages the additional noise in texture image. main steps of the iterative re-weighted decomposition algorithm
Reviewing the procedure of image decomposition, proper dic- are as follows.
tionaries Dt and Dn play an important role in representing the
1. Initialize Lmax = 5, δ = λLmax , λ = 0.4, ε = 10−3 , Xn = X,
texture and piecewise smooth components. One practical way to
Xt = 0 or Xt = 0.1 ∗ X, Wt &Wn − unitary matrix.
chose Dt and Dn is to utilize well known transforms. In this work,
2. Solve
we select widely known transforms to well represent both cartoon    
content and texture content. To constrain our choice to Dt and Dn , {Xt , Xn } =Arg min Dt†Wt Xt + Rt 1 + D†nWn Xn + Rn 1
{Xt ,Xn }
it is desirable to meet the requirements: (1) could be easily imple-
mented with less computational complexity; (2) should be effec- + λX− Xt − Xn 22 + γ T V Xn  
tive to represent the texture component or the piecewise smooth 1 1
s.t. Wt = diag , Wn = diag
component. Therefore, the Curvelet Transform and local Discrete |Xt + ε | |Xn + ε |
Cosine Transform (DCT) are used to represent the cartoon content 3. Loop
and the texture content, respectively. The motivations for trans- (a) Update Xn , assume Xt is fixed:
form basis selection are briefly illustrated below.
A: The Curvelet Transform for cartoon component rn = X − Xt − Xn
Roughly speaking, the basic principle of Curvelet Transform is αn = D†nWn (Xn + rn )
to separate an image into a set of wavelet sub-bounds, and then Xn = Dn αˆ n
to analyze each sub-band by means of a local ridgelet transform
where, αˆ n is the approximate coefficients; α n is the
[33,34]. It is more efficient to represent source images with abun-
sparse coefficients; Dn is the Curvelet Transform Dictio-
dant structure information and edges than wavelets or other con- †
nary; Dn is the Moore-Penrose pseudo-inverse of sparse
ventional representations. Especially, the Curvelet Transform could
dictionary Dn ; rn is the residual.
achieve exact reconstruction and stability due to the property of
(b) Update Wn :
invertibility and robustness. Therefore, the Curvelet Transform is  
used to represent the information of cartoon components, which 1
Wn = diag
include anisotropic structures and profile of source images. |Xn + ε |
106 Z. Liu et al. / Information Fusion 35 (2017) 102–116

Fig. 3. The experimental results by l1 relaxation method and the re-weighted l1 relaxation algorithm. (a) The original signal marked in blue ‘∗ ’ and the recovery signal by l1
relaxation method marked in green; (b) The original signal marked in blue ‘∗ ’ and the recovery signal by the proposed re-weighted l1 relaxation algorithm. (For interpretation
of the references to colour in this figure legend, the reader is referred to the web version of this article.)

(c) Update Xt , assume Xn is fixed: contains 10 0 0 discrete entries, which are marked in blue “∗ ” in
Fig. 3(a) and (b). For the sake of illustrating the property, the ex-
rt = X − Xt − Xn
periments are sketched here by l1 relaxation method and the re-
αt = Dt†Wt (Xt + rt ) weighted l1 relaxation algorithm.
Xt = Dt αˆ t In Fig. 3(a), the distinction between the original signal and the
recovery signal by l1 relaxations is shown clearly. Interestingly, the
where, αˆ t is the approximate coefficients; α t is the
reconstructed signal by the novel re-weighted l1 relaxation algo-
sparse coefficients; Dt is the Curvelet Transform Dictio-
† rithm can dramatically approximate the sparse discrete entries as
nary; Dt is the Moore-Penrose pseudo-inverse of sparse
shown in Fig. 3(b). In a nutshell, the re-weighted values inversely
dictionary Dt ; rt is the residual.
to the true signal magnitudes can achieve a feasible and opti-
(d) Update Wt :
  mal approximation solution with less time complexity. Put more
1 generally, there are many algorithms to approximate the l0 or l1
Wt = diag
|Xt + ε | norm minimization, including Basis Pursuit (BP), Matching Pur-
suit (MP), Orthogonal Matching Pursuit (OMP) and Iterative Re-
(e) Damp Ringing Artifacts by TV: weighted Least Square (IRLS) [36–40], etc. However, these methods
Xn = Xn − μγ (∂ T V Xn /∂ Xn ) may either guarantee the reconstructed image quality or achieve a
fast computation speed. In other words, the various ways may lead
Update Wn : to the high reconstruction error with a high computation speed,
 
1 which can be seen in Fig. 4.
Wn = diag
| n + ε|
X As shown in Fig. 4, a lower probability of error can be ob-
tained by the proposed approach than the results by the conven-
Update the threshold δ : δ = δ − λ tional algorithms under different cardinalities of the true solution.
4. Terminate if the threshold is satisfied (δ < λ): Although the running time for the proposed approach is much
Xn = Dn αˆ n higher than the computational time for MP and OMP methods, it
Xt = Dt αˆ t can guarantee the quality of the reconstructed images and greatly
reduces the running time as an extension of BP method. Therefore,
2.1.2. The experimental demonstration of the proposed decomposition the proposed approach can achieve a feasible and optimal approx-
model imation solution with less time complexity.
As mentioned above, an underlying model to describe image In image decomposition based fusion framework, the underly-
content is proposed. It can separate the source images into tex- ing representation of the inputs (image cartoon-texture composi-
ture and piecewise smooth (cartoon) parts theoretically. It is nat- tion) is the key step. Namely, how to completely and accurately
ural to ask, whether our presented algorithm would achieve high- separate multi-focus images into cartoon and texture components
quality performance in the real applications. In this subsection, the is a crucial issue. In this section, some experiments are sketched
comparative experiments for 1-D signal are sketched between the here to demonstrate the effectiveness and accuracy of the pro-
re-weighted l1 relaxation algorithm and other l1 relaxation algo- posed image decomposition algorithm, as shown in Fig. 5.
rithms. The experimental results show that the proposed method The decomposition results of multi-focus images are shown in
outperforms other classical methods in term of reconstruction er- Fig. 5. Multi-focus images are demonstrated in Fig. 5(a); while tex-
ror. It also get better tradeoff between computation complexity and ture components and cartoon components are shown in Fig. 5(b)–
reconstruction accuracy. When focusing on 2-D natural images, the (g), which is obtained by MCA, TV-l1 and the proposed iterative
presented algorithm could achieve better performance in compari- re-weighted decomposition algorithm, respectively. As shown in
son to the known decomposition methods, which could completely Fig. 5(d) and (e), there is still part of texture component stayed in
represent and separate morphological structure components. the cartoon image. In comparison, the texture and cartoon contents
To provide a more intuitive display, a practical example for 1- acquired by MCA and the proposed algorithm are well separated
dimension signal is given to verify the performance of the re- as shown in Fig. 5(b), (c), (f) and (g). However, MCA method does
weighted l1 relaxation algorithm in Fig. 3. The original signal not have the stability and robustness for all the multi-focus im-
Z. Liu et al. / Information Fusion 35 (2017) 102–116 107

Fig. 4. The probability of error and running time under diverse methods. (a) The probability of error obtained by the proposed approach and the conventional method; (b)
The running time obtained by the proposed approach and the conventional method.

Fig. 5. The results of image decomposition by the proposed iterative re-weighted algorithm, MCA and TV-l1 methods. (a) The source multi-focus images; (b) The texture
image obtained by MCA; (c) The cartoon image obtained by MCA; (d) The texture image obtained by TV-l1 ; (e) The cartoon image obtained by TV-l1 ; (f) The texture image
obtained by the proposed iterative re-weighted algorithm; (g) The cartoon image obtained by the proposed iterative re-weighted algorithm.

ages. Specific to the source image with more texture components, toon contents and texture contents demonstrate that the separated
the corresponding cartoon image still consists of definite texture components are well represented by MCA model and the proposed
component. Observing the separated cartoon component of source algorithm, as shown in Fig. 6(e)–(h). When carefully observing the
image with a ball, there is still part of cartoon component stayed in image spectrogram of texture components in Fig. 6(f) and (h), there
the texture image. In order to provide visually explicit directions, is much more luminance information obtained by MCA model than
the concept of image spectrogram is quoted and utilized to demon- by the proposed approach. In other words, the texture image ob-
strate the difference between the separated texture image and the tained by MCA model also consists of partial cartoon components.
separated cartoon image. As we all know, the image spectrogram In brief, the proposed algorithm can commendably separate mor-
represents the intensity of variation between one point and its phological structure component in comparison to the other two
neighborhood points. In other word, it is the distribution of image methods.
gradient (or image energy). According to this theoretical principle,
the image spectrogram of cartoon component should contain more 2.2. The improved multi-focus image fusion rules
energy contents (high brightness part); while the image spectro-
gram of texture component should demonstrate the mutant part As mentioned above, multi-focus image can be completely and
with low luminance (low energy). In this work, the image spectro- effectively separated into two incoherent components: cartoon
gram for one of experimental images is shown in Fig. 6. content and texture content. For the cartoon content, it contains
The source image and its spectrogram image are shown in the main geometry information and profile of multi-focus im-
Fig. 6(a) and (b). Comparatively speaking, the image spectrogram ages obtained by curvelet transform; while for the texture con-
of cartoon component in Fig. 6(c) indicates that TV-l1 model could tent, it mainly consists of the periodic texture information and par-
not primely represent the cartoon component. The image spectro- tial additional noise obtained by local DCT. Thus, different fusion
gram of texture component in Fig. 6(d) illustrates that the sep- rules should be constructed for morphological contents to preserve
arated texture image obtained by TV-l1 model contains partial much more structural information, which are illustrated in detail
cartoon contents. In comparison, the image spectrogram of car- as follows.
108 Z. Liu et al. / Information Fusion 35 (2017) 102–116

Fig. 6. The frequency spectrum of cartoon and texture images obtained by TV-l1 model, MCA model and the proposed algorithm on multifocus “Color_Book” image. (a) The
source “Color_Book” image; (b) The frequency spectrum of source image; (c) The frequency spectrum of cartoon image obtained by the proposed decomposition algorithm;
(d) The frequency spectrum of texture image obtained by the proposed decomposition algorithm; (e) The frequency spectrum of cartoon image obtained by MCA model; (f)
The frequency spectrum of texture image obtained by MCA model; (g) The frequency spectrum of cartoon image obtained by TV-l1 model; (h) The frequency spectrum of
texture image obtained by TV-l1 model.

A: Fusion rule for cartoon image In order to improve the fusion accuracy, energy and structural
For the separated cartoon part, it contains the main geometric similarity are used to construct fusion rule for texture image. The
structural behaviors including object hues and sharp boundaries. energy is defined as:
The fundamental goal of cartoon component fusion is to preserve
1
m n
the geometric structural information. Inspired by the fusion rule En(t1 ) = (XAt )2 , i = 1, 2, · · · , m, j = 1, 2, · · · , n
“coefficient-abs-max”, an extension fusion rule based on variance m×n
i=1 j=1
is constructed to fuse the cartoon images. The variance as an ac-
1
m
n
tivity measure can reflect the cartoon components in multi-focus En(n1 ) = (XAn )2 , i = 1, 2, · · · , m, j = 1, 2, · · · , n
images. Therefore, in this work, an extension of variance can be m×n
i=1 j=1
defined as follow:
 1
m
n

σAn (i, j ) = |XAn (i, j ) − uAn | En(t2 ) = (XBt )2 , i = 1, 2, · · · , m, j = 1, 2, · · · , n


, m×n
σBn (i, j ) = |XBn (i, j ) − uBn | i=1 j=1
⎧ 1
m
n


N
XAn (i, j ) En(n2 ) = (XBn )2 , i = 1, 2, · · · , m, j = 1, 2, · · · , n,
⎨uAn = N1 m×n
1≤i, j≤N i=1 j=1
s.t. . (4)

N

⎩uBn =
1
N
XBn (i, j ) (6)
1≤i, j≤N
where, En(t1), En(n1), En(t2) and En(n2) represent the energy of
In general, the defined variance is large specific to the remark- texture and cartoon components separated from source images A,
able contents in cartoon part. In order to transmit the complemen- B, respectively. X(At), X(An) denote the pixels of texture and car-
tary information into the fused cartoon image, an effective fusion toon image in source image A; while X(Bt), X(Bn) denote the pixels
rule based on the extension of variance is formulated as: of texture and cartoon image in source image B. m, n are the size
 of the source image.
XAn (i, j ), σAn (i, j ) ≥ σBn (i, j )
XF n (i, j ) = , (5) In addition, natural images contain strong structural features
XBn (i, j ), σAn (i, j ) < σBn (i, j ) and dependency. Human visual system is highly adaptive to
With the efficient fusion rule, it is natural to select the im- achieve this goal of extracting the structure information from nat-
portant information in the separated cartoon components and to ural images. Structural similarity is also a major measurement for
transmit these contents into the fused cartoon image. In this sense, structure distortion. In this work, the structural similarity S be-
high-quality cartoon image can be obtained by this improved rule, tween the source image A, B, and texture or cartoon image X(At),
which is called as “variance-abs-max” rule. X(An) or X(Bt), X(Bn) is defines as:
B: Fusion rule for texture image (2μA μXAt + C1 )(2σA,XAt + C2 )
Texture is characterized as repeated and meaningful structure S(A, XAt ) =
(μ2A + μ2XAt + C1 )(σA2 + σX2At + C2 )
of small patterns. Noise is characterized as uncorrelated random
patterns. Naturally, there is a certain noise information inset into (2μA μXAn + C1 )(2σA,XAn + C2 )
S(A, XAn ) =
the texture image. In brief, a texture image should contain the (μ2A + μ2XAn + C1 )(σA2 + σX2An + C2 )
properties: (1) obeying some statistical properties; (2) owning sim-
(2μB μXBt + C1 )(2σB,XBt + C2 )
ilar structures repeated over and over again; (3) being with some S(B, XBt ) =
degree of randomness. Therefore, when constructing a proper fu- (μ2B + μ2XBt + C1 )(σB2 + σX2Bt + C2 )
sion rule to extract the significant components from a texture (2μB μXBt + C1 )(2σB,XBn + C2 )
image, the normal texture contents should be preserved and the S(B, XBn ) = , (7)
(μ2B + μ2XBn + C1 )(σB2 + σX2Bn + C2 )
noise contents should be weakened. Inspired by the work of image
assessment criteria, a novel fusion rule for texture components is where, μ is on behalf of the average value of image pixels. σ∗2
proposed. stands for the variance. σA,XAt , σA,XAn , σB,XBt , σB,XBn represent the co-
Z. Liu et al. / Information Fusion 35 (2017) 102–116 109

variance. Small constants C1 , C2 are used to avoid the instability at which are shown as:
a zero interesting point. ⎧ A ⎫
⎪ gn,m A ⎪
According to the above description, two distinct combination ⎪
⎨ XAt , gXAt
n,m > g n,m ⎪

gn,m
modes (selection and averaging) are exploited to merge the tex- GA,X
n,m =
At
, (10)

⎪ gXAt ⎪
ture components into the final fused version. The novel fusion rule
⎩ An,m , otherwise ⎪

for texture images are calculated as Eq. (8). gn,m
XAt ∗ min(w0 (t1 ), w1 (t1 )) + XBt ∗ min(w0 (t2 ), w1 (t2 ))   
XF t =
A,XAt
Hn,m = 2π −1 αn,m
XAt A 
− αn,m − π /2, (11)
min(w0 (t1 ), w1 (t1 )) + XBt ∗ min(w0 (t2 ), w1 (t2 ))
En(t 1 ) where, gradient strength g and orientation α ∈ (0, π ) are extracted
w0 (t 1 ) = at each location n, m from each image using the Sobel operator.
En(t 1 ) + En(n1 )
The constants , kg , σ g , kα , σ α determine the exact shape of the
En(t 2 ) sigmoid functions used to form the edge strength and orientation
w0 (t 2 ) =
En(t 2 ) + En(n2 ) preservation values.
s.t .
S(t 1 ) As mentioned above, the fused cartoon image contains the
w1 (t 1 ) =
S(t 1 ) + S(n1 ) piecewise smooth changes in the luminous intensity or the marked
contents, and edges, which should be preserved as much as pos-
S(t 2 )
w1 (t 2 ) = , sible. The fused texture image describes the detailed information
S(t 2 ) + S(n2 )
in the regions enclosed by edges including the additional noise.
(8) Therefore, this part should bring in the additional noise as less
where, w0 (t1), w0 (t2) denote the weight between the sepa- as possible. Motivated by these principles, an extended version of
rated texture components and the separated cartoon components. weighted fusion rule is proposed to integrate the fused texture im-
w1 (t1), w1 (t2) represent the ratio of structural similarity between age and the fused cartoon image. The novel fusion rule is defined
the texture and cartoon images. min(w0 (t1), w1 (t1)), min(w0 (t2), as:
w1 (t2)) represent the importance of the coefficients in the final 1
XF = (Q(A,XAn ) + Q(B,XBn ) )XF n + (Q + Q(B,XBt ) )XF t
fused texture image. 2 (A,XAt )
The above formulas represent the fusion operator constructed
Q(A,XAn )
for fusing texture components. As is well-known, few cartoon com- Q(A,XAn ) =
ponents and partial additional noise may be introduced into the Q(A,XAn ) + Q(A,XAt )
texture content. According to the properties of the texture con- Q(A,XAt )
Q(A,XAt ) = (12)
tents, the “min” function is characterized by energy and structural Q(A,XAn ) + Q(A,XAt )
similarity to remove the cartoon comments and noise information. s.t.
Q(B,XBn )
Based on this operator, the fused texture components contain more Q(B,XBn ) =
Q(B,XBn ) + Q(B,XBt )
significant features and introduce as less artifacts or inconsistency
as possible. Q(B,XBt )
Q(B,XBt ) = ,
C: Fusion rule for cartoon image and texture image Q(B,XBn ) + Q(B,XBt )
In the fusion framework, the key issue for designing proper fu-
Through the above procedures, diverse improved fusion rules
sion rules is to integrate the important information into the final
are utilized to fuse the cartoon and texture components, respec-
fused image. When a natural image is decomposed into two inde-
tively; while the separated components are also effectively fused
pendent components with their inherent characteristics, the fused
by a proper fusion rule. These procedures can carry prominent
image can not be obtained by adding and subtracting operations
information without introduction of distortion and enhance the
on different components directly. Especially, the cartoon part con-
quality of fused image. In next section, several experiments are
tains the components: curves, edges and anisotropic structures;
sketched to illustrate the performance of the novel approach.
while the texture part mainly represents the details, lines and peri-
odic behaviors, as well as some degree of noise information. There-
3. Multi-focus image fusion experiments
fore, an improved fusion rule based on edge preservation is pro-
posed to preserve the geometric structural information. The edge
Image cameras with finite depth-of-field can not capture all
preservation is on behalf of the edge information from the sepa-
relevant and complementary information in focus in light optical
rated image. For source images A, B, the cartoon and texture im-
imaging systems. Multi-focus image fusion is an effective process
ages X(At), X(An), X(Bt), X(Bn), the spatial information preservation
of combining all the complementary information into a highly fo-
measures Q A,XAt , Q A,XAn , Q B,XBt , Q B,XBn [41,42] are defined as:
calized image, which can provide a suitable view for human or
machine perception. In this work, a novel multi-focus image fu-
A,XAt
Qn,m =
(
A,X
−σg )
A,XAt
−σα )
sion approach based on image decomposition is proposed. Firstly,
(1 + e kg Gn,mAt )(1 + ekα (Hn,m ) multi-focus images are decomposed into cartoon and texture com-
ponents by an iterative re-weighted decomposition algorithm. Sec-
A,XAn
Qn,m = ondly, the cartoon image and the texture image are fused by dif-
A,X A,X
( 1 + ekg ( Gn,mAn − σg ) )(1 + ekα (Hn,mAn −σα ) )
ferent fusion rules according to their properties. Finally, the fused
(9) images are obtained and some comparison experiments are con-

B,XBt
Qn,m = structed to illustrate the effectiveness of the proposed approach.
B,XBt A,XBt
(1 + ekg (Gn,m −σg ) )(1 + ekα (Hn,m −σα ) ) The objective evaluation is necessary for the fused images. The
evaluation criterion generally includes Average Gradient (AG), Mu-

B,XBn
Qn,m = , tual Information (MI), Edge-Intensity (EI), Relatively Warp (RW),
(
B,X A,X
σ ) )(1 + ekα (Hn,mBn −σα ) ) Structural Similarity (SSIM), Edge Retention QAB/F , Cross-Entropy
(1 + e kg Gn,mBn − g
(CE) and Figure Definition (FD) [1–6,14,15]. The fused medical im-
In Eq. (9), a pair of factors G(∗ ) and H(∗ ) are used to define age is better with the increasing numerical index of AG, MI, FD,
relative strength (gradient information) and orientation “change”, EI, while it is opposite for CE and RW. The Edge retention QAB/F
110 Z. Liu et al. / Information Fusion 35 (2017) 102–116

Table 1
The running time (Time/seconds) and PSNR by the proposed fusion approach and the
conventional fusion methods.

Mthods DCT Curvelet blockDCT TV MCA Proposed

Time 20.9975 20.4073 55.0439 78.7868 107.4161 59.5550


PSNR 26.7090 26.4671 28.2163 21.8885 28.6485 30.3605

and Structural Similarity (SSIM), expound the impression of the Furthermore, the experimental results also demonstrate that some
medical image better than others if the value approximates to 1. noise information has been introduced into the fused images ob-
The evaluation criterion would give an impersonal assessment to tained by Max, MP, RP and TV-l1 . In comparison, the fused images
the fused images by the conventional fusion methods and the pro- merged by DWT, CP, LP and the proposed method contain much
posed fusion approach. All the source multi-focus images can be more detail information. On the other hand, the results of other
obtained at http://www.imagefusion.org. In addition, all the ex- schemes have a poor contrast in comparison to those of the above
periments are implemented in Matlab 2010a and on a Pentium(R) mentioned four methods. However, it is difficult to distinguish the
2.5 GHz PC with 2.00 GB RAM. diversity among those fused images acquired by DWT, CP, LP and
the proposed method for human visual system. Thus, the objective
indices are utilized to evaluate the fused images. The results of the
3.1. Fusion on multi-focus “Color_Flower” images assessment criteria are shown in Table 2.
As shown in Table 2, the proposed fusion approach outperforms
In this section, a pair of “Color_Flower” multi-focus images with other methods in terms of the evaluation criteria including CE,
size of 384 × 512 pixels is utilized to perform the proposed al- QAB/F , RW, SSIM and FD. These quantitative assessments indicate
gorithm and the comparative methods. Each “Color_Flower” image that the fused image obtained by the proposed approach contains
has different focused regions. Color_FlowerA focuses on the left more detail information. Although the value of quantitative assess-
side of the big flower with different edge shapes in each petal, big ments is smaller in terms of AG, EI and MI, the distinction is rela-
flower clear, whole brick wall and small lobules fuzzy. In compar- tively tiny. Especially, the quantitative assessment of FD illustrates
ison, Color_FlowerB focuses on the right side of the whole brick that the fused image by the proposed approach is more suitable
wall and many small lobules with similar brick contour in the for human visual perception. In conclusion, the proposed fusion
background wall, the lobules of arbitrary shapes and sizes, whole approach is superior to other methods through subjective visual
brick wall and small lobules clear, big flower fuzzy. This group analysis and objective quantitative evaluation.
of multi-focus images consists of both rich cartoon components
and abundant texture components comparatively. In this section,
we sketch here some experiments to verify the effectiveness (low 3.2. Fusion on multi-focus “Color_Book” images
computational complexity and high-quality fused image) of the
proposed fusion approach. In particular, the running time is ob- In this work, some fusion experiments on a set of multi-focus
tained by averaging 20 times of algorithm procedure, which are “Color_Book” images with size of 320 × 240 pixels are sketched
acquired by DCT, blockDCT, Curvelet, TV-l1 model, MCA model and here to illustrate the performance of the proposed fusion approach.
the proposed algorithm; the quality of the fused images is esti- Color_BookA focuses on the right side of the whole book with
mated by Peak Signal to Noise Ratio (PSNR). The experimental re- some objects of arbitrary shapes and sizes, whole book clear, part
sults are shown in Table 1. book fuzzy. In proportion, Color_BookB focuses on the left side of
As shown in Table 1, it can be seen clearly that the proposed the part book with sawtooth contour and some English words in
algorithm holds less time than TV-l1 model and MCA model. In the cover, part book clear, whole book fuzzy. As mentioned above,
comparison to the classical methods (DCT, curvelet), the presented this group of source images is abundant in cartoon components
algorithm takes up more time. However, the results of the eval- and texture components. To show the effectiveness of the pre-
uation index (PSNR) demonstrate that the proposed algorithm can sented approach, some experiments are implemented to demon-
get high-quality fused images. In brief, the improved image decom- strate the merits of the proposed algorithm, which pay close at-
position based fusion method can achieve better results with low tention to the running time and image quality. The experimental
computational complexity when compared with the conventional results are shown in Table 3.
methods and other image decomposition based fusion methods. In Table 3, the distinction of running time between the pro-
In order to provide an intuitive result, the experiments are imple- posed algorithm and other image decomposition based method
mented on “Color_ Flower” multi-focus images as shown in Fig. 7. (TV and MCA models) is distinct. These are acquired by averag-
The fusion results using different fusion methods are shown in ing 20 times of fusion processes. In comparison to the classical
Fig. 7. The source images are shown in Fig. 7(a) and (b). The fused methods (DCT, curvelet), although the presented algorithm takes
images are obtained by combining those sets of multi-focus images up more time, it can get high-quality fused images. In particular,
perceived as a single content, as shown in Fig. 7(c)–(l), which are the results of the Peak Signal to Noise Ratio (often abbreviated
acquired by average-based method (AV), Select maximum based PSNR) demonstrate that the presented method can achieve better
method (Max), PCA-based method (PCA), Contrast Pyramid method performance than other conventional methods. In brief, our pro-
(CP), DWT-based method (DWT), FSD Pyramid method (FSD), Gra- posed method can achieve promising performance and guarantees
dient Pyramid method (GP), Morphological Pyramid method (MP), the processing speed. To offer an intuitionistic view, the fusion ex-
Ratio Pyramid method (RP) and Laplacian Pyramid method (LP), periments are performed on a pair of multi-focus “Color_Book” im-
respectively. The fused images, as shown in Fig. 7(m)–(o), are ob- ages, as shown in Fig. 8.
tained by integrating the cartoon and texture components, which The fused images obtained by the proposed fusion approach
are carried out by TV-l1 model, MCA model and the proposed algo- and the well-known methods are shown in Fig. 8. The source
rithm, respectively. Although the fusion methods (such as AV, FSD, images can provide dissimilar information as shown in Fig. 8(a)
GP, PCA and MCA) get high-quality fused images, there is lumi- and (b). From the fused results, seriously manual inspection illus-
nance distortion to some extent compared with the source images. trates that the fused images are not satisfactory with poor con-
Z. Liu et al. / Information Fusion 35 (2017) 102–116 111

Fig. 7. The fused images obtained by different fusion methods for multi-focus “Color_Flower” Images. (a) and (b) Multi-focus source images: Color_ FlowerA and Color_
FlowerB; (c) The fused image obtained by average-based method; (d) The fused image obtained by Select maximum method; (e) The fused image obtained by PCA-based
method; (f) The fused image obtained by Contrast Pyramid method; (g) The fused image obtained by DWT-based method; (h) The fused image obtained by FSD Pyramid
method; (i) The fused image obtained by Gradient Pyramid method; (j) The fused image obtained by Morphological Pyramid method; (k) The fused image obtained by Ratio
Pyramid method; (l) The fused image obtained by Laplacian Pyramid method; (m) The fused image based on the proposed fusion rules and image decomposition by TV-l1 ;
(n) The fused image based on the proposed fusion rules and image decomposition by MCA; (o) The fused image based on the proposed fusion rules and image decomposition
by the proposed iterative re-weighted algorithm.

Table 2
The objective evaluation of the conventional methods and our novel approach.

Methods AG CE EI QAB/F RW MI SSIM FD

AV 6.2953 0.0369 68.5991 0.7047 0.0789 3.7443 0.9652 6.9865


Max 6.3223 0.0460 68.6829 0.6658 0.0612 5.7996 0.9754 7.1613
PCA 6.3624 0.0357 69.2656 0.7072 0.0764 3.7774 0.9615 7.0862
CP 8.5201 0.0101 91.1605 0.6874 0.0430 3.7747 0.8994 10.0316
DWT 8.6682 0.0094 92.6794 0.6731 0.0371 3.3350 0.8983 10.2627
FSD 6.3964 0.1256 66.9069 0.6729 0.2054 3.2162 0.9077 7.9090
GP 6.3416 0.1266 66.7888 0.6786 0.2055 3.2460 0.9084 7.8428
MP 9.2347 0.0476 99.1303 0.6975 0.0372 3.9294 0.8894 10.6845
RP 7.0050 0.0561 75.6401 0.6897 0.0198 3.8910 0.9523 8.0230
LP 8.3501 0.0088 89.4040 0.6877 0.0340 3.7338 0.9082 9.8070
TV-l1 8.3182 0.1827 99.6319 0.4862 0.0345 2.6306 0.7460 9.7687
MCA 6.8508 0.0050 74.7885 0.6851 0.0048 3.5951 0.9638 9.6447
Proposed 9.2295 0.0011 96.0415 0.7111 0.0012 5.5208 0.9767 10.7314

Table 3 to distinguish the difference among those fused images for human
The running time (Time/seconds) and PSNR by the proposed fusion approach and
visual perception subjectively. Therefore, the objective assessment
the conventional fusion methods.
criteria are needed to measure the quality of the fused images, as
Mthods DCT Curvelet blockDCT TV MCA Proposed shown in Table 4.
Time 15.4709 16.9969 20.8112 22.0445 33.5805 19.0649
As shown in Table 4, the proposed fusion approach achieves
PSNR 27.0834 29.8013 28.6027 24.6287 29.9067 31.8063 better results when compared with other methods according to the
evaluation criteria including CE, QAB/F , RW, MI, SSIM and FD. The
results of these assessment criteria demonstrate that the proposed
fusion approach can capture much more information from source
trast as shown in Fig. 8(c), (d), (e), (h), (i), (k), (m) and (n), which images. In particular, although the value of AG and EI are not the
are obtained by average-based method, FSD-based method, RP- largest, the distinction is very little. Therefore, the proposed fusion
based method, Max-based method, PCA-based method, MCA-based approach outperforms the other methods according to visual anal-
method and TV-l1 -based method, respectively. These fused images ysis and parallel quantitative evaluation.
severely lose information in light intensity. Especially, the fused
image obtained by TV-l1 model contains a fraction of noise infor-
mation. The result may be generated by the poor property of sep- 3.3. Fusion on multi-focus “Ball” images
arating cartoon component and texture component. In contrast, as
shown in Fig. 8(f), (g), (l) and (o), the fused images have good per- In this section, the corresponding experiments are implemented
formance either in light intensity or in detail information (such as on a group of multi-focus “Ball” images among the proposed ap-
lines, shapes, edges and contour), which are obtained by CP-based proach and other methods. The source image, Ball_A, is the far
method, DWT-based method, MP-based method, LP-based method clear back focus image with many patches of arbitrary shapes and
and the proposed algorithm, respectively. However, it is not easy sizes, the patches clear, the ball fuzzy; while Ball_B is the close
112 Z. Liu et al. / Information Fusion 35 (2017) 102–116

Fig. 8. The fused images obtained by different fusion methods for multi-focus “Color_ Book” Images. (a) and (b) multi-focus source images: Color_BookA and Color_ BookB;
(c) The fused image obtained by average-based method; (d) The fused image obtained by Select maximum method; (e) The fused image obtained by PCA-based method; (f)
The fused image obtained by Contrast Pyramid method; (g) The fused image obtained by DWT-based method; (h) The fused image obtained by FSD Pyramid method; (i)
The fused image obtained by Gradient Pyramid method; (j) The fused image obtained by Morphological Pyramid method; (k) The fused image obtained by Ratio Pyramid
method; (l) The fused image obtained by Laplacian Pyramid method; (m) The fused image based on the proposed fusion rules and image decomposition by TV-l1 ; (n) The
fused image based on the proposed fusion rules and image decomposition by MCA; (o) The fused image based on the proposed fusion rules and image decomposition by
the proposed iterative re-weighted algorithm.

Table 4
The objective evaluation of the conventional methods and our novel approach.

Methods AG CE EI QAB/F RW MI SSIM FD

AV 6.8580 0.0535 69.9675 0.7978 0.0270 5.3814 0.9626 8.8379


Max 7.2250 0.0795 74.2480 0.7666 0.0205 5.8521 0.9547 9.2181
PCA 6.8696 0.0575 70.0491 0.7990 0.0266 5.4402 0.9587 8.8619
CP 8.7126 0.0123 85.4287 0.7712 0.0087 5.3669 0.9168 11.9958
DWT 8.7125 0.0142 85.0312 0.7621 0.0033 4.8076 0.9136 12.3349
FSD 7.0686 0.2074 66.5564 0.7310 0.0861 4.0011 0.8927 10.4932
GP 7.0223 0.2075 66.3621 0.7416 0.0862 4.0176 0.8954 10.4233
MP 9.1362 0.0626 92.0213 0.7296 0.0177 4.4438 0.9048 12.1859
RP 7.6367 0.1329 77.8730 0.7625 0.0097 5.6569 0.9436 10.0131
LP 8.5739 0.0104 85.4354 0.7746 0.0062 5.2433 0.9211 11.6845
TV-l1 9.5003 0.5917 94.4369 0.6386 0.1928 3.8879 0.7931 12.5483
MCA 5.8573 0.0719 60.9783 0.6329 0.0408 4.3807 0.8977 7.1332
Proposed 9.3671 0.0 0 09 92.1207 0.7997 0.0010 6.0471 0.9777 12.7188

Table 5 tional methods (DCT, blockDCT and Curvelet). However, the results
The running time (Time/seconds) and PSNR by the proposed fusion approach and
of PSNR illustrate that our presented algorithm outperforms bet-
the conventional fusion methods.
ter than other fusion methods. For human visual perception, there
Mthods DCT Curvelet blockDCT TV MCA Proposed are much more texture information in this set of multi-focus im-
Time 1.9588 1.6655 5.3015 6.5781 6.8715 5.2444
ages. The related fusion experiments are performed on a pair of
PSNR 22.4146 25.2725 25.5751 21.5964 27.7943 30.5659 multi-focus “Ball” images with finite depth-of-field as shown in
Fig. 9.
As shown in Fig. 9, the source images with size 128∗128 pix-
els are shown in Fig. 9(a) and (b). From the perspective of human
clear front focus image of a ball with many edge veins of arbitrary visual perception mechanism, there are some losses of local infor-
shapes and sizes, the ball clear, the patches fuzzy. In this work, mation or luminance distortion in Fig. 9(c), (d), (e), (h), (i), (j), (k),
we enforce some experiments to elaborate the performance of the (m) and (n), which are obtained by average-based method, Max-
proposed approach. In our experiments, the running time is ob- based method, PCA-based method, FSD-based method, GP-based
tained by averaging 20 times of fusion procedures; the quality of method, MP-based method, RP-based method, TV-l1 -based method
the fused images is estimated by Peak Signal to Noise Ratio (PSNR). and MCA-based method, respectively. Especially, there are intro-
The experimental results are shown in Table 5. duction of noise information in Fig. 9(d), (j) and (m). By compari-
As shown in Table 5, the fusion approach based on the im- son, high-quality fused images are obtained in Fig. 9(f), (g), (l) and
proved iterative re-weighted image decomposition algorithm has (o), which behave better in both light intensity and detail infor-
low time complexity ,when compared with other image decompo- mation including shapes, edges and contour. Those fused images
sition based fusion approaches (TV model and MCA model). Simi- are obtained by CP-based method, DWT-based method, LP-based
larly, the proposed method holds up more time than the conven- method and the proposed method, respectively. However, the dis-
Z. Liu et al. / Information Fusion 35 (2017) 102–116 113

Fig. 9. The fused images obtained by different fusion methods for multi-focus “Ball” Images. (a) and (b) multi-focus source images: Ball_A and Ball_B; (c) The fused image
obtained by average-based method; (d) The fused image obtained by Select maximum method; (e) The fused image obtained by PCA-based method; (f) The fused image
obtained by Contrast Pyramid method; (g) The fused image obtained by DWT-based method; (h) The fused image obtained by FSD Pyramid method; (i) The fused image
obtained by Gradient Pyramid method; (j) The fused image obtained by Morphological Pyramid method; (k) The fused image obtained by Ratio Pyramid method; (l) The
fused image obtained by Laplacian Pyramid method; (m) The fused image based on the proposed fusion rules and image decomposition by TV-l1 ; (n) The fused image based
on the proposed fusion rules and image decomposition by MCA; (o) The fused image based on the proposed fusion rules and image decomposition by the proposed iterative
re-weighted algorithm.

Table 6
The objective evaluation of the conventional methods and our novel approach.

Methods AG CE EI QAB/F RW MI SSIM FD

AV 11.6868 0.1353 109.3928 0.7157 0.1066 4.0256 0.8401 15.5838


Max 11.5661 0.1858 106.7145 0.6093 0.0875 5.1235 0.8089 15.5647
PCA 11.7273 0.1514 109.6460 0.7163 0.1065 4.0199 0.8347 15.6653
CP 17.6785 0.0376 155.4416 0.7133 0.0270 4.6862 0.7349 25.7195
DWT 17.9944 0.0323 156.0707 0.7075 0.0248 3.5957 0.7223 25.7592
FSD 15.3586 0.0338 127.3360 0.7095 0.1445 3.1681 0.7136 23.9674
GP 15.2849 0.0368 127.0993 0.7218 0.1424 3.1794 0.7183 23.7937
MP 18.5008 0.0158 164.7448 0.7286 0.1553 4.2018 0.8986 26.2534
RP 11.5984 0.2118 107.1465 0.6052 0.0856 4.5801 0.8039 15.6013
LP 17.7272 0.0395 155.7446 0.7130 0.0278 4.6569 0.7347 25.8030
TV-l1 18.9009 0.3175 145.1572 0.6514 0.1025 4.9814 0.5817 25.4769
MCA 18.3047 0.3276 141.3437 0.7062 0.0201 4.6268 0.7487 20.7706
Proposed 19.0476 0.0017 158.8382 0.7251 0.0011 5.0670 0.8398 26.3816

tinction between these methods is difficult to give out according 3.4. Fusion on multi-focus “Testna_Slika” images
to human visual perception. Thus, the objective quantitative crite-
ria are needed for the fused images. The results of the quantitative In this work, fusion experiments on a group of multi-focus
assessments are shown in Table 6. “Testna_Slika” images with size of 160 × 160 pixels are imple-
As shown in Table 6, the results of the evaluation assessments mented to verify the property of the proposed fusion approach.
in terms of AG, CE, QAB/F , RW, SSIM and FD indicate that the pro- Observing multi-focus source images in Fig. 10(a), Testna_SlikaA is
posed fusion approach performs better than other compared meth- a close clear front focus image of an aircraft with some contours of
ods. In other words, the fused image obtained by the proposed general shapes, the aircraft clear, environmental background fuzzy.
fusion approach contains much more detail information, such as In proportion, as shown in Fig. 10(b), Testna_SlikaB is a far clear
shapes, edges and contour. In addition, although the objective back focus image with an object of prominent and irregular shapes
quantitative assessments are not optimal in terms of EI and MI, the in the environmental background, environmental background clear,
distinction is comparatively small. Especially, the quantitative as- front aircraft fuzzy. In this section, some experiments are imple-
sessments also demonstrate that the fused image by the proposed mented to demonstrate the performance of low time complexity
approach is more suitable for human visual perception system. In and high fused image quality. In the following experiments, the
conclusion, the proposed fusion approach outperforms other meth- running time in Table 7 is obtained by averaging 20 times of fu-
ods through subjective visual analysis and objective evaluation. sion processes. The evaluation criteria of PSNR is utilized to evalu-
114 Z. Liu et al. / Information Fusion 35 (2017) 102–116

Fig. 10. The fused images obtained by different fusion methods for multi-focus “Testna_Slika” Images. (a) and (b) multi-focus source images: Testna_SlikaA and Testna_
SlikaB; (c) The fused image obtained by average-based method; (d) The fused image obtained by Select maximum method; (e) The fused image obtained by PCA-based
method; (f) The fused image obtained by Contrast Pyramid method; (g) The fused image obtained by DWT-based method; (h) The fused image obtained by FSD Pyramid
method; (i) The fused image obtained by Gradient Pyramid method; (j) The fused image obtained by Morphological Pyramid method; (k) The fused image obtained by Ratio
Pyramid method; (l) The fused image obtained by Laplacian Pyramid method; (m) The fused image based on the proposed fusion rules and image decomposition by TV-l1 ;
(n) The fused image based on the proposed fusion rules and image decomposition by MCA; (o) The fused image based on the proposed fusion rules and image decomposition
by the proposed iterative re-weighted algorithm.

Table 7 Max-based method, PCA-based method, DWT-based method, FSD-


The running time (Time/seconds) and PSNR by the proposed fusion approach and
based method, GP-based method, MP-based method, RP-based
the conventional fusion methods.
method, TV-l1 based method and MCA-based method, respectively.
Mthods DCT Curvelet blockDCT TV MCA Proposed Fortunately, the fused images contain much more detail informa-
Time 2.2158 2.3512 7.2152 11.7942 13.4317 7.3477
tion including edges and contours in Fig. 10(f), (l) and (o), which
PSNR 28.7571 29.2838 31.0391 23.6145 31.8803 35.2433 are obtained by CP-based method, LP-based method and the pro-
posed method, respectively. In particular, the fused images are ap-
propriate for human visual perception in light intensity. However,
the difference between those methods is difficult to distinguish
ate the performance of these fused images. The results are shown from subjective observation of the naked eyes. In order to evalu-
in Table 7. ate the performance of the proposed fusion scheme, the objective
As shown in Table 7, the results of the running time can clearly quantitative assessments are required. The results of the quantita-
indicate that the presented method enhances the speed of the fu- tive criteria are shown in Table 8.
sion procedure in comparison to other image decomposition based As shown in Table 8, the assessment criteria including AG, CE,
fusion methods. As described in the previous section, the results of QAB/F , RW and FD demonstrate that the proposed fusion approach
the evaluation criterion (PSNR) demonstrate that the quality of the can capture much more information from multi-focus source im-
fused image can be guaranteed, when compared with the conven- ages. In Table 8, although the objective criteria are not optimal in
tional fusion methods. In particular, although the fusion methods terms of EI MI and SSIM, the distinction is very little. From the
based on single transform basis require much less time than our scores of the related criteria in Table 1–4, it can be seen that the
proposed method, the presented method achieves better results for individual classical method could provide higher scores of some
human visual system. Reviewing the description of the property metric for the fused image. Since the different indexes consider the
of cartoon and texture components, there are much more cartoon fused image containing the diverse information as the ideal fusion
information in this pair of source images. The fusion experiments result (for example, the index AG measures the minute details con-
enforced on a set of multi-focus “Testna_Slika” images are shown trast and the variation of texture features in the fused image), and
in Fig. 10. different fusion methods aim to integrate the unlike information
The fused images obtained by the proposed fusion approach into the all-in-focus image (for example, the fusion method, Mor-
and other comparative methods are shown in Fig. 10. The source phological Pyramid method (MP), aims to combine all the detail
images offer different information on sharp contrast variations information, texture features and the cartoon information into the
due to the limited depth-of-focus of optical lenses, as shown in fused image), it is feasible and rational that some classical methods
Fig. 10(a) and (b). Carefully observing the related experimental re- sometimes perform better than our proposed algorithm in terms
sults, there are loss of information, introduction of noise informa- of some objective evaluation. In brief, the results of subjective and
tion or luminance distortion in Fig. 10(c), (d), (e), (g), (h), (i), (j), objective evaluation sketched here illustrate that the proposed fu-
(k), (m) and (n), which are obtained by average-based method, sion approach performs better than other methods.
Z. Liu et al. / Information Fusion 35 (2017) 102–116 115

Table 8
The objective evaluation of the conventional methods and our novel approach.

Methods AG CE EI QAB/F RW MI SSIM FD

AV 2.6902 0.0591 28.8511 0.7208 0.1068 5.2710 0.9556 2.9191


Max 2.8768 0.0943 30.3039 0.7080 0.1001 6.2193 0.9384 3.1678
PCA 2.7248 0.0554 29.1815 0.7258 0.1049 5.3379 0.9492 2.9677
CP 3.6275 0.0094 38.0391 0.7269 0.0027 4.9668 0.9111 4.1069
DWT 3.8207 0.0320 40.1576 0.6967 0.0245 4.4336 0.8949 4.3336
FSD 2.7779 0.1601 28.6364 0.7016 0.2382 4.1996 0.9174 3.2873
GP 2.7573 0.1635 28.5125 0.7045 0.2359 4.2016 0.9208 3.2497
MP 4.0540 0.0430 42.2950 0.6985 0.0438 4.5673 0.8913 4.6242
RP 2.9269 0.0931 30.8918 0.6885 0.0914 4.9043 0.9360 3.2361
LP 3.4233 0.0278 36.2670 0.7374 0.0370 4.8222 0.9258 3.9261
TV-l1 4.6670 0.3240 46.4589 0.5099 0.2602 3.1735 0.7787 5.2875
MCA 3.6310 0.3621 36.7707 0.6562 0.1549 3.7861 0.9124 4.0637
Proposed 4.6491 0.0 0 09 44.2574 0.7453 0.0 0 07 5.9682 0.9509 5.5291

4. Discussions and conclusions [3] Y.A.V. Phamila, R. Amutha, Discrete cosine transform based fusion of multi-fo-
cus images for visual sensor networks, Signal Process 95 (2014) 161–170.
[4] V.N. Gangapure, S. Banerjee, A.S. Chowdhury, Steerable local frequency based
Multi-focus image fusion plays a crucial role in digital im- multispectral multifocus image fusion, Inf. Fusion 23 (2015) 99–115.
age processing, such as computer vision, clinical medicine, mil- [5] T. Stathaki, Image Fusion: Algorithms and Applications, first ed., Elsevier, Lon-
itary surveillance, robotics and remote sensing. However, multi- don, 2008.
[6] W.W. Kong, Y. Lei, H.X. Zhao, Adaptive fusion method of visible light and in-
focus images with many morphological structures would be not frared images based on non-subsampled shearlet transform and fast non-neg-
well represented by only one single sparse transform method. In ative matrix factorization, Infrared Phys. Techn 67 (2014) 161–172.
this paper, an efficient multi-focus image fusion approach is pro- [7] M.M. Subashini, S.K. Sahoo, Pulse coupled neural networks and its applications,
Expert Syst. Appl 41 (8) (2014) 3965–3974.
posed to overcome these limitations. Firstly, all the source images
[8] Y. Liu, S.P. Liu, Z.F. Wang, A general framework for image fusion based on mul-
are decomposed into two components (cartoon content and tex- ti-scale transform and sparse representation, Inf. Fusion 24 (2015) 147–164.
ture content) by an improved iterative re-weighted image decom- [9] Z.Q. Zhou, S. Li, B. Wang, Multi-scale weighted gradient-based fusion for mul-
ti-focus images, Inf. Fusion 20 (1) (2014) 60–72.
position algorithm. Secondly, the cartoon component and the tex-
[10] D.X. He, Y. Meng, C.Y. Wang, Contrast pyramid based image fusion scheme for
ture component are integrated by the modified fusion rules drawn infrared image and visible image, in: Proceedings of 2011 IEEE International
on the properties of the separated parts. Finally, the fused compo- Conference on Geoscience & Remote Sensing Symposium, 2011, pp. 597–600.
nents are combined to generate the all-in-focus image. The results [11] S.T. Li, X.D. Kang, J.W. Hu, Image fusion with guided filtering, IEEE Trans. Image
Process 22 (7) (2013) 2864–2875.
indicate that the proposed fusion approach achieves better quality [12] A.P. James, B.V. Dasarathy, Medical image fusion: A survey of the state of the
in comparison to the existing state-of-the-art methods. art, Inf. Fusion 19 (3) (2014) 4–19.
However, every model of image processing has its own prac- [13] N. Mitianoudis, T. Stathaki, Pixel-based and region-based image fusion schemes
using ICA bases, Inf. Fusion 8 (2) (2007) 131–142.
tical limitations. In the follow-up study, many works can be per- [14] Z. Liu, H.P. Yin, Y. Chai, S.X. Yang, A novel approach for multimodal medical
formed in the multi-focus image fusion approach based on image image fusion, Expert Syst. Appl 41 (16) (2014) 7425–7435.
decomposition. In sparse representation domain, it is difficult to [15] Z.D. Liu, H.P. Yin, B. Fang, Y. Chai, A novel fusion scheme for visible and
infrared images based on compressive sensing, Opt. Commun. 335 (2015)
provide a dictionary to separate all kinds of source images. There- 168–177.
fore, how to construct a self-adaptive trained dictionary for a wide [16] X.L. Zhang, Y.C. Feng, X.F. Li, S. Wang, The morphological component analy-
family of multi-focus images is an essential challenge. Secondly, sis and its application to color-gray image fusion, J. Comput. Inf. Syst. 9 (24)
(2013) 9849–9856.
there is no quantitative evaluation criterion to evaluate the qual-
[17] Y. Jiang, M.H. Wang, Image fusion with morphological component analysis, Inf.
ity of the separated cartoon and texture components. How to de- Fusion 18 (7) (2014) 107–118.
sign general evaluation criteria is a meaningful research direction. [18] Z.P. Xu, Medical image fusion using multi-level local extrema, Inf. Fusion 19
(11) (2014) 38–48.
In addition, although the proposed image decomposition algorithm
[19] M. Elad, J.-L. Starck, P. Querre, D.L. Donoho, Simultaneous cartoon and texture
requires less time than the current methods, it is still a key issue image inpainting using morphological component analysis (MCA), Appl. Com-
to reduce the running time in practical applications. One point of put. Harmon. A. 19 (3) (2005) 340–358.
interest is to build a more efficient image decomposition algorithm [20] J. Bobin, J.-L. Starck, J.M. Fadili, Y. Moudden, D.L. Donoho, Morphological com-
ponent analysis: an adaptative thresholding strategy, IEEE Trans. Image Process
to speed up the procedure. These problems will be further investi- 16 (11) (2007) 2675–2681.
gated in our future work. [21] A. Buades, T.M. Le, J.-M. Morel, L.A. Vese, Fast cartoon + texture image filters,
IEEE Trans. Image Process 19 (8) (2010) 1978–1986.
[22] J.-L. Starck, M. Elad, D.L. Donoho, Image decomposition via the combination of
Acknowledgments sparse representations and a variational approach, IEEE Trans. Image Process
14 (10) (2005) 1570–1582.
[23] E.J. Candes, M.B. Wakin, S.P. Boyd, Enhancing sparsity by reweighted l1 mini-
We would like to thank the supports by National Natural mization, J. Fourier Anal. Appl. 14 (5–6) (2008) 877–905.
Science Foundation of China (61374135, 61203321, 61302041), [24] D. Giacobello, M.G. Christensen, M.N. Murthi, S.H. Jensen, M. Moonen, Enhanc-
ing sparsity in linear prediction of speech by iteratively reweighted 1-norm
Chongqing Nature Science Foundation for Fundamental science and
minimization, in: Proceedings of 2010 IEEE International Conference on Acous-
frontier technologies (cstc2015jcyjB0569), China Central Universi- tics, Speech, & Signal Processing, 2010, pp. 4650–4653.
ties Foundation (106112015CDJXY170 0 03) and Chongqing Graduate [25] S.S. Chen, D.L. Donoho, M.A. Saunders, Atomic decomposition by basis pursuit,
SIAM Rev 43 (1) (2001) 129–159.
Student Research Innovation Project (CYB14023).
[26] S. Haykin, Z. Chen, The cocktail party problem, Neural Comput 17 (9) (2005)
1875–1902.
References [27] M.A. Bee, C. Micheyl, The cocktail party problem: what is it? How can it be
solved? And why should animal behaviorists study it? J. Comp. Psychol 122
(3) (2008) 235–251.
[1] H.F. Li, Y. Chai, R. Ling, H.P. Yin, Multifocus image fusion scheme using fea- [28] H. Asari, B.A. Pearlmutter, A.M. Zador, Sparse representations for the cocktail
ture contrast of orientation information measure in lifting stationary wavelet party problem, J. Neurosci 26 (28) (2006) 7477–7490.
domain, J. Inf. Sci. Eng. 29 (2) (2013) 227–247. [29] D.J. Luo, C. Ding, H. Huang, Toward structural sparsity: an explicit l2 /l0 ap-
[2] A. Saha, G. Bhatnagar, Q.M.J. Wu, Mutual spectral residual approach for multi- proach, Knowl. Inf. Syst 36 (2) (2013) 411–438.
focus image fusion, Digit. Signal Process 23 (4) (2013) 1121–1135.
116 Z. Liu et al. / Information Fusion 35 (2017) 102–116

[30] K.P. Wang, Y. Chai, C.X. Su, Sparsely corrupted stimulated scattering signals [37] V. Saligrama, M. Zhao, Thresholded basis pursuit: LP algorithm for order-wise
recovery by iterative re-weighted continuous basis pursuit, Rev. Sci. Instrum. optimal support recovery for sparse and approximately sparse signals from
84 (8) (2013). 083103-1-083103-7 noisy random measurements, IEEE Trans. Inf. Theor. 57 (3) (2011) 1567–1586.
[31] M. Elad, A.M. Bruckstein, A generalized uncertainty principle and sparse rep- [38] M. Masood, T.Y. Al-Naffouri, Sparse reconstruction using distribution agnos-
resentation in pairs of bases, IEEE Trans. Inf. Theor. 48 (9) (2002) 2558–2567. tic Bayesian matching pursuit, IEEE Trans. Signal Process 61 (21) (2013)
[32] J.A. Tropp, Just relax: Convex programming methods for identifying sparse sig- 5298–5309.
nals in noise, IEEE Trans. Inf. Theor. 52 (3) (2006) 1030–1051. [39] S. Kunis, H. Rauhut, Random sampling of sparse trigonometric polynomials
[33] R. Gribonval, M. Nielsen, Sparse representations in unions of bases, IEEE Trans. II-orthogonal matching pursuit versus basis pursuit, Found. Comput. Math. 8
Inf. Theor. 49 (12) (2003) 3320–3325. (6) (2008) 737–763.
[34] G.G. Bhutada, R.S. Anand, S.C. Saxena, Edge preserved image enhancement us- [40] M. Beister, D. Kolditz, W.A. Kalender, Iterative reconstruction methods in x-ray
ing adaptive fusion of images denoised by wavelet and curvelet transform, CT, Phys. Medica 28 (2) (2012) 94–108.
Digit. Signal Process 21 (1) (2011) 118–130. [41] C.S. Xydeas, V. Petrovic, Objective image fusion performance measure, Electron.
[35] V.L. Guen, Cartoon + texture image decomposition by the TV-l1 model, Image Lett. 36 (4) (20 0 0) 308–309.
Process 4 (2014) 204–219. [42] V. Petrovic, T. Cootes, R. Pavlovic, Dynamic image fusion performance evalua-
[36] E.V.D. Berg, M.P. Friedlander, Probing the pareto frontier for basis pursuit so- tion, in: Proceedings of the 10th International Conference on Information Fu-
lutions, SIAM J. Sci. Comput. 31 (2) (2008) 890–912. sion, 2007, pp. 1154–1160.

Potrebbero piacerti anche