Sei sulla pagina 1di 24

Information Sciences 272 (2014) 49–72

Contents lists available at ScienceDirect

Information Sciences
journal homepage: www.elsevier.com/locate/ins

Interval Type-2 Relative Entropy Fuzzy C-Means clustering


M. Zarinbal a, M.H. Fazel Zarandi a,c,⇑, I.B. Turksen b,c
a
Department of Industrial Engineering, Amirkabir University of Technology, 424 Hafez Ave, P.O. Box 15875-4413, Tehran, Iran
b
TOBB Economics and Technology University, Ankara, Turkey
c
Knowledge Intelligent Systems Laboratory, University of Toronto, Toronto, Canada

a r t i c l e i n f o a b s t r a c t

Article history: Fuzzy set theory especially Type-2 fuzzy set theory provides an efficient tool for handling
Received 21 April 2013 uncertainties and vagueness in real world observations. Among various clustering
Received in revised form 31 October 2013 techniques, Type-2 fuzzy clustering methods are the most effective methods in the case of
Accepted 9 February 2014
having no prior knowledge about observations. While uncertainties in Type-2 fuzzy
Available online 20 February 2014
clustering parameters are investigated by researchers, uncertainties associated with
membership degrees are not very well discussed in the literature. In this paper, investigating
Keywords:
the latter uncertainties is our concern and Interval Type-2 Relative Entropy Fuzzy C-Means
Interval Type-2 fuzzy set theory
Interval arithmetic
(IT2 REFCM) clustering method is proposed. The computational complexity of the proposed
Relative entropy method is discussed and its performance is examined using several experiments. The
Fuzzy c-means clustering obtained results show that the proposed method has a very good ability in detecting noises
Interval Type-2 Relative Entropy Fuzzy and assignment of suitable membership degrees to observations.
C-Means clustering Ó 2014 Elsevier Inc. All rights reserved.

1. Introduction

Pattern recognition methods are types of data analysis methods, in which observations are investigated without
assuming to have any mathematical model. The main concern of these methods is to discover the regularities in observations
and perhaps classifying them into different categories, automatically. The motivation is to perform these tasks more accurate
and more economical than humans do [14].
Supervised and unsupervised classification methods are two main approaches in pattern recognition methods. In
supervised classification methods, the labeled observations are used to provide basis for learning, whereas, if there were
no prior information, unsupervised classification or clustering methods would be the most effective ones. In this case, the goal
is to find the natural grouping exists in observations. High dimensional observations along with unknown number of clusters
have resulted in thousands of automated clustering methods such as K-means, Gaussian mixture models, density-based
spatial clustering of applications with noise (DBSCAN) and Canopy methods. In all these methods, it is assumed that the
observations have to be belong to just one cluster, whereas in practice the observations could belong to more than one cluster
with some degree of belonging. This gap between theory and practice could be effectively handled by fuzzy set theory.
However, different meanings of words to different people, not agreeing knowledge extracted from experts, noisy and
uncertain measurements, and noisy data used to tune parameters, may cause uncertainties in fuzzy sets’ parameters [23].
Type-2 fuzzy sets in contrast with Type-1 fuzzy sets (classical fuzzy sets) are able to model such uncertainties as their

⇑ Corresponding author at: Department of Industrial Engineering, Amirkabir University of Technology, 424 Hafez Ave, P.O. Box 15875-4413, Tehran, Iran.
Tel.: +98 2164545378.
E-mail addresses: mzarinbal@aut.ac.ir (M. Zarinbal), zarandi@aut.ac.ir (M.H. Fazel Zarandi), bturksen@etu.edu.tr (I.B. Turksen).

http://dx.doi.org/10.1016/j.ins.2014.02.066
0020-0255/Ó 2014 Elsevier Inc. All rights reserved.
50 M. Zarinbal et al. / Information Sciences 272 (2014) 49–72

membership functions are themselves fuzzy. These secondary membership functions could be Type-1 fuzzy sets (General
Type-2 fuzzy sets) or crisp intervals in [0, 1] (Interval Type-2 fuzzy sets). Obviously, Interval Type-2 fuzzy sets have less com-
putational complexity than General Type-2 fuzzy sets [23] and have been applied in many application areas such as granular
computing [31,32] pattern recognition [21,9], control [22,2], prediction [27,20], and decision making [37,13].
The combination of Interval Type-2 fuzzy set theory and clustering methods gives more flexibility to handle uncertainties
in real observations and it has resulted in many clustering methods, including Interval Type-2 fuzzy c-means clustering [15],
Interval Type-2 fuzzy c-regression clustering [8], Interval Type-2 approach to kernelized fuzzy c-means clustering [16], Inter-
val Type-2 approach to kernel possibilistic c-means clustering [34], Interval Type-2 possibilistic c-means clustering [10], and
Interval-valued fuzzy relation-based clustering [12] methods. In addition, by combining General Type-2 fuzzy sets and fuzzy
c-means (FCM), Linda and Manic [19] proposed General Type-2 fuzzy c-mean clustering method. These clustering methods
have been applied in many areas such as image processing [9,16,30,33], performance evaluation [12], stock market predic-
tion [7], and detection [6].
Uncertainty in clustering parameters, such as degree of fuzziness, has been discussed in most of Type-2 fuzzy clustering
investigations. Although the uncertainties in these parameters would result in uncertain shape of membership functions,
uncertainty associated with membership functions themselves are not clearly investigated. In addition, in some investiga-
tions, using kernel functions has been discussed, but the choice of kernel functions and kernel width depends on the problem
at hand and thus these clustering methods could not be easily generalized. Moreover, in some of these methods, the Type-2
fuzzy membership functions are defuzzified into Type-1 fuzzy membership functions during each iteration, and therefore
some information would be lost. On the other hand, almost all of the developed Type-2 fuzzy clustering methods are based
either on FCM or on possibilistic c-means (PCM) clustering methods. FCM is a kind of partitioning algorithm that may cause
serious problems in the case of having noisy observations. Moreover, there is no interaction between clusters in PCM, which
may cause closely located or even coincide clusters [28]. Hence, introducing a new generalized clustering method based on
Interval Type-2 fuzzy sets is essential for real world applications.
Therefore, in this paper, we proposed a novel Interval Type-2 fuzzy clustering method, in which the uncertainty associ-
ated with membership functions is the main concern. That is, in the proposed clustering method, Interval Type-2 fuzzy
membership functions are directly modeled using interval arithmetic1 theorems. Moreover, in order to overcome the FCM
and PCM’s shortcomings, the proposed clustering method is modeled using Relative Entropy Fuzzy C-Means (REFCM) method
proposed by Zarinbal et al. [36]. The performance of this new clustering method, Interval Type-2 Relative Entropy Fuzzy C-
Means (IT2 REFCM), is investigated using several experiments. In the first four experiments, the dimensions of the datasets
are increased while the number of clusters is kept unchanged. In the fifth experiment, eight two-dimensional datasets with dif-
ferent cluster numbers are provided. Finally, the performance of the proposed method in real applications is evaluated using
four medical images. The obtained results show that the proposed method has a very good ability in detecting noises and
assignment of suitable membership degrees to observations.
Based on above discussions, the rest of this paper is organized as follows: Section 2, related works, provides some discus-
sions on Type-2 fuzzy clustering methods in literature. The Interval Type-2 Relative Entropy Fuzzy C-Means clustering meth-
od and its properties are presented in Section 3. Computational complexity and the performance of the proposed method are
addressed in Section 4. The conclusions are stated in Section 5. Finally, Appendix A provides some properties of interval
arithmetic and the proofs needed for property and theorem of the proposed method.

2. Related works

As mentioned in previous section, the membership functions of Type-2 fuzzy set are themselves fuzzy. That is, the Type-2
~ is comprised of membership function U ~ ðx; uÞ as [23]:
fuzzy set, F, F

e
F ¼ fððx; uÞ; U F~ ðx; uÞÞj 8x 2 X; 8u 2 U x # ½0; 1g ð1Þ

where X is universe of discourse, and 0 6 u 6 1 and 0 6 U F~ ðx; uÞ 6 1 denotes the primary and the secondary membership
functions, respectively. In fact, U F~ ðx; uÞ could be Type-1 fuzzy set (General Type-2 fuzzy set) or crisp interval in [0, 1] (Interval
Type-2 fuzzy set). e
F can also be expressed as [23]:
Z Z
e
F¼ U F~ ðx; uÞ=ðx; uÞ J x # ½0; 1 ð2Þ
x2X u2J x
RR
where ðÞ
denotes union over all admissible x and u [23].
Combination of Type-2 fuzzy set theory and clustering methods would give more flexibility to these methods in handling
uncertainties and it has resulted in many clustering methods. Interval Type-2 fuzzy c-means clustering [15], Interval Type-2
fuzzy c-regression clustering [8], Interval Type-2 approach to kernelized fuzzy c-means clustering [16], Interval Type-2
approach to kernel possibilistic c-means clustering [34], Interval Type-2 possibilistic c-means clustering [10], and Inter-
val-valued fuzzy relation-based clustering [12] are fuzzy clustering methods that utilize Interval Type-2 fuzzy set theory.

1
Interval arithmetic performs arithmetic operations on closed intervals and represents real numbers between the lower and upper endpoints such that the
true result certainly lies within this interval [5].
M. Zarinbal et al. / Information Sciences 272 (2014) 49–72 51

In addition, General Type-2 fuzzy c-means clustering method proposed by Linda and Manic [19] is FCM derivative method
that applies General Type-2 fuzzy set theory.
Uncertainties associated with degree of fuzziness, m, would result in uncertain shape of membership functions and have
been investigated by some of the above methods; In Interval Type-2 fuzzy c-means clustering method proposed by Hwang
and Rhee [15], the lower and upper interval membership functions are defined using two different values of m; ðm1 ; m2 Þ. The
same procedure has been taken in Fazel Zarandi et al. [8]. Ozkan and Turksen [29] investigate the behavior of membership
functions based on different values of m. Linda and Manic [19] proposed a method for managing the uncertainty associated
with m in FCM algorithm. The main purpose of these investigations is to find effective Type-2 fuzzy membership functions
using the proper values of m, but determining these proper values is highly related to the problem at hand. Moreover, during
each iteration of these clustering algorithms, the membership functions have to be defuzzified. These two important
drawbacks along with sorting pattern indexes in Hwang and Rhee [15] and Fazel Zarandi et al. [8] methods make them
inapplicable in all clustering problems especially the ones with high dimensional observations.
In kernelized Type-2 fuzzy c-means clustering method proposed by Kaur et al. [16], the kernel induced metric is replaced
with Euclidean norm metric in clustering method and the key point is to choose an appropriate value for the kernel width. In
the other kernelized Interval Type-2 clustering method proposed by Raza and Rhee [34] the proper values for kernel width
and m have to be determined before handed. Clearly, determining these values depends on the problem at hand. Guh et al.
[12] use subjective information in the form of interval-valued proximity relation matrix to obtain an agglomerative
hierarchical clustering. In this method, assigning certain value of similarity is problem dependent and so generalizing this
method would be impossible.
In sum, in these clustering methods, Interval Type-2 fuzzy membership functions are modeled using different values of m,
kernel function or relation matrix. However, determining these parameters depend on the problem at hand and could not be
easily generalized. In addition, in some of these methods, the Type-2 fuzzy membership functions are defuzzified into Type-1
fuzzy membership functions during each iteration, and therefore some information would be lost. On the other hand, all of
these clustering methods are based either on FCM or on PCM. While FCM is a partitioning method and is not sensitive to
noisy observations, the clusters in PCM could be located very close to each other or even coincide [28].
Hence, introducing a new generalized clustering method based on Interval Type-2 fuzzy sets is essential. Therefore, in this
paper, a novel Interval Type-2 fuzzy clustering method is presented, in which the Interval Type-2 fuzzy membership
functions are directly modeled. The proposed method is based on REFCM method proposed by Zarinbal et al. [36]. Interval
arithmetic theorems are also applied for Interval Type-2 fuzzy arithmetic purposes. This new clustering method, Interval
Type-2 Relative Entropy Fuzzy C-Means (IT2 REFCM), would be discussed in the next section.

3. Interval Type-2 Relative Entropy Fuzzy C-Means clustering

As mentioned before, FCM and PCM, the two mostly used clustering methods, suffer from some major drawbacks; FCM is
a partitioning method and its performance decreases in noisy observations and in PCM, there is no interaction between
clusters. To overcome the inaccurate results in the case of noisy observations and in the ill-located clusters cases, Zarinbal
et al. [36] proposed a new clustering method, which is modeled as:
0 1
Xc X N XN X c X c  
2 @ uij A
min Jðu; v ; cÞ ¼ um
ij dij  h uij ln
i¼1 j¼1 j¼1 i¼1 k¼1
ukj
k–i

s:t:
8 c
> X
>
> uij ¼ 1 8j 2 f1; . . . ; Ng ð3Þ
>
>
>
>
> i¼1
<
X N
>
> 0< uij 8i 2 f1; . . . ; cg
>
>
>
> j¼1
>
>
:
uij 2 ½0; 1 8i; j
where uij is membership degree of jth observation in ith cluster, dij is Euclidian distance of jth observation from ith cluster
center, m is degree of fuzziness, h is positive coefficient of relative entropy term, and c and N are numbers of clusters and
observations, respectively.
The first term of objective function in Eq. (3) demands that the distances between observations and the cluster centers be
as low as possible, whereas the second term maximizes the dissimilarity between membership functions of ith and kth
clusters. Hence, the second term would avoid closely located or coincide clusters [36].
Considering W 0 ðÞ as principle branch of Lambert-W function and xj as jth observation in n-dimensional space, the
membership degree of this observation in ith cluster, uij , and the center point of ith cluster, v i , are obtained by Eqs. (4)
and (5), respectively:
52 M. Zarinbal et al. / Information Sciences 272 (2014) 49–72

Fig. 1. FOU of an Interval Type-2 fuzzy set.

00 11=m1 11
BB mðm1Þd2ij C C
BB C C
uij ¼ B
BB
B " h
Pc !!#C
C
C
C ð4Þ
@@ mðm1Þd2ij kj h k¼1 lnðukj Þþh A A
W0 h
exp ðm  1Þ k–i
h

PN
j¼1 um
ij xj
vi ¼ PN ð5Þ
j¼1 um
ij

where kj j ¼ 1; . . . ; N is the Lagrangian multiplier and other parameters have the same definitions as before.
As words mean different things to different people, knowledge extracted from a group of experts are uncertain, and Type-
1 fuzzy sets’ training data might be noisy [23], assigning an exact degree of belonging to an observation seems impossible.
Hence, using membership degree in Eq. (4) may cause inaccurate results in real situations with many uncertainties. Type-2
fuzzy set by defining the secondary membership function provides additional degrees of freedom in fuzzy logic systems [9]
and a second order approximation [35].
Interval Type-2 fuzzy set is the most wildly used kind of Type-2 fuzzy set, in which the secondary membership function,
U F~ ðx; uÞ, is a crisp interval in ½0; 1. Hence, this kind of Type-2 fuzzy set could be completely described by its lower and upper
membership functions. That is, Interval Type-2 membership degree of jth observation in ith class, U ij , could be described by
its lower and upper membership degrees, uij and u  ij , respectively:
 ij 
U ij ¼ ½uij ; u ð6Þ
Thus, the footprint of uncertainty (FOU) for each observation (Eq. (7)) [24] could be considered as an n-dimensional interval
 F~ ðxÞ 6 1, as it is shown in Fig. 1.
vector, in which 0 6 uF~ ðxÞ 6 u
[
FOUð e
FÞ ¼ F~ ðxÞ
½uF~ ðxÞ; u ð7Þ
x2X

Form this viewpoint, it could be concluded that Type-1 fuzzy set is a special case of Interval Type-2 fuzzy set with degenerate
lower and upper membership degrees.2
Using the above definition along with the REFCM clustering method proposed by Zarinbal et al. [36], the new Interval
Type-2 Relative Entropy Fuzzy C-Means (IT2 REFCM) clustering method is modeled as follows:
0 1
Xc X N XN X c X c  
2 @ U ij A
min JðU; V; cÞ ¼ Um
ij dij  h U ij ln ð8Þ
i¼1 j¼1 j¼1 i¼1 k¼1
U kj
k–i

where dij is Euclidian distance of jth observation from ith cluster center, m is degree of fuzziness, h is positive coefficient of
the relative entropy term, and c and N are the number of clusters and observations, respectively. Clearly, U ij must satisfy the
following conditions:

2
An interval number X, X ¼ ½x; x, is called degenerate if x ¼ x [26].
M. Zarinbal et al. / Information Sciences 272 (2014) 49–72 53

8 c
> X
>
> U ij ¼ 1 8j 2 f1; . . . ; Ng ð1Þ
>
>
>
>
< i¼1
X N
ð9Þ
>
> 0< U ij 8i 2 f1; . . . ; cg ð2Þ
>
>
>
> j¼1
>
:
U ij 2 ½0; 1 8i; j ð3Þ
Based on interval arithmetic, the IT2 REFCM model would have the following property:

Property 1. The objective function (Eq. (8)) and the constraints (Eq. (9)) could be rewritten as Eqs. (10) and (11), respectively:
0 1
X N h
c X i XN X
c X
c
2
min JðU; V; cÞ ¼ um m 2
ij dij ; uij dij  h@  ij lnðmaxðQ ijk ÞÞA
½uij lnðminðQ ijk ÞÞ; u ð10Þ
i¼1 j¼1 j¼1 i¼1 k¼1
k–i

8
>
> Xc X c
>
> uij ¼ 1; ij ¼ 1 8j 2 f1; . . . ; Ng ð1Þ
u
>
>
>
> i¼1 i¼1
>
<
X N X N
ð11Þ
>
> 0< uij ; 0 <  ij 8i 2 f1; . . . ; cg ð2Þ
u
>
>
>
> j¼1 j¼1
>
>
>
:
0 6 uij 6 u  ij 6 1 8i; j ð3Þ
 
where Q ijk ¼ uij  u1 ; uij  u1 ; u
 ij  u1 ; u
 ij  u1 .
kj kj kj kj

Proof. The proof can be found in Appendix A. h

 ij , which optimizes the objective function (Eq. (8)) and satisfies


Theorem 1. The Interval Type-2 membership degree, U ij ¼ ½uij ; u
the constraints (Eq. (9)), is obtained by:
00 11=m1 11
BB mðm1Þd2ij C C
BB C C
U ij ¼ B
BB
B " h
Pc !!#C
C
C
C ð12Þ
@@ mðm1Þd2ij Kj h k¼1 lnðU kj Þþh A A
W0 h
exp ðm  1Þ k–i
h

Or equivalently,
200 11=m1 11
6BB mðm1Þd2ij C C
6BB C C
 ij  ¼ 6
½uij ; u 6BB " h
Pc !!#C C ;
6BB k h C C
4@@ A A
mðm1Þd2ij j k¼1 lnðukj Þþh
W0 h
exp ðm  1Þ k–i
h

00 11=m1 11 3
BB mðm1Þd2ij C C 7
BB C C 7
B B
 BB " h
Pc C
!!#C C 7 ð13Þ
C 7
@@ mðm1Þd2ij kj h k¼1  kj Þþh
lnðu A A 7 5
W0 h
exp ðm  1Þ k–i
h

where W 0 ðÞ is principle branch of Lambert-W function, m is degree of fuzziness, dij is Euclidean distance of jth observation
from cluster center and Kj ¼ ½kj ; 
kj  j ¼ 1; . . . ; N is Lagrangian multiplier that is calculated by:
80 ! 1 0 19
< h
2
mðm  1Þdij X c Xc =
2
max @ ln þ1 þh lnðU kj Þ  hA; @mdij þ h lnðU kj Þ  hA 6 Kj ð14Þ
: m1 h k¼1 k¼1
;
k–i k–i

or
80 ! 1 0 19
< h
2
mðm  1Þdij X c X c =
@ A @ 2
max ln þ1 þh  kj  h ; mdij þ h
ln u  kj  hA 6 kj 6 kj
ln u ð15Þ
: m1 h k¼1 k¼1
;
k–i k–i
54 M. Zarinbal et al. / Information Sciences 272 (2014) 49–72

The interval center point of ith cluster V i ¼ ½v i ; v


 i  are updated by:
PN
j¼1 Um
ij xj
V i ¼ PN ð16Þ
j¼1 Um
ij

or
½v i ; v i  ¼ ½minðSi Þ; maxðSi Þ ð17Þ
PN PN PN PN !
um x
ij j
um x
ij j
 m xj
u ij
 m xj
u ij
where Si ¼ Pj¼1
N ; Pj¼1
N ; Pj¼1
N ; Pj¼1
N and xj is the jth observation.
um
ij
m
u ij
um
ij
m
u ij
j¼1 j¼1 j¼1 j¼1

Proof. The proof of this theorem can be found in Appendix A. h

Thus, the membership degree of each observation in each cluster depends on its distance from the clusters’ centers and
obtained membership degrees regarding all clusters. Moreover, as Kj is calculated based on the bounds defined in Eq. (14) or
Eq. (15), for some observations the summation of membership degrees in all clusters is no longer equal to one, i.e.
Pc
i¼1 U ij < 1. Hence, some observations have low degree of belonging regarding all clusters, which make the proposed meth-
od sensitive to noisy observation and enable the method to detect the noise points, more effective. In addition, h is the coef-
ficient of relative entropy term. The model would concentrate more on maximization of the dissimilarity between
membership degrees (U ij and U kj ) as the value of h gets higher. This value depends on the users’ need on focusing more
on dissimilarity or more on partitioning. Clearly, changing in the values of m and h would result in various shapes of mem-
bership functions. The FOUs of these functions are demonstrated in Fig. 2(a–d). Fig. 2(a) demonstrates the FOU for
ðm ¼ 2:5; h ¼ 1:5Þ. The FOUs for ðm ¼ 2:5; h ¼ 5Þ; ðm ¼ 5; h ¼ 1:5Þ and ðm ¼ 5; h ¼ 5Þ are demonstrated in Fig. 2(b)–(d),
respectively.
Based on the above discussions, the main steps of IT2 REFCM clustering method are:

Algorithm 1. IT2 REFCM clustering method

Initial parameters:
Step 1: Fix the number of clusters, c, degree of fuzziness, m, and the relative entropy’s coefficient, h.
Step 2: Determine the initial membership degrees, U ð0Þ , and the center points, V.
     
 ðt1Þ ðtÞ    ðt1Þ ðtÞ 
Repeat until U ij  U ij  < e or equivalently u ij  uij  < e
Step 3: calculate Euclidean distance of each observation from each cluster center, dij .
j , using Eq. (14) or Eq. (15)
Step 4: determine Lagrangian multiplier, Kj ¼ ½kj ; k
h i
ðtÞ ðtÞ
Step 5: determine the membership degree, U ij ¼ uij ; u  ðtÞ
ij
, using Eq. (12) or Eq. (13)
Step 6: update the center point of ith cluster, V i ¼ ½v i ; v
 i , using Eq. (16) or Eq. (17).
Step 7: increment t.

Based on the obtained membership degrees, a hard partitioning can then be applied to assign jth observation to ith clus-
ter, that is:

IF U i ðxj Þ > U k ðxj Þ k ¼ 1; 2; . . . ; c; k – i THEN xj is assigned to ith cluster


Or equivalently,
 i ðxj Þ > ½uk ðxj Þ; u
IF ½ui ðxj Þ; u  k ðxj Þ k ¼ 1; 2; . . . ; c; k – i THEN xj is assigned to ith cluster

This could be rewritten as (for more information about the inequality of intervals see Appendix A):
 k ðxj Þ k ¼ 1; 2; . . . ; c; k – i THEN xj is assigned to ith cluster
IF ui ðxj Þ > u
The performance of the proposed method in hard partitioning of given datasets is our concern in the next section. There-
fore, the computational complexity of the IT2 REFCM clustering method is discussed and several experiments are provided.
The clustering results are then compared with the ones obtained from REFCM [36], Interval Type-2 FCM (IT2 FCM) [15], and
Type-2 PCM (T2 PCM) [9,10] clustering methods. These comparisons are done using four well-used criteria: accuracy, pre-
cision, sensitivity and specificity.

4. Performance evaluation

The performance of the proposed method is evaluated from two aspects: in the first place, computational complexity of
the proposed method is discussed and the performance of the proposed method in different situations is evaluated during
M. Zarinbal et al. / Information Sciences 272 (2014) 49–72 55

Fig. 2. FOU for different values of m and h: (a) ðm ¼ 2:5; h ¼ 1:5Þ, (b) ðm ¼ 2:5; h ¼ 5Þ, (c) ðm ¼ 5; h ¼ 1:5Þ, and (d) ðm ¼ 5; h ¼ 5Þ.

the second aspect. The results are then compared with the results obtained from REFCM [36], IT2 FCM [15], and T2 PCM
[9,10] clustering methods.

4.1. Computational complexity

As mentioned in Kolen and Hutcheson [17] and Linda and Manic [19], FCM runs asymptotically in OðNc2 pÞ time,
where N is the number of p-dimensional observations and c is the number of clusters. By adding relative entropy as
the regularization function to FCM, REFCM clustering method, the overall computational complexity would be
OðNc2 pÞ þ Oð2Nc logðNcÞÞ. Moreover, as theoretical complexity of fuzzy interval analysis is the same as standard interval
analysis [11], the overall complexity of proposed IT2 REFCM clustering method would be Oð2Nc2 pÞ þ Oð4Nc logð2NcÞÞ,
which is convenient for N P 100. Thus, the proposed method linearly increases the computational time when compared
with FCM and REFCM methods.
On the other hand, calculating W 0 ðÞ in Eq. (12) or Eq. (13) could be problematic. However, using numerical approxima-
tions in computer software packages, such as MATLAB, Maple and Mathematica, the value of W 0 ðÞ could be easily deter-
mined. There are also some simple numerical and analytical approximations for this function in literature, which could
be used for computational purposes [3,1].

4.2. Experimental results

In this section, several multidimensional datasets are provided and the performance of the proposed method is compared
against the three abovementioned clustering methods with the following properties: (1) REFCM with m ¼ 2:5; h ¼ 1:5, (2)
IT2 FCM with mL ¼ 2; mR ¼ 3:5, (3) T2 PCM with m ¼ 2:5, and (4) IT2 REFCM with m ¼ 2:5; h ¼ 1:5. These comparisons are
done using accuracy (Eq. (18)), precision (Eq. (19)), sensitivity (Eq. (20)) and specificity (Eq. (21)) criteria.

TP þ TN
accuracy ¼ ð18Þ
TP þ TN þ FP þ FN
TP
precision ¼ ð19Þ
TP þ FP
TP
sensitiv ity ¼ ð20Þ
TP þ FN
TN
specificity ¼ ð21Þ
TN þ FP
56 M. Zarinbal et al. / Information Sciences 272 (2014) 49–72

where TP is number of true positive, TN is number of true negative, FP is number of false positive, and FN is number of
false negative. The proportion of true results is measured by accuracy, while precision or positive predictive value is
defined as the proportion of the true positives against all the positive results. The ability to identify positive results
and the ability to identify negative results are measured by sensitivity and specificity, respectively criteria.
During the first four experiments, dimensions of the datasets are increased while the number of clusters is kept un-
changed. Then, eight two-dimensional datasets are provided in Experiment 5 to evaluate the performance of the proposed
method in the case of having different cluster numbers. Finally, four real world experiments are provided and are discussed
in Experiment 6.

4.2.1. Experiment 1
Consider a two-dimensional dataset consists of three clusters. The task is to find the optimum membership degrees of
each observation in each cluster. This is done in the following steps; in the first step, a noise-free dataset with 300 observa-
tions was considered (Fig. 3(a)), 90 normally distributed noise points were added to this dataset (Fig. 3(b)) in the second step
and in the third step, REFCM, IT2 FCM, T2 PCM and IT2 REFCM clustering methods are applied. The obtained membership
degrees are depicted in Fig. 4(a–d) and the results of the accuracy, precision, sensitivity and specificity criteria are reported
in Table 1. Moreover, based on having low membership degrees in all clusters, the noise points are detected and are
eliminated from the noisy dataset. These new de-noised datasets, obtained from each clustering method, are plotted in
Fig. 5(a–d), respectively.
According to the results reported in Table 1 and the de-noised datasets in Fig. 5, it is clear that IT2 REFCM could effectively
detect true results and positive predictive values. The proposed method could also identify the original observations in the
clusters and the noise points, more accurate than the other three methods.

Fig. 3. (a) Noise-free dataset and (b) noisy dataset.


M. Zarinbal et al. / Information Sciences 272 (2014) 49–72 57

(a)

(b)

(c)

(d)

Fig. 4. Obtained membership functions for (a) REFCM, (b) IT2 FCM, (c) T2 PCM, and (d) IT2 REFCM clustering methods.

Table 1
Performance of the clustering methods in first experiment.

Clustering method Accuracy Precision Sensitivity Specificity


REFCM 71% 95% 70% 52%
IT2 FCM 51% 50% 96% 4%
T2 PCM 67% 63% 96% 26%
IT2 REFCM 89% 91% 96% 78%

4.2.2. Experiment 2
Now consider a three-dimensional dataset consists of three clusters. Similar to Experiment 1, the task is to find the opti-
mum membership degree of each observation in each cluster. Therefore, first, a noise-free dataset with 300 observations was
considered (Fig. 6(a)) and then 90 normally distributed noise points were added (Fig. 6(b)). The four clustering method are
then applied on this noisy dataset. The obtained membership degrees and the performance of these four methods in hard
partitioning are reported in Fig. 7(a–d) and Table 2, respectively.
According to the performance results in Table 2, it is clear that while REFCM identifies the positive predictive values
more effective, IT2 REFCM could identify and detect true results and noise points better and more effective than the
other methods. Moreover, the obtained results from experiment 1 and 2 suggested that IT2 FCM clustering method
(Figs. 4(b) and 7(b)) divided the observations into given partition numbers, regardless of being noise or not. In
58 M. Zarinbal et al. / Information Sciences 272 (2014) 49–72

Fig. 5. De-noised datasets based on (a) REFCM, (b) IT2 FCM, (c) T2 PCM and (d) IT2 REFCM clustering methods.

addition, T2 PCM clustering method (Figs. 4(c) and 7(c)) divided the observations into two clusters, which confirm
closely located clustering results of PCM. In fact, these results are not appropriate especially for the last 90 data points
or added noise points.

4.2.3. Experiment 3
For the third experiment, the Iris plants dataset, which is perhaps the best-known dataset to be found in the pattern
recognition literature, is applied. This dataset contains three classes of 50 observations each, where each class refers to a
type of iris plant and each observation has four attributes. The first class is linearly separable from the other two, but the
latter are not linearly separable [39]. By applying the four abovementioned clustering methods, the optimum membership
degrees of each observation is obtained and are depicted in Fig. 8(a–d). The performances of each clustering method are
also reported in Table 3.
Similar to above results, it is clear that IT2 REFCM method could effectively identify the true results, positive predictive
values, and could correctly identify the observation in the clusters.

4.2.4. Experiment 4
Wisconsin prognostic breast cancer is considered as the fourth experiment that contains 699 observations with 10 attri-
butes in two classes of benign and malignant [39]. Similarly, the four clustering methods are applied and the performances
are reported in Table 4.
From the above four experiments it could be concluded that the membership degrees obtained by the proposed method
are more appropriate than the other methods’ results. IT2 FCM results are appropriate in low dimensional datasets and by
increasing in the dimension its performance would decrease. This could be because of sorting step in its algorithm, in which
the patterns should be sorted in ascending order. On the other hand, as T2 PCM method is a mode-seeking algorithm [18], its
performance would increase in high dimensional datasets. Both REFCM and IT2 REFCM clustering methods have appropriate
M. Zarinbal et al. / Information Sciences 272 (2014) 49–72 59

Fig. 6. Three-dimensional: (a) noise-free dataset and (b) noisy dataset.

results in all experiments; however, by increasing in the dimension of observations, the IT2 REFCM has a better performance
in detecting noises and assignment of suitable membership degrees.

4.2.5. Experiment 5
In the four above experiments, the performance of the proposed method is evaluated by increasing in dimension of data-
sets. In this experiment, however, we evaluate the performance of IT2 REFCM method by increasing in the complexity of
observations or number of clusters (Fig. 9(a–h)) and the obtained accuracy, precision and sensitivity values are compared
with the results obtained from REFCM, IT2 FCM and T2 PCM clustering methods (Fig. 10(a–c)). Similar to previous results,
the proposed clustering method is the most effective one in all cases.
In all abovementioned experiments, the imaginary datasets were used. However, a good clustering method is the method
that could handle the uncertainties and vagueness exist in real world. Hence, in the next experiment, four medical images
are provided and IT2 REFCM is applied to segment the abnormalities in these images. The obtained results are then com-
pared against the segmentation results obtained from REFCM, IT2 FCM and T2 PCM clustering methods.

4.2.6. Experiment 6
Images and visual understandings are basis in everyday life rather than audible tones or smells. Therefore, analyzing,
enhancing, compressing, and reconstructing images have been extensively applied in many areas including astronomy,
medicine, and industrial robotics [38]. In recent years, neurology and basic neuroscience have been significantly
60 M. Zarinbal et al. / Information Sciences 272 (2014) 49–72

(a)

(b)

(c)

(d)

Fig. 7. Obtained membership functions for (a) REFCM, (b) IT2 FCM, (c) T2 PCM, and (d) IT2 REFCM clustering methods.

Table 2
Performance of the clustering methods in second experiment.

Clustering method Accuracy Precision Sensitivity Specificity


REFCM 56% 98% 48% 64%
IT2 FCM 83% 83% 91% 0%
T2 PCM 56% 56% 90% 0%
IT2 REFCM 88% 94% 91% 73%

advanced by imaging tools. Among these various tools, Magnetic Resonance imaging (MRI) scans, provides rich informa-
tion about human soft tissue anatomy. This capability makes MRI an indispensable tool for human medicine and medical
diagnosis. Using this information frequently requires the MRI scan to be segmented into different homogeneous regions
of similar attributes such as luminance. This task has proven to be problematic, due to vast amount of data, existing
uncertainties and different characteristics of human tissues. In this experiment, the four clustering methods are applied
on four different brain MRI scans with confirmed abnormal lesions, Figs. 11(a), 12(a), 13(a) and 14(a). The task is to
segment MRI scans into different clusters based on intensity levels to help physicians to diagnose and to differentiate
abnormal lesions more efficient. The obtained clusters containing the existing abnormal lesions are demonstrated in
Figs. 11(b–e)–14(b–e).
M. Zarinbal et al. / Information Sciences 272 (2014) 49–72 61

(a)

(b)

(c)

(d)

Fig. 8. Obtained membership functions for Iris plants dataset using (a) REFCM, (b) IT2 FCM, (c) T2 PCM, and (d) IT2 REFCM clustering methods.

Table 3
Performance of clustering methods for Iris plants dataset.

Clustering method Accuracy Precision Sensitivity Specificity


REFCM 89% 89% 90% –
IT2 FCM 69% 79% 80% –
T2 PCM 76% 89% 88% –
IT2 REFCM 90% 91% 90% –

Table 4
Clustering accuracy rate for Wisconsin prognostic breast cancer.

Clustering method Accuracy Precision Sensitivity Specificity


REFCM 72% 90% 75% –
IT2 FCM 69% 71% 62% –
T2 PCM 63% 64% 56% –
IT2 REFCM 97% 95% 97% –
62 M. Zarinbal et al. / Information Sciences 272 (2014) 49–72

Fig. 9. Two-dimensional datasets with (a) 2 clusters, (b) 3 clusters, (c) 4 clusters, (d) 5 clusters, (e) 6 clusters, (f) 7 clusters, (g) 8 clusters, and (h) 9 clusters.
M. Zarinbal et al. / Information Sciences 272 (2014) 49–72 63

(a) 120%
100%

Accuracy
80%
60%
40%
20%
0%
2 3 4 5 6 7 8 9 10
Cluster Numbers
REFCM IT2 FCM T2 PCM T2 REFCM

(b) 120%
100%
Precision

80%
60%
40%
20%
0%
2 3 4 5 6 7 8 9 10
Cluster Numbers
REFCM IT2 FCM T2 PCM T2 REFCM

(c) 120%
100%
Sensitivity

80%
60%
40%
20%
0%
2 3 4 5 6 7 8 9 10
Cluster Numbers
REFCM IT2 FCM T2 PCM T2 REFCM

Fig. 10. Performance of each clustering method applied on each dataset based on (a) accuracy (b) precision and (c) sensitivity.

Moreover, in order to measure the similarity of the segmented abnormal lesions with the original abnormal lesions in
each MRI scans, the following empirical ratio (Eq. (22)) is used and is calculated for REFCM, IT2 FCM, T2 PCM, and IT2 REFCM
clustering methods. The results are reported in Table 5.

Abnormality lesion area in segmented image


Similarity ¼  100 ð22Þ
Abnormality lesion area in original image

By comparing the similarity values in Table 5 and the segmented images with the original ones in Figs. 11–14, it could be
concluded that the proposed method is able to detect and classify these lesions more accurate. In this table, although IT2
FCM and T2 PCM methods have high similarity ratio for Fig. 12, they could not recognize the abnormal lesion properly, as
it could be seen in Fig. 12(b and c). In short, in real world experiments with many complexities and uncertainties, the pro-
posed method’s results are more accurate and are more reliable.

4.3. Further discussions

In this section, several multidimensional datasets are provided and the performance of the proposed method is
compared against the other clustering methods using accuracy, precision, sensitivity, and specificity criteria. During
the first two experiments, performance of the proposed method is evaluated using 2-dimension and 3-dimensional
datasets with added noise points. According to Tables 1 and 2 results, it is clear that the proposed clustering method
could identify and detect true results and noise points better and more effective than the other methods. Two
best-known datasets, Iris plants dataset and Wisconsin prognostic breast cancer dataset are provided as the third
and fourth experiments and same results are obtained. In the fifth experiment, the proposed method is evaluated
64 M. Zarinbal et al. / Information Sciences 272 (2014) 49–72

Fig. 11. (a) Original MR image and the abnormality class obtained using (b) REFCM, (c) IT2 FCM, (d) T2 PCM, and (e) IT2 REFCM clustering methods.

by increasing in the complexity of observations (number of clusters). Based on Fig. 10(a–c), it is concluded that the
proposed clustering method is the most effective one in all cases especially in observations with high levels of com-
plexity. Four MRI scans are provided in the last experiment and the segmented images (Figs. 11(b–e)–14(b–e)) and the
similarity ratios (Table 5) suggest that the proposed method could detect the abnormal lesions more accurate and
more reliable.
In sum, the membership degrees obtained by the proposed method are more appropriate than the other methods. In fact,
the proposed method could effectively detect noise points and could assign suitable membership degrees to observations.
M. Zarinbal et al. / Information Sciences 272 (2014) 49–72 65

Fig. 12. (a) Original MR image and the abnormality class obtained using (b) REFCM, (c) IT2 FCM, (d) T2 PCM, and (e) IT2 REFCM clustering methods.

While IT2 FCM method is effective in low dimensional datasets, T2 PCM method has good performance in high dimensional
datasets. Moreover, by increasing in complexity of observations, the performance of both methods would decrease. In real
world experiment, although IT2 FCM and T2 PCM methods have high similarity ratio, they could not recognize the abnormal
lesions properly (Fig. 12(b and c)). REFCM obtains appropriate results in imaginary datasets (Experiment 1–Experiment 5).
However as there are many complexities and uncertainties in real world, the performance of REFCM method decreases in
real world experiments.
66 M. Zarinbal et al. / Information Sciences 272 (2014) 49–72

Fig. 13. (a) Original MR image and the abnormality class obtained using (b) REFCM, (c) IT2 FCM, (d) T2 PCM, and (e) IT2 REFCM clustering methods.

5. Conclusions

Type-1 fuzzy clustering methods have been considered by many researchers and have been used in many applications.
However, as there are many uncertainties and vagueness in real world, assigning an exact membership degree to an obser-
vation is impossible. The membership functions of Type-2 fuzzy sets are fuzzy themselves that empower the clustering
methods to handle the uncertainties more appropriate. In this paper, in order to consider the uncertainties associated with
membership functions, Interval Type-2 Relative Entropy Fuzzy C-Means (IT2 REFCM) clustering method is proposed and
interval arithmetic theorems are applied. Several imaginary datasets, with different dimensions and different cluster num-
bers, are provided. In addition, a real word problem in the form of four medical images is provided to show the performance
of the proposed method. In all cases, the IT2 REFCM clustering method shows a very good performance in detecting noises
and assignment of suitable membership degrees to observations.
M. Zarinbal et al. / Information Sciences 272 (2014) 49–72 67

Fig. 14. (a) Original MR image and the abnormality class obtained using (b) REFCM, (c) IT2 FCM, (d) T2 PCM, and (e) IT2 REFCM clustering methods.

Table 5
Similarity rate for the clustering methods in sixth experiment.

Clustering method Fig. 11 Fig. 12 Fig. 13 Fig. 14


REFCM 50% 26% 69% 36%
IT2 FCM 67% 90% 81% 59%
T2 PCM 55% 81% 80% 43%
IT2 REFCM 81% 72% 81% 66%
68 M. Zarinbal et al. / Information Sciences 272 (2014) 49–72

Appendix A

In this section, we, first, review some basic properties of interval arithmetic and then prove the property and theorem of
the proposed method.

A.1. Interval-valued arithmetic: brief reminder

Consider X ¼ ½x;   be two interval vectors. The interval arithmetic has the following properties (proofs could
x and Y ¼ ½y; y
be found in Moore [25] and Moore et al. [26]):

 Arithmetic operations:
X þ Y ¼ ½x þ y; x þ y
 ð23Þ
; x  y
X  Y ¼ ½x  y ð24Þ
; x  y; x  y
X  Y ¼ ½minfx  y; x  y g; maxfx  y; x  y
; x  y; x  y
g ð25Þ
X=Y ¼ X  ð1=YÞ ð26Þ
 Equality:
½x; x ¼ ½y; y
 () ðx ¼ yÞ ^ ðx ¼ y
Þ ð27Þ

 Ordering:
½x; x < ½y; y
 () x < y
ð28Þ
½x; x # ½y; y () y 6 x ^ x 6 y
  

 Power function:
8 n n
> ½x ; x  if x > 0
>
>
>
>  n Þ
>
> ½0; maxðX n ; X if 0 2 ½x; x & n is even
<
n n n
X  ½x ; x  if 0 2 ½x; x & n is odd ð29Þ
>
>
>
>
>
> ½xn ; xn  if x < 0 & n is even
>
: n n
½x ; x  if x < 0 & n is odd
 Let f ðXÞ be a monotonic function, then:
f ðXÞ ¼ ½f ðxÞ; f ðxÞ ð30Þ
 Euclidean distance function:
1=2
Þ ¼ ððx  yÞ2 þ ðx  y
dð½x; x; ½y; y  Þ2 Þ ð31Þ

2 2
Proof of Property. For the first term in Eq. (8), recall that U ij 2 ½0; 1; m > 1 and dij > 0. So, based on Eq. (29) U m
ij dij could be
h i
2 m 2
easily rewritten as um d
ij ij ; 
u d
ij ij .

For the second term in Eq. (8), recall that lnðXÞ is a monotonic increasing function. So, based on Eqs. (25), (26) and (30), it
could be rewritten as:
   
U ij  ij 
½uij ; u
U ij ln ij   ln
¼ ½uij ; u
U kj ½ukj ; u kj 
      
1 1 1 1 1 1 1 1
¼ uij ln min uij  ; uij   ij 
;u  ij 
;u  ij ln max uij 
;u ; uij   ij 
;u  ij 
;u
ukj  kj
u ukj kj
u ukj  kj
u ukj kj
u
ð32Þ
So,

Xc X N XN X c X c  !
2 U ij
min JðU; V; cÞ ¼ Um d
ij ij  h U ij ln
i¼1 j¼1 j¼1 i¼1 k¼1
U kj
!
X N h
c X i X
N X
c X
c
2
¼ um m 2
ij dij ; uij dij h  ij lnðmaxðQ ijk ÞÞ
½uij lnðminðQ ijk ÞÞ; u ð33Þ
i¼1 j¼1 j¼1 i¼1 k¼1
M. Zarinbal et al. / Information Sciences 272 (2014) 49–72 69
 
where Q ijk ¼ uij  u1 ; uij  u1 ; u
 ij  u1 ; u
 ij  u1 .
kj kj kj kj

Moreover, the constraints in Eq. (9) could be transformed to the constraints in Eq. (11) using the three following steps:

Step 1. Eq. (9)-1: Based on Eq. (27) and monotonicity of summation:


" #
X
c X
c X
c X
c X
c X
c X
c
U ij ¼ 1 )  ij  ¼ ½1; 1 )
½uij ; u  ij  ¼
½uij ; u uij ;  ij ¼ ½1; 1 )
u uij ¼ 1 ^  ij ¼ 1
u ð34Þ
i¼1 i¼1 i¼1 i¼1 i¼1 i¼1 i¼1

Step 2. Eq. (9)-2: Based on Eq. (28) and monotonicity of summation:


" #
X
N X
c Xc X
c X
c  ij
uij 6u X
c X
c
0< U ij ) ½0; 0 <  ij  ) ½0; 0 <
½uij ; u uij ;  ij ) 0 <
u uij ) 0 < uij ; 0<  ij
u ð35Þ
j¼1 i¼1 i¼1 i¼1 i¼1 i¼1 i¼1

Step 3. Eq. (9)-3:


 ij
uij 6u
 ij  2 ½0; 1 ) uij 2 ½0; 1 ^ u
U ij 2 ½0; 1 ) ½uij ; u  ij 2 ½0; 1 ) 0 6 uij 6 u
 ij 6 1 ð36Þ
So,
8 c 8 c
> X > X Xc
>
> U ¼ 1 8 j 2 f1; . . . ; Ng >
> uij ¼ 1;  ij ¼ 1 8j 2 f1; . . . ; Ng
u
>
> ij >
>
>
> i¼1 >
> i¼1 i¼1
>
> >
>
< <
X N
ffi X N X N
 ð37Þ
>
> 0< U ij 8i 2 f1; . . . ; cg >
> 0< uij ; 0 <  ij 8i 2 f1; . . . ; cg
u
>
> >
>
>
> j¼1 >
> j¼1 j¼1
>
> >
>
>
: >
:
U ij 2 ½0; 1 8i; j 0 6 uij 6 u  ij 6 1 8i; j

Pc

Proof of Theorem 1. The constraint term in Eq. (9)-1, i¼1 U ij ¼ 1 , is included in the objective function (Eq. (8)) using
Lagrangian multiplier, Kj ¼ ½kj ; 
kj . So, the following function must be minimized:
0 1 !
Xc X N XN X c X c   XN Xc
m 2 U ij
min JðU; V; cÞ ¼ U ij dij  h@ U ij ln A Kj U ij  1 ð38Þ
i¼1 j¼1 j¼1 i¼1 k¼1
U kj j¼1 i¼1
k–i

Minimizing Eq. (38) with respect to U is equivalent to minimizing the individual objective function with respect to U ij as the
membership degree of each observation in each cluster is independent of the other observations, So:
0 1
2
X
c
J ij ¼ Um
ij dij  h@U ij lnðU ij Þ  U ij lnðU kj ÞA  Kj ðU ij  1Þ ð39Þ
k¼1
k–i

@Jij @Jij
The two necessary conditions to be optimized are (1) @U ij
¼ 0 and (2) @ Kj
¼ 0 i ¼ 1; . . . ; c; j ¼ 1; . . . ; N.
@J ij
For the first necessary condition, @U ij ¼ 0, we have:
0 1
Xc  
@J ij m1 2 @ U ij
¼ 0 ) mU ij dij  h ln þ 1 A  Kj ¼ 0 ð40Þ
@U ij k¼1
U kj
k–i

Or,
02 31
@J ij h i X c Xc
m1 2 2
ij dij  h@4
¼ muij dij ; mu m1
lnðminðQ ijk ÞÞ þ 1; lnðmaxðQ ijk ÞÞ þ 15A  ½kj ; kj 
 ij 
@½uij ; u k¼1 k¼1
k–i k–i
2 0 1 0 1 3
2
X c
2
X
c
4
¼ muij dij  h@ A  
lnðmaxðQ ijk ÞÞ þ 1  kj ; muij dij  h @ lnðminðQ ijk ÞÞ þ 1  kj 5
A ð41Þ
k¼1 k¼1
k–i k–i
 
where Q ijk ¼ uij  u1 ; uij  u1 ; u
 ij  u1 ; u
 ij  u1 .
kj kj kj kj
@J
Solving @Uijij ¼ 0, would results in the optimum value for U ij . However, as there is no explicit solution for this equation, U ij
and U kj could be substitute with expðY ij Þ and expðY kj Þ, respectively. The reasons are:

1. U ij 2 ½0; 1 8i; j and using Y ij ¼  lnðU ij Þ would ensure the bounds.


2. As lnðÞ is a monotonic function, the Y ij would be an interval vector.
70 M. Zarinbal et al. / Information Sciences 272 (2014) 49–72

So Eq. (39) could be rewritten as:

2
X
c
m expððm  1ÞY ij Þdij þ hY ij  h  h Y kj  Kj ¼ 0 ð42Þ
k¼1
k–i

Solving this equation with respect to Y ij would result in:


P
Kj þ h ck ¼ 1 Y kj þ h " P ! !#
1
2
mðm  1Þdij Kj þ h ck¼1 Y kj þ h
Y ij ¼ k–i þ W0 exp ðm  1Þ k–i
ð43Þ
h m1 h h

Therefore, the optimum value of U ij would be:


P !
Kj  h ck¼1 lnðU kj Þ þ h
k–i
U ij ¼ exp
h
" P !!#!
1
2
mðm  1Þdij Kj  h ck¼1 lnðU kj Þ þ h
k–i
 exp  W0 exp ðm  1Þ ð44Þ
m1 h h

where W 0 ðÞ is principle branch of the Lambert-W function, that satisfies WðZÞ expðWðZÞÞ ¼ Z [4]. Thus, U ij in Eq. (44) could
be rewritten as:
00 11=m1 11
BB mðm1Þd2ij C C
BB C C
U ij ¼ B
BB
B " h
Pc !!#C
C
C
C ð45Þ
@@ mðm1Þd2ij Kj h k¼1 lnðU kj Þþh A A
W0 h
exp ðm  1Þ k–i
h

Or equivalently,
 ij  ¼ ½A1 ; B1 
½uij ; u ð46Þ
where
0 11=m1
B mðm1Þd2ij C
B C
A¼B
B " h
Pc !!#C
C and
k h
@ mðm1Þd2ij j k¼1 ln ukj þh A
W0 h
exp ðm  1Þ k–i
h

0 11=m1
B mðm1Þd2ij C
B C
B¼B
B "
h
Pc !!#C
C
@ mðm1Þd2ij kj h k¼1  kj þh
ln u A
W0 h
exp ðm  1Þ k–i
h

@Jij
For the second necessary condition, @ Kj
¼ 0, we have:
00 11=m1 11

Xc Xc B B
BB
mðm1Þd2ij C
C
C
C
@L BB
¼0) U ij ¼ 1 ) BB " h
Pc !!#C
C
C
C ¼1 ð47Þ
@ Kj i¼1 i¼1 @@ mðm1Þd2ij Kj h k¼1 lnðU kj Þþh A A
W0 h
exp ðm  1Þ k–i
h

Clearly, solving this equation with respect to Kj would not result in an exact solution and hence, the bounds have to be found.
2
ij mðm1Þd
Recall that 0 6 U ij 6 1 8i; j. First, consider 0 6 U ij 8i; j: Since h 6 0; W 0 ðÞ must be non-positive. Moreover, as
discussed in Corless et al. [4] the principle branch of Lambert-W function, W 0 ðÞ, has three characteristics:

W 0 ðÞ P 1; W 0  1e ¼ 1 and W 0 ð0Þ ¼ 0. So,


" P ! !#
2
mðm  1Þdij Kj  h ck¼1 lnðU kj Þ þ h 1
1 6 W 0 exp ðm  1Þ k–i
60)
h h expð1Þ
P !!
2
mðm  1Þdij Kj  h ck¼1 lnðU kj Þ þ h
6 exp ðm  1Þ k–i
60 ð48Þ
h h
M. Zarinbal et al. / Information Sciences 272 (2014) 49–72 71

Thus,
2
!
h mðm  1Þdij X c
ln þ1 þh lnðU kj Þ  h 6 Kj ð49Þ
m1 h k¼1
k–i

Or equivalently,
2 ! ! 3
2 2
mðm  1Þdij Xc
mðm  1Þdij Xc
4 h ln þ1 þh ln ukj  h;
h
ln þ1 þh kj  h5 6 ½kj ; kj 
ln u
m1 h k¼1
m1 h k¼1
k–i k–i

2
!
h mðm  1Þdij X c
) ln þ1 þh  kj  h 6 kj 6 kj
ln u ð50Þ
m1 h k¼1
k–i

Now consider the condition that U ij 6 1 8i; j, so,


" P ! !#
2
mðm  1Þdij
2
mðm  1Þdij Kj  h ck¼1 lnðU kj Þ þ h
k–i
6 W0 exp ðm  1Þ ð51Þ
h h h

Thus, the second lower bound for Kj is calculated as:

2
X
c
mdij þ h lnðU kj Þ  h 6 Kj ð52Þ
k¼1
k–i

Or equivalently,
2 3
Xc X
c X
c
4md2 þ h 2
ln ukj  h; mdij þ h  kj  h5 6 ½kj ; kj  ) md2ij þ h
ln u  kj  h 6 kj 6 kj
ln u ð53Þ
ij
k¼1 k¼1 k¼1
k–i k–i k–i

Based on Eqs. (49) and (52), the lower bound for Kj would be:
80 ! 1 0 19
< h
2
mðm  1Þdij X c Xc =
2
max @ ln þ1 þh lnðU kj Þ  hA; @mdij þ h lnðU kj Þ  hA 6 Kj ð54Þ
: m1 h k¼1 k¼1
;
k–i k–i

or,
80 ! 1 0 19
< h
2
mðm  1Þdij X c X c =
2
max @ ln þ1 þh  kj  hA; @mdij þ h
ln u  kj  hA 6 kj 6 kj
ln u ð55Þ
: m1 h k¼1 k¼1
;
k–i k–i

2
The center points of clusters are obtained by differentiating Eq. (39) with respect to the center points, V i i ¼ 1; . . . ; c. As dij is
considered Euclidean distance, the center points are updated by:
PN m
@L j¼1 U ij xj
¼ 0 ) V i ¼ PN m
ð56Þ
@V i j¼1 U ij

Or,
½v i ; v i  ¼ ½minðSi Þ; maxðSi Þ ð57Þ
PN PN PN PN !
um x
ij j
um x
ij j
 m xj
u ij
 m xj
u ij
where Si ¼ Pj¼1
N ; Pj¼1
N ; Pj¼1
N ; Pj¼1
N ; xj is the jth observation. h
um m
u um m
u
j¼1 ij j¼1 ij j¼1 ij j¼1 ij

References

[1] D.A. Barry, J. Parlange, L. Li, H. Prommer, C.J. Cunningham, F. Stagnitti, Analytical approximations for real values of the lambert W-function, Math.
Comput. Simul. 53 (2000) 95–103.
[2] N.R. Cazarez-Castro, L.T. Aguilar, O. Castillo, Designing type-1 and type-2 fuzzy logic controllers via fuzzy Lyapunov synthesis for nonsmooth
mechanical systems, Eng. Appl. Artif. Intell. 25 (2012) 971–979.
[3] F. Chapeau-blondeau, A. Monir, Numerical evaluation of the lambert W function and application to generation of generalized, IEEE Trans. Signal
Process. 50 (2002) 2160–2165.
[4] R.M. Corless, G.H. Gonnet, D.E.G. Hare, D.J. Jeffrey, D.E. Knuth, On the Lambert-W function, Adv. Comput. Math. 5 (1996) 329–359.
[5] H. Dawood, Interval Mathematics, Master of Science, Department of Mathematics, Faculty of Science, Cairo University, 2012.
[6] H.D. Duong, D.D. Nguyen, L.T. Ngo, D.T. Tinh, On approach to vision based fire detection based on type-2 fuzzy clustering, in: A. Abraham, H. Liu, C. Guo,
S. McLoone, E. Corchado (Eds.), Proc. Int. Conf. of Soft Computing and Pattern Recognition, SoCPaR, Dalian, China, 2011, pp. 51–56.
[7] D. Enke, M. Grauer, N. Mehdiyev, Stock market prediction with multiple regression, fuzzy type-2 clustering and neural networks, Proc. Comput. Sci. 6
(2011) 201–206.
72 M. Zarinbal et al. / Information Sciences 272 (2014) 49–72

[8] M.H. Fazel Zarandi, R. Gamasaee, I.B. Turksen, A type-2 fuzzy c-regression clustering algorithm for Takagi–Sugeno system identification and its
application in the steel industry, Inform. Sci. 187 (2012) 179–203.
[9] M.H. Fazel Zarandi, M. Zarinbal, M. Izadi, Systematic image processing for diagnosing brain tumors: a Type-II fuzzy expert system approach, Appl. Soft
Comput. 11 (2011) 285–294.
[10] M.H. Fazel Zarandi, M. Zarinbal, I.B. Turksen, Type-II fuzzy possibilistic c-mean clustering, in: J.P. Carvalho, D. Dubois, U. Kaymak, J.M.C. Sousa (Eds.),
Proc. Joint Int. Fuzzy Systems Association World Congress and European Society of Fuzzy Logic and Technology, IFSA-EUSFLAT, Lisbon, Portugal, 2009,
pp. 30–35.
[11] J. Fortin, D. Dubois, H. Fargier, Gradual numbers and their application to fuzzy interval analysis, IEEE Trans. Fuzzy Syst. 16 (2008) 388–402.
[12] Y.Y. Guh, M.S. Yang, R.W. Po, E.S. Lee, Interval-valued fuzzy relation-based clustering with its application to performance evaluation, Comput. Math.
Appl. 57 (2009) 841–849.
[13] D. Hidalgo, P. Melin, O. Castillo, An optimization method for designing type-2 fuzzy inference systems based on the footprint of uncertainty using
genetic algorithms, Expert Syst. Appl. 39 (2012) 4590–4598.
[14] F. Hopner, F. Klawonn, R. Kruse, T.A. Runkler, Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition, John Wiley &
Sons, Inc., 1999.
[15] C. Hwang, F.C.H. Rhee, Uncertain fuzzy clustering: interval type-2 fuzzy approach to C-means, IEEE Trans. Fuzzy Syst. 15 (2007) 107–120.
[16] P. Kaur, I.M.S. Lamba, A. Gosain, Kernelized type-2 fuzzy c-means clustering algorithm in segmentation of noisy medical images, Recent Adv. Intell.
Comput. Syst. (RAICS) (2011) 493–498.
[17] J.F. Kolen, T. Hutcheson, Reducing the time complexity of the fuzzy c-means algorithm, IEEE Trans. Fuzzy Syst. 10 (2002) 263–267.
[18] R. Krishnapuram, J.M. Keller, The possibilistic c-means algorithm: insights and recommendations, IEEE Trans. Fuzzy Syst. 4 (1996) 385–393.
[19] O. Linda, M. Manic, General type-2 fuzzy C-means algorithm for uncertain fuzzy clustering, IEEE Trans. Fuzzy Syst. 20 (2012) 883–897.
[20] C.M. Lou, M.C. Dong, Modeling data uncertainty on electric load forecasting based on type-2 fuzzy logic set theory, Eng. Appl. Artif. Intell. 25 (2012)
1567–1576.
[21] P. Melin, O. Mendoza, O. Castillo, Face recognition with an improved interval type-2 fuzzy logic sugeno integral and modular neural networks, IEEE
Trans. Syst. Man Cybernet. – Part A: Syst. Humans 41 (2011) 1001–1012.
[22] P. Melin, L. Astudillo, O. Castillo, F. Valdez, M. Garcia, Optimal design of type-2 and type-1 fuzzy tracking controllers for autonomous mobile robots
under perturbed torques using a new chemical optimization paradigm, Expert Syst. Appl. 40 (2013) 3185–3195.
[23] J.M. Mendel, R.I.B. John, Type-2 fuzzy sets made simple, IEEE Trans. Fuzzy Syst. 10 (2002) 117–127.
[24] J.M. Mendel, On answering the question ‘‘Where do I start in order to solve a new problem involving interval type-2 fuzzy sets?’’, Inform. Sci. 179
(2009) 3418–3431. R.E.
[25] Moore, Interval Analysis, Prentice-Hall, New York, 1966.
[26] R.E. Moore, R.B. Kearfott, M.J. Cloud, Introduction to Interval Analysis, Society for Industrial Mathematics, 2009.
[27] S.O. Olatunji, A. Selamat, A. Abdul Raheem, Improved sensitivity based linear learning method for permeability prediction of carbonate reservoir using
interval type-2 fuzzy logic system, Appl. Soft Comput. 14 (2014) 144–155.
[28] J. de Oliveira, W. Pedrycz, Advances in Fuzzy Clustering and Its Applications, Wiley Online Library, 2007.
[29] I. Ozkan, I.B. Turksen, Upper and lower values for the level of fuzziness in FCM, Inform. Sci. 177 (2007) 5143–5152.
[30] M. Palanivelu, M. Duraisamy, Color textured image segmentation using ICICM – Interval Type-2 Fuzzy C-means clustering hybrid approach, Eng. J. 16
(2012) 115–126.
[31] W. Pedrycz, Granular Computing: Analysis and Design of Intelligent Systems, CRC Press/Francis Taylor, Boca Raton, 2013.
[32] W. Pedrycz, Knowledge-based Clustering: From Data to Information Granules, John Wiley & Sons, 2005.
[33] C. Qiu, J. Xiao, L. Yu, L. Han, An interval type-2 fuzzy c-means algorithm based on spatial information for image segmentation, in: Y. Ding, Y. Li, Z. Fan, S.
Li, L. Wang (Eds.), Proc. 8th Int. Conf. on Fuzzy Systems and Knowledge Discovery, FSKD, Shanghai, China, 2011, pp. 545–549.
[34] M.A. Raza, F.C.H. Rhee, Interval type-2 approach to kernel possibilistic C-means clustering, in: IEEE Int. Conf. on Fuzzy Systems, FUZZ-IEEE, Brisbane,
Australia, 2012, pp. 1–7.
[35] D. Wu, J.M. Mendel, Uncertainty measures for interval type-2 fuzzy sets, Inform. Sci. 177 (2007) 5378–5393.
[36] M. Zarinbal, M.H. Fazel Zarandi, I.B. Turksen, Relative entropy fuzzy c-means clustering, Inform. Sci. 260 (2014) 74–97.
[37] Z. Zhang, S. Zhang, A novel approach to multi attribute group decision making based on trapezoidal interval type-2 fuzzy soft sets, Appl. Math. Model.
37 (2013) 4948–4971.
[38] E. Britannica, Image Processing, 17 April 2013. <http://www.britannica.com/EBchecked/topic/283261/image-processing>.
[39] A. Frank, A. Asuncion, Machine Learning Repository, 17 April 2013. <http://archive.ics.uci.edu/ml>.

Potrebbero piacerti anche