Sei sulla pagina 1di 6

2010 IEEE International Conference on Granular Computing

Semi-supervised Fuzzy c-Means Clustering


Using Clusterwise Tolerance Based Pairwise Constraints

Yukihiro Hamasuna , Yasunori Endo and Sadaaki Miyamoto


Dept. of Risk Engineering, Fac. of Sys. and Info. Eng., University of Tsukuba
1-1-1 Tennodai, Tsukuba, Ibaraki 305-8573, Japan
Email : yhama@soft.risk.tsukuba.ac.jp.
Email : {endo, miyamoto}@risk.tsukuba.ac.jp.
Research Fellow of the Japan Society for the Promotion of Science
JSPS, 6 Ichibancho, Chiyoda-ku, Tokyo 102-8471, Japan

[2]. However, because of the dissimilarity defined by the


squared Euclidean-norm, it is difficult to introduce pairwise
constraints in the Euclidean space. In case that cannot-link
is considered, the dissimilarity between two objects is at the
infinity. This generally breaks the Euclidean space. In order
to avoid such problems, the significant methods with kernel
function have been proposed [5], [8]. In these methods with
kernel function, pairwise constraints are considered not input
space but high-dimensional feature space.
In order to handle different sizes or shapes of clusters,
the concept of clusterwise tolerance has been proposed
[15]. This clusterwise tolerance is based on the concept
of tolerance [16]. The squared Euclidean-norm is rewritten
as the dissimilarity between data with clusterwise tolerance
vector and cluster center. By using the concept of clusterwise tolerance, we can handle different sizes or shapes of
clusters in the Euclidean space. From that sense, we propose
clusterwise tolerance based pairwise constraints in order to
introduce pairwise constraints into the Euclidean space in
natural way. In addition, we propose semi-supervised fuzzy
c-means clustering using it.
The contents of this paper are the followings. In the
second section, we introduce some symbols, fuzzy c-means
clustering (FCM) and pairwise constraints. In the third
section, we propose clusterwise tolerance based pairwise
constraints. In the forth section, we propose semi-supervised
fuzzy c-means clustering using clusterwise tolerance based
pairwise constraints (SSFCMCT). In the fifth section, we
show the effectiveness of proposed method through numerical examples. In the last section, we conclude this paper.

AbstractRecently, semi-supervised clustering has been remarked and discussed in many research fields. In semisupervised clustering, prior knowledge or information are
often formulated as pairwise constraints, that is, must-link
and cannot-link. Such pairwise constraints are frequently used
in order to improve clustering properties. In this paper, we
will propose a new semi-supervised fuzzy c-means clustering
by using clusterwise tolerance and pairwise constraints. First,
the concept of clusterwise tolerance and pairwise constraints
are introduced. Second, the optimization problem of fuzzy cmeans clustering using clusterwise tolerance based pairwise
constraint is formulated. Especially, must-link constraint is
considered and introduced as pairwise constraints. Third, a
new clustering algorithm is constructed based on the above
discussions. Finally, the effectiveness of proposed algorithm is
verified through numerical examples.
Keywords-semi-supervised clustering, fuzzy c-means clustering, clusterwise tolerance, pairwise constraints.

I. I NTRODUCTION
The aim of data analysis methods is to discover important
properties or knowledges from a large quantity of data.
Recently, semi-supervised learning has also been remarked
and discussed in many researches [1][8]. In the field of
clustering [12], [14], pairwise constraints are frequently used
in order to improve clustering results by using background
knowledges or prior informations [2], [3]. Also, pairwise
constraints problems are considered by using probabilistic
model [4], fuzzy clustering model [5], [8], and agglomerative
hierarchical clustering [9][11]. In addition, soft constraints
which are introduced as penalty terms in the objective
function are another way [7], [8]. In case of these methods
with soft constraints, pairwise constraints are not always
satisfied. These hard and soft constraints are frequently
considered in semi-supervised learning methods.
In recent years, semi-supervised clustering which are
based on k-means and fuzzy c-means clustering, and kernel
methods have been widely discussed [2], [5], [7], [8]. In
these methods, pairwise constraints referred to must-link and
cannot-link are used as a prior or background knowledge
about which objects should be in the same or different cluster
978-0-7695-4161-7/10 $26.00 2010 IEEE
DOI 10.1109/GrC.2010.149

II. P RELIMINARIES
Let a data, a cluster and its cluster center be x =
(xk1 , . . . , xkp )T p , (k = 1, . . . , n), Ci (i = 1, . . . , c)
and vi = (vi1 , . . . , vip )T p , respectively. Moreover, uki
is the membership grade of xk belonging to Ci and we
denote a partition matrix U = (uki )1kn, 1ic . Here, a
set of data and a set of cluster center be X = {x1 , . . . , xn },
V = {v1 , . . . , vc }, respectively.
188

vector is defined as = {11 , . . . , nc } in which ki is a


clusterwise tolerance vector. A clusterwise tolerance vector
ki p is the p-dimensional vector with real components.
A clusterwise tolerance vector is the vector within the range
of clusterwise tolerance. In the conventional studies, a data is
represented as xk . On the other hand, data with clusterwise
tolerance is represented as xk + ki .
A constraint for clusterwise tolerance vector is as follows:

A. Fuzzy c-Means Clustering


Fuzzy c-means clustering (FCM) is based on optimizing
an objective function under the constraint for membership
grade.
We consider following two objective functions Js and Je .
Js (U, V ) =
Je (U, V ) =

n 
c

k=1 i=1
n 
c


(uki )m dki ,



uki dki + 1 uki log uki .

ki 2 (ki )

k=1 i=1

(2)

 
 

i=1

The dissimilarity for clustering is generally the squared


L2 -norm between each data and a cluster center:
p


(ki 0) , k, i.

Figure 1 shows a clusterwise tolerance in 2 .

here, m and is fuzzified parameters.


Js is a well-known objective function as standard fuzzy
c-means clustering (sFCM) proposed by Bezdek [12] and Je
is an entropy based fuzzy c-means clustering (eFCM) [13].
The constraint for uki is as follows:


c


Uf = (uki ) : uki [0, 1] ,


uki = 1, k .
(1)

dki = xk vi 2 =

(xkj vij )2 .

j=1

B. Pairwise constraints
Typical examples of pairwise constraints are must-link
and cannot-link [2]. These constraints are considered as
a prior or background knowledges about which objects
should be in the same or different cluster. A set ML =
{(xk , xl )} X X consists of must-link pairs so that
xk and xl should be in the same cluster, while another
set CL = {(xk , xl )} X X consists of cannot-link
pairs so that xk and xl should be in different cluster.
Obviously, ML and CL are assumed to be symmetric, that is,
if (xk , xl ) ML then (xl , xk ) ML, and if (xk , xl ) CL
then (xl , xk ) CL.
In many studies, these pairwise constraints are considered
as hard or soft constraints. The hard constraint means that
pairwise constraints ML and CL are always satisfied in
clustering procedure and results, while ones are not always
satisfied in case of soft constraint. Many semi-supervised
clustering methods based on such hard or soft constrains
have been proposed in order to improve clustering results
by using background knowledges or prior informations of
data set [2][8].

Figure 1.
in 2 .

An illustrative example of the concept of clusterwise tolerance

B. Clusterwise tolerance based pairwise constraints


As above mentioned, pairwise constraints are considered
as a prior or background knowledges about which objects
should be in the same or different cluster. From that sense,
if (xk , xl ) ML, ki and li are calculated to be near each
other, while (xk , xl ) CL, ki and kl are calculated to be
distant each other
Here, we propose clusterwise tolerance based pairwise
constraints. First, a set of must or cannot-linked objects
are defined. A set ML(xk ) consists of must-linked objects
which are linked with an object xk , while CL(xk ) consists
of cannot-linked ones which are linked with an object xk .
ML(xk ) = { | X, (xk , ) ML} ,
CL(xk ) = { | X, (xk , ) CL} .

III. C LUSTERWISE TOLERANCE BASED PAIRWISE


CONSTRAINTS

(3)
(4)

A concept of clusterwise tolerance based pairwise constraints uses these sets to calculate the upper bound of
clusterwise tolerance vector |K(xk , vi )| which is defined
between a data and cluster center.
A value of K(xk , vi ) is calculated as the sum of clusterwise tolerance ki which in a set of must or cannot-linked

A. Clusterwise tolerance
First, we define a clusterwise tolerance and a clusterwise tolerance vector. A clusterwise tolerance ki =
(ki1 , . . . , kip )T means the admissible range of each clusterwise tolerance vector. A set of clusterwise tolerance

189

objects.
K(xk , vi ) = ki +

qi

xq ML(xk )

IV. S EMI - SUPERVISED F UZZY c- MEANS CLUSTERING


ri .

USING CLUSTERWISE TOLERANCE BASED PAIRWISE


CONSTRAINTS

(5)

xr CL(xk )

In this section, we consider semi-supervised fuzzy cmeans clustering using clusterwise tolerance based pairwise
constraints (SSFCMCT). Especially, we consider and introduce only must-link constraint as clusterwise tolerance based
pairwise constraints. Therefore, (5) is rewritten as follows:

qi .
K(xk , vi ) = ki +
xq ML(xk )

If K(xk , vi ) > 0, ki is calculated to be near the cluster


center vi , while K(xk , vi ) < 0, ki is calculated to be distant
the cluster center vi . In case that K(xk , vi ) = 0, it is ineffective to calculate ki . From above, we can handle the prior
or background knowledges as the vector in Euclidean space
by using clusterwise tolerance based pairwise constraints.
Next, we show an illustrative example of clusterwise
tolerance based pairwise constraints. Figure 2 is a simple
example of proposed method.

A. Standard model
The objective function of semi-supervised standard fuzzy
c-means clustering using clusterwise tolerance based pairwise constraints (SSsFCMCT) is as follows:

Jsct (U, , V ) =

n 
c


(uki ) xk + ki ki vi 2 , (6)

k=1 i=1

where, ki is a parameter calculated as follows:



1 (K(xk , vi ) > 0) ,
ki =
0 (K(xk , vi ) = 0) .



In SSsFCMCT, the dissimilarity is as follows:


dki = xk + ki ki vi 2 .
The constraints for uki remains the same as (1). The
constraint for ki (2) is rewritten by using K(xk , vi ) as
follows:

ki 2 {ki K(xk , vi )} , k, i.

Figure 2. An illustrative example of clusterwise tolerance based pairwise


constraint.

(7)

From the convexity of (6) the Lagrangian L is as follows:


L =Jsct (U, , V ) +

In this example, (x1 , x2 ) ML, (x2 , x3 ) CL. Moreover,


11 = 1.0, 12 = 0.0, 21 = 0.0, 22 = 0.0, 31 = 0.0, and
32 = 1.0. Therefore, ML(xk ), CL(xk ) and K(xk , vi ) are
as follows: For x1 ,

n 
c


n


c

k (
uki 1)

k=1

i=1

ki ki 2 {ki K(xk , vi )}

k=1 i=1

ML(x1 ) = {x2 } , CL(x1 ) = ,


K(x1 , v1 ) = 1.0, K(x1 , v2 ) = 0.0.

where, k and ki are Lagrange multipliers.


The optimal solutions for uki and vi are derived from
Lagrangian as follows:

1
1 m1
dki
,
(8)
uki = c

1
 1 m1

For x2 ,
ML(x2 ) = {x1 } , CL(x2 ) = {x3 } ,
K(x2 , v1 ) = 1.0, K(x2 , v2 ) = 1.0.
For x3 ,

l=1
n


ML (x3 ) = , CL (x3 ) = {x2 } ,


K(x3 , v1 ) = 0.0, K(x3 , v2 ) = 1.0.

vi = k=1

The dissimilarity is calculated as follows:


xk + ki vi 2 .

dkl
m

(uki ) (xk + ki ki )
n


(9)

(uki )

k=1

If some xk + ki ki = vi , we set uki = 1/|C  |. Here,


|C  | is number of cluster centers which satisfies dki = 0.

Therefore, pairwise constraints are handled by using clusterwise tolerance vector.

190

Next, we consider ki . Karush-Kuhn-Tucker conditions


(KKT conditions) are as follows:
L
= 0,
ki
From

L
0,
ki

L
ki

ki

L
= 0,
ki

From the same procedures as above section, the optimal


solutions for uki , vi , and ki are derived as follows:
uki =

ki 0. (10)

= 0, we can get
ki =

L
= 0,
From ki
ki


2
ki ki 2 {ki K(xk , vi )} = 0.

(16)

l=1

n


(uki ) ki (xk vi )
.
m
(uki ) + kij

exp (dki )
,
c

exp (dkl )

(11)

vi =

k=1

uki (xk + ki )
n


(17)

uki

k=1

ki = ki ki (xk vi ),

(12)
where,

From (12), we should consider two cases, i.e., ki = 0


2
and ki 2 = {ki K(xk , vi )} . First, we consider the case
of ki = 0. In this case, the constraint (7) is not considered.
From (11), we can get,


ki = min


ki K(xk , vi )
,1 .
xk vi 

C. Algorithms
The algorithm of SSFCMCT is described as Algorithm 1.
Equations A, B, and C follow Table I.

ki = ki (xk vi ) .
2

On the other hand, in case that ki 2 = {ki K(xk , vi )} ,




ki (uki )m (xk vi ) 2
= {ki K(xk , vi )}2 .

ki 2 =
m


(uki ) + ki

Algorithm 1 SSFCMCT
SSFCMCT1 Set the initial values and parameters.
SSFCMCT2 Calculate uki U using Equation A.
SSFCMCT3 Calculate vi V using Equation B.
SSFCMCT4 Calculate ki E using Equation C.
SSFCMCT5 If convergence criterion is satisfied, stop.
Otherwise, go back to SSFCMCT2.

From (ki )2 = 1 and (uki ) + ki > 0,


m

(uki )
ki K(xk , vi )
.
=
m
xk vi 
(uki ) + ki

(13)

In these algorithms, the convergence criterion is convergence of each variable, value of objective function or number
of repetition.

From (11), (13),


ki =

(ki ) K(xk , vi ) (xk vi )


.
xk vi 

Table I
T HE OPTIMAL SOLUTIONS OF SSFCMCT.

From the above, we can get an optimal solution for ki


as follows:
ki = ki ki (xk vi ),
where,


ki = min

Algorithm
SSsFCMCT
SSeFCMCT

(14)


ki K(xk , vi )
,1 .
xk vi 

Equation C
(14)
(14)

In this section, we show numerical examples with a simple


artificial data set. This data set consists of seven data points
allocated in two dimensional pattern space described in
Table II. This data set should be classified into two clusters.
Fig. 3 is an illustrative example of an artificial data set.
The cluster allocation of a data point x4 = (5.0, 5.0) is
strongly depended on initial value. Therefore, we consider
that SSFCMCT can classify x4 into adequate clusters by
using pairwise constraints. First, we set following parameters
11 = 1.0, 12 = 0.0, 21 = 1.0, 22 = 0.0, 31 = 1.0,
32 = 0.0, 41 = 0.0, 42 = 0.0, 51 = 0.0, 52 = 1.0,
61 = 0.0, 62 = 1.0, 71 = 0.0, 12 = 1.0. Here

The objective function of semi-supervised entropy based


fuzzy c-means clustering for data with clusterwise tolerance
(SSeFCMCT):
n 
c



Equation B
(9)
(17)

V. N UMERICAL EXAMPLES

B. Entropy model

Ject (U, , V ) =

Equation A
(8)
(16)


uki dki + 1 uki log uki . (15)

k=1 i=1

The constraint for uki and ki are remains the same as (1)
and (7)

191

Table II
DATA SET {xk | xk p , k = 1 9}.
k
1
2
3
4
5
6
7

(Project No.21500212) from the Ministry of Education,


Culture, Sports, Science and Technology, Japan.
R EFERENCES

(xk1 , xk2 )
(0.0,0.0)
(0.0,10.0)
(2.0,5.0)
(5.0,5.0)
(8.0,5.0)
(10.0,0.0)
(10.0,10.0)

[1] O. Chapelle, B. Schoolkopf, A. Zien, eds., Semi-Supervised


Learning, MIT Press, 2006.
[2] K. Wagstaff, C. Cardie, S. Rogers, S. Schroedl, Constrained
k-means clustering with background knowledge, Proc. of the
18th International Conference on Machine Learning (ICML
2001), pp. 577584, 2001.
[3] S. Basu, A. Banerjee, R. J. Mooney, Active semi-supervision
for pairwise constrained clustering, Proc. of the SIAM
International Conference on Data Mining (SDM 2004), pp.
333344, 2004.

we show the results of SSsFCMCT with m = 2.0. Figs.


4, 5, 6, and 7 are illustrative examples of classification
results. In these examples, and mean each cluster and
means the cluster centers. In these figures, clusterwise
tolerance vectors are expressed by arrowed line. The range of
K(xk , vi ) is expressed by circle and the pairwise constraints
are expressed by circle. In Fig 4, we set ML = . In case
that ML is not considered, proposed method is equivalent to
fuzzy c-means clustering for data with clusterwise tolerance
[15]. In Fig 5, we set ML = {(x1 , x3 ), (x2 , x3 )}. Therefore,
K(x1 , v1 ) = 2.0, K(x2 , v1 ) = 2.0, K(x3 , v1 ) = 3.0. In
Fig 5, we set ML = {(x1 , x3 ), (x2 , x3 ), (x3 , x4 )}, while,
we set ML = {(x1 , x3 ), (x2 , x3 ), (x4 , x5 )} in Fig 6. By
the difference of ML, x4 is classified into different cluster
from other examples. Moreover, clusterwise tolerance vector
indicates the opposite side in Fig 7.
From these results, we can see that SSFCMCT can classify this data set into adequate clusters by using clusterwise
tolerance based pairwise constraints.

[4] S. Basu, M. Bilenko, R. J. Mooney, A probabilistic framework for semi-supervised clustering, Proc. of the 10th ACM
SIGKDD International Conference on Knowledge Discovery
and Data Mining (KDD 2004), pp. 5968, 2004.
[5] S. Miyamoto, M. Yamazaki, A. Terami, On semi-supervised
clustering with pairwise constraints, Proc. of The 7th International Conference on Modeling Decisions for Artificial
Intelligence (MDAI 2009), pp. 245254, 2009 (CD-ROM).
[6] Y. Endo, Y. Hamasuna, M. Yamashiro, S. Miyamoto, On
semi-supervised fuzzy c-means clustering, Proc. of 2009
IEEE International Conference on Fuzzy Systems (FUZZIEEE 2009), pp. 11191124, 2009.
[7] B. Yan, C. Domeniconi, An adaptive kernel method for semisupervised clustering, Proc. of 17th European Conference on
Machine Learning (ECML 2006), pp. 521532, 2006.
[8] B. Kulis, S. Basu, I. Dhillon, R. Mooney, Semi-supervised
graph clustering: a kernel approach, Machine Learning, Vol.
74, No. 1, pp. 122, 2009.

VI. C ONCLUSIONS
In this paper, we have proposed semi-supervised fuzzy cmeans clustering using clusterwise tolerance based pairwise
constraints. The proposed method can handle the pairwise
constraints without breaking the Euclidean space by using
the concept of clusterwise tolerance. Moreover, we have
shown the effectiveness of proposed method through numerical examples. The proposed method is quite different from
other semi-supervised clustering methods from the viewpoint of handling pairwise constraint by using clusterwise
tolerance vector.
In future works, we will consider the way to handle
cannot-link constraint by proposed method. Next, we will
compare our proposed method with other semi-supervised
clustering methods through numerical examples with various kinds of data sets. Moreover, we will apply proposed
method to fuzzy c-means clustering for data with clusterwise
tolerance based on regularization [17], [18].

[9] L. Talavera, J Bejar, Integrating declarative knowledge in


hierarchical clustering tasks, Proc. of the Third International Symposium on Advances in Intelligent Data Analysis
(IDA99), pp. 211222, 1999.
[10] D. Klein, S. Kamvar, C. Manning, From instance-level constraints to space-level constraints: making the most of prior
knowledge in data clustering, Proc. of the 19th International
Conference on Machine Learning (ICML 2002), pp. 307314,
2002.
[11] I. Davidson, S.S. Ravi, Agglomerative hierarchical clustering
with constraints: theoretical and empirical results, Proc.
of 9th European Conference on Principles and Practice of
Knowledge Discovery in Databases (KDD 2005), pp. 5970,
2005.
[12] J. C. Bezdek, Pattern Recognition with Fuzzy Objective
Function Algorithms, Plenum Press,New York, 1981.

ACKNOWLEDGMENTS

[13] S. Miyamoto, M. Mukaidono, Fuzzy c-means as a regularization and maximum entropy approach, Proceedings of the
7th International Fuzzy Systems Association World Congress
(IFSA97), Vol. 2, pp. 8692, 1997.

This study is partly supported by Research Fellowships of


the Japan Society for the Promotion of Science for Young
Scientists and the Grant-in-Aid for Scientific Research (C)
192

[14] S. Miyamoto, H. Ichihashi, K. Honda, Algorithms for Fuzzy


Clustering, Springer, Heidelberg, 2008.

10

[15] Y. Hamasuna, Y. Endo, S. Miyamoto, On Tolerant Fuzzy cMeans, Journal of Advanced Computational Intelligence and
Intelligent Informatics (JACIII), Vol. 13, No. 4, pp. 421427,
2009.

[16] Y. Endo, R. Murata, H. Haruyama, S. Miyamoto, Fuzzy cmeans for data with tolerance, Proc. of International Symposium on Nonlinear Theory and Its Applications (Nolta05),
pp. 345348, 2005.

[17] Y. Hamasuna, Y. Endo, and S. Miyamoto, Comparison


of tolerant fuzzy c-means clustering with L2 - and L1 regularization, Proc. of 2009 IEEE International Conference
on Granular Computing (GrC2009), pp. 197202, 2009.

Figure 5.

10

Result with ML = {(x1 , x3 ), (x2 , x3 )}.

[18] Y. Hamasuna, Y. Endo, and S. Miyamoto, On tolerant


fuzzy c-means clustering with L1 -regularization, International Fuzzy Systems Association European Society for Fuzzy
Logic and Technology (IFSA-EUSFLAT2009), pp. 1152
1157, 2009.
10

10

Figure 6.

Figure 3.

10

Result with ML = {(x1 , x3 ), (x2 , x3 ), (x3 , x4 )}.

10

An illustrative example of an artificial data set.

10
10
8
8
6
6
4
4
2
2
0
0

0
0

Figure 4.

Figure 7.

10

Result with ML = .

193

10

Result with ML = {(x1 , x3 ), (x2 , x3 ), (x4 , x5 )}.

Potrebbero piacerti anche