Fuzzy Subsethood - V. R Young - 1994

sets and systems
ELSEVIER
Fuzzy Sets and Systems77 (1996) 371-384
Fuzzy subsethood
Virginia R. Young
School of Business, University of Wisconsin, Madison, WI 53706, USA
Received December 1993; revised April 1994
Abstract
Subsethood is an important concept in the area of fuzzy sets. It surfaces in fuzzy entropy, in the relationship of fuzzy set
theory and probability, and in tuning rules in fuzzy logic. Only a few authors (Sinha and Dougherty, 1993) have
considered axiomatizing the properties of a measure of fuzzy subsethood.
We offer fuzzy subsethood axioms as alternatives to those of Sinha and Dougherty. We show the significance of fuzzy
subsethood by demonstrating how it is connected with fuzzy entropy, probability, and fuzzy logic.
Keywords: Fuzzy subsethood; Inclusion grades; Entropy; Measures of fuzziness
1. Introduction
Subsethood is an important concept in the area of fuzzy sets. It surfaces in fuzzy entropy, in the
relationship of fuzzy set theory and probability, and in tuning rules in fuzzy logic. Only a few authors [12]
have considered axiomatizing the properties of a measure of fuzzy subsethood.
We axiomatize the properties of a measure of fuzzy subsethood so that such a measure reduces to an
entropy measure on a specific subset of ~-(X) x J~-(X), in which ~-(X) is the set of fuzzy sets in the universal
set X. We, thus, formalize the work of Kosko [8, 9], who connects fuzzy subsethood with fuzzy entropy. We
also consider the fuzzy subsethood axioms of Sinha and Dougherty [12] and offer our set of axioms as an
alternative to theirs.
Fuzzy subsethood allows a given fuzzy set to contain another to some degree between 0 and 1. This idea
fuzzifies Zadeh's fuzzy set containment [ 15] which is a crisp property: A fuzzy set B contains a fuzzy set A if
m a ( x ) <<,ran(x), for all x in X, in which mA and mB are the membership functions of A and B, respectively.
A measure of fuzzy subsethood is, thus, a fuzzy set in ~ ( X ) x ~ ( X ) . Fuzzy subsethood measures are also
called inclusion grades in the literature [3].
The entropy of a fuzzy set is the fuzziness of that set. A measure of entropy indicates the degree to which
a set is fuzzy; an entropy measure is, therefore, a fuzzy set in ~'(X). Intuitively, it follows that a nonfuzzy, or
crisp, set has entropy 0. Also, P, the fuzzy set with all membership values identically :1, has maximum entropy
because it is maximally fuzzy in the standard ordering of fuzziness [2].
0165-0114/96/$15.00 1996- ElsevierScienceB.V. All rights reserved
SSDI 0 1 6 5 - 0 1 1 4 ( 9 5 ) 0 0 0 4 5 - 3
V.R. Young / Fuzzy Sets and Systems 77 (1996) 371-384
372
At first blush, fuzzy subsethood and fuzzy entropy do not seem related. However, with respect to a specific
pair of entropy and subsethood measures, Kosko [8-1 shows that the entropy of a fuzzy set A is the degree to
which A w A c is a subset of its complement (A u A ~)~ = A c~A c. He also shows that the classical probability of
a crisp event C is the degree to which the universal set X is a subset of C.
In this work, we extend Kosko's results by formalizing the concept of fuzzy subsethood. We link fuzzy
entropy, subsethood, and probability in this more general setting. See Section 2 for a description of the
axioms of entropy of DeLuca and Termini [2] and for a summary of Kosko's work. Sinha and Dougherty
[12] offer nine axioms for fuzzy subsethood measures, and we further motivate our subsethood axioms by
discussing theirs in Section 3.
In Section 4, we present three axioms for fuzzy subsethood measures and prove a theorem, that relates
fuzzy subsethood with fuzzy entropy. We describe how to generate a subsethood measure from a given
probability distribution in Section 5. In Section 6, we briefly demonstrate how to fine tune rules in fuzzy logic
using fuzzy subsethood. Finally, we offer areas for further research in Section 7.
2. Fuzzy entropy and fuzzy subsethood

2.1. Notation
We write X to denote the universal set on which fuzzy sets are defined. Except in Section 6, we assume that
X is a finite set throughout this paper. One can readily obtain our results for X infinite - discrete or
continuous. We use capital letters A, B, C, etc. to denote fuzzy sets on X and write mA, mB, mc, etc. for their
membership functions, respectively. In other words, the fuzzy set A is given by a function ma: X ~ l = [0, 1].
Define IAI = Y~xma(x); we call IAI the cardinality of the fuzzy set A. Let ~ ( X ) stand for the set of fuzzy sets
in X.
In this paper, a superscript c denotes the standard operation of complement; that is, the fuzzy set A ~ is
given by
ma(X) ~ 1 -- ma(x).
We adopt the minimum and maximum operations for intersection and union, respectively, unless stated
otherwise. We write A _ B to mean that mA(x) <~roB(x), for all x in X, or more simply mA ~< ran; we say that
B contains A.
2.2. Fuzzy entropy
A measure of fuzzy entropy assesses the amount of vagueness, or fuzziness, in a fuzzy set. It is a function
E:~(X)~I,
I = (0, 1),
from the set of fuzzy sets in X to the unit interval. An entropy function is, therefore, a fuzzy set in ~ ( X ) .
DeLuca and Termini [2] formalize the properties of such measures through the following axioms:
(El) E(A) = 0 if and only if A is nonfuzzy; that is, mA(X) ~ {0, 1}, for all x ~X;
(E2) E(A) = 1 if and only if A = P, in which P is the set whose membership function is the constant
function with value at every x e X;
(E3) E(A)<,E(B) if A refines B; that is, ma(x)~< mB(x) when ma(x)~< and mA(X)>>-ms(x) when
mB(x) >1;
(E4) E(A) = E(A~).
373
Note that one can expand these axioms by using a more general complement and fuzziness ordering and
by setting P equal to a fixed point of that complement. Some authors have proposed additivity and valuation
constraints for entropy functions [4, 11] but we will not consider those properties here.
Kosko [8] defines an entropy function by
En(A)
-
LP(A'A .... )
LP(A, Afar)'
P ~> 1,
in which A~earand Afarare the nearest and farthest nonfuzzy neighbors of A, respectively. More precisely, the
membership value of Anear at x is defined to be 0 if mA(X) < and 1 otherwise, and Afa r = ( a . . . . )c. L p is the
distance defined by
LP(A,B) =
ImA(x)- mB(X)l p)
Note that E n satisfies Axioms (E1)-(E4). For p --- 1, Kosko shows that El can be written as
E1 (A ) = y'~ min(m A(x), 1 -- mA(X))

~ max(mA(x), 1 -- ma(x))"
(2.1)
2.3. Fuzzy subsethood

Zadeh [15] defines fuzzy set containment: B contains A if ma(x) <~mn(x) for all x e X; that is, mA <<.mB.
Kosko [8] contends that if this inequality holds for all but just a few x, one can still consider A to be a subset
of B to some degree. He then generalizes Zadeh's definition by using the following subsethood measure,
which was first defined by Sanchez [10]:
x min(mA(x),ma(x))
S(A,B) =
~ rna(x)
1,
IAnnl
Ial
A#O,
'
(2.2)
A=0.
One says that A is a subset of B to degree S(A, B). S is, therefore, a fuzzy set in the product ~ ( X ) ~ ( X ) . In
essence, this function S measures how well A and B satisfy the inequality mA <. mn relative to the size of A.
Kosko [18] then shows that S reduces to El, Eq. (2.1), on a particular subset o f ~ ( X ) x ~" (X), namely the
set of ordered pairs (A u A c, A c~AC). Specifically,
S ( A u A ~ , A c ~ A ~) = E ( A u A ~)
= E ( A n A ~)
= E(A).
(2.3)
For the remainder of the paper, we will write EK and SK for Kosko's entropy and subsethood measures, Eqs.
(2.1) and (2.2).
Remarking that this ratio looks much like relative probability, Kosko [9] confirms that S [ ( X , C) is the
relative frequency ICI/IXl, if C is a crisp set in X. He also proves a version of Bayes theorem using this
subsethood measure.
Many researchers have contributed to the area of fuzzy subsethood. Goguen [5] and Sanchez [10] each
define specific inclusion grades. We note above that Kosko's and Sanchez' measures coincide; see Example
4.1 below for Goguen's subsethood function. Bandler and Kohout [ 1] and Willmott [13, 14] develop several
subsethood measures using fuzzy implication operators. Also, Sinha and Dougherty [12] present axioms for
fuzzy subsethood, which we discuss in the next section.
374
3. Subsethood axioms of Sinha and Dougherty

Sinha and Dougherty [12] present nine axioms for subsethood; we discuss the first four. We use the bold
letter I to represent a fuzzy set in ~ ( X ) x ~,~(X) that satisfies their axioms.
Axiom 1. I(A, B) = 1 if and only if A _ B in the sense of Zadeh.

We agree that any measure of subsethood should generalize Zadeh's definition of containment. In fact, in
Section 4, we also state Axiom 1 as our first axiom. Their second axiom is a bit more controversial.
Axiom 2. I(A, B) = 0 if and only if 3x ~ X such that ma(x) = 1 and mB(X) = O.

This property allows the values of A and B at one point to make I(A, B) equal 0. For example, if
mA(x) <<,mB(x) for all x except for one point Xo, at which mA(Xo) = 1 and ma(xo) = 0, then I(A,B) = O,
despite the fact that mA ~< mB on the rest ofX. Kosko's SK does not satisfy this axiom; neither does Goguen's
inclusion function [5], see Example 4.1 below.
On the other hand, Axiom 2 is consistent with the work of Bandler and Kohout [1] and Willmott [13], in
which a subsethood measure is defined as the minimum value of an implication operator: S(A,B)=
minx(mA(x) --+ ma(x)). Bandler and Kohout, however, acknowledge that taking the minimum is a "harsh"
criterion, and Willmott [14] defines a subsethood measure as a mean value of an implication operator.
By letting one point determine when I is 0, one loses much of the relative structure of A and B. In fact, if
I satisfies Axiom 2, then E(A) =-I(A u A , A n A ) is not an entropy function because Axiom (El) does not
hold. Sinha and Dougherty assert that Axiom (El) should be replaced with
(El') E(A) = 0 if and only if A or A is a normal fuzzy set. A is normal if supxmA(x) = 1.
They base this new axiom on a statement by Dubois and Prade [3] that "the maximum membership value
of a fuzzy number is interpreted as a grade of reliability". We agree that Axiom (El') is consistent with what
Dubois and Prade write, but we suggest that much of the fuzziness of a set is missed by using such an entropy.
For example, let
E'(A) = inf min(mA(X), 1 - mA(X));

x~X
then, E' satisfies Axioms (El'), (E2)-(E4). Let A be given by
raM(X) =-
I2
' - x,
[ 0,
0~<1<1,
l~<x~<2,
elsewhere,
and B be the crisp set { 1}. Then, E'(A) = E'(B) = 0, that is, E' cannot distinguish between these two sets; it is
counterintuitive that E'(A) = 0 because A is fuzzy.
This loss of information about the fuzziness of A is similar to the loss when one looks only at the mode of
a probability distribution and ignores the mean and higher moments. It is also comparable to defuzzifying
a fuzzy set A by using the point(s) at which A is maximum instead of using the centroid of A.
As for Axiom 1, we also present an axiom that is similar to the next two axioms of Sinha and Dougherty:
Axiom 3. If B ~ C, then I(A, B) <<.I(A, C).

Axiom 4. I f B _~ C, thenl(C,A) <.I(B,A).
We note that SK does not satisfy Axiom 4. Their last five axioms further restrict I; we do not list
them here. They also present three optional axioms in addition to these nine. We note that Axiom 11,
KR. Young / Fuzzy Sets and Systems 77 (1996) 371-384
Axiom 11. If A refines B, then I ( A u A ~ , A n A

w = max, c~ = min, and c = negation.
375
~) <<.I(BuB~,Bc~BC), follows from Axioms 3 and 4 if
4. Axioms for fuzzy subsethood

Suppose we are given a fuzzy set S in ~ ( X ) x ~ ( X ) . What properties should S satisfy so that E is a fuzzy
entropy measure, when E is defined by E(A) =- S(A w A ~, A c~ A c), as in Eq. (2.3)? First, and foremost, we want
S to generalize Zadeh's definition of containment:
(S1) S(A,B) = 1 if and only if A _ B as defined by Zadeh; that is, mA ~< roB.
We receive a bonus from this axiom: It implies that E(A) = 1 if and only if A = P, which is Axiom (E2).
Indeed, if E(A) = 1, then A u A ~ c_ Ac~A c, which only occurs when A = P. Conversely, if A = P, then
E(P) = S(P,P) = 1.
If we compare A with A , then we want S to recognize some degree of overlap between the two sets as long
as they are truly fuzzy. As Kosko [9] notes, this overlap is what makes them fuzzy to begin with. This
observation leads to
(S2s) S(A,A ~) = 0 if and only if A is nonfuzzy, or crisp; that is, mA(x) equals 0 or 1, for all x in X.
This property implies that E(A) = 0 if and only if A is nonfuzzy, which is Axiom (E 1). Indeed, if E(A) -- 0,
then A w A ~ is nonfuzzy because (A u A ~)~ = A c~A c; thus, A is nonfuzzy. Conversely, if A is nonfuzzy, then
A w A ~ = X, and E(A) = S(X,0) = 0.
In fact, Axiom (S2s) is stronger than needed for Axiom (El) to hold. If the following axiom is true, then
so is (El):
($2) If P _ A in the sense of Zadeh, then S(A, A t) = 0 if and only if A = X.
If E(A) = 0, then A u A ~ = X, from which it follows that A is nonfuzzy. We have the converse as before.
Axiom ($2) follows the spirit of Willmott [14], in which he defines a subsethood measure as a mean value of
an implication operator. Please contrast Axiom ($2) with Axiom 2 of Sinha and Dougherty.
An intuitive property of a subsethood measure is that if A 1 - A 2, then S(A 1, B) should be greater than or
equal to S(A2,B) because the target inequality mA <~ma is violated more. Also, if B1 - B2, then S(A, B1)
should be less than or equal to S(A, B2) because the inequality is violated less. Therefore, we have
(S3s) If A1 - A2, then S ( A I , B ) >~S(A2,B), and if B1 c B2, then S(A, B1) <<.S(A, B2).
This condition implies that E(A) decreases as A becomes less fuzzy, which is Axiom (E3). Indeed, if
A1 refines A2, then AI La(A1) ~ A2 L.)(A2) and A1 n(Al) c ~ A2 c~(Az) ~. Therefore, S(AI w(A1) ~, AI c~(A1) ~) <<.
S(A2u(A2),A2nA2)~), or E(AI) <<.E(A2).
There is a slioht problem with Axiom (S3s): Kosko's subsethood measure does not satisfy it. For example
let IXI = 2, Al = (0.50,0.75) A2 = (0.75,0.75), and B = (0.75,0.50); then, SK(AI,B) = 0.80, while SK(A2,B)
0.83. If we require instead that S be decreasing in A when B ~_ A1 _ A2, then Axiom (E3) still follows and
Kosko's subsethood measure satisfies this weaker condition.
($3) If B _ Al -- A2, then S ( A I , B ) >>.S(A2,B), and if B1 _ B2, then S(A, B1) <. S(A, B2).
Definition 4.1. If S is a fuzzy set in ~ ( X ) x ~ ( X ) which satisfies Axioms (S1)-($3), then we say that S is
a fuzzy subsethood measure on X.
In our discussion above, we have demonstrated the following theorem:
376
Theorem 4.1. If S is a fuzzy subsethood measure on X, then E defined by
E(A) -S(AuA,Ac~A), ae~(2x),

is a fuzzy entropy measure on X.
Example 4.1. Goguen's inclusion orade. Goguen [5] defines an inclusion grade by
S~og(A,B) = l ~ m i n ( 1 , 1
mA(x)+mB(x)),
IXI
n.
S~os satisfies the three subsethood axioms. The corresponding entropy function, given by Eq. (2.3), is
E~og(A ) = 2_~, min(mA(x), 1 - mA(X)),
n x
which is a measure of entropy used by Kaufmann [6].
Example 4.2. Weak inclusion. Weak inclusion [3] is given by

1
Sw(A,B) =- - ~ m a x ( 1 - ma)(x),m~(x)),
IXl = n.
n x
Sw satisfies Axioms ($2) and ($3) but not (S1), although it is true that if Sw(A, B) = 1, then A _ B = X. Also,
Sw is strictly monotone with respect to Axiom ($3). If we define Ew by Eq. (2.3), then
Ew(A) = ~ ~ min (mA(X), 1 -- mA(X)).
which is times the entropy function in Example 4.1. Note that weak inclusion is closely tied to the
implication operator m(A =~ B)(x) = max(1 - mA(X), roB(x)).
Example 4.3. Inclusion from implication. Define an implication operator

(1 - mA(X) + mA(x)mn(x)), and define a corresponding inclusion grade by
1
n x
by m(A--*B)(x) ==
-
S_.(A,B) = - I A --* BI = - ~ (1 - mA(X) + mA(X)mB(x)).

S_. is similar to weak inclusion in that it satisfies ($2) and ($3) with strict monotonicity but satisfies only the
necessity portion of Axiom (S1). Also, in this case, the function given by Eq. (2.3) is
E~(A) - 1 ~ (1 - max2(mA(x), 1 - mA(X))).
--n
is an entropy function.
Note that (~)E_~
4
Examples 4.2 and 4.3 lead us to define a weak subsethood measure:
Definition 4.2. If Sw is a fuzzy set in ~ ( X ) x ~ ( X ) that satisfies Axioms ($2) and ($3) but not Axiom (S1),
then we say that S,~ is a weak fuzzy subsethood measure on X.
We obtain the following result:
377
Table 1.
Driver
At - rural
A2 - urban
C 1 -
dt
d2
d3
d4
d5
0.7
0.9
0.2
0.3
0.1
0.3
0.1
0.8
0.7
0.9
0.8
0.7
0.3
0.3
0.2
low
C2 - high
0.2
0.3
0.7
0.7
0.8
Table 2.
SK(A,C,,,)
C1
A~ - rural
A2 - urban
0.91
0.43
low
C2 - high
0.50
0.89
Theorem 4.2. l f Sw is a weak subsethood measure such that S,, is strictly monotone with respect to Axiom ($3),
then Ew defined by
Ew(A)--
S w ( A w A C , A r a A c)
,
Sw(P,P)
AeJ~(2x),
is a fuzzy entropy measure on X.

Remark. We require Sw to be strictly monotone with respect to Axiom ($3) so that Ew will achieve its
maximum at A -- P and only there.
Example 4.4. Application ofsubsethood to inference. Kosko [8] points out that his subsethood measure (up
to normalizing) is the Lukasiewicz implication operator. We have exploited similar relationships in Examples
4.2 and 4.3. We, therefore, interpret S(A, B), in general, as a measure of the degree to which A implies B. In
fact, Bandler and Kohout [1] and Willmott [-13, 14] define fuzzy subsethood as the degree to which A implies
B and use implication operators to specify subsethood measures. Consider the following hypothetical
example of fuzzy clustering for risk classification in private automobile insurance:
Let X = { dr, d2, ..., d s } consist of five policyholders. Associate to each driver di an ordered pair (a, ci), in
which a~ represents the location and ci, the automobile insurance claims of driver di, for i = 1 .... ,5. Suppose
we partition the locations into two fuzzy sets At, rural, and A2, urban; similarly we partition the claims into
C1, low claims, and C2, high claims. The membership values of the di are listed in Table 1, and the
subsethood values SK(At, Cm) are in Table 2, for l,m = 1, 2, using Kosko's subsethood measure St.
We see, from Table 2, that A~ implies Ct, for l = 1, 2, to higher degree than A~ implies Cm, for l,m = 1,2,
I ~ m. Based on this evidence, we could support the conclusions: Urban drivers have high automobile
insurance claims, while rural drivers have low claims.
5. Fuzzy subsethood and probability

Kosko [9] notes that his subsethood measure has the form of conditional probability P(B [ A). In fact, one
can say that it is the conditional probability under the uniform distribution on X. Zadeh [16] defines the
V.R. Young/ Fuzzy Sets and Systems 77 (1996) 371-384
378
probability of a fuzzy set A by
P(A) = F, mA(x)p(x),
X
in which p is a probability distribution on a finite set X. If the cardinality of X is n and if we set p(x) = 1/n, for
all x in X, then by defining
P(AnB)
P(BIA) = - - ,
P(A)
we have the following
~x ( ~ ) min(mA(x)'m'(x)) = y.,, min(mA(x),mn(x))

P(B,A)=
Z,, (1)n m A ( X )
~,,m,4(x)'
which is Kosko's subsethood measure.

In general, given a probability distribution p on a finite set X, define a fuzzy set S ~ in .,~-(X) x ~r(X) by
Spa(A,n)
Y.xmin(mA(x),mn(x))p(x)
Zx mA(X)p(x)
It is a fun exercise to verify that S ~ satisfies the subsethood axioms almost everywhere, that is, except on a set
of probability zero. By restricting X to the points at which p is nonzero, we can eliminate the qualifier "almost
everywhere". Alternatively, we can loosen the subsethood axioms to include the phrase "almost everywhere"
without any real loss of generality. It follows from Theorem 4.1 that
E~t(A) - Spd(Aw AC,AnA c)

= ~,~ min(mA(x), 1 -- mA(X))p(x)
2~ max(ma(x), 1 -- ma(x))p(x)
is an entropy function on X. One can readily extend the work in this section to continuous probability
distributions.
Remark. Given a subsethood measure S, define P(C) - S(X, C), for nonfuzzy sets C. In order for P to be
a probability function, it is necessary and sufficient that S(X, C) be additive in C. Indeed, P(C) >>-O, by
definition, and P(X) = 1, by Axiom (S1). Therefore, if S(X, C) were additive in C, then P would satisfy the
third and final Kolmogorov axiom for probability functions. The converse follows similarly. Note that
SK(X, C) is additive in C.
6. Tuning rules in fuzzy logic

In this section, we describe how one can use fuzzy subsethood to tune rules in fuzzy logic. This topic is the
subject of our continuing research, so we only outline one possible procedure and give an example showing
how to use it. Let X x Y = {(x,y): x ~X, y e Y}, in which X represents the input space of our fuzzy system,
and Y, the output space. In other words, given a value Xo in X, we will take some action Yo in Y, according to
the fuzzy logic of our system. Assume that as Xo increases, Yo also increases. Assume that X and Y are
bounded, closed intervals in the real numbers R, and write X = [ao, am] and Y = I-bo, b,,].
379
Partition X and Y into m subintervals: ao < al < .-" < am- 1 < am, and bo < bl < "- < bin- 1 < bin. To
every triple (ai-1, ai,ai+l), i = 1 , . . . , m - 1, associate a triangular fuzzy set Ai in X = [ao,a,.] defined by
ma,(X)=
(x-ai-l)/(al-ai-1),
(ai+l-x)/(ai+l--al),
O,
a i - l <~x <ai,
ai~x~ai+l,
otherwise.
The graph of A i, therefore, is piecewise linear and joins (ai-1, O) to (ai, 1) and (ai, 1) to (ai + 1,0). Define the
fuzzy set Ao by
mA(X) =
(al-x)/(al-ao),
O,
ao ~ x ~ al,
otherwise,
and define the fuzzy set Am by
ma,(x)=
(X--am-1)/(am--am-1),
O,
am-1 ~ X < a m ,
otherwise.
Similarly define triangular fuzzy sets Bi, i = 0 , . . . , m, in Y = [ b o , bm].

Let {(Ai, Bi): Ai ~.~(X), Bi ~ ~ ( Y ) , i = 1,... ,m} represent our fuzzy logic system, as follows: Define the
ith fuzzy rule by
mi(x,y) = min(ma,(x), ms,(y)).
(6.1)
The function m~, thus, defines a fuzzy set on X x Y, and we use it in the fuzzy modus ponens that follows.
Given a value Xo in X, define A[ by
mai(x) = min(ma,(x), mA,(Xo)),
and define the result of fuzzy modus ponens B~ by
mn~(X) = max min(mai(x), mi(x, y))
xeX
= max min(ma,(x), ma,(Xo), m~,(y))

x~X
= min(mA,(xo), mB,(y)).
In other words, Ai and B~ are the fuzzy sets Ai and Bi truncated at ma,~(Xo). This modus ponens is but one of
many that are used in fuzzy logic. We choose it for the sake of simplicity. The output value yo in Y is
calculated by
Yo =
~i~o mB;(Xo) x (center of B~)

r.im=o mB~(Xo)
(6.2)
For the center of B~, we choose the point at which Bi attains its maximum, namely b~. This method of
defuzzifying the output is also one of many that are used in fuzzy logic; again, we choose it for its simplicity.
In our simplified fuzzy system with only one input variable and one output variable and with triangular
fuzzy sets partitioning X and Y, the output yo can be written
Yo = mA,(Xo) x b i + mA,+,(Xo) bi+ 1,
380
in which i is such that Xo lies between ai and ai+l for some i = 0 . . . . . m - 1. We can further simplify Yo by
substituting the membership values of x0:
Yo --
(ai+l -Xo)Xbi+(xo-ai)xbi+l
(6.3)
ai+ 1 -- ai
a weighted average of bi and b~+ 1- Also note that Yo is a piecewise linear function of Xo.
If we have more than one input variable, then we can readily generalize Eq. (6.2) to that situation. In that
case, however, the output Yo will not necessarily be a piecewise linear function of Xo.
Given training data {(x,,y,): ct e~}, we may want to gauge how well an existing system {(A,B~):
i = 0 . . . . . m} fits the data, or we may want to design a fuzzy system to fit the data. Here we develop a measure
based on fuzzy subsethood. We model our measure on Kosko's measure of subsethood and define
I(A, B) = y~ min(ma(x~), m~(y~))

Y.~ mA(X~)
'
(6.4)
in which A is a fuzzy subset of X and B is a fuzzy subset of Y. Note that the numerator is the cardinality of the
fuzzy relation (restricted to the training data) between A and B defined by Eq. (6.1). Because A and B are not
subsets of the same universal set, I is not a measure of subsethood, but it is certainly connected with
subsethood. Indeed, one can consider A and B to be subsets of the universal set X x Y as follows:
mA(X, y) = mA(X),
mB(x,y) = ms(y), V ( x , y ) e X x Y.
With the obvious extension from ~ - ( X ) ~,~(Y) to ~ ( X Y), we can, thus, regard I as a measure of fuzzy
subsethood on ~ ( X x Y).
Define a global measure of implication for the system {(A,, Bi): i = O, ..., m} by
l { (Ai, Bi)} = Ei [ I(Ai, Bi) E, mA,(x,)]
(6.5)
Ei E, mA,(x,)
In Eq. (6.5), we write [ d [ for the denominator and substitute for
I{(Ai,B0}
=~
~i ~ min(mA,(x,),mn,(y,)).
I(A~, B~) in the numerator to obtain

(6.6)
O u r goal is, therefore, to create or modify a fuzzy logic system to maximize Eq. (6.6). One m a y use an
implication measure of the form of a sum of squared errors (based on statistical regression), but we believe
that our measure has merit because it is tied directly with the implication operator rain. If we were to use
a different implication operator, then we would define a corresponding implication measure.
In the following example, we tune a fuzzy system to maximize the expression in (6,6):
Example 6.1. Suppose that Y[(X = x), or more simply Y [x, is a r a n d o m variable distributed according to
the exponential distribution with mean x 2. The probability distribution function of Y [x is
1
f ( y I X ) - --X 2- - exp( - y/x2),
>"O.
Let x take on the 100 values 0.1, 0.2, ..., 10.0, and generate values ylx randomly according to the given
exponential probability distribution. Please see Table 3 for the noisy learning data {(x~,y~): at = 1, 2 .... ,100}
thus generated. Such data are called noisy because Y lx is a r a n d o m variable.

Table 3
X
O.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3.0
3.1
3.2
"3.3
3.4
3.5
3.6
3.7
3.8
3.9
4.0
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
5.0
5.1
5.2
5.3
Y
0.001
0.017
0.062
0.022
0.457
0.165
0.067
2.225
0.088
2.992
0.517
1.041
3.085
0.709
4.587
0.333
2.102
1.356
10.941
3.442
0.090
0.955
5.363
0.445
12.672
2.233
1.456
5.364
5.593
0.112
2.466
2.462
16.795
0.686
4.922
1.509
21.739
12.211
25.827
16.974
7.299
14.437
0.233
4.360
10.632
5.780
6.169
8.901
70,973
20,480
50.164
6.524
3.379
E(Y[x)
0.01
0.04
0.09
0.16
0.25
0.36
0.49
0.64
0.81
1.00
1.21
1.44
1.69
1.96
2.25
2.56
2.89
3.24
3.61
4.00
4.41
4.84
5.29
5.76
6.25
6.76
7.29
7.84
8.41
9.00
9.61
10.24
10.89
11.56
12.25
12.96
13.69
14.44
15.21
16.00
16.81
17.64
18.49
19.36
20.25
21.16
22.09
23.04
24.01
25.00
26.01
27.04
28.09
Pred Y(3)
0.001
O.127
0.253
0.379
0.505
0.631
0.757
0.883
1.009
1.135
1.261
1.387
1.513
1.639
1.765
1.891
2.017
2.142
2.268
2.394
2.520
2.646
2.772
2.898
3.024
3.150
3.276
3.402
3.528
3.654
3.780
3.906
4.032
4.158
4.284
4.410
4.536
4.662
4.788
4.914
5.040
5.166
5.292
5.418
5.544
5.670
5.796
5.922
6.482
7.150
7.819
8.487
9.155
Pred Y(10)
0.001
0.011
0.022
0.185
0.349
0.513
0.677
0.841
1.005
1.170
1.334
1.498
1.662
1.826
1.991
2.155
2.319
2.483
2.647
2.812
2.976
3.140
3.304
3.468
3.633
3.797
3.961
4.125
4.289
4.453
4.618
4.782
4.946
5.110
5.274
5.828
7.726
9.234
10.742
12.250
13.898
15.731
17.564
19.397
21.231
23.064
24.897
26.730
28.564
30. 397
32.230
34.063
40.336
381
382
Table 3 (continued)
X
E(Y Ix)
Pred Y(3)
Pred Y(10)
5.4
5,5
5,6
5.7
5.8
5.9
6.0
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
7.0
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.9
8.0
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
8.9
9.0
9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.8
9.9
10.0
47.372
0.102
48.099
13.630
41.001
9.678
19.636
89.237
30.039
48.492
72.010
79.970
0.339
5.915
100.201
10.029
15.240
31.122
2.890
14.902
152.233
21.877
30.484
14.879
10.253
43.328
102.131
37.375
160.906
64.584
19.889
22.375
l 2.707
18.065
58.915
9.360
25.069
107.005
69.061
74.148
102.919
30.002
41.671
61.434
87.152
74.883
86.692
29.16
30.25
31.36
32,49
33.64
34.81
36.00
37,21
38,44
39,69
40,96
42,25
43.56
44.89
46,24
47~61
49.00
50.41
51.84
53.29
54,76
56.25
57.76
59.29
60.84
62.41
64.00
65.61
67.24
68.89
70.56
72.25
73.96
75.69
77.44
79.21
81.00
82.81
84.64
86.49
88.36
90.25
92.16
94.09
96.04
98.01
100.00
9.823
10.491
11.160
11.828
12.496
13.164
13.832
14.500
15.169
15.837
16,505
17.173
17.841
18,510
19.178
19.846
20.514
21.182
21.851
22.519
23.187
23.855
24.523
25.192
25.860
26.528
27.196
27.864
28.532
29.201
29.869
30.537
31.205
31.873
32.542
33.210
33.878
34.546
35.214
35.883
36.551
37.219
41.671
56.156
70.642
85.128
99.613
41.049
41.761
42.474
43.186
43.899
44.612
45.324
46.037
46.749
47.462
48.174
48.887
49,599
50.312
51.025
51.753
52.696
53.638
54.580
55.522
56.465
57.407
58.349
59.291
60.234
61.176
62.118
63.061
64.003
64.945
65,887
66.830
67.772
68.714
69,656
70.599
71.541
72.483
73.425
74.368
75.310
76.252
77.195
78.137
79.079
80.021
109.010
383
To obtain initial values for ao, a t , . . . , am and bo, b 1,..., b,,, perform the following steps:
1. Recall that we assume the y,'s are ordered as the x,'s. Thus, replace each y, with y',, the arithmetic average
of the maximum of all the preceding y's and the minimum of all the following y's.
y'~ = ( m a x y~ + min yp) .

2. Partition X into subintervals, thereby partitioning {x,} and creating {Ai}. One natural way to partition
X is to subdivide {x,} into approximately equal-sized subsets, say 10 = I x / q ~ subsets of size 10:
{XI,X 2.... ,XI0},{Xll
,..., X20},...,{X91,...,XI00
}.
3. Choose ao = xl = left endpoint of data; al = (Xlo + xll); . . . ; a 9 = (x90 + X91); and alo = Xaoo =
right endpoint of data. We, thus, have our fuzzy sets {Ai, i = 0 ..... 10}.
4. Form the corresponding partition of Y, using the ordered {y~}:
{Y'I,Y2 . . . . . Y'10}, { Y l l , - " , Y 2 0 } , . - . , { Y 9 1 , ' " , Y l o 0 } .
Similarly define bo = YI' = left endpoint of data; b~ = ](Ylo1

' "1- Yll),' " -.. ; b 9 ":- ](Ygol
' + Y91);' and blo =
Y'loo = right endpoint of data.
5. Using the Solver function on an Excel spreadsheet, maximize expression (6.6)
I {(Ai'Bi)} = ~ l
~i ~ min(ma,(x~),mB,(y~)).
by allowing {a l .... , a9 } and {b~,..., b9 } to vary, subject to the constraints a o ~< a l ~< ... ~< a9 ~< a lo and
b o <~b l <~ ... <~b 9 <<b l o .

To maximize expression (6.6), we used the values of Yl x from the learning data not the reordered values.
Our original value for I was approximately 0.23; after maximizing via Solver, it rose to roughly 0.37. By
graphing the predicted values of ylx versus x, one can see that partitioning the intervals into three
subintervals, instead of 10, may be sufficient for this problem. (See Pred Y(10) in Fig. 1.)
After reanalysing the problem with three subintervals (four fuzzy rules), we obtain a value of 0.61 for I. The
predicted values of Y Ix, based on three subintervals or four fuzzy rules, seem to match the noisy values of
YIx. (Compare Y and Pred Y(3) in Fig. 1.) On the other hand, the predicted values of YIx, based on 10
subintervals or eleven fuzzy rules, track the expected values of Y Ix more closely than the ones based on three
subintervals. (Compare E [ Y[x] and Pred Y(10) in Fig. 1.)
Remark. The maximum found by the software we used is certainly only a local maximum. On the other
hand, the corresponding partitions are, in some sense, "close" to the original ones. This could be an
advantage when one wishes to find a solution close to the current system; therefore, the new system would be
a blend of the old system and the training data.
7. Summary and areas of further research
We have presented a set of axioms for fuzzy subsethood that allows one to connect subsethood with fuzzy
entropy, thus, extending the work of Kosko I-8, 9]. Our axioms are somewhat different from the ones given by
Sinha and Dougherty [ 12-1,but we offer ours as alternative axioms. In addition, we have shown how to generate
subsethood measures from probability distributions, thereby expanding Kosko's research. We have demonstrated the importance of fuzzy subsethood by showing how one can use subsethood to fine tune fuzzy rules.
In further work, we will investigate the relationship between subsethood and inference as in Examples
4.2-4.4; in particular, we will examine how to generate subsethood measures from implication operators so
384

200
100
J~[l]
00
,0
,0
+0
'
+0
00
,00
Fig.1.
that Theorem 4.1 holds. We will, thus, continue the work that Willmott [14] and Kosko [8] initiated. We will
also further research how to use fuzzy subsethood in tuning rules in fuzzy logic as in Example 6.1.
References
Ill
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
W. Bandler and L. Kohout, Fuzzy power sets and fuzzy implication operators, Fuzzy Sets and Systems 4 (1980) 13-30.
A. DeLuca and S. Termini, A definition ofa nonprobabilistic entropy in the setting of fuzzy sets theory, lnforr~ Control 20 (1972) 301-312.
D. Dubois and H. Prade, Fuzzy Sets and Systems: Theory and Applications (Academic Press, New York, 1980).
B. Ebanks, On measures of fuzziness and their representations, J. Math. Anal. Appl. 94 (1983) 24-37.
J.A. Goguen, The logic of inexact concepts, Systhese 19 (1969) 325-373.
A. Kaufmann, Introduction to the Theory of Fuzzy Subsets, VoL I: Fundamental Theoretical Elements (Academic Press, New York, 1975).
G.J. Klir and T.A. Folger, Fuzzy Sets, Uncertainty, and Information (Prentice-Hall, Englewood Cliffs, NJ, 1988).
B. Kosko, Fuzzy entropy and conditioning, Inform. Sci. 40 (1986) 165-174.
B. Kosko, Fuzziness vs. probability, lnternat J. Gen. Systems 17 (1990) 211-240.
E. Sanchez, Inverses of fuzzy relations: Applications to possibility distributions and medical diagnosis, Proc. IEEE Conf. Decision
Control, Vol. 2 (New Orleans, Louisiana, 1977) 1384-1389.
W. Sander, On measures of fuzziness, Fuzzy Sets and Systems 29 (1989) 49-55.
D. Sinha and E.R. Dougherty, Fuzzification of set inclusion: theory and applications, Fuzzy Sets and Systems 55 (1993) 15-42,
R. Willmott, Two fuzzier implication operators in the theory of fuzzy power sets, Fuzzy Sets and Systems 4 (1980) 31-36.
R. Willmott, Mean measures of containment and equality between fuzzy sets, Proc. of the llth lnternat. Syrup. on Multiple-Valued
Looic (Oklahoma City, Oklahoma, 1981) 183-190.
L.A. Zadeh, Fuzzy sets, Inform. Control 8 (1965) 338-353.
L.A. Zadeh, Probability measures of fuzzy events, J. Math. Anal. Appl. 23 (1968) 241-427.
L.A. Zadeh, Similarity relations and fuzzy orderings, Inform. Sci. 3 (1971) 177-200.

Fuzzy Subsethood - V. R Young - 1994

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Fuzzy Subsethood - V. R Young - 1994

Caricato da

Copyright:

Formati disponibili

sets and systems

Fuzzy Sets and Systems77 (1996) 371-384

Received December 1993; revised April 1994

V.R. Young / Fuzzy Sets and Systems 77 (1996) 371-384

2. Fuzzy entropy and fuzzy subsethood

V.R. Young / Fuzzy Sets and Systems 77 (1996) 371-384

E1 (A ) = y'~ min(m A(x), 1 -- mA(X))

2.3. Fuzzy subsethood

V.R. Young / Fuzzy Sets and Systems 77 (1996) 371-384

3. Subsethood axioms of Sinha and Dougherty

Axiom 1. I(A, B) = 1 if and only if A _ B in the sense of Zadeh.

Axiom 2. I(A, B) = 0 if and only if 3x ~ X such that ma(x) = 1 and mB(X) = O.

E'(A) = inf min(mA(X), 1 - mA(X));

then, E' satisfies Axioms (El'), (E2)-(E4). Let A be given by

Axiom 3. If B ~ C, then I(A, B) <<.I(A, C).

KR. Young / Fuzzy Sets and Systems 77 (1996) 371-384

Axiom 11. If A refines B, then I ( A u A ~ , A n A

~) <<.I(BuB~,Bc~BC), follows from Axioms 3 and 4 if

4. Axioms for fuzzy subsethood

V.R. Young / Fuzzy Sets and Systems 77 (1996) 371-384

Theorem 4.1. If S is a fuzzy subsethood measure on X, then E defined by

E(A) -S(AuA,Ac~A), ae~(2x),

which is a measure of entropy used by Kaufmann [6].

Example 4.2. Weak inclusion. Weak inclusion [3] is given by

Example 4.3. Inclusion from implication. Define an implication operator

S_.(A,B) = - I A --* BI = - ~ (1 - mA(X) + mA(X)mB(x)).

V.R. Young / Fuzzy Sets and Systems 77 (1996) 371-384

is a fuzzy entropy measure on X.

5. Fuzzy subsethood and probability

V.R. Young/ Fuzzy Sets and Systems 77 (1996) 371-384

probability of a fuzzy set A by

~x ( ~ ) min(mA(x)'m'(x)) = y.,, min(mA(x),mn(x))

which is Kosko's subsethood measure.

E~t(A) - Spd(Aw AC,AnA c)

6. Tuning rules in fuzzy logic

V.R. Young / Fuzzy Sets and Systems 77 (1996) 371-384

and define the fuzzy set Am by

Similarly define triangular fuzzy sets Bi, i = 0 , . . . , m, in Y = [ b o , bm].

= max min(ma,(x), ma,(Xo), m~,(y))

~i~o mB;(Xo) x (center of B~)

V.R. Young / Fuzzy Sets and Systems 77 (1996) 371-384

I(A, B) = y~ min(ma(x~), m~(y~))

l { (Ai, Bi)} = Ei [ I(Ai, Bi) E, mA,(x,)]

I(A~, B~) in the numerator to obtain

V.R. Young / Fuzzy Sets and Systems 77 (1996) 371-384

V.R. Young / Fuzzy Sets and Systems 77 (1996) 371-384

V.R. Young / Fuzzy Sets and Systems 77 (1996) 371-384

y'~ = ( m a x y~ + min yp) .

Similarly define bo = YI' = left endpoint of data; b~ = ](Ylo1

b o <~b l <~ ... <~b 9 <<b l o .

7. Summary and areas of further research

V.R. Young / Fuzzy Sets and Systems 77 (1996) 371-384

Potrebbero piacerti anche