Sei sulla pagina 1di 16

AgeweightHeinght

(lbs) (Cin)
BP BP Heant
(ystolic) (diastulic) Rok (bpm)
|0 68 58 125 82 80

45 175 73 |20 18 68
35 156 63 I15 70 82
75 123 81 90

51 |45 64 135 83 85

63 131 66 155 65

182 142 85 13
28 125 118 75 70

75 219 72 88 91

11 158 68 140 92 15

Numeie: Table -4 Data Featues

abeve numeúe data.


Q) Detwiune Dissinilaüly matrices
Step1 : Data neUmalizatie
we need te nemalize abeve data te o-| fonula:
using
noY m min

max- K min

"Let us cens ide column

Kmax 77
Compute otheu value 1h
(lo -10)/ (17-10) =
column | in simila
Juy = ( 45-10)7 (T7-10) D522
manne.

t3 = (35-10)/ (1-10) : 0·373


Consideing weight featune
Xmin 66;
(68-68)/Ca19-68) o
a
(I75-65)/(a19-68) =o.709
In
o583
N31 (IS6 - 68)/ (A19-68)

Nomalize all featuxes in above menioned manner, ue get

BP BP Heart
Weight Height
(y1s) (bs) (in) (systolie) (diastoli) Rae (bp)
O-25 0-545 0-511

0-$22 0-709 o-852 0-15 0-364 0-115

o-313 o-583 0294 o-654

0-119 o-94 2 .962

0-612 o5) o353 05 o-591 769

0-791 0412 O"471 0.955

0254 o-755 0.165 0-615 0-682 O-305

o-269 o-311 0-118 0-075 0-227 o-192

0-824 0.9 0-816

o-596 0.588 0-625 0-385

Notmalized Numeic Data

Euclidean distance
2

whee i= (i Xiz,.. Xi6) and i (*j), *ja,...*j6) be to


objeck described bu P=6 num eC attributes
O2 (o-522-o)t (o-709-0)+ (o.882-o'+ (o15-o-25)+ (o36 - 0-54)+
/o-5anj+(o704)'+ Co-fan)'+ (o-1)*+ (o-181)+ (o462) : |345

d31(o-313 -0+ (o-$83-o +(o294-0)*+ (0-025)*4 (o-o-545)+ (o-654-0-571)


(o31a)+ (o-583)+ (o-294)-+ (o-2s)'+ (o545)*+ (o-o11)* z O.965
d32 (o-313-0-522+ (o-583-o109)+ (o294-o.$8a)*+ (o-o15)+ (0-o-364)+ (o6s4 -0-115)
:o-144+ (o.126)+ (o-s88)*+ (o-15)*+ (o364)+ (o-s39)

Similaaly compuke fe all values o doij). Disimilanihy matrix is:


10
2 4 5

2 -345

3 0.965 o-9|

4 (·432 0.983 1-031

5 0-928 0 964 o-824 099)

|-188 1-604 (-6b6 l-014


1:444

1-22) 0-709 I-143 -891 o-18| 0-814

b.S9S 1:34| o.915 1-349 1-068


8 0-1)3 o-888
1-162 o.925 -235 1-061 -749
1. 815 1-351 1-5S2

1:352 0-138 o-85 1-319 0841


1·444 |-015 1-394

b) Manhattan distance

dCij): I*pipl
dË l0.s12-o] +lo:o9-o)+ )o.88a-o|+lo5 - 0.25|+

d3i
¿ 0-522+ o.109 + o882+ o + 0.8+ 0.462 |o-364-0543l+\
28S

lo-315-0| + lo-s$3-o| ++ lo-294-ol+ lo-o2s | +|0- 0-s45| +

d32
t o313 + os63 + 0294 + o-1s + 0 S4S t o

lo-313 -0-Si2| +lo-s83-o-709(+


017 2|22 (o-654-s1
o:294-0-882|+ lo-0-iSl+ lo-o-344l+ Ju6sa..
=049 + 0-126 + o-S8S + o IS + o-36b4 + 0·589 =
|415

Computing ue
get belour dissimilaniy matrix
4 5 J
lo

2 2854
3 2-122 1-915

4 2-S4 |-18S 1325

5 :963 2-049 -S11 2-153

b3.41S 2:S28 3-368 3-94 2-022

1 2604 -468 2:465 -866 |742

8 1-642 -638 |-25 2761 2017


276 2209

1 4-139 2-897 3.608 2-143 2111 2-3SI 2-014


4253
Io 3-20b 2:265 2-828 3-139 -628 (·31 IS 24 2-936 (-14)

) Supremum distance
P
P
dlij) = sim h

-jt max
f
^it -i+l
Also known as Chebysher distance efued by Loo norm
is generaliaaton e Minkowski distance for hyo.
9tats qth+s+t 9ts sum
a+s (ij) Asymmetric
dis
S+t t
ts+t qta
)=Symnetvis
(ij dis
Sum lanitydissimi
data binay matrix
fe Determine Q2)
O-61S 0-713 0-146 0-385 O409 0-881 0-636
0-825 -116 O49 0-851 885 0-
0-641 0.92s 0-511 0.882 9.462 0-165 0-385 8
0-531 0-462 o-654 0-682 0-525 0-765 7
0-169 0-962 o-85 0-791
0-647 0-59| 0-654 0-612 5
0-106 0-846
0-588 0-583 3
0-882
2
8 5 4 3
2
matrdisimiliky
ix below get tdcij), Compating
we
0-588 o539) : o-364, oIS, oS8, o126, (0-144, max *
)64-0,0654- 0-IS-o,
o.g82-0-194, o-109-0-s83, o-313, (D-Sal- may d32
05%3 o-o11) o194,021S,
o.S4S, -s93, (0313, max
s4-O37) o-545-0,o. 0.15-o, -0, 0-294o-S$3-o, (0-313-0, max d31
O.%52 o461)0I,o18, o682, o.109, (o-S, man :
0-571 364, o-54s-0 0-26-o.G.max(0-522-0,0M09-o,
o.882-0,
Oa1
a)
Gqenda Smokex Diabekc Fam His Fam His fam Hi
Heant Dis Cancer Dialela Farm His
3troke

Data (y-1, M-i)


3+1
3
1+1+3+2

d (3,1) = 0+4 4 =0.5)


4 3 0+0+4t3

d (3,2) = O+2 : 2 0-286


2 5 0+0+2+S

Compute t we
get dissimilaniy matrix as
(o
5

2 0.51|

3 0-286

4 0-511 0-286
o286 0-511 o-286

o-851 0-51| 0-286 0-S11


J

o-511 0-286 O-286 0.286 0-51| 0-28b

o-51/ o-28b o.286 0-286 o-286 0-286

-286 0-S11 o286 o.851 0-286 0-511

to51| D"286 0511 051| o-286 o-28b 0-286

b)
Smokes Diabehic Fam His Fam His Fam Hjs Fam His
Heaut Dis CanceL Diabeks Stroke

Asymmeri Sirany Jata (-o, N-i)


2 3 3+! = 4 : 0-664
1=2 d(2,) = 6
2+3+1

33 d(3,1) 3t0
2
3+3+0

d (3,2) =
5440

get dissimilaniby hatriy


Compute tacij), we

2 3 6 10

0-664

3 0-5 o-16

4 0-66 o-lbt

5 0.25 0-5 0-333 05

6 0.833 0-5 O333 0-66

0.66 0.333 O333 05 0-2

o-5 0-16? O-333 0-333 0-l6t

O4 O.4 o-833 O4 0-5

0-66 0-5 05 0-333 D. 16 O.4


symmetrie
Determine disimi laity matrix fo entine table : bina
amd Manhattan dist for numeie dota.
dCij) f=
(4) Pe l3

fl

(2 dy)/13
fei
matrices
Feu Gnumerie atribukes, ue use th duisimilaiy
aeady compukd in
" Fon + symmetric binay a ribukes, dii ao if **i# else |

e attri butes fo
Theenekica
which walues
it is

present
Same as
eounting
dfaont.

4 2

4 2

2 4 2 4

4 4 4

2 2 2 4 2
4
2 2 2 2 2
4

4 2 4 4 2 4
2 4 4 2 2 2
4

Summed up disaimilaniy matrix for ? binary atributes


Hhe above compuked mathix o binany syom,
attributka uth the one eompued n
Qi tor numenia ah
ith Man hatan di tance divided 13
gives oveall
diaimi laniky matris for entae table.
d(,2) = (2-854 +4)/13 = 0-524
d(3,1) = (2122+ 4)/13 0.4+|
J(3,2) >
(1-415+2) /13 030)
Computing acij), we
qet overal disimilaniky matrix
4

0"52+

3 0470 0-30)

4 0-503 D-134 0-333

5 030s 0·465 0-275 0413

0724 0502 0413


0-463
o-508 0-267 0343 O-297 0"442 0313
8 0:434 o-28 0-366 0314 0-366 0324
0-626 0-31b o585 0-319 0629 o489 o313 0-635
0.554 |14 0-371 0·241 D"433 O408 0"27) 038 0-288

Q4) Delermine di simi laniby matrix fo endihal


data.
Replace * by ik couespording aoank
NoYmalize :
No-e stak, (M¢))-1
Use, say Man hatan distance, for
compuing däsimilaniy matri.
Rank the featues
Gqeneral
Sleep Colds peu Exencise. Docta's
Feling ci-g) Year (I-4) vuis

2
2 4
4 4 2 3 2

5 4 2
Neven- |
2 4

5 2 4
Seldom - 2

3 3 2
Sometime-3
3 4 3 94 lem -4

2 3 2

5 4 2.

2 2 2 4

3 4 3 3

Table 2: Osdinal Data


NoYmalize data

Genenal Feeling (2-1)/ (5-)= 0-25 ; g (4-1)/(5-1):015


Exercise j4 (2-)/(4-1) : 0-33 sg4 (3-)/(4-) : 0-66

Gqen Feeling Sleep Colds/yy Execise Doe visi t


0·25 o-5 o-33

0-15 0-15 0-25 O-66 0-33


0-75 o-25 o33
O-25

05 O-5 o-66 0-33


05 0-75 0-66
025 0-5 0-66
0:75 0-33
0"25 0.25 033
05 0-75 0-66 O-66
CarniyoYous-3 Alcohol:
meiha-4 both3, Neansi -2, Veegan-!,
g Habik: Eating
-2, ght Farsight
-, Eyesight:
sedentay
-t -3, stress busg-, .
integas Sporhy-!
, Lifestyle: "
t aksaepaesent
by let's attnibutes
ne-e . P
matehes ene. m
P dCj):
-
m|-
data.. nominal famatrix larihy dissimi Deteumine Q5)
I'411 2-167
1·411 O-5 o-S83 25 1-917 I-33 1-661
3-S83 1·667 I·417 23.917 333 2:15 0-15
2-S83 2-167 |-S83
D-833 0-75 833 o 3-33
|917 2917 2:33 4n (:
|:15
(-083 2-5 |:911 (·33 2-16?
6
|917 133 0-75 115
0583 1-16] 3-667 4
o-583 3083
3
2
9 8 6 5
4 3 2
matdiarimilaniky below cij),
get ue Comput
td ing 1
o37-0-31) |+ o-66| loas-0-as
1-+| | 0.1+5| lo75-l-o1sl+ da
I-o-82)»
lo.43.05-0-s|
82 +|I-o-8sl+| lo5-o)+
+)0:33-) asl+ l1-o dai
2t lo-66-o3) 0:s|+0) "2g-+lens-ol+ also] .16-0 da
dskance tam Manha Assuming
Doctor's
Yisit Aifesyle Eqesight EatinAkohol
Habik Con wumpin
4 4 3 4
2 4

4 Table 2:
2 2 Nominal Data
1 3
3 3

4 3

1 A 4

3
4 4
3 2

habih) math
t : o.8 Coaly eating
matehes) ; d3a -2 3 5 0-6 (risite eyesight
d31 I-o =1 (mothing matehes)
matrix :
+dcinj), te get belouw dissimi laxiky
Compuhing 3

0-8
0-6

4 0.8 0-4 0-8

5 0-8 0-4 0-8

0·4 o-8 04 o-8 0-8


7 0-6
o-8 0-8
8 0-8
o-8 0-8 0-8 0-6
4|0-4
04 0-8 0.8 0-2
Q6) Determine dissimilani ly matrÔ% fo entine table 2..
P
.. (4)
d(ij): p:9
fy
() 9

" Fo finst 5 ordinal attributes, ue uwe diusimilaniky matrix


computed in Q4
alreiy
Fa Sast 4 nominal attribues, dij )soif ip*jt elsei =
Same as counting ne- e atri bukes, yalues prAent a ditfaent.

4 3

3 3

3 2 3 3

4 4 2 4 4
2 3 3 4

3 3 3 2 4 4 3

3 3 3 2 2 4

2 4 2 3 4 4 2

Summed up dissimilaihy matrix for 4 nominal att ribuke

Adding the above compuhed matrix fer nominal attributes ith


matrix computed in Q4 for endinal atri butes uih Manhatan
distamee dirided by ? givea overall dissimi laniy matriy
fer entAe fable.
AC,2) (2:S +3)/9 : o.Gl

d (3,1) = (3083+4) /4 : 0-18?


J(3,2) > (o-s83+3) /4 : 0398
qet oveall dissini laiky matrik;
Compuhing tacij), we
4

0-61|

3 0.781 0:398

0-241 0-398
0-141
0-48| 0-S46
0528 0306

o-593 0"435
0-122 0-S6S
0.685
0435 0-S 0-654
o306 0-593
0-38
0-62 0-685 0-62
0"426 041?
o104
0-SS 0:38 o.407 0.843
o-63) 0-815 o169
0.306
0.269 o-68S o38
0-6S4 0.5 0-398 05
0-296 0-31

matrix fo combination both tables


Detemine dissimilanity
and Manbattan distance
assuming syrnmetvie binany
attributes.
numeie and sclinal
22
(4) (+)
ps22

() 22

Add the disimilanihy matrix compuked in Q3 and Q6 divided


by total atnibues (22) gives eeLall disini kniky matrik.
d,2) = (2.854 4 4 +2-s + a)/22 E 0562.
d (3,)= (2122+4 + 3.083 +
4)/22 06
d(3,2) = (1:9IS +2 4
oS83+3)/22 034)

Computing acij), e get oveall dissimi lasi hy matrix


2 4 5 1

2 0S62

3 034)

4 o-8 0-3S9

S o.396 O4 036 0-So3

6 o708 o-539 0422 -6S6 o-Sos

O-45S 0-283 044S o3s4 O.466 O"454

o544 o34 0"22} o-34S 0.439 0"447 O-44S

O·495 0484 o.619 0-so3 0S99 0444 o-352 0-72

10 0.444 D-254 o488 o-34? O419 0 446 0-27 0-SOS 0-325

Potrebbero piacerti anche