Sample Classification Knearest

Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
5. K -N EAREST N EIGHBOR
Pedro Larranaga
Intelligent Systems Group Department of Computer Science and Articial Intelligence University of the Basque Country
Madrid, 25th of July, 2006
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
Outline
Introduction
The Basic K -NN
Extensions of the Basic K -NN
Prototype Selection
Summary
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
Outline
Introduction
The Basic K -NN
Prototype Selection
Summary
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
Basic Ideas
K -NN IBL, CBR, lazy learning A new instance is classied as the most frequent class of its K nearest neighbors Very simple and intuitive idea Easy to implement There is not an explicit model (transduction) K -NN instance based learning (IBL), case based reasoning (CBR), lazy learning
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
Outline
Introduction
The Basic K -NN
Prototype Selection
Summary
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
Algorithm for the basic K -NN

BEGIN Input: D = {(x1 , c1 ), . . . , (xN , cN )} x = (x1 , . . . , xn ) new instance to be classified FOR each labelled instance (xi , ci ) calculate d (xi , x) Order d (xi , x) from lowest to highest, (i = 1, . . . , N )
K Select the K nearest instances to x: Dx K Assign to x the most frequent class in Dx
END
Figure: Pseudo-code for the basic K -NN classier
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary

Example
Figure: Example for the basic 3-NN. m denotes the number of classes, n the number of predictor variables, and N the number of labelled cases
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary

The accuracy is not monotonic with respect to K
Figure: Accuracy versus number of neighbors
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
Outline
Introduction
The Basic K -NN
Prototype Selection
Summary
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
K -NN with rejection
Requiring for some guarantees Demanding for some guarantees before an instance is classied In case that the guarantees are not veried the instance remains unclassied Usual guaranty: threshold for the most frequent class in the neighbor
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
K -NN with average distance
Figure: K -NN with average distance
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
K -NN with weighted neighbors
Figure: K -NN with weighted neighbors
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
K -NN with weighted neighbors
x1 x2 x3 x4 x5 x6
d (xi , x) 2 2 2 2 0.7 0.8
wi 0.5 0.5 0.5 0.5 1/0.7 1/0.8
Figure: Weight to be assigned to each of the 6 selected instances
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
K -NN with weighted variables

X1 0 0 0 1 1 1 0 0 0 1 1 1 X2 0 0 0 0 0 1 1 1 1 1 1 0 C 1 1 1 1 1 1 0 0 0 0 0 0
Figure: Variable X1 is not relevant for C
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
K -NN with weighted variables

d (xl , x) =
X
n i =1
wi di (xl ,i , xi ) with wi = MI (Xi , C ) p(X ,C ) (0, 0) 1 p(X ,C ) (0, 1) 1 pX (0) pC (1)

1
MI (X1 , C ) = p(X ,C ) (0, 0) log 1
pX (0) pC (0)
1
+ p(X ,C ) (0, 1) log 1
p(X ,C ) (1, 0) log 1 3 12

3 12 6 12
p(X ,C ) (1, 0) 1 pX (1) pC (0)

1
+ p(X ,C ) (1, 1) log 1 3 12

3 12 6 12
p(X ,C ) (1, 1) 1 pX (1) pC (1)

1
log
6 12
3 12
log
3 12 6 12
6 12
log
6 12
3 12
log
3 12 6 12 6 12
=0
MI (X2 , C ) = p(X ,C ) (0, 0) log 2
p(X ,C ) (0, 0) 2 pX (0) pC (0)

2
+ p(X ,C ) (0, 1) log 2
p(X ,C ) (0, 1) 2 pX (0) pC (1)

2
p(X ,C ) (1, 0) log 2 1 12

1 12 6 12 6 12
p(X ,C ) (1, 0) 2 pX (1) pC (0)

2
+ p(X ,C ) (1, 1) log 2 5 12

5 12 6 12 6 12
p(X ,C ) (1, 1) 2 pX (1) pC (1)

2
log
5 12
log
5 12 6 12 6 12
log
1 12
log
1 12 6 12 6 12
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
Outline
Introduction
The Basic K -NN
Prototype Selection
Summary
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
Wilson edition
Eliminating rare instances The class of each labelled instance, (xl , c (l ) ), is compared with the label assigned by a K -NN obtained with all instances except itself If both labels coincide the instance in maintained in the le. Otherwise it is eliminated
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
Hart condensation
Maintaining rare instances For each labelled instance, and following the storage ordering, consider a K -NN with only the previous instances to the one to be considered If the true class and the class predicted by the K -NN are the same the instance is not selected Otherwise (the true class and the predicted one are different) the instance is selected The method depends on the storage ordering
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
Outline
Introduction
The Basic K -NN
Prototype Selection
Summary
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
K -nearest neighbor
Intuitive and easy to understand There is not an explicit model: transduction instead of induction Variants of the basic algorithm Storage problems: prototype selection
Introduction
The Basic K -NN
Extensions
Prototype Selection
Summary
5. K -N EAREST N EIGHBOR
Pedro Larranaga
Intelligent Systems Group Department of Computer Science and Articial Intelligence University of the Basque Country
Madrid, 25th of July, 2006

Sample Classification Knearest

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Sample Classification Knearest

Caricato da

Copyright:

Formati disponibili

Introduction

The Basic K -NN

Madrid, 25th of July, 2006

The Basic K -NN

The Basic K -NN

Extensions of the Basic K -NN

The Basic K -NN

The Basic K -NN

Extensions of the Basic K -NN

The Basic K -NN

The Basic K -NN

The Basic K -NN

Extensions of the Basic K -NN

The Basic K -NN

Algorithm for the basic K -NN

Figure: Pseudo-code for the basic K -NN classier

The Basic K -NN

Algorithm for the basic K -NN

The Basic K -NN

Algorithm for the basic K -NN

Figure: Accuracy versus number of neighbors

The Basic K -NN

The Basic K -NN

Extensions of the Basic K -NN

The Basic K -NN

K -NN with rejection

The Basic K -NN

K -NN with average distance

Figure: K -NN with average distance

The Basic K -NN

K -NN with weighted neighbors

Figure: K -NN with weighted neighbors

The Basic K -NN

K -NN with weighted neighbors

d (xi , x) 2 2 2 2 0.7 0.8

wi 0.5 0.5 0.5 0.5 1/0.7 1/0.8

Figure: Weight to be assigned to each of the 6 selected instances

The Basic K -NN

K -NN with weighted variables

Figure: Variable X1 is not relevant for C

The Basic K -NN

K -NN with weighted variables

wi di (xl ,i , xi ) with wi = MI (Xi , C ) p(X ,C ) (0, 0) 1 p(X ,C ) (0, 1) 1 pX (0) pC (1)

MI (X1 , C ) = p(X ,C ) (0, 0) log 1

+ p(X ,C ) (0, 1) log 1

p(X ,C ) (1, 0) log 1 3 12

p(X ,C ) (1, 0) 1 pX (1) pC (0)

+ p(X ,C ) (1, 1) log 1 3 12

p(X ,C ) (1, 1) 1 pX (1) pC (1)

MI (X2 , C ) = p(X ,C ) (0, 0) log 2

p(X ,C ) (0, 0) 2 pX (0) pC (0)

+ p(X ,C ) (0, 1) log 2

p(X ,C ) (0, 1) 2 pX (0) pC (1)

p(X ,C ) (1, 0) log 2 1 12

p(X ,C ) (1, 0) 2 pX (1) pC (0)

+ p(X ,C ) (1, 1) log 2 5 12

p(X ,C ) (1, 1) 2 pX (1) pC (1)

The Basic K -NN

The Basic K -NN

Extensions of the Basic K -NN

The Basic K -NN

The Basic K -NN

The Basic K -NN

The Basic K -NN