Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Quiz 3
Thursday August 25th holiday
Quiz 3 will be held on Wednesday August 24th during
regular class times 11.45-12.35
Instance Based
Learning
CSL465/603 - Fall 2016
Narayanan C Krishnan
ckn@iitrpr.ac.in
Outline
K-nearest neighbor
Other forms of IBL
Nonparametric Methods
Radial Basis Functions
Key Ideas
Training store all training examples (no explicit
learning)
Testing compute only locally the target function
Advantages
Can learn very complex target function
Training is very fast
No loss of information
Disadvantages
Slow during testing
Easily fooled by irrelevant attributes
Example
x" , %
(
%&'
K- nearest neighbor
Given the query instance x,
take a vote among its nearest neighbors (if is discrete)
take the mean of the % values of nearest neighbors
-
1
, %
%&'
Distance Measures
Numeric features
Manhattan, Euclidean, / norm
7
/ x' , x0 =
, '4 04
4&'
% , 4 = , |% |4
-&'
Illustrating k-NN
Voronoi Diagram
Voronoi cell of x : all points in closer to x than
any other instance in
10
Distance-Weighted k-NN
Simple refinement over k-NN
Might want to weight nearer neighbors more
heavily
-%& % %
= %&' %
Where
1
%
x, x" 0
And (x, x" )is the distance between x and x"
Makes senses to use all the training examples
instead of just
Instance Based Learning
11
12
Possible Solutions
Feature Selection
Filter Approach
Pre-select features individually using some measure
Wrapper approach
Experiment with different combinations of features using a
learner
Forward selection
Backward elimination
Feature Weighting
13
Curse of dimensionality
Sensitive to dimensionality of the data
Low dimension intuitions do not apply in high dimensions
14
Curse of Dimensionality
Examples
Normal Distribution
Points on a hyper grid
Approximation of a sphere by a cube
15
Curse of dimensionality
Sensitive to dimensionality of the data
Low dimension intuitions do not apply in high dimensions
Computational cost
Distance computation while testing!
Instance Based Learning
16
Forming prototypes
Edited k-NN
Remove instances that do not affect the decision
17
Overfitting
What parameter of the model can indicate
overfitting?
set the parameter through validation experiments
18
Nonparametric methods
Form of underlying distributions unknown
Still want to perform classification (or regression)
(h
z
~
op oM{ | op } oK{
Nave estimator:n x =
(h
(
1
x x"
1, if < 1/2
n
x =
,
, =
0, otherwise
%&'
19
20
21
22
n x =
2- x
Where
- x - distance to the kth closest instance to x
23
24
Summary
Lazy learning
K-NN for classification
Issues with KNN and potential solutions
Curse of dimensionality
Density Estimation
Nonparameteric methods
25