Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015: Lecture 02

Statistical Methods in Artificial Intelligence
CSE471 - Monsoon 2015 : Lecture 02
Avinash Sharma
CVIT, IIIT Hyderabad
Course Content
Introduction
Linear Classification
Neural Networks
Probability Densities
Bayesian Classifiers
Dimensionality Reduction
Support Vector Machines
Kernel Methods
Clustering Techniques
Decision Tree/Graphical Models
Reference Material
Books
Pattern Classification by Duda, Hart & Stork
The Elements of Statistical Learning by Hastie, Tibshirani and
Friedman
Machine Learning : A probabilistic Perspective by Kevin P. Murphy
Pre-requisite
Basics of Linear Algebra, Probability Theory and Statistics.
Programming in Matlab and C/C++.
Course Website
http://courses.iiit.ac.in
Online Courses/Tutorials and Research Papers
Assessment & Attendance

Policy
Assessment
27% Project
18% Assignments (Best of 6 out of 7)
30% Two Mid-semester Exams (15% each)
25% Final Exam
Zero Marks in assignment to both parties in case
a copy is detected
Cooperation with TAs is expected
Zero Tolerance for institute attendance policy
Expected Outcome
This course would enable students to
understand pattern recognition techniques in
detail.
We will ensure that both theoretical as well as
practical aspects are learnt simultaneously.
The project deliverables are expected to be
working systems attached to some practical
application.
Lecture 02: Plan

Nearest Neighbor (NN) Classifier
KNN Classifier
Classifier Evaluation
Assignment
Basic Linear Algebra Operations
Linear Discriminant Functions
Nearest Neighbor (NN)

Classifier
Two data points (samples) from the same
class should have similar features/attributes.
Similarity in feature space should be aligned
with similarity between among data points in
the real-world scenarios.
The easiest way to classify a data point in
the test data is to find a very similar data
point in the training data.
Nearest Neighbor (NN) Classifier

Eager v/s Lazy learning
Eager learning: Learn a classifier using training
data before receiving a test data sample to classify.
Lazy learning: Simply stores training data and waits
until it is given a test data sample.
NN Classifier
A lazy learning approach where each data sample is
represented as point in a Euclidean space.
Any new test data sample is assigned with the label
of closest data point in the training data using
Euclidean distance metric.
Nearest Neighbor (NN)

Classifier
Class A
Class B
Test #3
Test #2
Test #1
Nearest Neighbor (KNN) Classifier

NN rule leads to partition of the Euclidean
space into cells (Vornoi cells)
K-Nearest Neighbor (KNN) Classifier

Class A
Class B
Test #3
Test #2
Test #1
Practical Aspects
K should be chose empirically and preferably odd to
avoid tie situation.
KNN can have both discrete-value and continuousvalue target functions.
Weighted contributions from different neighbors can
be used to compute final label.
Distance based weighting can be used for giving
higher importance to closer data points.
Performance of NN classification typically degrades
when data is high-dimensional.
This can be avoided by assigning feature weights
inside Euclidean distance computation.
Effect of K on decision
boundaries
1 NN
5 NN
20 NN
KNN Classifier
Advantages:
Learn complex target functions
Training is very fast
Zero loss of information
Disadvantages:
Classification cost for new instances can be
very high
Major computation takes place at
classification time
Classifier Evaluation
Cross Validation is an important means of
evaluating classifiers. Types of cross
validation techniques are:
Random Subsampling
K-fold Cross-Validation
Leave-one-out Cross-validation
Assignment
Iris data or other standard dataset
KNN classifier
Matlab/C++ pipeline
K-fold cross-validation
A 2-3 page report on setup & experimental
evaluation
Basic Linear Algebra

Operations
Vector
Vector Operations
Scaling
Transpose
Addition
Subtraction
Dot Product
Equation of a Plane
Vector
Operations
Transpose
Scaling: Only Magnitude Changes
Vector
Operations
Vector
Operations
Dot Product (Inner Product) of two vectors is a scalar.
Dot product if two perpendicular vectors is 0
Equation of a Plane
Linear Discriminant
Functions
Assumes
a 2-class classification setup
Decision boundary is represented explicitly in

terms of components of .
Aim is to seek parameters of a linear discriminant
function which minimize the training error.
Why Linear ?
Simplest possible
Generalized
Linear Discriminant Functions &

Decision Surfaces
Class A
Class B
The perceptron
Perceptron Decision
Boundary
Perceptron Summary
Decision boundary surface (hyperplane)

divides feature space into two regions
Orientation of the surface is decided by the
normal vector
Location of the surface is determined by the
bias term

is proportional to distance of
from
the surface
positive side,
negative
side

Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015: Lecture 02

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015: Lecture 02

Caricato da

Copyright:

Formati disponibili

Statistical Methods in Artificial Intelligence

CSE471 - Monsoon 2015 : Lecture 02

Online Courses/Tutorials and Research Papers

Assessment & Attendance

Zero Tolerance for institute attendance policy

Lecture 02: Plan

Nearest Neighbor (NN)

Nearest Neighbor (NN) Classifier

Nearest Neighbor (NN)

Nearest Neighbor (KNN) Classifier

K-Nearest Neighbor (KNN) Classifier

Basic Linear Algebra

Scaling: Only Magnitude Changes

Dot Product (Inner Product) of two vectors is a scalar.

Dot product if two perpendicular vectors is 0

Decision boundary is represented explicitly in

Linear Discriminant Functions &

Decision boundary surface (hyperplane)

Potrebbero piacerti anche