Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
( Perceptive
Age,
(Test
Textual
Marital
instance
features
inputs
status,
):
Ngrams ) Classifier Discrete-valued
Category of document?
Health
Attributes
status, Salary ) Issue Loan?
Steer? { Left,{Yes,
Straight,
No}
{Politics,
Class Movies,
Right }label
(a1, a2,… an) Biology}
Classification learning
Training Testing
phase phase
Learning the classifier Testing how well the classifier
from the available data performs
‘Training set’ ‘Testing set’
(Labeled)
Generating datasets
• Methods:
– Holdout (2/3rd training, 1/3rd testing)
– Cross validation (n – fold)
• Divide into n parts
• Train on (n-1), test on last
• Repeat for different combinations
– Bootstrapping
• Select random samples to form the training set
Evaluating classifiers
• Outcome:
– Accuracy
– Confusion matrix
– If cost-sensitive, the expected cost of
classification ( attribute test cost +
misclassification cost)
etc.
Decision Trees
Example tree
Intermediate nodes : Attributes
a1 a2 a3 a4 a5 a6
X Y Z
•When an instance arrives for testing, runs the algorithm to get the
class prediction
Class label
Test instance
(hi,ci)
Unit
Decision List learning
R S’ = S - Qk
( h k, 1 / 0 )
If
For each hi, (| Pi| - pn * | Ni |
Select hk, the feature >
Set of candidate Qi = Pi U Ni with |Ni| - pp *|Pi| )
feature functions highest utility
( hi = 1 ) then 1
else 0
U i = max { | Pi| - pn * | Ni | , |Ni| - pp *|Pi| }
Decision list Issues
Pruning?
Accuracy / Complexity
What is the terminatingtradeoff?
condition?
hi is not required if :
Size of Rof: Complexity
1. Size R (an upper(Length of the list)
threshold)
1.
2.c
S’ Qi k= c (r+1)
contains examples of both classes : Accuracy (Purity)
= null
2.3.There is no h j ( j > i ) such that
S’ contains examples of same class
Qi=Qj
Probabilistic
classifiers
Probabilistic classifiers : NB
• Based on Bayes rule
• Naïve Bayes : Conditional independence
assumption
Naïve Bayes Issues
Family Marital
Age
status status
Addition of momentum
Choosing
What are the types
the learning
of factor
learning
Learning the structure of the network
approaches?
A small learning factor means multiple iterations
1. Construct a complete network
required.
Deterministic:
2. Prune usingUpdate weights after summing up
heuristics:
Errors over all examples
• Remove
A large learning edges with weights
factor means nearly zero
the learner
• Remove edges if the removal does not affect
may
But skip theUpdate
why?
Stochastic: global minimum
weights per example
accuracy
Support vector
machines
Support vector machines
• Basic ideas
Margin
“Maximum separating-
margin classifier”
+1
Support vectors
-1
Separating hyperplane : wx+b = 0
SVM training
• Problem formulation
Minimize (1 / 2) || w ||2
iLagrangian
w.r.t. (yi ( w xi + b ) – 1) >= 0 for allzero multipliers are
for data instances other
than support
Dot product vectors
of xk and xl
Focussing on dot product
Test Class
SVM
instance label
SVM Issues
Classifier Sample
model Training
D1
Mn dataset
Test D
set
Total set
Classifier Sample
Error model Training
D1
Mn Weights of dataset
Test D
set correctly classified
instances multiplied
Total set
by error / (1 – error)
Initialize weights
Selection based on
of instances
weight. May
Iftoerror
1/d
use > 0.5?
bootstrap sampling with replacement
`
The last slice
Data preprocessing
• Attribute subset selection
– Select a subset of total attributes to reduce
complexity
• Dimensionality reduction
– Transform instances into ‘smaller’ instances
Attribute subset selection
• Information gain measure for attribute
selection in decision trees
s = Wx W is k x p transformation mtrx.
instance x in
k-dimensions
k<p
Principal Component Analysis
Eigenvector matrix
(p X( n)
p X(pp) X n )
(k (Xpn)
X n ) (k X p) First k are k PCs
Diagram from Han-Kamber
Weka &
Weka Demo
Weka & Weka Demo
• Collection of ML algorithms
• Get it from :
http://www.cs.waikato.ac.nz/ml/weka/
• ARFF Format
• ‘Weka Explorer’
ARFF file format
@RELATION nursery Name of the relation
Experimenter
Explorer
Knowledge Flow
Comparing
Basic interface
experiments
to run ML
Similar to Work Flow
Algorithms
on different algorithms
‘Customized’ to one’s needs
Weka demo
Key References
• Data Mining – Concepts and techniques; Han and
Kamber, Morgan Kaufmann publishers, 2006.