Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Condensed version
of sections from
2007 edition of
tutorial
Bag of Words
Models
Object Bag of ‘words’
Bag of Words
• Independent features
• Histogram representation
1.Feature detection and representation
Compute
descriptor Normalize
e.g. SIFT [Lowe’99] patch
Detect patches
[Mikojaczyk and Schmid ’02]
[Mata, Chum, Urban & Pajdla, ’02]
[Sivic & Zisserman, ’03]
…
2. Codewords dictionary formation
Vector quantization
…..
codewords
Uses of BoW representation
• Hierarchical models
– Decompose scene/object
BoW as input to classifier
• SVM for object classification
– Csurka, Bray, Dance & Fan, 2004
• Naïve Bayes
– See 2007 edition of this course
Clustering BoW vectors
• Use models from text document literature
– Probabilistic latent semantic analysis (pLSA)
– Latent Dirichlet allocation (LDA)
– See 2007 edition for explanation/code
?
Adding spatial info. to BoW
• Feature level
– Spatial influence through correlogram features:
Savarese, Winn and Criminisi, CVPR 2006
Adding spatial info. to BoW
• Feature level
• Generative models
– Sudderth, Torralba, Freeman & Willsky, 2005, 2006
– Hierarchical model of scene/objects/parts
Adding spatial info. to BoW
• Feature level
• Generative models
– Sudderth, Torralba, Freeman & Willsky, 2005, 2006
– Niebles & Fei-Fei, CVPR 2007
P1 P2
P3 P4
Image
Bg
Adding spatial info. to BoW
• Feature level
• Generative models
• Discriminative methods
– Lazebnik, Schmid & Ponce, 2006
Part-based
Models
Problem with bag-of-words
• Model:
– Relative locations between parts
– Appearance of part
• Issues:
– How to model location
– How to represent appearance
– How to handle occlusion/clutter
Figure from [Fischler & Elschlager 73]
History of Parts and Structure
approaches
• Fischler & Elschlager 1973
• Yuille ‘91
• Brunelli & Poggio ‘93
• Lades, v.d. Malsburg et al. ‘93
• Cootes, Lanitis, Taylor et al. ‘95
• Amit & Geman ‘95, ‘99
• Perona et al. ‘95, ‘96, ’98, ’00, ’03, ‘04, ‘05
• Felzenszwalb & Huttenlocher ’00, ’04
• Crandall & Huttenlocher ’05, ’06
• Leibe & Schiele ’03, ’04
• NP combinations!!!
Different connectivity structures
Fergus et al. ’03 Crandall et al. ‘05 Felzenszwalb &
Crandall et al. ‘05 Huttenlocher ‘00
Fei-Fei et al. ‘03 Fergus et al. ’05
O(N ) 2
Csurka ’04 Bouchard & Triggs ‘05 Carneiro & Lowe ‘06
Vasconcelos ‘00
• PCA
• 2-scale model
– Whole object
– Parts
• HOG representation +
SVM training to obtain
robust part detectors
• Distance
transforms allow
examination of every
location in the image
Hierarchical Representations
• Pixels Pixel groupings Parts Object
• Multi-scale approach
increases number of low-
level features
e.g. contours,
intermediate objects
e.g. linelets,
curvelets, T-
junctions
e.g. discontinuities,
gradient
Parts model
The architecture
Learned parts
Parts and Structure models
Summary
0
0 10 20 30 40 50 60 70
x = data
• Discriminative model
(The lousy 1
painter) 0.5
0
0 10 20 30 40 50 60 70
x = data
• Classification function
1
-1
0 10 20 30 40 50 60 70 80
x = data
Formulation
• Formulation: binary classification
…
Features x = x1 x2 x3 … xN xN+1 xN+2 … xN+M
Labels y= -1 +1 -1 -1 ? ? ?
• Classification function
Where belongs to some family of functions
Haar wavelets
Papageorgiou & Poggio (2000)
Features: Edges and chamfer distance
106 examples
HOG – Histogram of
Oriented gradients
Learn weighting of
descriptor with linear
SVM
Image HOG HOG descriptor weighted by
descriptor +ve SVM -ve SVM
weights
Classifier: Boosting
Viola & Jones 2001
Haar features via Integral Image
Cascade
Real-time performance
…….