Convolutional Neural Networks: Riddhiman Dasgupta & Ayushi Dalmia Cse577 Tutorial, Iiit Hyderabad, Monsoon 2015

CONVOLUTIONAL
NEURAL NETWORKS
RIDDHIMAN DASGUPTA & AYUSHI DALMIA
CSE577 TUTORIAL, IIIT HYDERABAD, MONSOON 2015
DEEP LEARNING: OVERVIEW
Feature visualization of convnet trained on ImageNet by Zeiler and Fergus, 2013.

Each layer of features is learned, and hierarchical.
DEEP LEARNING: OVERVIEW

Each level transforms its input to a higher level one
Deep means more than one stage of non-linear feature transformation
High level features are global and invariant

Low level features are shared among categories
DEEP LEARNING: COMPARISON

Neural networks with 1 hidden layer are not deep
There is no feature hierarchy, just one non linear feature transformation
SVMs are not deep

Kernels followed by linear classifiers implies no feature hierarchy
Classification trees are not deep

All decisions are in input space only, depth of tree does not have same meaning
Graphical models are orthogonal to deep learning

Factors can come from deep networks
CONVNETS: HISTORY
Cognitron/Neocognitron
Fukushima, 1971
Hubel-Wiesel Architecture
Hubel & Wiesel, 1962
Simple cells for local features
Complex cells for pooling
Multistage Hubel-Wiesel Architecture

Yann Lecun, 1988
CONVNETS: OVERVIEW
Input Image
Feed forward
Convolution (Filtering)
Non linearity (Activation)
Pooling (Dimension reduction)
Convolution (Learned)
Non Linearity
Back propagation
Supervised classification error
Update weights of convolutional filters
Pooling
Feature Maps
CONVNETS: FULLY CONNECTED LAYERS
For a 200*200 image with 40,000

hidden nodes, 1.6 billion parameters
are needed
Spatial correlation is locally
concentrated
Wastage of resources bound to
happen
CONVNETS: LOCALLY CONNECTED LAYERS
For a 200*200 image and a filter of

size 10*10, with 40,000 hidden
nodes, 40 million parameters are
needed
Much more computationally efficient
Utilises locality and correlation in
image
CONVNETS: CONVOLUTIONAL LAYERS
Share same parameters across

different locations
Similar to a sliding window
Each filer is akin to a convolutional

kernel
Convolutional kernels need to be
learned
Multiple filters can be learned

For a 200*200 image and a filter of
size 10*10, with 100 filters, 10,000
parameters are needed
In general, for image and
filter, and stride of 1, the hidden
layer will be of size ( + 1)
( + 1)
CONVNETS: NON LINEARITY

Tanh
=
Sigmoid
1
= 1+
ReLU
= max 0,
Preferred choice
Fast to compute
Simplifies back propagation
CONVNETS: POOLING LAYERS
Pooling implies aggregation

Robustness to exact spatial location
of features
Additional inter-feature competition
CONVNETS: POOLING LAYERS
Max
pooling
Average
pooling
L2
pooling
For pooling size, and

input maps, i.e. hidden layer size, the

output map will be units in size.
Computational cost negligible

compared to convolution
CONVNETS: NORMALISATION LAYERS
CONVNETS: NORMALISATION LAYERS
Local contrastive normalisation

Local neighbourhood
Zero mean
Unit variance
BEFORE
Additional inter feature competition
Increases sparsity & improves

invariance at very low cost
AFTER
CONVNETS: PUTTING LAYERS TOGETHER

Input Pixels
Non Linearity
Pooling
Convolution
Normalisation
Features
Stack multiple stages of the

architecture one after another to form
a convnet
Final layers are usually fully connected

layers, acting as hidden layers of a
neural network
Final layer is a classifier, usually a
softmax classifier, with a suitable loss
function for the task

Input Pixels
Convolution
Filter bank + non linearity gives a nonlinear embedding in high dimension

Non Linearity
Normalisation
Pooling is basically contraction and

dimensionality reduction
Normalisation is simply smoothing
Equivalent to simple + complex cell

model of vision
Pooling
Features
Convolution
Non Linearity
Normalisation
Pooling
Top: Single stage, zoomed in

Bottom: Usual convnet pipeline
Input image
Stage 1
Stage 2
Stage 3
Fully
connected
layers
Classifier
CONVNETS: TRICKS OF THE TRADE

Hyper parameter selection is difficult
Grid search is highly inefficient due to large number of parameters
Use stochastic gradient descent with mini-batches

Use momentum
=
+ 1
Use weight decay

=
Use weight sharing to reduce number of parameters
Use unsupervised pre-training if data is limited
CONVNETS: AUGMENTATION
Crops
Random perturbations
Corners & centres
Translation
Reflections
Rotation
Horizontal & vertical
Jitter / Noise
CONVNETS: DROPOUT
Randomly omit each hidden unit with a probability, usually 0.5

Equivalent to sampling from an ensemble of 2 models
Fast and efficient regularization, robust to noisy inputs
At test time, each hidden unit is given a weight, usually 0.5
CONVNETS: SLIDING WINDOW

Traditionally, applying a
sliding window to entire
images is very expensive
Convolutional nets can

easily be replicated over
large images very cheaply
Simply apply convolutions
to entire image and
spatially replicate fullyconnected layers
CONVNETS: FULLY CONVOLUTIONAL
Transform fixed input-size

models into any-size models by
converting all inner products to
1 1 convolutions
CONVNETS: GENERIC FEATURES
REFERENCES
Slides:
Yann Lecun, CVPR 2014 Workshop on Deep Learning in Vision
Marc Ranzato, CVPR 2014 Workshop on Deep Learning in Vision
Rob Fergus, NIPS 2013
LeCun, Bottou, Bengio and Haffner: Gradient-Based Learning Applied to Document

Recognition
Krizhevsky, Sutskever and Hinton: ImageNet Classification with deep convolutional neural
networks
Girshick, Donahue, Darrell and Malick: Rich feature hierarchies for accurate object
detection and semantic segmentation
Razavian, Azizpour, Sullivan and Carlsson: CNN features off-the-shelf: and astounding
baseline for recognition
THANK YOU

Convolutional Neural Networks: Riddhiman Dasgupta & Ayushi Dalmia Cse577 Tutorial, Iiit Hyderabad, Monsoon 2015

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Convolutional Neural Networks: Riddhiman Dasgupta & Ayushi Dalmia Cse577 Tutorial, Iiit Hyderabad, Monsoon 2015

Caricato da

Copyright:

Formati disponibili

CONVOLUTIONAL

DEEP LEARNING: OVERVIEW

Feature visualization of convnet trained on ImageNet by Zeiler and Fergus, 2013.

DEEP LEARNING: OVERVIEW

High level features are global and invariant

DEEP LEARNING: COMPARISON

SVMs are not deep

Classification trees are not deep

Graphical models are orthogonal to deep learning

Multistage Hubel-Wiesel Architecture

CONVNETS: FULLY CONNECTED LAYERS

For a 200*200 image with 40,000

CONVNETS: LOCALLY CONNECTED LAYERS

For a 200*200 image and a filter of

CONVNETS: CONVOLUTIONAL LAYERS

Share same parameters across

Each filer is akin to a convolutional

CONVNETS: CONVOLUTIONAL LAYERS

Multiple filters can be learned

CONVNETS: CONVOLUTIONAL LAYERS

CONVNETS: NON LINEARITY

CONVNETS: POOLING LAYERS

Pooling implies aggregation

CONVNETS: POOLING LAYERS

For pooling size, and

Computational cost negligible

CONVNETS: NORMALISATION LAYERS

CONVNETS: NORMALISATION LAYERS

Local contrastive normalisation

Additional inter feature competition

Increases sparsity & improves

CONVNETS: PUTTING LAYERS TOGETHER

Stack multiple stages of the

Final layers are usually fully connected

CONVNETS: PUTTING LAYERS TOGETHER

Filter bank + non linearity gives a nonlinear embedding in high dimension

Pooling is basically contraction and

Equivalent to simple + complex cell

CONVNETS: PUTTING LAYERS TOGETHER

Top: Single stage, zoomed in

CONVNETS: TRICKS OF THE TRADE

Use stochastic gradient descent with mini-batches

Use weight decay

Use weight sharing to reduce number of parameters

Use unsupervised pre-training if data is limited

Randomly omit each hidden unit with a probability, usually 0.5

CONVNETS: SLIDING WINDOW

Convolutional nets can

CONVNETS: FULLY CONVOLUTIONAL

Transform fixed input-size

CONVNETS: GENERIC FEATURES

LeCun, Bottou, Bengio and Haffner: Gradient-Based Learning Applied to Document

Potrebbero piacerti anche