CNN Guide: Convolutional Neural Networks Explained

CONVOLUTIONAL NEURAL Dr Omar Arif
NETWORKS omar.arif@seecs.edu.pk
OUTLINE Additional Reading:
http://cs231n.github.io/convolutional-networks/
Visual Recognition
Image Representation
Challenges
Convolutional Neural Networks

Image Filtering
CNN Layer
Pooling Layer
ReLU Layer
Fully Connected Later
Famous CNN Architectures
VISUAL OBJECT RECOGNITION
REPRESENTING IMAGES AS MATRICES
IMAGE SENSING:
CONTINUOUS IMAGE PROJECTED ONTO A SENSOR ARRAY
4
REPRESENTING IMAGE AS A MATRIX
5
REPRESENTING IMAGE AS A MATRIX
6
COMPUTER VISION – MAKE SENSE OF
NUMBERS
255 255 240  255
255 248 232  255
252 247 238  239
    
255 255 255  255
7
VISUAL RECOGNITION
Design algorithms that are capable of
 Classifying images or videos
 Detect and localize image
 Estimate semantic and geometrical attributes
 Classify human activity and events
Why is this challenging?
8
HOW MANY OBJECT CATEGORIES ARE
THERE?
9
CHALLENGES – SHAPE AND APPEARANCE
VARIATIONS
 10
CHALLENGES – VIEWPOINT VARIATIONS
 11
CHALLENGES – ILLUMINATION
 12
CHALLENGES – BACKGROUND CLUTTER
 13
CHALLENGES – SCALE
 14
CHALLENGES – OCCLUSION
 15
CHALLENGES DO NOT APPEAR IN
ISOLATION!
Task: Detect phones in this image
Appearance variations
Viewpoint variations
Illumination variations
Background clutter
Scale changes
Occlusion
 16
CONVOLUTIONAL NEURAL
NETWORK
CONVOLUTIONAL NEURAL NETWORK
CNN or Convnet is feed forward neural network specially designed
for images
A two-dimensional
array of pixels
CNN X or O
FOR EXAMPLE
CNN X
CNN O
TRICKIER CASES
CNN X
CNN O
DECIDING IS HARD
?
=
WHAT COMPUTERS SEE
?
=
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 1 -1 1 1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1
-1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
COMPUTERS ARE LITERAL
=
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
x
-1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 1 -1 1 1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1
-1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
CONVNETS MATCH PIECES OF THE IMAGE
=
=
PIECES OF THE IMAGE ARE CALLED
FEATURES
1 -1 -1 1 -1 1 -1 -1 1
-1 1 -1 -1 1 -1 -1 1 -1
-1 -1 1 1 -1 1 1 -1 -1
1 -1 -1 1 -1 1 -1 -1 1
-1 1 -1 -1 1 -1 -1 1 -1
-1 -1 1 1 -1 1 1 -1 -1
1 -1 -1 1 -1 1 -1 -1 1
-1 1 -1 -1 1 -1 -1 1 -1
-1 -1 1 1 -1 1 1 -1 -1
1 -1 -1 1 -1 1 -1 -1 1
-1 1 -1 -1 1 -1 -1 1 -1
-1 -1 1 1 -1 1 1 -1 -1
1 -1 -1 1 -1 1 -1 -1 1
-1 1 -1 -1 1 -1 -1 1 -1
-1 -1 1 1 -1 1 1 -1 -1
1 -1 -1 1 -1 1 -1 -1 1
-1 1 -1 -1 1 -1 -1 1 -1
-1 -1 1 1 -1 1 1 -1 -1
HOW COMPUTER MATCH FEATURES:
CONVOLUTION (LINEAR FILTERING)
1 -1 -1
-1 1 -1
Convolution is a
-1 -1 1 neighborhood
operation in which
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
each output pixel is the
-1 -1 1 -1 -1 -1 1 -1 -1 weighted sum of
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1 neighboring input
-1 -1 -1 1 -1 1 -1 -1 -1 pixels. The matrix of
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1 weights is called
-1 -1 -1 -1 -1 -1 -1 -1 -1 the convolution kernel,
also known as the filter.
CONVOLUTION
1 -1 -1
1
-1 1 -1
9
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
CONVOLUTION
1 -1 -1
1
-1 1 -1
9
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1 55
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
CONVOLUTION
1 -1 -1
1 -1 1 -1
9 -1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-1 -1 1 -1 -1 -1 1 -1 -1 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
-1 -1 -1 1 -1 1 -1 -1 -1 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-1 -1 -1 -1 1 -1 -1 -1 -1 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
-1 -1 -1 1 -1 1 -1 -1 -1 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-1 -1 1 -1 -1 -1 1 -1 -1 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
-1 1 -1 -1 -1 -1 -1 1 -1 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-1 -1 -1 -1 -1 -1 -1 -1 -1
LINEAR FILTERS: EXAMPLES
1 1 1
1 1 1
1 1 1 =
Original Blur (with a mean
filter)
Source: D. Lowe
PRACTICE WITH LINEAR FILTERS
0 0 0
0 1 0
0 0 0 ?
Original
Source: D. Lowe
0 0 0
0 1 0
0 0 0
Original Filtered
(no change)
Source: D. Lowe
0 0 0
0 0 1
0 0 0 ?
Original
Source: D. Lowe
0 0 0
0 0 1
0 0 0
Original Shifted left

By 1 pixel
Source: D. Lowe
Image from http://www.texasexplorer.com/austincap2.jpg
Kristen Grauman
Showing magnitude of responses
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Fully Connected Layer
Example: 200x200 image
40K hidden units
~2B parameters!!!
- Spatial correlation is local

- Waste of resources + we have not enough
training samples anyway.. Ranzato
59
Locally Connected Layer

40K hidden units
Filter size: 10x10
4M parameters
Note: This parameterization is good when

input image is registered (e.g., face recognition).
Ranzato
60
Locally Connected Layer
STATIONARITY? Statistics is similar at
different locations

40K hidden units
Filter size: 10x10
4M parameters
Ranzato
61
Convolutional Layer
Share the same parameters across different

locations (assuming input is stationary):
Convolutions with learned kernels
Ranzato
62
CONVOLUTION
Border Handling:
Zero-Padding
CONVOLUTION
-1 -1 -1 -1 -1 -1 -1 -1 -1 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-1 1 -1 -1 -1 -1 -1 1 -1 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11

-1 -1 1 -1 -1 -1 1 -1 -1
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-1 -1 -1 1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1
= 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
-1 -1 -1 1 -1 1 -1 -1 -1 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11

-1 -1 1 -1 -1 -1 1 -1 -1
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
-1 1 -1 -1 -1 -1 -1 1 -1
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-1 1 -1 -1 -1 -1 -1 1 -1
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
-1 -1 1 -1 -1 -1 1 -1 -1 1 -1 -1
=
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-1 -1 -1 1 -1 1 -1 -1 -1
-1
-1
-1
-1
-1
-1
-1
1
1
-1
-1
1
-1
-1
-1
-1
-1
-1
-1 1 -1 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11

-1
-1
-1
1
1
-1
-1
-1
-1
-1
-1
-1
1
-1
-1
1
-1
-1
-1 -1 1 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
-1 -1 -1 -1 -1 -1 -1 -1 -1 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-1 -1 -1 -1 -1 -1 -1 -1 -1 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-1 1 -1 -1 -1 -1 -1 1 -1 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55

-1 -1 1 -1 -1 -1 1 -1 -1 1 -1 1
=
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-1 -1 -1 1 -1 1 -1 -1 -1
-1
-1
-1
-1
-1
-1
-1
1
1
-1
-1
1
-1
-1
-1
-1
-1
-1
-1 1 -1 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11

-1
-1
-1
1
1
-1
-1
-1
-1
-1
-1
-1
1
-1
-1
1
-1
-1
1 -1 1 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33

-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-1 1 -1 -1 -1 -1 -1 1 -1 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11

-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 1
=
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-1 -1 -1 1 -1 1 -1 -1 -1
-1
-1
-1
-1
-1
-1
-1
1
1
-1
-1
1
-1
-1
-1
-1
-1
-1
-1 1 -1 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55

-1
-1
-1
1
1
-1
-1
-1
-1
-1
-1
-1
1
-1
-1
1
-1
-1
1 -1 -1 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
-1 -1 -1 -1 -1 -1 -1 -1 -1 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33

CONVOLUTION LAYER
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
1 -1 -1 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-1 1 -1 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-1 -1 1 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-1 -1 -1 -1 -1 -1 -1 -1 -1 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-1 1 -1 -1 -1 -1 -1 1 -1 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55

-1
-1
-1
-1
1
-1
-1
1
-1
-1
-1
1
1
-1
-1
-1
-1
-1
1 -1 1 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-1
-1
-1
-1
-1
-1
-1
1
1
-1
-1
1
-1
-1
-1
-1
-1
-1
-1 1 -1 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11

-1
-1
-1
1
1
-1
-1
-1
-1
-1
-1
-1
1
-1
-1
1
-1
-1
1 -1 1 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33

-1 -1 -1 -1 -1 -1 -1 -1 -1
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
-1 -1 1 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-1 1 -1 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
1 -1 -1 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33

CONVOLUTION LAYER
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-1 -1 -1 -1 -1 -1 -1 -1 -1 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-1 1 -1 -1 -1 -1 -1 1 -1 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55

-1 -1 1 -1 -1 -1 1 -1 -1
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-1 -1 -1 1 -1 1 -1 -1 -1
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-1 -1 1 -1 -1 -1 1 -1 -1
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
-1 1 -1 -1 -1 -1 -1 1 -1
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-1 -1 -1 -1 -1 -1 -1 -1 -1
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33

CONVOLUTION LAYER
CONVOLUTION LAYER
CONVOLUTION LAYER
CONVOLUTION LAYER
CONVOLUTION LAYER
If we had 6 5x5 filters, we’ll get 6 separate activation maps:
We stack these up to get a “new image” of size 28x28x6!

CONVOLUTION LAYER
ConvNet is a sequence of Convolutional Layers, interspersed with Rectified Linear
Unit (ReLU)
CONVOLUTION LAYER
A closer look at spatial dimensions:
activation map
32x32x3 image
5x5x3 filter
32
28
convolve (slide) over all

spatial locations
32 28
3 1
27 Jan 2016
7
7x7 input (spatially)
assume 3x3 filter
27 Jan 2016
7
assume 3x3 filter
27 Jan 2016
7
assume 3x3 filter
27 Jan 2016
7
assume 3x3 filter
27 Jan 2016
7
assume 3x3 filter
=> 5x5 output

7
27 Jan 2016
7
assume 3x3 filter
applied with stride 2
27 Jan 2016
7
assume 3x3 filter
27 Jan 2016
7
assume 3x3 filter
=> 3x3 output!
7
27 Jan 2016
7
assume 3x3 filter
applied with stride 3?
27 Jan 2016
7
assume 3x3 filter
applied with stride 3?
7 doesn’t fit!
cannot apply 3x3 filter on
7x7 input with stride 3.
27 Jan 2016
N
Output size:
(N - F) / stride + 1
F
N e.g. N = 7, F = 3:
F stride 1 => (7 - 3)/1 + 1 = 5
stride 2 => (7 - 3)/2 + 1 = 3
stride 3 => (7 - 3)/3 + 1 = 2.33 :\
27 Jan 2016
In practice: Common to zero pad the border
0 0 0 0 0 0 e.g. input 7x7
0 3x3 filter, applied with stride 1
pad with 1 pixel border => what is the output?
0
(recall:)
(N - F) / stride + 1
27 Jan 2016
0 0 0 0 0 0 e.g. input 7x7
0
0 7x7 output!
0
27 Jan 2016
0 0 0 0 0 0 e.g. input 7x7
0
0 7x7 output!
in general, common to see CONV layers with
0 stride 1, filters of size FxF, and zero-padding with
(F-1)/2. (will preserve size spatially)
e.g. F = 3 => zero pad with 1
F = 5 => zero pad with 2
F = 7 => zero pad with 3
27 Jan 2016
Remember back to…
E.g. 32x32 input convolved repeatedly with 5x5 filters shrinks volumes spatially!
(32 -> 28 -> 24 ...). Shrinking too fast is not good, doesn’t work well.
32 28 24
….
CONV, CONV, CONV,
ReLU ReLU ReLU
e.g. 6 e.g. 10
5x5x3 5x5x6
32 filters 28 filters 24
3 6 10
27 Jan 2016
Examples time:
Input volume: 32x32x3

10 5x5 filters with stride 1, pad 2
Output volume size: ?
27 Jan 2016
Examples time:

Output volume size:

(32+2*2-5)/1+1 = 32 spatially, so
32x32x10
27 Jan 2016
Examples time:

Number of parameters in this layer?
27 Jan 2016
Examples time:

Number of parameters in this layer?

each filter has 5*5*3 + 1 = 76 params (+1 for bias)
=> 76*10 = 760
27 Jan 2016
CONVOLUTION LAYER
N -> size of image
F -> Size of filter
S -> Stride
P -> Padding
Output size:
(N-F+2P)/S + 1
e.g
(7-3+2)/2 + 1 = 4
27 Jan 2016
Common settings:
K = (powers of 2, e.g. 32, 64, 128, 512)

- F = 3, S = 1, P = 1
- F = 5, S = 1, P = 2
- F = 5, S = 2, P = ? (whatever fits)
- F = 1, S = 1, P = 0
27 Jan 2016
(btw, 1x1 convolution layers make perfect sense)
1x1 CONV
56 with 32 filters
56
(each filter has size
1x1x64, and performs a
64-dimensional dot
56 product)
56
64 32
27 Jan 2016
The brain/neuron view of CONV Layer
32x32x3 image
5x5x3 filter
32
1 number:
32 the result of taking a dot product between
the filter and this part of the image
3
(i.e. 5*5*3 = 75-dimensional dot product)
115
27 Jan 2016
32x32x3 image
5x5x3 filter
32
It’s just a neuron with local

connectivity...
1 number:
32 the result of taking a dot product between
the filter and this part of the image
3
(i.e. 5*5*3 = 75-dimensional dot product)
116
27 Jan 2016
32
28 An activation map is a 28x28 sheet of neuron

outputs:
1. Each is connected to a small region in the input
2. All of them share parameters
32 “5x5 filter” -> “5x5 receptive field for each neuron”

28
3
27 Jan 2016
32
E.g. with 5 filters,

28 CONV layer consists of
neurons arranged in a 3D grid
(28x28x5)
There will be 5 different

32 28 neurons all looking at the same
3 region in the input volume
5
27 Jan 2016
Pooling Layer
Let us assume filter is an “eye” detector.
Q.: how can we make the detection robust to the

exact location of the eye?
Ranzato
119
Pooling Layer
By “pooling” (e.g., taking max) filter
responses at different locations we gain
robustness to the exact spatial location
of features.
Ranzato
120
Pooling layer
- makes the representations smaller and more manageable
- operates over each activation map independently:
27 Jan 2016
MAX POOLING
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11

1.00
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

MAX POOLING
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11

1.00 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

MAX POOLING
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11

1.00 0.33 0.55
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

MAX POOLING
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11

1.00 0.33 0.55 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

MAX POOLING
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11

1.00 0.33 0.55 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

MAX POOLING
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11

1.00 0.33 0.55 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 1.00 0.33 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 0.33 1.00 0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.55 0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
1.00 0.33 0.55 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55

0.33 1.00 0.33 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.55 0.33 1.00 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 0.55 0.11 0.77
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33

0.55 0.33 0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11

0.33 1.00 0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 0.55 0.55 0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 0.11 0.11 0.33
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

0.33 0.55 1.00 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11

0.55 0.55 1.00 0.33
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 1.00 1.00 0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11

0.77 0.33 0.55 0.33
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
POOLING LAYER
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33

1.00 0.33 0.55 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55

0.33 1.00 0.33 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.55 0.33 1.00 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 0.55 0.11 0.77
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33

0.55 0.33 0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11

0.33 1.00 0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 0.55 0.55 0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 0.11 0.11 0.33
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

0.33 0.55 1.00 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11

0.55 0.55 1.00 0.33
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 1.00 1.00 0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 0.33 0.55 0.33
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
POOLING LAYER
Summary:
Accepts a volume of size W1 x H1 x D1
Requires four hyper-parameters:
Kernel Size F
Stride S
Produces a volume of size W2 x H2 x D2 where:
W2 = (W1 – F)/S + 1
H2 = (H1 – F)/S + 1
D2 = D1
Introduces zero parameters since it computes a fixed function
of the input
Note: Zero padding not common in case of pooling
RECTIFIED LINEAR UNITS (RELU)
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

RECTIFIED LINEAR UNITS (RELUS)
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 0
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 0 0.11 0.33 0.55 0 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 0 0.11 0.33 0.55 0 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0 1.00 0 0.33 0 0.11 0
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.11 0 1.00 0 0.11 0 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.33 0.33 0 0.55 0 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.55 0 0.11 0 1.00 0 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0 0.11 0 0.33 0 1.00 0
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 0 0.55 0.33 0.11 0 0.77
RELU LAYER
0.77 0 0.11 0.33 0.55 0 0.33

0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
0 1.00 0 0.33 0 0.11 0

-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 0 1.00 0 0.11 0 0.55

0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.33 0.33 0 0.55 0 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.55 0 0.11 0 1.00 0 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0 0.11 0 0.33 0 1.00 0
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 0 0.55 0.33 0.11 0 0.77
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 0.33 0 0.11 0 0.11 0 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0 0.55 0 0.33 0 0.55 0
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 0.11 0 0.55 0 0.55 0 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0 0.33 0 1.00 0 0.33 0
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 0.11 0 0.55 0 0.55 0 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0 0.55 0 0.33 0 0.55 0
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 0.33 0 0.11 0 0.11 0 0.33
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 0 0.55 0.33 0.11 0 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0 0.11 0 0.33 0 1.00 0
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.55 0 0.11 0 1.00 0 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.33 0.33 0 0.55 0 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.11 0 1.00 0 0.11 0 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0 1.00 0 0.33 0 0.11 0
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 0 0.11 0.33 0.55 0 0.33
LAYERS GET STACKED
1.00 0.33 0.55 0.33
0.33 1.00 0.33 0.55
-1 -1 -1 -1 -1 -1 -1 -1 -1 0.55 0.33 1.00 0.11
-1 1 -1 -1 -1 -1 -1 1 -1 0.33 0.55 0.11 0.77
-1 -1 1 -1 -1 -1 1 -1 -1
0.55 0.33 0.55 0.33
-1 -1 -1 1 -1 1 -1 -1 -1
0.33 1.00 0.55 0.11
-1 -1 -1 -1 1 -1 -1 -1 -1
0.55 0.55 0.55 0.11
-1 -1 -1 1 -1 1 -1 -1 -1
0.33 0.11 0.11 0.33
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1 0.33 0.55 1.00 0.77
-1 -1 -1 -1 -1 -1 -1 -1 -1 0.55 0.55 1.00 0.33
1.00 1.00 0.11 0.55
0.77 0.33 0.55 0.33

DEEP STACKING
1.00 0.55
-1 -1 -1 -1 -1 -1 -1 -1 -1 0.55 1.00
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1 1.00 0.55
-1 -1 -1 -1 1 -1 -1 -1 -1
0.55 0.55
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1 0.55 1.00
-1 -1 -1 -1 -1 -1 -1 -1 -1
1.00 0.55
FULLY CONNECTED LAYER
1.00
X
0.55
0.55
1.00
1.00
0.55
0.55
O
0.55
0.55
1.00
1.00
0.55
0.55
X
1.00
1.00
0.55
0.55
0.55
0.55
O
0.55
1.00
0.55
0.55
1.00
0.9
X
0.65
0.45
0.87
0.96
0.73
0.23
O
0.63
0.44
0.89
0.94
0.53
0.9
X
0.65
0.45
0.87
0.96
0.73
0.23
O
0.63
0.44
0.89
0.94
0.53
PUTTING IT ALL TOGETHER
-1
-1
-1
-1
-1
-1
1
-1
-1
-1
-1
-1
1
-1
-1
-1
-1
-1
1
-1
-1
-1
-1
-1
1
-1
-1
-1
1
-1
-1
-1
1
-1
-1
-1
1
-1
-1
-1
-1
-1
-1
-1
-1
X
-1 -1 -1 1 -1 1 -1 -1 -1
O
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
PUTTING IT ALL TOGETHER
IMPLEMENTATION – CIFAR10
IMPLEMENTATION – CIFAR10
FAMOUS CNN ARCHITECTURES
IMAGE NET
• The ImageNet project is a large visual database
designed for use in visual object recognition
software research. As of 2016, over ten million
URLs of images have been hand-annotated by
ImageNet to indicate what objects are pictured.
• Since 2010, the ImageNet project runs an annual

software contest, the ImageNet Large Scale
Visual Recognition Challenge (ILSVRC), where
software programs compete to correctly classify
and detect objects and scenes.
 147
IMAGE NET
 148
VARIOUS CNN ARCHITECTURES
PERFORMANCE
 149
Case Study: LeNet-5
[LeCun et al., 1998]
Conv filters were 5x5, applied at stride 1

Subsampling (Pooling) layers were 2x2 applied at stride 2
i.e. architecture is [CONV-POOL-CONV-POOL-CONV-FC]
27 Jan 2016
Case Study: AlexNet
[Krizhevsky et al. 2012]
27 Jan 2016
Case Study: AlexNet
Input: 227x227x3 images
First layer (CONV1): 96 11x11 filters applied at stride 4

=>
Q: what is the output volume size? Hint: (227-11)/4+1 = 55
27 Jan 2016
Case Study: AlexNet

=>
Output volume [55x55x96]
Q: What is the total number of parameters in this layer?
27 Jan 2016
Case Study: AlexNet

=>
Output volume [55x55x96]
Parameters: (11*11*3)*96 = 35K
27 Jan 2016
Case Study: AlexNet

After CONV1: 55x55x96
Second layer (POOL1): 3x3 filters applied at stride 2
Q: what is the output volume size? Hint: (55-3)/2+1 = 27
27 Jan 2016
Case Study: AlexNet


Output volume: 27x27x96
Q: what is the number of parameters in this layer?
15
6
27 Jan 2016
Case Study: AlexNet


Output volume: 27x27x96
Parameters: 0!
27 Jan 2016
Case Study: AlexNet

After POOL1: 27x27x96
...
27 Jan 2016
Case Study: AlexNet
Full (simplified) AlexNet architecture:

[227x227x3] INPUT
[55x55x96] CONV1: 96 11x11 filters at stride 4, pad 0
[27x27x96] MAX POOL1: 3x3 filters at stride 2
[27x27x96] NORM1: Normalization layer
[4096] FC6: 4096 neurons
[4096] FC7: 4096 neurons
[1000] FC8: 1000 neurons (class scores)
27 Jan 2016
Case Study: AlexNet
Full (simplified) AlexNet architecture:

[227x227x3] INPUT
Details/Retrospectives:
- first use of ReLU
- used Norm layers (not common anymore)
- heavy data augmentation
- dropout 0.5
- batch size 128
- SGD Momentum 0.9
- Learning rate 1e-2, reduced by 10
manually when val accuracy plateaus
- L2 weight decay 5e-4
[4096] FC6: 4096 neurons
- 7 CNN ensemble: 18.2% -> 15.4%
[4096] FC7: 4096 neurons
[1000] FC8: 1000 neurons (class scores)
27 Jan 2016
VGGNET
Simonyan and Zisserman, 2014

Consists of only:
3x3 CONV stride 1, pad 1
2x2 MAX POOL, stride 2
11.2% top 5 error in ILSVRC 2013 =>

7.3% top 5 error
 161
INPUT: [224x224x3] memory: 224*224*3=150K params: 0 (not counting biases)
CONV3-64: [224x224x64] memory: 224*224*64=3.2M params: (3*3*3)*64 = 1,728
POOL2: [112x112x64] memory: 112*112*64=800K params: 0
CONV3-256: [56x56x256] memory: 56*56*256=800K params: (3*3*128)*256 = 294,912
CONV3-512: [28x28x512] memory: 28*28*512=400K params: (3*3*256)*512 = 1,179,648
FC: [1x1x4096] memory: 4096 params: 7*7*512*4096 = 102,760,448
FC: [1x1x4096] memory: 4096 params: 4096*4096 = 16,777,216
27 Jan 2016
TOTAL memory: 24M * 4 bytes ~= 93MB / image (only forward! ~*2 for bwd)
TOTAL params: 138M parameters
27 Jan 2016
CONV3-64: [224x224x64] memory: 224*224*64=3.2M params: (3*3*3)*64 = 1,728 Note:
CONV3-128: [112x112x128] memory: 112*112*128=1.6M params: (3*3*64)*128 = 73,728 Most memory is in
CONV3-128: [112x112x128] memory: 112*112*128=1.6M params: (3*3*128)*128 = 147,456 early CONV
POOL2: [14x14x512] memory: 14*14*512=100K params: 0 Most params are
CONV3-512: [14x14x512] memory: 14*14*512=100K params: (3*3*512)*512 = 2,359,296 in late FC
TOTAL memory: 24M * 4 bytes ~= 93MB / image (only forward! ~*2 for bwd)
TOTAL params: 138M parameters
27 Jan 2016
GOOGLENET
Szegedy et al., 2014
Inception Module
ILSVRC 2014 winner (6.7% top 5 error)
ImageNet Large Scale Visual Recognition Challenge
 165
RESNET
He et al., 2015
ILSVRC 2015 winner (3.6% top 5 error)
 166
RESNET (CONTD.)
 167
SUMMARY
Visual Recognition
Challenges
Convolutional Neural Networks

Image Filtering
CNN Layer
Pooling Layer
ReLU Layer
Fully Connected Later
Famous CNN Architectures
 169
Find the total number of parameters/weights and the memory required
(in Bytes) to hold all the intermediate hidden layers (including the input
and final output layer) in the following network. All convolution filters have
size 3x3, stride 1 and pad 1, and all pool layers size is 2x2, stride 2.
Input layer: 224x224x3 Conv layer with 512 filters

Conv layer with 64 filters Conv layer with 512 filters
Pool Layer Pool layer
Pool Layer Conv layer with 512 filters
Conv layer with 256 filters Pool layer
Conv layer with 256 filters FC – 4096 neurons
Conv layer with 256 filters FC – 4096 neurons
Pool Layer FC – 1000 neurons
 170

CNN Guide: Convolutional Neural Networks Explained

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

CNN Guide: Convolutional Neural Networks Explained

Caricato da

Copyright:

Formati disponibili

CONVOLUTIONAL NEURAL Dr Omar Arif

Convolutional Neural Networks

Why is this challenging?

-1 -1 1 -1 -1 -1 1 -1 -1 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11

-1 -1 -1 1 -1 1 -1 -1 -1 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55

-1 -1 -1 -1 1 -1 -1 -1 -1 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33

-1 -1 -1 1 -1 1 -1 -1 -1 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11

-1 -1 1 -1 -1 -1 1 -1 -1 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11

-1 1 -1 -1 -1 -1 -1 1 -1 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

Original Shifted left

- Spatial correlation is local

Example: 200x200 image

Note: This parameterization is good when

Example: 200x200 image

Share the same parameters across different

-1 -1 -1 -1 -1 -1 -1 -1 -1 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33

-1 1 -1 -1 -1 -1 -1 1 -1 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11

-1 -1 -1 1 -1 1 -1 -1 -1 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11

0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11

-1 -1 -1 -1 -1 -1 -1 -1 -1 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

-1 -1 -1 -1 -1 -1 -1 -1 -1 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33

-1 1 -1 -1 -1 -1 -1 1 -1 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55

0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11

0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33

-1 -1 -1 -1 -1 -1 -1 -1 -1 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

-1 1 -1 -1 -1 -1 -1 1 -1 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11

0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55

-1 -1 -1 -1 -1 -1 -1 -1 -1 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33

0.77 -0.11 0.11 0.33 0.55 -0.11 0.33

-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11

1 -1 -1 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55

-1 1 -1 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33

0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11

-1 -1 1 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11

0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

-1 -1 -1 -1 -1 -1 -1 -1 -1 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33

-1 1 -1 -1 -1 -1 -1 1 -1 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55

0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11

0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33

0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11

-1 -1 1 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11

-1 1 -1 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33

0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55

1 -1 -1 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11

0.77 -0.11 0.11 0.33 0.55 -0.11 0.33

0.77 -0.11 0.11 0.33 0.55 -0.11 0.33

-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11

0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55

0.33 0.33 -0.33 0.55 -0.33 0.33 0.33

0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11

-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11

0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

-1 -1 -1 -1 -1 -1 -1 -1 -1 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33

-1 1 -1 -1 -1 -1 -1 1 -1 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55

0.33 -0.11 0.55 0.33 0.11 -0.11 0.77

-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11

0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11

0.33 0.33 -0.33 0.55 -0.33 0.33 0.33

0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55

-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11

0.77 -0.11 0.11 0.33 0.55 -0.11 0.33

We stack these up to get a “new image” of size 28x28x6!