Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
https://www.xenonstack.com/blog/static/public/uploads/media/machine-learning-vs-deep-learning.png
Neural Network Intro
Weights
𝒉 =𝝈 (𝐖 𝟏 𝒙+ 𝒃𝟏)
𝒚= 𝝈 (𝑾 𝟐 𝒉+𝒃𝟐 )
Activation functions
How do we train?
𝒚
Demo
Element of Neural Network
Neuron 𝑓 : 𝑅𝐾 → 𝑅
a1 w1 z a1w1 a2 w2 a K wK b
a2 w2
z z a
wK
…
aK weights
Activation
function
b
bias
Neural Network
neuron
Input Layer 1 Layer 2 Layer L Output
x1 …… y1
x2 …… y2
……
……
……
……
……
xN …… yM
Input Output
Layer Hidden Layers Layer
Ordinary Layer
z1
y1 z1
In general, the output of
z2
y2 z 2
network can be any
value.
May not be easy to interpret
z3
y3 z 3
Softmax
Probability:
• Softmax layer as the output layer
Softmax Layer
3
0.88
e
3 z1 20 zj
z1 e e y1 e z1
j 1
0.12 3
z2 1
e e z2 2.7 y2 e z2
e
zj
j 1
0.05 ≈0 3
z3 -3
e
e
z3 zj
e y3 e z3
3 j 1
e zj
j 1
Why Deep?
Universality Theorem
f : R N RM
Can be realized by a network
with one hidden layer
Reference for the reason:
(given enough hidden http://neuralnetworksandde
neurons) eplearning.com/chap4.html
x1 x2 …… xN x1 x2 …… xN
Shallow Deep
Fat + Short v.s. Thin + Tall
The same number
of parameters
x1 x2 …… xN x1 x2 …… xN
Shallow Deep
Why can
Deep?
be trained by little data
• Deep → Modularization
Classifier Girls with
1 long hair
Boy or
Girl? Classifier Boys with
2 fine long Little
hair data
Image Basic
Classifier Classifier Girls with
Long or 3 short hair
short?
Classifier Boys with
Sharing by the 4 short hair
following classifiers
as module
Deep Learning also works
WhyonDeep?
small data set like TIMIT.
• Deep → Modularization
→ Less training data?
x1 ……
x2 The modularization is ……
automatically learned from
……
……
……
……
data.
xN ……
The most basic Use 1st layer as module Use 2nd layer as
classifiers to build classifiers module ……
Hand-crafted
kernel function
SVM
Apply simple
classifier
Source of image: http://www.gipsa-lab.grenoble-
inp.fr/transfert/seminaire/455_Kadri2013Gipsa-lab.pdf
Deep Learning
simple
Learnable kernel
𝜙 ( 𝑥 ) classifier
x1 …… y1
x2
𝑥 …… y2
…
…
…
…
…
xN …… yM
Supervised Learning vs Unsupervised
Learning
What What
happe happens
where
ns we don’t
when have
our labels for
labels training
are at all?
noisy?
●
● Missing
values.
●
● Labeled
incorrectly.
01
Traditional Autoencoder