Artificial Neural Network: Advance Topics in Mathematical Methods ME7100

Artificial Neural Network
Advance Topics in Mathematical Methods ME7100
Introduction
An Artificial Neural Network (ANN) is an information processing
paradigm that is inspired by the way biological nervous systems,
such as the brain, process information.
The neuron sends out signals through thin stand known as an

axon
Neuron collects signals from others through structures called
dendrites.
Synapse converts the activity from the axon into electrical effects
http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html
Introduction
An Artificial Neural Network (ANN) is an information processing
paradigm that is inspired by the way biological nervous systems,
such as the brain, process information.
Bias b
x1
w1
Activation
function
Input
values
x2
w2
xm
Summing
function
wm
weights
http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html
f ( wi xi )
Output
Introduction
Artificial Neural Network (ANN)
Bias b
x1
w1
Activation
function
Input
values
x2
w2
xm
wm
weights
Summing
function
f ( wi xi )
Output
Introduction
An Artificial Neural Network encompasses
neuron model: Type of activation function,

an architecture: network structure
number of neurons, number of layers, weight at each neuron
a learning algorithm: Training of ANN by modifying the
weights in order to mimic the known observations (input, out
put) such that the unknown
Activation Function
Activation Function
sigmoid
rational function
hyperbolic tangent
Network architectures
Three different classes of network architectures
single-layer feed-forward
multi-layer feed-forward
recurrent
multi-layer
feed-forward
http://codebase.mql4.com/5738
Three different classes of network architectures
Recurrent:
In a recurrent network, the weight matrix for each

layer contains input weights from all other neurons in the network, not
just neurons from the previous layer.
http://en.wikibooks.org/wiki/Artificial_Neural_Networks/Recurrent_Networks
e.g. Perceptron -Rosenblatt (1958) classification
into one of two categories.
A perceptron uses a step function
a if v c
f(v)
b if v c
b (bias)
x1
x2
xn
w1
w2
f(v)
wn
v=wixi+b
used for binary classification: Geometrically finding a hyperplane that separates the examples in two classes
v=c
Case : Can we predict heart disease on the basis of
Age
sex (M/ F)
smoking frequency
Cholesterol
BP
Weight
Age
sex (m=1,
F=0)
55
41
45
60
22
53
34
41
39
52
58
58
37
49
smoking frequency Cholesterol

0
0
1
0
0
1
0
1
1
1
0
1
1
0
3
1
1
8
3
4
5
5
6
8
7
6
3
2
BP
143
145
224
237
140
163
188
192
222
179
165
182
174
190
Weight
109
91
126
83
83
94
88
120
126
99
122
117
113
126
66
43
46
85
56
73
53
46
75
72
58
47
46
45
Heart Patient
( 0= non
patient, 1=
patient)
0
0
1
1
0
1
1
1
1
1
1
1
0
1
A perceptron uses a step function f ( v ) 0 if v 250
1 if v 250
Age
0.880052
sex (m=1, F=0)
-1.13407
b =1.55
Y =0 or 1
smoking frequency 1.275656

Cholesterol
0.870191
BP
0.124578
Weight
0.759339
f(v)
v=wixi+b
Network Training
Method of Gradient descent:An algorithm for finding the nearest

local minimum/maximum of a function which presupposes that the
gradient of the function can be computed.
x=xmax
x=x1
x=xmax
x=x1
x=x1
x=x1
Network Training

Choose arbitrary point x=x1
For maximization (f(x2)- f(x1))>0
x=xmax
If
f(x) <0 , x2< x1

f(x) >0 , x2> x1
Thus, the following will always

yield movement towards maxima
x2=x1+ f(x)
>0
x=x1
x=x2
x=x1
is learning rate
x=x2
Network Training

Choose arbitrary point x=x1
For minimization (f(x2)- f(x1))<0
If
x=xmax
x=x1
x=x2
x=x1
f(x) <0 , x2> x1

f(x) >0 , x2< x1
Thus, the following will always

yield movement towards minima
x2=x1- f(x)
>0
is learning rate
x=x2
17
Multi variable - single response
Risk of trapping in local

maxima or minima!!
Heuristic Method
http://bayen.eecs.berkeley.edu/bayen/?q=webfm_send/246
Network Training
Method of Gradient descent for ANN: The network is to be optimized
through adjustment of weights such that error in prediction is
minimized
Network Training
Liner Perceptron
x1
x2
b (bias)
w1
w2
wn
xn
f(v)
v=wixi+b
=
=
=
()
= (())
=
Repeat till solution converges
Network Training
Liner Perceptron s number of training samples
b (bias)
x1
w1
x2
w2
f(v)
=
wn
v=wixi+b
xn
( )
( ) [( )]
[ ]
=

Network Training
Linear single layer perceptron ( m inputs & n

outputs)
wijxi+bj
=
=
( )
=
=
( ) [()]
( )
=

Network Training

outputs, s number of training samples)
wijxi+bj
=
=
( )
= =
( ) [( )]
= =
[ ]
= =
Network Training
Non-linear Perceptron
b (bias)
x1
x2
w1
w2
wn
xn
= =
f(v)
v=wixi+b
= ()
= [ ]
=
= [ ][ ]
Network Training

outputs)
wijxi+bj
=
=
( )
=
=
( ) [()]
( )
=

Network Training
Linear multi-layer perceptron ( 2 inputs, 2

hidden neuron and one output)
b1
X1
X2
w11
w12
w21
w22
f1
w1
b
f
f2
b2
= ( + + )
w2
f = ( + + 1)
f = ( + + 1)
Output weight scheme
= ( + + )
=
similarly
= ( + + )
( + + 1)
= ( + + )
( + + 2)
Network Training

b1
w11
w12
w21
w22
f1
w1
b
f
f2
b2
similarly
= ( + + )
w2
f = ( + + 1)
f = ( + + 1)
Input weight scheme
= ( + + )
= ( + + )
= ( + + )
= ( + + )
Network Training

b1
w11
w12
w21
w22
f1
w1
f
f2
b2
w2
f = ( + + 1)
f = ( + + 1)
Input weight scheme
= ( + + )
similarly
= ( + + )
= ( + + )
= ( + + )
Back Propagation
Network Training
Linear multi-layer perceptron ( m inputs h

=
wij
bj
fj
b
wj
( + )
( +
( +)
=
m inputs
(ith input
h hidden neurons
jth neuron
1 Output
)
=
=
( +
( +
=( + )
=( + )
=( + )
Network Training
Linear multi-layer perceptron ( m inputs & n

outputs)
wij
wjk
yk
fk
fj
m inputs
h hidden neurons
th
th
(i input j neurons
n Output
kth output)
( + )
( + )
=
Network Training
Derive weight change scheme for
1. Sigmoidal multi Perceptron s number of

training samples
2. Sigmoidal single layer perceptron ( m inputs &
n outputs)
3. Sigmoidal single layer perceptron ( m inputs &
n outputs, s number of training samples)
Optimal ANN Architecture

Map the known data
Generalize the new data
Error
Number of Iterations
Generalization Techniques
Splitting Technique
70 %
20 %
10 %
Training
Testing
Validation
Predictive Process Models : ANN
B
X1
Input
Output
X2
W1
W2
W3
X3
Input
Layer
Hidden
Layer
Output
Layer
F ([W][X]+[B]))
Generalization Techniques
Divide data set in (n) groups of size k each
Cross Validation
Train (n-1) Data sets and check the error in remaining n th set
Repeat process with different initial weights and average
the results (ensemble method)
Calculate error for all the n sets taken as testing sets
Calculate Error of cross validation
Ecv
1 n k/n
( y ij )
(
y
)
ij
Actual
Calculated
N i 1 j1

Artificial Neural Network: Advance Topics in Mathematical Methods ME7100

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Artificial Neural Network: Advance Topics in Mathematical Methods ME7100

Caricato da

Copyright:

Formati disponibili

Artificial Neural Network

Advance Topics in Mathematical Methods ME7100

The neuron sends out signals through thin stand known as an

Advance Topics in Mathematical Methods ME7100

Advance Topics in Mathematical Methods ME7100

Advance Topics in Mathematical Methods ME7100

neuron model: Type of activation function,

Advance Topics in Mathematical Methods ME7100

Advance Topics in Mathematical Methods ME7100

Advance Topics in Mathematical Methods ME7100

Advance Topics in Mathematical Methods ME7100

In a recurrent network, the weight matrix for each

Advance Topics in Mathematical Methods ME7100

Advance Topics in Mathematical Methods ME7100

Advance Topics in Mathematical Methods ME7100

smoking frequency Cholesterol

Advance Topics in Mathematical Methods ME7100

sex (m=1, F=0)

smoking frequency 1.275656

Advance Topics in Mathematical Methods ME7100

Method of Gradient descent:An algorithm for finding the nearest

Advance Topics in Mathematical Methods ME7100

Method of Gradient descent:An algorithm for finding the nearest

f(x) <0 , x2< x1

Thus, the following will always

Method of Gradient descent:An algorithm for finding the nearest

f(x) <0 , x2> x1

Thus, the following will always

Advance Topics in Mathematical Methods ME7100

Risk of trapping in local

Advance Topics in Mathematical Methods ME7100

Advance Topics in Mathematical Methods ME7100

Repeat till solution converges

Linear single layer perceptron ( m inputs & n

Repeat till solution converges

Linear single layer perceptron ( m inputs & n

Advance Topics in Mathematical Methods ME7100

Advance Topics in Mathematical Methods ME7100

Linear single layer perceptron ( m inputs & n

Repeat till solution converges

Linear multi-layer perceptron ( 2 inputs, 2

Output weight scheme

Advance Topics in Mathematical Methods ME7100

Linear multi-layer perceptron ( 2 inputs, 2

Input weight scheme

Linear multi-layer perceptron ( 2 inputs, 2

Advance Topics in Mathematical Methods ME7100

Linear multi-layer perceptron ( m inputs h

Advance Topics in Mathematical Methods ME7100

Linear multi-layer perceptron ( m inputs & n

Advance Topics in Mathematical Methods ME7100

1. Sigmoidal multi Perceptron s number of

Advance Topics in Mathematical Methods ME7100

Optimal ANN Architecture

Advance Topics in Mathematical Methods ME7100

Optimal ANN Architecture

Advance Topics in Mathematical Methods ME7100

Predictive Process Models : ANN

Advance Topics in Mathematical Methods ME7100

Optimal ANN Architecture

Advance Topics in Mathematical Methods ME7100

Advance Topics in Mathematical Methods ME7100