Sei sulla pagina 1di 4

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 6, JUNE 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/ WWW.JOURNALOFCOMPUTING.

ORG

163

Analysis of computing algorithm using momentum in neural networks


Gnana Sheela K Dr. S.N Deepa
Abstract The back-propagation algorithm is used in the majority of practical neural networks application. The objective of paper is to analyze the parameters such as momentum and learning rate using back propagation algorithm in artificial neural networks. In this paper, the results are obtained using two variation of back propagation algorithm: simple back propagation, back propagation with momentum. The selection of appropriate value for the parameters of a particular problem is not very easy. Slow convergence and the continuous instability will be happened, if the parameters are selected improperly. Simulation results shows affect the performance of ANN. Index Terms back propagation , Learning rate, momentum factor, neural networks

1 INTRODUCTION

rtificial neural networks (ANN) simulate the human brain in processing information through a series of interconnected neurons, and have excellent ability of mapping complex and highly nonlinear input output patterns without the knowledge of the actual model structure . The rapid development of architecture and algorithm are used in ANN. Each NN architecture affect the parameter like initial weights and biases, a learning rate, the activation function, and momentum parameter The objective of Neural Network is to develop a computational device for modeling the brain to perform various computational tasks faster than other system. The Back Propagation Algorithm (BPA) is one of most important development in neural network [5]. This BPA is different from other network in respect to the process by which the weights are calculated during learning period of network. The main goal is to minimize error, use gradient descent rule. Back propagation (BP) method with momentum has often been applied to adapt artificial neural networks for various pattern classification problems.However,an important limitation of this method is that its learning performance depends heavily upon the selection of the values of momentum factor and step size[4]. In this paper,it is shown that the hierarchical structure can be used for finding appropriate values of the parameters involved in the BP method with momentum.

the control the amount of weight adjustment at each step of training determine learning rate at each step. It affects convergence of BPN. A large value of leads to rapid learning but there is oscillation of weights, while the lower learning rate leads to slower learning.

2.2 Momentum factor


It is denoted by , varies from 0 to 1. The momentum is used to speedup the process. In normal BPA, two disadvantages a) larger training time b) slow convergence. The training execution is done in one by one, without add momentum. By adding momentum, improve convergence, improve the training time and reduce the oscillation. It is to accelerate learning process
.

3 IMPORTANCE OF PARAMETERS

Momentum factor is adjusted in order to cancel the introduced noise and retain the speed up as well as convergence. The rate of convergence and stability of training algorithm can be improved if momentum factor and learning rate adapted simultaneously. A high learning rate leads to rapid learning but weight may oscillate. A lower rate leads slower learning. Start high learning rate, then decreasing, changes in weight is small in order to reduce oscillation. To high learning rate in order to improve performance and decrease the learning rate in order to worsen the performance. The selection of learning rate is of critical impor2 PARAMETERS OF BACKPROPAGATION tance in finding global minima of error distance[7] .BPA with too small learning rate will make slow progress. Too The training of BPN is based on choice of parameters large learning rate will proceed must faster by may pro2.1 Learning rate duce oscillation between relatively poor solution. It is denoted by , varies from 0 to 1, is used The momentum has the following effects. It smooth the weight changes and suppresses crossGnana Sheela K, Research scholar, Anna university Coimbatore. stitching, that is cancels side-to-side oscillations across the Dr. S.N Deepa, Assistant Professor Dept. of EEE, Anna university error valley. When all weight changes are all in the same ,Coimbatore direction the momentum amplifies the learning rate caus-

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 6, JUNE 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/ WWW.JOURNALOFCOMPUTING.ORG

164

ing a faster convergence. It enables to escape from small local minima on the error surface. With momentum, once the weights start moving in a particular direction in weight space, they tend to continue moving in that direction. Imagine a ball rolling down a hill that gets stuck in a depression half way down the hill. If the ball has enough momentum, it will be able to roll through the depression and continue down the hill. Similarly, when applied to weights in a network, momentum can help the network "roll past" a local minima, as well as speed learning. In BrainWave, the default learning rate is 0.25 and the default momentum parameter is 0.9 [6]. Back propagation training with too small a learning rate will make agonizingly slow progress. Too large a learning rate will proceed much faster, but may simply produce oscillations between relatively poor solutions. Momentum in the back propagation algorithm can be helpful in speeding the convergence and avoiding local minima. The idea about using a momentum is to stabilize the weight change by making non radical revisions using a combination of the gradient decreasing term with a fraction of the previous weight change. The rate of convergence and stability of training algorithm can be improved if momentum factor and learning rate adapted simultaneously. The selection of of learning rate is of critical importune in finding global minima of error distance We consider a standard multi-layer feed forward neural networks that have an input layer of neurons, a hidden layer of neurons and an output layer of neurons. Every node in a layer is connected to every other node in the adjacent forward layer. The back-propagation algorithm has been the most popular and most widely implemented for training these types of neural network. When using the back-propagation algorithm to train a multilayer neural network, the designer is required to arbitrarily select parameter such as the, initial weights and biases, a learning rate, the activation function, and momentum parameter. Improper selection of any of these parameters can result in slow convergence or even network paralysis where the training process comes to a virtual standstill. Another problem is the tendency of the steepest descent technique, which is used in the training process, can easily get stuck at local minima. In recent years, a number of research studies have attempted to overcome these problems. These involved the development of heuristic techniques, based on studies of properties of the conventional back-propagation algorithm. These techniques include such as varying the learning rate and momentum parameter. The momentum can be helpful in speeding convergence and avoid local minima.

affect the parameter like initial weights and biases, a learning rate, the activation function, and momentum parameter. The objective of NN is developing a computational device for modeling the brain to perform various computational task faster than other system. The training of BPN is done in three stages, feed forward of input, calculate error and update weights. The brief algorithm is Step1: initialization of parameters Step2:find an output,Y=F(Yin) Step3:compute an error ,e=(t-Y)F(Yin) Step4: update weights,Wjk(new)=Wjk(old)+ Wjk Wjk= eZj, where is learning rate and Zj is output of hidden neuron apply sigmoid activation . Many researchers have devoted their effort to develop speed up technique. The momentum is standard technique ie used to speed up convergence and maintain generalization performance. By adding momentum in BPA in order to reduce oscillation. Applied to back propagation, the concept of momentum is that previous changes in the weights should influence the current direction of movement in weight space. This concept is implemented by the revised weight-update rule: Wjk(new)=Wjk(old)+ Wjk(t+1), Where Wjk(t+1) = EZj+ [Wjk(t) -Wjk(t-1)], where is learning rate and Zj is output of hidden neuron apply sigmoid activation .

5 ANALYSIS OF VARIATION ON PARAMETER


Researchers have developed alternative learning algorithm by employing better energy function ,choosing dynamic and learning rate and momentum or employing rules other than gradient descent .The best choice of learning rate is problem dependent and may need same trial and error before good choice is found. The value of learning rate is function of error derivative on consecutive updates. Mean squared error (MSE) is converged after many epochs which were varied with respect to momentum and learning rate. The BPA is modified by adding various parameters to the existing algorithm to increase overall efficiency of algorithm and to increase the speed of convergence. The momentum is to attempt to try to keep the weight change process moving and there by not get stuck in local minima. In some cases it also makes convergence faster and training more stable. The dynamic change of learning rate and momentum factor serve better performance. Small difference in this parameter can lead to large difference in training time.

4 LITERATURE REVIEW
Artificial Neural Network (ANN) is an efficient information processing system which resembles in characteristics with biological neural networks. The rapid development development of architecture and algorithm are used in ANN. The Mc culloch pitts neuron model was earliest NN discovered in 1943. Each NN architecture

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 6, JUNE 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/ WWW.JOURNALOFCOMPUTING.ORG

165

the training and varied the learning rate .It is found that error is reduced, and the convergence is also fast. The figure 4 is obtained by changing the momentum factor and learning rate. The figure 5 is obtained by increasing the iteration. From this we identified that the error is reduced.

0.12 0.11 0.1 0.09 0.08 mse 0.07 0.06 0.05 0.04

Fig. 1. Structure of BPN

The number of training cycle depends on learning rate and momentum, so that it is necessary to choose the most suitable values for parameter. Then, changing these values, we tried to search for the most suitable values for the learning[4]. As a result, given the minimum value of the number of training cycle behave under the constant rule. That is = K (l - ). Moreover constant K is decided by the ratio between the number of output units and hidden units. In our problem we have given input values to BPA training network, and found the result of the network and this values has been compared with the desired values. From this we got the error values .Then after updating the weights, the resulting error values are determined. These resulted values are plotted as shown below.
0.115

0.03

0.05

0.1

0.15 0.2 learning rate

0.25

0.3

0.35

Figure 3: Error variation in BPA

0.12 0.11 0.1 0.09 0.08 mse

0.11 0.105 0.1 0.095 m se 0.09 0.085 0.08 0.075 0.1 0.2 0.3 0.4 0.5 learning rate 0.6 0.7 0.8 0.9

0.07 0.06 0.05 0.04 0.03

0.1

0.2

0.3

0.4 0.5 momentum

0.6

0.7

0.8

0.9

Figure 4: Error variation in BPA with momentum

Figure 2. Error variation in BPA with out momentum

From the above graph we understood that error is not reducing very much. Following figure 3 shows, so we have added momentum factor in

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 6, JUNE 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/ WWW.JOURNALOFCOMPUTING.ORG

166

[8]
0.12 0.11 0.1 0.09 0.08 mse 0.07 0.06 0.05 0.04 0.03

Qamar Abbas, Jamil Ahmad, Waqas Haider Bangya, Momentum term heals the performance of Back propagation Algorithm for Digit Recognition, 2010 6th International Conference on Emerging Technologies (ICET) [9] S.N Sivanandan,S.N Deepa,Principles of soft computing,WileyIndia Ltd,First Edition,2007 [10] Dai Bibo ,Research of Test Model of Innovative Enterprise Developments Inner Power Based on BP Neural network2011 [11] Nazri Mohd Nawi Network The effect of gain variation in improving learning speed of BPA Symposium on Progress in Information & Communication Technology 2009

5 epoch

Figure 5: Error variation in BPA with respect to Epoch

6.CONCLUSION
This paper concludes that learning rate and momentum factor affect the performance of Artificial Neural Networks. It can be shows that by varying momentum and learning rate, to improve convergence, training time and minimize the errors. And also increases the epochs to minimize the mean squared errors. The selection of parameter is critical importance in finding the global minima of error distance. The result shows that the analysis of relationship between variation of parameter and error. .

REFERENCES
[1] Hamid Beigyh ,Back Propagation Algorithm Adaptation Parameters using Learning Automata,World Scientific Publishing Publishing Company, International Journal of Neural Systems, Vol. 11, No. 3 (2001) 219-228) Ernest Istook, Tony Martinez, Improved Back propagation learning in neural networks with windowed momentum, International Journal of Neural Systems, vol. 12, no.3&4, pp. 303318 Chien-Cheng Yu, and Bin-Da Liu, A Back propagation algorithm with adaptive learning rate and momentum coefficient,ieee2003 Nazri Mohd Nawia ,The effect of gain variation in improving learning speed of back propagation neural network algorithm on classification problems, Symposium on Progress in Information & Communication Technology 2009 Jun Xie, Peng Lin, Huizhen Liang, Minghui LThe Improved Rapid Convergence Algorithm of the Connecting Rights in the BP Network,2008 International Symposium on Knowledge Acquisition and Modeling Qamar Abbas, Analysis of Learning Rate using BP Algorithm forHand Written Digit Recognition Application, 2010 IEEE Chaoju Hu, Fen Zhao, Improved Methods of BP Neural Network Algorithm and its Limitation, 2010 International Forum on Information Technology and Applications

[2]

[3]

[4]

[5]

[6] [7]

Potrebbero piacerti anche