Sei sulla pagina 1di 12

Neural Network The term neural network was traditionally used to refer to a network or circuit of biological neurons.

[1] The modern usage of the term often refers to artificial neural networks, which are composed of artificial neurons or nodes. Thus the term may refer to either biological neural networks are made up of real biological neurons or artificial neural networks for solving artificial intelligence problems. Unlike von Neumann model computations, artificial neural networks do not separate memory and processing and operate via the flow of signals through the net connections, somewhat akin to biological networks. These artificial networks may be used for predictive modeling, adaptive control and applications where they can be trained via a dataset. features of Neural Networks: The good news: They exhibit some brain-like behaviors that are difficult to program directly like: learning association categorization generalization feature extraction optimization noise immunity Input-output mapping Nonlinearity (Nonlinearity is a highly important property, particularly if the underlying physical mechanism responsible for generation of the input signal is inherently nonlinear.)

The Structure of Biological Neural

Most living creatures, which have the ability to adapt to a changing environment, need a controlling unit which is able to learn. Higher developed animals and humans use very complex networks of highly specialized neurons to perform this task. The control unit - or brain - can be divided in different anatomic and functional subunits, each having certain tasks like vision, hearing, motor and sensor control. The brain is connected by nerves to the sensors and actors in the rest of the body. The brain consists of a very large number of neurons, about 1011 in average. These can be seen as the basic building bricks for the central nervous system (CNS). The

neurons are interconnected at points called synapses. The complexity of the brain is due to the massive number of highly interconnected simple units working in parallel, with an individual neuron receiving input from up to 10000 others. The neuron contains all structures of an animal cell. The complexity of the structure and of the processes in a simple cell is enormous. Even the most sophisticated neuron models in artificial neural networks seem comparatively toy-like. Structurally the neuron can be divided in three major parts: the cell body (soma), the dentrites, and the axon, see Figure 1.1 for an illustration.

Figure 1.1: Simplified Biological Neurons. The cell body contains the organelles of the neuron and also the `dentrites' are originating there. These are thin and widely branching fibers, reaching out in different directions to make connections to a larger number of cells within the cluster. Input connection are made from the axons of other cells to the dentrites or directly to the body of the cell. These are known as axondentrititic and axonsomatic synapses. There is only one axon per neuron. It is a single and long fiber, which transports the output signal of the cell as electrical impulses (action potential) along its length. The end of the axon may divide in many branches, which are then connected to other cells. The branches have the function to fan out the signal to many other inputs.

There are many different types of neuron cells found in the nervous system. The differences are due to their location and function. The neurons perform basically the following function: all the inputs to the cell, which may vary by the strength of the connection or the frequency of the incoming signal, are summed up. The input sum is processed by a threshold function and produces an output signal. The processing time of about 1ms per cycle and transmission speed of the neurons of about 0.6 to 120 {ms} are comparingly slow to a modern computer The brain works in both a parallel and serial way. The parallel and serial nature of the brain is readily apparent from the physical anatomy of the nervous system. That there is serial and parallel processing involved can be easily seen from the time needed to perform tasks. Biological neural systems usually have a very high fault tolerance. Experiments with people with brain injuries have shown that damage of neurons up to a certain level does not necessarily influence the performance of the system, though tasks such as writing or speaking may have to be learned again. This can be regarded as re-training the network. In the following work no particular brain part or function will be modeled. Rather the fundamental brain characteristics of parallelism and fault tolerance will be applied.

Architecture of Artificial neural networks


4.1 Feed-forward networks Feed-forward ANNs (figure 1) allow signals to travel one way only; from input to output. There is no feedback (loops) i.e. the output of any layer does not affect that same layer. Feed-forward ANNs tend to be straight forward networks that associate inputs with outputs. They are extensively used in pattern recognition. This type of organisation is also referred to as bottom-up or top-down. 4.2 Feedback networks Feedback networks (figure 1) can have signals travelling in both directions by introducing loops in the network. Feedback networks are very powerful and can get extremely complicated. Feedback networks are dynamic; their 'state' is changing continuously until they reach an equilibrium point. They remain at the equilibrium point until the input changes and a new equilibrium needs to be found. Feedback architectures are also referred to as interactive or recurrent, although the latter term is often used to denote feedback connections in single-layer organisations.

Figure 4.1 An example of a simple feedforward network

Figure 4.2 An example of a complicated network

4.3 Network layers The commonest type of artificial neural network consists of three groups, or layers, of units: a layer of "input" units is connected to a layer of "hidden" units, which is connected to a layer of "output" units. (see Figure 4.1) The activity of the input units represents the raw information that is fed into the network. The activity of each hidden unit is determined by the activities of the input units and the weights on the connections between the input and the hidden units. The behaviour of the output units depends on the activity of the hidden units and the weights between the hidden and output units. This simple type of network is interesting because the hidden units are free to construct their own representations of the input. The weights between the input and hidden units determine when each hidden unit is active, and so by modifying these weights, a hidden unit can choose what it represents.

We also distinguish single-layer and multi-layer architectures. The single-layer organisation, in which all units are connected to one another, constitutes the most general case and is of more potential computational power than hierarchically structured multi-layer organisations. In multi-layer networks, units are often numbered by layer, instead of following a global numbering. 4.4 Perceptrons The most influential work on neural nets in the 60's went under the heading of 'perceptrons' a term coined by Frank Rosenblatt. The perceptron (figure 4.4) turns out to be an MCP model ( neuron with weighted inputs ) with some additional, fixed, pre-processing. Units labelled A1, A2, Aj , Ap are called association units and their task is to extract specific, localised featured from the input images. Perceptrons mimic the basic idea behind the mammalian visual system. They were mainly used in pattern recognition even though their capabilities extended a lot more.

Figure 4.4 In 1969 Minsky and Papert wrote a book in which they described the limitations of single layer Perceptrons. The impact that the book had was tremendous and caused a lot of neural network researchers to loose their interest. The book was very well written and showed mathematically that single layer perceptrons could not do some basic pattern recognition operations like determining the parity of a shape or determining whether a shape is connected or not. What they did not realised, until the 80's, is that given the appropriate training, multilevel perceptrons can do these operations.

5. The Learning Process

The memorisation of patterns and the subsequent response of the network can be categorised into two general paradigms: associative mapping in which the network learns to produce a particular pattern on the set of input units whenever another particular pattern is applied on the set of input units. The associtive mapping can generally be broken down into two mechanisms: auto-association: an input pattern is associated with itself and the states of input and output units coincide. This is used to provide pattern completition, ie to produce a pattern whenever a portion of it or a distorted pattern is presented. In the second case, the network actually stores pairs of patterns building an association between two sets of patterns. hetero-association: is related to two recall mechanisms: nearest-neighbour recall, where the output pattern produced corresponds to the input pattern stored, which is closest to the pattern presented, and interpolative recall, where the output pattern is a similarity dependent interpolation of the patterns stored corresponding to the pattern presented. Yet another paradigm, which is a variant associative mapping is classification, ie when there is a fixed set of categories into which the input patterns are to be classified.

regularity detection in which units learn to respond to particular properties of the input patterns. Whereas in asssociative mapping the network stores the relationships among patterns, in regularity detection the response of each unit has a particular 'meaning'. This type of learning mechanism is essential for feature discovery and knowledge representation. Every neural network posseses knowledge which is contained in the values of the connections weights. Modifying the knowledge stored in the network as a function of experience implies a learning rule for changing the values of the weights.

Information is stored in the weight matrix W of a neural network. Learning is the determination of the weights. Following the way learning is performed, we can distinguish two major categories of neural networks: fixed networks in which the weights cannot be changed, ie dW/dt=0. In such networks, the weights are fixed a priori according to the problem to solve. adaptive networks which are able to change their weights, ie dW/dt not= 0.

Single-layer Neural Networks (Perceptrons)


Input is multi-dimensional (i.e. input can be a vector): input x = ( I1, I2, .., In) Input nodes (or units) are connected (typically fully) to a node (or multiple nodes) in the next layer. A node in the next layer takes a weighted sum of all its inputs:

Summed input =

Example

input x = ( I1, I2, I3) = ( 5, 3.2, 0.1 ).

Summed input =

= 5 w1 + 3.2 w2 + 0.1 w3

The rule

Multilayer Feedforward Neural Networks


A multilayer feedforward neural network is an interconnection of perceptrons in which data and calculations flow in a single direction, from the input data to the outputs. The number of layers in a neural network is the number of layers of perceptrons. The simplest neural network is one with a single input layer and an output layer of perceptrons. The network in Figure 13-7 illustrates this type of network. Technically, this is referred to as a onelayer feedforward network with two outputs because the output layer is the only layer with an activation calculation.

Figure 13- 7: A Single-Layer Feedforward Neural Net

In this single-layer feedforward neural network, the network's inputs are directly connected to the output layer perceptrons, Z1 and Z2. The output perceptrons use activation functions, g1 and g2, to produce the outputs Y1 and Y2. Since

, and

. When the activation functions g1 and g2 are identity activation functions, the single-layer neural network is equivalent to a linear regression model. Similarly, if g1 and g2 are logistic activation functions, then the single-layer neural network is equivalent to logistic regression. Because of this correspondence between single-layer neural networks and linear and logistic regression, single-layer neural networks are rarely used in place of linear and logistic regression. The next most complicated neural network is one with two layers. This extra layer is referred to as a hidden layer. In general there is no restriction on the number of hidden layers. However, it has been shown mathematically that a two-layer neural network can accurately reproduce any differentiable function, provided the number of perceptrons in the hidden layer is unlimited. However, increasing the number of perceptrons increases the number of weights that must be estimated in the network, which in turn increases the execution time for the network. Instead of increasing the number of perceptrons in the hidden layers to improve accuracy, it is sometimes better to add additional hidden layers, which typically reduce both the total number of network weights and the computational time. However, in practice, it is uncommon to see neural networks with more than two or three hidden layers.

Different Algorithms in Neural Networks Many advanced algorithms have been invented since the first simple neural network. Some algorithms are based on the same assumptions or learning techniques as the SLP and the MLP. A very different approach however was taken by Kohonen, in his research in self-organising networks. Backpropogation algorithm:
Backpropagation, an abbreviation for "backward propagation of errors", is a common method of training artificial neural networks. From a desired output, the network learns from many inputs, similar to the way a child learns to identify a dog from examples of dogs. Arthur E. Bryson and Yu-Chi Ho described it as a multi-stage dynamic system optimization method in [1][2] 1969. It wasn't until 1974 and later, when applied in the context of neural networks and through the [3] [4][5] work of Paul Werbos, David E. Rumelhart, Geoffrey E. Hinton and Ronald J. Williams, that it gained recognition, and it led to a renaissance in the field of artificial neural network research. It is a supervised learning method, and is a generalization of the delta rule. It requires a dataset of the desired output for many inputs, making up the training set. It is most useful for feed-forward networks (networks that have no feedback, or simply, that have no connections that loop). Backpropagation requires that the activation function used by the artificial neurons (or "nodes") bedifferentiable. initialize the weights in the network (often small random values) do for each example e in the training set O = neural-net-output(network, e) // forward pass T = teacher output for e compute error (T - O) at the output units compute delta_wh for all weights from hidden layer to output layer // backward pass compute delta_wi for all weights from input layer to hidden layer // backward pass continued update the weights in the network until all examples classified correctly or stopping criterion satisfied return the network

Kohonen Algorithm: The Kohonen self-organising networks have a two-layer topology. The first layer is the input layer, the second layer is itself a network in a plane. Every unit in the input layer is connected to all the nodes in the grid in the second layer. Furthermore the units in the grid function as the output nodes.

The nodes in the grid are only sparsely connected. Here each node has four immediate neighbours.
Hopfield Nets The Hopfield net is a fully connected, symmetrically weighted network where each node functions both as input and output node. The idea is that, depending on the weights, some states are unstable and the net will iterate a number of times to settle in a stable state.

Different functions The behaviour of an ANN (Artificial Neural Network) depends on both the weights and the input-output function (transfer function) that is specified for the units. This function typically falls into one of three categories: linear (or ramp) threshold Sigmoid Hardlim
Hardlim functions: hardlim is a neural transfer function. Transfer functions calculate a layer's output from its net input.

A = hardlim(N,FP) takes N and optional function parameters, N FP S-by-Q matrix of net input (column) vectors Struct of function parameters (ignored)

and returns A, the S-by-Q Boolean matrix with 1s where N 0. info = hardlim('code') returns information according to the code string specified: hardlim('name') returns the name of this function. hardlim('output',FP) returns the [min max] output range. hardlim('active',FP) returns the [min max] active input range. hardlim('fullderiv') returns 1 or 0, depending on whether dA_dN is S-by-S-by-Q or S-by-Q. hardlim('fpnames') returns the names of the function parameters.

hardlim('fpdefaults') returns the default function parameters. Examples Here is how to create a plot of the hardlim transfer function. n = -5:0.1:5; a = hardlim(n); plot(n,a) Assign this transfer function to layer i of a network. net.layers{i}.transferFcn = 'hardlim';

Sigmoid function:
A sigmoid function is a mathematical function having an "S" shape (sigmoid curve). Often, sigmoid function refers to the special case of the logistic function shown on the right and defined by the formula

A sigmoid function is a bounded differentiable real function that is defined for all real input values and has a positive (PS) derivative everywhere

Potrebbero piacerti anche