Sei sulla pagina 1di 8

Chapter 6

Lab 6: MLP and Backpropagation


6.1 Goals

At the end of this weeks laboratory you should: 1. Understand the basic principles of multi-layer perceptrons (MLP) 2. Understand the basic principles of backpropagation of errors (BP) training 3. Understand how to use MLP and BP to solve a problem in MATLAB

6.2

Information Contained in this Laboratory

1. Multi-Layer Perceptrons (MLP) 2. Creating feed forward networks in MATLAB 3. Backpropagation training feed forward networks in MATLAB 4. Recalling and evaluating feed forward networks in MATLAB

6.3

Introduction

Multi-layer perceptrons (MLP) are articial neural networks of the general class known as feed-forward networks. They consist of layers of articial neurons (neuron layers) connected together by layers of weighted connections (connection layers). MLP have at least three layers of neurons, the input layer, where the input signal is presented, the hidden, or intermediate layer, and the output layer, which represents the output signal of the network. They are known as feed-forward networks because the input signal only travels forward through the neuron layers, from input to output, never travelling backwards. MLP are most commonly trained by some variation of the backpropagation of errors (backprop) algorithm, an algorithm that adjusts the weightings of the connections in response to the output errors of the network. See Appendix D for more information on MLP and backprop training. 61

62

CHAPTER 6. LAB 6: MLP AND BACKPROPAGATION

6.4

Data Preparation

To explore the MLP functions in MATLAB, we will be using the venerable iris classication data set. Download the le iris_data.dat from the Additional Material page for this laboratory. Load the le into MATLAB with the following command: Iris = load(iris_data.dat); This is the complete iris data set, with 50 examples of each class. This needs to be divided into training and testing data sets. Also, the input features and output classes need to be separated from one another. Training and testing sets are usually split in a roughly 3:1 ratio. For this data set, that means 35 examples of each class for the training set and 15 examples of each class for the testing set. This is done with the following commands. Firstly, we extract the training data into a matrix called TrainSet. TrainSet = [ Iris(1:35,:); Iris(51:85,:); Iris(101:135,:)]; Since we know which rows correspond to which class, we can simply dump the nal column of the data: TrainSet = TrainSet(:,1:4); This is repeated for the testing data set with the following commands: TestSet = [ Iris(36:50,:); Iris(86:100,:); Iris(136:150,:)]; TestSet = TestSet(:,1:4); We then need to create the output examples for the training set. This is done as follows. matrix of zeros, as follows: Firstly, create a TrainOut = zeros (105, 3); Then, set the rst column of the rst 35 rows to one (this represents the rst class) TrainOut(1:35,1) = 1; Set the second column of the second 35 rows, and the third column of the last 35 rows to one also. This is done with the following commands: TrainOut(36:70,2) = 1; TrainOut(71:105,3) = 1; This will associate the output vector 1 0 0 with the rst class (Iris Setosa), 0 1 0 with the second class (Iris Virginica) and 0 0 1 with the third class (Iris Versicolour). Due to MATLABs insistence on swapping rows and columns from what some would consider a logical manner, you need to invert each matrix before continuing, as follows: TrainSet = TrainSet; TrainOut = TrainOut; TestSet = TestSet;

6.5. CREATING A NEW MLP

63

6.5

Creating a new MLP

With the data sets prepared, we are now ready to create an MLP to model the problem. This is done with the newff function, which is part of the Neural Network Toolbox. Examining the help entry for newff shows us that the function requires four arguments: the range of the input values, the architecture of the network (in terms of the number of neuron layers and number of neurons in each layer), the activation functions used by each layer of neurons, and nally the training algorithm used. The ranges can be determined using the minmax function, as follows: Iris = Iris (:,1:4); Iris=Iris; Ranges = minmax(Iris); The next parameter is a one row matrix specifying the number of neurons in the hidden and output neuron layers. The number of input neurons is inferred from the number of rows in the Ranges matrix. For this problem, there are three output neurons (one for each class). Ten hidden neurons should be ample for this problem. These parameters are encoded into the matrix as follows: Arch = [10 3]; It is also necessary to specify the activation functions for each neuron layer. We will use a sigmoid function (in this case the logsig function) for the hidden and output neuron layers. These functions are specied in an array of strings (called a cell array), as follows: ActFunc = {logsig, logsig}; The nal parameter is the training method to be used on the network, which is specied as a string. In this case gradient descent learning is used, which is specied by traingd. To create a new network with the above parameters, use the following command: IrisNet = newff(Ranges, Arch, ActFunc, traingd); This network can be saved as a binary .mat le as follows: save IrisNet

6.6

Training a MLP

Before the network can be trained, the training parameters must be set. The rst of these is the number of training epochs to use. To set the number of epochs to 5000, we use the following command: IrisNet.trainParam.epochs = 5000; To set the default learning rate for the entire network to 0.5, this command is used: IrisNet.trainParam.lr = 0.5;

64

CHAPTER 6. LAB 6: MLP AND BACKPROPAGATION

Training is done via the train function, which is called with three parameters: the network to train, the training data set, and the targets for each example in the training set. It will return a trained network object, and display in the command window the progress of the training, along with the Mean Squared Error (MSE) over the training set. It will also display a plot similiar to that in Figure 6.1, which is the MSE versus the epoch number (the exact plot will probably be different for different networks - this is because of the different starting connection weights). The training command is as follows: IrisNet5000 = train(IrisNet, TrainSet, TrainOut);

Figure 6.1: MATLAB training plot It is also possible to set the learning parameters for each layer of connections separately. To set the learning rate of the rst layer of connections of IrisNet to 0.3, the following command is used: IrisNet.layerWeights{1,2}.trainParam.lr = 0.3; You can even change the training method used for each layer of connections: to use backpropagation with momentum training for the above connection layer, use this command: IrisNet.layerWeights{1,2}.trainParam.learnFcn = learngdm; This will require setting the momentum parameter of the layer, as follows: IrisNet.layerWeights{1,2}.trainParam.mc = 0.5;

6.7. RECALLING A MLP

65

Note that you can change the settings for the entire network by modifying the settings in the IrisNet.trainParam structure. The initial training settings, however, should be good enough for this problem. For more information on the train function, type help train.

6.7

Recalling a MLP

The network is recalled with the sim function. This takes two arguments, the network to be recalled, and the data to recall the network with. The function returns a matrix of the activation values of each output neuron. Each row in the returned matrix corresponds to one output neuron, and the number of columns is equal to the number of examples (columns) in the input matrix. To recall the trained MLP with the test data, the following command is used: out = sim(IrisNet5000, TestSet); Since the IrisNet network has three output neurons (one for each class) and there are 45 examples in the test set matrix, the resulting matrix out will have three rows and 45 columns.

6.7.1

Analysing the Outputs

How well has the network learned? This question can also be stated as how well does the network classify each example in the test set?. Since each output neuron corresponds to a separate class, the class the network thinks the example belongs to will be the class that corresponds to the most highly activated (winning) output neuron. Since we know that the rst fteen examples are of the rst class, the second fteen of the second class, and the third fteen of the third class, the easiest way of examining the accuracy of the trained MLP is by plotting the output value of each output neuron for each example. For the rst fteen examples, the rst output neuron should be the most highly activated, and so on for the other two. Plotting the output values for each output neuron is done with the following commands: out1 = out(1,:); out2 = out(2,:); out3 = out(3,:); plot(out1,+); hold plot(out2,o); plot(out3,x); This will produce a plot similiar to Figure 6.2. If the network has learned to classify the iris data, then each series of values should have peaks for the examples for their target class, with all other values being close to zero. For the network that produced the outputs in Figure 6.2, this is clearly the case. Different training algorithms, training parameters and network architectures (number of hidden neurons, number of hidden layers) can have a large affect upon the performance of a MLP. Over training, where a network is able to recognise only the training

66

CHAPTER 6. LAB 6: MLP AND BACKPROPAGATION

Figure 6.2: Output Plot

data and cannot generalise at all, is a common problem. Task: experiment with the different learning algorithms available in MATLAB. Which one will yield an acceptable performance the most rapidly? What effect does changing the size of the hidden neuron layer have? How does changing the number of training epochs and learning rate affect learning?

6.8

Additional Tasks

Download the les wine.data and wine.nam from the Additional Material page for this tutorial. Create, train and evaluate a MLP to perform the wine classication problem. There are three classes described in this data le, with the class label being the rst column of the data le (the numbers 1,2 or 3 refer to the class number). The le wine.nam lists the number of examples present for each class. Preparing the data (in le wine.data) for this problem will involve the following:

replacing the comma delimiters with white space loading the data into MATLAB removing the rst column (the class label) from the data

6.8. ADDITIONAL TASKS


67

splitting the data into training and test sets (remember that each class should be represented in both sets, with each class being present in the same proportions) determining the maximum and minimum of each input feature creating, and setting the values in, a target matrix for training

68

CHAPTER 6. LAB 6: MLP AND BACKPROPAGATION

Potrebbero piacerti anche