Sei sulla pagina 1di 30

Object-Oriented Artificial Neural Network with C++ SSE550 Object-Oriented Programming I Project I (Chapter 1-5) February 13, 2012

Samuel Bixler

Table of Contents
Introduction Basic Perceptron Theory NeuralNet Class UML Diagram Headers Interface Implementation

Main Program Headers Instantiation Control Loop

Output Figure 4 - Options Menu Figure 5 - Initialize Weights Figure 6 - Refresh Menu Figure 7 - Display Weights Figure 8 - Input Training Set Figure 9 - Train Net Default Learning Rate Figure 10 - Train Net Learning Rate 10.0 Figure 11 - Display Weights after Training Figure 12 - Test Net Boolean Inputs Figure 13 - Test Net Float Inputs Figure 14 - Weights Plotted on Input Space Figure 15 - Set Activation Function Figure 16 - Set Learning Rate Figure 17 - Exit

Conclusion

Index of Topics Covered Chapter 1 - Introduction to Computers and C++ Chapter 2 - Introduction to C++ Programming Compiler directives The main function Input statements Output statements Stream insertion operator Escape sequences Return statement Variable declarations Fundamental types Identifiers Memory Arithmetic Operator precedence Relational operators User-defined classes Creating and using objects Declaring data members Defining member functions Calling member functions Passing data as arguments Local variables vs. data members Initial values via constructor Separating interface from implementation UML class diagrams Data member set methods Constructing an algorithm in pseudocode Selection statements Repetition statements Assignment operators More control statements Logical operators

Chapter 3 - Introduction to Classes, Objects and Strings

Chapter 4

Chapter 5

Introduction This project explores the application of object-oriented programming techniques to the construction of a single neuron artificial neural network (ANN). The framework that was constructed is designed in a scaleable way so that it will be useful to represent more complex networks. The focus of the project was on the construction of an easy to use NeuralNet class with member functions to perform common ANN operations. The class can be used to create and manipulate complex network architectures, which would be useful for real world applications. This paper will first address the theory of operation of a single neuron ANN, followed by the implementation of the NeuralNet class and then present results of using the class to perform the AND logical operation.

Basic Perceptron Theory Artificial Neural Networks are mathematical models of biological neurons which are usually used to perform functions that are not easily achieved using traditional algorithms. Pattern recognition and classification are two tasks that neural networks are especially well suited for.

Figure 1 - Two Input/Single Output Neural Network The simplest example of an ANN is the Rosenblatt Perceptron; it is the name given to a single neuron ANN and the algorithm used to train it. Figure 1 is a graphical representation of the functions and data that make up the Rosenblatt Perceptron. It is the model that will be explored using the NeuralNet class in this project. The mathematical neuron functions are similar to a biological neuron. Inputs either from the environment (user) or other neurons (hidden layers) are summed in the body of the neuron and if a certain threshold, called the activation potential, is reached, the output changes. The

function that maps the weighted sum of the input(s) to the output(s) is called the activation function. There are several functions that can be used for this step depending on the specific network architecture and data that is used. The Rosenblatt Perceptron can also be viewed mathematically as a line in 2D "input space" that is adjusted to divide the inputs based on which class they belong to. In the general case with n inputs, these weights represent an n-dimensional hyperplane that is able to perfectly classify any linearly separable sets of inputs. Unfortunately the Rosenblatt Perceptron performs very poorly at classifying inputs that are not linearly separable and more advanced networks and training algorithms are needed for more complex problems.

Figure 2 - Two Input Decision Boundary To understand exactly how the Rosenblatt Perceptron is able to classify inputs it is helpful to graph the line on the input space. If the input lies above the line, it belongs to one class and to the other if it lies below. There are several ways to train a neural network, but the method which this project uses is called supervised training. A training set of inputs and the correct outputs is shown to the perceptron and the weights are modified according to a learning rule which will be discussed later.

NeuralNet Class The NeuralNet class is composed of a set of private data members that store the architecture parameters which are required to initialize and train the network. This class contains methods to initialize, train and test the network as well as several mutators, and a function required by the learning algorithm. A UML diagram of the NeuralNet class is shown below.

NeuralNet -numInput: int -numHidden: int -numOutput: int -numTrainSets: int -activationSelect: int -learnRate: float -sigmoidCoef: float -weightMatrix: Eigen::MatrixXf -trainingInputs: Eigen::MatrixXf -trainingOutputs: Eigen::MatrixXf -testInputs: Eigen::VectorXf <<constructor>>+NeuralNet() +refreshScreen(): void +initializeWeights(): void +displayWeights(): void +inputTrainSet(): void +trainNet(): void +testNet(): void +setActivationFunction(): void +setLearningRate(): void +activationFunction(: float): float

Figure 3 - UML Diagram NeuralNet Class

Interface
#include <Eigen\Dense> class NeuralNet { public: NeuralNet(); void refreshScreen(); void initializeWeights(); void displayWeights(); void inputTrainSet(); void trainNet(); void testNet(); void setActivationFunction(); void setLearningRate(); float activationFunction(float); private: // Private data // //Network architecture parameters int numInput, numHidden, numOutput, numTrainSets; int activationSelect; float learnRate, sigmoidCoef; //Eigen matrices and vectors Eigen::MatrixXf weightMatrix; Eigen::MatrixXf trainingInputs; Eigen::MatrixXf trainingOutputs; Eigen::VectorXf testInputs; };

The contents of the header file NeuralNet.h, where the interface of the NeuralNet class is defined, are shown above. The implementation is in the NeuralNet.cpp file, and will be covered piece by piece in the next section. The data members numInput, numHidden and numOutput are integers that are used during the instantiation of a NeuralNet object to specify the architecture of the network. These parameters determine the size of the weights matrix and also are used to control for loops that initialize weights and train the network. The numHidden parameter is not utilized in this project, but it is included for flexibility. Its purpose is to specify the number of hidden layers of neurons in the network. In the single neuron case there are no hidden layers (i.e. not input or output layer). The learnRate floating point parameter is used in the training algorithm to vary the amount of adjustment that is made to the weight matrix after each training iteration. For the remaining data members, data types from the Eigen matrix library were used. Eigen is an open source template library that provides the capability to easily create, manipulate and display matrices and vectors. The weightMatrix data member is a dynamically allocated single precision floating point matrix that is used to store the neural network weights.

Headers
//NeuralNet.cpp #include <iostream> #include "NeuralNet.h" #include <cmath> #include "stdlib.h" using namespace std;

NeuralNet.cpp uses the #include compiler directive to include several required external libraries. The iostream header is included to provide access to the system input/output. The NeuralNet.h header needs to be included since it contains the NeuralNet class interface definitions as well as the member function prototypes. The cmath library implements the exponential function exp() which is required by the activationFunction method to generate the sigmoid and hyperbolic tangent output . The stdlib.h header is included so that system("cls") can be used to clear the console in the refreshScreen method.

Implementation
NeuralNet::NeuralNet() { //Default parameters numInput = 2; numOutput = 1; numTrainSets = 4; learnRate = 0.1; activationSelect = 1; sigmoidCoef = 4.0; //Matrix and vector sizing weightMatrix.resize(numInput+1,numOutput); trainingInputs.resize(numTrainSets,numInput+1); trainingOutputs.resize(numTrainSets,numOutput); testInputs.resize(numInput+1); //Define training set for AND function (Default) trainingInputs << 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1; trainingOutputs << 0, 0, 0, 1;

The NeuralNet class has an overloaded constructor that initializes the private data members with their default values. These defaults can be seen in the code above. The constructor also resizes the matrices and vectors based on the defaults. This piece of code will eventually need to be moved once functionality is added that allows the user to define the network architecture. The trainingInputs and trainingOutputs matrix and vector are populated with the appropriate data to teach the perceptron the logical AND function.

void {

NeuralNet::refreshScreen() system("cls"); cout << endl; cout << " [1] Main menu" << endl; cout << " [2] Initialize weight matrix " << endl; cout << " [3] Display weights matrix " << endl; cout << " [4] Input training set " << endl; cout << " [5] Train network " << endl; cout << " [6] Test the net " << endl; cout << " [7] Set activation function " << endl; cout << " [8] Set learning rate " << endl; cout << " [9] Exit " << endl;

The refreshScreen method clears the console output using system("cls") mentioned in the headers section, and refreshes the main menu options using the stream insertion operator to send output to the cout stream. This method can be called by the user from the main program if the screen is cluttered or the user needs to know what the options are.
void { NeuralNet::initializeWeights() //Initialize bias weight to +1 for ( int b = 0; b < numOutput; b++ ) weightMatrix(0,b) = 1; //Initialize weights to random values (-1)-(+1) with mean 0 for ( int out = 0; out < numOutput; out++ ) { for ( int in = 1; in <= numOutput + 1; in++ ) weightMatrix(in,out) = (float)rand()/(float)RAND_MAX*2 - 1; } cout << " The weight matrix has been initialized with random values.\n"; }

The initializeWeights method uses two for loops to initialize the synaptic weights and bias, to pseudorandom numbers between -1 and +1 with a normal distribution. The C++ standard library includes a random number generator function rand(), but because its

return type is integer, the initializeWeights method modifies the range and mean of the random numbers using the cast operator followed by an offset of -1. In order for the Rosenblatt Perceptron to properly classify inputs which may not be centered about the origin (of the input space), a bias is used to shift the decision boundary up or down. This bias is part of the weight matrix and is initialized to +1.
void { NeuralNet::displayWeights() cout << " The weights are: \n\n"; cout.fixed; cout.precision(2); for (int r = 0; r <= numInput; r++) { for (int c = 0; c < numOutput; c++) { if ( weightMatrix(r,c) < 0 ) cout << " " << fixed << weightMatrix(r,c); else cout << " " << fixed << weightMatrix(r,c); } cout << endl; } }

It is interesting to see what is actually being stored in the weights matrix. To do this, the displayWeights method was created. The format and precision of the output is set to fixed and two decimal places, then two for loops print each value in the weights matrix. The insertion operator could have been used to directly print the values in weightMatrix as the Eigen::MatrixXf type has this capability, but because the results could be either negative or positive and an if statement was included that keeps the decimals lined up for readability. Time did not permit implementation of the inputTrainSet method. It was planned, but not a top priority. The method should allow the user to input the training data either from a file or by entering it manually. At this point, if this method is called, it prints a message to tell the user that the functionality is not there yet.

void {

NeuralNet::trainNet() //Local variables float activation, product, error; int epoch = 0, sumMisclass; //Loop until the neural network doesn't misclassify any of the training inputs do { //Initialize training metrics variables epoch++; sumMisclass = 0; //Calculate output and error for each set of training inputs for (int i = 0; i < numTrainSets; i++) { //Calculate error product = trainingInputs.row(i).dot(weightMatrix.transpose().row(0)); activation = activationFunction(product); error = trainingOutputs(i,0) - activation; //Update weight matrix weightMatrix += trainingInputs.row(i).transpose()*learnRate*error; //Sum misclassified inputs if ( error != 0.0 ) sumMisclass++;

} cout << " " << sumMisclass << " misclassified inputs for epoch " << epoch << endl; } while (sumMisclass > 0); cout << " The network has finished training.\n"; }

The trainNet method is the most complex segment of code in the NeuralNet class. It executes the Rosenblatt Perceptron training algorithm to teach the neuron the AND operator. The method executes a supervised training algorithm that manipulates the weights matrix using the trainingInputs and trainingOutputs data. The numTrainSets integer variable is used to control looping in the algorithm. Local variables store intermediate values (float activation, product), the output error (float error), the epoch number (int epoch) and the sum of the misclassified inputs for a given epoch (int sumMisclass).

The pseudocode algorithm is: Set epoch count to 0. While the number of misclassified inputs is greater than 0. o Set misclassified inputs to zero o Increment epoch by 1. o For every input in the training set Compute weighted sum of inputs using initial random weights Compute the hardlimited output of the weighted sum

o Compute error by taking difference of training output and neuron output o Update weight matrix using the training rule. New weight equals the sum of the old weight and the product of the learning rate, the current input and the error. o If the error is greater than 0
{

Increment misclassified inputs by 1

o Display epoch number and number of misclassified inputs Return to the calling function
float NeuralNet::activationFunction(float x) float result; switch (activationSelect) { case 1: //Threshold function { if(x >= 0) result = 1; else result = 0; } break; case 2: //Hyperbolic tangent function result = (exp(x)-exp(-x))/(exp(x)+exp(-x)); break; case 3: //Sigmoid function result = 1/( 1 + exp(-activationSelect*x) ); } return result; }

The activationFunction method is a function that by default performs a threshold operation on the floating point input and returns a floating point result. There are several

functions such as the sigmoid, and hyperbolic tangent functions that can also be used as the activation function, but the threshold works best for the Rosenblatt Perceptron.
void { NeuralNet::testNet() float activation, product; //Local variables //Set bias input testInputs(0) = 1; //Loop to fill the test input vector with user values for (int i = 1; i < testInputs.rows(); i++) { cout << " Enter input " << i << ": "; cin >> testInputs(i); } //Compute the neuron's output given the test inputs product = testInputs.dot(weightMatrix.transpose().row(0)); activation = activationFunction(product); cout << "\n Given the inputs you entered,\n the Rosenblatt Perceptron "; cout << " says the correct answer is: " << activation << endl; }

The testNet method fills an Eigen VectorXf variable with a user specified set of inputs. It then shows the perceptron the input set and computes an intermediate product and activation value using the weight matrix resulting from the trainNet method. The result it sent to the console.

The NeuralNet class contains two set methods to allow the user the option to change the learning rate and activation function.
void { NeuralNet::setActivationFunction() int activationSelectTemp, sigmoidCoef; //Activation function selection menu cout << " [1] Threshold" << endl; cout << " [2] Sigmoid" << endl; cout << " [3] Hyperbolic Tangent" << endl; cout << " Select an activation function: "; cin >> activationSelectTemp; switch (activationSelectTemp) { case 1: activationSelect = activationSelectTemp; cout << "\n The threshold function has been selected.\n"; break; case 2: activationSelect = activationSelectTemp; cout << "\n The sigmoid function has been selected.\n"; cout << "\n Enter the exponential coefficient (positive real): "; cin >> sigmoidCoef; if (sigmoidCoef > 0.0) cout << "\n The coefficient has been set to: " << sigmoidCoef << endl; else { sigmoidCoef = 4.0; cout << "\n Invalid entry!"; cout << "\n The coefficient has been set to the default (4.0)\n"; } break; case 3: activationSelect = activationSelectTemp; cout << "\n The hyperbolic tangent function has been selected.\n"; break; default: activationSelect = 1; cout << "\n Invalid entry!"; cout << "\n The activation function has been set to the default (Threshold).\n"; break; } }

void {

NeuralNet::setLearningRate() //Set a new learning rate cout << " Enter the new learning rate (positive real): "; cin >> learnRate; if ( learnRate > 0.0 ) { cout << " The learning rate has been set to " << learnRate << endl; } else { learnRate = 0.1; cout << " Invalid entry!\n"; cout << " The learning rate has been set to the default (0.1)\n"; }

Main Program The main.cpp file begins with #include directives to access the required libraries. The iostream header is included to provide input/output and formatting capabilities to the program. The ctime library is required to generate a seed for the rand function. The compiler is informed that the std namespace is being used.
#include <iostream> #include <ctime> #include "NeuralNet.h" using namespace std;

The main function takes no arguments and its return type is void since it is not required to return any values for this application. The main program begins begins by seeding the pseudo-random number generator with the current system time by executing the srand() function. The next step is creating an instance of the NeuralNet object called myNet. After the myNet object is created, a call to the refreshScreen method clears the screen and displays the options to the user. The option variable holds the user's choice and is used as the switch variable in the option selection case statement. The boolean exit is used to exit the do-while and the program if the user chooses to do so.
void main() { //Seed the random number generator srand((unsigned)time(0)); //Create a NeuralNet object NeuralNet myNet; //Clear the console and display the options myNet.refreshScreen(); int option; bool exit = false;

Execution now enters a do-while loop and prompts the user to select an option. The input is saved in the option memory location and an if statement is used to test if the entry is a value between 1-9 which are the valid options.
do { //User interface cout << "\n Enter your selection: "; cin >> option; //Invalid input check if ( ( option > 0 ) & ( option < 10 ) ) { cout << endl; //Menu choice selection switch switch (option) { case 1: myNet.refreshScreen(); break; case 2: myNet.initializeWeights(); break; case 3: myNet.displayWeights(); break; case 4: myNet.inputTrainSet(); break; case 5: myNet.trainNet(); break; case 6: myNet.testNet(); break; case 7: myNet.setActivationFunction(); break; case 8: myNet.setLearningRate(); break; case 9: exit = true; break; default: cout << " Invalid input! Please enter an option (1 - 9):\n"; } } else { cout << " Invalid entry!\n"; cout << " Enter a number corresponding to one of the 9 options."; } } while (exit == false); } return;

NeuralNet methods are called based on the user input using a switch control statement. If the user chooses option 9, exit is set to "true" and when the do-while condition is tested the loop is exited and the program returns. If invalid data is entered into choice, the else block is executed and a message is displayed to inform the user.

Output The following pages show screenshots of the programs response to user inputs and demonstrate its capability to learn the AND function.

Figure 4 - Options Menu

Figure 5 - Initialize Weights

Figure 6 - Refresh Menu

Figure 7 - Display Weights

Figure 8 - Input Training Set

Figure 9 - Train Net, Default Learning Rate (0.1)

Figure 10 - Train Net, Learning Rate Set to (10.0)

Figure 11 - Display Weights After Training

Figure 12 - Test Net, Boolean Inputs

Figure 13 - Test Net, Float Inputs The final set of screenshots show the perceptron's response to inputs that are not 0 or 1. Figure 9 shows several examples of this. This demonstrates that even though the hyperplane is trained to separate the 4 inputs shown to it in the training set, it is only finding one of the infinite number of solutions to the problem. The results that the neural network generates are fuzzy and the neuron only learns as much as it needs to in order to meet the learning criterion.

Figure 14 - Weights Graphed in Input Space The figure above shows the training set of inputs, untrained and trained decision boundaries plotted on the 2D input space. This is a plot of observed weight matrix output from the program.

Figure 15 - Set Activation Funtion

Figure 16 - Set Learning Rate

Figure 17 - Exit

Conclusion The goal of this project was to design an artificial neural network class using objectoriented C++ techniques and verify the NeuralNet class's interface and implementation by creating and testing the Rosenblatt Perceptron case. This was successfully accomplished as the results indicate. The class is very simple at this point and would need much more work to allow it to classify non-linearly separable patterns and to utilize the more advance activation functions. The program only has minimal user input validation and exception handling and this is something that would need to be improved on in the future.

Potrebbero piacerti anche