Sei sulla pagina 1di 85

DUBLIN CITY UNIVERSITY

SCHOOL OF ELECTRONIC ENGINEERING



Artificial Neural Network Identification &
Control of an Inverted Pendulum

Barry N. Sweeney
August 2004











MASTER OF ENGINEERING
IN
ELECTRONIC SYSTEMS


Supervised by Ms. Jennifer Bruton









ii
Declaration
I hereby declare that, except where otherwise indicated, this document is entirely my own work and has not
been submitted in whole or in part to any other university.


Signed: ...................................................................... Date: ...............................

iii
Abstract
The purpose of this project is to illustrate the use of artificial neural networks (ANNs) in the
identification and control of a non-linear system. Non-linear systems are investigated with
respect to the dynamics of the inverted pendulum. The inverted pendulum is a classic
example of an unstable non-linear dynamic system. Consequently it has received much
attention, as it is an extremely complex and challenging control problem. The interesting
feature of neural networks is that it can learn from the environment in which the system is
being operated. The potential of ANNs to system identification and control is examined.
Subsequently feed-forward and recurrent neural networks are used to identify a robust
model of the inverted pendulum. Finally a neuro-controller is developed and implemented
using Borland C++, for control of the physical system.
























iv
Acknowledgements
I would like to take this opportunity to thank all those who supported and helped me
throughout the development and research of this project. Firstly, I would like to thank my
project supervisor Ms. Jennifer Bruton whose guidance and vast knowledge proved
invaluable, all the staff in the faculty of Engineering with especial thanks to Conor Maguire
for the setting up of the physical rig. Finally I would like to thank my family and girlfriend
for their support and patience throughout.

v
Table of Contents

Declaration.....E
rror! Bookmark not defined.ii
Abstract..iii
Acknowledgementsiv
Table Of Contents..v
Table of Figures....vii
Chapter 1 Introduction..1
1.1 Motivation2
1.2 Outline of Report.2
Chapter 2 Inverted Pendulum..4
2.1 Mathamatical Equations.5
2.2 Modelling of the Inverted Pendulum.7
2.3 Closed-loop Control.9
2.4 Summary.....12
Chapter 3 Neural Networks13
3.1 Artifical neuron model.....14
3.3 Activation functions......16
3.3 Neural network architecture....16
3.4 Learning algorithms..19
3.5 Learning rules20
3.6 Neural Network Limitations.20
3.7 Applications....21
3.8 Summary.22
Chapter 4 System Identification...23
4.1 System Identification Procedure24
4.2 Conventional linear system Identification25
4.3 Non-linear System Identification using NARMAX..29
4.4 System Identification using Neural Networks..30
4.5 Javier's Linearised Model..39
4.6 Summary..42
Chapter 5 Real -Time Identification43

vi
5.1 Closed-loop controller.44
5.2 Identification of physical system..45
5.3 Summary50
Chapter 6 Neuro Control.51
6.1 Supervised Control51
6.2 Unsupervised Control52
6.3 Adaptive neuro control..52
6.4 Model Reference Control..53
6.5 Direct inverse control54
6.6 Neuro Control in Simulink55
6.7 Real time neuro-control.56
6.8 Project Plan.62
6.9 Summary.62
Chapter 7 Conclusions.63
7.1 Future Recommendations..65
References.67
Appendix 170
Appendix 2...77













vii

Table of Figures
Figure 2.1 Inverted Pendulum..4
Figure 2.2 Simulink model of linear pendulum...7
Figure 2.3 Subsystem block of linear pendulum..7
Figure 2.4 Simulink model of linear pendulum...8
Figure 2.5 Subsystem block of linear pendulum..8
Figure 2.6 Open loop response of inverted pendulum.9
Figure 2.7 Simulink model of linear pendulum and controller..10
Figure 2.8 Closed loop response of linear pendulum with controller10
Figure 2.9 Simulink model of non-linear pendulum and controller...11
Figure 2.10 Closed loop response of non-linear pendulum with controller...11
Figure 2.11 Inverted Pendulum, Time (t) = 0.0 secs..12
Figure 3.1 Biological Neuron.13
Figure 3.2 Artificial Neuron, McMulloch & Pitts (1943)..14
Figure 3.3 Perceptron Model..15
Figure 3.4 Activation functions..16
Figure 3.5 Multi-layer Feed-forward Network structure...16
Figure 3.6 Multi-layer Recurrent Network structure.18
Figure 3.7 Supervised Learning.19
Figure 3.8 Unsupervised Learning.20
Figure 3.9 Local & global minimum..21
Figure 4.1 Input, Output, Disturbance of a System....23
Figure 4.2 System Identification Procedure...24
Figure 4.3 ARX model output with measured output26
Figure 4.4 ARX [4 3 1] model output and measured output..27
Figure 4.5 RARMAX [3 3 3 1] model output and process output.28
Figure 4.6 RARMAX[4 3 3 1] model output and validation data..29
Figure 4.7 Forward modelling of inverted pendulum using neural networks30
Figure 4.8 Neural Network training...31
Figure 4.9 Model validation set-up in simulink.32
Figure 4.10 feed-forward network, 1 hidden layer, 4 hidden neurons...33
Figure 4.11 feed-forward network, 2 hidden layers, 4 and 2 hidden neuron respectively.33

viii
Figure 4.12 Elman network, 2 hidden layers, 4 and 2 neurons respectively..35
Figure 4.13 Elman network, 2 hidden layers, 15 and 10 neurons respectively..35
Figure 4.14 Feed-forward network, 1 hidden layer with 50 neurons.36
Figure 4.15 Feed-forward networks, 2 hidden layers, 30 & 20 neurons respectively...37
Figure 4.16 Difference between network trained with data scaled....37
Figure 4.17 Elman network, 2 hidden layers, 15 and 10 neurons respectively..38
Figure 4.18 Javiers Linearised Model...40
Figure 4.19 A comparison of Javiers Model & Non-linear Model...40
Figure 4.20 Feed-forward NN, 2 hidden layers, 30 and 20 neurons respectively..41
Figure 5.1 Set-up of pendulum rig.43
Figure 5.2 Real Time Task in simulink environment.....44
Figure 5.3 Zones of Control Algorithms....44
Figure 5.4 NN no-linear model output & physical system output.46
Figure 5.5 Pendulum angle of Real System...46
Figure 5.6 Validation set-up...47
Figure 5.7 feed-forward NN, 1 hidden layer with 75 neurons...47
Figure 5.8 feed-forward NN, 1 hidden layer with 75 neurons...48
Figure 5.9 System set-up with disturbance49
Figure 5.10 Pendulum angle during disturbance...49
Figure 5.11 Pendulum angle, with large excitation signal.50
Figure 6.1 Supervised learning using existing controller...52
Figure 6.2 Adaptive neuro control.53
Figure 6.3 Model Reference Control..53
Figure 6.4 Direct inverse control....54
Figure 6.5 Neuro Controller in simulink....55
Figure 6.6 Pendulum angle using neuro controller in Simulink....55
Figure 6.7 neuro control of non-linear pendulum model...56
Figure 6.8 neuro control of non-linear model....56
Figure 6.9 Validation set-up...58
Figure 6.10 Neuro-Controller output.58
Figure 6.11 Neuro-Controller Structure.59
Figure 6.12 Pendulum Angle.60
Figure 6.13 Pendulum Angle.61


1
Chapter 1
Introduction
Due to increasing technological demands and ever increasing complex systems requiring
highly sophisticated controllers to ensure that high performance can be achieved and
maintained under adverse conditions, there is a demand for an alternative form of control as
conventional approaches to control do not meet the requirements of these complex systems.
To achieve such highly autonomous behaviour for complex systems one can enhance
today's control methods using intelligent control systems and techniques. It is for this reason
that neural networks are of significant importance in the design and construction of the
overall intelligent controller for complex non-linear systems. Currently neural networks are
established in many application areas (expert systems, pattern recognition, system control,
etc.). These methods have received a lot of criticism during their existence (for example, see
Cheeseman, 1986). However this criticism has weakened as artificial neural networks have
been successfully applied to practical problems.
Artificial neural networks attempt to simulate the human brain. This simulation is
based on the present knowledge of the brain, and this knowledge is even at its best
primitive. The operation of the brain is believed to be based on simple basic elements called
neurons, which are connected to each other with transmission lines, called axons and
receptive lines called dendrites. The learning may be based on two mechanisms: the creation
of new connections, and the modification of connections. Each neuron has an activation
level, which, in contrast to Boolean Logic, ranges between some minimum and maximum
value. Neural network have several important characteristics which make them suitable for
the identification and control of a non-linear system, their features include
No need to know data relationships.
Self-learning capability.
Self-tuning capability.
Applicable to model various systems.
Further to this neural networks contain non-linear elements that enables them to model and
control complex non-linear systems.
From a given transfer function the system response can be predicted. The reverse of
this process i.e. calculating the transfer function from a measured response is called system
identification. It is essentially a process of sampling the input and output signals of a

2
system, and subsequently using the respective data to generate a mathematical model of the
system to be controlled. System identification enables the real system to be altered without
the need to calculate the dynamic equations and remodel the parameters again. Knowledge
of the dynamics of the system is useful in the determination of the neural network
architecture, its inputs, outputs and training process for dynamic model identification
purposes [1].

1.1 Motivation
The inverted pendulum problem is a classic example of an unstable non-linear dynamic
system [2]. Consequently it has received much attention, as it is an extremely complex and
challenging control problem. The dynamics of the inverted pendulum constitute great
difficulty in system identification and control. Conventional control systems have been
found wanting given raised demands for high performance, due to their inability to adapt to
new or unusual circumstances. Conventional control systems do not incorporate the
desirable control features, such as, non-linear capability, adaptation, flexible control
objectives and multivariable capability [3]. Thus there is a need for a control method which
addresses the non-linearities of an operating system, incorporating an adaptation capability.
Considering these control issues, artificial neural network evolve as a solution. ANNs have
several important characteristics that identify them as suitable for the identification and
control of non-linear systems: their ability to learn [4], their ability for the approximation of
non-linear functions and their inherent parallelism. The predominant goal of this project is
to identify an accurate model of the physical inverted pendulum; the majority of modelling
is first simulated using Matlab simulink. Subsequently a suitable neuro controller is
developed and implemented using Borland C++.

1.2 Outline of Report
Chapter 2 investigates non-linear systems with particular respect to the inverted pendulum
and its associated control difficulties. The dynamic equations are derived and subsequently
models developed for both the linear and non-linear model. The development of feedback
controllers is also detailed in this section. Chapter 3 introduces Artificial Neural Networks
detailing their basic components, structure, architecture and application in system
identification and control. Chapter 4 discusses the area of system identification and its
subsequent procedure. Traditional identification techniques are examined first with respect

3
to the linear pendulum. Non-linear identification is subsequently performed using neural
networks. Chapter 5 details the set-up of the physical inverted pendulum; following on from
this real-time identification is performed. Chapter 6 details different neuro-control
techniques; subsequently a neuro-controller is developed and implemented using Borland
C++. Finally in Chapter 7, conclusions are drawn and future recommendations detailed.





































4
Chapter 2
Inverted Pendulum
A dynamic system is a system that changes during time. The starting point for the system is
the initial state and the final point is the equilibrium. Most often a dynamic system is
described by differential or difference equations, where the rate of change is a function of
time or some parameter. Basically all real systems are dynamic system. Most real-world
dynamic processes are nonlinear. Thus, nonlinear mathematical models are the most desired
ones [5]. The inverted pendulum is an example of a highly nonlinear and unstable dynamic
system. Pole-balancing is the task of keeping a rigid pole, hinged to a cart and free to fall in
a plane, in a roughly vertical orientation by moving the cart horizontally in the plane while
keeping the cart within some maximum distance of its starting position (see Figure 2.1).
Despite the dynamics being well understood it is still a difficult process to accomplish, as
most people who have experimented with such devices will appreciate. Further still, if the
system parameters are not known precisely, then the task of constructing a suitable
controller is, accordingly, more difficult. Many researchers have restrained their control-
learning systems to simulations of the inverted pendulum, which can be accredited to the
level of difficulty associated with the control problem. In this chapter, the dynamic
equations will be derived for both the linear and non-linear pendulum. Using a mathematical
model of this form for a system, computer simulation is possible and subsequently the
respective models can be developed using Matlab simulink. The underlying aim of
modelling is that the developed models will have the same characteristics as the actual
process.













Figure 2.1 Inverted Pendulum
Inverted Pendulum

y
2
y
z
2
Force

L = length of pole,
m = mass of pole,
M = mass of cart,
g = gravity


5

2.1 Mathematical Equations
Lagranges equation of motion can be used for the analysis of mechanical systems.

, F
y
L
y
L
dt
d

(2.1)

with y(t) the generalised position vector, ) (t y

the generalised velocity vector, and F(t)


the generalised force vector. The Lagrangian is L = K-U, the kinetic energy minus the
potential energy.

The kinetic energy of the cart is

2
1
2
1
y M K , (2.2)

The pole can move in both horizontal and vertical directions therefore the kinetic energy of
the pole is

) (
2
1
2
.
2
2
.
2 2
z y m K + , (2.3)

where y
2
and z
2
are equal to

, sin
2
L y y + cos
2
L z , (2.4)
giving

, cos
. .
2
.
L y y + sin
. .
2
L z , (2.5)

Therefore the total kinetic energy, K of the system is

,
_

+ + + +
.
2 2
. .
.
2
.
2
2 1
cos 2
2
1
2
1
L L y y m y M K K K , (2.6)

The potential energy due to the pendulum U is

cos
2
mgL mgz U , (2.7)

The Lagrangian function is

cos
2
1
cos ) (
2
1
.
2 2
. .
.
2
mgL mL y mL y m M U K L + + + , (2.8)


6

The state- space variables of the systems are y and , thus the Lagrange equations are
f
y
L
y
L
dt
d

.
, (2.9)

0
.

L L
dt
d
, (2.10)

By substituting for L and performing the partial differentiation produces

f mL ml y m M + + sin cos ) (
.
2
.. ..
, (2.11)

0 sin cos
..
2
..
+ mgL mL y mL , (2.12)

The above dynamic equations can be placed into state-space form, this is achieved by
expressing the Lagrange equation in terms of matrices

1
1
]
1

1
1
]
1

1
]
1

sin
sin
cos
cos
.
2
..
..
2
mgL
f mL
y
mL mL
mL m M
, (2.13)

This gives a mechanical system in typical Lagrangian form, i.e. the inertia matrix
multiplying the acceleration vector. Inverting the inertia matrix and simplifying, the
following non-linear equations describing the inverted pendulum are derived [7].

) ( cos
sin cos sin
2
.
2
..
m M m
f mL mg
y
+


, (2.14)

L m M mL
f mL g m M
) ( cos
cos cos sin sin ) (
2
.
2
..
+
+ + +


, (2.15)

As some of the modelling involved in this project is linear, these equations must be
linearised. The simplest approach is to approximate cos 1 and sin 0. In addition to
this, the quadratic terms are extremely small. Consequently they are set to zero. This yields
the linear system equations


M
mg
M
f
y
..
, g
ML
m M
ML
f

,
_

+
+
..
, (2.16)

7
2.2 Modelling of the Inverted Pendulum
Given the set of equations describing the linear and non-linear inverted pendulum, the next
step is modelling. The models are developed using Matlab simulink, which provides an
environment for computer simulation. The first model developed is the linear model of the
inverted pendulum, which for proficiency is encapsulated in a subsystem block, see Figure
2.2 and Figure 2.3 respectively.
Figure 2.2 Simulink model of linear pendulum
Figure 2.3 Subsystem block of linear pendulum
The subsystem block is set up using a mask. This enables the parameters m, l, g and M to be
altered for different simulations. As it is desired to model accurately the physical rig, the
parameters are taken from this system, the mass of the pendulum, m is set to 0.11 Kg, the
mass of the cart, M is set to 1.2 Kg, the length of the pendulum, L is set to 0.4 meters with
gravity, g set to 9.8m/s
2
. The non-linear model is developed in a similar manner to the linear
system and the parameter values remain the same, see Figure 2.4 and Figure 2.5 on the
following page.



8
Figure 2.4 Simulink model of the non-linear pendulum
Figure 2.5 Subsystem block of linear pendulum
Both pendulum models are simulated with an input of a step, to check the stability of the
systems. The system can be said to be input-output stable if it responds to every bounded
input variable with a bounded output variable [7]. Subsequently input-output instability can
be generalised as, for every bounded input variable the output variable goes unbounded. The
angle of the pendulum is shown in Figure 2.6, on the following page. The simulation shows
that the pendulum is open loop unstable i.e. the pendulum falls over.

9
Figure 2.6 Open loop response of the inverted pendulum
2.3 Closed-loop Control
System identification requires the collection of interesting data. In order to accurately
model the inverted pendulum and to generate suitable input/output data for system
identification it is necessary to stabilise it. This is achieved using a PID controller. Other
methods could have been used to stabilise the linear pendulum such as full state feedback.
However PID control is chosen for its ease of implementation. PID controllers have a long
history in control engineering as they have been proven to be stable, simple and robust for
many real life applications. The P action is related to the present error, the I action is based
on the past history of error, while the D action relates to the future behaviour of the error.
These actions roughly estimate to filtering, smoothing and prediction problems respectively.
The equation of a PID controller is given by
dt
t de
K d e K t e K t u
t
o
D p
) (
) ( ) ( ) (
1

+ + , (2.17)
There are several methods to design PID controllers. Initially Ziegler-Nichols method was
tested. However the optimum parameters obtained from this method offered a pure
response. Subsequently the parameters are chosen heuristically and subjectively, that is by
trial and error testing. The simulink model with controller is shown is Figure 2.7 on the
following page. The pendulum is now in a closed loop, thus the PID controller is de-tuned
so that the dynamics of pendulum are emphasised, and a band-limited white noise signal is

10
used as the input to the system to emulate the disturbances the physical rig would be
subjected to. In the closed loop model in Figure 2.7 there is a graphic visualisation of the
inverted pendulum when compiled, which has been adapted from the Matlab demonstration
slcp.mdl.

Figure 2.7 Simulink model of linear pendulum and controller
The linear pendulum is simulated, see Figure 2.8 for the angle of the pendulum. The closed
loop system with PID controller keeps the pendulum angle stable. This allows for longer
simulation of the pendulum and more importantly the generation of information rich data for
system identification purposes.
Figure 2.8 Closed loop response of linear pendulum with controller

11
The next step is to develop a controller for the non-linear pendulum. Similar to the control
of the linear pendulum, a PID controller was developed, see Figure 2.9. PID control of the
non-linear pendulum is possible because the simulation starts with the pendulum in a
linearisied region that is with the pendulum in an up-right position see Figure 2.11.
Figure 2.9 Simulink model of non-linear pendulum and controller
The closed loop response of the pendulum angle is shown in Figure 2.10. The closed loop
response with PID control is stable, thus suitable input-output data for system identification
purposes is obtained. Similar to the linear closed loop system, the PID controller has been
de-tuned so that the full dynamics of the pendulum emphasised and not the controller
dynamics.
Figure 2.10 Closed loop response of non-linear pendulum with controller

12



Figure 2.11 Inverted Pendulum, Time (t) = 0.0 secs.

2.4 Summary
The dynamic equations for both the linear and non-linear pendulum have been derived and
subsequently models for each developed using Matlab simulink. It was evident that the
system is open loop unstable, i.e. for a bounded input variable the output variable goes
unbounded. However a criterion for accurate system identification is that the process must
be stable. Consequently simple PID controllers were developed which stabilised the system.
They were de-tuned so that the dynamics of the pendulum is emphasised and interesting
data would be generated for system identification purposes. The subsequent chapter details
the theory, operation and structure of artificial neural networks.









Pendulum in
Upright Position
Time (t) = 0.0 secs

13
Chapter 3
Artificial Neural Networks
A neural network is an information-processing paradigm inspired by the way, the brain
processes information [8]. It is composed of a large number of highly interconnected
processing elements (neurons) working in parallel to solve a specific problem. ANNs learn
by example, trained using known input/output data sets to adjust the synaptic connections
that exist between the neurons. The Biological Neuron is composed of a large number of
highly interconnected processing elements called neurons and are tied together with
weighted connections or synapses. Learning in biological systems involves adjustments to
the synaptic connections that exist between the neurons. These connections store the
knowledge necessary to solve specific problems.

Figure 3.1 Biological Neuron [9]

One of the most interesting features of NNs is their learning ability. This is achieved by
presenting a training set of different examples to the network and using learning algorithms,
which changes the weights (or parameters of activation functions) in such a way that the
network will reproduce a correct output with the correct input values. One encountered
difficulty is how to guarantee generalisation and to determine when the network is
sufficiently trained. Neural networks offer non-linearity, input-output mapping, adaptivity
and fault tolerance. Non-linearity is a desired property if the generator of the input signal is
inherently non-linear [10]. The high connectivity of the network ensures that the influence
of errors in a few terms will be minor, which ideally gives a high fault tolerance.


14
3.1 Artificial neuron model
In ANNs the inputs are combined in a linear way with different weights. Walter Pitts
developed the first model of an elementary computing neuron.











Figure 3.2 Artificial Neuron, McMulloch & Pitts (1943)

Each neuron consists of a processing element with synaptic input connections and a single
output. The initial step is a process whereby the inputs
n
x x x x ....
3 , 2 , 1
are multiplied by their
weights respectively
n
w w w w .... , ,
3 2 1
and then summed by the neuron. The summation
process may be defined as

,
_

n
i
i i
x w net
1
. , (3.1)

Further to this a threshold value or bias may be included, subsequently the summation
process may be rewritten as

b x w net
n
i
i i
+
,
_


1
. , (3.2)

A non-linear activation function f is generally included in the neuron arrangement, this is
added to introduce non-linearities into the model. The output of the neuron can now be
expressed as (see Figure 3.3)
) (net f y , (3.3)



T
O
W
1

W
2

W
n

X
1

X
2

X
n


15














Figure 3.3 Perceptron Model

Using the back propagation algorithm the weights are dynamically updated. The error
between the target output and the actual output is calculated as

) ( ) ( ) ( k y k t k e , (3.4)

The error is then back propagated through the layers and the weights adjusted accordingly
by the formula
) ( ). ( . ) ( ) 1 ( k x k e k w n w + , (3.5)

The feed-forward process is subsequently repeated. The weights are updated and adjusted
on each pass until the error between the target and the actual output is low i.e. the model has
been sufficiently trained.









Y
W
1

W
2

W
n

X
1

X
2

X
n

f(net)
net
b
learning target
error

16
3.2 Activation functions
There are number of different types of activation functions such as step, ramp, sigmoid etc.
However the most commonly used activation functions are tan-sigmoid, log-sigmoid and
linear. The effect of the linear function is to multiply by a factor a constant factor. The
sigmoid function has an S shaped curve, see Figure 3.4.







Figure 3.4 Log-sigmoid function Tan-sigmoid function Linear function

3.3 Neural Network architecture
Network architectures can be categorised to two main types according to their connectivity:
feed-forward networks and recurrent networks (feedback networks). A Network is feed-
forward if all of the hidden and output neurons receive inputs from the preceding layer only.
The input is presented to the input layer and it is propagated forwards through the network.
Output never forms a part of its own input, see Figure 3.5.











Figure 3.5 Multi-layer Feed-forward Network structure


f(net) f(net)
0
0
1 1
-1
f(net)
net
net
net
N
N
N
1
f
1
f
1
f
2
f
2
f
2
f
X
1
X
2
1
Y
1
1
Y
2
1
Y
3
1
b
1
1
b
2
1
b
3
2
b
1
2
b
2
2
b
3
1
w
1,1
1
w
n
2
w
1
2
w
n
2
Y
1
2
Y
2
2
Y
3
Input layer hidden layer Output layer

17
In a multi-layer feed-forward network each layer has a weight matrix w, a bias vector b, and
an output vector Y. Thus the network in Figure 3.5, has an input vector
1
]
1

2
1
x
x
x , (3.6)

The weights of the network are the weight matrices

1
1
1
]
1

3 , 2
1
2 , 2
1
1 , 2
1
3 , 1
1
2 , 1
1
1 , 1
1
1
w
w
w
w
w
w
W ;
1
1
1
]
1

3 , 2
2
2 , 2
2
1 , 2
2
3 , 1
2
2 , 1
2
1 , 1
2
2
w
w
w
w
w
w
W , (3.7)

The biases are the bias vectors
1
1
1
]
1

3
1
2
1
1
1
1
b
b
b
b ;
1
1
1
]
1

3
2
2
2
1
2
2
b
b
b
b , (3.8)
The output of the network can now be written as

+ ) (
1 1 1
1
1
b x w f y ) (
2
1
1 2 2
1
2
b y w f y + , (3.9)
+ ) (
1 1 1
2
1
b x w f y ) (
2
2
1 2 2
2
2
b y w f y + , (3.10)
+ ) (
1 1 1
3
1
b x w f y ) (
2
3
1 2 1
3
2
b y w f y + , (3.11)

The premise behind the addition of extra layers or nodes enables the network to deal with
more complex problems and extract higher order statistics. Cybenko proved that a feed-
forward network with a sufficient number of hidden neurons with continuous and
differentiable transfer functions could approximate any continuous function over a closed
interval [11]. There is not a limit to the number of hidden layers. These hidden layers
increase the non-linear complexity of a network, however, Hornik and indeed other
researchers have shown that even a two-layer network with a suitable number of nodes in
the hidden layer can approximate any continuous function over a compact subset [12]. Thus,
generally just one or two hidden layers are used. Subsequently it is implied that a feed-
forward neural network with just one hidden layer is suitable for the purpose of
identification.

18
Recurrent network have at least one feedback loop, i.e., cyclic connection, which
means that at least one of its neurons feed its signal back to the inputs of all the other
neurons. The behaviour of such networks may be extremely complex [13].



















Figure 3.6 Multi-layer Recurrent Network structure

The effect of the feedback loop enables the control of outputs through outputs, thus giving
recurrent networks memory. This is especially meaningful if the network is
approximating functions dependent on time. Considering dynamics of the inverted
pendulum, this is particularly applicable i.e. there is several feedback loops in the developed
model of the pendulum. There are two main types of recurrent networks widely used,
Elman and Hopfield. The feedback loop in Elman networks enables them to learn to
recognise and generate spatial as well as temporal patterns [14]. An Elman network can
approximate any function (with a finite number of discontinuities) with arbitrary
accuracy, if the hidden layer has a sufficient number of neurons [15]. Hopfield is generally
N
N
N
f
f
f
f
f
f
X
1
X
2
1
Y
1
1
Y
2
1
Y
3
1
b
1
1
b
2
1
b
3
2
b
1
2
b
2
2
b
3
w
1
w
n
2
w
1
2
w
n
2
Y
1
2
Y
2
2
Y
3
z
-1
z
-1
z
-1
z
-1

19
used for classification of feature vectors [16]. Subsequently Elman networks will be the
chosen as the recurrent network architecture for identification purposes.


3.4 Learning Algorithms
In neural networks learning ability is achieved by presenting a training set of different
examples to the network and using learning algorithm to changes the weights (or the
parameters of activation functions) in such a way that the network will reproduce a correct
output with the correct input values. There are three main classes of learning reinforced,
supervised, and unsupervised. The latter two will be considered here.
In the supervised learning procedure a set of pairs of input-output patterns. The
network propagates the pattern inputs to produce its own output pattern and then compares
this with the desired output. The difference is the error, if this is absence; learning is
stopped, if present error is back propagated to have weight and bias changed. A supervised
learning scheme is illustrated in Figure 3.7.










Figure 3.7 Supervised Learning

In unsupervised learning there is no external learning signals to adjust the networks weights.
The approach adopted is to internally monitor their performance. It proceeds by seeking
trends in the input signals making adaptations in the according to the network function. At
present unsupervised learning is not fully understood and still the subject of much research.
An unsupervised learning scheme is illustrated in Figure 3.8.
X1
X2
e error Learning signal
t
Input
training data

20


Figure 3.8 Unsupervised Learning

3.5 Learning rules
Hebbs Rule was the first rule developed. The rule declares that when a neuron receives an
input from another neuron, and if both are highly active, then the weight between the
neurons should be strengthened. Kohonens learning rule is a procedure whereby
competing processing elements contend for the opportunity to learn. The only permitted
output is from the winning element, furthermore this element plus its adjacent neighbours
are permitted to adjust their connection weights. It should also be noted that the size of the
neighbourhood may adjust during this training period. The Back propagation learning
algorithm is perhaps the most popular learning algorithm. The net simply propagates the
pattern inputs to outputs to produce its own patterns comparing this with the desired output,
the difference being the error. If no error is present learning stops, however if error is
present, it is back propagated to change weights and biases, this recurs until no error is
present.

3.6 Neural Network Limitations
Neural Networks do have a number of limiting factors including training times, opacity and
local minima. Neural networks may require exhaustive training times especially with large
dimensional problems due to the increased number of synaptic weights required to be
adjusted. However this problem is gradually disintegrating with ever-increasing computer
processing capabilities. Opacity is associated with Neural Networks that operate as black

21
boxes. In a Neural Network that operates as a black box, only the inputs and outputs are
visible, thus it is difficult to relate to the parameters of the system under consideration to the
internals of the network. Subsequently it is difficult to obtain an intuitive feel for its
operation as their performance can only be measured statistically. A major concern
associated with training is the possibility of becoming trapped in a local minima, see Figure
3.9 below.

Figure 3.9 Local & global minimum

The global minimum represents the lowest point on the graph, which is the optimal solution
for the problem. The majority of training algorithms operate by travelling down these slopes
until they find the lowest point. The training algorithm may become trapped in a non-
optimal solution i.e. local minimum. There are a number of possibilities to overcome this
issue such as the use of momentum terms or Boltzman annealing. However there is a trade-
off with increased training time.

3.7 Applications

Neural networks have the capability of adaptively controlling and modelling a wide range of
non-linear process at a high level of performance. In particular to obtain models which
describe system behaviour i.e. system identification. Modelling is extremely important as it
provides a tool for simulating effects of different control strategies techniques. Using the
back propagation algorithm a feed-forward network can be trained to approximate arbitrary
non-linear input-output mappings from a collection of example input and output patterns.
This learning technique has been applied in a wide variety of pattern classification and
modelling tasks. Neural networks have found several fields of application including
medicine, finance, management and other signal processing applications.



22
3.8 Summary
Undoubtedly Neural Networks provide an extremely powerful information-processing tool.
They are ideal for control systems because of their non-linear approximation capabilities,
adaptive control and computational efficient due to their parallel architecture. Their ability
to learn by example makes them both flexible and powerful. They also remove the need to
explicitly know the internal structure of a specific task. Considering the specific control
problem of the inverted pendulum, it is apparent that Neural Networks have certain features
which make them extremely suitable for identification and control of such a challenging
control problem.











































23
Chapter 4

System Identification
The modelling and identification of linear and non-linear dynamic systems through the use
of measured experimental data is a problem of considerable importance in engineering [17],
and has duly received much attention. System identification is essentially a process of
sampling the input and output signals of a system, subsequently using to respective data to
generate a mathematical model of the system to be controlled i.e. it is procedure whereby a
model is developed. Figure 4.1 depicts a system inputs/outputs. The motivation behind
system identification is to obtain a model that enables the design and implementation of a
high level performance control system, while providing an insight into system behaviour,
prediction, state estimation, simulation etc. [18]






Figure 4.1 Input, Output, Disturbances of a System

Identification of multivariable systems is an extremely difficult problem due to the coupling
between various inputs and outputs, further complicated when systems are non-linear [19].
Neural networks are becoming increasingly recognised for this purpose due to their
attributes parallelism, adaptability, robustness and their inherent capability to handle non-
linear systems. Intuitively the inverted pendulum is extremely unstable, further to this from
the modelling of the pendulum in Chapter 2 the system is open-loop unstable. However
stability is a necessary criterion for system identification. Duly the system was placed in a
closed loop and stabilised using PID control. As the system is in a closed loop, it is
desirable that little of dynamics of the controller be seen at the system output. To achieve
this the controller was de-tuned i.e. the control was left loose, this ensures that the
input/output data generated emphasised the dynamics of the pendulum, thus is suitable data
for system identification.

System input U
output Y
disturbance e

24
Before neural networks are directly used for system identification, conventional linear
techniques such as auto regressive with exogenous input (ARX) and recursive auto
regressive moving average with exogenous input (RARMAX) will be investigated and
applied to the linear pendulum.

4.1 System Identification Procedure
System identification is essentially the process of adjusting the parameters of the model
until the model output resembles the output of the real system. The procedure for system
identification can be viewed graphically in Figure 4.2. The procedure can be categorised
into three main stages [20]:
Experimental input/output data from the process that is being modelled is required. With
respect to the inverted pendulum system this would consist of the input force on the cart
and the pendulum angle.
The second stage is to choose which model structure to use.
Subsequently the parameters of the model will be adjusted until the model output
resembles the system output.






















Figure 4.2 System Identification Procedure


Experimental
Design
Data
Choose
Model Set
Choose
Criterion
of Fit
Calculate Model
Validate
Model
Prior Knowledge
Not OK
Revise
Ok use it.

25
4.2 Conventional linear system identification
The ARX model structure is a simple linear difference equation which relates the current
output y(t) to a finite number of past outputs y(t-k) and inputs u(t-k).

) 1 ( ) ( ) ( ) 1 ( ) (
1 1
+ + + + + nb nk t u b nk t u b na t y a t y a t y
nb na
, (4.1)

or in more compact form

) (
) (
1
) (
) (
) (
) ( t e
q A
nk t u
q A
q B
t y + , (4.2)

Thus the ARX structure is defined by the three integers na, nb, and nk. na is the number of
poles and nb-1 is the number of zeros, while nk is the pure time-delay in the system. For a
system under sampled-data control, generally nk is equal to 1. The main method used to
estimate the a and b coefficients in the ARX model structure is the Least Squares method. It
proceeds by minimising the sum of squares of the right-hand side minus the left-hand side
of the expression above, with respect to a and b [21].

The RARMAX is an extension from the ARMAX structure, which in turn is an extension
from the ARX model structure. RARMAX model recursively estimates the a and b
coefficients in the ARMAX model structure. However, the ARMAX structure also includes
an extra C parameter in the noise spectrum model. Consequently RARMAX provides
greater accuracy.
) (
) (
) (
) (
) (
) (
) ( t e
q A
q C
nk t u
q A
q B
t y + , (4.3)

The data from model of the linear pendulum is exported to the Matlab workspace and
subsequently split into estimation and validation data. Changing the initial seed of the
excitation signal in the simulation creates the validation data. The ARX model is
implemented using Matlab functions as follows:

input_estim = force_1; %input data for estimation
output_estim= theta_1; %output data for estimation
input_val= force_2; %input data for validation
output_val= theta_2; %output data for validation
orders = [4 5 1]; % defines model structure

26
arx_model = arx([output_estim input_estim],orders); %arx function

compare([output_val input_val],arx_model); %compare function



The parameters of ARX model are chosen heuristically and subjectively. The compare
function is used to compare the model output with the validation data. Initially the open
loop unstable model of the inverted pendulum is modelled. A criterion for system
identification is that the system is stable. As the open loop system is inherently unstable the
ARX model completely fails to identify the pendulum as expected, see Figure 4.3.


Figure 4.3 ARX model output with measured output

Subsequently input/output data is generated from the controlled closed loop model of the
inverted pendulum. It is anticipated that the ARX model should identify the stable model.
Further to this as the complexity of the models increased the accuracy should also increase;
from the results obtained this was proven to be correct. Table 4.1 shows the different
parameters tested and the resulting performance from these models.







27

ARX [na nb nk] ARX performance
1 1 1 21.94%
2 2 1 93.28%
3 2 1 93.28%
3 3 1 99.90%
4 2 1 93%
4 3 1 100%

Table 4.1 ARX model performance

Figure 4.4 shows the best model performance tracking both actual output from the system
and the model output. From the results the ARX model identifies the pendulum dynamics
with extremely good accuracy.


Figure 4.4 ARX [4 3 1] model output and measured output

Having identified the linear system using the arx model, the rarmax subsequently is tested.
The results obtained are marginally better to that using the arx method, see Table 4.2 and
Figure 4.5. This is expected as this method includes an additional C parameter in the noise
spectrum model. The Matlab script for this is implemented as follows

orders = [3 3 3 1]; %model structure
[rarmax_model,yhat]=rarmax([output_estim input_estim],orders,'ff',); % function



28

RARMAX [na nb nc nk ] RARMAX performance
2 1 1 1 98.98%
2 2 2 1 99.01%
3 2 2 1 99.34%
3 3 2 1 99.58%
3 3 3 1 100%

Table 4.2 RARMAX model performance


Figure 4.5 RARMAX [3 3 3 1] model output and process output

The results using linear identification techniques (arx, rarmax) verify that the system must
be stabilised before identification can be performed. Also these conventional linear
identification techniques performed extremely well in modelling the dynamics of the linear
pendulum. However their representation ability of non-linear systems is restricted. For
completeness the arx and rarmax are tested to verify this. The input-output data for the non-
linear pendulum is generated in a similar manner as previously for the linear model. Again
parameters for both models are chosen heuristically and subjectively. Figure 4.6 shows the
best model identified. As expected these conventional identification techniques cannot
identify the full dynamics of the non-linear pendulum.

29

Figure 4.6 RARMAX [4 3 3 1] model output and validation data

4.3 Non-linear System Identification using NARMAX
Linear identification techniques are well established. However their representation ability of
non-linear processes is clearly limited. Subsequently, non-linear black-box model structures
have been developed, however they are still the subject of much debate. One such technique
is non-linear auto regressive moving average with exogenous inputs or narmax. The narmax
model can be described by

) ( )) 1 ( ),..., ( ), 1 ( ),....., ( ( ) 1 ( t e n t u t u n t y t y h t y
u y
+ + + +
) )
)
)
, (4.4)

The main problem with narmax is how to construct a model that easily estimated and used
to construct a systems dynamics in practical terms [22]. The main disadvantage in the
narmax estimation procedure is the need to select the most useful terms to be included in the
model, which are chosen from a large number of available model terms usually running into
thousands. This presents the most challenging procedure in the estimation of narmax
structures since it is dependent on factors like the sampling frequency and prior knowledge
about the system orders [23]. As such non-linear identification techniques such as narmax
do not offer a suitable solution for system identification in practical terms.






30
4.4 System Identification using Neural Networks
In this section neural networks are implemented for the identification of both the linear and
non-linear inverted pendulum models. In Chapter 3, different neural networks structures
were examined, consequently two structures were identified as suitable for the system
identification of the inverted pendulum, feed-forward and recurrent (Elman) networks. A
common structure for achieving system identification using neural networks is forward
modelling, Figure 4.7. This form of learning structure is a classic example of supervised
learning. The neural network model is placed in parallel with the system both receiving the
same input, the error between the system and network outputs is calculated and
subsequently used as the network training signal.














Figure 4.7 Forward modelling of inverted pendulum using neural networks

In order to provide targets for the network, the previously developed simulink models of the
inverted pendulum with feedback control are used. The control force is used as the input to
the neural network while the target for the network is the angle of the pendulum theta
(radians). For completeness the linear model of the inverted pendulum shall be identified
first. The first type of neural network tested is the feed-forward. Using Matlab it is possible
to develop multi-layer perceptons. However this shall be restricted to either one or two
hidden layers, as research previous has shown this to be sufficient for the identification of
non-linear systems. It is also expected that increasing the number of neurons in the hidden
layer will improve the models accuracy.

Inverted
Pendulum
Learning
Algorithm
System i/p System o/p
Adjust weights
+
-

31
A feed-forward back propagation network is created using Matlab script as follows
net = newff([-10 10],[10 1], {'tansig' 'purelin'},'trainlm');
net.trainParam.epochs = 400;
net.trainParam.lr = 0.0001;
net = train(net,in(1:2000)',theta(2:1000)');

In the example above a two-layer feed-forward network is created. The network's input
ranges from [-10 to 10]. The first layer has ten tansig neurons, the second layer has one
purelin neuron. The is the standard set-up of activation functions for multi-layer perceptons,
the hidden layer has non-linear functions however the output layer always has a linear
activation function. The trainlm network training function is used. Back-propagation
updates the weights. The number of epochs and learning rate can be set and adjusted. By
examining the training diagram it can be determined when the network is sufficiently
trained, and whether the convergence is too fast; this can sometimes account for getting
struck in a local minimum, see Figure 4.8.
Figure 4.8 Neural network training

When the network is sufficiently trained, the network is exported to the simulink
environment using the gensim command. To ensure the network is adequately validated
the initial seed of the input signal must be changed. The mean squared error and a
comparison of the systems and networks output calculate the quality of each model. The
mean squared error is not alone sufficient to determine to the quality of the model as there

32
could be a low mean squared error and yet a poor prediction of the dynamics of the system.
The simulink set-up for model validation may be viewed in Figure 4.9.




Figure 4.9 Model validation set-up in simulink


The following table 4.3 summarises the results obtained.

NN Architecture Hidden Layers Neurons in Hidden Layer Training
Epochs
MSE
feed-forward 1 4 400 0.000061404
feed-forward 1 10 400 0.0029
feed-forward 1 20 400 0.00007839
feed-forward 1 35 400 0.001
feed-forward 1 50 400 0.0013
feed-forward 2 [4 2] 400 0.000059481
feed-forward 2 [10 4] 400 0.000060253
feed-forward 2 [20 10] 400 0.00040555
feed-forward 2 [30 20] 400 0.0002063

Table 4.3 Summary of feed-forward neural network performance

A nominal number of neurons in the hidden layer were sufficient to identify the model
extremely well, above this threshold the performance was decreased. Tim Callinans work
on identification of the inverted pendulum using feed-forward neural networks, accredited
this to the fact, it is in a closed loop as such de-tuning the controller has a greater effect on
the models performance than an increase in the number of neurons [24]. The lowest mean
squared error and overall best performance was achieved using two hidden layers with four
and two neurons respectively. This is expected, as previously discussed, an increase in
hidden layers should improve the models performance. Figure 4.10 and 4.11 show the

33
optimum performance obtained using one, and two hidden layer feed-forward networks
respectively, plotting the process output against the model output. Overall the feed-forward
networks model the process well, predicting the pendulum angle with a low MSE error.

Figure 4.10 feed-forward network, 1 hidden layer, 4 hidden neurons


Figure 4.11 feed-forward network, 2 hidden layers, 4 and 2 hidden neurons respectively




34
The next type of ANN tested are Elman networks. As discussed in Chapter 3, Elman
networks are expected to perform well, due to their feedback loop providing dynamic
memory, subsequently making them suitable in the prediction of dynamic systems. The
Elman network is implemented using Matlab functions as follows:
net = newelm([-10 10],[10 5 1], {'tansig' 'tansig' 'purelin'},'trainlm');
net.trainParam.epochs = 400;
net.trainParam.lr = 0.0001;
net = train(net,in(1:2000)',theta(1:2000)');

In the example above, an Elman network with two hidden layers is created. The network's
input ranges from [-10 to 10]. The first layer has ten tansig neurons, the second layer has
five tansig neurons and finally the output layer has a single linear purelin neuron.

After initially testing it is found the Elman models fail to predict the angle of the pendulum
and the model predictions are completely out of range. Consequently several pre-processing
techniques of signals for neural networks were implemented. It is found that scaling the
input/output data for training has a small filtering effect; subsequently improving the
networks performance during training. A scaling factor of ten was used, and all validation
data is also scaled accordingly. By filtering the data in this way it decreases the possibility
of the network getting caught in a local minimum. The improvement in results was
extremely good and the models successfully predicted the pendulum angle. Table 4.4
summaries the results obtained.

NN Architecture Hidden Layers Neurons in Hidden Layer Training
Epochs
MSE
Elman (Recurrent) 1 4 400 0.0334
Elman (Recurrent) 1 10 400 0.0033
Elman (Recurrent) 1 20 400 0.0014
Elman (Recurrent) 1 25 400 0.0011
Elman (Recurrent) 2 [4 2] 400 0.0023
Elman (Recurrent) 2 [10 4] 400 0.00359
Elman (Recurrent) 2 [15 10] 400 0.003

Table 4.4 Elman network performance

Despite some Elman models having a low MSE error, Figure 4.12 shows that they still
failed to adequately identify the pendulum angle. It found however that an increase in the
number of neurons and hidden layers significantly improved their performance, see Figure
4.13.


35
Figure 4.12 Elman network, 2 hidden layers, 4 and 2 neurons respectively


Figure 4.13 Elman network, 2 hidden layers, 15 and 10 neurons respectively

Overall both networks performed well. The feed-forward network slightly outperformed the
recurrent network with a lower mean squared error. However both models identified the
pendulum dynamics well. Having successfully identified the linear pendulum, the next step
is the identification of the non-linear model. Identification of the non-linear model proceeds
in the same manner as that for the linear model. Input-output data is generated, the network
is trained, imported into simulink and validation performed.

36
The first network tested is the feed-forward network. Table 4.5 summarises the models
tested and their performance. The mean squared error is low, and an increase in neurons in
the hidden layer does improve performance as expected. This was not the case in identifying
the linear model, as the controlled closed-loop model had a greater impact on the dynamics
of pendulum seen at the output. Figure 4.14 and Figure 4.15 show the best models identified
using one, and two hidden layer feed-forward neural networks respectively; plotting the
process output against the model output. From the graphs the models identify the pendulum
dynamics extremely well.

NN
Architecture
Hidden Layers Neurons in Hidden Layer Training Epochs MSE
feed-forward 1 4 400 0.000051028
feed-forward 1 10 400 0.000050314
feed-forward 1 20 400 0.000048409
feed-forward 1 35 400 0.00004851
feed-forward 1 50 400 0.000046509
feed-forward 2 [4 2] 400 0.000049856
feed-forward 2 [10 4] 400 0.00005027
feed-forward 2 [20 10] 400 0.000048053
feed-forward 2 [30 20] 400 0.000046268

Table 4.5 Summary of feed-forward network performance




Figure 4.14 Feed-forward network, 1 hidden layer with 50 neurons


37


Figure 4.15 Feed-forward network, 2 hidden layers, 30 and 20 neurons respectively

The next step is to identify the non-linear pendulum model using the Elman recurrent
network. It is found that similar to identification of the linear pendulum a scaling factor is
required for the training data. This value is chosen heuristically and subjectively; the ideal
scaling factor is determined to 1000. This improves results dramatically, however the
network still performs poorly. Figure 4.16 shows the difference using the scaling factor. The
models do not identify the dynamics with the same degree of accuracy as the feed-forward
models.
Figure 4.16 Difference between network trained with data scaled

38
Clearly the scaling of the training data vastly improves the model performance; yet on
examination of the best model identified, the Elman network performance is still inferior to
the feed-forward networks. This is unexpected, as the presence of a feedback loop providing
dynamic memory in the Elman network, should enhance their prediction of non-linear
systems. However in cases where the required depth of memory is much larger than the size
of the tapped delay line, a recurrent network may operate poorly; essentially the information
needed to predict the future is not concentrated in the current sample neighbourhood [25].
Thus the network is unable to fully identify the dynamics of the pendulum. Table 4.6
summarises models tested and their performance and Figure 4.17 shows the best model
identified.

NN
Architecture
Hidden Layers Neurons in Hidden Layer Training Epochs MSE
Elman 1 4 400 0.1243
Elman 1 10 400 0.0535
Elman 1 20 400 0.3049
Elman 1 25 400 0.3868
Elman 2 [4 2] 400 0.6928
Elman 2 [10 4] 400 0.1017
Elman 2 [15 10] 400 0.05

Table 4.6 Elman network performance

4.17 Elman Network, 2 hidden layers, 15 and 10 neurons respectively

The overall results show that the feed-forward network outperforms the recurrent Elman
network identifying the non-linear models dynamics. Subsequently the report shall proceed
focused primarily using feed-forward networks.

39
4.5 Javiers Linearised Model
Javiers linear model of the inverted pendulum is of considerable interest; as it is upon this
model, controllers for the physical pendulum were developed, thus giving an intuitive feel
for how accurate the models developed of the linear and non-linear inverted pendulum are.
The following is the transfer function of the system [26].

,
_

,
_


,
_

,
_

,
_

L
g
s
M
F
s
s
ML
M
F
s s
M
G
G
s G
2
2
1
1
1
) (

using the following parameters

F = 0.303 Kg/s
M = 0.091 Kg
L = 0.32898 m
g = 9.8 m/s
2

Thus giving the following linearised model

,
_

+ +

,
_

) )( )( (
) (
) (
2
1
2
1
b s b s a s
s k
a s s
k
G
G
s G

where

30
11
46 . 5
33 . 3
2
1

k
k
b
a


This linearised system of the inverted pendulum is modelled using Matlab simulink. As this
model proved extremely robust in the design of controllers for the physical system, it
provides a good insight into the behaviour of the physical rig. Figure 4.18 shows the
simulink set-up. The system is stabilised using a PID controller.

40





Figure 4.18 Javiers Linearised Model

The input to the system is a dither signal in the form of a Pseudo-random Binary sequence
(PRBS). The signal has a spectral content rich in frequencies [27]. The objective of this
excitation signal is to generate input-output data, which contains the process dynamics over
the entire operating range. Figure 4.19 shows the pendulum angle theta.


Figure 4.19 A comparison of Javiers Model & Non-linear Model

Comparing the two models, although they do differ, they do display the same dynamics.
Thus the next step is to identify Javiers model using neural networks. The same procedure

41
that is used to identify the non-linear pendulum is adopted; and input-output data for
training and validation is carried out using the same method. For completeness recurrent
networks in the form of Elman are tested, a similar pattern emerges as with previous testing,
the training data had to be scaled and the feed-forward networks out-performed them. Table
4.7 shows the different feed-forward networks tested and their respective performance.

NN Architecture Hidden Layers Neurons in Hidden Layer Training
Epochs
MSE
Feed-forward 1 4 400 1.296E-08
Feed-forward 1 10 400 2.3718E-08
Feed-forward 1 20 400 1.2917E-08
Feed-forward 1 35 400 1.2901E-08
Feed-forward 1 50 400 1.2892E-08
Feed-forward 2 [4 2] 400 1.2931E-08
Feed-forward 2 [10 4] 400 1.2942E-08
Feed-forward 2 [20 10] 400 1.2917E-08
Feed-forward 2 [30 20] 400 1.2916E-08

Table 4.7 Summary of feed-forward network performance
Table 4.6 clearly shows the mean square error is extremely small. An increase in the number
of hidden layers and the number of neurons in the hidden layer does improve performance
slightly. However, with only one hidden layer and four neurons in this layer gives an
extremely small mse. Figure 4.20 shows the best model identified, plotting the process
output against the model output. The model identifies the system dynamics extremely well.
Figure 4.20 Feed-forward NN, 2 hidden layers, 30 and 20 neurons respectively



42
4.6 Summary
In this chapter an overall feel for system identification has been presented and its associated
procedure explained. Conventional identification techniques were first applied to the linear
model of the inverted pendulum. These traditional techniques modelled the linear pendulum
with good accuracy. However such traditional techniques cannot identify the complexity of
the non-linear pendulum. Extensions have been made to these linear identification
techniques in the form of non-linear armax. The main problem with non-linear armax is how
to construct a model that is easily estimated and used to construct a systems dynamics in
practical terms. As such system identification proceeded forward using neural networks.
Two main network structures were used feed-forward and recurrent. Initially the linear
pendulum model was identified. Both structures performed well. However it was found
necessary to pre-process the data for the recurrent network by using a scaling factor,
consequently having a filtering effect on the data. This helps prevent the network getting
stuck in a local minimum during training. The next step was identification of the non-linear
pendulum. Again both network architectures were tested. Similar results to the linear
identification were obtained with pre-processing of training data required for the recurrent
elman network. Overall the feed-forward networks out-performed the recurrent networks;
this is unexpected, as recurrent networks with their dynamic memory should identify a
dynamic system with good accuracy. This can be accredited to the fact the in some cases
recurrent networks perform poorly if the length of the fixed line delays is smaller than the
required length of memory to predict the next sample. Consequently the remainder of the
report will use feed-forward networks exclusively. Finally in this chapter Javiers linearised
model of the inverted pendulum was examined. This model is important because it is upon
this controllers for the physical pendulum were developed. As such it gives an intuitive feel
for the accuracy of the models identified. It is seen that overall the model contain mainly the
same dynamics. For completeness Javiers model is also identified using both feed-forward
and recurrent networks. Subsequently in the next chapter identification of the physical
pendulum is examined and implemented.







43
Chapter 5

Real Time Identification
The pendulum rig is comprised of a pole mounted on a cart free to swing in only a vertical
plane. The cart is driven by a DC motor and allowed to move on a rail of limited length.
Two optical encoders are used to detect pendulum angle and cart position. The two output
signals are received by a control algorithm via the interface card, which subsequently
determines the control action necessary to keep the pendulum upright. The control signal is
limited within a normalised range from 1 to 1. Figure 5.1 shows the pendulum control
system.










Figure 5.1 Set-up of pendulum rig

The control algorithm is in Matlab. Figure 5.2 shows the real time kernal (RTK) in the
Matlab environment. The RTK is an encapsulated block implementing the control tasks.
The input to the RTK block can be in the form of an excitation signal or desired cart
position. The outputs from the RTK contains all data regarding pendulum angle, angular
velocity, cart position, cart velocity and the control value. There is however no feedback
control loop because the controller is embedded in the RTK
Limit switch
Cart & angle
sensor
DC motor
&
position sensor
measurement
DC motor driver
&
interface
control
Control
Algorithms

44





Figure 5.2 Real Time Task in simulink environment

5.1 Closed-loop controller
The experiment starts with the pendulum in a downward position. The pendulum is steered
to its upright unstable position and subsequently kept erect by the linear-Quadratic (LQ)
controller. As such two independent control algorithms are required.

Swinging algorithm
Stabilising algorithm

Only one control algorithm is active in each control zone. Figure 5.3 shows these zones.












Figure 5.3 Zones of Control Algorithms



Stabilisation
zone
Swinging
zone

45
The swinging control algorithm is a heuristic one, based on energy rules and has the form

Friction u sign u u
old old
) ( + , (5.1)

where control u is the a normalised value 1 to 1.


The linear-quadratic to keep the inverted pendulum stabilised has the form

) (
4 4 3 3 2 2 1 1
K K K K u + + + , (5.2)
where

1
= desired position of the cart measured position of the cart,

2 =
desired angle of the pendulum measured angle of the pendulum,

3 =
desired velocity of the cart observed velocity of the cart,

4
= desired angular velocity of the pendulum observed angular velocity of the pendulum

K
1
.K
4
are positive constants.

The optimal feedback gain vector K = [K
1
.K
4
] is calculated such that the feedback law

u = - K; (5.3)

where = [
1..

4
] minimises the cost function

{ }dt Ru u Qx x Integral J
' '
+ , (5.4)

where Q and R are the weighting matrix [28].

5.2 Identification of physical system
At this stage there is a closed loop stabilised system, this is necessary for identification of an
open-loop unstable system. The non-linear model identified using neural networks is first
compared with the physical rig. The input to both the physical system and the model is a
small dither signal is to generate input-output data, which contains the non-linear process
dynamics over the entire operating range. Both systems are in a controlled closed loop.
Comparing the pendulum angle of each the dynamics are similar thus the modelling of the
system in simulink has been accurate, see Figure 5.4








46



Figure 5.4 NN non-linear model output & physical system output

Having confirmed that the two systems dynamics are similar, the next step is the
identification of the real rig. The data for this is generated in a similar manner, the
experiment starts with the pendulum in a downward position. Figure 5.5 shows pendulum
angle during test.

Figure 5.5 Pendulum angle of Real System

47
Having generated input-output data for the physical system, the next stage is to develop a
neural network, which identifies the pendulum angle of the physical rig. A feed-forward
network is used to identify the pendulum angle. Training of the network is performed and
subsequently imported into the simulink environment. Validation of the network is
performed on-line, see Figure 5.6. A single hidden layer in the neural network is adopted,
and a different number of neurons in the hidden layer are tested.






Figure 5.6 Validation set-up

Several different models were tested but unable to identify the pendulum angle, see Figure
5.7. The problem exists that two different controllers are used to swing up and stabilise the
pendulum, thus depending on what zone the pendulum is in, the output is calculated in a
different manner; thus effecting the data for training of the neural network and subsequent
identification.
Figure 5.7 feed-forward NN, 1 hidden layer with 75 neurons

48
Given that a neural network can approximate data on which it has not been trained it is
decided to train the neural network using the data from the stabilised zone and see if it can
approximate the swinging up action. Thus the neural network is re-trained using only the
input-output data when the pendulum is in the stabilised zone using the linear-quadratic
control i.e. it is not trained during the swing up action using the swinging algorithm.
Figure 5.8 shows the best model obtained.
Figure 5.8 feed-forward neural network, 1 hidden layer with 75 neurons

From the diagram a vast improve in the model is observed, it identifies the pendulum angle
in the stabilised region extremely well and also models well, the pendulum swing up
motion, on which it has not been trained. This is possible because of neural networks ability
to approximate functions on which they have not been trained. At stage a suitable model has
been developed for the physical rig. However this must be subjected to further tests. The
model identified must be robust in order to achieve neuro control. Subsequently a
disturbance is added on-line to see how the model responses. See Figure 5.9 for set-up.



49


Figure 5.9 System set-up with Disturbance
From Figure 5.10 it can be seen that the model correctly identifies the pendulum angle
during the disturbance.

Figure 5.10 Pendulum angle during disturbance

To further test the model the dither signal is increase so that the system is unstable
throughout the experiment. Figure 5.11 on the following page shows that the model still
identifies the pendulum angle. Thus an accurate model of the inverted pendulum has been
developed.

Disturbance

50

Figure 5.11 Pendulum Angle, with large excitation signal

5.3 Summary
In this chapter the physical rig set-up is discussed and its closed-loop controllers. The
system comprises of two controllers, one to swing-up the pendulum and one to maintain it
in a stabilised region. Neural networks had difficulty in identifying the process, as
depending on what zone the pendulum is in, the output is calculated in a different manner,
as such the output from the two controllers do not relate to each other. Neural networks can
approximate data on which they have not been trained; consequently the network is trained
using only the data from the pendulum in the stabilised region. The neural network
subsequently identifies the pendulum angle with good accuracy and successful predicts the
swing-up action. The model is then subjected to further experiments to test its robustness.
The model successfully identifies the process when subjected to a disturbance and a larger
excitation signal. Subsequently the next stage is neuro-control, which is dealt with in the
following chapter.











51
Chapter 6
Neuro Control
The inverted pendulum is open loop unstable, non-linear and a multi-output system. The
physical rig has one input a normalised control output value between 1 and 1, and two
outputs the pendulum angle theta and the cart position. A model of the physical rig has been
identified using static feed-forward networks modelling the pendulum angle. Thus the next
step is neuro control. However before a neuro-controller is developed, a comparison
between standard linear techniques of control such as PID and neuro control is made.
Subsequently the different techniques of neuro control are discussed and a control
developed.
Standard linear control techniques such as PID cannot map the complex non-
linearitys of the pendulum system. They have been used to control the physical rig, but
based on the condition the experiment starts with the pendulum in a stabilised zone. Even at
this its control of the system is extremely limited. ANNs have the capability of adaptively
controlling and modelling a wide of non-linear processes at a high level of performance.
The inverted pendulum is a SIMO system; in order to have full-state feedback control
several PID controllers would be necessary. However due to neural networks parallel
nature, a single neural network is sufficient.
Before a neuro-controller is developed, the merits of the main types of neuro-control
are discussed. The main types of neuro control include supervised, unsupervised, model
reference, direct inverse and adaptive.

6.1 Supervised Control
In supervised control the neural network uses an existing controller to learn the control
action. One may question why mimic an existing controller if it performs satisfactory. The
problem arises that traditional controllers may operate well around a specific operating
point. However if a disturbance or uncertainty occurs these traditional controllers fail. The
advantage of neuro-control is that the network can adjust and update its weights. A neuro-
controller can also approximate on data on which it has never been trained. Supervised
control proceeds with a teacher providing the control output for the neural network to learn.
The simplest approach to this method is to teach the network off-line; subsequently the
neural network is placed in the feedback loop, see Figure 6.1


52












Figure 6.1 Supervised learning using existing controller


6.2 Unsupervised Control
Unsupervised control does not require a prior knowledge. However the behaviour of such a
network may be extremely complex; at best unsupervised control is still not fully
understood. In unsupervised learning the neural network tests different states determining
which produces the correct control action. The learning time is computationally inefficient.
However an unsupervised neuro-controller can deal with complex non-linear control.
Anderson et al [29] developed an unsupervised controller for the inverted pendulum,
however a certain amount of prior knowledge is incorporated; in that failure signal is
supplied to the neural network based on pendulum angle and cart position


6.3 Adaptive neuro control
The main advantage of adaptive control is the ability to adapt on-line. This is achieved by
presenting the neuro controller with an error signal. This is calculated by subtracting the
actual output from the desired output. Subsequently the error is used to adjust the weights
on-line, see Figure 6.2.






PLANT

Control
U Y
error
Update weights

53















Figure 6.2 Adaptive neuro control

6.4 Model Reference Control
Model reference control differs from adaptive neuro control, in that the desired closed loop
response is specified through reference model. Thus, the error signal is calculated using the
reference model. The neuro-control forces the plant output similar to the reference model
output, see Figure 6.3











Figure 6.3 Model Reference Control



PLANT U Y
error signal used to
adapt weights
Desired Response
Y
PLANT

Reference Model
-
+
U

54
6.5 Direct inverse control
In direct inverse control the neural network is trained to model the inverse of the plant. The
plant output is inputted to the neuro controller; subsequently the neuro controllers output is
compared with the plant input and network trained, see Figure 6.4. The main difficulty with
this method is that the inverse model must be extremely accurate; as such this method is
limited to open-loop stable systems [30]. This can be accredited to the fact in a closed-loop
too much of the plant dynamics are removed, subsequently an accurate model cannot be
identified.










Figure 6.4 Direct inverse control

Considering the different control techniques possible with respect to the inverted pendulum
supervised emerges as a suitable solution. Inverse control is not possible, as previously
stated, an extremely accurate inverse model of the plant is required. However this is
unobtainable as the inverted pendulum is open-loop unstable, and the closed loop model
contains some of the dynamics of the controller. Anderson has proved unsupervised control
of the inverted pendulum possible. However this is an extremely complex method not yet
fully understood. Given the time constraints of the project this is not a viable technique.
Similar to inverse model control, model reference control also requires an accurate model of
the plant, as such this method is limited to an open-loop stable system. Given that there is an
existing controller for the inverted pendulum rig based on a swinging algorithm and a
stabilising algorithm, supervised control offers the best solution. Consequently neuro-
control proceeds focused using supervised neuro-control.





PLANT
U
Y
+
_

55
6.6 Neuro Control in Simulink
It is decided to develop a neuro-controller using supervised learning. Using the existing
feedback controller for the non-linear pendulum a feed-forward network is trained to model
the controller. The model is developed and trained using the same techniques covered in
system identification; with the exception that the input to train the network is the angle theta
and the output is the controller input. When training is complete the model imported into the
simulink environment and placed in the feedback loop replacing the existing controller, see
Figure 6.5.


Figure 6.5 Neuro Controller in Simulink

Figure 6.6 shows the pendulum angle controlled using the neuro controller, clearly the
controller maintains stability, keeping the pendulum up-right.
Figure 6.6 Pendulum angle using neuro controller in Simulink

56

The next step is placing the controller in closed-loop with the non-linear pendulum model
identified using neural networks, see Figure 6.7.


Figure 6.7 neuro control of non-linear pendulum model

Figure 6.8 show the neuro-controller also stabilises the model identified using neural
networks.

Figure 6.8 neuro control of non-linear model

6.7 Real time neuro-control
At this stage neuro-control of the non-linear pendulum model and the model identified using
neural networks has been carried out in the simulink environment. Subsequently the next
step is neuro control of the physical system. It is possible to test and develop controllers for
the physical rig using the external controller function. The external controller is a file, which
contains the control routine, which is accessed at the interrupt time. The control algorithm

57
must be written in C code and a dynamic link library created. Thus the steps to develop the
neuro-controller can be categorised into four main stages:

Train the neural network offline, import into simulink.
Validate model online.
Code the neuro-controller in C.
Test online.
Up until this point system identification has primarily focused on the pendulum
angle. In order to achieve control of the physical system both the pendulum angle theta and
the cart position must be taken into account due to the limited length of rail. The neural
network training procedure is carried out in a similar manner as previous, with the exception
the input for training is the pendulum angle and the cart position. The output for training is
the control output. The Matlab script is as follows:

tempP = [angle';position'];
net = newff([-3.5 3.5; -1 1],[20 1], {'tansig' 'purelin'},'trainlm');
net.trainParam.epochs = 1000;
net.trainParam.lr = 0.0001;
net = train(net,tempP,theta');

Feed-forward networks are used and different parameters are tested i.e. the number of
hidden layers and the number of neurons in the hidden layer. They are kept to a minimum so
that the subsequent C coding of the model is easier. Similar to identification of the physical
rig, the network is trained using only data form the stabilised zone, and the network is
allowed approximate the swing-up action. The next step is to validate the model before
implementing it in C. Figure 6.9 shows validation set-up. It is found that the optimum
number of hidden layers is 1 with 20 neurons in each layer respectively.

58


Figure 6.9 Validation set-up

The control output from the existing controller is compared with the output from the neuro-
controller, it is found that the two are similar with a low mean squared error, see Figure
6.10.

Figure 6.10 Neuro Controller output
The neuro-controller structure is shown in Figure 6.11, on the following page.





59






























Figure 6.11 Neuro Controller
Structure


N
N
X
1
X
2
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
Tansig

b
1

b
2

b
11

b
12

b
13

b
10

b
9

b
8

b
7

b
6

b
5

b
4

b
3

b
18

b
17

b
16

b
15

b
14

b
19
N
Tansig

Tansig

Tansig

Tansig

Tansig

Tansig

Tansig

Tansig

Tansig

Tansig

Tansig

Tansig

Tansig

Tansig

Tansig

Tansig

Tansig

Tansig

Tansig

N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N


Purelin
w
1
w
3
w
2
w
4
W
7
w
5
w
6
w
8
w
9
w
10
w
11
w
12
w
14
w
16
w
15
w
13
w
17
w
18
w
19
W
11,12
W
m,n
bias
Control
output
(angle,theta)
(cart position)

60

The neuro-controller is subsequently implemented in C code, see Appendix A. To create a
executable file the following commands are used implemented using the mk_bc32.bat file
provided in the external controller software.

set BCR=c:\borland\bcc55
%BCR%\bin\Bcc32 -P- -c -I%BCR%\INCLUDE -DDllMain=DllEntryPoint %1.c
%BCR%\bin\ilink32 -L%BCR%\LIB -Tpd -aa -c -x %BCR%\LIB\c0d32.obj %1.obj, %1,
%1, %BCR%\LIB\import32.lib %BCR%\LIB\cw32.lib
mk_bc32 neuro_controller

The executable file is created and subsequently the neuro-controller is tested. The first
experiment starts with the pendulum in a downward position. The neuro-controller swings
the pendulum into an upright position. However from Figure 6.12, it can be seen the
controller is unable to maintain it in an upright position. During the swing-up action the
controller over-compensates and the pendulum falls over. The controller repeatedly swings
the pendulum into an upright position but is unable to maintain it in the stabilised zone.

Figure 6.12 Pendulum Angle

The next experiments starts with the pendulum in an upright position see Figure 6.13. From
the figure it is seen the neuro controller maintains the pendulum in an upright position for a
limited period approximately 5 secs. After this pendulum falls over and the controller

61
swings the pendulum repeatedly into an upright position but it unable to maintain it in the
stabilised region.

Figure 6.13 Pendulum Angle

From the experiments the main problem with the neuro controller is that during the swing-
up action the controller over-compensates, and the pendulum falls over. If the experiment
starts with the pendulum in an upright position the controller can successfully maintain the
pendulum in the stabilised region, abet for a limited time frame.
Subsequently several modifications were made to the neuro controller, an increase in the
number of neurons in the hidden layer and the number of hidden layers. However no
improvement in performance was obtained.
Given that neural networks successfully identified the pendulum angle during the
swing up action and stabilisation, this is unexpected. This can partly be accredited to the fact
that during identification of the physical system, a sampling rate of only 0.1secs is possible
due to limitation of processing capabilities and the configuration of the interrupt routine
between the embedded controllers and Matlab. However these controllers operate at a
sampling rate of 0.01 secs. Subsequently the full dynamics necessary to train the network
are not available. This needs to be investigated further and could be addressed in a future
project. However the overall aims of the project have been achieved Artificial Neural
Network identification and control of an inverted pendulum.

62
6.8 Project Plan
To successfully complete any project there is basic requirement of a well structured, concise
and organised plan. It aids the user in time and resource management acting as bench mark
against which progress can be measured; thus enabling the coordinator to work in a work in
a logical and ordered progression achieving maximum efficiency. It is also imperative that
sufficient background research is conducted before any design may be attempted. Due to the
condensed nature of the project, time management is essential and its plan critical. The
initial project plan from the interim report was revised and may be viewed in appendix B.
During the course of the project few deviations were made from this resulting in a well-
rounded concise project.

6.9 Summary
In this chapter the testing and experimentation of the neuro-controller was conducted. The
main techniques of neuro control were examined. Subsequently it was decided to use
supervised control, as there are existing controllers that swing-up and stabilise the
pendulum. Initially the controller was modelling using Matlab simulink. Following on from
this the controller is implemented using Borland C++ and a dynamic link library created.
This enables the real time kernel to communication with the external controller for
subsequent analysis of results. The results obtained from neuro control were promising and
the controller successfully maintained the pendulum in an upright position for a limited time
frame. Several adjustments were made to the controller. However no substantial
improvement in the system stability was recorded, which may be addressed in a future
project.















63
Chapter 7
Conclusions
In this project, Artificial Neural Networks have been applied to the challenging task of
identification and control of a highly non-linear system, the inverted pendulum. Initially the
dynamical equations for both the linear and non-linear inverted pendulum were derived.
Subsequently models were developed of both systems using Matlab simulink. The
motivation for modelling the linear system was to allow a comparison of linear and non-
linear identification techniques. It was not possible to perform open-loop identification, as
the inverted pendulum is open loop unstable i.e. for a bounded input the output goes
unbounded; a necessary criterion for accurate identification is stability. Consequently the
inverted pendulum models, linear and non-linear are placed in closed loop, stabilised using
PID controllers. It possible to stabilise the non-linear inverted pendulum model using PID
control because the simulation starts in a linearised region i.e. the pendulum in an upright
position. The parameters for both controllers are chosen heuristically and subjectively.
Subsequently they are de-tuned so that the dynamics of the pendulum are emphasised at the
output. This is necessary so that an accurate and robust model of the process can be
identified. The models can subsequently be simulated for greater periods allowing the
generation of large quantities of information rich data.
In Chapter 3 neural network architecture was examined. Neural networks provide an
extremely powerful information-processing tool. They are ideal for control systems due to
their non-linear approximation capabilities, adaptability and computational efficient
accredited to their parallel nature. They remove the need to explicitly know the internal
structure of a specific task. Considering the specific process of the inverted pendulum, they
provide a suitable solution for identification and control. Before neural networks were
implemented, conventional identification techniques were investigated and applied to both
the linear and non-linear pendulum (ARX, RARMAX). These conventional identification
techniques modelled the linear pendulum with good accuracy. To obtain a robust model of
the inverted pendulum the non-linear system must be used. However traditional linear
techniques are inadequate in identifying the complexity of the non-linear pendulum.
Extensions have been made to these linear identification techniques in the form of non-
linear armax, and were duly investigated. The main problem associated with non-linear
armax is how to construct a model that is easily estimated and used to construct a systems
dynamics in practical terms. The non-linear properties of neural networks give them a

64
distinct advantage over conventional identification techniques such as arx and rarmax.
Further to this, the construction of a model using neural networks is relatively simple in
comparison to non-linear armax.
Identification using neural networks focused primarily on the pendulum angle. Two
network architectures were used feed-forward and recurrent. The parameters for both were
chosen heuristically and subjectively. However the number of hidden layers was restricted
to two and the number of neurons in these hidden layers was restricted to seventy-five. Non-
linear tansig activation functions were used in the hidden layers and a linear purelin
activation function used in the output layer. Generally, this is the accepted structure for a
neural network. For completeness neural networks were first used to model the linear
pendulum. The first network tested was feed-forward. The feed-forward network modelled
the process well, identifying the pendulum dynamics with good accuracy and a low MSE
error. Recurrent elman networks were tested next. Initially they were found to perform
extremely poorly, this is due to the network getting caught in a local minimum during
training. Pre-processing techniques were examined and it was found scaling the data, had a
filtering effect. Thus the network was retrained and the filtering of data for training,
prevented the network getting caught in a local minimum. Subsequently, elman networks
modelled the process extremely well, identifying the pendulum angle with good accuracy.
The next stage was identification of the non-linear pendulum model. It was found that the
same pattern emerged; in that scaling and consequently filtering of the data was required by
the elman network. Both network models identified the process with good accuracy. Overall
the feed-forward model out-performed the elman recurrent model. This was unexpected, as
with their dynamic memory the elman network should have identified the non-linear
process with good accuracy. Some literature has suggested that in cases where the length of
fixed line delay is smaller than the required length of memory recurrent networks perform
poorly [25]. Consequently the remainder of the project proceeded using feed-forward
networks.
Javiers linearised model was examined next. This model gave a good intuitive
feel for the accuracy of the models developed. As this model proved extremely robust in the
design of controllers for the physical system in previous research, it provided a good insight
into the behaviour of the physical rig. A dither signal was applied to Javiers model, to
generate input-output data that contained the process dynamics over the entire operating
range. The same dither signal was applied to the non-linear model of the pendulum and the
dynamics compared. It was found the dynamics were similar, thus the modelling of the

65
inverted pendulum was accurate. For completeness, neural network identification of Javiers
model was performed.
Having successfully modelled and identified the non-linear pendulum using neural
networks the next stage was real-time identification. Subsequently the pendulum rig was
discussed. The closed-loop system comprises of two controllers, one to swing-up the
pendulum and one to maintain it in the stabilised region. Consequently neural networks had
difficulty in identifying the process, as depending on what zone the pendulum is in, the
output is calculated in a different manner, as such the output from the two controller do not
relate to each other. Given that neural networks have the ability to approximate data on
which they have not been trained, it was decided to train the network using data from the
pendulum in the stabilised region. The model subsequently identifies the pendulum angle
with good accuracy and successful predicts the swing-up action. The model was then
subjected to further tests and it proved extremely robust.
Following on from this, different techniques of neuro-control were examined.
It was decided to use supervised control, as there are existing controllers that swing-up and
stabilise the pendulum. Initially the neuro controller was modelling using Matlab simulink
and subsequently validated on-line. Following on from this the controller was implemented
using Borland C++ and a dynamic link library created. This enabled the real time kernel to
communication with the external controller for subsequent analysis of results. The results
obtained from neuro control were promising and the controller successfully maintained the
pendulum in an upright position for a limited time frame. Several adjustments were made to
the controller. However no substantial improvement in the system stability was recorded,
which may be addressed in a future project.

7.1 Future Recommendations
Due to time constraints the testing of the neural controller was limited, further work on this
is required. One major concern is the controllers that were used to teach the neuro controller
used a sampling rate of 0.01 secs. However due to the configuration of the interrupt routine
between Matlab and the controllers, data can only be stored at a sampling rate of 0.1 secs. A
sampling rate greater than this corrupts the data. Thus the data used to train the neuro
controller does not contain the full dynamics required to swing up and stabilise the
pendulum for an extended period, although some limited control was achieved. This needs
to be investigated in greater detail.

66
Also for a future project several different intelligent optimisation techniques could
be explored, which were beyond the scope of this project. Genetic algorithms have been
widely used for tuning and optimisation of neural networks. Genetic algorithms are general-
purpose search algorithms that use principles inspired by natural population genetics to
evolve solutions to problems. Future work should also include an investigation into the
possible use of genetic algorithms to tune and optimise the parameters of the neural
controller. Another consideration is the possible use of the radial base function network.
This is a modified back propagation algorithm to be used with multi-layer feed forward
neural network. It can be used to optimise the neural network architecture. Concluding,
there is potential in further research to supplement the current work and further optimise
performance and results, which was beyond the scope of this project.

































67

References
[1] Jovanovic, Olivera. Identification of dynamic system using neural networks,
Department of Mechanical Engineering, University of Montenegro 81000
Podgorica, Yugoslavia.

[2] Q. Wu, N. Sepehri and S. He. On control of a base-excited inverted pendulum
using neural networks Journal of the Franklin Institute 337, 2000.pp. 267-286.

[3] M..Mills, Peter. Y.Zomaya, Albert. O Tade, Moses. Neuro-Adaptive Process, A
practical approach.

[4] S. Purwar, I.N. Kar and A.N. Jha. On-line System Identification Using Chebyshev
Neural Networks Indian Institute of Technology, Dehli. pp 1-4.

[5] www.control.hut.fi/Kurssit/ AS-74.115/Material/SOFTCOMPCH06a.pdf, 15
th

May 2004.

[6] http://arri.uta.edu/acs/ee4343/lectures99/ReprSys.pdf, 2
nd
May 2004.


[7] Moscinski, Jerzy. Ogonowski, Zbigniew. Advanced Control with Matlab &
Simulink. Ellis Horwood.

[8] Chenyi, Hu. Training Feedforward Multilayer Interval Artificial Neural Networks,
University of Houston, USA

[9] http://hepunx.rl.ac.uk/~candreop/minos/NeuralNets/images/intro/
biological_neuron.gif. 08
th
August 04

[10] Haykin, S. Neural Networks - A Comprehensive Foundation, Macmillan 1994.

[11] Cybenko, G. Approximation by superposition of a sigmoidal function,
Mathematics of Control Signals and Systems, Vol 2, 303-314


68
[12] K. Hornik and M. Stinchcombe and H.White. Multilayer feed-forward networks
are universal approximators. Department of Encomics, University of Californa,
San Diego, La Lolla, CA, 1988

[13] http://www.microcortex.com/InfoLinks/ANNDocumentation, Jan 2004.

[14] Elman, Jeffrey L. Finding Structure in Time. Cognitive Science, 14, 179-211
(1990). University of California, San Diego

[15] Howard Demuth, Howard. Beale, Mark. Neural Network Toolbox For Use with
MATLAB.

[16] Roelof, K Brouwer. An Integer Recurrent Artificial Neural Network for
Classifying Feature Vector, University College of the Cariboo, Canada

[17] Jovanovic, Olivera. Identification of dynamic system using neural networks,
Department of Mechanical Engineering, University of Montenegro 81000
Podgorica, Yugoslavia.

[18] Moscinski, Jerzy. Ogonowski, Zbigniew. Advanced Control with Matlab &
Simulink. Ellis Horwood (1995).

[19] Q. Wu, N. Sepehri, S. He. Neural inverse modelling and control of a base excited
inverted pendulum, Engineering Applications of Artificial Intelligence 15 (2002)
261-272. Department of Mechanical and Industrial Engineering. The University of
Manitoba, Winnipeg, Canada.

[20] Ljung, L. System Identification-Theory for the users, Prentice Hall.

[21] Demuth, Howard. Beale, Mark. Neural Network Toolbox For Use with
MATLAB.

[22] Sheng Lu and Ki H. Chon. Nonlinear Autoregressive and Nonlinear
Autoregressive Moving Average Model Parameter Estimation by Minimizing

69
Hypersurface Distance. IEEE TRANSACTIONS ON SIGNAL PROCESSING,
VOL. 51, NO. 12, DECEMBER 2003

[23] Nonlinear Gas Turbine Modelling Using Polynomial. NARMAX Structures
agrino.org/nchiras/pubs/thesis/chapter7_1.pdf

[24] Callinans,T. ANN identification and control of the inverted pendulum.
http://www.eeng.dcu.ie/~brutonj/Reports/TCallinan_MEng_03.pdf

[25] Principe, Jose C. De Vries, Bert. Jyh-Ming Kuo. De Oliveria, Pedro Guedes
Modelling Applications with the Focused Gamma Net. Departamento
Eletronica, Universidade de Aveiro Gainesville, FL 32611 Aveiro, Portugal

[26] http://www.eeng.dcu.ie/~csg/javier/data/chapter2.pdf, 20
th
June 2004.

[27] Landau, Ioan Dore. System identification and control design. Prentice Hall
information and system sciences series.

[28] Digital Pendulum System. Reference manual, 33-005-2M5 Matlab 5 Version
Feedback Instruments Ltd, UK.

[29] Anderson, C.W. Learning to control an inverted pendulum using neural networks,
IEEE Controls Systems Magazine, 9:31-37, 1989.

[30] Moscinski, Jerzy. Ogonowski, Zbigniew. Advanced Control with Matlab &
Simulink. Ellis Horwood (1995).


70









Appendix 1











71
#include <windows.h>
#include <stdio.h>
#include <string.h>
#include <dos.h>
#include <math.h>


#define PI_CONST ( 3.14159265358979323846 )

#define IPC_GetParam "GetParam" /* Read parameters */
#define IPC_SetParam "SetParam" /* Set parameters */

#ifdef WIN32
#define DLLEXPORT __declspec(dllexport)

HINSTANCE hInstDLL;

#endif

double UMax;
double TSample;
double RefAngle, RefPosition;
double weight1_1 = -0.44195224318656268;
double weight1_2 = -2.217880994856216;
double weight2_1 = -1.053251000171985;
double weight2_2 = -1.9186843219749373;
double weight3_1 = 1.461844642801459;
double weight3_2 = 0.9856294483999463;
double weight4_1 = -3.200440635173567;
double weight4_2 = 2.5423505397671269;
double weight5_1 = 1.2219975449226481;
double weight5_2 = -3.3658342335514796;
double weight6_1 = 1.5505338340465635;
double weight6_2 = 1.5234352744715038;
double weight7_1 = 1.1800020623474832;
double weight7_2 = -1.4426523826006032;
double weight8_1 = 1.8561712783326201;
double weight8_2 = 1.4566441399253465;
double weight9_1 = 0.71593937106804838;
double weight9_2 = -2.5615984977398991;
double weight10_1 = -1.341916122599372;
double weight10_2 = 1.5135817655528541;
double weight11_1 = -5.1768414393473607;
double weight11_2 = 2.2480236392986224;
double weight12_1 = -0.35706415837594402;
double weight12_2 = 2.2384012130448077;
double weight13_1 = 0.4313667963850335;
double weight13_2 = -2.2206868975580752;
double weight14_1 = -1.7779654090527175;
double weight14_2 = -3.269159363677926;
double weight15_1 = 5.2217076108976805;

72
double weight15_2 = -0.8636564150197007;
double weight16_1 = -1.7009648315943895;
double weight16_2 = 0.846360928664881;
double weight17_1 = 7.5956370405328846;
double weight17_2 = 3.6941828431267112;
double weight18_1 = 1.836495348981;
double weight18_2 = 3.6712080388472144;
double weight19_1 = 1.7011556884168486;
double weight19_2 = 0.67955072474223339;
double weight20_1 = -0.83634278642244209;
double weight20_2 = 2.0582407024391589;

double weight21 = 0.51985134463873106;
double weight22 = 0.0012878801134575358;
double weight23 = -0.45048922321320034;
double weight24 = 1.4060465783745444;
double weight25 = 2.8684457667358663;
double weight26 = -0.0425805392205622;
double weight27 = 2.4761813752713584;
double weight28 = 0.83942357012230939;
double weight29 = 1.0211426315494958;
double weight30 = 0.17475932060846133;
double weight31 = 4.3593255325499829;
double weight32 = 0.036552659568254112;
double weight33 = 0.06277525580357642;
double weight34 = 1.564594423331879;
double weight35 = 2.850903026787866;
double weight36 = 0.34410155683438404;
double weight37 = 4.8115814047913181;
double weight38 = -4.8038033798195858;
double weight39 = -0.32563710646978754;
double weight40 = -0.18018759676221036;

double bias_1 = 11.251222364487161;
double bias_2 = 9.99190017095237533;
double bias_3 = -6.9629089158925055;
double bias_4 = 0.44246522978535567;
double bias_5 = 0.709342803860435567;
double bias_6 = -6.3769458025717212;
double bias_7 = -4.0000322342135544;
double bias_8 = -5.5592270293053545;
double bias_9 = 1.837519985326985;
double bias_10 = -3.32074096384392;
double bias_11 = 0.6412873382264335;
double bias_12 = -6.0252522592445601;
double bias_13 = 6.6442130061038265;
double bias_14 = 4.2036212978856344;
double bias_15 = 0.90293110585069425;
double bias_16 = -5.4956596512452043;
double bias_17 = -1.00789821770195;
double bias_18 = 0.75085469247973802;

73
double bias_19 = 4.1510124590884825;
double bias_20 = -10.892031944348444;
double bias_21 = 0.36805450826949149;


double N,DotProd1,DotProd2,DotProd3,DotProd4,DotProd5,DotProd6,DotProd7,
DotProd8,DotProd9,DotProd10,DotProd11,DotProd12,DotProd13, DotProd14,
DotProd15,DotProd16,DotProd17,DotProd18,DotProd19,DotProd20,
node_1,node_2,node_3,node_4,node_5,node_6,node_7,node_8,node_9,node_10,
node_11,node_12,node_13,node_14,node_15,node_16,node_17,node_18,node_19,
node_20;

char ModuleName[ 128 ];


#ifdef WIN32 // 32-bit DLL

BOOL WINAPI DllMain(HINSTANCE hDLLInst, DWORD fdwReason, LPVOID
lpvReserved)
{
char msg_str[ 150 ];
static char ModuleName[ 128 ];

switch (fdwReason)
{
case DLL_PROCESS_ATTACH:
// The DLL is being loaded for the first time by a given process.
// Perform per-process initialization here. If the initialization
// is successful, return TRUE; if unsuccessful, return FALSE.
hInstDLL = hDLLInst;

UMax = 0.0;
TSample = 0.001; /* Min sample period == 1ms */
RefAngle = RefPosition = 0.0; /* Upright pos. of the pendulum */

GetModuleFileName( hDLLInst, ModuleName, sizeof( ModuleName ) - 30 );
sprintf( msg_str, "External DLL Library: %s", ModuleName );
MessageBox( (HWND)NULL,
"Entry point to external DLL reached",
msg_str, MB_OK );
break;

case DLL_PROCESS_DETACH:
// The DLL is being unloaded by a given process. Do any
// per-process clean up here, such as undoing what was done in
// DLL_PROCESS_ATTACH. The return value is ignored.
// Unload the hook before returning..

sprintf( msg_str, "External DLL Library: %s", ModuleName );
MessageBox( (HWND)NULL,
"Exit point of external DLL reached",

74
msg_str, MB_OK );
break;

case DLL_THREAD_ATTACH:
// A thread is being created in a process that has already loaded
// this DLL. Perform any per-thread initialization here. The
// return value is ignored.

break;

case DLL_THREAD_DETACH:
// A thread is exiting cleanly in a process that has already
// loaded this DLL. Perform any per-thread clean up here. The
// return value is ignored.

break;
}
return TRUE;
}



#else // 16-bit DLL

int FAR PASCAL LibMain( HINSTANCE hInstance, WORD wDataSegment,
WORD wHeapSize, LPSTR lpszCmdLine )
{
char msg_str[ 150 ];

GetModuleFileName( hInstance, ModuleName, sizeof(ModuleName) - 30 );

sprintf( msg_str, "External DLL Library: %s", ModuleName );
MessageBox( (HWND)NULL, "Entry point to external DLL reached", msg_str, MB_OK
);

UMax = 0.0;
TSample = 0.001; /* Min sample period == 1ms */
RefAngle = RefPosition = 0.0; /* Upright pos. of the pendulum */

return 1; /* Indicate that the DLL was initialized successfully */
}


int FAR PASCAL WEP ( int bSystemExit )
{
char msg_str[ 150 ];
sprintf( msg_str, "External DLL Library: %s", ModuleName );
MessageBox( (HWND)NULL, "Exit point of external DLL reached", msg_str, MB_OK );

return 1;
}

75

#endif




#ifdef WIN32
DLLEXPORT void __stdcall ExternalController(
double *Param,
double PendPos,
double CartPos,
double Time,
double DesValue,
double *Control )
#else
void FAR PASCAL ExternalController(
double far *Param, /* Parameters of the controller */
double far PendPos, /* Angle [rad]
double far CartPos, /* Cart's position [m] */
double far Time, /* Time [s] */
double far DesValue, /* Desired value of the cart [m] */
double far *Control ) /* Calculated control value */
#endif
{
/*-------------------- Neuro Controller ----------------------------------------*/

/*inputs are multiplied by weights and biases added on*/
DotProd1 = ((PendPos*weight1_1) + (CartPos*weight1_2) + bias_1);
DotProd2 = ((PendPos*weight2_1) + (CartPos*weight2_2) + bias_2);
DotProd3 = ((PendPos*weight3_1) + (CartPos*weight3_2) + bias_3);
DotProd4 = ((PendPos*weight4_1) + (CartPos*weight4_2) + bias_4);
DotProd5 = ((PendPos*weight5_1) + (CartPos*weight5_2) + bias_5);
DotProd6 = ((PendPos*weight6_1) + (CartPos*weight6_2) + bias_6);
DotProd7 = ((PendPos*weight7_1) + (CartPos*weight7_2) + bias_7);
DotProd8 = ((PendPos*weight8_1) + (CartPos*weight8_2) + bias_8);
DotProd9 = ((PendPos*weight9_1) + (CartPos*weight9_2) + bias_9);
DotProd10 = ((PendPos*weight10_1) + (CartPos*weight10_2) + bias_10);
DotProd11 = ((PendPos*weight11_1) + (CartPos*weight11_2) + bias_11);
DotProd12 = ((PendPos*weight12_1) + (CartPos*weight12_2) + bias_12);
DotProd13 = ((PendPos*weight13_1) + (CartPos*weight13_2) + bias_13);
DotProd14 = ((PendPos*weight14_1) + (CartPos*weight14_2) + bias_14);
DotProd15 = ((PendPos*weight15_1) + (CartPos*weight15_2) + bias_15);
DotProd16 = ((PendPos*weight16_1) + (CartPos*weight16_2) + bias_16);
DotProd17 = ((PendPos*weight17_1) + (CartPos*weight17_2) + bias_17);
DotProd18 = ((PendPos*weight18_1) + (CartPos*weight18_2) + bias_18);
DotProd19 = ((PendPos*weight19_1) + (CartPos*weight19_2) + bias_19);
DotProd20 = ((PendPos*weight20_1) + (CartPos*weight20_2) + bias_20);



/* Tansig function*/

76

node_1 = 2/(1+exp(-2*DotProd1))-1;
node_2 = 2/(1+exp(-2*DotProd2))-1;
node_3 = 2/(1+exp(-2*DotProd3))-1;
node_4 = 2/(1+exp(-2*DotProd4))-1;
node_5 = 2/(1+exp(-2*DotProd5))-1;
node_6 = 2/(1+exp(-2*DotProd6))-1;
node_7 = 2/(1+exp(-2*DotProd7))-1;
node_8 = 2/(1+exp(-2*DotProd8))-1;
node_9 = 2/(1+exp(-2*DotProd9))-1;
node_10 = 2/(1+exp(-2*DotProd10))-1;
node_11 = 2/(1+exp(-2*DotProd11))-1;
node_12 = 2/(1+exp(-2*DotProd12))-1;
node_13 = 2/(1+exp(-2*DotProd13))-1;
node_14 = 2/(1+exp(-2*DotProd14))-1;
node_15 = 2/(1+exp(-2*DotProd15))-1;
node_16 = 2/(1+exp(-2*DotProd16))-1;
node_17 = 2/(1+exp(-2*DotProd17))-1;
node_18 = 2/(1+exp(-2*DotProd18))-1;
node_19 = 2/(1+exp(-2*DotProd19))-1;
node_20 = 2/(1+exp(-2*DotProd20))-1;

/* Purelin function output layer*/
U = ((node_1*weight21 + node_2*weight22 + node_3*weight23 +
node_4*weight24 + node_5*weight25 + node_6*weight26 +
node_7*weight27 + node_8*weight28 + node_9*weight29 +
node_10*weight30 + node_11*weight31 + node_12*weight32 +
node_13*weight33 + node_14*weight34 + node_15*weight35 +
node_16*weight36 + node_17*weight37 + node_18*weight38 +
node_19*weight39 + node_20*weight40) + bias_21);




/* Control limits */
if( U > fabs( UMax ) ) U = fabs( UMax );
if( U < -fabs( UMax ) ) U = -fabs( UMax );

/* Check limits of the control value */
if( U <= -1.0 ) U = -1.0;
if( U >= +1.0 ) U = +1.0;

*Control = U;
}








77

















Appendix 2

Potrebbero piacerti anche