Sei sulla pagina 1di 194

Wojciech Mitkowski and Janusz Kacprzyk (Eds.

)
Modelling Dynamics in Processes and Systems
Studies in Computational Intelligence, Volume 180
Editor-in-Chief
Prof. Janusz Kacprzyk
Systems Research Institute
Polish Academy of Sciences
ul. Newelska 6
01-447 Warsaw
Poland
E-mail: kacprzyk@ibspan.waw.pl

Further volumes of this series can be found on our homepage: Vol. 168. Andreas Tolk and Lakhmi C. Jain (Eds.)
springer.com Complex Systems in Knowledge-based Environments: Theory,
Models and Applications, 2009
ISBN 978-3-540-88074-5
Vol. 156. Dawn E. Holmes and Lakhmi C. Jain (Eds.)
Innovations in Bayesian Networks, 2008 Vol. 169. Nadia Nedjah, Luiza de Macedo Mourelle and
ISBN 978-3-540-85065-6 Janusz Kacprzyk (Eds.)
Innovative Applications in Data Mining, 2009
Vol. 157. Ying-ping Chen and Meng-Hiot Lim (Eds.) ISBN 978-3-540-88044-8
Linkage in Evolutionary Computation, 2008
ISBN 978-3-540-85067-0 Vol. 170. Lakhmi C. Jain and Ngoc Thanh Nguyen (Eds.)
Knowledge Processing and Decision Making in Agent-Based
Vol. 158. Marina Gavrilova (Ed.) Systems, 2009
Generalized Voronoi Diagram: A Geometry-Based Approach to ISBN 978-3-540-88048-6
Computational Intelligence, 2009 Vol. 171. Chi-Keong Goh, Yew-Soon Ong and Kay Chen Tan
ISBN 978-3-540-85125-7 (Eds.)
Multi-Objective Memetic Algorithms, 2009
Vol. 159. Dimitri Plemenos and Georgios Miaoulis (Eds.) ISBN 978-3-540-88050-9
Artificial Intelligence Techniques for Computer Graphics, 2009
ISBN 978-3-540-85127-1 Vol. 172. I-Hsien Ting and Hui-Ju Wu (Eds.)
Web Mining Applications in E-Commerce and E-Services, 2009
Vol. 160. P. Rajasekaran and Vasantha Kalyani David ISBN 978-3-540-88080-6
Pattern Recognition using Neural and Functional Networks, Vol. 173. Tobias Grosche
2009 Computational Intelligence in Integrated Airline Scheduling,
ISBN 978-3-540-85129-5 2009
ISBN 978-3-540-89886-3
Vol. 161. Francisco Baptista Pereira and Jorge Tavares (Eds.)
Bio-inspired Algorithms for the Vehicle Routing Problem, 2009 Vol. 174. Ajith Abraham, Rafael Falcón and Rafael Bello (Eds.)
ISBN 978-3-540-85151-6 Rough Set Theory: A True Landmark in Data Analysis, 2009
ISBN 978-3-540-89886-3
Vol. 162. Costin Badica, Giuseppe Mangioni, Vol. 175. Godfrey C. Onwubolu and Donald Davendra (Eds.)
Vincenza Carchiolo and Dumitru Dan Burdescu (Eds.) Differential Evolution: A Handbook for Global
Intelligent Distributed Computing, Systems and Applications, Permutation-Based Combinatorial Optimization, 2009
2008 ISBN 978-3-540-92150-9
ISBN 978-3-540-85256-8
Vol. 176. Beniamino Murgante, Giuseppe Borruso and
Vol. 163. Pawel Delimata, Mikhail Ju. Moshkov, Alessandra Lapucci (Eds.)
Andrzej Skowron and Zbigniew Suraj Geocomputation and Urban Planning, 2009
Inhibitory Rules in Data Analysis, 2009 ISBN 978-3-540-89929-7
ISBN 978-3-540-85637-5 Vol. 177. Dikai Liu, Lingfeng Wang and Kay Chen Tan (Eds.)
Design and Control of Intelligent Robotic Systems, 2009
Vol. 165. Djamel A. Zighed, Shusaku Tsumoto, ISBN 978-3-540-89932-7
Zbigniew W. Ras and Hakim Hacid (Eds.)
Mining Complex Data, 2009 Vol. 178. Swagatam Das, Ajith Abraham and Amit Konar
ISBN 978-3-540-88066-0 Metaheuristic Clustering, 2009
ISBN 978-3-540-92172-1
Vol. 166. Constantinos Koutsojannis and Spiros Sirmakessis Vol. 179. Mircea Gh. Negoita and Sorin Hintea
(Eds.) Bio-Inspired Technologies for the Hardware of Adaptive Systems,
Tools and Applications with Artificial Intelligence, 2009 2009
ISBN 978-3-540-88068-4 ISBN 978-3-540-76994-1
Vol. 167. Ngoc Thanh Nguyen and Lakhmi C. Jain (Eds.) Vol. 180. Wojciech Mitkowski and Janusz Kacprzyk (Eds.)
Intelligent Agents in the Evolution of Web and Applications, 2009 Modelling Dynamics in Processes and Systems, 2009
ISBN 978-3-540-88070-7 ISBN 978-3-540-92202-5
Wojciech Mitkowski and Janusz Kacprzyk (Eds.)

Modelling Dynamics in Processes


and Systems

123
Prof. Wojciech Mitkowski
Faculty of Electrical Engineering, Automatics
Computer Science and Electronics
AGH University of Science and Technology
Al. Mickiewicza 30
30-059 Krakow
Poland
Email: wmi36@op.pl

Prof. Janusz Kacprzyk


Systems Research Institute
Polish Academy of Sciences
ul. Newelska 6
01-447 Warsaw
Poland
Email: kacprzyk@ibspan.waw.pl

ISBN 978-3-540-92202-5 e-ISBN 978-3-540-92203-2

DOI 10.1007/978-3-540-92203-2

Studies in Computational Intelligence ISSN 1860949X

Library of Congress Control Number: 2008942380


c 2009 Springer-Verlag Berlin Heidelberg

This work is subject to copyright. All rights are reserved, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilm or in any other way,
and storage in data banks. Duplication of this publication or parts thereof is permitted
only under the provisions of the German Copyright Law of September 9, 1965, in
its current version, and permission for use must always be obtained from Springer.
Violations are liable to prosecution under the German Copyright Law.

The use of general descriptive names, registered names, trademarks, etc. in this publi-
cation does not imply, even in the absence of a specific statement, that such names are
exempt from the relevant protective laws and regulations and therefore free for general
use.

Typeset & Cover Design: Scientific Publishing Services Pvt. Ltd., Chennai, India.

Printed in acid-free paper

987654321

springer.com
Preface V

Preface

Dynamics is what characterizes virtually all phenomenae we face in the real world,
and processes that proceed in practically all kinds of inanimate and animate systems,
notably social systems. For our purposes dynamics is viewed as time evolution of
some characteristic features of the phenomenae or processes under consideration. It is
obvious that in virtually all non-trivial problems dynamics can not be neglected, and
should be taken into account in the analyses to, first, get insight into the problem con-
sider, and second, to be able to obtain meaningful results.
A convenient tool to deal with dynamics and its related evolution over time is to
use the concept of a dynamic system which, for the purposes of this volume can be
characterized by the input (control), state and output spaces, and a state transition
equation. Then, starting from an initial state, we can find a sequence of consecutive
states (outputs) under consecutive inputs (controls). That is, we obtain a trajectory.
The state transition equation may be given in various forms, exemplified by differen-
tial and difference equations, linear or nonlinear, deterministic or stochastic, or even
fuzzy (imprecisely specified), fully or partially known, etc. These features can give
rise to various problems the analysts may encounter like numerical difficulties, insta-
bility, strange forms of behavior (e.g. chaotic), etc..
This volume is concerned with some modern tools and techniques which can be
useful for the modeling of dynamics. We focus our attention on two important areas
which play a key role nowadays, namely automation and robotics, and biological sys-
tems. We also add some new applications which can greatly benefit from the avail-
ability of effective and efficient tools for modeling dynamics, exemplified by some
applications in security systems.
The first part of the volume is concerned with more general tools and techniques
for the modeling of dynamics. We are particularly interested in the case of complex
systems which are characterized by a highly nonlinear dynamic behavior that can re-
sult in, for instance, chaotic behavior.
R. Porada and N. Mielczarek (Modeling of chaotic systems in program ChaoPhS)
consider first some general issues related to non-linear dynamics, both from the per-
spective of gaining mode knowledge on how to proceed in case of such dynamics, and
from tools and techniques which can be used in practice. Notably, they deal with
simulation tools, and propose a new simulation program, ChaoPhS (Chaotic Phenom-
ena Simulations), which is meant for studying chaotic phenomena in continuous and
discreet systems, including systems used in practice. The structure of the program,
and algorithms employed are presented. Numerical tests on some models of chaotic
systems known from the literature are presented. Moreover, as an illustration an
VI Preface

example of using the tools and techniques proposed for the analysis of chaotic behav-
ior in a power electronic system is presented.
V. Vladimirov and J. Wróbel (Oscillations of vertically hang elastic rod, contact-
ing rotating disc) present an analysis of mechanical oscillations of an elastic rod
forming a friction pair with a rotating disc. In the absence of friction the model is de-
scribed by a two-dimensional Hamiltonian system of ordinary differential equations
which is completely integrable. However, when a Coulomb type friction is added, the
situation becomes more complicated. The authors use both the qualitative methods
and the numerical simulation. They obtain a complete global behavior of the system,
within a broad range of values of a driven parameter, for two principal types of a
modeling function simulating the Coulomb friction. A sequence of bifurcations (limit
cycles, double-limit cycles, homoclinic bifurcations and other regimes) are observed
as the driven parameter changes. The patterns of bifurcations depend essentially upon
a model of a frictional force and this dependence is analyzed in detail. Much more
complicated regimes appear as one-dimensional oscillations of the rotating element
are incorporated into the model. The system possesses in this case quasiperiodic, mul-
tiperiodic and, probably, chaotic solutions.
V.N. Sidorets (The bifurcations and chaotic oscillations in electric circuits with
ARC) is concerned with the autonomous electric circuits with ARC governed by three
ordinary differential equations. By varying two parameters, many kinds of bifurca-
tions, periodic and chaotic behaviors of this system. Bifurcation diagrams, which are a
powerful tool to investigate bifurcations have been used and studied. Routes to chaos
have been considered using one-parameter bifurcation diagrams. Three basis patterns
of bifurcation diagrams that possess the properties of: softness and reversibility, stiff-
ness and irreversibility, and stiffness and reversibility, have been observed.
The second section of the volume is devoted to a key problem of modeling dynam-
ics in control and robotics, very relevant fields in which intelligent systems have
found numerous applications.
Oscar Castillo and Patricia Melin (Soft computing models for intelligent control of
non-linear dynamical systems) describe the application of soft computing techniques
(fuzzy logic, neural networks, evolutionary computation and chaos theory) to control-
ling non-linear dynamical systems in real-world problems. Since control of real world
non-linear dynamical systems may require the use of several soft computing tech-
niques to achieve a desired performance, several hybrid intelligent architectures have
been developed. The basic idea of these hybrid architectures is to combine the advan-
tages of each of the techniques involved. Moreover, this can also help in dealing with
the fact that non-linear dynamical systems are difficult to control due to the unstable
and even chaotic behaviors that may occur. Practical applications of the new control
architectures proposed include robotics, aircraft systems, biochemical reactors, and
manufacturing of batteries.
J. Garus (Model reference adaptive control of underwater robot in spatial motion)
discusses nonlinear control of an underwater robot. Emphasis is on the tracking of a
desired trajectory. Command signals are generated by an autopilot consisting of four
controllers with a parameter adaptation law that has been implemented implemented.
External disturbances are assumed, and an index of control quality is introduced.
Results of computer simulations are provided to demonstrate the effectiveness, effi-
ciency, correctness and robustness of the approach proposed.
Preface VII

P. Skruch (Feedback stabilization of distributed parameter gyroscopic systems)


discusses feedback stabilization of distributed parameter gyroscopic systems de-
scribed by second-order operator equations. It is shown that the closed loop system
which consists of the controlled system, a linear non-velocity feedback and a parallel
compensator is asymptotically stable. In the case where velocity is available, the par-
allel compensator is not necessary to stabilize the system. Results for the multi-input
multi-output case are presented. The stability issues are proved by using the LaSalle
theorem extended to the infinitely dimensional systems. Numerical examples are
given to illustrate the effectiveness and efficiency of the proposed controllers.
W. Mitkowski and P. Skruch (Stabilization results of second-order systems with
delayed positive feedback) discuss issues related to oscillations in second-order sys-
tems with a delayed positive feedback, notably oscillation and non-oscillation criteria.
The authors consider the stability conditions for the system without damping and with
a gyroscopic effect. A general algorithm for determining the stability regions is pro-
posed. Theoretical and numerical results are presented for the single-input single-
output case. The results obtained are better with respect to some oscillation criteria
proposed so far in the literature.
The third part of the volume is concerned with the modeling dynamics in various
processes that occur in biological systems. This area has recently been gaining much
popularity in the research community around the world, and it is hoped that a deeper
understanding of dynamics of such processes can be of a great importance for solving
many problems we face in the world related to, for instance, the propagation of vari-
ous kinds of disease, epidemics, etc.
F.F. Matthäus (A comparison of modeling approaches for the spread of prion dis-
eases in the brain) is concerned with prion related diseases, exemplified by the well-
known “mad cow disease” or the Creutzfeld-Jacob disease. She presents and com-
pares two different modeling approaches for the spread of prion diseases in the brain.
The first is a reaction-diffusion model, which allows the description of prion spread in
simple brain subsystems, like nerves or the spine. The second approach is the combi-
nation of epidemic models with transport on complex networks. With the help of
these models, she studies the dependence of the disease progression on transport phe-
nomena and the topology of the underlying network.
Ch. Merkwirth, J. Wichard and M. Ogorzałek (Ensemble modeling for bio-medical
Applications) propose the use of ensembles of models constructed by using methods
of statistical learning. The input data for model construction consists of real meas-
urements taken in physical system under consideration. Then the authors propose a
program toolbox which makes possible to construct single models as well as hetero-
genous ensembles of linear and nonlinear models. Several well performing model
types, among which are the ridge regression, k-nearest neighbor models and neural
networks have been implemented. Ensembles of heterogenous models typically yield
a better generalization performance than homogenous ensembles. Additionally, the
authors propose methods for model validation and assessment as well as adaptor
classes performing a transparent feature selection or random subspace training on a
large number of input variables. The toolbox is implemented in Matlab and C++ and
available under the GPL. Several applications of the described methods and the
numerical toolbox itself are described. These include the ECG modeling, classifica-
tion of activity in drug design, etc.
VIII Preface

The fourth part of the volume is devoted to various issues related to the modeling
of dynamics in new application areas which have recently attracted much attention in
the research community and practice.
M. HrebieĔ and J. Korbicz (Automatic fingerprint identification based on minutiae
points) deal with a problem that has recently attracted much attention, and become of
utmost importance, namely the use of some individual specific features in human
identification. In the paper, fingerprint ideantification is considered, specifically by
considering local ridge characteristics called the minutiae points. Automatic finger-
print matching depends on the comparison of these minutiaes and relationships be-
tween them. The authors discuss several methods of fingerprint matching, namely, the
Hough transform, the structural global star method and the speeded up correlation ap-
proach. Since there is still a need for finding the best matching approach, research for
on-line fingerprints has been conducted to compare quality differences and time rela-
tions between the algorithms considered and the experimental results are shown.
Some issues related to image enhancement and the minutiae detection schemes em-
ployed are dealt with.
Ł. Rauch and J. Kusiak (Image filtering using the dynamic particles method) con-
sider holistic approaches for image processing and their use in various types of applica-
tions in the domain of applied computer science and pattern recognition. A new image
filtering method based on the dynamic particles approach is presented. It employs
physical principles for the 3D signal smoothing. The obtained results are compared with
commonly used denoising techniques including the weighted average, Gaussian
smoothing and wavelet analysis. The calculations are performed on two types of noise
superimposed on the image data, i.e. the Gaussian noise and the salt-pepper noise. The
algorithm of the dynamic particle method and the results of calculations are presented.
B. AmbroĪek (The Simulation of cyclic thermal swing adsorption (TSA) process)
deals with the prediction of the dynamic behavior of a cyclic thermal swing adsorp-
tion (TSA) system with a column packed with a fixed bed of adsorbent using a rigor-
ous dynamic mathematical model. The set of partial differential equations, represent-
ing the thermal swing adsorption, is solved by using numerical procedures from the
International Mathematical and Statistical Library (IMSL). The simulated thermal
swing adsorption cycle is operated in three steps: (i) an adsorption step with a cold
feed; (ii) a countercurrent desorption step with a hot inert gas; (iii) a counter-current
cooling step with a cold inert gas. Some examples of simulations are presented for the
propane adsorbed onto and desorbed from a fixed bed of activated carbon. Nitrogen is
used as the carrier gas during adsorption and as the purge gas during desorption and
cooling.
M. Danielewski, B. Wierzba and M. Pietrzyk (The stress field induced diffusion)
present a mathematical description of the mass transport in multi-component solution.
The model is based on the Darken concept of the drift velocity. To be able to present
an example of a real system the authors restrict the analysis to an isotropic solid and
liquids for which the Navier equation holds. The diffusion of components depends on
the chemical potential gradients and on the stress that can be induced by the diffusion
and by the boundary and/or initial conditions. In such a quasi-continuum the energy,
momentum and mass transport are diffusion controlled and the fluxes are given by the
Nernst-Planck formulae. It is shown that the Darken method combined with the
Navier equations is valid for solid solutions as well as multi component liquids.
Preface IX

We hope that the particular chapters, written by leading experts in the field, can
provide the interested readers with much information on topics which may be relevant
for their research, and which are difficult to find in the vast scientific literature scat-
tered over many fields and subfields of applied mathematics, control, robotics, secu-
rity analysis, bioinformatics, mechanics, etc.
The idea of this volume has been a result of very interesting discussions held during,
and after the well attended Special Session on “Dynamical Systems – Modelling,
Analysis and Synthesis” at the CMS – “Computer Methods and Systems” International
Conference held on November 14–16, 2005 and organized by the AGH - University of
Science and Technology in Cracow, Poland. We wish to thank all the attendees, and
participants at discussions for their support and encouragement we have experienced
while preparing this publication.
We wish to thank the contributors for their excellent work and a great collaboration
in this challenging and interesting editorial project. Special thanks are due to Dr.
Thomas Ditzinger and Ms. Heather King from Springer for their constant help and
support.

October 2008 Wojciech Mitkowski


Janusz Kacprzyk
Contents

Basic Tools and Techniques for the Modelling of Dynamics

Modeling of Chaotic Systems in the ChaoPhS Program


Ryszard Porada, Norbert Mielczarek . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Model of a Tribological Sensor Contacting Rotating Disc
Vsevolod Vladimirov, Jacek Wróbel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
The Bifurcations and Chaotic Oscillations in Electric Circuits
with Arc
V. Sydorets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Modelling Dynamics in Control and Robotics

Soft Computing Models for Intelligent Control of Non-linear


Dynamical Systems
Oscar Castillo, Patricia Melin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Model Reference Adaptive Control of Underwater Robot in
Spatial Motion
Jerzy Garus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Feedback Stabilization of Distributed Parameter Gyroscopic
Systems
Pawel Skruch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Stabilization Results of Second-Order Systems with Delayed
Positive Feedback
Wojciech Mitkowski, Pawel Skruch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
XII Contents

Modelling Dynamics in Biological Processes

A Comparison of Modeling Approaches for the Spread of


Prion Diseases in the Brain
Franziska Matthäus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Ensemble Modeling for Bio-medical Applications
Christian Merkwirth, Jörg Wichard, Maciej J. Ogorzalek . . . . . . . . . . . . . . . 119

New Application Areas

Automatic Fingerprint Identification Based on Minutiae


Points
Maciej Hrebień, Józef Korbicz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

Image Filtering Using the Dynamic Particles Method


L. Rauch, J. Kusiak . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
The Simulation of Cyclic Thermal Swing Adsorption (TSA)
Process
Bogdan Ambrożek . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

The Stress Field Induced Diffusion


Marek Danielewski, Bartlomiej Wierzba, Maciej Pietrzyk . . . . . . . . . . . . . . . 179

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189


Modeling of Chaotic Systems in the ChaoPhS Program

Ryszard Porada and Norbert Mielczarek

Poznan University of Technology, Institute of Industrial Electrical Engineering


Piotrowo 3a, 61-138 Poznań, Poland
{Ryszard.Porada,Norbert.Mielczarek}@put.poznan.pl

Abstract. Modeling of the chaos phenomena in the nonlinear dynamics requires application
of more precise methods and simulatory tools than in cases of researches of linear systems.
Researches on these phenomena, except cognitive values, has also importance in technical
meaning. For obtaining the high quality parameters of output signals of practical systems it is
necessary to control, and even eliminate chaotic behaviour. Practical simulatory programs, eg.
Matlab not always realize high criteria concerning exactitude and speed of the simulation. In
the paper we introduced a new simulatory program ChaoPhS (Chaotic Phenomena Simulations)
to investigate chaotic phenomena in continuous and discreet systems, and also systems encoun-
tered in practice. Also we presented structure of the program and used numeric algorithms. The
program was tested with utilization of well-known from the literature models of chaotic sys-
tems. Some selected results of researches chaotic phenomena which appear in simple power
electronic systems are also presented.

1 Introduction
In recent years it is observed alot of interest in theory of deterministic chaos not only
among mathematicians and physicists, but also among representatives of technical
sciences. This theory analyzes irregular movement in the state space of nonlinear sys-
tem. Classic dynamic laws describe unambiguously the state of systems evolution as a
function of time, when initial conditions are known. The reason of observed chaotic
behaviour is not an external noise, but the property of nonlinear systems resulting in
exponential divergence of an initially close trajectory in the limited area of phase
space. The reason why the system behaves this way is its sensitivity to initial condi-
tions which makes impossible a long-term forecast of their trajectory, because in prac-
tice we can establish initial conditions only with a finite precision.
The research on deterministic chaos phenomena enables the identification of a reason
and designation of means of their elimination that is essential in practical applications.
The state vector of nonlinear systems in longer prospects of time depends on initial
conditions and significantly also on numeric methods applied to solving equations de-
scribing these systems. The application of one of typical simulation programs, e.g.
Matlab is often related with a very long computation time. Also a limited number of
implemented numeric algorithmic integration method of dynamics equations and lack
of numeric instruments to assign the quantities characterizing methods of nonlinear
dynamics (e.g. the Poincaré section, Lyapunov exponents etc.), has contributed to our
decision to write our own simulating program.

W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 1 – 20.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
2 R. Porada and N. Mielczarek

The paper describes the concept of deterministic chaos and mathematical instru-
ments used for its analyses. We introduced a self-made simulating program, ChaoPhs,
carried out tests of this program and demonstrated research results of a simple power
electronic system (an example of a typical, strongly nonlinear switching structure
used in practice), operating in a closed system for various control and load conditions.

2 General Characteristic of Methods of Nonlinear Dynamics


Behaviour of dynamical system (evolutionary) usually can be described [1,2] by the
differential equations in normal form:
x (t ) = F (t , x (t ), u(t ), Λ (t )) , x (t 0 ) = x 0 (1)

where: x – state vector,


u – control vector,
Λ – vector of additional parameter,
specified on manifold M , which creates its phase space. Phase flow g (t , x, Λ ) ≡ g Λ ( x )
t

generates vector field F specified on manifold M .


The subset of points:

γ = {K ∈ M : K = g Λt : t ∈ ℜ1 } (2)

makes the orbit of flow. The orbit is a curve lying on manifold M and is the trajec-
tory of equations (1). If equation (1) has a periodic solution with period T , then
g Λt +T = g Λt , t ∈ ℜ1 and orbit (2) is closed. The orbits of flow g Λt are the integral
curves of vector field F .
For the system with a discrete time given in the form of an algebraic representa-
tion, the evolution in the function of time can be described by an equation in form of a
general iterative formula:
x n +1 = f p ( x n ) (3)

where x n and x n+1 describe the system state in the n -th and in ( n + 1) -th step of
evolution.
Among all basic methods of nonlinear dynamics [2,3,4] it is possible to mention
several mutually interrelated notions, like fixed points, orbits, attractors, the Poincaré
section, the Lyapunov exponents, the Hausdorff dimension, the correlation function or
bifurcation [1,2,7,11].
An attractor is a certain region, trajectory or point in the phase space, towards
which trajectories beginning in different region of phase space head. The simplest at-
tractor is a fixed point, when the system has a distinguished state, towards which it is
aiming regardless of the initial conditions. In a two-dimensional phase space there is
only possible one more type of an attractor – border cycle. Border cycles appear in
nonlinear systems, in which there exist elements dissipating the energy and support-
ing the movement.
Modeling of Chaotic Systems in the ChaoPhS Program 3

The Poincaré sections simplify the attractor search problem by the analysis of points
appointed by trajectories which are cutting through the chosen plane. The Poincaré
t
map emerges from orbits of the phase flow g Λ , and its property, i.e. the qualification
whether it is contracting or expanding, determines the systems proceeding.
The Lyapunov exponents are used to estimate the convergence or divergence of the
phase flow trajectory. The positive values of exponents mean the divergence of orbits
and chaos. The Lyapunov exponent is defined by the equation:

λ = lim sup 1 ln ξ (t ) (4)


t →∞ t
where ξ (t ) is a phase trajectory and describes the exponential divergence or conver-
gence of trajectories surrounding the analyzed trajectory. In general, the number of
Lyapunov exponents equals the number of dimensions of the phase space.
By bifurcation in nonlinear dynamics we call a change in a stable functioning of
the system, proceeding under a modifications of control parameter. If we assume, that
the movement of a dynamic system is described by the structure of split of phase
space into trajectories, then by bifurcation values of parameters we understand those
for which this structure undergoes changes.

3 Numerical Modeling of Chaotic Systems


Sensitivity of initial conditions and unexpectedness in a long-term period of time has
the fundamental meaning for the evolution of chaotic systems. A numeric assignment
of trajectory of such systems is more difficult then in the case of linear systems. Deci-
sive are the accepted mathematical models, first of all, numerical algorithms. It re-
quires a particularly precise checking of error emerging during the calculations.
It is often accepted [1,2] that for the purpose of a preliminary evaluation of the sys-
tem's behaviour it is possible to apply a simplified system in which there occur only
simplified models of nonlinear elements, being the principle cause of the chaotic be-
haviors. From the point of view of a high sensitivity of such system, a quantitative
analysis of trajectory is useless. Results of such investigations are useful only for the
qualitative analysis, that is to assign fixed points, bifurcations and existence of chaotic
attractors.
A trajectory of a linear or nonlinear systems defined by formula (1) can be found
by the use of iterative algorithms:
t +h
x(t + h) = x(t ) + ∫ f (t , x)dt (5)
t

or by expanding in the Taylor series. In this research we use several methods of solv-
ing equations (1) and they are all discussed in the farther part of this paper.
For a qualitative study of the chaotic model very helpful can be the Poincaré sec-
tion. It makes possible a simplification of the attractor search task by the analysis of
points appointed by trajectories which are cutting through chosen plane. Instead of
continuous lines we obtain a set of points situated on this plane (Fig. 1). The plane is
4 R. Porada and N. Mielczarek

Fig. 1. An exemple of the Poincaré section in autonomous systems

Fig. 2. Example of the Poincaré section in non-autonomous systems

selected in such a way so as to provide as much as possible information, if this kind of


motion has an attractor and which is its structure. If the motion takes place on a closed
trajectory, then it intersects the Poincaré plane in one point and regularity of the
movement is easy to notice. A chaotic motion gives irregular trajectories which cross
the plane in others new points. If there is no regularity (i.e. an attractor), then the in-
tersection points migrate in an irregular way within a certain region of the plane, fa-
voring none of its part.
In the non-autonomous systems (particularly for input signals with a constant pe-
riod T ) we often apply the stroboscopic Poincaré section [1,2]. The Poincaré section
points can be found in moments nT , n = 1, 2,…, N .
In nonlinear dynamics the bifurcation diagram is used for the evaluation of a stable
work, evaluating the change of a stable system functioning, undergoing changes under
modifications of value of control parameter – Fig. 3. The change of state occurs in the
form of trajectory multiplications [1,2], leading as a result to chaotic behaviour of the
system. This diagram can be obtained by putting on the x axis the value of a control
parameter and on the y axis – the found points of the Poincaré section for different
initial conditions, after the elimination of transient states.
To distinguish phenomena of the deterministic chaos from noise or systems that are
entirely stochastic, we can use the Lyapunov exponents and the series of generalized
dimensions (Hausdorff, fractal, correlation). The first define the level of chaos in a
dynamical system, whereas the second defines a measure of complexity of the system.
Modeling of Chaotic Systems in the ChaoPhS Program 5

Fig. 3. Bifurcation diagram showing a cascade of period doubling of phase trajectory orbit

It is rather hard to calculate these coefficients analytically, however they can be rela-
tively easily determined by the use of sampled time series of the investigated system.
The Lyapunov exponents are numerical coefficients of exponential growth of dis-
tance between neighboring points on phase space, when we operate on it using a
transformation. For the simplest transformation x n +1 = a x n , after n steps, we ob-
tain x n = a n x0 , which can easily be recorded as x n = x 0 e n ln a . The ln a shows the
proportion in which the distance between points in one step of transformation
changes. For the multidimensional systems, where the transformation is a set in form
x n +1 = A x n , the Lyapunov exponents are equal λ k = ln a k , where a1 , a 2 ,…, a k are
the eigenvalues of A matrix. In directions where trajectories diverge from each other,
the Lyapunov exponents are positive, and on the contrary – when they converge – the
exponents are negative. The condition to keep the measure is det A = 1 which means
that the product of all eigenvalues is equal 1. For the continuous nonlinear systems,
the rate of motion on each trajectory is set by a tangent vector. The transformation
matrix is the Jacobian matrix, J i j = ∂ f j ∂ xi , where J i j are functions of points co-
ordinates in the phase space and they define the rate of change of the j -th coordinate
in the xi direction. Therefore these exponents are calculated locally and theirs values
are obtained in small surroundings of the explored point.
In order to assign the largest Lyapunov exponent, in this research was used the al-
gorithm developed by Collins, De Luca and Rosenstein [16].
Let the sequence:
x = {x1 , x 2 , x3 ,…, x N } (3)

represent the samples of a time series of one of the state variable for which exponents
are being estimated, whereas:

X i = [X 1 X2 ... X n ] T (4)
6 R. Porada and N. Mielczarek

where X i – vector of state variables obtained in discrete time i , n – number of


state variables (embedding dimension of systems trajectory).
Applying the Takens method [12] of attractor reconstruction from the time series,
we obtain the vector of delayed state variables:

Xi = Xi[ X i+ J ... X i +( m −1) J ]T (5)

where: J – reconstruction delay, m – embedding dimension of space of delayed


state variables vector.
To correctly designate the embedding dimension of space m , we apply the
Takens theorem:

m ≥ 2n + 1 (6)

After reconstruction of the vector of state variables, we find distance d j to the ref-
erence point j , in the nearest neighborhood, defined as the Euclidean norm:

d j (0 ) = min X j − X j (7)
Xj

where: d j (0 ) – initial distance of j -th point from neighboring point. It is possible to


accept, that distance d j (i ) is equal to:

d j (i ) ≈ C j e λ1 (i⋅Δt ) (8)

where C j – initial distance.


After finding the logarithm of both sides of the equation, we obtain:

ln d j (i) ≈ ln C j + λ1 (i ⋅ Δt ) (9)

The largest Lyapunov exponent can be obtained by calculating the slope coefficient
of equation (9) using the least squares method and dividing it by the sample interval
of time series x .

Fig. 4. Method of numeric calculations of the Lyapunov exponent


Modeling of Chaotic Systems in the ChaoPhS Program 7

For a qualitative description of complexity of the chaotic system we can use the
correlation dimension D2 , being the lower limit of the Hausdorff dimension D0 , i.e.
D2 < D0 . The correlation dimension is defined as:

ln C (r )
1
D2 = lim (10)
r →0 ln r

where C (r ) – correlation integral equal to:

C (r ) = ∑ Θ[r − X i − X k ]
2
(11)
M (M − 1) i ≠ k

where: r – distance between points, Θ – the Heaviside function.


If the time signal (5) is known, then it is possible to compute the correlation inte-
gral C (r ) . The correlation integral is the probability that the distance between two
points on the attractor is smaller than r .
In this work we use the Grassberger-Procacci algorithm [11] to calculate C (r ) as
the correlation sum. Writing equation (10) in the form:
ln C (r ) = f (ln r ) (12)

it is possible to notice that the correlation dimension can be assigned as the slope of
the function (12).
The correlation integral can also be used as an instrument allowing to distinguish
deterministic irregularities, arising from internal properties of a strange attractor from
the external white noise. If a strange attractor is embedded in an n - dimensional
space and an external white noise is added, then each point on the attractor is rimmed
with a homogeneous n – dimensional cloud of points. The radius of this cloud r0 is
proportional to the intensity of noise. For r >> r0 the correlation integral counts
these clouds as points and the slope of function ln C (r ) = f (ln r ) is equal to the cor-
relation dimension of the attractor. For r << r0 most of the counted points are situ-
ated inside homogenously filled n -dimensional cells and the slope tends to value n .
In practical applications, the sources of deterministic chaos are switched systems,
e.g. power electronics systems. In investigations using numerical simulations they
cause additional difficulties whose severity depends on a selected model of switch-
ing elements. In case of a model of system with a changeable structure, it is often an
ideal model of the element (zero time to switch, zero resistance of switch in on-state
and infinite in off-state). The method to eliminate the right-hand side discontinuity
of a system is the numerical calculation of the switching moment t , and next the

integration according to rule (2) in the range t; t S and setting the initial condi-

tions x(t S+ ) = x(t S− ) of the new integration (2) in the range t S ; t + h . It is possible
+

to eliminate the left-hand side discontinuity (e.g. closing switch in a circuit with
8 R. Porada and N. Mielczarek

capacitance with a non-zero initial condition) by setting initial conditions of a new


x(t S+ ) ≠ x(t S− ) and a very small integration step.
In the case when we select a model with the stationary structure, it is possible to
replace an ideal switching element with the real one and apply the algorithm of solv-
ing stiff differential equations of systems with a very small integration step.

4 Description of the ChaoPhS Program

The ChaoPhS (Chaotic Phenomena Simulations) program, whose block diagram is


shown in Fig. 5, was written in the C++ Builder software development kit, with the
use of object-oriented programming technology.
The program contains a library of mathematical models of chaotic systems and
there are also power electronics systems, from among which a tested object is se-
lected. This library is open because it is added dynamically. This means that when we
develop this program, it is possible to add new elements to the library without neces-
sity of modification and a renewed compilation of entire programming code. Each
system joined to the library is treated as a class being an object (in the sense of pro-
gramming terminology) which has definite methods used to analyse this system [6]
and properties determining its parameters. The methods of the class describing the
system composed among others with implemented numeric algorithms solving equa-
tions of the mathematical model of this system and mathematical instruments used for
its analysis. This concept and part of numerical methods was taken from work [14].
To solve the differential equation (1) it is possible to choose one of methods due
to: Runge-Kutta, Fehlberg, Dormand-Prince, Adams-Multon, Gear and Gragg with
the Bulirch-Stoer extrapolation. It is also possible to choose the methods order as well
as the step of integration or use of option of an automatic selection of the step, and in
some cases an automatic selection of the method’s order (Fig. 5). To provide a facility
improving this program performance and simplicity of implementing methods to in-
vestigate of nonlinear systems, there is added an approximation method of state vec-
tor. A discrete state vector obtained in discrete time tk = t0 + k ⋅ h ( h – integration
step) one can approximate by getting additional data between the discrete time t k and
t k −1 . To the applied method of spline function approximation of third order [13] there
has been added the linear interpolation which for some studied systems gives better
results by considering very small integration steps consequent of nonlinearity and
nonstationary of these objects.
Besides the waveform of the state vector, the program makes possible to appoint
the phase trajectory and spectral analysis with the usage of the discrete Fouriers
transformation (Horner’s algorithm and FFT algorithms created by: Cooley-Tukey
and Sande-Tukey with defined radix). The methods of analysis of nonlinear dynam-
ics, which were added to the used numerical library [6], concern the determination of
the Poincaré section, bifurcation diagrams of selected systems parameters, the
Modeling of Chaotic Systems in the ChaoPhS Program 9

b) menu responsible for choice of simulation


a) menu of model choice
start

c) option dialog box of simulation parame- d) option dialog box of parameters of data
ters analysis

e) dialog box for changing model parameters

f) menu of type of graphical chart describing g) table with models time series of state vector
system

Fig. 5. Exemplary screenshots of program ChaoPhS concerning the choice of numerical meth-
ods solving the equations of systems model
10 R. Porada and N. Mielczarek

Lyapunov exponents and a correlation function (developed by Rosenstein, Collins,


and De Luca [16]).
The program makes also possible to import data calculated in other programs (e.g.
Matlab) and to compute selected quantities characterizing the nonlinear dynamics
(e.g. the Poincaré section or the Lyapunov exponents). Data can be selected from the
program menu <File>.
To choose the model one should press the <Model> submenu from menu bar and
make selection of the proper system from objects available in the library (fig. 5a).
Next one should click the <Option> command from main menu and set the parame-
ters of simulation (fig. 5c), that means: initial and final time of simulation, initial val-
ues of state vector, algorithm of ODE solver and its order, precision and integration
step. Then there appears the name of the model investigated and the ODE solver al-
gorithm, at the bottom of the program panel on the status bar. In the main window
there should be visible a graphic symbol of the model, that for convenience can be
moved inside the programs panel. After setting parameters of the model (by double
clicking on the graphic symbol of system in Fig. 5e), it is possible already to start
calculations pressing the submenu <Calculations> and <Solve> command. During
the computation the program reacts, and it can be visible in the progress bar placed
below the main panel. Afterwards on the screen a waveforms window of state vari-
ables (Fig. 5b) appears which can be enlarged to the size of the program main win-
dow and can be dynamically changed using a computer mouse. In the main menu
one can activate the submenu <Charts> from which it is possible to choose the
graphs: waveforms of state variables, phase portraits, frequency spectrum, Poincaré
section, bifurcation diagram and function used to estimate the largest Lyapunov ex-
ponent and correlation dimension (fig. 5f). Before selecting any chart, it is necessary
to set options used in tab <Numeric methods II> in the options dialog box (Fig. 5d).
There can be selected algorithms of FFT, approximation (or interpolation) methods
of state vector if automatic selection of integration step is enabled, and also parame-
ters of the Poincaré section, bifurcation diagram and the Lyapunov exponent. In this
program it is also possible to see the computed state vector of the model equation by
selecting <Table> command from the main menu. The vector appears inside the ta-
ble shown in Fig. 5g.
Except for the model of systems described by equation (1) the program can analyze
models represented by algebraic equations, e.g. logistic map, Henon map, Ikeda map
and others.

5 Verification of the ChaoPhs Program


For verifying instruments implemented in ChoaPhS it has been carried out test of
well-known models of chaotic systems in the form:
ƒ recurrent map: logistic, Henon, Ikeda,
ƒ differential equation (1) of models: Lorenz, Rössler, Rössler – hyperchaos, Chua
circuit with smooth nonlinearity.
Modeling of Chaotic Systems in the ChaoPhS Program 11

Logistic map Henon map Ikeda map

Fig. 6. Bifurcation diagrams of tested mathematical models

Logistic map Henon map Ikeda map

Lorenz system Rösslera system Chua circuit (smooth nonlinear-


ity)

Fig. 7. Attractors of tested models

In Fig. 6 thre were introduced the diagrams of bifurcation for the investigated test
maps. All of the diagrams are identical with those obtained in other publications
[1,2,10] which proves the correctness of the performed calculations onto iterative
maps characterized by formula X n +1 = f ( X n ) .
From Fig. 7 it results that also systems described by (1) are correctly simulated,
and the implemented methods of solving differential equations are proper. Phase tra-
jectories forming strange attractors are the same as those contained in [1,2,6,7,8,9,10].
For the assignment of the largest Lyapunov exponent the program draws chart of
distance of trajectory points as a function of the largest exponent ln d j (i) = f (λ1 (t )) .
The exponent is calculated (using the least squares method in a selected range) on the
12 R. Porada and N. Mielczarek

Fig. 8. Chart of function ln d j (i ) = f (λ1 (t )) for logistic map

() ( )
Fig. 9. Chart of function ln C r = f ln r for logistic map

basis of slope of this function. The correlation dimension D2 can be determined on the
basis of a chart of logarithm of the correlation integral (an exemplary chart is shown in
Fig. 9) as a function of the logarithm of distance between neighboring points
ln C (r ) = f (ln r ) .
To determine the Lyapunov exponent and the correlation dimension, the time se-
ries of just one state variable is sufficient because the program independently assigns
Modeling of Chaotic Systems in the ChaoPhS Program 13

the attractor using the method of delayed time series [12]. Before those quantities are
computed in the program, it is necessary to input the following parameters: embedded
dimension of attractor obtained from time series, time delay, number of time series
used in calculations of exponent λ1 and dimension D2 , and also window size, out-
side of which points are skipped.
Table 1 shows a comparison of values of the largest Lyapunov exponent presented
in publications [15,16,20] with exponents calculated using ChaoPhS. It can be noticed
that the compared values are similar.

Table 1. Comparison of values of largest Lyapunov exponents calculated in program ChaoPhs


with values presented in publications [15,16,20]

6 Simulation of Simple Power Electronics Converters


We have presented some results of simulations of practical switching systems on an
example of a step-down buck converter [17], with different kinds of modulation. For
the converter we used the sawtooth and triangle signal as a carrier signal in the PWM
modulation. The investigations were carried out with the uso of ChaoPhS and were
compared with results obtained by the Matlab program.
14 R. Porada and N. Mielczarek

Fig. 10. Main panel of program ChaoPhS

Fig. 11. Frequency spectrum of DC/DC buck converter for control gain K as parameter:
K = 8,4 , K = 12 , K = 14,5 , K = 23

Figure 10 shows the main panel of ChaoPhS with a scheme of the DC/DC con-
verter and a graphical chart containing a voltage waveform during the steady system
state.
Figure 11 shows the evolution of converters state from the steady state with the 1T
periodic orbit, through the states 2T -, 4T - periodic, up to chaotic system functioning.
It is possible to notice there duplicative stripes of the frequency spectrum and in the fi-
nal chart – the absence of a leading frequency.
Modeling of Chaotic Systems in the ChaoPhS Program 15

For power electronics system, with a periodic waveform of current and voltage
whose frequency equals frequency of an external signal, that is frequency of the
sawtooth signal in the PWM generator, the largest Lyapunov exponent should be
negative. This is caused by a lack of dissipation of system trajectory. Function
ln d j (i) = f (λ1 (t )) for this condition must have a negative slope – Fig. 12a. For a
chaotic model of operation the slope is positive (Fig. 12b).

a) b)

Fig. 12. Chart for calculation of largest Lyapunov exponent for: a) stable state K = 8,4 ;
b) chaos K = 23

Fig. 13. Nonlinear phase trajectory – chaotic activity of DC/DC buck converter

In Figures 13 and 14 it is shown the phase trajectory and bifurcation diagram of the
investigated converter which confirm the occurrence of chaotic phenomena when the
control parameter (gain system) is changing.
Figure 15 represents a situation in which two attractors related with bifurcations
arise which are formed during the functioning of the system. In this figure there are
16 R. Porada and N. Mielczarek

compared two diagrams – the one obtained using the program presented program and
using Matlab. One can notice that:

a)

b)

Fig. 14. Bifurcation diagram of DC/DC buck converter with sawtooth carrier signal in PWM:
a) ChaoPhS, b) Matlab

Similarity of these two diagrams in the range <8,25> of control parameter value.
The additional clouds of points on the Matlab diagram are connected with an insuffi-
cient filtration of the transient states of the system which is a result of a long compu-
tation time in comparison with the computation time of the presented program.
Also the functioning of a buck type converter with a triangle carrier signal of the
PWM modulation [19] was presented.
Figure 16 shows a panel of ChaoPhS with a diagram of the converter, a window
of the phase space chart and a dialog box used to input parameters of the model. The
Modeling of Chaotic Systems in the ChaoPhS Program 17

a)

b)

Fig. 15. a) Two attractors of DC/DC buck converter; b) enlargement of one attractor

picture of the phase spaces shows that the system is in the state of a deterministic
chaos. Even though the structure of the converter has not been changed, the region of
stable work occurs in a different range of values of the controlled parameter K .
This can be noticed when we compare bifurcation diagrams of both systems pre-
sented in Figs. 14 and 18. Additionally, Figure 18 shows how significant are initial
18 R. Porada and N. Mielczarek

conditions for which simulation was performed. For the rol parameter K = 10 with
different initial conditions, the Poincaré section can have one or three stable points.
From Figure 17 it results that the structure of attractor of the buck converter with
the triangular carrier signal of the PWM modulation, is very complex and is different
from the system with the sawtooth carrier modulation signal (Fig. 15).

Fig. 16. Main panel of ChaoPhS program for DC/DC buck converter with triangle carrier PWM
signal

Fig. 17. Poincaré section for DC/DC buck converter with triangle carrier PWM signal and
K = 23
Modeling of Chaotic Systems in the ChaoPhS Program 19

Fig. 18. Bifurcation diagram for DC/DC buck converter with triangle carrier PWM signal

a) b)

Fig. 19. Chart to obtaining largest Lyapunov exponent for: a) stable state K = 8,4 , b) chaos
K = 23

7 Conclusions
The paper presents a simulation program, ChaoPhS (Chaotic Phenomena Simula-
tions), intended for investigating deterministic chaos phenomena in various systems,
among others in power electronics converters. This program was written in the C++
Builder software development kit to support of object-oriented programming technol-
ogy. Due to the application of dynamically linked libraries of the studied systems,
methods of solving equations which describe them, and also methods of analysis of
occurring chaos phenomena, the presented program can be easily developed further.
The verification of correctness of the analysis of well-known models of chaotic
systems carried out with the use of the presented program shows convergence with
results presented in the bibliography.
In this paper we also introduced several models of power electronics converters
whose chaotic properties are an object of research of the authors.
20 R. Porada and N. Mielczarek

References
[1] Ott, E.: Chaos in dynamical systems (in Polish). WNT, Warszawa (1997)
[2] Schuster, H.G.: Deterministic chaos. An introduction (In Polish). PWN, Warszawa (1995)
[3] Hamill, D.C.: Power electronics: A field rich in nonlinear dynamics. In: Nonlinear Dy-
namics of Electronic Systems, Dublin (1995)
[4] Hirsch, M.W., Smale, S.: Differential Equations, Dynamical Systems and Linear Algebra.
Academic Press, New York (1974)
[5] Banerjee, S., Ranjan, P., Grebogi, C.: Bifurcations in one-dimentional piecewise smooth
maps: Theory and applications in switching circuits. IEEE Trans. On Circuits and Sys-
tems – I 47(5) (2000)
[6] Lorenz, E.N.: Deterministic nonperiodic flow. J. Atmos. Sci. 20, 130 (1963)
[7] Eckmann, J.-P., Ruelle, D.: Ergodic theory of chaos and strange attractors. Rev. Mod.
Phys. 57, 617 (1985)
[8] Rössler, O.E.: An equation for continuous chaos. Phys. Lett. A 57, 397 (1976)
[9] Rössler, O.E.: An equation for hyperchaos. Phys. Lett. A 71, 155 (1979)
[10] Hénon, M.: A two-dimensional mapping with a strange attractor. Comm. Math. Phys. 50,
69 (1976)
[11] Grassberger, P., Procaccia, I.: Characterization of strange attractors. Phys. Rev. Lett. 50,
346 (1983)
[12] Takens, F.: Lecture Notes In Math, vol. 898. Springer, Heidelberg (1981)
[13] Baron, B., Piątek, Ł.: Metody numeryczne w C++ Builder. Helion, Gliwice (2004)
[14] Baron, B.: Układ dynamiczny jako obiekt klasy C++. IC-SPETO, Gliwice-Ustroń (2005)
[15] Wolf, A., Swift, J.B., Swinney, H.L., Vastano, J.A.: Determining Lyapunov exponents
from a time series. Physica D 16, 285 (1985)
[16] Rosenstein, M.T., Collins, J.J., De Luca, C.J.: A practical method for calculating largest
Lyapunov exponents from small data sets (1992)
[17] Porada, R., Mielczarek, N.: Wstępne badania symulacyjne zachowań chaotycznych
w układach energoelektronicznych. ZKwE, Kiekrz (2004)
[18] Porada, R., Mielczarek, N.: Preliminary Analysis of Chaotic Behaviou. In: Power Elec-
tronics. EPNC, Poznań (2004)
[19] Porada, R., Mielczarek, N.: Badania zjawisk chaosu deterministycznego w zamkniętych
układach energoelektronicznych. In: IC-SPETO 2005, Gliwice-Ustroń (2005)
[20] Huang, P.J.: Control in Chaos (2000),
http://math.arizona.edu/~ura/001/huang.pojen/
Model of a Tribological Sensor Contacting
Rotating Disc

Vsevolod Vladimirov and Jacek Wróbel

AGH University of Science and Technology


Al. Mickiewicza 30, 30-059 Cracow, Poland
vladimir@mat.agh.edu.pl, vsevolod.vladimirov@gmail.com

Abstract. We study mechanical oscillations of a sensor, forming a friction pair with


the rotating disc. In the absence of friction the model is described by a two-dimensional
hamiltonian system of ODE’s which is completely integrable. As the Coulomb-type fric-
tion is added, the regimes appearing in the modelling system become more complicated.
They are investigated both by the qualitative methods and the numerical simulation.
With such a synthesis we obtain a complete global behavior of the system, within
the broad range of a driven parameter values, for two principal types of the model-
ing function, simulating the Coulomb friction. A sequence of bifurcations (limit cycles,
double-limit cycles, homoclinic bifurcations and other regimes) are observed as the the
driven parameter changes. The pattern of bifurcations depends in essential way upon
the model of friction force employed and this dependence is analyzed in detail. Much
more complicated regimes appear as we incorporate into the model the one-dimensional
oscillations of the rotating element. The system possesses in this case quasiperiodic,
multiperiodic and, probably, chaotic solutions.

1 Model Describing the Vertical Rod Which Contacts


Rotating Disc
1.1 Statement of the Problem
We consider a modelling system, describing oscillations of the vertical rod. Its
lower end is fixed, while the upper one contacts the rotating disc. Geometry of
the mechanical system is shown in fig. 1. We assume in addition that the disc
can perform vertical oscillations. In such circumstances, nonlinear oscillations of
the far end of the rod can be described by the following second order equation:
ẍ + x3 − x + f (ẋ − ν) [1 +  sin (ω t)] = 0, (1)
where f is a Coulomb-type friction force, having the properties:
• f is the antisymmetric function
• f (ν) has a local minimum for some ν0 > 0.

W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 21–27.
springerlink.com 
c Springer-Verlag Berlin Heidelberg 2009
22 V. Vladimirov and J. Wróbel

B V

Fig. 1. Outlook of the mechanical system: A–vertical rod; B–rotating disc

1.2 Local Analysis of the Autonomous Case


At first let us analyze the case  = 0, for which equation (1) can be rewritten as
the following dynamical system:
ẋ = −y (2)
ẏ = x3 − x − f (y + ν),
To begin with, let us note that system (2) is Hamiltonian when f = 0 and it is
completely described by the Hamiltonian function
x4 y2 x2
H(x, y) = + − ,
4 2 2
which is constant on the phase trajectories (see fig. 2).
For f = 0, all stationary points of system (2) lie on the horizontal axis, and
have the representation (x∗ , 0), where x∗ satisfies the equation
x3 − x = f (ν). (3)
√ √
We assume in addition that f (ν) ∈ (− 2 9 3 , 2 9 3 ).
With this condition system
possesses three stationary points.
It is easy to see that Jacobi matrix of system (2), corresponding to a stationary
point (x∗ (ν), 0), is given by the following formula:
 
ˆ 0 −1
J(ν) =
3x2∗ (ν) − 1 −f  (ν)
ˆ
The eigenvalues of matrix J(ν) are as follows:
1   
λ1 = −f  (ν) − 4 − 12x2∗ (ν) + f  (ν)2 ,
2
1   
λ2 = −f (ν) + 4 − 12x2∗ (ν) + f  (ν)2 .
2
Model of a Tribological Sensor Contacting Rotating Disc 23

Fig. 2. Phase portrait of system (2) in case f = 0

In accordance with the assumption that f (−x) = −f (x), we analyze the


regimes, appearing in the system when ν > 0. It follows then from equation
(3) that coordinates x− , x0 and x+ of the critical points lie, respectively, inside
the intervals (−1, −0.577), (−0.577, 0), (1, 1.359). For x0 (ν) ∈ (−0.577, 0) the
eigenvalues are real and have opposite signs. The eigenvalues corresponding to
critical points x− (ν) ∈ (−1, −0.577) and x+ (ν) ∈ (1, 1.359) are complex. They
satisfy the inequality

Re[λ±
1,2 ] > 0 when ν < ν0 ,
Re[λ±
1,2 ] <0 when ν > ν0 .

So in vicinity of the value ν = ν0 limit cycles’ creation occur. In order to


study their types we should calculate the real part of the first Floquet index
C1 (ν0 ) [1]. Luckily we can do it for both of the critical points A± = (x± (ν0 ), 0)
simultaneously. Performing the change of variables
W
x=− + x± , (4)
Ω
y = U, (5)

where Ω = 3x2± − 1, we get the canonical representation
      
U̇ 0 −Ω U h1
= + , (6)
Ẇ Ω0 W h2

where
3W 2 x± W3
h1 = − −
  Ω2 Ω3

− f (ν0 )U 2 + f (ν0 )U 3 + o(U 3 ),
h2 = 0.
24 V. Vladimirov and J. Wróbel

Using the well know formula (see e.g. [1]), we obtain, that

Re(C1 (ν0 )) = −3 f  (ν0 )/8.

From this we conclude that stability type of the pair of the limit cycles completely
depends on the sign of f  (ν0 ). If f  (ν0 ) > 0 then the stable limit cycles appear
when ν < ν0 . Contrary, for f  (ν0 ) < 0 the unstable limit cycles appear when
ν > ν0 .

1.3 Global Behavior of the Autonomous System


Above we have shown that stability types of the periodic trajectories arising in
system (2) depend merely upon the sign of f  (ν0 ). Now we are going to present
the global behavior of this system and its dependence upon the parameter ν and
the type of the modelling function f , simulating the Coulomb friction. We use
the following approximation for f :

ϕ(ν) when ν ∈ (0, ν1 ),


f (ν) =
k arctan (ν − ν1 ) + ϕ(ν1 ) when ν > ν1 ,

where ϕ(ν) = a ν 4 + b ν 3 + +c ν 2 + d ν 1 + e.
Numerical simulations show that, depending on the sign of f  (ν0 ), there are
two types of the global behavior, as it is illustrated on fig. 3 and 4, while the rest
of peculiarities of function f seem to be unimportant. The global phase portraits
presented here could serve as a basis of the prediction of qualitative behavior of
the autonomous system (2) in the broad range of the values of the parameter ν.

1.4 Non-autonomous Case


In general case equation (1) can be presented as the following dynamical system:

ẋ = −y (7)
ẏ = x − x − f (y + ν) [1 +  sin (ωt)] .
3

v0


Fig. 3. Qualitative changes of phase portrait of system (2), case f (ν0 ) > 0
Model of a Tribological Sensor Contacting Rotating Disc 25


Fig. 4. Qualitative changes of phase portrait of system (2), case f (ν0 ) < 0

Fig. 5. Bifurcation diagrams of system (7), obtained for  = 0.2, and increasing ν

Fig. 6. Bifurcation diagrams of system (7), obtained for  = 0.6, and increasing ν

In what follows, we assume that  ∈ (0, 1]. Numerical experiments show that
behavior of system (7) does not differ from that of system (2) when  << 1.
There are no significant changes also in case when  is of the order of unity, but
ν > ν0 + d, i.e. in those cases when the critical points (x± , 0) of the autonomous
system are stable foci. But the behavior of system (7) drastically changes from
(2) when ν ∈ (0, ν0 + d). Qualitative changes of the non-autonomous system
that have been studied with the help of the Poincaré sections techniques [1] are
shown in figs. 5–10 They present the results of numerical simulations in which
26 V. Vladimirov and J. Wróbel

Fig. 7. Bifurcation diagrams of system (7), obtained for  = 1.0, and increasing ν

Fig. 8. Bifurcation diagrams of system (7), obtained for  = 0.2, and decreasing ν

Fig. 9. Bifurcation diagrams of system (7), obtained for  = 0.6, and decreasing ν

Fig. 10. Bifurcation diagrams of system (7), obtained for  = 1.0, and decreasing ν
Model of a Tribological Sensor Contacting Rotating Disc 27

the driving parameter ν either grow or decreases. All figures present the results

of the simulation for the case f (ν0 ) < 0.

2 Concluding Remarks
A brief presentation of the global analysis of equation (1) shows that even the
autonomous case presents very rich behavior within the parameter range ν ∈
(0, ν + d] for some d > 0. The qualitative features of the phase trajectories
depend merely on the sign of f  (ν0 ) and seems not to be sensible upon the other
details of the modelling function f , representing the Coulomb-type friction.
The variety of solutions becomes much more reach when the term that de-
scribes vertical oscillation is incorporated. On analyzing the qualitative features
of solutions one can see that it becomes more and more complicated, depend-
ing on the values of the parameter . As this parameter growth, the system (7)
demonstrates periodic, quasiperiodic and multiperiodic regimes, period doubling
cascades and, probably, chaotic oscillations. Let us note, yet, that this is the case
when ν ∈ (0, ν + d], because for sufficiently large values of velocity, lying beyond
this interval, all the movements in the system become asymptotically stable, and
tend, depending on the initial values, to either (x+ , 0) or (x− , 0).

Acknowledgements
The authors are very indebted to Dr T.Habdank-Wojewódzki for the acquainting
with his experimental results and valuable suggestions.

Reference
[1] Guckenheimer, J., Holmes, P.: Nonlinear Oscillations, Dynamical Systems and Bi-
furcations of Vector Fields. Springer, New York (1987)
The Bifurcations and Chaotic Oscillations in Electric
Circuits with Arc

V. Sydorets

Paton Welding Institute


Bozhenko 11
Kiev, Ukraine
sidvn@ua.fm

Abstract. The autonomous electric circuits with arc governed by three ordinary differential
equations were investigated. Under variation of two parameters we observed many kinds of
bifurcations, periodic and chaotic behaviors of this system. The bifurcation diagrams were
studied in details by means of its construction. Routes to chaos were classified. Three basis
patterns of bifurcation diagrams that possess the properties – (i) softness and reversibility; (ii)
stiffness and irreversibility; (iii) stiffness and reversibility – were observed.

1 Introduction
In the last years the investigations of nonlinear dynamical dissipative systems are
rapid developed. The fundamental results one of which is invention of deterministic
chaos in different mechanical, physical, chemical, biological, and ecological systems
was obtained. Same phenomena were found out in electrical engineering. They were
studied in detail by L.Chua [1] and V.Anishchenko [2].
A classical nonlinearity – electric arc in electric circuits remain insufficiently
researched. Author was tried to make up for this deficiency. The more so since the
mathematical model of dynamical electric arc was proposed by I.Pentegov, and
conjointly with author was improved and used in many applications [3].
As is shown preliminary investigations [4] in electric circuit with arc the
emergence of a deterministic chaos is possible.
A cardinal importance in nonlinear systems has the bifurcation phenomenon.
Under variation of two parameters a lot of kinds of bifurcations, periodic, and chaotic
regimes can be observed:

‰ Hopf bifurcation (supercritical or subcritical);


‰ Bifurcation of twin limit cycles (stable and unstable);
‰ Infinite cascade of period doubling bifurcations with transition to chaos;
‰ Finite cascade of period doubling bifurcations with or without transition to chaos;
‰ Reverse cascade of period doubling bifurcations;
‰ Intermittency;
‰ Crisis of attractor;
‰ Overlapping of attractor basin that leads to metastable chaos and isolate
regimes.

W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 29 – 42.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
30 V. Sydorets

A powerful tool to investigate nonlinear dissipative dynamic systems is toconstruct


one and two parameter bifurcation diagrams [5-10]. One parameter bifurcation
diagrams are very well suited for the investigation of routing of chaos development.
Two parameter diagrams allow to generalize these results and to reveal a set of
universal structures.
An electric circuit with arc is a fairly simple and convenient system for investigation
because it possesses a rich collection of periodic and chaotic regimes [5-7].
In spite of complexity and variety of bifurcation diagrams of an electric circuit with
arc we could find several typical patterns. Classification was carried out with respect to
two important properties: softness or stiffness of chaos or periodicity rising; reversibility
or irreversibility of a process under rising or falling of bifurcation parameter.
At this classification a pattern type does not depend on concrete bifurcation which
causes it, and also possesses self-similarity that is characteristic feature of embedded
patterns in self-organization.

2 Electric Circuits with Arc


Eight electric circuits with arc which are depicted in Fig.1 were investigated. It is easy
to show that processes in circuits depicted in Fig.1e, 1f, 1g, 1h are similar to the ones
depicted in Fig.1a, 1b, 1c, 1d accordingly. In circuits 1d and 1h the oscillations never
exist. In circuits 1b and 1f the oscillations are periodic only. Periodic and chaotic
oscillations are observed in circuits 1a, 1e, 1c, and 1g. Therefore the processes taking
place in circuit 1a will be described.
According to a generalized model of arc [3] it is considered as part of an electric
circuit. The voltage on this part is
U (iθ )
uA = i, (1)

where: i – arc current, U(i) – static volt-ampere characteristic of arc, iθ – state current
of arc [3].
A dimensionless differential equation system described a circuit depicted in Fig.1a
contains two Kirchhoff equations for the contour and node and also the arc model
equation is:
. 1⎛ n −1

x= ⎜ y − xz 2
⎟;
L⎝ ⎠
. 1
y= (1 + R − y − Rx ) ; (2)
RC
.
z = x 2 − z,
where R, L, C – resistance, inductance and capacity of electric circuit; n – exponent in
approximation of static volt-ampere characteristic of arc; x, y, z – dimensionless
reactor current, capacitor voltage, and square of arc state current accordingly.
The Bifurcations and Chaotic Oscillations in Electric Circuits with Arc 31

Fig. 1. Eight electric circuits with arc

When static volt-ampere characteristic of arc is falling two fixed point is present
whose coordinates may be found from system (2) equal to with zero.
A single condition which may be obtained analytically is the condition of Hopf
bifurcation [11]. For this case we carry out a linearization of system (2) closed point
for which the Kaufman condition hold true.
One of the Hopf bifurcation conditions coincide with the condition of equality to 0
of the real part of pair complex roots of the characteristic polynomial.

( RLC + RC + L )(1 + L + R + nRC ) =


(3)
= RLC ( R + n )

The basic distinction of the Hopf bifurcation in the considered circuit is that this
bifurcation may be supercritical as well as subcritical. So local unstability may come
as a result of separation stable limit cycle from focus as a result of junction focus with
an unstable limit cycle.
The curve of the Hopf bifurcation (see Fig.2) in the parameter plane (L,C) which is
defined by formulae (3) have a minimum. It turned out that from the side of a small L
until the minimum (for R = 15 - Lm = 2,7924741181414) the bifurcation is critical,
afterwards the minimum is subcritical. To point of change of the Hopf bifurcation
kind of a curve of twin cycle (tangent) bifurcation joins. Its location was defined more
32 V. Sydorets

12
C R 0.5
10 1.5

5
6

4
15
2
50

0
0 2 4 6 8 L 10

Fig. 2. The curves of Hopf bifurcation

6
R = 15 z
L=1 5
C = 2.7

x
2 1 0 1 2 3 4 5 6

Fig. 3. Oscillations with period 1T as a result of Hopf bifurcation. This is a projection on phase
portrait on plate (x, z ). Fixed point – 1,1,1.

exactly. The twin cycle bifurcation lies under the Hopf bifurcation curve. So under
variation of parameter C the system develops according to differ scenarios depending
on the value of a fixed parameter L.
For instance the case R = 15 will be described. At a small L (L < Lm) and rising of
C the Hopf bifurcation with the advent of a stable limit cycle occurs (Fig.3). Further,
theb rising of C leads to the period of doubling bifurcation: single divisible limit cycle
becomes unstable but twice divisible stable cycle appears (Fig.4). In the system a self-
oscillations with half frequency is settled. Then a period doubling bifurcation cascade
follows. As a result four, eight, sixteen, etc. divisible cycles appear (Figs.5-7).

3 Period Doubling Bifurcations


As is well known [2] period doubling bifurcation cascade is one of scenarios of
transition from an ordinary attractor to a strange one i.e. the transition from periodic
self-oscillations to chaotic ones.
The Bifurcations and Chaotic Oscillations in Electric Circuits with Arc 33

6
R = 15 z
L=1 5
C = 2.9

x
2 1 0 1 2 3 4 5 6

Fig. 4. Cascade of period doubling bifurcations. Oscillations with period 2T.

6
R = 15 z
L=1 5
C = 3.025

x
2 1 0 1 2 3 4 5 6

Fig. 5. Cascade of period doubling bifurcations. Oscillations with period 4T.

6
R = 15 z
L=1 5
C = 3.035

x
2 1 0 1 2 3 4 5 6

Fig. 6. Cascade of period doubling bifurcations. Oscillations with period 8T.

In fact, at a certain value of C in the system chaotic self-oscillations appear (fig.8),


and an attractor becomes strange. Its strangeness consists in that any of its trajectories
is unstable in the Lyapunov sense but an attractor is stable in the Poisson sense. A
34 V. Sydorets

6
R = 15 z
L=1 5
C = 3.0385

x
2 1 0 1 2 3 4 5 6

Fig. 7. Cascade of period doubling bifurcations. Oscillations with period 16T.

6
R = 15 z
L=1 5
C = 3.19

x
2 1 0 1 2 3 4 5 6

Fig. 8. Chaotic oscillations – strange attractor

6
R = 15 z
L=1 5
C = 3.088

x
2 1 0 1 2 3 4 5 6

Fig. 9. Periodic window in chaos. Oscillations with period 5T.

strong dependence of solution on the initial conditions demonstrates unstability in the


Lyapunov sense. If in the periodic regime two initial condition close trajectories come
together, then in a chaotic regime they diverge but oscillations remain stable because
The Bifurcations and Chaotic Oscillations in Electric Circuits with Arc 35

6
R = 15 z
L=1 5
C = 3.13

x
2 1 0 1 2 3 4 5 6

Fig. 10. Periodic window in chaos. Oscillations with period 3T.

6
R = 15 z
L=1 5
C = 3.2

x
2 1 0 1 2 3 4 5 6

Fig. 11. Periodic window in chaos. Oscillations with period 4T.

the system is characterized by a total compression of phase volume (divergence of


system is negative).
The structure of chaos is non-homogeneous. In a chaotic region a window of
periodicity is observed. At that for value of L = 1 they qualitatively coincide with the
window of periodicity for the logistic map [2], i.e. at first window where periodic
oscillations have six divisible period follows, then five divisible period window
follows (see fig.9), then wide three divisible period window follows (see fig.10). As a
result of a period doubling bifurcation window with six divisible periods appears. At
other values of L windows of periodicity with period 3, 4 (see Fig.11), 6, 9, 12 occur.
At large values of parameter L (L > Lm) the scenario of oscillation development in
the system greatly differs from the scenario described above. The development is
initiated by the twin limit cycle bifurcation and as a result stable and unstable limit
cycles appear stiffly.
So in the system two attractors coexist simultaneously: first – a stable fixed point,
second – a stable limit cycle. Further, at the increase C these attractors develop
independently. With a limit cycle the period doubling bifurcation cascade occurs after
that a chaotic oscillations appear. With a stable fixed point, the subcritical Hopf
bifurcation occurs and as a result, depending on parameter values, either periodic
36 V. Sydorets

oscillations with two divisible period or chaotic oscillations may stiffly appear. From
the bifurcation diagram one can see that depending on initial conditions the transition
process tends to different attractors: either to a limit cycle or a strange attractor.
Attracting zones are separated by unstable limit cycle.

4 Bifurcation Diagrams
For a more detailed study of scenarios of chaos development many researchers
employ the technique of constructing a single parameter bifurcation diagram. On the
abscissa axis the values of varied parameter is put and on the ordinate axis – one of
coordinates of the Poincare section points. As a section plane the half plane is chosen

x2 − z = 0 , (4)
where x > 1. Judging by the third equation of system (2) the Poincare section points
will be oscillation maximums of variable z.
In Fig.12 there is the bifurcation diagram for L = 1 and a varying range of
parameter C, from 2.8 to 3.4. All stages of the scenario described above are visible on
it very well. On the bifurcation diagram the periodic windows in chaos are well
visible too. A rise in chaotic region periodic oscillations may be considered as a self-
organization process. Therefore a question of interest is of cause and mechanism of its
appearance.
For instance at L = 1 the evolution of a strange attractor is well visible. From the
beginning the chaotic state is extended among neighbor orbits of periodic oscillations
and a strange attractor has a strip structure. Narrow strips are joined in wider anes as a
result of a “reverse” period doubling bifurcation cascade, i.e. according to order 2k, 2k-
1
, ..., 16, 8, 4, 2, 1. After the last “reverse” period doubling bifurcation the strange
attractor densely covers a part of phase space and has a structure the so-called screw
strange attractor.
In Fig.12 periodicity windows with period 5 (C = 3.088..3.090), 3 (C =
3.123..3.145, wide window), 4 (C = 3.200..3.210), 3 (C = 3.3355), and 1 (C = 3.3800)
are marked. At that wide window with period 3 presents almost on all bifurcation
diagrams where there is the regime of developed chaos (this fact was noted in [12]). It
begins by a stiff destruction of chaos and ends with a period doubling cascade (i.e. 3,
6, 12, ..., 3⋅2k).
However, for example, on the bifurcation diagram at L = 0.3 (fig.13) two windows
with period 3 are observed. The development scenario for the first window coincides
with one described above but the development scenario for second window is reverse.
The periodic window 2⋅3 (C = 4.1204 ..4.1586) on the bifurcation diagram at L =
0.2 (Fig.14) both appears stiffly and stiffly destroys. It is of particular interest the
window with period 2⋅2 (C = 4.204..4.205) since two attractors coexist in it and
depending on initial conditions one of them can realize.
The analysis and comparison scenarios described above with well known
approaches show that they coincide with the Feigenbaum scenario especially in prior
to chaos regimes (period doubling bifurcation cascade). Distinctions are observed in
The Bifurcations and Chaotic Oscillations in Electric Circuits with Arc 37

Fig. 12. Bifurcation diagram at L = 1

Fig. 13. Bifurcation diagram at L = 0.3

Fig. 14. Bifurcation diagram at L = 0.2


38 V. Sydorets

Fig. 15. Bifurcation diagram at L = 0.1

Fig. 16. Bifurcation diagram at L = 0.11

chaotic regimes. Parameter values are (for instance L = 0.1 in Fig.15) when the
number of period doubling bifurcations is limited and chaotic regimes do not come.
At value of parameter L = 0.11 an infinite period doubling bifurcation cascade
occurs. However it adjoins with other one occurring in reverse direction. By it chaotic
oscillations, there is a transition to periodic (see Fig.16).

5 Self-similarity and Scaling Invariance of Bifurcation Diagram


In spite of complexity and variety of bifurcation diagrams for an electric circuit with
arc several typical patterns are found. A classification was carried out with respect to
two important properties: (a) softness or stiffness of chaos or periodicity rising; (b)
reversibility or irreversibility of a process under rising or falling of bifurcation
parameter.
The Bifurcations and Chaotic Oscillations in Electric Circuits with Arc 39

At such a classification a type of pattern does not depend on a concrete bifurcation


which causes it and possesses self-similarity that is typical for embedded structures.
A property of reversibility is important at carrying out ordinary physical
experiments. In future we will consider an ordinary physical as such an experiment
when a dynamic system is observed at a sufficiently smooth changing of bifurcation
parameter. In this case the final values of variables for one value of parameter is the
initial values for other value of parameter. Then ordinary physical experiments will be
both the observation of more nature phenomena, and more physical experiments when
initial conditions for variables do not set by particular way, and numerical experiment
that simulate ordinary physical experiment.
In the studied system there were revealed only three base patters which possess
follow properties:

(i) softness and reversible (Fig.17);


(ii) stiffness and irreversible (Fig.17);
(iii) stiffness and reversible (Fig.18).

Pattern (i) is well known and extended one of period doubling bifurcations. It can
start either a supercritical Hopf bifurcation (as, for instance, in the studied system at
small values of parameter L), or a period doubling bifurcation when it is embedded
structure (for instance, every subsequent branch of bifurcation tree on fig.12-16 is

(i) (ii)

hysteresis

Fig. 17. Patterns (i) and (ii)

isolated region
(iii) (iii)
metastable chaos

crisis

Fig. 18. Patterns (iii) at ordinary and special physical experiments


40 V. Sydorets

similar to previous). Properties of softness show that all periodic oscillations at period
doubling bifurcations appear with zero amplitude, and in accumulation point of period
doubling bifurcations although one considers that transition to chaos happens a
chaotic component power is equal to zero. At reverse changing of bifurcation
parameter the processes occur in a reverse order. Prolongation of pattern (i) in a
chaotic region is a cascade of a so-called ‘reverse’ bifurcation. At that narrow chaotic
strips join forming more wide strips. ‘Reverse’ bifurcations possess properties of
softness and reversibility too.
Pattern (ii) differ from pattern (i) that at certain of the values of bifurcation
parameter the system is bistable and two attractors (stable motions) coexist in it.
Repeller (unstable motion) which is a limit of attractor basins is located between
them. System motion coincides with one of attractors while development of other
attractor happens imperceptibly. On edges of the bistable zone a junction of a repeller
with one of attractors and its mutual destruction that become apparent as jumping to
remained attractor. This phenomenon is known as hysteresis. It is necessary to
emphasize that jumps on differ edges happen in differ directions.
By increasing the bifurcation parameter (at L > Lm) as a result of period doubling
bifurcation a chaos in system appears stiffly. In other cases stiff appearance
(appearance with nonzero amplitude) of periodic oscillations is possible.
By further raising the bifurcation parameter a chaos development in patterns (ii)
and (i) coincides. However if falling of bifurcation parameter is begun then
irreversibility of pattern (ii) show. The process will follow another path. Those system
regimes which do not appear at rising of bifurcation parameter will be appeared.
Cascade of period doubling bifurcation is observed in a reverse order. The last
bifurcation at which an attractor disappears is the tangent bifurcation with stable and
unstable cycles.
It is necessary to note that although in pattern (ii) all regimes do not become
apparent simultaneously they can be reveal in principle by ordinary physical
experiment at rising and falling of bifurcation parameter.
Pattern (iii) outwardly resembles pattern (ii) however it have essential distinctions.
The limit of a strange attractor intersects with a repeller. Basins of two attractors
overlap. This phenomenon is called a crisis of strange attractor. A chaotic attractor
with that crisis take place at competition of two attractors looses its attracting
properties.
The jump to periodic (more stable) attractor occurs and zone of a so-called
metastable chaos appears. Attracting properties are restored only when a repeller
disappears joining with periodic attractor as a result tangent bifurcation which
coincides with second crisis.
An ordinary physical experiment in presence of pattern (iii) looks in the following
way. If bifurcation parameter rises the oscillations in system coincide with a periodic
attractor and in tangent bifurcation point developed chaotic oscillations appear stiffly. By
decreasing the bifurcation parameter in crisis point the developed chaotic oscillations
become periodic ones stiffly. Thus development of chaos is stiff and reversible.
It is necessary to pay special attention that pattern (iii) has regimes which can not
be revealed by ordinary an physical experiment. Therefore they can be called
‘isolated’ regimes. They are limited with one side by tangent bifurcation and with the
other side by strange attractor crisis.
The Bifurcations and Chaotic Oscillations in Electric Circuits with Arc 41

Why they are impossible to reveal? It is explained that the jumps which occur in
the system direct to the same side (unlike from hysteresis when the jumps direct in
differ side) and to hit in this regime region by natural way does not present possible.
This can do either by special physical experiment when can preset the initial
conditions or parameter value changes very fast or modifying studied system by
superposition any pulses. In nature these regimes can become apparent as a result of
some extreme (extraordinary) events. However even if in the system isolated regimes
occurs then any changing of parameter leads to transition in region of simple regimes.
Although isolated regimes are a phenomenon sufficiently exotic they are importance
from the viewpoint of studying chaotic oscillation properties that appear in patters (iii).
It turns out that properties of chaos in this case are determined that a cascade of period
doubling bifurcations which occur in isolated region because this is the same attractor.
Although in a metastable chaos region an attractor looses its attractive properties its
development continues.
Knowledge of isolated regime properties helps to reveal them on the bifurcation
diagram of the studied system (see enlarged notes on Fig.14-16).

6 Quantitative Estimations
Feigenbaum [2] determined that the cascade of period doubling bifurcations possesses
not only qualitative but quantitative universal properties. It turned out that at doubling
the bifurcation values of parameter represent the geometric series where denominator
δ is universal value i.e. value independent on kind of nonlinear system.
It was obtained for the studied system
δ = 4,669220751009,
already at bifurcation 64-divisible of period that confirms its universality i.e. contains
five correct significant digits.

7 Conclusions
The electric circuits with arc possess an abundance of periodic and chaotic behaviour.
Investigation of these circuits may be useful because its properties are universal and
can apply to other nonlinear dynamical systems.

References
1. Syuan, W.: Family of Chua’s circuits. Trans. IEEE. 75(8), 55–65 (1987)
2. Anishchenko, V.S.: Complicated oscillation in simple, 312 p. Nauka, Moscow (1990) (in
Russian)
3. Pentegov, I.V., Sidorets, V.N.: Energy parameters in mathematical model of dynamical
welding arc. Automaticheskaya svarka 11, 36–40 (1988) (in Russian)
4. Sidorets, V.N., Pentegov, I.V.: Chaotic oscillations in RLC circuit with electric arc
Doklady AN Ukrainy, vol. 10, pp. 87–90 (1992) (in Russian)
42 V. Sydorets

5. Sidorets, V.N., Pentegov, I.V.: Appearance and structure of strange attractor in RLC
circuit with electric arc. Technicheskaya electrodynamica 2, 28–32 (1993) (in Russian)
6. Sidorets, V.N., Pentegov, I.V.: Deterministic chaos development scenarios in electric
circuit with arc. Ukrainian physical journal 39(11-12), 1080–1083 (1994) (in Ukrainian)
7. Sidorets, V.N.: Structures of bifurcation diagrams for electric circuit with arc. Technichna
electrodynamica 6, 15–18 (1998)
8. Vladimirov, V.A., Sidorets, V.N.: On the Peculiarities of Stochastic Invariant Solutions of
a Hydrodynamic System Accounting for Non-local Effects. Symmetry in Nonlinear
Mathematical Physics 2, 409–417 (1997)
9. Vladimirov, V.A., Sidorets, V.N.: On Stochastic Self Oscillation Solutions of Nonlinear
Hydrodynamic Model of Continuum Accounting for Relaxation Effects. Dopovidi
Nacionalnoyi akademiyi nauk Ukrayiny 2, 126–131 (1999) (in Russian)
10. Vladimirov, V.A., Sidorets, V.N., Skurativskii, S.I.: Complicated Travelling Wave
Solutions of a Modelling System Describing Media with Memory and Spatial Nonlocality.
Reports on Mathematical Physics 41(1/2), 275–282 (1999)
11. Sidorets, V.N.: Feature of analyses eigenvalues of mathematical models of nonlinear
electrical circuits. Electronnoe modelirovanie 20(5), 60–71 (1998) (in Russian)
12. Li, T., Yorke, J.A.: Period Three Implies Chaos American Math. Monthly 82, 985–991
(1975)
Soft Computing Models for Intelligent Control of
Non-linear Dynamical Systems

Oscar Castillo and Patricia Melin

Division of Graduate Studies and Research


Tijuana Institute of Technology
Tijuana, Mexico
ocastillo@tectijuana.mx

Abstract. We describe in this paper the application of soft computing techniques to controlling
non-linear dynamical systems in real-world problems. Soft computing consists of fuzzy logic,
neural networks, evolutionary computation, and chaos theory. Controlling real-world non-linear
dynamical systems may require the use of several soft computing techniques to achieve the
desired performance in practice. For this reason, several hybrid intelligent architectures have
been developed. The basic idea of these hybrid architectures is to combine the advantages of
each of the techniques involved in the intelligent system. Also, non-linear dynamical systems
are difficult to control due to the unstable and even chaotic behaviors that may occur in these
systems. The described applications include robotics, aircraft systems, biochemical reactors,
and manufacturing of batteries.

Keywords: Neural Networks, Fuzzy Logic, Genetic Algorithms, Intelligent Control.

1 Introduction
We describe in this paper the application of soft computing techniques and fractal
theory to the control of non-linear dynamical systems [8]. Soft computing consists of
fuzzy logic, neural networks, evolutionary computation, and chaos theory [23]. Each
of these techniques has been applied successfully to real world problems. However,
there are applications in which one of these techniques is not sufficient to achieve the
level of accuracy and efficiency needed in practice. For this reason, is necessary to
combine several of these techniques to take advantage of the power that each
technique offers. We describe several hybrid architectures that combine different soft
computing techniques. We also describe the development of hybrid intelligent
systems combining several of these techniques to achieve better performance in
controlling real dynamical systems. We illustrate these ideas with applications to
robotic systems, aircraft systems, biochemical reactors, and manufacturing systems.
Each of these problems has its own characteristics, but all of them share in common
their non-linear dynamic behavior. For this reason, the use of soft computing
techniques is completely justified. In all of these applications, the results of using soft
computing techniques have been better than with traditional techniques.

W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 43 – 70.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
44 O. Castillo and P. Melin

2 Neural Network Models


A neural network model takes an input vector X and produces and output vector Y.
The relationship between X and Y is determined by the network architecture [23].
There are many forms of network architecture (inspired by the neural architecture of
the brain). The neural network generally consists of at least three layers: one input
layer, one output layer, and one or more hidden layers. Figure 1 illustrates a neural
network with p neurons in the input layer, one hidden layer with q neurons, and one
output layer with one neuron.

Output

Hidden 1 j q q+1

Input 1 2 i p+1

Fig. 1. Single hidden layer feedforward neural network

In the neural network we will be using, the input layer with p+1 processing
elements, i.e., one for each predictor variable plus a processing element for the bias.
The bias element always has an input of one, Xp+1=1. Each processing element in the
input layer sends signals Xi (i=1,…,p+1) to each of the q processing elements in the
hidden layer. The q processing elements in the hidden layer (indexed by j=1,…,q)
produce an “activation” aj=F(ΣwijXi) where wij are the weights associated with the
connections between the p+1 processing elements of the input layer and the jth
processing element of the hidden layer. Once again, processing element q+1 of the
hidden layer is a bias element and always has an activation of one, i.e. aq+1=1.
Assuming that the processing element in the output layer is linear, the network model
will be

(1)

Here πι are the weights for the connections between the input layer and the output
layer, and θj are the weights for the connections between the hidden layer and the
output layer. The main requirement to be satisfied by the activation function F(.) is
that it be nonlinear and differentiable. Typical functions used are the sigmoid,
hyperbolic tangent, and the sine functions, i.e.:
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 45

(2)

The weights in the neural network can be adjusted to minimize some criterion such
as the sum of squared error (SSE) function:

(3)

Thus, the weights in the neural network are similar to the regression coefficients in
a linear regression model. In fact, if the hidden layer is eliminated, (1) reduces to the
well-known linear regression function. It has been shown [13, 24] that, given
sufficiently many hidden units, (1) is capable of approximating any measurable
function to any accuracy. In fact F(.) can be an arbitrary sigmoid function without any
loss of flexibility.
The most popular algorithm for training feedforward neural networks is the
backpropagation algorithm. As the name suggests, the error computed from the output
layer is backpropagated through the network, and the weights are modified according to
their contribution to the error function. Essentially, backpropagation performs a local
gradient search, and hence its implementation does not guarantee reaching a global
minimum. A number of heuristics are available to partly address this problem, some of
which are presented below. Instead of distinguishing between the weights of the
different layers as in Equation (1), we refer to them generically as wij in the following.
After some mathematical simplification the weight change equation suggested by
back-propagation can be expressed as follows:

(4)

Here, ηis the learning coefficient and θ is the momentum term. One heuristic that
is used to prevent the neural network from getting stuck at a local minimum is the
random presentation of the training data. Another heuristic that can speed up
convergence is the cumulative update of weights, i.e., weights are not updated after
the presentation of each input-output pair, but are accumulated until a certain number
of presentations are made, this number referred to as an “epoch”. In the absence of the
second term in (4), setting a low learning coefficient results in slow learning, whereas
a high learning coefficient can produce divergent behavior. The second term in (4)
reinforces general trends, whereas oscillatory behavior is canceled out, thus allowing
a low learning coefficient but faster learning. Last, it is suggested that starting the
training with a large learning coefficient and letting its value decay as training
progresses speeds up convergence.

2.1 Levenberg-Marquardt Modifications for Neural Networks

The method of steepest descent, also known as gradient method, is one of the oldest
techniques for minimizing a given function defined on a multidimensional space. This
method forms the basis for many optimization techniques. In general, the descent
direction is given by the second derivatives of the objective function E. The matrix of
46 O. Castillo and P. Melin

second derivatives gives us what is known as the Hessian matrix H. In classical


Newton's method this matrix is used to define an adaptation rule for a parameter
vector θ as follows:
(5)
where g is the gradient vector consisting of all the first order derivatives of function
E. In Newton's method H needs to be positive definite to have convergence.
Furthermore, if the Hessian matrix is not positive definite, the Newton direction may
point toward a local maximum, or a saddle point. The Hessian can be altered by adding
a positive definite matrix P to H to make H positive definite. Levenberg and Marquardt
[15] introduced this notion in least-squares problems. Later, Goldfeld et al. [11] first
applied this concept to the Newton's method. When P = λΙ, Equation (5) will be

(6)

where I is the identity matrix and λ is some nonnegative value. Depending on the
magnitude of A, the method transits smoothly between the two extremes: Newton's
method (λ→ 0) and well-known steepest descent method (λ→ ∞ ) .A variety of
Levenberg- Marquardt algorithms differ in the selection of λ. Goldfeld et al.
computed eigenvalues of H and set A to a little larger than the magnitude of the most
negative eigenvalue.
Moreover, when λ increases, || θnext - θnow || decreases. In other words, λ plays the
same role as an adjustable step length. That is, with some appropriately large λ, the
step length, will be the right one. Of course, the step size η can be further introduced
and can be determined in conjunction with line search methods:

(7)

For the case of neural networks these ideas are used to update (or learn) the
weights of the network [8].

3 Fractal Dimension of a Geometrical Object


Recently, considerable progress has been made in understanding the complexity of an
object through the application of fractal concepts [14] and dynamic scaling theory [3].
For example, financial time series show scaled properties suggesting a fractal
structure [8]. The fractal dimension of a geometrical object can be defined as follows:

(8)

where N(r) is the number of boxes covering the object and r is the size of the box. An
approximation to the fractal dimension can be obtained by counting the number of
boxes covering the boundary of the object for different r sizes and then performing a
logarithmic regression to obtain d (box counting algorithm). In Figure 2, we illustrate
the box counting algorithm for a hypothetical curve C. Counting the number of boxes
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 47

Fig. 2. Box counting algorithm for a curve C

Fig. 3. Logarithmic regression to find dimension

for different sizes of r and performing a logarithmic linear regression, we can estimate
the box dimension of a geometrical object with the following equation:
(9)
this algorithm is illustrated in Figure 3.
The fractal dimension can be used to characterize an arbitrary object. The reason
for this is that the fractal dimension measures the geometrical complexity of objects.
In this case, a time series can be classified by using the numeric value of the fractal
dimension (d is between 1 and 2 because we are on the plane xy). The reasoning
behind this classification scheme is that when the boundary is smooth the fractal
dimension of the object will be close to one. On the other hand, when the boundary is
rougher the fractal dimension will be close to a value of two.
We developed a computer program in MATLAB for calculating the fractal
dimension of a sound signal. The computer program uses as input the figure of the
signal and counts the number of boxes covering the object for different grid sizes.

4 Intelligent Control Using Soft Computing


First, we describe a new method for adaptive model-based control of robotic dynamic
systems using a neuro-fuzzy-fractal approach. Intelligent control of robotic dynamic
48 O. Castillo and P. Melin

systems is a difficult problem because the dynamics of these systems is highly non-
linear [5]. We describe an intelligent system for controlling robot manipulators to
illustrate our neuro-fuzzy-fractal approach for adaptive control. We use a new fuzzy
inference system for reasoning with multiple differential equations for modelling based
on the relevant parameters for the problem [6]. In this case, the fractal dimension [14]
of a time series of measured values of the variables is used as a parameter for the fuzzy
system. We use neural networks for identification and control of robotic dynamic
systems [4, 21]. The neural networks are trained with the Levenberg-Marquardt
learning algorithm with real data to achieve the desired level of performance.
Combining a fuzzy rule base [32] for modelling with the neural networks for
identification and control, an intelligent system for adaptive model-based control of
robotic dynamic systems was developed. We have very good simulation results for
several types of robotic systems for different conditions. The new method for control
combines the advantages of fuzzy logic (use of expert knowledge) with the advantages
of neural networks (learning and adaptability), and the advantages of the fractal
dimension (pattern classification) to achieve the goal of robust adaptive control of
robotic dynamic systems.
The neuro-fuzzy-fractal approach described above can also be applied to the case
of controlling biochemical reactors [21]. In this case, we use mathematical models of
the reactors to achieve adaptive model-based control. We also use a fuzzy inference
system for differential equations to take into consideration several models of the
biochemical reactor. The neural networks are used for identification and control. The
fractal dimension of the bacteria used in the reactor is also an important parameter in
the fuzzy rules to take into account the complexity of biochemical process. We have
very good results for several food production processes in which the biochemical
reactor is controlled to optimize the production.
We have also used our hybrid approach for the case of controlling chaotic and
unstable behavior in aircraft dynamic systems [22]. For this case, we use
mathematical models for the simulation of aircraft dynamics during flight. The goal of
constructing these models is to capture the dynamics of the aircraft, so as to have a
way of controlling this dynamics to avoid dangerous behavior of the system. Chaotic
behavior has been related to the flutter effect that occurs in real airplanes, and for this
reason has to be avoided during flight. The prediction of chaotic behavior can be done
using the mathematical models of the dynamical system. We use a fuzzy inference
system combining multiple differential equations for modelling complex aircraft
dynamic systems. On the other hand, we use neural networks trained with the
Levenberg-Marquardt algorithm for control and identification of the dynamic
systems. The proposed adaptive controller performs rather well considering the
complexity of the domain.
We also describe in this paper, several hybrid approaches for controlling
electrochemical processes in manufacturing applications. The hybrid approaches
combine soft computing techniques to achieve the goal of controlling the
manufacturing process to follow a desired production plan. Electrochemical processes,
like the ones used in battery formation, are very complex and for this reason very
difficult to control. Also, mathematical models of electrochemical processes are
difficult to derive and they are not very accurate. We need adaptive control of the
electrochemical process to achieve on-line control of the production line. Of course,
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 49

adaptive control is easier to achieve if one uses a reference model of the process
[21, 22]. In this case, we use a neural network to model the electrochemical process
due to the difficulty in obtaining a good mathematical model for the problem. The
other part of the problem is how to control the non-linear electrochemical process in
the desired way to achieve the production with the required quality. We developed a
set of fuzzy rules using expert knowledge for controlling the manufacturing process.
The membership functions for the linguistic variables in the rules were tuned using a
specific genetic algorithm. The genetic algorithm was used for searching the parameter
space of the membership functions using real data from production lines. Our
particular neuro-fuzzy-genetic approach has been implemented as an intelligent system
to control the formation of batteries in a real plant with very good results.

5 Intelligent Control of Robotic Systems


Given the dynamic equations of motion of a robot manipulator, the purpose of robot
arm control is to maintain the dynamic response of the manipulator in accordance
with some pre-specified performance criterion [7]. Although the control problem can
be stated in such a simple manner, its solution is complicated by inertial forces,
coupling reaction forces, and gravity loading on the links. In general, the control
problem consists of (1) obtaining dynamic models of the robotic system, and (2) using
these models to determine control laws or strategies to achieve the desired system
response and performance [10].
Among various adaptive control methods, the model-based adaptive control is the
most widely used and it is also relatively easy to implement. The concept of model-
based adaptive control is based on selecting an appropriate reference model and
adaptation algorithm, which modifies the feedback gains to the actuators of the actual
system.
Many authors have proposed linear mathematical models to be used as reference
models in the general scheme described before. For example a linear second-order
time invariant, differential equation can be used as the reference model for each
degree of freedom of the robot arm. Defining the vector y(t) to represent the reference
model response and the vector x(t) to represent the manipulator response, the joint i of
the reference model can be described by
(10)
If we assume that the manipulator is controlled by position and velocity feedback
gains and the coupling terms are negligible, then the manipulator equation for joint i
can be
Di(t)x"i(t) + Ei(t)x'i(t) + xi(t) = ri(t) (11)
where the system parameters αi(t) and βi(t) are assumed to vary slowly with time.
The fact that this control approach is not dependent on a complex mathematical
model is one of its major advantages, but stability considerations of the closed-loop
adaptive system are critical. A stability analysis is difficult and has only been carried
out using linearized models. However, the adaptability of the controller can become
50 O. Castillo and P. Melin

questionable if the interaction forces among the various joints are severe (non-linear).
This is the main reason why soft computing techniques [7] have been proposed to
control this type of dynamic systems.
Adaptive fuzzy control is an extension of fuzzy control theory to allow the fuzzy
controller, extending its applicability, either to a wider class of uncertain systems or to
fine-tune the parameters of a system to accuracy [9]. In this scheme, a fuzzy
controller is designed based on knowledge of a dynamic system. This fuzzy controller
is characterized by a set of parameters. These parameters are either the controller
constants or functions of a model’s constants.
A controller is designed based on an assumed mathematical model representing a
real system. It must be understood that the mathematical model does not completely
match the real system to be controlled. Rather, the mathematical model is seen as an
approximation of the real system. A controller designed based on this model is
assumed to work effectively with the real system if the error between the actual system
and its mathematical representation is relatively insignificant. However, there exists a
threshold constant that sets a boundary for the effectiveness of a controller. An error
above this threshold will render the controller ineffective toward the real system.
An adaptive controller is set up to take advantage of additional data collected at run
time for better effectiveness. At run time, data are collected periodically at the
beginning of each constant time interval, tn = tn-1 + Δt, where Δt is a constant
measurement of time, and [tn, tn-1) is a duration between data collection. Let Dn be a
set of data collected at time t = tn. It is assumed that at any particular time, t = tn, a
history of data {D0, D1, …, Dn} is always available. The more data available, more
accurate the approximation of the system will become.
At run time, the control input is fed into both the real system and the mathematical
model representing the system. The output of the real system and the output of that
mathematical model are collected and an error representing the difference between
these two outputs are calculated. Let x(t) be the output of the real system, and y(t) the
output of the mathematical model. The error ε(t) is defined as:
H(t) = x(t) – y(t). (12)
Figure 4 depicts this tracking of the difference between the mathematical model
and the real dynamic system it represents.

+
Real Dynamic
Controller System
+ u(t) x(t) H(t)
xdesired

Mathematical
Model
y(t)

Fig. 4. Tracking the error function between outputs of a real system and mathematical model
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 51

An adaptive controller will be adjusted based on the error function ε(t). This
calculated data will be fed into either the mathematical model or the controller for
adjustment. Since the error function ε(t) is available only at run time, an adjusting
mechanism must be designed to accept this error as it becomes available, i.e., it must
evolve with the accumulation of data in time. At any time, t = tn, the set of calculated
data in the form of a time series {ε(t0), ε(t1),..., ε(tn)}is available and must be used by
the adjusting mechanism to update appropriate parameters.
In normal practice, instead of doing re-calculation based on a lengthy set of data,
the adjusting algorithm is reformulated to be based on two entities: (i) sufficient
information, and (ii) newly collected data. The sufficient information is a numerical
variable representing the set of data {ε(t0), ε(t1),..., ε(tn-1)} collected from the initial
time t0 to the previous collecting cycle starting at time t = tn-1. The new datum ε(tn) is
collected in the current cycle starting at time t = tn.
An adaptive controller will operate as follows. The controller is initially designed
as a function of a parameter set and state variables of a mathematical model. The
parameters can be updated any time during operation and the controller will adjust
itself to the newly updated parameters. The time frame is usually divided into a series
of equally spaced intervals {[tn,tn+1)| n = 0,1,2,...; tn+1 = tn+ Δt}. At the beginning of
each time interval [tn,t n+1) observable data are collected and the error function ε(tn) is
calculated. This error is used to calculate the adjustment in the parameters of the
controller. New control input u(tn) for the time interval [tn,tn+1) is then calculated
based on the newly calculated parameters and fed into both the real dynamic system
under control and the mathematical model upon which the controller is designed. This
completes one control cycle. The next control cycle will consist of the same steps
repeated for the next time interval [tn+1,tn+2), and so on.

5.1 Mathematical Modelling of Robotic Dynamic Systems

We will consider, in this section, the case of modelling robotic manipulators [5]. The
general model for this kind of robotic system is the following:
M(q)q" + V(q, q'))q' + G(q) + Fdq' = W (13)

where q ∈ Rn denotes the link position, M(q) ∈ Rnxn is the inertia matrix, V(q,q') ∈
Rnxn is the centripetal-Coriolis matrix, G(q) ∈ Rn represents the gravity vector, Fd ∈
Rnxn is a diagonal matrix representing the friction term, and τ is the input torque
applied to the links. We show in Figure 5 the case of the two-link robot arm. In this
figure, we show the variables involved.
For the simplest case of a one-link robot arm, we have the scalar equation:
Mqq" + Fdq' + G(q) = W (14)
If G(q) is a linear function (G = Nq), then we have the "linear oscillator" model:
q" + aq' + bq = c
where a = Fd/Mq , b = N/Mq and c = τ/Mq. This is the simplest mathematical model
for a one-link robot arm. More realistic models can be obtained for more complicated
52 O. Castillo and P. Melin

Fig. 5. Two-link robot arm indicating the variables involved

functions G(q). For example, if G(q) = Nq2, then we obtain the "quadratic oscillator"
model:
q" + aq' + bq2 = c (15)
where a, b and c are defined as above.
A more interesting model is obtained if we define G(q) = Nsinq. In this case, the
mathematical model is
q" + aq' + bsinq = c (16)
where a, b and c are the same as above. This is the so-called "sinusoidally forced
oscillator". More complicated models for a one-link robot arm can be defined
similarly.
For the case of a two-link robot arm, we can have two simultaneous differential
equations as follows:

q"1 + a1q'1 + b1q22 = c1

q"2 + a2q'2 + b2q21 = c2 (17)

which is called the "coupled quadratic oscillators" model. In Equation (17) a1, b1, a2,
b2, c1 and c2 are defined similarly as in the previous models. We can also have the
"coupled cubic oscillators" model:
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 53

q"1 + a1q'1 + b1q32 = c1 ,

(18)
q"2 + a2q'2 + b2q31 = c2

(a)

(b)

Fig. 6. (a) Function approximation after 9 epochs, (b) SSE of the neural network
54 O. Castillo and P. Melin

5.2 Simulation Results

To give an idea of the performance of our neuro-fuzzy approach for adaptive


model-based control of robotic systems, we show below simulation results obtained
for a single-link robot arm. The desired trajectory for the link was selected to be
qd = tsin(2.0t) (19)
and the simulation was carried out with the initial values: q(0) = 0.1 q'1(0) = 0. We used
three-layer neural networks (with 15 hidden neurons) with the Levenberg-Marquardt

(a)

(b)
Fig. 7. (a) Non-linear surface for modelling, (b) fuzzy reasoning procedure
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 55

(a)

(b)

Fig. 8. (a) Simulation of position q1, (b) Simulation of position q2

algorithm and hyperbolic tangent sigmoidal functions as the activation functions for
the neurons. We show in Figure 6(a) the function approximation achieved with the
neural network for control after 9 epochs of training with a variable learning rate. The
identification achieved by the neural network can be considered very good because
the error has been decreased to the order of 10-4. We show in Figure 6(b) the curve
relating the sum of squared errors SSE against the number of epochs of neural
network training. We can see in this figure how the SSE diminishes rapidly from
being of the order of 102 to smaller value of the order of 10-4. Still, we can obtain a
better approximation by using more hidden neurons or more layers. In any case, we
56 O. Castillo and P. Melin

can see clearly how the neural networks learns to control the robotic system, because
it is able to follow the arbitrary desired trajectory.
We show in Figure 7(a) the non-linear surface for the fuzzy rule base for modelling.
The fuzzy system was implemented in the fuzzy logic toolbox of MATLAB [25]. We
show in Figure 7(b) the reasoning procedure for specific values of the fractal
dimension and number of links of the robotic system.
In Figure 8 we show simulation results for a two-link robot arm with a model given
by two coupled second order differential equations. Figure 8(a) shows the behavior of
position q1 and Figure 8(b) shows it for position q2 of the robot arm.
We can see from these figures the complex dynamic behavior of this robotic system
[7]. Of course, the complexity is even greater for higher dimensional robotic systems.
We have very good simulation results for several types of robotic manipulators for
different conditions. The new method for control combines the advantages of neural
networks (learning and adaptability) with the advantages of fuzzy logic (use of expert
knowledge) to achieve the goal of robust adaptive control of robotic dynamic systems.
We consider that our method for adaptive control can be applied to general non-linear
dynamical systems [8, 27] because the hybrid approach, combining neural networks
and fuzzy logic, does not depend on the particular characteristics of the robotic
dynamic systems.
The new method for adaptive control can also be applied for autonomous robots
[8], but in this case it may be necessary to include genetic algorithms for trajectory
planning.

6 Control of Biochemical Reactors


Process control of biochemical plants is also an attractive application because of the
potential benefits to both adaptive network research and to actual biochemical process
control. In spite of the extensive work on self-tuning controllers and model-reference
control, there are many problems in chemical processing industries for which current
techniques are inadequate. Many of the limitations of current adaptive controllers arise
in trying to control poorly modeled non-linear systems [1]. For most of these processes
extensive data are available from past runs, but it is difficult to formulate precise
models. This is precisely where adaptive networks are expected to be useful [31].
Bioreactors are difficult to model because of the complexity of the living organisms
in them and also they are difficult to control because one often can't measure on-line the
concentration of the chemicals being metabolized or produced. Bioreactors can also
have markedly different operating regimes, depending on whether the bacteria is rapidly
growing or producing product. Model-based control of these reactors offers a dual
problem: determining a realistic process model and determining effective control laws
in the face of inaccurate process models and highly nonlinear processes [19, 20, 26].
Biochemical systems can be relatively simple in that they have few variables, but
still very difficult to control due to strong nonlinearities which are difficult to model
accurately. A prime example is the bioreactor. In its simplest form, a bioreactor is
simply a tank containing water and cells (e.g.. bacteria) which consume nutrients
("substrate") and produce products (both desired and undesired) and more cells.
Bioreactors can be quite complex: cells are self-regulatory mechanisms, and can
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 57

adjust their growth rates and production of different products radically depending on
temperature and concentrations of waste products [16]. Systems with heating or
cooling, multiple reactors or unsteady operation greatly complicate the analysis.
Mathematical models for these systems can be expressed as differential (or
difference) equations [3, 17, 18].
Now we propose mathematical models that integrate our method for geometrical
modelling of bacteria growth using the fractal dimension [14] with the method for
modelling the dynamics of bacteria population using differential equations [27]. The
resulting mathematical models describe bacteria growth in space and in time, because
the use of the fractal dimension enables us to classify bacteria by the geometry of the
colonies and the differential equations help us to understand the evolution in time of
bacteria population.
We will consider first the case of using one bacteria for food production. The
mathematical model in this case can be of the following form:
-D -D -D
dN/dt = r(1 - N /K)N - EN
-D
dP/dt = EN (20)

where D is the fractal dimension, N is the bacteria population, P is quantity of


chemical product, r is the rate of bacteria growth, K is the environment capacity, and
β is a biochemical conversion factor.
We will consider now the case of two bacteria used for food production:
-D1 -D2 -D1 -D1
dN1/dt=[r1-(r1/K1)N1 -(r1/K1)G12N2 ]N1 -EN1

-D2 -D1 -D2 -D2


dN2/dt = [r2-(r2/K2)N2 -(r2/K2)G21N1 ]N2 -JN2

-D1 -D2 (21)


dP/dt = EN1 + JN2
where D1 is the fractal dimension of bacteria 1, D2 is the fractal dimension of bacteria
2 and the rest of variables are as described in the last equation.
As we can see from equations (20) and (21) the idea of our method of modelling is
to use the fractal dimension D as a parameter in the differential equations, so as to
have a way of classifying for which type of bacteria the equation corresponds. In this
way, equation (20), for example, can represent the model for food production using
one bacteria (the one defined by the fractal dimension D).
We have implemented a model-based neural controller using the architecture of
Figure 9. Two multilayer networks are used, one for the model of the plant and the
second for the controller. The Neural Networks were implemented in the MATLAB
programming language to achieve a high level of efficiency on the numerical
calculations needed for these modules. The Fractal module was also implemented in
the MATLAB programming language for the same reason. In this way we combine the
three methodologies to obtain the best of the three worlds (Neural Networks, Fuzzy
Logic and Fractal Theory) using for each the appropriate implementation language.
58 O. Castillo and P. Melin

Fig. 9. Indirect Adaptive Neuro-Fuzzy-Fractal Control

Fig. 10. Simulation of the model for two bacteria used in food production

We show in Figure 10 simulation results of bacteria population used for food


production. We can see from this figure the complicated dynamics for the case of two
bacteria competing in the same environment, and at the same time producing the
chemical product necessary for food production.
We also show in Figure 11 simulation results for the case of two good bacteria
used for food production and one bad bacteria that is attacking the other ones. We can
see from this figure how one of the good bacteria is eliminated (the population goes
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 59

Fig. 11. Simulation of the model for two good bacteria and one bad one

down to zero), which of course results in a decrease of the resulting quantity of the
food product. This is a case, which has to be avoided because of the bad resulting
effect of the bad bacteria. Intelligent control helps in avoiding these types of scenarios
for food production.
We have use a general method for adaptive model based control of non-linear
dynamic plants using Neural Networks, Fuzzy Logic and Fractal Theory. We
illustrated our method for control with the case of biochemical reactors. In this case,
the models represent the process of biochemical transformation between the microbial
life and their generation of the chemical product. We also describe in this paper an
adaptive controller based on the use of neural networks and mathematical models for
the plant. The proposed adaptive controller performs rather well considering the
complexity of the domain being considered in this research work. We can say that
combining Neural Networks, Fuzzy Logic and Fractal Theory, using the advantages
that each of these methodologies has, can give good results for this kind of
application. Also, we believe that our neuro-fuzzy-fractal approach is a good
alternative for solving similar problems.

7 Intelligent Control of Aircraft Systems


The mathematical models of aircraft systems can be represented as coupled non-linear
differential equations [22]. In this case, we can develop a fuzzy rule base for modelling
that enables the use of the appropriate mathematical model according to the changing
60 O. Castillo and P. Melin

conditions of the aircraft and its environment. For example, we can use the following
model of an airplane when wind velocity is relatively small:
p’ = I1(-q + l), q’ = I2(p + m) (22)
where I1 and I2 are the inertia moments of the airplane with respect to axis x and y,
respectively, l and m are physical constants specific to the airplane, and p, q are the
positions with respect to axis x and y, respectively. However, a more realistic model
of an airplane in three dimensional space, is as follows:

p’ = I1(-qr + l), q’ = I2(pr + m), r’ = I3(-pq + n) (23)


where now I3 is the inertia moment of the airplane with respect to the z axis, n is a
physical constant specific to the airplane, and r is the position along the z axis.
Considering now wind disturbances in the model, we have the following equation:
p’ = I1(-qr + l) - ug, q’ = I2(pr + m), r’ = I3(-pq + n) (24)
where ug is the wind velocity. The magnitude of wind velocity is dependent on the
altitude of the airplane in the following form:
ug = uwind510 1 + ln (r/510)
ln(51)
where uwind510 is the wind speed at 510 ft altitude (typical value = 20 ft/sec).
If we use the models of Eq. (22)-(24) for describing aircraft dynamics, we can
formulate a set of rules that relate the models to the conditions of the aircraft and its
environment. Lets assume that M1 is given by Eq. (22), M2 is given by Eq. (24), and
M3 is given by Eq. (24). Now using the wind velocity ug and inertia moment I1 as
parameters, we can establish the fuzzy rule base for modelling [29, 30] as in Table 1.
In Table 1, we are assuming that the wind velocity ug can have only two possible
fuzzy values (small and large). This is sufficient to know if we have to use the
mathematical model that takes into account the effect of wind (M3) for ug large or if
we don’t need to use it and simply the model M2 is sufficient (for ug small). Also, the
inertia moment (I1) helps in deciding between models M1 and M2 (or M3).
To give an idea of the performance of our neuro-fuzzy-fractal approach for
adaptive control, we show below simulation results for aircraft dynamic systems.
First, we show in Figure 12(a) the fuzzy rule base for a prototype intelligent system

Table 1. Fuzzy rule base for modelling aircraft systems

IF THEN
Wind Inertia Fractal Dim Model
Small Small Low M1
Small Small Medium M2
Small Large Low M2
Small Large Medium M2
Large Small Medium M3
Large Large Medium M3
Large Large High M3
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 61

(a)

(b)

Fig. 12. (a) Fuzzy rule base (b) Non-linear surface for aircraft dynamics

developed in the fuzzy logic toolbox of the MATLAB programming language. We


show in Figure 12(b) the non-linear surface for the problem of aircraft dynamics using
as input variables: fractal dimension and wind velocity.
62 O. Castillo and P. Melin

(a)

(b)

Fig. 13. (a) Simulation of position q (b) Simulation of position p

We show simulation results for an aircraft system obtained using our new method
for modelling dynamical systems. In Figure 13(a) and Figure 13(b) we show results
for an airplane with inertia moments: I1 = 1, I2 = 0.4, I3 = 0.05 and the constants are:
l = m = n = 1. The initial conditions are: p(0) = 0, q(0) = 0, r(0) = 0.
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 63

To give an idea of the performance of our neuro-fuzzy approach for adaptive


model-based control of aircraft dynamics, we show below (Figure 14) simulation
results obtained for the case of controlling the altitude of an airplane for a flight of 6
hours. We assume that the airplane takes about one hour to achieve the cruising
altitude 30 000 ft, then cruises along for about three hours at this altitude (with minor
fluctuations), and finally descends for about two hours to its final landing point. We
will consider the desired trajectory as follows:

30t + sin2t for 0 d t d 1

rd = 30 + 2 sin10t for 1 < t d 4

90 - 15t for 4< t d 6


Of course, a complete desired trajectory for the airplane would have to include the
positions for the airplane in the x and y directions (variables p, q in the models).
However, we think that here for illustration purposes is sufficient to show the control
of the altitude r for the airplane.
We used three-layer neural networks (with 10 hidden neurons) with the Levenberg-
Marquardt algorithm and hyperbolic tangent sigmoidal functions as the activation
functions for the neurons. We show in Figure 14 the function approximation achieved
by the neural network for control after 800 epochs of training with a variable learning
rate. The identification achieved by the neural network (after 800 epochs) can be
considered very good because the error has been decreased to the order of 10-1. Still,
we can obtain a better approximation by using more hidden neurons or more layers. In

Fig. 14. Function approximation of the neural network for control of an airplane
64 O. Castillo and P. Melin

any case, we can see clearly (from Figure 14) how the neural network learns to
control the aircraft, because it is able to follow the arbitrary desired trajectory.
We have to mention here that these simulation experiments for the case of a
specific flight for a given airplane show very good results. We have also tried our
approach for control with other types of flights and airplanes with good simulation
results. Still, there is a lot of research to be done in this area because of the complex
dynamics of aircraft systems.
We have developed a general method for adaptive model based control of non-linear
dynamic systems using Neural Networks, Fuzzy Logic and Fractal Theory. We
illustrated our method for control with the case of controlling aircraft dynamics. In this
case, the models represent the aircraft dynamics during flight. We also described in this
paper an adaptive controller based on the use of neural networks and mathematical
models for the system. The proposed adaptive controller performs rather well considering
the complexity of the domain being considered in this research work. We have shown
that our method can be used to control chaotic and unstable behavior in aircraft systems.
Chaotic behavior has been associated with the “flutter” effect in real airplanes, and for
this reason is very important to avoid this kind of behavior. We can say that combining
Neural Networks, Fuzzy Logic and Fractal Theory, using the advantages that each of
these methodologies has, can give good results for this kind of application. Also, we
believe that our neuro-fuzzy-fractal approach is a good alternative for solving similar
problems.

8 Intelligent Control of the Battery Charging Process


In a battery a process of conversion of chemical energy into electrical energy is
carried out. The chemical energy contained in the electrode and electrolyte is
converted into electrical power by means of electrochemical reactions. When
connecting the battery to a source of direct current a flow of electrons takes place for
the external circuit, and of ions inside the battery, giving an accumulation of load in
the battery. The quantity of electric current that is required to load the battery is
determined by an unalterable law of nature, that was postulated by Michael Faraday,
which is known as the Law of Faraday [2]. Faraday found that the quantity of electric
power required to perform an electrochemical change in a metal is related to the
relative weight of the metal. In the specific case of lead this is considered to be 118
amperes hour for pound of positive active material for cell. In practice, more energy is
required to counteract the losses due to the heat and to the generation of gas.
We show in Table 2 experimental data for a specific type of battery with different
sizes of the plates, and different number of plates for each cell. In this table, we show
the load time and the average current needed for the respective load. In Table 2 we
can observe that to form a battery we need to apply a particular current intensity
during a certain amount of time to achieve the required loading for the battery.
The goal of the manufacturers of batteries is to reduce the time required to load the
battery. However, current intensity can't be increased arbitrarily because of the
physical characteristics of the specific battery [12]. If the current is increased too
much, the temperature in the battery will go over a safe temperature value eventually
causing the destruction of the battery.
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 65

Table 2. Experimental data for different types of batteries

Type of Plate
Positive 0.060” Positive 0.070”
Negative 0.050” Negative 0.060”
Plate Total 72 hr 96 hr Total 72 hr 96 hr
cell A. H. Amp. Amp. A.H. Amp Amp
7 155 2.2 1.6 165 2.4 1.8
9 180 2.8 2.0 200 2.8 2.2
11 230 3.2 2.4 245 3.4 2.4
13 260 3.6 2.6 295 4.0 3.0
15 300 4.2 3.0 345 4.8 3.6
17 400 5.6 4.2 415 5.8 4.4

8.1 Fuzzy Method for Control

In this approach we use a statistical model to represent the electrochemical process


and a fuzzy rule base for process control. The temperature in the battery depends on
the electrical current that circulates in it during its formation, this means that to
maintain the temperature below a specific threshold it is important to control the
intensity of the current. Therefore for this case the independent variable is the average
current I, and the dependent variable is the average temperature T. A simple statistical
linear model can stated as follows:
T = Eo + E1 I (25)

where βo and β1 are parameters to be estimated (by least squares) using real data for
this problem. In Table 3, we show experimental values for a battery of 6 Volts, which

Table 3. Values of temperature and current for a battery of 200 amperes hour

Hrs T I Hrs T I
21:00 111 5.22 23:00 93 3.53
23:00 100 5.21 1:00 91 3.40
1:00 105 5.52 3:00 92 3.32
3:00 100 5.66 5:00 96 3.16
5:00 100 5.60 7:00 98 3.10
7:00 97 5.72 9:00 98 3.14
9:00 92 4.82 11:00 102 3.12
11:00 95 4.32 13:00 99 3.03
13:00 102 4.10 15:00 98 3.05
15:00 103 4.05 17:00 97 3.06
17:00 100 3.40 19:00 95 2.96
19:00 97 3.77 21:00 94 2.60
21:00 94 3.62 23:00 96 2.76
66 O. Castillo and P. Melin

T I T
dT/dt Fuzzy Electro-chemical
controller process

Fig. 15. Fuzzy Control of the process

Fig. 16. Fuzzy rule base for controlling the Process

according to manufacturer’s specifications should be loaded by using 200 amperes


hour. Using the data from Table 3 we can obtain (by least squares method) the values
of βo and β1 [28]. The equations is as follows:

T = 88.03 + 2.5304 I (26)


with correlation value of only 0.57 which is because of the complexity of the data.
For the fuzzy controller we used as input variables, the temperature T and the
change of temperature dT/dt, and as output variable the current intensity that should
be applied to the battery. In Figure 15 we show the architecture of our control system.
The control method was implemented in the MATLAB language. For each of the
linguistic variables it was considered convenient to use five terms. In Figure 16 we
show the fuzzy rule base implemented in the Fuzzy Logic Toolbox of MATLAB. We
have 25 rules because we are using 5 linguistic terms for each variable. The membership
functions were tuned manually until they give the best values for the problem.
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 67

8.2 Neuro-Fuzzy Method for Control

Since it is difficult to tune a particular inference system to model a complex dynamical


system [1] it is convenient to use adaptive fuzzy inference systems. Adaptive neuro-
fuzzy inference systems (ANFIS) can be used to adapt the membership functions and
consequents of the rule base according to historical data of the problem [13]. In this
case, we can use the data from Table 2 and apply the ANFIS methodology to find the
best fuzzy system for our problem. We used the fuzzy logic toolbox of MATLAB to
apply the ANFIS methodology to our problem with 5 membership functions and first
order Sugeno functions in the consequents. We show in Figure 17 the non-linear
surface for control.

Fig. 17. ANFIS surface for the process

8.3 Neuro-Fuzzy-Genetic Control

In this case, neural networks are used for modelling the electrochemical process,
fuzzy logic for controlling the electrical current and genetic algorithms for adapting
the membership functions of the fuzzy system [8]. A multilayer feedforward neural
network was used for modelling the electrochemical process. We used the data form
Table 3 and the Levenberg-Marquardt learning algorithm to train the neural network.
We used a three layer neural network with 15 nodes in the hidden layer. The results of
training for 2000 epochs are as follows. The sum of squared errors was reduced from
about 200 initially to 11.25 at the end, which is a very good approximation in this
case. The fuzzy rule base was implemented in the Fuzzy Logic Toolbox of MATLAB.
68 O. Castillo and P. Melin

In this case, 25 fuzzy rules were used because there were 5 linguistic terms for each
input variable.

8.4 Experimental Results

The three hybrid control systems were compared by simulating the formation
(loading) of a 6 Volts battery. This particular battery is manually loaded (in the plant)
by applying 2 amperes for 50 hours under manufacturer’s specifications. We show in
Table 4 the experimental results.

Table 4. Comparison of the Methods for Control

Control Method Time Loading


Manual Control 50 hours
Conventional Control 36 hours
Fuzzy Control 32 hours
Neuro-Fuzzy Control 30 hours
Neuro-Fuzzy-Genetic 25 hours

We can see from Table 4 that the fuzzy control method reduces 36% the time
required to charge the battery compared with manual control, and 11.11% compared
with conventional PID control [27]. We can also see how ANFIS helps in reducing
even more this time because we are using neural networks for adapting the intelligent
system. Now the reduction is of 40% with respect to manual control. Finally, we can
notice that using a neuro-fuzzy-genetic approach reduces even more the time because
the genetic algorithm optimizes the fuzzy system. In this case, reduction is of 50 %
with respect to manual control.
We have described in this section, three different approaches for controlling an
electrochemical process. We have shown that for this type of application the use of
several soft computing techniques can help in reducing the time required to produce a
battery. Even fuzzy control alone can reduce the formation time of a battery, but using
neural networks and genetic algorithms reduces even more the time for production. Of
course, this means that manufacturers can produce the batteries in half the time
needed before.

9 Conclusions
We can say that hybrid intelligent systems can be used to solve difficult real-world
problems. Of course, the right hybrid architecture (and combination) has to be selected.
At the moment, there are no general rules to decide on the right architecture for specific
classes of problems. However, we can use the experience that other researchers have
gained on these problems and use it to our advantage. Also, we always have to turn to
experimental work to test different combinations of soft computing techniques and
decide on the best one for ourselves. Finally, we can conclude that the use of soft
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 69

computing for controlling dynamical systems is a very fruitful area of research, because
of the excellent results that can be achieved without using complex mathematical
models [8, 23].

Acknowledgments
We would like to thank the research grant committee of CONACYT-Mexico, for the
financial support given to this research project, under grant 33780-A, and also
COSNET for the research grants 743.99-P, 414.01-P and 487.02-P. We would also
like to thank the Department of Computer Science of Tijuana Institute of Technology
for the time and resources given to this project.

References
[1] Albertos, P., Strietzel, R., Mart, N.: Control Engineering Solutions: A practical approach.
IEEE Computer Society Press, Los Alamitos (1997)
[2] Bode, H., Brodd, R.J., Kordesch, K.V.: Lead-Acid Batteries. John Wiley & Sons,
Chichester (1977)
[3] Castillo, O., Melin, P.: Developing a New Method for the Identification of
Microorganisms for the Food Industry using the Fractal Dimension. Journal of
Fractals 2(3), 457–460 (1994)
[4] Castillo, O., Melin, P.: Mathematical Modelling and Simulation of Robotic Dynamic
Systems using Fuzzy Logic Techniques and Fractal Theory. In: Proceedings of IMACS
1997, Berlin, Germany, vol. 5, pp. 343–348 (1997)
[5] Castillo, O., Melin, P.: A New Fuzzy-Fractal-Genetic Method for Automated
Mathematical Modelling and Simulation of Robotic Dynamic Systems. In: Proceedings
of FUZZ 1998, vol. 2, pp. 1182–1187. IEEE Press, Anchorage (1998)
[6] Castillo, O., Melin, P.: A New Fuzzy Inference System for Reasoning with Multiple
Differential Equations for Modelling Complex Dynamical Systems. In: Proceedings of
CIMCA 1999, pp. 224–229. IOS Press, Vienna (1999)
[7] Castillo, O., Melin, P.: Automated Mathematical Modelling, Simulation and Behavior
Identification of Robotic Dynamic Systems using a New Fuzzy-Fractal-Genetic
Approach. Journal of Robotics and Autonomous Systems 28(1), 19–30 (1999)
[8] Castillo, O., Melin, P.: Soft Computing for Control of Non-Linear Dynamical Systems.
Springer, Heidelberg (2001)
[9] Chen, G., Pham, T.T.: Introduction to Fuzzy Sets, Fuzzy Logic, and Fuzzy Control
Systems. CRC Press, Boca Raton (2001)
[10] Fu, K.S., Gonzalez, R.C., Lee, C.S.G.: Robotics: Control, Sensing, Vision and
Intelligence. McGraw-Hill, New York (1987)
[11] Goldfeld, S.M., Quandt, R.E., Trotter, H.F.: Maximization by Quadratic Hill Climbing.
Econometrica 34, 541–551 (1966)
[12] Hehner, N., Orsino, J.A.: Storage Battery Manufacturing Manual III. Independent Battery
Manufacturers Association (1985)
[13] Jang, J.R., Sun, C.T., Mizutani, E.: Neuro-Fuzzy and Soft Computing. Prentice Hall,
Englewood Cliffs (1997)
[14] Mandelbrot, B.: The Fractal Geometry of Nature. W.H. Freeman and Company, New
York (1987)
70 O. Castillo and P. Melin

[15] Marquardt, D.W.: An Algorithm for Least Squares Estimation of Non-Linear Parameters.
Journal of the Society of Industrial and Applied Mathematics 11, 431–441 (1963)
[16] Melin, P., Castillo, O.: Modelling and Simulation for Bacteria Growth Control in the
Food Industry using Artificial Intelligence. In: Proceedings of CESA 1996, Gerf EC
Lille, Lille, France, pp. 676–681 (1996)
[17] Melin, P., Castillo, O.: An Adaptive Model-Based Neural Network Controller for
Biochemical Reactors in the Food Industry. In: Proceedings of Control 1997, pp. 147–
150. Acta Press, Canada (1997)
[18] Melin, P., Castillo, O.: An Adaptive Neural Network System for Bacteria Growth Control
in the Food Industry using Mathematical Modelling and Simulation. In: Proceedings of
IMACS World Congress 1997, vol. 4, pp. 203–208. W & T Verlag, Berlin (1997)
[19] Melin, P., Castillo, O.: Automated Mathematical Modelling and Simulation for Bacteria
Growth Control in the Food Industry using Artificial Intelligence and Fractal Theory.
Journal of Systems, Analysis, Modelling and Simulation, 189–206 (1997)
[20] Melin, P., Castillo, O.: An Adaptive Model-Based Neuro-Fuzzy-Fractal Controller for
Biochemical Reactors in the Food Industry. In: Proceedings of IJCNN 1998, Anchorage
Alaska, USA, vol. 1, pp. 106–111 (1998)
[21] Melin, P., Castillo, O.: A New Method for Adaptive Model-Based Neuro-Fuzzy-Fractal
Control of Non-Linear Dynamic Plants: The Case of Biochemical Reactors. In:
Proceedings of IPMU 1998, vol. 1, pp. 475–482. EDK Publishers, Paris (1998)
[22] Melin, P., Castillo, O.: A New Method for Adaptive Model-Based Neuro-Fuzzy-Fractal
of Non-Linear Dynamical Systems. In: Proceedings of ICNPAA, pp. 499–506. European
Conference Publications, Daytona Beach (1999)
[23] Melin, P., Castillo, O.: Modelling, Simulation and Control of Non-Linear Dynamical
Systems. Taylor and Francis Publishers, London (2002)
[24] Miller, W.T., Sutton, R.S., Werbos, P.J.: Neural Networks for Control. MIT Press,
Cambridge (1995)
[25] Nakamura, S.: Numerical Analysis and Graphic Visualization with MATLAB. Prentice-
Hall, Englewood Cliffs (1997)
[26] Narendra, K.S., Annaswamy, A.M.: Stable Adaptive Systems. Prentice Hall Publishing,
Englewood Cliffs (1989)
[27] Rasband, S.N.: Chaotic Dynamics of Non-Linear Systems. John Wiley & Sons,
Chichester (1990)
[28] Sepulveda, R., Castillo, O., Montiel, O., Lopez, M.: Analysis of Fuzzy Control System
for Process of Forming Batteries. In: ISRA 1998, Mexico, pp. 203–210 (1998)
[29] Sugeno, M., Kang, G.T.: Structure Identification of Fuzzy Model. Fuzzy Sets and
Systems 28, 15–33 (1988)
[30] Takagi, T., Sugeno, M.: Fuzzy Identification of Systems and its Applications to
Modelling and Control. IEEE Transactions on Systems, Man and Cybernetics 15, 116–
132 (1985)
[31] Ungar, L.H.: A Bioreactor Benchmark for Adaptive Network-Based Process Control. In:
Neural Networks for Control, pp. 387–402. MIT Press, Cambridge (1995)
[32] Zadeh, L.A.: The Concept of a Linguistic Variable and its Application to Approximate
Reasoning. Information Sciences 8, 43–80 (1975)
Model Reference Adaptive Control of Underwater Robot
in Spatial Motion

Jerzy Garus

Naval University
81-103 Gdynia ul. Śmidowicza 69, Poland
j.garus@amw.gdynia.pl

Abstract. The paper addresses nonlinear control of an underwater robot. The way-point line of
sight scheme is incorporated for the tracking of a desired trajectory. Command signals are
generated by an autopilot consisting of four controllers with parameter adaptation law
implemented. Quality of control is concerned in presence of environmental disturbances. Some
computer simulations are provided to demonstrate effectiveness, correctness and robustness of
the approach.

1 Introduction

Underwater Robotics has known an increasing interest in the last years. The main
benefits of usage of an Underwater Robotic Vehicles (URV) can be removing a man
from the dangers of the undersea environment and reduction in cost of exploration of
deep seas. Currently, it is common to use the URV to accomplish missions like
inspections of coastal and off-shore structures, cable maintenance, as well as
hydrographical and biological surveys. In the military field it is employed in such
tasks as surveillance, intelligence gathering, torpedo recovery and mine counter
measures.
The URV is considered being a floating platform carrying tools required for
performing various functions, like manipulator arms with interchangeable end-
effectors, cameras, scanners, sonars, etc. An automatic control of such objects is a
difficult problem caused by their nonlinear dynamics [1, 3, 4, 5, 6]. Moreover, the
dynamics can change according to the alteration of configuration to be suited to the
mission. In order to cope with those difficulties, the control system should be
flexible.
The conventional URV operate in crab-wise manner of four degrees of freedom
(DOF) with small roll and pitch angles that can be neglected during normal
operations. Therefore its basic motion is movement in horizontal plane with some
variation due to diving.
The objective of the paper is to present a usage of the adaptive inverse dynamics
algorithm to driving the robot along a desired trajectory in the spatial motion. It

W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 71 – 83.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
72 J. Garus

consists of the following four sections. Brief descriptions of dynamical and kinematical
equations of motion of the URV and the adaptive control law are presented in the
Section 2. Next some results of the simulation study are provided. Conclusions are
given in the Section 4.

2 Nonlinear Adaptive Control Law


The general motion of marine vessels of six DOF describes the following vectors [2, 4, 5]:

η = [x, y, z,φ ,θ ,ψ ]
T

v = [u, v, w, p, q, r ] (1)
T

τ = [X , Y , Z , K , M , N ]
T

where:

η – vector of position and orientation in the inertial frame;


x, y, z – coordinates of position;
φ, θ, ψ – coordinates of orientation (Euler angles);
v – vector of linear and angular velocities with coordinates in the
body-fixed frame;
u, v, w – linear velocities along longitudinal, transversal and vertical axes;
p, q, r – angular velocities about longitudinal, transversal and vertical axes;
τ – vector of forces and moments acting on the robot in the body-fixed
frame;
X, Y, Z – forces along longitudinal, transversal and vertical axes;
K, M, N – moments about longitudinal, transversal and vertical axes.

Nonlinear dynamical and kinematical equations of motion in the body-fixed frame


can be expressed as [4,5]:

Mv + C( v ) v + D( v ) v + g ( η) = τ (2a)

η = J (η)v (2b)

where:

M – inertia matrix (including added mass);


C(v) – matrix of Coriolis and centripetal terms (including added mass);
D(v) – hydrodynamic damping and lift matrix;
g (η) – vector of gravitational forces and moments;
J (η) – velocity transformation matrix between the body-fixed frame and
the inertial one.
Model Reference Adaptive Control of Underwater Robot in Spatial Motion 73

The robot’s dynamics in the inertial frame can be written as [4, 5]:
Mη ( η)η + Cη ( v, η)η + Dη ( v, η)η + gη ( η) = τη (3)
where:

( )
M η ( η) = J −1 ( η) MJ −1 ( η)
T

( ) [C( v) − MJ
Cη ( v, η) = J −1 ( η)
T −1
]
( η)J ( η) J −1 ( η)
D ( v, η) = (J ( η) ) D( v )J ( η)
−1 T −1
η

g ( η) = (J ( η) ) g ( η)
−1 T
η

τ = (J ( η) ) τ
−1 T
η

There are parametric uncertainties in the dynamic model (2a), and some parameters
are generally unknown. Hence, parameter estimation is necessary in case of model-
based control. For this purpose it is assumed that the robot equations of motion are
linear in a parameter vector p, that is [8]:

Mv + C( v ) v + D( v ) v + g ( η) ≅ Y (η, v, v )p = τ (4)

where Y(η, v, v ) is a known matrix function of measured signals usually referred as


the regressor matrix (dimension n×r) and p is a vector of uncertain or unknown
parameters.
Let define the nonlinear URV dynamics (2a) in a compact form as:

Mv + h( v, η) = τ (5)

where h is the nonlinear vector:

h (v, η) = C( v ) v + D( v ) v + g ( η) (6)

The parameter adaptation law, under assumption that parameters of desired


 d and η d are given and vectors η , v and v measured ,takes the
trajectory η d , η
form [5, 8]:

τ=M
ˆ a + hˆ ( v, η) (7)

where the hat denotes the adaptive parameter estimates.


ˆ v from the left side of the
Substitution (7) into (5) and adding and subtracting M
dynamical equations yields:

ˆ (v − a ) = M
M
~ ~
v + h ( v, η) (8)
~ ~
where M = M
ˆ − M and h( v, η) = hˆ ( v, η) − h( v, η) .
74 J. Garus

Since the equations of motion are linear in the parameter vector p, the following
parameterization can be applied:

M v + h ( v, η) = Y (η, v, v )p
~ ~ ~ (9)
~ = pˆ − p is the unknown parameter error vector.
where p
Differentiation of the kinematical equation (2b) with respect to time yields:

[
v = J −1 (η) η − J (η)v ] (10)

Substitution (10) to (8) and choosing the commanded acceleration a in a form


[ ]
a = J −1 (η) aη − J (η)v the following expression is obtained:

ˆ J −1 (η)[η − a ] = Y(η, v, v )p
M ~ (11)
η

Multiplying (11) with J −1 (η) ( )


T
gives:

M η η ( )
ˆ (η)[η − a ] = J −1 (η) T Y(η, v, v )p
~ (12)

Furthermore, let the commanded acceleration aη be chosen as the PDD2– type


control [5]:

η − K P ~
aη = η d − K D ~ η (13)

where ~η = η − ηd is the tracking error and KP, KD are positive definite diagonal
matrices.
Hence, the error dynamics can be written in the form:
ˆ (η) ~
M η (
η + K ~
 ~
Dη + KPη = J
−1
) ( ) (η)Y(η, v, v )p~
T
(14)

ˆ −1 (η) exists, the expression (14) can be written in a state-space


Assuming that M η
form:

x = Ax + BJ −T (η)Y (η, v, v )p
~ (15)

where:

⎡~η⎤ ⎡ 0 I ⎤ ⎡ 0 ⎤
x = ⎢~ ⎥ , A=⎢ , B = ⎢ ˆ −1 ⎥ .
⎣ η⎦ ⎣− K P − K D ⎥⎦ ⎣Mη (η)⎦
Updated the parameter vector p̂ according to the formulae [5, 6]:

pˆ = − Γ −1Y T (η, v, v )J −1 (η)B T Px (16)


Model Reference Adaptive Control of Underwater Robot in Spatial Motion 75

where Γ and P are symmetric positive definite matrices, convergence of ~


η to zero is
guaranteed.
A block diagram of the control system with parameter adaptation law is shown in
Fig. 1.

Fig. 1. A block diagram with the parameter adaptation law

3 Simulation Results
A main task of the proposed tracking control system is to minimize distance of
attitude of the robot’s centre of gravity to the desired trajectory under assumptions:

1. the robot can move with varying linear velocities u, v, w and angular velocity r;
2. its velocities u, v, w, r and coordinates of position x, y, z and heading ψ are
measurable;
3. the desired trajectory is given by means of set of way-points {( xdi , ydi , zdi )} ;
4. reference trajectories between two successive way-points are defined as smooth
and bounded curves;
5. the command signal τ consists of four components: τ X = X , τ Y = Y , τ Z = Z
and τ N = N calculated from the control law (7).

The structure of the proposed automatic control system is depicted in Fig. 2.


76 J. Garus

Fig. 2. The main parts of the control system

To validate the performance of the developed nonlinear control law some


simulations results, done in the MATLAB/Simulink environment, are presented. A
mathematical model of the UVR is based on a real construction, i.e. the underwater
robotic vehicle called “Coral” designed and built for the Polish Navy. It is the open
frame robot controllable of four DOF, being 1.5 m long and having a propulsion
system consisting of six thrusters. Displacement in horizontal plane is done by means
of four thrusters which generate force up to ±750 N assuring speed up to ±1.2 m/s and
±0.6 m/s in x and y direction, consequently. In the vertical plane two thrusters are
used assuring speed up to ±0.35 m/s. All parameters of the robot’s dynamics are
presented in the Appendix.
The numerical simulations have been done for the following assumptions:

1. The robot has to follow the desired trajectory beginning from (10 m, 10 m, 0 m),
passing target way-points: (10 m, 10 m, -5 m), (10 m, 90 m, -5 m), (30 m, 90 m,
-5 m), (30 m, 10 m, -5 m), (60 m, 10 m, -5 m), (60 m, 90 m, -5 m), (60 m, 90 m,
-15 m), (60 m, 10 m, -15 m), (30 m, 10 m, -15 m), (30 m, 90 m, -15 m),
(10 m, 90 m, -15 m) and ending in (10 m, 10 m, -15 m);
2. The turning point is reached when the robot is inside of the 0.5 meter circle of
acceptance;
3. The sea current interacts the robot’s hull with maximum velocity 0.3 m/s and
direction 1350;
4. Dynamic equations of the robot’s motion are integrated with higher frequency
(18 Hz) than the rest of modules (6 Hz).

It has been assumed that the time-varying reference trajectories at the way-point i to
the next way-point i+1 are generated using desired speed profiles [7, 8]. Such
approach allows us to keep constant speed along certain part of the path. For those
assumptions and the following initial conditions:
Model Reference Adaptive Control of Underwater Robot in Spatial Motion 77

ηdk (tb ) = η0 , ηdk (t b ) = η0

η dk (t f ) = η1 , ηdk (t f ) = η1 (17)

max ηdk (t ) = ηmax ,

where k = 1,4 , the ith segment of the trajectory in a period of time t ∈ tb , t f is


modelled according to the expression [8]:

⎧ ηmax − η0 2
⎪η0 + 2t t tb ≤ t ≤ t m

⎪⎪η1 + η0 − ηmax (t f − 2tm ) +


⎪ m

ηdk (t ) = ⎨ 2 tm < t ≤ t f − t m
⎪+ η (t − t )
⎪ max η − η1
m

⎪η1 − max
⎪⎩
(t f − t)
2
t f − tm < t ≤ t f
2t m

η1 − η 0
where tm = t f − .
ηmax
The algorithm of control has been worked out basis on simplified URV model
proposed in [4, 9]:

M s v + D s (v )v = τ (18)

where all kinematics and dynamics cross-coupling terms are neglected. Here M s and
D s (v ) are diagonal matrices with diagonal elements of the inertia matrix M and a
nonlinear damping matrix D n (v ) , consequently (see the Appendix). Uncertainties in
the above model are compensated in the control system. Therefore, the robot’s model
for spatial motion of four DOF can be written in the following form:
m X u + d X u u = τ X
mY v + d Y v v = τ Y (19)
mZ w + d Z w w = τ Z
mN r + d N r r = τ N
Defining the parameter vector p as
p = [m X dN ]
T
dX mY dY mZ dZ mN the equation (18) can be written
in a form:

Y (v , v )p = τ (20)
78 J. Garus

where:
⎡u uu 0 0 0 0 0 0⎤
⎢ ⎥
0 0 v vv 0 0 0 0 ⎥.
( )
Y v, v = ⎢

⎢0 0 0 0 w ww 0 0⎥
⎢ ⎥
⎣0 0 0 0 0 0 r r r⎦

d
r
2
0
-2
-4
position z [m]

-6
-8
-10
-12
-14
-16
0
20
40
60 0
10
20
80 40 30
50
100 60
70
position y [m] position x [m]

80
d
60 r
position x [m]

40

20

0
0 500 1000 1500 2000

1
error x [m]

-1
0 500 1000 1500 2000
time [s]

Fig. 3. Track-keeping control under interaction of sea current disturbances (maximum velocity
0.3 m/s and direction 1350): desired (d) and real (r) trajectories (upper plot), x-, y-, z-position
and their errors (2nd π 4th plots), course and its error (5th plot), commands (low plot)
Model Reference Adaptive Control of Underwater Robot in Spatial Motion 79

100
d
r

position y [m]
50

0
0 500 1000 1500 2000

0.5
error y [m]

-0.5

-1
0 500 1000 1500 2000
time [s]

5
d
0 r
position z [m]

-5

-10

-15
0 500 1000 1500 2000

-0.005
error z [m]

-0.01

-0.015

-0.02
0 500 1000 1500 2000
time [s]

400
course psi [deg]

200

0
d
r
-200
0 500 1000 1500 2000

20
error psi [deg]

-20

-40
0 500 1000 1500 2000
time [s]

Fig. 3. (continued)
80 J. Garus

forces and moment


1000

X [N]
0

-1000
0 500 1000 1500 2000
100
Y [N]
0

-100
0 500 1000 1500 2000
0
Z [N]

-100

-200
0 500 1000 1500 2000
50
N [Nm]

-50
0 500 1000 1500 2000
time [s]

Fig. 3. (continued)

The control problem has been examined under interaction of environmental


disturbances, i.e. a sea current. To simulate its effect on robot’s motion assumed the
current’s velocity Vc is slowly-varying and the direction is fixed. For simulation needs
the current velocity was generated by using the first order Gauss-Markov process [5]:

Vc + μVc = ω (21)

where ω is a Gaussian white noise, μ ≥ 0 is a constant and 0 ≤ Vc (t ) ≤ Vc max .

estimates for motion along x axis


600

400
m [kg]

200

0
0 500 1000 1500 2000

1500
s
e
1000
d [kg/m]

500

0
0 500 1000 1500 2000
time [s]

Fig. 4. Estimates of mass and damping coefficients: set value (s) and estimate (e)
Model Reference Adaptive Control of Underwater Robot in Spatial Motion 81

estimates for motion along y axis


600

400

m [kg]
200

0
0 500 1000 1500 2000

500
s
e
400
d [kg/m]

300

200
0 500 1000 1500 2000
time [s]

estimates for motion along z axis


150

100
m [kg]

50

0
0 500 1000 1500 2000

500
s
e
400
d [kg/m]

300

200
0 500 1000 1500 2000
time [s]

estimates for rotation about z axis


50

40
m [kg m 2]

30

20

10
0 500 1000 1500 2000

20
s
e
15
d [kg m 2]

10

5
0 500 1000 1500 2000
time [s]

Fig. 4. (continued)
82 J. Garus

Results of track-keeping in presence of external disturbances and courses of


command signals are presented in Fig. 3.
It can be noticed that the proposed autopilot enhanced good tracking control along
the desired trajectory in the spatial motion. The main advantage of the approach is
using the simple nonlinear law to design controllers and its high performance for
relative large sea current disturbances (comparable with resultant speed of the robot).
Since the true values of components of the vector p are unknown, the process of
evaluation started from half of the nominal values. Time histories of estimated
parameters during track-keeping are presented in Fig. 4.

4 Conclusions
In the paper the nonlinear control system for the underwater robot has been described.
The obtained results with the autopilot consisting of four controllers with parameter
adaptation law implemented have showed that the proposed control system is simple
and useful for the practical usage.
Disturbances from the sea current were added in the simulation study to verify the
performance, correctness and robustness of the approach.
Further works are devoted to the problem of tuning of the autopilot parameters in
relation to the robot’s dynamics.

References
[1] Antonelli, G., Caccavale, F., Sarkar, S., West, M.: Adaptive Control of an Autonomous
Underwater Vehicle: Experimental Results on ODIN. IEEE Transactions on Control
Systems Technology 9(5), 756–765 (2001)
[2] Bhattacharyya, R.: Dynamics of Marine Vehicles. John Wiley and Sons, Chichester (1978)
[3] Craven, J., Sutton, R., Burns, R.S.: Control Strategies for Unmanned Underwater Vehicles.
Journal of Navigation 1(51), 79–105 (1998)
[4] Fossen, T.I.: Guidance and Control of Ocean Vehicles. John Wiley and Sons, Chichester
(1994)
[5] Fossen, T.I.: Marine Control Systems. Marine Cybernetics AS, Trondheim (2002)
[6] Garus, J.: Design of URV Control System Using Nonlinear PD Control. WSEAS
Transactions on Systems 4(5), 770–778 (2005)
[7] Garus, J., Kitowski, Z.: Tracking Autopilot for Underwater Robotic Vehicle. In: Cagnol,
J., Zolesio, J.P. (eds.) Information Processing: Recent Mathematical Advances in
Optimization and Control, pp. 127–138. Presses de l’Ecole des Mines de Paris (2004)
[8] Spong, M.W., Vidyasagar, M.: Robot Dynamics and Control. John Wiley and Sons,
Chichester (1989)
[9] Yoerger, D.R., Slotine, J.E.: Robust Trajectory Control of Underwater Vehicles. IEEE
Journal of Oceanic Engineering (4), 462–470 (1985)
Model Reference Adaptive Control of Underwater Robot in Spatial Motion 83

Appendix
The URV model. The following parameters of dynamics of the underwater robot
have been used in computer simulations:

M = diag{ 99.0 108.5 126.5 8.2 32.9 29.1}


D(v ) = D + D n (v ) =
= diag {10.0 0.0 0.0 0.223 1.918 1.603} +
⎧227.18 u 405.41 v 478.03 w ⎫
+ diag ⎨ ⎬
⎩ 3.212 p 14.002 q 12.937 r ⎭

⎡ 0 0 0 0 26.0w − 28.0v ⎤
⎢ 0 0 0 − 26.0 w 0 18.5u ⎥
⎢ ⎥
⎢ 0 0 0 28.0v − 18.5u 0 ⎥
C(v ) = ⎢ ⎥
⎢ 0 26.0 w − 28.0v 0 5. 9 r − 6. 8q ⎥
⎢ − 26.0w 0 18.5u − 5.9 r 0 1.3 p ⎥
⎢ ⎥
⎣ 28.0v − 18.5u 0 6.8q − 1. 3 p 0 ⎦

⎡ − 17.0 sin(θ ) ⎤
⎢ 17.0 cos(θ ) sin(φ ) ⎥
⎢ ⎥
⎢ 17.0 cos(θ ) cos(φ ) ⎥
g (η) = ⎢ ⎥
⎢ − 279.2 cos(θ ) sin(φ ) ⎥
⎢ − 279.2(sin(θ ) + cos(θ ) cos(φ ) )⎥
⎢ ⎥
⎣ 0 ⎦
Feedback Stabilization of Distributed Parameter
Gyroscopic Systems

Pawel Skruch

AGH University of Science and Technology, Institute of Automation,


al. Mickiewicza 30/B1, 30-059 Kraków, Poland
pawel.skruch@agh.edu.pl

Abstract. In this paper feedback stabilization of distributed parameter gyroscopic


systems is discussed. The class of such systems is described by second-order operator
equations. We show that the closed loop system which consists of the controlled sys-
tem, linear non-velocity feedback and a parallel compensator is asymptotically stable.
In the case where velocity is available, the parallel compensator is not necessary to
stabilize the system. We present our results here for multi-input multi-output case.
The stability issues are proved by LaSalle’s theorem extended to infinite dimensional
systems. Numerical examples are given to illustrate the effectiveness of the proposed
controllers.

1 Introduction
Many physical systems are represented by partial differential equations. As an
example we can consider robots with flexible links, vibrating structures such as
beams, buildings, bridges, etc. For the most part, it is not possible or feasible
to obtain a solution of these equations. Therefore in practice, a distributed pa-
rameter system is first discretized to a matrix second-order model using some
approximate methods. Then the problem is solved for this discretized reduced-
order model.
It is well-known that a dangerous situation called resonance occurs when one
or more natural frequences of the system become equal or close to a frequency
of the external force. Because a linear infinite dimensional system described by
an operator second-order differential equation without damping term may have
an infinite number of poles on the imaginary axis [17], [18], [26], the approxi-
mate solutions are not suitable for designing the stabilizer. To combat possible
undesirable effects of vibrations, the dynamic effect of the system parts whose
behaviour are described by partial differential equations has to be taken into
account in designing a controller.
Stability of second-order systems both in finite and infinite dimensional case
has been studied in the past. More recently, in [19] and [20] the dynamics and
stability of LC ladder network by inner resistance, by velocity feedback and by
first range dynamic feedback are studied. Control problems for finite dimensional
undamped second-order systems are discussed in [12] and [21]. In [28], the class

W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 85–97.
springerlink.com c Springer-Verlag Berlin Heidelberg 2009
86 P. Skruch

of non-linear controllers is proposed to stabilize damped gyroscopic systems. Sta-


bilization problems for infinite dimensional second-order systems are discussed
by very many scientists, and to mention only a few we note the works [13], [14],
[17], [23] and [24]. A good source of references to papers in which stabilization
problems are treated can by found in [18].
The paper is organized as follows. In the next section we introduce the system.
We also analyze some properties of the system. In section 3 and 4, we propose two
types of control laws. We prove that the proposed control laws asymptotically
stabilize the system. In section 5 we present some numerical simulation results.
Finally, we give some concluding remarks.

2 Description of the System


Let Ω ⊂ RN be a bounded domain with smooth boundary ∂Ω. By X we denote
a real Hilbert space consisting of square integrable functions on the set Ω with
the following inner product:

f, gX = f (ξ)g(ξ)dξ. (1)
Ω
2 k
Let L and H be defined as follows:


L 2 = f : Ω → Rn : |f (ξ)|2 dξ < ∞ , (2)
Ω

H k = f ∈ L2 : f, f  , . . . , f (k) ∈ L2 . (3)
We consider a control system described by the second-order operator equation
ẍ(t) + Gẋ(t) + Ax(t) = Bu(t), t > 0, (4)
with initial conditions
x(0) = x0 ∈ D(A), ẋ(0) = x1 ∈ X, (5)
where x(t) ∈ X = L2 (Ω). We assume that A : (D(A) ⊂ X) → X is a linear,
generally unbounded, self-adjoint and positive definite operator with domain
D(A) dense in X and compact resolvent R(λ, A); G ∈ L(X) is a linear, bounded
and skew-adjoint (gyroscopic) operator. The control force is represented by the
operator B ∈ L(Rr , X) defined as follows:
r

Bu(t) = bi ui (t), (6)
i=1

where B = [b1 b2 · · · br ], bi ∈ X, u(t) = [u1 (t) u2 (t) · · · ur (t)]T , ui (·) ∈


L2 ([0, ∞), R), i = 1, 2, . . . , r. The state of the system is measured by averag-
ing sensors, whose outputs are expressed by the linear and bounded operator
C ∈ L(X, Rm )
y(t) = Cx(t), (7)
Feedback Stabilization of Distributed Parameter Gyroscopic Systems 87

where
Cx = [c1 , xX c2 , xX · · · cm , xX ]T , (8)
ci ∈ X, i = 1, 2, . . . , m are sensor influence functions.
From the Hilbert-Schmidt theory [5], [22], [31] for compact self-adjoint oper-
ators, it is well-known that the operator A satisfies the following hypotheses:
(a) 0 ∈ ρ(A), i.e. A−1 exists and is compact (ρ(A) stands for the resolvent set
of the operator A),
(b) A is closed,
(c) The operator A has only purely discrete spectrum consisting entirely of dis-
tinct real positive eigenvalues λi with finite multiplicity ri < ∞, where
0 < λ1 < . . . < λi < . . ., limi→∞ λi = ∞,
(d) For each eigenvalue λi there exists ri corresponding eigenfunctions υik ,
Aυik = λi υik , where i = 1, 2, . . ., k = 1, 2, . . . , ri ,
(e) The set of eigenfunctions υik , i = 1, 2, . . ., k = 1, 2, . . . , ri , forms a complete
orthonormal system in X.
By introducing new function space X  = D(A1/2 )×X, the equation (4) is reduced
to the following abstract first-order form:
d  + G)
 x(t) + Bu(t),

(t) + (A
x (9)
dt
 G
 = col (x, ẋ), the operators A,
where x  and B
 are defined as
     
A= 0 I , G = 0 0 ,B = 0 . (10)
−A 0 0 −G B
Remark 1. The operator A is positive and self-adjoint on the real Hilbert space
X. The operator A1/2 is well defined. Thus the operator A  (see (10)) on X
 =
D(A ) × X (see [18], [26]) is the infinitesimal generator of a C0 -semigroup S(t)
1/2

 S(t) ≤ 1 and domain of A


on X,  is D(A)
 = D(A) × D(A1/2 ). In this case the
 is given by z, v = A1/2 z1 , A1/2 v1  + z2 , v2 .
inner product on X

Remark 2. The operator G is bounded. Thus G  is bounded as well (see (10)).


From theorem about bounded perturbation of generator [9], [18], [25], the oper-
+G
ator A  (see (10)) is the infinitesimal generator of a C0 -semigroup on X.


Remark 3. In the real Hilbert space X and for the skew-adjoint operator G, the
following equality is true:
1 1 1 1
x, GxX = x, GxX + x, GxX = x, GxX − Gx, xX = 0. (11)
2 2 2 2

3 Stabilization in the Case Where Velocity Feedback Is


Not Available
The main idea of this section is devoted to the stabilization of the system (4),
(7) in the case where only position feedback is available. The stabilizer will be
88 P. Skruch

constructed by placing actuators and sensors at the same location, what means
that C = B ∗ and consequently

y(t) = B ∗ x(t). (12)

We assume that the system (4) with the output (12) is approximately observable
(see [1], [10], [11]).
Let us consider the linear dynamic feedback given by the formula (see also
[13], [21])
u(t) = −K[w(t) + y(t)], (13)

ẇ(t) + Aw w(t) = Bw u(t), w(0) = w0 , (14)


m
where w(t) ∈ R , Aw = diag [αi ], Bw = diag [βi ], αi , βi ∈ R, αi > 0, βi > 0,
i = 1, 2, . . . , m, K = K T > 0 is a real positive definite matrix.
To analyze the closed loop system, we first define the function space Z =
H 1 (Ω) × L2 (Ω) × Rm with the following inner product:

z, zZ = Az1 , z1 X + z2 , z2 X + z3T Q


z3 + (B ∗ z1 + z3 )T K(B ∗ z1 + z3 ), (15)

z1 , z2 , z3 ), Q = diag [ αβii ] = Aw Bw


where z = col (z1 , z2 , z3 ), z = col ( −1
. Let us
note that the space Z with the inner product (15) is a Hilbert space. Now, the
closed loop system (4), (12), (13), (14) can be written in the following abstract
form:
ż(t) = Lz(t), (16)
where z(t) = col (x(t), ẋ(t), w(t)), L : (D(L) ⊂ Z) → Z is a linear operator
defined as follows:
⎡ ⎤
0 I 0
L = ⎣−A − BKB ∗ −G −BK ⎦. (17)

−Bw KB 0 −Aw − Bw K

The closed loop system (16) can also obtain the following form [20]:
⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤
ẋ(t) 0 I 0 x(t) 0
⎣ ẍ(t) ⎦ = ⎣−A −G 0 ⎦ ⎣ ẋ(t) ⎦ + ⎣ B ⎦ u(t), (18)
ẇ(t) 0 0 −Aw w(t) Bw
⎡ ⎤
  x(t)
s(t) = C1 B ∗ 0 C2 ⎣ ẋ(t) ⎦ , (19)
w(t)

u(t) = −Ks(t), (20)


where the matrices C1 = C2 = I.
Theorem 1. Suppose that the matrices C1 and C2 are real and invertible and
the system (4), (12) is observable. Then the system (18), (19) is observable.
Feedback Stabilization of Distributed Parameter Gyroscopic Systems 89

Proof. The system (18), (19) is observable, if for any complex number s the
equation ⎧

⎪ sx1 − x2 = 0,

⎨Ax + (G + sI)x = 0,
1 2
(21)

⎪ (A w + sI)x = 0,


3
C1 B ∗ x1 + C2 x3 = 0
has no nonzero solution x = col (x1 , x2 , x3 ) [11]. When s = −αi , i = 1, 2, . . . , m,
we have x3 = 0 and (21) becomes


⎨sx1 − x2 = 0,
Ax1 + (G + sI)x2 = 0, (22)

⎩ ∗
B x1 = 0.

If the system (4), (12) is observable, (22) has no nonzero solution for any complex
number s.
Next consider the case where s = −αi for some i = 1, 2, . . . , m. From (21) is
follows that
Ax1 = (Gαi − α2i I)x1 . (23)
From this it holds that

Ax1 , x1 X = (Gαi − α2i I)x1 , x1 X = −α2i x1 2X ≤ 0, (24)

which implies that x1 = 0, since A is positive, self-adjoint and has compact


resolvent (see also lemma 2). Consequently, x2 = 0, x3 = 0. Therefore, the
system (21) has no nonzero solution also for s = −αi . We have proved the
theorem.

Theorem 2. Suppose that the system (4), (12) is approximately observable. Let
us consider the system (16), where the operator L is given by (17). Then the
following assertions are true:
(a) L is dissipative,
(b) Ran (λ0 I − L) = Z for some λ0 > 0,
(c) D(L)cl = Z and L is closed,
(d) The operator L generates a C0 -semigroup of contractions TL (t) ∈ L(Z),
t ≥ 0,
(e) The C0 -semigroup TL (t) generated by L is asymptotically stable.

Proof. (a) The linear operator L is dissipative if and only if

(λI − L)z Z ≥ λ z Z , ∀z∈D(L), λ>0 (25)

(see [25]). In the real Hilbert space Z, the condition (25) is equivalent to

Lz 2Z − 2Lz, zZ ≥ 0, ∀z∈D(L), λ>0 . (26)


90 P. Skruch

Using (15) and (17), we obtain

Lz, zZ = Az2 , z1 X + −Gz2 , z2 X + −(A + BKB ∗ )z1 − BKz3 , z2 X


+ [−Bw KB ∗ z1 − (Aw + Bw K)z3 ]T Qz3 + [B ∗ z2 − Bw KB ∗ z1 ]T K(B ∗ z1 + z3 )
+ [−(Aw + Bw K)z3 ]T K(B ∗ z1 + z3 ). (27)

Simple calculations show that

z T Bw z ≤ 0,
Lz, zZ = − (28)

where
z = KB ∗ z1 + (Q + K)z3 . (29)
Since Lz, zZ ≤ 0, it follows that L is dissipative (see (26)).
(b) To prove the assertion (b), it is enough to show that for some λ0 > 0, the
operator λ0 I − L : Z → Z is onto. Let z = col (
z1 , z2 , z3 ) ∈ Z be given. We have
to find z = col (z1 , z2 , z3 ) ∈ D(L) such that

(λ0 I − L)z = z. (30)

Hence the following equations should hold:

λ0 z1 − z2 = z1 , (31)

(A + BKB ∗ )z1 + (λ0 + G)z2 + BKz3 = z2 , (32)

Bw KB ∗ z1 + (λ0 I + Aw + Bw K)z3 = z3 . (33)


From (31) and (33) we can determine z2 and z3

z2 = λ0 z1 − z1 , (34)

z3 = (λ0 I + Aw + Bw K)−1 (
z3 − Bw KB ∗ z1 ). (35)
We can do this because the matrix λ0 I + Aw + Bw K is invertible (see lemma 1).
Using (34) and (35) in (32) we obtain

{λ20 + λ0 G + A + B[K −1 + (λ0 Bw


−1
+ Q)−1 ]−1 B ∗ }z1
= λ0 z1 + z2 − BK(λ0 I + Aw + Bw K)−1 z3 . (36)

Define Γ (λ0 ) by

Γ (λ0 ) = λ20 + λ0 G + A + B[K −1 + (λ0 Bw


−1
+ Q)−1 ]−1 B ∗ . (37)

We know that Q = QT > 0 and λ0 Bw −1


= diag [λ0 βi−1 ]. Moreover, the inverse of
a real, symmetric and positive definite matrix is also a symmetric and positive
definite matrix (see [32]). Hence, there exists the symmetric and positive definite
Feedback Stabilization of Distributed Parameter Gyroscopic Systems 91

matrix [K −1 + (λ0 Bw
−1
+ Q)−1 ]−1 . Thus the operator Γ (λ0 ) is a closed operator
with domain D(Γ ) = D(A) dense in X. Additionally,
Γ z1 , z1 X ≥ (λ20 + δ) z1 X , (38)
where the constant δ > 0 can be determined by using lemmas 2 and 3. This
means that the operator Γ (λ0 ) is invertible and the equation (36) has a unique
solution z1 ∈ D(A). The remaining unknowns z2 ∈ H 1 (Ω) and z3 ∈ Rm can be
uniquely determined from (34) and (35). This completes the proof of (b).
(c) If for some λ0 > 0, Ran (λ0 I − L) = Z then Ran (λI − L) = Z for all λ > 0
[25]. Let us note that also Ran (λI − L) = Z for λ = 0. Now, we know that the
operator L is dissipative, the Hilbert space Z is reflexive and Ran (I − L) = Z.
All these properties imply that D(L)cl = Z and L is closed [25].
(d) Because of (a), (b) and (c), the statement that the operator L generates
a C0 -semigroup of contractions TL (t) ∈ L(Z), t ≥ 0, can be concluded from
Lumer-Phillips theorem [6], [16], [17], [25].
(e) The asymptotic stability of the closed loop system (16) can be proved by
LaSalle’s invariance principle [15] extended to infinite dimensional systems [7],
[8], [17], [29]. We introduce the following Lyapunov function:
1 1 1
V (x(t), ẋ(t), w(t)) = ẋ(t), ẋ(t)X + Ax(t), x(t)X + w(t)T Qw(t)
2 2 2
1 T
+ [w(t) + B ∗ x(t)] K [w(t) + B ∗ x(t)] , (39)
2
where Q = diag [ αβii ] = Aw Bw−1
. We can notice that V (x, ẋ, w) = 0 if and only
if col (x, ẋ, w) = 0. Otherwise V (x, ẋ, w) > 0. Taking the derivative of V with
respect to time, we obtain
V̇ (x(t), ẋ(t), w(t)) = ẍ(t), ẋ(t)X + Ax(t), ẋ(t)X + w(t)T Qẇ(t)
T
+ [w(t) + B ∗ x(t)] K [ẇ(t) + B ∗ ẋ(t)] . (40)
Along trajectories of the closed loop system (16) it holds that
V̇ (x(t), ẋ(t), w(t)) = −s(t)T Bw s(t) ≤ 0, (41)
where s(t) = KB ∗ x(t) + (Q + K)w(t). According to LaSalle’s theorem, all solu-
tions of (16) asymptotically tend to the maximal invariant subset of the following
set 
S = z ∈ Z : z = col (x, ẋ, w), V̇ (z) = 0 , (42)
provided that the solution trajectories for t ≥ 0 are precompact in Z. From
V̇ = 0 we have s(t) = 0 (see (19) for C1 = K, C2 = Q + K). The system (18),
(19) is observable (see theorem 1), thus we have x = 0, ẋ = 0, w = 0 and finally
the largest invariant set contained in S = {0} is the set {0}.
The trajectories of the closed loop system (16) are precompact in Z if the set

γ(z 0 ) = TL (t)z 0 , z 0 = z(0) ∈ D(L), (43)
t≥0
92 P. Skruch

is precompact in Z. Since the operator L generates a C0 -semigroup of contrac-


tions on Z, hence the solution trajectories {TL (t), t ≥ 0} are bounded on Z.
The precompactness of the solution trajectories are guaranteed if the operator
(λI − L)−1 : Z → Z is compact for some λ > 0 [2], [27]. We first notice that
Γ (λ)−1 exists for λ ≥ 0 and is bounded. Therefore the operator (λI − L)−1 for
λ ≥ 0 exists and is bounded as well. Since the embedding of H 1 (Ω)×L2 (Ω)×Rm
into H 2 (Ω) × H 1 (Ω) × Rm is compact [30], it follows that (λI − L)−1 : Z → Z
is a compact operator. We have proved the theorem.

Lemma 1. The matrix λ0 I + Aw + Bw K is invertible for λ0 ≥ 0.
Proof. We can notice that λ0 I + Aw + Bw K = A w + Bw K, where A w =
diag [λ0 + αi ], i = 1, 2, . . . , m. The matrix Bw = diag [βi ], and therefore
Bw−1
= diag [βi−1 ]. Hence A w + Bw K = Bw (B −1 A w + K), where B −1 A w =
w w
−1 −1 
diag [βi (λ0 + αi )]. The matrix Bw Aw + K is symmetric and positive definite.
−1 
This means that there exists (Bw Aw + K)−1 . The proof can be concluded with
remark that the product of invertible matrices is also invertible (see [32]).

Lemma 2. The operator A satisfies the following condition:

Ax, xX ≥ λmin x X , (44)

λmin = min {λn : λn ∈ σ(A), n = 1, 2, . . .}, σ(A) stands for the discrete spec-
trum of A.
 =K
Lemma 3. For any real and positive definite matrix K  T > 0 there exists
δ > 0, such that

(A + B KB  ∗ x) ≥ δ x 2X .
 ∗ )x, xX = Ax, xX + (B ∗ x)T K(B (45)

The lemmas 2 and 3 can be proved by using the following expansions in Hilbert
space X:
∞ ri
x= x, υik X υik , x ∈ X, (46)
i=1 k=1

 ri

Ax = λi x, υik X υik , x ∈ D(A). (47)
i=1 k=1

4 Stabilization in the Case Where Velocity Feedback Is


Available
In this section we consider the dynamic feedback

u(t) = −K1 y(t) − K2 ẏ(t), (48)

where K1 = K1T ≥ 0 oraz K2 = K2T > 0 are real matrices. The control function
(48) is applied to the system (4) with the output (12). The resulting closed loop
system becomes
Feedback Stabilization of Distributed Parameter Gyroscopic Systems 93

ẍ(t) + (G + BK2 B ∗ )ẋ(t) + (A + BK1 B ∗ )x(t) = 0. (49)

We can reformulate the system (49) as a set of first-order equations. First, we


introduce the Hilbert space Z = H 1 (Ω)×L2 (Ω) with the following inner product:

z, zZ = Az1 , z1 X + z2 , z2 X + (B ∗ z1 )T K1 (B ∗ z1 ), (50)

where z = col (z1 , z2 ), z = col (


z1 , z2 ). In new function space Z, the closed loop
system (49) can be written in the abstract form

ż(t) = Lz(t), (51)

where z(t) = col (x(t), ẋ(t)), L : (D(L) ⊂ Z) → Z is a linear operator defined


as follows:  
0 I
L= . (52)
−A − BK1 B ∗ −G − BK2 B ∗

Theorem 3. Suppose that the system (4), (12) is approximately observable. Let
us consider the system (51), where the operator L is given by (52). Then the
following assertions are true:
(a) L is dissipative,
(b) Ran (λ0 I − L) = Z for some λ0 > 0,
(c) D(L)cl = Z and L is closed,
(d) The operator L generates a C0 -semigroup of contractions TL (t) ∈ L(Z),
t ≥ 0,
(e) The C0 -semigroup TL (t) generated by L is asymptotically stable.
Proof. The proof shall be carried out by using the same method as in the proof
of theorem 2. The Lyapunov function for the system (51) is given by
1 1 1
V (z(t)) = ẋ(t), ẋ(t)X + Ax(t), x(t)X + [B ∗ x(t)]T K1 [B ∗ x(t)], (53)
2 2 2
and the stability of the closed loop system is a consequence of LaSalle’s theorem.

5 Illustrative Examples
To illustrate our theory we consider the motion of a taut string, rotating about
its ξ-axis with constant angular velocity ω (Fig. 1). In was shown in [3] and [4]
that the small oscillations of such a string are governed by the system of partial
differential equations
 ∂ 2 x (t,ξ) 2
1
∂t2 − 2ω ∂x2∂t
(t,ξ)
− ω 2 x1 (t, ξ) − ∂ x∂ξ1 (t,ξ)
2 = b(ξ)u(t),
∂ 2 x2 (t,ξ) ∂x1 (t,ξ) ∂ 2 x2 (t,ξ)
(54)
∂t2 + 2ω ∂t − ω x2 (t, ξ) − ∂ξ2 = 0,
2
94 P. Skruch

Fig. 1. Small oscillations of a taut rotating string

where t > 0, ξ ∈ (0, 1). The boundary conditions are of the form

x1 (t, 0) = x1 (t, 1) = 0,
(55)
x2 (t, 0) = x2 (t, 1) = 0,

and the initial conditions



x1 (0, ξ) = 0.1(1 − ξ)ξ, ẋ1 (0, ξ) = 0,
(56)
x2 (0, ξ) = 0, ẋ2 (0, ξ) = 0.

The function b : [0, 1] → R is defined as follows:



1, for 0.7 ≤ ξ ≤ 1.0,
b(ξ) = (57)
0, otherwise.

Then we find that the system (54) can be written in the form (4), where X =
L2 ((0, 1), R2 ),     
x −x1 − ω 2
A 1 = , (58)
x2 −x2 − ω 2

D(A) = {x ∈ H 2 ((0, 1), R2 ) : x = [x1 x2 ]T , xi (0) = xi (1) = 0, i = 1, 2}, (59)


 
b(ξ)
Bu(t) = u(t), ξ ∈ [0, 1], t ≥ 0, (60)
0
   
x −2ωx2
G 1 = , D(G) = X, (61)
x2 2ωx1
(see also [3] and [4]). The output for the system (54) we calculate in the following
way:    
b x
y(t) = B ∗ x(t) =  , 1 X . (62)
0 x2
The open loop system is not asymptotically stable. In order to stabilize the
system we can use one of the following controllers:

ẇ(t) + 0.5w(t) = 0.1u(t),
(63)
u(t) = −100.0[w(t) + y(t)],
Feedback Stabilization of Distributed Parameter Gyroscopic Systems 95

0.5

0
y

−0.5

−1

−1.5
0 2 4 6 8 10 12 14 16 18 20
t

Fig. 2. Effects of using the controller (63) in stabilization of the system (54)

0.8

0.6

0.4

0.2
y

−0.2

−0.4

−0.6

−0.8
0 2 4 6 8 10 12 14 16 18 20
t

Fig. 3. Effects of using the controller (64) in stabilization of the system (54)

with w(0) = 0.5, or


u(t) = −10.0y(t) − 20.0ẏ(t). (64)
Simulation results are presented in Fig. 2 and 3. For comparison purposes, the
output y(t) of the open loop system (dot line) is given together with the output
of the closed loop system (solid line).

6 Concluding Remarks

We have investigated stabilization of distributed parameter gyroscopic systems


which are represented by second-order operator equations. The systems have an
96 P. Skruch

infinite number of poles on the imaginary axis. The important role in the stabi-
lization process has played the assumption that the input and output operators
are collocated. We have proposed a linear dynamic velocity feedback and linear
dynamic position feedback. In the case where velocity in not available, a parallel
compensator is necessary to stabilize the system. The asymptotic stability of
the closed loop system in both cases has been proved by LaSalle’s invariance
principle extended to infinite dimensional systems. Numerical simulation results
have shown the effectiveness of the proposed controllers.

Acknowledgement
This work was supported by Ministry of Science and Higher Education in Poland
in the years 2008–2011 as a research project No N N514 414034.

References
[1] Curtain, R.F., Pritchard, A.J.: Infinite dimensional linear systems theory.
Springer, Heidelberg (1978)
[2] Dafermos, C.M., Slemrod, M.: Asymptotic behaviour of nonlinear contraction
semigroups. J. Funct. Anal. 13(1), 97–106 (1973)
[3] Datta, B.N., Ram, Y.M., Sarkissian, D.R.: Multi-input partial pole placement for
distributed parameter gyroscopic systems. In: Proc. of the 39th IEEE International
Conference on Decision and Control, Sydney (2000)
[4] Datta, B.N., Ram, Y.M., Sarkissian, D.R.: Single-input partial pole-assignment
in gyroscopic quadratic matrix and operator pencils. In: Proc. of the 14th Inter-
national Symposium of Mathematical Theory of Networks and Systems MTNS
2000, Perpignan, France (2000)
[5] Dunford, N., Schwartz, J.T.: Linear operators. Part II. Spectral theory. Self adjoint
operators in Hilbert space. Interscience, New York (1963)
[6] Engel, K.J., Nagel, R.: One-parameter semigroups for linear evolution equation.
Springer, New York (2000)
[7] Hale, J.K.: Dynamical systems and stability. J. Math. Anal. Appl. 26(1), 39–59
(1969)
[8] Hale, J.K., Infante, E.F.: Extended dynamical systems and stability theory. Proc.
Natl. Acad. Sci. USA 58(2), 405–409 (1967)
[9] Kato, T.: Perturbation theory for linear operators. Springer, New York (1980)
[10] Klamka, J.: Controllability of dynamical systems. PWN, Warszawa (1990) (in
Polish)
[11] Kobayashi, T.: Frequency domain conditions of controllability and observability
for a distributed parameter system with unbounded control and observation. Int.
J. Syst. Sci. 23(12), 2369–2376 (1992)
[12] Kobayashi, T.: Low gain adaptive stabilization of undamped second order systems.
Arch. Control Sci. 11(XLVII) (1-2), 63–75 (2001)
[13] Kobayashi, T.: Stabilization of infinite-dimensional undamped second order sys-
tems by using a parallel compensator. IMA J. Math. Control Inf. 21(1), 85–94
(2004)
Feedback Stabilization of Distributed Parameter Gyroscopic Systems 97

[14] Kobayashi, T., Oya, M.: Adaptive stabilization of infinite-dimensional undamped


second order systems without velocity feedback. Arch. Control Sci. 14(L) (1), 73–
84 (2004)
[15] La Salle, J., Lefschetz, S.: Stability by Liapunov’s direct method with applications.
PWN, Warszawa (1966) (in Polish)
[16] Lummer, G., Phillips, R.S.: Dissipative operators in a Banach space. Pacific J.
Math. 11(2), 679–698 (1961)
[17] Luo, Z., Guo, B., Morgül, Ö.: Stability and stabilization of infinite dimensional
systems with applications. Springer, London (1999)
[18] Mitkowski, W.: Stabilization of dynamic systems. WNT, Warszawa (1991) (in
Polish)
[19] Mitkowski, W.: Dynamic feedback in LC ladder network. Bull. Pol. Acad. Sci.
Tech. Sci. 51(2), 173–180 (2003)
[20] Mitkowski, W.: Stabilisation of LC ladder network. Bull. Pol. Acad. Sci. Tech.
Sci. 52(2), 109–114 (2004)
[21] Mitkowski, W., Skruch, P.: Stabilization of second-order systems by linear position
feedback. In: Proc. of the 10th IEEE International Conference on Methods and
Models in Automation and Robotics, Miȩdzyzdroje, Poland, August 30–September
2, 2004, pp. 273–278 (2004)
[22] Mizohata, S.: The theory of partial differential equations. Cambridge Univ. Press,
Cambridge (1973)
[23] Morgül, Ö.: A dynamic control law for the wave equation. Automatica 30(11),
1785–1792 (1994)
[24] Morgül, Ö.: Stabilization and disturbance rejection for the wave equation. IEEE
Trans. Autom. Control 43(1), 89–95 (1998)
[25] Pazy, A.: Semigroups of linear operators and applications to partial differential
equations. Springer, New York (1983)
[26] Pritchard, A.J., Zabczyk, J.: Stability and stabilizability of infinite dimensional
systems. SIAM Rev. 23(1), 25–52 (1981)
[27] Saperstone, S.: Semidynamical systems in infinite dimensional spaces. Springer,
New York (1981)
[28] Skruch, P.: Stabilization of second-order systems by non-linear feedback. Int. J.
Appl. Math. Comput. Sci. 14(4), 455–460 (2004)
[29] Slemrod, M.: Stabilization of boundary control systems. J. Differ. Equations 22(2),
402–415 (1976)
[30] Tanabe, H.: Equations of evolution. Pitman, London (1979)
[31] Taylor, A.E., Lay, D.C.: Introduction to functional analysis. John Wiley & Sons,
New York (1980)
[32] Turowicz, A.: Theory of matrix, 5th edn., AGH, Kraków (1995) (in Polish)
Stabilization Results of Second-Order Systems
with Delayed Positive Feedback

Wojciech Mitkowski1 and Pawel Skruch1


1
AGH University of Science and Technology, Institute of Automatics,
al. Mickiewicza 30/B1, 30-059 Kraków, Poland
wmi@ia.agh.edu.pl
2
AGH University of Science and Technology, Institute of Automatics,
al. Mickiewicza 30/B1, 30-059 Kraków, Poland
pawel.skruch@agh.edu.pl

Abstract. Oscillation and nonoscillation criteria are established for second-order sys-
tems with delayed positive feedback. We consider the stability conditions for the system
without damping and with gyroscopic effect. A general algorithm for finding stability
regions is proposed. Theoretical and numerical results are presented for single-input
single-output case. These results improve some oscillation criteria of [1], [2] and [6].

1 Introduction
The paper expands on a method proposed by [1], [2] and [6] for stabilizing
second-order systems with delayed positive feedback. The system is described
by linear second-order differential equations

ẍ(t) + Gẋ(t) + Ax(t) = Bu(t), (1)

y(t) = Cx(t), (2)


where x(t) ∈ Rn , u(t) ∈ R, y(t) ∈ R, t ≥ 0. Here Rn and R are real vector spaces
of column vectors, x(t), u(t), y(t) are vectors of states, inputs and outputs,
respectively, A ∈ Rn×n , B ∈ Rn×1 , C ∈ R1×n , G ∈ Rn×n . We assume that the
matrix A = AT > 0 is positive definite and the multiplicity of all eigenvalues
of A is equal one. The matrix G = −GT is called skew-symmetric (gyroscopic)
matrix. If we take the Laplace transform in (1) and (2) and use zero initial
conditions, we obtain

Y (s)
G1 (s) = = C(s2 + sG + A)−1 B. (3)
U (s)

In [7], it has been proved that the system (1) is not asymptotically stable. The
eigenvalues of (1) are different from zero, pairwise conjugated and located on
the imaginary axis.

W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 99–108.
springerlink.com c Springer-Verlag Berlin Heidelberg 2009
100 W. Mitkowski and P. Skruch

In this paper in order to stabilize the system (1) we use the following positive,
time-delay feedback
u(t) = ky(t − τ ), (4)
where k > 0, τ > 0 and y(t) = 0 for t ∈ [−τ, 0). Using the Laplace transform in
(4), we obtain
U (s) = G2 (s)Y (s), (5)
where
G2 (s) = ke−sτ . (6)
The closed loop system (see Fig. 1) will be defined by the following transfer
function:
G1 (s)G2 (s)
G(s) = . (7)
1 − G1 (s)G2 (s)

Fig. 1. Closed loop system

We try to determine the range of allowable delays in order to guarantee sta-


bility for the system (7). The analysis of the closed loop system is made by
using the Nyquist criterion (see for example [5]). Because all open loop poles
are located on the imaginary axis, the system represented by the transfer func-
tion (7) will be asymptotically stable if there is no clockwise encirclements of
the (−1, j0) point by the Nyquist plot of G12 (s) = −G1 (s)G2 (s). If jω ∗ is the
pole of G12 (s) of multiplicity m∗ , then the ”points” G12 (jω−

) and G12 (jω+∗
) are
connected in the clockwise direction by a circular arc of the radius R = ∞ and
angle φ = m∗ π, which is centered at the origin.
The analysis of time-delay systems is widely discussed by very many scientists,
and to mention only a few we note the works [3] and [4]. A good source of references
to papers in which stabilization problems are treated can by found in [5].

2 System without Damping


Let us consider the controllable second-order system without damping

ẍ(t) + Ax(t) = Bu(t), (8)

y(t) = B T x(t), (9)


Stabilization Results of Second-Order Systems 101

where x(t) ∈ R2 , u(t) ∈ R, y(t) ∈ R, t ≥ 0, A ∈ L(R2 , R2 ), A = AT > 0,


B ∈ L(R, R2 ). For purpose of theoretical and numerical analysis we assume that
   
2 −1 1
A= , B= . (10)
−1 2 0

If we take the Laplace transform in (8), (9) and use zero initial conditions, we
obtain
Y (s)
G1 (s) = = B T (s2 + A)−1 B. (11)
U (s)
Simple calculations show that

s2 + 2
G1 (s) = . (12)
(s2 + 1)(s2 + 3)

The open loop system represented by the transfer function (12) is not asymp-
totically stable.√The poles of√ (12) are located on the imaginary axis: s1 = j,
s2 = −j, s3 = 3j, s4 = − 3j, j 2 = −1. In order to stabilize the system we
consider the following feedback:

U (s) = G2 (s)Y (s), G2 (s) = ke−sτ . (13)

In this case the closed loop system (11), (13) with the matrices (10) is described
by the transfer function

G1 (s)G2 (s) k(s2 + 2)e−sτ


G(s) = = 4 . (14)
1 − G1 (s)G2 (s) s + 4s + 3 − k(s2 + 2)e−sτ
2

The stability of the closed loop system (14) will be checked by exploring the
Nyquist plot of
G12 (s) = −G1 (s)G2 (s). (15)
The Nyquist plot allows us to gain insight into stability of the closed loop system
by analyzing the contour of the frequency response function G12 (jω) on the
complex plane. In this case

G12 (jω) = Re G12 (jω) + j Im G12 (jω), (16)

where
ω2 − 2
Re G12 (jω) = k cos(ωτ ), (17)
ω4 − 4ω 2 + 3

ω2 − 2
Im G12 (jω) = −k sin(ωτ ). (18)
ω 4 − 4ω 2 + 3
We will try to plot the graph (16) only for positive frequency, that is for ω ∈
[0, +∞). The second half of the curve can be achieved by reflecting it over the
real axis. The magnitude and the phase of the function (16) are given by
102 W. Mitkowski and P. Skruch
 
 ω2 − 2 
|G12 (jω)| = k  , (19)
ω − 4ω + 3 
4 2

tan θ(ω) = − tan (ωτ ). (20)


If ω = 0, then Im G12 (jω) = 0 and Re G12 (jω) = − 23 k. We note that a necessary
condition for stability is that k < 32 . This means that the start point of the curve
G12 (jω) will be located on the right side from the (−1, j0) point. √
√Let us consider what happens with the plot G12 (jω) as ω ∈ (0, 1) ∪ (1, 3) ∪
( 3, +∞). In this case the magnitude (19) is finite, therefore we need to find all
intersections of the polar plot with the negative real axis. They will take place
at the frequencies satisfying the following conditions:

Im G12 (jω) = 0 and Re G12 (jω) < 0. (21)

At these frequencies the magnitude (19) must be less than 1, i.e.


 
 ω2 − 2 
|G12 (jω)| = k  4  < 1. (22)
ω − 4ω + 3 
2

Then we will be sure that there is no encirclements of the (−1, j0) point. The
first condition Im G12 (jω) = 0 is true when
√ nπ
ω = 2 or ω = , (23)
τ

for all n = 1, 2, . . .. At ω = 2 we have Re G12 (jω) = 0. This means that the
magnitude |G12 (jω)| = 0. At ω = nπ τ , n = 1, 2, . . ., the condition Re G12 (jω) < 0
is equivalent to
(nπτ )2 − 2τ 4
(−1)n k < 0, (24)
(nπ) − 4(nπτ )2 + 3τ 4
4

for all n = 1, 2, . . ..

Now, let us consider what happens at ω = 1 and ω = 3. Since the magnitude
(19) is infinite at these frequencies, we need to be sure that

lim Im G12 (jω) > 0 (25)


ω→1−

and
lim
√ Im G12 (jω) > 0. (26)
ω→ 3−
√ √
Then the ”points” G12 (j− ) and G12 (j+ ) (or G12 (j 3− ) and G12 (j 3+ )) will be
connected by the polar plot in the clockwise direction by a circular arc of the
radius R = ∞ and angle φ = π, which is centered at the origin. In other words,
the (−1, j0) point will not be embraced by the curve G12 (jω). The inequalities
(25) and (26) are equivalent to the following ones

sin τ > 0 or sin 3τ > 0. (27)
Stabilization Results of Second-Order Systems 103

Combining all conditions, we have:



(a) k ∈ (0, 32 ), sin τ > 0, sin 3τ > 0,

(b) If there exists ω0 = nπτ ∈
/ {1, 3}, n = 1, 2, . . ., such that

(nπτ )2 − 2τ 4
(−1)n k < 0,
(nπ)4 − 4(nπτ )2 + 3τ 4

then it must satisfy


 
 (nπτ )2 − 2τ 4 

k  < 1.
(nπ)4 − 4(nπτ )2 + 3τ 4 

50

45

40

35

30

25
τ

20

15

10

B C
5
A

0
0 0.5 1 1.5
k

Fig. 2. Stability regions for the system (14)

A
B
C
4

0
y

−2

−4

−6
0 2 4 6 8 10 12 14 16 18 20
t

Fig. 3. Trajectories of the closed loop system (14) corresponding to the points A, B
and C
104 W. Mitkowski and P. Skruch

1
ω −−> 1− ω −−> sqrt(3)−

0.8

0.6

0.4

0.2

0
imag

ω=0

−0.2

−0.4

−0.6

−0.8
ω −−> 1+

−1
ω −−> sqrt(3)+

−1.2 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4


real

Fig. 4. The plot of the function G12 (jω) corresponding to the parameters k = 1.2,
τ = 0.9 (point A)

ω −−> 1−

0.2
ω −−> sqrt(3)−

0
ω=0

−0.2
imag

−0.4

−0.6

−0.8

ω −−> sqrt(3)+
ω −−> 1+
−1
−1.5 −1 −0.5 0 0.5
real

Fig. 5. The plot of the function G12 (jω) corresponding to the parameters k = 0.5,
τ = 7.53 (point B)

Fig. 2 illustrates the stability regions for the system (14). For example, the point
A = (1.2, 0.9) is located in the stability region. The point B = (0.5, 7.53) stands
for the stable system but not asymptotically stable. And the point C = (1.0, 3.0)
illustrates the unstable region. The trajectories corresponding to the points A,
B and C are shown in Fig. 3. Figs 4, 5 and 6 present the Nyquist plots for
asymptotically stable, stable and unstable systems.
Stabilization Results of Second-Order Systems 105

ω −−> 1−
ω −−> sqrt(3)+
ω −−> 1+
0.4

0.2

0
ω=0
imag

−0.2

−0.4

−0.6

ω −−> sqrt(3)−

−1.2 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8


real

Fig. 6. The plot of the function G12 (jω) corresponding to the parameters k = 1.0,
τ = 3.0 (point C)

3 Gyroscopic System
The gyroscopic system is a system of differential equations of the form

ẍ(t) + Gẋ(t) + Ax(t) = Bu(t), (28)

y(t) = B T x(t), (29)


T
where x(t) ∈ R , u(t) ∈ R, y(t) ∈ R, t ≥ 0, A ∈ L(R , R ), A = A
2 2 2
> 0,
G ∈ L(R2 , R2 ), G = −GT , B ∈ L(R, R2 ). We assume that
     
2 −1 1 0 1
A= , B= , G= . (30)
−1 2 0 −1 0

The Laplace transform of the system (28) and (29) determines the following
transfer function:
Y (s)
G1 (s) = = B T (s2 + sG + A)−1 B. (31)
U (s)
Using (30) in (31) we obtain

s2 + 2
G1 (s) = . (32)
s4 + 5s2 + 3
The open loop system is not asymptotically stable, its eigenvalues are located
on the imaginary axis
 √  √
−5 + 13 −5 − 13
s1,2 = ±j , s3,4 = ±j . (33)
2 2
106 W. Mitkowski and P. Skruch

50

45

40

35

30

25
τ

20

15

10

0
0 0.5 1 1.5
k

Fig. 7. Stability regions for the system (35)

Let
U (s) = G2 (s)Y (s), G2 (s) = ke−sτ . (34)
The closed loop system (31), (34) with the matrices (30) is given by
G1 (s)G2 (s) k(s2 + 2)e−sτ
G(s) = = 4 . (35)
1 − G1 (s)G2 (s) s + 5s + 3 − k(s2 + 2)e−sτ
2

Let us note that the difference between the non-gyroscopic (12) and gyroscopic
system (32) is in the denominator of the appropriate transfer function. Using
the same technique as in the previous section, we can easily give the conditions,
which let us determine the range of allowable parameters k and τ . They are as
follows:
(a) k ∈ (0, 32 ), sin s1 τ > 0, sin s3 τ > 0,
(b) If there exists ω0 = nπ τ ∈/ {s1 , s3 }, n = 1, 2, . . ., such that
(nπτ )2 − 2τ 4
(−1)n k < 0,
(nπ)4 − 5(nπτ )2 + 3τ 4
then it must satisfy
 
 (nπτ )2 − 2τ 4 
k   < 1.
(nπ) − 5(nπτ ) + 3τ 
4 2 4

Fig. 7 shows the graphical representation of the stability regions for the system
(35).
Based on our discussion, we can establish an algorithm for finding the range
of allowable parameters of the positive time-delay controller (4) in order to guar-
antee stability of the general second-order system (1), (2). The algorithm can be
easily implemented in MATLAB-Simulink environment.
Stabilization Results of Second-Order Systems 107

ALGORITHM: The algorithm for finding stability regions for the generalized
second-order system with the positive time-delay feedback.
INPUT: The matrices G = Rn×n , A = Rn×n , B = Rn×1 , C = R1×n , the
transfer function of the system
G1 (s) = C[s2 + sG + A]−1 B,
the transfer function of the controller
G2 (s) = ke−sτ ,
the transfer function of the closed loop system
G1 (s)G2 (s)
G(s) = .
1 − G1 (s)G2 (s)
OUTPUT: The set S = {(k, τ ) ∈ R2 : the closed loop system is asymptotically
stable}.
ASSUMPTIONS: G = −GT , A = AT > 0, the system is observable, the open
loop system has all eigenvalues located on the imaginary axis, the multiplicity
of all eigenvalues is equal one.
STEP 1: Find the poles of the open loop system: si = jωi , ωi > 0, i =
1, 2, . . . , n.
STEP 2: Determine the set S1 = {(k, τ ) ∈ R2 : τ > 0, k > 0, kCA−1 B < 1}.
STEP 3: Determine the set Ω = {ω ∈ (0, +∞)\{ω1 , ω2 , . . . , ωn } : Im G12 (jω) =
0 and Re G12 (jω) < 0}.
STEP 4: Determine the set S2 = {(k, τ ) ∈ R2 : ∀ω∗ ∈Ω |G12 (jω ∗ )| < 1, k >
0, τ > 0}.
STEP 5: Determine the set S3 = {(k, τ ) ∈ R2 : limω→ωi− Im G12 (jω) > 0, i =
1, 2, . . . , n}.
STEP 6: Determine the set S = S1 ∩ S2 ∩ S3 .

4 Concluding Remarks
In this paper stabilization problem of matrix second-order systems has been
discussed. We have presented our results for single-input single-output case. The
systems have all poles located on the imaginary axis. We have proved that the
system can be stabilized by delayed positive feedback. The analysis of the closed
loop system has been performed using the Nyquist criterion. An algorithm for
finding stability regions has been proposed and then validated by series numerical
computations in MATLAB-Simulink environment. It seems to be interesting
to extend the results for infinite dimensional second-order dynamical systems
described by singular partial differential equations.

Acknowledgement
This work was supported by Ministry of Science and Higher Education in Poland
in the years 2008–2011 as a research project No N N514 414034.
108 W. Mitkowski and P. Skruch

References
[1] Abdallah, C., Dorato, P., Benitez-Read, J., Byrne, R.: Delayed positive feedback
can stabilize oscillatory systems. In: Proc. of the American Control Conference,
San Francisco CA, pp. 3106–3107 (1993)
[2] Buslowicz, M.: Stabilization of LC ladder network by delayed positive feedback
from output. In: Proc. XXVII International Conference on Fundamentals of Elec-
trotechnics and Circuit Theory, IC-SPETO 2004, pp. 265–268 (2004) (in Polish)
[3] Elsgolc, L.E.: Intoduction to the theory of differential equations with delayed ar-
gument. Nauka, Moscow (1964) (in Russian)
[4] Górecki, H., Fuksa, S., Grabowski, P., Korytowski, A.: Analysis and synthesis of
time delay systems. PWN, Warszawa (1989)
[5] Mitkowski, W.: Stabilization of dynamic systems. WNT, Warszawa (1991) (in Pol-
ish)
[6] Mitkowski, W.: Static feedback stabilization of RC ladder network. In: Proc.
XXVIII International Conference on Fundamentals of Electrotechnics and Circuit
Theory, IC-SPETO, pp. 127–130 (2005)
[7] Skruch, P.: Stabilization of second-order systems by non-linear feedback. Int. J.
Appl. Math. Comput. Sci. 14(4), 455–460 (2004)
A Comparison of Modeling Approaches for the
Spread of Prion Diseases in the Brain

Franziska Matthäus

Interdisciplinary Center for Scientific Computing,


University of Heidelberg, Germany
franziska.matthaeus@iwr.uni-heidelberg.de

Abstract. In this article we will present and compare two different modeling ap-
proaches for the spread of prion diseases in the brain. The first is a reaction-diffusion
model, which allows the description of prion spread in simple brain subsystems, like
nerves or the spine. The second approach is the combination of epidemic models with
transport on complex networks. With the help of these models we study the dependence
of the disease progression on transport phenomena and the topology of the underlying
network.

1 Introduction

The progression of prion diseases is accompanied on one hand by the multiplica-


tion of the infective agent, and on the other hand by its spatial dispersion in the
brain. However, models developed to describe the kinetics of the prion disease
progression usually include reaction terms but neglect prion transport. To close
this gap, we want to present in this article different approaches to model prion
propagation with a special focus on the spatial component. The spatial models
provide information about the prion-distribution in space, additionally to the
temporal evolution of the concentration, and allow to determine the dependence
of the concentration kinetics on prion transport.
Prion diseases are fatal neurodegenerative diseases caused by an infective
agent that is neither a virus nor any other conventional agent, but a particle
consisting solely of a wrongly folded protein, PrPsc [14]. This protein is an iso-
form of the native cellular prion protein, PrPc , which is present in many tissues
but with the highest concentration in the brain. In comparison to the mainly
alpha-helical PrPc , PrPsc is dominated by beta-sheet and characterized by a
higher resistance to degradation by protease K and a tendency to aggregate.
The infectivity of PrPsc lies in its ability to interact with the native PrPc,
resulting in a change of PrPc into the pathologic form PrPsc. The interaction
mechanism is thereby not fully understood.
Because we are interested mainly in spatial effects, we will focus on the
simplest kinetic model of prion-prion interaction. In the so-called heterodimer
model [14, 5], PrPc is converted upon interaction with a single prion particle

W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 109–117.
springerlink.com 
c Springer-Verlag Berlin Heidelberg 2009
110 F. Matthäus

+ +

PrP c PrP sc

Fig. 1. Scheme of the heterodimer model

(see Figure 1). After the conversion, the two resulting infective agents dissociate
and the process can start again with new PrPc.
Prion transport in the brain is another field where experimental data is sparse.
However, there are indications that prions move within the brain via axonal
transport [1], where the transport happens in both directions (anterograde and
retrograde). The speed of 1 mm/d hereby coincides with the speed of passive
neuronal transport [7].

2 The Reaction-Diffusion Approach


The heterodimer model (see Figure 1) can be written in the form of two differ-
ential equations, one describing the concentration dynamics of PrPc and one for
the concentration dynamics of PrPsc . We denote the concentration of PrPc by
A and the concentration of PrPsc by B. In the model, PrPc is produced with a
constant rate v0 and degraded with rate kA . Conversion is proportional to both
concentrations, A and B, with a constant kAB . PrPsc is degraded with a rate kB ,
where kA > kB , because of the higher resistance of PrPsc to proteases. For the
spatial model we assume a one-dimensional domain Ω = x ∈ R1 : 0 ≤ x ≤ L
and and zero-flux boundary conditions,
 
∂A  ∂B 
= 0, and = 0. (1)
∂x ∂Ω×T ∂x ∂Ω×T

The one-dimensional domain can hereby by associated with simple brain sub-
structures, like nerves or the spine. The reaction-diffusion system for the het-
erodimer model then has the following form:
∂A
= v0 − kA · A − kAB · A · B + D∇2 A
∂t
(2)
∂B
= kAB · A · B − kB · B + D∇2 B,
∂t
with the initial conditions A(0, x) = A0 (x) ≥ 0, B(0, x) = B0 (x) ≥ 0.
This set of equations (2) has been used by Payne and Krakauer [13] to study
inter-strain competition. Qualitatively they could show how after co-infection
with two different prion strains the first inoculated strain can slow down or even
stop the spread of the second strain and prevail, even if it has a longer incubation
period.
The parameters for this model have been estimated in [10], and are summa-
rized in Table 1. With the estimated parameter values the solutions of the system
A Comparison of Modeling Approaches for the Spread of Prion Diseases 111

Table 1. Kinetic parameters for prion-prion interaction

v0 4 μg/(g·d)
kA 4 d−1
kB 0.03 d−1
kAB 0.15 (μg·d/g)−1
D 0.05 mm2 /d

(2) can be analyzed qualitatively as well as quantitatively, and allow comparison


with results from real experiments.

2.1 Results from the Reaction-Diffusion Approach


We solve the Equations (2) with (1) numerically using an implicit Euler dis-
cretization scheme and the initial conditions A(0, x) = A∗1 , B(0, 0) > 0 but
small, and B(0, x) = 0 for x > 0, which corresponds to an infection with scrapie
prion at one end of the domain. For these initial conditions, the solutions exhibit
traveling wave behavior, as shown in Figure 2.

120

100

80
B in μg/g

60

40 t=450 days

20 B(0)=0.025 μg/g

0
0 20 40 60 80 100
distance in mm

Fig. 2. Snapshot of a traveling wave for the heterodimer model

For the heterodimer model (2), the speed of the traveling wave front for scrapie
prion cB can bedetermined analytically [10, 12], and depends on the kinetic param-
eters as cB = D · kAB · (A∗1 − A∗2 ), where A∗1 and A∗2 stand for the steady state
concent rations of cellular prion in the healthy system (absence of scrapie prion)
and in the diseased system (after infection with scrapie prions), respectively.
With the spatial model (2) it can also be shown, that the diffusion coefficient
has an influence on the overall concentration dynamics of PrPsc . In Figure 3
we show the dynamics of the PrPsc -concentration, averaged over the domain Ω,
for varying D. For small diffusion coefficients, the traveling wave forms and the
access of PrPsc to its substrate PrPc is limited. In this case the concentration
dynamics are dominated by linear growth. For large diffusion coefficients PrPsc
quickly distributes in space and the concentration dynamics show a sigmoidal
evolution, similar to the results of the heterodimer model without diffusion.
112 F. Matthäus

120

100

B concentration
80

60

40
D=0.9
D=0.5
20
D=0.1
D=0.05
0
0 100 200 300 400
time in days

Fig. 3. Dependence of the overall concentration dynamics on the diffusion coefficient

330
Heterodimer model
320
incubation time (in days)

310 Incubation time for


unenucleated mice
300

290

280

270

260

250
0 10 20 30 40
time of eye removal (in days)

Fig. 4. Incubation times depending on the interval between intraocular infection and
surgical eye removal. Experimental data from [16] (× with error bars) and simulation
data for two different parameter sets.

The model (2) can also be related to experiments, for example when modeling
the spread of prions in the mouse visual system. Here we can make use of the fact
that the mouse visual system is nearly linear, with the optic nerve projecting
from the eye the lateral geniculate nucleus (LGN), and the optic radiations then
projecting from the LGN to the visual cortex. Because of this simple structure
the system can be approximated by a one-dimensional domain and our model
applies.
Scott et al. [15, 16] carried out experiments to show the dependence of the in-
cubation time tinc on the dose of intraocularly injected scrapie material. Further-
more, they investigated how the incubation time changes when the eye is surgically
removed at different time intervals after intraocular infection. For the first exper-
iment, the relationship tinc ∝ log(dose) found can be easily reproduced with our
spatial model, however, here the spatial component is not essential, as any model
with a near-exponential initial phase would give the same result. Different is the
A Comparison of Modeling Approaches for the Spread of Prion Diseases 113

situation for the experiments regarding the surgical eye removal. Here a model
without a spatial component is not sufficient. With a spatial model, eye removal
can be simulated by a change in the domain Ω, in particular by inserting zero-flux
boundary conditions at the position where the optic nerve is cut. To compare the
results of the simulations with the real data we modified the model slightly to ob-
tain a better description of the spatial domain and the steady state distribution
of PrPc. For details see [9, 10]. The results of the simulation fit well to the exper-
imental data that show a decaying incubation time for larger intervals between
infection and surgical eye removal (see Figure 4).

3 Network Models

The complexity of the brain neuronal network and the fact that prions are trans-
ported across the edges of this network make the application of reaction-diffusion
equations on larger brain systems very difficult. However, some results on the
spread of infections on networks can be obtained by combining epidemic models
with transport on networks. In the previous section we showed that the disease
kinetics are dependent on prion transport. In the present section we will show
that the topology of the underlying network effects the disease spread as well.
Networks consist of a set of N nodes and M edges, where the nodes represent
here the neuronal cells and the edges denote whether between two cells exists
a connection (in the form of a synapse or gap junction) or not. The number of
edges originating from a node corresponds to the number of neighbors of the
node and is called the nodes degree k. The average of the degrees of all nodes
in the network is called the degree of the network k. According to the degree
distribution P (k) networks can be classified. In this article we will focus on one
network model, called small world [17]. Small worlds can be constructed from d-
dimensional regular grids by rewiring the edges with a probability p. Depending
on p, this model interpolates between regular and totally random networks, and
is therefore a good model for our purpose.
To describe the spread of infective diseases on networks, the network model is
combined with a model of epidemic diseases, the SI model. The SI model classifies
nodes into two discrete states, namely susceptible or infected. Susceptible nodes
become infected with probability ξ, where ξ is a function of the transmission
probability between two neighbors λ, and the number of infected neighbors m:

ξ = 1 − (1 − λ)m .

3.1 Results from the Network Approach


To show how the network topology affects the speed of the epidemic spread, we
run simulations on two-dimensional small worlds. All networks hereby consist of
104 nodes and have an average degree of k = 3.96, however vary in the rewiring
probability p. Every simulation is started with a single (randomly chosen) in-
fected node, and we measure the number of iterations until 95% of the network
114 F. Matthäus

140

number of iterations until 95% infected


120
100
80

60

40

20

−4 −3 −2 −1 0
10 10 10 10 10
rewiring probability

Fig. 5. Number of iterations until 95% of the network is infected, depending on the
rewiring probability p

4.8
heterogeneity 〈 k2〉 / 〈 k〉

4.6

4.4

4.2

4
0 0.2 0.4 0.6 0.8 1
rewiring probability

Fig. 6. Heterogeneity of 2-dimensional small worlds in dependence on the rewiring


probability

got infected. The result for every p is the average over various realizations of the
network.
The result, displayed in Figure 5, shows clearly that the velocity of the spread
increases with increasing rewiring probability. The crucial network feature is
thereby the degree heterogeneity of the network, defined as k 2 /k. For small
worlds, the degree heterogeneity increases with p, as shown in Figure 6. In [3] it
was shown that the time scale τ of the initial exponential growth of the epidemics
is related to the degree heterogeneity as:
k
τ= . (3)
λ(k 2  − k)
This relation shows that for scale-free networks, characterized by a power-law
degree distribution with P (k) ∝ k −α , the epidemics can have an extremely fast
initial growth, because here k 2  diverges with the network size.
A Comparison of Modeling Approaches for the Spread of Prion Diseases 115

80

average survival time (in iterations)


75

70

65

60

55
2 4 6 8 10 12
node degree

Fig. 7. Survival time of nodes in random networks in dependence on their degree

The neuronal network of the brain is an example of a very large network, and
although the degree variability of the nodes is bound by the number of synapses
a cell can form, this number can be as high as 2·105 for Purkinje cells [18]. To
determine the exact growth rate of the number of infected cells, estimates for
the transmission probability and for the degree heterogeneity are needed, which
are points that still need experimental and theoretical investigation.
The spread of epidemics on networks differs in many aspects from diffusive
spread on homogeneous domains. One example is the following: On a homoge-
neous domain, the time when a cell becomes infected depends only on its distance
from the origin of the infection. On networks, this time is also influenced by the
degree of the node. To show this, we set p = 1 to obtain networks with a large
degree variation and again simulate the outbreak of an epidemic applying the SI
model.
Figure 7 shows the average survival time of nodes in dependence on their
degree. One can see that on average nodes of high degree are earlier infected
(have shorter survival times) than nodes of low degree. The reason is that nodes
of high degree have more neighbors from which they can contract the disease.
Instead of looking at the survival times directly, Barthélemy et al. [3] measured
(with the same result) the average degree of !the newly infected nodes and the
inverse partition ratio, defined as Y2 (t) = k (Ik /I)2 , where Ik /I denotes the
fraction infected nodes of degree k in relation to all infected nodes.

4 Conclusions
The two approaches describe the disease progression on different scales. The
diffusion approach focuses on the mechanism of prion-prion interaction, but is
limited to simple spatial domains. The network approach takes into account the
complexity of the domain, in which the transport of the infective agent takes
place, but therefore is no longer specific for prion diseases.
116 F. Matthäus

The problem with models of prion spread in the brain is the shortness of
experimental data. Not only the prion interaction mechanism is not fully un-
derstood, but also the topology of the brain neuronal network is unclear. The
aim of this article is to present some general results obtained by the use of very
simple models.
With the appearance of new experimental data, the development of more
detailed models will become feasible. A possibility here is the combination of a
kinetic model for prion-prion interaction with transport on networks, and thus
the study of reaction-diffusion systems on networks. Some work on reaction-
diffusion systems on networks has been carried out for example in [6], which
deals with annealing processes of the types A + A → 0 and A + B → 0 on
scale-free networks, or in [2] where the Gierer-Meinhardt model was studied
on random and scale-free networks. The models for prion spread derived by
combining prion-prion interaction with transport on networks eventually should
not only account for long incubation periods but also provide a description of
local prion accumulations and the formation of plaques.

References
[1] Armstrong, R.A., Lantos, P.L., Cairns, N.J.: The spatial patterns of prion deposits
in Creutzfeldt-Jakob disease: comparison with β-amyloid deposits in Alzheimer’s
disease. Neurosci. Lett. 298, 53–56 (2001)
[2] Banerjee, S., Mallik, S.B., Bose, I.: Reaction-diffusion processes on random and
scale-free networks (2004) arXiv:cond-mat/0404640
[3] Barthélemy, M., Barrat, A., Pastor-Satorras, R., Vespignani, A.: Velocity and
hierarchical spread of epidemic outbreaks in scale-free networks (2004) arXiv:cond-
mat/0311501
[4] Eigen, M.: Prionics or the kinetic basis of prion diseases. Biophys. Chem. 63,
A1–A18 (1996)
[5] Galdino, M.L., de Albuquerque, S.S., Ferreira, A.S., Cressoni, J.C., dos Santos,
R.J.V.: Thermo-kinetic model for prion diseases. Phys. A 295, 58–63 (2001)
[6] Gallos, L.K., Argyrakis, P.: Absence of kinetic effects in reaction-diffusion pro-
cesses in scale-free networks. Phys. Rev. Lett. 92(13), 138301 (2004)
[7] Glatzel, M., Aguzzi, A.: Peripheral pathogenesis of prion diseases. Microbes. In-
fect. 2, 613–619 (2000)
[8] Harper, J.D., Lansbury Jr., P.T.: Models of amyloid seeding in Alzheimer’s dis-
ease and scrapie: mechanistic truths and physiological consequences of the time-
dependent solubility of amyloid proteins. Annu. Rev. Biochem. 66, 385–407 (1997)
[9] Matthäus, F.: Hierarchical modeling of prion spread in brain tissue, PhD thesis
(2005)
[10] Matthäus, F.: Diffusion versus network models as descriptions for the spread of
prion diseases in the brain. J. theor. Biol (in press) (2005)
[11] Masel, J., Jansen, V.A.A., Nowak, M.A.: Quantifying the kinetic parameters of
prion replication. Biophys. Chem. 77, 139–152 (1999)
[12] Murray, J.D.: Mathematical Biology. Springer, Heidelberg (1989)
[13] Payne, R.J.H., Krakauer, D.C.: The spatial dynamics of prion disease. Proc. R.
Soc. Lond. B 265, 2341–2346 (1998)
A Comparison of Modeling Approaches for the Spread of Prion Diseases 117

[14] Prusiner, S.B.: Prions. Proc. Natl. Acad. Sci. USA 95, 13363–13383 (1998)
[15] Scott, J.R., Davies, D., Fraser, H.: Scrapie in the central nervous system: neu-
roanatomical spread of infection and Sinc control of pathogenesis. J. Gen. Vi-
rol. 73, 1637–1644 (1992)
[16] Scott, J.R., Fraser, H.: Enucleation after intraocular scrapie injection delays the
spread of infection. Brain Res. 504, 301–305 (1989)
[17] Watts, D.J., Strogatz, S.H.: Collective dynamics of ’small-world’ networks. Na-
ture 393, 440–442 (1998)
[18] http://faculty.washington.edu/chudler/facts.html#brain
Ensemble Modeling for Bio-medical Applications

Christian Merkwirth1 , Jörg Wichard2,4 , and Maciej J. Ogorzalek1,3


1
Department of Information Technologies, Jagiellonian University, ul. Reymonta 4,
Cracow, Poland
christianmerkwirth@web.de
2
Institute of Molecular Pharmacology, Robert-Rössle-Str. 10, Berlin, Germany
JoergWichard@web.de
3
AGH University of Science and Technology, al. Mickiewicza 30, Cracow, Poland
maciej@agh.edu.pl
4
Institut für Medizinische Informatik, Charité, Hindenburgdamm 30,
12200 Berlin, Germany

Abstract. In this paper we propose to use ensembles of models constructed using


methods of Statistical Learning. The input data for model construction consists of real
measurements taken in physical system under consideration. Further we propose a pro-
gram toolbox which allows the construction of single models as well as heterogenous
ensembles of linear and nonlinear models types. Several well performing model types,
among which are ridge regression, k-nearest neighbor models and neural networks have
been implemented. Ensembles of heterogenous models typically yield a better general-
ization performance than homogenous ensembles. Additionally given are methods for
model validation and assessment as well as adaptor classes performing transparent fea-
ture selection or random subspace training on large number of input variables. The
toolbox is implemented in Matlab and C++ and available under the GPL. Several
applications of the described methods and the numerical toolbox itself are described.
These include ECG modeling, classification of activity in drug design and ...

1 Introduction
Ensemble methods have gained increasing attention in the last decade[1, 2] and
seem to be a promising approach for improving the generalization error of ex-
isting statistical learning algorithms in the regression and classification tasks.
The output of an ensemble model is the average of outputs of the individual
models belonging to the ensemble. In prediction problems an ensemble typically
outperforms single models. Almost all ensemble methods described so far use
models of one single class, e.g. neural networks [1, 2, 3, 4] or regression trees [5].
We suggested to build ensembles of different model classes, to improve the per-
formance in regression problems. The theoretical background of our approach is
provided by the bias/variance decomposition of the ensemble. We argue that an
ensemble of heterogeneous models usually leads to a reduction of the ensemble
variance because the cross terms in the variance contribution have a higher ambi-
guity. Further we describe the structure of the programming toolkit and its usage.

W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 119–135.
springerlink.com 
c Springer-Verlag Berlin Heidelberg 2009
120 C. Merkwirth, J. Wichard, and M.J. Ogorzalek

2 Learning Dependency from Data


The modeling problem can be described as follows [6]:
Given: A series of input-output-pairs (xμ , y μ ) with μ = 1, . . . , N or a functional
dependence y(x) (possibly corrupted by noise)
we would like to choose a model (function) fˆ out of some hypothesis space H
as close to the true f as possible
Two different tasks can be considered:
Classification f : RD → {0, 1, 2, ...} – discrete classes enabling sorting the
input data into distinct classes having specific properties
Regression f : RD → R – continuous output - finding a dependency on time
or parameters.

2.1 Model Types Used in Statistical Learning


There exist a vast variety of available models described in the literature which
can be grouped into some general classes [6]:
• Global Models
– Linear Models
– Polynomial Models
– Neural Networks (MLP)
– Support Vector Machines
• Semi–global Models
– Radial Basis Functions
– Multivariate Adaptive Regression Splines (MARS)
– Decision Trees (C4.5, CART)
• Local Models
– k–Nearest–Neighbors
• Hybrid Models
– Projection Based Radial Basis Functions Network (PRBFN)
Implementation of any of such modeling methods leads usually to solution of
an optimization problem and further to operations such as matrix inversion in
case of linear regression or minimization of a loss function on the training data
or quadratic programming problem (eg. for SVMs).

2.2 Validation and Model Selection


The key for model selection is the Generalization error – how does the model
perform on unseen data (samples)? Exact generalization error is not accessi-
ble since we have only limited number of observations! Training on small data
set tends to overfit, causing generalization error to be significantly higher than
training error This is a consequence of mismatch between the capacity of the
hypothesis space H (VC-Dimension) and the number of training observations.
Ensemble Modeling for Bio-medical Applications 121

Any type of model constructed has to pass the validation stage – estimation
of the generalization error using just the given data set. In a logical way we
select the model with lowest (estimated) generalization error. To improve the
generalization error typical remedies can be:
• Manipulating training algorithm (e.g. early stopping)
• Regularization by adding a penalty to the loss function
• Using algorithms with built-in capacity control (e.g. SVM)
• Relying on criteria like BIC, AIC, GCV or Cross Validation to select
optimal model complexity
• Reformulating the loss function, e.q. by using an -insensitive loss

3 Ensemble Methods
Building an Ensemble consists of averaging the outputs of several separately
trained models
!K
• Simple average f¯(x) = K
1
!k=1 fk (x) !
• Weighted average f¯(x) = k wk fk (x) with k wk = 1

The ensemble generalization error is always smaller than the expected error
of the individual models. An ensemble should consist of well trained but diverse
models.

3.1 The Bias/Variance Decomposition for Ensembles


Our approach is based on the observation that the generalization error of an
ensemble model can be improved if the predictors on which the averaging is
done disagree and if their residuals are uncorrelated [7]. We consider the case
where we have a given data set D = {(x1 , y1 ), . . . , (xN , yN )} and we want to find
a function f (x) that approximates y also for unseen observations of x. These
unseen observations are assumed to stem from the same but not explicitly known
probability distribution P (x). The expected generalization error Err(x) given a
particular x and a training set D is

Err(x) = E[(y − f (x))2 |x, D] (1)

where the expectation E[·] is taken with respect to the probability distribution
P . The Bias/Variance Decomposition of Err(x) is

Err(x) = σ 2 + (ED [f (x)] − E[y|x])2


+ED [(f (x) − ED [f (x)])2 ] (2)
2 2
= σ + (Bias(f (x))) + Var(f (x)) (3)

where the expectation ED [·] is taken with respect to all possible realizations of
training sets D with fixed sample size N and E[y|x] is the deterministic part of
122 C. Merkwirth, J. Wichard, and M.J. Ogorzalek

the data and σ 2 is the variance of y given x. Balancing between the bias and
the variance term is a crucial problem in model building. If we try to decrease
the bias term on a specific training set, we usually increase the bias term and
vice versa. We now consider the case of an ensemble average fˆ(x) consisting of
K individual models
K

fˆ(x) = ωi fi (x) wi ≥ 0, (4)
i=1
!K
where the weights may sum to one i=1 ωi = 1. If we put this into eqn. (2) we
get
Err(x) = σ 2 + Bias(fˆ(x))2 + Var(fˆ(x)), (5)
and we can have a look at the effects concerning bias and variance. The bias
term in eqn. (5) is the average of the biases of the individual models. So we
should not expect a reduction in the bias term compared to single models.
The variance term of the ensemble could be decomposed in the following way:

V ar(fˆ) = E (fˆ − E[fˆ])2
K K
= E[( ωi fi )2 ] − (E[ ωi fi ])2
i=1 i=1
K
 "   #
= ωi2 E fi2 − E 2 [fi ] (6)
i=1

+2 ωi ωj (E [fi fj ] − E [fi ] E [fj ]) ,
i<j

where the expectation is taken with respect to D and x is dropped for simplicity.
The first sum in eqn. 6 gives the lower bound of the ensemble variance and
contains the variances of the ensemble members. The second sum contains the
cross terms of the ensemble members and disappears if the models are completely
uncorrelated [7]. The reduction of the variance of the ensemble is related to the
degree of independence of the single models. This is a key feature of the ensemble
approach.
There are several ways to increase model decorrelation. In the case of neural
network ensembles, the networks can have different topology, different training
algorithms or different training subsets [2, 1]. For the case of fixed topology, it is
sufficient to use different initial conditions for the network training [4]. Another
way of variance reduction is Bagging, where an ensemble of predictors is trained
on several bootstrap replicates of the training set [8]. When constructing k-
Nearest-Neighbor models, the number of neighbors and the metric coefficients
could be used to generate diversity.
Krogh et al. derive the equation E = Ē − Ā which relates the ensemble
generalization error E with the average generalization error Ē of the individual
Ensemble Modeling for Bio-medical Applications 123

models and the variance Ā of the model outputs with respect to the average out-
put. When keeping the average generalization error Ē of the individual models
constant, the ensemble generalization error E should decrease with increasing
diversity of the models Ā. Hence we try to increase A by using two strategies:
1. Resampling: We train each model on a randomly drawn subset of 80% of all
training samples. The number of models trained for one ensemble is chosen
so that usually all samples of the training set are covered at least once by
the different subsets.
2. Variation of model type: We employ two different model types, which are lin-
ear models trained by ridge regression and k-nearest-neighbor (k-NN) models
with adaptive metric.

4 Out-of-Train Technique

The Out-of-Train (OOT) technique is a method for assessing the extra-sample


error and can be regarded as a combination of traditional cross-validation (CV)

1 0.2 −0.018 −0.018 0.14 0.076

2 −0.42 −0.37 −0.022 −0.4 −0.3

3 −0.38 −0.44 −0.42 −0.36 −0.4

4 0.17 0.17 0.11 0.17 0.16


index of sample

5 0.018 0.034 0.054 0.086 0.048

6 0.13 −0.42 −0.4 0.13 −0.14

7 0.17 0.19 0.15 0.17 0.17

8 −0.013 −0.035 0.041 0.063 0.014

9 0.025 0.017 0.063 −0.01 0.024

10 0.13 0.21 0.16 0.16 0.16

2 4 6 8 10 12 14 16 18 20 OOT
index of model

Fig. 1. Averaging scheme for OOT calculation for an example data set of ten samples.
On this data set, 20 models were trained. Column j corresponds to model j. For each
model, samples used for training are colored white, while samples not used for training
are colored gray. For easier reading, only output values for test samples were printed on
the respective row and column. To compute the OOT output for the i-th sample which
is depicted as gray value in the rightmost column, the average over the output of all
models for which this sample was not in the training fraction is calculated (averaging
over all gray fields in a row).
124 C. Merkwirth, J. Wichard, and M.J. Ogorzalek

and ensemble averaging. As for cross-validation, the data set is repeatedly di-
vided into training and test partitions. For each partitioning, a model is con-
structed only on samples of the training partition. Test samples are not used for
model selection, deriving of stopping criteria or the like. The OOT output for
one sample of the data set is the average of the outputs of models for which this
sample was not part of the training set (out-of-train) as depicted in Figure 1.

5 Model Training and Cross Validation


In order to select models for the final ensemble we use a cross validation scheme
for model training. As the models are initialized with different parameters (num-
ber of hidden units, number of nearest neighbor, initial weights, etc.), cross val-
idation helps us to find a proper value for these model parameters.
The cross validation is done in several training rounds on different subsets
of the entire training data. In every training round the data is divided in a
training set and a test set. The trained models are compared by evaluating their
prediction errors on the unseen data of the test set. The model with the smallest
test error is taken out and becomes a member of the ensemble. This is repeated
several times and the final ensemble is a simple average over its members. For
example a K-fold cross validation training leads to an ensemble with K members,
1
where the weights in equ.(4) turn to ωi = K .

6 The ENTOOL Toolbox for Statistical Learning

The ENTOOL toolbox for statistical learning is designed to make state-of-the-


art machine learning algorithms available under a common interface. It allows
construction of single models or ensembles of (heterogenous) models. ENTOOL
is Matlab-based with parts written in C++ and runs under Windows and Linux.

6.1 ENTOOL Software Architecture


Each model type is implemented as separate class in our simulator, all model
classes share common interface. Exchange model types by exchanging construc-
tor call. The system allows for automatic generation of ensembles of models.
Models are divided into two brands:
1. Primary models like linear models, neural networks, SVMs etc.
2. Secondary models that rely on primary models to calculate output. All en-
semble models are secondary models.
Each selected model goes through three phases: Construction, Training and Eval-
uation. In the construction phase topology of the model is specified. The model
can’t be used yet – it has now to be trained on some training data set (xi , yi ).
After training, the model can be evaluated on new/unseen inputs (xn ).
Ensemble Modeling for Bio-medical Applications 125

6.2 Syntax
• Constructor syntax:
model = perceptron; will create a MLP model with default topology
model = perceptron(12); MLP model with 12 hidden layer neurons
model = ridge; will create a linear model trained by ridge regression
• Training syntax:
model = train(model,x,y,[],[],0.05);
trains model with -insensitive loss of 0.05 on data set (xi , yi )
• Evaluation syntax:
y new = calc(model, x new) evaluates the model on new inputs
• How to build an ensemble of models:
ens = crosstrainensemble; will create an empty ensemble object
ens = train(ens,x,y,[],[],0.05); calls training routines for several primary
models and joins them into ensemble object
• Ensemble evaluation:
y new = calc(ens, x new) evaluates the ensemble on new inputs.

6.3 Adjusting Class Specific Training Parameters


The 5th argument when calling train specifies training parameters. Except
topology, often training parameters have to be specified:
tp = get(perceptron, ’trainparams’); error loss margin: 0.0100; decay:
0.0010; rounds: 500; mrate init: 0.0100; max weight: 10; mrate grow:
1.2000; mrate shrink: 0.5000.

6.4 Primary Models Types


The user has a choice of various primary model types:
ares Adaption of Friedman’s MARS algorithm
ridge Linear model trained by ridge regression
perceptron Multilayer perceptron with iRPROP+ training
prbfn Shimon Cohen’s projection based radial basis function network
rbf Mark Orr’s radial basis function code
vicinal k-nearest-neighbor regression with adaptive metric
mpmr Thomas Strohmann’s Mimimax Probability Machine Regression
lssvm Johan Suykens’ least-square SVM toolbox
tree Adaption of Matlab’s build-in regression/classification trees
osusvm SVM code based on Chih-Jen Lin’s libSVM
vicinalclass k-nearest-neighbor classification
126 C. Merkwirth, J. Wichard, and M.J. Ogorzalek

6.5 Secondary Models Types


The user has a choice of various secondary model types which can be used for
ensembling or feature selection:
ensemble. Virtual parent class for all ensemble classes
crosstrainensemble. Ensemble class that trains models according to crosstrain-
ing scheme. Creates ensembles of decorrelated models.
cvensemble. Ensemble class that trains models according to crossvalidation/
out-of-training scheme. Can be used to access OOT error.
extendingsetensemble. Boosting variant for regression.
subspaceensemble. Creates an ensemble of models where each single model is
trained on a random subspace of the input data set.
optimalsvm. Wrapper that trains RBF osusvm/lssvm with optimal parameter
settings (C and γ)
featureselector. Does feature selection and trains model on selected subset

6.6 Experience in Ensembling


All the described methods have been implemented and are available for download
from our web-site http://chopin.zet.agh.edu.pl/˜wichtel and http://zti.if.uj.edu.
pl/˜merkwirth/entool.htm which contains also manual and installation guide.
The toolkit is under continuous development and new features and algorithms
are being added. Also in the nearest future we will make available for users an
on-line statistical learning service. The toolkit has been thoroughly tested on a
variety of problems from ECG modeling, CNN training, financial time series, El
Niño real data, physical measurements and many others [9, 10]

6.7 ECG Modeling


We used a ECG time series (see figure 2) from the CinC Challenge 2000 data
sets: Data for development and evaluation of ECG-based apnea detectors (see
Physiobank Database at URL: pcbim2.dsi.unifi.it/physiobank/database/apnea-
ecg/index.html for more information about the recording details). The data set
a01.dat was cropped to approximately 50000 samples by omitting later measure-
ments. From this time series we generated time-delay-vectors of dimension 12
with delay 1 (see [11]). The modeling task consisted of learning the one-step-
ahead prediction for each of the 50000 delay vectors. Since the final error MSEE
on this full data set F differs from run to run, we repeatedly partitioned F ran-
domly into a training partition Aj of 40000 samples and a test partition A0j
of 10000 samples. A0j is not used at all during the j-th run of the algorithm as
described in section 2, instead it is kept until the end of the training to assess the
test error of ensemble Ej . As parameters for the algorithmwe chooseN = 100,4N
= 10 and R = 6, which results in models mi that are trained on a maximum of
150 samples out of Aj . The repeated partitioning and execution of leads to an
collection of ensembles Ej, j = 1, . . . 10. We then removed the worst performing
ensembles according to their error on the full data set3 F from that collection
Ensemble Modeling for Bio-medical Applications 127

300

200

100

−100

200 400 600 800 1000


Fig. 2. First 1000 samples of the ECG time series used for the numerical experiments.
Time-delay-reconstruction with dimension 12 and delay 1 was used to generate the
data sets F and G. Both data sets contain 50000 samples each. G is taken from the
later part of the ECG time series. Both data sets have no samples in common. Since
the MSE of models trained solely on F is lower on G than on F itself, the dynamics of
the ECG seems to be stationary.

and treated the reduced collection as an ensemble of ensembles EE. This method
has two advantages: If one run diverges or produces an substantially inferior
model, it is not used for the final collection of ensembles. Due to the stochastic
nature of the initial subset and of the training method used, the output of the
ensembles generated by the different runs are to some extend decorrelated. This
results not only in a substantial reduction of the error MSEEE of the collection
of ensembles on F, but also in an reduction of the generalization error on unseen
data (see [1, 2]). We used four different underlying model types which range from
strictly local to a global models: 1. The model type that was mostly used within
the numerical experiments is a variant of the k-nearest-neighbor regressor with
adaptive metric. In our case GA-like algorithm adapts the metric coefficients
of one of the L1, L2 or L1 metrics as well as the number k of neighbors. The
fitness of an individual is the negative leave-one-out-error on the training data,
which can be easily calculated using a fast nearest-neighbor algorithm ([24]).
The metric coefficients are adapted by mutation and crossover within a prede-
fined number of generations. 2. As a semi-local model type we decided to use the
hybrid PRBFN network based on radial basis function and sigmoidal nodes (see
[12]). We found this model type to perform superior on several artificial and real
world test problems. 3. As global model type we choose a fully connected neural
network with one hidden layer and eight hidden layer neurons. The network is
trained using a second-order Levenberg-Marquardt algorithm. The implementa-
tion was taken from the NNSYSID2.0 toolbox written by Magnus Nrgaard (see
[25]). 4. Another global model type we used was a linear model that was trained
using a cross-validation scheme.
128 C. Merkwirth, J. Wichard, and M.J. Ogorzalek

Results of model types on training set F


0.45
Sequences
Ensembles of Sequences
0.4
Ensemble of Standard Models

0.35

0.3

Mean Square Error


0.25

0.2

0.15

0.1

0.05

0
k−NN PRBFN NeuralNet Linear k−NN full
Model Type

Results of model types on test set G


0.45
Sequences
Ensembles of Sequences
0.4
Ensemble of Standard Models

0.35

0.3
Mean Square Error

0.25

0.2

0.15

0.1

0.05

0
k−NN PRBFN NeuralNet Linear k−NN full
Model Type

Results on training set F with adding of randomly chosen samples


0.45
Sequences
Ensembles of Sequences
0.4

0.35

0.3
Mean Square Error

0.25

0.2

0.15

0.1

0.05

0
k−NN PRBFN NeuralNet Linear
Model Type

Several interesting points could be observed:

6.8 Classification of Anti-Viral Chemicals


6.8.1 NCI AIDS Antiviral Screen Data Set
To apply the ENTOOL toolbox to a problem encountered in cheminformatics,
we employed a special neural network type called Molecular Graph Networks
(MGN) that can be directly applied for QSAR applications ([13]). We considered
Ensemble Modeling for Bio-medical Applications 129

a data set of more than 42000 compounds from the DTP AIDS antiviral screen
data set of the NCI Open Database.
We considered a data set of more than 42000 compounds from the DTP AIDS
antiviral screen data set of the NCI Open Database. The antiviral screen uti-
lized a soluble formazan assay to measure the ability of compounds to protect
human CEM cells[14] from HIV-1-induced cell death. In the primary screen-
ing set of results, the activities of the compounds tested in the assay were de-
scribed to fall into three classes: confirmed active (CA) for compounds that
provided 100 % protection, confirmed moderately active (CM) for compounds
that provided more than 50 % protection, and confirmed inactive (CI) for the
remaining compounds or compounds that were toxic to the CEM cells and there-
fore seemed to not provide any protection. The data set was obtained from
http://cactus.nci.nih.gov/ncidb2/download.html. The data set consisted
of originally 42689 2D structures with AIDS test data as of October 1999 and
was provided in SDF format. Seven compounds could not be parsed and had to
be removed. From the total of 42682 useable compounds 41179 compounds were

ROC
1

0.9

0.8

0.7
Frac. true positives

0.6

0.5

0.4

0.3

0.2
OOT Train Classes CM and CA
Test Classes CM and CA
0.1 OOT Train Class CA only
Test Class CA only

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Frac. false positives

Fig. 3. ROC curves for the classifiers constructed on the NCI AIDS Antiviral Screen
Data Set with -insensitive absolute loss. The Figure displays two pairs of ROC curves.
In this computational experiment we trained an ensemble of molecular graph networks
on a data set consisting of three classes of molecules (CI, CM and CM). To be able
to generate ROC curves, we had to reduce the number of classes to two by pooling
the molecules of two classes into a single class. The lower pair of ROC curves was
obtained by using the ensemble of classifiers to discriminate between CI as one class
and CA and CM as second class, while the upper pair details the ROC curves when
using the same ensemble of classifiers to discriminate between CI and CM as one class
and the confirmed actives CA as the second. The AUCs of the respective pairs of curves
are 0.82 resp. 0.81 for classification of CI versus CA and CM and 0.94 resp. 0.94 for
classification of CI and CM versus CA.
130 C. Merkwirth, J. Wichard, and M.J. Ogorzalek

confirmed inactive, 1080 compounds were confirmed moderately active and 423
compounds were confirmed active.
To solve this multiclass classification problem, we used the one-versus-all ap-
proach based on the logistic loss function. We construct three ensembles of clas-
sifiers. Each of these three ensembles is trained to solve the binary classification
problem of discriminating one of the three classes against the rest and consists
of six MGNs. Each MGN consisted of 18 individual feature nets with iteration
depths ranging from 3 to 10 and a supervisor network with 24 hidden layer
neurons. The MGNs were trained by stochastic gradient descent with a fixed
number of 106 gradient calculations. The global step size μ was decreased every
70000 gradient updates by a factor of 0.8. We randomly partitioned the data set
into a training set of 35000 compounds and test set of 7682 compounds. Each
MGN was trained on a random two-third of the 35000 training samples. Thus
the OOT output for every sample of the training set was computed by averaging
over 2 models while the output for the held-out test set by averaging over all 6
models of each ensemble.
Results for the classification experiments on NCI data set with classification
loss function are given in Figure 3. This figure displays two pairs of ROC curves.
The lower pair of ROC curves in Figure 3 was obtained by using the ensemble of
classifiers to discriminate between CI on the one hand and CA and CM on the
other hand, while the upper pair details the ROC curves when using the same
ensemble of classifiers to discriminate between CI and CM on the one hand
and the confirmed actives CA on the other. The remarkable coincidence of the
curves obtained by validation on the training part and from the held-out test
part of more than 7000 compounds indicates that the validation was performed
properly and does not exhibit overfitting. This result is supported by the AUCs
of the respective pairs of curves which are 0.82 (OOT) and 0.81 (test) for the
classification of CI versus CA and CM and both 0.94 for the classification of CI
and CM versus CA.
Results for the classification with logistic loss function are depicted in Figure 4.
The obtained AUC values are similar to the best results of several variants of a
classification method based on finding frequent subgraphs[15] (experiments H2
and H3 when omitting class CM from the test set for the ensemble constructed to
discriminate CA versus the two other classes). Wilton et al.[16] compare several
ranking methods for virtual screening on an older version of the NCI data set.
The best performing method there, binary kernel discrimination, is able to locate
12 % of all actives (CM and CA pooled) in the first 1 % and 35 % of all actives in
the first 5 % of the ranked NCI data set. MGNs trained with logistic loss are able
to find 36 % resp. 74 % of all actives in the first 1 % resp. 5 % of the NCI data
set ranked according to the output of the ensemble of classifiers. When interpret-
ing the output of the three classifiers as pseudo-probabilities and assigning the
class label of the classifier with highest output value to each sample, we are able
to compute confusion matrices for the OOT validation on the training set and for
Ensemble Modeling for Bio-medical Applications 131

ROC
1

0.9

0.8

0.7

Frac. true positives


0.6

0.5

0.4

0.3
Test Class CI
0.2 OOT Train Class CI
Test Class CM
OOT Train Class CM
0.1 Test Class CA
OOT Train Class CA
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Frac. false positives

Fig. 4. ROC curves for the three classifiers constructed on the NCI AIDS Antiviral
Screen Data Set using logistic loss and one-versus-all approach. The Figure displays
three pairs of ROC curves. In this computational experiment three ensembles of molec-
ular graph networks were trained on a data set consisting of three classes of molecules
(CI, CM and CM). The green/black pair of ROC curves corresponds to the ensemble
classifier discriminating class CM from the two other classes, the red/magenta pair
to class CI against the others. The blue/cyan pair details the ROC curves resulting
from the ensemble classifier trained to discriminate class CA against the two remaining
classes CI and CM. AUCs are 0.80 resp. OOT 0.81 for class CI, 0.75 resp. OOT 0.75
for class CM and 0.94 resp. OOT 0.91 for class CA.

Table 1. Confusion matrix for the OOT validation on the training set obtained by the
system of three classifiers on the NCI AIDS Antiviral Screen data set using logistic loss
and one-versus-all approach. The values displayed indicate the fraction of the samples
of each class are classified into the respective classes ). E.g. 83.5 % of the samples of
class CI are classified correctly, 12.6 % of the CI samples are classified wrongly into
class CM and the remaining 3.8 % are wrongly classified into class CA. While samples
of classes CI and CA are mostly classified correctly, class CM (confirmed moderate)
are recognized correctly in only 38 % of the cases.

Predicted Class
Actual Class CI CM CA
CI 0.835 0.126 0.038
CM 0.408 0.380 0.212
CA 0.124 0.187 0.690

the held-out test set, given in Tables 1 and 2. While classes CI and CA can be
correctly classified in a majority of the cases, samples of class CM are recognized
correctly in less than 40 % of all cases.
132 C. Merkwirth, J. Wichard, and M.J. Ogorzalek

Table 2. Confusion matrix for the held-out test set. 85 % of the samples of class CI are
classified correctly, 12 % of the samples of class CI are classified wrongly as belonging
to class CM and the remaining 3 % are wrongly classified to fall into class CA.

Predicted Class
Actual Class CI CM CA
CI 0.852 0.121 0.027
CM 0.444 0.369 0.187
CA 0.093 0.160 0.747

6.9 Classifiers for the Chromosome-Damaging Potential of


Chemicals
We used ensemble models for classification in order to predict the chromosome-
damaging potential of chemicals as assessed in the in Vitro chromosome aberra-
tion (CA) test. This CA test is an in Vitro test that is used in the early stages of
the drug discovery process in order to exclude toxic compounds. A detailed de-
scription of our appraoch in comparision to the performance of existing methods
could be found in Rothfuss et al. [17].
The CA-test data used in this study were obtained from two recently pub-
lished data collections [18, 19]. Further details on the original data source can be
obtained from the references of both data compilations. The genotoxicity data
collection from Snyder et al. [19] contains in Vitro cytogenetics data for 248 mar-
keted pharmaceuticals. Structural information could be retrieved for 229 of the
248 compounds. Altogether, 189 negative and 40 positive data records from this
data source could be used for modelbuilding purposes. The database collected
by Kirkland et al. [18] contains CA-test data for 488 structurally diverse com-
pounds, consisting of industrial, environmental, and pharmaceutical compounds.
Structural information was retrieved for 450 out of this compounds. Altogether,
168 negative and 282 positive data records from this data source could be used
for model-building purposes.
The descriptors that serve as input variables for the classification were calcu-
lated with the dragonX software [20] that was originally developed by Milano
Chemometrics and the QSAR Research Group. The software generates a total
number of 1664 molecular descriptors that are group into 20 different blocks,
such as constitutional descriptors, topological descriptors, and walk and path
counts [20]. For each compound in the data set, all 1664 descriptors were cal-
culated. Because many of these descriptors are redundant or carry correlated
information, feature selection need to be performed in order to select the most
useful subset of descriptors to build a classification model. Our feature selection
approach follows the method of variable importance as proposed by Breiman
[21]. The underlying idea is to select descriptors on the basis of the decrease of
classification accuracy after the permutation of the descriptors [21]. Briefly, an
ensemble of decision trees is built, which uses all descriptors as input variables
and associated activity (CA-test result) as output variables using 90% of the data
(training set). The prediction accuracy of the classification model on an out of
Ensemble Modeling for Bio-medical Applications 133

training portion of the data (test set) is recorded. In a second step, the same is
done after the successive permutation of each descriptor. The relative decrease
of classification accuracy is the variable importance following the idea that the
most discriminative descriptors are the most important ones. The descriptor set
was reduced iteratively resulting in a final set of 14 descriptors, including topo-
logical charge indices, electronegativity and shape descriptors. Several of the
identified descriptors can be directly related to genotoxicity and specify char-
acteristics of structures involved in DNA modifications (see Rothfuss et al. [17]
and the references therein). Our final classifier was trained with several different
model classes to achieve a diverse ensemble:
• Classification and regression trees (CART)
• Support vector machines (SVM) with Gaussian kernels
• Linear and quadratic discriminant analysis
• Linear ridge models
• Feedforward neural networks with two hidden layers trained by gradient de-
scend
• K-nearest-neighbor models with adaptive metrics
In order to estimate the performance of the final ensemble model, we performed
a 20 fold cross-validation, wherein 10% of the data was randomly kept out as test
set an the remaining 90% of the data was used for model training. The results
with respect to training and test set are reported in Table 3.

Table 3. The performance of the ensemble classification model, mean values calculated
over 20 cross-validation-folds

Accuracy Sensitivity Specificity


Training set 75.9% 75.1% 76.8%
Test set 71.6% 70.8% 71.4%

As pointed out by several research groups, the state of the art machine learn-
ing approaches in the field toxicology prediction can compete with most of the
commercial software tools [22, 23, 17] and they havethe further advantage, that
they could be trained with the additional in-house data collection of institutes
or companies.

7 Why Do Ensembling? Pros and Cons


Building ensembles of models gives several advantages:
• Straightforward extension of existing modeling algorithms
• Almost fool-proof minimization of generalization error
• Makes no assumptions on the structure of the underlying models
• Alleviates the problem of model selection
134 C. Merkwirth, J. Wichard, and M.J. Ogorzalek

These advantages come on the expense of increased computational effort. The


described strategies have been tested extensively using measured data series form
experiments in electronic circuits, real ECG time series and financial data.

Acknowledgments
This work has been prepared in part within the scope of the Research Training
Network COSYC of SENS No. HPRN-CT-2000-00158 of the 5th EU Framework.

References
[1] Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active
learning. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Advances in Neural In-
formation Processing Systems, vol. 7, pp. 231–238. MIT Press, Cambridge (1995),
citeseer.ist.psu.edu/krogh95neural.html
[2] Perrone, M.P., Cooper, L.N.: When Networks Disagree: Ensemble Methods for
Hybrid Neural Networks. In: Mammone, R.J. (ed.) Neural Networks for Speech
and Image Processing, pp. 126–142. Chapman and Hall, Boca Raton (1993)
[3] Hansen, L., Salamon, P.: Neural Network Ensembles. IEEE Trans. on Pattern
Analysis and Machine Intelligence 12(10), 993–1001 (1990)
[4] Naftaly, U., Intrator, N., Horn, D.: Optimal ensemble averaging of neural networks.
Network, Comp. Neural Sys. 8, 283–296 (1997)
[5] Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regres-
sion Trees. Wadsworth International Group, Belmont (1984)
[6] Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning.
Springer Series in Statistics. Springer, Heidelberg (2001)
[7] Krogh, A., Sollich, P.: Statistical mechanics of ensemble learning. Physical Review
E 55(1), 811–825 (1997)
[8] Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996),
citeseer.ist.psu.edu/breiman96bagging.html
[9] Merkwirth, C., Ogorzalek, M., Wichard, J.: Stochastic gradient descent training
of ensembles of dt-cnn classifiers for digit recognition. In: Proceedings of the Eu-
ropean Conference on Circuit Theory and Design ECCTD 2003, Kraków, Poland,
vol. 2, pp. 337–341 (September 2003)
[10] Wichard, J., Ogorzalek, M.: Iterated time series prediction with ensemble models.
In: Proceedings of the 23rd International Conference on Modelling Identification
and Control (2004)
[11] Suykens, J., Vandewalle, J. (eds.): Nonlinear Modeling - Advanced Black–Box
Techniques. Kluwer Academic Publishers, Dordrecht (1998)
[12] Cohen, S., Intrator, N.: A hybrid projection based and radial basis function archi-
tecture. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 147–155.
Springer, Heidelberg (2000)
[13] Merkwirth, C., Lengauer, T.: Automatic generation of complementary descriptors
with molecular graph networks (2004)
[14] Weislow, O., Kiser, R., Fine, D., Bader, J., Shoemaker, R., Boyd, M.: New soluble
formazan assay for hiv-1 cytopathic effects: application to high flux screening of
synthetic and natural products for aids antiviral activity. J. Nat. Cancer Inst. 81,
577–586 (1989)
Ensemble Modeling for Bio-medical Applications 135

[15] Deshpande, M., Kuramochi, M., Karypis, G.: Frequent sub-structure-based ap-
proaches for classifying chemical compounds. In: Proceedings of the Third IEEE
International Conference on Data Mining ICDM 2003, Melbourne, Florida, pp.
35–42 (November 2003)
[16] Wilton, D., Willett, P., Lawson, K., Mullier, G.: Comparison of ranking methods
for virtual screening in lead-discovery programs. J. Chem. Inf. Comput. Sci. 43,
469–474 (2003)
[17] Rothfuss, A., Steger-Hartmann, T., Heinrich, N., Wichard, J.: Computational pre-
diction of the chromosome-damaging potential of chemicals. Chemical Research
in Toxicology 19(10), 1313–1319 (2006)
[18] Kirkland, D., Aardema, M., Henderson, L., Muller, L.: Evaluation of the ability
of a battery of three in vitro genotoxicity tests to discriminate rodent carcinogens
and non-carcinogens. Mutat. Res. 584, 1–256 (2005)
[19] Snyder, R.D., Pearl, G.S., Mandakas, G., Choy, W.N., Goodsaid, F., Rosen-
blum, I.Y.: Assessment of the sensitivity of the computational programs DEREK,
TOPKAT and MCASE in the prediction of the genotoxicity of pharmaceutical
molecules. EnViron. Mol. Mutagen. 43, 143–158 (2004)
[20] Todeschini, R.: Dragon Software, http://www.talete.mi.it/dragon_exp.htm
[21] Breiman, L.: Arcing classifiers. The Annals of Statistics 26(3), 801–849 (1998),
http://citeseer.nj.nec.com/breiman98arcing.html
[22] Serra, J.R., Thompson, E.D., Jurs, P.C.: Development of binary classification of
structural chromosome aberrations for a diverse set of organic compounds from
molecular structure. Chem. Res. Toxicol. 16, 153–163 (2003)
[23] Li, H., Ung, C., Yap, C., Xue, Y., Li, Z., Cao, Z., Chen, Y.: Prediction of genotoxic-
ity of chemical compounds by statistical learning methods. Chem. Res. Toxicol. 18,
1071–1080 (2005)
[24] McNames, J.: Innovations in Local Modeling for Time Series Prediction, Ph.D.
Thesis, Stanford University (1999)
[25] Norgaard, M.: Neural Network Based System Identification Toolbox, Tech. Report.
00-E-891, Department of Automation, Technical University of Denmark (2000),
http://www.iau.dtu.dk/research/control/nnsysid.html
Automatic Fingerprint Identification Based on
Minutiae Points

Maciej Hrebień and Józef Korbicz

Institute of Control and Computation Engineering


University of Zielona Góra
ul. Podgórna 50
65-246 Zielona Góra
Poland
{m.hrebien,j.korbicz}@issi.uz.zgora.pl

Introduction
In recent years security systems have played an important role in our community.
Payment operations without cash, restricted access to specific areas, secrecy of
information stored in databases are only a small part of our daily living that requires
special treatment. Beside traditional locks, keys or ID cards, there is an increased
interest in biometric technologies, that is, human identification based on one’s
individual features [2, 19].
Fingerprint identification is one of the most important biometric technologies
considered nowadays. The uniqueness of a fingerprint is exclusively determined by local
ridge characteristics called the minutiae points. Automatic fingerprint matching depends
on the comparison of these minutiaes and relationships between them [13, 14, 18].
In this paper several methods of fingerprint matching are discussed, namely, the
Hough transform, the structural global star method and the speeded up correlation
approach (Sect. 4). Because there is still a need for finding the best matching approach,
research for on-line fingerprints was conducted to compare quality differences and
time relations between the algorithms considered and the experimental results are
grouped in Section 5. One can also find here a detailed description of image
enhancement (Sect. 2) and the minutiae detection scheme (Sect.3) used in our research.

1 Fingerprint Representation
A fingerprint is a structure of ridges and valleys unique for all human beings. The
uniqueness is exclusively determined by local ridge characteristics called minutiae
points and relationships between them [6].
Two most common minutiae points considered in today’s research are known as
ending and bifurcation. An ending point is the place where a ridge ends its flow, and
bifurcation is the place where a ridge forks into two parts (Fig. 1).

W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 137 – 152.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
138 M. Hrebień and J. Korbicz

Fig. 1. Example of ending and bifurcation points

2 Image Enhancement
A very common technique for reducing the quantity of information received from a
fingerprint scanner in the form of a grayscale image is known as the Gabor filtering
[13]. The filter based on local ridge orientation and frequency estimations produces a
nearly binary output – the intensity histogram has a U-shaped form [8].
The Gabor filter is defined by

1 x2 y2
h( x, y, f ,θ ) = exp(− [ θ2 + θ2 ]) cos(2πfxθ ), (1)
2 δx δy
where:

xθ = x sin θ + y cos θ and yθ = x cos θ − y sin θ ,


θ is a local ridge orientation, e.g., the angle that fingerprint ridges form with the
horizontal axis when crossing through an arbitrary small block, f is an estimation of
ridge frequency in that block, δx and δy are space constants defining the stretch of the
filter (Fig. 2).
Because the ridge flow does not change significantly in its local neighborhood, the
orientation angle is calculated for blocks of the fingerprint image rather then for each
point separately as suggested in the more direct mask method presented in [15]. The

Fig. 2. Graphical representation of the Gabor filter (f = 1/10, δx = δy = 4.0)


Automatic Fingerprint Identification Based on Minutiae Points 139

ridge orientation θ for a specified block centered at the position (i, j) can be then
estimated with the equations

1 ⎛ V y (i, j ) ⎞
θ (i, j ) = tan −1 ⎜⎜ ⎟⎟, (2)
2 ⎝ V x (i, j ) ⎠
w w
i+ j+
2 2
V x (i, j ) = ∑ ∑ 2∂
w w
x (u , v)∂ y (u , v), (3)
u =i − v = j −
2 2

w w
i+ j+
2 2
V y (i, j ) = ∑ ∑ (∂
w w
2
x (u , v) − ∂ 2y (u , v)), (4)
u =i − v = j −
2 2

where ∂x(u, v) and ∂y(u, v) are pixel gradients at the position (u, v) (e.g., estimated
with Sobel’s mask [16]) and w is the block size (w = 16 for 500dpi fingerprint images,
or w = 15 to avoid unambiguous selection of the central point [5]). Additionally,
taking into account the fact that fingerprint ridges are not directed, orientation equal
to 225° can be considered as equal to 45°, so the orientation θ is usually described on
a half-open angle, for example, θ ∈ [0…π).
The local ridge frequency f can be estimated by counting an average number of
pixels between two consecutive peaks of gray-levels along the direction normal to the
local ridge orientation. The idea is based on a w × l (where w < l) oriented window
placed at the center of each block and rotated with the θ angle. The frequency of each
block is given by
1
f (i, j ) = , (5)
T (i, j )
where T(i, j) is an average number of pixels between two consecutive peaks in the so-
called x-signature obtained from

1 w−1
Xk = ∑W (d , k ), k = 0,1,..., l − 1.
w d =0
(6)

The local ridge frequency f can be given a constant value, the same for all blocks if
the filtering time must be minimized. Certainly, proper selection of its value is crucial
for the final result. Too large frequency will cause creation of spurious ridges. Too
small, on the contrary, will introduce the problem of merging nearby ridges into one.
For 500dpi fingerprint images, the inter-ridge distance is approximately equal to 10,
so f can be given a 1/10 value [10].
The space constants δx and δy define the stretch of the Gabor filter along the OX
and the OY axis. Selecting their values is a sort of a deal. If δx and δy are too small, the
140 M. Hrebień and J. Korbicz

filter is not effective in removing noise. If δx and δy are too large, the filter is more
robust in removing noise but introduces the smoothing effect and the ridge details are
lost. The δx and δy values should be approximately equal half the inter-ridge distance
to maximize enhancement effectiveness. For 500dpi fingerprint images, δx and δy are
usually equal to 4.0 or, sometimes, δy = 3.0 if there is a concern for spurious ridges
creation [5].
Trying to speed up the process of the Gabor filtering one can notice that the filter is
symmetrical on the OX as well as the OY axis, so the calculations can be significantly
reduced. Additionally, a segmentation mask (constructed, for example, using the
variance approach [12, 17]) can be used so that the filter calculations could be
performed only in those parts of image which were marked as blocks containing the
object’s pixels (ridges) [7].
An example result of fingerprint image enhancement with the final binary output is
illustrated in Fig. 3. The example input image was first normalized to reduce variance
in gray-levels of each pixel [1] (without changing the clarity between ridges and
valleys) with the equations
N −1 N −1
1
M (X ) =
N2
∑∑ X (i, j ),
i =0 j =0
(7)

N −1 N −1
1
VAR( X ) =
N2
∑∑ ( X (i, j) − M ( X ))
i = 0 j =0
2
, (8)

⎧ VARo ( X (i, j ) − M ) 2
⎪ 0
M + , if X (i, j ) > M
⎪ VAR
G (i , j ) = ⎨ (9)
⎪ M − VARo ( X (i, j ) − M ) , if X (i, j ) ≤ M ,
2

⎪⎩ 0 VAR
where M0 and VAR0 are the expected middle value and variance (both usually equal
128), and N is the size of the N × N input image X.

Fig. 3. Example of image enhancement and binaryzation based on the Gabor filter1

1
Off-line fingerprint taken from the U.S. National Institute of Standards and Technology
database, NIST-4, http://www.nist.gov
Automatic Fingerprint Identification Based on Minutiae Points 141

3 Minutiae Detection

3.1 Image Thinning, Coordinates and Types

Image thinning can be considered as a process of erosion [4, 7]. All pixels from the
edges of an object (a fingerprint ridge) are removed only if they do not affect the
coherence of the object as a whole, and they are left untouched otherwise. The
skeleton form of a fingerprint is generated until there are no more surplus pixels to
remove. The thickness of ridges in the resulting image has to be equal to one pixel,
and the shape and run of the original ridges should be preserved. An example of a
thinned form of a fingerprint can be seen in Fig. 4.
To determine whether a pixel at the position (i, j) in the skeleton form of a
fingerprint is a minutiae point, we have to deal with the mask rules illustrated in Fig. 5.
Bifurcation or ending are defined in a place where the perimeter of the mask (eight
nearest neighbors of the central point) is intersected in three or one part respectively.

Fig. 4. Example of a thinned form of a fingerprint image from Fig. 3

Fig. 5. Example of 3×3 masks used to define: a) bifurcation, b) non-minutiae point, c) ending,
d) noise
142 M. Hrebień and J. Korbicz

3.2 Minutiae Orientation

To define the orientation of each minutiae we can use a (7 × 7) mask technique with
angles quantized to 15° and the center placed in a minutiae point. The orientation of
an ending point is equal to the point where a ridge is crossing through the mask. The
orientation of a bifurcation point can be estimated with the same method but only the
leading ridge is considered, that is, the ridge with a maximum sum of angles to the
other two ridges of the bifurcation (see, for instance, Fig. 6).

Fig. 6. Bifurcation (60°) and ending (210°) point orientation example

3.3 False Minutiae Points

At the end of the minutiae detection process, all determined points should be verified
to see if they were not created by accident, for example, as a result of filtering errors.
Thus, all minutiaes from the borders of an image, in a very close neighbourhood to
the region marked as the background in the segmentation mask, created as a result of
a local ridge peak (bifurcation very close to an ending point) or as a consequence of
the pore structure of a fingerprint (the ridge hole – two bifurcations in a close
neighbourhood with opposite orientations) should be treated as false and removed
from the set. The local ridge noise problem can be reduced e.g. by ridge smoothing
techniques (a pixel is given a value using the majority rule in the nearest
neighbourhood [4]) just after image binaryzation, so that all small ridge holes will be
patched and all peaks smoothed out.

4 Minutiae Matching

4.1 Hough Transform

Let MA and MB denote minutiae sets determined from the images A and B:

M A = {m1A , m2A ,..., mmA }


(10)
M B = {m1B , m2B ,..., mnB }.
Automatic Fingerprint Identification Based on Minutiae Points 143

Each minutiae is defined by the image coordinates (x, y) and the orientation angle θ
∈ [0...2π], that is,

miA = {xiA , y iA ,θ iA }, i = 1...m


(11)
m Bj = {x Bj , y Bj ,θ jB }, j = 1...n.
What we expect is to find such a transformation of the minutiae set MB into MC that
will be the best estimation of MA (the MA minutiaes have to be covered by the MC
minutiaes with a given distance (r0) and the orientation (θ0) tolerance). This means that

∀ ∃( S ( miA , m Cj ) ≤ ro and K (θ iA , θ Cj ) ≤ θ 0 ) (12)


i j

have to be maximized, where S is a function defining the distance between a pair of


minutiaes (in Chebyshev’s meaning) and K is a function defining the difference
between minutiaes orientation (assuming that the difference between 3° and 358° is
equal to 5°):

S (a, b) = max( a x − bx , a y − b y ),
(13)
K (α , β ) = min( α − β , 2π − α − β ).

∀i∀ j ∀ k ∀l A(i, j , k , l ) ← 0

FOR {x iA , y iA , θ iA } ∈ M A , i = 1...m
FOR {x Bj , y Bj , θ jB } ∈ M B , j = 1...n
FOR θ k ∈ {θ 1 , θ 2 ,..., θ K }, k = 1...K
IF K (θ iA + θ k , θ jB ) ≤ θ 0
FOR s l ∈ {s1 , s 2 ,..., s L }, l = 1...L
{
⎡ Δx ⎤ ⎡ x iA ⎤ ⎡ cos θ k sin θ k ⎤ ⎡ x Bj ⎤
⎢ Δy ⎥ ← ⎢ A ⎥ − s l ⎢ − sin θ ⎢ ⎥
cos θ k ⎥⎦ ⎢⎣ y Bj ⎥⎦
⎣ ⎦ ⎣ yi ⎦ ⎣ k

Δx # , Δy # , θ # , s # ← discretize (Δx, Δy, θ k , s l )


A( Δx # , Δy # , θ # , s # ) ← A(Δx # , Δy # , θ # , s # ) + 1
}

(Δx + , Δy + , θ + , s + ) ← arg max( A)

Fig. 7. Hough transform routine


144 M. Hrebień and J. Korbicz

The Hough transform, which was adopted for fingerprint matching [14], can be
performed to find the best alignment of the sets MA and MB including the possible
scale, rotation and displacement of the image A versus B. The transformation space is
discretized – each parameter of the geometric transform (Δx, Δy, θ, s) comes from a
finite set of values. A four dimensional accumulator A is used to accumulate
evidences of alignment between each pair of minutiaes considered. The best
parameters of the geometric transform, that is, (Δx+, Δy+, θ+, s+) are arguments of the
maximum value from the accumulator (see the procedure in Fig. 7).
After performing the transformation, minutiae points are juxtaposed to calculate
the matching score with respect to their distance, orientation and type (with a given
tolerance).
An example result of the Hough transform is shown in Fig. 8.

4.2 Global Star Method

The global star method is based on a structural model of fingerprints. Distinguishing


between the types of minutiaes (ending or bifurcation) and including the possible
scale, rotation and displacement of images, a star can be created with the central point
in one of the minutiaes, and with the arms directed to the remaining ones (Fig. 9a).
Assuming, as in previous deliberations, that

M A = {m1A , m2A ,..., mmA }


(14)
M B = {m1B , m2B ,..., mnB },
indicate sets of minutiaes of one type, m stars for the image A and n stars for the
image B can be created:

S A = {S1A , S 2A ,..., S mA }
(15)
S B = {S1B , S 2B ,..., S nB },
where each star can be defined as

S iA = {m1A , m 2A ,..., m mA }i =1...m , center in miA


(16)
S Bj = {m1B , m 2B ,..., m nB } j =1...n , center in m Bj .

In opposition to the local methods [18], the voting technique for selecting the best
A B
aligned pair of the stars ( S wi , S wj ) can be performed (Fig. 10), including matching
such features like the between-minutiae angle K and the ridge count D (Fig. 9cb). In
the final decision, also the orientation of minutiaes is taken into account (Fig. 11)
after their adjustment by the angle of orientation difference between the central points
of stars from the best alignment (α).
An example result of the global start matching method is shown in Fig. 12.
Automatic Fingerprint Identification Based on Minutiae Points 145

Fig. 8. Example result of the Hough transform – matched minutiaes, with a given tolerance, are
marked with elipses

Fig. 9. General explanation of the star method: a) example star created for fingerprint ending
points, b) ridge counting (here equal to 5), c) example of relative angle determination between
the central minutiae and the remaining ones
146 M. Hrebień and J. Korbicz

∀ i =1...m ∀ j =1...n A(i, j ) ← 0

FOR S iA ∈ S A , i = 1...m
FOR S Bj ∈ S B , j = 1...n
FOR mkA ∈ S iA − {miA }
assuming that : mlB ∈ S Bj − {m Bj }
IF ∃( D(m Bj , mlB ) − D(miA , mkA ) ≤ d 0 and K (m Bj , mlB ) − K (miA , m kA ) ≤ k 0 )
l

A(i, j ) ← A(i, j ) + 1

S wiA ← S A (arg i (max( A)))


S wjB ← S B (arg j (max( A)))

Fig. 10. First stage of the global star matching algorithm

L←0

FOR m kA ∈ S wiA − {m wiA }


assuming , that : mlB ∈ S wjB − {m wjB }
IF ∃( D (m wjB , mlB ) − D(m wiA , m kA ) ≤ d 0 and K (m wjB , mlB ) − K (m wiA , m kA ) ≤ k 0
l

and T (θ kA , θ lB + α ) ≤ θ 0 )
{
L ← L +1
}

Fig. 11. Second stage of the global start matching algorithm – way of determining the number
of the matched minutiae pairs (L)

4.3 Correlation

Because of non-linear distortion, skin conditions or finger pressure that cause the
varying of image brightness and contrast [13], the correlation between fingerprint
images cannot be applied directly. Moreover, taking into account the possible scale,
rotation and displacement, searching for the best correlation between two images
using an intuitive sum of squared differences is computationally very expensive. To
eliminate or at least reduce some of the above-mentioned problems, a binary
representation of the fingerprint can be used. To speed up the process of preliminary
Automatic Fingerprint Identification Based on Minutiae Points 147

Fig. 12. Example result of the global star method

Fig. 13. Example of preliminary aligned fingerprint segmentation masks (left) and correlation
between two impressions of the same finger (right), where red denotes the best alignment
(images obtained with a Digital Persona U.are.U 4000 scanner)

alignment, a segmentation mask can be used with conjunction to the center of gravity
of binary images. Also, the quantization of geometric transform features can be
applied, considering the scale and rotation only at the first stage (since displacement
148 M. Hrebień and J. Korbicz

∀i ∀ j A(i, j ) ← A(i, j ) ∗ Aseg (i, j )


∀i ∀ j B(i, j ) ← B(i, j ) ∗ B seg (i, j )

[ s min , θ min , d min ] ← [φ , φ , + ∞]


x
[ s max y
, s max , θ max , Δxmax , Δ ymax , d max ] ← [φ , φ , φ , φ , φ , − ∞]

FOR s i ∈ {s1 , s 2 ,..., s I }, i = 1...I


FOR θ j ∈ {θ 1 , θ 2 ,..., θ J }, j = 1...J
{
d ← D seg ( Aseg , transform( B seg , s i , s i , θ j , Δx, Δy, S iB , S Bj ))

IF d < d min
[ s min , θ min , d min ] ← [ s i , θ j , d ]
}

B ← transform( B, s min , s min , θ min , Δx, Δy, S iB , S Bj )


[ S iB , S Bj ] ← [ S iB − Δy, S Bj + Δx]

FOR s kx ∈ {s1x , s 2x ,..., s Kx }, k = 1...K


FOR s ly ∈ {s1y , s 2y ,..., s Ly }, l = 1...L
FOR θ m ∈ {θ 1 , θ 2 ,..., θ M }, m = 1...M
FOR Δxn ∈ {Δx1 , Δx2 ,..., ΔxN }, n = 1...N
FOR Δ yp ∈ {Δ 1y , Δ y2 ,..., Δ yP }, p = 1...P
{
d ← Dobj ( A, transform( B, s kx , s ly , θ m , Δxn , Δ yp , S iB , S Bj ))

IF d max < d
x
[ s max y
, s max , θ max , Δxmax , Δ ymax , d max ] ← [ s kx , s ly , θ m , Δxn , Δ yp , d ]
}

x
Dobj ( A, transform( B, s max y
, s max , θ max , Δxmax , Δ ymax , S iB , S Bj ))
W←
D obj ( A, A)

Fig. 14. Algorithm of finding the best correlation ratio (W) between the images A and B using
their segmentation masks Aseg and Bseg
Automatic Fingerprint Identification Based on Minutiae Points 149

Fig. 15. Example result of the best minutiae match for the correlation from Fig. 13

is the difference between the centers of gravity), minimizing the Dseg criteria (a simple
image XOR):
N −1 N −1 ⎧1, if Aseg (i, j ) ≠ Bseg (i, j )
Dseg ( Aseg , Bseg ) = ∑∑ ⎨ (17)
i = 0 j = 0 ⎩0, if Aseg (i, j ) = Bseg (i, j ).
After finding nearly the best alignment of segmentation masks (Fig. 13), looking
for the best correlation is limited to a much more reduced area. Including the rotation,
vertical and horizontal displacement, stretch and arbitrary selected granularity of these
features, the best correlation can be found (Fig. 14) searching for the maximum value
of the Dobj criteria (a double image XNOR):


N −1 N −1 1, if A(i, j ) = B (i, j ) = obj
Dobj ( A, B) = ∑∑ ⎨ (18)
i = 0 j = 0 ⎩ 0, in the other case,

where obj represents the object’s (ridge) pixel.


Because fingerprint correlation does not tell us anything about minutiae matching,
the thinning process with minutiae detection should be applied to both binary images
150 M. Hrebień and J. Korbicz

from the best correlation. Then two sets of minutiaes can be compared to sum up the
matching score.
An example result of the correlation algorithm is shown in Fig. 13 and Fig. 15.

5 Experimental Results
The experiments were performed on a PC with a Digital Persona U.are.U 4000
fingerprint scanner. The database consists of 20 fingerprint images with 5 different
impressions (plus one more for the registration phase).
There were three experiments carried out. The first and the third one differ in the
case of parameter settings of each method. In the second one, the image selected for
the registration phase was chosen arbitrarily as the best one in the arbiter’s opinion (in
the first and the third experiment the registration image was the first fingerprint image
acquired from the user).
All images were enhanced with the Gabor filter described in Section 2 and
matched using the algorithms described in Section 4. The summary of the matching
results for Polish regulations concerning fingerprint identification based on minutiaes
[6] and time relations between each method are grouped in Tab. 1.

Table. 1. Summary of the achieved results

Experiment Hough Trans. Global Star Correlation


1 85 45 37
Matching percentage [%] 2 88 76 70
3 82 80 61
Avg. count of endings / 1 25 / 7 10 / 3 16 / 4
bifurcations in the best 2 25 / 7 13 / 4 19 / 5
match 3 26 / 7 19 / 5 17 / 5
Number of images that 1 1 22 6
did not cross the 2 1 10 1
matching threshold 3 1 2 2
Time relation 1, 2, 3 1 HT ~6 HT ~14 HT

As one can easily notice, the Hough transform gave us the fastest response and the
highest hit ratio from the methods considered. Additionally, it can be quite easily
vectorized to perform more effectively with SIMD organized computers.
The global star method is scale and rotation independent but more expensive
computationally because of the star creation process – determining the ridge count
between the mA and mB minutiaes needs an iteration process, which is time consuming
(even if one notices that D(mA, mB) = D(mB, mA) in the star creation process with the
center in mA and mB). Moreover, filtering errors and not very good image quality can
cause breaks in the continuity of ridges disturbing the proper ridge count
determination and, as a consequence, produce a lower matching percentage in
common situations.
Automatic Fingerprint Identification Based on Minutiae Points 151

The analysis of an error set of the correlation method shows that it is most sensitive
in the case of image selection for the registration phase and parameter settings from
the group of the algorithms considered. Too small fragment or strongly deformed
fingerprint impression make finding the unambiguous best correlation (maximum W
value in Fig. 13) significantly difficult. Additionally, it is time consuming because of
its complexity (series of geometric transformations).

6 Conclusions
In this paper several methods of fingerprint matching were reviewed. The
experimental results show quality differences and time relations between the analyzed
algorithms. The influence of selecting an image for the registration phase can be
observed. The better image selected, the higher the matching percentage and smaller
inconvenience if the system works as a lock.
Suboptimal parameters selected for the performed preliminary experiments show
that it is still a challenge to use global optimization techniques for finding the best
parameters of each described method. Additionally, automatic image pre-selection
(classification), e.g., one based on global features of a fingerprint (such as core and
delta positions, the loop class) can speed up the whole matching process for very large
databases [3, 9, 11].
Heavy software architecture dependent optimizations or even hardware
implementation [14] can be considered if there is a big concern about speed. On the
other hand, if security is more important, hybrid solutions including, for example,
voice, face or iris recognition could be combined with fingerprint identification to
increase the system’s infallibility [19].

References
[1] Andrysiak, T., Choraś, M.: Image retrieval based on hierarchical Gabor filters. Int. J. of
Appl. Math. and Comput. Sci. 15(4), 471–480 (2005)
[2] Bouslama, F., Benrejeb, M.: Exploring the human handwriting process. Int. J. of Appl.
Math. and Comput. Sci. 10(4), 877–904 (2000)
[3] Cappelli, R., Lumini, A., Maio, D., Maltoni, D.: Fingerprint classification by directional
image partitioning. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 402–421 (1999)
[4] Fisher, R., Walker, A., Perkins, S., Wolfart, E.: Hypermedia Image Processing Reference.
John Wiley & Sons, Chichester (1996)
[5] Greenberg, S., Aladjem, M., Kogan, D., Dimitrov, I.: Fingerprint image enhancement
using filtering techniques. In: IEEE Proc. 15th Int. Conf. Pattern Recognition, vol. 3, pp.
322–325 (2000)
[6] Grzeszyk, C.: Dactyloscopy. PWN, Warszawa (1992) (in Polish)
[7] Gonzalez, R., Woods, R.: Digital Image Processing. Prentice-Hall, Englewood Cliffs
(2002)
[8] Hong, L., Wan, Y., Jain, A.: Fingerprint image enhancement: algorithm and performance
evaluation. IEEE Trans. on Pattern Analysis and Machine Intelligence 20(8), 777–789
(1998)
152 M. Hrebień and J. Korbicz

[9] Jain, A., Minut, S.: Hierarchical kernel fitting for fingerprint classification and alignment.
In: IEEE Proc. 16th Int. Conf. Pattern Recognition, vol. 2, pp. 469–473 (2002)
[10] Jain, A., Prabhakar, S., Hong, L., Pankanti, S.: Fingercode: a filterbank for fingerprint
representation and matching. In: IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, vol. 2, pp. 2187–2194 (1999)
[11] Karu, K., Jain, A.: Fingerprint classification. Pattern Recognition 29(3), 389–404 (1996)
[12] Lai, J., Kuo, S.: An improved fingerprint recognition system based on partial thinning. In:
Proc. 16th Conf. on Computer Vision, Graphics and Image Processing, vol. 8, pp. 169–
176 (2003)
[13] Maltoni, D., Maio, D., Jain, A., Prabhakar, S.: Handbook of Fingerprint Recognition.
Springer, Heidelberg (2003)
[14] Ratha, N., Karu, K., Chen, S., Jain, A.: A real-time matching system for large fingerprint
databases. IEEE Trans. on Pattern Analysis and Machine Intelligence 28(8), 799–813
(1996)
[15] Stock, R., Swonger, C.: Devolopment and Evaluation of a Reader of Fingerprint
Minutiae, Cornell Aeronautical Laboratory, Technical Report (1969)
[16] Tadeusiewicz, R.: Vision systems of industrial robots, WNT (1992) (in Polish)
[17] Thai, R.: Fingerprint Image Enhancement and Minutiae Extraction, University of
Western Australia (2003)
[18] Wahab, A., Chin, S., Tan, E.: Novel approach to automated fingerprint recognition. IEE
Proc. in Vis. Image Signal Process 145(3) (1998)
[19] Zhang, D., Campbell, P., Maltoni, D., Bolle, R. (eds.): Special Issue on Biometric
Systems. IEEE Trans. on Systems, Man, and Cybernetics 35(3), 273–450 (2005)
Image Filtering Using the Dynamic Particles Method

L. Rauch and J. Kusiak

UST, AGH University of Science and Technology, Cracow, Poland


lrauch@agh.edu.pl, kusiak@metal.agh.edu.pl

Abstract. The holistic approaches used for image processing are considered in various types of
applications in the domain of applied computer science and pattern recognition. A new image
filtering method based on the dynamic particles (DP) approach is presented. It employs physics
principles for the 3D signal smoothing. The obtained results were compared with commonly
used denoising techniques including weighted average, Gaussian smoothing and wavelet
analysis. The calculations were performed on two types of noise superimposed on the image
data i.e. Gaussian noise and salt-pepper noise. The algorithm of the DP method and the results
of calculations are presented.

1 Introduction

1.1 Denoising Processes

The analysis of the experimental measurement data is often difficult and sometimes
even impossible in their rough version because of superimposed noise. Properly
performed analysis based on the denoising techniques allows extracting the vital part
of the data. Due to the denoising process, which is often very expensive and time-
consuming, the experimental data can be restored and used in further calculations.
There exists a lot of examples of such data obtained from many experiments in
different domains of science e.g. experiments of plastometric material tests,
determination of engine parameters, sound recording, market analysis, etc. In most
cases observed noise is a result of external factors like sensitivity of the industrial
measuring sensors or market impulses [1].
Above-mentioned measurements are mainly in form of one dimensional signal that
have to be pre-processed before further analysis (Fig. 1).
However, a lot of obtained experimental results are presented as multi-dimensional
data and also requires application of denoising algorithms. The example of such data
used in medical or industrial applications is the image data in form of two
dimensional pictures. The analysis of pictures taken for example from industrial
camera is very often difficult because of low quality of registered image caused by the
low resolution of the possessed equipment. Moreover, the data presented in the
picture is usually superimposed with the noise. In most cases the noise on the image is
the difference between the real color that can be seen by the human eye and the value
that is registered by the camera. Thus, if there are many pixels in the picture with
unsettled value, then the whole image can be illegible.

W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 153 – 163.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
154 L. Rauch and J. Kusiak

Fig. 1. Examples of noisy measurements signals obtained from different plastometric material
tests [2]

However, the character of the noise can be very different. Several types of random
noise can be distinguished [3]:
ƒ Gaussian noise – used for testing of the denoising algorithm when the noise is
generated and then superimposed on the source image,
ƒ White noise – is a noise that contains every frequency within the range of human
hearing (generally from 20 hertz to 20 kHz) in equal amounts,
ƒ Salt-pepper noise – specific type of noise that changes the value of random
chosen pixels on white or black color.
The denoising method presented in this paper is dedicated for the images saved in
grayscale.

1.2 State-of-the-Art

Several commonly used methods of denoising exist. Each of them has some
advantages and disadvantages, but no one can be treated as the unified denoising and
smoothing method. The unification of such techniques should give one method, which
can be applied to different types of measurement data saddled with a noise of
different type. Existing known methods have to be reconfigured and adapted to the
new conditions, even if the analyzed data has the same form, but with different noise.
The example of such data is presented in Figure 1, where two similar plots are shown.
They contain results of a metal samples compression tests performed with different
tools’ velocities. Each of these curves is loaded with noise of different frequencies
though they describe the same type of tested material. Therefore, denoising methods
should be designed to obtain similar results independently of the noise character and,
what is more important, independently of the curve shape. This would allow the
application of the method in the automated way of denoising process that won’t
require reconfiguration of input parameters and additional user’s interaction.
Image Filtering Using the Dynamic Particles Method 155

The process of denoising is in fact the problem of data approximation. There are
many of such algorithms, but the most widely known and used are:
ƒ moving weighted average and polynomial approximations,
ƒ wavelet analysis [4] and artificial neural networks [5].
ƒ large family of convolusion methods and frequency based filters [6],
ƒ Kalman statistical model processing [7],
ƒ dedicated filtering (used mainly in the image filtering processes) e.g. NL-means,
neighborhood models [3].
In case of polynomial approximation approach, the algorithms return well-fitting
smoothed curves, but if the data contains several thousands of measured points then
the calculation time is very long and the method appears inefficient. The weighted
average technique allows user very fast and flexible data smoothing, but the
assessment of the obtained results is very difficult and based only on the user’s
intuition. Moreover, if the algorithm is running too long then the results converges to
the straight line or surface joining the border points of data set. Thus, the main
disadvantage of this method is the problem of a stop criterion of the algorithm. The
wavelet analysis is very similar to the traditional Fourier method, but is more
effective in analyzing physical situations where the signal contains discontinuities and
sharp peaks. It allows application of denoising process on different levels of signal
decomposition, making the solution very precise and controllable. Wavelets are
mathematical functions that divide the data into different frequency components.
Then the analysis of each component is performed with a resolution matched to the
frequency scale. The drawbacks of the method are the necessity of setting thresholds
each time the input data is changing and choosing the quantity of decomposition
levels that can be dependent on the noise character. Approach based on the artificial
neural networks is also often used, mainly the Generalized Regression Neural
Networks (GRNN) is applied. The results obtained using that technique are smoother
than in other methods e.g. wavelet analysis, but the application have to be
reconfigured each time the data is changing. In some cases, even the type of the
network must be changed, what is very inconvenient during the continuous
calculations. Thus, the neural network approach is suitable for single calculations, but
not for the automated application of denoising process.
The review of mentioned-above denoising methods allows to determine main
problems related to the process of denoising:

ƒ the definition of the stop criterion and the evaluation of the quality results,
ƒ techniques applied as the iterated algorithms run too long in most cases,
ƒ the results are too simplified, which makes useless the further analysis of the
data,
ƒ there is no unified method that could be applied on different types of noise
characterized by different frequencies.
The main objective of this paper is the presentation of scalable algorithm that could
be applied for different types of random noise. Moreover, the algorithm should be
equipped with the solution of the stop criterion that analyzes the progress of
calculations making temporary assessment of obtained results. The description of this
156 L. Rauch and J. Kusiak

method and the results of the application of elaborated method to the image data
testing sets and their interpretation is presented.

2 Dynamic Particle Method

2.1 Description of the DPA Idea

The idea of the Dynamic Particles (DP) algorithm is based on the definition of a
particle. A lot of definitions in the different science domains exist, but the most
general definition treats the particle as an object placed in the N-dimensional space.
From the mathematical point of view, the particle is a vector with N components
related to each dimension in a space. This approach characterize the particle’s
position and thus it can be analyzed relatively to the others [8].
The paper presents an algorithm that performs calculations on the three-dimensional
particles where the particle is in fact a pixel in form of three-dimensional vector:

ƒ two dimensions define the position of the pixel on the image (width and height),
ƒ the third dimension defines the value of the pixel in the grayscale – values are
from the range of 0-255, where 0 indicates black and 255 indicates white color.
Therefore, we receive the whole image as the 3D surface made of points represent-
ing adequate pixels. Values of all three dimensions should be normalized before
calculations. This process allows equalization of the influence that each dimension
has on the results of calculations. Finally, the obtained results are re-scaled to the
previous range of values.

2.2 DP Algorithm

DP algorithm employs elementary physical principles determining laws of particles


motion. Each particle has its own set of neighbour particles. The distance between the
particle and one of its neighbours is denoted as dij and calculated in Euclidian space as:
N
d ij = ∑ (x
k =1
jk − x ik ) 2 (1)

where xjk and xik indicate the vector components in N-dimensional space. The force
between two particles is proportional to the distance between them, and can be
defined as a resultant of all forces acting on the neighbours. Thus, the length of
resultant force acting on the particle can be treated as the particle’s potential Vi. The
gradient of this potential is mainly responsible for the movement of the particle in
each calculations step. The set of differential equations of particles movement can be
written as follows:

⎧ dv
⎪mi ⋅ i = −Vi − f c ⋅v i
⎨ dt (2)
⎪⎩ dr = vi ⋅ dt
Image Filtering Using the Dynamic Particles Method 157

where i is a number of considered particle (pixel), mi - mass of the particle (default


mass is equal 1), vi - particle’s velocity, dr - one step distance, fc - friction coefficient.
The friction coefficient is similar to the friction force, which is responsible for
braking of a particles motion. The value of this coefficient should be less than 0.5 to
sustain the stability of the whole set of particles. It has been found, during several
performed calculations, that the best start value for Cc is 0.4, which makes the
algorithm fast and convergent.
After each performed iteration of the algorithm, the correction of fc is applied. If
the calculated resultant force is lower than resultant force in previous iteration, then fc
is reduced by these forces quotient. Thus, the convergence of the algorithm is assured
through the reduction of forces and fc in each step of calculations.
The example of the calculations of a resultant force in 3D is presented in Figure 2.
The set of neighbours for each particle contains eight (full set) or four (subset)
particles.

Fig. 2. Visualization of the image as connected particles set

The stop criterion of the proposed algorithm was solved by establishing the
threshold of movement. If the force acting on the single particle is less than the
threshold defined at the beginning of the algorithm, the particle does not move any
longer. The whole algorithm reaches the end of the run when all particles are stopped.
However, the threshold responsible for the motion of the particles defines also the
smoothness of the expected results. If it is set as the small value, then the algorithm is
running till all forces on the curve reach the threshold and the differences between
positions of two adjacent particles are very low. Otherwise, the plot of new curve is
sharper sustaining all most important peaks. The value of this parameter can vary
158 L. Rauch and J. Kusiak

between 10-5 and 10-20. If its value is too small, then it has no more impact on the
shape of the curve. Otherwise, if it is too high, the algorithm stops too early giving no
effect of smoothing.

2.3 Quality Assessment of Results

The precise validation of the obtained results is possible only in the case of testing
original data, which does not contain noise. The procedure of such testing is proposed
in three main steps:

ƒ Preparation of testing data – the original image (without noise) is taken and
noised with generated random noise – the converted image is created,
ƒ The algorithm of denoising is applied on converted image – the denoised image
is obtained,
ƒ The calculation of similarity ratio between original and denoised images is
performed.

The ratio of similarity is in most cases calculated as the standard deviation between
original and denoised images [3]. If such coefficient is equal to zero it means that the
process of denoising was perfectly performed. At the moment there are no algorithms
giving such results. The main disadvantage is that the value of the ratio is absolute and
its interpretation is usually impeded. Therefore, it has been proposed the coefficient of
the denoising quality, which can be evaluated accounting for the differences between
the original image, converted image and denoised image as follows:

calc _ diff ( S i , N i )
Dq = (3)
calc _ diff ( S i , Di )
where Dq is denoising quality coefficient; Si – source (original) image; Ni – noised
image; Di – denoised image. The calc_diff function used in equation (3) is defined as
the modified standard deviation:

calc _ diff =
∑d i
(4)
n −1
where di is the distance between corresponding particles in both images; n is the
number of points. The Dq coefficient equal to 1 means that there was no effect of
denoising process. Thus, the Dq value should be grater than 1 and the higher value
means the better denoising results. The test performed on the one dimensional data
(signal denoising) indicated that high quality of denoising was obtained when the Dq
value was higher then 5. However, the character of images, which are often very
jagged, indicates that denoising quality in the range from 1.2 to 1.6 is satisfactory.

2.4 Computational Complexity

One of the main objectives of this paper was to create scalable algorithm of denoising.
In this case the scalability property means that the algorithm would be applicable for
Image Filtering Using the Dynamic Particles Method 159

Table 1. The algorithm of DP method


___________________________________________
do { // Begin of the calculations
total = 0;
for (int j=0; j<data.length; j++) {

// Motion of the particle


if (ismoving[j]==1) {

// New particle position


position[j] = calc_pos(j);

// Store forces for reduction of Cc


if (i>0)
force_old[j] = force[j];
force[j] = pos_diff(data[j] – position[j]);

// New coef. for each particle


data[j] = position[j];
}

// Reduction of the Cc for each particle


if (force[j]<force_old[j])
cc[j] = cc[j]*(force_old[j]/force[j]);

// Checking the stop criterion


if (cc[j]<thres) {
ismoving [j] = 0;
} else {
ismoving [j] = 1; total = 1;
}
}
}___________________________________________
while (total==1); // End of the calculations

data placed in N-dimensional space. Thus, it is required that the complexity of such
algorithm should possess low dependence on the quantity of dimensions.
The source code presented in the Table 1 shows that the calculation complexity of
DP method depends mainly on the number of points in a data set. The pessimistic
variant of the calculations assumes that each iteration of the algorithm requires the
calculation of every particle’s position. The function responsible for these calculations
called calc_pos performs several iterations based on the quantity of neighbor particles
and the quantity of dimension. In the case of two dimensional data (image) the
influence of this function on the algorithm complexity is insignificant. However, the
number of main loop iterations is important which depends on the Cc initial value.
Satisfactory results for image denoising (e.g. 400x400 pixels is equal to 160 000
particles) are obtained after 25-50 iteration. Thus, the calculation complexity of the
designed algorithm can be estimated in O notation as follows:
160 L. Rauch and J. Kusiak

O(n) = n log 2 n (5)

where n is dependent on the number of particles and log2n component is related to the
number of iterations.

3 Results
The tests of created algorithm were performed on the data set containing several
examples of images. The images were characterized by different types of content:

ƒ smoothed – contents with several basic colors grouped in sets of pixels e.g.
geometric figures (Fig. 3-5),
ƒ jagged – real photos containing in most cases full range of colors mixed between
each other with high frequency (Fig. 6-9).

Each one of them was superimposed by the two types of generated random noise:

ƒ Gaussian noise – the values of standard deviation were set to 5, 10 and 20


separately. It means that the average value of noise was equal respectively 4%,
8% and 16% of the average pixels’ values,
ƒ Salt and pepper noise – the noise was superimposed on pixels with probability p.
If the random number was less or equal p then the value of the pixel was changed
to zero. Otherwise if the random number was greater than 1-p, the pixel value
was changes to 255.

Finally, the set of data contained 30 (thirty) images submitted to analyze using
procedure of DP algorithm. Several chosen results are presented in the Figures 3-9.

Fig. 3. Example of smooth image containing Fig. 4. Result of denoising process obtained
only 5 colors in original version, noised with by using DP method
Gaussian (4%) noise
Image Filtering Using the Dynamic Particles Method 161

Fig. 5. The comparison of magnified noised (Fig. 3) and denoised (Fig. 4) images – the
denoising quality coefficient is equal 1.495

Fig. 6. Example of original jagged image Fig. 7. The same image noised with Gaussian
containing a lot of details – Lena picture (8%) noise

Fig. 8. Results obtained using standard DP Fig. 9. Results obtained using DP method
method equipped with edge detection algorithm used
during the neighborhood determination process
162 L. Rauch and J. Kusiak

The results obtained from denoising process seem to be satisfactory. However, the
main observed disadvantage is the lack of edge detection procedure what can be seen
in Fig. 5. The subsequent figures present the application of the DP method supported
by proper edge detection algorithm.

4 Conclusions and Discussion


The presented DP method was designed as the technique of data denoising that could
be applied for measurements data of different dimensions without the necessity of
reconfiguration of the whole set of parameters. The design and implementation of
such unified method was the main objective of this paper. The algorithm was tested
for one dimensional data (measurement signals of different frequencies) and two
dimensional data (images being the aim of this work). Moreover, the implementation
of the algorithm allows to perform calculations using multi-dimensional data.
The two dimensional data in form of image is a special type of measurements data
obtained from a photo camera. The analysis of such data is often impeded by the
superimposed noise. The DP algorithm suppresses those noised parts of images
allowing the further analysis.
The advantage of the image data processing is also the possibility of visual
assessment of results. The results obtained using DP method are very similar to these
obtained by Gaussian filtering and wavelet smoothing.
The further development of this technique should focus on the application for the
multi-dimensional data processing. Main objectives that should be achieved are:
ƒ design and implementation of the shared nearest neighborhood algorithm,
ƒ design of parallel version of the DP algorithm,
ƒ compilation of the DP method together with proper algorithm of Multi
Dimensional Scaling (MDS).

Acknowledgements
Financial assistance of the KBN project No. 11.11.110.575 is acknowledged.

References
[1] Rauch, Ł., Talar, J., Zak, T., Kusiak, J.: Filtering of thermomagnetic data curve using
artificial neural network and wavelet analysis. In: Rutkowski, L., Siekmann, J.H.,
Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS, vol. 3070, pp. 1093–1098.
Springer, Heidelberg (2004)
[2] Gawąd, J., Kusiak, J., Pietrzyk, M., Di Rosa, S., Nicol, G.: Optimization Methods Used for
Identification of Rheological Model for Brass. In: Proc. 6th ESAFORM Conf. On Material
Forming, Salerno, Italy, pp. 359–362 (2003)
[3] Buades, A., Coll, B., Morel, J.M.: On image denoising methods, Centre de Matematiques
et de Leurs Applications, http://www.cmla.ens-cachan.fr
Image Filtering Using the Dynamic Particles Method 163

[4] Adelino, R., da Silva, F.: Bayesian wavelet denoising and evolutionary calibration. Digital
Signal Processing 14, 566–689 (2004)
[5] Falkus, J., Kusiak, J., Pietrzkiewicz, P., Pietrzyk, W.: The monograph, Intelligence in
Small World - nanomaterials for the 21th Century. In: Filtering of the industrial data for
the Artificial Neural Network Model of the Steel Oxygen Converter Process. CRC-PRESS,
Boca Raton (2003)
[6] Hara, S., Tsukada, T., Sasajirna, K.: An in-line digital filtering algorithm for surface
roughness profiles. Precision Engineering 22, 190–195 (1998)
[7] Piovoso, M., Laplante, P.A.: Kalman filter recipes for real-time image processing. Real-
time Image Processing 9, 433–439 (2003)
[8] Dzwinel, W., Alda, W., Yuen, D.A.: Cross-Scale Numerical Simulations using Discrete
Particle Models. Molecular Simulation 22, 397 (1999)
The Simulation of Cyclic Thermal Swing Adsorption
(TSA) Process

Bogdan Ambrożek

Szczecin University of Technology, Department of Chemical Engineering and Environmental


Protection Processes, Al. Piastów 42, 71-065 Szczecin
ambog@ps.pl

Abstract. The dynamic behavior of cyclic thermal swing adsorption (TSA) system with a
column packed with fixed bed of adsorbent is predicted successfully with a rigorous dynamic
mathematical model. The set of partial differential equations (PDEs), representing the TSA, is
solved by the numerical method of lines (NMOL), using the FORTRAN subroutine DIVPAG
from the International Mathematical and Statistical Library (IMSL). The simulated TSA cycle
is operated in three steps: (i) an adsorption step with cold feed; (ii) a countercurrent desorption
step with hot inert gas; (iii) a countercurrent cooling step with cold inert gas. Exemplary
simulation results are presented for the propane adsorbed onto and desorbed from fixed bed of
activated carbon. Nitrogen is used as carrier gas during adsorption and as purge gas during
desorption and cooling.

1 Introduction
The cyclic thermal swing adsorption (TSA) processes have been widely used in the
industry for the removal and recovery of pollutants, such as volatile organic
compounds (VOCs), from the gaseous streams [1]. A typical TSA system consist of
two adsorption columns with fixed bed of adsorbent and operates between two
different temperatures. While the adsorption process takes place in one column, the
bed in the other column is subjected to regeneration. During desorption, the first step
of regeneration, hot purge gas, which can be a slipstream of the purified gas or
another inert gas, flows through the bed. The adsorbate concentration in the purge gas
is much higher than in the feed gas and this concentrated stream can be sent to an
incinerator. It is also possible to recover the adsorbate by condensing it out from the
purge gas stream. After completion of the desorption step, the bed is cooled.
The cyclic TSA processes, in mathematical aspect, are classified as distributed
parameter systems, described by an integrated system of partial differential and
algebraic equations (IPDAEs) [2]. Each TSA process approaches a cyclic steady-state
(CSS). In this state the conditions at the end of each cycle are identical to those at the
start [3]. The difficulties for the design of TSA processes are based on the lack of
information about the influence of the process variables on the dynamic behavior of
adsorption column and cyclic steady-state convergence time.
The purpose of the present paper is to provide a parametric analysis of thermal
swing adsorption. The effect of different operating conditions and some of the model

W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 165 – 178.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
166 B. Ambrożek

parameters on the concentration and temperature breakthrough curves were considered.


The cyclic steady-state cycles are obtained by cyclic computer simulation. The system
studied was propane adsorbed onto and desorbed from fixed bed of activated carbon.
Nitrogen was used as carrier gas during adsorption and as purge gas during desorption
and cooling. The TSA cycle comprises the following steps: adsorption, desorption and
cooling.

2 Mathematical Model
Mathematical model describing TSA process consists of integrated partial differential
and algebraic equations. The model equations were obtained by applying differential
material and energy balances to the adsorbent bed. The following assumptions were
made:
(1) The gas phase follows the ideal gas law.
(2) Constant pressure operation.
(3) Single adsorbate system.
(4) The velocity of the carrier gas is constant.
(5) Negligible radial concentration, temperature and velocity gradient within the bed.
(6) Negligible intraparticle heat transfer resistance.
The model considers mass and heat transfer resistances, axial diffusion and thermal
conductivity.
Based on the above assumptions, the adsorbate mass balance within the gas phase
is represented by the following equation:

∂2 y G ∂y ∂y (1 − ε ) ρ p ∂q
− Dax + + − =0 (1)
∂z 2 ρ g ε ∂z ∂t ε ρ g ∂t
The adsorbate balance around the solid phase is formulated using a linear driving
force expression:

∂q
∂t
(
= k q* − q ) (2)

Heterogeneous energy balance around the gas phase in the packed bed accounts for
axial conduction and heat transfer to the solid phase and to the column wall:


kax ∂ Tg
2
+
G ∂Tg ∂Tg h f α p (1 − ε )
+ + (
Tg − Ts + ) (
4hw Tg − Tc
=0
) (3)
ρ g C pg ∂z 2 ρ g ε ∂z ∂t ερ g C pg ε Dρ g C pg

The energy balance of the adsorbent particle includes the heat generated by adsorption
and is expressed as:

∂Ts h f α p
∂t

ρ pC ps
(
Tg − Ts − )
Δ H a ∂q
C ps ∂t
=0 (4)
The Simulation of Cyclic Thermal Swing Adsorption (TSA) Process 167

Wall energy balance is stated as:


∂Tc
∂t
− hw
αc
C pc ρc
( α
)
Tg − Tc + U air (Tc − Tamb ) = 0
C pc ρc
(5)

The overall mass transfer coefficient k is calculated by a combination of resistances


at the external and inside of adsorbent particle [4,5]:

1 ρ p q* Rp
= + (6)
k k f ρ gα p y 5 D psα p
The effective diffusivity, Dps, is related to Knudsen and surface diffusivities as
follows:

ε p ρ g ∂y*
D ps = Ds + DK (7)
ρ p ∂q
The Knudsen diffusion coefficient is calculated by the following equation [6]:
1/ 2
r ⎛T ⎞
DK = 9.7 ⋅ 10−5 e ⎜ ⎟ (8)
τp ⎝M ⎠

The surface diffusion coefficient, DS , is expressed by the following equation [6]:

1.61 ⋅ 10−6 ⎛ E ⎞
Ds = exp⎜ − ⎟ (9)
τs ⎝ RT ⎠
Mass and heat axial dispersion values are calculated with the following correlations
[7]:
ε Dax
= 20 + 0.5Sc Re (10)
DM
kax
= 7 + 0.5 Pr Re (11)
kg

The k f and h f values are calculated using equations of Wakao and Chen [8]:

Sh = 2.0 + 1.1 Re 0.6 Sc1 / 3 (12)

Nu = 2.0 + 1.1 Re 0.6 Pr1 / 3 (13)


168 B. Ambrożek

Molecular diffusivity is calculated using the equation developed by Fuller et al. [9]:

DM =
(
1.013 ⋅ 10− 2 T 1.75 1 / M + 1 / M g 1 / 2 )
[ ]
(14)
P (Dv ) 1/ 3
+ Dvg( )
1/ 3 2

The isosteric heat of adsorption is estimated using the Clausius-Clapeyron equation:


⎛ ∂ ln p ⎞
ΔH a = RT 2 ⎜ ⎟
⎝ ∂T ⎠q (15)

For z = 0 and t > 0 two different boundary conditions are used:


∂y G ⎛ ⎞⎟
=− ⎜y − −y
z =0+ ⎠
Dax (16)
∂z z = 0 ρ g ⎝ z =0

kax
∂Tg
∂z z = 0
= −GC pg Tg ( z =0−
− Tg
z =0+
) (17)

and

y =y = yo (18)
z =0 − z =0 +

Tg = Tg =T (19)
z =0 − z =0 +
Both of the above boundary conditions have been employed by other investigator
[4-6].
The boundary conditions at z = L and t > 0 are written as follows:

∂y ∂Tg
= 0; =0 (20)
∂z z = L ∂z z = L
The solution of the model equations requires the knowledge of the state of the
column at the beginning of each step. The initial conditions for 0 < z < L and
t = 0 are:
q(0, z ) = qo ( z ) ; y (0, z ) = yo ( z )
Ts (0, z ) = Tso ( z ) ; Tg (0, z ) = Tgo ( z ) (21)

Tc (0, z ) = Tco ( z )
In the present study, it is assumed that the final concentration and temperature
profile in adsorbent bed for each step defines the initial conditions for the next step.
For the adsorption step in the first adsorption cycle:
The Simulation of Cyclic Thermal Swing Adsorption (TSA) Process 169

q o ( z ) = 0; y o ( z ) = 0 ,
Tso ( z ) = T go ( z ) = Tco ( z ). = Ta
(22)

The temperature-dependent Langmuir isotherm equation was used to represent ad-


sorption equilibrium:

q exp( Q / T )bo exp( B / T )Py


q* = o (23)
1 + bo exp( B / T )Py

3 Numerical Solution
The model developed in this work consists of partial differential equations (PDEs) for
mass and energy balances. The set of PDEs are first transformed into a dimensionless
form, and the resulting system is solved using the numerical method of lines
(NMOL) [10]. The spatial discretization is performed using second-order central
differencing, and the PDEs are reduced to a set of ordinary differential equations
(ODEs). The number of axial gird nodes was 30. The resulting set of ODEs were
solved using the FORTRAN subroutine DIVPAG of the International Mathematical
and Statistical Library (IMSL). The DIVPAG program employs Adams-Moulton’s or
Gear’s BDF method with variable order and step size.

4 Results and Discussion


The simulated TSA cycle (Figure 1) was operated in three steps: (i) an adsorption step
with cold feed (293K); (ii) a countercurrent desorption step with hot inert gas; (iii) a
countercurrent cooling step with cold inert gas (293K).
The system studied was propane adsorbed onto and desorbed from fixed bed of
activated carbon (Columbia Grade L). Nitrogen was used as carrier gas during
adsorption and as purge gas during desorption and cooling. The adsorbent bed
was 0.40 m long, with 0.07 m diameter. The concentration of propane at inlet to
the adsorption column during adsorption step was y = 0.01 mol/mol, total pressure
P = 0.25 MPa. The superficial gas flow rates was the same for each step and was
7.0 mol/m2 s.
The appropriate set of constants in Eq. (23) for propane on activated carbon are
determined using the experimental isotherm data published in [11]. The following
values of parameters are obtained: q0 = 1.841 mol/kg, Q = 323.7 K, b0 = 0.257·10-7
Pa-1, B = 2466.5 K.
The cyclic steady-state (CSS) cycles are obtained under various conditions by a
cyclic iteration method; complete cycles are run until the periodic states are achieved.
Adsorption step is terminated when the outlet concentration of organic compound
rises up to 5 % of inlet concentration. The desorption step is terminated when the
outlet temperature exceeds 95 % of the inlet temperature. Cooling time depends
mainly on the required final outlet temperature. In this study the value of 300 K is
170 B. Ambrożek

Fig. 1. Three-step TSA process with fixed adsorbent bed

assumed. The final concentration and temperature profile in adsorbent bed for each
step defines the initial conditions for the next step. It is assumed that the condition for
a periodic state is satisfied when the amount removed from the bed during regenera-
tion is equal to the amount that is accumulated in the bed during the adsorption step.
The following equation is used to determine the cyclic steady-state [2]:

⎛L ⎞ ⎛L ⎞
⎜ ∫ qdz ⎟ − ⎜ ∫ qdz ⎟ <δ (24)
⎜ ⎟ ⎜ ⎟
⎝0 ⎠(nc −1)th cycle ⎝ 0 ⎠( nc )th cycle
where δ is value close to zero (in this work δ = 1⋅10-5).
Approximately 15-20 cycles are needed to achieve the cyclic steady-state,
depending on process conditions.
The computer simulation results are used to study the effect of different operating
conditions and some of the model parameters on the concentration and temperature
breakthrough curves. The effects of adiabatic and non-adiabatic operation, purge gas
temperature during desorption step, boundary conditions, axial diffusion and thermal
conductivity were investigated. Exemplary simulation results for the cyclic steady-
state are shown in Figures 2 - 11. Typical concentration and temperature breakthrough
curves for adsorption, desorption and cooling steps are shown in Figures 2 - 6. In the
case of desorption step, two transitions are apparent, connected by a plateau. The
breakthrough curves were highly influenced by purge gas temperature (Figure 7) and
heat loss through the adsorption column wall, especially for small diameter adsorption
The Simulation of Cyclic Thermal Swing Adsorption (TSA) Process 171

0.01

0.008
y [mol/mol]

0.006

0.004

0.002

0 4000 8000 12000 16000 20000

t [s]
Fig. 2. Concentration breakthrough curve for adsorption step

312

308

304
T [K]

300

296

292

0 4000 8000 12000 16000 20000

t [s]
Fig. 3. Temperature breakthrough curve for adsorption step
172 B. Ambrożek

0.03

0.02
y [mol/mol]

0.01

0 1000 2000 3000 4000 5000

t [s]
Fig. 4. Concentration breakthrough curve for desorption step. Purge gas temperature: 394 K.

380

360

340
T [K]

320

300

280

0 1000 2000 3000 4000 5000

t [s]
Fig. 5. Temperature breakthrough curve for desorption step. Purge gas temperature: 394 K.
The Simulation of Cyclic Thermal Swing Adsorption (TSA) Process 173

380

360

340
T [K]

320

300

280

0 1000 2000 3000 4000 5000

t [s]
Fig. 6. Temperature breakthrough curve for cooling step. Purge gas temperature during desorption
step: 394 K.

0.03

T= 394 K

T= 350 K
T= 310 K
0.02
y [mol/mol]

0.01

0 1000 2000 3000 4000 5000

t [s]
Fig. 7. Effect of purge gas temperature on concentration breakthrough curve for desorption step
174 B. Ambrożek

0.04

adiabatic

non-adiabatic, D= 1m
0.03 non-adiabatic, D= 0.07 m
y [mol/mol]

0.02

0.01

0 1000 2000 3000 4000 5000

t [s]
Fig. 8. Concentration breakthrough curves for adiabatic and non-adiabatic desorption. Purge
gas temperature: 394 K

0.03

Eqs (18) and (19)

Eqs (16) and (17)

0.02
y [mol/mol]

0.01

0 1000 2000 3000 4000 5000

t [s]
Fig. 9. Concentration breakthrough curves for different boundary conditions. Purge gas temperature:
394 K.
The Simulation of Cyclic Thermal Swing Adsorption (TSA) Process 175

0.03

Eq. 10
Dax= 0

0.02
y [mol/mol]

0.01

0 1000 2000 3000 4000 5000

t [s]
Fig. 10. Effect of axial diffusion on concentration breakthrough curve for desorption step.
Purge gas temperature: 394K.

380

360

Eq. 11

340
kax= 0
T [K]

320

300

280

0 1000 2000 3000 4000 5000

t [s]
Fig. 11. Effect of axial thermal conductivity on temperature breakthrough curve for desorption
step. Purge gas temperature: 394 K.
176 B. Ambrożek

column (Figure 8). The modeling results show that concentration and temperature
breakthrough curves obtained using different boundary conditions, defined by
equations (16)-(17) and (18)-(19), are practically identical (Figure 9).
The effect of axial diffusion on the concentration breakthrough curve is illustrated
in Figure 10. The effective axial diffusion coefficient, Dax , was (i) set equal to zero,
and (ii) calculated by the equation (10). Figure 11 represents the effect of axial
thermal conductivity on the temperature breakthrough curve. The value of kax was
varied in the same manner as for axial diffusion coefficient. Both breakthrough curves
were not significantly affected by axial thermal conductivity and axial diffusion, but
the required computer time was sensitive to the values of Dax and kax.

5 Conclusions
The theoretical study of thermal swing adsorption was made. A non-equilibrium, non-
adiabatic mathematical model was developed to simulate temperature and concentra-
tion breakthrough curves for adsorption and regeneration. The cyclic steady-state
(CSS) cycles are obtained under various conditions by a cyclic iteration method. The
modeling results were used to study the effect of different operating conditions and
some of the model parameters on the concentration and temperature breakthrough
curves. The effects of adiabatic and non-adiabatic operation, purge gas temperature
during desorption step, boundary conditions, axial diffusion and thermal conductivity
were investigated.
Based on the modeling results the following conclusions are drawn:
(i) The breakthrough curves were highly influenced by purge gas temperature and
heat loss through the adsorption column wall.
(ii) The concentration and temperature breakthrough curves obtained using different
boundary conditions are practically identical.
(iii) The breakthrough curves were not significantly affected by axial thermal con-
ductivity and axial diffusion.

Symbols
bo – constant in Eq. 20
B – constant in Eq. 20
Cpc – heat capacity of column, J/(mol K)
Cpg – heat capacity of gas, J/(mol K)
Cps – heat capacity of solid, J/(kg K)
D – internal diameter of bed, m
Dax – axial diffusion coefficient, m2/s
DK – Knudsen diffusion coefficient, m2/s
DM – molecular diffusion coefficient, m2/s
Dps – effective particle diffusion coefficient, m2/s
DS – surface diffusion coefficient, m2/s
Dv, Dvg – diffusion volume of adsorbate, inert gas
E – surface diffusion energy of activation, J/mol
The Simulation of Cyclic Thermal Swing Adsorption (TSA) Process 177

G – superficial molar gas flow rate, mol/(m2 s)


hf – heat transfer coefficient from the bulk gas phase to the particle, W/(m2 K)
hw – heat transfer coefficient from the bulk gas phase to the column wall, W/(m2 K)
ΔHa – heat of adsorption of adsorbate, J/mol
k – overall mass transfer coefficient, 1/s
kax – axial thermal conductivity, W/(m K)
kf – film mass transfer coefficient, m/s
kg – thermal conductivity of gas, W/(m K)
L – bed length, m
M, Mg – molecular weight of adsorbate, inert gas, kg/kmol
nc – number of cycles
Nu – Nusselt number
p – partial pressure of adsorbate, Pa
P – total pressure, Pa
Pr – Prandtl number
q – adsorbate concentration in solid phase, mol/kg
qo – constant in Eq. 20
q* – value of q at equilibrium with y, mol/kg
Q – constant in Eq. 20
re – mean pore radius, Å
R – gas constant, J/(mol K)
Re – Reynolds particle number
Rp – particle radius, m
Sc – Schmidt number
Sh – Sherwood number
t – time, s
T – temperature, K
Tc – column wall temperature, K
Tg – gas temperature within the bed, K
Tgo – gas temperature at the feed conditions, K
Ts – solid phase temperature, K
Tamb – ambient temperature, K
U – overall heat transfer coefficient for column insulation, W/(m2 K)
y – mole fraction of adsorbate in the gas phase, mol/mol
yo – mole fraction of adsorbate in the gas phase at the feed conditions, mol/mol
y* – mole fraction of adsorbate in the gas phase in equilibrium with q, mol/mol
z – axial distance, m
αair – ratio of the log mean surface area of the insulation to the volume of the
column wall, 1/m
αc – ratio of the internal surface area to the volume of the column wall, 1/m
αp – particle external surface area to volume ratio, 1/m
ε – bed voidage
εp – particle porosity
ρc – column density, kg/m3
ρg – gas density, mol/m3
ρp – particle density, kg/m3
τp, τs – pore, surface tortuosity factor
178 B. Ambrożek

References
[1] Bathen, D., Breitbach, M.: Adsorptionstechnik. Springer, Berlin (2001)
[2] Ko, D., Moon, I., Choi, D.-K.: Analysis of the Contact Time in Cyclic Thermal Swing
Adsorption Process. Ind. Eng. Chem. Res. 41, 1603 (2002)
[3] Ding, Y., LeVan, M.D.: Periodic States of Adsorption Cycles III. Convergence
Acceleration for Direct Determination. Chem. Eng. Sci. 56, 5217 (2001)
[4] Schork, J.M., Fair, J.R.: Parametric Analysis of Thermal Regeneration of Adsorption
Beds. Ind. Eng. Chem. Res. 27, 457 (1988)
[5] Yun, J.-H., Choi, D.-K., Monn, H.: Benzene Adsorption and Hot Purge Regeneration in
Activated Carbon Beds. Chem. Eng. Sci. 55, 5857 (2000)
[6] Huang, C.-C., Fair, J.R.: Study of the Adsorption and Desorption of Multiple Adsorbates
in a Fixed Bed. AICHE J. 34, 1861 (1988)
[7] Wakao, N., Funazkri, T.: Effect of Fluid Dispersion Coefficients on Particle-to-Fluid
Mass Transfer Coefficients in Packed Beds. Chem. Eng. Sci. 33, 1375 (1978)
[8] Wakao, N., Chen, B.H.: Some Models for Un steady-state Heat Transfer in Packed Bed
Reactors. In: Kulkarni, B., Mashelkar, R., Sharma, M. (eds.) Recent Trends in Chemical
Reaction Engineering, vol. 1, p. 254. Wiley Eastern Ltd., New Delhi (1987)
[9] Sinnott, R.K.: Coulson & Richardson’s Chemical Engineering, vol. 6. Butterworth-
Heinemann, Oxford (1999)
[10] Schiesser, W.E.: The Numerical Methods of Lines. Academic Press, California (1991)
[11] Valenzuela, D.P., Myers, A.L.: Adsorption Equilibrium Data Handbook. Prentice-Hall,
Englewood Cliffs (1989)
The Stress Field Induced Diffusion

Marek Danielewski1, Bartłomiej Wierzba2, and Maciej Pietrzyk2


1
Faculty of Materials Science and Ceramics, AGH University of Science and Technology,
Al. Mickiewicza 30, 30-059 Cracow, Poland
daniel@agh.edu.pl
2
Faculty of Metals Engineering and Industrial Computer Science,
AGH University of Science and Technology,
Al. Mickiewicza 30, 30-059 Cracow, Poland
bwierzba@metal.agh.edu.pl, mpietrzyk@agh.edu.pl

Abstract. The mathematical description of the mass transport in multicomponent solution is


presented. Model base on Darken concept of the drift velocity. In order to present an example
of the real system we restrict analysis to an isotropic solid and liquids for which Navier
equation holds. The diffusion of components depends on the chemical potential gradients and
on the stress that can be induced by the diffusion and by the boundary and/or initial conditions.
In such quasi-continuum the energy, momentum and mass transport are diffusion controlled
and the fluxes are given by the Nernst-Planck formulae. It is show that the Darken method
combined with Navier equations is valid for solid solutions as well as multi component liquids.

Keywords: stress, interdiffusion, Navier equation, multicomponent solution, alloys, drift velocity.

1 Introduction
The new understanding of diffusion in multi component systems started with
Kirkendall experiments on the interdiffusion (ID) between Cu and Zn. Experiments
proved that the diffusion by direct interchange of atoms, the prevailing idea of the
day, was incorrect and that a less-favored theory, the vacancy mechanism, must be
considered. In 1946, Kirkendall, along with his student, Alice Smigelskas, had co-
authored a paper asserting that ID between Cu and Zn in brass shows movement of
the interface between the “initially different phases” due to ID. This discovery, known
since then as the “Kirkendall effect”, supported the idea that atomic diffusion occurs
through vacancy exchange [1]. It shows the different intrinsic diffusion fluxes of the
components, that cause swelling (creation) of one part and shrinkage (annihilation) of
the other part of the diffusion couple. The key conclusion is that local movement of
solid (its lattice) and liquid due to the diffusion is a real process. Once the solution is
non uniform and the mobilities differ from each other, than the vast number of
phenomena can occur: the Kirkendall marker movement, the Kirkendall-Frenkel
voids might be formed and stress is generated, etc. The concepts initiated by
Kirkendall played a decisive role in the development of the diffusion theory [2,3]. The
progress in the understanding of the ID phenomenology [4] allows nowadays for an
attempt to further generalize Darken method. Darken method for multicomponent

W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 179 – 188.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
180 M. Danielewski, B. Wierzba, and M. Pietrzyk

solutions is based on the postulate that the total mass flow is a sum of diffusion and
drift flow [4]. The force arising from gradients causes the atoms of the particular
component to move with a velocity, which in general may differ from the velocity of
the atoms of other components. Medium is common for all the species and all the
fluxes are coupled. Thus, their local changes can affect the common drift velocity,
υdrift. The physical laws that govern process are continuity equations and the postulate
that the total molar concentration of the solution is constant. The extended Darken
method in one dimension [4] allows modeling the positions of the solution
boundaries, densities and the drift velocity. Physical laws are the same as in original
Darken model. All the important differences are in the formulation of the initial and
boundary conditions. Model allows modelling ID for arbitrary initial distribution of
the components, in a case of moving boundaries, of the reactions and in many other
situations. The uniqueness and existence of the solution, the effective methods of
numerical solution and successful modelling of the “diffusional structers” (“up-hill
diffusion”) prove the universality of the drift concept. It offers sole opportunity to
describe ID in the real solutions and in three dimensions - an objective of this work.
The presented model is solvable and there exists an unique solution of it [4].

2 The Darken Method


The core of the Darken method is the mass balance equation:
∂ρi
= −divJ i , i = 1,..., r (1)
∂t
and the postulated form of the flux of i-th element, Ji, that contains the diffusive and
the drift terms:

J i = J id + ρ iυ drift , i = 1,..., r (2)

where υdrift denotes the drift velocity, J i is the diffusion flux and r number of
d

components. The mass balance equation can be written in the internal reference frame
(relatively to the drift velocity).
Thus from Eqs. (1) and (2) it follows:
Dρ i
Dt
= −divJ i − ρ i divυ
d drift
= −div ρ i xi( d
) − ρ divυ
i
drift
i = 1,..., r (3)
υ
drift

where xi
d
is diffusion velocity. The derivative in Eq. (3) is called Lagrange’an,
substantial or material derivative:
Dρ i ∂ρ i
= +υ gradρ i
drift
(4)
Dt υ
drift ∂t
and it gives the rate of density changes at the point moving with an arbitrary velocity,
here it is the drift velocity.
The Stress Field Induced Diffusion 181

The generally accepted form of the diffusion flux is the Nernst-Planck equation
[5,6]:

J id = ρ i Bi Fi (5)

where Bi and Fi are the mobility of i-th component and forces acting on it:

Fi = −gradμi (6)

Upon combining Eqs. (3), (5) and (6) the continuity equation becomes:
Dρ i
= div [ ρ i Bi gradμ i ] − ρi divυ i = 1, ..., r
drift
(7)
Dt υ drift

3 The Diffusion and Stress in the Multi Component Solution


3.1 Mass Balance

For all the processes that obey the mass conservation law and when the chemical
and/or nuclear reactions are not allowed (the reaction term can be omitted), the
equation of mass conservation holds, Eq (3). It is postulated here that the drift
velocity is a sum of Darken drift velocity (generated by the interdiffusion) and the
deformation velocity υ
σ
(generated by the stress):

υ drift = υ D + υ σ (8)
Darken postulated that diffusion fluxes are local and defined exclusively by the
local forcing (e.g., the chemical potential gradient, stress field, electric field etc.). He
postulated existence of the unique average velocity that he called the drift velocity. In
this work, we generalize the original Darken concept to include the elastic
deformation of an alloy. The Darken’s drift velocity, υD, is given by [2]:
df r r
1 1
υD =
c
∑ ci xi − c
∑ c x i
d
i
−υσ (9)
i =1 i =1

The average total and the diffusion velocities are given by:
df r
1
υ=
c
∑ c x i i
(10)
i =1

df r
1
υd =
c
∑ c x i i
d
(11)
i =1

The diffusion velocity of the i-th component and the concentration of the solution
are defined by:
182 M. Danielewski, B. Wierzba, and M. Pietrzyk

J id = ci xid (12)

r
c = ∑ ci (13)
i =1

From Eqs. (8) – (12), the following relations for the flux of the i-th element and its
velocity hold:

J i = J id + ciυ D + ciυ σ (14)

xi = υ D + υ σ + xid = υ drift + xid (15)

Upon summing Eqs. (14), for all components the average local velocities, satisfy
the Eq. (9):
r r

∑ c x i i
d
= ∑ ci xi − cυ D − cυ σ
i =1 i =1

and from Eqs. (10), (11) and (14) it follows

υ d = υ − υ drift = υ − υ D − υ σ (16)
The postulate of the drift velocity allows rewriting Eq. (3) in the following form:

Dci
Dt
( )
+ div ci xid + ci divυ drift = 0 (17)
υ drift

Summing (17) over all components it is easy to show:

Dc
+ div ( cυ ) − υ drift gradc = 0 (18)
Dt υ drift

and finally
∂c
+ div ( cυ ) = 0 (19)
∂t
Thus, we have obtained the well known formulae for the mass conservation in the
multicomponent solution.

3.2 Stress and Strain Relations

The general form of the equation of motion for an elastic solid is very complex. We
will use the results that come out for an isotropic material. In such a case the equation
of motion reduces to the vector equation: f = (λ + μ )graddivu + μ divgradu , were
f is the density of the force induced by the displacement vector u. It shows that
isotropic material has only two elastic constants. To get the equation of motion for
The Stress Field Induced Diffusion 183

such a material, we can set f = ρ ∂ u


2
and upon neglecting any body forces like
∂t 2
gravity etc., one gets [7]:

ρ∂ u
2
= (λ + μ )graddivu + μ divgradu .
∂t 2
An elastic body is defined as a material for which the stress tensor depends only on
a deformation tensor F,
σ = σ (F) (20)
We postulate in this work that the displacements are small. In such a case the
displacement gradient, H, is defined as the gradient of the displacement vector (u = x – X):
H = gradu = F − 1 (21)
and the strain tensor is the symmetric part of H

ε=
1
2
(H + H ) T
(22)

where

(u + ul , k )
1
ε kl = k ,l
2
The constitutive equation of an isotropic, linear and elastic body is known as the
Hooke’an law [8]:
σ = ( λ trε ) 1 + 2μ ε (23)

where λ and μ denote the Lame coefficients:


vE E
λ= and μ = (24)
(1 + v )(1 − 2v ) 2 (1 + v )
where E denotes the Young module and v is the Poisson number.
The divergence of the stress tensor is defined by the Eq. (23) and equals [9]:
divσ = ( λ + μ ) graddivu + μ divgradu (25)

3.3 Momentum Balance

The Navier and Navier-Lamé equations describe the momentum balance in the
compressible fluid and isotropic solid. The relations for the momentum and the
moment of momentum obtained in the theory of mass transport in continuum in which
diffusion takes place [10] allow to postulate the following relation:

ρ = divσ * + ρ fb (26)
Dt υ drift

where σ* and fb denote the overall Cauchy stress tensor and body force, respectively.
184 M. Danielewski, B. Wierzba, and M. Pietrzyk

When interdiffusion is analyzed it is convenient to express momentum balance as a


function of concentrations. Thus upon dividing Eq. (26) by the overall molar mass
and using Eq. (15) one gets:
Dυ drift Dυ d
c = divσ + cfb − c (27)
Dt υ drift
Dt υ drift

where σ and f b denote the overall stress tensor defined by the Eq. (23) and body
force, respectively.
In Eqs. (26) and (27) we postulate that the drift velocity defines the local frame of
reference. In the analyzed case of the regular, cubic and elastic crystal the following
relation holds [9,11]:
σ − σT = 0 (28)

3.4 Diffusion and Other Fluxes

The diffusion of the i-th component, Eq. (14), depends on the both, the stress and the
chemical potential gradient, Eq. (5) and (6). Following Darken the total flux is a sum
diffusion and drift terms:
J i = −ci Bi grad ( μi + Ωi p ) + ciυ D + ciυ σ (29)
Moreover we limit the free energy density to the isostatic stress component.
Keeping only the diagonal terms [12,11] one gets:
1
p = − trσ (30)
3
The free energy density (pressure) gradient will induce the diffusion flux of
elements if their molar volumes differ [12]. The Nernst-Einstein equation relates the
mobility and the self diffusion coefficient [12]:
Di = Bi kT (31)
where k is the Boltzmann constant and T the absolute temperature.

3.5 Physical Laws (The Integral Form)

The mass conservation law has a form:


D
Dt υ drift
∫ ( ) cdϑ + ∫
β t ∂β ( t )
cυ d ds = 0

Using Eq. (16), it can be written in terms of drift velocity. Thus, the following
formulae form the integral balance equations for multicomponent solution [13]:

D
Dt υ drift
∫ ( ) cdϑ + ∫
β t ∂β ( t )
cυ ds − ∫
∂β ( t )
cυ drift ds = 0 (32)
The Stress Field Induced Diffusion 185

D
Dt υ drift
∫ ( ) cυ dϑ = ∫
β t ∂β ( t )
σ ds + ∫
β (t )
c f b dϑ (33)

D
Dt υ
drift
∫ ( ) x × cυ dϑ = ∫
β t ∂β ( t )
x × σ ds + ∫
β (t )
x × cfb dϑ (34)

⎛ ⎞
∫ ( ) ⎜⎝ ce + ∑ 2 c (υ )
r
D 1 (35)
⎟ dϑ = ∫∂β ( t ) υσ ds − ∫∂β (t ) q T ds + ∫β ( t ) cfbυ dϑ + ∫β (t ) q B dϑ
2
drift
+ xid

β t i
Dt υ drift i =1

D qT qB
Dt υ drift
∫ ( ) cη dϑ ≥ − ∫
β t ∂β ( t )
T
ds + ∫
β (t )
T
dϑ (36)

where e, qT, qB, and η denote the specific internal energy, heat flux (vector of heat
transfer), vector of heat source per unit mass produced by internal sources and density
of energy production, respectively.
The integral equations allow to derive the self-consistent set of the following
differential equations [13]:
Dc
Dt
+ cdivυ drift + div cυ d = 0 ( ) (37)
υ drift

Dυ d Dυ drift
divσ + cfb − c −c + υ div ( cυ d ) = 0 (38)
Dt υ drift
Dt υ drift

Dυ d Dxid
( ) − υυ div ( cυ d ) − ∑ ci xi
r
De
−c + ediv cυ d + cυ − divqT +
Dt υ drift Dt υ drift i =1 Dt υ drift (39)

( )
r
1
+∑ xi xi div ci xid + σ : gradυ + q B = 0
i =1 2

σ − σT = 0 (40)
Dψ Dυ d Dxid
( ) − υυ div ( cυ d ) − ∑ ci xi
r
−c + ψ div cυ d + cυ +
Dt υ drift Dt υ drift i =1 Dt υ drift
(41)

( )
r
1
+∑ xi xi div ci xid + σ : gradυ ≥ 0
i =1 2
where the specific free energy is defined as ψ = e − ηT .

4 Results
There exists a solution of above model. At present we solve this problem numerically
using Finite Differential Method (FDM) in one dimension.
For demonstration the Cr-Fe-Ni system has been chossen. Interdiffusion modelling
in Cr-Fe-Ni closed system has been done using the FDM method and compared with
experimantal results. For the calculations the following data have been used:
186 M. Danielewski, B. Wierzba, and M. Pietrzyk

(a) The initial distribution of concentrations Fig. 1.


(a) The activity of components (thermodynamic data)
(a) Calculated avarage self-diffusion coeficients at 1273 K:
−11 −1
DNi = 1.106 ⋅ 10
2
[ cm s ]
−11 −1
DFe = 1.923 ⋅ 10
2
[ cm s ]
−11 −1
DCr = 2.788 ⋅ 10
2
[ cm s ]

Fig. 1. Initial concentrations profiles of Cr-Ni-Fe

In Figure 2 the calculated concentrations profiles of Cr, Fe and Ni are compared


with the experimental results and show satisfy agreement:

Fig. 2. Calculated concentrations profiles of Cr-Ni-Fe


The Stress Field Induced Diffusion 187

Figure 3 shows the calculated drift velocity of the diffusion couple shown in
Fig. 1:

Fig. 3. Calculated drift velocity

In Figure 4 the pressure disribution is shown.

Fig. 4. Pressure diagram

The above figures illustrate the evolution of the concentration, drift velocity and
the pressure. Compariton the symulation data with experiment shows that the model is
valid, and the mathematical description of interdiffusion and stress is effective tool for
simulating that processes.
188 M. Danielewski, B. Wierzba, and M. Pietrzyk

5 Concluding Remarks
The following conclusions can be drawn:
a) The mathematical description of interdiffusion in multicomponent systems has
been formulated. For the known thermodynamic data and diffusivities, the
evolution of the concentration profiles and drift velocity can be computed.
b) Effective formulae enable us to calculate the concentration profiles and the
drift velocity as a function of time and position.
c) The model was applied for the modelling interdiffusion in Cr-Fe-Ni diffusion
couple. The calculated concentration profiles were consistent with
experimental results.
d) The Navier–Stokes and Navier-Lame equations for the case of multi
component solutions, where the concentrations are not uniform, has been
effectively used.

Acknowledgments
This work has been supported by the MNiI No. 11.11.110.643, under Grant No. 4
T08C 03024, and under Grant No. 3 T08C 044 30, finansed during the period 2006-
2008.

References
[1] Smigelskas, A.D., Kirkendall, E.: Trans. A.I.M.E. 171, 130 (1947)
[2] Darken, L.S.: Trans. AIME 174, 184 (1948)
[3] Danielewski, M.: Defect and Diffusion Forum 95–98, 125 (1993)
[4] Holly, K., Danielewski, M.: Phys. Rev. B 50, 13336 (1994)
[5] Nernst, W.: Z. Phys. Chem. 4, 129 (1889)
[6] Planck, M.: Ann. Phys. Chem. 40, 561 (1890)
[7] Feynman, R.P., Leighton, R.B., Sands, M.: The Feynman Lectures on Physics. Addison-
Wesley, London (1964)
[8] Cottrell, A.H.: The mechanical properties of matter. John Wiley & Sons Inc., New York
(1964)
[9] Landau, L.D., Lifszyc, E.M.: Theory of elasticity, Nauka, Moscow (1987)
[10] Danielewski, M., Krzyżański, W.: The Conservation of Momentum and Energy in Open
Systems. Phys. Stat. Sol. 145, 351 (1994)
[11] Stephenson, G.B.: Acta metall. 36, 2663 (1988)
[12] Philibert, J.: Diffusion and Stress. In: Defect and Diffusion Forum, pp. 129–130 (1996)
[13] Danielewski, M., Wierzba, B.: The Unified Description of Interdiffusion in Solids and
Liquids. In: Proc. Conf. 1st International Conference on Diffusion in Solids and Liquids,
Aveiro, Portugal, p. 113 (2005)
Author Index

Ambrożek, Bogdan 165 Mielczarek, Norbert 1


Mitkowski, Wojciech 99
Castillo, Oscar 43
Ogorzalek, Maciej J. 119
Danielewski, Marek 179 Pietrzyk, Maciej 179
Porada, Ryszard 1
Garus, Jerzy 71
Rauch, L. 153
Hrebień, Maciej 137
Skruch, Pawel 85, 99
Sydorets, V. 29
Korbicz, Józef 137
Kusiak, J. 153 Vladimirov, Vsevolod 21

Matthäus, Franziska 109 Wichard, Jörg 119


Melin, Patricia 43 Wierzba, Bartlomiej 179
Merkwirth, Christian 119 Wróbel, Jacek 21

Potrebbero piacerti anche