Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
)
Modelling Dynamics in Processes and Systems
Studies in Computational Intelligence, Volume 180
Editor-in-Chief
Prof. Janusz Kacprzyk
Systems Research Institute
Polish Academy of Sciences
ul. Newelska 6
01-447 Warsaw
Poland
E-mail: kacprzyk@ibspan.waw.pl
Further volumes of this series can be found on our homepage: Vol. 168. Andreas Tolk and Lakhmi C. Jain (Eds.)
springer.com Complex Systems in Knowledge-based Environments: Theory,
Models and Applications, 2009
ISBN 978-3-540-88074-5
Vol. 156. Dawn E. Holmes and Lakhmi C. Jain (Eds.)
Innovations in Bayesian Networks, 2008 Vol. 169. Nadia Nedjah, Luiza de Macedo Mourelle and
ISBN 978-3-540-85065-6 Janusz Kacprzyk (Eds.)
Innovative Applications in Data Mining, 2009
Vol. 157. Ying-ping Chen and Meng-Hiot Lim (Eds.) ISBN 978-3-540-88044-8
Linkage in Evolutionary Computation, 2008
ISBN 978-3-540-85067-0 Vol. 170. Lakhmi C. Jain and Ngoc Thanh Nguyen (Eds.)
Knowledge Processing and Decision Making in Agent-Based
Vol. 158. Marina Gavrilova (Ed.) Systems, 2009
Generalized Voronoi Diagram: A Geometry-Based Approach to ISBN 978-3-540-88048-6
Computational Intelligence, 2009 Vol. 171. Chi-Keong Goh, Yew-Soon Ong and Kay Chen Tan
ISBN 978-3-540-85125-7 (Eds.)
Multi-Objective Memetic Algorithms, 2009
Vol. 159. Dimitri Plemenos and Georgios Miaoulis (Eds.) ISBN 978-3-540-88050-9
Artificial Intelligence Techniques for Computer Graphics, 2009
ISBN 978-3-540-85127-1 Vol. 172. I-Hsien Ting and Hui-Ju Wu (Eds.)
Web Mining Applications in E-Commerce and E-Services, 2009
Vol. 160. P. Rajasekaran and Vasantha Kalyani David ISBN 978-3-540-88080-6
Pattern Recognition using Neural and Functional Networks, Vol. 173. Tobias Grosche
2009 Computational Intelligence in Integrated Airline Scheduling,
ISBN 978-3-540-85129-5 2009
ISBN 978-3-540-89886-3
Vol. 161. Francisco Baptista Pereira and Jorge Tavares (Eds.)
Bio-inspired Algorithms for the Vehicle Routing Problem, 2009 Vol. 174. Ajith Abraham, Rafael Falcón and Rafael Bello (Eds.)
ISBN 978-3-540-85151-6 Rough Set Theory: A True Landmark in Data Analysis, 2009
ISBN 978-3-540-89886-3
Vol. 162. Costin Badica, Giuseppe Mangioni, Vol. 175. Godfrey C. Onwubolu and Donald Davendra (Eds.)
Vincenza Carchiolo and Dumitru Dan Burdescu (Eds.) Differential Evolution: A Handbook for Global
Intelligent Distributed Computing, Systems and Applications, Permutation-Based Combinatorial Optimization, 2009
2008 ISBN 978-3-540-92150-9
ISBN 978-3-540-85256-8
Vol. 176. Beniamino Murgante, Giuseppe Borruso and
Vol. 163. Pawel Delimata, Mikhail Ju. Moshkov, Alessandra Lapucci (Eds.)
Andrzej Skowron and Zbigniew Suraj Geocomputation and Urban Planning, 2009
Inhibitory Rules in Data Analysis, 2009 ISBN 978-3-540-89929-7
ISBN 978-3-540-85637-5 Vol. 177. Dikai Liu, Lingfeng Wang and Kay Chen Tan (Eds.)
Design and Control of Intelligent Robotic Systems, 2009
Vol. 165. Djamel A. Zighed, Shusaku Tsumoto, ISBN 978-3-540-89932-7
Zbigniew W. Ras and Hakim Hacid (Eds.)
Mining Complex Data, 2009 Vol. 178. Swagatam Das, Ajith Abraham and Amit Konar
ISBN 978-3-540-88066-0 Metaheuristic Clustering, 2009
ISBN 978-3-540-92172-1
Vol. 166. Constantinos Koutsojannis and Spiros Sirmakessis Vol. 179. Mircea Gh. Negoita and Sorin Hintea
(Eds.) Bio-Inspired Technologies for the Hardware of Adaptive Systems,
Tools and Applications with Artificial Intelligence, 2009 2009
ISBN 978-3-540-88068-4 ISBN 978-3-540-76994-1
Vol. 167. Ngoc Thanh Nguyen and Lakhmi C. Jain (Eds.) Vol. 180. Wojciech Mitkowski and Janusz Kacprzyk (Eds.)
Intelligent Agents in the Evolution of Web and Applications, 2009 Modelling Dynamics in Processes and Systems, 2009
ISBN 978-3-540-88070-7 ISBN 978-3-540-92202-5
Wojciech Mitkowski and Janusz Kacprzyk (Eds.)
123
Prof. Wojciech Mitkowski
Faculty of Electrical Engineering, Automatics
Computer Science and Electronics
AGH University of Science and Technology
Al. Mickiewicza 30
30-059 Krakow
Poland
Email: wmi36@op.pl
DOI 10.1007/978-3-540-92203-2
c 2009 Springer-Verlag Berlin Heidelberg
This work is subject to copyright. All rights are reserved, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilm or in any other way,
and storage in data banks. Duplication of this publication or parts thereof is permitted
only under the provisions of the German Copyright Law of September 9, 1965, in
its current version, and permission for use must always be obtained from Springer.
Violations are liable to prosecution under the German Copyright Law.
The use of general descriptive names, registered names, trademarks, etc. in this publi-
cation does not imply, even in the absence of a specific statement, that such names are
exempt from the relevant protective laws and regulations and therefore free for general
use.
Typeset & Cover Design: Scientific Publishing Services Pvt. Ltd., Chennai, India.
987654321
springer.com
Preface V
Preface
Dynamics is what characterizes virtually all phenomenae we face in the real world,
and processes that proceed in practically all kinds of inanimate and animate systems,
notably social systems. For our purposes dynamics is viewed as time evolution of
some characteristic features of the phenomenae or processes under consideration. It is
obvious that in virtually all non-trivial problems dynamics can not be neglected, and
should be taken into account in the analyses to, first, get insight into the problem con-
sider, and second, to be able to obtain meaningful results.
A convenient tool to deal with dynamics and its related evolution over time is to
use the concept of a dynamic system which, for the purposes of this volume can be
characterized by the input (control), state and output spaces, and a state transition
equation. Then, starting from an initial state, we can find a sequence of consecutive
states (outputs) under consecutive inputs (controls). That is, we obtain a trajectory.
The state transition equation may be given in various forms, exemplified by differen-
tial and difference equations, linear or nonlinear, deterministic or stochastic, or even
fuzzy (imprecisely specified), fully or partially known, etc. These features can give
rise to various problems the analysts may encounter like numerical difficulties, insta-
bility, strange forms of behavior (e.g. chaotic), etc..
This volume is concerned with some modern tools and techniques which can be
useful for the modeling of dynamics. We focus our attention on two important areas
which play a key role nowadays, namely automation and robotics, and biological sys-
tems. We also add some new applications which can greatly benefit from the avail-
ability of effective and efficient tools for modeling dynamics, exemplified by some
applications in security systems.
The first part of the volume is concerned with more general tools and techniques
for the modeling of dynamics. We are particularly interested in the case of complex
systems which are characterized by a highly nonlinear dynamic behavior that can re-
sult in, for instance, chaotic behavior.
R. Porada and N. Mielczarek (Modeling of chaotic systems in program ChaoPhS)
consider first some general issues related to non-linear dynamics, both from the per-
spective of gaining mode knowledge on how to proceed in case of such dynamics, and
from tools and techniques which can be used in practice. Notably, they deal with
simulation tools, and propose a new simulation program, ChaoPhS (Chaotic Phenom-
ena Simulations), which is meant for studying chaotic phenomena in continuous and
discreet systems, including systems used in practice. The structure of the program,
and algorithms employed are presented. Numerical tests on some models of chaotic
systems known from the literature are presented. Moreover, as an illustration an
VI Preface
example of using the tools and techniques proposed for the analysis of chaotic behav-
ior in a power electronic system is presented.
V. Vladimirov and J. Wróbel (Oscillations of vertically hang elastic rod, contact-
ing rotating disc) present an analysis of mechanical oscillations of an elastic rod
forming a friction pair with a rotating disc. In the absence of friction the model is de-
scribed by a two-dimensional Hamiltonian system of ordinary differential equations
which is completely integrable. However, when a Coulomb type friction is added, the
situation becomes more complicated. The authors use both the qualitative methods
and the numerical simulation. They obtain a complete global behavior of the system,
within a broad range of values of a driven parameter, for two principal types of a
modeling function simulating the Coulomb friction. A sequence of bifurcations (limit
cycles, double-limit cycles, homoclinic bifurcations and other regimes) are observed
as the driven parameter changes. The patterns of bifurcations depend essentially upon
a model of a frictional force and this dependence is analyzed in detail. Much more
complicated regimes appear as one-dimensional oscillations of the rotating element
are incorporated into the model. The system possesses in this case quasiperiodic, mul-
tiperiodic and, probably, chaotic solutions.
V.N. Sidorets (The bifurcations and chaotic oscillations in electric circuits with
ARC) is concerned with the autonomous electric circuits with ARC governed by three
ordinary differential equations. By varying two parameters, many kinds of bifurca-
tions, periodic and chaotic behaviors of this system. Bifurcation diagrams, which are a
powerful tool to investigate bifurcations have been used and studied. Routes to chaos
have been considered using one-parameter bifurcation diagrams. Three basis patterns
of bifurcation diagrams that possess the properties of: softness and reversibility, stiff-
ness and irreversibility, and stiffness and reversibility, have been observed.
The second section of the volume is devoted to a key problem of modeling dynam-
ics in control and robotics, very relevant fields in which intelligent systems have
found numerous applications.
Oscar Castillo and Patricia Melin (Soft computing models for intelligent control of
non-linear dynamical systems) describe the application of soft computing techniques
(fuzzy logic, neural networks, evolutionary computation and chaos theory) to control-
ling non-linear dynamical systems in real-world problems. Since control of real world
non-linear dynamical systems may require the use of several soft computing tech-
niques to achieve a desired performance, several hybrid intelligent architectures have
been developed. The basic idea of these hybrid architectures is to combine the advan-
tages of each of the techniques involved. Moreover, this can also help in dealing with
the fact that non-linear dynamical systems are difficult to control due to the unstable
and even chaotic behaviors that may occur. Practical applications of the new control
architectures proposed include robotics, aircraft systems, biochemical reactors, and
manufacturing of batteries.
J. Garus (Model reference adaptive control of underwater robot in spatial motion)
discusses nonlinear control of an underwater robot. Emphasis is on the tracking of a
desired trajectory. Command signals are generated by an autopilot consisting of four
controllers with a parameter adaptation law that has been implemented implemented.
External disturbances are assumed, and an index of control quality is introduced.
Results of computer simulations are provided to demonstrate the effectiveness, effi-
ciency, correctness and robustness of the approach proposed.
Preface VII
The fourth part of the volume is devoted to various issues related to the modeling
of dynamics in new application areas which have recently attracted much attention in
the research community and practice.
M. HrebieĔ and J. Korbicz (Automatic fingerprint identification based on minutiae
points) deal with a problem that has recently attracted much attention, and become of
utmost importance, namely the use of some individual specific features in human
identification. In the paper, fingerprint ideantification is considered, specifically by
considering local ridge characteristics called the minutiae points. Automatic finger-
print matching depends on the comparison of these minutiaes and relationships be-
tween them. The authors discuss several methods of fingerprint matching, namely, the
Hough transform, the structural global star method and the speeded up correlation ap-
proach. Since there is still a need for finding the best matching approach, research for
on-line fingerprints has been conducted to compare quality differences and time rela-
tions between the algorithms considered and the experimental results are shown.
Some issues related to image enhancement and the minutiae detection schemes em-
ployed are dealt with.
Ł. Rauch and J. Kusiak (Image filtering using the dynamic particles method) con-
sider holistic approaches for image processing and their use in various types of applica-
tions in the domain of applied computer science and pattern recognition. A new image
filtering method based on the dynamic particles approach is presented. It employs
physical principles for the 3D signal smoothing. The obtained results are compared with
commonly used denoising techniques including the weighted average, Gaussian
smoothing and wavelet analysis. The calculations are performed on two types of noise
superimposed on the image data, i.e. the Gaussian noise and the salt-pepper noise. The
algorithm of the dynamic particle method and the results of calculations are presented.
B. AmbroĪek (The Simulation of cyclic thermal swing adsorption (TSA) process)
deals with the prediction of the dynamic behavior of a cyclic thermal swing adsorp-
tion (TSA) system with a column packed with a fixed bed of adsorbent using a rigor-
ous dynamic mathematical model. The set of partial differential equations, represent-
ing the thermal swing adsorption, is solved by using numerical procedures from the
International Mathematical and Statistical Library (IMSL). The simulated thermal
swing adsorption cycle is operated in three steps: (i) an adsorption step with a cold
feed; (ii) a countercurrent desorption step with a hot inert gas; (iii) a counter-current
cooling step with a cold inert gas. Some examples of simulations are presented for the
propane adsorbed onto and desorbed from a fixed bed of activated carbon. Nitrogen is
used as the carrier gas during adsorption and as the purge gas during desorption and
cooling.
M. Danielewski, B. Wierzba and M. Pietrzyk (The stress field induced diffusion)
present a mathematical description of the mass transport in multi-component solution.
The model is based on the Darken concept of the drift velocity. To be able to present
an example of a real system the authors restrict the analysis to an isotropic solid and
liquids for which the Navier equation holds. The diffusion of components depends on
the chemical potential gradients and on the stress that can be induced by the diffusion
and by the boundary and/or initial conditions. In such a quasi-continuum the energy,
momentum and mass transport are diffusion controlled and the fluxes are given by the
Nernst-Planck formulae. It is shown that the Darken method combined with the
Navier equations is valid for solid solutions as well as multi component liquids.
Preface IX
We hope that the particular chapters, written by leading experts in the field, can
provide the interested readers with much information on topics which may be relevant
for their research, and which are difficult to find in the vast scientific literature scat-
tered over many fields and subfields of applied mathematics, control, robotics, secu-
rity analysis, bioinformatics, mechanics, etc.
The idea of this volume has been a result of very interesting discussions held during,
and after the well attended Special Session on “Dynamical Systems – Modelling,
Analysis and Synthesis” at the CMS – “Computer Methods and Systems” International
Conference held on November 14–16, 2005 and organized by the AGH - University of
Science and Technology in Cracow, Poland. We wish to thank all the attendees, and
participants at discussions for their support and encouragement we have experienced
while preparing this publication.
We wish to thank the contributors for their excellent work and a great collaboration
in this challenging and interesting editorial project. Special thanks are due to Dr.
Thomas Ditzinger and Ms. Heather King from Springer for their constant help and
support.
Abstract. Modeling of the chaos phenomena in the nonlinear dynamics requires application
of more precise methods and simulatory tools than in cases of researches of linear systems.
Researches on these phenomena, except cognitive values, has also importance in technical
meaning. For obtaining the high quality parameters of output signals of practical systems it is
necessary to control, and even eliminate chaotic behaviour. Practical simulatory programs, eg.
Matlab not always realize high criteria concerning exactitude and speed of the simulation. In
the paper we introduced a new simulatory program ChaoPhS (Chaotic Phenomena Simulations)
to investigate chaotic phenomena in continuous and discreet systems, and also systems encoun-
tered in practice. Also we presented structure of the program and used numeric algorithms. The
program was tested with utilization of well-known from the literature models of chaotic sys-
tems. Some selected results of researches chaotic phenomena which appear in simple power
electronic systems are also presented.
1 Introduction
In recent years it is observed alot of interest in theory of deterministic chaos not only
among mathematicians and physicists, but also among representatives of technical
sciences. This theory analyzes irregular movement in the state space of nonlinear sys-
tem. Classic dynamic laws describe unambiguously the state of systems evolution as a
function of time, when initial conditions are known. The reason of observed chaotic
behaviour is not an external noise, but the property of nonlinear systems resulting in
exponential divergence of an initially close trajectory in the limited area of phase
space. The reason why the system behaves this way is its sensitivity to initial condi-
tions which makes impossible a long-term forecast of their trajectory, because in prac-
tice we can establish initial conditions only with a finite precision.
The research on deterministic chaos phenomena enables the identification of a reason
and designation of means of their elimination that is essential in practical applications.
The state vector of nonlinear systems in longer prospects of time depends on initial
conditions and significantly also on numeric methods applied to solving equations de-
scribing these systems. The application of one of typical simulation programs, e.g.
Matlab is often related with a very long computation time. Also a limited number of
implemented numeric algorithmic integration method of dynamics equations and lack
of numeric instruments to assign the quantities characterizing methods of nonlinear
dynamics (e.g. the Poincaré section, Lyapunov exponents etc.), has contributed to our
decision to write our own simulating program.
W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 1 – 20.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
2 R. Porada and N. Mielczarek
The paper describes the concept of deterministic chaos and mathematical instru-
ments used for its analyses. We introduced a self-made simulating program, ChaoPhs,
carried out tests of this program and demonstrated research results of a simple power
electronic system (an example of a typical, strongly nonlinear switching structure
used in practice), operating in a closed system for various control and load conditions.
γ = {K ∈ M : K = g Λt : t ∈ ℜ1 } (2)
makes the orbit of flow. The orbit is a curve lying on manifold M and is the trajec-
tory of equations (1). If equation (1) has a periodic solution with period T , then
g Λt +T = g Λt , t ∈ ℜ1 and orbit (2) is closed. The orbits of flow g Λt are the integral
curves of vector field F .
For the system with a discrete time given in the form of an algebraic representa-
tion, the evolution in the function of time can be described by an equation in form of a
general iterative formula:
x n +1 = f p ( x n ) (3)
where x n and x n+1 describe the system state in the n -th and in ( n + 1) -th step of
evolution.
Among all basic methods of nonlinear dynamics [2,3,4] it is possible to mention
several mutually interrelated notions, like fixed points, orbits, attractors, the Poincaré
section, the Lyapunov exponents, the Hausdorff dimension, the correlation function or
bifurcation [1,2,7,11].
An attractor is a certain region, trajectory or point in the phase space, towards
which trajectories beginning in different region of phase space head. The simplest at-
tractor is a fixed point, when the system has a distinguished state, towards which it is
aiming regardless of the initial conditions. In a two-dimensional phase space there is
only possible one more type of an attractor – border cycle. Border cycles appear in
nonlinear systems, in which there exist elements dissipating the energy and support-
ing the movement.
Modeling of Chaotic Systems in the ChaoPhS Program 3
The Poincaré sections simplify the attractor search problem by the analysis of points
appointed by trajectories which are cutting through the chosen plane. The Poincaré
t
map emerges from orbits of the phase flow g Λ , and its property, i.e. the qualification
whether it is contracting or expanding, determines the systems proceeding.
The Lyapunov exponents are used to estimate the convergence or divergence of the
phase flow trajectory. The positive values of exponents mean the divergence of orbits
and chaos. The Lyapunov exponent is defined by the equation:
or by expanding in the Taylor series. In this research we use several methods of solv-
ing equations (1) and they are all discussed in the farther part of this paper.
For a qualitative study of the chaotic model very helpful can be the Poincaré sec-
tion. It makes possible a simplification of the attractor search task by the analysis of
points appointed by trajectories which are cutting through chosen plane. Instead of
continuous lines we obtain a set of points situated on this plane (Fig. 1). The plane is
4 R. Porada and N. Mielczarek
Fig. 3. Bifurcation diagram showing a cascade of period doubling of phase trajectory orbit
It is rather hard to calculate these coefficients analytically, however they can be rela-
tively easily determined by the use of sampled time series of the investigated system.
The Lyapunov exponents are numerical coefficients of exponential growth of dis-
tance between neighboring points on phase space, when we operate on it using a
transformation. For the simplest transformation x n +1 = a x n , after n steps, we ob-
tain x n = a n x0 , which can easily be recorded as x n = x 0 e n ln a . The ln a shows the
proportion in which the distance between points in one step of transformation
changes. For the multidimensional systems, where the transformation is a set in form
x n +1 = A x n , the Lyapunov exponents are equal λ k = ln a k , where a1 , a 2 ,…, a k are
the eigenvalues of A matrix. In directions where trajectories diverge from each other,
the Lyapunov exponents are positive, and on the contrary – when they converge – the
exponents are negative. The condition to keep the measure is det A = 1 which means
that the product of all eigenvalues is equal 1. For the continuous nonlinear systems,
the rate of motion on each trajectory is set by a tangent vector. The transformation
matrix is the Jacobian matrix, J i j = ∂ f j ∂ xi , where J i j are functions of points co-
ordinates in the phase space and they define the rate of change of the j -th coordinate
in the xi direction. Therefore these exponents are calculated locally and theirs values
are obtained in small surroundings of the explored point.
In order to assign the largest Lyapunov exponent, in this research was used the al-
gorithm developed by Collins, De Luca and Rosenstein [16].
Let the sequence:
x = {x1 , x 2 , x3 ,…, x N } (3)
represent the samples of a time series of one of the state variable for which exponents
are being estimated, whereas:
X i = [X 1 X2 ... X n ] T (4)
6 R. Porada and N. Mielczarek
m ≥ 2n + 1 (6)
After reconstruction of the vector of state variables, we find distance d j to the ref-
erence point j , in the nearest neighborhood, defined as the Euclidean norm:
d j (0 ) = min X j − X j (7)
Xj
d j (i ) ≈ C j e λ1 (i⋅Δt ) (8)
ln d j (i) ≈ ln C j + λ1 (i ⋅ Δt ) (9)
The largest Lyapunov exponent can be obtained by calculating the slope coefficient
of equation (9) using the least squares method and dividing it by the sample interval
of time series x .
For a qualitative description of complexity of the chaotic system we can use the
correlation dimension D2 , being the lower limit of the Hausdorff dimension D0 , i.e.
D2 < D0 . The correlation dimension is defined as:
ln C (r )
1
D2 = lim (10)
r →0 ln r
C (r ) = ∑ Θ[r − X i − X k ]
2
(11)
M (M − 1) i ≠ k
it is possible to notice that the correlation dimension can be assigned as the slope of
the function (12).
The correlation integral can also be used as an instrument allowing to distinguish
deterministic irregularities, arising from internal properties of a strange attractor from
the external white noise. If a strange attractor is embedded in an n - dimensional
space and an external white noise is added, then each point on the attractor is rimmed
with a homogeneous n – dimensional cloud of points. The radius of this cloud r0 is
proportional to the intensity of noise. For r >> r0 the correlation integral counts
these clouds as points and the slope of function ln C (r ) = f (ln r ) is equal to the cor-
relation dimension of the attractor. For r << r0 most of the counted points are situ-
ated inside homogenously filled n -dimensional cells and the slope tends to value n .
In practical applications, the sources of deterministic chaos are switched systems,
e.g. power electronics systems. In investigations using numerical simulations they
cause additional difficulties whose severity depends on a selected model of switch-
ing elements. In case of a model of system with a changeable structure, it is often an
ideal model of the element (zero time to switch, zero resistance of switch in on-state
and infinite in off-state). The method to eliminate the right-hand side discontinuity
of a system is the numerical calculation of the switching moment t , and next the
−
integration according to rule (2) in the range t; t S and setting the initial condi-
tions x(t S+ ) = x(t S− ) of the new integration (2) in the range t S ; t + h . It is possible
+
to eliminate the left-hand side discontinuity (e.g. closing switch in a circuit with
8 R. Porada and N. Mielczarek
c) option dialog box of simulation parame- d) option dialog box of parameters of data
ters analysis
f) menu of type of graphical chart describing g) table with models time series of state vector
system
Fig. 5. Exemplary screenshots of program ChaoPhS concerning the choice of numerical meth-
ods solving the equations of systems model
10 R. Porada and N. Mielczarek
In Fig. 6 thre were introduced the diagrams of bifurcation for the investigated test
maps. All of the diagrams are identical with those obtained in other publications
[1,2,10] which proves the correctness of the performed calculations onto iterative
maps characterized by formula X n +1 = f ( X n ) .
From Fig. 7 it results that also systems described by (1) are correctly simulated,
and the implemented methods of solving differential equations are proper. Phase tra-
jectories forming strange attractors are the same as those contained in [1,2,6,7,8,9,10].
For the assignment of the largest Lyapunov exponent the program draws chart of
distance of trajectory points as a function of the largest exponent ln d j (i) = f (λ1 (t )) .
The exponent is calculated (using the least squares method in a selected range) on the
12 R. Porada and N. Mielczarek
() ( )
Fig. 9. Chart of function ln C r = f ln r for logistic map
basis of slope of this function. The correlation dimension D2 can be determined on the
basis of a chart of logarithm of the correlation integral (an exemplary chart is shown in
Fig. 9) as a function of the logarithm of distance between neighboring points
ln C (r ) = f (ln r ) .
To determine the Lyapunov exponent and the correlation dimension, the time se-
ries of just one state variable is sufficient because the program independently assigns
Modeling of Chaotic Systems in the ChaoPhS Program 13
the attractor using the method of delayed time series [12]. Before those quantities are
computed in the program, it is necessary to input the following parameters: embedded
dimension of attractor obtained from time series, time delay, number of time series
used in calculations of exponent λ1 and dimension D2 , and also window size, out-
side of which points are skipped.
Table 1 shows a comparison of values of the largest Lyapunov exponent presented
in publications [15,16,20] with exponents calculated using ChaoPhS. It can be noticed
that the compared values are similar.
Fig. 11. Frequency spectrum of DC/DC buck converter for control gain K as parameter:
K = 8,4 , K = 12 , K = 14,5 , K = 23
Figure 10 shows the main panel of ChaoPhS with a scheme of the DC/DC con-
verter and a graphical chart containing a voltage waveform during the steady system
state.
Figure 11 shows the evolution of converters state from the steady state with the 1T
periodic orbit, through the states 2T -, 4T - periodic, up to chaotic system functioning.
It is possible to notice there duplicative stripes of the frequency spectrum and in the fi-
nal chart – the absence of a leading frequency.
Modeling of Chaotic Systems in the ChaoPhS Program 15
For power electronics system, with a periodic waveform of current and voltage
whose frequency equals frequency of an external signal, that is frequency of the
sawtooth signal in the PWM generator, the largest Lyapunov exponent should be
negative. This is caused by a lack of dissipation of system trajectory. Function
ln d j (i) = f (λ1 (t )) for this condition must have a negative slope – Fig. 12a. For a
chaotic model of operation the slope is positive (Fig. 12b).
a) b)
Fig. 12. Chart for calculation of largest Lyapunov exponent for: a) stable state K = 8,4 ;
b) chaos K = 23
Fig. 13. Nonlinear phase trajectory – chaotic activity of DC/DC buck converter
In Figures 13 and 14 it is shown the phase trajectory and bifurcation diagram of the
investigated converter which confirm the occurrence of chaotic phenomena when the
control parameter (gain system) is changing.
Figure 15 represents a situation in which two attractors related with bifurcations
arise which are formed during the functioning of the system. In this figure there are
16 R. Porada and N. Mielczarek
compared two diagrams – the one obtained using the program presented program and
using Matlab. One can notice that:
a)
b)
Fig. 14. Bifurcation diagram of DC/DC buck converter with sawtooth carrier signal in PWM:
a) ChaoPhS, b) Matlab
Similarity of these two diagrams in the range <8,25> of control parameter value.
The additional clouds of points on the Matlab diagram are connected with an insuffi-
cient filtration of the transient states of the system which is a result of a long compu-
tation time in comparison with the computation time of the presented program.
Also the functioning of a buck type converter with a triangle carrier signal of the
PWM modulation [19] was presented.
Figure 16 shows a panel of ChaoPhS with a diagram of the converter, a window
of the phase space chart and a dialog box used to input parameters of the model. The
Modeling of Chaotic Systems in the ChaoPhS Program 17
a)
b)
Fig. 15. a) Two attractors of DC/DC buck converter; b) enlargement of one attractor
picture of the phase spaces shows that the system is in the state of a deterministic
chaos. Even though the structure of the converter has not been changed, the region of
stable work occurs in a different range of values of the controlled parameter K .
This can be noticed when we compare bifurcation diagrams of both systems pre-
sented in Figs. 14 and 18. Additionally, Figure 18 shows how significant are initial
18 R. Porada and N. Mielczarek
conditions for which simulation was performed. For the rol parameter K = 10 with
different initial conditions, the Poincaré section can have one or three stable points.
From Figure 17 it results that the structure of attractor of the buck converter with
the triangular carrier signal of the PWM modulation, is very complex and is different
from the system with the sawtooth carrier modulation signal (Fig. 15).
Fig. 16. Main panel of ChaoPhS program for DC/DC buck converter with triangle carrier PWM
signal
Fig. 17. Poincaré section for DC/DC buck converter with triangle carrier PWM signal and
K = 23
Modeling of Chaotic Systems in the ChaoPhS Program 19
Fig. 18. Bifurcation diagram for DC/DC buck converter with triangle carrier PWM signal
a) b)
Fig. 19. Chart to obtaining largest Lyapunov exponent for: a) stable state K = 8,4 , b) chaos
K = 23
7 Conclusions
The paper presents a simulation program, ChaoPhS (Chaotic Phenomena Simula-
tions), intended for investigating deterministic chaos phenomena in various systems,
among others in power electronics converters. This program was written in the C++
Builder software development kit to support of object-oriented programming technol-
ogy. Due to the application of dynamically linked libraries of the studied systems,
methods of solving equations which describe them, and also methods of analysis of
occurring chaos phenomena, the presented program can be easily developed further.
The verification of correctness of the analysis of well-known models of chaotic
systems carried out with the use of the presented program shows convergence with
results presented in the bibliography.
In this paper we also introduced several models of power electronics converters
whose chaotic properties are an object of research of the authors.
20 R. Porada and N. Mielczarek
References
[1] Ott, E.: Chaos in dynamical systems (in Polish). WNT, Warszawa (1997)
[2] Schuster, H.G.: Deterministic chaos. An introduction (In Polish). PWN, Warszawa (1995)
[3] Hamill, D.C.: Power electronics: A field rich in nonlinear dynamics. In: Nonlinear Dy-
namics of Electronic Systems, Dublin (1995)
[4] Hirsch, M.W., Smale, S.: Differential Equations, Dynamical Systems and Linear Algebra.
Academic Press, New York (1974)
[5] Banerjee, S., Ranjan, P., Grebogi, C.: Bifurcations in one-dimentional piecewise smooth
maps: Theory and applications in switching circuits. IEEE Trans. On Circuits and Sys-
tems – I 47(5) (2000)
[6] Lorenz, E.N.: Deterministic nonperiodic flow. J. Atmos. Sci. 20, 130 (1963)
[7] Eckmann, J.-P., Ruelle, D.: Ergodic theory of chaos and strange attractors. Rev. Mod.
Phys. 57, 617 (1985)
[8] Rössler, O.E.: An equation for continuous chaos. Phys. Lett. A 57, 397 (1976)
[9] Rössler, O.E.: An equation for hyperchaos. Phys. Lett. A 71, 155 (1979)
[10] Hénon, M.: A two-dimensional mapping with a strange attractor. Comm. Math. Phys. 50,
69 (1976)
[11] Grassberger, P., Procaccia, I.: Characterization of strange attractors. Phys. Rev. Lett. 50,
346 (1983)
[12] Takens, F.: Lecture Notes In Math, vol. 898. Springer, Heidelberg (1981)
[13] Baron, B., Piątek, Ł.: Metody numeryczne w C++ Builder. Helion, Gliwice (2004)
[14] Baron, B.: Układ dynamiczny jako obiekt klasy C++. IC-SPETO, Gliwice-Ustroń (2005)
[15] Wolf, A., Swift, J.B., Swinney, H.L., Vastano, J.A.: Determining Lyapunov exponents
from a time series. Physica D 16, 285 (1985)
[16] Rosenstein, M.T., Collins, J.J., De Luca, C.J.: A practical method for calculating largest
Lyapunov exponents from small data sets (1992)
[17] Porada, R., Mielczarek, N.: Wstępne badania symulacyjne zachowań chaotycznych
w układach energoelektronicznych. ZKwE, Kiekrz (2004)
[18] Porada, R., Mielczarek, N.: Preliminary Analysis of Chaotic Behaviou. In: Power Elec-
tronics. EPNC, Poznań (2004)
[19] Porada, R., Mielczarek, N.: Badania zjawisk chaosu deterministycznego w zamkniętych
układach energoelektronicznych. In: IC-SPETO 2005, Gliwice-Ustroń (2005)
[20] Huang, P.J.: Control in Chaos (2000),
http://math.arizona.edu/~ura/001/huang.pojen/
Model of a Tribological Sensor Contacting
Rotating Disc
W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 21–27.
springerlink.com
c Springer-Verlag Berlin Heidelberg 2009
22 V. Vladimirov and J. Wróbel
B V
Re[λ±
1,2 ] > 0 when ν < ν0 ,
Re[λ±
1,2 ] <0 when ν > ν0 .
where
3W 2 x± W3
h1 = − −
Ω2 Ω3
− f (ν0 )U 2 + f (ν0 )U 3 + o(U 3 ),
h2 = 0.
24 V. Vladimirov and J. Wróbel
Using the well know formula (see e.g. [1]), we obtain, that
From this we conclude that stability type of the pair of the limit cycles completely
depends on the sign of f (ν0 ). If f (ν0 ) > 0 then the stable limit cycles appear
when ν < ν0 . Contrary, for f (ν0 ) < 0 the unstable limit cycles appear when
ν > ν0 .
where ϕ(ν) = a ν 4 + b ν 3 + +c ν 2 + d ν 1 + e.
Numerical simulations show that, depending on the sign of f (ν0 ), there are
two types of the global behavior, as it is illustrated on fig. 3 and 4, while the rest
of peculiarities of function f seem to be unimportant. The global phase portraits
presented here could serve as a basis of the prediction of qualitative behavior of
the autonomous system (2) in the broad range of the values of the parameter ν.
ẋ = −y (7)
ẏ = x − x − f (y + ν) [1 + sin (ωt)] .
3
v0
Fig. 3. Qualitative changes of phase portrait of system (2), case f (ν0 ) > 0
Model of a Tribological Sensor Contacting Rotating Disc 25
Fig. 4. Qualitative changes of phase portrait of system (2), case f (ν0 ) < 0
Fig. 5. Bifurcation diagrams of system (7), obtained for = 0.2, and increasing ν
Fig. 6. Bifurcation diagrams of system (7), obtained for = 0.6, and increasing ν
In what follows, we assume that ∈ (0, 1]. Numerical experiments show that
behavior of system (7) does not differ from that of system (2) when << 1.
There are no significant changes also in case when is of the order of unity, but
ν > ν0 + d, i.e. in those cases when the critical points (x± , 0) of the autonomous
system are stable foci. But the behavior of system (7) drastically changes from
(2) when ν ∈ (0, ν0 + d). Qualitative changes of the non-autonomous system
that have been studied with the help of the Poincaré sections techniques [1] are
shown in figs. 5–10 They present the results of numerical simulations in which
26 V. Vladimirov and J. Wróbel
Fig. 7. Bifurcation diagrams of system (7), obtained for = 1.0, and increasing ν
Fig. 8. Bifurcation diagrams of system (7), obtained for = 0.2, and decreasing ν
Fig. 9. Bifurcation diagrams of system (7), obtained for = 0.6, and decreasing ν
Fig. 10. Bifurcation diagrams of system (7), obtained for = 1.0, and decreasing ν
Model of a Tribological Sensor Contacting Rotating Disc 27
the driving parameter ν either grow or decreases. All figures present the results
of the simulation for the case f (ν0 ) < 0.
2 Concluding Remarks
A brief presentation of the global analysis of equation (1) shows that even the
autonomous case presents very rich behavior within the parameter range ν ∈
(0, ν + d] for some d > 0. The qualitative features of the phase trajectories
depend merely on the sign of f (ν0 ) and seems not to be sensible upon the other
details of the modelling function f , representing the Coulomb-type friction.
The variety of solutions becomes much more reach when the term that de-
scribes vertical oscillation is incorporated. On analyzing the qualitative features
of solutions one can see that it becomes more and more complicated, depend-
ing on the values of the parameter . As this parameter growth, the system (7)
demonstrates periodic, quasiperiodic and multiperiodic regimes, period doubling
cascades and, probably, chaotic oscillations. Let us note, yet, that this is the case
when ν ∈ (0, ν + d], because for sufficiently large values of velocity, lying beyond
this interval, all the movements in the system become asymptotically stable, and
tend, depending on the initial values, to either (x+ , 0) or (x− , 0).
Acknowledgements
The authors are very indebted to Dr T.Habdank-Wojewódzki for the acquainting
with his experimental results and valuable suggestions.
Reference
[1] Guckenheimer, J., Holmes, P.: Nonlinear Oscillations, Dynamical Systems and Bi-
furcations of Vector Fields. Springer, New York (1987)
The Bifurcations and Chaotic Oscillations in Electric
Circuits with Arc
V. Sydorets
Abstract. The autonomous electric circuits with arc governed by three ordinary differential
equations were investigated. Under variation of two parameters we observed many kinds of
bifurcations, periodic and chaotic behaviors of this system. The bifurcation diagrams were
studied in details by means of its construction. Routes to chaos were classified. Three basis
patterns of bifurcation diagrams that possess the properties – (i) softness and reversibility; (ii)
stiffness and irreversibility; (iii) stiffness and reversibility – were observed.
1 Introduction
In the last years the investigations of nonlinear dynamical dissipative systems are
rapid developed. The fundamental results one of which is invention of deterministic
chaos in different mechanical, physical, chemical, biological, and ecological systems
was obtained. Same phenomena were found out in electrical engineering. They were
studied in detail by L.Chua [1] and V.Anishchenko [2].
A classical nonlinearity – electric arc in electric circuits remain insufficiently
researched. Author was tried to make up for this deficiency. The more so since the
mathematical model of dynamical electric arc was proposed by I.Pentegov, and
conjointly with author was improved and used in many applications [3].
As is shown preliminary investigations [4] in electric circuit with arc the
emergence of a deterministic chaos is possible.
A cardinal importance in nonlinear systems has the bifurcation phenomenon.
Under variation of two parameters a lot of kinds of bifurcations, periodic, and chaotic
regimes can be observed:
W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 29 – 42.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
30 V. Sydorets
When static volt-ampere characteristic of arc is falling two fixed point is present
whose coordinates may be found from system (2) equal to with zero.
A single condition which may be obtained analytically is the condition of Hopf
bifurcation [11]. For this case we carry out a linearization of system (2) closed point
for which the Kaufman condition hold true.
One of the Hopf bifurcation conditions coincide with the condition of equality to 0
of the real part of pair complex roots of the characteristic polynomial.
The basic distinction of the Hopf bifurcation in the considered circuit is that this
bifurcation may be supercritical as well as subcritical. So local unstability may come
as a result of separation stable limit cycle from focus as a result of junction focus with
an unstable limit cycle.
The curve of the Hopf bifurcation (see Fig.2) in the parameter plane (L,C) which is
defined by formulae (3) have a minimum. It turned out that from the side of a small L
until the minimum (for R = 15 - Lm = 2,7924741181414) the bifurcation is critical,
afterwards the minimum is subcritical. To point of change of the Hopf bifurcation
kind of a curve of twin cycle (tangent) bifurcation joins. Its location was defined more
32 V. Sydorets
12
C R 0.5
10 1.5
5
6
4
15
2
50
∞
0
0 2 4 6 8 L 10
6
R = 15 z
L=1 5
C = 2.7
x
2 1 0 1 2 3 4 5 6
Fig. 3. Oscillations with period 1T as a result of Hopf bifurcation. This is a projection on phase
portrait on plate (x, z ). Fixed point – 1,1,1.
exactly. The twin cycle bifurcation lies under the Hopf bifurcation curve. So under
variation of parameter C the system develops according to differ scenarios depending
on the value of a fixed parameter L.
For instance the case R = 15 will be described. At a small L (L < Lm) and rising of
C the Hopf bifurcation with the advent of a stable limit cycle occurs (Fig.3). Further,
theb rising of C leads to the period of doubling bifurcation: single divisible limit cycle
becomes unstable but twice divisible stable cycle appears (Fig.4). In the system a self-
oscillations with half frequency is settled. Then a period doubling bifurcation cascade
follows. As a result four, eight, sixteen, etc. divisible cycles appear (Figs.5-7).
6
R = 15 z
L=1 5
C = 2.9
x
2 1 0 1 2 3 4 5 6
6
R = 15 z
L=1 5
C = 3.025
x
2 1 0 1 2 3 4 5 6
6
R = 15 z
L=1 5
C = 3.035
x
2 1 0 1 2 3 4 5 6
6
R = 15 z
L=1 5
C = 3.0385
x
2 1 0 1 2 3 4 5 6
6
R = 15 z
L=1 5
C = 3.19
x
2 1 0 1 2 3 4 5 6
6
R = 15 z
L=1 5
C = 3.088
x
2 1 0 1 2 3 4 5 6
6
R = 15 z
L=1 5
C = 3.13
x
2 1 0 1 2 3 4 5 6
6
R = 15 z
L=1 5
C = 3.2
x
2 1 0 1 2 3 4 5 6
oscillations with two divisible period or chaotic oscillations may stiffly appear. From
the bifurcation diagram one can see that depending on initial conditions the transition
process tends to different attractors: either to a limit cycle or a strange attractor.
Attracting zones are separated by unstable limit cycle.
4 Bifurcation Diagrams
For a more detailed study of scenarios of chaos development many researchers
employ the technique of constructing a single parameter bifurcation diagram. On the
abscissa axis the values of varied parameter is put and on the ordinate axis – one of
coordinates of the Poincare section points. As a section plane the half plane is chosen
x2 − z = 0 , (4)
where x > 1. Judging by the third equation of system (2) the Poincare section points
will be oscillation maximums of variable z.
In Fig.12 there is the bifurcation diagram for L = 1 and a varying range of
parameter C, from 2.8 to 3.4. All stages of the scenario described above are visible on
it very well. On the bifurcation diagram the periodic windows in chaos are well
visible too. A rise in chaotic region periodic oscillations may be considered as a self-
organization process. Therefore a question of interest is of cause and mechanism of its
appearance.
For instance at L = 1 the evolution of a strange attractor is well visible. From the
beginning the chaotic state is extended among neighbor orbits of periodic oscillations
and a strange attractor has a strip structure. Narrow strips are joined in wider anes as a
result of a “reverse” period doubling bifurcation cascade, i.e. according to order 2k, 2k-
1
, ..., 16, 8, 4, 2, 1. After the last “reverse” period doubling bifurcation the strange
attractor densely covers a part of phase space and has a structure the so-called screw
strange attractor.
In Fig.12 periodicity windows with period 5 (C = 3.088..3.090), 3 (C =
3.123..3.145, wide window), 4 (C = 3.200..3.210), 3 (C = 3.3355), and 1 (C = 3.3800)
are marked. At that wide window with period 3 presents almost on all bifurcation
diagrams where there is the regime of developed chaos (this fact was noted in [12]). It
begins by a stiff destruction of chaos and ends with a period doubling cascade (i.e. 3,
6, 12, ..., 3⋅2k).
However, for example, on the bifurcation diagram at L = 0.3 (fig.13) two windows
with period 3 are observed. The development scenario for the first window coincides
with one described above but the development scenario for second window is reverse.
The periodic window 2⋅3 (C = 4.1204 ..4.1586) on the bifurcation diagram at L =
0.2 (Fig.14) both appears stiffly and stiffly destroys. It is of particular interest the
window with period 2⋅2 (C = 4.204..4.205) since two attractors coexist in it and
depending on initial conditions one of them can realize.
The analysis and comparison scenarios described above with well known
approaches show that they coincide with the Feigenbaum scenario especially in prior
to chaos regimes (period doubling bifurcation cascade). Distinctions are observed in
The Bifurcations and Chaotic Oscillations in Electric Circuits with Arc 37
chaotic regimes. Parameter values are (for instance L = 0.1 in Fig.15) when the
number of period doubling bifurcations is limited and chaotic regimes do not come.
At value of parameter L = 0.11 an infinite period doubling bifurcation cascade
occurs. However it adjoins with other one occurring in reverse direction. By it chaotic
oscillations, there is a transition to periodic (see Fig.16).
Pattern (i) is well known and extended one of period doubling bifurcations. It can
start either a supercritical Hopf bifurcation (as, for instance, in the studied system at
small values of parameter L), or a period doubling bifurcation when it is embedded
structure (for instance, every subsequent branch of bifurcation tree on fig.12-16 is
(i) (ii)
hysteresis
isolated region
(iii) (iii)
metastable chaos
crisis
similar to previous). Properties of softness show that all periodic oscillations at period
doubling bifurcations appear with zero amplitude, and in accumulation point of period
doubling bifurcations although one considers that transition to chaos happens a
chaotic component power is equal to zero. At reverse changing of bifurcation
parameter the processes occur in a reverse order. Prolongation of pattern (i) in a
chaotic region is a cascade of a so-called ‘reverse’ bifurcation. At that narrow chaotic
strips join forming more wide strips. ‘Reverse’ bifurcations possess properties of
softness and reversibility too.
Pattern (ii) differ from pattern (i) that at certain of the values of bifurcation
parameter the system is bistable and two attractors (stable motions) coexist in it.
Repeller (unstable motion) which is a limit of attractor basins is located between
them. System motion coincides with one of attractors while development of other
attractor happens imperceptibly. On edges of the bistable zone a junction of a repeller
with one of attractors and its mutual destruction that become apparent as jumping to
remained attractor. This phenomenon is known as hysteresis. It is necessary to
emphasize that jumps on differ edges happen in differ directions.
By increasing the bifurcation parameter (at L > Lm) as a result of period doubling
bifurcation a chaos in system appears stiffly. In other cases stiff appearance
(appearance with nonzero amplitude) of periodic oscillations is possible.
By further raising the bifurcation parameter a chaos development in patterns (ii)
and (i) coincides. However if falling of bifurcation parameter is begun then
irreversibility of pattern (ii) show. The process will follow another path. Those system
regimes which do not appear at rising of bifurcation parameter will be appeared.
Cascade of period doubling bifurcation is observed in a reverse order. The last
bifurcation at which an attractor disappears is the tangent bifurcation with stable and
unstable cycles.
It is necessary to note that although in pattern (ii) all regimes do not become
apparent simultaneously they can be reveal in principle by ordinary physical
experiment at rising and falling of bifurcation parameter.
Pattern (iii) outwardly resembles pattern (ii) however it have essential distinctions.
The limit of a strange attractor intersects with a repeller. Basins of two attractors
overlap. This phenomenon is called a crisis of strange attractor. A chaotic attractor
with that crisis take place at competition of two attractors looses its attracting
properties.
The jump to periodic (more stable) attractor occurs and zone of a so-called
metastable chaos appears. Attracting properties are restored only when a repeller
disappears joining with periodic attractor as a result tangent bifurcation which
coincides with second crisis.
An ordinary physical experiment in presence of pattern (iii) looks in the following
way. If bifurcation parameter rises the oscillations in system coincide with a periodic
attractor and in tangent bifurcation point developed chaotic oscillations appear stiffly. By
decreasing the bifurcation parameter in crisis point the developed chaotic oscillations
become periodic ones stiffly. Thus development of chaos is stiff and reversible.
It is necessary to pay special attention that pattern (iii) has regimes which can not
be revealed by ordinary an physical experiment. Therefore they can be called
‘isolated’ regimes. They are limited with one side by tangent bifurcation and with the
other side by strange attractor crisis.
The Bifurcations and Chaotic Oscillations in Electric Circuits with Arc 41
Why they are impossible to reveal? It is explained that the jumps which occur in
the system direct to the same side (unlike from hysteresis when the jumps direct in
differ side) and to hit in this regime region by natural way does not present possible.
This can do either by special physical experiment when can preset the initial
conditions or parameter value changes very fast or modifying studied system by
superposition any pulses. In nature these regimes can become apparent as a result of
some extreme (extraordinary) events. However even if in the system isolated regimes
occurs then any changing of parameter leads to transition in region of simple regimes.
Although isolated regimes are a phenomenon sufficiently exotic they are importance
from the viewpoint of studying chaotic oscillation properties that appear in patters (iii).
It turns out that properties of chaos in this case are determined that a cascade of period
doubling bifurcations which occur in isolated region because this is the same attractor.
Although in a metastable chaos region an attractor looses its attractive properties its
development continues.
Knowledge of isolated regime properties helps to reveal them on the bifurcation
diagram of the studied system (see enlarged notes on Fig.14-16).
6 Quantitative Estimations
Feigenbaum [2] determined that the cascade of period doubling bifurcations possesses
not only qualitative but quantitative universal properties. It turned out that at doubling
the bifurcation values of parameter represent the geometric series where denominator
δ is universal value i.e. value independent on kind of nonlinear system.
It was obtained for the studied system
δ = 4,669220751009,
already at bifurcation 64-divisible of period that confirms its universality i.e. contains
five correct significant digits.
7 Conclusions
The electric circuits with arc possess an abundance of periodic and chaotic behaviour.
Investigation of these circuits may be useful because its properties are universal and
can apply to other nonlinear dynamical systems.
References
1. Syuan, W.: Family of Chua’s circuits. Trans. IEEE. 75(8), 55–65 (1987)
2. Anishchenko, V.S.: Complicated oscillation in simple, 312 p. Nauka, Moscow (1990) (in
Russian)
3. Pentegov, I.V., Sidorets, V.N.: Energy parameters in mathematical model of dynamical
welding arc. Automaticheskaya svarka 11, 36–40 (1988) (in Russian)
4. Sidorets, V.N., Pentegov, I.V.: Chaotic oscillations in RLC circuit with electric arc
Doklady AN Ukrainy, vol. 10, pp. 87–90 (1992) (in Russian)
42 V. Sydorets
5. Sidorets, V.N., Pentegov, I.V.: Appearance and structure of strange attractor in RLC
circuit with electric arc. Technicheskaya electrodynamica 2, 28–32 (1993) (in Russian)
6. Sidorets, V.N., Pentegov, I.V.: Deterministic chaos development scenarios in electric
circuit with arc. Ukrainian physical journal 39(11-12), 1080–1083 (1994) (in Ukrainian)
7. Sidorets, V.N.: Structures of bifurcation diagrams for electric circuit with arc. Technichna
electrodynamica 6, 15–18 (1998)
8. Vladimirov, V.A., Sidorets, V.N.: On the Peculiarities of Stochastic Invariant Solutions of
a Hydrodynamic System Accounting for Non-local Effects. Symmetry in Nonlinear
Mathematical Physics 2, 409–417 (1997)
9. Vladimirov, V.A., Sidorets, V.N.: On Stochastic Self Oscillation Solutions of Nonlinear
Hydrodynamic Model of Continuum Accounting for Relaxation Effects. Dopovidi
Nacionalnoyi akademiyi nauk Ukrayiny 2, 126–131 (1999) (in Russian)
10. Vladimirov, V.A., Sidorets, V.N., Skurativskii, S.I.: Complicated Travelling Wave
Solutions of a Modelling System Describing Media with Memory and Spatial Nonlocality.
Reports on Mathematical Physics 41(1/2), 275–282 (1999)
11. Sidorets, V.N.: Feature of analyses eigenvalues of mathematical models of nonlinear
electrical circuits. Electronnoe modelirovanie 20(5), 60–71 (1998) (in Russian)
12. Li, T., Yorke, J.A.: Period Three Implies Chaos American Math. Monthly 82, 985–991
(1975)
Soft Computing Models for Intelligent Control of
Non-linear Dynamical Systems
Abstract. We describe in this paper the application of soft computing techniques to controlling
non-linear dynamical systems in real-world problems. Soft computing consists of fuzzy logic,
neural networks, evolutionary computation, and chaos theory. Controlling real-world non-linear
dynamical systems may require the use of several soft computing techniques to achieve the
desired performance in practice. For this reason, several hybrid intelligent architectures have
been developed. The basic idea of these hybrid architectures is to combine the advantages of
each of the techniques involved in the intelligent system. Also, non-linear dynamical systems
are difficult to control due to the unstable and even chaotic behaviors that may occur in these
systems. The described applications include robotics, aircraft systems, biochemical reactors,
and manufacturing of batteries.
1 Introduction
We describe in this paper the application of soft computing techniques and fractal
theory to the control of non-linear dynamical systems [8]. Soft computing consists of
fuzzy logic, neural networks, evolutionary computation, and chaos theory [23]. Each
of these techniques has been applied successfully to real world problems. However,
there are applications in which one of these techniques is not sufficient to achieve the
level of accuracy and efficiency needed in practice. For this reason, is necessary to
combine several of these techniques to take advantage of the power that each
technique offers. We describe several hybrid architectures that combine different soft
computing techniques. We also describe the development of hybrid intelligent
systems combining several of these techniques to achieve better performance in
controlling real dynamical systems. We illustrate these ideas with applications to
robotic systems, aircraft systems, biochemical reactors, and manufacturing systems.
Each of these problems has its own characteristics, but all of them share in common
their non-linear dynamic behavior. For this reason, the use of soft computing
techniques is completely justified. In all of these applications, the results of using soft
computing techniques have been better than with traditional techniques.
W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 43 – 70.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
44 O. Castillo and P. Melin
Output
Hidden 1 j q q+1
Input 1 2 i p+1
In the neural network we will be using, the input layer with p+1 processing
elements, i.e., one for each predictor variable plus a processing element for the bias.
The bias element always has an input of one, Xp+1=1. Each processing element in the
input layer sends signals Xi (i=1,…,p+1) to each of the q processing elements in the
hidden layer. The q processing elements in the hidden layer (indexed by j=1,…,q)
produce an “activation” aj=F(ΣwijXi) where wij are the weights associated with the
connections between the p+1 processing elements of the input layer and the jth
processing element of the hidden layer. Once again, processing element q+1 of the
hidden layer is a bias element and always has an activation of one, i.e. aq+1=1.
Assuming that the processing element in the output layer is linear, the network model
will be
(1)
Here πι are the weights for the connections between the input layer and the output
layer, and θj are the weights for the connections between the hidden layer and the
output layer. The main requirement to be satisfied by the activation function F(.) is
that it be nonlinear and differentiable. Typical functions used are the sigmoid,
hyperbolic tangent, and the sine functions, i.e.:
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 45
(2)
The weights in the neural network can be adjusted to minimize some criterion such
as the sum of squared error (SSE) function:
(3)
Thus, the weights in the neural network are similar to the regression coefficients in
a linear regression model. In fact, if the hidden layer is eliminated, (1) reduces to the
well-known linear regression function. It has been shown [13, 24] that, given
sufficiently many hidden units, (1) is capable of approximating any measurable
function to any accuracy. In fact F(.) can be an arbitrary sigmoid function without any
loss of flexibility.
The most popular algorithm for training feedforward neural networks is the
backpropagation algorithm. As the name suggests, the error computed from the output
layer is backpropagated through the network, and the weights are modified according to
their contribution to the error function. Essentially, backpropagation performs a local
gradient search, and hence its implementation does not guarantee reaching a global
minimum. A number of heuristics are available to partly address this problem, some of
which are presented below. Instead of distinguishing between the weights of the
different layers as in Equation (1), we refer to them generically as wij in the following.
After some mathematical simplification the weight change equation suggested by
back-propagation can be expressed as follows:
(4)
Here, ηis the learning coefficient and θ is the momentum term. One heuristic that
is used to prevent the neural network from getting stuck at a local minimum is the
random presentation of the training data. Another heuristic that can speed up
convergence is the cumulative update of weights, i.e., weights are not updated after
the presentation of each input-output pair, but are accumulated until a certain number
of presentations are made, this number referred to as an “epoch”. In the absence of the
second term in (4), setting a low learning coefficient results in slow learning, whereas
a high learning coefficient can produce divergent behavior. The second term in (4)
reinforces general trends, whereas oscillatory behavior is canceled out, thus allowing
a low learning coefficient but faster learning. Last, it is suggested that starting the
training with a large learning coefficient and letting its value decay as training
progresses speeds up convergence.
The method of steepest descent, also known as gradient method, is one of the oldest
techniques for minimizing a given function defined on a multidimensional space. This
method forms the basis for many optimization techniques. In general, the descent
direction is given by the second derivatives of the objective function E. The matrix of
46 O. Castillo and P. Melin
(6)
where I is the identity matrix and λ is some nonnegative value. Depending on the
magnitude of A, the method transits smoothly between the two extremes: Newton's
method (λ→ 0) and well-known steepest descent method (λ→ ∞ ) .A variety of
Levenberg- Marquardt algorithms differ in the selection of λ. Goldfeld et al.
computed eigenvalues of H and set A to a little larger than the magnitude of the most
negative eigenvalue.
Moreover, when λ increases, || θnext - θnow || decreases. In other words, λ plays the
same role as an adjustable step length. That is, with some appropriately large λ, the
step length, will be the right one. Of course, the step size η can be further introduced
and can be determined in conjunction with line search methods:
(7)
For the case of neural networks these ideas are used to update (or learn) the
weights of the network [8].
(8)
where N(r) is the number of boxes covering the object and r is the size of the box. An
approximation to the fractal dimension can be obtained by counting the number of
boxes covering the boundary of the object for different r sizes and then performing a
logarithmic regression to obtain d (box counting algorithm). In Figure 2, we illustrate
the box counting algorithm for a hypothetical curve C. Counting the number of boxes
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 47
for different sizes of r and performing a logarithmic linear regression, we can estimate
the box dimension of a geometrical object with the following equation:
(9)
this algorithm is illustrated in Figure 3.
The fractal dimension can be used to characterize an arbitrary object. The reason
for this is that the fractal dimension measures the geometrical complexity of objects.
In this case, a time series can be classified by using the numeric value of the fractal
dimension (d is between 1 and 2 because we are on the plane xy). The reasoning
behind this classification scheme is that when the boundary is smooth the fractal
dimension of the object will be close to one. On the other hand, when the boundary is
rougher the fractal dimension will be close to a value of two.
We developed a computer program in MATLAB for calculating the fractal
dimension of a sound signal. The computer program uses as input the figure of the
signal and counts the number of boxes covering the object for different grid sizes.
systems is a difficult problem because the dynamics of these systems is highly non-
linear [5]. We describe an intelligent system for controlling robot manipulators to
illustrate our neuro-fuzzy-fractal approach for adaptive control. We use a new fuzzy
inference system for reasoning with multiple differential equations for modelling based
on the relevant parameters for the problem [6]. In this case, the fractal dimension [14]
of a time series of measured values of the variables is used as a parameter for the fuzzy
system. We use neural networks for identification and control of robotic dynamic
systems [4, 21]. The neural networks are trained with the Levenberg-Marquardt
learning algorithm with real data to achieve the desired level of performance.
Combining a fuzzy rule base [32] for modelling with the neural networks for
identification and control, an intelligent system for adaptive model-based control of
robotic dynamic systems was developed. We have very good simulation results for
several types of robotic systems for different conditions. The new method for control
combines the advantages of fuzzy logic (use of expert knowledge) with the advantages
of neural networks (learning and adaptability), and the advantages of the fractal
dimension (pattern classification) to achieve the goal of robust adaptive control of
robotic dynamic systems.
The neuro-fuzzy-fractal approach described above can also be applied to the case
of controlling biochemical reactors [21]. In this case, we use mathematical models of
the reactors to achieve adaptive model-based control. We also use a fuzzy inference
system for differential equations to take into consideration several models of the
biochemical reactor. The neural networks are used for identification and control. The
fractal dimension of the bacteria used in the reactor is also an important parameter in
the fuzzy rules to take into account the complexity of biochemical process. We have
very good results for several food production processes in which the biochemical
reactor is controlled to optimize the production.
We have also used our hybrid approach for the case of controlling chaotic and
unstable behavior in aircraft dynamic systems [22]. For this case, we use
mathematical models for the simulation of aircraft dynamics during flight. The goal of
constructing these models is to capture the dynamics of the aircraft, so as to have a
way of controlling this dynamics to avoid dangerous behavior of the system. Chaotic
behavior has been related to the flutter effect that occurs in real airplanes, and for this
reason has to be avoided during flight. The prediction of chaotic behavior can be done
using the mathematical models of the dynamical system. We use a fuzzy inference
system combining multiple differential equations for modelling complex aircraft
dynamic systems. On the other hand, we use neural networks trained with the
Levenberg-Marquardt algorithm for control and identification of the dynamic
systems. The proposed adaptive controller performs rather well considering the
complexity of the domain.
We also describe in this paper, several hybrid approaches for controlling
electrochemical processes in manufacturing applications. The hybrid approaches
combine soft computing techniques to achieve the goal of controlling the
manufacturing process to follow a desired production plan. Electrochemical processes,
like the ones used in battery formation, are very complex and for this reason very
difficult to control. Also, mathematical models of electrochemical processes are
difficult to derive and they are not very accurate. We need adaptive control of the
electrochemical process to achieve on-line control of the production line. Of course,
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 49
adaptive control is easier to achieve if one uses a reference model of the process
[21, 22]. In this case, we use a neural network to model the electrochemical process
due to the difficulty in obtaining a good mathematical model for the problem. The
other part of the problem is how to control the non-linear electrochemical process in
the desired way to achieve the production with the required quality. We developed a
set of fuzzy rules using expert knowledge for controlling the manufacturing process.
The membership functions for the linguistic variables in the rules were tuned using a
specific genetic algorithm. The genetic algorithm was used for searching the parameter
space of the membership functions using real data from production lines. Our
particular neuro-fuzzy-genetic approach has been implemented as an intelligent system
to control the formation of batteries in a real plant with very good results.
questionable if the interaction forces among the various joints are severe (non-linear).
This is the main reason why soft computing techniques [7] have been proposed to
control this type of dynamic systems.
Adaptive fuzzy control is an extension of fuzzy control theory to allow the fuzzy
controller, extending its applicability, either to a wider class of uncertain systems or to
fine-tune the parameters of a system to accuracy [9]. In this scheme, a fuzzy
controller is designed based on knowledge of a dynamic system. This fuzzy controller
is characterized by a set of parameters. These parameters are either the controller
constants or functions of a model’s constants.
A controller is designed based on an assumed mathematical model representing a
real system. It must be understood that the mathematical model does not completely
match the real system to be controlled. Rather, the mathematical model is seen as an
approximation of the real system. A controller designed based on this model is
assumed to work effectively with the real system if the error between the actual system
and its mathematical representation is relatively insignificant. However, there exists a
threshold constant that sets a boundary for the effectiveness of a controller. An error
above this threshold will render the controller ineffective toward the real system.
An adaptive controller is set up to take advantage of additional data collected at run
time for better effectiveness. At run time, data are collected periodically at the
beginning of each constant time interval, tn = tn-1 + Δt, where Δt is a constant
measurement of time, and [tn, tn-1) is a duration between data collection. Let Dn be a
set of data collected at time t = tn. It is assumed that at any particular time, t = tn, a
history of data {D0, D1, …, Dn} is always available. The more data available, more
accurate the approximation of the system will become.
At run time, the control input is fed into both the real system and the mathematical
model representing the system. The output of the real system and the output of that
mathematical model are collected and an error representing the difference between
these two outputs are calculated. Let x(t) be the output of the real system, and y(t) the
output of the mathematical model. The error ε(t) is defined as:
H(t) = x(t) – y(t). (12)
Figure 4 depicts this tracking of the difference between the mathematical model
and the real dynamic system it represents.
+
Real Dynamic
Controller System
+ u(t) x(t) H(t)
xdesired
Mathematical
Model
y(t)
Fig. 4. Tracking the error function between outputs of a real system and mathematical model
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 51
An adaptive controller will be adjusted based on the error function ε(t). This
calculated data will be fed into either the mathematical model or the controller for
adjustment. Since the error function ε(t) is available only at run time, an adjusting
mechanism must be designed to accept this error as it becomes available, i.e., it must
evolve with the accumulation of data in time. At any time, t = tn, the set of calculated
data in the form of a time series {ε(t0), ε(t1),..., ε(tn)}is available and must be used by
the adjusting mechanism to update appropriate parameters.
In normal practice, instead of doing re-calculation based on a lengthy set of data,
the adjusting algorithm is reformulated to be based on two entities: (i) sufficient
information, and (ii) newly collected data. The sufficient information is a numerical
variable representing the set of data {ε(t0), ε(t1),..., ε(tn-1)} collected from the initial
time t0 to the previous collecting cycle starting at time t = tn-1. The new datum ε(tn) is
collected in the current cycle starting at time t = tn.
An adaptive controller will operate as follows. The controller is initially designed
as a function of a parameter set and state variables of a mathematical model. The
parameters can be updated any time during operation and the controller will adjust
itself to the newly updated parameters. The time frame is usually divided into a series
of equally spaced intervals {[tn,tn+1)| n = 0,1,2,...; tn+1 = tn+ Δt}. At the beginning of
each time interval [tn,t n+1) observable data are collected and the error function ε(tn) is
calculated. This error is used to calculate the adjustment in the parameters of the
controller. New control input u(tn) for the time interval [tn,tn+1) is then calculated
based on the newly calculated parameters and fed into both the real dynamic system
under control and the mathematical model upon which the controller is designed. This
completes one control cycle. The next control cycle will consist of the same steps
repeated for the next time interval [tn+1,tn+2), and so on.
We will consider, in this section, the case of modelling robotic manipulators [5]. The
general model for this kind of robotic system is the following:
M(q)q" + V(q, q'))q' + G(q) + Fdq' = W (13)
where q ∈ Rn denotes the link position, M(q) ∈ Rnxn is the inertia matrix, V(q,q') ∈
Rnxn is the centripetal-Coriolis matrix, G(q) ∈ Rn represents the gravity vector, Fd ∈
Rnxn is a diagonal matrix representing the friction term, and τ is the input torque
applied to the links. We show in Figure 5 the case of the two-link robot arm. In this
figure, we show the variables involved.
For the simplest case of a one-link robot arm, we have the scalar equation:
Mqq" + Fdq' + G(q) = W (14)
If G(q) is a linear function (G = Nq), then we have the "linear oscillator" model:
q" + aq' + bq = c
where a = Fd/Mq , b = N/Mq and c = τ/Mq. This is the simplest mathematical model
for a one-link robot arm. More realistic models can be obtained for more complicated
52 O. Castillo and P. Melin
functions G(q). For example, if G(q) = Nq2, then we obtain the "quadratic oscillator"
model:
q" + aq' + bq2 = c (15)
where a, b and c are defined as above.
A more interesting model is obtained if we define G(q) = Nsinq. In this case, the
mathematical model is
q" + aq' + bsinq = c (16)
where a, b and c are the same as above. This is the so-called "sinusoidally forced
oscillator". More complicated models for a one-link robot arm can be defined
similarly.
For the case of a two-link robot arm, we can have two simultaneous differential
equations as follows:
which is called the "coupled quadratic oscillators" model. In Equation (17) a1, b1, a2,
b2, c1 and c2 are defined similarly as in the previous models. We can also have the
"coupled cubic oscillators" model:
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 53
(18)
q"2 + a2q'2 + b2q31 = c2
(a)
(b)
Fig. 6. (a) Function approximation after 9 epochs, (b) SSE of the neural network
54 O. Castillo and P. Melin
(a)
(b)
Fig. 7. (a) Non-linear surface for modelling, (b) fuzzy reasoning procedure
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 55
(a)
(b)
algorithm and hyperbolic tangent sigmoidal functions as the activation functions for
the neurons. We show in Figure 6(a) the function approximation achieved with the
neural network for control after 9 epochs of training with a variable learning rate. The
identification achieved by the neural network can be considered very good because
the error has been decreased to the order of 10-4. We show in Figure 6(b) the curve
relating the sum of squared errors SSE against the number of epochs of neural
network training. We can see in this figure how the SSE diminishes rapidly from
being of the order of 102 to smaller value of the order of 10-4. Still, we can obtain a
better approximation by using more hidden neurons or more layers. In any case, we
56 O. Castillo and P. Melin
can see clearly how the neural networks learns to control the robotic system, because
it is able to follow the arbitrary desired trajectory.
We show in Figure 7(a) the non-linear surface for the fuzzy rule base for modelling.
The fuzzy system was implemented in the fuzzy logic toolbox of MATLAB [25]. We
show in Figure 7(b) the reasoning procedure for specific values of the fractal
dimension and number of links of the robotic system.
In Figure 8 we show simulation results for a two-link robot arm with a model given
by two coupled second order differential equations. Figure 8(a) shows the behavior of
position q1 and Figure 8(b) shows it for position q2 of the robot arm.
We can see from these figures the complex dynamic behavior of this robotic system
[7]. Of course, the complexity is even greater for higher dimensional robotic systems.
We have very good simulation results for several types of robotic manipulators for
different conditions. The new method for control combines the advantages of neural
networks (learning and adaptability) with the advantages of fuzzy logic (use of expert
knowledge) to achieve the goal of robust adaptive control of robotic dynamic systems.
We consider that our method for adaptive control can be applied to general non-linear
dynamical systems [8, 27] because the hybrid approach, combining neural networks
and fuzzy logic, does not depend on the particular characteristics of the robotic
dynamic systems.
The new method for adaptive control can also be applied for autonomous robots
[8], but in this case it may be necessary to include genetic algorithms for trajectory
planning.
adjust their growth rates and production of different products radically depending on
temperature and concentrations of waste products [16]. Systems with heating or
cooling, multiple reactors or unsteady operation greatly complicate the analysis.
Mathematical models for these systems can be expressed as differential (or
difference) equations [3, 17, 18].
Now we propose mathematical models that integrate our method for geometrical
modelling of bacteria growth using the fractal dimension [14] with the method for
modelling the dynamics of bacteria population using differential equations [27]. The
resulting mathematical models describe bacteria growth in space and in time, because
the use of the fractal dimension enables us to classify bacteria by the geometry of the
colonies and the differential equations help us to understand the evolution in time of
bacteria population.
We will consider first the case of using one bacteria for food production. The
mathematical model in this case can be of the following form:
-D -D -D
dN/dt = r(1 - N /K)N - EN
-D
dP/dt = EN (20)
Fig. 10. Simulation of the model for two bacteria used in food production
Fig. 11. Simulation of the model for two good bacteria and one bad one
down to zero), which of course results in a decrease of the resulting quantity of the
food product. This is a case, which has to be avoided because of the bad resulting
effect of the bad bacteria. Intelligent control helps in avoiding these types of scenarios
for food production.
We have use a general method for adaptive model based control of non-linear
dynamic plants using Neural Networks, Fuzzy Logic and Fractal Theory. We
illustrated our method for control with the case of biochemical reactors. In this case,
the models represent the process of biochemical transformation between the microbial
life and their generation of the chemical product. We also describe in this paper an
adaptive controller based on the use of neural networks and mathematical models for
the plant. The proposed adaptive controller performs rather well considering the
complexity of the domain being considered in this research work. We can say that
combining Neural Networks, Fuzzy Logic and Fractal Theory, using the advantages
that each of these methodologies has, can give good results for this kind of
application. Also, we believe that our neuro-fuzzy-fractal approach is a good
alternative for solving similar problems.
conditions of the aircraft and its environment. For example, we can use the following
model of an airplane when wind velocity is relatively small:
p’ = I1(-q + l), q’ = I2(p + m) (22)
where I1 and I2 are the inertia moments of the airplane with respect to axis x and y,
respectively, l and m are physical constants specific to the airplane, and p, q are the
positions with respect to axis x and y, respectively. However, a more realistic model
of an airplane in three dimensional space, is as follows:
IF THEN
Wind Inertia Fractal Dim Model
Small Small Low M1
Small Small Medium M2
Small Large Low M2
Small Large Medium M2
Large Small Medium M3
Large Large Medium M3
Large Large High M3
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 61
(a)
(b)
Fig. 12. (a) Fuzzy rule base (b) Non-linear surface for aircraft dynamics
(a)
(b)
We show simulation results for an aircraft system obtained using our new method
for modelling dynamical systems. In Figure 13(a) and Figure 13(b) we show results
for an airplane with inertia moments: I1 = 1, I2 = 0.4, I3 = 0.05 and the constants are:
l = m = n = 1. The initial conditions are: p(0) = 0, q(0) = 0, r(0) = 0.
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 63
Fig. 14. Function approximation of the neural network for control of an airplane
64 O. Castillo and P. Melin
any case, we can see clearly (from Figure 14) how the neural network learns to
control the aircraft, because it is able to follow the arbitrary desired trajectory.
We have to mention here that these simulation experiments for the case of a
specific flight for a given airplane show very good results. We have also tried our
approach for control with other types of flights and airplanes with good simulation
results. Still, there is a lot of research to be done in this area because of the complex
dynamics of aircraft systems.
We have developed a general method for adaptive model based control of non-linear
dynamic systems using Neural Networks, Fuzzy Logic and Fractal Theory. We
illustrated our method for control with the case of controlling aircraft dynamics. In this
case, the models represent the aircraft dynamics during flight. We also described in this
paper an adaptive controller based on the use of neural networks and mathematical
models for the system. The proposed adaptive controller performs rather well considering
the complexity of the domain being considered in this research work. We have shown
that our method can be used to control chaotic and unstable behavior in aircraft systems.
Chaotic behavior has been associated with the “flutter” effect in real airplanes, and for
this reason is very important to avoid this kind of behavior. We can say that combining
Neural Networks, Fuzzy Logic and Fractal Theory, using the advantages that each of
these methodologies has, can give good results for this kind of application. Also, we
believe that our neuro-fuzzy-fractal approach is a good alternative for solving similar
problems.
Type of Plate
Positive 0.060” Positive 0.070”
Negative 0.050” Negative 0.060”
Plate Total 72 hr 96 hr Total 72 hr 96 hr
cell A. H. Amp. Amp. A.H. Amp Amp
7 155 2.2 1.6 165 2.4 1.8
9 180 2.8 2.0 200 2.8 2.2
11 230 3.2 2.4 245 3.4 2.4
13 260 3.6 2.6 295 4.0 3.0
15 300 4.2 3.0 345 4.8 3.6
17 400 5.6 4.2 415 5.8 4.4
where βo and β1 are parameters to be estimated (by least squares) using real data for
this problem. In Table 3, we show experimental values for a battery of 6 Volts, which
Table 3. Values of temperature and current for a battery of 200 amperes hour
Hrs T I Hrs T I
21:00 111 5.22 23:00 93 3.53
23:00 100 5.21 1:00 91 3.40
1:00 105 5.52 3:00 92 3.32
3:00 100 5.66 5:00 96 3.16
5:00 100 5.60 7:00 98 3.10
7:00 97 5.72 9:00 98 3.14
9:00 92 4.82 11:00 102 3.12
11:00 95 4.32 13:00 99 3.03
13:00 102 4.10 15:00 98 3.05
15:00 103 4.05 17:00 97 3.06
17:00 100 3.40 19:00 95 2.96
19:00 97 3.77 21:00 94 2.60
21:00 94 3.62 23:00 96 2.76
66 O. Castillo and P. Melin
T I T
dT/dt Fuzzy Electro-chemical
controller process
In this case, neural networks are used for modelling the electrochemical process,
fuzzy logic for controlling the electrical current and genetic algorithms for adapting
the membership functions of the fuzzy system [8]. A multilayer feedforward neural
network was used for modelling the electrochemical process. We used the data form
Table 3 and the Levenberg-Marquardt learning algorithm to train the neural network.
We used a three layer neural network with 15 nodes in the hidden layer. The results of
training for 2000 epochs are as follows. The sum of squared errors was reduced from
about 200 initially to 11.25 at the end, which is a very good approximation in this
case. The fuzzy rule base was implemented in the Fuzzy Logic Toolbox of MATLAB.
68 O. Castillo and P. Melin
In this case, 25 fuzzy rules were used because there were 5 linguistic terms for each
input variable.
The three hybrid control systems were compared by simulating the formation
(loading) of a 6 Volts battery. This particular battery is manually loaded (in the plant)
by applying 2 amperes for 50 hours under manufacturer’s specifications. We show in
Table 4 the experimental results.
We can see from Table 4 that the fuzzy control method reduces 36% the time
required to charge the battery compared with manual control, and 11.11% compared
with conventional PID control [27]. We can also see how ANFIS helps in reducing
even more this time because we are using neural networks for adapting the intelligent
system. Now the reduction is of 40% with respect to manual control. Finally, we can
notice that using a neuro-fuzzy-genetic approach reduces even more the time because
the genetic algorithm optimizes the fuzzy system. In this case, reduction is of 50 %
with respect to manual control.
We have described in this section, three different approaches for controlling an
electrochemical process. We have shown that for this type of application the use of
several soft computing techniques can help in reducing the time required to produce a
battery. Even fuzzy control alone can reduce the formation time of a battery, but using
neural networks and genetic algorithms reduces even more the time for production. Of
course, this means that manufacturers can produce the batteries in half the time
needed before.
9 Conclusions
We can say that hybrid intelligent systems can be used to solve difficult real-world
problems. Of course, the right hybrid architecture (and combination) has to be selected.
At the moment, there are no general rules to decide on the right architecture for specific
classes of problems. However, we can use the experience that other researchers have
gained on these problems and use it to our advantage. Also, we always have to turn to
experimental work to test different combinations of soft computing techniques and
decide on the best one for ourselves. Finally, we can conclude that the use of soft
Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 69
computing for controlling dynamical systems is a very fruitful area of research, because
of the excellent results that can be achieved without using complex mathematical
models [8, 23].
Acknowledgments
We would like to thank the research grant committee of CONACYT-Mexico, for the
financial support given to this research project, under grant 33780-A, and also
COSNET for the research grants 743.99-P, 414.01-P and 487.02-P. We would also
like to thank the Department of Computer Science of Tijuana Institute of Technology
for the time and resources given to this project.
References
[1] Albertos, P., Strietzel, R., Mart, N.: Control Engineering Solutions: A practical approach.
IEEE Computer Society Press, Los Alamitos (1997)
[2] Bode, H., Brodd, R.J., Kordesch, K.V.: Lead-Acid Batteries. John Wiley & Sons,
Chichester (1977)
[3] Castillo, O., Melin, P.: Developing a New Method for the Identification of
Microorganisms for the Food Industry using the Fractal Dimension. Journal of
Fractals 2(3), 457–460 (1994)
[4] Castillo, O., Melin, P.: Mathematical Modelling and Simulation of Robotic Dynamic
Systems using Fuzzy Logic Techniques and Fractal Theory. In: Proceedings of IMACS
1997, Berlin, Germany, vol. 5, pp. 343–348 (1997)
[5] Castillo, O., Melin, P.: A New Fuzzy-Fractal-Genetic Method for Automated
Mathematical Modelling and Simulation of Robotic Dynamic Systems. In: Proceedings
of FUZZ 1998, vol. 2, pp. 1182–1187. IEEE Press, Anchorage (1998)
[6] Castillo, O., Melin, P.: A New Fuzzy Inference System for Reasoning with Multiple
Differential Equations for Modelling Complex Dynamical Systems. In: Proceedings of
CIMCA 1999, pp. 224–229. IOS Press, Vienna (1999)
[7] Castillo, O., Melin, P.: Automated Mathematical Modelling, Simulation and Behavior
Identification of Robotic Dynamic Systems using a New Fuzzy-Fractal-Genetic
Approach. Journal of Robotics and Autonomous Systems 28(1), 19–30 (1999)
[8] Castillo, O., Melin, P.: Soft Computing for Control of Non-Linear Dynamical Systems.
Springer, Heidelberg (2001)
[9] Chen, G., Pham, T.T.: Introduction to Fuzzy Sets, Fuzzy Logic, and Fuzzy Control
Systems. CRC Press, Boca Raton (2001)
[10] Fu, K.S., Gonzalez, R.C., Lee, C.S.G.: Robotics: Control, Sensing, Vision and
Intelligence. McGraw-Hill, New York (1987)
[11] Goldfeld, S.M., Quandt, R.E., Trotter, H.F.: Maximization by Quadratic Hill Climbing.
Econometrica 34, 541–551 (1966)
[12] Hehner, N., Orsino, J.A.: Storage Battery Manufacturing Manual III. Independent Battery
Manufacturers Association (1985)
[13] Jang, J.R., Sun, C.T., Mizutani, E.: Neuro-Fuzzy and Soft Computing. Prentice Hall,
Englewood Cliffs (1997)
[14] Mandelbrot, B.: The Fractal Geometry of Nature. W.H. Freeman and Company, New
York (1987)
70 O. Castillo and P. Melin
[15] Marquardt, D.W.: An Algorithm for Least Squares Estimation of Non-Linear Parameters.
Journal of the Society of Industrial and Applied Mathematics 11, 431–441 (1963)
[16] Melin, P., Castillo, O.: Modelling and Simulation for Bacteria Growth Control in the
Food Industry using Artificial Intelligence. In: Proceedings of CESA 1996, Gerf EC
Lille, Lille, France, pp. 676–681 (1996)
[17] Melin, P., Castillo, O.: An Adaptive Model-Based Neural Network Controller for
Biochemical Reactors in the Food Industry. In: Proceedings of Control 1997, pp. 147–
150. Acta Press, Canada (1997)
[18] Melin, P., Castillo, O.: An Adaptive Neural Network System for Bacteria Growth Control
in the Food Industry using Mathematical Modelling and Simulation. In: Proceedings of
IMACS World Congress 1997, vol. 4, pp. 203–208. W & T Verlag, Berlin (1997)
[19] Melin, P., Castillo, O.: Automated Mathematical Modelling and Simulation for Bacteria
Growth Control in the Food Industry using Artificial Intelligence and Fractal Theory.
Journal of Systems, Analysis, Modelling and Simulation, 189–206 (1997)
[20] Melin, P., Castillo, O.: An Adaptive Model-Based Neuro-Fuzzy-Fractal Controller for
Biochemical Reactors in the Food Industry. In: Proceedings of IJCNN 1998, Anchorage
Alaska, USA, vol. 1, pp. 106–111 (1998)
[21] Melin, P., Castillo, O.: A New Method for Adaptive Model-Based Neuro-Fuzzy-Fractal
Control of Non-Linear Dynamic Plants: The Case of Biochemical Reactors. In:
Proceedings of IPMU 1998, vol. 1, pp. 475–482. EDK Publishers, Paris (1998)
[22] Melin, P., Castillo, O.: A New Method for Adaptive Model-Based Neuro-Fuzzy-Fractal
of Non-Linear Dynamical Systems. In: Proceedings of ICNPAA, pp. 499–506. European
Conference Publications, Daytona Beach (1999)
[23] Melin, P., Castillo, O.: Modelling, Simulation and Control of Non-Linear Dynamical
Systems. Taylor and Francis Publishers, London (2002)
[24] Miller, W.T., Sutton, R.S., Werbos, P.J.: Neural Networks for Control. MIT Press,
Cambridge (1995)
[25] Nakamura, S.: Numerical Analysis and Graphic Visualization with MATLAB. Prentice-
Hall, Englewood Cliffs (1997)
[26] Narendra, K.S., Annaswamy, A.M.: Stable Adaptive Systems. Prentice Hall Publishing,
Englewood Cliffs (1989)
[27] Rasband, S.N.: Chaotic Dynamics of Non-Linear Systems. John Wiley & Sons,
Chichester (1990)
[28] Sepulveda, R., Castillo, O., Montiel, O., Lopez, M.: Analysis of Fuzzy Control System
for Process of Forming Batteries. In: ISRA 1998, Mexico, pp. 203–210 (1998)
[29] Sugeno, M., Kang, G.T.: Structure Identification of Fuzzy Model. Fuzzy Sets and
Systems 28, 15–33 (1988)
[30] Takagi, T., Sugeno, M.: Fuzzy Identification of Systems and its Applications to
Modelling and Control. IEEE Transactions on Systems, Man and Cybernetics 15, 116–
132 (1985)
[31] Ungar, L.H.: A Bioreactor Benchmark for Adaptive Network-Based Process Control. In:
Neural Networks for Control, pp. 387–402. MIT Press, Cambridge (1995)
[32] Zadeh, L.A.: The Concept of a Linguistic Variable and its Application to Approximate
Reasoning. Information Sciences 8, 43–80 (1975)
Model Reference Adaptive Control of Underwater Robot
in Spatial Motion
Jerzy Garus
Naval University
81-103 Gdynia ul. Śmidowicza 69, Poland
j.garus@amw.gdynia.pl
Abstract. The paper addresses nonlinear control of an underwater robot. The way-point line of
sight scheme is incorporated for the tracking of a desired trajectory. Command signals are
generated by an autopilot consisting of four controllers with parameter adaptation law
implemented. Quality of control is concerned in presence of environmental disturbances. Some
computer simulations are provided to demonstrate effectiveness, correctness and robustness of
the approach.
1 Introduction
Underwater Robotics has known an increasing interest in the last years. The main
benefits of usage of an Underwater Robotic Vehicles (URV) can be removing a man
from the dangers of the undersea environment and reduction in cost of exploration of
deep seas. Currently, it is common to use the URV to accomplish missions like
inspections of coastal and off-shore structures, cable maintenance, as well as
hydrographical and biological surveys. In the military field it is employed in such
tasks as surveillance, intelligence gathering, torpedo recovery and mine counter
measures.
The URV is considered being a floating platform carrying tools required for
performing various functions, like manipulator arms with interchangeable end-
effectors, cameras, scanners, sonars, etc. An automatic control of such objects is a
difficult problem caused by their nonlinear dynamics [1, 3, 4, 5, 6]. Moreover, the
dynamics can change according to the alteration of configuration to be suited to the
mission. In order to cope with those difficulties, the control system should be
flexible.
The conventional URV operate in crab-wise manner of four degrees of freedom
(DOF) with small roll and pitch angles that can be neglected during normal
operations. Therefore its basic motion is movement in horizontal plane with some
variation due to diving.
The objective of the paper is to present a usage of the adaptive inverse dynamics
algorithm to driving the robot along a desired trajectory in the spatial motion. It
W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 71 – 83.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
72 J. Garus
consists of the following four sections. Brief descriptions of dynamical and kinematical
equations of motion of the URV and the adaptive control law are presented in the
Section 2. Next some results of the simulation study are provided. Conclusions are
given in the Section 4.
η = [x, y, z,φ ,θ ,ψ ]
T
v = [u, v, w, p, q, r ] (1)
T
τ = [X , Y , Z , K , M , N ]
T
where:
Mv + C( v ) v + D( v ) v + g ( η) = τ (2a)
η = J (η)v (2b)
where:
The robot’s dynamics in the inertial frame can be written as [4, 5]:
Mη ( η)η + Cη ( v, η)η + Dη ( v, η)η + gη ( η) = τη (3)
where:
( )
M η ( η) = J −1 ( η) MJ −1 ( η)
T
( ) [C( v) − MJ
Cη ( v, η) = J −1 ( η)
T −1
]
( η)J ( η) J −1 ( η)
D ( v, η) = (J ( η) ) D( v )J ( η)
−1 T −1
η
g ( η) = (J ( η) ) g ( η)
−1 T
η
τ = (J ( η) ) τ
−1 T
η
There are parametric uncertainties in the dynamic model (2a), and some parameters
are generally unknown. Hence, parameter estimation is necessary in case of model-
based control. For this purpose it is assumed that the robot equations of motion are
linear in a parameter vector p, that is [8]:
Mv + h( v, η) = τ (5)
h (v, η) = C( v ) v + D( v ) v + g ( η) (6)
τ=M
ˆ a + hˆ ( v, η) (7)
ˆ (v − a ) = M
M
~ ~
v + h ( v, η) (8)
~ ~
where M = M
ˆ − M and h( v, η) = hˆ ( v, η) − h( v, η) .
74 J. Garus
Since the equations of motion are linear in the parameter vector p, the following
parameterization can be applied:
M v + h ( v, η) = Y (η, v, v )p
~ ~ ~ (9)
~ = pˆ − p is the unknown parameter error vector.
where p
Differentiation of the kinematical equation (2b) with respect to time yields:
[
v = J −1 (η) η − J (η)v ] (10)
ˆ J −1 (η)[η − a ] = Y(η, v, v )p
M ~ (11)
η
M η η ( )
ˆ (η)[η − a ] = J −1 (η) T Y(η, v, v )p
~ (12)
η − K P ~
aη = η d − K D ~ η (13)
where ~η = η − ηd is the tracking error and KP, KD are positive definite diagonal
matrices.
Hence, the error dynamics can be written in the form:
ˆ (η) ~
M η (
η + K ~
~
Dη + KPη = J
−1
) ( ) (η)Y(η, v, v )p~
T
(14)
x = Ax + BJ −T (η)Y (η, v, v )p
~ (15)
where:
⎡~η⎤ ⎡ 0 I ⎤ ⎡ 0 ⎤
x = ⎢~ ⎥ , A=⎢ , B = ⎢ ˆ −1 ⎥ .
⎣ η⎦ ⎣− K P − K D ⎥⎦ ⎣Mη (η)⎦
Updated the parameter vector p̂ according to the formulae [5, 6]:
3 Simulation Results
A main task of the proposed tracking control system is to minimize distance of
attitude of the robot’s centre of gravity to the desired trajectory under assumptions:
1. the robot can move with varying linear velocities u, v, w and angular velocity r;
2. its velocities u, v, w, r and coordinates of position x, y, z and heading ψ are
measurable;
3. the desired trajectory is given by means of set of way-points {( xdi , ydi , zdi )} ;
4. reference trajectories between two successive way-points are defined as smooth
and bounded curves;
5. the command signal τ consists of four components: τ X = X , τ Y = Y , τ Z = Z
and τ N = N calculated from the control law (7).
1. The robot has to follow the desired trajectory beginning from (10 m, 10 m, 0 m),
passing target way-points: (10 m, 10 m, -5 m), (10 m, 90 m, -5 m), (30 m, 90 m,
-5 m), (30 m, 10 m, -5 m), (60 m, 10 m, -5 m), (60 m, 90 m, -5 m), (60 m, 90 m,
-15 m), (60 m, 10 m, -15 m), (30 m, 10 m, -15 m), (30 m, 90 m, -15 m),
(10 m, 90 m, -15 m) and ending in (10 m, 10 m, -15 m);
2. The turning point is reached when the robot is inside of the 0.5 meter circle of
acceptance;
3. The sea current interacts the robot’s hull with maximum velocity 0.3 m/s and
direction 1350;
4. Dynamic equations of the robot’s motion are integrated with higher frequency
(18 Hz) than the rest of modules (6 Hz).
It has been assumed that the time-varying reference trajectories at the way-point i to
the next way-point i+1 are generated using desired speed profiles [7, 8]. Such
approach allows us to keep constant speed along certain part of the path. For those
assumptions and the following initial conditions:
Model Reference Adaptive Control of Underwater Robot in Spatial Motion 77
⎧ ηmax − η0 2
⎪η0 + 2t t tb ≤ t ≤ t m
ηdk (t ) = ⎨ 2 tm < t ≤ t f − t m
⎪+ η (t − t )
⎪ max η − η1
m
⎪η1 − max
⎪⎩
(t f − t)
2
t f − tm < t ≤ t f
2t m
η1 − η 0
where tm = t f − .
ηmax
The algorithm of control has been worked out basis on simplified URV model
proposed in [4, 9]:
M s v + D s (v )v = τ (18)
where all kinematics and dynamics cross-coupling terms are neglected. Here M s and
D s (v ) are diagonal matrices with diagonal elements of the inertia matrix M and a
nonlinear damping matrix D n (v ) , consequently (see the Appendix). Uncertainties in
the above model are compensated in the control system. Therefore, the robot’s model
for spatial motion of four DOF can be written in the following form:
m X u + d X u u = τ X
mY v + d Y v v = τ Y (19)
mZ w + d Z w w = τ Z
mN r + d N r r = τ N
Defining the parameter vector p as
p = [m X dN ]
T
dX mY dY mZ dZ mN the equation (18) can be written
in a form:
Y (v , v )p = τ (20)
78 J. Garus
where:
⎡u uu 0 0 0 0 0 0⎤
⎢ ⎥
0 0 v vv 0 0 0 0 ⎥.
( )
Y v, v = ⎢
⎢0 0 0 0 w ww 0 0⎥
⎢ ⎥
⎣0 0 0 0 0 0 r r r⎦
d
r
2
0
-2
-4
position z [m]
-6
-8
-10
-12
-14
-16
0
20
40
60 0
10
20
80 40 30
50
100 60
70
position y [m] position x [m]
80
d
60 r
position x [m]
40
20
0
0 500 1000 1500 2000
1
error x [m]
-1
0 500 1000 1500 2000
time [s]
Fig. 3. Track-keeping control under interaction of sea current disturbances (maximum velocity
0.3 m/s and direction 1350): desired (d) and real (r) trajectories (upper plot), x-, y-, z-position
and their errors (2nd π 4th plots), course and its error (5th plot), commands (low plot)
Model Reference Adaptive Control of Underwater Robot in Spatial Motion 79
100
d
r
position y [m]
50
0
0 500 1000 1500 2000
0.5
error y [m]
-0.5
-1
0 500 1000 1500 2000
time [s]
5
d
0 r
position z [m]
-5
-10
-15
0 500 1000 1500 2000
-0.005
error z [m]
-0.01
-0.015
-0.02
0 500 1000 1500 2000
time [s]
400
course psi [deg]
200
0
d
r
-200
0 500 1000 1500 2000
20
error psi [deg]
-20
-40
0 500 1000 1500 2000
time [s]
Fig. 3. (continued)
80 J. Garus
X [N]
0
-1000
0 500 1000 1500 2000
100
Y [N]
0
-100
0 500 1000 1500 2000
0
Z [N]
-100
-200
0 500 1000 1500 2000
50
N [Nm]
-50
0 500 1000 1500 2000
time [s]
Fig. 3. (continued)
400
m [kg]
200
0
0 500 1000 1500 2000
1500
s
e
1000
d [kg/m]
500
0
0 500 1000 1500 2000
time [s]
Fig. 4. Estimates of mass and damping coefficients: set value (s) and estimate (e)
Model Reference Adaptive Control of Underwater Robot in Spatial Motion 81
400
m [kg]
200
0
0 500 1000 1500 2000
500
s
e
400
d [kg/m]
300
200
0 500 1000 1500 2000
time [s]
100
m [kg]
50
0
0 500 1000 1500 2000
500
s
e
400
d [kg/m]
300
200
0 500 1000 1500 2000
time [s]
40
m [kg m 2]
30
20
10
0 500 1000 1500 2000
20
s
e
15
d [kg m 2]
10
5
0 500 1000 1500 2000
time [s]
Fig. 4. (continued)
82 J. Garus
4 Conclusions
In the paper the nonlinear control system for the underwater robot has been described.
The obtained results with the autopilot consisting of four controllers with parameter
adaptation law implemented have showed that the proposed control system is simple
and useful for the practical usage.
Disturbances from the sea current were added in the simulation study to verify the
performance, correctness and robustness of the approach.
Further works are devoted to the problem of tuning of the autopilot parameters in
relation to the robot’s dynamics.
References
[1] Antonelli, G., Caccavale, F., Sarkar, S., West, M.: Adaptive Control of an Autonomous
Underwater Vehicle: Experimental Results on ODIN. IEEE Transactions on Control
Systems Technology 9(5), 756–765 (2001)
[2] Bhattacharyya, R.: Dynamics of Marine Vehicles. John Wiley and Sons, Chichester (1978)
[3] Craven, J., Sutton, R., Burns, R.S.: Control Strategies for Unmanned Underwater Vehicles.
Journal of Navigation 1(51), 79–105 (1998)
[4] Fossen, T.I.: Guidance and Control of Ocean Vehicles. John Wiley and Sons, Chichester
(1994)
[5] Fossen, T.I.: Marine Control Systems. Marine Cybernetics AS, Trondheim (2002)
[6] Garus, J.: Design of URV Control System Using Nonlinear PD Control. WSEAS
Transactions on Systems 4(5), 770–778 (2005)
[7] Garus, J., Kitowski, Z.: Tracking Autopilot for Underwater Robotic Vehicle. In: Cagnol,
J., Zolesio, J.P. (eds.) Information Processing: Recent Mathematical Advances in
Optimization and Control, pp. 127–138. Presses de l’Ecole des Mines de Paris (2004)
[8] Spong, M.W., Vidyasagar, M.: Robot Dynamics and Control. John Wiley and Sons,
Chichester (1989)
[9] Yoerger, D.R., Slotine, J.E.: Robust Trajectory Control of Underwater Vehicles. IEEE
Journal of Oceanic Engineering (4), 462–470 (1985)
Model Reference Adaptive Control of Underwater Robot in Spatial Motion 83
Appendix
The URV model. The following parameters of dynamics of the underwater robot
have been used in computer simulations:
⎡ 0 0 0 0 26.0w − 28.0v ⎤
⎢ 0 0 0 − 26.0 w 0 18.5u ⎥
⎢ ⎥
⎢ 0 0 0 28.0v − 18.5u 0 ⎥
C(v ) = ⎢ ⎥
⎢ 0 26.0 w − 28.0v 0 5. 9 r − 6. 8q ⎥
⎢ − 26.0w 0 18.5u − 5.9 r 0 1.3 p ⎥
⎢ ⎥
⎣ 28.0v − 18.5u 0 6.8q − 1. 3 p 0 ⎦
⎡ − 17.0 sin(θ ) ⎤
⎢ 17.0 cos(θ ) sin(φ ) ⎥
⎢ ⎥
⎢ 17.0 cos(θ ) cos(φ ) ⎥
g (η) = ⎢ ⎥
⎢ − 279.2 cos(θ ) sin(φ ) ⎥
⎢ − 279.2(sin(θ ) + cos(θ ) cos(φ ) )⎥
⎢ ⎥
⎣ 0 ⎦
Feedback Stabilization of Distributed Parameter
Gyroscopic Systems
Pawel Skruch
1 Introduction
Many physical systems are represented by partial differential equations. As an
example we can consider robots with flexible links, vibrating structures such as
beams, buildings, bridges, etc. For the most part, it is not possible or feasible
to obtain a solution of these equations. Therefore in practice, a distributed pa-
rameter system is first discretized to a matrix second-order model using some
approximate methods. Then the problem is solved for this discretized reduced-
order model.
It is well-known that a dangerous situation called resonance occurs when one
or more natural frequences of the system become equal or close to a frequency
of the external force. Because a linear infinite dimensional system described by
an operator second-order differential equation without damping term may have
an infinite number of poles on the imaginary axis [17], [18], [26], the approxi-
mate solutions are not suitable for designing the stabilizer. To combat possible
undesirable effects of vibrations, the dynamic effect of the system parts whose
behaviour are described by partial differential equations has to be taken into
account in designing a controller.
Stability of second-order systems both in finite and infinite dimensional case
has been studied in the past. More recently, in [19] and [20] the dynamics and
stability of LC ladder network by inner resistance, by velocity feedback and by
first range dynamic feedback are studied. Control problems for finite dimensional
undamped second-order systems are discussed in [12] and [21]. In [28], the class
W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 85–97.
springerlink.com c Springer-Verlag Berlin Heidelberg 2009
86 P. Skruch
where
Cx = [c1 , xX c2 , xX · · · cm , xX ]T , (8)
ci ∈ X, i = 1, 2, . . . , m are sensor influence functions.
From the Hilbert-Schmidt theory [5], [22], [31] for compact self-adjoint oper-
ators, it is well-known that the operator A satisfies the following hypotheses:
(a) 0 ∈ ρ(A), i.e. A−1 exists and is compact (ρ(A) stands for the resolvent set
of the operator A),
(b) A is closed,
(c) The operator A has only purely discrete spectrum consisting entirely of dis-
tinct real positive eigenvalues λi with finite multiplicity ri < ∞, where
0 < λ1 < . . . < λi < . . ., limi→∞ λi = ∞,
(d) For each eigenvalue λi there exists ri corresponding eigenfunctions υik ,
Aυik = λi υik , where i = 1, 2, . . ., k = 1, 2, . . . , ri ,
(e) The set of eigenfunctions υik , i = 1, 2, . . ., k = 1, 2, . . . , ri , forms a complete
orthonormal system in X.
By introducing new function space X = D(A1/2 )×X, the equation (4) is reduced
to the following abstract first-order form:
d + G)
x(t) + Bu(t),
(t) + (A
x (9)
dt
G
= col (x, ẋ), the operators A,
where x and B
are defined as
A= 0 I , G = 0 0 ,B = 0 . (10)
−A 0 0 −G B
Remark 1. The operator A is positive and self-adjoint on the real Hilbert space
X. The operator A1/2 is well defined. Thus the operator A (see (10)) on X
=
D(A ) × X (see [18], [26]) is the infinitesimal generator of a C0 -semigroup S(t)
1/2
Remark 3. In the real Hilbert space X and for the skew-adjoint operator G, the
following equality is true:
1 1 1 1
x, GxX = x, GxX + x, GxX = x, GxX − Gx, xX = 0. (11)
2 2 2 2
constructed by placing actuators and sensors at the same location, what means
that C = B ∗ and consequently
We assume that the system (4) with the output (12) is approximately observable
(see [1], [10], [11]).
Let us consider the linear dynamic feedback given by the formula (see also
[13], [21])
u(t) = −K[w(t) + y(t)], (13)
The closed loop system (16) can also obtain the following form [20]:
⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤
ẋ(t) 0 I 0 x(t) 0
⎣ ẍ(t) ⎦ = ⎣−A −G 0 ⎦ ⎣ ẋ(t) ⎦ + ⎣ B ⎦ u(t), (18)
ẇ(t) 0 0 −Aw w(t) Bw
⎡ ⎤
x(t)
s(t) = C1 B ∗ 0 C2 ⎣ ẋ(t) ⎦ , (19)
w(t)
Proof. The system (18), (19) is observable, if for any complex number s the
equation ⎧
⎪
⎪ sx1 − x2 = 0,
⎪
⎨Ax + (G + sI)x = 0,
1 2
(21)
⎪
⎪ (A w + sI)x = 0,
⎪
⎩
3
C1 B ∗ x1 + C2 x3 = 0
has no nonzero solution x = col (x1 , x2 , x3 ) [11]. When s = −αi , i = 1, 2, . . . , m,
we have x3 = 0 and (21) becomes
⎧
⎪
⎨sx1 − x2 = 0,
Ax1 + (G + sI)x2 = 0, (22)
⎪
⎩ ∗
B x1 = 0.
If the system (4), (12) is observable, (22) has no nonzero solution for any complex
number s.
Next consider the case where s = −αi for some i = 1, 2, . . . , m. From (21) is
follows that
Ax1 = (Gαi − α2i I)x1 . (23)
From this it holds that
Theorem 2. Suppose that the system (4), (12) is approximately observable. Let
us consider the system (16), where the operator L is given by (17). Then the
following assertions are true:
(a) L is dissipative,
(b) Ran (λ0 I − L) = Z for some λ0 > 0,
(c) D(L)cl = Z and L is closed,
(d) The operator L generates a C0 -semigroup of contractions TL (t) ∈ L(Z),
t ≥ 0,
(e) The C0 -semigroup TL (t) generated by L is asymptotically stable.
(see [25]). In the real Hilbert space Z, the condition (25) is equivalent to
z T Bw z ≤ 0,
Lz, zZ = − (28)
where
z = KB ∗ z1 + (Q + K)z3 . (29)
Since Lz, zZ ≤ 0, it follows that L is dissipative (see (26)).
(b) To prove the assertion (b), it is enough to show that for some λ0 > 0, the
operator λ0 I − L : Z → Z is onto. Let z = col (
z1 , z2 , z3 ) ∈ Z be given. We have
to find z = col (z1 , z2 , z3 ) ∈ D(L) such that
λ0 z1 − z2 = z1 , (31)
z2 = λ0 z1 − z1 , (34)
z3 = (λ0 I + Aw + Bw K)−1 (
z3 − Bw KB ∗ z1 ). (35)
We can do this because the matrix λ0 I + Aw + Bw K is invertible (see lemma 1).
Using (34) and (35) in (32) we obtain
Define Γ (λ0 ) by
matrix [K −1 + (λ0 Bw
−1
+ Q)−1 ]−1 . Thus the operator Γ (λ0 ) is a closed operator
with domain D(Γ ) = D(A) dense in X. Additionally,
Γ z1 , z1 X ≥ (λ20 + δ) z1 X , (38)
where the constant δ > 0 can be determined by using lemmas 2 and 3. This
means that the operator Γ (λ0 ) is invertible and the equation (36) has a unique
solution z1 ∈ D(A). The remaining unknowns z2 ∈ H 1 (Ω) and z3 ∈ Rm can be
uniquely determined from (34) and (35). This completes the proof of (b).
(c) If for some λ0 > 0, Ran (λ0 I − L) = Z then Ran (λI − L) = Z for all λ > 0
[25]. Let us note that also Ran (λI − L) = Z for λ = 0. Now, we know that the
operator L is dissipative, the Hilbert space Z is reflexive and Ran (I − L) = Z.
All these properties imply that D(L)cl = Z and L is closed [25].
(d) Because of (a), (b) and (c), the statement that the operator L generates
a C0 -semigroup of contractions TL (t) ∈ L(Z), t ≥ 0, can be concluded from
Lumer-Phillips theorem [6], [16], [17], [25].
(e) The asymptotic stability of the closed loop system (16) can be proved by
LaSalle’s invariance principle [15] extended to infinite dimensional systems [7],
[8], [17], [29]. We introduce the following Lyapunov function:
1 1 1
V (x(t), ẋ(t), w(t)) = ẋ(t), ẋ(t)X + Ax(t), x(t)X + w(t)T Qw(t)
2 2 2
1 T
+ [w(t) + B ∗ x(t)] K [w(t) + B ∗ x(t)] , (39)
2
where Q = diag [ αβii ] = Aw Bw−1
. We can notice that V (x, ẋ, w) = 0 if and only
if col (x, ẋ, w) = 0. Otherwise V (x, ẋ, w) > 0. Taking the derivative of V with
respect to time, we obtain
V̇ (x(t), ẋ(t), w(t)) = ẍ(t), ẋ(t)X + Ax(t), ẋ(t)X + w(t)T Qẇ(t)
T
+ [w(t) + B ∗ x(t)] K [ẇ(t) + B ∗ ẋ(t)] . (40)
Along trajectories of the closed loop system (16) it holds that
V̇ (x(t), ẋ(t), w(t)) = −s(t)T Bw s(t) ≤ 0, (41)
where s(t) = KB ∗ x(t) + (Q + K)w(t). According to LaSalle’s theorem, all solu-
tions of (16) asymptotically tend to the maximal invariant subset of the following
set
S = z ∈ Z : z = col (x, ẋ, w), V̇ (z) = 0 , (42)
provided that the solution trajectories for t ≥ 0 are precompact in Z. From
V̇ = 0 we have s(t) = 0 (see (19) for C1 = K, C2 = Q + K). The system (18),
(19) is observable (see theorem 1), thus we have x = 0, ẋ = 0, w = 0 and finally
the largest invariant set contained in S = {0} is the set {0}.
The trajectories of the closed loop system (16) are precompact in Z if the set
γ(z 0 ) = TL (t)z 0 , z 0 = z(0) ∈ D(L), (43)
t≥0
92 P. Skruch
λmin = min {λn : λn ∈ σ(A), n = 1, 2, . . .}, σ(A) stands for the discrete spec-
trum of A.
=K
Lemma 3. For any real and positive definite matrix K T > 0 there exists
δ > 0, such that
(A + B KB ∗ x) ≥ δ x 2X .
∗ )x, xX = Ax, xX + (B ∗ x)T K(B (45)
The lemmas 2 and 3 can be proved by using the following expansions in Hilbert
space X:
∞ ri
x= x, υik X υik , x ∈ X, (46)
i=1 k=1
∞
ri
Ax = λi x, υik X υik , x ∈ D(A). (47)
i=1 k=1
where K1 = K1T ≥ 0 oraz K2 = K2T > 0 are real matrices. The control function
(48) is applied to the system (4) with the output (12). The resulting closed loop
system becomes
Feedback Stabilization of Distributed Parameter Gyroscopic Systems 93
Theorem 3. Suppose that the system (4), (12) is approximately observable. Let
us consider the system (51), where the operator L is given by (52). Then the
following assertions are true:
(a) L is dissipative,
(b) Ran (λ0 I − L) = Z for some λ0 > 0,
(c) D(L)cl = Z and L is closed,
(d) The operator L generates a C0 -semigroup of contractions TL (t) ∈ L(Z),
t ≥ 0,
(e) The C0 -semigroup TL (t) generated by L is asymptotically stable.
Proof. The proof shall be carried out by using the same method as in the proof
of theorem 2. The Lyapunov function for the system (51) is given by
1 1 1
V (z(t)) = ẋ(t), ẋ(t)X + Ax(t), x(t)X + [B ∗ x(t)]T K1 [B ∗ x(t)], (53)
2 2 2
and the stability of the closed loop system is a consequence of LaSalle’s theorem.
5 Illustrative Examples
To illustrate our theory we consider the motion of a taut string, rotating about
its ξ-axis with constant angular velocity ω (Fig. 1). In was shown in [3] and [4]
that the small oscillations of such a string are governed by the system of partial
differential equations
∂ 2 x (t,ξ) 2
1
∂t2 − 2ω ∂x2∂t
(t,ξ)
− ω 2 x1 (t, ξ) − ∂ x∂ξ1 (t,ξ)
2 = b(ξ)u(t),
∂ 2 x2 (t,ξ) ∂x1 (t,ξ) ∂ 2 x2 (t,ξ)
(54)
∂t2 + 2ω ∂t − ω x2 (t, ξ) − ∂ξ2 = 0,
2
94 P. Skruch
where t > 0, ξ ∈ (0, 1). The boundary conditions are of the form
x1 (t, 0) = x1 (t, 1) = 0,
(55)
x2 (t, 0) = x2 (t, 1) = 0,
Then we find that the system (54) can be written in the form (4), where X =
L2 ((0, 1), R2 ),
x −x1 − ω 2
A 1 = , (58)
x2 −x2 − ω 2
0.5
0
y
−0.5
−1
−1.5
0 2 4 6 8 10 12 14 16 18 20
t
Fig. 2. Effects of using the controller (63) in stabilization of the system (54)
0.8
0.6
0.4
0.2
y
−0.2
−0.4
−0.6
−0.8
0 2 4 6 8 10 12 14 16 18 20
t
Fig. 3. Effects of using the controller (64) in stabilization of the system (54)
6 Concluding Remarks
infinite number of poles on the imaginary axis. The important role in the stabi-
lization process has played the assumption that the input and output operators
are collocated. We have proposed a linear dynamic velocity feedback and linear
dynamic position feedback. In the case where velocity in not available, a parallel
compensator is necessary to stabilize the system. The asymptotic stability of
the closed loop system in both cases has been proved by LaSalle’s invariance
principle extended to infinite dimensional systems. Numerical simulation results
have shown the effectiveness of the proposed controllers.
Acknowledgement
This work was supported by Ministry of Science and Higher Education in Poland
in the years 2008–2011 as a research project No N N514 414034.
References
[1] Curtain, R.F., Pritchard, A.J.: Infinite dimensional linear systems theory.
Springer, Heidelberg (1978)
[2] Dafermos, C.M., Slemrod, M.: Asymptotic behaviour of nonlinear contraction
semigroups. J. Funct. Anal. 13(1), 97–106 (1973)
[3] Datta, B.N., Ram, Y.M., Sarkissian, D.R.: Multi-input partial pole placement for
distributed parameter gyroscopic systems. In: Proc. of the 39th IEEE International
Conference on Decision and Control, Sydney (2000)
[4] Datta, B.N., Ram, Y.M., Sarkissian, D.R.: Single-input partial pole-assignment
in gyroscopic quadratic matrix and operator pencils. In: Proc. of the 14th Inter-
national Symposium of Mathematical Theory of Networks and Systems MTNS
2000, Perpignan, France (2000)
[5] Dunford, N., Schwartz, J.T.: Linear operators. Part II. Spectral theory. Self adjoint
operators in Hilbert space. Interscience, New York (1963)
[6] Engel, K.J., Nagel, R.: One-parameter semigroups for linear evolution equation.
Springer, New York (2000)
[7] Hale, J.K.: Dynamical systems and stability. J. Math. Anal. Appl. 26(1), 39–59
(1969)
[8] Hale, J.K., Infante, E.F.: Extended dynamical systems and stability theory. Proc.
Natl. Acad. Sci. USA 58(2), 405–409 (1967)
[9] Kato, T.: Perturbation theory for linear operators. Springer, New York (1980)
[10] Klamka, J.: Controllability of dynamical systems. PWN, Warszawa (1990) (in
Polish)
[11] Kobayashi, T.: Frequency domain conditions of controllability and observability
for a distributed parameter system with unbounded control and observation. Int.
J. Syst. Sci. 23(12), 2369–2376 (1992)
[12] Kobayashi, T.: Low gain adaptive stabilization of undamped second order systems.
Arch. Control Sci. 11(XLVII) (1-2), 63–75 (2001)
[13] Kobayashi, T.: Stabilization of infinite-dimensional undamped second order sys-
tems by using a parallel compensator. IMA J. Math. Control Inf. 21(1), 85–94
(2004)
Feedback Stabilization of Distributed Parameter Gyroscopic Systems 97
Abstract. Oscillation and nonoscillation criteria are established for second-order sys-
tems with delayed positive feedback. We consider the stability conditions for the system
without damping and with gyroscopic effect. A general algorithm for finding stability
regions is proposed. Theoretical and numerical results are presented for single-input
single-output case. These results improve some oscillation criteria of [1], [2] and [6].
1 Introduction
The paper expands on a method proposed by [1], [2] and [6] for stabilizing
second-order systems with delayed positive feedback. The system is described
by linear second-order differential equations
Y (s)
G1 (s) = = C(s2 + sG + A)−1 B. (3)
U (s)
In [7], it has been proved that the system (1) is not asymptotically stable. The
eigenvalues of (1) are different from zero, pairwise conjugated and located on
the imaginary axis.
W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 99–108.
springerlink.com c Springer-Verlag Berlin Heidelberg 2009
100 W. Mitkowski and P. Skruch
In this paper in order to stabilize the system (1) we use the following positive,
time-delay feedback
u(t) = ky(t − τ ), (4)
where k > 0, τ > 0 and y(t) = 0 for t ∈ [−τ, 0). Using the Laplace transform in
(4), we obtain
U (s) = G2 (s)Y (s), (5)
where
G2 (s) = ke−sτ . (6)
The closed loop system (see Fig. 1) will be defined by the following transfer
function:
G1 (s)G2 (s)
G(s) = . (7)
1 − G1 (s)G2 (s)
If we take the Laplace transform in (8), (9) and use zero initial conditions, we
obtain
Y (s)
G1 (s) = = B T (s2 + A)−1 B. (11)
U (s)
Simple calculations show that
s2 + 2
G1 (s) = . (12)
(s2 + 1)(s2 + 3)
The open loop system represented by the transfer function (12) is not asymp-
totically stable.√The poles of√ (12) are located on the imaginary axis: s1 = j,
s2 = −j, s3 = 3j, s4 = − 3j, j 2 = −1. In order to stabilize the system we
consider the following feedback:
In this case the closed loop system (11), (13) with the matrices (10) is described
by the transfer function
The stability of the closed loop system (14) will be checked by exploring the
Nyquist plot of
G12 (s) = −G1 (s)G2 (s). (15)
The Nyquist plot allows us to gain insight into stability of the closed loop system
by analyzing the contour of the frequency response function G12 (jω) on the
complex plane. In this case
where
ω2 − 2
Re G12 (jω) = k cos(ωτ ), (17)
ω4 − 4ω 2 + 3
ω2 − 2
Im G12 (jω) = −k sin(ωτ ). (18)
ω 4 − 4ω 2 + 3
We will try to plot the graph (16) only for positive frequency, that is for ω ∈
[0, +∞). The second half of the curve can be achieved by reflecting it over the
real axis. The magnitude and the phase of the function (16) are given by
102 W. Mitkowski and P. Skruch
ω2 − 2
|G12 (jω)| = k , (19)
ω − 4ω + 3
4 2
Then we will be sure that there is no encirclements of the (−1, j0) point. The
first condition Im G12 (jω) = 0 is true when
√ nπ
ω = 2 or ω = , (23)
τ
√
for all n = 1, 2, . . .. At ω = 2 we have Re G12 (jω) = 0. This means that the
magnitude |G12 (jω)| = 0. At ω = nπ τ , n = 1, 2, . . ., the condition Re G12 (jω) < 0
is equivalent to
(nπτ )2 − 2τ 4
(−1)n k < 0, (24)
(nπ) − 4(nπτ )2 + 3τ 4
4
for all n = 1, 2, . . ..
√
Now, let us consider what happens at ω = 1 and ω = 3. Since the magnitude
(19) is infinite at these frequencies, we need to be sure that
and
lim
√ Im G12 (jω) > 0. (26)
ω→ 3−
√ √
Then the ”points” G12 (j− ) and G12 (j+ ) (or G12 (j 3− ) and G12 (j 3+ )) will be
connected by the polar plot in the clockwise direction by a circular arc of the
radius R = ∞ and angle φ = π, which is centered at the origin. In other words,
the (−1, j0) point will not be embraced by the curve G12 (jω). The inequalities
(25) and (26) are equivalent to the following ones
√
sin τ > 0 or sin 3τ > 0. (27)
Stabilization Results of Second-Order Systems 103
(nπτ )2 − 2τ 4
(−1)n k < 0,
(nπ)4 − 4(nπτ )2 + 3τ 4
50
45
40
35
30
25
τ
20
15
10
B C
5
A
0
0 0.5 1 1.5
k
A
B
C
4
0
y
−2
−4
−6
0 2 4 6 8 10 12 14 16 18 20
t
Fig. 3. Trajectories of the closed loop system (14) corresponding to the points A, B
and C
104 W. Mitkowski and P. Skruch
1
ω −−> 1− ω −−> sqrt(3)−
0.8
0.6
0.4
0.2
0
imag
ω=0
−0.2
−0.4
−0.6
−0.8
ω −−> 1+
−1
ω −−> sqrt(3)+
Fig. 4. The plot of the function G12 (jω) corresponding to the parameters k = 1.2,
τ = 0.9 (point A)
ω −−> 1−
0.2
ω −−> sqrt(3)−
0
ω=0
−0.2
imag
−0.4
−0.6
−0.8
ω −−> sqrt(3)+
ω −−> 1+
−1
−1.5 −1 −0.5 0 0.5
real
Fig. 5. The plot of the function G12 (jω) corresponding to the parameters k = 0.5,
τ = 7.53 (point B)
Fig. 2 illustrates the stability regions for the system (14). For example, the point
A = (1.2, 0.9) is located in the stability region. The point B = (0.5, 7.53) stands
for the stable system but not asymptotically stable. And the point C = (1.0, 3.0)
illustrates the unstable region. The trajectories corresponding to the points A,
B and C are shown in Fig. 3. Figs 4, 5 and 6 present the Nyquist plots for
asymptotically stable, stable and unstable systems.
Stabilization Results of Second-Order Systems 105
ω −−> 1−
ω −−> sqrt(3)+
ω −−> 1+
0.4
0.2
0
ω=0
imag
−0.2
−0.4
−0.6
ω −−> sqrt(3)−
Fig. 6. The plot of the function G12 (jω) corresponding to the parameters k = 1.0,
τ = 3.0 (point C)
3 Gyroscopic System
The gyroscopic system is a system of differential equations of the form
The Laplace transform of the system (28) and (29) determines the following
transfer function:
Y (s)
G1 (s) = = B T (s2 + sG + A)−1 B. (31)
U (s)
Using (30) in (31) we obtain
s2 + 2
G1 (s) = . (32)
s4 + 5s2 + 3
The open loop system is not asymptotically stable, its eigenvalues are located
on the imaginary axis
√ √
−5 + 13 −5 − 13
s1,2 = ±j , s3,4 = ±j . (33)
2 2
106 W. Mitkowski and P. Skruch
50
45
40
35
30
25
τ
20
15
10
0
0 0.5 1 1.5
k
Let
U (s) = G2 (s)Y (s), G2 (s) = ke−sτ . (34)
The closed loop system (31), (34) with the matrices (30) is given by
G1 (s)G2 (s) k(s2 + 2)e−sτ
G(s) = = 4 . (35)
1 − G1 (s)G2 (s) s + 5s + 3 − k(s2 + 2)e−sτ
2
Let us note that the difference between the non-gyroscopic (12) and gyroscopic
system (32) is in the denominator of the appropriate transfer function. Using
the same technique as in the previous section, we can easily give the conditions,
which let us determine the range of allowable parameters k and τ . They are as
follows:
(a) k ∈ (0, 32 ), sin s1 τ > 0, sin s3 τ > 0,
(b) If there exists ω0 = nπ τ ∈/ {s1 , s3 }, n = 1, 2, . . ., such that
(nπτ )2 − 2τ 4
(−1)n k < 0,
(nπ)4 − 5(nπτ )2 + 3τ 4
then it must satisfy
(nπτ )2 − 2τ 4
k < 1.
(nπ) − 5(nπτ ) + 3τ
4 2 4
Fig. 7 shows the graphical representation of the stability regions for the system
(35).
Based on our discussion, we can establish an algorithm for finding the range
of allowable parameters of the positive time-delay controller (4) in order to guar-
antee stability of the general second-order system (1), (2). The algorithm can be
easily implemented in MATLAB-Simulink environment.
Stabilization Results of Second-Order Systems 107
ALGORITHM: The algorithm for finding stability regions for the generalized
second-order system with the positive time-delay feedback.
INPUT: The matrices G = Rn×n , A = Rn×n , B = Rn×1 , C = R1×n , the
transfer function of the system
G1 (s) = C[s2 + sG + A]−1 B,
the transfer function of the controller
G2 (s) = ke−sτ ,
the transfer function of the closed loop system
G1 (s)G2 (s)
G(s) = .
1 − G1 (s)G2 (s)
OUTPUT: The set S = {(k, τ ) ∈ R2 : the closed loop system is asymptotically
stable}.
ASSUMPTIONS: G = −GT , A = AT > 0, the system is observable, the open
loop system has all eigenvalues located on the imaginary axis, the multiplicity
of all eigenvalues is equal one.
STEP 1: Find the poles of the open loop system: si = jωi , ωi > 0, i =
1, 2, . . . , n.
STEP 2: Determine the set S1 = {(k, τ ) ∈ R2 : τ > 0, k > 0, kCA−1 B < 1}.
STEP 3: Determine the set Ω = {ω ∈ (0, +∞)\{ω1 , ω2 , . . . , ωn } : Im G12 (jω) =
0 and Re G12 (jω) < 0}.
STEP 4: Determine the set S2 = {(k, τ ) ∈ R2 : ∀ω∗ ∈Ω |G12 (jω ∗ )| < 1, k >
0, τ > 0}.
STEP 5: Determine the set S3 = {(k, τ ) ∈ R2 : limω→ωi− Im G12 (jω) > 0, i =
1, 2, . . . , n}.
STEP 6: Determine the set S = S1 ∩ S2 ∩ S3 .
4 Concluding Remarks
In this paper stabilization problem of matrix second-order systems has been
discussed. We have presented our results for single-input single-output case. The
systems have all poles located on the imaginary axis. We have proved that the
system can be stabilized by delayed positive feedback. The analysis of the closed
loop system has been performed using the Nyquist criterion. An algorithm for
finding stability regions has been proposed and then validated by series numerical
computations in MATLAB-Simulink environment. It seems to be interesting
to extend the results for infinite dimensional second-order dynamical systems
described by singular partial differential equations.
Acknowledgement
This work was supported by Ministry of Science and Higher Education in Poland
in the years 2008–2011 as a research project No N N514 414034.
108 W. Mitkowski and P. Skruch
References
[1] Abdallah, C., Dorato, P., Benitez-Read, J., Byrne, R.: Delayed positive feedback
can stabilize oscillatory systems. In: Proc. of the American Control Conference,
San Francisco CA, pp. 3106–3107 (1993)
[2] Buslowicz, M.: Stabilization of LC ladder network by delayed positive feedback
from output. In: Proc. XXVII International Conference on Fundamentals of Elec-
trotechnics and Circuit Theory, IC-SPETO 2004, pp. 265–268 (2004) (in Polish)
[3] Elsgolc, L.E.: Intoduction to the theory of differential equations with delayed ar-
gument. Nauka, Moscow (1964) (in Russian)
[4] Górecki, H., Fuksa, S., Grabowski, P., Korytowski, A.: Analysis and synthesis of
time delay systems. PWN, Warszawa (1989)
[5] Mitkowski, W.: Stabilization of dynamic systems. WNT, Warszawa (1991) (in Pol-
ish)
[6] Mitkowski, W.: Static feedback stabilization of RC ladder network. In: Proc.
XXVIII International Conference on Fundamentals of Electrotechnics and Circuit
Theory, IC-SPETO, pp. 127–130 (2005)
[7] Skruch, P.: Stabilization of second-order systems by non-linear feedback. Int. J.
Appl. Math. Comput. Sci. 14(4), 455–460 (2004)
A Comparison of Modeling Approaches for the
Spread of Prion Diseases in the Brain
Franziska Matthäus
Abstract. In this article we will present and compare two different modeling ap-
proaches for the spread of prion diseases in the brain. The first is a reaction-diffusion
model, which allows the description of prion spread in simple brain subsystems, like
nerves or the spine. The second approach is the combination of epidemic models with
transport on complex networks. With the help of these models we study the dependence
of the disease progression on transport phenomena and the topology of the underlying
network.
1 Introduction
W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 109–117.
springerlink.com
c Springer-Verlag Berlin Heidelberg 2009
110 F. Matthäus
+ +
PrP c PrP sc
(see Figure 1). After the conversion, the two resulting infective agents dissociate
and the process can start again with new PrPc.
Prion transport in the brain is another field where experimental data is sparse.
However, there are indications that prions move within the brain via axonal
transport [1], where the transport happens in both directions (anterograde and
retrograde). The speed of 1 mm/d hereby coincides with the speed of passive
neuronal transport [7].
The one-dimensional domain can hereby by associated with simple brain sub-
structures, like nerves or the spine. The reaction-diffusion system for the het-
erodimer model then has the following form:
∂A
= v0 − kA · A − kAB · A · B + D∇2 A
∂t
(2)
∂B
= kAB · A · B − kB · B + D∇2 B,
∂t
with the initial conditions A(0, x) = A0 (x) ≥ 0, B(0, x) = B0 (x) ≥ 0.
This set of equations (2) has been used by Payne and Krakauer [13] to study
inter-strain competition. Qualitatively they could show how after co-infection
with two different prion strains the first inoculated strain can slow down or even
stop the spread of the second strain and prevail, even if it has a longer incubation
period.
The parameters for this model have been estimated in [10], and are summa-
rized in Table 1. With the estimated parameter values the solutions of the system
A Comparison of Modeling Approaches for the Spread of Prion Diseases 111
v0 4 μg/(g·d)
kA 4 d−1
kB 0.03 d−1
kAB 0.15 (μg·d/g)−1
D 0.05 mm2 /d
120
100
80
B in μg/g
60
40 t=450 days
20 B(0)=0.025 μg/g
0
0 20 40 60 80 100
distance in mm
For the heterodimer model (2), the speed of the traveling wave front for scrapie
prion cB can bedetermined analytically [10, 12], and depends on the kinetic param-
eters as cB = D · kAB · (A∗1 − A∗2 ), where A∗1 and A∗2 stand for the steady state
concent rations of cellular prion in the healthy system (absence of scrapie prion)
and in the diseased system (after infection with scrapie prions), respectively.
With the spatial model (2) it can also be shown, that the diffusion coefficient
has an influence on the overall concentration dynamics of PrPsc . In Figure 3
we show the dynamics of the PrPsc -concentration, averaged over the domain Ω,
for varying D. For small diffusion coefficients, the traveling wave forms and the
access of PrPsc to its substrate PrPc is limited. In this case the concentration
dynamics are dominated by linear growth. For large diffusion coefficients PrPsc
quickly distributes in space and the concentration dynamics show a sigmoidal
evolution, similar to the results of the heterodimer model without diffusion.
112 F. Matthäus
120
100
B concentration
80
60
40
D=0.9
D=0.5
20
D=0.1
D=0.05
0
0 100 200 300 400
time in days
330
Heterodimer model
320
incubation time (in days)
290
280
270
260
250
0 10 20 30 40
time of eye removal (in days)
Fig. 4. Incubation times depending on the interval between intraocular infection and
surgical eye removal. Experimental data from [16] (× with error bars) and simulation
data for two different parameter sets.
The model (2) can also be related to experiments, for example when modeling
the spread of prions in the mouse visual system. Here we can make use of the fact
that the mouse visual system is nearly linear, with the optic nerve projecting
from the eye the lateral geniculate nucleus (LGN), and the optic radiations then
projecting from the LGN to the visual cortex. Because of this simple structure
the system can be approximated by a one-dimensional domain and our model
applies.
Scott et al. [15, 16] carried out experiments to show the dependence of the in-
cubation time tinc on the dose of intraocularly injected scrapie material. Further-
more, they investigated how the incubation time changes when the eye is surgically
removed at different time intervals after intraocular infection. For the first exper-
iment, the relationship tinc ∝ log(dose) found can be easily reproduced with our
spatial model, however, here the spatial component is not essential, as any model
with a near-exponential initial phase would give the same result. Different is the
A Comparison of Modeling Approaches for the Spread of Prion Diseases 113
situation for the experiments regarding the surgical eye removal. Here a model
without a spatial component is not sufficient. With a spatial model, eye removal
can be simulated by a change in the domain Ω, in particular by inserting zero-flux
boundary conditions at the position where the optic nerve is cut. To compare the
results of the simulations with the real data we modified the model slightly to ob-
tain a better description of the spatial domain and the steady state distribution
of PrPc. For details see [9, 10]. The results of the simulation fit well to the exper-
imental data that show a decaying incubation time for larger intervals between
infection and surgical eye removal (see Figure 4).
3 Network Models
The complexity of the brain neuronal network and the fact that prions are trans-
ported across the edges of this network make the application of reaction-diffusion
equations on larger brain systems very difficult. However, some results on the
spread of infections on networks can be obtained by combining epidemic models
with transport on networks. In the previous section we showed that the disease
kinetics are dependent on prion transport. In the present section we will show
that the topology of the underlying network effects the disease spread as well.
Networks consist of a set of N nodes and M edges, where the nodes represent
here the neuronal cells and the edges denote whether between two cells exists
a connection (in the form of a synapse or gap junction) or not. The number of
edges originating from a node corresponds to the number of neighbors of the
node and is called the nodes degree k. The average of the degrees of all nodes
in the network is called the degree of the network k. According to the degree
distribution P (k) networks can be classified. In this article we will focus on one
network model, called small world [17]. Small worlds can be constructed from d-
dimensional regular grids by rewiring the edges with a probability p. Depending
on p, this model interpolates between regular and totally random networks, and
is therefore a good model for our purpose.
To describe the spread of infective diseases on networks, the network model is
combined with a model of epidemic diseases, the SI model. The SI model classifies
nodes into two discrete states, namely susceptible or infected. Susceptible nodes
become infected with probability ξ, where ξ is a function of the transmission
probability between two neighbors λ, and the number of infected neighbors m:
ξ = 1 − (1 − λ)m .
140
60
40
20
−4 −3 −2 −1 0
10 10 10 10 10
rewiring probability
Fig. 5. Number of iterations until 95% of the network is infected, depending on the
rewiring probability p
4.8
heterogeneity 〈 k2〉 / 〈 k〉
4.6
4.4
4.2
4
0 0.2 0.4 0.6 0.8 1
rewiring probability
got infected. The result for every p is the average over various realizations of the
network.
The result, displayed in Figure 5, shows clearly that the velocity of the spread
increases with increasing rewiring probability. The crucial network feature is
thereby the degree heterogeneity of the network, defined as k 2 /k. For small
worlds, the degree heterogeneity increases with p, as shown in Figure 6. In [3] it
was shown that the time scale τ of the initial exponential growth of the epidemics
is related to the degree heterogeneity as:
k
τ= . (3)
λ(k 2 − k)
This relation shows that for scale-free networks, characterized by a power-law
degree distribution with P (k) ∝ k −α , the epidemics can have an extremely fast
initial growth, because here k 2 diverges with the network size.
A Comparison of Modeling Approaches for the Spread of Prion Diseases 115
80
70
65
60
55
2 4 6 8 10 12
node degree
The neuronal network of the brain is an example of a very large network, and
although the degree variability of the nodes is bound by the number of synapses
a cell can form, this number can be as high as 2·105 for Purkinje cells [18]. To
determine the exact growth rate of the number of infected cells, estimates for
the transmission probability and for the degree heterogeneity are needed, which
are points that still need experimental and theoretical investigation.
The spread of epidemics on networks differs in many aspects from diffusive
spread on homogeneous domains. One example is the following: On a homoge-
neous domain, the time when a cell becomes infected depends only on its distance
from the origin of the infection. On networks, this time is also influenced by the
degree of the node. To show this, we set p = 1 to obtain networks with a large
degree variation and again simulate the outbreak of an epidemic applying the SI
model.
Figure 7 shows the average survival time of nodes in dependence on their
degree. One can see that on average nodes of high degree are earlier infected
(have shorter survival times) than nodes of low degree. The reason is that nodes
of high degree have more neighbors from which they can contract the disease.
Instead of looking at the survival times directly, Barthélemy et al. [3] measured
(with the same result) the average degree of !the newly infected nodes and the
inverse partition ratio, defined as Y2 (t) = k (Ik /I)2 , where Ik /I denotes the
fraction infected nodes of degree k in relation to all infected nodes.
4 Conclusions
The two approaches describe the disease progression on different scales. The
diffusion approach focuses on the mechanism of prion-prion interaction, but is
limited to simple spatial domains. The network approach takes into account the
complexity of the domain, in which the transport of the infective agent takes
place, but therefore is no longer specific for prion diseases.
116 F. Matthäus
The problem with models of prion spread in the brain is the shortness of
experimental data. Not only the prion interaction mechanism is not fully un-
derstood, but also the topology of the brain neuronal network is unclear. The
aim of this article is to present some general results obtained by the use of very
simple models.
With the appearance of new experimental data, the development of more
detailed models will become feasible. A possibility here is the combination of a
kinetic model for prion-prion interaction with transport on networks, and thus
the study of reaction-diffusion systems on networks. Some work on reaction-
diffusion systems on networks has been carried out for example in [6], which
deals with annealing processes of the types A + A → 0 and A + B → 0 on
scale-free networks, or in [2] where the Gierer-Meinhardt model was studied
on random and scale-free networks. The models for prion spread derived by
combining prion-prion interaction with transport on networks eventually should
not only account for long incubation periods but also provide a description of
local prion accumulations and the formation of plaques.
References
[1] Armstrong, R.A., Lantos, P.L., Cairns, N.J.: The spatial patterns of prion deposits
in Creutzfeldt-Jakob disease: comparison with β-amyloid deposits in Alzheimer’s
disease. Neurosci. Lett. 298, 53–56 (2001)
[2] Banerjee, S., Mallik, S.B., Bose, I.: Reaction-diffusion processes on random and
scale-free networks (2004) arXiv:cond-mat/0404640
[3] Barthélemy, M., Barrat, A., Pastor-Satorras, R., Vespignani, A.: Velocity and
hierarchical spread of epidemic outbreaks in scale-free networks (2004) arXiv:cond-
mat/0311501
[4] Eigen, M.: Prionics or the kinetic basis of prion diseases. Biophys. Chem. 63,
A1–A18 (1996)
[5] Galdino, M.L., de Albuquerque, S.S., Ferreira, A.S., Cressoni, J.C., dos Santos,
R.J.V.: Thermo-kinetic model for prion diseases. Phys. A 295, 58–63 (2001)
[6] Gallos, L.K., Argyrakis, P.: Absence of kinetic effects in reaction-diffusion pro-
cesses in scale-free networks. Phys. Rev. Lett. 92(13), 138301 (2004)
[7] Glatzel, M., Aguzzi, A.: Peripheral pathogenesis of prion diseases. Microbes. In-
fect. 2, 613–619 (2000)
[8] Harper, J.D., Lansbury Jr., P.T.: Models of amyloid seeding in Alzheimer’s dis-
ease and scrapie: mechanistic truths and physiological consequences of the time-
dependent solubility of amyloid proteins. Annu. Rev. Biochem. 66, 385–407 (1997)
[9] Matthäus, F.: Hierarchical modeling of prion spread in brain tissue, PhD thesis
(2005)
[10] Matthäus, F.: Diffusion versus network models as descriptions for the spread of
prion diseases in the brain. J. theor. Biol (in press) (2005)
[11] Masel, J., Jansen, V.A.A., Nowak, M.A.: Quantifying the kinetic parameters of
prion replication. Biophys. Chem. 77, 139–152 (1999)
[12] Murray, J.D.: Mathematical Biology. Springer, Heidelberg (1989)
[13] Payne, R.J.H., Krakauer, D.C.: The spatial dynamics of prion disease. Proc. R.
Soc. Lond. B 265, 2341–2346 (1998)
A Comparison of Modeling Approaches for the Spread of Prion Diseases 117
[14] Prusiner, S.B.: Prions. Proc. Natl. Acad. Sci. USA 95, 13363–13383 (1998)
[15] Scott, J.R., Davies, D., Fraser, H.: Scrapie in the central nervous system: neu-
roanatomical spread of infection and Sinc control of pathogenesis. J. Gen. Vi-
rol. 73, 1637–1644 (1992)
[16] Scott, J.R., Fraser, H.: Enucleation after intraocular scrapie injection delays the
spread of infection. Brain Res. 504, 301–305 (1989)
[17] Watts, D.J., Strogatz, S.H.: Collective dynamics of ’small-world’ networks. Na-
ture 393, 440–442 (1998)
[18] http://faculty.washington.edu/chudler/facts.html#brain
Ensemble Modeling for Bio-medical Applications
1 Introduction
Ensemble methods have gained increasing attention in the last decade[1, 2] and
seem to be a promising approach for improving the generalization error of ex-
isting statistical learning algorithms in the regression and classification tasks.
The output of an ensemble model is the average of outputs of the individual
models belonging to the ensemble. In prediction problems an ensemble typically
outperforms single models. Almost all ensemble methods described so far use
models of one single class, e.g. neural networks [1, 2, 3, 4] or regression trees [5].
We suggested to build ensembles of different model classes, to improve the per-
formance in regression problems. The theoretical background of our approach is
provided by the bias/variance decomposition of the ensemble. We argue that an
ensemble of heterogeneous models usually leads to a reduction of the ensemble
variance because the cross terms in the variance contribution have a higher ambi-
guity. Further we describe the structure of the programming toolkit and its usage.
W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 119–135.
springerlink.com
c Springer-Verlag Berlin Heidelberg 2009
120 C. Merkwirth, J. Wichard, and M.J. Ogorzalek
Any type of model constructed has to pass the validation stage – estimation
of the generalization error using just the given data set. In a logical way we
select the model with lowest (estimated) generalization error. To improve the
generalization error typical remedies can be:
• Manipulating training algorithm (e.g. early stopping)
• Regularization by adding a penalty to the loss function
• Using algorithms with built-in capacity control (e.g. SVM)
• Relying on criteria like BIC, AIC, GCV or Cross Validation to select
optimal model complexity
• Reformulating the loss function, e.q. by using an -insensitive loss
3 Ensemble Methods
Building an Ensemble consists of averaging the outputs of several separately
trained models
!K
• Simple average f¯(x) = K
1
!k=1 fk (x) !
• Weighted average f¯(x) = k wk fk (x) with k wk = 1
The ensemble generalization error is always smaller than the expected error
of the individual models. An ensemble should consist of well trained but diverse
models.
where the expectation E[·] is taken with respect to the probability distribution
P . The Bias/Variance Decomposition of Err(x) is
where the expectation ED [·] is taken with respect to all possible realizations of
training sets D with fixed sample size N and E[y|x] is the deterministic part of
122 C. Merkwirth, J. Wichard, and M.J. Ogorzalek
the data and σ 2 is the variance of y given x. Balancing between the bias and
the variance term is a crucial problem in model building. If we try to decrease
the bias term on a specific training set, we usually increase the bias term and
vice versa. We now consider the case of an ensemble average fˆ(x) consisting of
K individual models
K
fˆ(x) = ωi fi (x) wi ≥ 0, (4)
i=1
!K
where the weights may sum to one i=1 ωi = 1. If we put this into eqn. (2) we
get
Err(x) = σ 2 + Bias(fˆ(x))2 + Var(fˆ(x)), (5)
and we can have a look at the effects concerning bias and variance. The bias
term in eqn. (5) is the average of the biases of the individual models. So we
should not expect a reduction in the bias term compared to single models.
The variance term of the ensemble could be decomposed in the following way:
V ar(fˆ) = E (fˆ − E[fˆ])2
K K
= E[( ωi fi )2 ] − (E[ ωi fi ])2
i=1 i=1
K
" #
= ωi2 E fi2 − E 2 [fi ] (6)
i=1
+2 ωi ωj (E [fi fj ] − E [fi ] E [fj ]) ,
i<j
where the expectation is taken with respect to D and x is dropped for simplicity.
The first sum in eqn. 6 gives the lower bound of the ensemble variance and
contains the variances of the ensemble members. The second sum contains the
cross terms of the ensemble members and disappears if the models are completely
uncorrelated [7]. The reduction of the variance of the ensemble is related to the
degree of independence of the single models. This is a key feature of the ensemble
approach.
There are several ways to increase model decorrelation. In the case of neural
network ensembles, the networks can have different topology, different training
algorithms or different training subsets [2, 1]. For the case of fixed topology, it is
sufficient to use different initial conditions for the network training [4]. Another
way of variance reduction is Bagging, where an ensemble of predictors is trained
on several bootstrap replicates of the training set [8]. When constructing k-
Nearest-Neighbor models, the number of neighbors and the metric coefficients
could be used to generate diversity.
Krogh et al. derive the equation E = Ē − Ā which relates the ensemble
generalization error E with the average generalization error Ē of the individual
Ensemble Modeling for Bio-medical Applications 123
models and the variance Ā of the model outputs with respect to the average out-
put. When keeping the average generalization error Ē of the individual models
constant, the ensemble generalization error E should decrease with increasing
diversity of the models Ā. Hence we try to increase A by using two strategies:
1. Resampling: We train each model on a randomly drawn subset of 80% of all
training samples. The number of models trained for one ensemble is chosen
so that usually all samples of the training set are covered at least once by
the different subsets.
2. Variation of model type: We employ two different model types, which are lin-
ear models trained by ridge regression and k-nearest-neighbor (k-NN) models
with adaptive metric.
4 Out-of-Train Technique
2 4 6 8 10 12 14 16 18 20 OOT
index of model
Fig. 1. Averaging scheme for OOT calculation for an example data set of ten samples.
On this data set, 20 models were trained. Column j corresponds to model j. For each
model, samples used for training are colored white, while samples not used for training
are colored gray. For easier reading, only output values for test samples were printed on
the respective row and column. To compute the OOT output for the i-th sample which
is depicted as gray value in the rightmost column, the average over the output of all
models for which this sample was not in the training fraction is calculated (averaging
over all gray fields in a row).
124 C. Merkwirth, J. Wichard, and M.J. Ogorzalek
and ensemble averaging. As for cross-validation, the data set is repeatedly di-
vided into training and test partitions. For each partitioning, a model is con-
structed only on samples of the training partition. Test samples are not used for
model selection, deriving of stopping criteria or the like. The OOT output for
one sample of the data set is the average of the outputs of models for which this
sample was not part of the training set (out-of-train) as depicted in Figure 1.
6.2 Syntax
• Constructor syntax:
model = perceptron; will create a MLP model with default topology
model = perceptron(12); MLP model with 12 hidden layer neurons
model = ridge; will create a linear model trained by ridge regression
• Training syntax:
model = train(model,x,y,[],[],0.05);
trains model with -insensitive loss of 0.05 on data set (xi , yi )
• Evaluation syntax:
y new = calc(model, x new) evaluates the model on new inputs
• How to build an ensemble of models:
ens = crosstrainensemble; will create an empty ensemble object
ens = train(ens,x,y,[],[],0.05); calls training routines for several primary
models and joins them into ensemble object
• Ensemble evaluation:
y new = calc(ens, x new) evaluates the ensemble on new inputs.
300
200
100
−100
and treated the reduced collection as an ensemble of ensembles EE. This method
has two advantages: If one run diverges or produces an substantially inferior
model, it is not used for the final collection of ensembles. Due to the stochastic
nature of the initial subset and of the training method used, the output of the
ensembles generated by the different runs are to some extend decorrelated. This
results not only in a substantial reduction of the error MSEEE of the collection
of ensembles on F, but also in an reduction of the generalization error on unseen
data (see [1, 2]). We used four different underlying model types which range from
strictly local to a global models: 1. The model type that was mostly used within
the numerical experiments is a variant of the k-nearest-neighbor regressor with
adaptive metric. In our case GA-like algorithm adapts the metric coefficients
of one of the L1, L2 or L1 metrics as well as the number k of neighbors. The
fitness of an individual is the negative leave-one-out-error on the training data,
which can be easily calculated using a fast nearest-neighbor algorithm ([24]).
The metric coefficients are adapted by mutation and crossover within a prede-
fined number of generations. 2. As a semi-local model type we decided to use the
hybrid PRBFN network based on radial basis function and sigmoidal nodes (see
[12]). We found this model type to perform superior on several artificial and real
world test problems. 3. As global model type we choose a fully connected neural
network with one hidden layer and eight hidden layer neurons. The network is
trained using a second-order Levenberg-Marquardt algorithm. The implementa-
tion was taken from the NNSYSID2.0 toolbox written by Magnus Nrgaard (see
[25]). 4. Another global model type we used was a linear model that was trained
using a cross-validation scheme.
128 C. Merkwirth, J. Wichard, and M.J. Ogorzalek
0.35
0.3
0.2
0.15
0.1
0.05
0
k−NN PRBFN NeuralNet Linear k−NN full
Model Type
0.35
0.3
Mean Square Error
0.25
0.2
0.15
0.1
0.05
0
k−NN PRBFN NeuralNet Linear k−NN full
Model Type
0.35
0.3
Mean Square Error
0.25
0.2
0.15
0.1
0.05
0
k−NN PRBFN NeuralNet Linear
Model Type
a data set of more than 42000 compounds from the DTP AIDS antiviral screen
data set of the NCI Open Database.
We considered a data set of more than 42000 compounds from the DTP AIDS
antiviral screen data set of the NCI Open Database. The antiviral screen uti-
lized a soluble formazan assay to measure the ability of compounds to protect
human CEM cells[14] from HIV-1-induced cell death. In the primary screen-
ing set of results, the activities of the compounds tested in the assay were de-
scribed to fall into three classes: confirmed active (CA) for compounds that
provided 100 % protection, confirmed moderately active (CM) for compounds
that provided more than 50 % protection, and confirmed inactive (CI) for the
remaining compounds or compounds that were toxic to the CEM cells and there-
fore seemed to not provide any protection. The data set was obtained from
http://cactus.nci.nih.gov/ncidb2/download.html. The data set consisted
of originally 42689 2D structures with AIDS test data as of October 1999 and
was provided in SDF format. Seven compounds could not be parsed and had to
be removed. From the total of 42682 useable compounds 41179 compounds were
ROC
1
0.9
0.8
0.7
Frac. true positives
0.6
0.5
0.4
0.3
0.2
OOT Train Classes CM and CA
Test Classes CM and CA
0.1 OOT Train Class CA only
Test Class CA only
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Frac. false positives
Fig. 3. ROC curves for the classifiers constructed on the NCI AIDS Antiviral Screen
Data Set with -insensitive absolute loss. The Figure displays two pairs of ROC curves.
In this computational experiment we trained an ensemble of molecular graph networks
on a data set consisting of three classes of molecules (CI, CM and CM). To be able
to generate ROC curves, we had to reduce the number of classes to two by pooling
the molecules of two classes into a single class. The lower pair of ROC curves was
obtained by using the ensemble of classifiers to discriminate between CI as one class
and CA and CM as second class, while the upper pair details the ROC curves when
using the same ensemble of classifiers to discriminate between CI and CM as one class
and the confirmed actives CA as the second. The AUCs of the respective pairs of curves
are 0.82 resp. 0.81 for classification of CI versus CA and CM and 0.94 resp. 0.94 for
classification of CI and CM versus CA.
130 C. Merkwirth, J. Wichard, and M.J. Ogorzalek
confirmed inactive, 1080 compounds were confirmed moderately active and 423
compounds were confirmed active.
To solve this multiclass classification problem, we used the one-versus-all ap-
proach based on the logistic loss function. We construct three ensembles of clas-
sifiers. Each of these three ensembles is trained to solve the binary classification
problem of discriminating one of the three classes against the rest and consists
of six MGNs. Each MGN consisted of 18 individual feature nets with iteration
depths ranging from 3 to 10 and a supervisor network with 24 hidden layer
neurons. The MGNs were trained by stochastic gradient descent with a fixed
number of 106 gradient calculations. The global step size μ was decreased every
70000 gradient updates by a factor of 0.8. We randomly partitioned the data set
into a training set of 35000 compounds and test set of 7682 compounds. Each
MGN was trained on a random two-third of the 35000 training samples. Thus
the OOT output for every sample of the training set was computed by averaging
over 2 models while the output for the held-out test set by averaging over all 6
models of each ensemble.
Results for the classification experiments on NCI data set with classification
loss function are given in Figure 3. This figure displays two pairs of ROC curves.
The lower pair of ROC curves in Figure 3 was obtained by using the ensemble of
classifiers to discriminate between CI on the one hand and CA and CM on the
other hand, while the upper pair details the ROC curves when using the same
ensemble of classifiers to discriminate between CI and CM on the one hand
and the confirmed actives CA on the other. The remarkable coincidence of the
curves obtained by validation on the training part and from the held-out test
part of more than 7000 compounds indicates that the validation was performed
properly and does not exhibit overfitting. This result is supported by the AUCs
of the respective pairs of curves which are 0.82 (OOT) and 0.81 (test) for the
classification of CI versus CA and CM and both 0.94 for the classification of CI
and CM versus CA.
Results for the classification with logistic loss function are depicted in Figure 4.
The obtained AUC values are similar to the best results of several variants of a
classification method based on finding frequent subgraphs[15] (experiments H2
and H3 when omitting class CM from the test set for the ensemble constructed to
discriminate CA versus the two other classes). Wilton et al.[16] compare several
ranking methods for virtual screening on an older version of the NCI data set.
The best performing method there, binary kernel discrimination, is able to locate
12 % of all actives (CM and CA pooled) in the first 1 % and 35 % of all actives in
the first 5 % of the ranked NCI data set. MGNs trained with logistic loss are able
to find 36 % resp. 74 % of all actives in the first 1 % resp. 5 % of the NCI data
set ranked according to the output of the ensemble of classifiers. When interpret-
ing the output of the three classifiers as pseudo-probabilities and assigning the
class label of the classifier with highest output value to each sample, we are able
to compute confusion matrices for the OOT validation on the training set and for
Ensemble Modeling for Bio-medical Applications 131
ROC
1
0.9
0.8
0.7
0.5
0.4
0.3
Test Class CI
0.2 OOT Train Class CI
Test Class CM
OOT Train Class CM
0.1 Test Class CA
OOT Train Class CA
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Frac. false positives
Fig. 4. ROC curves for the three classifiers constructed on the NCI AIDS Antiviral
Screen Data Set using logistic loss and one-versus-all approach. The Figure displays
three pairs of ROC curves. In this computational experiment three ensembles of molec-
ular graph networks were trained on a data set consisting of three classes of molecules
(CI, CM and CM). The green/black pair of ROC curves corresponds to the ensemble
classifier discriminating class CM from the two other classes, the red/magenta pair
to class CI against the others. The blue/cyan pair details the ROC curves resulting
from the ensemble classifier trained to discriminate class CA against the two remaining
classes CI and CM. AUCs are 0.80 resp. OOT 0.81 for class CI, 0.75 resp. OOT 0.75
for class CM and 0.94 resp. OOT 0.91 for class CA.
Table 1. Confusion matrix for the OOT validation on the training set obtained by the
system of three classifiers on the NCI AIDS Antiviral Screen data set using logistic loss
and one-versus-all approach. The values displayed indicate the fraction of the samples
of each class are classified into the respective classes ). E.g. 83.5 % of the samples of
class CI are classified correctly, 12.6 % of the CI samples are classified wrongly into
class CM and the remaining 3.8 % are wrongly classified into class CA. While samples
of classes CI and CA are mostly classified correctly, class CM (confirmed moderate)
are recognized correctly in only 38 % of the cases.
Predicted Class
Actual Class CI CM CA
CI 0.835 0.126 0.038
CM 0.408 0.380 0.212
CA 0.124 0.187 0.690
the held-out test set, given in Tables 1 and 2. While classes CI and CA can be
correctly classified in a majority of the cases, samples of class CM are recognized
correctly in less than 40 % of all cases.
132 C. Merkwirth, J. Wichard, and M.J. Ogorzalek
Table 2. Confusion matrix for the held-out test set. 85 % of the samples of class CI are
classified correctly, 12 % of the samples of class CI are classified wrongly as belonging
to class CM and the remaining 3 % are wrongly classified to fall into class CA.
Predicted Class
Actual Class CI CM CA
CI 0.852 0.121 0.027
CM 0.444 0.369 0.187
CA 0.093 0.160 0.747
training portion of the data (test set) is recorded. In a second step, the same is
done after the successive permutation of each descriptor. The relative decrease
of classification accuracy is the variable importance following the idea that the
most discriminative descriptors are the most important ones. The descriptor set
was reduced iteratively resulting in a final set of 14 descriptors, including topo-
logical charge indices, electronegativity and shape descriptors. Several of the
identified descriptors can be directly related to genotoxicity and specify char-
acteristics of structures involved in DNA modifications (see Rothfuss et al. [17]
and the references therein). Our final classifier was trained with several different
model classes to achieve a diverse ensemble:
• Classification and regression trees (CART)
• Support vector machines (SVM) with Gaussian kernels
• Linear and quadratic discriminant analysis
• Linear ridge models
• Feedforward neural networks with two hidden layers trained by gradient de-
scend
• K-nearest-neighbor models with adaptive metrics
In order to estimate the performance of the final ensemble model, we performed
a 20 fold cross-validation, wherein 10% of the data was randomly kept out as test
set an the remaining 90% of the data was used for model training. The results
with respect to training and test set are reported in Table 3.
Table 3. The performance of the ensemble classification model, mean values calculated
over 20 cross-validation-folds
As pointed out by several research groups, the state of the art machine learn-
ing approaches in the field toxicology prediction can compete with most of the
commercial software tools [22, 23, 17] and they havethe further advantage, that
they could be trained with the additional in-house data collection of institutes
or companies.
Acknowledgments
This work has been prepared in part within the scope of the Research Training
Network COSYC of SENS No. HPRN-CT-2000-00158 of the 5th EU Framework.
References
[1] Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active
learning. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Advances in Neural In-
formation Processing Systems, vol. 7, pp. 231–238. MIT Press, Cambridge (1995),
citeseer.ist.psu.edu/krogh95neural.html
[2] Perrone, M.P., Cooper, L.N.: When Networks Disagree: Ensemble Methods for
Hybrid Neural Networks. In: Mammone, R.J. (ed.) Neural Networks for Speech
and Image Processing, pp. 126–142. Chapman and Hall, Boca Raton (1993)
[3] Hansen, L., Salamon, P.: Neural Network Ensembles. IEEE Trans. on Pattern
Analysis and Machine Intelligence 12(10), 993–1001 (1990)
[4] Naftaly, U., Intrator, N., Horn, D.: Optimal ensemble averaging of neural networks.
Network, Comp. Neural Sys. 8, 283–296 (1997)
[5] Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regres-
sion Trees. Wadsworth International Group, Belmont (1984)
[6] Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning.
Springer Series in Statistics. Springer, Heidelberg (2001)
[7] Krogh, A., Sollich, P.: Statistical mechanics of ensemble learning. Physical Review
E 55(1), 811–825 (1997)
[8] Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996),
citeseer.ist.psu.edu/breiman96bagging.html
[9] Merkwirth, C., Ogorzalek, M., Wichard, J.: Stochastic gradient descent training
of ensembles of dt-cnn classifiers for digit recognition. In: Proceedings of the Eu-
ropean Conference on Circuit Theory and Design ECCTD 2003, Kraków, Poland,
vol. 2, pp. 337–341 (September 2003)
[10] Wichard, J., Ogorzalek, M.: Iterated time series prediction with ensemble models.
In: Proceedings of the 23rd International Conference on Modelling Identification
and Control (2004)
[11] Suykens, J., Vandewalle, J. (eds.): Nonlinear Modeling - Advanced Black–Box
Techniques. Kluwer Academic Publishers, Dordrecht (1998)
[12] Cohen, S., Intrator, N.: A hybrid projection based and radial basis function archi-
tecture. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 147–155.
Springer, Heidelberg (2000)
[13] Merkwirth, C., Lengauer, T.: Automatic generation of complementary descriptors
with molecular graph networks (2004)
[14] Weislow, O., Kiser, R., Fine, D., Bader, J., Shoemaker, R., Boyd, M.: New soluble
formazan assay for hiv-1 cytopathic effects: application to high flux screening of
synthetic and natural products for aids antiviral activity. J. Nat. Cancer Inst. 81,
577–586 (1989)
Ensemble Modeling for Bio-medical Applications 135
[15] Deshpande, M., Kuramochi, M., Karypis, G.: Frequent sub-structure-based ap-
proaches for classifying chemical compounds. In: Proceedings of the Third IEEE
International Conference on Data Mining ICDM 2003, Melbourne, Florida, pp.
35–42 (November 2003)
[16] Wilton, D., Willett, P., Lawson, K., Mullier, G.: Comparison of ranking methods
for virtual screening in lead-discovery programs. J. Chem. Inf. Comput. Sci. 43,
469–474 (2003)
[17] Rothfuss, A., Steger-Hartmann, T., Heinrich, N., Wichard, J.: Computational pre-
diction of the chromosome-damaging potential of chemicals. Chemical Research
in Toxicology 19(10), 1313–1319 (2006)
[18] Kirkland, D., Aardema, M., Henderson, L., Muller, L.: Evaluation of the ability
of a battery of three in vitro genotoxicity tests to discriminate rodent carcinogens
and non-carcinogens. Mutat. Res. 584, 1–256 (2005)
[19] Snyder, R.D., Pearl, G.S., Mandakas, G., Choy, W.N., Goodsaid, F., Rosen-
blum, I.Y.: Assessment of the sensitivity of the computational programs DEREK,
TOPKAT and MCASE in the prediction of the genotoxicity of pharmaceutical
molecules. EnViron. Mol. Mutagen. 43, 143–158 (2004)
[20] Todeschini, R.: Dragon Software, http://www.talete.mi.it/dragon_exp.htm
[21] Breiman, L.: Arcing classifiers. The Annals of Statistics 26(3), 801–849 (1998),
http://citeseer.nj.nec.com/breiman98arcing.html
[22] Serra, J.R., Thompson, E.D., Jurs, P.C.: Development of binary classification of
structural chromosome aberrations for a diverse set of organic compounds from
molecular structure. Chem. Res. Toxicol. 16, 153–163 (2003)
[23] Li, H., Ung, C., Yap, C., Xue, Y., Li, Z., Cao, Z., Chen, Y.: Prediction of genotoxic-
ity of chemical compounds by statistical learning methods. Chem. Res. Toxicol. 18,
1071–1080 (2005)
[24] McNames, J.: Innovations in Local Modeling for Time Series Prediction, Ph.D.
Thesis, Stanford University (1999)
[25] Norgaard, M.: Neural Network Based System Identification Toolbox, Tech. Report.
00-E-891, Department of Automation, Technical University of Denmark (2000),
http://www.iau.dtu.dk/research/control/nnsysid.html
Automatic Fingerprint Identification Based on
Minutiae Points
Introduction
In recent years security systems have played an important role in our community.
Payment operations without cash, restricted access to specific areas, secrecy of
information stored in databases are only a small part of our daily living that requires
special treatment. Beside traditional locks, keys or ID cards, there is an increased
interest in biometric technologies, that is, human identification based on one’s
individual features [2, 19].
Fingerprint identification is one of the most important biometric technologies
considered nowadays. The uniqueness of a fingerprint is exclusively determined by local
ridge characteristics called the minutiae points. Automatic fingerprint matching depends
on the comparison of these minutiaes and relationships between them [13, 14, 18].
In this paper several methods of fingerprint matching are discussed, namely, the
Hough transform, the structural global star method and the speeded up correlation
approach (Sect. 4). Because there is still a need for finding the best matching approach,
research for on-line fingerprints was conducted to compare quality differences and
time relations between the algorithms considered and the experimental results are
grouped in Section 5. One can also find here a detailed description of image
enhancement (Sect. 2) and the minutiae detection scheme (Sect.3) used in our research.
1 Fingerprint Representation
A fingerprint is a structure of ridges and valleys unique for all human beings. The
uniqueness is exclusively determined by local ridge characteristics called minutiae
points and relationships between them [6].
Two most common minutiae points considered in today’s research are known as
ending and bifurcation. An ending point is the place where a ridge ends its flow, and
bifurcation is the place where a ridge forks into two parts (Fig. 1).
W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 137 – 152.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
138 M. Hrebień and J. Korbicz
2 Image Enhancement
A very common technique for reducing the quantity of information received from a
fingerprint scanner in the form of a grayscale image is known as the Gabor filtering
[13]. The filter based on local ridge orientation and frequency estimations produces a
nearly binary output – the intensity histogram has a U-shaped form [8].
The Gabor filter is defined by
1 x2 y2
h( x, y, f ,θ ) = exp(− [ θ2 + θ2 ]) cos(2πfxθ ), (1)
2 δx δy
where:
ridge orientation θ for a specified block centered at the position (i, j) can be then
estimated with the equations
1 ⎛ V y (i, j ) ⎞
θ (i, j ) = tan −1 ⎜⎜ ⎟⎟, (2)
2 ⎝ V x (i, j ) ⎠
w w
i+ j+
2 2
V x (i, j ) = ∑ ∑ 2∂
w w
x (u , v)∂ y (u , v), (3)
u =i − v = j −
2 2
w w
i+ j+
2 2
V y (i, j ) = ∑ ∑ (∂
w w
2
x (u , v) − ∂ 2y (u , v)), (4)
u =i − v = j −
2 2
where ∂x(u, v) and ∂y(u, v) are pixel gradients at the position (u, v) (e.g., estimated
with Sobel’s mask [16]) and w is the block size (w = 16 for 500dpi fingerprint images,
or w = 15 to avoid unambiguous selection of the central point [5]). Additionally,
taking into account the fact that fingerprint ridges are not directed, orientation equal
to 225° can be considered as equal to 45°, so the orientation θ is usually described on
a half-open angle, for example, θ ∈ [0…π).
The local ridge frequency f can be estimated by counting an average number of
pixels between two consecutive peaks of gray-levels along the direction normal to the
local ridge orientation. The idea is based on a w × l (where w < l) oriented window
placed at the center of each block and rotated with the θ angle. The frequency of each
block is given by
1
f (i, j ) = , (5)
T (i, j )
where T(i, j) is an average number of pixels between two consecutive peaks in the so-
called x-signature obtained from
1 w−1
Xk = ∑W (d , k ), k = 0,1,..., l − 1.
w d =0
(6)
The local ridge frequency f can be given a constant value, the same for all blocks if
the filtering time must be minimized. Certainly, proper selection of its value is crucial
for the final result. Too large frequency will cause creation of spurious ridges. Too
small, on the contrary, will introduce the problem of merging nearby ridges into one.
For 500dpi fingerprint images, the inter-ridge distance is approximately equal to 10,
so f can be given a 1/10 value [10].
The space constants δx and δy define the stretch of the Gabor filter along the OX
and the OY axis. Selecting their values is a sort of a deal. If δx and δy are too small, the
140 M. Hrebień and J. Korbicz
filter is not effective in removing noise. If δx and δy are too large, the filter is more
robust in removing noise but introduces the smoothing effect and the ridge details are
lost. The δx and δy values should be approximately equal half the inter-ridge distance
to maximize enhancement effectiveness. For 500dpi fingerprint images, δx and δy are
usually equal to 4.0 or, sometimes, δy = 3.0 if there is a concern for spurious ridges
creation [5].
Trying to speed up the process of the Gabor filtering one can notice that the filter is
symmetrical on the OX as well as the OY axis, so the calculations can be significantly
reduced. Additionally, a segmentation mask (constructed, for example, using the
variance approach [12, 17]) can be used so that the filter calculations could be
performed only in those parts of image which were marked as blocks containing the
object’s pixels (ridges) [7].
An example result of fingerprint image enhancement with the final binary output is
illustrated in Fig. 3. The example input image was first normalized to reduce variance
in gray-levels of each pixel [1] (without changing the clarity between ridges and
valleys) with the equations
N −1 N −1
1
M (X ) =
N2
∑∑ X (i, j ),
i =0 j =0
(7)
N −1 N −1
1
VAR( X ) =
N2
∑∑ ( X (i, j) − M ( X ))
i = 0 j =0
2
, (8)
⎧ VARo ( X (i, j ) − M ) 2
⎪ 0
M + , if X (i, j ) > M
⎪ VAR
G (i , j ) = ⎨ (9)
⎪ M − VARo ( X (i, j ) − M ) , if X (i, j ) ≤ M ,
2
⎪⎩ 0 VAR
where M0 and VAR0 are the expected middle value and variance (both usually equal
128), and N is the size of the N × N input image X.
Fig. 3. Example of image enhancement and binaryzation based on the Gabor filter1
1
Off-line fingerprint taken from the U.S. National Institute of Standards and Technology
database, NIST-4, http://www.nist.gov
Automatic Fingerprint Identification Based on Minutiae Points 141
3 Minutiae Detection
Image thinning can be considered as a process of erosion [4, 7]. All pixels from the
edges of an object (a fingerprint ridge) are removed only if they do not affect the
coherence of the object as a whole, and they are left untouched otherwise. The
skeleton form of a fingerprint is generated until there are no more surplus pixels to
remove. The thickness of ridges in the resulting image has to be equal to one pixel,
and the shape and run of the original ridges should be preserved. An example of a
thinned form of a fingerprint can be seen in Fig. 4.
To determine whether a pixel at the position (i, j) in the skeleton form of a
fingerprint is a minutiae point, we have to deal with the mask rules illustrated in Fig. 5.
Bifurcation or ending are defined in a place where the perimeter of the mask (eight
nearest neighbors of the central point) is intersected in three or one part respectively.
Fig. 5. Example of 3×3 masks used to define: a) bifurcation, b) non-minutiae point, c) ending,
d) noise
142 M. Hrebień and J. Korbicz
To define the orientation of each minutiae we can use a (7 × 7) mask technique with
angles quantized to 15° and the center placed in a minutiae point. The orientation of
an ending point is equal to the point where a ridge is crossing through the mask. The
orientation of a bifurcation point can be estimated with the same method but only the
leading ridge is considered, that is, the ridge with a maximum sum of angles to the
other two ridges of the bifurcation (see, for instance, Fig. 6).
At the end of the minutiae detection process, all determined points should be verified
to see if they were not created by accident, for example, as a result of filtering errors.
Thus, all minutiaes from the borders of an image, in a very close neighbourhood to
the region marked as the background in the segmentation mask, created as a result of
a local ridge peak (bifurcation very close to an ending point) or as a consequence of
the pore structure of a fingerprint (the ridge hole – two bifurcations in a close
neighbourhood with opposite orientations) should be treated as false and removed
from the set. The local ridge noise problem can be reduced e.g. by ridge smoothing
techniques (a pixel is given a value using the majority rule in the nearest
neighbourhood [4]) just after image binaryzation, so that all small ridge holes will be
patched and all peaks smoothed out.
4 Minutiae Matching
Let MA and MB denote minutiae sets determined from the images A and B:
Each minutiae is defined by the image coordinates (x, y) and the orientation angle θ
∈ [0...2π], that is,
S (a, b) = max( a x − bx , a y − b y ),
(13)
K (α , β ) = min( α − β , 2π − α − β ).
∀i∀ j ∀ k ∀l A(i, j , k , l ) ← 0
FOR {x iA , y iA , θ iA } ∈ M A , i = 1...m
FOR {x Bj , y Bj , θ jB } ∈ M B , j = 1...n
FOR θ k ∈ {θ 1 , θ 2 ,..., θ K }, k = 1...K
IF K (θ iA + θ k , θ jB ) ≤ θ 0
FOR s l ∈ {s1 , s 2 ,..., s L }, l = 1...L
{
⎡ Δx ⎤ ⎡ x iA ⎤ ⎡ cos θ k sin θ k ⎤ ⎡ x Bj ⎤
⎢ Δy ⎥ ← ⎢ A ⎥ − s l ⎢ − sin θ ⎢ ⎥
cos θ k ⎥⎦ ⎢⎣ y Bj ⎥⎦
⎣ ⎦ ⎣ yi ⎦ ⎣ k
The Hough transform, which was adopted for fingerprint matching [14], can be
performed to find the best alignment of the sets MA and MB including the possible
scale, rotation and displacement of the image A versus B. The transformation space is
discretized – each parameter of the geometric transform (Δx, Δy, θ, s) comes from a
finite set of values. A four dimensional accumulator A is used to accumulate
evidences of alignment between each pair of minutiaes considered. The best
parameters of the geometric transform, that is, (Δx+, Δy+, θ+, s+) are arguments of the
maximum value from the accumulator (see the procedure in Fig. 7).
After performing the transformation, minutiae points are juxtaposed to calculate
the matching score with respect to their distance, orientation and type (with a given
tolerance).
An example result of the Hough transform is shown in Fig. 8.
S A = {S1A , S 2A ,..., S mA }
(15)
S B = {S1B , S 2B ,..., S nB },
where each star can be defined as
In opposition to the local methods [18], the voting technique for selecting the best
A B
aligned pair of the stars ( S wi , S wj ) can be performed (Fig. 10), including matching
such features like the between-minutiae angle K and the ridge count D (Fig. 9cb). In
the final decision, also the orientation of minutiaes is taken into account (Fig. 11)
after their adjustment by the angle of orientation difference between the central points
of stars from the best alignment (α).
An example result of the global start matching method is shown in Fig. 12.
Automatic Fingerprint Identification Based on Minutiae Points 145
Fig. 8. Example result of the Hough transform – matched minutiaes, with a given tolerance, are
marked with elipses
Fig. 9. General explanation of the star method: a) example star created for fingerprint ending
points, b) ridge counting (here equal to 5), c) example of relative angle determination between
the central minutiae and the remaining ones
146 M. Hrebień and J. Korbicz
FOR S iA ∈ S A , i = 1...m
FOR S Bj ∈ S B , j = 1...n
FOR mkA ∈ S iA − {miA }
assuming that : mlB ∈ S Bj − {m Bj }
IF ∃( D(m Bj , mlB ) − D(miA , mkA ) ≤ d 0 and K (m Bj , mlB ) − K (miA , m kA ) ≤ k 0 )
l
A(i, j ) ← A(i, j ) + 1
L←0
and T (θ kA , θ lB + α ) ≤ θ 0 )
{
L ← L +1
}
Fig. 11. Second stage of the global start matching algorithm – way of determining the number
of the matched minutiae pairs (L)
4.3 Correlation
Because of non-linear distortion, skin conditions or finger pressure that cause the
varying of image brightness and contrast [13], the correlation between fingerprint
images cannot be applied directly. Moreover, taking into account the possible scale,
rotation and displacement, searching for the best correlation between two images
using an intuitive sum of squared differences is computationally very expensive. To
eliminate or at least reduce some of the above-mentioned problems, a binary
representation of the fingerprint can be used. To speed up the process of preliminary
Automatic Fingerprint Identification Based on Minutiae Points 147
Fig. 13. Example of preliminary aligned fingerprint segmentation masks (left) and correlation
between two impressions of the same finger (right), where red denotes the best alignment
(images obtained with a Digital Persona U.are.U 4000 scanner)
alignment, a segmentation mask can be used with conjunction to the center of gravity
of binary images. Also, the quantization of geometric transform features can be
applied, considering the scale and rotation only at the first stage (since displacement
148 M. Hrebień and J. Korbicz
IF d < d min
[ s min , θ min , d min ] ← [ s i , θ j , d ]
}
IF d max < d
x
[ s max y
, s max , θ max , Δxmax , Δ ymax , d max ] ← [ s kx , s ly , θ m , Δxn , Δ yp , d ]
}
x
Dobj ( A, transform( B, s max y
, s max , θ max , Δxmax , Δ ymax , S iB , S Bj ))
W←
D obj ( A, A)
Fig. 14. Algorithm of finding the best correlation ratio (W) between the images A and B using
their segmentation masks Aseg and Bseg
Automatic Fingerprint Identification Based on Minutiae Points 149
Fig. 15. Example result of the best minutiae match for the correlation from Fig. 13
is the difference between the centers of gravity), minimizing the Dseg criteria (a simple
image XOR):
N −1 N −1 ⎧1, if Aseg (i, j ) ≠ Bseg (i, j )
Dseg ( Aseg , Bseg ) = ∑∑ ⎨ (17)
i = 0 j = 0 ⎩0, if Aseg (i, j ) = Bseg (i, j ).
After finding nearly the best alignment of segmentation masks (Fig. 13), looking
for the best correlation is limited to a much more reduced area. Including the rotation,
vertical and horizontal displacement, stretch and arbitrary selected granularity of these
features, the best correlation can be found (Fig. 14) searching for the maximum value
of the Dobj criteria (a double image XNOR):
⎧
N −1 N −1 1, if A(i, j ) = B (i, j ) = obj
Dobj ( A, B) = ∑∑ ⎨ (18)
i = 0 j = 0 ⎩ 0, in the other case,
from the best correlation. Then two sets of minutiaes can be compared to sum up the
matching score.
An example result of the correlation algorithm is shown in Fig. 13 and Fig. 15.
5 Experimental Results
The experiments were performed on a PC with a Digital Persona U.are.U 4000
fingerprint scanner. The database consists of 20 fingerprint images with 5 different
impressions (plus one more for the registration phase).
There were three experiments carried out. The first and the third one differ in the
case of parameter settings of each method. In the second one, the image selected for
the registration phase was chosen arbitrarily as the best one in the arbiter’s opinion (in
the first and the third experiment the registration image was the first fingerprint image
acquired from the user).
All images were enhanced with the Gabor filter described in Section 2 and
matched using the algorithms described in Section 4. The summary of the matching
results for Polish regulations concerning fingerprint identification based on minutiaes
[6] and time relations between each method are grouped in Tab. 1.
As one can easily notice, the Hough transform gave us the fastest response and the
highest hit ratio from the methods considered. Additionally, it can be quite easily
vectorized to perform more effectively with SIMD organized computers.
The global star method is scale and rotation independent but more expensive
computationally because of the star creation process – determining the ridge count
between the mA and mB minutiaes needs an iteration process, which is time consuming
(even if one notices that D(mA, mB) = D(mB, mA) in the star creation process with the
center in mA and mB). Moreover, filtering errors and not very good image quality can
cause breaks in the continuity of ridges disturbing the proper ridge count
determination and, as a consequence, produce a lower matching percentage in
common situations.
Automatic Fingerprint Identification Based on Minutiae Points 151
The analysis of an error set of the correlation method shows that it is most sensitive
in the case of image selection for the registration phase and parameter settings from
the group of the algorithms considered. Too small fragment or strongly deformed
fingerprint impression make finding the unambiguous best correlation (maximum W
value in Fig. 13) significantly difficult. Additionally, it is time consuming because of
its complexity (series of geometric transformations).
6 Conclusions
In this paper several methods of fingerprint matching were reviewed. The
experimental results show quality differences and time relations between the analyzed
algorithms. The influence of selecting an image for the registration phase can be
observed. The better image selected, the higher the matching percentage and smaller
inconvenience if the system works as a lock.
Suboptimal parameters selected for the performed preliminary experiments show
that it is still a challenge to use global optimization techniques for finding the best
parameters of each described method. Additionally, automatic image pre-selection
(classification), e.g., one based on global features of a fingerprint (such as core and
delta positions, the loop class) can speed up the whole matching process for very large
databases [3, 9, 11].
Heavy software architecture dependent optimizations or even hardware
implementation [14] can be considered if there is a big concern about speed. On the
other hand, if security is more important, hybrid solutions including, for example,
voice, face or iris recognition could be combined with fingerprint identification to
increase the system’s infallibility [19].
References
[1] Andrysiak, T., Choraś, M.: Image retrieval based on hierarchical Gabor filters. Int. J. of
Appl. Math. and Comput. Sci. 15(4), 471–480 (2005)
[2] Bouslama, F., Benrejeb, M.: Exploring the human handwriting process. Int. J. of Appl.
Math. and Comput. Sci. 10(4), 877–904 (2000)
[3] Cappelli, R., Lumini, A., Maio, D., Maltoni, D.: Fingerprint classification by directional
image partitioning. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 402–421 (1999)
[4] Fisher, R., Walker, A., Perkins, S., Wolfart, E.: Hypermedia Image Processing Reference.
John Wiley & Sons, Chichester (1996)
[5] Greenberg, S., Aladjem, M., Kogan, D., Dimitrov, I.: Fingerprint image enhancement
using filtering techniques. In: IEEE Proc. 15th Int. Conf. Pattern Recognition, vol. 3, pp.
322–325 (2000)
[6] Grzeszyk, C.: Dactyloscopy. PWN, Warszawa (1992) (in Polish)
[7] Gonzalez, R., Woods, R.: Digital Image Processing. Prentice-Hall, Englewood Cliffs
(2002)
[8] Hong, L., Wan, Y., Jain, A.: Fingerprint image enhancement: algorithm and performance
evaluation. IEEE Trans. on Pattern Analysis and Machine Intelligence 20(8), 777–789
(1998)
152 M. Hrebień and J. Korbicz
[9] Jain, A., Minut, S.: Hierarchical kernel fitting for fingerprint classification and alignment.
In: IEEE Proc. 16th Int. Conf. Pattern Recognition, vol. 2, pp. 469–473 (2002)
[10] Jain, A., Prabhakar, S., Hong, L., Pankanti, S.: Fingercode: a filterbank for fingerprint
representation and matching. In: IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, vol. 2, pp. 2187–2194 (1999)
[11] Karu, K., Jain, A.: Fingerprint classification. Pattern Recognition 29(3), 389–404 (1996)
[12] Lai, J., Kuo, S.: An improved fingerprint recognition system based on partial thinning. In:
Proc. 16th Conf. on Computer Vision, Graphics and Image Processing, vol. 8, pp. 169–
176 (2003)
[13] Maltoni, D., Maio, D., Jain, A., Prabhakar, S.: Handbook of Fingerprint Recognition.
Springer, Heidelberg (2003)
[14] Ratha, N., Karu, K., Chen, S., Jain, A.: A real-time matching system for large fingerprint
databases. IEEE Trans. on Pattern Analysis and Machine Intelligence 28(8), 799–813
(1996)
[15] Stock, R., Swonger, C.: Devolopment and Evaluation of a Reader of Fingerprint
Minutiae, Cornell Aeronautical Laboratory, Technical Report (1969)
[16] Tadeusiewicz, R.: Vision systems of industrial robots, WNT (1992) (in Polish)
[17] Thai, R.: Fingerprint Image Enhancement and Minutiae Extraction, University of
Western Australia (2003)
[18] Wahab, A., Chin, S., Tan, E.: Novel approach to automated fingerprint recognition. IEE
Proc. in Vis. Image Signal Process 145(3) (1998)
[19] Zhang, D., Campbell, P., Maltoni, D., Bolle, R. (eds.): Special Issue on Biometric
Systems. IEEE Trans. on Systems, Man, and Cybernetics 35(3), 273–450 (2005)
Image Filtering Using the Dynamic Particles Method
Abstract. The holistic approaches used for image processing are considered in various types of
applications in the domain of applied computer science and pattern recognition. A new image
filtering method based on the dynamic particles (DP) approach is presented. It employs physics
principles for the 3D signal smoothing. The obtained results were compared with commonly
used denoising techniques including weighted average, Gaussian smoothing and wavelet
analysis. The calculations were performed on two types of noise superimposed on the image
data i.e. Gaussian noise and salt-pepper noise. The algorithm of the DP method and the results
of calculations are presented.
1 Introduction
The analysis of the experimental measurement data is often difficult and sometimes
even impossible in their rough version because of superimposed noise. Properly
performed analysis based on the denoising techniques allows extracting the vital part
of the data. Due to the denoising process, which is often very expensive and time-
consuming, the experimental data can be restored and used in further calculations.
There exists a lot of examples of such data obtained from many experiments in
different domains of science e.g. experiments of plastometric material tests,
determination of engine parameters, sound recording, market analysis, etc. In most
cases observed noise is a result of external factors like sensitivity of the industrial
measuring sensors or market impulses [1].
Above-mentioned measurements are mainly in form of one dimensional signal that
have to be pre-processed before further analysis (Fig. 1).
However, a lot of obtained experimental results are presented as multi-dimensional
data and also requires application of denoising algorithms. The example of such data
used in medical or industrial applications is the image data in form of two
dimensional pictures. The analysis of pictures taken for example from industrial
camera is very often difficult because of low quality of registered image caused by the
low resolution of the possessed equipment. Moreover, the data presented in the
picture is usually superimposed with the noise. In most cases the noise on the image is
the difference between the real color that can be seen by the human eye and the value
that is registered by the camera. Thus, if there are many pixels in the picture with
unsettled value, then the whole image can be illegible.
W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 153 – 163.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
154 L. Rauch and J. Kusiak
Fig. 1. Examples of noisy measurements signals obtained from different plastometric material
tests [2]
However, the character of the noise can be very different. Several types of random
noise can be distinguished [3]:
Gaussian noise – used for testing of the denoising algorithm when the noise is
generated and then superimposed on the source image,
White noise – is a noise that contains every frequency within the range of human
hearing (generally from 20 hertz to 20 kHz) in equal amounts,
Salt-pepper noise – specific type of noise that changes the value of random
chosen pixels on white or black color.
The denoising method presented in this paper is dedicated for the images saved in
grayscale.
1.2 State-of-the-Art
Several commonly used methods of denoising exist. Each of them has some
advantages and disadvantages, but no one can be treated as the unified denoising and
smoothing method. The unification of such techniques should give one method, which
can be applied to different types of measurement data saddled with a noise of
different type. Existing known methods have to be reconfigured and adapted to the
new conditions, even if the analyzed data has the same form, but with different noise.
The example of such data is presented in Figure 1, where two similar plots are shown.
They contain results of a metal samples compression tests performed with different
tools’ velocities. Each of these curves is loaded with noise of different frequencies
though they describe the same type of tested material. Therefore, denoising methods
should be designed to obtain similar results independently of the noise character and,
what is more important, independently of the curve shape. This would allow the
application of the method in the automated way of denoising process that won’t
require reconfiguration of input parameters and additional user’s interaction.
Image Filtering Using the Dynamic Particles Method 155
The process of denoising is in fact the problem of data approximation. There are
many of such algorithms, but the most widely known and used are:
moving weighted average and polynomial approximations,
wavelet analysis [4] and artificial neural networks [5].
large family of convolusion methods and frequency based filters [6],
Kalman statistical model processing [7],
dedicated filtering (used mainly in the image filtering processes) e.g. NL-means,
neighborhood models [3].
In case of polynomial approximation approach, the algorithms return well-fitting
smoothed curves, but if the data contains several thousands of measured points then
the calculation time is very long and the method appears inefficient. The weighted
average technique allows user very fast and flexible data smoothing, but the
assessment of the obtained results is very difficult and based only on the user’s
intuition. Moreover, if the algorithm is running too long then the results converges to
the straight line or surface joining the border points of data set. Thus, the main
disadvantage of this method is the problem of a stop criterion of the algorithm. The
wavelet analysis is very similar to the traditional Fourier method, but is more
effective in analyzing physical situations where the signal contains discontinuities and
sharp peaks. It allows application of denoising process on different levels of signal
decomposition, making the solution very precise and controllable. Wavelets are
mathematical functions that divide the data into different frequency components.
Then the analysis of each component is performed with a resolution matched to the
frequency scale. The drawbacks of the method are the necessity of setting thresholds
each time the input data is changing and choosing the quantity of decomposition
levels that can be dependent on the noise character. Approach based on the artificial
neural networks is also often used, mainly the Generalized Regression Neural
Networks (GRNN) is applied. The results obtained using that technique are smoother
than in other methods e.g. wavelet analysis, but the application have to be
reconfigured each time the data is changing. In some cases, even the type of the
network must be changed, what is very inconvenient during the continuous
calculations. Thus, the neural network approach is suitable for single calculations, but
not for the automated application of denoising process.
The review of mentioned-above denoising methods allows to determine main
problems related to the process of denoising:
the definition of the stop criterion and the evaluation of the quality results,
techniques applied as the iterated algorithms run too long in most cases,
the results are too simplified, which makes useless the further analysis of the
data,
there is no unified method that could be applied on different types of noise
characterized by different frequencies.
The main objective of this paper is the presentation of scalable algorithm that could
be applied for different types of random noise. Moreover, the algorithm should be
equipped with the solution of the stop criterion that analyzes the progress of
calculations making temporary assessment of obtained results. The description of this
156 L. Rauch and J. Kusiak
method and the results of the application of elaborated method to the image data
testing sets and their interpretation is presented.
The idea of the Dynamic Particles (DP) algorithm is based on the definition of a
particle. A lot of definitions in the different science domains exist, but the most
general definition treats the particle as an object placed in the N-dimensional space.
From the mathematical point of view, the particle is a vector with N components
related to each dimension in a space. This approach characterize the particle’s
position and thus it can be analyzed relatively to the others [8].
The paper presents an algorithm that performs calculations on the three-dimensional
particles where the particle is in fact a pixel in form of three-dimensional vector:
two dimensions define the position of the pixel on the image (width and height),
the third dimension defines the value of the pixel in the grayscale – values are
from the range of 0-255, where 0 indicates black and 255 indicates white color.
Therefore, we receive the whole image as the 3D surface made of points represent-
ing adequate pixels. Values of all three dimensions should be normalized before
calculations. This process allows equalization of the influence that each dimension
has on the results of calculations. Finally, the obtained results are re-scaled to the
previous range of values.
2.2 DP Algorithm
where xjk and xik indicate the vector components in N-dimensional space. The force
between two particles is proportional to the distance between them, and can be
defined as a resultant of all forces acting on the neighbours. Thus, the length of
resultant force acting on the particle can be treated as the particle’s potential Vi. The
gradient of this potential is mainly responsible for the movement of the particle in
each calculations step. The set of differential equations of particles movement can be
written as follows:
⎧ dv
⎪mi ⋅ i = −Vi − f c ⋅v i
⎨ dt (2)
⎪⎩ dr = vi ⋅ dt
Image Filtering Using the Dynamic Particles Method 157
The stop criterion of the proposed algorithm was solved by establishing the
threshold of movement. If the force acting on the single particle is less than the
threshold defined at the beginning of the algorithm, the particle does not move any
longer. The whole algorithm reaches the end of the run when all particles are stopped.
However, the threshold responsible for the motion of the particles defines also the
smoothness of the expected results. If it is set as the small value, then the algorithm is
running till all forces on the curve reach the threshold and the differences between
positions of two adjacent particles are very low. Otherwise, the plot of new curve is
sharper sustaining all most important peaks. The value of this parameter can vary
158 L. Rauch and J. Kusiak
between 10-5 and 10-20. If its value is too small, then it has no more impact on the
shape of the curve. Otherwise, if it is too high, the algorithm stops too early giving no
effect of smoothing.
The precise validation of the obtained results is possible only in the case of testing
original data, which does not contain noise. The procedure of such testing is proposed
in three main steps:
Preparation of testing data – the original image (without noise) is taken and
noised with generated random noise – the converted image is created,
The algorithm of denoising is applied on converted image – the denoised image
is obtained,
The calculation of similarity ratio between original and denoised images is
performed.
The ratio of similarity is in most cases calculated as the standard deviation between
original and denoised images [3]. If such coefficient is equal to zero it means that the
process of denoising was perfectly performed. At the moment there are no algorithms
giving such results. The main disadvantage is that the value of the ratio is absolute and
its interpretation is usually impeded. Therefore, it has been proposed the coefficient of
the denoising quality, which can be evaluated accounting for the differences between
the original image, converted image and denoised image as follows:
calc _ diff ( S i , N i )
Dq = (3)
calc _ diff ( S i , Di )
where Dq is denoising quality coefficient; Si – source (original) image; Ni – noised
image; Di – denoised image. The calc_diff function used in equation (3) is defined as
the modified standard deviation:
calc _ diff =
∑d i
(4)
n −1
where di is the distance between corresponding particles in both images; n is the
number of points. The Dq coefficient equal to 1 means that there was no effect of
denoising process. Thus, the Dq value should be grater than 1 and the higher value
means the better denoising results. The test performed on the one dimensional data
(signal denoising) indicated that high quality of denoising was obtained when the Dq
value was higher then 5. However, the character of images, which are often very
jagged, indicates that denoising quality in the range from 1.2 to 1.6 is satisfactory.
One of the main objectives of this paper was to create scalable algorithm of denoising.
In this case the scalability property means that the algorithm would be applicable for
Image Filtering Using the Dynamic Particles Method 159
data placed in N-dimensional space. Thus, it is required that the complexity of such
algorithm should possess low dependence on the quantity of dimensions.
The source code presented in the Table 1 shows that the calculation complexity of
DP method depends mainly on the number of points in a data set. The pessimistic
variant of the calculations assumes that each iteration of the algorithm requires the
calculation of every particle’s position. The function responsible for these calculations
called calc_pos performs several iterations based on the quantity of neighbor particles
and the quantity of dimension. In the case of two dimensional data (image) the
influence of this function on the algorithm complexity is insignificant. However, the
number of main loop iterations is important which depends on the Cc initial value.
Satisfactory results for image denoising (e.g. 400x400 pixels is equal to 160 000
particles) are obtained after 25-50 iteration. Thus, the calculation complexity of the
designed algorithm can be estimated in O notation as follows:
160 L. Rauch and J. Kusiak
where n is dependent on the number of particles and log2n component is related to the
number of iterations.
3 Results
The tests of created algorithm were performed on the data set containing several
examples of images. The images were characterized by different types of content:
smoothed – contents with several basic colors grouped in sets of pixels e.g.
geometric figures (Fig. 3-5),
jagged – real photos containing in most cases full range of colors mixed between
each other with high frequency (Fig. 6-9).
Each one of them was superimposed by the two types of generated random noise:
Finally, the set of data contained 30 (thirty) images submitted to analyze using
procedure of DP algorithm. Several chosen results are presented in the Figures 3-9.
Fig. 3. Example of smooth image containing Fig. 4. Result of denoising process obtained
only 5 colors in original version, noised with by using DP method
Gaussian (4%) noise
Image Filtering Using the Dynamic Particles Method 161
Fig. 5. The comparison of magnified noised (Fig. 3) and denoised (Fig. 4) images – the
denoising quality coefficient is equal 1.495
Fig. 6. Example of original jagged image Fig. 7. The same image noised with Gaussian
containing a lot of details – Lena picture (8%) noise
Fig. 8. Results obtained using standard DP Fig. 9. Results obtained using DP method
method equipped with edge detection algorithm used
during the neighborhood determination process
162 L. Rauch and J. Kusiak
The results obtained from denoising process seem to be satisfactory. However, the
main observed disadvantage is the lack of edge detection procedure what can be seen
in Fig. 5. The subsequent figures present the application of the DP method supported
by proper edge detection algorithm.
Acknowledgements
Financial assistance of the KBN project No. 11.11.110.575 is acknowledged.
References
[1] Rauch, Ł., Talar, J., Zak, T., Kusiak, J.: Filtering of thermomagnetic data curve using
artificial neural network and wavelet analysis. In: Rutkowski, L., Siekmann, J.H.,
Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS, vol. 3070, pp. 1093–1098.
Springer, Heidelberg (2004)
[2] Gawąd, J., Kusiak, J., Pietrzyk, M., Di Rosa, S., Nicol, G.: Optimization Methods Used for
Identification of Rheological Model for Brass. In: Proc. 6th ESAFORM Conf. On Material
Forming, Salerno, Italy, pp. 359–362 (2003)
[3] Buades, A., Coll, B., Morel, J.M.: On image denoising methods, Centre de Matematiques
et de Leurs Applications, http://www.cmla.ens-cachan.fr
Image Filtering Using the Dynamic Particles Method 163
[4] Adelino, R., da Silva, F.: Bayesian wavelet denoising and evolutionary calibration. Digital
Signal Processing 14, 566–689 (2004)
[5] Falkus, J., Kusiak, J., Pietrzkiewicz, P., Pietrzyk, W.: The monograph, Intelligence in
Small World - nanomaterials for the 21th Century. In: Filtering of the industrial data for
the Artificial Neural Network Model of the Steel Oxygen Converter Process. CRC-PRESS,
Boca Raton (2003)
[6] Hara, S., Tsukada, T., Sasajirna, K.: An in-line digital filtering algorithm for surface
roughness profiles. Precision Engineering 22, 190–195 (1998)
[7] Piovoso, M., Laplante, P.A.: Kalman filter recipes for real-time image processing. Real-
time Image Processing 9, 433–439 (2003)
[8] Dzwinel, W., Alda, W., Yuen, D.A.: Cross-Scale Numerical Simulations using Discrete
Particle Models. Molecular Simulation 22, 397 (1999)
The Simulation of Cyclic Thermal Swing Adsorption
(TSA) Process
Bogdan Ambrożek
Abstract. The dynamic behavior of cyclic thermal swing adsorption (TSA) system with a
column packed with fixed bed of adsorbent is predicted successfully with a rigorous dynamic
mathematical model. The set of partial differential equations (PDEs), representing the TSA, is
solved by the numerical method of lines (NMOL), using the FORTRAN subroutine DIVPAG
from the International Mathematical and Statistical Library (IMSL). The simulated TSA cycle
is operated in three steps: (i) an adsorption step with cold feed; (ii) a countercurrent desorption
step with hot inert gas; (iii) a countercurrent cooling step with cold inert gas. Exemplary
simulation results are presented for the propane adsorbed onto and desorbed from fixed bed of
activated carbon. Nitrogen is used as carrier gas during adsorption and as purge gas during
desorption and cooling.
1 Introduction
The cyclic thermal swing adsorption (TSA) processes have been widely used in the
industry for the removal and recovery of pollutants, such as volatile organic
compounds (VOCs), from the gaseous streams [1]. A typical TSA system consist of
two adsorption columns with fixed bed of adsorbent and operates between two
different temperatures. While the adsorption process takes place in one column, the
bed in the other column is subjected to regeneration. During desorption, the first step
of regeneration, hot purge gas, which can be a slipstream of the purified gas or
another inert gas, flows through the bed. The adsorbate concentration in the purge gas
is much higher than in the feed gas and this concentrated stream can be sent to an
incinerator. It is also possible to recover the adsorbate by condensing it out from the
purge gas stream. After completion of the desorption step, the bed is cooled.
The cyclic TSA processes, in mathematical aspect, are classified as distributed
parameter systems, described by an integrated system of partial differential and
algebraic equations (IPDAEs) [2]. Each TSA process approaches a cyclic steady-state
(CSS). In this state the conditions at the end of each cycle are identical to those at the
start [3]. The difficulties for the design of TSA processes are based on the lack of
information about the influence of the process variables on the dynamic behavior of
adsorption column and cyclic steady-state convergence time.
The purpose of the present paper is to provide a parametric analysis of thermal
swing adsorption. The effect of different operating conditions and some of the model
W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 165 – 178.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
166 B. Ambrożek
2 Mathematical Model
Mathematical model describing TSA process consists of integrated partial differential
and algebraic equations. The model equations were obtained by applying differential
material and energy balances to the adsorbent bed. The following assumptions were
made:
(1) The gas phase follows the ideal gas law.
(2) Constant pressure operation.
(3) Single adsorbate system.
(4) The velocity of the carrier gas is constant.
(5) Negligible radial concentration, temperature and velocity gradient within the bed.
(6) Negligible intraparticle heat transfer resistance.
The model considers mass and heat transfer resistances, axial diffusion and thermal
conductivity.
Based on the above assumptions, the adsorbate mass balance within the gas phase
is represented by the following equation:
∂2 y G ∂y ∂y (1 − ε ) ρ p ∂q
− Dax + + − =0 (1)
∂z 2 ρ g ε ∂z ∂t ε ρ g ∂t
The adsorbate balance around the solid phase is formulated using a linear driving
force expression:
∂q
∂t
(
= k q* − q ) (2)
Heterogeneous energy balance around the gas phase in the packed bed accounts for
axial conduction and heat transfer to the solid phase and to the column wall:
−
kax ∂ Tg
2
+
G ∂Tg ∂Tg h f α p (1 − ε )
+ + (
Tg − Ts + ) (
4hw Tg − Tc
=0
) (3)
ρ g C pg ∂z 2 ρ g ε ∂z ∂t ερ g C pg ε Dρ g C pg
The energy balance of the adsorbent particle includes the heat generated by adsorption
and is expressed as:
∂Ts h f α p
∂t
−
ρ pC ps
(
Tg − Ts − )
Δ H a ∂q
C ps ∂t
=0 (4)
The Simulation of Cyclic Thermal Swing Adsorption (TSA) Process 167
1 ρ p q* Rp
= + (6)
k k f ρ gα p y 5 D psα p
The effective diffusivity, Dps, is related to Knudsen and surface diffusivities as
follows:
ε p ρ g ∂y*
D ps = Ds + DK (7)
ρ p ∂q
The Knudsen diffusion coefficient is calculated by the following equation [6]:
1/ 2
r ⎛T ⎞
DK = 9.7 ⋅ 10−5 e ⎜ ⎟ (8)
τp ⎝M ⎠
1.61 ⋅ 10−6 ⎛ E ⎞
Ds = exp⎜ − ⎟ (9)
τs ⎝ RT ⎠
Mass and heat axial dispersion values are calculated with the following correlations
[7]:
ε Dax
= 20 + 0.5Sc Re (10)
DM
kax
= 7 + 0.5 Pr Re (11)
kg
The k f and h f values are calculated using equations of Wakao and Chen [8]:
Molecular diffusivity is calculated using the equation developed by Fuller et al. [9]:
DM =
(
1.013 ⋅ 10− 2 T 1.75 1 / M + 1 / M g 1 / 2 )
[ ]
(14)
P (Dv ) 1/ 3
+ Dvg( )
1/ 3 2
kax
∂Tg
∂z z = 0
= −GC pg Tg ( z =0−
− Tg
z =0+
) (17)
and
y =y = yo (18)
z =0 − z =0 +
Tg = Tg =T (19)
z =0 − z =0 +
Both of the above boundary conditions have been employed by other investigator
[4-6].
The boundary conditions at z = L and t > 0 are written as follows:
∂y ∂Tg
= 0; =0 (20)
∂z z = L ∂z z = L
The solution of the model equations requires the knowledge of the state of the
column at the beginning of each step. The initial conditions for 0 < z < L and
t = 0 are:
q(0, z ) = qo ( z ) ; y (0, z ) = yo ( z )
Ts (0, z ) = Tso ( z ) ; Tg (0, z ) = Tgo ( z ) (21)
Tc (0, z ) = Tco ( z )
In the present study, it is assumed that the final concentration and temperature
profile in adsorbent bed for each step defines the initial conditions for the next step.
For the adsorption step in the first adsorption cycle:
The Simulation of Cyclic Thermal Swing Adsorption (TSA) Process 169
q o ( z ) = 0; y o ( z ) = 0 ,
Tso ( z ) = T go ( z ) = Tco ( z ). = Ta
(22)
3 Numerical Solution
The model developed in this work consists of partial differential equations (PDEs) for
mass and energy balances. The set of PDEs are first transformed into a dimensionless
form, and the resulting system is solved using the numerical method of lines
(NMOL) [10]. The spatial discretization is performed using second-order central
differencing, and the PDEs are reduced to a set of ordinary differential equations
(ODEs). The number of axial gird nodes was 30. The resulting set of ODEs were
solved using the FORTRAN subroutine DIVPAG of the International Mathematical
and Statistical Library (IMSL). The DIVPAG program employs Adams-Moulton’s or
Gear’s BDF method with variable order and step size.
assumed. The final concentration and temperature profile in adsorbent bed for each
step defines the initial conditions for the next step. It is assumed that the condition for
a periodic state is satisfied when the amount removed from the bed during regenera-
tion is equal to the amount that is accumulated in the bed during the adsorption step.
The following equation is used to determine the cyclic steady-state [2]:
⎛L ⎞ ⎛L ⎞
⎜ ∫ qdz ⎟ − ⎜ ∫ qdz ⎟ <δ (24)
⎜ ⎟ ⎜ ⎟
⎝0 ⎠(nc −1)th cycle ⎝ 0 ⎠( nc )th cycle
where δ is value close to zero (in this work δ = 1⋅10-5).
Approximately 15-20 cycles are needed to achieve the cyclic steady-state,
depending on process conditions.
The computer simulation results are used to study the effect of different operating
conditions and some of the model parameters on the concentration and temperature
breakthrough curves. The effects of adiabatic and non-adiabatic operation, purge gas
temperature during desorption step, boundary conditions, axial diffusion and thermal
conductivity were investigated. Exemplary simulation results for the cyclic steady-
state are shown in Figures 2 - 11. Typical concentration and temperature breakthrough
curves for adsorption, desorption and cooling steps are shown in Figures 2 - 6. In the
case of desorption step, two transitions are apparent, connected by a plateau. The
breakthrough curves were highly influenced by purge gas temperature (Figure 7) and
heat loss through the adsorption column wall, especially for small diameter adsorption
The Simulation of Cyclic Thermal Swing Adsorption (TSA) Process 171
0.01
0.008
y [mol/mol]
0.006
0.004
0.002
t [s]
Fig. 2. Concentration breakthrough curve for adsorption step
312
308
304
T [K]
300
296
292
t [s]
Fig. 3. Temperature breakthrough curve for adsorption step
172 B. Ambrożek
0.03
0.02
y [mol/mol]
0.01
t [s]
Fig. 4. Concentration breakthrough curve for desorption step. Purge gas temperature: 394 K.
380
360
340
T [K]
320
300
280
t [s]
Fig. 5. Temperature breakthrough curve for desorption step. Purge gas temperature: 394 K.
The Simulation of Cyclic Thermal Swing Adsorption (TSA) Process 173
380
360
340
T [K]
320
300
280
t [s]
Fig. 6. Temperature breakthrough curve for cooling step. Purge gas temperature during desorption
step: 394 K.
0.03
T= 394 K
T= 350 K
T= 310 K
0.02
y [mol/mol]
0.01
t [s]
Fig. 7. Effect of purge gas temperature on concentration breakthrough curve for desorption step
174 B. Ambrożek
0.04
adiabatic
non-adiabatic, D= 1m
0.03 non-adiabatic, D= 0.07 m
y [mol/mol]
0.02
0.01
t [s]
Fig. 8. Concentration breakthrough curves for adiabatic and non-adiabatic desorption. Purge
gas temperature: 394 K
0.03
0.02
y [mol/mol]
0.01
t [s]
Fig. 9. Concentration breakthrough curves for different boundary conditions. Purge gas temperature:
394 K.
The Simulation of Cyclic Thermal Swing Adsorption (TSA) Process 175
0.03
Eq. 10
Dax= 0
0.02
y [mol/mol]
0.01
t [s]
Fig. 10. Effect of axial diffusion on concentration breakthrough curve for desorption step.
Purge gas temperature: 394K.
380
360
Eq. 11
340
kax= 0
T [K]
320
300
280
t [s]
Fig. 11. Effect of axial thermal conductivity on temperature breakthrough curve for desorption
step. Purge gas temperature: 394 K.
176 B. Ambrożek
column (Figure 8). The modeling results show that concentration and temperature
breakthrough curves obtained using different boundary conditions, defined by
equations (16)-(17) and (18)-(19), are practically identical (Figure 9).
The effect of axial diffusion on the concentration breakthrough curve is illustrated
in Figure 10. The effective axial diffusion coefficient, Dax , was (i) set equal to zero,
and (ii) calculated by the equation (10). Figure 11 represents the effect of axial
thermal conductivity on the temperature breakthrough curve. The value of kax was
varied in the same manner as for axial diffusion coefficient. Both breakthrough curves
were not significantly affected by axial thermal conductivity and axial diffusion, but
the required computer time was sensitive to the values of Dax and kax.
5 Conclusions
The theoretical study of thermal swing adsorption was made. A non-equilibrium, non-
adiabatic mathematical model was developed to simulate temperature and concentra-
tion breakthrough curves for adsorption and regeneration. The cyclic steady-state
(CSS) cycles are obtained under various conditions by a cyclic iteration method. The
modeling results were used to study the effect of different operating conditions and
some of the model parameters on the concentration and temperature breakthrough
curves. The effects of adiabatic and non-adiabatic operation, purge gas temperature
during desorption step, boundary conditions, axial diffusion and thermal conductivity
were investigated.
Based on the modeling results the following conclusions are drawn:
(i) The breakthrough curves were highly influenced by purge gas temperature and
heat loss through the adsorption column wall.
(ii) The concentration and temperature breakthrough curves obtained using different
boundary conditions are practically identical.
(iii) The breakthrough curves were not significantly affected by axial thermal con-
ductivity and axial diffusion.
Symbols
bo – constant in Eq. 20
B – constant in Eq. 20
Cpc – heat capacity of column, J/(mol K)
Cpg – heat capacity of gas, J/(mol K)
Cps – heat capacity of solid, J/(kg K)
D – internal diameter of bed, m
Dax – axial diffusion coefficient, m2/s
DK – Knudsen diffusion coefficient, m2/s
DM – molecular diffusion coefficient, m2/s
Dps – effective particle diffusion coefficient, m2/s
DS – surface diffusion coefficient, m2/s
Dv, Dvg – diffusion volume of adsorbate, inert gas
E – surface diffusion energy of activation, J/mol
The Simulation of Cyclic Thermal Swing Adsorption (TSA) Process 177
References
[1] Bathen, D., Breitbach, M.: Adsorptionstechnik. Springer, Berlin (2001)
[2] Ko, D., Moon, I., Choi, D.-K.: Analysis of the Contact Time in Cyclic Thermal Swing
Adsorption Process. Ind. Eng. Chem. Res. 41, 1603 (2002)
[3] Ding, Y., LeVan, M.D.: Periodic States of Adsorption Cycles III. Convergence
Acceleration for Direct Determination. Chem. Eng. Sci. 56, 5217 (2001)
[4] Schork, J.M., Fair, J.R.: Parametric Analysis of Thermal Regeneration of Adsorption
Beds. Ind. Eng. Chem. Res. 27, 457 (1988)
[5] Yun, J.-H., Choi, D.-K., Monn, H.: Benzene Adsorption and Hot Purge Regeneration in
Activated Carbon Beds. Chem. Eng. Sci. 55, 5857 (2000)
[6] Huang, C.-C., Fair, J.R.: Study of the Adsorption and Desorption of Multiple Adsorbates
in a Fixed Bed. AICHE J. 34, 1861 (1988)
[7] Wakao, N., Funazkri, T.: Effect of Fluid Dispersion Coefficients on Particle-to-Fluid
Mass Transfer Coefficients in Packed Beds. Chem. Eng. Sci. 33, 1375 (1978)
[8] Wakao, N., Chen, B.H.: Some Models for Un steady-state Heat Transfer in Packed Bed
Reactors. In: Kulkarni, B., Mashelkar, R., Sharma, M. (eds.) Recent Trends in Chemical
Reaction Engineering, vol. 1, p. 254. Wiley Eastern Ltd., New Delhi (1987)
[9] Sinnott, R.K.: Coulson & Richardson’s Chemical Engineering, vol. 6. Butterworth-
Heinemann, Oxford (1999)
[10] Schiesser, W.E.: The Numerical Methods of Lines. Academic Press, California (1991)
[11] Valenzuela, D.P., Myers, A.L.: Adsorption Equilibrium Data Handbook. Prentice-Hall,
Englewood Cliffs (1989)
The Stress Field Induced Diffusion
Keywords: stress, interdiffusion, Navier equation, multicomponent solution, alloys, drift velocity.
1 Introduction
The new understanding of diffusion in multi component systems started with
Kirkendall experiments on the interdiffusion (ID) between Cu and Zn. Experiments
proved that the diffusion by direct interchange of atoms, the prevailing idea of the
day, was incorrect and that a less-favored theory, the vacancy mechanism, must be
considered. In 1946, Kirkendall, along with his student, Alice Smigelskas, had co-
authored a paper asserting that ID between Cu and Zn in brass shows movement of
the interface between the “initially different phases” due to ID. This discovery, known
since then as the “Kirkendall effect”, supported the idea that atomic diffusion occurs
through vacancy exchange [1]. It shows the different intrinsic diffusion fluxes of the
components, that cause swelling (creation) of one part and shrinkage (annihilation) of
the other part of the diffusion couple. The key conclusion is that local movement of
solid (its lattice) and liquid due to the diffusion is a real process. Once the solution is
non uniform and the mobilities differ from each other, than the vast number of
phenomena can occur: the Kirkendall marker movement, the Kirkendall-Frenkel
voids might be formed and stress is generated, etc. The concepts initiated by
Kirkendall played a decisive role in the development of the diffusion theory [2,3]. The
progress in the understanding of the ID phenomenology [4] allows nowadays for an
attempt to further generalize Darken method. Darken method for multicomponent
W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 179 – 188.
springerlink.com © Springer-Verlag Berlin Heidelberg 2009
180 M. Danielewski, B. Wierzba, and M. Pietrzyk
solutions is based on the postulate that the total mass flow is a sum of diffusion and
drift flow [4]. The force arising from gradients causes the atoms of the particular
component to move with a velocity, which in general may differ from the velocity of
the atoms of other components. Medium is common for all the species and all the
fluxes are coupled. Thus, their local changes can affect the common drift velocity,
υdrift. The physical laws that govern process are continuity equations and the postulate
that the total molar concentration of the solution is constant. The extended Darken
method in one dimension [4] allows modeling the positions of the solution
boundaries, densities and the drift velocity. Physical laws are the same as in original
Darken model. All the important differences are in the formulation of the initial and
boundary conditions. Model allows modelling ID for arbitrary initial distribution of
the components, in a case of moving boundaries, of the reactions and in many other
situations. The uniqueness and existence of the solution, the effective methods of
numerical solution and successful modelling of the “diffusional structers” (“up-hill
diffusion”) prove the universality of the drift concept. It offers sole opportunity to
describe ID in the real solutions and in three dimensions - an objective of this work.
The presented model is solvable and there exists an unique solution of it [4].
where υdrift denotes the drift velocity, J i is the diffusion flux and r number of
d
components. The mass balance equation can be written in the internal reference frame
(relatively to the drift velocity).
Thus from Eqs. (1) and (2) it follows:
Dρ i
Dt
= −divJ i − ρ i divυ
d drift
= −div ρ i xi( d
) − ρ divυ
i
drift
i = 1,..., r (3)
υ
drift
where xi
d
is diffusion velocity. The derivative in Eq. (3) is called Lagrange’an,
substantial or material derivative:
Dρ i ∂ρ i
= +υ gradρ i
drift
(4)
Dt υ
drift ∂t
and it gives the rate of density changes at the point moving with an arbitrary velocity,
here it is the drift velocity.
The Stress Field Induced Diffusion 181
The generally accepted form of the diffusion flux is the Nernst-Planck equation
[5,6]:
J id = ρ i Bi Fi (5)
where Bi and Fi are the mobility of i-th component and forces acting on it:
Fi = −gradμi (6)
Upon combining Eqs. (3), (5) and (6) the continuity equation becomes:
Dρ i
= div [ ρ i Bi gradμ i ] − ρi divυ i = 1, ..., r
drift
(7)
Dt υ drift
For all the processes that obey the mass conservation law and when the chemical
and/or nuclear reactions are not allowed (the reaction term can be omitted), the
equation of mass conservation holds, Eq (3). It is postulated here that the drift
velocity is a sum of Darken drift velocity (generated by the interdiffusion) and the
deformation velocity υ
σ
(generated by the stress):
υ drift = υ D + υ σ (8)
Darken postulated that diffusion fluxes are local and defined exclusively by the
local forcing (e.g., the chemical potential gradient, stress field, electric field etc.). He
postulated existence of the unique average velocity that he called the drift velocity. In
this work, we generalize the original Darken concept to include the elastic
deformation of an alloy. The Darken’s drift velocity, υD, is given by [2]:
df r r
1 1
υD =
c
∑ ci xi − c
∑ c x i
d
i
−υσ (9)
i =1 i =1
The average total and the diffusion velocities are given by:
df r
1
υ=
c
∑ c x i i
(10)
i =1
df r
1
υd =
c
∑ c x i i
d
(11)
i =1
The diffusion velocity of the i-th component and the concentration of the solution
are defined by:
182 M. Danielewski, B. Wierzba, and M. Pietrzyk
J id = ci xid (12)
r
c = ∑ ci (13)
i =1
From Eqs. (8) – (12), the following relations for the flux of the i-th element and its
velocity hold:
Upon summing Eqs. (14), for all components the average local velocities, satisfy
the Eq. (9):
r r
∑ c x i i
d
= ∑ ci xi − cυ D − cυ σ
i =1 i =1
υ d = υ − υ drift = υ − υ D − υ σ (16)
The postulate of the drift velocity allows rewriting Eq. (3) in the following form:
Dci
Dt
( )
+ div ci xid + ci divυ drift = 0 (17)
υ drift
Dc
+ div ( cυ ) − υ drift gradc = 0 (18)
Dt υ drift
and finally
∂c
+ div ( cυ ) = 0 (19)
∂t
Thus, we have obtained the well known formulae for the mass conservation in the
multicomponent solution.
The general form of the equation of motion for an elastic solid is very complex. We
will use the results that come out for an isotropic material. In such a case the equation
of motion reduces to the vector equation: f = (λ + μ )graddivu + μ divgradu , were
f is the density of the force induced by the displacement vector u. It shows that
isotropic material has only two elastic constants. To get the equation of motion for
The Stress Field Induced Diffusion 183
ρ∂ u
2
= (λ + μ )graddivu + μ divgradu .
∂t 2
An elastic body is defined as a material for which the stress tensor depends only on
a deformation tensor F,
σ = σ (F) (20)
We postulate in this work that the displacements are small. In such a case the
displacement gradient, H, is defined as the gradient of the displacement vector (u = x – X):
H = gradu = F − 1 (21)
and the strain tensor is the symmetric part of H
ε=
1
2
(H + H ) T
(22)
where
(u + ul , k )
1
ε kl = k ,l
2
The constitutive equation of an isotropic, linear and elastic body is known as the
Hooke’an law [8]:
σ = ( λ trε ) 1 + 2μ ε (23)
The Navier and Navier-Lamé equations describe the momentum balance in the
compressible fluid and isotropic solid. The relations for the momentum and the
moment of momentum obtained in the theory of mass transport in continuum in which
diffusion takes place [10] allow to postulate the following relation:
Dυ
ρ = divσ * + ρ fb (26)
Dt υ drift
where σ* and fb denote the overall Cauchy stress tensor and body force, respectively.
184 M. Danielewski, B. Wierzba, and M. Pietrzyk
where σ and f b denote the overall stress tensor defined by the Eq. (23) and body
force, respectively.
In Eqs. (26) and (27) we postulate that the drift velocity defines the local frame of
reference. In the analyzed case of the regular, cubic and elastic crystal the following
relation holds [9,11]:
σ − σT = 0 (28)
The diffusion of the i-th component, Eq. (14), depends on the both, the stress and the
chemical potential gradient, Eq. (5) and (6). Following Darken the total flux is a sum
diffusion and drift terms:
J i = −ci Bi grad ( μi + Ωi p ) + ciυ D + ciυ σ (29)
Moreover we limit the free energy density to the isostatic stress component.
Keeping only the diagonal terms [12,11] one gets:
1
p = − trσ (30)
3
The free energy density (pressure) gradient will induce the diffusion flux of
elements if their molar volumes differ [12]. The Nernst-Einstein equation relates the
mobility and the self diffusion coefficient [12]:
Di = Bi kT (31)
where k is the Boltzmann constant and T the absolute temperature.
Using Eq. (16), it can be written in terms of drift velocity. Thus, the following
formulae form the integral balance equations for multicomponent solution [13]:
D
Dt υ drift
∫ ( ) cdϑ + ∫
β t ∂β ( t )
cυ ds − ∫
∂β ( t )
cυ drift ds = 0 (32)
The Stress Field Induced Diffusion 185
D
Dt υ drift
∫ ( ) cυ dϑ = ∫
β t ∂β ( t )
σ ds + ∫
β (t )
c f b dϑ (33)
D
Dt υ
drift
∫ ( ) x × cυ dϑ = ∫
β t ∂β ( t )
x × σ ds + ∫
β (t )
x × cfb dϑ (34)
⎛ ⎞
∫ ( ) ⎜⎝ ce + ∑ 2 c (υ )
r
D 1 (35)
⎟ dϑ = ∫∂β ( t ) υσ ds − ∫∂β (t ) q T ds + ∫β ( t ) cfbυ dϑ + ∫β (t ) q B dϑ
2
drift
+ xid
⎠
β t i
Dt υ drift i =1
D qT qB
Dt υ drift
∫ ( ) cη dϑ ≥ − ∫
β t ∂β ( t )
T
ds + ∫
β (t )
T
dϑ (36)
where e, qT, qB, and η denote the specific internal energy, heat flux (vector of heat
transfer), vector of heat source per unit mass produced by internal sources and density
of energy production, respectively.
The integral equations allow to derive the self-consistent set of the following
differential equations [13]:
Dc
Dt
+ cdivυ drift + div cυ d = 0 ( ) (37)
υ drift
Dυ d Dυ drift
divσ + cfb − c −c + υ div ( cυ d ) = 0 (38)
Dt υ drift
Dt υ drift
Dυ d Dxid
( ) − υυ div ( cυ d ) − ∑ ci xi
r
De
−c + ediv cυ d + cυ − divqT +
Dt υ drift Dt υ drift i =1 Dt υ drift (39)
( )
r
1
+∑ xi xi div ci xid + σ : gradυ + q B = 0
i =1 2
σ − σT = 0 (40)
Dψ Dυ d Dxid
( ) − υυ div ( cυ d ) − ∑ ci xi
r
−c + ψ div cυ d + cυ +
Dt υ drift Dt υ drift i =1 Dt υ drift
(41)
( )
r
1
+∑ xi xi div ci xid + σ : gradυ ≥ 0
i =1 2
where the specific free energy is defined as ψ = e − ηT .
4 Results
There exists a solution of above model. At present we solve this problem numerically
using Finite Differential Method (FDM) in one dimension.
For demonstration the Cr-Fe-Ni system has been chossen. Interdiffusion modelling
in Cr-Fe-Ni closed system has been done using the FDM method and compared with
experimantal results. For the calculations the following data have been used:
186 M. Danielewski, B. Wierzba, and M. Pietrzyk
Figure 3 shows the calculated drift velocity of the diffusion couple shown in
Fig. 1:
The above figures illustrate the evolution of the concentration, drift velocity and
the pressure. Compariton the symulation data with experiment shows that the model is
valid, and the mathematical description of interdiffusion and stress is effective tool for
simulating that processes.
188 M. Danielewski, B. Wierzba, and M. Pietrzyk
5 Concluding Remarks
The following conclusions can be drawn:
a) The mathematical description of interdiffusion in multicomponent systems has
been formulated. For the known thermodynamic data and diffusivities, the
evolution of the concentration profiles and drift velocity can be computed.
b) Effective formulae enable us to calculate the concentration profiles and the
drift velocity as a function of time and position.
c) The model was applied for the modelling interdiffusion in Cr-Fe-Ni diffusion
couple. The calculated concentration profiles were consistent with
experimental results.
d) The Navier–Stokes and Navier-Lame equations for the case of multi
component solutions, where the concentrations are not uniform, has been
effectively used.
Acknowledgments
This work has been supported by the MNiI No. 11.11.110.643, under Grant No. 4
T08C 03024, and under Grant No. 3 T08C 044 30, finansed during the period 2006-
2008.
References
[1] Smigelskas, A.D., Kirkendall, E.: Trans. A.I.M.E. 171, 130 (1947)
[2] Darken, L.S.: Trans. AIME 174, 184 (1948)
[3] Danielewski, M.: Defect and Diffusion Forum 95–98, 125 (1993)
[4] Holly, K., Danielewski, M.: Phys. Rev. B 50, 13336 (1994)
[5] Nernst, W.: Z. Phys. Chem. 4, 129 (1889)
[6] Planck, M.: Ann. Phys. Chem. 40, 561 (1890)
[7] Feynman, R.P., Leighton, R.B., Sands, M.: The Feynman Lectures on Physics. Addison-
Wesley, London (1964)
[8] Cottrell, A.H.: The mechanical properties of matter. John Wiley & Sons Inc., New York
(1964)
[9] Landau, L.D., Lifszyc, E.M.: Theory of elasticity, Nauka, Moscow (1987)
[10] Danielewski, M., Krzyżański, W.: The Conservation of Momentum and Energy in Open
Systems. Phys. Stat. Sol. 145, 351 (1994)
[11] Stephenson, G.B.: Acta metall. 36, 2663 (1988)
[12] Philibert, J.: Diffusion and Stress. In: Defect and Diffusion Forum, pp. 129–130 (1996)
[13] Danielewski, M., Wierzba, B.: The Unified Description of Interdiffusion in Solids and
Liquids. In: Proc. Conf. 1st International Conference on Diffusion in Solids and Liquids,
Aveiro, Portugal, p. 113 (2005)
Author Index