Computational Inverse Techniques in Nondestructive Evaluation - G. R. Liu - X. Han - ch05

1523_Frame_C05.
fm Page 107 Thursday, August 28, 2003 4:23 PM
5
Genetic Algorithms
Chapter 4 gave a concise, insightful description of traditional optimization

methods that have a long history of development and application. This chapter
introduces a “nontraditional” search or optimization method, known as the
genetic algorithm (GA), that has become a potential search algorithm for com-
plex engineering problems. The word “nontraditional” may not be appropri-
ate, because GAs are already widely used; however, the word is used here for
the purpose of distinguishing GAs from the methods discussed in Chapter 4.
Over the past two decades, many different versions of GAs have been
developed. Combinations of GAs with traditional optimization methods or
hybrid GAs have also been proposed by many and proved to be very effec-
tive for a large number of problems. This chapter describes the basic concept
of the GA, and then some modified GAs with an emphasis on versions of
the intergeneration project GAs (IP-GAs), as well as the method that com-
bines GAs with gradient-based methods. A large portion of this book is
devoted to GAs because they are particularly useful for inverse problems
that are usually very complex in nature and for which the global optimum
is always required and forward solvers are often expensive. Most of the
methods will be employed in solving the inverse problems presented in
Chapter 7 through Chapter 13.
5.1 Introduction
Genetic algorithms are computational techniques for searching the optimum
or maximum of complex objective fitness functions based on a process that
simulates Darwin’s nature evolution theory. In 1975, Holland established the
theoretical foundation that initiated the most contemporary developments
of GAs. Since then, extensive research works have been carried out on the
theoretical investigation and engineering application of GAs. Due mainly to
their applicability to problems with very complex objective functions, GAs
have been successful in a wide variety of scientific fields such as computa-
tional search algorithms, optimization, and machine learning. As effective
© 2003 by CRC Press LLC

1523_Frame_C05.fm Page 108 Thursday, August 28, 2003 4:23 PM
optimization techniques, GAs have also been extremely successful in their

applications in structural optimization problems and appear to be promising
in dealing with complex, nonlinear multimodal optimization problems,
including inverse problems.
5.2 Basic Concept of GAs

GAs emulate the survival-of-the-fittest principle of nature to perform the
search and are naturally formulated for optimization problems. They are
also applicable to minimization problems because they can be easily con-
verted into maximization problems. Consider the following optimization
problem to present the basic concept of the GA:
Maximize f (x)
(5.1)
Subject to x L ≤ x ≤ x U
where x is the vector of parameters, x = {x1, x2, …, xN}T. The superscripts L

and U represent the lower and upper bands of the parameters respectively.
5.2.1 Coding
In a plain GA program, each parameter, xi (i = 1, 2, , N ) , of a given problem
should be coded into a finite-length string according to one of the coding
methods, among which binary coding is the simplest and most popular. A
so-called chromosome is formed as a super string that combines all these
finite-length strings and represents an individual (a candidate for solution
to the given problem). After the optimal individual is found, it is then
decoded back to the physical parameter. It should be noted that binary
coding of the parameters is not absolutely necessary. As will be illustrated
in Section 5.3.2, the parameters can be directly used in the so-called real
parameter coded GA. Here, the popular binary coding is used to illustrate
the process of GAs.
The objective function is often defined using continuous variables of
parameters. GAs, however, operate on the (binary-encoded) discrete param-
eters. Therefore, these parameters in a continuous space should be first
discretized, and then are encoded in binary form. The mathematical formu -
lation for the binary encoding and decoding of the ith parameters can be
given as (Haupt and Haupt, 1998)
Encoding
xi − xiL
xi = (5.2)
xiU − xiL

 m −1

gene[m] = roundxi − 2 − m −

∑ gene[m]2
k =1
−k


(5.3)
Decoding
N gene
x qn
i = ∑ gene[m]2
m =1
−m
+ 2 − ( m+1)
(5.4)
( )
xiq = xiqn xiU − xiL + xiL
where
xi : normalized i th parameter 0.0 ≤ x ≤ 1.0
xiL : smallest values of ith parameter
xiU : highest values of ith parameter
gene[m]: binary value of xi
round[]: round to nearest integer
Ngene: number of bits in the gene
xiqn : quantized value of xi
xiq : quantized value of xi
A plain GA program starts with a generation of chromosomes (individ-

uals) that are randomly selected from the entire pool of the search space.
Each of the chromosome’s fitness values is evaluated by computing the
fitness function (objective function). The following simulated genetic oper-
ators are then employed to simulate the natural evolution process, which
leads to the most fit chromosome or individual that is the solution or the
optimizer of the optimization problem.
5.2.2 Genetic Operators

Three basic genetic operators — selection, crossover, and mutation — are
performed in that order on these chromosomes of the current generation to
produce child generations that become fitter in the simulated evolution
process. The details of these operators are given next.
5.2.2.1 Selection
Selection is a process in which a mating pool of individual chromosomes of
the current generation is chosen in a certain way for reproduction of the
child generation according to the fitness values of the chromosomes of the
current generation. This operator is designed to improve the average quality
of the population by giving individuals of higher fitness a higher probability
to be copied to produce the new individuals of chromosomes in the child

generation. The quality of an individual in the current generation is mea-

sured by its fitness value through the evaluation of the fitness function;
therefore, the selection can focus on more promising regions in the search
space. A number of selection schemes, such as proportionate selection and
ranking selection, as well as tournament selection, have been popularly used
in GA programs. Once a chromosome has been selected for reproduction, it
enters into a mating pool that is a tentative new population ready for further
genetic operations. Obviously, the selection operation is an artificial emula-
tion of natural selection of the Darwinian survival theory.
5.2.2.2 Crossover
After the selection operation is completed and the mating pool is formed,
the so-called crossover operator may proceed. Crossover is an operation to
exchange part of the genes in the chromosomes of two parents in the mating
pool to create new individuals for the child generation; it is the most impor-
tant operator in a GA. A simple crossover proceeds in two steps. First,
members of the chromosomes in the mating pool are mated at random. Next,
each pair of the randomly selected chromosomes undergoes a crossover
using one of the following schemes to generate new chromosomes (Davis,
1991; Goldberg, 1989; Lawrence, 1987; Syswerda, 1989):
• One-point crossover scheme

• Multipoint crossover scheme
• Uniform crossover scheme
5.2.2.2.1 One-Point Crossover Scheme

A crossover operator randomly selects a crossover point within a chromo-
some then interchanges the two parent chromosomes at this point to produce
two new offspring. For example, consider the following two parents that
have been selected for crossover. The “|” symbol indicates the randomly
chosen crossover point:
Parent #1: 011101|0001
crossover (5.5)
Parent #2: 100111|0101
The first part of the gene segment of the first parent is hooked up with
the second part of the gene segment of the second parent to make the first
offspring. The second offspring is built from the first part of the second
parent and the second part of the first parent:

Offspring #1: 011101|0101

Offspring #2: 100111|0001 (5.6)
5.2.2.2.2 Multipoint Crossover Scheme

A crossover operator randomly selects a number of crossover points within
a chromosome then interchanges the gene segments in the chromosomes of
the two parents between these points to produce two new offspring. In the
following, a two-point crossover scheme is used to illustrate the process of
the multipoint crossover operator. For example, consider two parents that
have been selected for crossover:
Parent 1: 1101|010|101
crossover (5.7)
Parent 2: 0010|001|110
After interchanging the genes in the parent chromosomes between the

crossover points, the following offspring are produced:
Offspring 1: 1101|001|101
Offspring 2: 0010|010|110 (5.8)
In the multipoint crossover scheme, more than one crossover point is

selected in a pair of the parent chromosomes. The crossover operator per-
forms bit by bit at the gene bit level. The number of crossover points and
crossover positions, distinct from each other, in each pair of chromosomes
is selected randomly.
5.2.2.2.3 Uniform Crossover Scheme

A uniform crossover operator decides which parent will contribute each of
the genes in the offspring chromosomes with a given probability. This allows
the parent chromosomes to be mixed at the gene bit level rather than at the
gene segment level (as in the one- and multi-point crossover schemes). This
uniform crossover operation provides flexibility, but also destroys the build-
ing block in the chromosomes. However, for some problems, this additional
flexibility outweighs the disadvantage of destroying building blocks. In the
uniform crossover strategy, the crossover positions are predefined in a mask.
This mask determines from which parent the genetic material is taken for
each gene. All the chromosomes in a population are uniformly crossed over
in the same positions. Note that, in the multipoint crossover strategy, each
pair of chromosomes is crossed over at different points because no pre-

defined mask is used. For example, consider the following two parents that
have been selected for crossover:
Parent 1: ABCDEFGH
crossover with a mask (5.9)
Parent 2: IJKLMNOP
If the probability is 0.5, approximately half of the gene bits in the offspring
will come from parent 1 and the other half will come from parent 2. With
the mask of 1 0 1 0 1 0 1 0, the possible sets of offspring after uniform
crossover are:
Offspring 1: AJCLENGP
Offspring 2: IBKDMFOH (5.10)
With the mask of 0 1 0 1 0 1 0 1, the possible sets of offspring after uniform

crossover are:
Offspring 1: IBKDMFOH
Offspring 2: AJCLENGP (5.11)
In addition to these standard crossover operators, offspring can also be

generated using other crossover operators, such as the arithmetic crossover
operator and the heuristic crossover operators (Davis, 1991). Section 5.3.2
gives some crossover operators for real parameter coded GAs, and Section
9.5.1.1 detailed investigates the influence of the probability of the uniform
crossover operator.
5.2.2.3 Mutation
The mutation operator is designed so that one or more of the chromosome’s
genes will be mutated at a small probability. The goal of the mutation oper-
ator is to prevent the genetic population from converging to a local minimum
and to introduce some new possible solutions to the generation. Without
mutation, the population would rapidly become uniform under the so-called
conjugated effect of selection and crossover operator. There are a number of
mutation methods (OpitGA: http//www.optwater.com/optiga): flip bit, ran-
dom, and min–max. For example, consider the following parent that has been
selected for mutation. The bit at a selected point is mutated from 0 to 1:

Parent: 1101010101
mutation (5.12)
Offspring: 1101011101
Now the basic operators of GAs have been briefly introduced. The con-
temporary developments of GAs have introduced many new techniques to
improve the performance of GA operators, as well as the way of performing
the coding. For additional details as well as the mathematic foundations,
readers are referred to the references listed in Section 5.11.
The next section demonstrates application of the basic operations of the
genetic algorithm via a simple example of optimization.
5.2.3 A Simple Example

To demonstrate how the plain GA works, solve the minimization problem
that was considered in Section 4.4.1:
Minimize f ( x1 , x 2 ) = ( x1 + 1) + ( x 2 + 1)
2 2
(5.13)
Subject − 2 ≤ x1 ≤ 2, − 2 ≤ x 2 ≤ 2
Obviously, the optimum solution of the problem is (–1, –1)T with the function
value of zero.
5.2.3.1 Solution
Because GAs are often coded for the maximization problem, first transfer
the minimization problem specified by Equation 5.13 to a maximization
problem. A number of such transformations can be used. Here, the following
fitness function is employed in the GA performance according the transfor-
mation given by Deb (1998).
1.0
f ( x1 , x 2 ) = (5.14)
1.0 + f ( x1 , x 2 )
5.2.3.2 Representation (Encoding)

A binary vector is used to represent the real values of parameters x1 and x2.
The GA search space has been limited to a region of a parameter space, as
listed in Table 5.1. These two parameters are discretized and translated into

TABLE 5.1
GA Search Space for Numerical Test of Problem Defined by Equation 5.13
Parameter Search Range Possibilities # Binary Digit
x1 –2.0–2.0 4096 12
x2 –2.0–2.0 4096 12
Total population is 2 24 (1.678 × 10 7 ) .
TABLE 5.2
Initial Generation of 15 Randomly Generated Chromosomes and
the Corresponding Real Parameters and Fitness Value
No. Binary Code x1 x2 Fitness
1 100011111101010001001000 0.2476 –0.9294 0.39039
2 001101000110010010001101 –1.1814 –0.8620 0.95061
3 001110011100110010101011 –1.0974 1.1678 0.17517
4 001110110101011010010100 –1.0730 –0.3551 0.70360
5 111000101110111111010001 1.5458 1.9551 0.06168
6 111011111100000100100111 1.7470 –1.7118 0.11046
7 110101010101110010010111 1.3338 1.1482 0.09040
8 001101010100101011011110 –1.1678 0.7175 0.25139
9 001100100011101111101101 –1.2156 0.9822 0.20098
10 010100000010111101010000 –0.7477 1.8291 0.11029
11 001000110010110101110110 –1.4510 1.3661 0.14702
12 011101101110111111111011 –0.1421 1.9961 0.09335
13 111010110101110111111111 1.6777 1.4999 0.06935
14 001010100111010010101010 –1.3368 –0.8337 0.87638
15 101111100000011010000011 0.9695 –0.3717 0.18962
a chromosome of length 24 bits according to the binary coding procedure

given in Section 5.2.1, with 12 bits for each parameter. In the entire search
space, a total of 224 (Ý1.678 × 10 7) possible combinations of these two param-
eters exists.
5.2.3.3 Initial Generation and Evaluation Function

The GA starts from an initial generation that is usually created in a random
manner. Table 5.2 shows the initial generation of the 15 chromosomes (indi-
viduals) created randomly for this example. In this table, the binary coding,
real parameters, and corresponding function value are explicitly listed.
5.2.3.4 Genetic Operations

Selection is first performed from the individuals of the initial generation;
several selection operators have been proposed. The following is the simplest
one. All 15 individuals in the generation and their corresponding function
values are evaluated and ranked in descending order based on their fitness
values. Only a number (usually about one half, e.g., seven, in this case) of the
best individuals with the highest fitness values in the generation are retained

as seven new individuals for the next generation, and the rest are discarded
based on the rule of survival of the fittest. These seven best individuals are
also used to form the mating pool to produce the other shortfall of eight new
individuals for the next generation. Two individuals from the mating pool are
paired in a random fashion. Pairing chromosomes in a GA can be carried out
by various methods, such as pairing from top to bottom, random pairing,
ranking weighting, etc. The often used approaches are based on the probabil-
ities of individuals. The probability pi of the ith individual selected for the
pairing is proportional to its fitness value and can be computed using
fi
pi = nbi (5.15)
∑f i =1
i
where nbi is the number of the best individuals (equal to seven in this
example). A simple algorithm can be coded to pair up eight pairs of parents
using these seven individuals, based on the probability value obtained from
Equation 5.15. Using the crossover operators, these parents are then mated
to produce the shortfall of eight children for the next generation.
In the crossover operation, the crossover points must be determined first.
In this example, one crossover point is first randomly selected. The gene
segments in chromosomes of the paired parent individuals are then
exchanged. Assuming the following two pairing individuals are selected
from the mating pool:
Chromosome 1 (C1): 100011111101

010001001000

x1 x2
(5.16)
Chromosome 2 (C2): 001101000110
010010001101

x1 x2
the corresponding parameter values of these two chromosomes are
C1: x1C1 = 0.2476, x 2C1 = −0.9294

(5.17)
C2: x1C 2 = −1.1814 , x 2C 2 = −0.6820
These chromosomes are evaluated to arrive at their function value of
f (C1) = 0.39039
(5.18)
f (C 2) = 0.95061
Assume that the crossover point was randomly selected after the 12th gene
bit:

Chromosome 1 (C1): 100011111101 010001001000
Chromosome 2 (C2): 001101000110 010010001101 (5.19)
After the crossover operation, the two resulting offspring are
Offspring 1 (O1): 100011111101 010010001101
Offspring 2 (O2): 001101000110 010001001000 (5.20)
and the corresponding values for these two offspring are
O1: x1O1 = 0.2476, x 2O1 = −0.6820

(5.21)
O2: x1O 2 = −1.1814 , x 2O 2 = −0.9294
These offspring are evaluated to obtain their function values
f (O1) = 0.376275
(5.22)
f (O 2) = 0.963493
The crossover will produce a total of eight children. In addition to the

seven best individuals from the parent generation, a tentative generation of
15 individuals is now ready for the next genetic operation: mutation, which
is a random alteration at a small percentage of the gene bits in the chromo-
somes. Mutation points are randomly selected for individual chromosomes
in the population pool. For example, for the mutation operator on chromo-
some 1 in Equation 5.16, if the mutation point is at the 20th bit, the bit at
point 20 is mutated from 0 to 1 as
Chromosome 3 (C3): 100011111101 010001001000 (5.23)
Mutation
Offspring 3 (O3): 100011111101 010001011000
The corresponding value for the mutated offspring is
O31: x1O 3 = 0.2476, x 2O 3 = −0.9138 (5.24)
and these chromosomes evaluate to

f (O31) = 0.19125 (5.25)
After the mutation operation on all 15 individuals, a new generation of 15

is finally born, and the next cycle of evolution begins. The evolution is
repeated until the best individual in the entire search space is found or the
prescribed maximum number of generations is reached.
5.2.3.5 Results
For this numerical example, the following GA operational parameters have
been used:
• Population size is 15.

• Probability of crossover is 0.4.
• Probability of mutation is 0.02.
• Maximum generation is 100.
From the calculation result, it has been found that the best chromosome
after 100 generations is 001111111110010000000000, and corresponding values
of this chromosome are x1 = –1.0017, x2 = –0.9998; the corresponding fitness
value is 0.999993. The convergence of fitness value against the number of
generations for a GA run is plotted in Figure 5.1. It can be observed from
the convergence curve that the GA converges very fast at the beginning and
very slowly at later stages. The converging performance slows down signif-
icantly at the final stage of searching.
0.995
0.99
0.985
0.98
f
0.975
0.97
0.965
0.96
0.955
0.95
0 10 20 30 40 50 60 70 80 90 100
Generation
FIGURE 5.1
Convergence of a GA for the simple problem defined in Equation 5.14. The GA converges very
fast at the beginning and very slow at the later stage.

5.2.4 Features of GAs

Genetic algorithms are stochastic global search methods and differ in the
fundamental concepts from traditional gradient-based search techniques.
One important feature of genetic algorithms is that they work on groups
(generations) of points in the whole search space, while most gradient-based
search techniques handle only one point at a time. For this reason, gradient-
based search techniques have a drawback of depending heavily on the initial
guess point and are more likely to be trapped at a local optimum in some
complex problems. The genetic algorithms work on a group of points and
proceed in a more globally exploratory move, and thus work well in many
complex search problems where gradient-based search techniques fail. This
feature gives the GAs an edge in dealing with complicated, nonlinear, and
multimodal optimization problems, including inverse problems.
Furthermore, GAs require only objective function information, while many
other search techniques usually require auxiliary information in order to
work properly. For example, the gradient-based techniques need knowledge
of derivatives of the objective function in order to climb in the right direction
to the current (local) peak. GAs can work well for those types of problems
to which gradient-based search techniques are not applicable, such as prob-
lems whose objective function is not differentiable. This characteristic makes
GAs more canonical than many other search schemes for many complex
engineering problems.
Table 5.3 summarizes the differences between genetic algorithms and the
traditional gradient-based optimization and search procedures.
One major disadvantage of the GA is its higher computational cost; gen-
erally, more evaluations of the objective function are required by a GA than
a traditional gradient-based search method. This drawback is very critical
for expensive forward solvers, but becomes less critical with faster computers
or simple objective functions that can be computed very fast. For solving an
TABLE 5.3
Comparison between GA and Gradient-Based Optimization and Search Procedures
Items GA Gradient-Based Optimization
Search bases Groups of points Single point
Initial guess No Yes
Function information Objective function only Objective function and its
derivatives
Search rule Probabilistic in nature Deterministic laws
Convergence Fast at beginning, slow at the Relatively slow at initial stage,
later stage very fast at later stage
Applicability Global search for complex Local search for simple problem
problems with many local with single optimum
optima
Computing efficiency Computationally expensive Efficient

inverse problem using GAs, exploring a faster forward computation solver

is very important to reduce the computer time because GAs require a large
number of calls of the forward solver. A small saving in the single run of
forward calculation can significantly reduce the total running time of the
inverse problem.
Another major disadvantage of a GA is its deficiency for problems with
too many variables because of the exponential growth rate of the search
space with respect to the increase of the number of variables. Gradient-based
methods are far superior to GAs in this regard. GAs sometimes demonstrate
very poor convergence performance, especially when the search has found
a good individual very close to the global optimum. Because of the proba-
bilistic nature of the GA, once a very good individual is found, to find a
better one in the entire search space becomes more difficult. In addition, their
performance near the global solutions appears to be relatively imprecise
when compared with the conventional gradient-based optimization tech-
niques that use the deterministic translation rules (Gen and Chen, 1997;
Krishnan and Navin, 1998). The next section gives a brief review of devel-
opments in improving the GA’s performance.
5.2.5 Brief Reviews on Improvements of GAs

To improve convergence performance and enhance searching capability, it
has been recommended to incorporate GAs with conventional optimization
techniques (Bosworth et al., 1972; Bethke, 1981; Goldberg, 1983; Angelo, 1996;
Back et al., 1997; Dozier et al., 1998). GAs are good at global searching but
slow at converging, while some of the conventional optimization techniques
are good at fine-tuning but lack a global perspective, so a hybrid algorithm
can be an ideal alternative. Such a hybrid algorithm can combine the global
explorative power of GAs with the local exploitation behaviors of conven-
tional optimization techniques, complement their individual weak points,
and thus outperform either one individually (Gen and Chen, 1997).
Various hybrid algorithms have been proposed so far (Davis, 1991; Gen
and Chen, 1997; Cheng et al., 1999; Magyar et al., 2000). Basically, they can
be classified into three categories (Xu et al., 2001c):
1. Inject the problem-specific information into the existing genetic

operators in order to reproduce the offspring that possess higher
fitness values. For example, Davidor (1991) defined the Lamarckian
probability for mutations in order to enable mutation operators to
be more controllable. Yamada and Nakano (1992) designed a new
crossover operator based on the Giffer and Thompson’s algorithm.
Cheng et al. (1996) designed a new mutation operator based on a
neighborhood search mechanism.
2. Design new heuristic-inspired operators in order to guide genetic
search more directly toward better solutions. For example, Bosworth
et al. (1972) used the Fletcher-Reeves method together with the

golden section search method as a new mutation operator. Grefen-

stette et al. (1985) developed a greedy, heuristic crossover operator
and Grefenstette (1991) introduced a Lamarckian operator. Davis
(1991) and Miller et al. (1993) proposed an extra move operator and
a local improvement operator. Magyar et al. (2000) proposed an
adaptively fired hill-climber operator. Goldberg (1989) developed a
G-bit improvement operation for binary strings and Whitley et al.
(1994) proposed a Baldwinian strategy to change the value of fitness
function based on the hill-climbing operation.
3. Incorporate conventional optimization methods into GAs. This can
be done in two typical ways. The first is to take the conventional
optimization methods as an add-on extra to the basic loop of genetic
algorithms. That is, apply a conventional optimization method (typ-
ically the hill-climbing method) to each newly generated offspring
to move it to a local optimum, and then replace the current individ -
uals with these locally optimal solutions before putting the offspring
back into the population. This approach is commonly called Lama-
rckian evolution as explained by Kennedy (1993), or memetic algo-
rithms introduced by Moscato and Norman (1992) and Radcliffe and
Surry (1994). The second approach is to run the GA and then apply
a conventional optimization method to obtain the final solution
(Levine, 1996; Yang et al., 1995; Mohammed and Uler, 1997; Xiao
and Yabe, 1998; Liu et al., 2002a; Xu and Liu, 2002d).
Incorporating conventional optimization methods into the GAs is the most

common form of hybrid genetic algorithms in engineering practice so far
because these kinds of algorithms are relatively simple in implementation
(Gen and Chen, 1997; Levine, 1996). However, they usually require high
computation cost because a large number of function evaluations must be
conducted in the local optimization process. Most conventional optimization
methods used in hybrid algorithms are the hill-climbing methods for main-
taining the flexibility of algorithms (Ackley, 1987; Gorges-Schleuter, 1989;
Davis, 1991; Kennedy, 1993; Whitley et al., 1994; Levine, 1996; Gong et al.,
1996; Dozier et al., 1998; Magyar et al., 2000; Xu et al., 2001a). This usually
results in an expensive computation cost in each of the local optimization
processes for realistic problems where the number of decision variables is
large and/or a single function evaluation takes considerable computation
time (Xu et al., 2001c), making the implementation of hybrid algorithms
difficult or even impossible in these cases.
Recently, a novel hybrid genetic algorithm has been proposed by Xu et al.
(2001c). This GA uses an additional operator called intergeneration projection
(IP), and hence the algorithm is termed an intergeneration projection GA
(IP-GA). In conventional or micro GAs (see the next section), the child gen-
eration is produced using the genes of parent generation based on the fitness
of the parent individuals. In the IP-GA, however, some of the individuals in
the child generation are produced using genes of the parent and the grand-

parent generations. This intergeneration operator drastically improves the

efficiency of searching for all the problems tested so far. The IP-µGA was
later further improved by Liu’s group (Xu et al., 2001c, 2002; Xu and Liu,
2002a, e; Yang et al., 2001) and the latest version of IP- µGA is about 20 times
faster than the µGA (Yang et al., 2001; Xu and Liu, 2002e).
In the next sections, several improved GAs will be introduced, the IP-GA
will be emphasized and detailed results will be provided.
5.3 Micro GAs

As mentioned before, one main disadvantage of using GAs is that a relatively
large number of forward evaluations are generally required. Hence, various
other versions of GAs have been developed to improve performance, such
as micro GA (Krishnakumar, 1989), messy GA (Goldberg et al., 1989), non-
traditional GA (Eshelman, 1989), etc.
The micro GA (µGA) is an extension of the “plain” GA. It is capable of
avoiding the premature convergence and of performing better in reaching
the optimal region than the traditional GA (Krishnakumar, 1989; Carroll,
1996a). Recently, the µGA has been widely applied in engineering practice
due to these advantages (Carroll, 1996b; Johnson and Abushagur, 1997; Xiao
and Yabe, 1998; Abu-Lebdeh and Benekohal, 1999; Liu and Chen, 2001; Liu
et al., 2002c, f; Wu et al., 2002, etc.).
Basically, the µGA uses a similar evolutionary strategy to that used in
traditional GAs. Selection and crossover are still the basic genetic operations
in the µGA, while mutation is usually omitted. Other operations, such as
niching, elitism, etc., are also often recommended (Carroll, 1996a). Niching
means that the multidimensional phenotypic sharing scheme with a trian-
gular sharing function is implemented (Goldberg and Richardson, 1987).
Elitism means that the best individual must be replicated in the next gener-
ation. These operations have been found effective in improving the conver-
gence performance of the µGA (Carroll, 1996a; Sareni and Krahenbuhl, 1998),
although they are not absolutely necessary.
The main differences of the µGA from traditional GAs are in the population
size for each generation and the mechanism to introduce and maintain the
genetic diversity (Abu-Lebdeh and Benekohal, 1999). Generally, the µGA
operates on a very small population size (typically 5 ~ 8). The small popu-
lation size very often allows fast convergence to a local optimum in the
encoded space in a few generations. To maintain the genetic diversity in the
population, the µGA uses a restart strategy, not the conventional mutation
operation. That is, once the current generation converges, a new generation
will be generated that has the same population size and consists of the best
individual from the previously converged generation and other new ran-
domly generated individuals from the entire space. This evolutionary pro-

j = :0
Randomly
initialize P(0)
Binary encoding
Decoding
Evaluate P(0)
Elitism, with
Convergence Yes others in P(j)
criteria 1 are randomly
selected
No
j = :j+1 Binary encoding
Tournament
selection
Uniform
crossover
Decoding
Evaluate P(j)
Elitism
FIGURE 5.2
Flow chart of the µGA. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With
permission.)
cess will be sequentially conducted until the global optimum is found (or
the predesignated number of generations is reached) and is schematically
depicted in Figure 5.2, where P(j) and C(j) denote the parents and child
(offspring) in the jth generation, respectively.
The key strategy of the the µGA is to divide the GA search into many
cycles, each of which will find a local optimal in the encoded space. To do
this efficiently, it uses a small population size for each “micro” generation
to achieve fast convergence to a local optimum in one cycle, and to restart
the global exploration via randomly generating a relatively large number of
individuals in the microgeneration of a new cycle. The elitism is always used
in generation to generation and cycle to cycle.
By introducing the micro technique, the µGA guarantees its robustness in
a different way: whenever the microgeneration is reborn, new chromosomes
are randomly generated, so new genetic information keeps flowing in. Krish-

nakumar’s 1989 study pointed out that a µGA can avoid premature conver-
gence and demonstrates faster convergence to the near optimal region than
does a plain GA for many multimodal problems.
5.3.1 Uniform µGA

The uniform crossover operator, which was developed by Syswerda in 1989,
generally works better than the one-point and two-point crossovers. Follow-
ing the reproduction process, in which pairs of chromosomes have been
chosen for mating and stored in the mating pool, the uniform crossover
operator proceeds. For each bit at the same position of two mated chromo-
somes, a random number is generated and compared with a preset crossover
possibility; if the random number is larger than the crossover possibility, the
crossover operator swaps the two bits of the mated chromosomes. On the
other hand, if the random number is smaller, the two chromosomes remain
unchanged and the crossover operation on this bit is finished. This crossover
operation is performed to every bit of the mated chromosomes in sequence.
When the crossover operation completes, two new chromosomes are created
for the next GA operation.
A uniform µGA program combines the two improved techniques of µGAs
and uniform crossover operator. Carroll’s study (1996b) has shown that the
uniform-µGA generally exhibits more robustness in handling an order 3
deceptive function than traditional GA methods, and pointed out that the
robustness of the uniform µGA lies in the constant infusion of new genetic
information as the micropopulation restarts, as well as the uniform crossover
operator’s characteristic of being unbiased to position.
5.3.2 Real Parameter Coded µGA

As summarized by Man et al. (1999), in general, binary encoding is the most
classic method used by GA researchers because of its simplicity and trace-
ability. The conventional GA operations and theory (schemata theory) are
also developed on the basis of this fundamental structure. However, a direct
manipulation of real-value chromosomes (Janikow and Michalewicz, 1991;
Wright, 1991) has also raised considerable interest. The study by Janikow
and Michalewicz (1991) indicates that the floating point representation would
be faster in computation and more consistent from the run-to-run basis.
A real parameter coded microgenetic algorithm (real µGA) is constructed
by Liu and Ma (2003) based on the concept of the µGA. The flow chart of
the real µGA is presented in Figure 5.3. Comparing Figure 5.3 with Figure
5.2, it can be seen that these two algorithms are basically the same; both
consist of many subcycles. In the beginning of every subcycle, new genera-
tion is formed using randomly generated individuals with the best individ -
ual of the last generation. Because of the small population size in the µGA
and real µGA run, it will converge quickly to a local optimum. After the

j = :0
Randomly
initialize P(0)
Evaluate P(0)
Yes
Convergence Elitism
No
j = : j+1
Selection
Crossover
Evaluate P(j)
Elitism
FIGURE 5.3
Flowchart of the real µGA.
convergence occurs, this subcycle ends and the next subcycle starts. Each
subcycle typically consists of several generations. In every generation, tour-
nament selection, elitism, and crossover operators are included. A mutation
operator is not present in the process.
Although many similarities have been mentioned, some differences do
exist. Two of them are
• A different crossover operator must constructed in the real µGA due

to the different coding scheme. The crossover operator used in the
µGA operates on a binary string, so it cannot be used directly in the
real µGA. Types of crossover operators will be detailed in the fol-
lowing subsections.
• Convergence has different meanings in the µGA and the real µGA.
In the µGA, convergence means that less than a certain percentage
of the total bits of other individuals in a generation are different
from the best individual. In the real µGA, convergence means that
all the individuals in a generation are very near to each other in
physical space. In other words, the convergence occurs in real phys-
ical space in the real µGA, but it occurs in bit space in the µGA. In

x2 x2
x1 x1
(a) (b)
Possible x location Possible x location
x1
x1 x2 x2
(c) (d)
Possible x location Possible x location
FIGURE 5.4
Schematic representation of different cross-over operators: (a) simple crossover; (b) uniform
arithmetical crossover; (c) uniform heuristic crossover; and (d) uniform extended arithmetical
crossover.
the real µGA, the search covers a large proportion of the entire search
space in the beginning of every subcycle. The search range covered
reduces as the search progresses, until all the candidates in a gener-
ation are crowded in a very small area and the convergence criterion
is reached. Once that happens, new randomly generated individuals
will flow in and the next subcycle starts. In the µGA, no clear phys-
ical interpretation on the convergence can be provided.
5.3.2.1 Four Crossover Operators

Four crossover operators have been introduced for the real µGA (Liu and
Ma, 2003). For all the crossover operators here, it is assumed that two parents
generate one child. Assume the two parents and one child can be written as:
x 1 = {x11 x12 ... x1n } (parent 1) (5.26)
x 2 = {x 21 x 22 ... x 2 n } (parent 2) (5.27)
x = {x1 x2 ... xn } (the child) (5.28)
where xij stands for the jth parameter in the ith parent individual.
The four crossover operators are plotted schematically in Figure 5.4 (a–d).
They can be expressed mathematically as:

• Simple crossover
{
x = x11 x12 ... x1i x 2 , i +1 x 2 ,i + 2 ... x2n } (5.29)
in which crossover occurs at the randomly selected ith position.

Using this operator, the child is located at one of the corner points
of the rectangle whose diagonal line is as shown in Figure 5.4 (a).
This operator has been used by Wright (1991).
• Uniform arithmetical crossover
xi = ai x1i + (1 − ai )x 2 i , i = 1, 2, … , n (5.30)
in which ai ∈[0, 1] is randomly selected. Using this operator, the

child x lies in the rectangular whose diagonal line is x 1x 2 , as is
shown in Figure 5.4 (b), while in the arithmetical crossover operator
used by Wright (1991), the child x can only lie on the diagonal line
x 1x 2 .
• Uniform heuristic crossover
xi = ai ( x 2 i − x1i ) + x 2 i (5.31)
in which ai ∈[0, 1] is randomly selected, and the fitness value at x 2

is larger than that at x 1 . This operator is called uniform heuristic
crossover because it uses fitness value of the function in determining
the direction of the search, as is shown in Figure 5.4 (c). It is different
from the heuristic crossover operator used by Wright (1991), in
which the child x can only lie on the line segment extended from
x 1x 2 .
• Extended uniform arithmetical crossover
xi = 2 x1i − x 2 i + 3 ai ( x 2 i − x1i ) (5.32)
in which ai is also randomly selected. This operator is named uniform

extended arithmetical crossover here because it extends the search
range of the uniform arithmetical crossover and the uniform heuris-
tic crossover, as shown in Figure 5.4 (d). This crossover has been
used by Liu and Ma (2003).
With these four crossover operators, four versions of real µGAs are con-
structed. In order to compare the performance of different algorithms mean-
ingfully, all the parameters and operations are set the same, except for the

coding scheme, crossover operator, and convergence criteria. The details of

the operations and parameters used here for the five algorithms, i.e., the
uniform µGA and the four real µGAs, are
• Tournament selection
• Uniform simple, uniform arithmetical, uniform heuristic, or uniform
extended arithmetical crossover; for uniform crossover in uniform
µGA, the probability of crossover is set to 0.5
• Elitism operator
• No mutation operation
• Population size of each generation set to 5
• The population convergence criterion for the real µGA is 2%, which
means that when all the candidates in a generation are located very
near to each other in physical space so that the maximum distance
of two candidates is less than 2% of the search range, the conver-
gence occurs. In the uniform µGA, the population convergence cri-
terion is set to be 5%, which means that when less than 5% of the
total bits of other individuals in a generation are different from the
best individual, the convergence occurs.
5.3.2.2 Test Functions

To examine the effectiveness of the real-coded GAs and other GAs, six typical
benchmarking functions listed in Table 5.4 are used to test the performance
of various modified GAs to search for their optimal. These functions are
selected from the examples used in the programme on Advanced Genetic
Algorithm in Engineering, School in Computer Science, Sophia Antipolis,
France (available at website: http://www.essi.fr/~parisot/GA200/ga.html).
They have been especially designed to have many local optima and one or
more global optima. For visualization of their features, two two-dimensional
functions F1 and F2 are plotted in Figure 5.5 and Figure 5.6.
5.3.2.3 Performance of the Test Functions

These six test functions are used in the section to compare the performance
of the real µGAs and the uniform µGA. Five different algorithms are exam-
ined: real-µGAs, with four different crossover operators, and the uniform
µGA. Because the most significant operator in the real µGAs is the crossover
operator, the tests are designed to examine the performance of the different
crossover operators. The generation number required to achieve different
best fitness values by different algorithms are tabulated in Table 5.5 to Table
5.10 for the six test functions, respectively. The convergence process of all
these GAs are plotted in Figure 5.7 to Figure 5.12. From these tables and
figures, the performance of each algorithm can be observed:

TABLE 5.4
Test Functions
Variable
No. Objective Function Bound Global Optima Fitness Value
F1 0 < xi < 1.0 (0.0669, 0.0669) 1.0 (maximum)
−4(log 2)( x i − 0.0667 ) 2
2
f (x1 , x 2 ) = ∏ [sin(5.1px + 0.5)] exp

i =1
i
6
0.64
i = 1, 2
p = 3.14159
F2 5 5 –10 < xi < 10 (4.8581, –7.0835) –186.7309 (minimum)
f (x1 , x 2 ) = ∑
i =1
i cos((i + 1)x 1 + i) ∑i =1
i cos((i + 1)x 2 + i) i = 1, 2 (–1.4251, –0.8003)
(–0.8003, –1.4251)
F3 –10 < xi < 10 (–1.0467, 0.0) –0.3524 (minimum)
f ( x 1 , x 2 ) = x 14 / 4 − x 12 / 2 + x 1 / 10 + x 22 / 2
i = 1, 2
F4 3 –5 < xi < 5 (1.0, 1.0, 1.0) 0.0 (minimum)
f (x1 , x 2 , x 3 ) = ∑
i =1
(( x 1 − x i2 ) 2 + ( x i − 1) 2 ) i = 1, 2, 3
F5 3 –5 < xi < 5 (1.0, 1.0, 1.0) 0.0 (minimum)

f (x1 , x 2 , x 3 ) = ∑
i =1
(( ax1 − bx i2 ) 2 + (cx i − d) 2 ) i = 1, 2, 3
0.999 ≤ a, b, c, d ≤ 1.001 randomly

F6 5 0 < xi < 10.0 (4.0, 4.0, 4.0, 4.0) –10.1532 (minimum)
−1
f (x1 , x 2 , x 3 , x 4 ) = ∑ 4
i = 1, 2, 3, 4
i =1
∑ (x
j =1
j − d( j , i)) 2 + c(i)
d[4, 5] = (4, 4, 4, 4; 1, 1, 1, 1; 8, 8, 8, 8; 6, 6, 6, 6; 3, 7, 3, 7)
c[5] = (0.1, 0.2, 0.2, 0.4, 0.4)
Source: Xu, Y.G., et al., Appl. Artificial Intelligence, 15(7), 601–631, 2001. With permission
X2
X1
FIGURE 5.5
Test function F1 has a number of local optima in the search space. (From Xu, Y.G. et al., Appl.
Artifi. Intelligence, 15(7), 601–631, 2001. With permission.)
X2
X1
FIGURE 5.6
Test function F2 with a number of local optima in the search space. (From Xu, Y.G. et al., Appl.
Artif. Intelligence, 15(7), 601–631, 2001. With permission.)

TABLE 5.5
Generation Number Required to Achieve Best Function Values by Different GAs for Test Function F1
Real-µGA, Crossover 1 Real-µGA, Crossover 2 Real-µGA, Crossover 3 Real-µGA, Crossover 4 Binary-µGA
Best Best Best Best Best
Function Generation Function Generation Function Generation Function Generation Function Generation
Value Number Value Number Value Number Value Number Value Number
0.456 50 0.5113 170 0.5139 44 0.5655 100 0.58124 5
0.8696 191 0.8208 103 0.8398 114 0.8471 138 0.8423 242
0.95347 1101 0.9785 977 0.9785 977 0.9500 280 0.8423 (up to 2000)
0.9989 1879 0.9996 693 0.9996 1020 0.9996 450 0.8423 (up to 2000)
TABLE 5.6
–41.895 12 –38.82 12 –48.497 7 –33.211 13 –44.04 26
–140.68 57 –120.87 83 –147.94 183 –171.39 14 –114.4 52
–182.58 109 –182.30 140 –185.60 230 –186.51 33 –186.06 151
–186.11 1753 –186.47 157 –186.50 257 –186.70 58 –186.19 221
TABLE 5.7
–0.2134 16 –0.1508 53 –0.1536 50 –0.1009 46 –0.1345 28
–0.3437 128 –0.3514 146 –0.3502 77 –0.3455 47 –0.3279 86
–0.3509 1698 –0.3524 187 –0.35236 112 –0.35237 132 –0.35238 225
TABLE 5.8
2.043 23 0.9847 16 1.298 62 0.9928 136 0.4703 22
1.3573 (up to 2000) 0.00133 96 0.00087 122 0.000202 331 0.0092245 1180
1.3573 (up to 2000) 0.000 131 0.000 183 0.000 1464 0.0092245 2000
TABLE 5.9
5.8747 16 0.9927 16 0.1946 63 0.9965 136 0.47 22
1.3795 (up to 2000) 0.836e-4 130 0.9488e-4 157 0.2369e-3 352 0.999e-2 (up to 2000)
1.3795 (up to 2000) 0.153e-4 131 (up to 2000) 1.529e-5 467 1.529e-5 452 0.999e-2 (up to 2000)
TABLE 5.10
Real-µGA, Crossover 2 Real-µGA, Crossover 3 Real-µGA, Crossover 4 Binary -µGA
Best Best Best Best
Function Generation Function Generation Function Generation Function Generation
Value Number Value Number Value Number Value Number
–1.019 39 –2.119 29 –1.0557 67 –1.208 14
–5.3675 129 –5.0255 63 –5.0183 271 –2.6301 (up to 1168, failed)
–10.150 855 –10.152 156 –10.151 1302 –2.6301 (up to 1168, failed)
0.9
0.8
0.7
Best function value 0.6
0.5
0.4
0.3
Binary coded
Real coded crossover 1
0.2
0.1 Real coded crossover 4
0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Generation number
FIGURE 5.7
Convergence of the real µGA for test function F1.
-20
-40
-60
Best function value
-80
-100
-120 Binary coded

-140 Real coded crossover 2
-180
-200
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Generation number
FIGURE 5.8

20
Binary coded
15 Real coded crossover 2
Best function value
10
-5
0 50 100 150 200 250 300 350 400 450 500
Generation number
FIGURE 5.9
90
80
Binary coded
Best function value
60
50
40
30
20
10
0
0 50 100 150 200 250 300 350 400 450 500
Generation number
FIGURE 5.10

90
80
Binary coded
Best function value

60
50
40
30
20
10
0
0 50 100 150 200 250 300
Generation number
FIGURE 5.11
-2
Best function value
-4
-6
Binary coded
-10
-12
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Generation number
FIGURE 5.12

• The real-µGA with crossover operator 1 performs poorly for every

test functions, while the search even failed for test function F6, as
shown in Figure 5.12. The poor performance of this algorithm may
be due to the limitation of the simple crossover operator, where only
the corner points of the rectangular (see Figure 5.4a) can be explored
by the new generations.
• The real-µGA with crossover operator 2 performs reasonably well
for test functions F1, F2, F3, F4 and F6, while it performs badly for
test functions F5. This phenomenon can be explained by the biased
nature of the crossover operator 2. Using the uniform arithmetical
crossover (see Figure 5.4b), points outside the rectangular have no
chance to be tested in new generations, resulting in a bias against
points near the boundary of the search range. The points in the
middle of the search range are given a higher probability to be tried.
• The real-µGA with crossover operator 3 or 4 performs reasonably well
for all the functions. The algorithm with crossover operator 3 outper-
forms that with the crossover operator 4 for test functions F3, F4 and
F6. However, it underperforms the crossover operator 4 for multimo-
dal test functions F1 and F2 (c.f. Figure 5.5 and Figure 5.6), because
it may be deceived at some stage in the search process. It is expected
that with increasing deception level of the function, the performance
of the algorithm with crossover operator 3 may deteriorate.
• The real-µGA with crossover operator 4 perform consistently well
for all the test functions. In order to clearly compare the performance
of real-µGA with crossover operator 4 with the binary uniform µGA,
convergence on the six test functions are collected in Table 5.11. It
can be found that the real-µGA with crossover operator 4 consis-
tently converges faster than the uniform µGA. By using uniform
extended arithmetical crossover operator, the search process is faster,
more accurate, and is not easily deceived.
TABLE 5.11
Convergence Comparison between Real-µGA with Crossover 4 and the Binary- µGA
Test Functions Real-µGA , Crossover 4 Binary -µGA
(Maximum or Convergence Generation Convergence
Minimum) Point Number Point Generation Number
F1 (1.000) 0.9996 450 0.834 Up to 2000 (failed to find
the solution)
F2 (–186.7309) –186.70 58 –186.19 221
F3 (–0.3524) –0.3524 132 –0.3524 225
F4 (0.0) 0.000 1464 0.00922 2000
F5 (0.0) 1.529e-5 452 0.96d-2 Up to 2000 (failed to find
the solution)
F6 (–10.1532) –10.151 1302 –2.6301 Up to 2000 (failed to find
the solution)

Summarize above observations, the real µGA with crossover operator 4 is

recommended due to its consistently good performance for all the six test
functions studied. These findings are very much similar to those reported
by Liu and Ma (2003).
5.4 Intergeneration Projection Genetic Algorithm (IP-GA)

The IP-GA was proposed by Xu et al. (2001c). In the IP-GA, the child gen-
eration is produced using information from the parent and grandparent
generations. IP-GAs were originally developed based on the µGA algorithm,
to make use of its feature of small population size per generation so as to
maximize the efficiency. It was therefore termed IP-µGA. The concept of the
IP is, of course, applicable to all other version of GAs. In this book, only the
IP-µGA is used, but for simplification, the abbreviation of IP-GA will be used
to refer to the IP-µGA. The IP-GA starts from the modified µGA.
5.4.1 Modified µ GA
It is obvious that the population size and the measuring criterion for defining
the population convergence have a great influence upon the performance of
µGAs. The issue of population size was examined and the corresponding
procedure to determine the best population size was developed by Abu-
Lebdeh and Benekohal (1999). The criterion for defining the population
convergence was described by Carroll (1996a) as having less than 5% of the
genes (or bits) of the other individuals different from the best individual in
one generation; it has been successfully applied in engineering practice so
far (Xiao and Yabe, 1998; Carroll, 1996a, b).
Improvement for this criterion is still possible, however, because it takes
into account only the number of the “different genes,” but not their positions
in the compared chromosome strings. In fact, if two individuals have the
same number of the different gene from the best individual, but the different
genes in the compared chromosomes are at different positions, their Euclid-
ean distance from the best individual may be significantly different in solu-
tion space (or real-value parameter space). This can be immediately
demonstrated by the following example (see Table 5.12). Chromosomes A,
TABLE 5.12
Comparison of Euclidean Distances between Two Chromosomes in µGAs
Binary String Real Value Euclidean Distance
Chromosome A 1101|1001|0001 13 | 9 | 1 ||A–C||2 = 2
Chromosome B 0101|1001|0011 5|9|3 ||B–C||2 = 8
Chromosome C 1101|1001|0011 13 | 9 | 3
Source: Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.

B, and C are constructed by a 12-bit string coded from three real value
parameters, respectively. Chromosome A and chromosome B have one dif-
ferent gene with the chromosome C. Because the different gene in chromo-
some A is under a position different from that in chromosome B, the
Euclidean distance of chromosomes A and C is thus significantly different
from that of chromosomes B and C in the solution space.
The solution space is the real space to define whether or not the population
converges; therefore, it is insufficient for the criterion to take into account
only the number of different genes in coding space without considering the
differences of their Euclidean distances in solution space. A modified crite-
rion has been introduced to overcome this problem (Xu et al., 2001c), in
which a weight wi is introduced to take into account the position of the
different gene, i.e., the difference of Euclidean distance in solution space for
the compared individuals. The weight wi is given as:
2i
wi = np (5.33)
( N − 1)∑ n (n + 1)
j =1
j j
where i is the position of the different gene that counts from right to left in
the substring representing the ith real parameter, nj is the number of genes
(or bits) in the jth substring, np is the number of parameters to be optimized,
and N is the population size. It is obvious that in Equation 5.33, the more
leftwards the position of the different gene in two substrings of the compared
chromosomes, the larger the Euclidean distance of these two chromosomes
is in the solution space, and thus, the larger the weight wi is. This means more
influence of the different gene on the population convergence of the µGA.
As for two extreme cases where each of the genes in N – 1 compared
chromosomes is identical to (complete convergence) or different from that
in the best individual (Xu et al., 2001c):
N −1 np nj
∑∑∑
2i
np =0 (5.34)
j =1 j =1 i =1
( N − 1) ∑ n (n + 1)
j =1
j j
and
N −1 np nj
∑∑∑
2i
np =1 (5.35)
j =1 j =1 i =1
( N − 1) ∑ n (n + 1)
j =1
j j

Therefore, the criterion for defining the population convergence of the µGA
is set as:
N −1 np nj
∑∑∑
2i
np ≤γ (5.36)
j =1 j =1 i =1
( N − 1)∑ n (n + 1)
j =1
j j
With reference to the convergence criterion (Carroll, 1996a), it is recom-

mended that γ = 5 ~ 10%.
5.4.2 Intergeneration Projection (IP) Operator

The intergeneration projection (IP) operator aims to find a better individual
by jumping along the move direction of the best individual at two consec-
utive generations so as to improve the convergence rate. It usually requires
no additional function evaluations.
Construction of the move direction of the best individual is a key of
implementing the IP operator. Optimization methods based on the heuristic
pattern move are actually a kind of direct search method. Generally, they
are less efficient than the traditional gradient-based methods; however, they
have usually been the preferred choice in hybrid genetic algorithms. This is
due to their simplicity and also the fact that many real optimization problems
require the use of computationally expensive simulation packages to calcu-
late the values of objective functions. It is very difficult or extremely expen-
sive to compute the derivatives of objective functions in such cases. In
addition, some objective functions formulated from the real world may be
nondifferentiable or noncontinuous, making the gradient-based methods
inapplicable.
Intergeneration projection (IP) is performed using two best individuals in
the current (parent) and the previous (grandparent) generation, denoted by
p bj and p bj −1 , respectively. The IP operator produces two new child individ-
uals, c1 and c2, around p bj , based on the formula (Xu et al., 2001c):
(
c1 = p bj + α p bj − p bj −1 ) (5.37)
(
c2 = p bj −1 + β p bj − p bj −1 ) (5.38)
where α and β are the control parameters of the IP operator; both are
recommended to be within the range from 0.3 to 0.7. The effect of the
control parameters on the evolutionary process is addressed in the follow-
ing examples.

The two newly obtained individuals, c1 and c2, are used to replace the two
worst individuals in the present offspring. Because some kind of gradient
between the generations is used, the use of Equation 5.37 and Equation 5.38
is expected to get to a better individual. This feature is especially important
when searching has entered into the local region around the global optimum,
where the best individual is close to the global optimum.
5.4.3 Hybridization of Modified µ GA with IP Operator

Based on the preceding discussion, the IP-GA can be outlined as follows:
1. Letting j = 0, initialize the population of individuals, P(j) = (pj1,

pj2,…,pjN).
2. Evaluate the fitness values of P(j).
3. Check the termination condition. If “yes,” the process ends. Other-
wise, j = j + 1 and go to the next step.
4. Conduct the genetic operations — selection, crossover, etc. — to
generate the initial offspring C (j) = (cj1, cj2,…,cjN).
5. Evaluate the fitness values of offspring C(j), and find the two worst
individuals.
6. Perform the IP operation using the two best individuals, p bj and
p bj −1 .
7. Generate two new individuals, c1 and c2, by conducting the interpo-
lation and extrapolation along the direction of pattern move, and
evaluate their fitness values.
8. Replace the two worst individuals in the initial C(j) with c1 and c2
to obtain the updated offspring, Ch(j) = (cj1, cj2,…,cjN-2, c1, c2), used in
the next round of evolution
9. Check if population convergence occurs in offspring Ch(j). If “yes,”
implement restarting strategy. Otherwise, go back to step 3.
The flowchart of the IP-GA is depicted in Figure 5.13. When compared

with the conventional µGA shown in Figure 5.2, some features of the IP-GA
can be observed as follows:
• The main difference between the IP-GA to the conventional µGA is

the add-on of a local intergeneration operator (IP) in the evolution
process. Because this IP operator is a simple heuristic operator, this
IP-GA basically can be regarded as the second kind of hybrid algo-
rithm mentioned in Section 5.2.5.
• The IP-GA is different from the conventional Lamarckian approach
in hybrid principal. Lamarckian approach uses the incorporated
local operator to move all the newly generated offspring C(j) to their

j=0
Initiate P(j)
Evaluate P(j)
Yes
Stop criterion ? End
No
j = j+1
Niching
Selection
Intergeneration projection
(IP) operator
Crossover
Obtain C1, C2
Elitism
Generate C(j)
Evaluate C1, C2
Evaluate C(j)
Generate Ch(j)
No Population
convergence ?
Yes
Restart
FIGURE 5.13
Flow chart of the IP-GA. (From Xu, Y.G. et al., Appl. Artifi. Intelligence, 15(7), 601–631, 2001. With
permission.)
local optima in each of the generations (Moscato and Norman, 1992;

Radcliffe and Surry, 1994), which usually results in an expensive
computation. The IP-GA only uses the IP operator to find out a better
individual near the present best individual; it does not require the
individuals, c1 and c2, to be local optima. This greatly simplifies the
local search process and reduces the computation cost in the hybrid-
ization process.
• The IP operator in the IP-GA affects the evolution process in a self-
adaptive manner. At the early stage of evolution, the subspace Sp{cjb :
f(cjb ) ≥ f(cb)} is larger (see Figure 5.14), where cjb ∈ C(j), f(cjb) =

Sp1{cbj: f(cbj)≥f(c1)} is small
Sp1{cbj}
C1 IP operation dominates the evolution
p bj
C2
p bj−1
Sp2{cbj: f(cbj)≥f(c2)} is larger
Sp2{cbj}
Genetic operators dominate the evolution
C2
p bj−1
Fitness value
p bj C1
Individuals
FIGURE 5.14
Effect of intergeneration projection operation on the evolution process. At the early stage of
evolution, the subspace Sp{cjb: f(cjb) ≥ f(cb)} is larger, where cjb∈C(j), f(cjb) = max{f(cj1), f(cj2),…,f(cjN)},
cb ∈ (c1, c2), f(cb) = max{f(c1), f(c2)}. This means the conventional genetic operations based on the
stochastic model have a larger possibility to generate the individual cjb better than cb generated
from the IP operator. As a result, the conventional genetic operators have great dominance in
this stage. At the later stage, with the subspace Sp{cjb: f(cjb) ≥ f(cb)} growing smaller, the possibility
p(f(cjb) ≥ f(cb)) would also correspondingly become smaller and smaller. (From Xu, Y.G. et al.,
Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
max{f(cj1), f(cj2),…,f(cjN)}, f(.) is the fitness function, cb ∈ (c1, c2), f(cb)=

max{f(c1), f(c2)}. This means the possibility p(f(cjb) ≥ f(cb)) is larger. In
other words, conventional genetic operations based on the stochastic
model have larger possibility to generate the individual cjb better
than cb generated from the IP operator. As a result, the conventional
genetic operators have great dominance in this stage. At the later
stage, with the subspace Sp{cjb: f(cjb) ≥ f(cb)} becoming smaller, the
possibility p(f(cjb) ≥ f(cb)) would also become correspondingly smaller.
As a result, the best individual in one generation would be mainly
generated from the IP operator, not from the conventional genetic
operations, which means that the IP operator would play a more
important role. This self-adaptive feature of the IP operator is very
beneficial to the whole evolution process. The lesser effect of the IP
operator at the early stage would be helpful to avoid the pitfall of
sticking to searching at a local optimum. This is because searching
at this stage is to focus on finding the promising areas, which is
mainly achieved by using the conventional genetic operations. The
larger effect of the IP operator at the later stage can greatly speed
up the convergence of the evolution process because most searching

at this stage is to focus on finding a better solution neighbor to the

present individual until the global optimum is reached.
• The IP operator always shifts its starting point for the search and
keeps it to be the best one in the present individual population, no
matter how this best individual is obtained (by the conventional
genetic operations or by the IP operator in the previous generation).
This ensures the insertion of an IP operator without the pitfall of
sticking the evolution process at a local optimum.
• The IP operator costs less computationally for obtaining two new
individuals, c1 and c2 because no evaluation of objective function is
required in this process. The computation cost for the IP-GA to
reproduce each of the new generations has hardly increased when
compared with the conventional GAs. Thus, time saving is a remark-
able advantage of the IP-GA when compared with the other hybrid
algorithms such as that incorporated with the hill-climbing method.
• The implementation of integrating the IP operator into the basic
loop of GAs is simple and straightforward. It is therefore conve-
nient to use this hybrid algorithm in engineering practice. In addi -
tion, because the IP operator can be programmed as an
independent subroutine to be called in computation process, this
ideal of hybridization is also easy to incorporate into any existing
GA software packages.
5.4.4 Performance Tests and Discussions

To examine the effectiveness of the IP-GA, six typical benchmarking func-
tions listed in Table 5.4 are tested to see how fast their global optima can be
obtained using the IP-GA algorithm.
5.4.4.1 Convergence Performance of the IP-GA

For each test function, 18 cases are studied in order to test the convergence
performance of the IP-GA fully. These 18 cases use the same genetic operators
but different combinations of α and β. The genetic operators are set as: a
population size of 7, tournament selection, no mutation, niching, elitism,
possibility of uniform crossover of 0.5, one child, and γ = 5%. The 18 com-
binations of α and β are created by setting β = 0.5, varying α from 0.1 to 0.9
with an increment of 0.1, and setting α = 0.6, varying β from 0.1 to 0.9 with
the same increment. Table 5.13 and Table 5.14 show their convergence results
in terms of the numbers of generations, nIP-GA, that the IP-GA has taken to
reach the global optimum. For comparison, the conventional µGA with the
same genetic operators but without the IP operator incorporated is also run
for these six test functions. Their results are also shown in Table 5.13 and
Table 5.14, where nµGA is the number of generations to convergence when
using the µGA and fn is the best fitness value at generation n.

TABLE 5.13
Comparison of Numbers of Generations to Convergence Using µGA and IP-GA
for Different α with β =0.5
nIP-GAa (β =0.5, α varies from 0.1 ~ 0.9) nµGAb nIP-GA/nµGA (%)
No. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 (fn) Min Max
F1 189 325 213 209 44 151 96 103 293 >500 <8.8 <65
(0.9998)
F2 177 229 164 58 72 187 68 104 94 >500 <11.6 <45.8
(–185.83)
F3 261 353 236 278 202 80 198 286 326 493 16.2 71.6
(–0.3524)
F4 326 446 249 278 309 266 247 363 418 >1000 <24.7 <44.6
(–0.0090)
F5 337 458 235 274 265 188 279 389 437 >1000 <18.8 <45.8
(–0.0093)
F6 1759 959 746 532 436 596 873 682 1232 >3000 <14.5 <58.6
(–5.0556)
a nIP-GA = the number of generations to convergence using the IP-GA.
b nµGA = the number of generations to convergence using the µGA.
TABLE 5.14
Comparison of Numbers of Generations to Convergence Using µGA and IP-
GA for Different α with β =0.6
nIP-GAa (α = 0.6, β varies from 0.1 ~ 0.9) nµGAb nIP-GA / nµGA (%)
No. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 (fn) Min Max
F1 302 32 97 54 151 203 55 66 87 >500 <6.4 <60.4
(0.9998)
F2 205 71 93 110 187 105 91 40 97 >500 <8 <41
(–185.83)
F3 351 367 198 234 80 135 186 259 301 493 16.2 74.4
(–0.3524)
F4 333 482 388 198 266 231 206 589 342 >1000 <19.8 <58.9
(–0.0090)
F5 392 513 401 201 188 303 257 556 312 >1000 <18.8 <55.6
(–0.0093)
F6 770 1214 768 486 596 723 512 780 1106 >3000 <17.1 <40.5
(–5.0556)
a nIP-GA = the number of generations to convergence for the IP-GA.
b nµGA = the number of generations to convergence for the µGA.
It can be seen that the IP-GA demonstrates the excellent performance of

convergence over the conventional µGA. It takes only 6.4 ~ 74.4% of the
number of generations required in the µGA to obtain the global optimum
for any α and β in the range of 0.1 ~ 0.9. This means that the IP-GA can
always perform better compared to the conventional µGA even with the
worst combination of α and β. If the parameters α and β are further limited

to a smaller range of 0.3 ~ 0.7, the maximal ratio nIP-GA/nµGA would decrease
from 74.4 to 56.4%.
The computation time for reproducing the same number of generations
using the IP-GA is almost equivalent to that using the conventional µGA for
each test function. For example, both take about 1 minute to complete the
evolution process of the first 500 generations for the test function F1 in the
workstation SGI/Cray. This feature clearly results from the fact that no
function evaluation is required in the added IP operator.
To reveal the evolution process, Figure 5.15 to Figure 5.20 show the con-
vergence processes of test functions F1 ~ F6, respectively, using the IP-GA
(α = 0.6, β = 0.5) against the conventional µGA.
1.2
1.0
0.8
Fitness value
µGA
0.6
IP-GA (α = 0.6 β = 0.5)
0.4
0.2
0.0
0 100 200 300 400 500
Number of generations
FIGURE 5.15
Comparison of convergence processes for test function F1. The IP-GA provide a quick conver-
gence. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
200
160
Fitness value
120
µGA
80 IP-GA (α = 0.6 β = 0.5)
40
0
0 100 200 300 400 500
FIGURE 5.16
Comparison of convergence processes for test function F2. (From Xu, Y.G. et al., Appl. Artif.
Intelligence, 15(7), 601–631, 2001. With permission.)

0.5
0.3
Fitness value
0.0
µGA
IP-GA (α = 0.6 β = 0.5)
-0.3
-0.5
0 100 200 300 400 500
FIGURE 5.17
1.0
0.0
Fitness value
-1.0 µGA
IP-GA (α = 0.6 β = 0.5)
-2.0
-3.0
0 200 400 600 800 1000
FIGURE 5.18
5.4.4.2 Effect of Control Parameters α and β

It can be found from Table 5.13 and Table 5.14 that the different combinations
of α and β result in different nIP-GA in the IP-GA for each of the test functions.
The selection of α and β has a significant effect upon the evolution process
of the IP-GA. For revealing this feature more obviously, Figure 5.21 and
Figure 5.22 show the convergence processes of test function F1 when using
the 18 different combinations of α and β.
Further observation results in the following findings:
• Using any combinations of α and β, the IP-GA will always converge

significantly faster than the conventional µ GA when the same
genetic operators are used.

1.0
0.0
Fitness value
-1.0 µGA
IP-GA (α = 0.6 β = 0.5)
-2.0
-3.0
0 200 400 600 800 1000
FIGURE 5.19
12.0
10.0
µGA
8.0
Fitness value
IP-GA (α = 0.6 β = 0.5)

6.0
4.0
2.0
0.0
0 500 1000 1500 2000
FIGURE 5.20
• Further improvement on convergence performance of the IP-GA

depends on a better combination of α and β. Extreme values of α
and β (too small or too large, such as 0.1 or 0.9) usually result in less
improvement.
• It is difficult to specify exactly the value of α and β, which can
produce the best convergence performance for all the test functions.
For example, α = 0.6, β = 0.2 is the best choice for test function F1,
resulting in the fastest convergence (only 32 generations required).
However, this choice does not generate the best results for the other
test functions. This means that the best selection of α and β is fitness
function-dependent.

1.0
0.8
α
Fitness value
0.6 = 0.1 0.6
0.2 0.7
0.4 0.3 0.8
0.4 0.9
0.2
0.5
0.0
0 50 100 150 200 250 300
FIGURE 5.21
Evolution processes using different α for function F1 (β = 0.5). (From Xu, Y.G. et al., Appl. Artif.
1.0
0.8
β
Fitness value
0.6 = 0.1 0.6
0.2 0.7
0.4
0.3 0.8
0.2 0.4 0.9
0.5
0.0
0 50 100 150 200 250 300
FIGURE 5.22
Evolution processes using different β for function F1 (α = 0.6). (From Xu, Y.G. et al., Appl. Artif.
Based on the preceding analysis, parameters α and β are recommended

within the range of 0.3 ~ 0.7. It can be found from Table 5.15 that the means
of nIP-GA, when using α and β within this recommended range, obviously
decreases compared to that using α and β within 0.1 ~ 0.9. A selection of α
and β within 0.3 ~ 0.7 may not be the optimal choice for a specific fitness
function; however, it always significantly ends in a better result when com-
pared with the conventional µGA.
5.4.4.3 Effect of the IP Operator

It is interesting to quantitatively reveal the influence of the IP operator on the
evolution process in the IP-GA. A simple way to this end is to figure out the

TABLE 5.15
Means of the Numbers of Generations
(nIP-GAa) of IP-GA to Convergence When Using
Different Ranges of α and β
α~β F1 F2 F3 F4 F5 F6
0.3 ~ 0.7 127 113 183 264 259 627
0.1 ~ 0.9 151 120 248 338 342 845
a nIP-GA = the number of generations to conver-
gence for the IP-GA.
Source: Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7),
601–631, 2001. With permission.
TABLE 5.16
Numbers of Best Individuals (nb) Generated by the IP Operator Using Different
α for Function F1 (β = 0.5)
α 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
nb 144 278 173 168 30 121 67 73 260
nb/nIP-GA (%) 76.2 85.5 81.2 80.4 75.0 80.1 69.8 70.9 88.7
TABLE 5.17
Numbers of Best Individuals (nb) Generated by the IP Operator Using
Different β for Function F1 (α = 0.6)
β 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
nb 249 18 75 35 121 157 29 40 67
nb/nIP-GA (%) 82.5 56.3 77.7 64.8 80.1 77.3 52.7 60.6 77.0
number of best individuals, nb, generated by the IP operator in the evolution

process. The larger the number nb is, the stronger the influence of the IP
operator. Table 5.16 and Table 5.17 show the ratio of nb/nIP-GA for test function
F1. The ratio of nb/nIP-GA ranges from about 53 ~ 89% for all 18 cases. This
means that the IP operator plays a very important role in the whole evolution
process — also true for the other test functions. Table 5.18 shows the mean
ratios of nb/nIP-GA for functions F1 through F6. All of them are over 69%.
Figure 5.23 shows the convergence processes of test function F1 when
using the IP-GA with three different sets of α and β. The “ •” mark indicates
the best individuals generated by the IP operator. This mark becomes denser
with the increase of generation number in the three convergence curves,
meaning that the IP operator plays a more important role in approaching
the global optimum. This self-adaptive feature of IP operator is very ideal.
It is this feature that ensures the IP-GA capable to explore the promising

TABLE 5.18
Mean Ratios of nba/nIP-GAb for All Test Functions
No. F1 F2 F3 F4 F5 F6
nb/nIP-GA (%) 69.3 75.3 80.1 73.2 75.6 78.1
a nb = number of best individuals generated by the IP operator.
b nIP-GA = the number of generations to convergence for the IP-GA.
Source: Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001.
With permission.
1.2
1.0
Fitness value
0.8
0.6 α = 0.3 β = 0.5

α = 0.5 β = 0.5
0.4
α = 0.6 β = 0.6
0.2
0 50 100 150 200 250
FIGURE 5.23
Effect of IP operator on evolution process of function F1. (From Xu, Y.G. et al., Appl. Artif.
areas containing the global optima at the early stage and to converge quickly
to the final solution at the later generations.
5.4.4.4 Comparison with Hybrid GAs Incorporated with Hill-Climbing

Method
The hybrid GAs incorporated with the hill-climbing method have been
widely applied in present engineering practice, so it is significant to compare
this kind of hybrid GA with the IP-GA. In this study, two typical schemes
for the hybrid GAs incorporated with the hill-climbing method are used.
The first is to integrate the hill-climbing method with the µGA (denoted as
GA-HC(1)). This is called Lamarchian evolution algorithm (Kennedy, 1993)
or memetic algorithm (Moscato and Norman, 1992; Radcliffe and Surry,
1994). The second one is to run the µGA to the predesignated number of
generations, npre , then apply a hill-climbing method to all the obtained indi-
viduals and finally get the best solution (denoted as GA-HC(2)). The minimal
step-length for variable pi in the hill-climbing method is set to be [pimax – pimin]
/32768 so as to be in accordance with the computation accuracy in the IP-

TABLE 5.19
Results Obtained from Hill-Climbing Method in GA-HC(1)a for Function F2
Starting Points Ending Points
Function Function
Generation Individual Variables Value Variable Value nf-hillb
1 1 –9.6728, 5.6255 22.0841 –9.7803, 5.4827 38.2959 69
2 –7.2533, 2.8965 18.8253 –7.0837, 2.7860 38.2959 63
3 0.6772, 5.3959 30.6487 0.8219, 5.4827 54.4048 57
4 –7.1868, –9.9219 19.5499 –7.0837, –9.7803 38.2959 60
5 0.0003, –9.9169 –7.1568 –0.1956, –0.0342 3.0050 91
6 –9.9774, –9.6802 –1.3402 –10.0165, –0.780 0.0946 53
7 –9.3262, 0.3177 –13.4106 –9.7803, 0.3342 10.1556 112
2 1 –9.4775, 6.1065 12.8347 –9.2865, 6.0875 30.7807 71
2 –6.9249, 2.7525 27.1076 –7.0837, 2.7860 38.2959 65
3 –8.2366, –9.8401 14.1813 –8.2904, –9.7803 16.2861 60
4 0.8219, 5.4827 54.4048 0.8219, 5.4827 54.4048 16
5 2.8336, –9.8804 –5.1788 2.7860, –10.0366 1.0463 85
6 –9.9756, –9.5239 0.4181 –9.7803, –9.2865 9.5388 49
7 –9.0576, 0.4739 –1.2259 –8.7939, 0.3342 13.8031 74
3 1 1.1429, 6.1065 9.5043 1.3199, 6.0875 24.9369 71
2 –9.5868, 8.2409 –0.9176 –9.2865, 8.0889 9.8608 79
3 –9.4903, 5.4729 –18.4041 –9.7803, 5.4827 38.2959 81
4 –9.2816, 0.9116 11.5597 –9.2865, 0.8219 13.5513 55
5 –7.1618, –7.3705 –32.5859 –7.0837, –7.7081 186.7307 92
6 0.8219, 5.4827 54.4048 0.8219, 5.4827 54.4048 16
7 –6.5575, 0.4788 16.8553 –6.4788, 0.3342 32.7709 55
aGA-HC(1) = method integrating the hill-climbing method with the µGA.
bn = the number of function evaluations taken in the hill-climbing process.
f-hill
GA, where the number of possibilities for variables is 32,768. The comparison
study is done for all six test functions.
Table 5.19 shows the results obtained from the hill-climbing method in the
GA-HC (1) for the test function F2. Function F2 is selected as a representative
function for detailed discussion because function F2 has many local optima
(see Figure 5.6). In Table 5.19, the starting points are actually the offspring
in the present generation that are obtained from the present parents using
the conventional genetic operations. The ending points are the local optima,
obtained using the hill-climbing method from the corresponding starting
points; they are the parents of the next generation. Table 5.19 shows that, in
the first three generations, the conventional genetic operations in the µGA
failed to discover the global optimum. The best fitness value in the first,
second, and third generations is 30.6478 (individual 3), 54.4048 (individual
4), and 54.4048 (individual 6), respectively. However, starting from individ-
ual 5 in the offspring of generation 3, the hill-climbing method has success-
fully discovered the global optimum (fitness value = 186.7307). The total
number of function evaluations nf taken in this process is 1395 (sum of the
numbers of function evaluations taken by both the hill-climbing method and

by the genetic operators in the first three generations). For visualization of

the searching process of hill-climbing method, Figure 5.24 and Figure 5.25
show how this method started from the initial points provided by the µGA
at the generation 1 and generation 3, respectively, to the corresponding
ending points.
FIGURE 5.24
Hill-climbing searching starting from offspring in generation 1 in the GA-HC(1) for function
F2. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
FIGURE 5.25

TABLE 5.20
Best Individuals Obtained from Hill-Climbing Method in GA-
HC(2)a for Function F2
Ending Point (Best Individual) No. of
Generation (npre)b Variable Function Value Optima nf-hillc
5 0.8219, 5.4827 54.4048 — 743
10 –6.4788, 5.4827 123.5766 — 619
20 –6.4788, 5.4827 123.5766 — 801
30 –1.4249, 5.4827 186.7307 1 1087
40 –1.4249, 5.4827 186.7307 3 1060
a GA-HC(2) = GA is run at predesignated number of generations (npre), then
hill-climbing method is applied to get the final solution.
b n = predesignated number of generations from which hill-climbing meth-
pre
od starts.
c n
f-hill = number of function evaluations taken in the hill-climbing process.
Source: Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With
permission.
Table 5.20 shows the best individuals and their fitness values obtained by
the hill-climbing method in the GA-HC(2). The hill-climbing searching starts
from the offspring in the generations 5, 10, 20, 30, and 40, respectively. It can
be seen that the hill-climbing searching starting from the offspring at the
generations 5, 10, and 20 fails to obtain the global optimum (Figure 5.26 for
the first case). It is not successful until the number npre increases to 30: one
of the offspring has successfully led the hill-climbing searching to reach the
global optimum (Figure 5.27). With the further increase of npre, more offspring
Function value
FIGURE 5.26

Function value
FIGURE 5.27
become close to the global optimum. The number of offspring that can lead
the hill-climbing searching successfully to reach the global optimum has also
correspondingly increased. For example, when npre increases to 40, there are
three offspring that have successfully led the hill-climbing searching to get
to the global optimum. However, the larger npre usually results in more
function evaluations.
Comparing the GA-HC(1) with the GA-HC(2), the total number of function
evaluations, nf, taken to reach the global optimum for test function F2 is 1395
and 1297(npre = 30), respectively. The GA-HC(2) costs less computationally
than the GA-HC(1) — also true for the other test functions. However, it is
usually difficult to designate the number npre properly in the GA-HC(2) (Yang
et al., 1995; Xiao and Yabe, 1998; Xu et al., 2001c). Improper selection of npre
usually results in the overuse of function evaluations or failing to get the
global optimum.
Table 5.21 shows the number nf for the six test functions when using the
GA-HC(1), GA-HC(2), IP-GA, and the conventional µGA, respectively. It can
be found that the IP-GA incurs the least computation cost among all three
of these algorithms. The advantage is more obvious for test functions with
more decision variables such as functions, F3, F4, and F5. In addition, the
IP-GA does not have difficulty in choosing npre in the GA-HC(2).
Nevertheless, there is likely a situation in which this IP-GA does not
perform particularly well. That is, when the individual p bj is identical to
p bj −1 at the jth generation in the evolution process, the IP operator fails to
find the new individuals different from those in the C(j). This would decrease
the population diversity and also increase unnecessary evaluations for the
same individuals in one generation.

TABLE 5.21
Comparison of Numbers of Function Evaluations (nf) Using
Different Algorithms for Test Functions
Algorithm F1 F2 F3 F4 F5 F6
GA-HC(1) a 1476 1395 6199 10004 10012 9615
GA-HC(2) b 1373 1297 3375 6736 6765 8004
IP-GA 1150 1024 1654 2383 2338 5650
µGA >3500 >3500 3451 >7000 >7000 >21,000
a GA-HC(1) = method integrating the hill-climbing method with
µGA.
b GA-HC(2) = GA is run at predesignated number of generations
(npre), then hill-climbing method is applied to get the final solution.
Source: Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001.
With permission.
5.5 Improved IP-GA

Improvements for the previous IP-GA are made to overcome the mentioned
problem and further increase the searching efficiency of the algorithm (Xu
et al., 2002). This includes:
• The IP operator is improved by using an alternative way to construct

the move direction of the best individual so that it can find better
individuals different from those in the C(j) with a significantly
increased possibility. That is, the move direction of the best individ-
ual is made using either p bj and p bj −1 or p bj and p sj (second best
individual at jth generation) when p bj is identical to p bj −1 .
• Only the better of two new individuals obtained from the IP operator
is used to replace the worst individual in the current C(j) to imple-
ment the hybridization process. This is obviously beneficial to avoid-
ing the decrease of population diversity due to the insertion of two
new individuals that are close to each other, and also beneficial to
decreasing the population size.
• The mutation operation is employed in the evolution process to
increase the population diversity. This is especially beneficial when
the improved IP operator fails to find new individuals different from
those in the C(j) when p bj , p bj −1, and p sj are identical. This new IP-
GA is termed as improved IP-GA.
5.5.1 Improved IP Operator

Improvement on the IP operator, in which p bj and p bj −1 are used only if they
are not identical, is carried out. Otherwise, p bj and p sj should be used. This
means that the better individual c is obtained by (Xu et al., 2002):

{
f (c) = max f (c1 ), f (c2 ) } c ∈ {c1 , c2 }
( )
c1 = p bj + α p bj − p
c2 = p bj −1 + β( p − p)
b
j
(5.39)
p bj −1 p bj ≠ p bj −1
p= s
p j p bj = p bj −1
where α and β are recommended to be within 0.1 ~ 05 and 0.3 ~ 0.7, respec-
tively. It is clear that α and β decide how far the newly generated individual
c is from the present best individual p bj . Selection of α and β has obvious
effects on the convergence process of the IP-GA. Detailed discussion on this
point takes place in Section 5.4.4 for the IP-GA; the following examples
address it further.
5.5.2 Implementation of the Improved IP Operator

The implementation of the improved IP operator can be carried out exactly
as described in Section 5.4.3, except that only the following steps are adopted:
• Carry out the conventional genetic operations: niching, selection,

crossover, mutation, and elitism, which result in a new generation.
Details can be found in Section 5.2.2. The mutation operator is
employed in the improved IP-GA, as highlighted in Figure 5.28.
• Generate offspring C(j) = (cj1 , cj2,…,cjN), and evaluate their fitness
values. They are expected to be closer to the global optimum than
those in the P(j).
• Carry out the projection operation:
• Using Equation 5.39, construct the move direction of the best
individual.
• Generate the individuals c1, c2, and evaluate their fitness values.
• Select the better individual.
This process is depicted in Figure 5.28. Basically, the improved IP-GA takes
the same strategy in incorporating the IP operator into the basic loop of the
µGA as that used in the previous IP-GA. It thus maintains the main advan-
tages of IP-GAs:
• Very little computation effort is required in the projection operator.

j = 0, initiate P(j)
Evaluate P(j)
Yes
No
j = j+1
Niching
Selection Intergeneration projection

(IP)operator
Crossover
Obtain and evaluate c1, c2

Mutation
Elitism
Select c
Generate and evaluate C(j)
Generate Ch(j)
No Population
convergence ?
Yes
Restart
FIGURE 5.28
Flow chart of the improved IP-GA. (From Xu, Y.G. et al., AIAA J., 40(9), 1860–1866, 2002. With
permission.)
• The incorporated projection operator affects the evolution process

in a self-adaptive manner so as to ensure global searching and fast
convergence.
• The implementation of the improved IP operator is straightforward.
Therefore, it is convenient to use in engineering practice.
5.5.3 Performance Test

The test functions listed in Table 5.4 have been used for the performance test
of the improved IP-GA.

5.5.3.1 Performance of the Improved IP-GA

Performance of the improved IP-GA on the convergence is investigated in
terms of the number of generations (or times of function evaluations)
required to obtain the global optimum for the preceding benchmark func-
tions. To make the results meaningful statistically, each benchmark function
is tested 40 times using the improved IP-GA with the different combinations
of α and β and different initial random number seed idum. With a different
negative number idum, the Knuth’s algorithm generates different series of
random numbers. The combinations are sampled at α = 0.1, 0.3, β = 0.5, 0.618,
and idum= –1000, –5000, –10,000, –15,000, –20,000, –30,000, –35,000, –40,000,
–45,000, and –50,000, using the full fractional combination method. The
genetic operators and other operation parameters used are: the possibility of
uniform crossover of 0.5, the possibility of mutation of 0.02, tournament
selection, one child, niching, elitism, population size of 5, and γ = 5%. Table
5.22 shows the mean number of generations (nIIP-GA) of the improved IP-GA
to obtain the global optimum for six benchmark functions listed inTable 5.4.
In order to have a fair and meaningful comparison, the IP-GA and µGA
are also run 40 times with genetic operators and operation parameters similar
to those of the improved IP-GA (except for population size N = 7) for all the
six test functions listed in Table 5.4. nIIP-GA, nIP-GA, and nµGA are the means of
numbers of generations to obtain global optimum when using the improved
IP-GA, IP-GA, and µGA, respectively. The corresponding results are shown
in Table 5.22. It can be found that the improved IP-GA demonstrates a much
faster convergence than the conventional µGA as well as the previous IP-
GA. For clearly showing the comparison, two relative ratios are defined as
RatioIP = (5 × nIIP-GA)/(7 × nIP-GA) and RatioµGA = (5 × nIIP-GA)/(7 × nµGA). These
TABLE 5.22
Convergence Performance of Improved IP-GA and Comparison with IP-GA
and µGAa
Function RatioIP RationµGA
No. Global Optimum Value nIIP-GAb nIP-GAc nµGAd (%)e (%) f
F1 (0.0669, 0.0669) 1.0 158 189 984 59.7 11.5
F2 (1.4251, –0.8003) g –186.73 185 348 1136 38.0 11.6
F3 (–1.0467, 0.0) –0.352 175 437 983 28.6 12.7
F4 (1.0, 1.0, 1.0) 0.0 261 544 1561 34.3 11.9
F5 (1.0, 1.0, 1.0) 0.0 229 583 1648 28.1 9.9
F6 (4.0, 4.0, 4.0, 4.0) –10.153 652 1195 3271 39.0 14.2
a Obtained from 40 independent runs.
b nIIP-GA = number of generations to convergence for the improved IP-GA.
c nIP-GA number of generations to convergence for the IP-GA.
d nµ GA = number of generations to convergence for the µGA.
e Ratio IP (%) = (5 × nIIP-GA)/(7 × nIP-GA) × 100%.
f Ratio µGA (%) = (5 × nIIP-GA)/(7 × nµGA) × 100%.
g One of global optima.
Source: Xu, Y.G. et al., AIAA J., 40(9), 1860–1866, 2002. With permission.

0.5
0.3
Fitness value
0.0 µGA
IP-GA
-0.3
Improved IP-GA
-0.5
0 30 60 90 120 150
(a) Function F3
12.0
9.0
µGA
Fitness value
IP-GA
6.0
Improved IP-GA
3.0
0.0
0 200 400 600 800 1000
(b) Function F6
FIGURE 5.29
Convergence processes of the improved IP-GA for benchmark function F3 and F6. (From Xu,
Y.G. et al., AIAA J., 40(9), 1860–1866, 2002. With permission.)
two relative ratios are calculated and shown in Table 5.22. From this table,
it can be found that the improved IP-GA takes only 9.9 ~ 14.2% (or 28.1 ~
59.7%) of the number of function evaluations required by the µGA (or the
IP-GA) to obtain the global optimum for these benchmark functions. Figure
5.29 shows the convergence processes of functions F4 and F6 using the
improved IP-GA against the IP-GA and µGA, from which the outstanding
performance of the improved IP-GA on the convergence can be seen clearly.
5.5.3.2 Effect of the Mutation Operation

Traditionally, the mutation operation is not used in the µGA (Krishnakumar,
1989; Carroll, 1996a, b). However, it is recommended to apply in the
improved IP-GA for increasing the population diversity. For testing the effect
of the mutation operation, the preceding benchmark functions are investi-
gated using the improved IP-GA with and without mutation operation,

TABLE 5.23
Effect of Mutation Operator on Performance of Improved IP-
GA for the Case with α = 0.2, β = 0.5, and idum = –10,000
F1 F2 F3 F4 F5 F6
nIIP-GA(a)a 97 126 104 173 170 412
nIIP-GA(b)b 107 156 116 352 417 >1000
Ratio (%) c 90.7 80.8 89.7 49.1 40.8 <41.2
a nIIP-GA(a) = number of generations to convergence for the improved
IP-GA with the mutation operator.
b nIIP-GA(b) = number of generations to convergence for the improved
IP-GA without the mutation operator.
c Ratio (%) = nIIP-GA(a) / nIIP-GA(b) × 100%.
TABLE 5.24
Effect of Coefficient α on Performance of Improved IP-GA for
Function F1 Where β = 0.5 and idum = –10,000
α 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
nIIP-GA(a) a 147 86 50 119 38 81 63 56
nIIP-GA(b)b 149 109 86 130 44 91 74 91
Ratio (%)c 98.7 78.9 89.3 91.5 86.4 89.0 85.1 61.5
a nIIP-GA(a) = number of generations to convergence for the improved IP-GA
with the mutation operator.
b nIIP-GA(b) = number of generations to convergence for the improved IP-GA
without the mutation operator.
respectively. Table 5.23 shows that the improved IP-GA with the mutation
operation finds the global optimum faster than that without the mutation
operation for all the benchmark functions. Further investigations on the
effect of the mutation operation associated with the variation of α, β, and
idum are also performed, and the results are given in Table 5.24 to Table 5.26.
5.5.3.3 Effect of the Coefficients α and β

To study the effect of α and β on the improved IP-GA, benchmark function
F1 is investigated again using the different α and β, with the same genetic
operators and other operation parameters (idum = –10,000). Two schemes of
the improved IP-GA with and without mutation operation are used. Table
5.24 and Table 5.25 show that nIIP-GA corresponding to the different α (α =
0.1 ~ 0.8) with β = 0.5 ranges from 38 to 147, while that corresponding to
the different β (β = 0.2 ~ 0.9) with α = 0.2 ranges from 70 to 486. Figure 5.30
shows the convergence processes.
Further investigation has shown that it is difficult to specify exactly the
value of α and β, which can get the best convergence performance for all

TABLE 5.25
Effect of Coefficient β on Performance of Improved IP-GA
β 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
nIIP-GA(a) a 486 217 70 86 80 164 372 243
nIIP-GA(b) b 121 106 98 109 94 74 109 99
Ratio (%)c 401.6 204.7 71.4 78.9 85.1 221.6 341.3 245.5
a nIIP-GA(a) = number of generations to convergence for the improved IP-GA with
the mutation operator.
b nIIP-GA(b) = number of generations to convergence for the improved IP-GA
without the mutation operator.
TABLE 5.26
Effect of the Random Number Seed idum on Performance of Improved IP-GA
and µGA
i dum (× 102) 1 50 100 150 200 300 350 400 450 500
nIIP-GA(a)a 245 164 97 150 196 229 60 175 67 197
nIIP-GA(b)b 279 230 107 77 186 238 185 128 395 199
nµGAc 823 2741 1229 1512 1105 240 1043 415 526 208
a nIIP-GA(a) = number of generations to convergence for the improved IP-GA with the
mutation operator.
b nIIP-GA(b) = number of generations to convergence for the improved IP-GA without the
mutation operator.
c nµ GA = number of generations to convergence for µGA.
the benchmark functions. However, it has been found that any combinations
of α and β always result in the fact that the improved IP-GA converges
significantly faster than the µGA using the same genetic operators and oper-
ation parameters. It is found from this study that α and β should be within
0.1 ~ 0.5 and 0.3 ~ 0.7, respectively. The recommended choice is α = 0.l ~ 0.3
and β = 0.5, which generally ends in good results in numerical experiments.
5.5.3.4 Effect of the Random Number Seed

A total of 10 different random number seeds idum has been used to investigate
their influence on the convergence performance of the improved IP-GA. All
are selected to be negative according to the suggestion made by Carroll
(1996a, b) in a public version (1.7) of the GA program. To show the effect
of idum , Table 5.26 presents the corresponding nIIP-GA and nµGA for function F1
when the improved IP-GA and µGA use the different idum . In these investi-
gations, the genetic operators and the operation parameters remain the same
(α = 0.2, β = 0.5). From Table 5.26, it can be found that the improved IP-GA
is not as sensitive as the µGA to idum . This feature makes the improved IP-
GA more robust to use in practice.

1.0
0.8
Fitness value
0.6 α = 0.1 0.5
0.2 0.6
0.4
0.3 0.7
0.2 0.4 0.8
0.0
0 20 40 60 80 100 120
(a) Effect of α with β = 0.5 and idum = -10000
1.0
0.8
Fitness value
0.6 β = 0.2 0.6

0.3 0.7
0.4
0.4 0.8
0.2 0.5 0.9
0.0
20 40 60 80 100 120 140 160 180
(b) Effect of β with α = 0.2 and idum = -10000
FIGURE 5.30
Effects of coefficients α and β on the convergence processes of the improved IP-GA for test
function F1. (From Xu, Y.G. et al., AIAA J., 40(9), 1860–1866, 2002. With permission.)
5.6 IP-GA with Three Parameters (IP3-GA)

In the IP-GA, as well as the improved IP-GA, two new individuals are
generated in each generation using forward and internal interpolations
based on the two best individuals in the neighboring generations. Compared
to the µGA, the IP-GA has shown great success in time saving to search for
the global optimum; it can always perform better than the µGA. The only
drawback for this method is that the searching performance depends greatly
on the parameters for interpolations, and the improvement may not be
significant for some discrete or singular functions. In order to overcome this
shortcoming, a further improvement has been implemented in the IP-GA
using three parameters (Yang et al., 2002a). For convenience of description,
this improved IP-GA is termed IP3-GA for the use of three parameters.

5.6.1 Three-Parameter IP Operator

The main idea of this further modification is that the best individuals in the
adjacent generations are selected as two basic individuals; three new indi-
viduals near the two original ones will be generated through forward, inter -
nal, and backward interpolations. Distance between new individuals and
original ones can be adjusted through changing the corresponding parameter
of the interpolation. These newly produced individuals are evaluated and
the best one will be inherited by the next generation. Because only three
additional new individuals are introduced in each generation, extra work
for evaluation is small, but the speed for searching for the true solution is
significantly improved.
In order to find out whether better individuals near p bj and p bj −1 exist,
three new individuals, c1, c2, and c3, will be generated through the forward,
internal, and backward interpolations, respectively, which can be expressed
in the following equations (Yang et al., 2001):
(
c1 = p bj + α p bj − p bj −1 ) (5.40)
(
c2 = p bj −1 + β p bj − p bj −1 ) (5.41)
(
c 3 = p bj − γ p bj − p bj −1 ) (5.42)
where α, β, and γ are three non-negative decimal parameters, whose values

can be changed to adjust the distances of these new individuals to original
individuals p bj and p bj −1 . To achieve stable convergence, generally the ranges
of these three parameters are: 0 ð β ð 1.0, 0 ð α ð 1.0, 0 ð γ ð 1.0.
The procedure of this IP3-GA is shown in Figure 5.31.
5.6.2 Performance Comparison

Table 5.27 gives results for previous testing functions listed in Table 5.4. The
performance comparison between the IP3-GA and IP-GA is presented, and
the rate of generations for desirable fitness of the IP3-GA over the IP-GA is
also listed in this table. This table is not exactly consistent with the preceding
studies for the µGA, IP-GA, as well as the improved IP-GA. This is due to
the difference in the parameters used in these GAs. Also the binary digit for
parameter discretization is different.
For all the testing functions, the results show that the IP3-GA has per-
formed much better in terms of accuracy of results or convergence speed
compared to the µGA. The parameters are not yet optimized (α = 0.2, β =
0.5, and γ = 0.2 are arbitrarily used). The more the variables in each individ-
ual, the better the improvement will be. The results also show that the IP3-
GA performed a little better than the IP-GA for some cases (F1, F2, F4, F5).

Get started, j = 0
Initiate P(j)
Evaluate P(j)
Yes
No
Generate C(j)
Evaluate C(j)
No Population
convergence ?
Yes
IP operation to generate C1, C2, C3
Generate P(j)
Evaluate C1, C2, C3
j = j+1
Restart
FIGURE 5.31
Flowchart of the IP3-GA.
For other cases (such as F3, F6), the IP3-GA performed substantially better
than the IP-GA.
5.7 GAs with Search Space Reduction (SR-GA )

Several modified GAs have been presented. All these improvements are
aimed at speeding up the local search with the help of local operators or
getting out of the stagnation using IP operators. As mentioned in the last
paragraph of Section 5.2.4, the stagnation in the latest stage of the GA search
process is due to the fact that when a very good individual is found, it is
very difficult to find a better individual from the entire original searching
space. The main reason is the significant reduction of the space that contains
the better individuals compared with the entire original space that is
unchanged. Therefore, the best approach to solve this problem is to shrink

TABLE 5.27
Performance of IP3-GA and Comparison with Other Methods
Test Function µGA IP-GA IP3-GA RatioIPi/
RatioIP3j
No (x1, …, xn)opta foptb nµGAc fµGAd nIP-GAe fIP-GAf nIP3-GAg fIP3-GAh (%)
F1 (0.0669, 0.0669) 1.0 984 1.0 189 1.0 221 1.0 19.2/22.5
F2 (–1.4251, –.8003) –186.73 1136 –186.73 348 –186.73 277 –186.73 30.6/24.4
F3 (–1.0467, 0.0) –0.3524 983 –0.3524 437 –0.3524 137 –0.3524 44.5/13.9
F4 (1.0, 1.0, 1.0) 0.0 1561 –2.235E–8 544 –2.235E-8 331 –2.235E-8 34.8/21.2
F5 (1.0, 1.0, 1.0) 0.0 1648 –0.917E–7 583 1.254E-5 503 1.254E-5 35.4/30.5
F6 (4.0,4.0,4.0,4.0) –10.153 3271 –10.1532 1195 –10.1532 330 –10.1532 36.5/10.1
a (x1, …, xn)opt = optimal point of the test functions.
b fopt = function values at the optimal point.
c nµ GA = number of generations to convergence for the µGA.
d fµ = function value at the convergence point of the µGA.
e nIP-GA = number of generations to convergence for the IP-GA.
f fIP-GA = function value at the convergence point of the IP-GA.
g nIP3-GA = number of generations to convergence for the IP3-GA.
h fIP3-GA = function value at the convergence point of the IP3-GA.
i RatioIP = nIP-GA / nµ GA × 100%.
j RatioIP3 = nIP3-GA / nµGA × 100%.
the searching space while the GA is advancing so as to increase the chance

of getting better individuals.
A technique has been proposed by Liu et al. (2002h) to narrow the search
domain after generations of GA runs. It is termed space-reduction GA (SR-
GA) here and works as follows.
After a number of generations, the maximum and minimum values,
PMAXj and PMINj, of each parameter can be found from the M best indi-
viduals up to this stage, where j refers to the jth parameter to be identified.
A new reduced search space is defined as follows:
(
PMAX jnew = PMAX j + α PMAX jold − PMIN jold ) (5.43)
(
PMIN jnew = PMIN j − α PMAX jold − PMIN jold ) (5.44)
where α is a predefined factor and PMAX jold , PMIN jold are the maximum
and minimum values of each parameter in the previous search domain. This
procedure is depicted in Figure 5.32.
In the SR-GA, a sufficient number of generations is first carried out to
ensure that the recorded M best individuals are covering the space that
contains the global optimum of the objective function. The parameter M
should be so chosen to avoid trapping at any local optimal point when the
objective error function is not unimodal or not continuous. The parameter
α is used to ensure the local best individual is not excluded from the new
GA search process. The combination of M and α ensures that the GAs can
find the best individual for complicated objective functions, even when the
searching space is reduced.
M=3
Fitness value
Individuals
FIGURE 5.32
Schematic drawing of the search space deduction in the SR-GA. (From Liu, G.R. et al., Comput.
Struct., 80, 23–31, 2002h. With permission.)

The SR-GA has been applied to predict engineering problems and has
proved to be very efficient compared to the plain GA (see Section 13.1.2).
The authors strongly believe that this idea of search space reduction is one
of the most effective ways to solve the convergence stagnation problem in
GAs. Therefore, much more effort should be made in this direction to further
improving GAs by developing more efficient ways to reduce the search space
while the individual is approaching the global optimum.
5.8 GA Combined with the Gradient-Based Method

As discussed in Chapter 4, gradient-based optimization methods have a high
probability to converge to a local optimum, depending on the given initial
guess. The advantage of gradient-based optimization is that it converges
very fast to the local optimum, especially when the initial guess is close to
the optimum. However, the search for suitable initial points for a locally
converged optimization method often proves to be difficult. On the other
hand, GAs hold complementary promises in searching for the global opti-
mum in comparison with traditional optimization methods. The other
advantages of GAs are the capability to escape from the local optima and
no need for initial guesses. GAs are, however, computationally expensive;
their converging performance slows down significantly at the later stage of
searching. This can often be observed from the convergence curve of a GA,
where it converges very fast at the beginning and very slowly at the later
stage, as shown in Figure 5.1. Thus, it is expected to combine a GA and a
traditional optimization method so as to provide an ideal performance for
the optimization procedure, which is often vital in nonlinear optimization
problems. As such, not only can the global optima be ensured but results
can also be obtained at a reasonably fast speed.
5.8.1 Combined Algorithm

As reviewed in Section 5.2.5, several kinds of combined algorithms have
been proposed. One of them has been used by Liu et al. (2002a) for deter-
mining the material property of composites. This combined optimization
method combines the µ GA with the modified Levenberg–Marquardt
method, which is efficient for solving nonlinear least squares problem. The
subroutine BCLSF of IMSL is directly employed in the combined method in
which the modified Levenberg–Marquardt method is employed and the
Jacobian is obtained using the finite-difference method.
This combined algorithm performs in three steps:
1. The µGA is used to determine the initial points. The main purpose is
to select a set of better solutions close to the optima. The selection
criterion is imposed to limit the function value below a required value.

2. Each set of these solutions is used as the initial point in searching

for the individual local optimum using the BCLSF (refer to the gra-
dient-based method).
3. All solutions from the BCLSF searching are considered the local
optima of the function. The global optimum is found from these
solutions simply by comparing their corresponding objective func-
tion values.
5.8.2 Numerical Example

In order to demonstrate this combined method clearly, consider the Him-
melblau function as the benchmark test problem. The Himmelblau function
can be written in the form of nonlinear least squares of:
Minimize F( x1 , x 2 ) = ∑ ( f (x , x ))
2
i 1 2
i =1 (5.45)
Subject to − 6 ≤ x1 , x 2 ≤ 6
where
f1 ( x1 , x 2 ) = x12 + x 2 − 11; f2 ( x1 , x 2 ) = x1 + x 22 − 7 (5.46)
Note that the preceding function has four minimum points that can be
obtained by solving the following equations:
x12 + x 2 − 11 = 0; x1 + x 22 − 7.0 = 0 (5.47)
The solutions to these equations are (3.0, 2.0)T, (–2.805, 3.131)T, (–3.779,
–3.283)T, and (3.584, –1.848)T. This study is concerned with designing a func-
tion, which has only one global minimum point. Add two terms to the
Himmelblau function and form the following nonlinear least square problem,
i.e.,
Minimize g( x1 , x 2 ) = ∑ ( f (x , x ))
2
i 1 2
i =1 (5.48)
Subject to − 6 ≤ x1 , x 2 ≤ 6
where
f3 ( x1 , x 2 ) = 0.316( x1 − 3); f4 ( x1 , x 2 ) = 0.316( x 2 − 2) (5.49)

The additional terms do not alter the location of the optimum and the
function value at the global optimal point (3.0, 2.0)T. They alter locations and
function values of the other three minimum points, thereby making them
local minimum points. Therefore, the global minimum of Equation 5.48 is
still at (3.0, 2.0) T and has a function value of zero. Other three local minima
have higher function values of 3.498, 7.386, and 1.515, respectively. This
problem has been studied by Deb (1998). He has found that, on average, one
out of four simulations of the steepest descent algorithm solve the preceding
problem to the global optimum, and a successful run takes 215 function
evaluations on average to convergence. This finding is typical for many
traditional gradient-based optimization algorithms. If they do not begin with
a sufficiently good point, the algorithms may converge to a wrong solution
of a local minimum. In contrast, the GA could be the global minimum of the
function find most of the time; the average number of function evaluations
required to achieve the global minimum is 520.
As a comparison, a uniform µGA with binary parameter coding, tourna-
ment selection, uniform crossover, and elitism is adopted to solve the prob-
lem. The population size of each generation is set to be 5 and the probability
of uniform crossover is set to be 0.6. The population convergence criterion
is 5%, i.e., when less than 5% of the total bits of other individuals in a
generation are different from the best individual, the convergence occurs. A
new population, in which the best individual of the last generation is repli-
cated, will be randomly generated and the evolution process restarts.
Knuth’s subtractive method is used to generate random numbers.
The search space defined for this numerical test is listed in Table 5.28. The
two parameters are described and translated into chromosomes. In the whole
search space, a total of 214 (16,384) possible combinations of these two param-
eters exists. It has been found that the uniform µGA can find the global
minimum of the function; the average number of function evaluations
required to achieve the global minimum is 480, that is, less than using the
plain GA (520). The value of error function against the number of generation
for a GA run is also plotted in Figure 5.33. From this figure, it can be seen
that the GA can reach the “better” points fast, but its convergence to the
“best” is very slow. It has been found from this example that it can reach
TABLE 5.28
Uniform µGA Search Space for Numerical Test Defined
by Equation 5.48
Parameter Search Range Possibilities # Binary Digit
x1 –6.0, 6.0 128 7
x2 –6.0, 6.0 128 7
Note: Total population in the entire search space is 2 14 (16,384).
Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191,
1909–1921, 2002. With permission.

35
30
25
Function value
20
15
10
0
0 20 40 60 80 100 120 140 160 180 200
FIGURE 5.33
Convergence of a µGA for the numerical test of problem defined by Equation 5.48. (From Liu,
G.R. Comput. Methods Appl. Mech. Eng., 191, 1909–1921, 2002. With permission.)
the point (3.176, 1.906)T at 51 generations, but it does not converge to the
global minimum point (3.0, 2.0)T until 96 generations.
Now, the combined optimization method is used to solve the same prob-
lem. At the first step, the uniform µGA is used to isolate the best zones in
the parameters space. In other words, the uniform µGA is employed as a
tool to determine the initial estimations of the parameters. Four sets of the
parameters can be selected based on the results generated from the first five
generations. The selection criterion is imposed to limit the function value
below 200. The values of these selected sets and their corresponding function
values are listed in Table 5.29. When studying the features of all the parents
from the five generations of uniform µGAs, only four types of parameters
— (+, +), (–, +), (+, –), and (–, –) — are found among these parents. These
are selected as the better zones in the parameter space. At the second step,
these four sets of parameters are considered as the four initial points. The
BCLSF is applied four times, each starting from a different initial point. The
results from BCLSF are shown in Table 5.30.
All these solutions could be considered the local minima of the function;
the global minimum can be found from these solutions by comparing their
corresponding function values. The global solution is easily found to be (3.0,
2.0)T and has a function value equal to zero, as shown in bold fonts in Table
5.30. The required number of function evaluations to convergence of the
BCLSF is very few (about six function evaluations of each run of BCLSF)
because the initial points are very close to these minima.
For the present method, 25 function evaluations in GA runs and 26 function
evaluations in the BCLSF are performed. Therefore, 51 function evaluations
in total are required in the combined method, significantly less than 215

TABLE 5.29
Selected Sets of Better Solutions Close to
Optima from Uniform µGA for the
Numerical Test Defined by Equation 5.48a
Set Number Point (x1, x2) Function Value
1 (3.929, 0.635) 33.45
2 (–4.447, –4.635) 128.73
3 (–4.165, 4.259) 167.90
4 (2.329, –2.706) 78.36
a Results obtained at fifth generation.
Note: Total number of function evaluations in the
GA search stage is 5 × 5 = 25.
Source: Liu, G.R. et al., Comput. Methods Appl. Mech.
Eng., 191, 1909–1921, 2002. With permission.
TABLE 5.30
Results from Gradient-Based Method (BCLSF) for Numerical Test
Defined by Equation 5.48
Set Initial Point Corresponding Function at Function
Number Solution Evaluations Solution Point Value
1 (3.929,0.635) (3.0,2.0) 8 0.000
2 (–4.447,–4.635) (–3.763,–3.266) 6 7.367
3 (–4.165,4.259) (–2.787,3.128) 6 3.487
4 (2.329,–2.706 (3.581,–1.821) 6 1.504
Note: Total number of function evaluations in the gradient-based method
search stage is 26.
Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 1909–1921, 2002.
With permission.
function evaluations in the steepest descent method (even for the successful
runs), 480 function evaluations of the uniform µGA, and 520 function eval-
uations of the plain GA. This numerical test demonstrates the high efficiency
of the combined method.
5.9 Other Minor Tricks in Implementation of GAs

The implementation of a GA in an inverse procedure is schematically
outlined in Figure 5.34. In the applications presented in Chapter 7 through
Chapter 13, the µGAs and IP-GAs play a very important role in solving
a wide range of inverse problems. Special implementation techniques
will be addressed separately for each of the practical applications. The

Trial parameters
Search
range
Computed results Forward solver
(Identified parameters)
(Fitness function)
Error function
Inputs for GA
Output results
GA
Stopping
criterion
Measurement date or simulated
measurements generated by adding
the random noise to the computer-
generated results.
FIGURE 5.34
Flowchart of the computational inverse technique using the GA for the inverse analysis. (From
Han, X. et al., Inverse Probl. Eng., 10(4), 309, 2002. With permission.)
following addresses some common minor tricks useful in achieving better

efficiency.
The first minor improvement of the µGA is to record the best individual
of current generation and use it directly in the next generation. For the
population size of 5, in each generation (except the initial generation in which
all five individuals must be evaluated), only four individuals need to be
newly evaluated with forward solvers. This will reduce one fifth of the
forward computation. This technique has been implemented in Chapter 8
and Chapter 9 for the material property identification of composite. This
improvement will obviously be worthwhile for cases in which the forward
computation is computationally expensive.
Another improvement for the convergence rate of the GA is the two-stage
searching method, which consists of a global searching at the first stage and
a local searching at the second stage. The local search is performed by
reducing the search space after the global search locates the likelihood of
the optima region. This method has been employed in Chapter 12 and
Chapter 13.
For the application of the combined optimization method, Chapter 12
provides the detailed implementation of the combined technique, as well as
the technique issue on switch from the GA to the gradient-based optimiza-
tion algorithm for inversely detecting the crack in composite structures.

5.10 Remarks
• Remark 5.1 — genetic algorithms are stochastic global search meth-

ods and differ in their fundamental concept from traditional gradi-
ent-based search techniques. For those complicated optimization
problems in which the derivatives of the objective functions are
difficult or impossible to obtain, the GA can work well. This charac-
teristic makes GAs more canonical than many other search schemes.
• Remark 5.2 — the micro GA ( µGA) is a variation of traditional GAs,
able to avoid premature convergence and perform better in reaching
the optimal region than traditional GAs for many problems. By
introducing the microtechnique, the µGA guarantees its robustness
in a different way: whenever the micropopulation is reborn, new
chromosomes are randomly generated to ensure that the new genetic
information keeps flowing in the entire searching process.
• Remark 5.3 — the IP-GA uses one more additional operator called
intergeneration projection (IP). In the IP-GA, the child generation is
produced using genes of the parent and grandparent generations to
achieve much better convergence. The concept of the intergeneration
projection is applicable to all other versions of GAs.
• Remark 5.4 — a method combining the GA with the gradient-based
optimization algorithm has been suggested to be used as an effective
optimization method. In this method, the genetic algorithm is first
used to select a set of better solutions that are close to the optima;
then the gradient-based optimization algorithm is applied using
these better solutions as the initial guesses. Finally, the optima can
be determined from the solutions of the gradient-based optimization
algorithm by comparing their corresponding fitness values. This
method takes advantage of the global operation of the GA and fast
convergence of the gradient-based optimization algorithm.
5.11 Some References for Genetic Algorithms

Ackley, D., A Connectionist Machine for Genetic Hillclimbing, Kluwer Academic Pub-
lishers, Boston, 1987.
Bethke, A.D., Genetic algorithms as function optimizers, University of Michigan, Diss.
Abst. Int., 41(9), 3503B, 1981.
Coley, D.A., An Introduction to Genetic Algorithms for Scientists and Engineering,
World Scientific, Singapore, 1999.
Davis, L., Handbook of Genetic Algorithms, Van Nostrand Reinhold, New York, 1991.

Gen, M. and Chen, R.W., Genetic Algorithms and Engineering Design, John Wiley &
Sons, New York. 1997.
Goldberg, D.E., Genetic Algorithms in Search, Optimization, and Machine Learning, Ad-
dison-Wesley, Reading, MA, 1989.
Haupt, L.R. and Haupt, S.E., Pratical Genetic Algorithms, John Wiley & Sons, Inc., New
York, 1998.
Holland, J.H., Adaptation in Natural and Artificial Systems, University of Michigan
Press, Ann Arbor, 1975.
Krishnakumar, K., Micro-Genetic Algorithms for Stationary and Non-Stationary Function
Optimization, SPIE: Intelligent Control and Adaptive Systems, Philadelphia, PA,
1196, 1989, 289.
Lawrence, D., Genetic Algorithms and Simulated Annealing, Morgan Kaufmann Pub-
lishers, London, 1987.
Levine, D., Users Guide to the PGA Pack Parallel Genetic Algorithm Library, U.S, Argonne
National Laboratory, Argonne,. 1996.
Man, K.F., Tang, K.S. and Kwong, S., Genetic Algorithms: Concepts and Designs, Spring-
er-Verlag, London, 1999.

Computational Inverse Techniques in Nondestructive Evaluation - G. R. Liu - X. Han - ch05

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Computational Inverse Techniques in Nondestructive Evaluation - G. R. Liu - X. Han - ch05

Caricato da

Copyright:

Formati disponibili

1523_Frame_C05.

fm Page 107 Thursday, August 28, 2003 4:23 PM

Chapter 4 gave a concise, insightful description of traditional optimization

© 2003 by CRC Press LLC

optimization techniques, GAs have also been extremely successful in their

5.2 Basic Concept of GAs

where x is the vector of parameters, x = {x1, x2, …, xN}T. The superscripts L

© 2003 by CRC Press LLC

A plain GA program starts with a generation of chromosomes (individ-

5.2.2 Genetic Operators

© 2003 by CRC Press LLC

generation. The quality of an individual in the current generation is mea-

• One-point crossover scheme

5.2.2.2.1 One-Point Crossover Scheme

Parent #1: 011101|0001

Parent #2: 100111|0101

© 2003 by CRC Press LLC

Offspring #1: 011101|0101

5.2.2.2.2 Multipoint Crossover Scheme

After interchanging the genes in the parent chromosomes between the

In the multipoint crossover scheme, more than one crossover point is

5.2.2.2.3 Uniform Crossover Scheme

© 2003 by CRC Press LLC

crossover with a mask (5.9)

With the mask of 0 1 0 1 0 1 0 1, the possible sets of offspring after uniform

In addition to these standard crossover operators, offspring can also be

© 2003 by CRC Press LLC

5.2.3 A Simple Example

5.2.3.2 Representation (Encoding)

© 2003 by CRC Press LLC

a chromosome of length 24 bits according to the binary coding procedure

5.2.3.3 Initial Generation and Evaluation Function

5.2.3.4 Genetic Operations

© 2003 by CRC Press LLC

Chromosome 1 (C1): 100011111101

the corresponding parameter values of these two chromosomes are

C1: x1C1 = 0.2476, x 2C1 = −0.9294

These chromosomes are evaluated to arrive at their function value of

© 2003 by CRC Press LLC

Chromosome 1 (C1): 100011111101 010001001000

Chromosome 2 (C2): 001101000110 010010001101 (5.19)

After the crossover operation, the two resulting offspring are

Offspring 1 (O1): 100011111101 010010001101

Offspring 2 (O2): 001101000110 010001001000 (5.20)

and the corresponding values for these two offspring are

O1: x1O1 = 0.2476, x 2O1 = −0.6820

These offspring are evaluated to obtain their function values

The crossover will produce a total of eight children. In addition to the

Chromosome 3 (C3): 100011111101 010001001000 (5.23)

The corresponding value for the mutated offspring is

O31: x1O 3 = 0.2476, x 2O 3 = −0.9138 (5.24)

and these chromosomes evaluate to

© 2003 by CRC Press LLC

f (O31) = 0.19125 (5.25)

After the mutation operation on all 15 individuals, a new generation of 15

• Population size is 15.

© 2003 by CRC Press LLC

5.2.4 Features of GAs

© 2003 by CRC Press LLC

inverse problem using GAs, exploring a faster forward computation solver

5.2.5 Brief Reviews on Improvements of GAs

1. Inject the problem-specific information into the existing genetic

© 2003 by CRC Press LLC