Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Evolutionary Computing
Evolution in the Changing World Looking at the world around us, we see a diversity of life - millions of species each with its own unique behaviour patterns and characteristics. All of these plants, animals, birds, fishes and other creatures have evolved, and continue evolving, over millions of years. They have adapted themselves to a constantly shifting and changing environment in order to survive. Those weaker members of a species tend to die away, leaving the stronger and fitter to mate, create offspring and ensure the continuing survival of the species. Their lives are dictated by the laws of natural selection and Darwinian evolution struggle for existence and survive for the fittest. Such an evolutionary process is shown in Figure 1.
Figure 1: Evolution in the nature And it is upon these ideas that evolutionary computing (EC) is based. Evolutionary computing is the emulation of the process of natural selection in a search procedure. In nature, organisms have certain characteristics that influence their ability to survive and reproduce. These characteristics are represented by encoding of information contained in the chromosomes of the organisms. New offspring chromosomes are created by means of mating
http://www.infm.ulst.ac.uk/~siddique
and reproduction mechanisms. The end result will be offspring chromosomes that contain the best characteristics of each parent chromosomes. The process of natural selection ensures that more fit individuals have opportunity to mate most of the time, leading to expectation that the offspring have a similar or better fitness. A Natural Perspective of Evolutionary Algorithms The population can be simply viewed as a collection of interacting creatures. As each generation of creatures comes and goes, the weaker ones tend to die away without producing children, while the stronger mate, combining attributes of both parents, to produce new, and perhaps unique children to continue the cycle. Occasionally, a mutation creeps into one of the creatures, diversifying the population even more. Remember that in nature, a diverse population within a species tends to allow the species to adapt to it's environment with more ease.
Population
Calculate Fitness
http://www.infm.ulst.ac.uk/~siddique
Each chromosome X represents a point in the search space. A chromosome consists of a number of genes xi, where the gene is the functional unit of inheritance. In terms of optimisation, a gene represents one parameter of the optimisation problem. The characteristics represented by chromosome can be divided into classes of evolutionary information: genotypes and phenotypes. A genotype describes the genetic composition of an individual as inherited from its parents. Genotypes provide a mechanism to store experiential evidence as gathered by parents. A phenotype is the expressed behavioural traits of an individual in a specific environment. Encoding Encoding of genetic information in evolutionary computing should be such that the representation and the problem space are close together, i.e. a natural representation of the problem. This also allows the incorporation of knowledge about the problem domain into EC system in the form of special genetic operators. Encoding of chromosomes is the first question to ask when starting to solve a problem with evolutionary computing. There is no exact way of choosing an encoding scheme rather it depends on the problem at question. In this lecture, some of the encoding schemes will be introduced that have been in use with some success in the field. Binary coding The most commonly used chromosome representation in the EA is the binary coding scheme. Each decision variable in the parameter set is encoded as a binary string and concatenated to form a chromosome. Some problems can be expressed very efficiently using binary coding shown as follows
X = {(b 1 , b 2 , b l ), (b 1 , b 2 , b l ), }, b i {0,1}
(2)
http://www.infm.ulst.ac.uk/~siddique
Example: Problem is to optimise a function f(x1,x2,x3) that takes real values between [0.0,1.0] and each value is represented by 8 digits. In binary coding the string {00000000} corresponds to real value 0.0 and {11111111} corresponds to 1.0. Now the chromosome is represented by {x1,x2,x3}looks like X={00000001 00101000 10110001} 0.00390625 0.15625 0.69140625 1/256=0.00390625 The problem existing in the binary coding lies in the fact that a long string always occupies the computer memory even though only a few bits are actually involved in the crossover and mutation operations. This is particularly the case when a lot of parameters are needed to be optimised and a higher precision is required for the final result. Also, empirical evidence suggests that large Hamming distance in the representational mapping between adjacent values, as in the case of binary coding, can result in the search process being deceived or unable to efficiently locate the global minimum (Caruana and Schaffer, 1988). To overcome the inefficient occupation of memory and inefficient search process, there is an increasing interest in alternative encoding strategies such as integer or real-valued encoding. Gray coding While binary coding is frequently used, it has the disadvantages called Hamming Cliffs. A Hamming Cliff is formed when two numerically adjacent values have bit representations that have a large Hamming distance. For example, consider the decimal numbers 7 and 8. The corresponding binary representations using 4-bits are 7=0111 and 8=1000, with a Hamming distance of 4. This causes a problem when a small change in variables should result in a small change in fitness. An alternative bit representation is to use Gray coding. Gray coding has the advantage over binary coding in that the Hamming distance between two successive numerical values is one. Binary numbers can be easily converted into Gray coding using the conversion
g = g1 g k g k = (bk 1 bk )
(3)
g 1 = b1 .
Example: Represent the decimal numbers 1-8 in binary and Gray coding and show Hamming Cliffs between two subsequent numbers.
http://www.infm.ulst.ac.uk/~siddique
Binary
Hamming cliff 2 1 3 1 2 1 4
Gray
Hamming cliff 1 1 1 1 1 1 1
The use of real-valued genes in EA is claimed to offer a number of advantages in numerical function optimisation over binary encoding (Wright, 1991). Efficiency of EA is increased as there is no need to convert binary strings into real values before each function evaluation and hence there is no loss in precision caused by conversion. For detail description of real-valued encoding schemes, see (Michalewicz,1992). When real values are used in chromosome representation, chromosomes are simply a string of real values
(4)
Example: Problem is to optimise a function f(x1,x2,x3) that takes real values between [0.0,1.0] and each value is represented by a real value rather than binary digits. Now the chromosome is represented by {x1,x2,x3}looks like
X={0.00390625 0.15625 0.69140625} The advantage of real-valued coding over the binary coding includes increased precision and chromosome string becomes shorter. Also real-valued coding gives a greater freedom to use special crossover and mutation techniques.
Hybrid coding
Combination of binary and real-values can be used in chromosome representation depending on the problem. X = {x 1 , x 2 , x 3 } = {(bbb), (bbb), (r)} e.g. X={100, 010, 7} (5)
http://www.infm.ulst.ac.uk/~siddique
This type of coding poses extra constraints on the EA operators. Normal cross over can still be applied but the mutation operator has to be changed in that it reinitialises a gene depending on the coding of that gene.
Permutation coding
Permutation encoding can be used in ordering problems, such as travelling salesman problem or task ordering problem. In permutation encoding, every chromosome is a string of numbers that represent a position in a sequence. An example of chromosomes with permutation encoding is shown below Chromosome A 1 5 3 2 6 4 7 9 8 Chromosome B 8 5 6 7 2 3 1 4 9
Permutation encoding is useful for ordering problems. For some types of crossover and mutation corrections must be made to leave the chromosome consistent (i.e. have real sequence in it) for some problems.
Value coding
Direct value encoding can be used in problems where some more complicated values such as real numbers are used. Use of binary encoding for this type of problems would be difficult. In the value encoding, every chromosome is a sequence of some values. Values can be anything connected to the problem, such as (real) numbers, chars or any objects. An example of chromosomes with value encoding is as follows
Chromosome A 1.2324 5.3243 0.4556 2.3293 2.4545 Chromosome B Chromosome C Chromosome D ABDJEIFJDHDIERJFDLDFLFEGT (back), (back), (right), (forward), (left) NB, NB, ZO, ZO, PS, NS, NS, ZO
Value encoding is a good choice for some special problems. However, for this encoding it is often necessary to develop some new crossover and mutation specific for the problem.
Tree coding
Tree encoding is used mainly for evolving programs or expressions, i.e. for genetic programming. In the tree encoding every chromosome is a tree of some objects, such as functions or commands in programming language. For example
http://www.infm.ulst.ac.uk/~siddique
Chromosome A
Chromosome B
(+ x (/ 5 y))
Tree encoding is useful for evolving programs or any other structures that can be encoded in trees. Programming language like LISP is often used for this purpose, since programs in LISP are represented directly in the form of tree and can be easily parsed as a tree, so the crossover and mutation can be done relatively easily. Finding a function that would approximate given pairs of values.
Population
Population is a randomly generated set of chromosomes or individuals used in a search procedure as shown in equation (6). The size of population is an important factor in and depends on the problem at hand. Typically, a population is composed of between 20 and 100 individuals (De Jong, 1975). A variant called micro GA uses very small population, approximately 10, with a restrictive genetic operators in order to implement real-time execution (Karr, 1991).
1X1 2 X = 1 M N X1
1 2
X2
X2 K M M N X2 K
Xm 2 Xm M N Xm
1
(6)
Initialisation: It is reasonable to discretize and place upper and lower bounds on the solution spaces for each of these parameters. An initial population of N chromosomes is generated using a random number generator that uniformly distributes numbers within the desired search space. An example of such a search space is shown in Figure 3. A variation is the extended random initialisation procedure whereby a number of random initialisation are tried for each individual and the one with the best performance is chosen for the initial population
http://www.infm.ulst.ac.uk/~siddique
(Bramlette, 1991). Other users have seeded the initial population with some individuals that are known a priori. e.g. Let g = f(x,y,z) is the function to be optimised. The parameters are x, y and z. The population P is initialised with a random number generator.
1X = M NX
Z
Y M Y
(7)
Z=10
Solution space
Y
Z= -3
f (x )
i =1 i
http://www.infm.ulst.ac.uk/~siddique
where N is the population size and xi is the ith individual. This fitness value ensures that each individual has a probability of reproducing according to its relative fitness. The fitness function f(xi) must be defined by user and very often problem dependent.
Genetic Operators
Each generation of an EA produces a new generation of individuals, representing a set of new potential solutions to the optimisation problem. The new generation is formed through application of operators. There are three basic genetic operators found in every evolutionary computation. (Although some algorithms may not employ the crossover operator, we shall refer to them as evolutionary algorithms rather than genetic algorithms.)