Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
In recent years, Genetic Algorithms are gaining wide attention by the research community.
Genetic algorithm (GA) is rapidly growing area of Artificial Intelligence. It is categorised as
subclass of evolutionary algorithms. It is applicable to large number of optimisation
techniques in science and industry.
1
Optimisation algorithms are categorised into two major groups – deterministic and
probabilistic [Weise 2007]. Deterministic algorithms do not contain instructions that use
random numbers in order to decide what to do or how to modify data. They employ heuristics
in order to define the processing order of solution candidates. Probabilistic algorithms find
their applications in problems where there is correlation between possible solutions and their
utility for a given problem. They are non-deterministic and employ random numbers for
finding solution. They are also referred to as stochastic algorithms.
Memetic
Algorithms Ant Colony
Optimisation
Computational Soft
Intelligence- Computing Swarm Particle
Intelligence Swarm
Optimisation
Evolutio
nary
Evolutionary Artificial Bee
Comput
Computation Colony
Artificial ation Optimisation
Intelligence
Evolutionary Genetic
Stochastic Monte Carlo Algorithms Algorithms
or Probabilistic Algorithms
Evolutionary Differential
Optimisation Strategies Evolution
Hill Climbing
Algorithms State Space Genetic
Search Tabu Programming
Branch and Search
Deterministic Bound
Simulated Learning Classifier
Algebraic Annealing System
Geometry
Random Evolutionary
Optimisation Programming
2
Genetic algorithms are based on Charles Darwin’s theory of evolution that describes the
principle of natural selection “Survival of Fittest”. Genetic algorithm imitates the process of
evolution and follows the process of natural selection. In this process of imitation, genetic
algorithm allows populations of potential solutions to optimisation problems to die or
reproduce with variations gradually becoming adapted to their environment.
Darwin’s ideas about the principles of life can be summarized by the following three basic
principles:
There is a population of individuals with different properties and abilities. An upper limit
for the number of individuals in a population exists.
Nature creates new individuals with similar properties to the existing individuals.
Promising individuals are selected more often for reproduction by natural selection.
Darwin described the idea of natural selection as foundation behind biological evolution in
his research ‘Origin of Species’ in 1859 and few lines describing the fact from his work are
reproduced.
“Owing to this struggle for life, variations, however slight and from whatever cause
proceeding, if they be in any degree profitable to the individuals of a species, in their
infinitely complex relations to other organic beings and to their physical conditions of life,
will tend to the preservation of such individuals, and will generally be inherited by the
offspring. The offspring, also, will thus have a better chance of surviving, for, of the many
individuals of any species which are periodically born, but a small number can survive. I
have called this principle, by which each slight variation, if useful, is preserved, by the term
Natural Selection.” [Darwin 1859]
Natural selection is probabilistic but favours the fittest individual in the generation. A single
individual of a population is affected by other individuals of the population as well as by the
environment. Individuals with an advantage have a greater chance for survive. Natural
selection results from the differing abilities of individuals to survive and reproduce in their
environment. Adaptation is a progressive increase in the degree to which a species becomes
3
genetically well suited to its environment. A principal mechanism of adaptation is natural
selection, in which individuals superior in survival or reproductive ability in prevailing
environment contribute a disproportionate share of genes to future generations, thereby,
gradually increasing the frequency of favourable alleles in the whole population.
For example, the ancestors of giraffes had a short neck, but they had the habit of eating leaves
high up on the trees. As they reached for high leaves their necks became longer. The
character of longer neck was passed on to their descendants, making a long-necked giraffe
[Weblink1]. As a result of struggle for existence, "natural selection" is at work in nature,
allowing to survive only those individuals that are suitably adapted to the environment This is
called “survival of the fittest”. In the third edition of the Origin [Darwin 1861], Darwin
mentioned the neck of Giraffe as an adaptation for feeding as shown in Figure 1.3. Darwin
called it as beautiful adaptations in nature;—such as the long neck of the giraffe for browsing
on the branches of trees. In sixth edition of the Origin [Darwin 1872] that Darwin gave
consideration to the long neck as an adaptation for feeding. It inferences natural selection
plays a major role in this survival process.
Another example is Peppered Moth – Biston betularia. Before industrial revolution, the
common form of this moth – typica was light in colour with small dark spots which acted as
good camouflage on light coloured lichen. With increase in pollution, tree trunks became
dark and mutated form carbonaria became prevalent as shown in Figure 1.4. When moths
landed on these trees and other blackened surfaces, the dark coloured ones were harder to
spot by birds who ate them and, subsequently, they more often lived long enough to
4
reproduce. Over generations, the environment continued to favour darker moths. As a result,
they progressively became more common. By 1895, 98% of the moths in the vicinity of
English cities like Manchester were mostly black [Weblink4]. Since the 1950's, due to
control measures for air pollution, amount of air pollutants has reduced. As a result, lichen
has grown back, making trees lighter in colour and natural selection favours lighter moth
variety - typica in non-polluted forests [Weblink3]. Adaptation over generations of peppered
moth is shown in Figure 1.5.
Figure 1.4: Natural Selection in Peppered Moth Figure 1.5: Adaptation over Generations
Image Source: [Weblink2] in Peppered Moth
Image Source: [Weblink3]
5
pairs of diploid chromosomes totalling to 46 chromosomes. Cell and its organisation is
depicted in Figure 1.6.
Chromosomes are thread like or rod like DNA protein hereditary structures which store,
replicate and transmit coded genetic information. Sutton and Boveri [Sutton 1902, Boveri
1902] proposed the chromosome theory of inheritance. It was established that chromosome is
the physical basis of inheritance and DNA is the chemical basis of inheritance. Shape and
size of chromosome differs from phase to phase in cell cycle. Chemical composition of
chromosomes is 40% DNA, 15% RNA, 50% Histone proteins, 8.5% Non-histone proteins,
traces of lipids and traces of minerals like Ca, Mg and Fe [Bonner et al. 1968].
DNA represents the hereditary or genetic material. It is double helix polynucleotide which is
long but lies packed in only a few micrometer long chromosome as shown in Figure 1.7.
DNA controls the inheritance of traits from one generation to the next through two processes
– Replication and Information Transfer [Gardener 1984].
Chromosomes bear genes. All the hereditary information is located in the genes.
Chromosomes form a link between the offspring and the parents. Genes code the
characteristics of an individual. The possibilities of the genes for one property are called
allele and a gene can take different alleles [Sivanandam et al. 2007]. For example, there is a
gene for eye colour. Its possible alleles are black, brown, blue and green. Set of all possible
alleles present in particular population forms a gene pool. This gene pool determines all
6
different possible variations for the future generations. The size of the gene pool helps in
determining the diversity of the individuals in the population. The set of all the genes of a
specific species is called genome. Each and every gene has an unique position on the genome
called the locus. Figure 1.8 shows the positions of various genes on the chromosomes in
different colours and Figure 1.9 depicts the gene constitution on a DNA molecule represented
by different colours.
Genetics is the study of biologically inherited traits including traits that are influenced in part
by the environment. Each species of a living organism has a unique set of inherited
characteristics. Inherited traits are determined by elements of heredity called genes that are
transmitted from parent to offspring in reproduction. Genes have an organisation within
chromosomes that can be changed and thus provide variation in the traits of organisms.
Genes not only have a basic role in the origin and life of individual organisms but they also
cause changes in populations through gene variations. Genetics deals with the inherent
mechanisms that control constancy and change in living organisms in contrast to evolution
that leads to progressive changes in the gene pool [Hartl 2001].
Reproduction of species via genetic information is carried out by Mitosis or Meiosis. Mitosis
is cell division process in which chromosomes replicate and become equally distributed both
quantitatively and qualitatively into two daughter nuclei. It is also called somatic cell
division. Mitosis keeps all the somatic cells of an organism genetically similar. It is involved
7
in asexual reproduction and regeneration. Mitosis ensures that each daughter cell has the
same genetic information as the mother cell. Mitosis is a process of nuclear division that
maintains the chromosome number when a somatic cell divides.
Meiosis is a double division which occurs in a diploid cell and gives rise to four haploid cells.
Meiosis represents sexual reproduction. Meiosis is a process in sexual reproduction through
which the chromosome number of diploid (2n) germ cells is reduced to half (n) in formation
of mature reproductive cells or gametes. It involves two successive cell divisions. Meiosis is
the mode of cell division that results in haploid daughter cells containing only one member of
each pair of chromosomes [Gardener 1984, Hartl 2001]. The process generates genetic
diversity because each daughter cell contains different set of alleles. Meiosis reduces diploid
number of chromosomes to the haploid number. Figure 1.10 shows the pictorial
representation of both Meiosis and Mitosis – cell division process.
8
corresponding to the genotype. An organism’s genotype is the set of genes that it carries. An
organism’s phenotype is all of its observable characteristics—which are influenced both by
its genotype and by the environment. For example, differences in the genotypes can produce
different phenotypes. In house cats, the genes for ear form are different, causing one of these
cats to have normal ears and the other to have curled ears (shown in Figure 1.11). Phenotype
is the outward, physical manifestation of the organism. Genotype is the internally coded,
inheritable information carried by all living organisms. This stored information is used as a
blueprint or set of instructions for building and maintaining a living creature. Figure 1.12
shows different phenotypes for eye colour in human beings.
Figure 1.11: Different Phenotypes for Cat’s Ears Figure 1.12: Different Phenotypes
for Human Eye Colour
Image Source : [Weblink10] Image Source : [Weblink11]
The relationship between the genotype and phenotype is that the Genotype codes for the
Phenotype. The internally coded, inheritable information, or Genotype, carried by all living
organisms, holds the critical instructions that are used and interpreted by the cellular
machinery of the cells to produce the outward, physical manifestation, or Phenotype of the
organism. A change in the environment also can affect the phenotype. For example pink
colour of flamingoes is not encoded into their genotype (See Figure 1.13). The food intake of
flamingoes makes their phenotype white or pink [Weblink10].
9
1.6 Linkage between Nature and Computation
In the previous sections, evolution and biological concepts of genetics have been detailed. In
short, it can be said that evolution is cumulative genetic change in a population through time.
In the past few decades, various computational methods have been developed which have
their working inspired from the nature. One such computational method is Evolutionary
algorithm.
John Holland presented genetic algorithm in his book Adaptation in Natural and Artificial
Systems in 1975 as an abstraction of biological evolution. He defined GA as: [Holland 1975]
“A method for moving from one population of "chromosomes" to a new
population by using a kind of "natural selection" together with the
genetics−inspired operators of crossover, mutation, and inversion”.
10
David E. Goldberg in 1989 defined genetic algorithms as: [Goldberg 1989]
“Genetic Algorithms are adaptive heuristic search algorithms based on the
evolutionary ideas of natural selection and natural genetics”.
Genetic algorithms were introduced by John Holland and subsequently studied by many
researchers. Genetic algorithm was found to be a general model of adaptive process but by far
the largest application of the technique is in the domain of optimisation.
11
1.9 Fitness Function
Genetic algorithms are robust, stochastic optimisation algorithms which find the optimal
value for a particular objective function depending on the problem to be solved. The standard
approach to an optimisation problem begins by designing an objective function that can
model the problem’s objectives while incorporating any constraints. Objective function of an
optimisation technique corresponds to fitness function in genetic algorithm. Fitness function
helps to evaluate the chromosomes depending upon its fitness value. Fitness is proportional to
the utility or ability of individual which the chromosome represents. Measure of fitness helps
in evolving good solutions and implementing natural selection. Fitness function forms the
basis for selection and facilitates improvements in forthcoming generations. Mathematically,
fitness function is associated with maximising or minimising the value of fitness depending
on the problem to be solved. The fitness function can be any of the three types – firstly,
objective function representing mathematical model, computational model or computer
simulation, or subjective function where humans choose better solutions over worse ones or
lastly it can be co-evolved arising out of cooperative and competitive environments [Sastry
2002]. Fitness function should be problem specific. Fitness can be quantified by single
numerical fitness in single objective optimisation or as multiple measures in multi-objective
optimisation problem.
The genetic algorithm is a search algorithm that iteratively transforms a set of strings, each
with an associated fitness value, called a population into a new population of offspring
objects using the Darwinian principle of natural selection and using operations such as
crossover and mutation. Algorithm begins with a set of solutions represented by population of
chromosomes. Solutions from one population are taken and used to form a new population.
This is motivated by a hope that the new population will be better than the old one. Solutions
which are then selected to form new solutions (offspring) are selected according to their
fitness - the more suitable they are the more chances they have to reproduce. This is repeated
until some condition is satisfied.
The space of all feasible solutions is called search space or state space. Each point in the
search space represents one possible solution. Each possible solution can be marked by its
12
value or fitness for the problem. The problem is that the search can be very complicated. One
may not know where to look for a solution or where to start. There are many methods one can
use for finding a suitable solution, but these methods do not necessarily provide the best
solution. Genetic Algorithms aids to look for the best solution among a number of possible
solutions.
A simple genetic algorithm that yields good results in many practical problems is composed
of four operators [De Jong 1975, Beasley et al. 1993a]:
ii) Crossover: It occurs after reproduction or selection. It creates two new population or
strings from two existing ones by genetically recombining randomly chosen parts formed by
randomly chosen crossover point.
iii) Mutation: It is the occasional random alteration of the value of a string position. Mutation
creates a new string by altering value of existing string.
iv) Replacement: It is the last step in breeding step of any genetic algorithm cycle. It is used
to decide which individuals stay or get replaced in a population.
The following section describes in detail the sequence of steps in a genetic algorithm and
Figure 1.14 shows the pictorial representation of genetic algorithm in the form of a flowchart.
13
b. [Crossover] Mate the selected chromosomes as per given crossover probability
to form new offsprings.
c. [Mutation]Mutate new chromosomes as per given mutation probability.
d. [Replace] Replace the old population of chromosomes with the new
population.
5. [Convergence check] If the maximum number of generations is reached, then stop,
and return the best solution.
6. [Loop] Go to step 3
Selection
Crossover
Mutation
Convergence
Check
End
Encoding of a Chromosome:
Encoding of chromosomes is the first question to ask when starting to solve a problem with
genetic algorithm. A chromosome should in some way contain information about solution
that it represents. The most used way of encoding is a binary string. Each chromosome is
represented by a binary string (shown in Figure 1.15). Each bit in the string can represent
some characteristics of the solution. Another possibility is that the whole string can represent
a number. Of course, there are many other ways of encoding. The encoding depends mainly
14
on the solved problem. For example, one can encode directly integer or real numbers,
sometimes it is useful to encode some permutations and so on.
gene
0 1 0 1 1 1 0 0 1 1
allele
Selection:
Chromosomes are selected from the population to be parents for crossover. The problem is
how to select these chromosomes. According to Darwin's theory of evolution the best ones
survive to create new offspring. There are many methods in selecting the best chromosomes.
Examples are roulette wheel selection, rank selection, steady state selection and some others.
Crossover:
Crossover depends upon the encoding scheme used for the problem. Crossover operates on
selected genes from parent chromosomes and creates new offspring. The simplest way of
performing crossover is to choose randomly some crossover point and copy everything before
this point from the first parent and then copy everything after the crossover point from the
other parent. There exist many other ways to perform crossover like n-point crossover,
uniform crossover, order crossover etc. Crossover can be quite complicated and depends
mainly on the encoding of chromosomes [Beasley et al. 1993b, Ryan 2000]. Specific
crossover made for a specific problem can improve performance of the genetic algorithm.
One point crossover operation on binary strings is illustrated in Figure 1.16.
15
Mutation: After a crossover is performed, mutation takes place. Mutation is intended to
prevent falling of all solutions in the population into a local optimum of the solved problem.
Mutation operation randomly changes the offspring resulted from crossover. In case of binary
encoding we can switch a few randomly chosen bits from 1 to 0 or from 0 to 1. Mutation can
be then illustrated as follows:
Chromosome A 101100101100101011100101
After Mutation 101000101100101011100101
The technique of mutation (as well as crossover) depends mainly on the encoding of
chromosomes. For example, in case of permutation encoding, mutation could be performed
as an exchange of two genes.
Replacement:
When a new generation of offsprings is produced, the next question is which of these newly
generated offsprings would move forward to the next generation and would replace which
chromosomes of the current generation. The answer to this question is based on Darwin’s
principle of “Survival of Fittest” [Fogel 1995]. So better fit individuals have more chances to
survive and carried forward to next generation leaving behind the less fit ones. The process of
forming next generation of individuals by replacing or removing some offsprings or parent
individuals is done by replacement operator. This process in evolution is known as
replacement scheme [Sivanandam et al. 2007]. Basically, there are two kinds of replacement
strategies for maintaining the population – generational replacement and steady state
replacement. In generational replacement, entire population of genomes is replaced at each
generation. In elitism, complete population of genome is replaced except for the best member
of each generation which is carried over to next generation without modification [Affenzeller
et al. 2009]. In this case, generations are non-overlapping. Steady state replacement involves
overlapping population which means only a small fraction of the population is replaced
during each iteration. In a steady state replacement, new individuals are inserted in the
population as soon as they are created [Sarma et al. 1997].
16
1.11 Parameters of Genetic Algorithms
A number of parameters control the precise operation of the genetic algorithm. They are:
a) Crossover probability: It is the measure of how often crossover will be performed. If there
is no crossover, offspring are exact copies of parents. If there is crossover, offspring are
made from parts of both parent's chromosome. If crossover probability is 100%, then all
offspring are made by crossover. If it is 0%, whole new generation is made from exact
copies of chromosomes from old population. Crossover is made in hope that new
chromosomes will contain good parts of old chromosomes and therefore the new
chromosomes will be better.
b) Mutation probability: It is the measure of how often parts of chromosome will be
mutated. If there is no mutation, offspring are generated immediately after crossover
without any change. If mutation is performed, one or more parts of a chromosome are
changed. If mutation probability is 100%, whole chromosome is changed, if it is 0%,
nothing is changed. Mutation generally prevents the genetic algorithm from falling into
local extremes and helps in recovering the lost genetic material. Mutation should not
occur very often, because then genetic algorithms would act as to random search.
c) Population size: It is the number of how many chromosomes are present in the population
(representing one generation). If there are too few chromosomes, genetic algorithm has
few options available for crossover and only a small part of search space is explored. On
the counterpart, if there are too many chromosomes in one population then the speed of
genetic algorithm slows down.
d) Selection Pressure: Each of the genetic operations - crossover, mutation or replacement
involves both parent and child chromosomes. The selection of parent chromosomes is
biased towards highly fit chromosomes. More fit chromosome is more likely to be a
parent than an unfit one in genetic operations. The selection pressure is defined as the
ratio between the probability that the most fit member of the population is selected as a
parent to the probability that an average member is selected as a parent. Too high
selection pressure would result in the population converging too early, sometimes leading
to premature convergence.
e) Number of Operations: The genetic algorithm starts off with a random population of
chromosomes. Genetic operations (crossover, mutation, replacement) are then applied
iteratively to the population. The parameter - number of operations is the number of
operators that are applied over the course of a genetic algorithm run.
17
f) Elitism: When creating a new population by crossover and mutation, there are chances
that the best chromosome is lost. Elitism is the name of the method that first copies the
best chromosome (or few best chromosomes) to the new population [De Jong 1975]. The
rest of the population is constructed according to GA. Elitism can rapidly increase the
performance of GA, because it prevents a loss of the best found solution.
Genetic algorithms are powerful and broadly applicable stochastic search and optimisation
techniques. They have been used in large number of scientific and engineering problems and
models. Optimisation problems occur in many technical, economic and scientific projects like
cost, time, risk minimisation or quality, profit and efficiency maximisation. Thus the
development of general strategies is of great value.
Automatic Programming: They are used to evolve computer programs for specific
tasks and to design other computational structures as in Cellular automata and sorting
networks.
18
Optimisation: Genetic algorithms are widely used for optimisation tasks including
both numerical and combinatorial optimisation problems such as Travelling Salesman
Problem, Circuit Design [Louis 1993], job shop scheduling [Goldstein 1991], video &
sound quality optimisation, Telecommunication routing, State assignment problem,
Time tabling problem, Traffic and Shipment routing etc.
Engineering Design: They are also used to optimise the structure and operational
design of buildings, factories, machines etc. They are used to design heat exchangers,
robot gripping arms, flywheels, turbines etc.
Robotics: Robot’s design is dependent on the job it is intended to do. So there are
many different designs for robots. A range of optimal designs and components can be
searched with the help of genetic algorithms for each specific use and return entirely
new type of robots.
Machine Learning: These algorithms are used for machine learning applications like
classification and prediction, protein structure prediction etc. They are also used to
design neural networks, to evolve rules for learning classifier systems and symbolic
production systems.
Ecological Model: Genetic algorithms are used to model ecological phenomena such
as biological arm races, host parasite evolution, symbiosis and resource flow in
ecologies.
Evolvable Hardware: Genetic algorithms are used develop computer models that use
stochastic operators to evolve new configurations from old ones so as develop new
electronic circuits that can be termed as evolvable hardware.
Strategy planning and Decision making: Genetic algorithms find their wide usage in
solving different business problems in functional areas such as finance, marketing,
and production. They can be used in activities like tactical information management,
grouping and network design, job scheduling for better decision making and
management.
19
Game Playing: Genetic algorithms are also applied in Game theory and so they are
widely used in developing computer games, simulated environments.
Encryption and code breaking: Genetic algorithms can be used both to create
encryption for sensitive data as well as to break those codes.
1.13 Summary
Living organisms’ exhibit extremely sophisticated learning, decision making and processing
abilities that allow them to survive and proliferate. Nature has always served as inspiration
for several scientific and technological developments. Genetic Algorithm is one such nature
inspired technique used to solve search and optimisation problems. Genetic algorithms are
based on the principles of evolution via natural selection, employing a population of
individuals that undergo selection in the presence of variation inducing operators such as
mutation and crossover. They find their usage in vast range of applications like search,
optimisation, decision making, machine learning, robotics and many more.
20