Sei sulla pagina 1di 27

19/03/12

Machine Learning:
Algorithms and Applications
Floriano Zini

Free University of Bozen-Bolzano


Faculty of Computer Science

Academic Year 2011-2012

Lecture 4: 19th March 2012

Evolutionary computing

These slides are mainly taken from A.E. Eiben and J.E. Smith,
Introduction to Evolutionary Computing

1
19/03/12

The Main EC Metaphor

EVOLUTION PROBLEM SOLVING

Environment Problem
Individual Candidate Solution
Fitness Quality

Fitness → chances for survival and reproduction

Quality → chance for seeding new solutions

Fitness in nature: observed, 2ndary, EC: primary

Motivations for EC
›  Developing,
analyzing, applying problem solving
methods a.k.a. algorithms is a central theme in
mathematics and computer science

›  Time for thorough problem analysis decreases

›  Complexity of problems to be solved increases

›  Consequence:ROBUST PROBLEM SOLVING


technology needed

2
19/03/12

Evolutionary machine learning

›  Wehave corresponding sets of inputs &


outputs and seek model that delivers correct
output for every known input

Modelling example: load applicant creditibility

›  British
bank evolved
creditability model to
predict loan paying
behavior of new
applicants

›  Evolving: prediction


models

›  Fitness:
model accuracy
on historical data

3
19/03/12

Modelling example: mushroom edibility

›  Classify
mushrooms as
edible or not edible

›  Evolving: classifications


models

›  Fitness:
classification
accuracy on training
set of edible and not
edible mushrooms

EC metaphor

›  Apopulation of individuals exists in an environment


with limited resources
›  Competition for those resources causes selection of
those fitter individuals that are better adapted to the
environment
›  Theseindividuals act as seeds for the generation of
new individuals through recombination and mutation
›  The
new individuals have their fitness evaluated and
compete (possibly also with parents) for survival.
›  Over time Natural selection causes a rise in the
fitness of the population

4
19/03/12

General scheme of EAs


Parent selection
Parents
Intialization
Recombination
(crossover)
Population
Mutation

Termination
Offspring
Survivor selection

EA scheme in pseudo-code

5
19/03/12

Main EA components

›  Representation

›  Population

›  Evaluation

›  Selection (parent selection, survivor selection)


›  Variation (mutation, recombination)
›  Initialization

›  Termination condition

Representation
Genotype space
Phenotype space Encoding
(representation) R0c01cd

B0c01cd

G0c01cd
Decoding
(inverse representation)

In order to find the global optimum, every feasible solution must be


represented in genotype space

6
19/03/12

Population

›  Role:
holds the candidate solutions of the
problem as individuals (genotypes)
›  Formally,a population is a multiset of individuals,
i.e. repetitions are possible
›  Population
is the basic unit of evolution, i.e., the
population is evolving, not the individuals
›  Selection operators act on population level
›  Variation operators act on individual level

Evaluation (fitness) function


›  A.k.a. quality function or objective function
›  Role:
›  Represents the task to solve, the requirements to adapt to
(can be seen as “the environment”)
›  enables selection (provides basis for comparison)
›  e.g., some phenotypic traits are advantageous, desirable, e.g.
big ears cool better, these traits are rewarded by more offspring
that will expectedly carry the same trait
›  Assigns
a single real-valued fitness to each phenotype
which forms the basis for selection
›  So the more discrimination (different values) the better
›  Typically we talk about fitness being maximised
›  Some problems may be best posed as minimisation
problems, but conversion is trivial

7
19/03/12

Selection
Role:
›  Identifies individuals
›  to become parents
›  to survive
›  Pushes population towards higher fitness
›  Usually probabilistic
›  high quality solutions more likely to be selected than low
quality
›  but not guaranteed
›  even worst in current population usually has non-zero
probability of being selected
›  This
stochastic nature can aid escape from local
optima

Selection mechanism example

Example: roulette wheel selection


1/6 = 17%

B
fitness(A) = 3 A C
fitness(B) = 1 3/6 = 50% 2/6 = 33%
fitness(C) = 2

In principle, any selection mechanism can be used for


parent selection as well as for survivor selection

8
19/03/12

Survivor selection
›  A.k.a. replacement
›  Most EAs use fixed population size so need a way
of going from (parents + offspring) to next
generation
›  Often deterministic (while parent selection is
usually stochastic)
›  Fitness based : e.g., rank parents+offspring and take
best
›  Age based: make as many offspring as parents and
delete all parents
›  Sometimes a combination of stochastic and
deterministic (elitism)

Variation operators
›  Role: to generate new candidate solutions
›  Usually
divided into two types according to their
arity (number of inputs):
›  Arity 1 : mutation operators
›  Arity >1 : recombination operators
›  Arity = 2 typically called crossover
›  Arity > 2 is formally possible, seldomly used in EC

›  There
has been much debate about relative
importance of recombination and mutation
›  Nowadays most EAs use both
›  Variation operators must match the given representation

9
19/03/12

Mutation
›  Role: causes small, random variance
›  Acts on one genotype and delivers another
›  Element of randomness is essential and differentiates it from
other unary heuristic operators

before 1 1 1 1 1 1 1

after 1 1 1 0 1 1 1

Recombination
›  Role: merges information from parents into offspring
›  Choice of what information to merge is stochastic
›  Most offspring may be worse, or the same as the parents
›  Hope is that some are better by combining elements of
genotypes that lead to good traits

Parents
cut cut

1 1 1 1 1 1 1 0 0 0 0 0 0 0

1 1 1 0 0 0 0 0 0 0 1 1 1 1

Offspring

10
19/03/12

Initialisation / Termination
›  Initialisation usually done at random
›  Need to ensure even spread and mixture of possible allele
values
›  Can include existing solutions, or use problem-specific
heuristics, to “seed” the population

›  Termination condition checked every generation


›  Reaching some (known/hoped for) fitness
›  Reaching some maximum allowed number of generations
›  Reaching some minimum level of diversity
›  Reaching some specified number of generations without
fitness improvement

Example: the 8-queens problem

Place 8 queens on an 8x8 chessboard in


such a way that they cannot check each other

11
19/03/12

The 8-queens problem: representation

Phenotype:
a board configuration

Genotype: Obvious mapping


a permutation of
the numbers 1 - 8 1 3 5 2 6 4 7 8

The 8-queens problem: fitness evaluation

›  Penalty
of one queen: the number of queens
she can check

›  Penalty
of a configuration: the sum of
penalties of all queens

›  Note: penalty is to be minimized

›  Fitness
of a configuration: inverse penalty to
be maximized

12
19/03/12

The 8-queens problem: mutation

Small variation in one permutation, e.g.:


•  swapping values of two randomly chosen positions,

1 3 5 2 6 4 7 8 1 3 72 6 4 58

The 8-queens problem: recombination

Combining two permutations into two new


permutations:
•  choose random crossover point
•  copy first parts into children
•  create second part by inserting values from other parent:
• in the order they appear there
• beginning after crossover point
• skipping values already in child

13526 47 8 13542 87 6
87654 32 1 87624 13 5

13
19/03/12

The 8-queens problem: selection

›  Parent selection:


›  Pick 5 parents and take best two to undergo
crossover
›  Survivor selection (replacement)
›  When inserting a new child into the population,
choose an existing member to replace by:
›  sorting the whole population by decreasing fitness
›  enumerating this list from high to low
›  replacing the first with a fitness lower than the given
child

8 Queens Problem: summary

Note that is only one possible


set of choices of operators and parameters

14
19/03/12

Typical behavior of an EA
Stages in optimizing on a 1-dimensional fitness landscape

Early stage:
quasi-random population distribution

Mid-stage:
population arranged around/on hills

Late stage:
population concentrated on high hills

Typical run: progression of fitness


Best fitness in population

Time (number of generations)

Typical run of an EA shows so-called “anytime behavior”

15
19/03/12

Typical run: progression of fitness


Best fitness in population

Progress in 2nd half

Progress in 1st half

Time (number of generations)

›  Are long runs beneficial?


›  It depends on how much you want the last bit of progress

›  May be better to do more short runs

Is it worth expending effort on smart initialization?


Best fitness in population

F F: fitness after smart initialisation


T: time needed to reach level F after random
initialisation

T
Time (number of generations)

•  Answer: it depends.
• Possibly good, if good solutions/methods exist
• Care is needed

16
19/03/12

EAs as problem solvers: Goldberg view (1989)


Performance of methods on problems

Special, problem tailored method

Evolutionary algorithm

Random search

Scale of “all” problems

EAs as problem solvers: Michalewicz view (1996)

EA 4
Performance of methods on problems

EA 2
EA 3

EA 1

Scale of “all” problems

17
19/03/12

Genetic Algorithms

GA Quick Overview

›  Developed: USA in the 1970’s


›  Early names: J. Holland, K. DeJong, D. Goldberg
›  Holland’s
original GA is now known as the
simple genetic algorithm (SGA)
›  Other GAs use different:
›  Representations
›  Mutations
›  Crossovers
›  Selection mechanisms

18
19/03/12

Representation

SGA technical summary tableau

Representation Binary Strings


Recombination N-point or uniform
Mutation Bitwise bit-flipping with fixed
probability
Parent selection Fitness-Proportionate
Survivor selection All children replace parents
Speciality Emphasis on crossover

19
19/03/12

SGA reproduction cycle

1.  Select parents for the mating pool


(size of mating pool = population size)
2.  Shuffle the mating pool
3.  Apply crossover for each consecutive pair
with probability pc, otherwise copy parents
4.  Apply mutation for each offspring (bit-flip
with probability pm independently for each
bit)
5.  Replace the whole population with the
resulting offspring

SGA operators: 1-point crossover

›  Choose a random point on the two parents


›  Split parents at this crossover point

›  Create children by exchanging tails


›  Pc typically in range (0.6, 0.9)

20
19/03/12

SGA operators: mutation

›  Altereach gene independently with a


probability pm
›  pm is called the mutation rate
›  Typically between 1/pop_size and
1/chromosome_length

SGA operators: Selection


Main idea: better individuals get higher chance
›  Chances proportional to fitness

›  Implementation: roulette wheel technique


›  Assign to each individual a part of the roulette wheel
›  Spin the wheel n times to select n individuals

B
C 2/6 = 33% Fitness(A) 3
1/6 = 17%
Fitness(B) 2
A Fitness(C) 1
3/5 = 50 %

21
19/03/12

An example after Goldberg ‘89 (1)

›  Simpleproblem: max x2 over {0,1,…,31}


›  GA approach:

›  Representation: binary code, e.g., 01101 ↔ 13


›  Population size: 4
›  1-point xover, bitwise mutation
›  Roulette wheel selection
›  Random initialization
›  We show one generational cycle done by hand

X2 example: selection

22
19/03/12

X2 example: Crossover

X2 example: Mutation

23
19/03/12

The simple GA
›  Has been subject of many (early) studies
›  still often used as benchmark for novel GAs
›  Shows many shortcomings, e.g.,
›  Representation is too restrictive
›  Mutation & crossover operators only applicable for bit-
string & integer representations
›  Selection mechanism sensitive for converging
populations with close fitness values
›  Generational population model (step 5 in SGA repr.
cycle) can be improved with explicit survivor selection

Alternative Crossover Operators


›  Performance
with 1 Point Crossover depends on
the order that variables occur in the
representation
›  more likely to keep together genes that are near each
other
›  Can never keep together genes from opposite ends of
string
›  This is known as Positional Bias
›  Can be exploited if we know about the structure of our
problem, but this is not usually the case

24
19/03/12

n-point crossover
›  Choose n random crossover points
›  Split along those points
›  Glue parts, alternating between parents

›  Generalisation of 1 point (still some positional bias)

Uniform crossover
›  Assign 'heads' to one parent, 'tails' to the other
›  Flip a coin for each gene of the first child
›  Make an inverse copy of the gene for the second child
›  Inheritance is independent of position

25
19/03/12

Crossover OR mutation?
›  Decade long debate: which one is better /
necessary / main-background

›  Answer (at least, rather wide agreement):


›  it depends on the problem, but
›  in general, it is good to have both
›  both have another role
›  mutation-only-EA is possible, xover-only-EA would not
work

Crossover OR mutation? (cont’d)

Exploration: Discovering promising areas in the search


space, i.e. gaining information on the problem
Exploitation: Optimising within a promising area, i.e.
using information

There is co-operation AND competition between them

Crossover is explorative, it makes a big jump to an


› 
area somewhere “in between” two (parent) areas
Mutation is exploitative, it creates random small
› 
diversions, thereby staying near (in the area of ) the
parent

26
19/03/12

Crossover OR mutation? (cont’d)

›  Only crossover can combine information from two


parents
›  Only mutation can introduce new information
(alleles)
›  To hit the optimum you often need a ‘lucky’
mutation

27

Potrebbero piacerti anche