Sei sulla pagina 1di 26

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk - 140293481


Abstract
In the Travelling Salesman Problem (TSP), we wish to visit n cities in
the shortest possible time (or distance), without ever visiting any city twice.
This problem has a time complexity of O(n!) so an exhaustive search of every
solution is infeasible for large n. An alternative to this is the Simple Genetic
Algorithm (SGA) which mimics the behaviour of natural selection to find the
best route. This report introduces such an algorithm, and then applies it to
the TSP using MATLAB. An investigation is made into the behaviour of the
algorithm under various parameters.

1
1.1

Introduction
Search Problems and Travelling Salesman Problem (TSP)

Suppose we have a collection of items, with each item being slightly different than
the other items in the collection. We wish to find an item in this collection with
specific characteristics. This is the simplest definition of a search problem. To find
this item, we could start by checking each item until we find the item we are looking
for. This is called an exhaustive search. Although such an approach is successful if
the collection in question is very small, it quickly breaks down for large collections
as the time taken to find the item grows drastically.
One famous example of a search problem is the Travelling Salesman Problem
(TSP). Suppose we are given a number of cities (nodes) and the distances between
these cities (edges) such that every city is connected with all the other cities. We
wish to compute the shortest distance in which we can travel to every city, without
visiting any city twice. There are n cities we wish to visit. Thus, when we choose
a starting point we have n 1 cities to choose from for the next destination. Then,
there are n 2 edges to choose from the next city and so on. Therefore, the number
of possible routes is
1

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

N = (n 1)(n 2)... = (n 1)!,


where N is the number of possible routes. However, due to this being an undirected graph, reverse paths have exactly the same total distance. That is, the route
{1, 2, 3, 4} is the same as the route {4, 3, 2, 1}. So, the number of unique routes is
given by
(n 1)!
.
2
Consider a small number of nodes, say n = 6. Then, there are N = 5!2 = 120
= 60
2
different routes. We could easily use exhaustive search to find the value of every route
to obtain the shortest route. Next, consider a problem with n = 16 nodes. Then,
= 653, 837, 184, 000 unique routes. Suppose we have a computer
there are N = 15!
2
that can examine 100, 000 routes per second. To visit every route, it would take us
over 75 days, even at such a high speed of calculations. The time increase is even
more concerning for higher values of n.
In this project, it will be investigated how search problems with such large search
spaces can be tackled using Genetic Algorithms.
N=

1.2

Overview of Genetic Algorithms

In the 1970s, Holland (1992) proposed an evolutionary algorithm to tackle search


problems, called the Simple Genetic Algorithm (SGA). This algorithm imitates behaviour seen in nature during natural selection. In short, the algorithm has the
following procedure:
1. Represent some random solutions as chromosomes. Evaluate the fitness of the
chromosomes (how close the solutions are to the goal state).
2. Select fittest chromosomes, called the parent chromosomes.
3. Cross the parent chromosomes using some well defined process to obtain the
children. Allow random mutations to occur.
4. Repeat the above steps until a given condition is met (e.g. a certain number
of repetitions, solution good enough, etc.)
Each of the above steps will now be discussed and some common techniques used
in Genetic Algorithms will be shown.
2

Solving TSP using Genetic Algorithms

2
2.1

Mateusz Matyjaszczyk (140293481)

Simple Genetic Algorithm (SGA)


Representation

Before any computation can take place, the solutions to the search problem need to
be expressed as chromosomes, that are made up of genes. In such a representation,
the user should easily move between solutions (states) by only performing a small
change to the genes. For example, if the solution to a given search problem is an
integer, we can express it as a binary number. For example, the solution 105 could
Note: In be written as:
a binary
1 1 0 1 0 0 1
representaWe can easily move between solutions by flipping one bit. So, by flipping the bit
tion the
0, we could go to the state:
bits/genes
are
1 1 0 1 0 0 0
counted
from the
(Michalewicz and Fogel, 2004, p.35) propose an alternative representation for the
right,
TSP. Suppose we number our cities 1, . . . , n. We can express a route by a permutation
starting of these values. Thus, for a problem with n = 7 cities, this could be:
at 0
4 7 6 1 5 2 1
Suppose we wish to find a similar solution. We could simply increment a single
gene by 1. For example changing bit 0 produces the following solution:
4 7 6 1 5 2 2
However, this solution is illegal since we visit the city labeled "2" twice which
violates the problem description. A alternative would be to swap two consecutive
bits/genes. For example, swapping the two right-most genes produces the following
solution:
4 7 6 1 5 1 2
Typically, a genetic algorithm has a large number of chromosomes. Such a collection of chromosomes is called a population while a population at a given time
is called a generation. A large population is important for the later stages of the
algorithm, but it also increases the number of calculations, and thus decreases the
speed of the algorithm.
3

Solving TSP using Genetic Algorithms

2.2

Mateusz Matyjaszczyk (140293481)

Fitness Function

We need a way of distinguishing between how well the solutions perform, which in
genetic algorithms is called the fitness of the chromosome. A fitness function is
used to measure this. (Mitchell, 1996, p. 7-8) used a toy example of maximising a
function to illustrate the role of a fitness function. Suppose we have a quadratic
fitness function
x2 + 14x + 17,
(1)
for which we wish to find the maximum. We have a binary representation with 5
bits, similar to the representation seen in Section 2.1. Thus, we can evaluate the
fitness using the above function for a small population of 4 chromosomes.
0 0 1 0 1 Value = 5

Fitness = 62

0 0 1 0 1 Value = 7

Fitness = 66

0 1 0 0 1 Value = 9

Fitness = 62

0 1 1 0 0 Value = 12

Fitness = 41

From this, we can see that the second chromosome is the fittest in this population
since we wish to obtain the maximum. Next, consider an evaluation function for the
TSP. Suppose we have the chromosome:
4 7 6 1 5 1 2
We are able to compute the distance between each consecutive city e.g. 4 and 7
then 7 and 6 and so on. Thus, the fitness of a route in TSP can be simply calculated
as the sum of those distances.

2.3
2.3.1

Selection
Roulette-Wheel Selection

Once we are able to assess the fitness of a given population, we can select the fittest
chromosomes to which we can later apply some operators. The main intuition behind
this step is that by choosing the fittest chromosomes as parents, we will also produce
fit children using crossover later on. This idea is known as the Buliding Blocks
Hypothesis.
4

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

Roulette-Wheel Selection (RWS) is one of the most basic algorithms for selection. In this method, each individual chromosome has a probability to be selected
proportional its fitness. Jebardi and Madiafi (2013) state the following procedure for
using RWS:
Let the chromosomes in the population be i = 1, . . . , n. Evaluate the fitness of
each chromosome using the fitness function f (i).
Calculate the total fitness, denoted by S = ni=1 f (i)
For each chromosome, work out their proportional fitness using k(i) =

f (i)
S

Choose each chromosome i with the probability k(i).


Consider the example population seen in Section 2.2. Such a population had
four chromosomes {i1 , ..., i4 } with the corresponding fitness {62, 66, 62, 41}. The
total fitness of this population is S = 231. Thus, the probability of choosing each
chromosome is k = {0.27, 0.29, 0.27, 0.18} for each respective value of i. This can
then be expressed as intervals, such as 0, ..., 0.27, 0.27, ..., (0.27 + 0.29) and so on. We
can then select a parent by generating a random number from the standard uniform
distribution and choosing the interval into which this random number falls into.
Jebardi and Madiafi (2013) suggest that this method might be unsuitable when
a dominant chromosome is present in the population that is much fitter than other
chromosomes. In such a case, this chromosome will always be selected and this could
lead to premature convergence of the algorithm.
2.3.2

Tournament Selection

Tournament selection (TOS) is an alternative to RWS. This scheme involves choosing


a number of chromosomes and then performing a "tournament". The method can
be summarised using the following procedure:
1. Choose k chromosomes for the tournament. Order the chromosomes in descending order. Let n = 0.
2. Let 0.5 p 0.9. Choose the most fit chromosome with probability p(1 p)n .
3. If no chromosome has been chosen, continue. Delete the most fit chromosome
from the tournament. Increment n.
4. Run the above two steps until a chromosome has been chosen.
5

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

Selection pressure is the degree to which fitter chromosomes are favoured to be


chosen. The more likely the fitter chromosomes are to be chosen, the higher the
selection pressure. Since the convergence of the algorithm is largely dependent on
selection pressure, Miller and Goldberg (1995) discuss that too little selection pressure could lead to the GA not converging, or taking unnecessarily long to converge.
On the other hand, excessive selection pressure could lead to premature convergence,
meaning that optimal solution is not found as the algorithm is stuck in a local optima. In addition, Miller and Goldberg (1995) suggest that for tournament selection
the selection pressure can be altered by a proper choice of k (size of the tournament).
Selection pressure can also be altered by the choice of p. When p is high, the fitter
chromosomes are more likely to be chosen, which also increases selection pressure.
Jebardi and Madiafi (2013) has shown that GAs which implement TOS have a
better convergence rate than ones that use RWS. There exist many other selection
schemes such as truncation selection and rank selection. The choice of the scheme is
often determined by the problem to be solved and the selection pressure associated.

2.4
2.4.1

Crossover
One-point Crossover (1PX)

As outlined in Section 1.2, once some parent chromosomes have been selected, a
crossover operation needs to applied to produce children (or offspring) chromosomes.
The most basic type of such an operation is the one-point crossover (1PX).
Suppose we have two binary chromosomes. We choose a single crossover point,
Note: A say between bit 1 and 2.
crossover
Parent 1 1 0 1 0 0
point is
shown
Parent 2 0 1 0 1 0
by
a
double
Let the three bits to the left of the crossover points stay the same. The bits to
line
between the right of the crossover point are swapped between parents. That is, bits 0 and 1
of parent 1 become the bits 0 and 1 of parent 2 and vice versa. Thus, we create the
consecutive following children:
bits/genes.
Child 1 1 0 1 1 0
Child 2

0 1 0 1 0

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

(Sivanandam and Deepa, 2007, p. 51) suggest that using 1PX might not be able
to pass desired characteristics to the children. This can occur when the good information is stored at either end of the parent chromosomes. On the contrary, due to
the simplicity of 1PX, it also reduces the run time of the GA.
2.4.2

Two-point Crossover (2PX)

To overcome the problem of desired characteristics not being passed on from the
parents to the children, an alternative could be to use a two-point crossover (2PX).
In such a crossover operator we define two crossover points. The children inherit
the head and the tail of the chromosome from the corresponding parent (parent 1
to child 1 and parent 2 to child 2). The information between the crossover points is
swapped between parents, that is child 1 obtains the information from parent 2 and
vice versa.
Next, illustrate this using a simple example. Suppose we have the same pair
of parent chromosomes as seen in Section 2.4.1 but the crossover points are now
between genes 0 and 1 and between genes 2 and 3:
Parent 1 1 0

1 0

Parent 2 0 1

0 1

Perform the crossover operation by copying the parents and then swapping the
bits between the two crossover points defined above. Rename the new chromosomes
as children.
Child 1

1 0 0 1 0

Child 2

0 1 1 0 0

Next, obtain the fitness of the children using (1). The fitness of the parent 1
and parent 2 can be shown to be 103 and 57 respectively. Similarly, the fitness
of child 1 and child 2 is 55 and 41 respectively. Thus, performing the crossover
operator creates children that are genetically different to their parents, although
some characteristics of the parents are preserved.
2.4.3

Partially-Matched Crossover (PMX)

Next, consider an example of a chromosome representation that is a permutation.


Two parents have been selected and a 2PX operator is to be applied. The crossover
7

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

points are between genes 0 and 1 and between genes 2 and 3.


Parent 1 2 3

4 5

Parent 1 4 5

2 1

Child 1

2 3

2 1

Child 2

4 5

4 5

Performing such an operator creates children that are no longer permutations.


A way of addressing this problem is to use the partially matched crossover (PMX).
Here, we will consider PMX for parents with two crossover points but the technique
can be applied to any number of crossover points. Consider the example above,
where we obtained two children after performing 2PX operator. Note the violations
of the permutations:
Child 1

2 3 2 1 1 Violations: 1,2

Child 2

4 5 4 5 3 Violations: 4,5

It is desired to keep the value of the genes between the crossover points (the
swapped parts between parents) to be the same as this was the outcome of the
operator that we applied. Thus, we can change bits 0,3 and 4. We adjust the
children by inputting the violations of child 2 into child 1 and vice versa. So, 4 and
5 will be used to correct child 1. Moving from the left, the first violation is 1 which
is located in bit 0 and 1. We cannot change bit 1 because it is between crossover
points. Thus, we change bit 0 with one of the violations from child 2. It does not
matter which violation we choose, but for consistance we will move from smallest to
largest value in the violation list. Thus, bit 0 becomes 4. The other violation of child
1 is in bit 2 and bit 4. We cannot change bit 2 as it is between crossover points so
we change bit 4 to a value from the violations list for child 2. The only value left in
the list is 5 (4 was used for the first violation) so we change the value of bit 4. We
apply the same method to obtain a corrected child 2. For this example, the process
can be summarised as:

Solving TSP using Genetic Algorithms


Child 1

Mateusz Matyjaszczyk (140293481)

2 3 2 1 1 Violations: 1,2

First violation is 1 located in bit 0 and 1. Correct bit 0 (outside crossover points)
Child 1

2 3 2 1 4

Second violation is 2 located in bit 2 and 4. Correct bit 4 (outside crossover points)
Child 1

5 3 2 1 4

This is now a correct permutation. Perform a similar correction for child 2. We go from
Child 2

4 5 4 5 3

to
Child 2

2 1 4 5 3

which is a correct permutation. Thus the corrected children are:


Child 1

5 3 2 1 4

Child 2

2 1 4 5 3

Various other crossover operators exist for permutations, with the most notable
ones being cycle crossover (CX) and order crossover (OX). Kumar and Kumar (2012)
have shown that for TSP the PMX performs better than the other two crossover
operators mentioned.

Solving TSP using Genetic Algorithms

2.5

Mateusz Matyjaszczyk (140293481)

Mutation

Suppose we have a population of chromosomes:


1 0

1 0 1

1 0

0 1 0

0 0

1 1 1

1 0

1 1 0

Consider bit 3. The only value this gene contains in the whole population is 0.
Thus, even if we perform a crossover operator, we might never be able to obtain
1 as the value of this gene. Thus, we need a way of the chromosomes obtaining
new information that is not present in the original population. To achieve this, in
his original proposal of a genetic algorithm, Holland (1992) proposed a mutation
operator. This operator allows each gene to be altered with some small probability
0 m 1.
For a binary representation, this can be done by generating a random number
u from the standard normal distribution and then flipping the bit if u < m. To
illustrate this, consider the example below with m = 0.2.
Chromosome

1 0 1 0 1

Generate 5 random numbers for i = 1, ..5


u = 0.76, 0.17, 0.74, 0.39, 0.66
Swap each bit if ui < m
Updated Chromosome 1 1 1 0 1
Thus, by performing the mutation operator we obtained a chromosome with
information that was not available in the original population as bit 3 was flipped.
(Sivanandam and Deepa, 2007, p. 56) suggested an alternative mutation operator
where the mutation operator is only applied if the mutation would improve the
fitness of the chromosome. However, such an operator would increase the running
time of the algorithm.
10

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

For a permutation representation, we cannot apply the above mutation operator


as it would create illegal chromosomes. Thus, an alternative would to swap two
consecutive genes with probability p. This way, we will always mutate so that the
chromosomes are legal solutions (permutations).

3
3.1

Applying SGA to TSP


Introduction

TSPLIB is an online library of sample TSP problems. Two of such problems will be
selected and then the later proposed genetic algorithm will be applied in MATLAB.
The two problems chosen are ulysses16 (n = 16) and ulysses22 (n = 22).

3.2

Representation and Evaluation Function

In section 2.1 we have suggested that a permutation is an appropriate representation


for the TSP. In the description of this problem, it states that we also always wish
to start and finish in the same city. Thus, for ulysses16 we would always start and
finish at node numbered one. Between these two genes, we would have a permutation
from 2 to 16. Thus, an example of a chromosome for the problem ulysses16 would
be:
1 14 13 12 7 6 15 5 11 9 10 16 3 2 4 8 1
TSPLIB also provides a distance matrix between these cities. We can calculate
the score of this chromosome by obtaining the distance between consecutive cities
and then summing those distances. For example, considering the example above, the
distance can be calculated as follows:
Count Current Node
1
1
2
14
..
..
.
.
16
8

Next Node Distance


14
479
13
52
..
..
.
.
1
60
Total
6859

This is also the optimal route, as defined by the problem description on TSPLIB.
The matrix of distances is saved in a .csv file in the directory of the program. This
11

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

allows the user to choose the distance matrix for which to calculate the route. Thus,
for the above route we can use the command:
solution=[1,14,13,12,7,6,15,5,11,9,10,16,3,2,4,8,1];
getScore(solution,ulysses16dist.csv)
ans =
6859
Here, solution is a vector containing the chromosome and ulysses16dist.csv is the
name of the file where the matrix of distances is located. This function returns the
score, which agrees with the score provided by TSPLIB for this route.

3.3

Initial population

Once we are able to represent routes and score them, we can generate a population
of chromosomes. This will be used in the alogirithm to obtain an initial population
that we will then apply the genetic alogrithm to. Thus, if we wish to obtain 5 routes
and their scores from the ulysses16 problem, we can use the code:
[population,scores]=initialPopulation(5,16,ulysses16dist.csv)
population =
1
9
1
12
1
8
1
4
1
12
scores =
14079

5
4
15
10
10

12
13
10
13
2
13075

7
7
3
6
13

8
15
14
14
8
10968

11
9
5
8
6

13
5
6
15
9
13273

3
3
7
11
15

6
14
11
7
5

...
...
...
...
...

12882

Thus, this function correctly generates legal chromosomes of the problem requested.
The scores are also obtained.

3.4

Selection

In Section 2.3 we proposed two selection schemes: Roulette-Wheel Selection (RWS)


and Tournament Selection (TOS). Out of these two, Jebardi and Madiafi (2013) have
12

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

shown that the TOS has a better convergence rate. Thus, for this particular genetic
algorithm we will use a TOS scheme.
The pseudocode outlined in 2.3.2 has been written as a function. To illustrate the
workings of this function, consider the ulysses16 problem with a population size of 10
chromosomes. We wish to perform the selection amongst k = 5 (size of tournament)
randomly chosen chromosomes from the population. Select chromosomes using the
function defined in 2.3.2 with p = 0.5. Run the following code, using an initial
population as generated before.
A=initialPopulation(10,16,ulysses16dist.csv)
tournament_Selection(A,0.5,5)
A =
1
1
1
1
1
1
1
1
1
1

9
3
5
4
3
14
6
3
9
9

8
5
12
2
2
15
14
14
2
7

7
7
7
6
10
7
15
11
13
8

13
6
14
10
8
11
7
8
15
15

6
13
2
7
11
5
4
6
4
12

15
12
3
9
9
8
9
13
12
5

12
15
6
8
6
10
2
4
8
3

14...
11...
10...
14...
14...
3...
13...
5...
6...
2...

1
1

9
6

8
14

7
15

13
7

6
4

15
9

12
2

14...
13...

ans =

Hence, the function works correctly by choosing some chromosomes from a given
population.

3.5

Crossover

Once some chromosomes have been selected as parents, they will need to be crossed
over to obtain children chromosomes. In section 2.4 some common crossover operators have been defined. In section 2.4.3 it was mentioned that Partially Matched
13

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

Crossover (PMX) is a suitable choice for a permutation representation and that Kumar and Kumar (2012) have shown this operator to have better convergence rates
than other common operators for permutation representations. Thus, this genetic
algorithm will use a PMX operator.
To perform PMX, we will need to define the crossover points. Since we want
our genetic algorithm to apply to problems with different number of nodes, it is
unsuitable to keep the crossover points constant. For example, crossover points of
5 and 12 might be suitable for ulysses16 since majority of the information within
chromosomes will be swapped between the parents. However, such crossover points
will be unsuitable for larger problems such as berlin52 with n = 52 nodes since the
genes in the interval (12, 52) after this crossover point will not be altered. Thus, we
will define the crossover points to be 0.3 and 0.7 of the number of nodes (rounded
to the nearest whole number).
Run the following code to check the crossover function.
A=initialPopulation(10,16,ulysses16dist.csv);
selected=tournament_Selection(A,0.5,5)
children=crossover2PMX(selected)
selected =
1
1

13
5

10
10

4
9

14
4

6
8

11
3

15
13

8
6

9
12

7...
14...

children =
1
1

15
5

10
10

4
9

14
4

9
8

3
11

13
15

6
3

12
6

7...
14...

We can see that the parents are significantly different than the children with
the information between the crossover points being swapped between the parents to
create child chromosomes.Thus, the crossover operator works correctly.

3.6

Mutation

In Section 2.5 it was mentioned that we can perform mutation in a permutation representation by swapping two consecutive bits with a probability m. Such a function
14

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

was written and it will now be tested using a random population of two chromosomes.
Let m = 0.1.
A=initialPopulation(2,16,ulysses16dist.csv)
mutate(A,0.1,2,16)
A =
1 5 2 3 12 8 9 7 6 4 15 10 11 14 13 1
1 14 9 11 8 2 5 10 7 (4 12) 15 6 13 3 1
ans=
1 5 2 3 12 8 9 7 6 4 15 10 11 14 13 1
1 14 9 11 8 2 5 10 7 (12 4) 15 6 13 3 1

Here, we can see that in the bottom chromosomes 4 and 12 have swapped positions
(shown in brackets). Thus, this function works correctly.

3.7

Termination criteria

Before running a genetic algorithm, we need to define a termination criteria. This


will allow the genetic algorithm to terminate when a desired solution is found.
One option is to consider the function to look for a certain solution. Thus, the
program will run until a chromosome with a specific score is reached. If we are looking
for the best solution to the ulysses22 problem, we will terminate the algorithm when
the best score in the population is 6859. This approach is valid when we are looking
for a specific solution that is good enough e.g. we want a solution that is lower
than 7500 but we are not interested in the optimal one. However, if we wish to find
an optimal route, we would usually not know the score of the optimum solution so
we cannot use this approach in such a case. This approach can also result in the
algorithm getting stuck in a local minimum. If we wish to run the ulysses16 problem
until we find a solution lower than 8500 we run the following code:
main(50,30,0.1,9500,0.5,ulysses16)
Second consideration could be to terminate the algorithm after a solution has
not improved for a certain q number of generations. For example, we are running an
algorithm with q = 50. In iteration 70 we improved our solution. We keep running
the solution and find that the solution in iteration 120 has the same score as in
15

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

iteration 70 so we terminate the algorithm. Such an approach could lead to premature


termination if the algorithm cannot escape a local optima. This termination criteria
can be run using the command with q = 50:
main2(30,20,0.1,50,0.5,ulysses16)
The final option for a termination criteria is a combination of the two criteria
above. We would run the algorithm while these criteria are satisfied: the score of the
best chromosome is greater than some threshold and the solution has improved in
the last q iterations. If the second condition is broken, the algorithm would restart.
This would be repeated until we find the solution lower than the score we are looking
for. Suppose we wish to find a solution with a score less than 8500 for the ulysses16
problem. We also let q = 50, that is we restart the program if no improvement has
been made in 50 steps.
main3(30,20,0.1,50,0.5,ulysses16,8500)

3.8

Visualising the performance of the algorithm

In order to help determine how well a given run of the algorithm performs, some
characteristics of each generation need to be captured. One such characteristic is
the fitness score of the fittest chromosome, which is the distance of the best solution
found so far. The change of this over time can tell how fast the algorithm converges
and whether it is getting stuck in a local optima. This allows us to determine the
behaviour of the algorithm under certain parameters which in turn allows us to
choose appropriate choices of the parameters in order to find the optimum solution.
Another characteristic of the generation we may be interested in is the average
fitness score of all chromosomes in the population. This can tell us how diverse our
population is. The more diverse a population is, the smaller the selection pressure
and this results in the algorithm being less likely to get stuck in a local optima.
Given some termination criteria, we want to compare the fitness of the initial
population with the fitness of the final population. To do this, we may take the best
solution in each of these generations and plot it. TSPLIB provides the positions of
the nodes in the problem which can then be plotted as points in the right order to
visualise a path. When calling a problem, say ulysses16 the program will read the
positions from the file ulysses16pos.csv, provided such a file is located in the same
directory. We will show the workings of such a function by generating two random
chromosomes from the ulysses16 problem and plotting the solutions. Note that here
the two solutions are randomly generated using the initial population function seen
16

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

earlier on, but in the main program these will be the best solutions in the first and
final generations.
A=initialPopulation(2,16,ulysses16dist.csv)
drawRoute(A(1,:),A(2,:),ulysses16dist.csv,ulysses16pos.csv)
The output of this command can be seen in Figure 1.
Figure 1: An example of visualising two random solutions.

4
4.1

Results
Investigating the effect of the size of the population

It will now be investigated how the size of the population effects the convergence
of the algorithm. To investigate this, the program obtains the value of the fittest
17

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

chromosome of each consecutive generation and upon terminating plots these values against the number of the generation. In addition, the average fitness of each
generation is also shown as this can be used to comment on the diversity of the
population.
The algorithm was run with three different sizes of the population: 10, 25 and
40. The program uses the second termination criteria (see Section 3.7) so it assumes
that the program is stuck in a local optima when an improved solution has not been
found in the last 150 generations. The other parameters have been kept constant.
The results of such a test are shown in Figure 2.
Firstly, consider the number of generations performed by each execution. The
number of generations increases as the population size increases: we have 191 generations for population size of 10, 339 generations for a population size of 25 and
finally 751 generations for a populations of size 40. (Holland, 1992) states that the
genetic algorithm aims to avoid getting stuck in a local optima by keeping a large
population of solutions. The results obtained here correspond to this, as the number
of iterations before the run is stuck in a local optima increases as the population size
increases.
Also, compare the values of the minimum and the average. The difference between
these two values tells us how varied our population is. The closer the two are, the
more uniform the population is. Similar populations are undesired as this increases
the chance of the algorithm getting stuck in a local optima. In Figure 2 we can
clearly see that the population is much more diverse for larger population sizes, due
to the bigger difference between the minimum and the average. It is worth noting
that the best solution found is closer to the optimal solution for higher population
sizes.
However, increasing the population size also causes the running time of the algorithm to increase. In Figure 2 the running time of each run of the algorithm has
been noted. Since the runs have a different number of generations, we need to work
out the time taken to obtain each generation. Thus, we divide the running time by
the number of generations for each run. We obtain values of 0.055, 0.075 and 0.095
for the population sizes of 15, 25 and 40 respectively. Thus, the running time per
generation increases with the population size.
To conclude, we have shown how the population size can affect the convergence
of the genetic algorithm. Thus, it is important to keep the population diverse by
having a large population size at each generation. However, we have also shown that
increasing the population size increases the running time. Therefore, due to this
trade-off, the population size should be large enough to avoid local optima but not
overly large so that the computation time becomes infeasible.
18

Solving TSP using Genetic Algorithms

4.2

Mateusz Matyjaszczyk (140293481)

Investigating the effect of mutation

We have applied a mutation operator to help keep genetic diversity in each generation. It will now be investigated how different mutation rates can affect the
convergence of our algorithm. Results of such an experiment are shown in Figure 3.
When m = 0, there is no mutation present. Thus, if some information is not
present in the initial generation, we will not be able to reach solutions with those
characteristics. Thus, an algorithm which does not use the mutation operator is
much more likely to get stuck in a local optima, where every solution eventually
becomes the same. This is the case in the graph for m = 0 in Figure 3 as the average
and minimum solutions are identical for generations > 25 which indicates that all the
chromosomes in the population are exactly the same. Note that the last generation
the minimum solution has improved in is generation 9.
As we increase the mutation rate, we also increase the difference between the
average and the minimum solution. Thus, mutation helps us to maintain a diverse
population. In turn, this allows the algorithm to not converge prematurely.
The number of generations of each run is also noted. We can see that as we
increase m we also increase the number of generations. This corresponds with what
we found above, that the higher mutation rates decrease the chance of getting stuck
in a local optima.
The minimum solutions also seem to be lower for the higher mutation rates.
However, very high mutation rates are also undesirable. In such cases, the genetic
algorithm behaves more like a random walk as good characteristics are less likely to
be preserved which could lead to the algorithm taking unnecessarily long to converge.

Optimal solution found

For the ulysses16 an optimal solution was found and it can be seen in Figure 4. The
command used to obtain this result is
main3(30,20,0.1,150,0.5,ulysses16,6859).
As was found in Section 4.1, higher population sizes tend to keep the population
more diverse which is a desired characteristic. However, an overly large population
size can result in low running speed of the algorithm. Thus, a population size of 30
was chosen, which allowed for a diverse population while keeping the computation
reasonably fast.
We have also chosen the mutation parameter m = 0.1. This was investigated
in Section 4.2 and it was found that mutation can help in keeping the population
19

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

diverse. When m is too large, the algorithm behaves more like a random walk,
meaning that the algorithm runs unnecessarily long. A value was chosen between
these two extremes.
We are also required to specify two parameters for tournament selection. We
need to choose k (the size of the tournament) and p (probability of selecting each
chromosome). It is desired to keep k large as this allows for a larger variety of
chromosomes to be selected. Recall that 0.5 p 0.9. A small p leads to less fit
chromosomes being more likely to be chosen while a large p means that the fittest
parents are almost always chosen. Thus, a small p was chosen so that the algorithm
is more likely to choose less fit parents which in turn could mean that the algorithm
is less likely to get stuck in a local optima.
Finally the other two parameters is the solution to be found (in this case this
is 6859) and q (the number of generations after which we restart the algorithm if
a solution was not found). Finding the optimal solution using this command can
take some time and varies with each execution due to the aspects of randomness in
selection and mutation.
Next, the algorithm was applied to ulysses22 problem. The following function
was used to find a solution with a score of 7200 or less.
main3(20,20,0.1,150,0.5,ulysses22,7200).
The results can also be seen in Figure 5. The algorithm has been unable to find
the optimal solution of 7013 due to being stuck in a local optima. However, the
solution found with a score of 7153 is still a very good solution. When considering
the graphed version of this solution (shown in the bottom figure) it can be seen
that this is close to what one would imagine the optimal solution to be. The only
difference between this solution and the optimal one probably is a few swaps in
the very congested area in the middle. The solution found is superior to the best
chromosome in the initial population.
Finally, this example shows how genetic algorithms can be applied in practice.
When usually faced with a large TSP problem, we will not know the optimal solution.
Thus, we would not be able to tell if the found solution is the optimal one. We could
instead search for a solution that is below a certain threshold (7200 here). Once such
a solution has been found, we would decrease the threshold and look for a solution
below this new threshold. We could continue this until the algorithm is not finding
an optimal solution in some time.

20

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

Conclusion

In this project, we have outlined the workings of the Simple Genetic Algorithm. We
have introduced a number of selection schemes (Rolette Wheel Selection, Tournament Selection) and crossover operators (One-Point Crossover, Two-Point Crossover,
Partially Matched Crossover) as well as the general approach to solving search problems (Fitness function, representation etc.). These tools can be applied to solve a
range of search problems. However, many other operators exist for the genetic algorithm which may be suited better for different problems. Michalewicz and Fogel
(2004) mentions that there are as many genetic algorithms as there are problems
that they are built to solve.
With the knowledge of the workings of the Genetic Algorithm, we have applied
this to the Travelling Salesman Problem. We considered two problems: ulysses16
with 16 nodes and ulysses22 with 22 nodes. We defined a permutation representation
and then applied the knowledge of genetic algorithms to choose suitable operators for
this algorithm. We have chosen the algorithm to have a tournament selection scheme
and a partially matched crossover. These choices have been based on research that
has shown these to be the most suitable choices of operators for TSP. We introduced
a mutation operator that swaps two consecutive genes with some probability.
To help visualise the workings of the algorithm, we graph some characteristics of
the population over the running of the algorithm. We also visualise the initial and
optimal solutions as graphs to show the difference between the starting solution and
the optimal one found. Finally, we defined three different termination criteria for
the algorithm.
We then investigated the effect of the size of the population and the mutation
probability. We found that higher population sizes help to maintain diversity of
the population. However, there is a trade-off as the running time of the algorithm
is extended. It was also shown that mutation has a similar trade-off where some
mutation can help to keep a diverse population but a very high mutation rate could
lead to the algorithm taking unnecessarily long to converge.
Using these results we run the algorithm to find the best solution for the ulysses16
problem. We knew the optimal solution so the algorithm was allowed to run until
we found this specific solution. Such an approach is not useful in practice as we
usually are not aware of the optimal solution. Thus, an approach where we look for
a solution below a certain threshold was suggested. The best solution found for this
problem was 7153, whilst it can be shown that the optimal solution for this problem
is 7013.
This project can easily be extended to look at problems with a larger number
21

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

of nodes. There are many different selection schemes and crossover operators which
have not been covered which can also be investigated. Finally, there are other metaheuristic algorithms such as simulated annealing or tabu search that can be applied
to solve this problem.

References
J.H. Holland. Adaptation In Natural And Artificial Systems: An Introductory Analysis With Applications To Biology, Control, And Artificial Intelligence, 2nd Edition.
A Bradford Book, 1992.
K. Jebardi and M. Madiafi.
Selection methods for genetic algorithms.
Int.J.Emerg.Sci., 3:333344, 2013.
Bidhan K. Kumar, N. and R. Kumar. A comparative analysis of pmx, cx and ox
crossover operators for solving travelling salesman problem. IJLRST, 1:98101,
2012.
Z. Michalewicz and D.B. Fogel. How to Solve It: Modern Heuristics, 2nd Edition.
Springer, 2004.
B.L. Miller and D.E. Goldberg. Genetic algorithms, tournament selection, and the
effects of noise. Comp. Sys., 9:193212, 1995.
M. Mitchell. An Introduction To Genetic Algorithms. A Bradford Book, 1996.
S. N. Sivanandam and S. N. Deepa. Introduction to Genetic Algorithms. Springer,
2007.

22

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

Figure 2: The graphs for average and minimum path for different settings of the size
of the population. The other parameters have been kept at constant, meaningful
rates. The runtime of each execution is also shown.

(a) Population size = 10


RUNTIME: 10.42 seconds

(b) Population size = 25


RUNTIME: 25.47 seconds

(c) Population size = 40


RUNTIME: 71.37 seconds

23

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

Figure 3: The graphs for average and minimum path for different settings of the
mutation rate. The other parameters have been kept at a constant, meaningful
rates.

(a) Mutation rate m = 0

(b) Mutation rate m = 0.01

(c) Mutation rate m = 0.05

(d) Mutation rate m = 0.10

24

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

Figure 4: The optimal solution found for the ulysses16 problem

(a) Behaviour of the algorithm over time

(b) Optimal route

25

Solving TSP using Genetic Algorithms

Mateusz Matyjaszczyk (140293481)

Figure 5: The best solution found for the ulysses22 problem

(a) Behaviour of the algorithm over time

(b) Optimal route

26

Potrebbero piacerti anche