Sei sulla pagina 1di 10

Cybernetics and Systems Analysis, Vol. 39, No.

1, 2003

GENETIC ALGORITHMS USED TO SOLVE


SCHEDULING PROBLEMS

N. N. Glibovets and S. A. Medvid’ UDC 681.3.06

A general description of genetic algorithms is given. An optimal scheduling problem is examined from
the viewpoint of real needs of a modern university. A solution, based on a genetic algorithm, is
proposed for the problem. The genetic algorithm implementations are reviewed and their convergence
rate and quality are analyzed.

Keywords: genetic algorithm, evolutionary calculus, reproduction conditions, mutation of


chromosomes, selection of chromosomes, optimization problem.

Genetic algorithms are considered as a successful approach to solving optimization problems. The principle of genetic
algorithm [1–6], proposed by J. Holland (Michigan University) in 1975 is of interest [7]. Genetic algorithms have a large
number of implementations with optimization, improvement, and adjustment methods, united by a common idea of algorithm
perfection.
Operation of any genetic algorithm is based on the principle of evolution first described by Ch. Darwin. There is a
mathematical proof, which explains the reason of efficiency of such an approach (the so-called Schema Theorem) [8, 9]. Like
in the natural evolution model, an optimal solution is sought for in parallel in a great number of variants. Therefore, the
terminology used in genetic (or evolutionary) algorithms is basically similar to that of the Darwin theory of evolution.
As distinct from ordinary algorithms, where an optimal solution is sought for linearly, in one variant, there is a set of
alternate solutions in evolutionary algorithms, which compete with each other and vary by definite laws during the algorithm
operation. The best solution is finally selected, which is considered the result of the algorithm operation.
Let us illustrate practical application of genetic algorithms by the example of solving an NP-complete optimization
problem of scheduling in an educational institution. Such problem must be solved most successfully in any college, since
time required for solution of an NP -complete problem increases drastically even for an insignificant increase in size and
complication of input data structure.

PRINCIPLE OF GENETIC ALGORITHMS OPERATION

Let us consider the basic principle of operation of a classical genetic algorithm, which can be divided into three
stages: initialization, simulation of the evolutionary process, and completion of operation.
Initialization. A population (set) of a significant number of candidate-solutions is created. In the general case, no
conditions are imposed on their generation, they are completely random, and they are optimized further in the process of
operation of the second part of the algorithm. However, in actual implementations of genetic algorithms, various heuristics
are frequently used already during creation of the initial population of solutions. These heuristics are specific for each
specific optimization problem, but their main purpose is to increase the average value of optimality already among the initial
solutions. This, in turn, allows us to reduce significantly operating time of the main body of the algorithm.

“Kyiv-Mohylyanska Academy” University, Kiev, Ukraine, glib@ukma.kiev.ua; mserge@ukma.kiev.ua. Translated


from Kibernetika i Sistemnyi Analiz, No. 1, pp. 95-108, January-February 2003. Original article submitted February 11,
2002.

©
1060-0396/03/3901-0081$25.00 2003 Plenum Publishing Corporation 81
In classical implementations of genetic algorithm, alternate solutions are specified as a set of bit strings, each being
called chromosome, and each bit is called, respectively, a gene. It is of no semantic importance how the chromosomes are
presented (for example, as bit strings), except for influence on efficiency of a specific program of algorithm implementation.
Simulation of Evolutionary Process. The next phase of the algorithm is most durable and important, since
evolutionary laws are applied to the initial population. As is generally known, the main principles of evolution are the
inheriting, mutation, and selection. It is these rules that are applied iteratively to the initial population, which starts varying
permanently under their action. It turned out that it is convenient to simulate natural evolutionary processes by considering
each solution as a chromosome, which consists of indivisible parts (genes). Therefore, at the stage of implementation the
solutions are represented as chromosomes, and in the algorithm, the laws of evolution are realized as follows.
• The mutation operator is applied to each gene of all chromosomes of the first generation, which may change the
gene with a small probability for any its modification.
• Sections interchange in chromosomes of the first generation. Thus, chromosomes of the second generation are
formed. In this case, the higher estimate of the chromosome of the first generation, the larger probability of its participation
in formation of the next generation.
• As soon as the same number of chromosomes is formed in the second generation as in the first one, the second
generation completely substitutes for the first one and the iteration repeats.
It is clear intuitively that it is possible to select operation parameters of the second phase of genetic algorithm, such
that the population of solutions will improve step-by-step, i.e., become more optimal. The type of reproduction used in
genetic algorithm determines the change of generations. Two main types of reproduction are differentiated — the generation
and the steady state mode.
The generation type of reproduction means a complete replacement of the population with the next generation at each
iteration of the algorithm [7]. Let the number of chromosomes in the population be equal to N. Then at each iteration of the
algorithm N / 2 pairs of parent chromosomes are selected, using which N children chromosomes will be generated. The
generation thus obtained completely substitutes the parent generation.
In a steady state reproduction, as distinct from generation mode, each iteration is repeated only for one pair of
chromosomes, i.e., a previous generation is not substituted with a new one. The children chromosomes generated come back
to the initial population and substitute for the worst solutions contained in it [10].
Crossovers and Mutations. The main function of mutations and interchange of definite parts of chromosomes
(so-called crossovers) is item-by-item examination (exhaustive search) of variants of solutions. A specific implementation of
the algorithm should provide a potential possibility of obtaining any solution from the whole space of possible solutions
using only crossovers and mutations. Indeed, a crossover allows us to sort out efficiently a large number of variants (since
initial solutions were formed arbitrarily). The main task of mutation is to ensure passage to the solutions to which it is
impossible to pass from the initial state of the population using crossovers. Therefore, probability of appearance of a
mutation is a subject of a special study [11–14]. On the one hand, mutation should not bring chaos into population,
destroying “good” solutions (such situation will be observed for a high probability of appearing a mutation). A vivid example
of unacceptability of a high level of mutation may be the fact that for a high level of radiation, all alive usually dies already
after some generations. On the other hand, mutation should provide wide opportunities for exhaustive search for solutions.
Mutation plays a key role in a nature, not in vain, evolution would be impossible without it.
For generation of the next generation, a set of genetic crossover operators is used, mainly intended for generation of
the next generation of children chromosomes by a recombination of parts of parent chromosomes through an interchange of
segments between pairs of chromosomes. A “two parents” scheme is used frequently in genetic algorithms, but examples of
genetic algorithms, where more than two parents take part in generation of children chromosomes, are presented in some
studies [3]. The following types of crossovers are used most frequently: one-point, two-point, N-point, and homogeneous. In
a onle-point crossover, two chromosomes are selected and a random point of crossing is generated. One-point crossover is
being studied in [8, 15]. This process is shown in Fig. 1.
Two-point crossover (Fig. 2) differs from a single-point one only in the fact that two points of crossing are arbitrarily
selected, and chromosome interchange with the sections between these two points.
An N-point crossover is a generalization of one- and two-point crossovers by defining N crossing points. These points
divide chromosome into N +1 sections and exchange only with even (or only odd) sections.
A homogeneous crossover (Fig. 3) was described first in [4]. Each gene of the first parent has a 50%-chance to
exchange places with a respective gene of the second parent.

82
Parents
..... 34 22 73 54 25 67 12 35 65 85 49 92 44 57 .....

..... 69 77 54 16 23 54 80 69 44 78 12 33 24 55 .....
Crossover point
Children
..... 34 22 73 54 25 67 12 69 44 78 12 33 24 55 .....

..... 69 77 54 16 23 54 80 35 65 85 49 92 44 57 .....

Fig. 1. One-point crossover.

Parents
..... 34 22 73 54 25 67 12 35 65 85 49 92 44 57 .....

..... 69 77 54 16 23 54 80 69 44 78 12 33 24 55 .....
Crossover point 1 Crossover point 2
Children
..... 34 22 73 16 23 54 80 69 44 78 49 92 44 57 .....

..... 69 77 54 54 25 67 12 35 65 85 12 33 24 55 .....

Fig. 2. Two-point crossover.

Parents
..... 34 22 73 54 25 67 12 35 65 85 49 92 44 57 .....

..... 69 77 54 16 23 54 80 69 44 78 12 33 24 55 .....
Children
..... 69 22 73 16 23 67 80 35 65 85 12 92 24 55 .....

..... 34 77 54 54 25 54 12 69 44 78 49 33 44 57 .....

Fig. 3. Homogeneous crossover.

Genes, which are candidates for crossover, are selected with a bold font in one of parent chromosomes. The choice
was arbitrary, with 50-percentage probability. Children chromosomes were derived from parent chromosomes by
interchanging genes selected at the previous step.
There are two more genetic operators influencing only one chromosome, i.e., not participating in generation of the
next generations of solutions. Such operators are mutation and inversion, which are intended for random modification of one
of schedules. Since initial schedules are generated arbitrarily, such generation and the subsequent crossovers will not
necessarily provide search in the whole decision space. The operators of mutation and inversion are intended to introduce to
the population random solutions, which are difficult or impossible to derive by applying only crossover operators.
The probability of origin of a mutation or inversion should be comparatively small but balanced. On the one hand, it
should generate new solutions, and on the other hand, it should not bring chaos to the population. Unsuccessful solutions
generated by mutation (those decreasing the value of the criterion function of a chromosome), will fall out out during
selection, and successful ones will have more chances for life due to the selection.
The operator of mutation changes one of genes of the chromosome with a random value. It is applied to all genes but
with a small probability of operation. After mutation, the function of chromosome estimation should be recalculated.

83
The action of the inversion operator is as follows. Two positions on a chromosome are selected arbitrarily, and the
section between them is inverted in such a way that each gene is substituted with an inverse one. As we see, such inversion
does not influence the function of chromosome estimation. However, crossover, which will be then applied to the
chromosome, will generate absolutely different children chromosomes. The operator of inversion should be applied with a
probability significantly smaller than unity.
Selection. Such an important principle of the theory of evolution as selection is realized in the algorithm by ensuring
participation of chromosomes of the previous generation in formation of the next generation with a different probability. Due
to this, a more optimal solution (chromosome) has more chances for prolongation of generation. However, this does not
mean that the least optimal solutions have no chances at all. By analogy with similar processes in nature, a not optimal
chromosome may participate in formation of the next generation while the most optimal one remains without offsprings. But
probability of this is rather small, since in that case we go to possible elimination of optimal solutions. Like in nature, the
situation takes place when the least optimal solution comprises a part of optimal one and this solution is a basis for
generation of optimal solution in the next generations.
Other complication in solution construction is the set of constraints, which should be observed in the optimal result.
Two types of constraints are formally introduced: rigid and nonrigid ones.
When rigid constraints are violated, the solution obtained is incorrect, i.e., it is impossible to apply it in practice. An
example of violation of a rigid constraint is compiling a schedule, in which one lecturer should read simultaneously three
lectures in three lecture-rooms.
Rigid constraints should be specified in the configuration of the algorithm and not affect its implementation. The
number of rigid constraints should be small, since their violation means a complete impossibility to construct the solution,
i.e., the conditions become such that none of possible solutions satisfies all the conditions.
Based on this, nonrigid constraints are introduced, which, as distinct from rigid ones, are determined not by
impossibility of a specific event but by its undesirability, and different nonrigid constraints specify the degree of
undesirability. For example, there are two alternate solutions. One of them violates the nonrigid constraint on the existence of
“windows” (free time between lectures) in the schedule of a specific group (two groups in the schedule have one “window”
each during the whole week), and in the second solution, there are long uninterrupted lectures in three audiences. It is
necessary to select the best schedules among those described. To formalize such choice, weight coefficients are introduced
for each nonrigid constraint, which influence the estimate of a definite variant of the schedule. In this case, a schedule with
the greater value of the estimate function is considered the best.
Thus, the operator of selection is important for genetic algorithms. There are several types of selection: tournament,
truncation selection, linear and exponential (ranking selection), and proportional. All types of selection are based on
comparing two chromosomes and selecting the best of them. Such estimation is carried out by means of the estimate
function.
In tournament selection, t chromosomes are selected arbitrarily from the parent population, and the best one is
selected among them. The number of repetitions of the procedure corresponds to N , which is the number of chromosomes in
the population. Tournaments are frequently carried out between two individuals (binary tournaments). For t individuals, we
obtain a general case of tournament selection. Such selection algorithm can be implemented efficiently, since its operation
does not require sorting of the initial population. The time estimate of the algorithm is O ( N ). This method of selection was
analyzed first in [16], and mathematical analysis was carried out in [9].
When truncation selection with a threshold T is used, only T best individuals are selected, each of them having the
same probability to be selected [1]. Since such an algorithm of selection requires item-by-item examination of the whole
population, the truncation selection algorithm has the time estimate O ( N ⋅ ln N ).
John Holland [7] has proposed the proportional method of selection in the first implementation of the genetic
algorithm in 1975. For each individual, the probability to be selected is directly proportional to the value of its estimate. Such
an algorithm can be realized with the time estimate O ( N ).
Ranking selection, first proposed by Baker, eliminates serious shortages of proportional selection [2]. In this case, all
chromosomes of the initial population are sorted by their estimate, best of them being assigned the rank N and the worse one
the rank 1. The probability of selection is directly proportional to the rank of the chromosome thus obtained. Exponential
ranking selection differs from the linear one in that probabilities exponentially depend on the rank. The base of exponential
function is the parameter 0 < c < 1, which does not vary during the whole operation of the algorithm. Thus, these methods
also require sorting of the population, therefore, their time estimate is determined by the time estimate of the sorting
algorithm, which, as is generally known, amounts to O ( N ⋅ ln N ).

84
It is necessary to formalize estimation of a chromosome in the population. All the selection methods are based on
comparing chromosomes with each other according to estimates. The estimate function is introduced so that larger values of
this function correspond to a smaller number of the constraints violated by the solution.
Completion of Genetic Algorithm. In simulating evolutionary process, solutions become more and more optimal
and the number of conflicts decreases step-by-step. As distinct, for example, from the algorithm of exhaustive search, the
genetic algorithm does not guarantee derivation of the most optimal solution. It is possible to find fast a good solution, but it
will not necessarily be optimal. However, the majority of actual optimization problems do not require derivation of the most
optimal solution.
In connection with this feature of genetic algorithms, a problem arises of finding condition of stopping the basis cycle
of the algorithm and output of the final solution. Usually, with this aim in view, one should determine a threshold value of
the function of estimate of solutions optimality. If one or several chromosomes from the population have achieved a given
level of optimality, the algorithm stops, and the best solution arrives at the output.
In the process of search for solution using genetic algorithm, the most important is a constant support of validity
(admissibility) of solutions during operation of the algorithm, i.e., keeping the chromosomes not violating rigid constraints.
This saves CPU time, which would otherwise be spent for calculation and support of inadmissible solutions. Each of the
genetic operators described (except for inversion) may affect a chromosome so that rigid constraints are violated in it.
Therefore, as soon as such operator is completed, it is necessary to check whether the newly formed chromosome have
violations of rigid constraints and, if any, to make a rollback, returning the chromosome to the previous state. After such a
rollback, the algorithm may follow one of the two strategies: either not to repeat application of the genetic operator, or repeat
it a definite number of times and stop the attempts making sure that none of the repetitions gave a correct schedule.

THE PROBLEM OF LECTURES SCHEDULING

Since a scheduling problem is NP -complete [17], the number of possible solutions is equal to T N even for a problem
such as allocation of N examinations in T temporal slots. This number is large for an educational institution with a small
number of students. But the problem of examination scheduling is much more simple as compared with the problem of lesson
scheduling. For example, each examination is precisely associated with a definite group of students and a definite teacher.
This is not true for an ordinary lesson, because a group of teachers may conduct a lesson attended by several groups of
students. Thus, defining what teacher for what group conducts lessons is also a part of scheduling.
Let us introduce the following rigid constraints: one lecturer should conduct lessons only for one audience at one time;
one group may have only one lesson at any moment; one room can be used only for one lesson at any moment. We will
restrict ourselves to this list to make the schedule real. For example, we may consider rigid the following constraints: a room
should be free during two temporal slots (pairs of lectures) after each lesson (for example, for airing). But in this case, if the
number of rooms is small and there are a lot of lessons, it may happen that all possible schedules will violate rigid
constraints, i.e., the problem will have no solution at all.
Let us select the following constraint among nonrigid ones: it should be as few as possible “windows” in individual
schedules of teachers and students; a lesson cannot take place if the number of students in a group is more than the number of
places in the room.
Thus, having at the input a set of such data concerning parts of the schedule for a definite period (a semester, a
trimester) as lists of students, teachers, subjects, rooms, relations between them, the number of educational weeks and hours
for each subject, and sets of rigid and nonrigid constraints, the algorithm should produce a schedule not violating rigid
constraints and as much as possible corresponding to nonrigid constraints.
Adapting Genetic Algorithm to the Scheduling Problem. When genetic algorithms are applied to solution of a real
scheduling problem, we should define, first of all, an internal data format which, on the one hand, should be convenient for
the algorithm, and on the other hand, should take into account all pecularities of external data.
It is expedient to save input data in an external database in a format convenient for filling in and representation. This
will allow connection of the scheduling module to a more complex control system for university educational process.
However, such format cannot be used in the algorithm due to a small rate of access to such data and their structure, which
does not correspond to internal structures of the algorithm. Therefore, before starting the algorithm, it is necessary to
transform external data to the internal format, and to make the inverse transformation after the completion.
In the case being considered, alternate solutions can be presented by bit strings by constructing a definite mapping.

85
But it is inconvenient to work with such mapping, since it requires significant resources (for example, in calculating the
estimate function for each schedule, bit strings should be converted permanently into actual data). Therefore, we will turn to
a more natural implementation of chromosomes.
A gene is an indivisible unit, which is a part of a chromosome and which encodes one amino acid. Similarly, in our
presentation of a chromosome, a gene encodes one tuple of the schedule — a triple (group, room, time). A group determines
a set of some other data: training course for which the group is formed (for example, Chemistry-1, English-2) and type of
lessons (lecture, laboratory work, practical work). Each type of lessons determines a list of possible teachers and rooms
(these data arrive from the outside): a list of students in a group; a teacher delivering lectures in the group.
These data can be divided into two categories:
• static data, which do not vary during operation of the algorithm (the training course and the type of lesson for each
group) and
• dynamic data, which are actually appearing in the schedule itself (the list of students in each group and the teacher of
this group). The list of students may vary (if this is not restricted from the outside). For example, it is of no importance for
four practical groups in Chemistry-1, to which group a student belongs, therefore, scheduling includes also generation of the
lists of these groups. Similarly, it does not matter, what teacher (from a list) reads for a group.
We will identify static data by a group code, and dynamic data by the number of its modification. Thus, each group is
presented in a gene by static and dynamic parts. The static part consists of the subject, type of lessons, and group number,
and dynamic one of the current modification of a group (lecturer and list of students). Within the limits of one schedule,
modification of the group will be the same for groups with the same code (generated for one subject and type of lessons) but
with different numbers.
Let us consider an example. Let six groups for practical lessons in Chemistry-1 be formed. These groups are identified
as follows: C1P-1, C1P-2, C1P-3, C1P-4, C1P-5, and C1P-6. Here C1 stands for Chemistry-1, P means practical lessons, and
a digit means serial number of the group. Let 60 students attend Chemistry-1, i.e., there are ten students in each group. Three
teachers conduct practical lessons — two groups each. In this case, the number of groups, title of the course, and type of
lessons is static information. A specific list of students in each group and the correspondence between a teacher and a group
is dynamic information.
Thus, a group will be presented in a gene by a triple: the code of the group (for example, C1P), the number of the
group (for example, 3), and the number of modification (the number defining lists of students in each group), and what
teacher delivers lectures in a group (correspondence is established between the number of modification and lists of students
in groups, as well as between a group and a teacher).
According to theory of combinations, the number of possible modes when three teachers may conduct lessons in six
groups, having two groups each, equals 90. If lists of students in groups are assumed constant, then we can represent the
number of modification by seven bits (2 7 = 128). It is possible to specify a univalent correspondence between this number
and a teacher who conducts lessons in each of six groups. Initial constraints (each teacher conducts lessons in two and only
two groups) will not be violated in this case.
In a similar way, we can compare the number of modification to the list of students. There will be much more variants
but they also can be presented by a rather small number of bits. In the above example, the number of variants of different lists
of groups can be determined by the expression
60 !
N = ≈ 364
. ⋅ 10 42 ≈ 2141.4 ,
6
(10 !)

i.e., 142 bits are required for representation of the number of modification of lists of groups, and in view of variants
with teachers, 149 bits are required for the number of modification (in the above example).
Thus, the number of group modification cannot be presented in a machine implementation as an ordinary integer
number, since it may take on large values. Therefore, we will represent it as a bit field having a univalent mapping into lists
of students and correspondence between teachers and groups. This field may amount to several kilobytes. However, it should
not be within the gene. The number of group modification can be determined not for each lesson in the schedule (a gene is a
lesson) but for the code group (in our example, C1P). Therefore, all the groups in the schedule, beginning with C1P, should
have an identical number of modification.
Since there are six similar groups in the example being considered, each having several lessons during one week, it is
inexpedient to save bit fields of group modification in the chromosome. First, it would require a lot of memory, and second,

86
Fig. 4. Representation of gene in binary form.

one should permanently take care of synchronization of these bit fields, changing all the remaining fields when one of them
is modified.
Therefore, it is advisable to save the group code in a chromosome as a pointer to a structure, which would contain the
whole information about the group (for example, its membership of a definite training course), and a bit field, which specifies
the number of group modification.
For each group code, there is only one such structure, which is specified by appropriate pointers from the
chromosome. Therefore, if one of the genes is modified (for example, in group C1P-5), the numbers of modifications of all
the remaining similar groups (from C1P-1 up to C1P-6) will be automatically changed.
Representation of a Chromosome. Each chromosome is a set of genes. Since the number of lessons is known even
prior to the beginning of the algorithm, we may represent chromosomes as a dynamic array, whose each element is a
structure being a gene, as is shown in Fig. 4. The structure containing the training course, type of lessons, and number of
modification is one for several groups with the same code.
Thus, one gene in a chromosome is one lesson in the schedule. The chromosome itself looks as follows: gene 1,
gene 2, gene 3, ..., gene N .
Initialization. Genetic algorithms do not require following any rules in constructing the initial generation of solutions.
Schedules may be conflicting in the beginning, with violations of rigid constraints. It is natural that this simplifies formation
of the initial generation and determines for it only one simply realizable method — generation of a completely random
schedule. But hereafter, at the second phase of the algorithm, such a mode of creation of the initial population causes a
number of problems.
First, it is necessary to determine weight coefficients also for rigid constraints, and these coefficients should be much
larger than those determined for nonrigid constraints. But it is difficult to select a proper correlation, moreover, this requires
durable experiments. Second, a significant resource will be spent for calculations, modification, and support of a priori
inadmissible schedules. These inadmissible solutions might become non-conflicting during operation of the algorithm, but
probability of this is negligibly small for completely random initial solutions.
Therefore, let us adhere to non-conflictness of schedules at all stages of operation of the algorithm [18]. Naturally, this
complicates the algorithm but the gain is incomparably higher: unnecessary CPU time is not spent, and there is no need to
introduce weight coefficients for rigid constraints, which significantly simplifies adjustment of the algorithm.
At the stage of initial generation, a scheduling algorithm, which is non-conflicting with respect to rigid constraints,
has the following form.
1. Determine group structure.
1.1. For each training course, form the list of students from input data.
1.2. Mix contents of the lists.
1.3. From the input data, determine sets of groups (lecture, practical, etc.).
1.4. For each set, calculate the number of groups and their size.
1.5. Sequentially fill in lists of students in the groups.

87
2. Assign teaching staff responsible for groups.
2.1. From input data, determine the courses and types of lessons a teacher may conduct.
2.2. Form arbitrarily the correspondence “a teacher ↔ a group.”
3. Determine time of lesson duration.
3.1. Place groups arbitrarily into time slots with regard for the following requirements:
3.1.1. There are no rigid conflicts as to teachers.
3.1.2. There are no rigid conflicts as to students.
4. Determine a room.
4.1. For each obtained lesson, determine a random room from the list of admissible ones so that there is no conflict as
to occupied rooms.
Since there should be a lot of schedules generated in such a way, and all of them should be different, there are random
functions in the algorithm (see Item 1.2, 2.2, 3.1, and 4.1). Based on this, it is possible to obtain a lot of not optimized but
non-conflicting (with respect to rigid constraints) schedules.
Further, chromosomes should be formed from the generated schedules, which comprise the initial population to be
changed hereafter. Note that according to the data structure proposed, information on lists of groups and teachers is specified in
bit fields. In actual practice, they are a binary representation of the numbers that mean the number of group modification. These
numbers are univalently associated with a definite list of all subgroups and with a set of correspondence “a teacher ↔ a group.”
Main Phase of Operation of the Algorithm. Thus, after initialization, we have a data structure representing a set of
chromosomes, which is convenient for the main phase of the algorithm (emulation of evolutionary processes). At this phase,
the algorithm assumes an iterative realization of the following operations over schedules-chromosomes:
• selection of chromosomes for reproduction;
• application of modification operators to candidate chromosomes;
• reproduction of a new generation;
• check for conditions of completion.
These steps are applied to the population of solutions until check for algorithm completion conditions gives a positive
result. Let us describe this in more detail as applied to the data structure obtained after initialization of the scheduling
algorithm.
Selection of Chromosomes. As it was mentioned above, there are some methods of selection of chromosomes for
participation in forming the next generation. All of them are based on one or other approach to definition of the reproduction
ability of chromosomes. This ability is determined through the estimate function and immediately influences the probability
of the fact that chromosome is involved in generation of descendant-solutions.
Estimate and Cost Functions. The estimate function determines the degree of ability of a chromosomes to form the
1
next generations: E = , where E is the estimate of the chromosome and C is cost. Here the cost function is
1+C
C = ∑ wi * ni , where i is the number of a nonrigid constraint, wi is its weight coefficient, and ni is the number of violations
of such constraints. The estimate function takes on values from 0 to 1, and the less nonrigid constraints are violated, the
greater this function. The algorithm is directed to deriving solution with the greatest value of this function.
Process of Selection. Several selection functions are realized. All of them have standard interface of transmission and
acceptance of parameters and differ only in the way of selection. Using configuration, we can select one of the selection
functions (in turn or with a definite probability). This opens additional possibilities in adjustment of the algorithm.
Types of Selection. Let us define some peculiarities of the methods of selection in a specific implementation of the
scheduling algorithm.
Tournament selection is implemented in the scheduling algorithm for pairs of solutions. Therefore, the best
chromosome is selected from a random pair of chromosomes. Depending on the type of reproduction used in a call for a
function, the function of tournament selection returns the pointer one chromosome back (a steady condition of reproduction,
see below) or an array (generation type of reproduction), which contains pointers for a half of chromosomes from the
population. These chromosomes are the best candidates for formation of a new generation.
The cutting off selection returns T pointers to chromosomes in the steady state condition of reproduction, in this case,
T is much less than the total number of chromosomes in the population. For the generation type of reproduction, the number
of returned pointers equals to a half the number of chromosomes in the population.

88
The proportional selection and its improved modification, the ranking selection, as well as the function performing
tournament selection, return the pointer or the number of pointers equal to a half of the number of chromosomes in the
population one chromosome back.
Reproduction Conditions. In implementation of the scheduling algorithm, its operation in two reproduction
conditions is provided: generation and steady state.
The generation condition of reproduction provides a complete replacement of the previous generation by a new one at
each iteration. The following sequence of operations is fulfilled in this case.
• The function of selection is called twice, which returns every time the pointer array to chromosomes.
• The operator of formation of the next generation, which forms two chromosomes written in a new array, is applied
pairwise to elements of the arrays.
• The generated array of chromosomes substitutes the previous generation.
Let us point out that when the selection function is called, the pointer array is returned and some of the pointers may
specify the same chromosome. Therefore, one chromosome may participate several times in formation of a new generation.
At the same time, this means that some chromosomes from the parent generation will disappear without leaving descendants.
The steady state reproduction is close to the natural one, since it foresees simultaneous existence of chromosomes
pertaining to several generations. The sequence of operations is as follows.
• Call twice the selection function. In this case, it returns one or a small number of pointers to chromosomes (for
selection type description see above).
• Apply the operator of formation of a new generation.
Descendant-chromosomes substitute the worst chromosomes from the population. Note that in this operation mode,
the function of selection returns the sets of pointers, which specify a physical chromosome only once, i.e., there are no
repetitions in this case as in generation reproduction mode.
Genetic Operators. Let us define peculiarities of implementation of genetic operators for the data structure described
above, which is used in the scheduling algorithm.
Crossover in the Scheduling Algorithm. A gene was defined as a triple (group, room, time). But this does not means
that crossover points in a chromosome are selected only on gene boundaries, i.e., between the “time” component of the
previous gene and the “group” component of the next one, without breaking them off. Vice versa, it is expedient for a more
exhaustive search of possible solutions that crossover would also break off genes of parent chromosomes when one part of a
gene passes to the first descendant and another to the second one. Thus, we have a guaranteed way to derive new genes even
using crossover in addition to mutation and other one-chromosome genic operators (this will be discovered later).
Let us consider peculiarities of representation of an educational group as a part of a gene. As distinct from time and
room, which are indivisible, the crossover point may separate out the code of a group (for example, C1P-2, see the previous
examples) from its modification number. This means that in a descendant chromosome, code of any group may be inherited
from one parent, and modification number from another. To provide such possibility, it is necessary to use modification
numbers as bit fields of the same dimension — the greatest among all the groups. For groups with a smaller dimension, an
ambiguous correspondence is specified, when several values of group modification number correspond to one its
modification. Such an approach to coding group modifications will allow crossover to separate group codes from their
modifications.
A mutation operator changes arbitrarily one of three components of a gene in a chromosome. Therefore, after
mutation of a chromosome in one of genes, either group, or audience, or time of lessons varies.
For a solution-chromosome to continue to be meaningful, a component accessible for such modification is substituted
not completely with a random value but with an arbitrary value from a definite set. Thus, influenced by a mutation operator
on a group, a new value may be replaced with any other accessible group.
Completion of the Algorithm. Check for correspondence to the completion condition is carried out at the end of each
iteration of the main phase of the scheduling algorithm. As in the majority of other genetic algorithms, stopping happens
once the threshold value of the estimate function is attained, which is specified at the configuration stage. One or several
alternate schedules, whose estimate functions are greater or equal to a preset value, are applied to the output.
In summary, it should be mentioned that a test of the genetic algorithm using as an example compilation of a schedule
for computer science department of the National University “Kyiv-Mohylyanska Academy”, with a rather complex profile of
preferences, has shown practical acceptability of the algorithm.

89
REFERENCES

1. H. Muhlenbein and D. Schlierkamp-Voosen, “Predictive models for the breeder genetic algorithm,” Evolutionary
Computation, 1, No. 1, 25-50 (1993).
2. J. J. Grefensette and J. E. Baker, “How genetic algorithms work: A critical look at implicit parallelism,” in: Proc. 3rd
Intern. Conf. on Genetic Algorithms, Morgan Kaufmann, San Mateo (1989), pp. 20-27.
3. A. E. Eiben, P.-E. Raue, and Z. Ruttkay, “Genetic algorithms with multi-parent recombination,” in: Proc. 3rd Conf. on
Parallel Problem Solving from Nature, Springer-Verlag, Berlin (1994), pp. 78-87.
4. G. Syswerda, “Uniform crossover in genetic algorithms,” in: Proc. 3rd Intern. Conf. on Genetic Algorithms, Morgan
Kaufmann, San Mateo (1989), pp. 2-9.
5. N. J. Radcliffe, “Equivalence class analysis of genetic algorithms,” Complex Systems, 5, No. 2, 183-205 (1990).
6. Yu. V. Kapitonova and A. A. Letichevskii, “Theorem proving in a mathematical information environment,” Kibern.
Sist. Analiz, No. 4, 3-12 (1998).
7. J. H. Holland, Adaptation in Natural and Artificial Systems, The University of Michigan Press, Ann Arbor (1975).
8. R. Poli, “Exact schema theory for genetic programming and variable-length genetic algorithms with one-point
crossover,” in: Genetic Programming and Evolvable Machines, Morgan Kaufmann, Las Vegas (2001), pp. 469 476.
9. T. Blickle and L. Thiele, “A mathematical analysis of tournament selection,” in: Proc. 6th Intern. Conf. on Genetic
Algorithms (ICGA’ 95), Morgan Kaufmann, San Mateo (1995), pp. 9-16.
10. D. Whitley, “The GENITOR algorithm and selection pressure,” in: J. D. Schaffer (ed.), Proc. 3rd Intern. Conf. on
Genetic Algorithms, Morgan Kaufmann, San Mateo (1989), pp. 116-121.
11. J. Hesser and R. Manner, “Towards an optimal mutation probability for genetic algorithms,” Parallel Problem Solving
from Nature, 11, 96-119 (1991).
12. T. Haynes, R. Wainwright, S. Sen, and D. Schoenefeld, “Strongly typed genetic programming in evolving cooperation
strategies,” in: Proc. 6th Intern. Conf. on Genetic Algorithms, Morgan Kaufmann, San Mateo (1995), pp. 271-278.
13. T. Jansen and I. Wegener, “On the choice of the mutation probability for the (1 + 1) EA,” Parallel Problem Solving
from Nature, 6, 233-239 (2000).
14. T.C. Fogarty, “Varying the probability of mutation in the genetic algorithm,” in: Proc. 3rd Intern. Conf. on Genetic
Algorithms, Morgan Kaufmann, La Jolla, CA (1989), pp. 104-109.
15. R. Poli and W. B. Langdon, “Genetic programming with one-point crossover,” in: Proc. 2nd On-line World Conf. on
Soft Computing in Engineering Design and Manufacturing, Springer-Verlag, London (1997).
16. T. Blickle and L. Thiel, “Genetic programming and redundancy,” in: Genetic Algorithms within the Framework of
Evolutionary Computation, Max-Plank-Institut f&& ur Informatik, Saarbrucken (1994), pp. 33-38.
17. S. Evan, A. Itai, and A. Shamir, “On the complexity of timetable and multicom-modify flow problems,” SIAM J.
Comp., 5, No. 4, 691-703 (1976).
18. W. Erben and K. Keppler, “A general algorithm for solving a weekly course timetabling problem”, in: Lecture Notes
in Computer Science, 1153, Springer-Verlag, London (1999), pp. 198-211.

90

Potrebbero piacerti anche