Sei sulla pagina 1di 7

l.y.c.li@umail.leidenuniv.

nl

Solving MAX-3SAT problems


using GA
Li Yicheng
l.y.c.li@umail.leidenuniv.nl
s1684930

Introduction
In computational complexity theory, the maximum satisfiability problem (MAX-SAT) is the problem
of determining the maximum number of clauses, of a given Boolean formula in conjunctive normal
form, that can be made true by an assignment of truth values to the variables of the formula. It is a
generalization of the Boolean satisfiability problem, which asks whether there exists a truth
assignment that makes all clauses true. If there does not exist a truth assignment to make all
clauses true, what is the assignment that makes the number of true clauses as many as possible.
3SAT means the given Boolean formula to deal with is in 3-CNF(conjunctive normal form).
Expressions are restricted to be of 3-CNF form where every clause consists of exactly 3 literals.
Every k-SAT problem can be reduced to 3-SAT
The MAX-SAT problem is NP-hard, since its solution easily leads to the solution of the boolean
satisfiability problem, which is NP-complete. Many exact solvers for MAX-SAT have been
developed during recent years. In 2006 the SAT Conference hosted the first MAX-SAT evaluation
comparing performance of practical solvers for MAX-SAT, as it has done in the past for the pseudoboolean satisfiability problem and the quantified boolean formula problem. Because of its NPhardness, large-size MAX-SAT instances cannot be solved exactly, and one must resort to
approximation algorithms and heuristics [4]
There are many applications about the MAX-SAT problem like in the field of Electronic Design
Automation and Field Programmable Gate Array routing where large amount of logical operations
needed to be implemented. Since in the FPGA design, a simple design could involve thousands of
logic elements, how to optimize the route to make the time delay as little as possible is a problem
related to the SAT problem.

Problem Description
The problem is described as follows: Implement a Genetic Algorithm (GA) that, given a 3-CNF
expression, finds a realization of the variables that maximizes the number of clauses that are
satisfied. In order to solve MAX-3SAT problems using Genetic Algorithm, for this experiment, the
simple (1+1) Genetic Algorithm is regarded as the solver of the problem. (1+1) means one parent
generate one child. In this experiment, a constant mutation possibility is chosen the reciprocal of
n, and n is the number of variables. And in the later discussion, different constant or varying
mutation rate will be present to compare their performances. Also, compare the solver using GA
with Monte-Carlo approach and use tools to show the different performances.

1 of 7

l.y.c.li@umail.leidenuniv.nl

Implementation
The function takes two parameters, first is path of the cnf-file and the second is the evaluation
times we expected to run. Cnf file is written in a fixed regulation, which expresses the conjunctive
normal form. A function named read_cnf helps read the file according to its regulation and return a
matrix for the evaluation function to evaluate. From the matrix, the number of the variables and the
number of the clauses will be available for the later operation. The evaluate function or in the GA
way, the fitness function returns the result of number of true clauses.
The following steps show the details of implementing the GA solution.
The right shows the routine.
1.After getting these data and functionthe size of
the variables and the fitness function, the program
start with generating the initial parent. Using rand
function providing by the Matlab because this
function generate values which follows the uniform
distribution so that there will not be bias. Then use
the fitness function to evaluate the first parent and
record the result.
2.Start looping evaluation times, in every loop
generate a new bit string , this bit string is used to
mutate some bits of its parent with the parameter
mutation possibility. In the implementation the flip is
done by
(x + bool)mod 2
because
(0+1)mod 2 = 1 (flip)
(1+1)mod 2 = 0 (flip)
(0+0)mod 2 = 0 (not flip)
(1+0)mod 2 = 1 (not flip)
Here, x is the origin string and bool is a random bit
string generate with the mutation possibility and the
random function. Thus we now get the offspring.
3.Evaluate the offspring and compare the result with
its parents evaluation. If the offspring gets a better
score, then it become the new parent.
4.Do the loop until reaching the evaluation budget.
The last parent is the best solution within the
evaluation times.

Experiments
1.For every cnf file, using GA and Monte-Carlo search respectively. Get final solution
quality after 10000 evaluations and average over 20 runs.

2 of 7

l.y.c.li@umail.leidenuniv.nl

Monte-Carlo Search

Genetic Algorithm

file name

Avg

Std dev

Avg

Std dev

uf50-01

201.5500

0.6863

216.8500

0.9881

uf100-01

401.4500

2.0641

427.7500

1.2085

uf100-02

400.6500

1.7554

427.4000

1.1425

uf100-03

401.0500

1.3563

428.0000

1.1698

uf200-01

788.3000

3.0625

854.9000

1.9167

uuf50-01

206.2000

1.1965

216.5000

0.8272

uuf100-01

400.5000

1.8778

427.0500

0.5104

uuf100-02

399.5500

1.2763

427.3500

1.1367

uuf100-03

401.0000

2.2478

427.2500

1.0195

uuf200-01

787.4000

3.0157

851.8000

1.6092

Table1 : Final solution quality after 10000 function evaluations, averaged over 20 runs

2.Compare the performance between GA and Monte-Carlo research.(The Monto-Carlo search is


provided)
Performance on Benchmark filename:uuf200-01

Figure 1 : Final solution quality after 10000 function evaluations

3 of 7

l.y.c.li@umail.leidenuniv.nl

Figure 2: Final solution quality after 5000 function evaluations

Discussion and Conclusion


1.Analysis about the Monte-Carlo Searchs poor performance on final score
According to the data gathered in the first experiment, clearly it is hard for the Monte-Carlo search
to reach as high score as what GA did. This is reasonable because random plays an important part
in the Monte-Carlo search that each time it generates a new bit string without former information.
That is a blind search and most of the chances are waste. Plus, the provided Monte-Carlo search
program may did not generate a different seed the second time which could lead to same bit string
appearing. Thus, I added the seed function into the experiment and make sure the Monte-Carlo
Search generate different bit string each time but the result is the same. Clearly it is because MC is
not lucky enough to find the best string within 10000 evaluations. How many evaluations should be
considered if we expect MC-Search get as high score as GA does. If half of the number of the
combination of the bit string is took into consideration, in this case, where there are 200 variables,
the evaluation budget would be 2200/2, a very large number which 10000 is too small. So what
would MC Search behave if given a expression for small number of variables and MC search can
iterate all the possible combination. I did the following experiment to test the GA and MC
performance when solving a MAX3-SAT problem with small number of variables (12 and 14). The
following are four experiments which may find out in what condition Monte-Carlo Search could
behave better.
In the first experiment, I used a simple script to create a cnf file with 12 variables. According to
the basic combination theory, it can be easily get that with 12 variables there can be ( 12*11*10/
3*2*1) * (2^3) = 1760 clauses, we generate 600 of them with random function. The appendix will
give the script to generate such cnf file. The performance is shown in figure 3.
In the second experiment, I used a cnf with 14 variables, and 700 clauses. That is 16384
different combinations of the bit string. And choose 2000 to be the evaluation budget. The
performance is shown in figure 4.

4 of 7

l.y.c.li@umail.leidenuniv.nl

figure 3. Performance on 12 variables, 600 clauses

figure 4. Performance on 14 variables, 700 clauses, 2000 evaluation

figure 5. Performance on 14 variables, 700 clauses, 4000 evaluation

In the third experiment, I used the same cnf


expression in experiment 2( 14 variables and
700 clauses) and this time 4000 evaluation
budget is chosen( 4000/(2^14)=24.4%, almost
25% of the searching space). The
performance is shown in figure 5.
In the last experiment, the cnf expression
(14 variables and 700) is chosen and this
time, the evaluate budget is 16000 which
means the 97.6% of the searching space. The
performance is shown if figure 6.
Because there are 12 variables, thus, after
4096 search, MC( a optimized MC generate
different random bit string each time) will
definitely give the best score, we choose
evaluate budget to be 1000, and the following
shows the performance between MC and GA
if given a small number of variables. I tried if
give MC a 25% chance, that 1000/4096
25%, does MC has great possibility to get the
best bit string. And the result shows it is
possible. Both reach the same result. Still GA
has a better result because of its velocity.
But because MC conceive the blind
random feature, when lucky enough it may
reach a better score faster than GA within
given evaluation times which can be seen in
figure 4 and figure 5. And MC get a better
score, this phenomenon may lead to a
surprising result that GA may not find the best
score because it partly rely on the previous
score. To test this, the last experiment is
done.
When comes to a huge amount of
variables, like the given cnf files, MC will not
be such lucky. It is not possible to tell when
Monte-Carlo Search suddenly get the best
score which is the final result and most of the
difficult SAT problems involve hundreds of
variables.
To conclude, if only provided two choices,
genetic algorithm and Monte-Carlo search,
when the evaluation budget is limited because
of the time or hardware boundaries, GA is
always the safe and better choice. However,
when the amount of variables is not very
large, why bother to search all the space. MC
may perform better and after the starting good
velocity, GA may fail to keep performing
well to generate the best bit string as is
shown in the last experiment in this section.
So I regard MC search as kind of sudden
surprise.

5 of 7

l.y.c.li@umail.leidenuniv.nl

figure 6. Performance on 14 variables, 700 clauses, 16384


evaluations

2.How mutation possibility


affect the velocity of GA
search

860

Performance with different mutation rate

840
To generate the first table, the
constant mutation possibility 1/n
is given in every different
820
experiment. What will the GA
perform if given different
mutation possibility and is 1/n
800
the best choice?
The following experiment is
780
conducted to test such ideas. In
the experiment, I choose the
mutation possibility form 1/n to
760
0.5( to make it perform like MC
1/n
search) with 9 steps among
1/n+step*1
1/n+step*2
1/n+step*3
them. The cnf file for this
740
1/n+step*4
1/n+step*5
experiment is the uf200-01 file
1/n+step*6
1/n+step*7
0.5
which means there are 200
720
variables. Figure 7 show the
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
Evaluate Times
result of the experiment.
figure 7. Performance on GA in different mutation possibilities
Clearly the 1/n is a good
file: uf200-01 5000 evaluations
choice, which means every time
there is possibility of one bit to be
changed. But does there exist a mutation
possibility that is better than 1/n?
Traditionally in Genetic Algorithms, the mutation probability parameter maintains a constant
value during the search. However, an important difficulty is to determine a priori which probability
value is the best suited for a given problem. Besides, there is a growing demand for up-to-date
optimization software, applicable by a non-specialist. Thus, I tested a dynamic mutation possibility.
When discussing about the counting one problem, we get the approximation of optimum mutation
rate, p* = 1/(2(fa+1) - l), where fa is the fitness of the current bit string. What will it behave in the
SAT problem is shown in the following figure.
As is shown in the figure, it did not perform good in the beginning but later it catch up. Still what
will be the good dynamic mutation possibility for the SAT problem should be studied.
According to the convergence velocity of a (1+1)-GA, the possibility of the increase of the nexttime expression evaluation should be took into consideration. It is not like in the counting one

6 of 7

l.y.c.li@umail.leidenuniv.nl

problem where we just presume how


many zeros becoming one and how
many ones become zeros since that is
840
easy for the counting one problem to
evaluate. Change one bit in the SAT
820
may lead to a large change to the
evaluation.
800
To roughly get the velocity of
780
convergency, I presume several
assumptions, first the variables are in a
760
uniform distribution in the clause which
means the possibility of each variables
1/n
740
appearance is the same. Because
1/2(fa+1)-n
there are 200 variables, I can assume
720
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
each variable has an appearance
possibility of 1/200. In the file cnf
uf200-01, there are 860 clauses which
means there are 2520 letters in total. So take a1 for example, there can be in total 2520/200 = 12
(a1 or not(a1)) in total. The second assumption I made is that to a and not a, they have the same
amount of appearance. In this case there are 6 a1 and 6 not a1. The generalization of these two
assumptions are that: For a 3cnf expression with n variables and m clauses, the amount of one
element is approximately equal to m*3/n and the amount of its complementary is the same.
Continue to take the uf200-01 case as the example. If at first the element is 1 and it become
zero, it is sure that there is decrease of 6( according to the assumption) in the evaluation, but it is
not sure that the when its complementary becomes 1 that clause would become 1 which may lead
to increase. For the initial state, if a variable is one, it is sure that the clause with its complementary
will be false and it is not certain that the clause with the variable is true. So the assumption will not
lead to the future discussion since there may not exist progress if the assumption is too in to
consideration.
This lead to my thought in comparing the amount of one variable and its complementary.
Clearly if amount(a) > amount(~a), it is better to let a become 1. So the distribution of the
variables matters, different distribution of the expression may provide different optimum
mutation rate.
860

Mutation rate 1/2(fa+1)-n file: uf200-01

Above are my discussion on this problem and I introduced the following opinions :
1. To a very large searching space in the problem, it is always better to choose Genetic
Algorithm than Monte-Carlo Search to find the best solution within limited evaluation
times.
2. To a relatively small searching space, where all the possibility ca be encountered, MonteCarlo Search may perform better because it does not rely on former combination
information and may surprisingly present the solution quickly however Genetic
Algorithm may fail to get the best score because it do its mutation according to the
former bit string.
3. 1/n is relatively a good choice for the mutation rate in this problem, but when comes to
construct a dynamic mutation rate to reach the fastest rate, the distribution of the
variables should be took into consideration.

Reference
https://en.wikipedia.org/wiki/Maximum_satisfiability_problem#Solvers

7 of 7

Potrebbero piacerti anche