Sei sulla pagina 1di 7

Parallel Computing 8 (1988) 177-183 North-Holland

177

The approximate solution of the Euclidean traveling salesman problem on a CRAY X-MP
Renate GURKE
Zentralinstitut far Angewandte Mathematik, Kernforschungsanlage Jiilich GmbH, 5170 Jiilich, Fed. Rep. Germany

A b s t m ~ The efficient use of MIMD computers calls for a careful choice of adequate algorithms as for an implementation taking into account the particular architecture. To demonstrate these facts, a parallel algorithm to fmd an approximate solution to the Eucfidean Traveling Salesman Problem (ETSP) is presented. The algorithm is a parallelization of Karp's partitioning algorithm. It is a divide-and-conquer method for solving the ETSP approximately. Since the successor vertex to any vertex in the tour is usually a nearby vertex, the problem can be "geographically" partitioned into subproblems which then can be solved indepeadently. The resulting subtours can be combined into a single tour which is an approximate solution to the ETSP. The algorithm is implemented on a CRAY X-MP with two and four processors, and results using macrotasking and microtasking are presented. geywordg Euclidean traveling salesman problem, parallel algorithm, CRAY X-MP, multitasking.

1. Introduction Given N vertices and the distance between each pair of vertices, the Traveling Salesman Problem (T~P) is to find the shortest tour in a labelled graph where every vertex is visited exactly once. For this problem, which seems to be very simple, the time to fina the exact solution increases exponentially with N. For this reason, approximate algorithms--algorithms which find a suboptimal solution in polynomial time--are attractive. Many of these algorithms find tours which are usually only a few percent longer than the shortest tour, yet have a low-order polynomial time complexity [2]. Of the many approximate algorithms available for the E:aclidean Traveling Salesman Problem (ETSP), a suitable candidate for parallelization is Karp's partitioning algorithm because of the divide-and-conquer method [4].

2. Karp's partitioning algorithm Karp's partitioning algorithm is a recursive method to solve the ETSP. In the ETSP each vertex has a location in a Euclidean space, and thus the distances are Euclidean. Since the successor vertex to any vertex in the tour is usually a nearby vertex, the problem can be 'geographical](~y' partitioned into two subproblems which then can be solved independently. Combining the resulting subtour~ into a single tour yields an approximate solution to the ETSP [51. A set of yes,rices is divided into two subsets. Partitionhag can be done either by x-coordinates or by y-coordinates, depending on whether the distribution of the vertices is greater in the x- or y-direction, respectively. The only vertex which is included in both subsets, is the vertex with
0167-8191/88/$3J0 1988, Elsevier Science Publishers B.V. (North-Holland)

178

R. Gurke / Approximate solution of the Euclidean TSP

sort array X by x-coordinates; sort array Y by y-coordinates; find the position of the elements of X in array Y and form array XA$SOC; find the position of the elements of Y in array X and form array YASSOC;

partition(I);
procedure partition( i ) begin if number of elements in partition i > maximum partition size then begin if distribution of vertices in x-direction >i distribution of vertices in y-direction then partition array X and Y by x-coordinates else partition array X and Y by y-coordinates; for j:ffi 2 , i to 2 , i + 1 do

partition(j);
combine the subtours of partition 2 , i and partition 2 , i + 1 end else solve the problem of partition i by the farthest-insertion heuristic end;

Fig. 1. Karp's partitiomng algorithm.

the medium x- (or y-) coordinate. Thus, the extremal x- and y-coordinates of the set of vertices in any partition must be calculated in order to determine the partitioning direction. Once the direction is chosen, the proper median point must be found, so that the set can be evenly divided into two subsets. A subset contains at most one more element than the other subset. Karp's partitioning algorithm is described in Fig. 1. Array X contains the vertices sorted by their x-coordinates; similarly array Y contains the vertices sorted by their y-coordinates. Two other arrays, XASSOC and YASSOC, serve as pointers between the elements of X and Y. The position of a vertex in array Y, stored in X[i], is given by XASSOC[i]. Likewise X[YASSOC[i]]ffi Y[i]. Logically each partition has its own X, XASSOC, Y and YASSOC arrays. Implemented as subarrays of large arrays, each of them is accessed via a base pointer. Finally the x- and y-coordinates of the vertices are stored in arrays XLOC and YLOC, respectively. Assuming that the set of vertices is to be partitioned by x-coordinates, the first half of array X can be copied into the first subset of array X. Similarly the second half can be copied into the second subset of array X. The el,ements of array Y cannot be copied into the proper subset of array Y in the same way, because the position of each element in the new array depends on the index in array X. In which subset: the elements have to be copied can be decided using array

YASSOC.
This divide-and-conquer process recursively continues until a set of vertices is left whose cardinality is smaller than a predefined value which is called maximum partition size (MAXELS). The resulting subproblems are solved using the farthest-insertion heuristic. The farthest-insertion heuristic springs from the intuition that if the rough outline of the tour can be constructed from the widely-separated vertices, then the finer details of the tour resulting from the inclusion of nearer vertices can be filled in without increasing its total length significantly. The algorithm adds the vertex which is farthest from any vertex in the partially-completed tour in such a way that the increase in the length of the partially-completed tour is minimized. When

R. Gurke / Approximate solution of the Euclidean TSP


I

179

i3.

7.
5.

I I

"6

i i 2t

-4

1"!

Fig. 2. Distributionof vertices.


the third vertex is added to the tour it has to be tested if the resulting tour is traversed in a counterclockwise direction. Otherwise the sequence of vertices has to be changed. The same direction of all subtours is important for combining them. The vertex, common to both subsets, is called the source of these subtours. There are two possibilities for combining two subtours: (1) The edge in partition i directed to the source and the edge in partition j directed from the source are deleted. Then the predecessor of the source in partition i and the successor of the source in partition j are joined. (2) The edge in partition j directed to the source and the edge in partition i directed from the source are deleted. Then the predecessor of the source in partition j and the successor of the source in partition i are joined. Joining two subtours is done by deciding which of these two possible shortcuts shortens the tour by the greatest amount. Example 2.1. Consider the distribution of vertices in Fig. 2. This distribution of vertices yields to array X and Y as follows:

x XASSOC YASSOC

I 597123846 I 627349158 [ 724581396[


891245763[

I I

Now ti~e first partitioning of array X and Y should be presented. Firstly array X is divided into two sl!LL~sets,because the distribution of the vertices is greater in x-direction than in y-direction: X

597 597 I 2[

123846[ [ 23846 I

///V/ \\\\\

Which elements of array Y have to be copied in which subset can be determined using array

YASSOC:
8 9

! 245763

2111221122

180

R. Gurke /Approximate solution of the Euclidean TSP

.3

Fig. 3. Formingtwo subtours.


! !

i 8
It

8
X

Fig.4. Possibilitiesfor combining.

Having obtained this information the elements are copied into the correct subsets:

[ 89 1 2 4 5 7 6 3 [ [91257] [82463[

Using the farthest-insertion heuristic for this example, the two resulting subtours are shown in Fig. 3. The two possibilities for combining these tours are shown in Fig. 4. Comparing both alternatives the first tour turns out to be the better solution. Therefore it is chosen as an approximate solution of the original problem.

3. I m p l e m e n t a t i o n

A modification of Karp's partitioning algorithm is implemented in FORTRAN on a CRAY X-MP/22 and on a CRAY X-MP/48 at Jiilich (FORTRAN compiler CFT 1.15 BF2 and operating system COS 1.15 BF3). The sequential program has been compared with several parallel versions using the CRAY multitasking facilities described in [1]. All programs make use of loop vectorization. Karp's recursive algorithm has been replaced by a version using a stack technique. As input the algorithm gets two arrays XLOC and YLOC containing the x- and y-coordinates. In our examples, they were produced using a random number generator. Sorting the arrays XLOC and YLOC the arrays X and Y are obtained. Furthermore the algorithm needs the computation of array XASSOC and array YASSOC. An important factor concerning the result and the execution time of the algorithm is the maximum partition size MAXELS, i.e. the maximum number of vertices in a subset which is no longer subdivided. The execution times for our examples decrease with MAXELS increasing from 3 to 15, they increase significantly with MAXELS increasing from 15 to N / 2 + 1. For

R. Gurke / Approximate solution of the Euclidean TSP

181

example the sequential execution time with M A X E L S - 3 for a problem with 1000 vertices measured on a CRAY X-MP/22 was 53.95 ms. With M A X E L S = 14 we measured an execution time of 39.75 ms. Chosing M A X E L S = 501, only one partitioning of the arrays is done. The two resulting subtours are solved and then combined. The execution time was 567.42 ms. It can be observed that the resulting tour length in our examples decreases with increasing MAXELS. Using two processors, a second task is created after the first array partitioning. Then both tasks are working ~synchronously on their subarrays until both of them get a subtour as a solution. Now the initial task has to combine these two subtours to get an approximate solution of the original problem [3]. The good performance of the parallel algorithm is due to the fact that synchronization of both tasks is only necessary before combining the last two subtours. Another advantage in parallelizing this algorithm is the nearly equal granularity of both tasks, which leads to a balanced workload and a good CPU-utilization. Using four processors to execute the algorithm in parallel, both tasks create an additional task after the next partitioning. Now four tasks are working on their subarrays asynchronously until everyone gets a subtour as a solution. Accordingly two tasks are combining two subtours each. Finally the initial task has to do the last combination. Besides macrotasking, a second possibility for parallel program execution is microtasking. Using microtasking dix-ectives, the algorithm can be parallelized in the same way as for macrotasking with two tasks described before. Creating four tasks has to be done in a different manner, because it is not possible to create two tasks at a time and later on two additional tasks. For this reason the algorithm has been modified in the following way. At the beginning the partitioning is clone into four subarrays instead of two. The following partitionings are done as before. Every task is working independently on its partition until it gets an approximate solution. Accordingly all tasks are halted and two new tasks are created to combine two subtours each. The last combination which yields an approximate solution of the algorithm is done sequentially.

~ . 4 . 0 0 "1o o 3.50-

Mocrotoskng 4 Orocessors ocrotoskng 2 processors crotosking crotoskng 4 rocessors roc ssors

.00-

..--.----...-..--.

..... -.-.- .......

--.. ..........

.----- ..................

..........

..... ...........................................

100

2o0

300

400

500

600 700 n u m b e r of

800 900 vertices

Fig. 5. Speedup values for ETSP on CRAY X-MP/48: M A X E L S = 14.

182

R. Gurke / Approximate solution of the Euclidean TSP


--~ M i c r o t o s k i n 9, 4 p r o c e s s o r s 4 processors 2 processors 2 processors

-..=--Mocrotosking, e~ 4 . 0 0 ...................................................................................... "1o


n

~ :

Microtosking, Macrotosking,

3 . 5 0 ...................................................................................... 3 . 0 0 .......................................................................................................................................

2 . 0 0 ....................................................................................... ~

.~'~.~. = ~ _ = - T - --"-h

1.5o. ~ ~
1.00-

=-"=--- ==-

= . . . . ---~ ~ ~ --" ;~.~.: ~

.50-

....nn..n..... .Hn.. .... . . . ..u .. . H u

0.00 10

15

20

25

30

35

40

45

50

number of vertices
Fig. 6. S p e e d u p value., for E T S P o n C R A Y X - M P / 4 8 : MAXELS = 4.

Paralle!izing the algorithm in this way, i.e. partitioning the arrays into four subarrays at the same time, the tour given as a solution of the algorithm may be different from the resulting tour obtained by dividing at the beginning into two parts only. By doing so, Karp's idea of dividing into two parts by x- or y-coordinates in dependence of the distribution of vertices is lost. In general the bisection leads to better results. The same results are guaranteed only if the sequential program is forced to divide in the same direction for the first three times. The execution times were measured on a dedicated machine using the function IRTC to count the clock periods. Comparing the speedup values obtained with macrotasking and microtasking, one can see that a better speedup is achieved using microtasking, especially if the problems are small. The reason is the small overhead of the microtasking calls. The speedup which can be achieved with four processors is nearly 3 (Fig. 5). In Fig. 6 the speedup values for small problems are shown. Considering the speedup values obtained with four processors it is obvious that the best speedup is reached if every task has the same number of vertices, i.e. N + 3 is a multiple of 4. An analogous effect can be observed for the results with two processors; the first two subtours have the same number of vertices if the number of vertices of the original problem is odd. The execution times of the sequential program and of both parallel versions for two processors (macrotasking and microtasking) measured on the CRAY X-MP/22 are nearly the same as for the CRAY X-MP/48. Therefore diagrams for the CRAY X-MP/22 are omitted.

4. Conclusion The speedup which can be achieved for Karp's partitioning algorithm is about 75~ of the theoretically attainable speedup. It can be observed that using microtasldng leads to a better speedup than macrotasldng, especially if the problems are small. Nevertheless, microtasking with four processors as implemented here results in different tour lengths in most of the cases,

R. Gurke / Approximate solution of the Euclidean TSP

183

because this implementation deviates from the structure of the algorithm as designed by Karp. To get the same results, a parallelization with microtasking has to be done in a completely different manner and not as a derivation of the macrotasking version.

Acknowledgment The author wants to thank Dr. P. Weidner, J.-Fr. Hake and W. Nagel for many helpful remarks.

References
[1] CRAY Computer Systems, CRAY X-MP Multitasking, Programmer's Reference Manual, Revision C, CRAY-Research Inc., SN-0222, 1986. [2] B. Golden, L. Bodin, 2". Doyle and W. Stewart, Jr., Approximate traveling salesman algorithms, Oper. Res. 28 (3) Part 2 (1980) 694-711. [3] R. Gurke, Graphenalgorithmen ffir MIMD-Rechner, KFA Jiilich, Jtil-Spez-355, 1986. [4] R. Karp, Probabifi~tic analysis of partitioning algorithms for the traveling salesman problem in the plane, ,Hath. Oper. Res. 2 (1977) 209-224. [5] M.J. Quinn, The design and analysis of algorithms and data structures for the efficient solution of graph theoretic problems on MIMD computers, Ph.D. Thesis, Computer Science Department, Washington State University, Pullman, WA, 1983.

Potrebbero piacerti anche