Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Introduction
In this laboratory we investigated the usage of the union/find data structure as a means of solving more
complicated problems that otherwise would have been difficult to resolve without. In this laboratory in
particular, we implemented a class that encapsulates the logic of creating and printing a maze based on
a (n x m) matrix. In the latter section of the laboratory, we used the union/find data structure to detect
cycles in a weighted graph read from a *.csv file and to perform the Kariskal’s Minimum Spanning Tree
Algorithm. Finally, because the union/find data structure was essential for the completion of this
laboratory, we also analyzed the running time of all three methods which this data structure could have
used. Namely, these methods are the standard, compressed, and the ranked/height methods.
Maze (Maze.java)
In order to process the labyrinth walls properly, a system had to be used in order to represent
them. In a (n x m) matrix, each cell that will be considered in the algorithm, typically has two walls, the
bottom and the right. Every row has m * 2 – 1 walls, with the exception of the last row. You can see the
scheme applied below on a 3 x 4 matrix:
Figure 5.1
For instance, in this example, there are 4 columns, thus in each row (with exception to the last row), has
2m – 1 walls or 4 * 2 – 1 = 7. You can see that the walls go up to 6, starting from 0. Nonetheless, giving
the walls symbols does not help us much in finding the cell numbers. Couldn’t we just divide the number
of the wall by 2 and that would be the cell number? Well, for each row, the walls are reduced by one, so
dividing the number by two won’t work for the rest of the rows. Because in each additional row, there is
one more wall missing, I devised an algorithm to determine the adjacent cells to the wall:
ݎܾ݁݉ݑ݊ ݈݈ܽݓ
IF current Row is not last Row THEN
ݎܾ݁݉ݑ݊ ݈݈ܽݓ+ 2݉ − 1
= ݎܾ݁݉ݑܰ ݈݈݁ܥቨ ቩ
2
ݎܾ݁݉ݑ݊ ݈݈ܽݓ
ELSE
ݎܾ݁݉ݑ݊ ݈݈ܽݓ+ ሺݎܾ݁݉ݑ݊ ݈݈ܽݓሻ݉ ݀ሺ2݉ − 1ሻ + 1
= ݎܾ݁݉ݑܰ ݈݈݁ܥቨ 2݉ − 1 + ቩ
2
Notice that if the wall happens to land on the last row, then we lose twice the amount of walls, since the
bottom walls are not considered. In order to account for this loss, we add the number of walls we are
missing, according to which column it belongs to, and add one to it. We do this because 0 mod x is
always 0.
The class generates the maze by randomly selecting a wall from the range of [0, 2nm - n - m] ,
calculating the cell number, and finding out if those two cell numbers are already connected by using
the union/find data structure. The loop continues until the first and last cell are connected, or belong to
the same set.
The entire program is structured in multiple classes that encapsulate each aspect of the lab. The
Main.java manufactures an easy-to-follow menu to illustrate the usage of these classes whilst providing
the user with the option to re-run the program.
Experimental Results
In the experimental process we tested for the running times of all three implementation of the
union/find data structure. The results are shown below, in milliseconds:
n Standard Compression Rank/Height
50 16 4 2
100 114 5 7
150 348 22 12
200 418 18 23
250 524 36 37
300 398 36 57
350 18886 62 47
400 14901 99 57
450 44705 224 121
500 147509 115 100
Table 5.1
10000
Figure 5.2
The data structure was tested by setting both of the dimensions to the same value. This will allow us to
record a constant change between all of the executions of the methods when we increase the input size.
The results above illustrate how both the compression and the rank/height methods work sufficiently
faster than the standard method.
Conclusion
The union/find data structure, although a very simply one, is a helpful tool that allow us to solve more
abstract problems. Because it such a useful tool, and often used, one must use the proper
implementation of the structure. It may be that in a certain application it is useful to use one
implementation rather than the other two. However, in this lab, we have concluded that union/find
structure using either the compression or rank/height methods are ideal candidates for use.