- As Robert Sedgewick has said, "It is an interesting computational puzzle to write a program that generates all possible ways of rearranging N distinct items." Applications include "traveling-salesman" problem, where it is necessary to find the shortest path through all points in a graph. Permutations are also used in the analysis of sorting algorithms.
Usual approaches to finding the permutations of a data set include the use of graphs as suggested by Robert Sedgewick or using recursion as suggested by Jacques Arsac. Both of these approaches have a common drawback they require helping data structures to perform the permutation. To create permutations using graphs, it is necessary to have data space for the data set to be permuted, as well as data space for the graph structures to perform the permutation. The recursive algorithm requires the overhead of recursive calls, or the use of a stack if implemented without recursion. These helping data structures can require as much space as the original data set. This paper discusses research in progress which will offer a new approach to the permutation problem, that retains the speed of the above algorithms, without requiring helping data structures.
Upon inspecting character data sets as they permute, it was found that through a set of logical steps a set of character data can be permuted into its next permutation. It then becomes possible to find all permutations in sorted order by repeatedly calling the algorithm on the last output obtained. These same steps can be used on any data set in which the data have magnitude relationships. It was also discovered that each permutation could be created "in place", so no additional helping structure is required. Preliminary analysis reveal that the creation of each permutation has a worst case running time of O(n) and a much faster average running time.
The SEPA Algorithm
- Three steps are needed to convert a data set to its next permutation. The first step is to find the key field for this permutation. The first field which changes between the two permutations is the key field. The key field in "abdca" as it becomes "acabd" (its next sorted order permutation) is the field containing the value 'b'.
- The key field can be determined by checking pairs of digits (or fields) starting from the rear (or right side) of the set. For each pair, if the left value of the pair is smaller than the right, then the field on the left is the key field. At this point all values to the right of the key field must be in sorted order from right to left (if not a field to the right would be the key field). If the key does not exist, the last permutation has been created. At this time, the data set is in reverse sorted order (from left to right).
- For clarification, those values to the right of the key field will be termed the "tail" of the set, while those to the left will be the "head". Note that the head, key, and tail will be different for each permutation.
- To find the next permutation in order, the key field must be exchanged with the smallest value in the tail which is larger than the key value. For example in the data set "abdca" the field with the value 'b' is the key, and the field containing the value 'c' has the smallest value in the tail which is larger than the key value. After swapping values, the set becomes "acdba". It is possible to find this value by comparing each value with the key value, starting at the end of the data set. The first value larger than the key value is the value to be exchanged with the key value. Notice that the tail (the 'c' is the key now) is still in reverse sorted order. This is important, because the last step is putting the tail into sorted order.
- Since the tail is in reverse sorted order, to sort it requires only a single loop, which works from both ends of the tail at the same time, swapping each pair of items while moving to the center of the tail. In this example, a character data set was used, but as stated before the logic behind this algorithm will work with other data sets, as long as there is a less than/greater than relationship among the data items. If all permutations are required, it is necessary that the original data set be the first permutation (exist in sorted order).
A quick review of the algorithm shows it is compose of three loops (an example in C is given as Listing 1). The first finds the key. The second finds the value to swap with the key. The third reorders the tail from reverse order to sorted order. A short analysis shows all three loops are bounded by the size of the data set, therefore the running time is bounded by O(n).
The author with proceed to create in-depth proof showing the algorithm produces all permutations, as well as giving formal worst case, and average run time bound analysis. Representative implementations of both graph and recursive based algorithms will be giving with similar analysis as a comparison to this new approach. A memory constraint analysis will also be given for each of the algorithms.
This algorithm's lack of helping data structures makes it a contribution to the field of Computer Science.
The author thanks Dr. Christopher Jones (Utah Valley State College) and Dr. Cory Barker (Brigham Young University-Hawaii Campus) for their support for this project.