Reverse Search - Sebastian Nowozins slow blog

One of my all-time favorite algorithms is reverse search proposed by David Avis and Komei Fukuda in 1992, PDF.

Reverse search is an algorithm to solve enumeration problems, that is, problems where you would like to list a finite set of typically combinatorially related elements. Reverse search is not quite an algorithm, rather it is a general construction principle that is applicable to a wide variety of problems and often leads to optimal algorithms for enumeration problems.

Problems in which reverse search is applicable often have the flavour where the elements have a natural partial order (such as sets, sequences, graphs where we can define subsets, subsequences, and subgraphs), or where there is a natural neighborhood relation between elements which can be used to traverse from one element to the other (such as the linear programming bases considered in the Avis and Fukuda examples).

The reverse search construction leads to a structured search space that is also suitable for combinatorial search and optimization algorithms. For example, we can often readily use the resulting enumeration tree in branch-and-bound search methods. I made heavy use of this possibility during my PhD a few years ago during my work with Koji Tsuda, and reverse search is the working horse in my CVPR 2007, ICCV 2007, and ICDM 2008 papers. (Needless to say, I have fond memories of it, but even now I regularly see applications of the reverse search idea.) In the following, my presentation will differ quite a bit from the Avis and Fukuda paper.

Basic Idea

At its core reverse search is a method to organize all elements to be enumerated into a tree where the nodes in the tree each represent a single element. Each element appears exactly once in the tree and by traversing the tree from the root we can enumerate all elements exactly once.

Here is the recipe:

Define a ``reduction'' operation which takes an enumeration element and reduces it to a simpler one. This defines an enumeration tree.
Invert the reduction operation.
Enumerate all elements, starting from the root.

Let us illustrate this recipe first on a simple example: enumerating subsets of a given set. Say we are given the set \(\{1,2,3\}\) and would like to enumerate subsets. To define the reduction operation we simply say ``remove the largest integer from the set''. Formally, this defines defines a function \(f\) from the set of sets to the set of sets. Here is an illustration:

Set of three integers and reduction operation

Now we consider the inverse map \(f^{-1}\), from the set of sets to the set of powersets. Here is an illustration:

Inverse reduction operation

The inverse defines an enumeration strategy: we start at \(\emptyset\) and evaluate \(f^{-1}(\emptyset) = \{\{1\}, \{2\}, \{3\}\}\). For each set element we now recurse. This enumerates all elements in the tree exactly once.

The above recipe has the following practical advantages:

Reverse search often yields a simple algorithm.
Typically there is no additional memory or bookkeeping required beyond the recursion call stack, so that the total memory required is \(O(r)\) where \(r\) is the recursion depth.
Yields a output-linear polynomial-delay enumeration algorithms, which means that the total time complexity is linear in the number of items enumerated and for each item only polynomial time is needed. (This slightly unconventional notion of complexity makes sense for enumeration problems because the answer is often exponential in the size of the input.)
Often yields optimal enumeration algorithms in terms of memory and runtime.
The resulting algorithms are trivially parallelizable over the enumeration tree.

Ok, the above was a trivial example, let us look at a more complicated example.

Example: Enumerating all Connected Subgraphs

Let us consider a non-trivial application of the reverse search idea: enumerating all connected subgraphs of a given graph.

To apply the recipe, how could the reduction operation look like? Intuitively, we are given a connected graph and we could remove a single vertex from the graph, thereby making it smaller. By removing one vertex at a time we would eventually arrive at the empty graph.

But given a graph, how do we determine which vertex to remove? For this, let us assume all vertices in the given graph have a unique integer index. Then, given such a graph we can then attempt to remove the highest integer vertex, just as in the set example above. Here we hit a complication: upon removal of the vertex the graph may become disconnected. For example, consider the chain graph \(1-3-2\). Here the vertex labeled \(3\) would be removed, yielding two disconnected components, which violates the requirement of enumerating only connected subgraphs. Therefore we simply say: ``Remove the highest-index vertex such that the resulting graph remains connected''.

Here is an example of the reduction operation in action on the following simple cycle graph:

Cycle graph with four nodes

The enumeration tree of all fourteen connected subgraphs (counting the empty graph as well) looks as follows. Here each arrow is the application of one reduction operation.

All connected subgraphs of the cycle graph with four nodes

Looking at the above tree, you can note the following:

The graph \(1-4-2\) has the highest vertex \(4\) but this cannot be removed because it would yield a disconnected subgraph; therefore the reduction operation removes \(2\) instead.
By construction, there is a unique path from every graph to the root.
By construction only connected subgraphs are present in the tree, and each such graph is present exactly once.

In order to enumerate all connected subgraphs, we have to invert the arrows of this graph. That is, we have to invert the reduction operation and given a graph we have to generate all child nodes in the reversed graph. This reversion is what gives reverse search its name.

The inverse operation is described as follows: ``given a connected subgraph, add a vertex which will become the highest-index vertex and whose removal retains a connected graph." This is quite a mouthful but luckily the actual implementation is simple.

Here is a Julia implementation.

using LightGraphs

is_connected1(g::Graph) = nv(g) <= 1 ? true : is_connected(g)
is_removable(g::Graph, vset::IntSet, rmv) =
    is_connected1(induced_subgraph(g, setdiff(vset, rmv)))
rm_vertex(g::Graph, vset::IntSet) =
    maximum(filter(rmv -> is_removable(g, vset, rmv), vset))

function connsubgraphs(g::Graph)
    function _connsubgraphs(vset::IntSet)
        produce(copy(vset))   # output current subgraph vertex set

        # Generate child nodes of the current subgraph.
        # Consider all vertices not yet in graph
        for add_vi = filter(v -> !in(v, vset), vertices(g))
            push!(vset, add_vi)     # Add new vertex
            if is_connected1(induced_subgraph(g, vset)) &&
                add_vi == rm_vertex(g, vset)
                # Recurse
                _connsubgraphs(vset)
            end
            setdiff!(vset, add_vi)  # Remove new vertex
        end
    end
    function _connsubgraphs()
        _connsubgraphs(IntSet())
    end
    Task(_connsubgraphs)
end

g = Graph(4)
add_edge!(g, 1, 3)
add_edge!(g, 3, 2)
add_edge!(g, 4, 2)
add_edge!(g, 1, 4)
S = collect(connsubgraphs(g))

Note the key statements between the push! and setdiff! lines that govern the recursion. In the if-condition we check that the new graph remains connected and the added vertex would be the one that would be removed.

The above code uses the Julia producer-consumer pattern. When run, it produces the following output, identical to the above diagram.

14-element Array{Any,1}:
 IntSet([])          
 IntSet([1])         
 IntSet([1, 3])      
 IntSet([1, 2, 3])   
 IntSet([1, 2, 3, 4])
 IntSet([1, 3, 4])   
 IntSet([1, 4])      
 IntSet([1, 2, 4])   
 IntSet([2])         
 IntSet([2, 3])      
 IntSet([2, 3, 4])   
 IntSet([2, 4])      
 IntSet([3])         
 IntSet([4])

Conclusion

Reverse search is a general recipe to construct tree-structured enumeration methods useful for enumerating combinatorial sets and optimization over them.

In fact, it is so useful that some authors have reinvented reverse search without noticing. For example, the popular gSpan algorithm of Yan and Han published in 2003 defines a clever total ordering on labeled graphs essentially in order to be able to define the reduction operation needed in reverse search.

So, check it out, the Avis and Fukuda paper is very rich and well worth a read! (If you prefer a different presentation similar to the one above but more technical, have a look at my PhD thesis.)

Acknowledgements. I thank Koji Tsuda for reading a draft version of the article and providing feedback.