Nearest Path Algorithm
Nearest Path Algorithm
c 2005 Society for Industrial and Applied Mathematics
Vol. 34, No. 6, pp. 1398–1431
Abstract. We present a new scheme for computing shortest paths on real-weighted undirected
graphs in the fundamental comparison-addition model. In an efficient preprocessing phase our al-
gorithm creates a linear-size structure that facilitates single-source shortest path computations in
O(m log α) time, where α = α(m, n) is the very slowly growing inverse-Ackermann function, m the
number of edges, and n the number of vertices. As special cases our algorithm implies new bounds
on both the all-pairs and single-source shortest paths problems. We solve the all-pairs problem
in O(mn log α(m, n)) time and, if the ratio between the maximum and minimum edge lengths is
O(1)
bounded by n(log n) , we can solve the single-source problem in O(m + n log log n) time. Both
these results are theoretical improvements over Dijkstra’s algorithm, which was the previous best
for real weighted undirected graphs. Our algorithm takes the hierarchy-based approach invented by
Thorup.
Key words. single-source shortest paths, all-pairs shortest paths, undirected graphs, Dijkstra’s
algorithm
DOI. 10.1137/S0097539702419650
∗ Received by the editors December 13, 2002; accepted for publication (in revised form) October
15, 2004; published electronically July 26, 2005. This work was supported by Texas Advanced
Research Program grant 003658-0029-1999 and NSF grant CCR-9988160. A preliminary version of
this paper, titled Computing shortest paths with comparisons and additions, was presented at the
13th Annual ACM-SIAM Symposium on Discrete Algorithms, 2002, San Francisco, CA.
http://www.siam.org/journals/sicomp/34-6/41965.html
† Max Planck Institut für Informatik, Stuhlsatzenhausweg 85, 66123 Saarbrücken, Germany
([email protected]). This author’s work was also supported by an Alexander von Humboldt
Postdoctoral Fellowship and by an MCD Graduate Fellowship.
‡ Department of Computer Sciences, The University of Texas at Austin, Austin, TX 78712 (vlr@
cs.utexas.edu).
1398
A SHORTEST PATH ALGORITHM FOR UNDIRECTED GRAPHS 1399
has also been a focus on computing approximate shortest paths—see Zwick’s recent
survey [Z01]. One common assumption is that the graph is integer-weighted, though
structurally unrestricted, and that the machine model is able to manipulate the in-
teger representation of weights. Shortest path algorithms based on scaling [G85b,
GT89, G95] and fast matrix multiplication [Sei95, GM97, AGM97, Tak98, SZ99, Z02]
have running times that depend on the magnitude of the integer edge weights, and
therefore yield improved algorithms only for sufficiently small edge weights. In the
case of the matrix multiplication–based algorithms the critical threshold is rather
low: even edge weights sublinear in n can be too large. Dijkstra’s algorithm can be
sped up in the integer-weight model by using an √integer priority queue.1 The best
bounds on Dijkstra’s algorithm to date are O(m log log n) (expected) [HT02] and
O(m + n log log n) [Tho03]. Both of these algorithms use multiplication, a non-AC 0
operation; see [Tho03] for bounds in the AC 0 model. Thorup [Tho99] considered
the restricted case of integer-weighted undirected graphs and showed that on an AC 0
random access machine (RAM), shortest paths could be computed in linear time.
Thorup invented what we call in this paper the hierarchy-based approach to shortest
paths.
The techniques developed for integer-weighted graphs (scaling, matrix multipli-
cation, integer sorting, and Thorup’s hierarchy-based approach) seem to depend cru-
cially on the graph being integer-weighted. This state of affairs is not unique to the
shortest path problem. In the weighted matching [G85b, GT89, GT91] and maxi-
mum flow problems [GR98], for instance, the best algorithms for real- and integer-
weighted graphs have running times differing by a polynomial factor. For the shortest
path problem on positively weighted graphs the integer/real gap is only logarith-
mic. It is of great interest whether an integer-based approach is inherently so, or
whether it can yield a faster algorithm for general, real-weighted inputs. In this
paper we generalize Thorup’s hierarchy-based approach to the comparison-addition
model (see section 2.1) and, as a corollary, to real-weighted input graphs. For the
undirected APSP problem we nearly eliminate the existing integer/real gap, reducing
it from log n to log α(m, n), where α is the incomprehensibly slowly growing inverse-
Ackermann function. Before stating our results in detail, we first give an overview
of the hierarchy-based approach and discuss the recent hierarchy-based shortest path
algorithms [Tho99, Hag00, Pet04, Pet02b].
Hierarchy-based algorithms should be thought of as preprocessing schemes for
answering SSSP queries in nonnegatively weighted graphs. The idea is to compute
a small non–source-specific structure that encodes useful information about all the
shortest paths in the graph. We measure the running time of a hierarchy-based algo-
rithm with two quantities: P, the worst case preprocessing cost on the given graph,
and M, the marginal cost of one SSSP computation after preprocessing. Therefore,
solving the s-sources shortest path problem requires s · M + P time. If s = n, that is,
if we are solving APSP, then for all known hierarchy algorithms the P term becomes
negligible. However, P may be dominant (in either the asymptotic or real-world sense)
for smaller values of s. In Thorup’s original algorithm [Tho99], P and M are both
O(m); recall that his algorithm works on integer-weighted undirected graphs. Hagerup
[Hag00] adapted Thorup’s algorithm to integer-weighted directed graphs, incurring a
slight loss of efficiency in the process. In [Hag00], P = O(min{m log log C, m log n}),2
1 It can also be sped up using an integer sorting algorithm in conjunction with Thorup’s reduction
where C is the maximum edge weight and M = O(m + n log log n). After the ini-
tial publication of our results [PR02a], Pettie [Pet04, Pet02b] gave an adaptation of
the hierarchy-based approach to real-weighted directed graphs. The main result of
[Pet04] is an APSP algorithm running in time O(mn + n2 log log n), which improved
upon the O(mn + n2 log n) bound derived from multiple runs of Dijkstra’s algorithm
[Dij59, J77, FT87]. The result of [Pet04] is stated in terms of the APSP problem
because its preprocessing cost P is O(mn), making it efficient only if s is very close to
n. In [Pet02b] (see also [Pet03]) the nonuniform complexity of APSP is considered;
the main result of [Pet02b] is an algorithm performing O(mn log α(m, n)) comparison
and addition operations. This bound is essentially optimal when m = O(n) due to
the trivial Ω(n2 ) lower bound on APSP.
In this paper we give new bounds on computing undirected shortest paths in
real-weighted graphs. For our algorithm, the preprocessing cost P is O(mst(m, n) +
min{n log n, n log log r}), where mst(m, n) is the complexity of the minimum span-
ning tree problem and r is the ratio of the maximum-to-minimum edge weight. This
bound on P is never worse than O(m + n log n), though if r is not excessively large,
O(1)
say less than n(log n) , P is O(m + n log log n). We show that the marginal cost M
of our algorithm is asymptotically equivalent to split-findmin(m, n), which is the
decision-tree complexity of a certain data structuring problem of the same name. It
was known that split-findmin(m, n) = O(mα(m, n)) [G85a]; we improve this bound
to O(m log α(m, n)). Therefore, the marginal cost of our algorithm is essentially (but
perhaps not precisely) linear. Theorem 1.1 gives our general result, and Corollaries
1.2 and 1.3 relate it to the canonical APSP and SSSP problems, respectively.
Theorem 1.1. Let P = mst(m, n) + min{n log n, n log log r}, where m and n are
the number of edges and vertices in a given undirected graph, r bounds the ratio of
any two edge lengths, and mst(m, n) is the cost of computing the graph’s minimum
spanning tree. In O(P) time an O(n)-space structure can be built that allows the com-
putation of SSSPs in O(split-findmin(m, n)) time, where split-findmin(m, n) =
O(m log α(m, n)) represents the decision-tree complexity of the split-findmin problem
and α is the inverse-Ackermann function.
Corollary 1.2. The undirected APSP problem can be solved on a real-weighted
graph in O(n · split-findmin(m, n)) = O(mn log α(m, n)) time.
Corollary 1.3. The undirected SSSP problem can be solved on a real-weighted
graph in O(split-findmin(m, n)+mst(m, n)+min{n log n, n log log r}) = O(mα(m, n)+
min{n log n, n log log r}) time.
The running time of our SSSP algorithm (Corollary 1.3) is rather unusual. It con-
sists of three terms, where the first two are unknown (but bounded by O(mα(m, n)))
and the third depends on a nonstandard parameter: the maximum ratio of any two
edge lengths.3 A natural question is whether our SSSP algorithm can be substantially
improved. In section 6 we formally define the class of “hierarchy-based” SSSP algo-
rithms and show that any comparison-based undirected SSSP algorithm in this class
must take time Ω(m+min{n log n, n log log r}). This implies that our SSSP algorithm
is optimal for this class, up to an inverse-Ackermann factor, and that no hierarchy-
based SSSP algorithm can improve on Dijkstra’s algorithm, for r unbounded.
Pettie, Ramachandran, and Sridhar [PRS02] implemented a simplified version of
our algorithm. The observed marginal cost of the [PRS02] implementation is nearly
linear, which is in line with our asymptotic analysis. Although it is a little slower
3 Dinic’s implementation [Din78, Din03] of Dijkstra’s algorithm also depends on r, in both time
than Dijkstra’s algorithm in solving SSSP, it is faster in solving the s-sources shortest
path problem, in some cases for s as small as 3. In many practical situations it is
the s-sources problem, not SSSP, that needs to be solved. For instance, if the graph
represents a physical network, such as a network of roads or computers, it is unlikely
to change very often. Therefore, in these situations a nearly linear preprocessing cost
is a small price to pay for more efficient shortest path computations.
1.1. An overview. In section 2 we define the SSSP and APSP problems and
review the comparison-addition model and Dijkstra’s algorithm [Dij59]. In section
3 we generalize the hierarchy approach to real-weighted graphs and give a simple
proof of its correctness. In section 4 we propose two implementations of the general
hierarchy-based algorithm, one for proving the asymptotic bounds of Theorem 1.1
and one that is simpler and uses more standard data structures. The running times
of our implementations depend heavily on having a well-balanced hierarchy. In section
5 we give an efficient method for constructing balanced hierarchies; it is based on a
hierarchical clustering of the graph’s minimum spanning tree. In section 6 we prove a
lower bound on the class of hierarchy-based undirected SSSP algorithms. In section
7 we discuss avenues for further research.
2. Preliminaries. The input is a weighted, undirected graph G = (V, E, ),
where V = V (G) and E = E(G) are the sets of n vertices and m edges, respectively,
and : E → R assigns a real length to each edge. The distance from vertex u to
vertex v, denoted d(u, v), is the length of the minimum length path from u to v, or
∞ if there is no such path from u to v, or −∞ if there is no such path of minimum
length. The APSP problem is to compute d(u, v), for (u, v) ∈ V × V , and the SSSP
problem is to compute d(u, v) for some specified source u and all v ∈ V .
If, in an undirected graph, some connected component contains an edge of negative
length, say e, then the distance between two vertices u and v in that component is −∞:
one can always construct a path of arbitrarily small length by concatenating a path
from u to e, followed by the repetition of e a sufficient number of times, followed by a
path from e to v. Without loss of generality we will assume that : E → R+ assigns
only positive edge lengths. A slightly restricted problem (which forbids the types of
paths described above) is the shortest simple path problem. This problem is NP-hard
as it generalizes the Hamiltonian path problem. However, Edmonds showed that when
there is no negative weight simple cycle, the problem is solvable in polynomial time
by a reduction to weighted matching—see [AMO93, p. 496] and [G85a].
2.1. The comparison-addition model. We use the term comparison-addition
model to mean any uniform model in which real numbers are subject to only compar-
ison and addition operations. The term comparison-addition complexity refers to the
number of comparison and addition operations, ignoring other computational costs.
In the comparison-addition model we leave unspecified the machine model used for
all data structuring tasks. Our results as stated hold when that machine model is a
RAM. If instead we assume a pointer machine [Tar79], our algorithms slow down by
at most an inverse-Ackermann factor.4
The comparison-addition model has some aesthetic appeal because it is the sim-
plest model appropriate to computing shortest paths and many other network opti-
4 The only structure we use whose complexity changes between the RAM and pointer machine
models is the split-findmin structure. On a pointer machine there are matching upper and lower
bounds of Θ(mα) [G85a, LaP96], whereas on the RAM the complexity is somewhere between Ω(m)
and O(m log α)—see Appendix B.
1402 SETH PETTIE AND VIJAYA RAMACHANDRAN
our algorithm does not, and neither do [F76, Tak92, Han04, Z04, Pet04, Pet02b].
A SHORTEST PATH ALGORITHM FOR UNDIRECTED GRAPHS 1403
min{n log r, n log n}) lower bound on computing directed SSSP with a “hierarchy-
type” algorithm, where r bounds the ratio of any two edge lengths. In section 6 we
prove a lower bound of Ω(m + min{n log log r, n log n}) on hierarchy-type algorithms
for undirected SSSP. These last two lower bounds are essentially tight for hierarchy-
type algorithms, on directed and undirected graphs, respectively.
Graham, Yao, and Yao [GYY80] proved that the information-theoretic argument
cannot prove a nontrivial ω(n2 ) lower bound on the comparison-complexity of APSP,
where additions are granted for free. It is also simple to see that there can be no
nontrivial information-theoretic lower bound on SSSP.
2.2. Dijkstra’s algorithm. Our algorithm, like [Tho99, Hag00], is best un-
derstood as circumventing the limitations of Dijkstra’s algorithm. We give a brief
description of Dijkstra’s algorithm in order to illustrate its complexity and introduce
some vocabulary.
For a vertex set S ⊆ V (G), let dS (u, v) denote the distance from u to v in the
subgraph induced by S ∪ {v}. Dijkstra’s algorithm maintains a tentative distance
function D(v) and a set of visited vertices S satisfying Invariant 2.1. Henceforth, s
denotes the source vertex.
Invariant 2.1. Let s be the source vertex and v be an arbitrary vertex:
d(s, v) if v ∈ S,
D(v) =
dS (s, v) if v ∈ S.
Choosing an initial assignment of S = ∅, D(s) = 0, and D(v) = ∞ for v = s
clearly satisfies the invariant. Dijkstra’s algorithm consists of repeating the following
step n times: choose a vertex v ∈ V (G)\S such that D(v) is minimized, set S :=
S ∪ {v}, and finally, update tentative distances to restore Invariant 2.1. This last
part involves relaxing each edge (v, w) by setting D(w) = min{D(w), D(v) + (v, w)}.
Invariant 2.1 and the positive-weight assumption imply D(v) = d(s, v) when v is
selected. It is also simple to prove that relaxing outgoing edges of v restores Invariant
2.1.
The problem with Dijkstra’s algorithm is that vertices are selected in increasing
distance from the source, a task that is at least as hard as sorting n numbers. Main-
taining Invariant 2.1, however, does not demand such a particular ordering. In fact,
it can be seen that selecting any vertex v ∈ S for which D(v) = d(s, v) will maintain
Invariant 2.1. All hierarchy-type algorithms [Tho99, Hag00, Pet04, Pet02b] maintain
Invariant 2.1 by generating a weaker certificate for D(v) = d(s, v) than “D(v) is min-
imal.” Any such certificate must show that for all u ∈ S, D(u) + d(u, v) ≥ D(v). For
example, Dijkstra’s algorithm presumes there are no negative length edges, hence
d(u, v) ≥ 0, and by choice of v ensures D(u) ≥ D(v). This is clearly a suffi-
cient certificate. In Dinic’s version [Din78] of Dijkstra’s algorithm the lower bound
d(u, v) ≥ min is used, where min is the minimum edge length. Thus Dinic is free
to visit any v ∈ S for which D(v)/min is minimal. All hierarchy-type algorithms
[Tho99, Hag00, Pet04, Pet02b], ours included, precompute a much stronger lower
bound on d(u, v) than d(u, v) ≥ min .
3. The hierarchy approach and its correctness. In this section we gener-
alize the hierarchy-based approach of [Tho99] to real-weighted graphs. Because the
algorithm follows directly from its proof of correctness, we will actually give a kind of
correctness proof first.
Below, X ⊆ V (G) denotes any set of vertices, and s always denotes the source
vertex. Let I be a real interval. The notation X I refers to the subset of X whose
1404 SETH PETTIE AND VIJAYA RAMACHANDRAN
4.1. Visit. Consider the Visit procedure given below. We prove that Visit cor-
rectly computes SSSPs by demonstrating that it is an implementation of Generalized-
Visit, which was already proved correct.
Visit(x, [a, b)).
Input: x is a node in a proper hierarchy H; V (x) is (S, [a, b))-safe and
Invariant 2.1 is satisfied.
Output guarantee: Invariant 2.1 is satisfied and Spost = Spre ∪ V (x)[a,b) ,
where Spre and Spost are the set S before and after the call.
1. If x is a leaf and D(x) ∈ [a, b), then let S := S ∪ {x}, relax all edges
incident on x, restoring Invariant 2.1, and return.
2. If Visit(x, ·) is being called for the first time, create a bucket array of
diam(x)/norm(x) + 1 buckets. Bucket i represents the interval
D(x) if D(x) + diam(x) < b,
where ax = b−D(x)
b − norm (x) norm(x) otherwise.
Fig. 1. First observe that when ax is initialized we have D(x) ≥ ax ≥ a, as in the figure. If
ax is chosen such that norm(x) divides (b − ax ), then by Definition 4.1(ii) either norm(x) divides
norm(p(x)) (which puts us in Case 1) or diam(x) < norm(p(x)) (putting us in Case 2); that is,
norm(x) does not divide (b + norm(p(x)) − ax ), but it does not matter since we’ll never reach
b + norm(p(x)) anyway. If ax is chosen so that norm(x) does not divide (b − ax ), then ax = D(x)
and D(x) + diam(x) < b (putting us in Case 3), meaning we will never reach b. Note that by
the definition of diam(x) (Definition 4.1) and Invariant 2.1, for any vertex in u ∈ V (x) we have
d(s, u) ≤ d(s, x) + diam(x) ≤ D(x) + diam(x).
hierarchies (Definitions 3.2 and 4.1). Since the bucketed nodes form a norm(x)-
partition, one can easily see that the recursive calls in step 3 of Visit correspond
to the recursive calls in Generalized-Visit. However, their interval arguments are
different. We sketch below why this change does not affect correctness.
In Generalized-Visit the intervals passed to recursive calls are of the form
[a , min{a + t, b}), whereas in Visit they are [a , a + t) = [a , a + norm(x)). We will
argue why a + t = a + norm(x) is never more than b. The main idea is to show that
we are always in one of the three cases portrayed in Figure 1.
If norm(x) divides norm(p(x)) and ax is chosen in step 2 so that t = norm(x)
divides (b − ax ), then we can freely substitute the interval [a , a + t) for [a , min{a +
t, b}) since they will be identical. Note that in our algorithm (b − a) = norm(p(x)).6
The problems arise when norm(x) does not divide either norm(p(x)) or (b − ax ).
In order to prove the correctness of Visit we must show that the input guarantee
(regarding safe-ness) is satisfied for each recursive call. We consider two cases: when
we are in the first recursive call to Visit(x, ·) and any subsequent call. Suppose we
are in the first recursive call to Visit(x, ·). By our choice of ax in step 2, either
b = ax + q · norm(x) for some integer q, or b > D(x) + diam(x) = ax + diam(x). If
it is the first case, each time the outer while-loop is entered we have a < b, which,
6 Strictly speaking, this does not hold for the initial call because in this case, x = root(H) is the
root of the hierarchy H and there is no such node p(x). The argument goes through just fine if we
let p(root(H)) denote a dummy node such that norm(p(root(H))) = ∞.
A SHORTEST PATH ALGORITHM FOR UNDIRECTED GRAPHS 1409
Theorem 4.2, given below, establishes some new bounds on the problem that are
just slightly better than Gabow’s original data structure [G85a]. Refer to Appendix
B for a proof. Thorup [Tho99] gave a similar data structure for integer keys in the
RAM model that runs in linear time. It relies on the RAM’s ability to sort small sets
of integers in linear time [FW93].
Theorem 4.2. The split-findmin problem can be solved on a pointer machine in
O(n + mα) time while making only O(n + m log α) comparisons, where α = α(m, n) is
the inverse-Ackermann function. Alternatively, split-findmin can be solved on a RAM
in time Θ(split-findmin(m, n)), where split-findmin(m, n) = O(n+m log α) is the
decision-tree complexity of the problem.
We use the split-findmin structure to maintain D-values as follows. In the be-
ginning there is one sequence consisting of the n leaves of H in an order consistent
with some depth-first search traversal of H. For any leaf v in H we maintain, by
appropriate decrease-key operations, that key(v) = D(v). During execution of Visit
we will say an H-node is unresolved if it lies in another node’s bucket array but its
tentative distance (D-value) is not yet finalized. The D-value of an H-node becomes
finalized, in the sense that it never decreases again, during step 3 of Visit, either by
being removed from some bucket array or passed, for the first time, to a recursive call
of Visit. (It follows from Definition 3.1 and Invariant 2.1 that D(y) = d(s, y) at the
first recursive call to y.) One can verify a couple properties of the unresolved nodes.
1410 SETH PETTIE AND VIJAYA RAMACHANDRAN
First, each unvisited leaf has exactly one unresolved ancestor. Second, to implement
Visit we need only query the D-values of unresolved nodes. Therefore, we maintain
that for each unresolved node y, there is some sequence in the split-findmin structure
corresponding to V (y), the descendants of y. Now suppose that a previously unre-
solved node y is resolved in step 3 of Visit. The deg(y) children of y will immediately
become unresolved, so to maintain our correspondence between sequences and unre-
solved nodes, we perform deg(y) − 1 split operations on y’s sequence, so that the
resulting subsequences correspond to y’s children.
We remark that the split-findmin structure we use can be simplified slightly be-
cause we know in advance where the splits will occur. However, this knowledge does
not seem to affect the asymptotic complexity of the problem.
Lemma 4.3. Let Δx ≥ 1 denote the number of buckets between the first open
bucket at the time of x’s insertion and the bucket from which x was enumerated.
The
bucket-heap can be implemented on a pointer machine to run in O(N + x log Δx )
time, where N is the number of operations.
When Visit(x, ·) is called for the first time, we initialize the bucket-heap at x
with a call to create(norm(x), ax ), followed by a number of insert operations for each
of x’s children, where the key of a child is its D-value. Here ax is the beginning of the
real interval represented by the bucket array, and norm(x) the width of each bucket.
Every time the D-value of a bucketed node decreases, which can easily be detected
with the split-findmin structure, we perform a decrease-key on the corresponding item
in the bucket-heap. We usually refer to buckets not by their cardinal number but by
their associated real interval, e.g., bucket [ax , ax + norm(x)).
4.4. Analysis of Visit. In this section we bound the time required to compute
SSSP with Visit as a function of m, n, and the given hierarchy H. We will see
later that the dominant term in this running time corresponds to the split-findmin
A SHORTEST PATH ALGORITHM FOR UNDIRECTED GRAPHS 1411
structure, whose complexity is no more than O(m log α) but could turn out to be
linear.
Lemma 4.4. Let H be a proper hierarchy. Computing SSSPs with Visit on
H takes time O(split-findmin(m, n) + φ(H)), where split-findmin(m, n) is the
complexity of the split-findmin problem and
diam(x) diam(p(x))
φ(H) = + log +1 .
norm(x) norm(p(x))
x∈H
x∈H such that
norm(x)=norm(p(x))
Proof. The split-findmin(m, n) term represents the time to relax edges (in step
1) and update the relevant D-values of H-nodes, as described in section 4.2. Except
for the costs associated with updating D-values, the overall time of Visit is linear
in the number of recursive calls and the bucketing costs. The two terms of φ(H)
represent these costs. Consider the number of calls to Visit(x, I) for a particular H-
node x. According to step 3 of Visit, there will be zero calls to x unless norm(x) =
norm(p(x)). If it is the case that norm(x) = norm(p(x)), then for all recursive calls
on x, the given interval I will have the same width: norm(z) for some ancestor z of x.
By Definition 4.1(i), norm(z) ≥ norm(x), and therefore the number of such recursive
calls on x is ≤ diam(x)/norm(x) + 2; the extra 2 counts the first and last recursive
calls, which may cover negligible parts of the interval [d(s, x), d(s, x) + diam(x)]. By
Definition 4.1(iii),|H| < 2n, and therefore the total number of recursive calls is
bounded by 4n + x diam(x)/norm(x), where the summation is over H nodes whose
norm-values differ from their parents’ norm-values.
Now consider the bucketing costs of Visit if implemented with the bucket-heap.
According to steps 2 and 3, a node y is bucketed either because Visit(p(y), ·) was
called for the first time, or its parent p(y) was removed from the first open bucket (of
some bucket array), say bucket [a, a + norm(p(y))). In either case, this means that
d(s, p(y)) ∈ [a, a + norm(p(y))) and that d(s, y) ∈ [a, a + norm(p(y)) + diam(p(y))).
To use the terminology of Lemma 4.3, Δy ≤ diam(p(y))/norm(p(y)), and the
total
bucketing costs would be #(buckets scanned) + #(insertions) + #(dec-keys) +
x log(diam(p(x))/norm(p(x)) + 1), which is O(φ(H) + m + n).
In section 5 we give a method for constructing a proper hierarchy H such that
φ(H) = O(n). This bound together with Lemma 4.4 shows that we can compute
SSSP in O(split-findmin(m, n)) time, given a suitable hierarchy. Asymptotically
speaking, this bound is the best we are able to achieve. However, the promising
experimental results of a simplified version of our algorithm [PRS02] have led us to
design an alternate implementation of Generalized-Visit that is both theoretically
fast and easier to code.
Proof. The split-findmin term plays the same role in Visit-B as in Visit.
Visit-B is different than Visit in that it makes recursive calls on all hierarchy nodes,
not just those with different norm-values than their parents. Using the same argu-
ment as in Lemma 4.5, we can bound the number of recursive calls of the form Visit-
B(x, ·) as diam(x)/norm(x) + 2; this gives the first summation in ψ(H). Assuming
an optimal heap is used (for example, aFibonacci heap [FT87]), all decrease-keys
take O(m) time, and all deletions take x deg(x) log deg(x) time. The bound on
deletions follows since each of the deg(x) children of x is inserted into and deleted
from x’s heap at most once.
In section 5 we construct a hierarchy H such that ψ(H) = Θ(nlog∗ n), imply-
ing an overall bound on Visit-B of O(m + nlog∗ n), since split-findmin(m, n) =
O(mα(m, n)) = O(m + nlog∗ n). Even though ψ(H) = Ω(nlog∗ n) in the worst case,
we are only able to construct very contrived graphs for which this lower bound is
tight.
5. Efficient construction of balanced hierarchies. In this section we con-
struct a hierarchy that works well for both Visit and Visit-B. The construction
procedure has three distinct phases. In phase 1 we find the graph’s minimum span-
A SHORTEST PATH ALGORITHM FOR UNDIRECTED GRAPHS 1413
ning tree, denoted M , and classify its edges by length. This classification immediately
induces a coarse hierarchy, denoted H0 , which is analogous to the component hierarchy
defined by Thorup [Tho99] for integer-weighted graphs. Although H0 is proper, using
it to run Visit or Visit-B may result in a slow SSSP algorithm. In particular, φ(H0 )
and ψ(H0 ) can easily be Θ(n log n), giving no improvement over Dijkstra’s algorithm.
Phase 2 facilitates phase 3, in which we produce a refinement of H0 , called H; this is
the “well balanced” hierarchy we referred to earlier. The refined hierarchy H is con-
structed so as to minimize the φ(H) and ψ(H) terms in the running times of Visit and
Visit-B. In particular, φ(H) will be O(n), and ψ(H) will be O(nlog∗ n). Although
H could be constructed directly from M (the graph’s minimum spanning tree), we
would not be able to prove the time bound of Theorem 1.1 using this method. The
purpose of phase 2 is to generate a collection of small auxiliary graphs that—loosely
speaking—capture the structure and edge lengths of certain subtrees of the minimum
spanning tree. Using the auxiliary graphs in lieu of M , we are able to construct H in
phase 3 in O(n) time.
In section 5.1 we define all the notation and properties used in phases 1, 2, and
3 (sections 5.2, 5.3, and 5.4, respectively). In section 5.5 we prove that φ(H) = O(n)
and ψ(H) = O(nlog∗ n).
5.1. Some definitions and properties.
5.1.1. The coarse hierarchy. Our refined hierarchy H is derived from a coarse
hierarchy H0 , which is defined here and in section 5.2. Although H0 is typically very
simple to describe, the general definition of H0 is rather complicated since it must
take into account certain extreme circumstances. H0 is defined w.r.t. an increasing
sequence of norm-values: norm1 , norm2 , . . ., where all edge lengths are at least
as large as norm1 . (Typically normi+1 = 2 · normi ; however, this is not true in
general.) We will say that an edge e is at level i if (e) ∈ [normi , normi+1 ), or
alternatively, we may write norm(e) = normi to express that e is at level i. A level
i subgraph is a maximal connected subgraph restricted to edges with level i or less,
that is, with length strictly less than normi+1 . Therefore, the level zero subgraphs
consist of single vertices. A level i node in H0 corresponds to a nonredundant level
i subgraph, where a level i subgraph is redundant if it is also a level i − 1 subgraph.
This nonredundancy property guarantees that all nonleaf H0 -nodes have at least
two children. The ancestor relationship in H0 should be clear: x is an ancestor of
y if and only if the subgraph of y is contained in the subgraph of x, i.e., V (y) ⊆
V (x). The leaves of H0 naturally correspond to graph vertices, and the internal
nodes to subgraphs. The coarse hierarchy H0 clearly satisfies Definition 4.1(i), (iii),
(iv); however, we have to be careful in choosing the norm-values if we want it to be
a proper hierarchy, that is, for it to satisfy Definition 4.1(ii) as well. Our method for
choosing the norm-values is deferred to section 5.2.
5.1.2. The minimum spanning tree. By the cut property of minimum span-
ning trees (see [CLRS01, PR02c]) the H0 w.r.t. G is identical to the H0 w.r.t. M ,
the minimum spanning tree (MST) of G. Therefore, the remainder of this section is
mainly concerned with M , not the graph itself. If X ⊆ V (G) is a set of vertices, we let
M (X) be the minimal connected subtree of M containing X. Notice that M (X) can
include vertices outside of X. Later on we will need M to be a rooted tree in order to
talk coherently about a vertex’s parent, ancestors, children, and so on. Assume that
M is rooted at an arbitrary vertex. The notation root(M (X)) refers to the root of
the subtree M (X).
1414 SETH PETTIE AND VIJAYA RAMACHANDRAN
5.1.3. Mass and diameter. The mass of a vertex set X ⊆ V (G) is defined as
def
mass(X) = (e).
e∈E(M (X))
Extending this notation, we let M (x) = M (V (x)) and mass(x) = mass(V (x)),
where x is a node in any hierarchy. Since the MST path between two vertices in
M (x) is an upper bound on the shortest path between them, mass(x) is an upper
bound on the diameter of V (x). Recall from Definition 4.1 that diam(x) denoted any
upper bound on the diameter of V (x); henceforth, we will freely substitute mass(x)
for diam(x).
5.1.4. Refinement of the coarse hierarchy. We will say that H is a refine-
ment of H0 if all nodes in H0 are also represented in H. An equivalent definition,
which provides us with better imagery, is that H is derived from H0 by replacing each
node x ∈ H0 with a rooted subhierarchy H(x), where the root of H(x) corresponds
to (and is also referred to as) x and the leaves of H(x) correspond to the children of
x in H0 . Consider a refinement H of H0 where each internal node y in H(x) satisfies
deg(y) = 1 and norm(y) = norm(x). One can easily verify from Definitions 3.2 and
4.1 that if H0 is a proper hierarchy, so too is H. Of course, in order for φ(H) and
ψ(H) to be linear or near-linear, H(x) must satisfy certain properties. In particular,
it must be sufficiently short and balanced. By balanced we mean that a node’s mass
should not be too much smaller than its parent’s mass.
5.1.5. Lambda values. We will use the following λ-values in order to quantify
precisely our notion of balance:
−q
λ0 = 0, λ1 = 12 and λq+1 = 2λq ·2 .
Lemma 5.1 gives a lower bound on the growth of the λ-values; we give a short
proof before moving on.
Lemma 5.1. min{q : λq ≥ n} ≤ 2log∗ n.
2
Proof. Let Sq be a stack of q twos; for example, S3 = 22 = 16. We will prove
that λq ≥ S q/2 , giving the lemma. One can verify that this statement holds for
q ≤ 9. Assume that it holds for all q ≤ q.
λq−1 ·2−(q−1) −q
λq+1 = 22 2
{definition of λq+1 }
S ·2−(q−1) −q
2 (q−1)/2
≥2 {inductive assumption}
S
2 (q−1)/2−1
≥2 =S (q+1)/2 {holds for q ≥ 9}.
The third line follows from the inequality S (q−1)/2 · 2−(q−1) − q ≥ S (q−1)/2 −1 ,
which holds for q ≥ 9.
5.1.6. Ranks. Recall from section 5.1.4 that our refined hierarchy H is derived
from H0 by replacing each node x ∈ H0 with a subhierarchy H(x). We assign to all
nodes in H(x) a nonnegative integer rank. The analysis of our construction would
become very simple if for every rank j node y in H(x), mass(y) = λj · norm(x).
Although this is our ideal situation, the nature of our construction does not allow
us to place any nontrivial lower or upper bounds on the mass of y. We will assign
ranks in order to satisfy Property 5.1, given below, which ensures us a sufficiently
A SHORTEST PATH ALGORITHM FOR UNDIRECTED GRAPHS 1415
good approximation to the ideal. It is mainly the internal nodes of H(x) that can
have subideal ranks; we assign ranks to the leaves of H(x) (representing children of x
in H0 ) to be as close to the ideal as possible.
We should point out that the assignment of ranks is mostly for the purpose of
analysis. Rank information is never stored explicitly in the hierarchy nodes, nor is
rank information used, implicitly or explicitly, in the computation of shortest paths.
We only refer to ranks in the construction of H and when analyzing their effect on
the φ and ψ functions.
Property 5.1. Let x ∈ H0 and y, z ∈ H(x) ⊆ H.
(a) If y is an internal node of H(x), then norm(y) = norm(x) and deg(y) > 1.
(b) If y is a leaf of H(x) (i.e., a child of x in H0 ), then y has rank j, where j is
maximal s.t. mass(y)/norm(x) ≥ λj .
(c) Let y be a child of a rank j node. We call y stunted if mass(y)/norm(x) <
λj−1 /2. Each node has at most one stunted child.
(d) Let y be of rank j. The children of y can be divided into three sets: Y1 , Y2 , and
a singleton {z} such that (mass(Y1 ) + mass(Y2 ))/norm(x) < (2 + o(1)) · λj .
(e) Let X be the nodes of H(x) of some specific rank. Then y∈X mass(y) ≤
2 · mass(x).
Before moving on, let us examine some features of Property 5.1. Part (a) is
asserted to guarantee that H is proper. Part (b) shows how we set the rank of leaves
of H(x). Part (c) says that at most one child of any node is less than half its ideal
mass. Part (d) is a little technical but basically says that for a rank j node y, although
mass(y) may be huge, the children of y can be divided into sets Y1 , Y2 , {z} such that
Y1 and Y2 are of reasonable mass—around λj · norm(x). However, no bound is placed
on the mass contributed by z. Part (e) says that if we restrict our attention to the
nodes of a particular rank, their subgraphs do not overlap in too many places. To
see how two subgraphs might overlap, consider {xi }, the set of nodes of some rank
in H(x). By our construction it will always be the case that the vertex sets {V (xi )}
are disjoint; however, this does not imply that the subtrees {M (xi )} are edge-disjoint
because M (xi ) can, in general, be much larger than V (xi ).
We show in section 5.5 that if H is a refinement of H0 and H satisfies Property
5.1, then φ(H) = O(n) and ψ(H) = O(nlog∗ n). Recall from Lemmas 4.4 and 4.5 that
φ(H) and ψ(H) are terms in the running times of Visit and Visit-B, respectively.
5.2. Phase 1: The MST and the coarse hierarchy. Pettie and Ramachan-
dran [PR02c] recently gave an MST algorithm that runs in time proportional to the
decision-tree complexity of the MST problem. As the complexity of MST is triv-
ially Ω(m) and only known to be O(mα(m, n)) [Chaz00], it is unknown whether this
cost will dominate or be dominated by the split-findmin(m, n) term. (This issue is
mainly of theoretical interest.) In the analysis we use mst(m, n) to denote the cost
of computing the MST. This may be interpreted as the decision-tree complexity of
MST [PR02c] or the randomized complexity of MST, which is known to be linear
[KKT95, PR02b].
Recall from section 5.1.1 that H0 was defined w.r.t. an arbitrary increasing se-
quence of norm-values. We describe below exactly how the norm-values are chosen,
then prove that H0 is a proper hierarchy. Our method depends on how large r is,
which is the ratio of the maximum-to-minimum edge length in the minimum spanning
tree. If r < 2n , which can easily be determined in O(n) time, then the possible norm-
values are {min · 2i : 0 ≤ i ≤ log r + 1}, where min is the minimum edge length
in the graph. If r ≥ 2n , then let e1 , . . . , en−1 be the edges in M in nondecreasing
1416 SETH PETTIE AND VIJAYA RAMACHANDRAN
order by length and let J = {1} ∪ {j : (ej ) > n · (ej−1 )} be the indices that mark
large separations in the ((ei ))1≤i<n sequence. The possible norm-values are then
{(ej ) · 2i : i ≥ 0, j ∈ J and (ej ) · 2i < (ej+1 )}.
Under either definition, normi is the ith largest norm-value, and for an edge
e ∈ E(M ), norm(e) = normi if (e) ∈ [normi , normi+1 ). Notice that if no edge
length falls within the interval [normi , normi+1 ), then normi is an unused norm-
value. We only need to keep track of the norm-values in use, of which there are no
more than n − 1.
Lemma 5.2. The coarse hierarchy H0 is a proper hierarchy.
Proof. As we observed before, parts (i), (iii), and (iv) of Definition 4.1 are sat-
isfied for any monotonically increasing sequence of norm-values. Definition 4.1(ii)
states that if x is a hierarchy node, either norm(p(x))/norm(x) is an integer or
diam(x)/norm(p(x)) < 1. Suppose that x is a hierarchy node and norm(p(x))/norm(x)
is not integral; then norm(x) = (ej1 ) · 2i1 and norm(p(x)) = (ej2 ) · 2i2 , where
j2 > j1 . By our method for choosing the norm-values, the lengths of all MST edges
are either at least (ej2 ) or less than (ej2 )/n. Since edges in M (x) have length less
than (ej2 ), and hence less than (ej2 )/n, diam(x) < (|V (x)| − 1) · (ej2 )/n < (ej2 ) ≤
norm(p(x)).
Lemma 5.3. We can compute the minimum spanning tree M , and norm(e) for
all e ∈ E(M ), in O(mst(m, n) + min{n log log r, n log n}) time.
Proof. mst(m, n) represents the time to find M . If r < 2n , then by Lemma 2.2
we can find norm(e) for all e ∈ M in O(log r + n log log r) = O(n log log r) time. If
r ≥ 2n , then n log log r = Ω(n log n), so we simply sort the edges of M and determine
the indices J in O(n log n) time. Suppose there are nj edges e s.t. norm(e) is of the
form (ej ) · 2i . Since (e)/(e
j ) ≤ n , we need only generate nj log n values of the
nj
form (ej ) · 2 . A list of the j nj log n = n log n possible norm-values can easily be
i
generated in sorted order. By merging this list with the list of MST edge lengths, we
can determine norm(e) for all e ∈ M in O(n log n) time.
Lemma 5.4, given below, will come in handy in bounding the running time of
our preprocessing and SSSP algorithms. It says that the total normalized mass in
H0 is linear in n. Variations of Lemma 5.4 are at the core of the hierarchy approach
[Tho99, Hag00, Pet04, Pet02b].
Lemma 5.4.
mass(x)
< 4(n − 1).
norm(x)
x∈H0
Proof. Recall that the notation norm(e) = normi stands for (e) ∈
[normi , normi+1 ), where normi , is the ith largest norm-value. Observe that if
e ∈ M is an MST edge with norm(e) = normi , e can be included in mass(x) for no
more than one x at levels i, i+1, . . . in H0 . Also, it follows from our definitions that for
every normi in use, normi+1 /normi ≥ 2, and for any MST edge, (e)/norm(e) < 2.
Therefore, we can bound the normalized mass in H0 as
mass(x) ∞
(e)
≤
norm(x) j=i
norm j
x∈H0 e∈M
norm(e)=normi
∞
(e)
≤ < 4(n − 1).
e∈M j=i
2j−i · normi
norm(e)= normi
A SHORTEST PATH ALGORITHM FOR UNDIRECTED GRAPHS 1417
2
3
2
Fig. 2. On the left is a subtree of M , the MST, where X is the set of blackened vertices. In the
center is M (X), the minimal subtree of M connecting X, and on the right is T (X), derived from
M (X) by splicing out unblackened degree 2 nodes in M (X) and adjusting edge lengths appropriately.
Unless otherwise marked, all edges in this example are of length 1.
u1 = root(Tx ) u1 = root(Tx )
Already traversed
u2 u2
u3 u3
w = LCA(v, u4 ) w = the new u4
u4 v v = the new u5
Fig. 3. The blackened vertices represent those known to be in T (x). The active path of the
traversal is shown in bold edges. Before v is processed (left) the stack consists of u1 , u2 , u3 , u4 ,
where u4 is the last vertex in the traversal known to be in T (x) and w = LCA(v, u4 ), which implies
w ∈ T (x). After v is processed (right) the stack is set to u1 , u2 , u3 , w, v and w is blackened.
parent of uq in T (x). So, to maintain the stack properties, we pop off uq and add the
edge (uq , uq−1 ) to T (x).
Succinct-Tree(v): A procedure for constructing T (x), for a given x ∈ H0 .
The argument v is a vertex in the MST M .
The stack for T (x) consists of vertices u1 , . . . , uq , which
are known to be in T (x) but whose parents in T (x) are
not yet known. All but uq are on the active path of the
DFS traversal. Initially the stack for T (x) is empty.
where v ∈ T (x), will produce an array of sets v[·] whose elements are nodes in H(x)
that represent (collectively) the subtree of T (x) rooted at v. The set v[j] holds rank
j nodes, which, taken together, are not yet massive enough to become a rank j + 1
node. We extend the mass notation to sets v[·] as follows. Bear in mind that this
mass is w.r.t. the tree T (x), not M (x). By Lemma 5.5(ii), mass w.r.t. T (x) is a good
approximation to the mass of the equivalent subtree in M (x):
⎛ ⎞
Property 5.1(a) states that internal nodes in H(x) must have norm-values equal
to that of x, which we satisfy by simply assigning them the proper norm-values, and
that no node of H(x) has one child. By our treatment of one-element sets in the
promotion procedure of Definition 5.7, it is simply impossible to create a one-child
node in H(x). Property 5.1(e) follows from Lemma 5.5(ii) and the observation that
the mass (in T (x)) represented by nodes of the same rank is disjoint. Now consider
Property 5.1(c), regarding stunted nodes. We show that whenever a set v[j] accepts a
new node z, either v[j] is immediately promoted, or z is not stunted, or the promotion
of z into v[j] represents the last promotion in the construction of H(x). Consider the
pattern of promotions in line 9. We promote the sets v[0], . . . , v[j] in a cascading
fashion: v[0] to v[1], v[1] to v[2], and so on. The only set accepting a new node that
is not immediately promoted is v[j + 1], so in order to prove Property 5.1(c) we must
show that the node derived from promoting v[0], . . . , v[j] is not stunted. By choice
of j, mass(v[j]) ≥ λj+1 · norm(x), where mass is w.r.t. the tree T (x). By Lemma
5.5(ii) the mass of the equivalent tree in M (x) is at least λj+1 · norm(x)/2, which is
exactly the threshold for this node being stunted. Finally, consider Property 5.1(d).
Before the merging step in line 7, none of the sets in v[·] or w[·] is massive enough
to be promoted. Let v[·] and w[·] denote the sets associated with v and w before the
merging in step 7, and let v [·] denote the set associated with v after step 7. By the
definition of mass we have
mass(v [j]) = mass(v[j]) + mass(w[j]) + (v, w) < 2 · λj+1 · norm(x) + (v, w).
below. The time for merging the v[·] and w[·] sets in line 7 is only proportional to the
shorter list; this time bound is given by expression t2 below.
mass(v)
t1 = O 1 + log∗ ,
norm(x)
min{mass(v[·]), mass(w[·])}
t2 = O 1 + log∗ ,
norm(x)
where mass(v[·]) is just the total mass represented by the v[·] sets. We update the
mass of only the first t1 + t2 sets in v[·], and, as a rule, we update v[j + 1] half as
often as v[j]. It is routine to show that Refine-Hierarchy will have a lower bound
on the mass of v[j] that is off by a 1+o(1) factor, where the o(1) is a function of
j.7 This leads to the conspicuous 2 + o(1) in Property 5.1(d). To bound the cost of
Refine-Hierarchy we model its computation as a binary tree: leaves represent the
creation of nodes in lines 1–4, and internal nodes represent the merging events in line
7. The cost of a leaf f is log∗ (mass(f )/norm(x)), and the cost of an internal node
f with children f1 and f2 is 1 + log∗ (min{mass(f1 )/norm(x), mass(f2 )/norm(x)}).
We can think of charging the cost of f collectively to the mass in the subtree of f1 or
f2 , whichever is smaller. Therefore, no unit of mass can be charged for two nodes f
and g if the total mass under f is within twice the total mass under g. The total cost
is then
∞
mass(T (x)) log∗ (2i ) mass(x)
cost(f ) = O |T (x)| + · i
=O .
norm(x) i=0
2 norm(x)
f
be updated at most 2j−i − 1 times before the mass of v[j] is updated. Since 2j−1 − 1 · λi λj , our
neglecting to update the mass of v[j] causes a negligible error.
A SHORTEST PATH ALGORITHM FOR UNDIRECTED GRAPHS 1423
mass(y) 2α
deg(y) log ≤ + 2 log(α + β) {see explanations below}
norm(y) λj−1
max{α log(2λj ), β}
=O
λj−1
α+β mass(y)
=O = O .
2 j−1 norm(y) · 2j−1
The first line follows from our bound on deg(y) and the definitions of α and β.
The second line follows since α < (2 + o(1))λj and α log(α + β) = O(max{α log α, β}).
The lastline follows since log λj = λj−1 /2j−1 > 1. By the above bound and Property
5.1(e), y∈H(x) deg(y) log(mass(y)/norm(y)) = O(mass(x)/norm(x)). Therefore,
by Lemma 5.4, the second summations in both φ(H) and ψ(H) are bounded by
O(n).
6. Limits of hierarchy-type algorithms. In this section we state a simple
property (Property 6.1) of all hierarchy-type algorithms and give a lower bound on
any undirected SSSP algorithm satisfying that property. The upshot is that our
SSSP algorithm is optimal (up to an inverse-Ackermann factor) for a fairly large
class of SSSP algorithms, which includes all hierarchy-type algorithms, variations on
Dijkstra’s algorithm, and even a heuristic SSSP algorithm [G01].
We will state Property 6.1 in terms of directed graphs. Let cycles(u, v) denote
the set of all cycles, including nonsimple cycles, that pass through both u and v,
and let sep(u, v) = minC∈cycles(u,v) maxe∈C (e). Note that in undirected graphs
sep(u, v) corresponds exactly to the longest edge on the MST path between u and v.
Property 6.1. An SSSP algorithm with the hierarchy property computes, aside
from shortest paths, a permutation πs : V (G) → V (G) such that for any vertices u, v,
we find d(s, u) ≥ d(s, v) + sep(u, v) =⇒ πs (u) > πs (v), where s is the source and d
the distance function.
The permutation πs corresponds to the order in which vertices are visited when
the source is s. Property 6.1 says that πs is loosely sorted by distance, but may invert
pairs of vertices if their relative distance is less than their sep-value. To see that
our hierarchy-based algorithm satisfies Property 6.1, consider two vertices u and v.
Let x be the LCA of u and v in H, and let u and v be the ancestors of u and v,
respectively, which are children of x. By our construction of H, norm(x) ≤ sep(u, v).
If d(s, u) ≥ d(s, v) + sep(u, v), then d(s, u) ≥ d(s, v) + norm(x), and therefore the
recursive calls on u and v that cause u and v to be visited are not passed the same
interval argument, since both intervals have width norm(x). The recursive call on u
1424 SETH PETTIE AND VIJAYA RAMACHANDRAN
. . .
must, therefore, precede the recursive call on v , and u must be visited before v.
Theorem 6.1. Suppose that our computational model allows any set of func-
tions from RO(1) → R and comparison between two reals. Any SSSP algorithm for
real-weighted graphs satisfying Property 6.1 makes Ω(m + min{n log log r, n log n})
operations in the worst case, where r is the ratio of the maximum to minimum edge
length.
Proof. Let q be an integer. Assume without loss of generality that 2q divides
n − 1. The MST of the input graph is as depicted in Figure 4. It consists of the source
vertex s, which is connected to p = (n − 1)/2 vertices in the top row, each of which
is paired with one vertex in the bottom row. All the vertices (except s) are divided
into disjoint groups, where group i consists of exactly p/q randomly chosen pairs of
vertices. There are exactly p!/(p/q)!q = q Ω(p) possible group arrangements. We will
show that any algorithm satisfying Property 6.1 must be able to distinguish them.
We choose edge lengths as follows. All edges in group i have length 2i . This
includes edges from s to the group’s top row and between the two rows. Other
non-MST edges are chosen so that shortest paths from s correspond to paths in
the MST. Let vi denote any vertex in the bottom row of group i. Then d(s, vi ) =
2 · 2i and sep(vi , vj ) = 2max{i,j} . By Property 6.1, vi must be visited before vj if
d(s, vi ) + sep(vi , vj ) ≤ d(s, vj ), which is true for i < j since 2 · 2i + 2j ≤ 2 · 2j .
Therefore, any algorithm satisfying Property 6.1 must be prepared to visit vertices in
q Ω(p) distinct permutations and make at least Ω(p log q) = Ω(n log log r) comparisons
in the worst case. It also must include every non-MST edge in at least one operation,
which gives the lower bound.
Theorem 6.1 shows that our SSSP algorithm is optimal among hierarchy-type
algorithms, to within a tiny inverse-Ackermann factor. A lower bound on directed
SSSP algorithms satisfying Property 6.1 is given in [Pet04]. Theorem 6.1 differs from
that lower bound in two respects. First, the [Pet04] bound is Ω(m+min{n log r, n log n}),
which is Ω(m + n log n) for even reasonably small values of r. Second, the [Pet04]
bound holds even if the algorithm is allowed to compute the sep function (and sort
the values) for free. Contrast this with our SSSP algorithm, where the main obstacle
to achieving linear time is the need to sort the sep-values.
avenues for further research, the most interesting of which is developing a feasible
alternative to Property 5.1 that does not have an intrinsic sorting bottleneck. This
would be a backward approach to algorithm design: first we define a desirable prop-
erty, then we hunt about for algorithms with that property. Another avenue, which
might have some real-world impact, is to reduce the preprocessing cost of the directed
shortest path algorithm in [Pet04] from O(mn) to near-linear, as it is in our algorithm.
The marginal cost of computing SSSP with our algorithm may or may not be
linear; it all depends on the complexity of the split-findmin structure. This data
structure, invented first by Gabow [G85a] for use in a weighted matching algorithm,
actually has connections with other fundamental problems. For instance, it can be
used to solve both the minimum spanning tree and shortest path tree sensitivity
analysis problems [Pet03]. (The sensitivity analysis problem is to decide how much
each edge’s length can be perturbed without changing the solution tree.) Therefore,
by Theorem 4.2 both these problems have complexity O(m log α(m, n)), an α/ log α
improvement over Tarjan’s path-compression–based algorithm [Tar82]. If we consider
the offline version of the split-findmin problem, where all splits and decrease-keys
are given in advance, one can show that it is reducible to both the MST problem
and the MST sensitivity analysis problem. None of these reductions proves whether
mst(m, n) dominates split-findmin(m, n) or vice versa; however, they do suggest
that we have no hope of solving the MST problem [PR02b, PR02c] without first
solving the manifestly simpler split-findmin and MST sensitivity analysis problems.
The experimental study of Pettie, Ramachandran, and Sridhar [PRS02] shows
that our algorithm is very efficient in practice. However, the [PRS02] study did not
explore all possible implementation choices, such as the proper heap to use, the best
preprocessing algorithm, or different implementations of the split-findmin structure.
To our knowledge no one has investigated whether the other hierarchy-type algorithms
[Tho99, Hag00, Pet04] are competitive in real-world scenarios.
An outstanding research problem in parallel computing is to bound the time-work
complexity of SSSP. There are several published algorithms on the subject [BTZ98,
CMMS98, KS97, M02, TZ96], though none runs in worst-case polylogarithmic time
using work comparable to Dijkstra’s algorithm. There is clearly a lot of parallelism in
the hierarchy-based algorithms. Whether this approach can be effectively parallelized
is an intriguing question.
Level 3
2
1
0
Gabow [G85a] gave an elegant algorithm for this problem that is nearly optimal.
On an initial sequence of n elements, it handles up to n − 1 splits and m decrease-keys
in O((m + n)α(m, n)) time. Gabow’s algorithm runs on a pointer machine [Tar79].
We now prove Theorem 4.2 from section 4.2.
Proof of Theorem 4.2. In Gabow’s decrease-key routine a sequence of roughly
α variables needs to be updated, although it is already known that their values are
A SHORTEST PATH ALGORITHM FOR UNDIRECTED GRAPHS 1427
where the first {1, . . . , q} represents the argument to the findmin query. The findmin-
action function can either perform a comparison (represented by V (H)×V (H)) which,
if performed, will alter the state, or return an answer to the findmin query, represented
by the second {1, . . . , q}. One simply applies the findmin-action function until it
produces an answer. We will represent the findmin-action function as a table. Since
the state is represented in o(log n) bits, we can keep it in one machine word; therefore,
computing the findmin-action function (and updating the state) takes constant time
on a RAM.
One can see that any split-findmin solver can be converted, without loss of effi-
ciency, into one that performs comparisons only during calls to findmin. Therefore,
finding the optimal findmin-action function is tantamount to finding the optimal split-
findmin solver.
We have now reduced the split-findmin problem to a brute force search over the
4 4
findmin-action function. There are less than F = 23q · q · (q 4 + q) < 24q distinct
findmin-action functions, most of which do not produce correct answers. There are
2
less than I = (2q + q 2 (q + 1))q +3q distinct instances of the problem, because the
number of decrease-keys is < q 2 , findmins < 2q, and splits < q. Furthermore, each
operation can be a split or findmin, giving the 2q term, or a decrease-key, which
requires us to choose an element and where to fit its new key into the permutation,
giving the q 2 (q + 1) term. Each findmin-action/problem instance pair can be tested
for correctness in V = O(q 2 ) time, and therefore all correct findmin-action functions
4
can be chosen in time F · I · V = 2Ω(q ) . For q = log log n this is o(n), meaning the
time for this brute force search does not affect the other constant factors involved.
How do we choose the optimal split-findmin solver? This is actually not a trivial
question because of the possibility of there not being one solver that dominates all
1428 SETH PETTIE AND VIJAYA RAMACHANDRAN
others on all input sizes. Consider charting the worst-case complexity of a solver S
as a function gS of the number of operations p in the input sequence. It is plausible
that certain solvers are optimal for only certain densities p/q. We need to show
that for some solver S ∗ , gS ∗ is within a constant factor of the lower envelope of
{gS }S , where S ranges over all correct solvers. Let Sk be the optimal solver for 2k
operations. We let S ∗ be the solver that mimics Sk from operations 2k−1 + 1 to
2k . At operation 2k it resets its state, reexecutes all 2k operations under Sk+1 , and
continues using Sk+1 until operation 2k+1 . Since gSk+1 (2k+1 ) ≤ 2 · gSk (2k ), it follows
that gS ∗ (p) ≤ 4 · minS {gS (p)}.
Our overall algorithm is very simple. We divide the n elements into n = n/q
superelements, each representing a contiguous block of q elements. Each unsplit se-
quence then consists of three parts: two subsequences in the leftmost and rightmost
superelements and a third subsequence consisting of unsplit superelements. We use
Gabow’s algorithm on the unsplit superelements, where the key of a superelement
is the minimum over constituent elements. For the superelements already split, we
use the S ∗ split-findmin solver constructed as above. The cost of Gabow’s algo-
rithm is O((m + n/q)α(m, n/q)) = O(m + n), and the cost of using S ∗ on each
superelement is Θ(split-findmin(m, n)) by construction; therefore the overall cost is
Θ(split-findmin(m, n)).
One can easily extend the proof to randomized split-findmin solvers by defining
the findmin-action as selecting a distribution over actions.
We note that the time bound of Theorem 4.2 on pointer machines is provably
optimal. La Poutré [LaP96] gave a lower bound on the pointer machine complexity of
the split-find problem, which is subsumed by the split-findmin problem. The results
in this section address the RAM complexity and decision-tree complexity of split-
findmin, which are unrelated to La Poutré’s result.
REFERENCES
[AGM97] N. Alon, Z. Galil, and O. Margalit, On the exponent of the all pairs shortest path
problem, J. Comput. System Sci., 54 (1997), pp. 255–262.
[AHU76] A. V. Aho, J. E. Hopcroft, and J. D. Ullman, On finding lowest common ancestors
in trees, SIAM J. Comput., 5 (1976), pp. 115–132.
[AMO93] R. K. Ahuja, T. L. Magnati, and J. B. Orlin, Network Flows: Theory, Algorithms,
and Applications, Prentice–Hall, Englewood Cliffs, NJ, 1993.
[AMOT90] R. K. Ahuja, K. Mehlhorn, J. B. Orlin, and R. E. Tarjan, Faster algorithms for
the shortest path problem, J. ACM, 37 (1990), pp. 213–223.
[BKRW98] A. L. Buchsbaum, H. Kaplan, A. Rogers, and J. R. Westbrook, Linear-time
pointer-machine algorithms for LCAs, MST verification, and dominators, in Pro-
ceedings of the 30th ACM Symposium on Theory of Computing (STOC), Dallas,
TX, 1998, ACM, New York, 1998, pp. 279–288.
[BTZ98] G. S. Brodal, J. L. Tra̋ff, and C. D. Zaroliagis, A parallel priority queue with
constant time operations, J. Parallel and Distrib. Comput., 49 (1998), pp. 4–21.
[Chaz00] B. Chazelle, A minimum spanning tree algorithm with inverse-Ackermann type com-
plexity, J. ACM, 47 (2000), pp. 1028–1047.
[CLRS01] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algo-
rithms, MIT Press, Cambridge, MA, 2001.
[CMMS98] A. Crauser, K. Mehlhorn, U. Meyer, and P. Sanders, A parallelization of Di-
jkstra’s shortest path algorithm, in Proceedings of the 23rd International Sympo-
sium on Mathematical Foundations of Computer Science (MFCS), Lecture Notes
in Comput. Sci. 1450, Springer, New York, 1998, pp. 722–731.
[Dij59] E. W. Dijkstra, A note on two problems in connexion with graphs, Numer. Math., 1
(1959), pp. 269–271.
[Din78] E. A. Dinic, Economical algorithms for finding shortest paths in a network, Trans-
portation Modeling Systems, (1978), pp. 36–44 (in Russian).
A SHORTEST PATH ALGORITHM FOR UNDIRECTED GRAPHS 1429
[Din03] Y. Dinitz, Personal communication, Ben-Gurion University, Be’er Sheva, Israel, 2003.
[FG85] A. M. Frieze and G. R. Grimmett, The shortest-path problem for graphs with random
arc-lengths, Discrete Appl. Math., 10 (1985), pp. 57–77.
[FR01] J. Fakcharoenphol and S. Rao, Planar graphs, negative weight edges, shortest paths,
and near linear time, in Proceedings of the 42nd IEEE Symposium on Foundations
of Computer Science (FOCS), Las Vegas, NV, 2001, IEEE Press, Piscataway, NJ,
pp. 232–241.
[F91] G. N. Frederickson, Planar graph decomposition and all pairs shortest paths, J. ACM,
38 (1991), pp. 162–204.
[F76] M. L. Fredman, New bounds on the complexity of the shortest path problem, SIAM
J. Comput., 5 (1976), pp. 83–89.
[FT87] M. L. Fredman and R. E. Tarjan, Fibonacci heaps and their uses in improved network
optimization algorithms, J. ACM, 34 (1987), pp. 596–615.
[FW93] M. L. Fredman and D. E. Willard, Surpassing the information-theoretic bound with
fusion trees, J. Comput. System Sci., 47 (1993), pp. 424–436.
[G01] A. V. Goldberg, A simple shortest path algorithm with linear average time, in Pro-
ceedings of the 9th European Symposium on Algorithms (ESA), Lecture Notes in
Comput. Sci. 2161, Springer, New York, 2001, pp. 230–241.
[G85a] H. N. Gabow, A scaling algorithm for weighted matching on general graphs, in Proceed-
ings of the 26th IEEE Symposium on Foundations of Computer Science (FOCS),
Portland, OR, 1985, IEEE Press, Piscataway, NJ, pp. 90–100.
[G85b] H. N. Gabow, Scaling algorithms for network problems, J. Comput. System Sci., 31
(1985), pp. 148–168.
[G95] A. V. Goldberg, Scaling algorithms for the shortest paths problem, SIAM J. Comput.,
24 (1995), pp. 494–504.
[GM97] Z. Galil and O. Margalit, All pairs shortest distances for graphs with small integer
length edges, Inform. and Comput., 134 (1997), pp. 103–139.
[GR98] A. V. Goldberg and S. Rao, Beyond the flow decomposition barrier, J. ACM, 45
(1998), pp. 783–797.
[GT89] H. N. Gabow and R. E. Tarjan, Faster scaling algorithms for network problems,
SIAM J. Comput., 18 (1989), pp. 1013–1036.
[GT91] H. N. Gabow and R. E. Tarjan, Faster scaling algorithms for general graph-matching
problems, J. ACM, 38 (1991), pp. 815–853.
[GYY80] R. L. Graham, A. C. Yao, and F. F. Yao, Information bounds are weak in the shortest
distance problem, J. ACM, 27 (1980), pp. 428–444.
[Hag00] T. Hagerup, Improved shortest paths on the word RAM, in Proceedings of the 27th
International Colloquium on Automata, Languages, and Programming (ICALP),
Lecture Notes in Comput. Sci. 1853, Springer, New York, 2000, pp. 61–72.
[Hag04] T. Hagerup, Simpler computation of single-source shortest paths in linear average
time, in Proceedings in the 21st Annual Symposium on Theoretical Aspects of Com-
puter Science (STACS), Montpellier, France, 2004, Springer, New York, pp. 362–
369.
[Han04] Y. Han, Improved algorithm for all pairs shortest paths, Inform. Process. Lett., 91
(2004), pp. 245–250.
[HKRS97] M. R. Henzinger, P. N. Klein, S. Rao, and S. Subramanian, Faster shortest path
Sci., 55 (1997), pp. 3–23.
algorithms for planar graphs, J. Comput. System
[HT02] Y. Han and M. Thorup, Integer sorting in O(n log log n) expected time and linear
space, in Proceedings of the 43rd Annual Symposium on Foundations of Computer
Science (FOCS), Vancouver, 2002, IEEE Press, Piscataway, NJ, pp. 135–144.
[J77] D. B. Johnson, Efficient algorithms for shortest paths in sparse networks, J. ACM, 24
(1977), pp. 1–13.
[K70] L. R. Kerr, The Effect of Algebraic Structure on the Computational Complexity of
Matrix Multiplication, Technical report TR70-75, Computer Science Department,
Cornell University, Ithaca, NY, 1970.
[KKP93] D. R. Karger, D. Koller, and S. J. Phillips, Finding the hidden path: Time bounds
for all-pairs shortest paths, SIAM J. Comput., 22 (1993), pp. 1199–1217.
[KKT95] D. R. Karger, P. N. Klein, and R. E. Tarjan, A randomized linear-time algorithm
for finding minimum spanning trees, J. ACM, 42 (1995), pp. 321–329.
[KS97] P. N. Klein and S. Subramanian, A randomized parallel algorithm for single-source
shortest paths, J. Algorithms, 25 (1997), pp. 205–220.
[KS98] S. G. Kolliopoulos and C. Stein, Finding real-valued single-source shortest paths in
o(n3 ) expected time, J. Algorithms, 28 (1998), pp. 125–141.
1430 SETH PETTIE AND VIJAYA RAMACHANDRAN
[LaP96] H. LaPoutré, Lower bounds for the union-find and the split-find problem on pointer
machines, J. Comput. System Sci., 52 (1996), pp. 87–99.
[M01] U. Meyer, Single-source shortest-paths on arbitrary directed graphs in linear average-
case time, in Proceedings of the 12th Annual ACM-SIAM Symposium on Discrete
Algorithms (SODA), Washington, DC, 2001, SIAM, Philadelphia, pp. 797–806.
[M02] U. Meyer, Buckets strike back: Improved parallel shortest-paths, in Proceedings of
the 16th International Parallel and Distributed Processing Symposium (IPDPS),
Ft. Lauderdale, FL, 2002, IEEE Computer Society Press, Los Alamitos, CA, pp. 75–
82.
[Mit00] J. S. B. Mitchell, Geometric shortest paths and network optimization, in Handbook
of Computational Geometry, North–Holland, Amsterdam, 2000, pp. 633–701.
[MN00] K. Mehlhorn and S. Näher, LEDA: A Platform for Combinatorial and Geometric
Computing, Cambridge University Press, Cambridge, UK, 2000.
[MT87] A. Moffat and T. Takaoka, An all pairs shortest path algorithm with expected time
O(n2 log n), SIAM J. Comput., 16 (1987), pp. 1023–1031.
[Pet02b] S. Pettie, On the comparison-addition complexity of all-pairs shortest paths, in Pro-
ceedings of the 13th International Symposium on Algorithms and Computation
(ISAAC’02), Vancouver, 2002, Springer, New York, pp. 32–43.
[Pet03] S. Pettie, On the Shortest Path and Minimum Spanning Tree Problems, Ph.D.
thesis, Department of Computer Sciences, The University of Texas at Austin,
Austin, TX, 2003; also available online as Technical report TR-03-35 at
http://www.cs.utexas.edu/ftp/pub/techreports/tr03-35.ps.gz.
[Pet04] S. Pettie, A new approach to all-pairs shortest paths on real-weighted graphs, Spe-
cial Issue of Selected Papers from the 29th International Colloqium on Automata
Languages and Programming (ICALP 2002), Theoret. Comput. Sci., 312 (2004),
pp. 47–74.
[PR02a] S. Pettie and V. Ramachandran, Computing shortest paths with comparisons and
additions, in Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete
Algorithms (SODA), San Francisco, CA, 2002, SIAM, Philadelphia, pp. 267–276.
[PR02b] S. Pettie and V. Ramachandran, Minimizing randomness in minimum spanning
tree, parallel connectivity, and set maxima algorithms, in Proceedings of the 13th
Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), San Francisco,
CA, 2002, SIAM, Philadelphia, pp. 713–722.
[PR02c] S. Pettie and V. Ramachandran, An optimal minimum spanning tree algorithm,
J. ACM, 49 (2002), pp. 16–34.
[PRS02] S. Pettie, V. Ramachandran, and S. Sridhar, Experimental evaluation of a new
shortest path algorithm, in Proceedings of the 4th Workshop on Algorithm En-
gineering and Experiments (ALENEX), San Francisco, CA, 2002, Springer, New
York, pp. 126–142.
[Sei95] R. Seidel, On the all-pairs-shortest-path problem in unweighted undirected graphs,
J. Comput. System Sci., 51 (1995), pp. 400–403.
[SP75] P. M. Spira and A. Pan, On finding and updating spanning trees and shortest paths,
SIAM J. Comput., 4 (1975), pp. 375–380.
[Spi73] P. M. Spira, A new algorithm for finding all shortest paths in a graph of positive arcs
in average time O(n2 log2 n), SIAM J. Comput., 2 (1973), pp. 28–32.
[SZ99] A. Shoshan and U. Zwick, All pairs shortest paths in undirected graphs with integer
weights, in Proceedings of the 40th Annual IEEE Symposium on Foundations of
Computer Science (FOCS), New York, 1999, IEEE Press, Piscataway, NJ, pp. 605–
614.
[Tak92] T. Takaoka, A new upper bound on the complexity of the all pairs shortest path prob-
lem, Inform. Process. Lett., 43 (1992), pp. 195–199.
[Tak98] T. Takaoka, Subcubic cost algorithms for the all pairs shortest path problem, Algo-
rithmica, 20 (1998), pp. 309–318.
[Tar79] R. E. Tarjan, A class of algorithms which require nonlinear time to maintain disjoint
sets, J. Comput. System Sci., 18 (1979), pp. 110–127.
[Tar79b] R. E. Tarjan, Applications of path compression on balanced trees, J. ACM, 26 (1979),
pp. 690–715.
[Tar82] R. E. Tarjan, Sensitivity analysis of minimum spanning trees and shortest path trees,
Inform. Process. Lett., 14 (1982), pp. 30–33; Corrigendum, Inform. Process. Lett.,
23 (1986), p. 219.
[Tho00] M. Thorup, Floats, integers, and single source shortest paths, J. Algorithms, 35 (2000),
pp. 189–201.
A SHORTEST PATH ALGORITHM FOR UNDIRECTED GRAPHS 1431
[Tho03] M. Thorup, Integer priority queues with decrease key in constant time and the single
source shortest paths problem, in Proceedings of the 35th Annual ACM Symposium
on Theory of Computing (STOC), San Diego, CA, 2003, ACM, New York, pp. 149–
158.
[Tho99] M. Thorup, Undirected single-source shortest paths with positive integer weights in
linear time, J. ACM, 46 (1999), pp. 362–394.
[TZ96] J. L. Träff and C. D. Zaroliagis, A simple parallel algorithm for the single-source
shortest path problem on planar digraphs, in Parallel Algorithms for Irregularly
Structured Problems, Lecture Notes in Comput. Sci. 1117, Springer, New York,
1996, pp. 183–194.
[Z01] U. Zwick, Exact and approximate distances in graphs—A survey, in Proceedings of the
9th European Symposium on Algorithms (ESA), University of Aarhus, Denmark,
2001, pp. 33–48; available online at http://www.cs.tau.ac.il/∼zwick/.
[Z02] U. Zwick, All pairs shortest paths using bridging sets and rectangular matrix multipli-
cation, J. ACM, 49 (2002), pp. 289–317.
[Z04] U. Zwick, A slightly improved sub-cubic algorithm for the all pairs shortest paths prob-
lem with real edge lengths, in Proceedings of the 15th International Symposium
on Algorithms and Computation (ISAAC), Lecture Notes in Comput. Sci. 3341,
Springer, New York, 2004, pp. 921–932.