Chapter 25 All-Pairs Shortest Paths 25.0.1
Given a weighted, directed graph G = (V,E),
with weight function w:E -> R, the all-pairs
shortest-paths problem is to find a shortest
path from u to v for every pair of vertices u
and v -- to make a table of distances between
cities, for instance.
One solution is to run a single-source
shortest-paths algorithm on each vertex. If
the edge weights are non-negative we can use
Dijkstra's algorithm. If the min-priority
queue is implemented by a linear array, a
binary min-heap, or a Fibonacci heap, we get
running times of O(V^3), O(V E lg V), and
O(V^2 lg V + V E) respectively.
If there are negative edges, we must use the
Bellman-Ford algorithm, with a running time of
O(V^2 E), which is O(V^4) for a dense graph.
We can improve on this.
We use the adjacency matrix representation in
the algorithms in this chapter, except for
Johnson's algorithm (Section 25.3). We assume
that the vertices are numbered 1, 2, ..., |V|
= n, so the edge weights are represented by an
n x n matrix W = (w_ij), where
/ 0 if i = j
w_ij = < the weight of (i,j) if (i,j) is in E
\ infinity for (i,j) not in E
25.0.2
Negative-weight edges are allowed, but we
assume there are no negative-weight cycles in
Section 25.1. The output of an all-pairs
shortest-paths algorithm is an n x n matrix
D = (d_ij), where d_ij = delta(i,j) is the
shortest-path weight from i to j.
The predecessor matrix PI = (pi_ij) tells how
to get from one vertex to another along a
shortest path, where pi_ij = NIL if i = j or
if there is no path from i to j, and otherwise
pi_ij is the predecessor of j on some shortest
path from i. The subgraph formed by the i-th
row of PI should be a shortest-paths tree with
root i. For each vertex i in V, we define the
predecessor subgraph of G for i as
G_pi,i = (V_pi,i , E_pi,i), where
V_pi,i = { j in V : pi_ij not NIL } Union {i}
and
E_pi,i = {(pi_ij, j) : j is in V_pi,i - {i} }
If G_pi,i is a shortest-paths tree, the
following prints a shortest path from i to j:
PRINT-ALL-PAIRS-SHORTEST-PATH(PI,i,j)
1 if i = j
2 then print i
3 else if pi_ij = NIL
4 then print "no path from" i "to" j
5 else
PRINT-ALL-PAIRS-SHORTEST-PATH(PI,i,pi_ij)
6 print j
Chapter outline 25.0.3
Section 25.1 solves the all-pairs shortest-
paths problem using a dynamic-programming
algorithm, based on matrix multiplication,
which can be made to run in O(V^3 lg V) time.
Section 25.2 solves the all-pairs shortest-
paths problem by the Floyd-Warshall algorithm,
another dynamic-programming algorithm, which
runs in O(V^3) time. Also, a closely related
algorithm can find the transitive closure of a
directed graph.
Section 25.3 solves the all-pairs shortest-
paths problem with Johnson's algorithm, which
runs in O(V^2 lg V + V E) time and uses the
adjacency-list representation. It is good for
large, sparse graphs. It can handle negative
weight edges and can detect negative weight
cycles.
Conventions: First we assume G = (V,E) has n
= |V| vertices. Second, matrices, such as W,
L, or D, are denoted by upper case, and their
elements by subscripted lower case letters,
such as w_ij, l_ij, or d_ij. In these notes
we use notation like L(m) = (l(m)_ij) to
indicate iterates instead of having "(m)" as
superscripts as in the book. Finally, for an
n x n matrix A, the value n is stored in the
attribute rows[A].
25.1 Shortest paths and matrix 25.1.1
multiplication
The four steps for developing a dynamic-
programming algorithm to solve a problem are:
1. Characterize the structure of an optimal
solution.
2. Recursively define the value of an optimal
solution.
3. Compute the value of an optimal solution in
a bottom-up fashion.
4. Construct an optimal solution from the
computed information.
The structure of a shortest path
Consider a shortest path p from i to j, and
suppose p contains at most m edges. Assuming
no negative-weight cycles, m is finite. If
i = j, then p has no edges and weight 0. If
p'
i not = j, then we decompose p as i ^-> k -> j
where p' has at most m - 1 edges. Lemma 24.1
says that p' is a shortest path from i to k,
and so delta(i,j) = delta(i,k) + w_kj.
A recursive solution to the all-pairs 25.1.2
shortest-paths problem
Let l(m)_ij be the minimum weight of any path
from i to j that contains m edges. When m = 0
there is a shortest path from i to j (with no
edges) if and only if i = j, so that:
l(0)_ij = / 0 if i = j
\ infinity if i not = j
For m > 0, l(m)_ij is the minimum of l(m-1)_ij
and the minimum weight of any path from i to j
consisting of m edges, obtained by looking at
all possible predecessors k of j. Thus:
l(m)_ij = min( l(m-1)_ij,
min {l(m-1)_ik + w_kj} )
1<=k<=n
= min {l(m-1)_ik + w_kj}
1<=k<=n
since w_jj = 0 for all j.
If there are no negative-weight cycles, then
for every pair of vertices i and j for which
delta(i,j) < infinity, there is a shortest
path from i to j that is simple and contains
at most n - 1 edges; a path with more than n-1
edges cannot have lower weight. The actual
shortest-path weights are therefore given by:
delta(i,j) = l(n-1)_ij = l(n)_ij = l(n+1)_ij =
... (Equation 25.3)
25.1.3
Assuming the input matrix W = (w_ij), we
compute a series of matrices L(m) = (l(m)_ij)
for m = 1, 2, ..., n-1. The final matrix
L(n-1) contains the shortest-path weights.
Also l(1)_ij = w_ij for all i, j, so L(1) = W.
The following procedure computes L(m) from W
and L(m-1), and so extends the shortest paths
computed so far by one more edge. (It is
written so the input and output is independent
of m.) Because of the three nested for loops,
its running time is Theta(n^3).
EXTEND-SHORTEST-PATHS(L,W)
1 n <- rows[L]
2 let L' = (l'_ij) be an n x n matrix
3 for i <- 1 to n do
4 for j <- 1 to n do
5 l'_ij <- infinity
6 for k <- 1 to n do
7 l'_ij <- min(l'_ij, l_ik + w_kj)
8 return L'
Note that if we make the substitutions:
l --> a
w --> b
l' --> c
min --> +
+ --> x
25.1.4
and replace infinity (the identity for min) by
0 (the identity for +), we obtain the standard
procedure for matrix multiplication:
MATRIX-MULTIPLY(A,B)
1 n <- rows[A]
2 let C be an n x n matrix
3 for i <- 1 to n do
4 for j <- 1 to n do
5 c_ij <- 0
6 for k <- 1 to n do
7 c_ij <- c_ij + a_ik x b_kj
8 return C
Returning to the all-pairs shortest-paths
problem, let A x B denote the matrix "product"
returned by EXTEND-SHORTEST-PATHS(A,B), then
extend the shortest paths edge by edge by
computing the sequence of n - 1 matrices:
L(1) = L(0) x W = W
L(2) = L(1) x W = W^2
L(3) = L(2) x W = W^3
.
.
L(n-1) = L(n-2) x W = W^(n-1)
As argued above, L(n-1) = W^(n-1) contains the
shortest-path weights. The procedure below
computes this sequence in Theta(n^4) time.
Figure 25.1 (page 626) shows a graph and the
matrices computed by the procedure.
SLOW-ALL-PAIRS-SHORTEST-PATHS(W) 25.1.5
1 n <- rows[W]
2 L(1) <- W
3 for m <- 2 to n - 1
4 do L(m) <- EXTEND-SHORTEST-PATHS(L(m-1),W)
5 return L(n-1)
Improving the running time
Note that we only need L(n-1), and if there
are no negative-weight cycles, L(m) = L(n-1)
for all m >= n - 1 by Equation 25.3 above.
Also the matrix "product" defined by
EXTEND-SHORTEST-PATHS is associative, so we
can compute L(n-1) with c = ceiling(lg(n - 1))
such products by computing the sequence:
L(1) = W
L(2) = W^2 = W x W
L(4) = W^4 = W^2 x W^2
L(8) = W^8 = W^4 x W^4
.
.
L(2^c) = W^(2^c) = W^(2^(c-1)) x W^(2^(c-1))
and since 2^ceiling(lg(n-1)) >= n - 1,
L(2^c) = L(2^ceiling(lg(n-1))) = L(n-1).
The following procedure computes the sequence
above by "repeated squaring".
FASTER-ALL-PAIRS-SHORTEST-PATHS(W) 25.1.6
1 n <- rows[W]
2 L(1) <- W
3 m <- 1
4 while m < n - 1 do
5 L(2m) <- EXTEND-SHORTEST-PATHS(L(m),L(m))
6 m <- 2m
7 return L(m)
FASTER-ALL-PAIRS-SHORTEST-PATHS runs in
Theta(n^3 lg n) time, since each of the
ceiling(lg(n - 1)) matrix "products" takes
Theta(n^3) time. Note that the code is tight,
containing no complex data structures, so that
the constant in the Theta-notation is small.
Note: to actually find all the shortest paths,
i.e. to compute the matrix PI, we can either
compute it from the final matrix L(n-1) by
Exercise 25.1-6 in O(n^3) additional time, or
we can modify the EXTEND-SHORTEST-PATHS and
SLOW-ALL-PAIRS-SHORTEST-PATHS algorithms to
compute a sequence PI(1), PI(2), ..., PI(n-1)
= PI, along with L(1), L(2), ..., L(n-1) by
Exercise 25.1-7 on page 628. This is the
fourth step of dynamic programming.
25.2 The Floyd-Warshall algorithm 25.2.1
We can use a different dynamic programming
method to obtain the Floyd-Warshall algorithm,
which runs in Theta(V^3) time. It assumes
that there are no negative-weight cycles (but
it can be used to detect such cycles -- see
Exercise 25.2-6 page 635).
The structure of the shortest path
In the Floyd-Warshall algorithm, we consider
"intermediate" vertices of a shortest path,
where an intermediate vertex of a simple path
p = is any vertex other
than v_0 or v_l.
Under the assumption that the vertices of G
are V = {1, 2, ..., n}, we consider the subset
{1, 2, ..., k} for some k. For any pair of
vertices i, j in V, let p be a minimum-weight
path from i to j whose intermediate vertices
are all taken from {1, 2, ..., k} (note that p
is simple). The Floyd-Warshall algorithm uses
a relationship between p and shortest paths in
the set {1, 2, ..., k-1}. This relationship
depends on whether or not k is an intermediate
vertex of path p:
- If k is not an intermediate vertex 25.2.2
of p, then all intermediate vertices are the
set {1, 2, ..., k-1}. Thus a shortest path
from i to j with intermediate vertices in
the set {1, 2, ..., k-1} is also a shortest
path from i to j with intermediate vertices
in the set {1, 2, ..., k}.
- If k _is_ an intermediate vertex of path p,
p1 p2
then we decompose p into i ^-> k ^-> j as
shown in Figure 25.3 (page 630). By Lemma
24.1, p1 is a shortest path from i to k with
all intermediate vertices in {1, 2, ..., k}.
Since k is not an intermediate vertex of p1,
p1 is a shortest path from i to k with all
intermediate vertices in {1, 2, ..., k-1}.
Similarly, p2 is a shortest path from vertex
k to vertex j with all intermediate vertices
in {1, 2, ..., k-1}.
A recursive solution to the all-pairs
shortest-paths problem
Let d(k)_ij be the weight of a shortest path
from i to j with all intermediate vertices in
the set {1, 2, ..., k}. When k = 0, a path
from i to j with no intermediate vertices
numbered higher than 0 has no intermediate
vertices at all; such a path has at most one
edge, and hence d(0)_ij = w_ij. Thus we get
the recursive definition of d(k)_ij below.
d(k)_ij = (Equation 25.5) 25.2.3
/ w_ij if k = 0
< min( d(k-1)_ij, d(k-1)_ik + d(k-1)_kj )
\ if k > 0
Since for any path, all intermediate vertices
are in the set {1, 2, ..., n}, the matrix D(n)
= (d(n)_ij) gives the final answer -- i.e.
d(n)_ij = delta(i,j) for all i,j in V.
Computing the shortest-path weights bottom-up
Based on Equation 25.5, the following bottom-
up procedure computes the values d(k)_ij in
order of increasing values of k. Figure 25.4
(page 631) shows the matrices computed for the
graph of Figure 25.1 (page 626).
FLOYD-WARSHALL(W)
1 n <- rows[W]
2 D(0) <- W
3 for k <- 1 to n do
4 for i <- 1 to n do
5 for j <- 1 to n do
6 d(k)_ij = min( d(k-1)_ij,
d(k-1)_ik + d(k-1)_kj )
7 return D(n)
The running time of FLOYD-WARSHALL is
determined by the triply nested for loops, and
since line 6 takes Theta(1) time, the whole
algorithm runs in Theta(n^3) time. As before,
the code is tight so the constant is small.
Constructing a shortest path 25.2.4
As in the all-pairs shortest-paths algorithms
we can construct the predecessor matrix PI
from the matrix D of shortest-path weights in
O(n^3) time (Exercise 25.1-6 page 628).
Alternatively, we can compute PI in parallel
with D during the run of the FLOYD-WARSHALL
algorithm. Specifically, we compute a
sequence PI(1), PI(2), ..., PI(n) of matrices
where PI = PI(n) and pi(k)_ij is the
predecessor of j on a shortest path from i to
j with intermediate vertices in the set
{1, 2, ..., k}. Here is the recursive
definition of pi(k)_ij:
If k = 0, a shortest path has no intermediate
vertices between i and j, so
pi(0)_ij = / NIL if i = j or w_ij = infinity
\ i if i not = j & w_ij < infinity
For k > 0, if we can take a shorter path using
k, i.e. i ^-> k ^-> j, where k not = j, then
j's predecessor is the same as its predecessor
on a shortest path k ^-> j only using vertices
in the set {1, 2, ..., k-1}. Otherwise, if
by using k, there is no shorter path from i to
j, then the predecessor for j is the same as
its predecessor in a shortest path that just
uses {1, 2, ..., k-1}. Specifically:
pi(k)_ij = 25.2.5
/ pi(k-1)_ij
/ if d(k-1)_ij <= d(k-1)_ik + d(k-1)_kj
\ pi(k-1)_kj
\ if d(k-1)_ij > d(k-1)_ik + d(k-1)_kj
Figure 25.4 (page 631) shows the PI(k)s for
the graph in Figure 25.1 (page 626). Exercise
25.2-3 asks for inclusion of the idea above
into FLOYD-WARSHALL (and to prove that the
predecessor subgraph G_pi,i is a shortest-
paths tree with root i). Exercise 25.2-7
indicates another way to find shortest-paths.
Transitive closure of a directed graph
Given a directed graph G = (V,E) with vertex
set v = {1, 2, ..., n}, the transitive closure
of G is defined as the graph G* = (V,E*) where
E* = {(i,j): there is a path from i to j in G}
One way to compute G* in Theta(n^3) time is to
assign a weight of 1 to each edge of E and run
FLOYD-WARSHALL. If there is a path from i to
j, d_ij < n, otherwise d_ij = infinity.
Another similar way computes G* in Theta(n^3)
also, but can save time and space in practice.
It substitutes v (logical OR) and ^ (logical
AND) for min and + in FLOYD-WARSHALL.
We define matrices T(k) by setting 25.2.6
t(k)_ij to 1 (TRUE) if there is a path from i
to j with all intermediate vertices in the set
{1, 2, ..., k}, and 0 (FALSE) otherwise. Then
(i,j) is in E* if and only if t(n)_ij = 1. So
t(0)_ij = / 0 if i not = j and (i,j) not in E
\ 1 if i = j or (i,j) is in E
and for k > 0
t(k)_ij = t(k-1)_ij v (t(k-1)_ik ^ t(k-1)_kj)
TRANSITIVE-CLOSURE(G)
1 n <- |V[G]|
2 for i <- 1 to n do
3 for j <- 1 to n do
4 if j = j or (i,j) in E[G]
5 then t(0)_ij <- 1
6 else t(0)_ij <- 0
7 for k <- 1 to n do
8 for i <- 1 to n do
9 for j <- 1 to n do
10 t(k)_ij =
t(k-1)_ij v (t(k-1)_ik ^ t(k-1)_kj)
11 return T(n)
Figure 25.5 (page 634) shows a sample run of
TRANSITIVE-CLOSURE, showing the T(k) matrices.
Note: 1) the logical bitwise operations v and
^ are often faster on computers, and 2) only
single bits need be stored in each t(k)_ij,
reducing the space requirements by a factor
equal to the word size of the computer.
25.3 Johnson's algorithm for sparse 25.3.1
graphs
Idea: re-weight the edges so that they are
non-negative if there are no negative weight
cycles in the original graph. Add a vertex s
and an edge with weight 0 to every original
vertex. Define h(v) = delta(s,v) = distance
computed by Bellman-Ford. Then re-weight as
follows: w-hat(u,v) = w(u,v) + h(u) - h(v)
Lemma 25.1: Re-weighting preserves shortest
paths: p is a shortest path for w if and only
if p is a shortest path for w-hat. Also w has
a negative weight cycle iff w-hat does.
Moreover, if there are no negative weight
cycles, then w-hat(u,v) >= 0.
JOHNSON(G)
1 compute G', where V[G'] = V[G] U {s}
E[G'] = E[G] U {(s,v): v in V[G]} and
w(s,v) = 0 for all v in V[G]
2 if BELLMAN-FORD(G',w,s) = FALSE
3 then print "graph has neg. wt. cycle"
4 else for each v in V[G'] do
5 h(v) <- delta(s,v) by Bellman-Ford
6 for each edge (u,v) in E[G']
7 w-hat(u,v) <- w(u,v) + h(u) - h(v)
8 for each u in V[G] do
9 run DIJKSTRA(G,w-hat,u) to compute
delta-hat(u,v) for all v in V[g]
10 for each v in V[G] do
11 d_uv <- delta-hat(u,v)+h(v)-h(u)
12 return D