Chapter 25 All-Pairs Shortest Paths 25.0.1 Given a weighted, directed graph G = (V,E), with weight function w: E -> R, the all-pairs shortest-paths problem is to find a shortest path from u to v for every pair of vertices u and v -- to make a table of distances between cities, for instance. One solution is to run a single-source shortest-paths algorithm on each vertex. If the edge weights are non-negative we can use Dijkstra's algorithm. If the min-priority queue is implemented by a linear array, a binary min-heap, or a Fibonacci heap, we get running times of O(V^3), O(V E lg V), and O(V^2 lg V + V E) respectively. If there are negative edges, we must use the Bellman-Ford algorithm, with a running time of O(V^2 E), which is O(V^4) for a dense graph. We can improve on this. We use the adjacency matrix representation in the algorithms in this chapter, except for Johnson's algorithm (Section 25.3). We assume that the vertices are numbered 1, 2, ..., |V| = n, so the edge weights are represented by an n x n matrix W = (w_ij), where / 0 if i = j w_ij = < the weight of (i,j) if (i,j) is in E \ infinity for (i,j) not in E Negative-weight edges are allowed, 25.0.2 but we assume there are no negative-weight cycles in Section 25.1. The output of an all-pairs shortest-paths algorithm is an n x n matrix D = (d_ij), where d_ij = delta(i,j) is the shortest-path weight from i to j. The predecessor matrix PI = (pi_ij) tells how to get from one vertex to another along a shortest path, where pi_ij = NIL if i = j or if there is no path from i to j, and otherwise pi_ij is the predecessor of j on some shortest path from i. The subgraph formed by the i-th row of PI should be a shortest-paths tree with root i. For each vertex i in V, we define the predecessor subgraph of G for i as G_pi,i = (V_pi,i , E_pi,i), where V_pi,i = { j in V : pi_ij not NIL } Union {i} and E_pi,i = {(pi_ij, j) : j is in V_pi,i - {i} } If G_pi,i is a shortest-paths tree, the following prints a shortest path from i to j: PRINT-ALL-PAIRS-SHORTEST-PATH(PI,i,j) 1 if i == j 2 print i 3 else if pi_ij == NIL 4 print "no path from" i "to" j 5 else PRINT-ALL-PAIRS-SHORTEST-PATH(PI,i,pi_ij) 6 print j Chapter outline 25.0.3 Section 25.1 solves the all-pairs shortest- paths problem using a dynamic-programming algorithm, based on matrix multiplication, which can be made to run in O(V^3 lg V) time. Section 25.2 solves the all-pairs shortest- paths problem by the Floyd-Warshall algorithm, another dynamic-programming algorithm, which runs in O(V^3) time. Also, a closely related algorithm can find the transitive closure of a directed graph. Section 25.3 solves the all-pairs shortest- paths problem with Johnson's algorithm, which runs in O(V^2 lg V + V E) time and uses the adjacency-list representation. It is good for large, sparse graphs. It can handle negative weight edges and can detect negative weight cycles. Conventions: First we assume G = (V,E) has n = |V| vertices. Second, matrices, such as W, L, or D, are denoted by upper case, and their elements by subscripted lower case letters, such as w_ij, l_ij, or d_ij. In these notes we use notation like L(m) = (l(m)_ij) to indicate iterates instead of having "(m)" as superscripts as in the book. Finally, for an n x n matrix A, the value n is stored in the attribute A.rows. 25.1 Shortest paths and matrix 25.1.1 multiplication The four steps for developing a dynamic- programming algorithm to solve a problem are: 1. Characterize the structure of an optimal solution. 2. Recursively define the value of an optimal solution. 3. Compute the value of an optimal solution in a bottom-up fashion. 4. Construct an optimal solution from the computed information. The structure of a shortest path (Step 1) Consider a shortest path p from i to j, and suppose p contains at most m edges. Assuming no negative-weight cycles, m is finite. If i = j, then p has no edges and weight 0. If p' i not = j, then we decompose p as i ^-> k -> j where p' has at most m - 1 edges. Lemma 24.1 says that p' is a shortest path from i to k, and so delta(i,j) = delta(i,k) + w_kj. A recursive solution to the all-pairs 25.1.2 shortest-paths problem (Step 2) Let l(m)_ij be the minimum weight of any path from i to j that contains m edges. When m = 0 there is a shortest path from i to j (with no edges) if and only if i = j, so that: l(0)_ij = / 0 if i = j \ infinity if i not = j For m > 0, l(m)_ij is the minimum of l(m-1)_ij and the minimum weight of any path from i to j consisting of m edges, obtained by looking at all possible predecessors k of j. Thus: l(m)_ij = min( l(m-1)_ij, min {l(m-1)_ik + w_kj} ) 1<=k<=n = min {l(m-1)_ik + w_kj} 1<=k<=n since w_jj = 0 for all j. If there are no negative-weight cycles, then for every pair of vertices i and j for which delta(i,j) < infinity, there is a shortest path from i to j that is simple and contains at most n - 1 edges; a path with more than n-1 edges cannot have lower weight. The actual shortest-path weights are therefore given by: delta(i,j) = l(n-1)_ij = l(n)_ij = l(n+1)_ij = ... (Equation 25.3) 25.1.3 Assuming the input matrix W = (w_ij), we compute a series of matrices L(m) = (l(m)_ij) for m = 1, 2, ..., n-1. The final matrix L(n-1) contains the shortest-path weights. Also l(1)_ij = w_ij for all i, j, so L(1) = W. The following procedure computes L(m) from W and L(m-1), and so extends the shortest paths computed so far by one more edge. (It is written so the input and output is independent of m.) Because of the three nested for loops, its running time is Theta(n^3). EXTEND-SHORTEST-PATHS(L,W) 1 n = L.rows 2 let L' = (l'_ij) be a new n x n matrix 3 for i = 1 to n 4 for j = 1 to n 5 l'_ij = infinity 6 for k = 1 to n 7 l'_ij = min(l'_ij, l_ik + w_kj) 8 return L' Note that if we make the substitutions: l --> a w --> b l' --> c min --> + + --> x 25.1.4 and replace infinity (the identity for min) by 0 (the identity for +), we obtain the standard procedure for matrix multiplication: MATRIX-MULTIPLY(A,B) 1 n = A.rows 2 let C be a new n x n matrix 3 for i = 1 to n 4 for j = 1 to n 5 c_ij = 0 6 for k = 1 to n 7 c_ij = c_ij + a_ik x b_kj 8 return C Returning to the all-pairs shortest-paths problem, let A x B denote the matrix "product" returned by EXTEND-SHORTEST-PATHS(A,B), then extend the shortest paths edge by edge by computing the sequence of n - 1 matrices: L(1) = L(0) x W = W L(2) = L(1) x W = W^2 L(3) = L(2) x W = W^3 . . L(n-1) = L(n-2) x W = W^(n-1) As argued above, L(n-1) = W^(n-1) contains the shortest-path weights. Step 3: the procedure below computes this sequence in Theta(n^4) time. Figure 25.1 (page 690) shows a graph and the matrices computed by the procedure. SLOW-ALL-PAIRS-SHORTEST-PATHS(W) 25.1.5 1 n = W.rows 2 L(1) = W 3 for m = 2 to n - 1 4 L(m) = EXTEND-SHORTEST-PATHS(L(m-1),W) 5 return L(n-1) Improving the running time Note that we only need L(n-1), and if there are no negative-weight cycles, L(m) = L(n-1) for all m >= n - 1 by Equation 25.3 above. Also the matrix "product" defined by EXTEND-SHORTEST-PATHS is associative, so we can compute L(n-1) with c = ceiling(lg(n - 1)) such products by computing the sequence: L(1) = W L(2) = W^2 = W x W L(4) = W^4 = W^2 x W^2 L(8) = W^8 = W^4 x W^4 . . L(2^c) = W^(2^c) = W^(2^(c-1)) x W^(2^(c-1)) and since 2^ceiling(lg(n-1)) >= n - 1, L(2^c) = L(2^ceiling(lg(n-1))) = L(n-1). Step 3': the following procedure computes the sequence above by "repeated squaring". FASTER-ALL-PAIRS-SHORTEST-PATHS(W) 25.1.6 1 n = W.rows 2 L(1) = W 3 m = 1 4 while m < n - 1 5 L(2m) = EXTEND-SHORTEST-PATHS(L(m),L(m)) 6 m = 2m 7 return L(m) FASTER-ALL-PAIRS-SHORTEST-PATHS runs in Theta(n^3 lg n) time, since each of the ceiling(lg(n - 1)) matrix "products" takes Theta(n^3) time. Note that the code is tight, containing no complex data structures, so that the constant in the Theta-notation is small. Note: to actually find all the shortest paths (Step 4) i.e. to compute the matrix PI, we can either compute it from the final matrix L(n-1) by Exercise 25.1-6 in O(n^3) additional time, or we can modify the EXTEND-SHORTEST-PATHS and SLOW-ALL-PAIRS-SHORTEST-PATHS algorithms to compute a sequence PI(1), PI(2), ..., PI(n-1) = PI, along with L(1), L(2), ..., L(n-1) by Exercise 25.1-7 on page 692. This is the fourth step of dynamic programming. 25.2 The Floyd-Warshall algorithm 25.2.1 We can use a different dynamic programming method to obtain the Floyd-Warshall algorithm, which runs in Theta(V^3) time. It assumes that there are no negative-weight cycles (but it can be used to detect such cycles -- see Exercise 25.2-6 page 700). The structure of the shortest path (Step 1) In the Floyd-Warshall algorithm, we consider "intermediate" vertices of a shortest path, where an intermediate vertex of a simple path p = is any vertex other than v_0 or v_l. Under the assumption that the vertices of G are V = {1, 2, ..., n}, we consider the subset {1, 2, ..., k} for some k. For any pair of vertices i, j in V, let p be a minimum-weight path from i to j whose intermediate vertices are all taken from {1, 2, ..., k} (note that p is simple). The Floyd-Warshall algorithm uses a relationship between p and shortest paths in the set {1, 2, ..., k-1}. This relationship depends on whether or not k is an intermediate vertex of path p: - If k is not an intermediate vertex 25.2.2 of p, then all intermediate vertices are the set {1, 2, ..., k-1}. Thus a shortest path from i to j with intermediate vertices in the set {1, 2, ..., k-1} is also a shortest path from i to j with intermediate vertices in the set {1, 2, ..., k}. - If k _is_ an intermediate vertex of path p, p1 p2 we decompose p into i ^-> k ^-> j as shown in Figure 25.3. By Lemma 24.1, p1 is a shortest path from i to k with intermediate vertices in {1,2,...,k}. Since k is not an intermediate vertex of p1, p1 is a shortest path from i to k with all intermediate vertices in {1,2,...,k-1}. Similarly p2 is a shortest path from vertex k to vertex j with all intermediate vertices in {1,2,...,k-1}. A recursive solution to the all-pairs shortest-paths problem (Step 2) Let d(k)_ij be the weight of a shortest path from i to j with all intermediate vertices in the set {1, 2, ..., k}. When k = 0, a path from i to j with no intermediate vertices numbered higher than 0 has no intermediate vertices at all; such a path has at most one edge, and hence d(0)_ij = w_ij. Thus we get the recursive definition of d(k)_ij below. d(k)_ij = (Equation 25.5) 25.2.3 / w_ij if k = 0 < min( d(k-1)_ij, d(k-1)_ik + d(k-1)_kj ) \ if k > 0 Since for any path, all intermediate vertices are in the set {1, 2, ..., n}, the matrix D(n) = (d(n)_ij) gives the final answer -- i.e. d(n)_ij = delta(i,j) for all i,j in V. Computing the shortest-path weights bottom-up Step 3: based on Equation 25.5, the following bottom-up function computes the values d(k)_ij in order of increasing values of k. Figure 25.4 (page 696) shows the matrices computed for the graph of Figure 25.1. FLOYD-WARSHALL(W) 1 n = W.rows[W] 2 D(0) = W 3 for k = 1 to n do 4 for i = 1 to n do 5 for j = 1 to n do 6 d(k)_ij = min( d(k-1)_ij, d(k-1)_ik + d(k-1)_kj ) 7 return D(n) The running time of FLOYD-WARSHALL is determined by the triply nested for loops, and since line 6 takes Theta(1) time, the whole algorithm runs in Theta(n^3) time. As before, the code is tight so the constant is small. Constructing a shortest path (Step 4) 25.2.4 As in the all-pairs shortest-paths algorithms we can construct the predecessor matrix PI from the matrix D of shortest-path weights in O(n^3) time (Exercise 25.1-6 page 692). Alternatively, we can compute PI in parallel with D during the run of the FLOYD-WARSHALL algorithm. Specifically, we compute a sequence PI(1), PI(2), ..., PI(n) of matrices where PI = PI(n) and pi(k)_ij is the predecessor of j on a shortest path from i to j with intermediate vertices in the set {1, 2, ..., k}. Here is the recursive definition of pi(k)_ij: If k = 0, a shortest path has no intermediate vertices between i and j, so pi(0)_ij = / NIL if i = j or w_ij = infinity \ i if i not = j & w_ij < infinity For k > 0, if we can take a shorter path using k, i.e. i ^-> k ^-> j, where k not = j, then j's predecessor is the same as its predecessor on a shortest path k ^-> j only using vertices in the set {1, 2, ..., k-1}. Otherwise, if by using k, there is no shorter path from i to j, then the predecessor for j is the same as its predecessor in a shortest path that just uses {1, 2, ..., k-1}. Specifically: pi(k)_ij = 25.2.5 / pi(k-1)_ij / if d(k-1)_ij <= d(k-1)_ik + d(k-1)_kj \ pi(k-1)_kj \ if d(k-1)_ij > d(k-1)_ik + d(k-1)_kj Figure 25.4 (page 696) shows the PI(k)s for the graph in Figure 25.1 (page 690). Exercise 25.2-3 asks for inclusion of the idea above into FLOYD-WARSHALL (and to prove that the predecessor subgraph G_pi,i is a shortest- paths tree with root i). Exercise 25.2-7 indicates another way to find shortest-paths. Transitive closure of a directed graph Given a directed graph G = (V,E) with vertex set v = {1, 2, ..., n}, the transitive closure of G is defined as the graph G* = (V,E*) where E* = {(i,j): there is a path from i to j in G} One way to compute G* in Theta(n^3) time is to assign a weight of 1 to each edge of E and run FLOYD-WARSHALL. If there is a path from i to j, d_ij < n, otherwise d_ij = infinity. Another similar way computes G* in Theta(n^3) also, but can save time and space in practice. It substitutes v (logical OR) and ^ (logical AND) for min and + in FLOYD-WARSHALL. We define matrices T(k) by setting 25.2.6 t(k)_ij to 1 (TRUE) if there is a path from i to j with all intermediate vertices in the set {1, 2, ..., k}, and 0 (FALSE) otherwise. Then (i,j) is in E* if and only if t(n)_ij = 1. So t(0)_ij = / 0 if i not = j and (i,j) not in E \ 1 if i = j or (i,j) is in E and for k > 0 t(k)_ij = t(k-1)_ij v (t(k-1)_ik ^ t(k-1)_kj) TRANSITIVE-CLOSURE(G) 1 n = |G.V| 2 for i = 1 to n 3 for j = 1 to n 4 if i == j or (i,j) in G.E 5 t(0)_ij = 1 6 else t(0)_ij = 0 7 for k = 1 to n 8 for i = 1 to n 9 for j = 1 to n 10 t(k)_ij = t(k-1)_ij v (t(k-1)_ik ^ t(k-1)_kj) 11 return T(n) Figure 25.5 (page 698) shows a sample run of TRANSITIVE-CLOSURE, showing the T(k) matrices. Note: 1) the logical bitwise operations v and ^ are often faster on computers, and 2) only single bits need be stored in each t(k)_ij, reducing the space requirements by a factor equal to the word size of the computer. 25.3 Johnson's algorithm for sparse 25.3.1 graphs Idea: re-weight the edges so that they are non-negative if there are no negative weight cycles in the original graph. Add a vertex s and an edge with weight 0 to every original vertex. Define h(v) = delta(s,v) = distance computed by Bellman-Ford. Then re-weight as follows: w-hat(u,v) = w(u,v) + h(u) - h(v) Lemma 25.1: Re-weighting preserves shortest paths: p is a shortest path for w if and only if p is shortest for w-hat; w has a negative weight cycle iff w-hat does. And if there are no negative weight cycles, w-hat(u,v) >= 0. JOHNSON(G) 1 compute G', where G'.V = G.V U {s} G'.E = G.E U {(s,v): v in G.V} and w(s,v) = 0 for all v in G.V 2 if BELLMAN-FORD(G',w,s) == FALSE 3 print "graph has neg. wt. cycle" 4 else for each vertex v in G'.V 5 h(v) = delta(s,v) by Bellman-Ford 6 for each edge (u,v) in G'.E 7 w-hat(u,v) = w(u,v) + h(u) - h(v) 8 Let D = (d_uv) be a new n x n matrix 9 for each vertex u in G.V 10 run DIJKSTRA(G,w-hat,u) to compute delta-hat(u,v) for all v in G.V 11 for each v in G.V 12 d_uv = delta-hat(u,v)+h(v)-h(u) 13 return D