Chapter 24 Single-Source Shortest Paths 24.0.1 A motorist wants to find the shortest route from Duluth to Boston, given a map of the U.S. marked with distances between neighboring towns. Abstractly, in a shortest-paths problem, we are given a weighted, directed graph G = (V,E), with weight function w:E -> R mapping edges to real numbers. The weight of a path p = is: k w(p) = Sum ( w(v , v ) ) i = 1 i-1 i The shortest-path weight from u to v is: p / min{w(p): u ^-> v} if there is delta(u,v) = < a path from u to v \ infinity otherwise A shortest path from u to v is then defined as any path p with weight w(p) = delta(u,v). Edge weights can also be used to represent time, cost, penalties, loss, etc. Variants In this chapter, we focus on the single-source shortest-paths problem: given graph G = (V,E), we want to find a shortest path from a given source vertex s to each vertex v in V. Many other problems can be solved the same way: 24.0.2 Single-destination shortest-paths problem: Find a shortest path to a given destination vertex t from each vertex v. We can reduce this problem to the single-source problem by reversing the direction of the edges in E. Single-pair shortest-path problem: For given vertices u and v, find a shortest path from u to v. Note: no algorithms are known for this problem that run asymptotically faster than the best single-source shortest-path algorithms in the worst case. All-pairs shortest-paths problem: Find a shortest path from u to v for every pair of vertices u and v. This problem (discussed in Chapter 25), interesting by itself, can be solved faster than running the single- source algorithm once for each vertex u. Optimal substructure of a shortest path Lemma 24.1 Subpaths of shortest paths are shortest paths. Let p = be a shortest path from v_1 to v_k and, for any i and j such that 1 <= i < j <= k, let p_ij = be the subpath of p from v_i to v_j. Then p_ij is a shortest path from v_i to v_j. Proof: By contradiction: Assume there 24.0.3 is a shorter path p_ij' from v_i to v_j, then following p from v_1 to v_i, then p_ij' from v_i to v_j, then p from v_j to v_k would give a shorter path from v_1 to v_k, contradicting the fact that p is a shortest from v_1 to v_k. Negative-weight edges In some cases, in the shortest-paths problem there may be edges with negative weights. If G = (V,E) contains no negative-weight cycles reachable from s, then for all v in V, the shortest-path weight delta(s,v) remains well- defined even if it is negative. However, if there is a negative-weight cycle reachable on a path from s to v, we define delta(s,v) to be -infinity. See Figure 24.1 page 646. Some shortest-path algorithms assume edge weights are nonnegative; others allow negative weights if there are no negative-weight cycles reachable from s. Usually an algorithm can detect and report negative-weight cycles. Cycles We just argued that a shortest path cannot contain a negative-weight cycle. Also, a shortest path cannot contain a positive-weight cycle, since removing the cycle would give a shorter path. We can also remove all 0-weight cycles, since it does not affect path weight. 24.0.4 Thus we can assume that shortest paths have no cycles, and therefore can contain at most |V| distinct vertices and |V| - 1 edges. Representing shortest paths Given a vertex v in a graph G = (V,E), we define its predecessor v.pi as another vertex or NIL. The shortest-path algorithms in this chapter set pi so that the predecessor chain starting at v runs backward to s along a shortest path from s to v. Thus if v.pi is not NIL, PRINT-PATH(G,s,v) from Section 22.2 will print a shortest path from s to v. During the run of a shortest-path algorithm, the v.pi values need not indicate shortest paths but they will at termination. As in BFS, we are interested in the predecessor subgraph G_pi = (V_pi,E_pi), where: V_pi = {v in V : v.pi not = NIL} Union {s} and E_pi = {(v.pi,v) in E : v is in V_pi - {s} } It is proved for algorithms in this 24.0.5 chapter that G_pi is a "shortest-paths tree": a rooted tree that has a shortest path from s to every vertex reachable from s. G_pi is like the breadth-first tree of Section 22.2, but it contains shortest paths defined in terms of weights instead of numbers of edges. If no negative-weight cycles are reachable from s, a shortest-paths tree rooted at s is a directed subgraph G' = (V',E') such that: 1. V' is the subset of V of vertices reachable from s in G, 2. G' forms a rooted tree with root s, and 3. for all v in V', the unique simple path from s to v in G' is a shortest path from s to v in G. Shortest paths are not necessarily unique, and neither are shortest-path trees, as is shown in Figure 24.2 on page 648. Relaxation For each vertex v in V, we maintain a field v.d, the shortest-path estimate, which is an upper bound on the weight of a shortest path from s to v. We can initialize shortest-path estimates and predecessors in Theta(V) time by INITIALIZE-SINGLE-SOURCE(G,s) 24.0.6 1 for each vertex v in G.V 2 v.d = infinity 3 v.pi = NIL 4 s.d = 0 Relaxation, the process of relaxing an edge (u,v), consists of testing if we can improve the shortest path to v found so far by going through u, and if so, updating v.d and v.pi. Figure 24.3 (page 649) shows two examples of relaxing an edge, one where the shortest-path estimate decreases and one where it doesn't. Here is the code to do relaxation on (u,v): RELAX(u,v,w) 1 if v.d > u.d + w(u,v) 2 v.d = u.d + w(u,v) 3 v.pi = u Each algorithm in this chapter calls INITIALIZE-SINGLE-SOURCE and then repeatedly relaxes edges. In the Bellman-Ford algorithm, each edge is relaxed many times; in the other two algorithms each edge is relaxed once. Properties of shortest paths and relaxation Several properties of relaxation and shortest paths are used to prove correctness of the algorithms in this chapter. They are stated here and proved in Section 24.5. Triangle inequality (Lemma 24.10) 24.0.7 For any edge (u,v) in E, we have: delta(s,v) <= delta(s,u) + w(u,v) For all of the following, we assume that INITIALIZE-SINGLE-SOURCE has been called and that v.d and v.pi only change by relaxation. Upper-bound property (Lemma 24.11) We always have v.d >= delta(s,v), and once v.d = delta(s,v) it never changes. No-path property (Corollary 24.12) If there is no path from s to v, then we always have v.d = delta(s,v) = infinity. Convergence property (Lemma 24.14) If s ^-> u --> v is a shortest path in G for some u,v in V, and if u.d = delta(s,u) at any time prior to relaxing edge (u,v), then v.d = delta(s,v) at all times afterward. Path-relaxation property (Lemma 24.15) If p = is a shortest path from v_0 to v_k, and the edges of p are relaxed in the order (v_0,v_1), (v_1,v_2), ..., (v_k-1,v_k), then v_k.d = delta(s,v_k) and this property holds regardless of any other relaxation steps that occur, even if mixed with relaxations of the edges of p. Predecessor-subgraph property (Lemma 24.17) Once v.d = delta(s,v) for all v in V, the predecessor subgraph = a shortest-paths tree Chapter outline 24.0.8 Section 24.1 presents the Bellman-Ford algorithm, which can handle negative weight edges, and can detect negative-weight cycles. Section 24.2 gives a linear-time algorithm for finding shortest paths from a single source in a directed acyclic graph. Section 24.3 covers Dijkstra's algorithm which runs faster than the Bellman-Ford algorithm but requires nonnegative weights. Section 24.4 shows how the Bellman-Ford algorithm can solve a special case of "linear programming". Finally, Section 24.5 proves the properties of shortest paths and relaxation listed above. We use some conventions when doing arithmetic with infinity. If "a" is not = -infinity, a + infinity = infinity + a = infinity. And if "a" is not = +infinity, a + (-infinity) = (-infinity) + a = -infinity. All algorithms in this chapter assume that the graph is stored in the adjacency-list representation, and that the weight of each edge is stored with it, so that we can find the edge weights in O(1) time per edge. 24.1 The Bellman-Ford algorithm 24.1.1 The Bellman-Ford algorithm solves the single- source shortest-paths problem even when the edge weights may be negative. If there is a negative-weight cycle reachable from the source, it indicates that, otherwise it gives the shortest paths and their weights. Figure 24.4 (page 652) shows how it works. BELLMAN-FORD(G,w,s) 1 INITIALIZE-SINGLE-SOURCE(G,s) 2 for i = 1 to |G.V| - 1 3 for each edge (u,v) in G.E 4 RELAX(u,v,w) 5 for each edge (u,v) in G.E 6 if v.d > u.d + w(u,v) 7 return FALSE 8 return TRUE The Bellman-Ford algorithm runs in O(V E) time, since Line 1 takes Theta(V), Lines 5-7 take O(E), and Lines 3-4 take Theta(E) time and are executed |V| - 1 times. Lemma 24.2 Let G = (V,E) be a weighted, directed graph with source s and weight function w, and assume that G has no negative-weight cycles reachable from s. Then, after the |V| - 1 iterations of lines 2-4, v.d = delta(s,v) for all vertices v that are reachable from s. 24.1.2 Proof: Let v be reachable from s, and let p = be any acyclic shortest path from v_0 = s to v_k = v. Path p has at most |V| - 1 edges, so k <= |V| - 1. Each iteration of the for loop relaxes all |E| edges. In particular edge (v_(i-1), v_i) is relaxed in iteration i for i = 1, 2, ..., k. Then v.d = v_k.d = delta(s,v_k) = delta(s,v) by the path-relaxation property. Corollary 24.3 Let G = (V,E) be a weighted, directed graph with source s and weight function w. Then for each vertex v, there is a path from s to v if and only if BELLMAN-FORD terminates with v.d < infinity when it is run on G. Proof: Exercise 24.1-2 Theorem 24.4 (Correctness of Bellman-Ford) If BELLMAN-FORD is run on a weighted, directed graph G = (V,E) with source s and weight w, and with no negative-weight cycle reachable from s, then it returns TRUE, the predecessor subgraph G_pi is a shortest-paths tree, and v.d = delta(s,v); otherwise it returns FALSE. Proof: On pages 653-654. 24.2 Single-source shortest paths in 24.2.1 directed acyclic graphs If we relax the edges of a weighted dag G = (V,E) according to a topological sort of its vertices, we can compute shortest paths from a single source in O(V + E) time. Shortest paths are well defined, since there are no (negative weight) cycles. DAG-SHORTEST-PATHS(G,w,s) 1 topologically sort the vertices of G 2 INITIALIZE-SINGLE-SOURCE(G,s) 3 for each vertex u, in topological order 4 for each vertex v in G.Adj[u] 5 RELAX(u,v,w) The topological sort of line 1 can be done in Theta(V + E) time. Line 2 takes Theta(V) time. Each of the edges is examined exactly once in lines 3-5, for a total of Theta(E) time, and so the total time is Theta(V + E). Theorem 24.5 If a weighted, directed graph G = (V,E) has source s and no cycles, at the termination of DAG-SHORTEST-PATHS, v.d = delta(s,v) for all v in V, and G_pi is a shortest-paths tree. Proof: First, v.d = delta(s,v) for all v in V at termination. If v isn't reachable from s v.d = delta(s,v) = inf. (no-path property). If v is reachable from s, let p = 24.2.2 be a shortest path, where v_0 = s and v_k = v. Since we processed the vertices in topological order, the edges on p are relaxed in the order (v_0,v_1), (v_1,v_2), ..., (v_(k-1),v_k), so by the path-relaxation property, v_i.d = delta(s,v_i) for i = 0, 1, ..., k. Finally, by the predecessor subgraph property, G_pi is a shortest-paths tree. An application of this algorithm arises in determining critical paths in PERT (Program Evaluation & Review Technique) chart analysis. Edges represent jobs to perform and weights represent the time to perform them. If (u,v) enters vertex v, and (v,x) leaves v, the job (u,v) must be performed before (v,x). A path through this dag represents a sequence of jobs that must be performed in a particular order. A critical path is a longest path through the dag, i.e. a longest time needed to perform a sequence of jobs. The weight of a critical path is a lower bound on the time to perform all the jobs. We can find a critical path by: - negating the edge weights and running DAG-SHORTEST-PATHS, or - running DAG-SHORTEST-PATHS after replacing infinity by -infinity in INITIALIZE-SINGLE- SOURCE, and ">" by "<" in RELAX. 24.3 Dijkstra's algorithm 24.3.1 Dijkstra's algorithm solves the single-source shortest-paths problem on a weighted, directed graph G = (V,E) with w(u,v) >= 0 for each edge in E. It maintains a set S of vertices whose final shortest-path weights have been found. The algorithm selects a vertex u in V-S with minimum shortest-path estimate, adds u to S, and relaxes all edges leaving u. It uses a min-priority queue Q of vertices with key v.d DIJKSTRA(G,w,s) 1 INITIALIZE-SINGLE-SOURCE(G,s) 2 S = phi // Initialize S to be empty 3 Q = G.V 4 while Q not = phi 5 u = EXTRACT-MIN(Q) 6 S = S Union {u} 7 for each vertex v in G.Adj[u] 8 RELAX(u,v,w) Figure 24.6 (page 659) shows an example. Note that vertices are never inserted into Q after line 3, so that the while loop iterates exactly |V| times. Since Dijkstra's algorithm always chooses the "lightest" vertex in V - S to add to S, we say it is a greedy algorithm. Greedy algorithms do not always yield optimal results in general, but the following theorem and corollary show that Dijkstra's algorithm does indeed compute shortest paths. Theorem 24.6 24.3.2 (Correctness of Dijkstra's algorithm) Dijkstra's algorithm, run on a weighted, directed graph G = (V,E) with nonnegative weight function w and source s, terminates with u.d = delta(s,u) for all vertices u in V Proof: We use the following loop invariant: At the start of each iteration of the while loop, v.d = delta(s,v) for all v in S. It suffices to show that u.d = delta(s,u) at the time u is added to S, since by the upper- bound property that equality holds thereafter. Initialization: Initially, S = phi so the invariant is certainly true. Maintenance: We wish to show u.d = delta(s,u) for the vertex added to S. For the purpose of contradiction, let u be the first vertex for which u.d not = delta(s,u) when it is added to S. We show that u.d = delta(s,u) actually holds, obtaining the contradiction. We must have u not = s since s is the first vertex added and s.d = delta(s,s) = 0. Since u is not s, S is not empty just before u is added to S. There must be some path from s to u, otherwise u.d = delta(s,u) = infinity by the no-path property, which contradicts u.d not = delta(s,u). Since there is at least one path, 24.3.3 there is a shortest path p from s to u. Prior to adding u to S, p connects a vertex in S, namely s, to a vertex in V - S, namely u. Let y be the first vertex along p such that y is in V - S, and let x in S be y's predecessor. Thus, as shown in Figure 24.7 (page 597), p can be decomposed as: p1 p2 s ^-> x --> y ^-> u. We claim that y.d = delta(s,y) when u is added to S. Note that x is in S. Then because u was the first vertex with u.d not = delta(s,u) when it was added to S, we had x.d = delta(s,x) when x was added to S. Edge (x,y) was relaxed at that time so the claim follows from the convergence property. We can now get a contradiction to prove that u.d = delta(s,u). Because y occurs before u on a shortest path from s to u and weights are nonnegative, delta(s,y) <= delta(s,u), & y.d = delta(s,y) <= delta(s,u) (*) <= u.d (by the upper-bound property) Since both u and y were in V - S when u was chosen in line 5, we have u.d <= y.d also, since u.d was a minimum. So the inequalities in (*) are equalities: y.d = delta(s,y) = delta(s,u) = u.d In particular delta(s,u) = u.d, 24.3.4 which contradicts our choice of u. Thus we conclude that delta(s,u) = u.d when u was added to S, and that this equality is maintained at all later times too. Termination: At termination, Q is phi, which together with Q = V - S, implies that S = V. Thus u.d = delta(s,u) for all u in V. Corollary 24.7 If we run Dijkstra's algorithm on a weighted, directed graph G, then at termination, the predecessor subgraph is a shortest-paths tree. Proof: Immediate from Theorem 24.6 and the predecessor-subgraph property. Analysis Note that Dijkstra's algorithm uses the min-priority queue operations INSERT (implicit in line 3), EXTRACT-MIN in line 5, and DECREASE-KEY (implicit in RELAX). Since each vertex v is added to S once, each edge in Adj[v] is examined in the for loop once in the course of the algorithm, so there are a total of |E| iterations of the for loop, and thus a total of at most |E| DECREASE-KEY operations. 24.3.5 The run time of Dijkstra's algorithm depends on the min-priority queue implementation. If we number the vertices 1 to |V|, we can store v.d in a simple array, so each INSERT and DECREASE-KEY operation take O(1) time, but EXTRACT-MIN takes O(V) time (since we have to search the whole array), for a total time of O(V^2 + E) = O(V^2). If the graph is sparse, we can implement the min-priority queue with a binary min-heap. Each of the |V| EXTRACT-MIN operations takes O(lg V) time. The time to build the heap is O(V). Each DECREASE-KEY takes O(lg V), and there are at most |E| of them. So the total running time is O((V + E)lg V), which is O(E lg V) if all vertices are reachable from the source: an improvement if E = o(V^2/lg V). If we implement the min-priority queue with a Fibonacci heap, the amortized cost of the |V| EXTRACT-MIN operations is O(lg V), and each DECREASE-KEY call (of which there are at most |E|) takes O(1) amortized time, for a total of O(V lg V + E) (since the amortized cost of the |V| INSERT operations is O(1) ).