Chapter 24 Single-Source Shortest Paths 24.0.1
A motorist wants to find the shortest route
from Duluth to Boston, given a map of the U.S.
marked with distances between neighboring
towns. Abstractly, in a shortest-paths
problem, we are given a weighted, directed
graph G = (V,E), with weight function w:E -> R
mapping edges to real numbers. The weight of
a path p = is:
k
w(p) = Sum ( w(v , v ) )
i = 1 i-1 i
The shortest-path weight from u to v is:
p
/ min{w(p): u ^-> v} if there is
delta(u,v) = < a path from u to v
\ infinity otherwise
A shortest path from u to v is then defined as
any path p with weight w(p) = delta(u,v).
Edge weights can also be used to represent
time, cost, penalties, loss, etc.
Variants
In this chapter, we focus on the single-source
shortest-paths problem: given graph G = (V,E),
we want to find a shortest path from a given
source vertex s to each vertex v in V. Many
other problems can be solved the same way:
24.0.2
Single-destination shortest-paths problem:
Find a shortest path to a given destination
vertex t from each vertex v. We can reduce
this problem to the single-source problem by
reversing the direction of the edges in E.
Single-pair shortest-path problem: For given
vertices u and v, find a shortest path from
u to v. Note: no algorithms are known for
this problem that run asymptotically faster
than the best single-source single source
algorithms in the worst case.
All-pairs shortest-paths problem: Find a
shortest path from u to v for every pair of
vertices u and v. This problem (discussed
in Chapter 25), interesting by itself, can
be solved faster than running the single-
source algorithm once for each vertex u.
Optimal substructure of a shortest path
Lemma 24.1
Subpaths of shortest paths are shortest paths.
Let p = be a shortest
path from v_1 to v_k and, for any i and j such
that 1 <= i < j <= k, let p_ij =
be the subpath of p from v_i to v_j. Then
p_ij is a shortest path from v_i to v_j.
Proof: By contradiction: Assume there 24.0.3
is a shorter path p_ij' from v_i to v_j, then
following p from v_1 to v_i, then p_ij' from
v_i to v_j, then p from v_j to v_k would give
a shorter path from v_1 to v_k, contradicting
the fact that p is a shortest from v_1 to v_k.
Negative-weight edges
In some cases, in the shortest-paths problem
there may be edges with negative weights. If
G = (V,E) contains no negative-weight cycles
reachable from s, then for all v in V, the
shortest-path weight delta(s,v) remains well-
defined even if it is negative. However, if
there is a negative-weight cycle reachable on
a path from s to v, we define delta(s,v) to be
-infinity. See Figure 24.1 page 483.
Some shortest-path algorithms assume edge
weights are nonnegative; others allow negative
weights if there are no negative-weight cycles
reachable from s. Usually an algorithm can
detect and report negative-weight cycles.
Cycles
We just argued that a shorest path cannot
contain a negative-weight cycle. Also, a
shortest path cannot contain a positive-weight
cycle, since removing the cycle would give a
shorter path. We can also remove all 0-weight
cycles, since it does not affect path weight.
24.0.4
Thus we can assume that shortest paths have no
cycles, and therefore can contain at most |V|
distinct vertices and |V| - 1 edges.
Representing shortest paths
Given a vertex v in a graph G = (V,E), we
define its predecessor pi[v] as another vertex
or NIL. The shortest-path algorithms in this
chapter set pi so that the predecessor chain
starting at v runs backward to s along a
shortest path from s to v. Thus if pi[v] is
not NIL, PRINT-PATH(G,s,v) from Section 22.2
will print a shortest path from s to v.
During the run of a shortest-path algorithm,
the pi values need not indicate shortest paths
but they will at termination. As in BFS, we
are interested in the predecessor subgraph
G_pi = (V_pi,E_pi), where:
V_pi = {v in V : pi[v] not = NIL} Union {s}
and
E_pi = {(pi[v],v) in E : v is in V_pi - {s} }
It is proved for algorithms in this 24.0.5
chapter that G_pi is a "shortest-paths tree":
a rooted tree that has a shortest path from s
to every vertex reachable from s. G_pi is
like the breadth-first tree of Section 22.2,
but it contains shortest paths defined in
terms of weights instead of numbers of edges.
If no negative-weight cycles are reachable
from s, a shortest-paths tree rooted at s is
a directed subgraph G' = (V',E') such that:
1. V' is the subset of V of vertices reachable
from s in G,
2. G' forms a rooted tree with root s, and
3. for all v in V', the unique simple path
from s to v in G' is a shortest path from s
to v in G.
Shortest paths are not necessarily unique, and
neither are shortest-path trees, as is shown
in Figure 24.2 on page 585.
Relaxation
For each vertex v in V, we maintain a field
d[v], the shortest-path estimate, which is an
upper bound on the weight of a shortest path
from s to v. We can initialize shortest-path
estimates and predecessors in Theta(V) time by
INITIALIZE-SINGLE-SOURCE(G,s) 24.0.6
1 for each vertex v in V[G]
2 do d[v] <- infinity
3 pi[v] <- NIL
4 d[s] <- 0
Relaxation, the process of relaxing an edge
(u,v), consists of testing if we can improve
the shortest path to v found so far by going
through u, and if so, updating d[v] and pi[v].
Figure 24.3 (page 586) shows two examples of
relaxing an edge, one where the shortest-path
estimate decreases and one where it doesn't.
Here is the code to do relaxation on (u,v):
RELAX(u,v,w)
1 if d[v] > d[u] + w(u,v)
2 then d[v] <- d[u] + w(u,v)
3 pi[v] <- u
Each algorithm in this chapter calls
INITIALIZE-SINGLE-SOURCE and then repeatedly
relaxes edges. In the Bellman-Ford algorithm,
each edge is relaxed many times; in the other
two algorithms each edge is relaxed once.
Properties of shortest paths and relaxation
Several properties of relaxation and shortest
paths are used to prove correctness of the
algorithms in this chapter. They are stated
here and proved in Section 24.5.
Triangle inequality (Lemma 24.10) 24.0.7
For any edge (u,v) in E, we have:
delta(s,v) <= delta(s,u) + w(u,v)
For all of the following, we assume that
INITIALIZE-SINGLE-SOURCE has been called and
that d[v] and pi[v] only change by relaxation.
Upper-bound property (Lemma 24.11)
We always have d[v] >= delta(s,v), and once
d[v] = delta(s,v) it never changes.
No-path property (Corollary 24.12)
If there is no path from s to v, then we
always have d[v] = delta(s,v) = infinity.
Convergence property (Lemma 24.14)
If s ^-> u --> v is a shortest path in G for
some u,v in V, and if d[u] = delta(s,u) at
any time prior to relaxing edge (u,v), then
d[v] = delta(s,v) at all times afterward.
Path-relaxation property (Lemma 24.15)
If p = is a shortest
path from v_0 to v_k, and the edges of p are
relaxed in the order (v_0,v_1), (v_1,v_2),
..., (v_k-1,v_k), then d[v_k] = delta(s,v_k)
and this property holds regardless of any
other relaxation steps that occur, even if
mixed with relaxations of the edges of p.
Predecessor-subgraph property (Lemma 24.17)
Once d[v] = delta(s,v) for all v in V, the
predecessor subgraph = a shortest-paths tree
Chapter outline 24.0.8
Section 24.1 presents the Bellman-Ford
algorithm, which can handle negative weight
edges, and can detect negative-weight cycles.
Section 24.2 gives a linear-time algorithm for
finding shortest paths from a single source in
a directed acyclic graph. Section 24.3 covers
Dijkstra's algorithm which runs faster than
the Bellman-Ford algorithm but requires
nonnegative weights. Section 24.4 shows how
the Bellman-Ford algorithm can solve a special
case of "linear programming". Finally,
Section 24.5 proves the properties of shortest
paths and relaxation listed above.
We use some conventions when doing arithmetic
with infinity. If "a" is not = -infinity,
a + infinity = infinity + a = infinity.
And if "a" is not = +infinity,
a + (-infinity) = (-infinity) + a = -infinity.
All algorithms in this chapter assume that
the graph is stored in the adjacency-list
representation, and that the weight of each
edge is stored with it, so that we can find
the edge weights in O(1) time per edge.
24.1 The Bellman-Ford algorithm 24.1.1
The Bellman-Ford algorithm solves the single-
source shortest-paths problem even when the
edge weights may be negative. If there is a
negative-weight cycle reachable from the
source, it indicates that, otherwise it gives
the shortest paths and their weights.
Figure 24.4 (page 589) shows how it works.
BELLMAN-FORD(G,w,s)
1 INITIALIZE-SINGLE-SOURCE(G,s)
2 for i <- 1 to |V[G]| - 1
3 do for each edge (u,v) in E[G]
4 do RELAX(u,v,w)
5 for each edge (u,v) in E[G]
6 do if d[v] > d[u] + w(u,v)
7 then return FALSE
8 return TRUE
The Bellman-Ford algorithm runs in O(V E)
time, since Line 1 takes Theta(V), Lines 5-7
take O(E), and Lines 3-4 take Theta(E) time
and are executed |V| - 1 times.
Lemma 24.2
Let G = (V,E) be a weighted, directed graph
with source s and weight function w, and
assume that G has no negative-weight cycles
reachable from s. Then, after the |V| - 1
iterations of lines 2-4, d[v] = delta(s,v) for
all vertices v that are reachable from s.
24.1.2
Proof: Let v be reachable from s, and let
p = be any acyclic
shortest path from v_0 = s to v_k = v. Path p
has at most |V| - 1 edges, so k <= |V| - 1.
Each iteration of the for loop relaxes all |E|
edges. In particular edge (v_(i-1), v_i) is
relaxed in iteration i for i = 1, 2, ..., k.
Then d[v] = d[v_k] = delta(s,v_k) = delta(s,v)
by the path-relaxation property.
Corollary 24.3
Let G = (V,E) be a weighted, directed graph
with source s and weight function w. Then for
each vertex v, there is a path from s to v if
and only if BELLMAN-FORD terminates with
d[v] < infinity when it is run on G.
Proof: Exercise 24.1-2
Theorem 24.4 (Correctness of Bellman-Ford)
If BELLMAN-FORD is run on a weighted, directed
graph G = (V,E) with source s and weight w,
and with no negative-weight cycle reachable
from s, then it returns TRUE, the predecessor
subgraph G_pi is a shortest-paths tree, and
d[v] = delta(s,v); otherwise it returns FALSE.
Proof: On pages 590-591.
24.2 Single-source shortest paths in 24.2.1
directed acyclic graphs
If we relax the edges of a weighted dag G =
(V,E) according to a topological sort of its
vertices, we can compute shortest paths from a
single source in O(V + E) time. Shortest
paths are well defined, since there are no
(negative weight) cycles.
DAG-SHORTEST-PATHS(G,w,s)
1 topologically sort the vertices of G
2 INITIALIZE-SINGLE-SOURCE(G,s)
3 for each vertex u, taken in topologically
sorted order
4 do for each vertex v in Adj[u]
5 do RELAX(u,v,w)
The topological sort of line 1 can be done in
Theta(V + E) time. Line 2 takes Theta(V)
time. Each of the edges is examined exactly
once in lines 3-5, for a total of Theta(E)
time, and so the total time is Theta(V + E).
Theorem 24.5
If a weighted, directed graph G = (V,E) has
source s and no cycles, at the termination of
DAG-SHORTEST-PATHS, d[v] = delta(s,v) for all
v in V, and G_pi is a shortest-paths tree.
Proof: First, d[v] = delta(s,v) for all v in
V at termination. If v isn't reachable from s
d[v] = delta(s,v) = inf. (no-path property).
If v is reachable from s, let p = 24.2.2
be a shortest path, where
v_0 = s and v_k = v. Since we processed the
vertices in topological order, the edges on p
are relaxed in the order (v_0,v_1), (v_1,v_2),
..., (v_(k-1),v_k), so by the path-relaxation
property, d[v_i] = delta(s,v_i) for i = 0, 1,
..., k. Finally, by the predecessor subgraph
property, G_pi is a shortest-paths tree.
An application of this algorithm arises in
determining critical paths in PERT (Program
Evaluation & Review Technique) chart analysis.
Edges represent jobs to perform and weights
represent the time to perform them. If (u,v)
enters vertex v, and (v,x) leaves v, the job
(u,v) must be performed before (v,x). A path
through this dag represents a sequence of jobs
that must be performed in a particular order.
A critical path is a longest path through the
dag, i.e. a longest time needed to perform a
sequence of jobs. The weight of a critical
path is a lower bound on the time to perform
all the jobs. We can find a critical path by:
- negating the edge weights and running
DAG-SHORTEST-PATHS, or
- running DAG-SHORTEST-PATHS after replacing
infinity by -infinity in INITIALIZE-SINGLE-
SOURCE, and ">" by "<" in RELAX.
24.3 Dijkstra's algorithm 24.3.1
Dijkstra's algorithm solves the single-source
shortest-paths problem on a weighted, directed
graph G = (V,E), and assumes that w(u,v) >= 0
for each edge in E. It maintains a set S of
vertices whose final shortest-path weights
have already been determined. The algorithm
repeatedly selects a vertex u in V - S with
the minimum shortest-path estimate, adds u to
S and relaxes all edges leaving u. It uses a
min-priority queue Q of vertices with key d[v]
DIJKSTRA(G,w,s)
1 INITIALIZE-SINGLE-SOURCE(G,s)
2 S <- phi |> Initialize S to be empty
3 Q <- V[G]
4 while Q not = phi
5 do u <- EXTRACT-MIN(Q)
6 S <- S Union {u}
7 for each vertex v in Adj[u]
8 do RELAX(u,v,w)
Figure 24.6 (page 596) shows an example.
Note that vertices are never inserted into Q
after line 3, so that the while loop iterates
exactly |V| times. Since Dijkstra's algorithm
always chooses the "lightest" vertex in V - S
to add to S, we say it is a greedy algorithm.
Greedy algorithms do not always yield optimal
results in general, but the following theorem
and corollary show that Dijkstra's algorithm
does indeed compute shortest paths.
Theorem 24.6 24.3.2
(Correctness of Dijkstra's algorithm)
Dijkstra's algorithm, run on a weighted,
directed graph G = (V,E) with nonnegative
weight function w and source s, terminates
with d[u] = delta(s,u) for all vertices u in V
Proof: We use the following loop invariant:
At the start of each iteration of the while
loop, d[v] = delta(s,v) for all v in S.
It suffices to show that d[u] = delta(s,u) at
the time u is added to S, since by the upper-
bound property that equality holds thereafter.
Initialization: Initially, S = phi so the
invariant is certainly true.
Maintenance: We wish to show d[u] = delta(s,u)
for the vertex added to S. For the purpose
of contradiction, let u be the first vertex
for which d[u] not = delta(s,u) when it is
added to S. We show that d[u] = delta(s,u)
actually holds, obtaining the contradiction.
We must have u not = s since s is the first
vertex added and d[s] = delta(s,s) = 0.
Since u is not s, S is not empty just before
u is added to S. There must be some path
from s to u, otherwise d[u] = delta(s,u) =
infinity by the no-path property, which
contradicts d[u] not = delta(s,u).
Since there is at least one path, 24.3.3
there is a shortest path p from s to u.
Prior to adding u to S, p connects a vertex
in S, namely s, to a vertex in V - S, namely
u. Let y be the first vertex along p such
that y is in V - S, and let x in S be y's
predecessor. Thus, as shown in Figure 24.7
(page 597), p can be decomposed as:
p1 p2
s ^-> x --> y ^-> u. We claim that d[y] =
delta(s,y) when u is added to S. Note that
x is in S. Then because u was the first
vertex with d[u] not = delta(s,u) when it
was added to S, we had d[x] = delta(s,x)
when x was added to S. Edge (x,y) was
relaxed at that time so the claim follows
from the convergence property.
We can now get a contradiction to prove that
d[u] = delta(s,u). Because y occurs before
u on a shortest path from s to u and weights
are nonnegative, delta(s,y) <= delta(s,u), &
d[y] = delta(s,y)
<= delta(s,u) (*)
<= d[u] (by the upper-bound property)
Since both u and y were in V - S when u was
chosen in line 5, we have d[u] <= d[y] also,
since d[u] was a minimum. So the
inequalities in (*) are equalities:
d[y] = delta(s,y) = delta(s,u) = d[u]
In particular delta(s,u) = d[u], 24.3.4
which contradicts our choice of u. Thus we
conclude that delta(s,u) = d[u] when u was
added to S, and that this equality is
maintained at all later times too.
Termination: At termination, Q is phi, which
together with Q = V - S, implies that S = V.
Thus d[u] = delta(s,u) for all u in V.
Corollary 24.7
If we run Dijkstra's algorithm on a weighted,
directed graph G, then at termination, the
predecessor subgraph is a shortest-paths tree.
Proof: Immediate from Theorem 24.6 and the
predecessor-subgraph property.
Analysis
Note that Dijkstra's algorithm uses the
min-priority queue operations INSERT (implicit
in line 3), EXTRACT-MIN in line 5, and
DECREASE-KEY (implicit in RELAX). Since each
vertex v is added to S once, each edge in
Adj[v] is examined in the for loop once in the
course of the algorithm, so there are a total
of |E| iterations of the for loop, and thus a
total of at most |E| DECREASE-KEY operations.
24.3.5
The run time of Dijkstra's algorithm depends
on the min-priority queue implementation. If
we number the vertices 1 to |V|, we can store
d[v] in a simple array, so each INSERT and
DECREASE-KEY operation take O(1) time, but
EXTRACT-MIN takes O(V) time (since we have to
search the whole array), for a total time of
O(V^2 + E) = O(V^2).
If the graph is sparse, we can implement the
min-priority queue with a binary min-heap.
Each of the |V| EXTRACT-MIN operations takes
O(lg V) time. The time to build the heap is
O(V). Each DECREASE-KEY takes O(lg V), and
there are at most |E| of them. So the total
running time is O((V + E)lg V), which is
O(E lg V) if all vertices are reachable from
the source: an improvement if E = o(V^2/lg V).
If we implement the min-priority queue with a
Fibonacci heap, the amortized cost of the |V|
EXTRACT-MIN operations is O(lg V), and each
DECREASE-KEY call (of which there are at most
|E|) takes O(1) amortized time, for a total of
O(V lg V + E) (since the amortized cost of the
|V| INSERT operations is O(1) ).