Chapter 23 Minimum Spanning Trees 23.0.1
Given a connected, undirected weighted graph
G = (V,E), it is often desired to find a least
cost subset T of E that connects all vertices.
Such a subset T will be acyclic, and thus also
a tree, which we call a spanning tree. The
problem of finding T is the minimum-spanning-
tree (MST) problem. Figure 23.1 (page 562)
shows a minimum spanning tree example:
___8__ ___7__
(b)------(c)------(d)
//| // \\ |\\
4 // | //2 \\ | \\9
// | // \\ | \\
(a) | (i) 4\\ | (e)
\ 11| / \ \\ |14 /
8\ | 7/ \6 \\ | /10
\ | /_____\ _____\\| /
(h)------(g)------(f)
1 2
The total weight of the spanning tree is 37;
replacing (b,c) by (a,h) gives another minimum
spanning tree of weight 37.
We examine two "greedy" algorithms to find T:
Kruskal's and Prim's. Each can be made to run
in O(E lg V) time using binary heaps; by using
Fibonacci heaps, Prim's algorithm can be sped
up to O(E + V lg V).
Sections 23.1 and 23.2 give a generic MST
algorithm, and Kruskal's & Prim's algorithms.
23.1 Growing a minimum spanning tree 23.1.1
The following generic algorithm grows the
tree maintaining the following loop invariant:
Prior to each iteration, A is a subset of
some minimum spanning tree.
At each step, we find a "safe edge" (u,v) --
one that can be added to A without violating
the invariant.
GENERIC-MST(G,w)
1 A <- phi
2 while A does not form a spanning tree
3 do find a safe edge (u,v) for A
4 A <- A Union {(u,v)}
5 return A
Initialization: After line 1, A trivially
satisfies the loop invariant.
Maintenance: The loop in lines 2-4 maintains
the invariant by only adding safe edges.
Termination: All edges added to A are in a
minimum spanning tree, so the set A returned
in line 5 must be a minimum spanning tree.
Theorem 23.1 tells us how to recognize safe
edges, but first we need some terminology.
23.1.2
A cut (S, V - S) of an undirected graph
G = (V,E) is a partition of V (see Figure 23.2
page 564). We say an edge crosses the cut if
one endpoint is in S and the other is in V-S.
We say a cut respects a set A of edges if no
edge in A crosses the cut. An edge is a light
edge crossing a cut if its weight is the
minimum of any edge crossing the cut. More
generally, we say an edge is a light edge for
a property if it has the minimum weight of any
edge satisfying that property.
Theorem 23.1
Let G = (V,E) be a connected, undirected graph
with weight function w. Let A be a subset of
E that is included in some MST for G, let
(S, V - S) be any cut that respects A, and let
(u,v) be a light edge crossing (S, V - S).
Then (u,v) is a safe edge for A.
Proof: Let T be a MST that includes A, and
assume that T does not contain (u,v), since if
it does, we are done. We shall construct
another MST T' that includes A Union {(u,v)},
thus showing that (u,v) is a safe edge for A.
The edge (u,v) forms a cycle with the path p
from u to v in T, as shown in Figure 23.3.
Since u and v are on opposite sides of the cut
(S, V - S), there is at least one edge (x,y)
in T on path p that also crosses the cut. Now
(x,y) is not in A since the cut respects A.
Also since (x,y) is on the unique path 23.1.3
from u to v in T, removing (x,y) breaks T into
two components. Adding (u,v) reconnects them
to form a new spanning tree:
T' = ( T - {(x,y)} ) Union {(u,v)}
We now show T' is a MST. Since (u,v) is a
light edge crossing (S, V - S) and (x,y) also
crosses this cut, w(u,v) <= w(x,y). Therefore
w(T') = w(T) - w(x,y) + w(u,v)
<= w(T).
But T is a MST, so w(T) <= w(T'), and so T'
must be a MST also.
It remains to show that (u,v) is actually a
safe edge for A. We know A is a subset of T'
since A is a subset of T & (x,y) is not in A;
thus A Union {(u,v)} is a subset of T'. Thus,
since T' is a MST, (u,v) is safe for A.
Corollary 23.2
Let G = (V,E) and w be as above. Let A be a
subset of E that is included in some MST for
G, and let C = (V_C, E_C) be a connected
component in the forest G_A = (V,A). If (u,v)
is a light edge connecting C to some other
component in G_A, then (u,v) is safe for A.
Proof: The cut (V_C, V - V_C) respects A, and
(u,v) is a light edge for this cut. Therefore
(u,v) is safe for A.
23.2 The algorithms of Kruskal & Prim 23.2.1
Kruskal's algorithm
Kruskal's algorithm finds a safe edge to add
to the growing forest by finding an edge (u,v)
of least weight that connects two trees in the
forest. Let C1 and C2 be the two trees that
(u,v) connects. Since (u,v) must be a light
edge connecting C1 to some other tree,
Corollary 23.2 says that (u,v) is a safe edge
for C1. Kruskal's algorithm is a greedy
algorithm, since it always adds an edge of
least possible weight.
It uses a disjoint-set data structure. Each
set contains the vertices in a tree of the
current forest. FIND-SET(u) returns a
representative element from the set containing
u. The UNION procedure combines trees. An
example is shown in Figure 23.4 (page 568).
MST-KRUSKAL(G,w)
1 A <- phi
2 for each vertex v in V[G]
3 do MAKE-SET(v)
4 sort the edges of E into nondecreasing order
by weight
5 for each edge (u,v) in E, taken in order
6 do if FIND-SET(u) not = FIND-SET(v)
7 then A <- A Union {(u,v)}
8 UNION(u,v)
9 return A
Kruskal's algorithm's running time 23.2.2
depends on the implementation of the disjoint-
set data structure. The implementation of
Section 21.3 is asymptotically the fastest one
known, so we assume it. Line 1 takes O(1)
time, and line 4 takes O(E lg E). The for
loop of lines 5-8 does O(E) FIND-SET and UNION
operations on the disjoint-set forest. Along
with the |V| MAKE-SET operations, these take a
total of O( (V + E) alpha(V) ) time (where
alpha is the very slowly growing function
defined in Section 21.4). Because we assume G
is connected, |E| >= |V| - 1, and so the
disjoint-set operations take O( E alpha(V) )
time. Moreover, since alpha(|V|) = O(lg V) =
O(lg E), the total running time is O(E lg E).
But since |E| < |V|^2, we have lg|E| = O(lg V)
and so we can restate the running time of
Kruskal's algorithm as O(E lg V).
Prim's algorithm
In Prim's algorithm, the set A forms a single
tree and the safe edge to add is a light edge
connecting A to a vertex not in A. By
Corollary 23.2, this rule only adds edges that
are safe for A, so when the algorithm ends,
the edges of A form a MST. This strategy is
greedy since A is augmented by an edge that
adds the minimum weight to A. Figure 23.5
(page 571) illustrates the algorithm.
23.2.3
Prim's algorithm makes use of a min-priority
queue Q to hold all the vertices that are not
yet in the tree. For each vertex v, key[v] is
the minimum weight of any edge connecting v to
a vertex in the tree (key[v] = infinity if no
such edge exists). pi[v] is the parent of v
in the tree. The set A is kept implicitly as:
A = {(v,pi[v]) : v is in V - {r} - Q }
When the algorithm terminates, Q is empty and
the MST A for G is thus:
A = {(v,pi[v]) : v is in V - {r} }
MST-PRIM(G,w,r)
1 for each u in V[G]
2 do key[u] <- infinity
3 pi[u] <- NIL
4 key[r] <- 0
5 Q <- V[G]
6 while Q not empty
7 do u <- EXTRACT-MIN(Q)
8 for each v in Adj[u]
9 do if v in Q and w(u,v) < key[v]
10 then pi[v] <- u
11 key[v] <- w(u,v)
The algorithm maintains the three-part loop
invariant for the while loop of lines 6-11:
Prior to any iteration of lines 6-11, 23.2.4
1. A = {(v,pi[v]) : v is in V - {r} - Q }
2. The vertices currently in the MST are
those in V - Q
3. For all vertices v in Q, if pi[v] is not
NIL, then key[v] < infinity and key[v] is
the weight of a light edge (v,pi[v]) that
connects v to a vertex already in the MST
Line 7 identifies a vertex u in Q that is an
endpoint of a light edge crossing the cut
(V - Q, Q). Removing u from Q adds it to the
set V - Q in the tree, thus adding (u,pi[u])
to A. The for loop of lines 8-11 updates the
key and pi fields of each vertex v adjacent to
u but not in the tree, which maintains Part 3.
If Q is implemented as a binary min-heap, we
can use BUILD-MIN-HEAP to do lines 1-5 in O(V)
time. The body of the while loop is done |V|
times and since EXTRACT-MIN takes O(lg V) time
that is a total of O(V lg V). The for loop is
done O(E) times since there are 2|E| edges in
the adjacency lists. The test for membership
in Q in line 9 can be done in O(1) time by
using a membership bit. Line 11 involves a
DECREASE-KEY operation, which is O(lg V); the
total time is O(V lg V + E lg V) = O(E lg V).
We can do better with Fibonacci heaps, where
DECREASE-KEY takes only O(1) amortized time.
Thus the total running time can be improved to
O(E + V lg V) by using a Fibonacci heap.