Chapter 13 Red-Black Trees 13.1.1
Red-black trees are one scheme for insuring
that binary search trees remain balanced, so
that their height never gets larger than
O(lg n), where n is the number of keys.
13.1 Properties of red-black trees
A red-black tree is a binary search tree with
an extra bit of data, its color: RED or BLACK.
By constraining the coloring of nodes,
red-black trees ensure that any path from the
root to a leaf is no more than twice as long
as any other such path, so red-black are
approximately balanced.
A binary search tree is a red-black tree if
it satisfies the red-black tree properties:
1. Every node is either red or black.
2. The root is black.
3. Every leaf (NIL) is black.
4. If a node is red, then both of its children
are black.
5. For each node, all paths from that node to
descendent leaves contain the same number of
black nodes.
Figure 13.1(a), page 275, shows an example.
13.1.2
For convenience, we use a single sentinel,
nil[T] to represent NIL in the tree T. Its
color is BLACK. It represents all the leaves
and the parent of the root. Figure 13.1(b),
page 275, shows an example.
Since we are only interested in internal,
key-holding nodes, we omit drawing the leaves,
as shown in Figure 13.1(c) page 275.
We define the black-height, bh(x), of a node
x as the number of black nodes on a path from
x to a leaf (but not counting x), which is
well defined by property 5. The black-height
of a tree is the black-height of its root.
Lemma 13.1 A red-black tree with n internal
nodes has height at most 2lg(n + 1).
Proof: We first show that the subtree rooted
at a node x contains at least 2^bh(x) - 1
internal nodes by induction on the height of
x. If the height is 0, x must be a leaf and
the subtree rooted at x contains 2^bh(x) - 1 =
2^0 - 1 = 1 - 1 = 0 internal nodes.
If x has height > 0, each child has black
height of either bh(x) or bh(x) - 1, and since
a child has height less than x, we can apply
the induction hypothesis: each child has at
least 2^(bh(x)-1) - 1 internal nodes. So the
subtree rooted at x contains 2(2^(bh(x)-1)-1)
+ 1 = 2^bh(x) - 1 internal nodes, as desired.
13.1.3
Now let h be the height of the tree, then by
property 4, at least half the nodes must be
black on any simple path from the root to a
leaf, not including the root. So the black
height of the tree must be at least h/2, thus:
n >= 2^bh(x) - 1 >= 2^(h/2) - 1,
or: n + 1 >= 2^(h/2), and taking lg of each
side: lg(n + 1) >= h/2, or h <= 2lg(n + 1)
which is what we want, finishing the proof.
Consequently, the dynamic set queries SEARCH,
MINIMUM, MAXIMUM, SUCCESSOR, and PREDECESSOR
will run in O(lg n) time since they run in
O(h) time on a search tree of height h, and
any red-black tree with n nodes is a search
tree of height O(lg n).
Note: TREE-INSERT and TREE-DELETE of Chapter
12 would also run in O(lg n) time, but they
would not necessarily preserve the red-black
tree properties. However, by being careful,
INSERT and DELETE _can_ be made to run in
O(lg n) while preserving the red-black tree
properties, as will be shown in Sections 13.3
and 13.4. This is done by performing
rotations and recoloring nodes, which maintain
the red-black tree properties and so keep the
tree balanced.
13.2 Rotations 13.2.1
A rotation is a local operation that preserves
the binary-search-tree property. Figure 13.2
shows left and right rotations. For a left
rotation on a node x, we assume that right[x]
= y is not nil[T]. The rotation "pivots"
counter-clockwise around the link from x to y,
making y the new root of the subtree and x its
left child. Here is the code for LEFT-ROTATE.
Figure 13.3, page 279, shows how it works.
LEFT-ROTATE(T,x)
1 y <- right[x] |> Set y
2 right[x] <- left[y] |> Turn y's left
3a if left[y] not = nil[T] |>subtree into x's
3b then p[left[y]] <- x |> right subtree
4 p[y] <- p[x] |> Link x's parent to y
5 if p[x] = nil[T] |> x was the root
6 then root[T] <- y
7 else if x = left[p[x]]
8 then left[p[x]] <- y
9 else right[p[x]] <- y
10 left[y] <- x |> Put x on y's left
11 p[x] <- y
The code for RIGHT-ROTATE is symmetric; it and
LEFT-ROTATE only change a fixed number of
pointers and so they run in O(1) time.
13.3 Insertion 13.3.1
Insertion into an n-node red-black tree can
be done in O(lg n) time. We slightly modify
TREE-INSERT to insert the node, colored red,
then call RB-INSERT-FIXUP to re-establish the
red-black tree properties by recolorings and
rotations.
RB-INSERT(T,z)
1 y <- nil[T]
2 x <- root[T]
3 while x not = nil[T]
4 do y <- x
5 if key[z] < key[x]
6 then x <- left[x]
7 else x <- right[x]
8 p[z] <- y
9 if y = nil[T]
10 then root[T] <- z
11 else if key[z] < key[y]
12 then left[y] <- z
13 else right[y] <- z
14 left[z] <- nil[T]
15 right[z] <- nil[T]
16 color[z] <- RED
17 RB-INSERT-FIXUP(T,z)
The 4 modifications are: 1) nil[T] replaces
NIL, 2) z's children are set to nil[T], 3) z
is colored RED, 4) RB-INSERT-FIXUP is called.
RB-INSERT-FIXUP(T,z) 13.3.2
1 while color[p[z]] = RED do
2 if p[z] = left[p[p[z]]]
3 then y <- right[p[p[z]]]
4 if color[y] = RED |> Case:
5 then color[p[z]] <- BLACK |> 1
6 color[y] <- BLACK |> 1
7 color[p[p[z]]] <- RED |> 1
8 z <- p[p[z]] |> 1
9 else if z = right[p[z]]
10 then z <- p[z] |> 2
11 LEFT-ROTATE(T,z) |> 2
12 color[p[z]] <- BLACK |> 3
13 color[p[p[z]]] <- RED |> 3
14 RIGHT-ROTATE(T,p[p[z]]) |> 3
15 else (same as "then" clause with
"right" and "left" exchanged)
16 color[root[T]] <- BLACK
We examine code in three major steps:
1) What violations of red-black tree
properties are introduced by RB-INSERT?
2) What is the goal of the while-loop in 1-15?
3) How do the three cases perform fix-up?
Figure 13.4, page 282, shows a sample fix-up.
1) Properties 1, 3, and 5 are still satisfied
but maybe not property 2 (root is BLACK) or
property 4 (RED node can't have RED child).
2) The while loop maintains the 13.3.3
following three-part invariant:
At the start of each iteration of the loop:
a. Node z is red.
b. If p[z] is the root, then p[z] is BLACK.
c. There is at most one violation of the
red-black properties -- either property 2
or 4. If 2 is violated, it is because z
is the root and is red. If 4 is violated
it is because both z and p[z] are red.
To check the invariant, we start with the
initialization and termination arguments. In
the proof of maintenance, we note that two
things can happen: z moves up the tree or some
rotations are done and the loop terminates.
Initialization:
a. When RB-INSERT-FIXUP is called, z is the
red node that was added.
b. If p[z] is the root, then p[z] started
out black and has not changed.
c. If there is a violation of property 2,
the red root must be the new node z, the
only internal node; and the parent and
both children are black (nil), so there is
no violation of property 4.
If 4 is violated, then since the children
of z are black and the tree had no other
violations before z was added, the only
violation is now: both z and p[z] are red.
Termination: 13.3.4
The loop terminates when p[z] becomes black.
(If z is the root, p[z] is nil and black.)
Thus there is no violation of property 4, so
the only violation can be of property 2,
which is fixed by line 16. So all red-black
properties hold when RB-INSERT-FIXUP ends.
Maintenance: There are 6 cases to consider in
the while loop, but 3 cases are symmetric to
the other 3, depending on whether z's parent
p[z] is a left or right child of p[p[z]],
which is determined at line 2. This is major
step 3) of our analysis. If p[z] is the
root, it is black. Since we only enter the
loop if p[z] is red, we know p[z] isn't the
root in that case, and so p[p[z]] exists.
We distinguish case 1 from 2 and 3 by the
color of z's parent's sibling or "uncle".
Line 3 makes y point to z's uncle. Line 4
tests if y is red, & if so, case 1 is done,
else we do cases 2 & 3. In each, p[p[z]] is
black, since p[z] is red and property 4 is
only violated between z and p[z].
Case 1: z's uncle is red
Figure 13.5 shows case 1 (lines 5-8), which
is done when both p[z] and y are red. Since
p[p[z]] is black, we can color both p[z] and
y black (fixing the problem of z and p[z]
being red) and color p[p[z]] red, and so
maintain property 5. The pointer z moved up
2 levels to p[p[z]], & the loop is repeated.
13.3.5
Now we show that case 1 maintains the loop
invariant. We let z be the node of the
current iteration and z' = p[p[z]] be the
node at the beginning of the next iteration.
a. Since this iteration colors p[p[z]] red,
z' is red when the next iteration starts.
b. p[z'] is p[p[p[z]]] in this iteration, and
its color doesn't change. If p[z'] is the
root, it was black before this iteration
and remains black.
c. We have shown that case 1 maintains
property 5, and it doesn't cause violations
of properties 1 and 3.
If z' is the root at the start of the next
iteration, then case 1 just corrected the
lone violation of property 4. Since z' is
red and is the root, property 2 is the only
one violated and is due to z'.
If z' is not the root at the start of the
next iteration, then case 1 has not created
a violation of property 2. Case 1 fixed
the lone violation of property 4 that
existed at the start of this iteration. It
made z' red and left p[z'] alone. If p[z']
was black, there is no violation of 4. If
p[z'] was red, coloring z' red created one
violation of property 4 between z' & p[z'].
Figure 13.6 page 286 shows: 13.3.6
Case 2: z's uncle is black & z = a right child
Case 3: z's uncle is black & z = a left child
Cases 2 and 3 are distinguished by whether
z is a left or right child. In case 2, z is
a right child and we use a left rotation to
transform it into case 3, where z is a left
child. Since both z and p[z] are red, the
rotation doesn't affect the black-height of
nodes or property 5. In either case, z's
uncle is black, otherwise we would be in case
1. Also p[p[z]] exists, since it existed
before this iteration, and lines 10 and 11
move z up then down one level, so the
identity of p[p[z]] remains unchanged. In
case 3, we do some color changes and a
rotation, which preserves property 5, and
then we are done since there are no longer 2
red nodes in a row. The while loop is not
executed again since p[z] is now black.
Next we show that cases 2 and 3 maintain the
loop invariant.
a. Case 2 makes z point to p[z] which is red.
No other change to z occurs in cases 2 & 3.
b. Case 3 makes p[z] black, so if it is the
root at the start of the next iteration, it
is black.
c. As in case 1, cases 2 and 3 maintain
properties 1, 3, and 5.
13.3.7
Since z is not the root in cases 2 and 3,
we know property 2 isn't violated. Cases
2 and 3 don't cause a violation of property
2, since the only node made red becomes a
child of a black node by the rotation in
case 3.
Cases 2 and 3 correct the lone violation
of property 4 and they do not cause another
violation. This finishes the proof.
Since we have shown that each iteration of
the loop maintains the invariant, we have
shown that RB-INSERT-FIXUP correctly restores
red-black properties.
Analysis
Since the height of a red-black tree with n
nodes is O(lg n), lines 1-16 of RB-INSERT take
O(lg n) time. In RB-INSERT-FIXUP, the loop
repeats only in case 1, and then z moves up
the tree two levels, so the total number of
times the while loop can be executed is
O(lg n) also. Thus RB-INSERT takes a total of
O(lg n) time. Note: it never performs more
than two rotations since the loop terminates
if case 2 or case 3 is executed.
13.4 Deletion 13.4.1
Like insertion, deletion from an n-node
red-black tree takes O(lg n) time. It's a bit
more complicated than insertion. We slightly
modify TREE-DELETE to delete a node, and call
RB-DELETE-FIXUP to re-establish the red-black
tree properties, if needed.
RB-DELETE(T,z)
1 if left[z] = nil[T] or right[z] = nil[T]
2 then y <- z
3 else y <- TREE-SUCCESSOR(z)
4 if left[y] not = nil[T]
5 then x <- left[y]
6 else x <- right[y]
7 p[x] <- p[y]
8 if p[y] = nil[T]
9 then root[T] <- x
10 else if y = left[p[y]]
11 then left[p[y]] <- x
12 else right[p[y]] <- x
13 if y not = z
14 then key[z] <- key[y]
15 copy y's satellite data into z
16 if color[y] = BLACK
17 then RB-DELETE-FIXUP(T,x)
18 return y
The 3 modifications are: 1) nil[T] replaces
NIL, 2) p[x] <- p[y] is done unconditionally,
and 3) RB-DELETE-FIXUP is called only if y is
black; if y is red the red-black properties
still hold when y is spliced out because:
- no black-heights have changed, 13.4.2
- no red nodes have been made adjacent, and
- since y could not have been the root if it
was red, the root remains black.
The node x is either y's sole child or nil[T]
if y had no child, so line 7 sets x's parent
to the node that was y's parent regardless.
RB-DELETE-FIXUP(T,x)
1 while x not = root[T] and color[x] = BLACK
2 do if x = left[p[x]]
3 then w <-right[p[x]]
4 if color[w] = RED |> Case:
5 then color[w] <- BLACK |> 1
6 color[p[x]] <- RED |> 1
7 LEFT-ROTATE(T,p[x]) |> 1
8 w <- right[p[x]] |> 1
9 if color[left[w]] = BLACK and
color[right[w]] = BLACK
10 then color[w] <- RED |> 2
11 x <- p[x] |> 2
12 else if color[right[w]] = BLACK
13 then color[left[w]]<-BLACK
14 color[w] <- RED |>3
15 RIGHT-ROTATE(T,w) |>3
16 w <- right[p[x]] |>3
17 color[w] <- color[p[x]]|>4
18 color[p[x]] <- BLACK |>4
19 color[right[w]] <- BLACK 4
20 LEFT-ROTATE(T,p[x]) |>4
21 x <- root[T] |>4
22 else (same as "then" clause but with
"right" and "left" exchanged)
23 color[x] <- BLACK
13.4.3
If the spliced-out node y in RB-DELETE is
black, three problems may arise. First, if y
had been the root and a red child of y becomes
the new root, property 2 is violated. Second,
if both x and p[y] (= p[x] now) were red, we
have violated property 4. Third, y's removal
causes any path that contained y to have one
fewer black node, so property 5 is violated.
We can correct the third problem (we correct
the others in RB-DELETE-FIXUP also) by saying
that x has an "extra" black, thus restoring
property 5, i.e. we push y's blackness onto
its child x. But this violates property 1
since x is either "doubly black" or "red and
black". We don't have to record this in x's
color since we _always_ assume that x carries
an extra black. RB-DELETE-FIXUP eliminates
the need for this extra black before exiting.
RB-DELETE-FIXUP restores properties 1, 2, and
4. Exercises 13.4-1 and 13.4-2 show that it
restores 2 and 4; so we focus on property 1.
The goal of the while loop in lines 1-22 is to
move the extra black up the tree until:
1. x is red-and-black, in which case it is
colored (singly) black in line 23,
2. x is the root, in which case the extra
black can be simply "removed", or
3. we can do suitable rotations & recolorings.
13.4.4
Within the while loop, x always points to the
doubly black node. Line 2 determines whether
x is a left or right child (RB-DELETE-FIXUP
shows the code for a left child; the code for
a right child, line 22, is symmetric). We
maintain w as (a pointer to) the sibling of x.
Since x is doubly black, w cannot be nil[T] --
otherwise the number of blacks from p[x] on
paths through x and w would be different.
The 4 cases in the code are illustrated in
Figure 13.7, page 291. We must show property
5 is preserved in each case. So the key idea
is to show that the number of black nodes,
including x's extra black, in paths from the
root shown to the leaves is maintained by the
transformation in each case. For example:
In Figure 13.7(a) (case 1), the number of
black nodes from the root to subtrees alpha
and beta is 3, both before and after the
transformation. And the number of black nodes
from the root to gamma, delta, epsilon, and
zeta is 2 before and after transformation.
In order to cut down on the number of cases,
in Figure 13.7(b) (case 2), we let the color
of the root be c, which can be either red or
black, and we let count(c) denote the "black
count" of a color: count(RED) = 0, and
count(BLACK) = 1. In this case, the number of
black nodes from the root to alpha or beta is
2 + count(c) before and after transformation.
The same is true for the other subtrees.
Now we analyze the cases. 13.4.5
Case 1: x's sibling w is red
This is the case in Figure 13.7(a) and lines
5-8. Since w must have black children, we can
switch the colors of w and p[x] and perform a
left-rotation on p[x] without violating the
red-black properties. The new sibling of x,
one of w's children before rotation, is black,
so we have converted case 1 to case 2, 3, or 4
w is black in Cases 2, 3, and 4, and they are
distinguished by the color of w's children.
Case 2: x's sibling w is black, and both of
w's children are black
This is the case in Figure 13.7(b) and lines
10-11. Since w is black, we take one black
off both x and w, and put it on p[x], leaving
x with one black and w red; p[x] becomes the
new x. Note that if we enter case 2 through
case 1, the new x is red-and-black, since the
original p[x] was red. So the color of the
new x is red and the loop terminates. The new
x is then colored black by line 23
Case 3: x's sibling w is black, and w's left
child is red and its right child is black
This is the case in Figure 13.7(c) and lines
13-16. We can switch colors of w & left[w], &
do a right rotation about w without violating
red-black properties. Now we are in case 4.
13.4.6
Case 4: x's sibling w is black, and w's right
child is red
This is the case in Figure 13.7(d) and lines
17-21. By making some color changes and
doing a left rotation on p[x], we can remove
the extra black on x, making it singly black,
without violating red-black properties. Then
we set x to the root to exit the while loop.
Analysis
The running time of RB-DELETE without a call
to RB-DELETE-FIXUP is O(lg n) as we saw before
in analyzing ordinary binary search trees. In
RB-DELETE-FIXUP, cases 1, 3, and 4 each
terminate after a constant number of color
changes and at most three rotations. Case 2
is the only case in which the while loop can
repeat, and then the pointer x moves up the
tree at most O(h) = O(lg n) times and no
rotations are performed. Thus RB-DELETE-FIXUP
takes O(lg n) time and performs at most 3
rotations. So the overall time for RB-DELETE
is also O(lg n).