Chapter 13 Red-Black Trees 13.1.1
Red-black trees are one scheme for insuring
that binary search trees remain balanced, so
that their height never gets larger than
O(lg n), where n is the number of keys.
13.1 Properties of red-black trees
A red-black tree is a binary search tree with
an extra bit of data, its color: RED or BLACK.
By constraining the coloring of nodes,
red-black trees ensure that any path from the
root to a leaf is no more than twice as long
as any other such path, so red-black are
approximately balanced.
A binary search tree is a red-black tree if
it satisfies the red-black tree properties:
1. Every node is either red or black.
2. The root is black.
3. Every leaf (NIL) is black.
4. If a node is red, then both of its children
are black.
5. For each node, all paths from that node to
descendent leaves contain the same number of
black nodes.
Figure 13.1(a) shows an example.
13.1.2
For convenience, we use a single sentinel,
T.nil to represent NIL in the tree T. Its
color is BLACK. It represents all the leaves
and the parent of the root. Figure 13.1(b)
shows an example.
Since we are only interested in internal,
key-holding nodes, we omit drawing the leaves,
as shown in Figure 13.1(c).
We define the black-height, bh(x), of a node
x as the number of black nodes on a path from
x to a leaf (but not counting x), which is
well defined by property 5. The black-height
of a tree is the black-height of its root.
Lemma 13.1 A red-black tree with n internal
nodes has height at most 2lg(n + 1).
Proof: We first show that the subtree rooted
at a node x contains at least 2^bh(x) - 1
internal nodes by induction on the height of
x. If the height is 0, x must be a leaf and
the subtree rooted at x contains 2^bh(x) - 1 =
2^0 - 1 = 1 - 1 = 0 internal nodes.
If x has height > 0, each child has black
height of either bh(x) or bh(x) - 1, and since
a child has height less than x, we can apply
the induction hypothesis: each child has at
least 2^(bh(x)-1) - 1 internal nodes. So the
subtree rooted at x contains 2(2^(bh(x)-1)-1)
+ 1 = 2^bh(x) - 1 internal nodes, as desired.
13.1.3
Now let h be the height of the tree, then by
property 4, at least half the nodes must be
black on any simple path from the root to a
leaf, not including the root. So the black
height of the tree must be at least h/2, thus:
n >= 2^bh(x) - 1 >= 2^(h/2) - 1,
or: n + 1 >= 2^(h/2), and taking lg of each
side: lg(n + 1) >= h/2, or h <= 2lg(n + 1)
which is what we want, finishing the proof.
Consequently, the dynamic set queries SEARCH,
MINIMUM, MAXIMUM, SUCCESSOR, and PREDECESSOR
will run in O(lg n) time since they run in
O(h) time on a search tree of height h, and
any red-black tree with n nodes is a search
tree of height O(lg n).
Note: TREE-INSERT and TREE-DELETE of Chapter
12 would also run in O(lg n) time, but they
would not necessarily preserve the red-black
tree properties. However, by being careful,
INSERT and DELETE _can_ be made to run in
O(lg n) while preserving the red-black tree
properties, as will be shown in Sections 13.3
and 13.4. This is done by performing
rotations and recoloring nodes, which maintain
the red-black tree properties and so keep the
tree balanced.
13.2 Rotations 13.2.1
A rotation is a local operation that preserves
the binary-search-tree property. Figure 13.2
shows left and right rotations. For a left
rotation on a node x, we assume that x.right
= y is not T.nil. The rotation "pivots"
counter-clockwise around the link from x to y,
making y the new root of the subtree and x its
left child. Here is the code for LEFT-ROTATE.
Figure 13.3 shows how it works.
LEFT-ROTATE(T,x)
1 y = x.right // Set y
2 x.right = y.left // Turn y's left
3 if y.left != T.nil //subtree into x's
4 y.left.p = x // right subtree
5 y.p = x.p // Link x's parent to y
6 if x.p == T.nil // x was the root
7 T.root = y
8 else if x == x.p.left
9 x.p.left = y
10 else x.p.right = y
11 y.left = x // Put x on y's left
12 x.p = y
The code for RIGHT-ROTATE is symmetric; it and
LEFT-ROTATE only change a fixed number of
pointers and so they run in O(1) time.
13.3 Insertion 13.3.1
Insertion into an n-node red-black tree can
be done in O(lg n) time. We slightly modify
TREE-INSERT to insert the node, colored red,
then call RB-INSERT-FIXUP to re-establish the
red-black tree properties by recolorings and
rotations.
RB-INSERT(T,z)
1 y = T.nil
2 x = T.root
3 while x != T.nil
4 y = x
5 if z.key < x.key
6 x = x.left
7 else x = x.right
8 z.p = y
9 if y == T.nil
10 T.root = z
11 else if z.key < y.key
12 y.left = z
13 else y.right = z
14 z.left = T.nil
15 z.right = T.nil
16 z.color = RED
17 RB-INSERT-FIXUP(T,z)
The 4 modifications are: 1) T.nil replaces
NIL, 2) z's children are set to T.nil 3) z
is colored RED, 4) RB-INSERT-FIXUP is called.
RB-INSERT-FIXUP(T,z) 13.3.2
1 while z.p.color == RED
2 if z.p == z.p.p.left
3 y = z.p.p.right
4 if y.color == RED // Case:
5 z.p.color = BLACK // 1
6 y.color = BLACK // 1
7 z.p.p.color = RED // 1
8 z = z.p.p // 1
9 else if z == z.p.right
10 z = z.p // 2
11 LEFT-ROTATE(T,z) // 2
12 z.p.color = BLACK // 3
13 z.p.p.color = RED // 3
14 RIGHT-ROTATE(T,z.p.p) // 3
15 else (same as "if" clause with
"right" and "left" exchanged)
16 T.root.color = BLACK
We examine code in three major steps:
1) What violations of red-black tree
properties are introduced by RB-INSERT?
2) What is the goal of the while-loop in 1-15?
3) How do the three cases perform fix-up?
Figure 13.4 shows a sample fix-up.
1) Properties 1, 3, and 5 are still satisfied
but maybe not property 2 (root is BLACK) or
property 4 (RED node can't have RED child).
2) The while loop maintains the 13.3.3
following three-part invariant:
At the start of each iteration of the loop:
a. Node z is red.
b. If z.p is the root, then z.p is BLACK.
c. There is at most one violation of the
red-black properties -- either property 2
or 4. If 2 is violated, it is because z
is the root and is red. If 4 is violated
it is because both z and z.p are red.
To check the invariant, we start with the
initialization and termination arguments. In
the proof of maintenance, we note that two
things can happen: z moves up the tree or some
rotations are done and the loop terminates.
Initialization:
a. When RB-INSERT-FIXUP is called, z is the
red node that was added.
b. If z.p is the root, then z.p started
out black and has not changed.
c. If there is a violation of property 2,
the red root must be the new node z, the
only internal node; and the parent and
both children are black (nil), so there is
no violation of property 4.
If 4 is violated, then since the children
of z are black and the tree had no other
violations before z was added, the only
violation is now: both z and z.p are red.
Termination: 13.3.4
The loop terminates when z.p becomes black.
(If z is the root, z.p is nil and black.)
Thus there is no violation of property 4, so
the only violation can be of property 2,
which is fixed by line 16. So all red-black
properties hold when RB-INSERT-FIXUP ends.
Maintenance: There are 6 cases to consider in
the while loop, but 3 cases are symmetric to
the other 3, depending on whether z's parent
z.p is a left or right child of z.p.p,
which is determined at line 2. This is major
step 3) of our analysis. If z.p is the
root, it is black. Since we only enter the
loop if z.p is red, we know z.p isn't the
root in that case, and so z.p.p exists.
We distinguish case 1 from 2 and 3 by the
color of z's parent's sibling or "uncle".
Line 3 makes y point to z's uncle. Line 4
tests if y is red, & if so, case 1 is done,
else we do cases 2 & 3. In each, z.p.p is
black, since z.p is red and property 4 is
only violated between z and z.p
Case 1: z's uncle is red
Figure 13.5 shows case 1 (lines 5-8), which
is done when both z.p and y are red. Since
z.p.p is black, we can color both z.p and y
black (fixing the problem of z and z.p being
red) and color z.p.p red, and so maintain
property 5. The pointer z moved up 2 levels
to z.p.p, & the loop is repeated.
13.3.5
Now we show that case 1 maintains the loop
invariant. We let z be the node of the
current iteration and z' = z.p.p be the
node at the beginning of the next iteration.
a. Since this iteration colors z.p.p red,
z' is red when the next iteration starts.
b. z'.p is z.p.p.p in this iteration, and
its color doesn't change. If z'.p is the
root, it was black before this iteration
and remains black.
c. We have shown that case 1 maintains
property 5, and it doesn't cause violations
of properties 1 and 3.
If z' is the root at the start of the next
iteration, then case 1 just corrected the
lone violation of property 4. Since z' is
red and is the root, property 2 is the only
one violated and is due to z'.
If z' is not the root at the start of the
next iteration, then case 1 has not created
a violation of property 2. Case 1 fixed
the lone violation of property 4 that
existed at the start of this iteration. It
made z' red and left z'.p alone. If z'.p
was black, there is no violation of 4. If
z'.p was red, coloring z' red created one
violation of property 4 between z' & z'.p.
Figure 13.6 shows: 13.3.6
Case 2: z's uncle is black & z = a right child
Case 3: z's uncle is black & z = a left child
Cases 2 and 3 are distinguished by whether
z is a left or right child. In case 2, z is
a right child and we use a left rotation to
transform it into case 3, where z is a left
child. Since both z and z.p are red, the
rotation doesn't affect the black-height of
nodes or property 5. In either case, z's
uncle is black, otherwise we would be in case
1. Also z.p.p exists, since it existed
before this iteration, and lines 10 and 11
move z up then down one level, so the
identity of z.p.p remains unchanged. In
case 3, we do some color changes and a
rotation, which preserves property 5, and
then we are done since there are no longer 2
red nodes in a row. The while loop is not
executed again since z.p is now black.
Next we show that cases 2 and 3 maintain the
loop invariant.
a. Case 2 makes z point to z.p which is red.
No other change to z occurs in cases 2 & 3.
b. Case 3 makes z.p black, so if it is the
root at the start of the next iteration, it
is black.
c. As in case 1, cases 2 and 3 maintain
properties 1, 3, and 5.
13.3.7
Since z is not the root in cases 2 and 3,
we know property 2 isn't violated. Cases
2 and 3 don't cause a violation of property
2, since the only node made red becomes a
child of a black node by the rotation in
case 3.
Cases 2 and 3 correct the lone violation
of property 4 and they do not cause another
violation. This finishes the proof.
Since we have shown that each iteration of
the loop maintains the invariant, we have
shown that RB-INSERT-FIXUP correctly restores
red-black properties.
Analysis
Since the height of a red-black tree with n
nodes is O(lg n), lines 1-16 of RB-INSERT take
O(lg n) time. In RB-INSERT-FIXUP, the loop
repeats only in case 1, and then z moves up
the tree two levels, so the total number of
times the while loop can be executed is
O(lg n) also. Thus RB-INSERT takes a total of
O(lg n) time. Note: it never performs more
than two rotations since the loop terminates
if case 2 or case 3 is executed.
13.4 Deletion 13.4.1
Like insertion, deletion from an n-node
red-black tree takes O(lg n) time. It's a bit
more complicated than insertion. We modify
TRANSPLANT to apply to red-black trees:
RB-TRANSPLANT(T,u,v)
1 if u.p == T.nil // u is the root
2 T.root = v
3 else if u == u.p.left // u is left child
4 u.p.left = v
5 else u.p.right = v // u is right child
6 v.p = u.p
RB-TRANSPLANT differs from TRANSPLANT in two
ways: T.nil replaces NIL, and v.p = u.p is
done unconditionally in line 6.
RB-DELETE is like TREE-DELETE, with added
lines, which (1) keep track of a node y that
might violate red-black properties, (2) we
remember y's color, and (3) we keep track of
node x that moves into y's original position.
If z has fewer than two children, y becomes z
and if z has two children, y becomes z's
successor, which moves into z's position.
Finally, RB-DELETE-FIXUP is called to change
colors and do rotations to restore red-black
properties.
RB-DELETE(T,z) 13.4.2
1 y = z
2 y-original-color = y.color
3 if z.left == T.nil // case (a)
4 x = z.right
5 RB-TRANSPLANT(T,z,z.right)
6 else if z.right == T.nil // case (b)
7 x = z.left
8 RB-TRANSPLANT(T,z,z.left)
9 else y = TREE-MINIMUM(z.right)
10 y-original-color = y.color
11 x = y.right
12 if y.p == z // case (c)
13 x.p = y
14 else RB-TRANSPLANT(T,y,y.right)
15 y.right = z.right // case (d) step 1
16 y.right.p = y
17 RB-TRANSPLANT(T,z,y) // case (c)
18 y.left = z.left // and case (d)
19 y.left.p = y // step 2
20 y.color = z.color
21 if y-original-color == BLACK
22 RB-DELETE-FIXUP(T,x)
In addition to the trivial replacements of
NIL by T.nil and TRANSPLANT by RB-TRANSPLANT,
TREE-DELETE and RB-DELETE differ as follows:
- We maintain y as either z if z had fewer
than 2 children, or z's successor otherwise.
- We save y's original color, and if 13.4.3
z had 2 children, give y z's color. At the
end, if y's original color were BLACK, we
call RB-DELETE-FIXUP to fix color problems.
- We also maintain x as the node that goes
into y's original position, so x.p points to
y's original parent, even if x is T.nil.
- Finally, if y was black, there may be some
violations of red-black properties, so we
call RB-DELETE-FIXUP to fix them. If y was
red, the red-black properties still hold if
y is removed since:
1 No black heights have changed
2 No red nodes have been made adjacent.
Because y take z's place and z's color, we
cannot have two adjacent red nodes at y's
new position. Also, if y was not z's
right child, then y's original right child
x (which must be black) replaces y.
3 Since y was red, it could not be the root
so the root remains black.
If y was black, three problems may arise,
which RB-DELETE-FIXUP fixes. (1) If y had
been the root, and a red child of y becomes
the new root, property 2 is violated. (2) If
both x and x.p are red property 4 is violated.
(3) Moving y may cause some path to have one
fewer black node, violating property 5.
We can fix the property 5 violation 13.4.4
by saying that x carries an "extra" black.
Thus x is "doubly black" or "red-and-black"
(and is actually BLACK or RED respectively),
but x will _always_ point to the only node
that has double coloring. Here is the code
for RB-DELETE-FIXUP:
RB-DELETE-FIXUP(T,x)
1 while x != T.root and x.color == BLACK
2 if x == x.p.left
3 w = x.p.right
4 if w.color == RED // Case:
5 w.color = BLACK // 1
6 x.p.color = RED // 1
7 LEFT-ROTATE(T,x.p) // 1
8 w = x.p.right // 1
9 if w.left.color == BLACK and
w.right.color == BLACK
10 w.color = RED // 2
11 x = x.p // 2
12 else if w.right.color == BLACK
13 w.left.color = BLACK // 3
14 w.color = RED // 3
15 RIGHT-ROTATE(T,w) // 3
16 w = x.p.right // 3
17 w.color = x.p.color // 4
18 x.p.color = BLACK // 4
19 w.right.color = BLACK // 4
20 LEFT-ROTATE(T,x.p) // 4
21 x = T.root // 4
22 else (same as "if" clause but with
"right" and "left" exchanged)
23 x.color = BLACK
RB-DELETE-FIXUP restores properties 13.4.5
1, 2, and 4. Exercises 13.4-1 and 13.4-2 show
it restores 2 and 4; so we focus on property
1. The goal of the while loop in lines 1-22
is to move the extra black up the tree until:
1. x is red-and-black, in which case it is
colored (singly) black in line 23,
2. x is the root, in which case the extra
black can be simply "removed", or
3. we can do suitable rotations & recolorings.
Within the while loop, x always points
to the doubly black node. Line 2 determines
if x is a left or right child (RB-DELETE-FIXUP
shows the code for a left child; the code for
a right child, line 22, is symmetric). We
maintain w as (a pointer to) the sibling of x.
Since x is doubly black, w cannot be T.nil --
otherwise the number of blacks from x.p on
paths through x and w would be different.
The 4 cases in the code are illustrated in
Figure 13.7. We must show property 5 is
preserved in each case. So the key idea
is to show that the number of black nodes,
including x's extra black, in paths from the
root shown to the leaves is maintained by the
transformation in each case. For example:
In Figure 13.7(a) (case 1), the number 13.4.6
of black nodes from the root to subtrees alpha
and beta is 3, both before and after the
transformation. And the number of black nodes
from the root to gamma, delta, epsilon, and
zeta is 2 before and after transformation.
In order to cut down on the number of cases,
in Figure 13.7(b) (case 2), we let the color
of the root be c, which can be either red or
black, and we let count(c) denote the "black
count" of a color: count(RED) = 0, and
count(BLACK) = 1. In this case, the number of
black nodes from the root to alpha or beta is
2 + count(c) before and after transformation.
The same is true for the other subtrees.
Now we analyze the cases.
Case 1: x's sibling w is red
This is the case in Figure 13.7(a) and lines
5-8. Since w must have black children, we can
switch the colors of w and x.p and perform a
left-rotation on x.p without violating the
red-black properties. The new sibling of x,
one of w's children before rotation, is black,
so we have converted case 1 to case 2, 3, or 4
w is black in Cases 2, 3, and 4, and they are
distinguished by the color of w's children.
Now we analyze the cases. 13.4.7
Case 2: x's sibling w is black, and both of
w's children are black
This is the case in Figure 13.7(b) and lines
10-11. Since w is black, we take one black
off both x and w, and put it on x.p leaving
x with one black and w red; x.p becomes the
new x. Note that if we enter case 2 through
case 1, the new x is red-and-black, since the
original x.p was red. So the color of the
new x is red and the loop terminates. The new
x is then colored black by line 23
Case 3: x's sibling w is black, and w's left
child is red and its right child is black
This is the case in Figure 13.7(c) and lines
13-16. We can switch colors of w & w.left, &
do a right rotation about w without violating
red-black properties. Now we are in case 4.
Case 4: x's sibling w is black, and
w's right child is red
This is the case in Figure 13.7(d) and lines
17-21. By making some color changes and
doing a left rotation on x.p we can remove
the extra black on x, making it singly black,
without violating red-black properties. Then
we set x to the root to exit the while loop.
Analysis 13.4.8
The running time of RB-DELETE without a call
to RB-DELETE-FIXUP is O(lg n) as we saw before
in analyzing ordinary binary search trees. In
RB-DELETE-FIXUP, cases 1, 3, and 4 each
terminate after a constant number of color
changes and at most three rotations. Case 2
is the only case in which the while loop can
repeat, and then the pointer x moves up the
tree at most O(h) = O(lg n) times and no
rotations are performed. Thus RB-DELETE-FIXUP
takes O(lg n) time and performs at most 3
rotations. So the overall time for RB-DELETE
is also O(lg n).