Chapter 12 Binary Search Trees 12.1.1
Dynamic-sets are sets that can grow or shrink
(by adding or removing elements). Search
trees are data structures that support many of
the dynamic-set operations: SEARCH, MINIMUM,
MAXIMUM, PREDECESSOR, SUCCESSOR, INSERT, and
DELETE. Thus a search tree can be used both
as a dictionary and as a priority queue.
Operations on a binary search tree (BST) take
Theta(h) time, so for a complete binary tree
with n nodes, this would be Theta(lg(n)), and
for a "linear" tree it would be Theta(n). The
height of a randomly-built tree = Theta(lg(n))
so the operations would take Theta(lg(n)).
There are variations on BSTs whose worst-case
performance can be guaranteed to be good.
12.1 What is a binary search tree?
Each node contains a key value, and pointers
(possibly NIL) left, right (to children), and
p (to parent), in addition to satellite data.
See Figure 12.1 for examples. The keys
satisfy the binary search tree property:
For any node x, if y is a node in the left
subtree of x, x.key <= x.key; if y is a node
in the right subtree of x, x.key >= x.key.
12.1.2
We can visit the nodes of a BST, T, in sorted
order by keys using an inorder tree walk (also
preorder and postorder tree walks can be done)
The following prints the keys in sorted order
with the call INORDER-TREE-WALK(T.root):
INORDER-TREE-WALK(x)
1 if x != NIL
2 INORDER-TREE-WALK(x.left)
3 print key(x)
4 INORDER-TREE-WALK(x.right)
Theorem 12.1 If x is the root of an n-node
subtree, the the call INORDER-TREE-WALK(x)
takes Theta(n) time.
Proof: Let T(n) denote the time taken by
INORDER-TREE-WALK(x) when x is the root of an
n-node subtree. Then T(0) = c, a positive
constant time to do the test for x being NIL.
For n > 0, suppose the left subtree of x has
k nodes, so the right subtree has n - k - 1
nodes, so T(n) = T(k) + T(n - k - 1) + d for
some positive constant d. We show that
T(n) = (c + d)n + c by the substitution method
T(n) = T(k) + T(n - k - 1) + d
= ((c+d)k + c) + ((c+d)(n-k-1) + c) + d
= (c+d)n + c -1*(c+d) + c + d
= (c+d)n + c
12.2 Querying a binary search tree 12.2.1
Query operations on a BST: SEARCH (the most
common), MINIMUM, MAXIMUM, SUCCESSOR, and
PREDECESSOR. Each can be performed in time
O(h), where h is the height of the tree.
Searching
Given a pointer to the root of a BST and a
key, k, TREE-SEARCH returns a pointer to a
node with key k if one exists, or NIL if not.
TREE-SEARCH(x,k)
1 if x == NIL or k == x.key
2 return x
3 if k < x.key
4 return TREE-SEARCH(x.left, k)
5 else return TREE-SEARCH(x.right, k)
The search progresses downward in the tree,
as in Figure 12.2, and so the number of nodes
encountered, and hence the running time is
O(h), where h is the height of the tree.
An iterative (more efficient?) version:
ITERATIVE-TREE-SEARCH(x,k)
1 while x != NIL and k != x.key
2 if k < x.key
3 x = x.left
4 else x = x.right
5 return x
Minimum and maximum 12.2.2
A node in a BST whose key is a minimum can be
found by following left pointers until a NIL
is encountered. The following procedure
returns a pointer to the node with minimum
key in a BST rooted at x. It is correct by
the binary-search-tree property.
TREE-MINIMUM(x)
1 while x.left != NIL
2 x = x.left
3 return x
Similarly, the following procedure returns a
pointer to the node with maximum key:
TREE-MAXIMUM(x)
1 while x.right != NIL
2 x = x.right
3 return x
Both these procedures run in O(h) time for
the same reason SEARCH runs in O(h) time.
Successor and predecessor
The following procedure returns the successor
of a node x in a BST, and NIL if x.key is the
largest key in the tree:
TREE-SUCCESSOR(x) 12.2.3
1 if x.right != NIL
2 return TREE-MINIMUM(x.right)
3 y = x.p
4 while y != NIL and x == y.right
5 x = y
6 y = x.p
7 return y
If the right subtree of x is not empty, the
successor is the left-most node in the right
subtree -- found by TREE-MINIMUM(x.right)
If the right subtree of x is empty and x has
a successor y, then y is the lowest ancestor
of x whose left child is also an ancestor of x
(Exercise 12.2-6). To find such a lowest
ancestor, y, we go up the tree (lines 3-7).
The running time of TREE-SUCCESSOR is O(h)
since we either follow a path up the tree or
down the tree, the length of such paths
is O(h), and we execute a constant number of
operations at each node. The same is true of
TREE-PREDECESSOR, defined symmetrically.
Even if the keys are not distinct, we can
define the successor or predecessor as the
node returned by the those procedures.
Theorem: The dynamic-set queries MINIMUM,
MAXIMUM, SUCCESSOR, PREDECESSOR, and SEARCH
can be made to run in O(h) time in a BST.
12.3 Insertion and deletion 12.3.1
The insertion and deletion operations of a
dynamic set are modifiers and allow it to
change. For a BST, we also need to preserve
the binary-search-tree property.
Insertion
To insert a new node, z, into a BST, T, we
assume that z.key = v, and z.p, z.left, and
z.right are all NIL.
TREE-INSERT(T,z)
1 y = NIL
2 x = T.root
3 while x != NIL
4 y = x
5 if z.key < x.key
6 x = x.left
7 else x = x.right
8 z.p = y
9 if y == NIL // Tree T was empty
10 T.root = z
11 else if z.key < y.key
12 y.left = z
13 else y.right = z
12.3.2
Figure 12.3, shows how TREE-INSERT works: it
begins at the root and traces a path downward,
3-8 maintaining y as x's parent. When x
becomes NIL, that is where we want to place z;
lines 8-13 set pointers to do that.
TREE-INSERT runs in O(h) time for the same
reason as the query procedures above.
Deletion
The procedure for deleting a node (pointed to
by) z, takes z as an argument. The procedure
considers 3 cases: shown in Figure 12.4:
- if z has no children, we set its parent's
pointer to it to NIL
- if z has only one child, we elevate that
child to take z's position by having z's
parent point to the child.
- if z has two children, we find z's successor
y (in z's right subtree) and have y take z's
position. The rest of z's original right
subtree becomes y's new right subtree, and z's
left subtree becomes y's new left subtree. It
matters whether y is z's right child or not,
leading to two subcases.
The delete procedure is organized differently
into 4 cases, shown in Figure 12.4, as follows
12.3.3
- If z has no left child (Fig. 12.4a), we
replace z by its right child. When z's right
child is also NIL, z has no children; when
z's right child is not NIL, z has one child.
- If z has just one child, it is a left child
and we replace z by that child (Fig. 12.4b).
- Otherwise z has two children. We find z's
successor y which is in z's right subtree and
has no left child (Exercise 12.2-5). We move
y to z's position, adjusting subtrees.
- If y is z's right child (Fig. 12.4c), we
replace z by y, maintaining y's right child.
- If y is not z's right child (Fig. 12.4d),
we first replace y by its own right child,
then replace z by y.
We use a routine TRANSPLANT to move subtrees
around in a binary tree: the subtree rooted at
u is replaced by the subtree rooted at v, v
becoming the appropriate child of u's parent.
12.3.4
TRANSPLANT(T,u,v)
1 if u.p == NIL // u is the root
2 T.root = v
3 else if u == u.p.left // u is left child
4 u.p.left = v
5 else u.p.right = v // u is right child
6 if v != NIL
7 v.p = u.p
NOTE: TRANSPLANT does not update v's children
- the calling program does that if needed.
TREE-DELETE(T,z)
1 if z.left == NIL // case (a)
2 TRANSPLANT(T,z,z.right)
3 else if z.right == NIL // case (b)
4 TRANSPLANT(T,z,z.left)
5 else y = TREE-MINIMUM(z.right)
6 if y.p != z // case (d)
7 TRANSPLANT(T,y,y.right) // step 1
8 y.right = z.right
9 y.right.p = y
10 TRANSPLANT(T,z,y) // case (c)
11 y.left = z.left // and case (d)
12 y.left.p = y // step 2
12.3.5
TREE-DELETE runs in O(h) time, since all
steps take a constant amount of time except
TREE-MINIMUM, which runs in O(h) time.
Theorem 12.3: The dynamic-set operations
INSERT and DELETE can be made to run in
O(h) time on a BST of height h.
12.4 Randomly built binary search trees
It is shown that the expected height of a
randomly built BST with n keys is O(lg(n)).