The assignment consists of two parts. For the first part, you will write code for and demonstrate the B-tree operations except for deletion. You will also write pseudocode for B-tree deletion. In the second part, you will implement B-tree deletion and write pseudocode for binomial heaps.
Here is an example using treetest1:
treetest1 commands
+------------------------------------------------+
| # create tree |
| c |
| |
| # insert 10 items |
| i 372 |
| i 245 |
| i 491 |
| i 474 |
| i 440 |
| i 122 |
| i 418 |
| i 125 |
| i 934 |
| i 752 |
| |
| # Show tree structure |
| S |
| |
| # search for item 122 in tree |
| s 122 |
| |
| # search for item 441 not in tree |
| s 441 |
| |
| # end of test |
+------------------------------------------------+
The output from treetest1 is below:
treetest1 script using B-trees
+-------------------------------------------------------+
| # create tree |
| # insert 10 items |
| # Show tree structure |
| |
| Structure of Btree (rotated 90 degrees to the left): |
| |
| 934 |
| 752 |
| 491 |
| 474 |
| 440 |
| 418 |
| 372 |
| 245 |
| 125 |
| 122 |
| |
| # search for item 122 in tree |
| Key 122 found at index 1 in node: |
| |
| Leaf = True, n = 5 |
| Keys: 122 125 245 372 418 |
| |
| # search for item 441 not in tree |
| Key 441 not found |
| |
| # end of test |
+-------------------------------------------------------+
For Part 1 also implement B-Tree-Minimum()
and B-Tree-Maximum().
Discussion of B-tree Implementation:
As mentioned in class, B-trees are balanced trees designed to hold large
amounts of data and designed to work efficiently with disk memory.
In addition to the standard insert, delete, and search operations,
you need to implement ShowTree(), which displays the
structure of a B-tree (rotated 90 degrees to the left), and auxiliary
operations such as
Split-Child(),
Merge-Children(),
Borrow-Left(),
Minimum(), etc.
Note: Remember, the minimum degree,
t = 3.
Here is the overall implementation strategy that I used: I first implemented B-Tree-Create(), B-Tree-Insert(), and ShowTree(), and tested them. Then I implemented and tested B-Tree-Search(). Finally, I implemented insertion one case at a time, implementing the necessary auxiliary functions as I went. Unless you do the extra credit, you can ignore the Read-Disk() and Write-Disk() operations (and possibly even deallocation of nodes, though in practice you should definitely do this). The following paragraphs contain more details.
Implementation of Insertion, etc.
I implemented B-Tree-Create() and B-Tree-Insert() (along with B-Tree-Split-Child() and B-Tree-Insert-Nonfull()) by exactly following the pseudocode in the text.
Implementation of ShowTree()
Here is some pseudocode for ShowTree():
ShowTree(x, depth) |> x points to root of subtree of given depth
if not leaf[x] then
ShowTree( c_(n[x]+1)[x], depth + 1 )
for i <- n[x] downto 1 do
print key_i[x] right-justified in a field of (depth * 6) + 4
if not leaf[x] then
ShowTree( c_i[x], depth + 1 )
Implementation of the Search Operation
For B-Tree-Search(), I first defined a new data type BTreeLocation, which had two fields, nodePtr that points to a BTreeNode and index that is the index of a key in a node, and then I followed the pseudocode in the text (returning (NIL,0) instead of NIL for an unsuccessful search).
What To Hand In:
For the deletion operation, I followed the strategy of the text, using a non-recursive top-level routine B-Tree-Delete() that called a recursive routine B-Tree-Delete-Fullenough() (this is a common strategy for dealing with recursive routines -- to take care of special initial boundary cases). Also, in general, I found that several routines could be designed by "reading" a related routine backward. This was somewhat true of B-Tree-Delete(), which I checked by reading B-Tree-Insert() backward; here is the pseudocode:
B-Tree-Delete(T,k)
r <- root[T]
if n[r] = 0 then
then error: "Attempt to delete from empty tree."
else B-Tree-Delete-Fullenough(r,k)
if n[r] = 0 and not leaf[r]
then root[T] <- c_1[r]
deallocate r
For B-Tree-Delete-Fullenough(), I followed the pseudocode given in the lab session. But note a couple of subtleties. First, in case 3b, there are two subcases: when ci[x] is merged with ci-1[x] or with ci+1[x]. These two cases can be handled by simply decrementing i if needed and then calling B-Tree-Merge-Children() as follows:
if i = n[x] + 1 |> Then we need to merge from left, not from right
then i <- i - 1
B-Tree-Merge-Children( x, i, c_i[x], c_(i+1)[x] )
(rather than having two separate calls to
B-Tree-Merge-Children()).
B-Tree-Minimum(x)
while not leaf[x]
do x <- c_1[x]
return key_1[x]
Overall strategy for B-Tree-Delete-Fullenough()
My overall strategy to implement B-Tree-Delete-Fullenough() was to implement and test case 1 (and "case 4") first, then the "find the predecessor/successor" cases 2a and 2b, then the "merge" of case 2c (also used in case 3b), then the "borrow-left" and "borrow-right" subcases of case 3a, and finally the "merge" of case 3b.
As I have mentioned, the "borrow-left" operation is sort of like the "rotate" operation used in red-black trees (I think that the split-child and merge-children operations could be mostly performed by a sequence of "borrow" operations also, so in that sense split and merge are kinds of "super-rotates"). The pseudocode for B-Tree-Borrow-Left() is below; the steps in B-Tree-Borrow-Right() are similar.
B-Tree-Borrow-Left( x, i )
y <- c_(i-1)[x] |> rename left child to simplify pseudocode
z <- c_i[x] |> rename child to simplify pseudocode
shift all keys (and child pointers if not leaf[z]) up one index
|> in node z to make room for one more key and child pointer
increment n[z]
key_1[z] <- key_(i-1)[x] |> "rotate" keys
key_(i-1)[x] <- key_(n[y])[y] |>
c_1[z] <- c_(n[y]+1)[y] |> and move a child pointer into z
decrement n[y]
Implementation of B-Tree-Merge-Children()
The B-Tree-Merge-Children() operation is similar to B-Tree-Split-Child(). As in B-Tree-Split-Child(), to simplify the pseudocode, let y = ci[x], and z = ci+1[x], where it is assumed that both y and z have t-1 keys. Here is the pseudocode:
B-Tree-Merge-Children( x, i, y, z )
copy the t-1 keys of z into the top t-1 key positions of y
if not leaf[y] then
copy the t child pointers of z into the top t child pointer positions of y
key_t[y] <- key_i[x] |> move the splitting key down to y
n[y] <- 2t - 1 |> y is now a full node
shift the keys from index i+1 to n[x] down one index in node x
shift the child pointers from index i+2 to n[x]+1 down one index in node x
decrement n[x] |> since it has lost its splitting key
deallocate z
Testing:
Note that treetest6 for this assignment was generated using random numbers, so there are a number of duplicate keys, both for insert and delete. The implementation discussed above and in the text can handle duplicates, so just insert any duplicate keys. That way when you delete any duplicate keys, each will be deleted and you won't generate any "search key not found" messages. The result of treetest6, should have the keys 133 and 357 in its root.
What To Hand In: