Computer Science 4611
Database Management Systems

Programming Assignment 3
Buffer Manager (25 points)
Due Wednesday, March 26, 2003

Introduction

In programming assignments 3-6, you will be implementing a simple DBMS. For an overview of how you will construct the DBMS, read this page. For this assignment you will implement a simplified version of the Buffer Manager layer, without support for concurrency control or recovery. You will be given the code for the lower layer, the Disk Space Manager.

You will carry out this assignment, and subsequent ones, in teams of three. The teams will be selected by the instructor, you can find your team assignment here. You should begin by reading the chapter on Disks and Files, to get an overview of buffer management. This material will also be covered in class. Your Buffer manager code will provide an interface between the Disk Space manager and the upper levels of code. The Disk Space manager, implemented with the class DB, provides basic routines for allocating and deallocating file pages. We will not be implementing this class directly because this aspect of a system tends to be very system specific (also, we will be using a very simple model of pages). The Buffer Manager provides a simple set of routines for obtaining, pinning, and unpinning pages that are associated with a file (in later layers of the DBMS we will be adding layers that organize pages into files). You should familiarize yourself with what routines are available for low level processes by reading db.h carefully.

To make a local copy of the code you need to implement you should download the file bm.tar.Z. This is a tared archive file. To unpack this file you should do the following:

  % uncompress bm.tar
  % tar xvf bm.tar

This will create a directory BufMgr that contains the code provided to you as well as skeletons for the code you need to write. This code comes with a provided make file makefile. It is unlikely you will need to modify this file (the only files you should need to change are buf.h and buf.C). The code comes with two testing programs, db_tester and bm_tester. These test programs are automatically constructed by the makefile by simply typing "make" (you can remake each one by typing "make db_tester" and "make bm_tester"). db_tester focuses primarily on testing the Disk Space manager code (though it needs the Buffer Manager code to be working) and bm_tester primarily tests the Buffer Manager code. To recompile all of the code you first type "make clean" which will eliminate all current .o and executable files and then type "make" again to recompile.

The Buffer Manager Interface

The simplified Buffer Manager interface that you will implement in this assignment allows a client (a higher level program that calls the Buffer Manager) to allocate/de-allocate pages on disk, to bring a disk page into the buffer pool and pin it, and to unpin a page in the buffer pool.

The methods that you have to implement are described below (the skeletons for this class can be found in buf.h and buf.C):

class BufMgr {

public:

    // Allocate pages (frames) for the pool in main memory.
  BufMgr(int numbuf);  

    // Should flush all dirty pages in the pool to 
    // disk before shutting down and deallocate the
    // buffer pool in main memory.
  ~BufMgr();   

    // Check if this page is in buffer pool.  If it is, increment the pin_count
    // and return a pointer to this page.  If the pin_count was 0 before the
    // call, the page was a replacement candidate, but is no longer a candidate;
    // be sure to remove this from the CLOCK list of candidates.
    // If the page is not in the pool, choose a frame (from the set of
    // replacement candidates) to hold this page, read the page (using
    // the appropriate DB class method) and pin it.
    // Also, must write out the old page in chosen frame if it is dirty 
    // before reading new page.  (You can assume that emptyPage == 0 for
    // this assignment.)
  Status pinPage(PageId PageId_in_a_DB, Page*& page,int emptyPage=0);

    // Should be called with dirty==TRUE if the client has modified the page.
    // If so, this call should set the dirty bit for this frame.  Further,
    // if pin_count > 0 should decrement it, and if it becomes zero, should
    // update the CLOCK list by adding an entry for this frame.
    // If pin_count == 0 before this call, return error.
  Status unpinPage(PageId globalPageId_in_a_DB, int dirty=FALSE);

    // Call DB object to allocate a run of new pages and 
    // find a frame in the buffer pool for the first page
    // and pin it. (This call allows a client of the Buffer Manager
    // to allocate pages on disk.) If buffer is full, i.e., you 
    // can't find a frame for the first page, ask DB to deallocate 
    // all these pages, and return error.
  Status newPage(PageId& firstPageId, Page*& firstpage,int howmany=1); 

    // This method should be called to delete a page that is on disk.
    // This routine must call the DB class to deallocate the page. 
  Status freePage(PageId globalPageId); 

    // Used to flush a particular page of the buffer pool to disk
    // Should call the write_page method of the DB class
  Status flushPage(int pageId);

    // Flushes all pages of the buffer pool to disk
  Status flushAllPages();

    // Total number of Buffers
  unsigned int getNumBuffers();

    // Count of the number of unpinned Buffers
  unsigned int getNumUnpinnedBuffers();
};

Internal Design

The buffer pool is a collection of frames (page-sized sequence of main memory bytes) that is managed by the Buffer Manager. It should be stored as an array bufPool[numbuf] of Page objects.

In addition, you should maintain an array of buffer descriptors, one per frame. Each descriptor is a record with the following fields:

The pin_count field is an integer, page number is a PageId object, and dirtybit is a boolean. This describes the page that is stored in the corresponding frame. A page is identified by a page number that is generated by the DB class when the page is allocated, and is unique over all pages in the database. The PageId type is defined as an integer type in minidb.h.

A simple hash table should be used to figure out what frame a given disk page occupies. The hash table should be implemented (entirely in main memory) by using an array of pointers to lists of <page number, frame number> pairs. The array is called the directory and each list of pairs is called a bucket. Given a page number, you should apply a hash function to find the directory entry pointing to the bucket that contains the frame number for this page, if the page is in the buffer pool. If you search the bucket and don't find a pair containing this page number, the page is not in the pool. If you find such a pair, it will tell you the frame in which the page resides.

The hash function must distribute values in the domain of the search field uniformly over the collection of buckets. If we have HTSIZE buckets, numbered 0 through HTSIZE-1, a hash function h of the form h(value) = (a*value+b) mod HTSIZE works well in practice. HTSIZE should be chosen to be a prime number.

When a page is requested the buffer manager should do the following:

  1. Check the buffer pool (by using the hash table) to see if it contains the requested page.
    If the page is not in the pool, it should be brought in as follows:
    1. Choose a frame for replacement, using the CLOCK replacement policy.
    2. If the frame chosen for replacement is dirty, flush it (i.e., write out the page that it contains to disk, using the appropriate DB class method).
    3. Read the requested page (again, by calling the DB class) into the frame chosen for replacement; the pin_count and dirtybit for the frame should be initialized to 0 and FALSE, respectively.
    4. Delete the entry for the old page from the Buffer Manager's hash table and insert an entry for the new page. Also, update the entry for this frame in the bufDescr array to reflect these changes.
  2. Pin the requested page by incrementing the pin_count in the descriptor for this frame. and return a pointer to the page to the requestor.

To implement the CLOCK replacement policy, you should maintain a counter in the system that updates whenever a page is brought into memory. The counter value should be associated with that page (indicating when that page was brought into memory). When a frame is to be chosen for replacement, you should pick the frame containing the page that was brought in first according to its clock value (the oldest page read into a frame).

Error Protocol and Debugging

Be sure to follow the error protocol described in new_error.h. Note that you will likely want to add new error codes and new error messages to the tables provided for you.

The make file compiles the code using the -g flag. This means that you can debug the executables produced using gdb. I have also set up the code with a command line debugging system. When running either of the test programs you can add command line arguments of db, bm, or gory. These turn on debugging flags in the Disk Space module (db), Buffer Manager module (bm) and some extended (gory details) flags (gory). Note that you may want to add debugging commands in your Buffer Manager code following this protocol.

What to Turn In, and When

Print out your versions of buf.h and buf.C. You should test your code using the test routines bm_tester and db_tester and print out the results. Next, write up a team report of how your code is implemented. This report should give an overview of how you completed the BufMgr class and any new classes you created. It should also discuss the algorithms you used to solve the problem. This report should be at least two pages long but no longer than four pages. Each team member should also write up an individual report (at least half a page but no more than a page) discussing their contributions to the coding process and how the overall team interaction went.

You must also submit your code electronically (but only once for each team). To do this go to the link https://webapps.d.umn.edu/service/webdrop/rmaclin/cs4611-1-s2003/uploa d.cgi and follow the directions for uploading a file.

To make your code easier to check and grade please use the following procedure for collecting the code before uploading it: