UNIVERSITY OF MINNESOTA--DULUTH

COMPUTER SCIENCE DEPARTMENT

CS 4611: DATABASE MANAGEMENT SYSTEMS

Assignment 5: Sort-Merge Join
Due: Friday, December 17, 1999 (NO LATE ASSIGNMENTS)
Instructor: Rich Maclin
40 points

Introduction

In this assignment, you will implement the sort-merge join algorithm. You will carry out this assignment in teams with the same partner(s) as for the previous assignments.

Available Documentation

You should begin by reading the chapter on Implementation of Relational Operations, in particular, the section on Sort-Merge Join.

What You Have to Implement

class sortMerge {
  public:

    sortMerge(
      char       *filename1,    // Name of heapfile for relation R.
      int         len_in1,      // # of columns in R.
      AttrType    in1[],        // Array containing field types of R.
      short       t1_str_sizes[], // Array containing size of columns in R.
      int         join_col_in1, // The join column number of R.
      char       *filename2,    // Name of heapfile for relation S
      int         len_in2,      // # of columns in S.
      AttrType    in2[],        // Array containing field types of S.
      short       t2_str_sizes[], // Array containing size of columns in S.
      int         join_col_in2, // The join column number of S.
      char*       filename3,    // Name of heapfile for merged results
      int         amt_of_mem,   // Number of pages available for sorting
      TupleOrder  order,        // Sorting order: Ascending or Descending
      Status&     s             // Status of constructor
    ); 
   ~sortMerge();
}; 

The sortMerge constructor joins two relations R and S, represented by the heapfiles filename1 and filename2, respectively, using the sort-merge join algorithm. Note that the columns for relation R (S) are numbered from 0 to len_in1 - 1 (len_in2 - 1). You are to concatenate each matching pair of records and write it into the heapfile filename3. The error layer for the sortMerge class is JOINS.

You will need to use the following classes which are given: Sort, HeapFile, and Scan. You will call the Sort constructor to sort a heapfile. To compare the join columns of two tuples, you should create a function tupleCmp to compare the key fields of two tuples. Write the function so that it works as a strcmp function works (in terms of the value it returns). Once a scan is opened on a heapfile, the scan cursor can be positioned to any record within the heapfile calling the Scan method position with an RID argument. The next call to the Scan method getNext will proceed from the new cursor position.

Where to Find Makefiles, Code, etc.

The structure files for the classes you will be using can be found in the directory:

        /usr/local/minibase/minibase-2.0/include

You will need to copy files from the src directory for this assignment. To do this you need to follow the same steps as in assignments 1 and 2:

  1. Make an appropriate directory to work in.
  2. Copy the file Makefile from the directory
                    /usr/local/minibase/mini_hwk/assign/SM_Join/src
           
  3. Execute the call:
                    make setup
           
    which will copy the appropriate files.
  4. Implement the assignment -- you can type make to attempt to compile your solution.

The files are:

What to Turn In, and When

You should turn in copies of your code together with copies of the output produced by running the tests provided.