CS 4611 Fall 2004 Sample Exam 2 Questions

Some sample exam 2 questions:

1. Briefly define the following terms:
   
   Clustered Index

   Unclustered Index

   Primary Index

   Secondary Index

   Database Workload

   Index-only plan

   System Catalog

   Double Buffering

   Reduction Factor

   Bulk Loading of a B+ Tree

2. What is meant by the term Buffer Replacement Policy?  Give an example of
   a buffer replacement policy and explain how this policy affects a DB.
   Make sure to give advantages and disadvantages of your sample policy.

3. What are the advantages of using a variable length format for records in
   a file.  Give an example of how you might represent a record with four
   fields, each of varying lengths.

4. Describe how each of the following page formats represents records:

   * Variable length records with a slot directory
   * Packed fixed length records
   * Unpacked fixed length records with a bitmap

5. Three file storage mechanisms are heap files, sorted files and hashed files.
   Give advantages and disadvantages of each.  Make your answers specific with
   respect to normal DB operations such a searching for a particular record,
   or scanning all of the records, etc.

6. We discussed three representations for data entries in an index file. 
    Define these three possible representations and discuss advantages and
    disadvantages of each representation and when they would be appropriate.

7. For each of the following index methods describe
     (1) How the index is represented
     (2) Where data is stored in the index
     (3) How the following operations are executed
         searching the index for a key value
         insert a key value in the index
         deleting a key value from the index
   Also discuss the advantages and disadvantages of each method.

   The methods:

     ISAM
     B+ Trees
     Static Hashing
     Extendible Hashing
     Linear Hashing

8. Give the algorithm for each of the following external sorting methods.
   How many I/O operations would be needed for each assuming a file of
   87 pages and 15 available buffers?

   2 Way External Merge Sort

   N Way External Merge Sort

9. Why is heapsort often used as your sorting mechanism for the internal
   sort during an external sort?  What is the average, best and worst case
   performance of this sort and under what conditions do these occur?

10. If you have a selection operator

    SELECT equality condition ( relation )

    how would you perform the operation under the following conditions, and
    what would be the estimated cost of your approach in disk I/Os.

    - There is no index, data is unsorted
    - There is a clustered tree index
    - There is a hash index

    How would your answer change if the condition were not an equality 
    condition?

    If there is more than one index available, how do you choose which
    mechanism(s) to use?

11. How is the projection operator implemented with sorting?  How about hashing?
    What are the expected costs for these operations in terms of disk I/Os?

12. Give pseudo-code for the following join operations (you may assume the join
    is performed on a single attribute with an equality condition).  Also give
    the expected cost for each operation.

    Simple Nested Loops Join

    Index Nested Loops Join

    Block Nested Loops Join

    Sort-Merge Join

    Hash-Join

13. What is meant by the term reduction factor?  How does it apply in query
    evaluation?  Give at least two examples of reduction factors and how they
    might affect a particular query.

14. System R only considers left deep plans.  Why is this?  Give an example of
    a plan that is left deep and one that is not.