Some sample exam 2 questions:
1. Briefly define the following terms:
Clustered Index
Unclustered Index
Primary Index
Secondary Index
Database Workload
Index-only plan
System Catalog
Double Buffering
Reduction Factor
Bulk Loading of a B+ Tree
2. What is meant by the term Buffer Replacement Policy? Give an example of
a buffer replacement policy and explain how this policy affects a DB.
Make sure to give advantages and disadvantages of your sample policy.
3. What are the advantages of using a variable length format for records in
a file. Give an example of how you might represent a record with four
fields, each of varying lengths.
4. Describe how each of the following page formats represents records:
* Variable length records with a slot directory
* Packed fixed length records
* Unpacked fixed length records with a bitmap
5. Three file storage mechanisms are heap files, sorted files and hashed files.
Give advantages and disadvantages of each. Make your answers specific with
respect to normal DB operations such a searching for a particular record,
or scanning all of the records, etc.
6. We discussed three representations for data entries in an index file.
Define these three possible representations and discuss advantages and
disadvantages of each representation and when they would be appropriate.
7. For each of the following index methods describe
(1) How the index is represented
(2) Where data is stored in the index
(3) How the following operations are executed
searching the index for a key value
insert a key value in the index
deleting a key value from the index
Also discuss the advantages and disadvantages of each method.
The methods:
ISAM
B+ Trees
Static Hashing
Extendible Hashing
Linear Hashing
8. Give the algorithm for each of the following external sorting methods.
How many I/O operations would be needed for each assuming a file of
87 pages and 15 available buffers?
2 Way External Merge Sort
N Way External Merge Sort
9. Why is heapsort often used as your sorting mechanism for the internal
sort during an external sort? What is the average, best and worst case
performance of this sort and under what conditions do these occur?
10. If you have a selection operator
SELECT equality condition ( relation )
how would you perform the operation under the following conditions, and
what would be the estimated cost of your approach in disk I/Os.
- There is no index, data is unsorted
- There is a clustered tree index
- There is a hash index
How would your answer change if the condition were not an equality
condition?
If there is more than one index available, how do you choose which
mechanism(s) to use?
11. How is the projection operator implemented with sorting? How about hashing?
What are the expected costs for these operations in terms of disk I/Os?
12. Give pseudo-code for the following join operations (you may assume the join
is performed on a single attribute with an equality condition). Also give
the expected cost for each operation.
Simple Nested Loops Join
Index Nested Loops Join
Block Nested Loops Join
Sort-Merge Join
Hash-Join
13. What is meant by the term reduction factor? How does it apply in query
evaluation? Give at least two examples of reduction factors and how they
might affect a particular query.
14. System R only considers left deep plans. Why is this? Give an example of
a plan that is left deep and one that is not.