## CS 4611 Take Home Final

### To take this final you must signup in class or see me (Rich) by April 26

You must submit a signed copy of the following page with your exam.

1. Draw an ER diagram to capture the following information: [25 points]
• Students have a unique ID#, a name, an address, and a major (which is a department ID, see next)
• Departments have a unique ID#, a name, a building, and a count of how many majors it has
• Courses have a department (a department ID#), a course number, a semester character (F for Fall, S for Spring) and a year, the combination of all four values is unique
• Professors have a unique ID#, a name, a title, and how many years they have been at the university
• Every student must take at least one class.
• Each instance of a course is taught by one professor.
• Every student is advised by a professor.
• Every professor advises at least one student.
• Every student has a major (which is a department), students may have more than one major
• Every department has at least one class.
2. Define in SQL a set of relations to capture the information from the previous question. Make sure to indicate keys and include foreign keys. Can you capture all of the constraints listed above in your definition? If not, why not? Could you capture these constraints in some other manner (besides defining relations)? (If so, how?) [20 points]
3. For the following relations:
```     Supplier (id: integer, name: string(20), address: string(50))
- id is unique
Part (id: integer, color: string(10), name: string(20), cost: real)
- id is unique
Contractor (id: integer, name: string(20), address: string(50))
- id is unique
Project (id: integer, cid: integer, budget: real, duedate: date)
- id is unique
- cid is a foreign key referring to Contractor
Order (projid: integer, partid: integer, suppid: integer, quantity: integer)
- the combination of projid, partid, suppid is unique
- projid is a foreign key referring to Project
- partid is a foreign key referring to Part
- suppid is a foreign key referring to Supplier
```

Construct the following queries:

• The names of parts that have been ordered for every project (in Relation Algebra) [10 points]
• Suppliers who supply only Blue or Yellow parts (in Domain Relational Calculus) [10 points]
• The name of the contractor who has the largest total budget (the total of all the budgets that contractor is responsible for) and the project id of the largest budget project that contractor is responsible for (in SQL) [15 points]
• The name of the contractor, the project id, and the total amount committed to parts being ordered for a project -- an order is for a quantity of a part, the total cost of the order is that quantity times the cost of the part (in SQL) [15 points]
• The names of contractors who have at least one project that has a budget larger than \$10,000 (in QBE) [10 points]
4. For the indexing methods ISAM, B+ trees, Extendible Hashing and Linear Hashing attempt to characterize the amount of space needed for each method in the best, worst, and averages cases based on the number of keys in the index and how many keys are stored on a page. You may introduce terms as needed (for example, you will likely need to include a term capturing overflow pages for ISAM). Also characterize the best, worst and average case lookup time (in page I/Os) for each method. Make sure to justify your answers and list any assumptions. [20 points]
5. For the linear hashing indexing method: [20 points]
• Describe how the index is represented
• Where and how the data is stored
• How the following operations are executed: searching for a key, inserting a tuple, and deleting a tuple
• The advantages and disadvantages of this method with respect to other indexing methods.
6. Define the terms 3NF and BCNF. Define a relation that is not in 3NF. Show how you could decompose that relation into 3NF. Then show how you could further decompose the resulting relations to result in BCNF. Discuss the advantages and disadvantages of your decompositions. [20 points]
7. Answer Exercise 14.8 from page 409 of the text. [30 points]
8. Answer Exercise 19.2 from page 566 of the text for the question of whether these schedules are conflict-serializable. If they are not, show a dependency graph indicating why not. [20 points]
9. Define Strict Two Phase Locking. Give an example of a schedule where a potential concurrency problem would be avoided using Strict 2PL. Make sure to clearly indicate how the locking affects the schedule. [15 points]
10. Define write-ahead logging. Explain why it is so important for recovery. Define the Aries method for crash recovery. Include an example with transactions that would be identified in all three phases and how those transactions would be dealt with. [20 points]