CS277 - Database System Implementation
Spring 2002, Prof. Arthur Keller

Assignment #1
Due in class on Thursday April 11, 2002

State all assumptions.

Problem 1. (40 points)

Consider a 5.25 inch disk with 16 magnetic surfaces rotating at 7200 rpm. It has a usable capacity of 8 gigabytes (2**33 bytes) stored on 1024 cylinders. Assume 10% of each track is used as overhead.

a)
What is the burst bandwidth this disk could support reading a single 4-Kbyte (2**12 bytes) block from one track?
b)
What is the sustained bandwidth this disk could support reading an entire track?
c)
What is the average rotational latency, assuming it is not necessary to start at the beginning of the track?
d)
Assuming the average seek time is 12 ms, what is the average time to fetch a 4-Kbyte (2**12 bytes) sector?

Problem 2. (15 points)

Consider the "new" Megatron 747 disk, whose properties are defined in Examples 11.3 and 11.5 in the textbook. Suppose that we know that the last I/O request accessed cylinder 500.
a)
What is the expected (average) number of cylinders that will be traveled due the very next I/O request to this disk?
b)
What is the expected block access time, again given that the head is on cylinder 500 initially?

Problem 3. (15 points)

Suppose that we are scheduling I/O requests for the new Megatron 747 disk (Examples 11.3, 11.5). Use average seek, latency and transfer times of 6.5, 7.8, and 0.5 milliseconds. Initially the head is at cylinder 4000, and then the following requests come in (times in milliseconds):

   time =  0; request for block on cylinder 2000 arrives
   time =  1; request for block on cylinder 7000 arrives
   time = 10; request for block on cylinder  500 arrives
   time = 20; request for block on cylinder 4000 arrives
   time = 25; request for block on cylinder 5000 arrives
a)
If we use the elevator scheduling algorithm, at what time is each request serviced completely?
b)
If we use a first-come-first-served scheduler, at what time is each request serviced fully?

Problem 4. (20 points)

You are designing a file system for a medical application. Each patient record has 10 fields that always occur (e.g., name, patient number) and 50 fields that may or may not be relevant or known for a patient (e.g., number of children given birth to, cholesterol level). Assume that each of the optional fields is relevant or known for a particular patient with probability p. The values for all fields are a fixed size of 12 bytes.

You are considering two options:

(i)
A fixed format record.
(ii)
A variable format record where all fields are tagged. Each tag is one byte.

a)
What is the expected size of a record for each option? Your answer may be a function of p.
b)
For what range of p values is the fixed format option best?