Homework #5

Assigned: December 2nd
Due: Friday, December 5th at 11:59 PM

NOTE: This homework is optional. If you choose to turn it in, it will replace your lowest homework grade from the first four homeworks.

Please read this information on how to submit homework online. Hard copy of homework will not be accepted.

All work on this homework must be your own. Please read (and follow!) the academic honesty policy for this class.

  1. On the Mac and Windows, double-clicking a file automatically launches the program that knows how to deal with the file, and hands the file as a parameter to the program. List two ways the operating system could know which file to run. Which approach do you believe is better? How could you change the mapping? For example, suppose you wanted to open text files with emacs rather than Microsoft Word.
  2. Some operating systems provide a rename system call to give a file a new name. Is there any difference between using this call to change a file's name and simply copying the file to a new file with a new name, followed by deletion of the old file? Can you think of examples where rename would work better (or faster) than copying followed by deletion? Are there cases where copying/deletion would work better?
  3. It has been suggested that the first part of each UNIX file be kept in the same disk block as its inode. What good would this do?
  4. Consider a Unix file system with 12 direct pointers in each inode and 4 KB file blocks.
    1. How much overhead would be needed to store index blocks for a 100 MB file, not including the inode itself? What percentage of total file size would the overhead be? Remember that blocks must be allocated in their entirety if any part of the block is needed.
    2. On average, how many 4 KB blocks would have to be read to get a random block from the (100 MB) file? Again, assume the inode is already in memory and doesn't need to be read from disk. HINT: figure out how many blocks can be read for each of direct, single, double, triple indirect and compute the average from this information.
  5. The Elephant file system (and others) are designed so that files are never deleted; rather, they're simply made invisible. This allows users to "time travel" to retrieve old copies of their files. A similar technique is used with the OldFiles directory on unix.ic.
    1. What are the advantages to such an approach?
    2. What are the disadvantages?
    3. How could you reduce the space used by such a system? Are there simple criteria you might use to select some files that should be permanently erased?
  6. Often, a disk drive doesn't experience requests uniformly distributed across the disk. In particular, cylinders containing directories or file block pointers (inodes) will be accessed more frequently than cylinders with just file blocks. How would you deal with this uneven request distribution?
    1. Would any of the disk scheduling algorithms discussed earlier in class be helpful? Why or why not?
    2. Where on the disk could you put the "hot" cylinders to improve performance using a standard scheduling algorithm?
    3. Since many file systems find blocks via indirection (block pointers), how could you design the file system to improve performance and reduce the number of accesses to the "hot" cylinders with directory & block pointer information?


Last updated 2 Dec 2003 by Ethan L. Miller (elm@ucsc.edu)