CMPS 232: Distributed Systems
Final Projects
Overview
Each student in class must do a project on a topic related to distributed systems. This is a broad area—fault tolerance, security, storage, code mobility, caching and replication, and naming are all fair game. The project can be either an implementation or design project (preferred) or a survey of existing work in the field. If you do a survey, you're limited to a B grade in the class.
Project Topics
Your project may be on any topic related to distributed systems, which gives you a lot of leeway on project topics. As noted in the syllabus, survey papers are acceptable, but limit your class grade to a B.
Potential project topics are listed below. You're encouraged to pick a topic not listed if you have one you'd like to work on.
- Write a peer-to-peer system that can exchange file fragments using machine learning techniques to adaptively pick the "best" location for the fragments (contact qxin at soe dot ucsc dot edu for more info)
- Investigate the use of WebDAV for a wide-area file system. What's needed to make this work?
- Develop an rsync-like system that synchronizes your files whenever you connect to the network. This should be as fast as possible when you're connected, and should minimize bandwidth and time online for the sync. Currently, rsync requires a good deal of time "online" to compare files; this implementation should keep separate lists and transfer as little data as possible as quickly as possible. It should also be compatible with existing file systems. [thanks to Scott Brandt for the topic]
- Compare existing distributed failure detection and notification techniques, and see how they scale to a system with 10,000+ nodes.
- Investigate protocols for discovering new resources (ie, newly added disks) in a large-scale archive and for moving data around in such an archive.
- Explore secure protocols for large-scale (~10,000 disks or more) storage systems, focusing on approaches that limit the ability to damage the system (loss of data, other effects) with pinpointed attacks.
- Investigate techniques for long-term data preservation in a scalable (10,000 disk) storage system. This should include data preservation using aggressive m/n erasure coding and protocols for locating data in such a system.
- Additional topics to be added shortly...
Deliverables
Final paper
You must write a final paper on your project. This paper should be about 8 pages of 10 point, two column text, including figures. In other words, it should be similar to a (relatively short) conference paper. An author kit will be available online in early May. The author kit has information on preparing your paper in LaTeX, which is strongly encouraged. You may, however, use any document system you like, as long as the final paper is submitted in PDF.
Poster presentation
More on this in a bit....
Deadlines
Please see the schedule for a list of deadlines. The schedule is the final authority for deadlines; any changes to deadlines will be made on the schedule but might not be reflected elsewhere.
Last updated 8 Apr 2005 by Ethan L. Miller (elm-at-ucscXd0tXedu)