CSE 710 – Wide Area Distributed File Systems

Spring 2016 – Project Ideas

 

Project-1: Design and Implementation of a Serverless Distributed File System for Smartphones: 


 

In this project, the students will develop a distributed file system for file access and sharing across multiple Android smartphones. This will be a serverless file system, meaning it will not require any external server component nor any of the participating phones acting like a server. In that sense, this will be a peer-to-peer (p2p) distributed file system with POSIX interface. Each phone will be able to export certain portions of their local file system to other users (i.e. enable data sharing), and other phones will be able to locate and import/mount those remote files/directories to their local file system. Performance and scalability will be the major design considerations. The authorization and authentication of remote clients will also be an important component of the project. The connectivity between the participating phones can be either through WIFI or through 4G.

 

Project-2: Energy Efficiency in Mobile Systems: 

 

Every year, we move more than 1 zettabytes of data over the Internet globally, which consumes several terawatt hours of electricity, and costs billions of US dollars to the world economy. Mobile systems are responsible for a good portion of this global data movement as well energy consumption. It is possible to make these mobile data transfers more energy efficient without any performance degradation, and the overall data movement cost can be reduced drastically. In this project, you will analyze several application-layer parameters that affect the throughput and power consumption in mobile data transfers, and develop techniques to minimize the energy consumption without penalizing the performance.

 

Project-3: Distributed Object Storage for Genomics:

 

As in several other science areas, the data generation rate in Genomics has been exponentially growing, and will be reaching zettabyte scale in a few years. For this reason, efficient storage, search and retrieval of the genomics data has been a big challenge for the domain scientists. In this project, the students will study, analyze, and propose a new distributed object storage system which will organize distributed collection of genomes, identified variants and other relevant data (e.g. medical annotations) into an easily discoverable, searchable and accessible data store, spanning multiple geographic sites and exploiting domain-experts knowledge about the data. This data store will eliminate the need for centralized data gathering, and will scale as more sites and genomes will be added.

 

Project-4: Benchmarking Distributed File Systems:

 

You bill be choosing any two distributed file systems you wish (these could also be mobile file systems), and benchmark them using some standard benchmarking tools (such as IOZone, Bonnie, Postmark, TPC, Andrew ..etc) to evaluate and compare them in terms of operational throughput, access latency, availability, consistency.. etc. You will be able to choose the benchmarking tools that you want to use as long as they satisfy the requirements.