CSE 710

Wide Area Distributed File Systems

Spring 2012

 

 

Instructor:

 

Prof. Tevfik Kosar

Office: 245 Bell Hall

Phone: 645-2323

Email: tkosar@buffalo.edu

Office hour: Fri 11:30am-1:00pm

 

Course Description:

 

As the data requirements of commercial as well as scientific applications continue to increase, the ability to share large amounts of data across widely distributed sites (i.e. data centers, clouds, clusters, supercomputers) become more and more important.

 

This seminar will be discussing state-of-the-art research, development, and deployment efforts in wide-area distributed file systems on clustered, grid, and cloud infrastructures. We will be reading and discussing two papers every week in one of the following areas:

 

·      File System Design Decisions

·      Performance, Scalability, and Consistency issues in File Systems

·      Traditional Distributed File Systems

·      Parallel Cluster File Systems

·      Wide Area Distributed File Systems

·      Cloud File Systems

·      Commercial vs Open Source File System Solutions

 

Course Location and Time:

 

The seminars will be held Mondays 12:00pm-2:00pm @ 113A Davis Hall. First day of classes will be on Monday, January 23, 2012.

 

Reading List:

 

The reading list for this seminar is available here.

 

Grading:

 

This is a research course. There will be no exams and no projects (unless there is a request from individual students for a term project). Each student will present 1 paper and will write reviews for 2 others. Each student is expected to read all papers, submit questions and comments about the papers, attend classes, and join the discussion of the papers. Grading will be P/F.

 

Useful Links:

 

·      How to Read a Paper, by S. Keshav.

·      Reviewing a Technical Paper, by M. Ernst

 

Paper Review Format Guidelines:

 

·      1 paragraph executive summary (what are the authors trying to achieve? potential contributions of the paper?)

·      2-3 paragraphs of details (key ideas? motivation & justification? strengths and weaknesses? technical flaws? supported with results? comparison with other systems? future work? anything you disagree with authors?)

·      1-2 paragraphs summarizing the discussions in the class.

 

Course Blog:

 

    All paper reviews will be posted on the course blog at http://cse710.blogspot.com/. Please make sure you visit this blog regularly. Also, do not forget to post your questions and comments on papers to be discussed every Friday by Midnight.

 

 

Seminar Schedule:

 

Date

Week

Papers to be Discussed

Presenter

Reviewers

Jan. 23

1

Introduction: Wide Area Distributed File Systems

Kosar

 

Jan. 30

2

[1] The Sun Network File System: Design, Implementation and Experience (NFS)

Prabhat

Nandavanam, Agrawal

[2] Scale and Performance in a Distributed File System (AFS)

Kapoor

Harinathagupta, Murali

Feb. 6

3

[3] Disconnected Operation in the Coda File System

Bachhav

Grama Prasad, Talbar

[4] Serverless Network File Systems (xFS)

Chakka

Syed, Ochani

Feb. 13

4

[5] PVFS: A Parallel File System for Linux Clusters

Inamdar

Zhang, Rudraraju

[6] Lustre: A Scalable, High-Performance File System

Grama Ramaprasad

Dabhere, Sreevathsa

Feb. 20

5

[7] GPFS: A Shared-Disk File System for Large Computing Clusters

Srinath

Ross, Sural

[8] Scalable Performance of the Panasas Parallel File System

Arslan

Murali, Kapoor

Feb. 27

6

[9] Nache: Design and Implementation of a Caching Proxy for NFSv4

Yildiz

Agrawal, Chakka

[10] Panache: A Parallel File System Cache for Global File Access

Ki

Desai Shridhar, Holavanalli

Mar. 5

7

[11] OceanStore: An Architecture for Global-Scale Persistent Storage

Sankara Narayanan

Arslan, Inamdar

[12] Shark: Scaling File Servers via Cooperative Caching

Syed

Talbar, Seth

Mar. 12

 

 

 

 

Mar. 19

8

[13] Pangaea: a symbiotic wide-area file system

Agrawal

Ki, Srinath

[15] A Distributed File System for a Wide-Area High Performance Computing Infrastructure

Desai Shridhar

Murali

Yildiz, Sankara Narayanan

Ochani, Bachhav

Apr. 2

9

[16] The Google File System

Sural

Kapoor, Nandavanam

[17] The Hadoop Distributed File System

Gupta

Prabhat, Arslan

[18] Ceph: A Scalable, High-Performance Distributed File System

Ross

Rudraraju, Ki

Apr. 9

10

[19] Flexible, Wide-Area Storage for Distributed Systems with WheelFS

Zhang

Sreevathsa, Dabhere

[20] Bigtable: A Distributed Storage System for Structured Data

Nandavanam

Sural, Syed

[21] Dynamo: Amazon’s Highly Available Key-value Store

Talbar

Chakka, Harinathagupta

Apr. 16

11

[22] PNUTS: Yahoo’s Hosted Data Serving Platform

Ochani

Holavanalli, Ross

[23] Cassandra – A Decentralized Structured Storage System

Rudraraju

Inamdar, Yildiz

Apr. 23

12

[24] Spyglass: Fast, Scalable Metadata Search for Large-Scale Storage Systems

Sreevathsa

Seth, Desai Sridhar

[25] Megastore: Providing Scalable, Highly Available Storage for Interactive Services

Holavanalli

Gupta, Zhang

Apr. 30

13

[26] Safety, Visibility, and Performance in a Wide-Area File System

Seth

Sankara Narayanan, Prabhat

[28] Energy-Efficiency and Storage Flexibility in the Blue File System

Debhere

Harinathagupta

Bachhav, Grama Ramaprasad

Srinath, Gupta