UB - University at Buffalo, The State University of New York Computer Science and Engineering

CSE 726: Hot Topics in Cloud Computing

This page refers to the Fall 2013 offering of CSE 726 only. The information on this page does not necessarily apply to every offering of CSE 726.

Fall 2013

19475

We will read and present papers covering five major topics in cloud computing (see below). Each student is expected to make 2-3 presentations and do a term project.

Major Topics to be covered: 1. Cloud Computing Application issues, including i) Google-like search/query and data processing using MapReduce/Hadoop, ii) Amazon-like Infrastructure as a Service or IaaS, iii) High Performance Computing-cloud Apps, and iv) Other apps (SaaS, PaaS etc.) The key is to understand differences/similarities, workload characteristics, their computing and communications resource requirements, and cost-performance criteria (failure-tolerance, availability requirements, delay/throughput..), as well as suitability/feasibility of multi-date center computing, public and private cloud computing, and multiple, heterogeneous cloud computing. 2. Cloud Computing file/storage systems such as Google File Systems, Hadoop File systems, General Parallel File System (GPFS) etc. and database systems that are either column-based, document-based, relational, key/value, graph-based etc.. The key is to understand their pros/cons, and how they can support distributed computing and virtualization, their resource requirements, and performance criteria (including consistency) 3. Middleware (Operational, Management and Kernel) issues including i) admission control and resource allocation, ii) advance reservation and scheduling, iii) support for multi-tenancy and virtual data center (VDC) via VM to server mapping, and VLAN establishment, iv) performance and failure monitoring, v) disaster and failure recovery etc.. The key is to understand the cost/performance tradeoffs, including overbooking, risk management, availability guarantee and other general Service Level Agreement (SLA), disaster or failure recovery and how the availability/reliability requirements impact resource allocation, reservation/scheduling and server consolidation. 4. Virtualization/Hypervisor issues related to i) server/machine (including CPU/memory), disk storage, network, and I/O virtualization, ii) backup and check-pointing mechanisms, iii) migration to optimize performance /power and the overhead involved. The key is to understand the performance impact /overhead of the virtualization, and specifically that of running multiple VMs on a single server for example (and how to minimize such impact/overhead), and how much additional computing and communications bandwidth is needed for backup/check-pointing and migration, and how much delay is involved in these operations (and how to speed them up). 5. Datacenter networking issues related to i) topology (symmetric, scale-free, expandable…), ii) capacity (oversubscription ratio) , iii) redundancy (failure tolerance/multi-path), iv) technology (optical, wireless, hybrid), v) inter-DCN vs intra-DCN and vi) TCP variants and performance. The key here is to understand application requirements (from Topic 1), suitable routing and transportation layer protocols, caching/content distribution strategies (in the case of multiple datacenters), and cost-performance tradeoffs.

CSE 589 or equivalent. Good knowledge of operating systems and distributed systems.

Ph.D.: This course does not fulfill core area or core course requirements.

M.S.: This course does not fulfill core area or core course requirements.

Valid XHTML 1.0 Transitional