Introduction to Parallel and Distributed Processing

Welcome to CSE 470/570 "Intro PDP", an introductory course for those who want to learn
how to efficiently use modern parallel and distributed systems

Intro PDP will be offered in Fall 2017!

Seats are still available.
You can register via UB HUB.

Fall 2017

Course Resources

Multiple course resources are available!

Course resources include lecture videos and slides, course reading and useful software tools. Note that you will have to authenticate to access this content. Ask your instructor if you have not received access guidelines so far.


Dr. Jaroslaw 'Jaric' Zola

 ·   ·   · 

Course Overview

This course is intended for students interested in the efficient use of modern parallel systems ranging from multi-core and many-core processors to large-scale distributed memory clusters. The course puts equal emphasis on the theoretical foundations of parallel computing and practical aspects of different parallel programming models. It begins with a survey of common parallel architectures and types of parallelism, and then follows with an overview of formal approaches to assess scalability and efficiency of parallel algorithms and their implementations. In the second part, the course covers the most common and current parallel programming techniques and APIs, including for shared address space, many-core accelerators, distributed memory clusters and big data analytics platforms. Each component of the course involves solving practical computational and data driven problems, ranging from basic algorithms like sorting or searching, to graphs and numerical data analysis.


The course consists of a series of lectures organized into five topical modules. Each lecture module is complemented with a programming assignment exposing practical aspects of the covered material. The course outline is provided below:

  1. Overview of parallel processing landscape: why and how, types of parallelism, Flynn’s taxonomy and brief overview of parallel architectures, Exascale computing vs. Exascale data, practical demonstration of CCR as an example HPC center. (3 lectures) Basic concepts in parallel processing: formal definition of parallelism, concepts of work, speedup, efficiency, overhead, strong and weak scalability (Amdahl’s law, Gustafson’s law), practical considerations using parallel sum and parallel prefix. (4 lectures)
  2. Multi-core programming: shared memory and shared address space, data and task parallelism, Cilk+, OpenMP, Intel TBB data structures (time permitting), parallel merge sort, pointer jumping, parallel BFS. (9 lectures)
  3. Distributed memory programming: Message Passing Interface (including one-sided communication, derived datatypes and MPI-IO), interconnect topologies, latency + bandwidth model, parallel matrix-vector product, parallel connected components, sample sort. (9 lectures)
  4. Higher-level programming models: MapReduce, Apache Spark and Resilient Distributed Datasets, Bulk Synchronous Parallel model, Pregel and Apache GraphX, triangle counting, connected components, single source shortest path. (9 lectures)
  5. Many-core programming: SIMD parallelism, massively parallel GPGPU accelerators, data movement and organization, matrix-matrix product, connected components. (6 lectures)


You can download full syllabus from here.


© 2017 Jaroslaw Zola