Parallel & Mp Architect

EECE.6520 Parallel & Mp Architect

(a.k.a Parallel Computing)

Credits: 3

Contact Hours: 3 hours of lecture

Instructor: Seung Woo Son


  • Required: S. Pacheco, Introduction to Parallel Programming, Morgan Kaufmann, 2011.
  • Recommended:
    • B. Chapman, G. Jost and R. van der Pas, Using OpenMP: Portable Shared Memory Parallel Programming, The MIT Press, 2008.
    • David B. Kirk and Wen-mei Hwu, Programming Massively Parallel Processors: A Hands-on Approach, 2nd Edition, Morgan Kaufmann, 2014.
    • B. Gropp, T. Hoefler, R. Thakur, and E. Lusk, Using Advanced MPI: Modern Features of the Message-Passing Interface, MIT Press, 2014.

Other supplemental materials: All supplemental materials can be found on the UMass Lowell Blackboard portal ( These materials include lecture slides, handouts, recordings, assignments, quizzes, and other documentation.

Course Catalog Description:

This course is an introduction to parallel computing for engineers and scientists, covering the basic concepts and widely used parallel computing systems and parallel programming models. With prevalence of multi-core and even recent accelerators such as NVIDIA GPUs and Intel Xeon Phi coprocessors, parallel programming is everywhere; even laptops and cellphones have several processing cores. Therefore, application programmers need to know a productive way to express their computation to exploit available parallel computing powers.

The course will be structured as lectures, homeworks, programming assignments and a final project. Students are expected to code several programming projects using selected parallel programming models and measure their performance. The final projects will implement parallel algorithms/programs using one or more programming models.

Prerequisites: There is no formal prerequisite, but these are recommended: basic programming skills (e.g., EECE.2160 ECE Application Programming). Understanding of computer architecture (e.g., EECE.4820 Computer Architecture and Design) will be a plus.

Grading: Attendance (5%), Written and Programming Assignments (50%), Exams (25%), Course project (20%)

Required or elective? This course may be used as a technical elective for Computer Engineering majors.

Course Outcomes:

By the end of this course, students will understand and be able to use all of the following:

  1. Analyze a given problem for various parallel computing architectures and design the best parallel algorithm using proper parallel programming models.
  2. Understand the basics of shared memory parallel architectures and programming.
  3. Design a shared memory parallel program for a given parallel algorithm using both explicit (pthread) and implicit (OpenMP) parallel programming and evaluate performance.
  4. Understand the basics of distributed memory parallel architectures and programming.
  5. Design a message-passing distributed memory parallel program for a given parallel algorithm using the portable Message-Passing Interface (MPI) and evaluate performance.
  6. Understand the basics of accelerator-based parallel architectures and programming.

Course Topics

  • Parallel computing architecture: A general concept of parallel and distributed computing architecture, weak and strong scaling, task and data partition, performance evaluation, job scheduling and submission on HPC systems.
  • Distributed memory architecture programming: General notion of distributed memory architecture, message passing interface (MPI) programming.
  • Shard memory architecture and programming: General notion of shared memory architecture, OpenMP/threads programming.
  • Heterogenous architecture and programming: General notion of parallel architecture with accelerators such as GPUs and FPGAs, CUDA/OpenCL programming.
  • Parallel algorithms: N-body solver, tree search problem, etc.


Note that below is a tentative schedule based on 14 meetings.

Week Topic Reading Assignment
1 Course overview; Parallel computing architecture (1/2) Chapter 1 and 2 HW1
2 Parallel computing architecture (2/2) Chapter 2
3 XSEDE tutorial and lab
4 Distributed memory programming (1/3) Chapter 3 HW2
5 Distributed memory programming (2/3) Chapter 3 programming 1; Project idea
6 Distributed memory programming (3/3) Chapter 3
7 Midterm exam
8 Shared memory programming (pthread) Chapter 4 programming 2; Project proposal
9 Shared memory programming (OpenMP) (1/2) Chapter 5
10 Shared memory programming (OpenMP) (2/2) Chapter 5 programming 3
11 CUDA (1/2) Course material by NVidia
12 CUDA (2/2) programming 4
13 Parallel algorithms Chapter 6
14 Project presentation
Final exam (exam week)