CENG 443

Heterogenous Parallel Programming

The course provides GPU-based heterogeneous programming basics. It covers the basic concepts of parallel architectures and parallel programming, CUDA programming model topics, and practical examples by presenting the fundamentals of accelerated computing with CUDA.

Course Objectives

1. Understand CPU-GPU heterogeneous architectures.

2. Design and implement heterogeneous parallel programs.

3. Understand GPU execution units and memory hierarchy to foster execution performance.

4. Understand performance evaluation and optimization methods targeting GPU devices.

Recommended or Required Reading

– David B. Kirk, Wen-Mei W. Hwu: Programming Massively Parallel Processors: A Hands-on Approach, Morgan Kaufmann Publishers

– NVIDIA Accelerated Computing Teaching Kit

– NVIDIA Deep Learning Institute (DLI) Workshops

Learning Outcomes:

1. To demonstrate the ability to design and implement programs for heterogeneous parallel programming platforms

2. To demonstrate the experience to apply the parallel programming patterns on various programming problems

3. To be able to analyze the parallel program performance and make performance optimizations

Topics
Introduction
Fundamentals of Accelerated Computing with CUDA
(DLI Workshop)
Fundamentals of Accelerated Computing with CUDA
(DLI Workshop)
Fundamentals of Accelerated Computing with CUDA
(DLI Workshop)
Midterm
Fundamentals of Accelerated Computing with CUDA
(DLI Workshop)
Getting Started with AI on Jetson Nano
Getting Started with AI on Jetson Nano
Accelerating CUDA Applications with Multiple GPUs
(DLI Workshop)
Accelerating CUDA Applications with Multiple GPUs
(DLI Workshop)
Accelerating CUDA Applications with Multiple GPUs
(DLI Workshop)
Project Presentations

Grading

Midterm: 20%

Assignments/Final Project: 40%

Final: 40%