High Performance Computing

About the Course

Welcome to CSV880, Special Topics in Parallel Computing.

Parallelism has been employed for many years, mainly in high-performance computing. Interest in it has grown lately due to the physical constraints preventing frequency scaling. Parallel Computers can be roughly classified as either (i) shared-memory systems: multi-core/multi-processor computers having multiple processing elements in a single machine; or (ii) distributed-memory systems: typically using multiple computers to work on the same task. Typically, practical systems employ hybrid systems comprising of distributed memory systems with each computer comprising of shared memory system.

Distributed-memory systems are more scalable than shared-memory systems and are therefore more frequently used in Supercomputers. Such large scale systems typically comprise of a massive number of dedicated processors placed in close proximity to each other and connected using a special dedicated network such as a mesh or hypercube architecture. This saves considerable time in moving data around and makes it possible for the processors to work together. Supercomputers play an important role in computational science and are used in a wide range of fields, including weather forecasting, climate research, oil and gas exploration, and physical simulations.

Developing parallel programs involves aspects related to designing parallel algorithms, performing efficient communication and synchronization amongst the processors. In this course, we will be studying some basic computational routines that are used in Supercomputing (solving system of linear equations, fast Fourier transforms and graph problems). We will study algorithms for solving these problems in parallel and the underlying communication patterns. We will study various network topologies (mesh, hypercube, dragonfly) and relate the above communication patterns to these topologies.

Announcements

January 21, 2014
Organizational class in 501

Course Contents

The tentative contents for the course are outlined below:

1. Introduction to High Performance Computing
- - Focus on Distributed Memory Systems (Architecture)
- - Network Interconnects and message passing paradigm
- - Applications and computation kernels (FFT, LU)
- - Basic Data Decomposition Strategies (Block, Cyclic, Block-Cyclic)

2. Introduction to network interconnects
- - Hypercube
- - Mesh and Torus networks
- - Dragonfly networks
- - Communication collectives and primitives

3. Algorithms for Computation Kernels
- - Solving System of Linear Equations (and associated communication patterns)
- - Fast Fourier Transforms (and associated communication patterns)
- - Graph Algorithms (BFS) (and associated communication patterns)

4. Communication Primitives and Collectives
- - Blocking and Non-blocking sends/receives
- - Broadcasts, Reduce, All-to-all, Gather/Scatter
- (with Algorithms for different topologies)

+ Simulator based assignment
(implementing communication patterns for different topologies)

References, Resources, Assignments

Reference Book: Introduction to Parallel Computing by Grama, Karypis, Kumar, Gupta
MPI Tutorial @ LLNL
Assignment 1 due date extended to 3rd March (no more extensions).

    Format for submission:
    Please create a zip/tgz containing the following:
    1. Source Files (no binaries)
    2. makefile or script that compiles the code
    3. example file called "example
    4. a file named "run" that contains a valid command to run the mpi based code on the example file.
        The code should work for different example files and run time parameters!

    The name of the zip should be "Roll No-date of submission" e.g. "csm2009012-28-02-2014".
    Please submit by email to Priyanka.

    Penalty for late submission:
    20% till 5 March, 50% till 10 March, 100% beyond.
Assignment 2 due date 13th April.

Download code.zip here

    Format for submission:
    Please create a zip/tgz containing the following:
    1. Source Files (no binaries)
    2. makefile or script that compiles the code
    3. a file named "run" that contains a valid command to run the code.
        The code should work for different torus sizes (>2)!

    The name of the zip should be "Asg2-Roll No-date of submission" e.g. "Asg2-csm2009012-28-02-2014".
    Please submit by email to Priyanka.

    Penalty for late submission:
    20% till 20 April, 50% till 25 April, 100% beyond.