High Performance Computing I

This lecture provides an introduction to parallel computer architectures and standard programming interfaces for parallel numerical algorithms. We focus on few numerical algorithms like dense matrix multiplications and LU decompositions which allow us to approach step by step the theoretical peak performance.


  • Introduction to the programming language C++ with a special focus on numerical linear algebra
  • Memory hierarchies
  • Parallel computer architectures
    • shared memory with POSIX threads and OpenMP
    • distributed systems with MPI
    • GPUs


  • Monday, 2 pm to 6 pm, Helmholtzstraße 18, room E.20, and in the labs Helmholtzstraße 18, room E.44, and O27/211.
  • Friday, 2 pm to 6 pm, Helmholtzstraße 22, room E.04, and in the labs Helmholtzstraße 18, room E.44, and O27/211.
  • The first lecture will be on 14 October 2018, 2 pm.


Linear algebra, calculus, numerical linear algebra, and programming are required. We expect some basic knowledge of C. Knowledge of C++ is not required as we provide an introduction to C++. Introduction to High Performance Computing is recommended but not strictly required. Consider, however, that probably considerable more time will be required for the assignments of the lab sessions without the associated introductory lecture of the bachelor program.


The exams will be held orally in March and April 2020. An oral exam takes about 45 minutes. You are free to chose English or German as exam language. An active and successful participation in the labs (at least half of the quizzes are passed) and an registration for the exam at HIS is required.

Following days are offered for oral exams:

  • 19 March 2020 (no slots left)
  • 24 March 2020 (no slots left)
  • 26 March 2020 (no slots left)
  • 31 March 2020 (no slots left)
  • 2 April 2020 (no slots left)

To register for one of the free slots (available before noon and at afternoon) send an email to Andreas F. Borchert.

All oral exams are cancelled until further notice due to the current crisis to confine the spread of the SARS-CoV-2 virus. Please checkout the current informations about the handling of this crisis at our university and the current informations by Dezernat II.


Resources and examples of the lectures will be posted here:

  • 14 October 2019: First steps with vectors in C: notes, session
  • 18 October 2019: First steps with matrices in C: notes, session
  • 21 October 2019: Selected BLAS level 1 functions, benchmarks and gnuplot: session
  • 25 October 2019: Simple cache optimizations: session
  • 28 October 2019: Simple cache optimizations for GEMM: session
  • 4 November 2019: Cache optimizations for GEMV: session
  • 8 November 2019: First steps with C++: session
  • 11 November 2019: C++ tools for managing memory buffers: session
  • 15 November 2019: Packing matrix blocks for an efficient GEMM implementation: session
  • 18 November 2019: GEMM micro kernel, GEMM macro kernel, GEMM frame routine: session
  • 22 November 2019: Generic classes, template functions, and static polymorphism: session
  • 25 November 2019: Function objects and lambda expressions: session
  • 29 November 2019: Unblocked LU factorization: session
  • 2 December 2019: More on vector and matrix classes: session
  • 6 December 2019: First steps with threads in C++: session
  • 9 December 2019: Mutex and condition variables: session
  • 13 December 2019: Thread pools (part one): session
  • 16 December 2019: Thread pools (part two): session
  • 20 December 2019: GEMM with AVX-optimized micro kernels: session
  • 23 December 2019: Another unblocked LU factorization, blocked LU factorization: session
  • 10 January 2020: Using MKL-BLAS for LU factorization, improved blocked LU factorization (divide and conquer): session
  • 13 January 2020: Introduction to OpenMP: session
  • 17 January 2020: Introduction to MPI: session
  • 20 January 2020: Transfer of vectors and matrices using MPI: session
  • 24 January 2020: Scatter and gather operations, asynchronous communication, two-dimensional grids: session
  • 27 January 2020: Distributed matrices (with scatter and gather operations): session
  • 31 January 2020: Distributed GEMM: session
  • 3 February 2020: Introduction to CUDA: slides, session
  • 7 February 2020: Virtual vs. physical GPU architecture, matrices: slides, session
  • 10 February 2020: Global synchronization and two-dimensional aggregation, CUDA streams: slides, session
  • 14 February 2020: A simple multigrid solver: session

Lab sessions

Lab sessions will be held on two times per week. Each session comes with an online guide that summarizes the preceding lecture and provides practical exercises.

A registration at SLC for HPC I is necessary to participate in the lab sessions. Please get an SLC account through https://anmelden.mathematik.uni-ulm.de/ if you do not have one yet.

Some of the lab sessions come with a quiz which are due a week later:


Number Issued Deadline PDF
1 25 October 2019 4 November 2019, 2pm quiz01.pdf
2 8 November 2019 15 November 2019, 2pm quiz02.pdf
3 18 November 2019 25 November 2019, 2pm quiz03.pdf
4 6 December 2019 13 December 2019, 2pm quiz04.pdf
5 10 January 2020 17 January 2020, 2pm quiz05.pdf


Dr. Andreas F. Borchert
Helmholtzstr. 20
Room 1.23

Dr. Michael Lehn
Helmholtzstr. 20
Room 1.09

M. Sc. Constantin Greif
Helmholtzstr. 20
Room 1.28