Introduction to High Performance Computing (Softwaregrundlagen HPC) - SS 2019
In High Performance Computing we are interested in realizing numerical algorithms in such a way that we get as close as possible to the theoretical peak performance of the underlying hardware. This requires adapting the numerical algorithms. In the usual lectures on numerical analysis or numerical linear algebra this does not get covered.
It turns out the performance only depends on a few elementary linear algebra operations. These operations became known as BLAS (Basic Linear Algebra Subroutines). So obviously an efficient BLAS implementation is crucial for scientific computing and scientific applications. Both, commercial (Intel MKL, AMD ACML, Sun Performance Library, ...) and open source (BLIS, ATLAS, ...) implementations of BLAS are available.
So if a lecture is called High Performance Computing it has to deal with BLAS! One way of dealing with BLAS is merely reading papers and using existing BLAS implementations as black box. But if you really want to understand it you have to implement your own BLAS! We will call this implementation ulmBLAS and it will be on par with commercial implementations.
It is important to know that developing your own ulmBLAS means that you will start with an empty source file. So this is a hands-on class.
- Introduction to the programming languages C, Assembler and Fortran.
- Concepts of compiler and linker.
- Hardware architectures
- Cache-optimization of numerical methods.
- SIMD (Single Instruction Multiple Data) programming with SSE, AVX, AVX2.
- Instruction Pipeline Optimization.