# 263-2300-00L How To Write Fast Numerical Code

Semester | Frühjahrssemester 2016 |

Dozierende | M. Püschel |

Periodizität | jährlich wiederkehrende Veranstaltung |

Lehrsprache | Englisch |

Kommentar | Prerequisite: Master student, solid C programming skills. |

Kurzbeschreibung | This course introduces the student to the foundations and state-of-the-art techniques in developing high performance software for numerical functionality such as linear algebra and others. The focus is on optimizing for the memory hierarchy and for special instruction sets. Finally, the course will introduce the recent field of automatic performance tuning. |

Lernziel | Software performance (i.e., runtime) arises through the interaction of algorithm, its implementation, and the microarchitecture the program is run on. The first goal of the course is to provide the student with an understanding of this interaction, and hence software performance, focusing on numerical or mathematical functionality. The second goal is to teach a general systematic strategy how to use this knowledge to write fast software for numerical problems. This strategy will be trained in a few homeworks and semester-long group projects. |

Inhalt | The fast evolution and increasing complexity of computing platforms pose a major challenge for developers of high performance software for engineering, science, and consumer applications: it becomes increasingly harder to harness the available computing power. Straightforward implementations may lose as much as one or two orders of magnitude in performance. On the other hand, creating optimal implementations requires the developer to have an understanding of algorithms, capabilities and limitations of compilers, and the target platform's architecture and microarchitecture. This interdisciplinary course introduces the student to the foundations and state-of-the-art techniques in high performance software development using important functionality such as linear algebra functionality, transforms, filters, and others as examples. The course will explain how to optimize for the memory hierarchy, take advantage of special instruction sets, and, if time permits, how to write multithreaded code for multicore platforms. Much of the material is based on state-of-the-art research. Further, a general strategy for performance analysis and optimization is introduced that the students will apply in group projects that accompany the course. Finally, the course will introduce the students to the recent field of automatic performance tuning. |