Onur Mutlu: Catalogue data in Autumn Semester 2022

Name Prof. Dr. Onur Mutlu
FieldComputer Science
Address
Dep. Inf.techno.u.Elektrotechnik
ETH Zürich, ETZ F 84
Gloriastrasse 35
8092 Zürich
SWITZERLAND
E-mailonur.mutlu@safari.ethz.ch
URLhttps://people.inf.ethz.ch/omutlu/
DepartmentInformation Technology and Electrical Engineering
RelationshipFull Professor

NumberTitleECTSHoursLecturers
227-0085-34LProjects & Seminars: Exploration of Emerging Memory Systems Restricted registration - show details
Only for Electrical Engineering and Information Technology BSc.

The course unit can only be taken once. Repeated enrollment in a later semester is not creditable.
3 credits3PO. Mutlu
AbstractThe category of "Laboratory Courses, Projects, Seminars" includes courses and laboratories in various formats designed to impart practical knowledge and skills. Moreover, these classes encourage independent experimentation and design, allow for explorative learning and teach the methodology of project work.
Learning objectiveDRAM is predominantly used to build the main memory systems of modern computing devices. Emerging memory technologies (RRAM, PCM, STT-MRAM, FeRAM) provide an exciting opportunity to replace or complement DRAM. Simulation-based experimental studies are key for understanding the complex interactions between DRAM, emerging memory technologies, and modern applications. Ramulator is an extensible main memory simulator providing cycle-accurate performance models for a variety of commercial DRAM standards (e.g., DDR3/4, LPDDR3/4, GDDR5, HBM), emerging memory technologies, and academic proposals. Ramulator has a modular design that enables easy integration of additional standards, technologies and mechanisms. Ramulator is written in C++11 and can be easily integrated to full-system simulators such as gem5.

In this P&S, you will design new memory and memory controller mechanisms for improving overall system performance, energy consumption, reliability, security, scalability and cost. You will extend Ramulator with these new designs and evaluate their performance, energy consumption, and reliability using modern applications.
This will be the right P&S for you if you would like to learn about the state-of-the-art and future memory and memory controllerdesigns and their interaction with modern applications.

This P&S will also enable you to hands-on simulate and understand the memory system behavior of modern workloads such as machine learning, graph analytics, genome analysis.

The course is conducted in English.

Course website: https://safari.ethz.ch/projects_and_seminars/doku.php?id=ramulator
Lecture notesSee https://safari.ethz.ch/projects_and_seminars/doku.php?id=ramulator
LiteratureLearning Materials
===============

An old version of Ramulator:
https://github.com/CMU-SAFARI/ramulator

Original Ramulator paper:
https://people.inf.ethz.ch/omutlu/pub/ramulator_dram_simulator-ieee-cal15.pdf

An example study of modern workloads and DRAM architectures using Ramulator:
https://people.inf.ethz.ch/omutlu/pub/Workload-DRAM-Interaction-Analysis_sigmetrics19_pomacs19.pdf

An example recent study of a new DRAM architecture using Ramulator:
https://people.inf.ethz.ch/omutlu/pub/CLR-DRAM_capacity-latency-reconfigurable-DRAM_isca20.pdf

An example recent study of a new virtual memory system architecture using Ramulator:
https://people.inf.ethz.ch/omutlu/pub/VBI-virtual-block-interface_isca20.pdf

Several examples of new ideas enabled by Ramulator based evaluation
https://people.inf.ethz.ch/omutlu/pub/rowclone_micro13.pdf
https://people.inf.ethz.ch/omutlu/pub/salp-dram_isca12.pdf
https://people.inf.ethz.ch/omutlu/pub/raidr-dram-refresh_isca12.pdf
https://people.inf.ethz.ch/omutlu/pub/DR_STRANGE_EndtoEnd-DRAM-TRNG_hpca22.pdf
Prerequisites / NoticePrerequisites of the course:
Digital Design and Computer Architecture (or equivalent course)
A good knowledge in C/C++ programming language.
Interest in making things efficient and solving problems.
Interest in understanding software development and hardware design, and their interactions.
227-0085-35LProjects & Seminars: FPGA-based Exploration of DRAM and RowHammer Restricted registration - show details
Only for Electrical Engineering and Information Technology BSc.

The course unit can only be taken once. Repeated enrollment in a later semester is not creditable.
3 credits3PO. Mutlu
AbstractThe category of "Laboratory Courses, Projects, Seminars" includes courses and laboratories in various formats designed to impart practical knowledge and skills. Moreover, these classes encourage independent experimentation and design, allow for explorative learning and teach the methodology of project work.
Learning objectiveDRAM is predominantly used to build the main memory systems of modern computing devices. To improve the performance, reliability, and security of DRAM, it is critical to perform experimental characterization and analysis of existing cutting-edge DRAM chips.

SoftMC is an FPGA-based DRAM testing infrastructure that enables the programmer to perform all low-level DRAM operations (i.e., DDR commands) in a cycle-accurate manner. SoftMC provides a simple and intuitive high-level programming interface (in C++) that completely hides the low-level details of the FPGA from programmers. Programmers implement test routines in C++, and the test routines automatically get translated into the low-level SoftMC memory controller operations in the FPGA. SoftMC developers write low-level hardware description language code to enable new and faster studies.

In this P&S, you will have the chance to learn how DRAM is organized and operates in a low-level and gain practical experience in using SoftMC while developing SoftMC programs for new DRAM characterization studies related to performance, reliability, and security. You may also improve the SoftMC infrastructure itself to enable new studies. And, who knows, you might discover new security vulnerabilities like RowHammer.

This will be the right P&S for you if you are interested in DRAM technology and would like to learn more about it as well as FPGA technology and how it can be used for practical purposes such as understanding and mitigating RowHammer attacks, generating true random numbers, reducing memory latency, fingerprinting and identifying devices, and improving reliability.

The course is conducted in English.

See: https://safari.ethz.ch/projects_and_seminars/doku.php?id=softmc
Lecture notesSee: See: https://safari.ethz.ch/projects_and_seminars/doku.php?id=softmc
LiteratureLearning Materials:
===================

- An old version of SoftMC is here: https://github.com/CMU-SAFARI/SoftMC
- SoftMC description: https://people.inf.ethz.ch/omutlu/pub/softMC_hpca17.pdf
- SoftMC lecture: https://www.youtube.com/watch?v=tnSPEP3t-Ys
- Example RowHammer study using SoftMC: https://people.inf.ethz.ch/omutlu/pub/Revisiting-RowHammer_isca20.pdf
- Example security attack study using SoftMC: https://people.inf.ethz.ch/omutlu/pub/rowhammer-TRRespass_ieee_security_privacy20.pdf
- Example neural network acceleration study using SoftMC: https://people.inf.ethz.ch/omutlu/pub/EDEN-efficient-DNN-inference-with-approximate-memory_micro19.pdf
- Example random number generation study using SoftMC: https://people.inf.ethz.ch/omutlu/pub/drange-dram-latency-based-true-random-number-generator_hpca19.pdf
- Example physical unclonable function study using SoftMC: https://people.inf.ethz.ch/omutlu/pub/dram-latency-puf_hpca18.pdf
- The original RowHammer study using SoftMC: https://people.inf.ethz.ch/omutlu/pub/dram-row-hammer_isca14.pdf
Prerequisites / NoticePrerequisites of the course:
- Digital Design and Computer Architecture (or equivalent course)
- Familiarity with FPGA programming
- Interest in low-level system exploration and memory
- Interest in discovering why things do or do not work and solving problems
227-0085-51LProjects & Seminars: Programming Heterogeneous Computing Systems with GPUs and other Accelerators Restricted registration - show details
Only for Electrical Engineering and Information Technology BSc.

Course can only be registered for once. A repeatedly registration in a later semester is not chargeable.
3 credits3PO. Mutlu, J. Gómez Luna
AbstractThe category of "Laboratory Courses, Projects, Seminars" includes courses and laboratories in various formats designed to impart practical knowledge and skills. Moreover, these classes encourage independent experimentation and design, allow for explorative learning and teach the methodology of project work.
Learning objectiveThe increasing difficulty of scaling the performance and efficiency of CPUs every year has created the need for turning computers into heterogeneous systems, i.e., systems composed of multiple types of processors that can suit better different types of workloads or parts of them. More than a decade ago, Graphics Processing Units (GPUs) became general-purpose parallel processors, in order to make their outstanding processing capabilities available to many workloads beyond graphics. GPUs have been a critical key to the recent rise of Machine Learning and Artificial Intelligence, which took unrealistic training times before the use of GPUs. Field-Programmable Gate Arrays (FPGAs) are another example computing device that can deliver impressive benefits in terms of performance and energy efficiency. More specific examples are (1) a plethora of specialized accelerators (e.g., Tensor Processing Units for neural networks), and (2) near-data processing architectures (i.e., placing compute capabilities near or inside memory/storage).

Despite the great advances in the adoption of heterogeneous systems in recent years, there are still many challenges to tackle, for example:
- Heterogeneous implementations (using GPUs, FPGAs, TPUs) of modern applications from important fields such as bioinformatics, machine learning, graph processing, medical imaging, personalized medicine, robotics, virtual reality, etc.
- Scheduling techniques for heterogeneous systems with different general-purpose processors and accelerators, e.g., kernel offloading, memory scheduling, etc.
- Workload characterization and programming tools that enable easier and more efficient use of heterogeneous systems.

If you are enthusiastic about working hands-on with different software, hardware, and architecture projects for heterogeneous systems, this is your P&S. You will have the opportunity to program heterogeneous systems with different types of devices (CPUs, GPUs, FPGAs, TPUs), propose algorithmic changes to important applications to better leverage the compute power of heterogeneous systems, understand different workloads and identify the most suitable device for their execution, design optimized scheduling techniques, etc. In general, the goal will be to reach the highest performance reported for a given important application.

The course is conducted in English.

The course has two main parts:
Weekly lectures on GPU and heterogeneous programming.
Hands-on project: Each student develops his/her own project.

Course website: https://safari.ethz.ch/projects_and_seminars/doku.php?id=heterogeneous_systems
ContentSee: https://safari.ethz.ch/projects_and_seminars/doku.php?id=heterogeneous_systems for past examples.
Lecture notesSee: https://safari.ethz.ch/projects_and_seminars/doku.php?id=heterogeneous_systems
LiteratureLearning Materials
============

1. An introduction to SIMD processors and GPUs:
http://www.youtube.com/watch?v=hOeIkAYraTE

2. An introduction to GPUs and heterogeneous programming: http://www.youtube.com/watch?v=y40-tY5WJ8A

3. Example recent studies of FPGA and GPU implementation for bioinformatics:
GateKeeper: FPGA for bioinformatics (Bioinformatics 2017): https://people.inf.ethz.ch/omutlu/pub/gatekeeper_FPGA-genome-prealignment-accelerator_bionformatics17.pdf
SneakySnake: Pre-alignment filter on FPGA and GPU (Bioinformatics 2020): https://people.inf.ethz.ch/omutlu/pub/SneakySnake_UniversalGenomePrealignmentFilter_bioinformatics20.pdf

4. An example recent study of a suite of heterogeneous benchmarks:
Chai: heterogeneous benchmarks (ISPASS 2017): https://chai-benchmarks.github.io/assets/ispass17.pdf

5. An example recent study of a medical image application on GPU:
GPU for medical imaging (CMPB 2020): https://people.inf.ethz.ch/omutlu/pub/bsplines_interpolation_on_GPUs_compmethodsprograms-biomedicine20.pdf

6. Example studies of programming tools and performance portability on heterogeneous systems:
Boyi: execution models for FPGAs (FPGA 2020): https://people.inf.ethz.ch/omutlu/pub/boyi-opencl-execution-model-selection-for-FPGAs_fpga20.pdf
Zorua: hardware support for GPU performance portability (MICRO 2016): https://people.inf.ethz.ch/omutlu/pub/zorua-holistic-GPU-virtualization_micro16.pdf
Locality descriptor: Cross-layer abstraction to express data locality on GPUs (ISCA 2018): https://people.inf.ethz.ch/omutlu/pub/LocalityDescriptor-Cross-Layer-GPU-Data-Locality-Abstraction_isca18.pdf

7. Example studies of scheduling techniques for heterogeneous systems:
Thread scheduling (MICRO 2011): https://people.inf.ethz.ch/omutlu/pub/large-gpu-warps_micro11.pdf
DASH: memory scheduling (TACO 2016): https://people.inf.ethz.ch/omutlu/pub/dash_deadline-aware-heterogeneous-memory-scheduler_taco16.pdf
Prerequisites / NoticePrerequisites of the course:
- Digital Design and Computer Architecture (or equivalent course).
- Familiarity with C/C++ programming and strong coding skills.
- Interest in future computer architectures and computing paradigms.
- Interest in discovering why things do or do not work and solving problems
- Interest in making systems efficient and usable
227-0085-56LProjekte & Seminare: Intelligent Architectures via Hardware/Software Cooperation Restricted registration - show details
Only for Electrical Engineering and Information Technology BSc.

Course can only be registered for once. A repeatedly registration in a later semester is not chargeable.
3 credits3PO. Mutlu
AbstractThe category of "Laboratory Courses, Projects, Seminars" includes courses and laboratories in various formats designed to impart practical knowledge and skills. Moreover, these classes encourage independent experimentation and design, allow for explorative learning and teach the methodology of project work.
Learning objectiveModern general-purpose processors are agnostic to an application’s high-level semantic information. Hence, they employ prediction-based techniques to enable computational and memory optimizations, such as prefetching, cache management policies, memory data placement, instruction scheduling, and many others. As such, the potential of such optimizations is limited due to the limited information the underlying hardware can discover on its own and such optimizations come with large area, power and complexity overheads required by the hardware for prediction purposes. Purely-hardware optimizations cannot achieve their performance potential and waste power, complexity and hardware area, since they are not aware of the application characteristics. On the other hand, purely-software optimizations are fundamentally tied up and limited by the underlying hardware.

A promising way to increase the performance of modern applications is to co-design software and hardware. Hence, lately both industry and academia are making serious attempts to improve performance, energy and security using hardware/software cooperative schemes such as application-specific hardware accelerators (e.g., Google’s Tensor Processing Unit) and application-specific extensions in general-purpose processors (e.g., Media Engine in Apple M1).

In this course, we will explore several different topics around hardware/software co-design such as: (i) new hardware/software interfaces (e.g., virtual memory, instruction set architecture) to enhance performance, energy and security, (ii) hardware/software co-design schemes to improve the performance of the memory subsystem in killer memory-intensive applications (e.g., sparse and irregular workloads), (iii) hardware/software cooperative machine-learning-based techniques for different microarchitectural components such as prefetchers, caches and branch predictors, which would continuously learn from the vast amount of memory accesses seen by a processor and adapt to the varying workload and system conditions.

If you are enthusiastic about working hands-on to design both software and hardware, this is your P&S. You will have the opportunity to study modern applications, propose software changes to better match the underlying hardware components, design new hardware components that better match the overlying software and come up with new machine-learning techniques to design efficient microarchitectural components. You will also learn how to program industry-supported microarchitectural simulators and study the performance of modern workloads after your hardware/software modifications.

Preferable:
- Hands-on experience with Machine Learning frameworks (depends on the topic you choose)

The course is conducted in English.

Course website: https://safari.ethz.ch/projects_and_seminars/
Lecture notesSee: https://safari.ethz.ch/projects_and_seminars/
LiteratureLearning materials
============

[1] Onur Mutlu,"Intelligent Architectures for Intelligent Machines"
Invited Keynote Paper in Proceedings of the 2020 International Symposia on VLSI (VLSI): https://people.inf.ethz.ch/omutlu/pub/intelligent-architectures-for-intelligent-machines_keynote-paper_VLSI20.pdf

[2] Kanellopoulos et al. "SMASH: Co-designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations", Proceedings of the 52nd International Symposium on Microarchitecture (MICRO 2019): https://people.inf.ethz.ch/omutlu/pub/SMASH-sparse-matrix-software-hardware-acceleration_micro19.pdf

[3] Bera et al. "Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning" Proceedings of the 54th International Symposium on Microarchitecture (MICRO 2021): https://people.inf.ethz.ch/omutlu/pub/Pythia-customizable-hardware-prefetcher-using-reinforcement-learning_micro21.pdf

[4] Hajinazar et al. "The Virtual Block Interface: A Flexible Alternative to the Conventional Virtual Memory Framework" Proceedings of the 47th International Symposium on Computer Architecture (ISCA 2020): https://people.inf.ethz.ch/omutlu/pub/VBI-virtual-block-interface_isca20.pdf

[5] Vijaykumar et al. "A Case for Richer Cross-layer Abstractions: Bridging the Semantic Gap with Expressive Memory", Proceedings of the 45th International Symposium on Computer Architecture (ISCA 2018): https://people.inf.ethz.ch/omutlu/pub/X-MEM_Expressive-Memory-for-Rich-Cross-Layer-Abstractions_isca18.pdf

[6] Vijaykumar et al. “MetaSys: A Practical Open-Source Metadata Management System to Implement and Evaluate Cross-Layer Optimizations” TACO 2022: https://arxiv.org/abs/2105.08123

[7] Vijaykumar et al. "The Locality Descriptor: A Holistic Cross-Layer Abstraction to Express Data Locality in GPUs"
Proceedings of the 45th International Symposium on Computer Architecture (ISCA 2018): https://people.inf.ethz.ch/omutlu/pub/LocalityDescriptor-Cross-Layer-GPU-Data-Locality-Abstraction_isca18.pdf

[8] Besta et al. "SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems", Proceedings of the 54th International Symposium on Microarchitecture (MICRO 2021): https://people.inf.ethz.ch/omutlu/pub/SISA-GraphMining-on-PIM_micro21.pdf
Prerequisites / NoticePrerequisites of the course:
- Digital Design and Computer Architecture (or equivalent course).
- Familiarity with C/C++ programming and strong coding skills.
- Interest in future computer architectures and computing paradigms.
- Interest in discovering why things do or do not work and solving problems
- Interest in making systems efficient and usable
227-2210-00LComputer Architecture Information 8 credits6G + 1AO. Mutlu
AbstractComputer architecture is the science & art of designing and optimizing hardware components and the hardware/software interface to create a computer that meets design goals. This course covers basic components of a modern computing system (memory, processors, interconnects, accelerators). The course takes a hardware/software cooperative approach to understanding and designing computing systems.
Learning objectiveWe will learn the fundamental concepts of the different parts of modern computing systems, as well as the latest major research topics in Industry and Academia. We will extensively cover memory systems (including DRAM and new Non-Volatile Memory technologies, memory controllers, flash memory), new paradigms like processing-in-memory, parallel computing systems (including multicore processors, coherence and consistency, GPUs), heterogeneous computing, interconnection networks, specialized systems for major data-intensive workloads (e.g. graph analytics, bioinformatics, machine learning), etc. We will focus on fundamentals as well as cutting-edge research. Significant attention will be given to real-life examples and tradeoffs, as well as critical analysis of modern computing systems.
ContentThe principles presented in the lecture are reinforced in the laboratory through 1) the design and implementation of a cycle-accurate simulator, where we will explore different components of a modern computing system (e.g., pipeline, memory hierarchy, branch prediction, prefetching, caches, multithreading), and 2) the extension of state-of-the-art research simulators (e.g., Ramulator) for more in-depth understanding of specific system components (e.g., memory scheduling, prefetching).

See the course website for detailed and complete content of past incarnations of the course: https://safari.ethz.ch/architecture
Lecture notesAll the materials (including lecture slides) will be provided on the course website: https://safari.ethz.ch/architecture/

The video recordings of the lectures are expected to be made available after lectures.

See https://safari.ethz.ch/architecture for past examples.
LiteratureWe will provide required and recommended readings in every lecture. They will mainly consist of research papers presented in major Computer Architecture and related conferences and journals.

See https://safari.ethz.ch/architecture for past examples.
Prerequisites / NoticeDigital Design and Computer Architecture (https://safari.ethz.ch/digitaltechnik)
227-2211-00LSeminar in Computer Architecture Information Restricted registration - show details
Number of participants limited to 28.

The deadline for deregistering expires at the end of the second week of the semester. Students who are still registered after that date, but do not attend the seminar, will officially fail the seminar.
2 credits2SO. Mutlu, M. H. K. Alser, J. Gómez Luna
AbstractIn this seminar course, we will cover fundamental and cutting-edge research papers in computer architecture. The course will consist of multiple components that are aimed at improving students' technical skills in computer architecture, critical thinking and analysis on computer architecture concepts, as well as technical presentation of concepts and papers in both spoken and written forms.
Learning objectiveThe main objective is to learn how to rigorously analyze and present papers and ideas on computer architecture. We will have rigorous presentation and discussion of selected papers during lectures and a written report delivered by each student at the end of the semester.
This course is for those interested in computer architecture. Registered students are expected to attend every lecture, participate in the discussion, and create a synthesis report at the end of the course.
ContentTopics will center around computer architecture. We will, for example, discuss papers on hardware security; new execution paradigms like processing in memory; architectural acceleration mechanisms for key applications like machine learning, graph processing and bioinformatics; memory systems; interconnects; various fundamental and emerging paradigms in computer architecture; hardware/software co-design and cooperation; fault tolerance; energy efficiency; heterogeneous and parallel systems; technology scaling; new execution models, etc.

See https://safari.ethz.ch/architecture_seminar for past examples.
Lecture notesAll the materials will be posted on the course website: https://safari.ethz.ch/architecture_seminar/
Links to past course materials, including the synthesis report assignment, can be found in this page: https://safari.ethz.ch/architecture_seminar
LiteratureKey papers and articles, on both fundamentals and cutting-edge topics in computer architecture will be provided and discussed. These will be posted on the course website.

See https://safari.ethz.ch/architecture_seminar for past examples.
Prerequisites / NoticeDesign of Digital Circuits.
Students should have done very well in Digital Design and Computer Architecture (https://safari.ethz.ch/digitaltechnik) show a genuine interest in Computer Architecture research and practice.