Juan Gómez Luna: Catalogue data in Autumn Semester 2022

Name Dr. Juan Gómez Luna
DepartmentInformation Technology and Electrical Engineering
RelationshipLecturer

NumberTitleECTSHoursLecturers
227-0085-33LProjects & Seminars: Accelerating Genome Analysis with FPGAs, GPUs, and New Execution Paradigms Restricted registration - show details
Only for Electrical Engineering and Information Technology BSc.

The course unit can only be taken once. Repeated enrollment in a later semester is not creditable.
3 credits3PM. H. K. Alser, J. Gómez Luna
AbstractThe category of "Laboratory Courses, Projects, Seminars" includes courses and laboratories in various formats designed to impart practical knowledge and skills. Moreover, these classes encourage independent experimentation and design, allow for explorative learning and teach the methodology of project work.
Learning objectiveIn this course, we will cover the basics of genome analysis to understand the computational steps of the entire pipeline and find the computational bottlenecks. Students will learn about the existing efforts for accelerating one or more of these steps and will have the chance to carry out a hands-on project to improve these efforts.

The course is conducted in English.

Course website: https://safari.ethz.ch/projects_and_seminars/doku.php?id=bioinformatics
ContentA genome encodes a set of instructions for performing some functions within our cells. Analyzing our genomes helps, for example, to determine differences in these instructions (known as genetic variations) from human to human that may cause diseases or different traits. One benefit of knowing the genetic variations is better understanding and diagnosis of diseases and the development of efficient drugs.

Computers are widely used to perform genome analysis using dedicated algorithms and data structures. However, timely analysis of genomic data remains a daunting challenge, due to the complex algorithms and large datasets used for the analysis. Increasing the number of processing cores used for genome analysis decreases the overall analysis time, but significantly escalates the cost of building, maintaining, and cooling such a computing cluster, as well as the power/energy consumed by the cluster. This is a critical shortcoming with respect to both energy production and environmental friendliness. Cloud computing platforms can be used as an alternative to distribute the workload, but transferring the data between the clinic and the cloud poses new privacy and legal concerns.
Lecture notesSee past course materials here: https://safari.ethz.ch/projects_and_seminars/doku.php?id=bioinformatics
LiteratureLearning Materials
===============

1. A survey on accelerating genome analysis: https://arxiv.org/pdf/2008.00961
2. A detailed survey on the state-of-the-art algorithms for sequencing data: https://arxiv.org/pdf/2003.00110
3. An example of how to accelerate genomic sequence matching by two orders of magnitude with the help of FPGAs or GPUs: https://arxiv.org/abs/1910.09020
4. An example of how to accelerate read mapping step by an order of magnitude and without using hardware acceleration: https://arxiv.org/pdf/1912.08735
5. An example of using a different computing paradigm for accelerating read mapping step and improving its energy consumption: https://arxiv.org/pdf/1708.04329
6. Two examples on using software/hardware co-design to accelerate genomic sequence matching by two orders of magnitude: https://arxiv.org/abs/1604.01789 https://arxiv.org/abs/1809.07858
Prerequisites / NoticePrerequisites of the course:

- No prior knowledge in bioinformatics or genome analysis is required.
- Digital Design and Computer Architecture (or equivalent course)
- A good knowledge in C programming language is required.
- Experience in at least one of the following is highly desirable:
FPGA implementation and GPU programming.
- Interest in making things efficient and solving problems
227-0085-36LProjects & Seminars: Genome Sequencing on Mobile Devices Restricted registration - show details
Only for Electrical Engineering and Information Technology BSc.

The course unit can only be taken once. Repeated enrollment in a later semester is not creditable.
3 credits3PM. H. K. Alser, J. Gómez Luna
AbstractThe category of "Laboratory Courses, Projects, Seminars" includes courses and laboratories in various formats designed to impart practical knowledge and skills. Moreover, these classes encourage independent experimentation and design, allow for explorative learning and teach the methodology of project work.
Learning objectiveGenome analysis is the foundation of many scientific and medical discoveries, and serves as a key enabler of personalized medicine. This analysis is currently limited by the inability of existing technologies to read an organism’s complete genome. Instead, a dedicated machine (called sequencer) extracts a large number of shorter random fragments of an organism’s DNA sequence, known as reads. Small, handheld sequencers such as ONT MinION and Flongle make it possible to sequence bacterial and viral genomes in the field, thus facilitating disease outbreak analyses such as COVID-19, Ebola, and Zika. However, large, capable computers are still needed to perform genome assembly, which tries to reassemble read fragments back into an entire genome sequence. This limits the benefits of mobile sequencing and may pose problems in rapid diagnosis of infectious diseases, tracking outbreaks, and near-patient testing. The problem is exacerbated in developing countries and during crises where access to the internet network, cloud services, or data centers is even more limited.

In this course, we will cover the basics of genome analysis to understand the speed-accuracy tradeoff in using computationally-lightweight heuristics versus accurate computationally-expensive algorithms. Such heuristic algorithms typically operate on a smaller dataset that can fit in the memory of today’s mobile device. Students will experimentally evaluate different heuristic algorithms and observe their effect on the end results. This evaluation will give the students the chance to carry out a hands-on project to implement one or more of these heuristic algorithms in their smartphones and help the society by enabling on-site analysis of genomic data.

The course is conducted in English.

Course website: https://safari.ethz.ch/projects_and_seminars/doku.php?id=genome_seq_mobile
Lecture notesSee: https://safari.ethz.ch/projects_and_seminars/doku.php?id=genome_seq_mobile
LiteratureLearning Materials
===============

1. A survey on accelerating genome analysis: https://arxiv.org/pdf/2008.00961

2. A detailed survey on the state-of-the-art algorithms for sequencing data: https://arxiv.org/pdf/2003.00110

3. An example of how to accelerate genomic sequence matching by two orders of magnitude with the help of FPGAs or GPUs: https://arxiv.org/abs/1910.09020

4. An example of how to accelerate read mapping step by an order of magnitude and without using hardware acceleration: https://arxiv.org/pdf/1912.08735

5. An example of using a different computing paradigm for accelerating read mapping step and improving its energy consumption: https://arxiv.org/pdf/1708.04329

6. Two examples on using software/hardware co-design to accelerate genomic sequence matching by two orders of magnitude: https://arxiv.org/abs/1604.01789 https://arxiv.org/abs/1809.07858

7. An example of a purely software method for fast genome sequence analysis: http://www.biomedcentral.com/content/pdf/1471-2164-14-S1-S13.pdf
Prerequisites / NoticePrerequisites of the course:
- No prior knowledge in bioinformatics or genome analysis is required.
- A good knowledge in C programming language and programming is required.
- Interest in making things efficient and solving problems
227-0085-37LProjects & Seminars: Data-Centric Architectures: Fundamentally Improving Performance and Energy Restricted registration - show details
Only for Electrical Engineering and Information Technology BSc.

The course unit can only be taken once. Repeated enrollment in a later semester is not creditable.
3 credits3PJ. Gómez Luna
AbstractThe category of "Laboratory Courses, Projects, Seminars" includes courses and laboratories in various formats designed to impart practical knowledge and skills. Moreover, these classes encourage independent experimentation and design, allow for explorative learning and teach the methodology of project work.
Learning objectiveData movement between the memory units and the compute units of current computing systems is a major performance and energy bottleneck. From large-scale servers to mobile devices, data movement costs dominate computation costs in terms of both performance and energy consumption. For example, data movement between the main memory and the processing cores accounts for 62% of the total system energy in consumer applications. As a result, the data movement bottleneck is a huge burden that greatly limits the energy efficiency and performance of modern computing systems. This phenomenon is an undesired effect of the dichotomy between memory and the processor, which leads to the data movement bottleneck.

Many modern and important workloads such as machine learning, computational biology, graph processing, databases, video analytics, and real-time data analytics suffer greatly from the data movement bottleneck. These workloads are exemplified by irregular memory accesses, relatively low data reuse, low cache line utilization, low arithmetic intensity (i.e., ratio of operations per accessed byte), and large datasets that greatly exceed the main memory size. The computation in these workloads cannot usually compensate for the data movement costs. In order to alleviate this data movement bottleneck, we need a paradigm shift from the traditional processor-centric design, where all computation takes place in the compute units, to a more data-centric design where processing elements are placed closer to or inside where the data resides. This paradigm of computing is known as Processing-in-Memory (PIM).

This is your perfect P&S if you want to become familiar with the main PIM technologies, which represent "the next big thing" in Computer Architecture. You will work hands-on with the first real-world PIM architecture, will explore different PIM architecture designs for important workloads, and will develop tools to enable research of future PIM systems. Projects in this course span software and hardware as well as the software/hardware interface. You can potentially work on developing and optimizing new workloads for the first real-world PIM hardware or explore new PIM designs in simulators, or do something else that can forward our understanding of the PIM paradigm.

The course is conducted in English.

The course has two main parts:
Weekly lectures on processing-in-memory.
Hands-on project: Each student develops his/her own project.

Course website: https://safari.ethz.ch/projects_and_seminars/
Lecture notesSee: https://safari.ethz.ch/projects_and_seminars/
LiteratureLearning materials
============

Summary papers about recent research in PIM.
https://people.inf.ethz.ch/omutlu/pub/ModernPrimerOnPIM_springer-emerging-computing-bookchapter21.pdf
https://people.inf.ethz.ch/omutlu/pub/ProcessingDataWhereItMakesSense_micpro19-invited.pdf
https://people.inf.ethz.ch/omutlu/pub/processing-in-memory_workload-driven-perspective_IBMjrd19.pdf

PIM Simulators.
Ramulator-PIM: A version of Ramulator simulator for PIM.
https://github.com/CMU-SAFARI/ramulator-pim
DAMOV simulator.
https://github.com/CMU-SAFARI/DAMOV

UPMEM SDK documentation: The first real-world PIM architecture.
https://sdk.upmem.com/2021.3.0/

An example recent study of 3D-stacked PIM for consumer workloads.
https://people.inf.ethz.ch/omutlu/pub/Google-consumer-workloads-data-movement-and-PIM_asplos18.pdf

An example recent study of lightweight PIM functionality on 3D-stacked memory:
https://people.inf.ethz.ch/omutlu/pub/pim-enabled-instructons-for-low-overhead-pim_isca15.pdf

An example recent study of a PIM accelerator for graph processing.
https://people.inf.ethz.ch/omutlu/pub/tesseract-pim-architecture-for-graph-processing_isca15.pdf

An example recent study of a Processing-using-Memory system.
https://people.inf.ethz.ch/omutlu/pub/ambit-bulk-bitwise-dram_micro17.pdf
https://people.inf.ethz.ch/omutlu/pub/SIMDRAM_asplos21.pdf
Prerequisites / NoticePrerequisites of the course:
- Digital Design and Computer Architecture (or equivalent course).
- Familiarity with C/C++ programming.
- Interest in future computer architectures and computing paradigms.
- Interest in discovering why things do or do not work and solving problems
- Interest in making systems efficient and usable
227-0085-51LProjects & Seminars: Programming Heterogeneous Computing Systems with GPUs and other Accelerators Restricted registration - show details
Only for Electrical Engineering and Information Technology BSc.

Course can only be registered for once. A repeatedly registration in a later semester is not chargeable.
3 credits3PO. Mutlu, J. Gómez Luna
AbstractThe category of "Laboratory Courses, Projects, Seminars" includes courses and laboratories in various formats designed to impart practical knowledge and skills. Moreover, these classes encourage independent experimentation and design, allow for explorative learning and teach the methodology of project work.
Learning objectiveThe increasing difficulty of scaling the performance and efficiency of CPUs every year has created the need for turning computers into heterogeneous systems, i.e., systems composed of multiple types of processors that can suit better different types of workloads or parts of them. More than a decade ago, Graphics Processing Units (GPUs) became general-purpose parallel processors, in order to make their outstanding processing capabilities available to many workloads beyond graphics. GPUs have been a critical key to the recent rise of Machine Learning and Artificial Intelligence, which took unrealistic training times before the use of GPUs. Field-Programmable Gate Arrays (FPGAs) are another example computing device that can deliver impressive benefits in terms of performance and energy efficiency. More specific examples are (1) a plethora of specialized accelerators (e.g., Tensor Processing Units for neural networks), and (2) near-data processing architectures (i.e., placing compute capabilities near or inside memory/storage).

Despite the great advances in the adoption of heterogeneous systems in recent years, there are still many challenges to tackle, for example:
- Heterogeneous implementations (using GPUs, FPGAs, TPUs) of modern applications from important fields such as bioinformatics, machine learning, graph processing, medical imaging, personalized medicine, robotics, virtual reality, etc.
- Scheduling techniques for heterogeneous systems with different general-purpose processors and accelerators, e.g., kernel offloading, memory scheduling, etc.
- Workload characterization and programming tools that enable easier and more efficient use of heterogeneous systems.

If you are enthusiastic about working hands-on with different software, hardware, and architecture projects for heterogeneous systems, this is your P&S. You will have the opportunity to program heterogeneous systems with different types of devices (CPUs, GPUs, FPGAs, TPUs), propose algorithmic changes to important applications to better leverage the compute power of heterogeneous systems, understand different workloads and identify the most suitable device for their execution, design optimized scheduling techniques, etc. In general, the goal will be to reach the highest performance reported for a given important application.

The course is conducted in English.

The course has two main parts:
Weekly lectures on GPU and heterogeneous programming.
Hands-on project: Each student develops his/her own project.

Course website: https://safari.ethz.ch/projects_and_seminars/doku.php?id=heterogeneous_systems
ContentSee: https://safari.ethz.ch/projects_and_seminars/doku.php?id=heterogeneous_systems for past examples.
Lecture notesSee: https://safari.ethz.ch/projects_and_seminars/doku.php?id=heterogeneous_systems
LiteratureLearning Materials
============

1. An introduction to SIMD processors and GPUs:
http://www.youtube.com/watch?v=hOeIkAYraTE

2. An introduction to GPUs and heterogeneous programming: http://www.youtube.com/watch?v=y40-tY5WJ8A

3. Example recent studies of FPGA and GPU implementation for bioinformatics:
GateKeeper: FPGA for bioinformatics (Bioinformatics 2017): https://people.inf.ethz.ch/omutlu/pub/gatekeeper_FPGA-genome-prealignment-accelerator_bionformatics17.pdf
SneakySnake: Pre-alignment filter on FPGA and GPU (Bioinformatics 2020): https://people.inf.ethz.ch/omutlu/pub/SneakySnake_UniversalGenomePrealignmentFilter_bioinformatics20.pdf

4. An example recent study of a suite of heterogeneous benchmarks:
Chai: heterogeneous benchmarks (ISPASS 2017): https://chai-benchmarks.github.io/assets/ispass17.pdf

5. An example recent study of a medical image application on GPU:
GPU for medical imaging (CMPB 2020): https://people.inf.ethz.ch/omutlu/pub/bsplines_interpolation_on_GPUs_compmethodsprograms-biomedicine20.pdf

6. Example studies of programming tools and performance portability on heterogeneous systems:
Boyi: execution models for FPGAs (FPGA 2020): https://people.inf.ethz.ch/omutlu/pub/boyi-opencl-execution-model-selection-for-FPGAs_fpga20.pdf
Zorua: hardware support for GPU performance portability (MICRO 2016): https://people.inf.ethz.ch/omutlu/pub/zorua-holistic-GPU-virtualization_micro16.pdf
Locality descriptor: Cross-layer abstraction to express data locality on GPUs (ISCA 2018): https://people.inf.ethz.ch/omutlu/pub/LocalityDescriptor-Cross-Layer-GPU-Data-Locality-Abstraction_isca18.pdf

7. Example studies of scheduling techniques for heterogeneous systems:
Thread scheduling (MICRO 2011): https://people.inf.ethz.ch/omutlu/pub/large-gpu-warps_micro11.pdf
DASH: memory scheduling (TACO 2016): https://people.inf.ethz.ch/omutlu/pub/dash_deadline-aware-heterogeneous-memory-scheduler_taco16.pdf
Prerequisites / NoticePrerequisites of the course:
- Digital Design and Computer Architecture (or equivalent course).
- Familiarity with C/C++ programming and strong coding skills.
- Interest in future computer architectures and computing paradigms.
- Interest in discovering why things do or do not work and solving problems
- Interest in making systems efficient and usable
227-2211-00LSeminar in Computer Architecture Information Restricted registration - show details
Number of participants limited to 28.

The deadline for deregistering expires at the end of the second week of the semester. Students who are still registered after that date, but do not attend the seminar, will officially fail the seminar.
2 credits2SO. Mutlu, M. H. K. Alser, J. Gómez Luna
AbstractIn this seminar course, we will cover fundamental and cutting-edge research papers in computer architecture. The course will consist of multiple components that are aimed at improving students' technical skills in computer architecture, critical thinking and analysis on computer architecture concepts, as well as technical presentation of concepts and papers in both spoken and written forms.
Learning objectiveThe main objective is to learn how to rigorously analyze and present papers and ideas on computer architecture. We will have rigorous presentation and discussion of selected papers during lectures and a written report delivered by each student at the end of the semester.
This course is for those interested in computer architecture. Registered students are expected to attend every lecture, participate in the discussion, and create a synthesis report at the end of the course.
ContentTopics will center around computer architecture. We will, for example, discuss papers on hardware security; new execution paradigms like processing in memory; architectural acceleration mechanisms for key applications like machine learning, graph processing and bioinformatics; memory systems; interconnects; various fundamental and emerging paradigms in computer architecture; hardware/software co-design and cooperation; fault tolerance; energy efficiency; heterogeneous and parallel systems; technology scaling; new execution models, etc.

See https://safari.ethz.ch/architecture_seminar for past examples.
Lecture notesAll the materials will be posted on the course website: https://safari.ethz.ch/architecture_seminar/
Links to past course materials, including the synthesis report assignment, can be found in this page: https://safari.ethz.ch/architecture_seminar
LiteratureKey papers and articles, on both fundamentals and cutting-edge topics in computer architecture will be provided and discussed. These will be posted on the course website.

See https://safari.ethz.ch/architecture_seminar for past examples.
Prerequisites / NoticeDesign of Digital Circuits.
Students should have done very well in Digital Design and Computer Architecture (https://safari.ethz.ch/digitaltechnik) show a genuine interest in Computer Architecture research and practice.