Valentina Boeva: Catalogue data in Spring Semester 2022

Name Prof. Dr. Valentina Boeva
FieldBiomedical Informatics
Address
Professur für Biomedizininformatik
ETH Zürich, CAB G 32.2
Universitätstrasse 6
8092 Zürich
SWITZERLAND
E-mailvalentina.boeva@inf.ethz.ch
DepartmentComputer Science
RelationshipAssistant Professor (Tenure Track)

NumberTitleECTSHoursLecturers
252-0868-00LData Science for Medicine Information Restricted registration - show details
Only for Human Medicine BSc
4 credits4VJ. Vogt, V. Boeva, G. Rätsch
AbstractMachine Learning (ML) methods have shown to have a profound impact in medical applications, where the great variety of tasks and data types enables us to get benefit of ML algorithms in many different ways. In this course we will review the most relevant methods and applications of ML in medicine, and work on practical projects to solve medical problems with the help of ML.
Learning objectiveThe course will start with a general introduction to ML, where we will cover supervised and unsupervised learning techniques, as for example classification and regression models, feature selection and preprocessing of data, clustering and dimensionality reduction techniques. After the introduction of the basic methodologies, we will continue with the most relevant applications of ML in medicine, as for example dealing with time series, medical notes and medical images.
ContentDuring the last few years, we have observed a rapid growth of Machine Learning (ML) in Medicine. ML methods have shown to have a profound impact in medical applications, where the great variety of tasks and data types enables us to get benefit of ML algorithms in many different ways. In this course we will review the most relevant methods and applications of ML in medicine, discuss the main challenges they present and their current technical solutions, and work on practical projects to solve medical problems with the help of ML.
Prerequisites / NoticePrerequisite:
Attendance/exam of 252-0866-00 Digital Medicine I
261-5120-00LMachine Learning for Health Care Information Restricted registration - show details
Number of participants limited to 150.
5 credits2V + 2AV. Boeva, G. Rätsch, J. Vogt
AbstractThe course will review the most relevant methods and applications of Machine Learning in Biomedicine, discuss the main challenges they present and their current technical problems.
Learning objectiveDuring the last years, we have observed a rapid growth in the field of Machine Learning (ML), mainly due to improvements in ML algorithms, the increase of data availability and a reduction in computing costs. This growth is having a profound impact in biomedical applications, where the great variety of tasks and data types enables us to get benefit of ML algorithms in many different ways. In this course we will review the most relevant methods and applications of ML in biomedicine, discuss the main challenges they present and their current technical solutions.
ContentThe course will consist of four topic clusters that will cover the most relevant applications of ML in Biomedicine:
1) Structured time series: Temporal time series of structured data often appear in biomedical datasets, presenting challenges as containing variables with different periodicities, being conditioned by static data, etc.
2) Medical notes: Vast amount of medical observations are stored in the form of free text, we will analyze stategies for extracting knowledge from them.
3) Medical images: Images are a fundamental piece of information in many medical disciplines. We will study how to train ML algorithms with them.
4) Genomics data: ML in genomics is still an emerging subfield, but given that genomics data are arguably the most extensive and complex datasets that can be found in biomedicine, it is expected that many relevant ML applications will arise in the near future. We will review and discuss current applications and challenges.
Prerequisites / NoticeData Structures & Algorithms, Introduction to Machine Learning, Statistics/Probability, Programming in Python, Unix Command Line

Relation to Course 261-5100-00 Computational Biomedicine: This course is a continuation of the previous course with new topics related to medical data and machine learning. The format of Computational Biomedicine II will also be different. It is helpful but not essential to attend Computational Biomedicine before attending Computational Biomedicine II.
263-3300-00LData Science Lab Restricted registration - show details
Only for Data Science MSc.
14 credits9PC. Zhang, V. Boeva, R. Cotterell, J. Vogt, F. Yang
AbstractIn this class, we bring together data science applications
provided by ETH researchers outside computer science and
teams of computer science master's students. Two to three
students will form a team working on data science/machine
learning-related research topics provided by scientists in
a diverse range of domains such as astronomy, biology,
social sciences etc.
Learning objectiveThe goal of this class if for students to gain experience
of dealing with data science and machine learning applications
"in the wild". Students are expected to go through the full
process starting from data cleaning, modeling, execution,
debugging, error analysis, and quality/performance refinement.
Prerequisites / NoticePrerequisites: At least 8 KP must have been obtained under Data Analysis and at least 8 KP must have been obtained under Data Management and Processing.
263-5351-00LMachine Learning for Genomics Information Restricted registration - show details
The deadline for deregistering expires at the end of the second week of the semester. Students who are still registered after that date, but do not provide project work and/or do not show up for the exam, will officially fail the course.

Number of participants limited to 75.
5 credits2V + 1U + 1AV. Boeva
AbstractThe course reviews solutions that machine learning provides to the most challenging questions in human genomics.
Learning objectiveOver the last few years, the parallel development of machine learning methods and molecular profiling technologies for human cells, such as sequencing, created an extremely powerful tool to get insights into the cellular mechanisms in healthy and diseased contexts. In this course, we will discuss the state-of-the-art machine learning methodology solving or attempting to solve common problems in human genomics. At the end of the course, you will be familiar with (1) classical and advanced machine learning architectures used in genomics, (2) bioinformatics analysis of human genomic and transcriptomic data, and (3) data types used in this field.
Content- Short introduction to major concepts of molecular biology: DNA, genes, genome, central dogma, transcription factors, epigenetic code, DNA methylation, signaling pathways
- Prediction of transcription factor binding sites, open chromatin, histone marks, promoters, nucleosome positioning (convolutional neural networks, position weight matrices)
- Prediction of variant effects and gene expression (hidden Markov models, topic models)
- Deconvolution of mixed signal
- DNA, RNA and protein folding (RNN, LSTM, transformers)
- Data imputation for single cell RNA-seq data, clustering and annotation (diffusion and methods on graphs)
- Batch correction (autoencoders, optimal transport)
- Survival analysis (Cox proportional hazard model, regularization penalties, multi-omics, multi-tasking)
Prerequisites / NoticeIntroduction to Machine Learning, Statistics/Probability, Programming in Python, Unix Command Line; having taken Computational Biomedicine is highly recommended