# Search result: Catalogue data in Spring Semester 2020

Statistics Master The following courses belong to the curriculum of the Master's Programme in Statistics. The corresponding credits do not count as external credits even for course units where an enrolment at ETH Zurich is not possible. | ||||||

Core Courses In each subject area, the core courses offered are normally mathematical as well as application-oriented in content. For each subject area, only one of these is recognised for the Master degree. | ||||||

Regression No offering in this semester (401-3622-00L Statistical Modelling is offered in the autumn semester). | ||||||

Analysis of Variance and Design of Experiments No offering in this semester | ||||||

Multivariate Statistics | ||||||

Number | Title | Type | ECTS | Hours | Lecturers | |
---|---|---|---|---|---|---|

401-6102-00L | Multivariate StatisticsDoes not take place this semester. | W | 4 credits | 2G | not available | |

Abstract | Multivariate Statistics deals with joint distributions of several random variables. This course introduces the basic concepts and provides an overview over classical and modern methods of multivariate statistics. We will consider the theory behind the methods as well as their applications. | |||||

Objective | After the course, you should be able to: - describe the various methods and the concepts and theory behind them - identify adequate methods for a given statistical problem - use the statistical software "R" to efficiently apply these methods - interpret the output of these methods | |||||

Content | Visualization / Principal component analysis / Multidimensional scaling / The multivariate Normal distribution / Factor analysis / Supervised learning / Cluster analysis | |||||

Lecture notes | None | |||||

Literature | The course will be based on class notes and books that are available electronically via the ETH library. | |||||

Prerequisites / Notice | Target audience: This course is the more theoretical version of "Applied Multivariate Statistics" (401-0102-00L) and is targeted at students with a math background. Prerequisite: A basic course in probability and statistics. Note: The courses 401-0102-00L and 401-6102-00L are mutually exclusive. You may register for at most one of these two course units. | |||||

401-0102-00L | Applied Multivariate Statistics | W | 5 credits | 2V + 1U | F. Sigrist | |

Abstract | Multivariate statistics analyzes data on several random variables simultaneously. This course introduces the basic concepts and provides an overview of classical and modern methods of multivariate statistics including visualization, dimension reduction, supervised and unsupervised learning for multivariate data. An emphasis is on applications and solving problems with the statistical software R. | |||||

Objective | After the course, you are able to: - describe the various methods and the concepts behind them - identify adequate methods for a given statistical problem - use the statistical software R to efficiently apply these methods - interpret the output of these methods | |||||

Content | Visualization, multivariate outliers, the multivariate normal distribution, dimension reduction, principal component analysis, multidimensional scaling, factor analysis, cluster analysis, classification, multivariate tests and multiple testing | |||||

Lecture notes | None | |||||

Literature | 1) "An Introduction to Applied Multivariate Analysis with R" (2011) by Everitt and Hothorn 2) "An Introduction to Statistical Learning: With Applications in R" (2013) by Gareth, Witten, Hastie and Tibshirani Electronic versions (pdf) of both books can be downloaded for free from the ETH library. | |||||

Prerequisites / Notice | This course is targeted at students with a non-math background. Requirements: ========== 1) Introductory course in statistics (min: t-test, regression; ideal: conditional probability, multiple regression) 2) Good understanding of R (if you don't know R, it is recommended that you study chapters 1,2,3,4, and 5 of "Introductory Statistics with R" from Peter Dalgaard, which is freely available online from the ETH library) An alternative course with more emphasis on theory is 401-6102-00L "Multivariate Statistics" (only every second year). 401-0102-00L and 401-6102-00L are mutually exclusive. You can register for only one of these two courses. | |||||

Time Series and Stochastic Processes | ||||||

Number | Title | Type | ECTS | Hours | Lecturers | |

401-6624-11L | Applied Time Series | W | 5 credits | 2V + 1U | M. Dettling | |

Abstract | The course starts with an introduction to time series analysis (examples, goal, mathematical notation). In the following, descriptive techniques, modeling and prediction as well as advanced topics will be covered. | |||||

Objective | Getting to know the mathematical properties of time series, as well as the requirements, descriptive techniques, models, advanced methods and software that are necessary such that the student can independently run an applied time series analysis. | |||||

Content | The course starts with an introduction to time series analysis that comprises of examples and goals. We continue with notation and descriptive analysis of time series. A major part of the course will be dedicated to modeling and forecasting of time series using the flexible class of ARMA models. More advanced topics that will be covered in the following are time series regression, state space models and spectral analysis. | |||||

Lecture notes | A script will be available. | |||||

Prerequisites / Notice | The course starts with an introduction to time series analysis that comprises of examples and goals. We continue with notation and descriptive analysis of time series. A major part of the course will be dedicated to modeling and forecasting of time series using the flexible class of ARMA models. More advanced topics that will be covered in the following are time series regression, state space models and spectral analysis. | |||||

Mathematical Statistics No offering in this semester | ||||||

Specialization Areas and Electives | ||||||

Statistical and Mathematical Courses | ||||||

Number | Title | Type | ECTS | Hours | Lecturers | |

401-4632-15L | Causality | W | 4 credits | 2G | C. Heinze-Deml | |

Abstract | In statistics, we are used to search for the best predictors of some random variable. In many situations, however, we are interested in predicting a system's behavior under manipulations. For such an analysis, we require knowledge about the underlying causal structure of the system. In this course, we study concepts and theory behind causal inference. | |||||

Objective | After this course, you should be able to - understand the language and concepts of causal inference - know the assumptions under which one can infer causal relations from observational and/or interventional data - describe and apply different methods for causal structure learning - given data and a causal structure, derive causal effects and predictions of interventional experiments | |||||

Prerequisites / Notice | Prerequisites: basic knowledge of probability theory and regression | |||||

401-4627-00L | Empirical Process Theory and Applications | W | 4 credits | 2V | S. van de Geer | |

Abstract | Empirical process theory provides a rich toolbox for studying the properties of empirical risk minimizers, such as least squares and maximum likelihood estimators, support vector machines, etc. | |||||

Objective | ||||||

Content | In this series of lectures, we will start with considering exponential inequalities, including concentration inequalities, for the deviation of averages from their mean. We furthermore present some notions from approximation theory, because this enables us to assess the modulus of continuity of empirical processes. We introduce e.g., Vapnik Chervonenkis dimension: a combinatorial concept (from learning theory) of the "size" of a collection of sets or functions. As statistical applications, we study consistency and exponential inequalities for empirical risk minimizers, and asymptotic normality in semi-parametric models. We moreover examine regularization and model selection. | |||||

401-3632-00L | Computational Statistics | W | 8 credits | 3V + 1U | M. H. Maathuis | |

Abstract | We discuss modern statistical methods for data analysis, including methods for data exploration, prediction and inference. We pay attention to algorithmic aspects, theoretical properties and practical considerations. The class is hands-on and methods are applied using the statistical programming language R. | |||||

Objective | The student obtains an overview of modern statistical methods for data analysis, including their algorithmic aspects and theoretical properties. The methods are applied using the statistical programming language R. | |||||

Content | See the class website | |||||

Prerequisites / Notice | At least one semester of (basic) probability and statistics. Programming experience is helpful but not required. | |||||

401-3602-00L | Applied Stochastic Processes Does not take place this semester. | W | 8 credits | 3V + 1U | not available | |

Abstract | Poisson processes; renewal processes; Markov chains in discrete and in continuous time; some applications. | |||||

Objective | Stochastic processes are a way to describe and study the behaviour of systems that evolve in some random way. In this course, the evolution will be with respect to a scalar parameter interpreted as time, so that we discuss the temporal evolution of the system. We present several classes of stochastic processes, analyse their properties and behaviour and show by some examples how they can be used. The main emphasis is on theory; in that sense, "applied" should be understood to mean "applicable". | |||||

Literature | R. N. Bhattacharya and E. C. Waymire, "Stochastic Processes with Applications", SIAM (2009), available online: http://epubs.siam.org/doi/book/10.1137/1.9780898718997 R. Durrett, "Essentials of Stochastic Processes", Springer (2012), available online: http://link.springer.com/book/10.1007/978-1-4614-3615-7/page/1 M. Lefebvre, "Applied Stochastic Processes", Springer (2007), available online: http://link.springer.com/book/10.1007/978-0-387-48976-6/page/1 S. I. Resnick, "Adventures in Stochastic Processes", Birkhäuser (2005) | |||||

Prerequisites / Notice | Prerequisites are familiarity with (measure-theoretic) probability theory as it is treated in the course "Probability Theory" (401-3601-00L). | |||||

401-3642-00L | Brownian Motion and Stochastic Calculus | W | 10 credits | 4V + 1U | W. Werner | |

Abstract | This course covers some basic objects of stochastic analysis. In particular, the following topics are discussed: construction and properties of Brownian motion, stochastic integration, Ito's formula and applications, stochastic differential equations and connection with partial differential equations. | |||||

Objective | This course covers some basic objects of stochastic analysis. In particular, the following topics are discussed: construction and properties of Brownian motion, stochastic integration, Ito's formula and applications, stochastic differential equations and connection with partial differential equations. | |||||

Lecture notes | Lecture notes will be distributed in class. | |||||

Literature | - J.-F. Le Gall, Brownian Motion, Martingales, and Stochastic Calculus, Springer (2016). - I. Karatzas, S. Shreve, Brownian Motion and Stochastic Calculus, Springer (1991). - D. Revuz, M. Yor, Continuous Martingales and Brownian Motion, Springer (2005). - L.C.G. Rogers, D. Williams, Diffusions, Markov Processes and Martingales, vol. 1 and 2, Cambridge University Press (2000). - D.W. Stroock, S.R.S. Varadhan, Multidimensional Diffusion Processes, Springer (2006). | |||||

Prerequisites / Notice | Familiarity with measure-theoretic probability as in the standard D-MATH course "Probability Theory" will be assumed. Textbook accounts can be found for example in - J. Jacod, P. Protter, Probability Essentials, Springer (2004). - R. Durrett, Probability: Theory and Examples, Cambridge University Press (2010). | |||||

401-6228-00L | Programming with R for Reproducible Research | W | 1 credit | 1G | M. Mächler | |

Abstract | Deeper understanding of R: Function calls, rather than "commands". Reproducible research and data analysis via Sweave and Rmarkdown. Limits of floating point arithmetic. Understanding how functions work. Environments, packages, namespaces. Closures, i.e., Functions returning functions. Lists and [mc]lapply() for easy parallelization. Performance measurement and improvements. | |||||

Objective | Learn to understand R as a (very versatile and flexible) programming language and learn about some of its lower level functionalities which are needed to understand *why* R works the way it does. | |||||

Content | See "Skript": https://github.com/mmaechler/ProgRRR/tree/master/ETH | |||||

Lecture notes | Material available from Github https://github.com/mmaechler/ProgRRR/tree/master/ETH (typically will be updated during course) | |||||

Literature | Norman Matloff (2011) The Art of R Programming - A tour of statistical software design. no starch press, San Francisco. on stock at Polybuchhandlung (CHF 42.-). More material, notably H.Wickam's "Advanced R" : see my ProgRRR github page. | |||||

Prerequisites / Notice | R Knowledge on the same level as after *both* parts of the ETH lecture 401-6217-00L Using R for Data Analysis and Graphics Link An interest to dig deeper than average R users do. Bring your own laptop with a recent version of R installed | |||||

401-3629-00L | Quantitative Risk Management | W | 4 credits | 2V + 1U | P. Cheridito | |

Abstract | This course introduces methods from probability theory and statistics that can be used to model financial risks. Topics addressed include loss distributions, risk measures, extreme value theory, multivariate models, copulas, dependence structures and operational risk. | |||||

Objective | The goal is to learn the most important methods from probability theory and statistics used in financial risk modeling. | |||||

Content | 1. Introduction 2. Basic Concepts in Risk Management 3. Empirical Properties of Financial Data 4. Financial Time Series 5. Extreme Value Theory 6. Multivariate Models 7. Copulas and Dependence 8. Operational Risk | |||||

Lecture notes | Course material is available on https://people.math.ethz.ch/~patrickc/qrm | |||||

Literature | Quantitative Risk Management: Concepts, Techniques and Tools AJ McNeil, R Frey and P Embrechts Princeton University Press, Princeton, 2015 (Revised Edition) http://press.princeton.edu/titles/10496.html | |||||

Prerequisites / Notice | The course corresponds to the Risk Management requirement for the SAA ("Aktuar SAV Ausbildung") as well as for the Master of Science UZH-ETH in Quantitative Finance. | |||||

401-4658-00L | Computational Methods for Quantitative Finance: PDE Methods | W | 6 credits | 3V + 1U | C. Schwab | |

Abstract | Introduction to principal methods of option pricing. Emphasis on PDE-based methods. Prerequisite MATLAB programming and knowledge of numerical mathematics at ETH BSc level. | |||||

Objective | Introduce the main methods for efficient numerical valuation of derivative contracts in a Black Scholes as well as in incomplete markets due Levy processes or due to stochastic volatility models. Develop implementation of pricing methods in MATLAB. Finite-Difference/ Finite Element based methods for the solution of the pricing integrodifferential equation. | |||||

Content | 1. Review of option pricing. Wiener and Levy price process models. Deterministic, local and stochastic volatility models. 2. Finite Difference Methods for option pricing. Relation to bi- and multinomial trees. European contracts. 3. Finite Difference methods for Asian, American and Barrier type contracts. 4. Finite element methods for European and American style contracts. 5. Pricing under local and stochastic volatility in Black-Scholes Markets. 6. Finite Element Methods for option pricing under Levy processes. Treatment of integrodifferential operators. 7. Stochastic volatility models for Levy processes. 8. Techniques for multidimensional problems. Baskets in a Black-Scholes setting and stochastic volatility models in Black Scholes and Levy markets. 9. Introduction to sparse grid option pricing techniques. | |||||

Lecture notes | There will be english, typed lecture notes as well as MATLAB software for registered participants in the course. | |||||

Literature | R. Cont and P. Tankov : Financial Modelling with Jump Processes, Chapman and Hall Publ. 2004. Y. Achdou and O. Pironneau : Computational Methods for Option Pricing, SIAM Frontiers in Applied Mathematics, SIAM Publishers, Philadelphia 2005. D. Lamberton and B. Lapeyre : Introduction to stochastic calculus Applied to Finance (second edition), Chapman & Hall/CRC Financial Mathematics Series, Taylor & Francis Publ. Boca Raton, London, New York 2008. J.-P. Fouque, G. Papanicolaou and K.-R. Sircar : Derivatives in financial markets with stochastic volatility, Cambridge Univeristy Press, Cambridge, 2000. N. Hilber, O. Reichmann, Ch. Schwab and Ch. Winter: Computational Methods for Quantitative Finance, Springer Finance, Springer, 2013. | |||||

401-2284-00L | Measure and Integration | W | 6 credits | 3V + 2U | F. Da Lio | |

Abstract | Introduction to abstract measure and integration theory, including the following topics: Caratheodory extension theorem, Lebesgue measure, convergence theorems, L^p-spaces, Radon-Nikodym theorem, product measures and Fubini's theorem, measures on topological spaces | |||||

Objective | Basic acquaintance with the abstract theory of measure and integration | |||||

Content | Introduction to abstract measure and integration theory, including the following topics: Caratheodory extension theorem, Lebesgue measure, convergence theorems, L^p-spaces, Radon-Nikodym theorem, product measures and Fubini's theorem, measures on topological spaces | |||||

Lecture notes | New lecture notes in English will be made available during the course | |||||

Literature | 1. L. Evans and R.F. Gariepy " Measure theory and fine properties of functions" 2. Walter Rudin "Real and complex analysis" 3. R. Bartle The elements of Integration and Lebesgue Measure 4. The notes by Prof. Michael Struwe Springsemester 2013, https://people.math.ethz.ch/~struwe/Skripten/AnalysisIII-FS2013-12-9-13.pdf. 5. The notes by Prof. UrsLang Springsemester 2019. https://people.math.ethz.ch/~lang/mi.pdf 6. P. Cannarsa & T. D'Aprile: Lecture notes on Measure Theory and Functional Analysis: http://www.mat.uniroma2.it/~cannarsa/cam_0607.pdf . | |||||

401-3903-11L | Geometric Integer Programming | W | 6 credits | 2V + 1U | J. Paat | |

Abstract | Integer programming is the task of minimizing a linear function over all the integer points in a polyhedron. This lecture introduces the key concepts of an algorithmic theory for solving such problems. | |||||

Objective | The purpose of the lecture is to provide a geometric treatment of the theory of integer optimization. | |||||

Content | Key topics are: - Lattice theory and the polynomial time solvability of integer optimization problems in fixed dimension. - Structural properties of integer sets that reveal other parameters affecting the complexity of integer problems - Duality theory for integer optimization problems from the vantage point of lattice free sets. | |||||

Lecture notes | not available, blackboard presentation | |||||

Literature | Lecture notes will be provided. Other helpful materials include Bertsimas, Weismantel: Optimization over Integers, 2005 and Schrijver: Theory of linear and integer programming, 1986. | |||||

Prerequisites / Notice | "Mathematical Optimization" (401-3901-00L) | |||||

401-4944-20L | Mathematics of Data Science | W | 8 credits | 4G | A. Bandeira | |

Abstract | Mostly self-contained, but fast-paced, introductory masters level course on various theoretical aspects of algorithms that aim to extract information from data. | |||||

Objective | Introduction to various mathematical aspects of Data Science. | |||||

Content | These topics lie in overlaps of (Applied) Mathematics with: Computer Science, Electrical Engineering, Statistics, and/or Operations Research. Each lecture will feature a couple of Mathematical Open Problem(s) related to Data Science. The main mathematical tools used will be Probability and Linear Algebra, and a basic familiarity with these subjects is required. There will also be some (although knowledge of these tools is not assumed) Graph Theory, Representation Theory, Applied Harmonic Analysis, among others. The topics treated will include Dimension reduction, Manifold learning, Sparse recovery, Random Matrices, Approximation Algorithms, Community detection in graphs, and several others. | |||||

Lecture notes | https://people.math.ethz.ch/~abandeira/TenLecturesFortyTwoProblems.pdf | |||||

Prerequisites / Notice | The main mathematical tools used will be Probability, Linear Algebra (and real analysis), and a working knowledge of these subjects is required. In addition to these prerequisites, this class requires a certain degree of mathematical maturity--including abstract thinking and the ability to understand and write proofs. We encourage students who are interested in mathematical data science to take both this course and ``227-0434-10L Mathematics of Information'' taught by Prof. H. Bölcskei. The two courses are designed to be complementary. A. Bandeira and H. Bölcskei | |||||

227-0434-10L | Mathematics of Information | W | 8 credits | 3V + 2U + 2A | H. Bölcskei | |

Abstract | The class focuses on mathematical aspects of 1. Information science: Sampling theorems, frame theory, compressed sensing, sparsity, super-resolution, spectrum-blind sampling, subspace algorithms, dimensionality reduction 2. Learning theory: Approximation theory, uniform laws of large numbers, Rademacher complexity, Vapnik-Chervonenkis dimension | |||||

Objective | The aim of the class is to familiarize the students with the most commonly used mathematical theories in data science, high-dimensional data analysis, and learning theory. The class consists of the lecture, exercise sessions with homework problems, and of a research project, which can be carried out either individually or in groups. The research project consists of either 1. software development for the solution of a practical signal processing or machine learning problem or 2. the analysis of a research paper or 3. a theoretical research problem of suitable complexity. Students are welcome to propose their own project at the beginning of the semester. The outcomes of all projects have to be presented to the entire class at the end of the semester. | |||||

Content | Mathematics of Information 1. Signal representations: Frame theory, wavelets, Gabor expansions, sampling theorems, density theorems 2. Sparsity and compressed sensing: Sparse linear models, uncertainty relations in sparse signal recovery, matching pursuits, super-resolution, spectrum-blind sampling, subspace algorithms (MUSIC, ESPRIT, matrix pencil), estimation in the high-dimensional noisy case, Lasso 3. Dimensionality reduction: Random projections, the Johnson-Lindenstrauss Lemma Mathematics of Learning 4. Approximation theory: Nonlinear approximation theory, fundamental limits on compressibility of signal classes, Kolmogorov-Tikhomirov epsilon-entropy of signal classes, optimal compression of signal classes, recovery from incomplete data, information-based complexity, curse of dimensionality 5. Uniform laws of large numbers: Rademacher complexity, Vapnik-Chervonenkis dimension, classes with polynomial discrimination, blessings of dimensionality | |||||

Lecture notes | Detailed lecture notes will be provided at the beginning of the semester and as we go along. | |||||

Prerequisites / Notice | This course is aimed at students with a background in basic linear algebra, analysis, statistics, and probability. We encourage students who are interested in mathematical data science to take both this course and "401-4944-20L Mathematics of Data Science" by Prof. A. Bandeira. The two courses are designed to be complementary. H. Bölcskei and A. Bandeira | |||||

261-5110-00L | Optimization for Data Science | W | 8 credits | 3V + 2U + 2A | B. Gärtner, D. Steurer | |

Abstract | This course provides an in-depth theoretical treatment of optimization methods that are particularly relevant in data science. | |||||

Objective | Understanding the theoretical guarantees (and their limits) of relevant optimization methods used in data science. Learning general paradigms to deal with optimization problems arising in data science. | |||||

Content | This course provides an in-depth theoretical treatment of optimization methods that are particularly relevant in machine learning and data science. In the first part of the course, we will first give a brief introduction to convex optimization, with some basic motivating examples from machine learning. Then we will analyse classical and more recent first and second order methods for convex optimization: gradient descent, projected gradient descent, subgradient descent, stochastic gradient descent, Nesterov's accelerated method, Newton's method, and Quasi-Newton methods. The emphasis will be on analysis techniques that occur repeatedly in convergence analyses for various classes of convex functions. We will also discuss some classical and recent theoretical results for nonconvex optimization. In the second part, we discuss convex programming relaxations as a powerful and versatile paradigm for designing efficient algorithms to solve computational problems arising in data science. We will learn about this paradigm and develop a unified perspective on it through the lens of the sum-of-squares semidefinite programming hierarchy. As applications, we are discussing non-negative matrix factorization, compressed sensing and sparse linear regression, matrix completion and phase retrieval, as well as robust estimation. | |||||

Prerequisites / Notice | As background, we require material taught in the course "252-0209-00L Algorithms, Probability, and Computing". It is not necessary that participants have actually taken the course, but they should be prepared to catch up if necessary. | |||||

252-0220-00L | Introduction to Machine Learning Limited number of participants. Preference is given to students in programmes in which the course is being offered. All other students will be waitlisted. Please do not contact Prof. Krause for any questions in this regard. If necessary, please contact studiensekretariat@inf.ethz.ch | W | 8 credits | 4V + 2U + 1A | A. Krause | |

Abstract | The course introduces the foundations of learning and making predictions based on data. | |||||

Objective | The course will introduce the foundations of learning and making predictions from data. We will study basic concepts such as trading goodness of fit and model complexitiy. We will discuss important machine learning algorithms used in practice, and provide hands-on experience in a course project. | |||||

Content | - Linear regression (overfitting, cross-validation/bootstrap, model selection, regularization, [stochastic] gradient descent) - Linear classification: Logistic regression (feature selection, sparsity, multi-class) - Kernels and the kernel trick (Properties of kernels; applications to linear and logistic regression); k-nearest neighbor - Neural networks (backpropagation, regularization, convolutional neural networks) - Unsupervised learning (k-means, PCA, neural network autoencoders) - The statistical perspective (regularization as prior; loss as likelihood; learning as MAP inference) - Statistical decision theory (decision making based on statistical models and utility functions) - Discriminative vs. generative modeling (benefits and challenges in modeling joint vy. conditional distributions) - Bayes' classifiers (Naive Bayes, Gaussian Bayes; MLE) - Bayesian approaches to unsupervised learning (Gaussian mixtures, EM) | |||||

Literature | Textbook: Kevin Murphy, Machine Learning: A Probabilistic Perspective, MIT Press | |||||

Prerequisites / Notice | Designed to provide a basis for following courses: - Advanced Machine Learning - Deep Learning - Probabilistic Artificial Intelligence - Seminar "Advanced Topics in Machine Learning" | |||||

252-0526-00L | Statistical Learning Theory | W | 7 credits | 3V + 2U + 1A | J. M. Buhmann, C. Cotrini Jimenez | |

Abstract | The course covers advanced methods of statistical learning: - Variational methods and optimization. - Deterministic annealing. - Clustering for diverse types of data. - Model validation by information theory. | |||||

Objective | The course surveys recent methods of statistical learning. The fundamentals of machine learning, as presented in the courses "Introduction to Machine Learning" and "Advanced Machine Learning", are expanded from the perspective of statistical learning. | |||||

Content | - Variational methods and optimization. We consider optimization approaches for problems where the optimizer is a probability distribution. We will discuss concepts like maximum entropy, information bottleneck, and deterministic annealing. - Clustering. This is the problem of sorting data into groups without using training samples. We discuss alternative notions of "similarity" between data points and adequate optimization procedures. - Model selection and validation. This refers to the question of how complex the chosen model should be. In particular, we present an information theoretic approach for model validation. - Statistical physics models. We discuss approaches for approximately optimizing large systems, which originate in statistical physics (free energy minimization applied to spin glasses and other models). We also study sampling methods based on these models. | |||||

Lecture notes | A draft of a script will be provided. Lecture slides will be made available. | |||||

Literature | Hastie, Tibshirani, Friedman: The Elements of Statistical Learning, Springer, 2001. L. Devroye, L. Gyorfi, and G. Lugosi: A probabilistic theory of pattern recognition. Springer, New York, 1996 | |||||

Prerequisites / Notice | Knowledge of machine learning (introduction to machine learning and/or advanced machine learning) Basic knowledge of statistics. | |||||

252-3900-00L | Big Data for Engineers This course is not intended for Computer Science and Data Science MSc students! | W | 6 credits | 2V + 2U + 1A | G. Fourny | |

Abstract | This course is part of the series of database lectures offered to all ETH departments, together with Information Systems for Engineers. It introduces the most recent advances in the database field: how do we scale storage and querying to Petabytes of data, with trillions of records? How do we deal with heterogeneous data sets? How do we deal with alternate data shapes like trees and graphs? | |||||

Objective | This lesson is complementary with Information Systems for Engineers as they cover different time periods of database history and practices -- you can even take both lectures at the same time. The key challenge of the information society is to turn data into information, information into knowledge, knowledge into value. This has become increasingly complex. Data comes in larger volumes, diverse shapes, from different sources. Data is more heterogeneous and less structured than forty years ago. Nevertheless, it still needs to be processed fast, with support for complex operations. This combination of requirements, together with the technologies that have emerged in order to address them, is typically referred to as "Big Data." This revolution has led to a completely new way to do business, e.g., develop new products and business models, but also to do science -- which is sometimes referred to as data-driven science or the "fourth paradigm". Unfortunately, the quantity of data produced and available -- now in the Zettabyte range (that's 21 zeros) per year -- keeps growing faster than our ability to process it. Hence, new architectures and approaches for processing it were and are still needed. Harnessing them must involve a deep understanding of data not only in the large, but also in the small. The field of databases evolves at a fast pace. In order to be prepared, to the extent possible, to the (r)evolutions that will take place in the next few decades, the emphasis of the lecture will be on the paradigms and core design ideas, while today's technologies will serve as supporting illustrations thereof. After visiting this lecture, you should have gained an overview and understanding of the Big Data landscape, which is the basis on which one can make informed decisions, i.e., pick and orchestrate the relevant technologies together for addressing each business use case efficiently and consistently. | |||||

Content | This course gives an overview of database technologies and of the most important database design principles that lay the foundations of the Big Data universe. It targets specifically students with a scientific or Engineering, but not Computer Science, background. We take the monolithic, one-machine relational stack from the 1970s, smash it down and rebuild it on top of large clusters: starting with distributed storage, and all the way up to syntax, models, validation, processing, indexing, and querying. A broad range of aspects is covered with a focus on how they fit all together in the big picture of the Big Data ecosystem. No data is harmed during this course, however, please be psychologically prepared that our data may not always be in normal form. - physical storage: distributed file systems (HDFS), object storage(S3), key-value stores - logical storage: document stores (MongoDB), column stores (HBase) - data formats and syntaxes (XML, JSON, RDF, CSV, YAML, protocol buffers, Avro) - data shapes and models (tables, trees) - type systems and schemas: atomic types, structured types (arrays, maps), set-based type systems (?, *, +) - an overview of functional, declarative programming languages across data shapes (SQL, JSONiq) - the most important query paradigms (selection, projection, joining, grouping, ordering, windowing) - paradigms for parallel processing, two-stage (MapReduce) and DAG-based (Spark) - resource management (YARN) - what a data center is made of and why it matters (racks, nodes, ...) - underlying architectures (internal machinery of HDFS, HBase, Spark) - optimization techniques (functional and declarative paradigms, query plans, rewrites, indexing) - applications. Large scale analytics and machine learning are outside of the scope of this course. | |||||

Literature | Papers from scientific conferences and journals. References will be given as part of the course material during the semester. | |||||

Prerequisites / Notice | This course is not intended for Computer Science and Data Science students. Computer Science and Data Science students interested in Big Data MUST attend the Master's level Big Data lecture, offered in Fall. Requirements: programming knowledge (Java, C++, Python, PHP, ...) as well as basic knowledge on databases (SQL). If you have already built your own website with a backend SQL database, this is perfect. Attendance is especially recommended to those who attended Information Systems for Engineers last Fall, which introduced the "good old databases of the 1970s" (SQL, tables and cubes). However, this is not a strict requirement, and it is also possible to take the lectures in reverse order. | |||||

263-5300-00L | Guarantees for Machine Learning | W | 5 credits | 2V + 2A | F. Yang | |

Abstract | This course teaches classical and recent methods in statistics and optimization commonly used to prove theoretical guarantees for machine learning algorithms. The knowledge is then applied in project work that focuses on understanding phenomena in modern machine learning. | |||||

Objective | This course is aimed at advanced master and doctorate students who want to understand and/or conduct independent research on theory for modern machine learning. For this purpose, students will learn common mathematical techniques from statistical learning theory. In independent project work, they then apply their knowledge and go through the process of critically questioning recently published work, finding relevant research questions and learning how to effectively present research ideas to a professional audience. | |||||

Content | This course teaches some classical and recent methods in statistical learning theory aimed at proving theoretical guarantees for machine learning algorithms, including topics in - concentration bounds, uniform convergence - high-dimensional statistics (e.g. Lasso) - prediction error bounds for non-parametric statistics (e.g. in kernel spaces) - minimax lower bounds - regularization via optimization The project work focuses on active theoretical ML research that aims to understand modern phenomena in machine learning, including but not limited to - how overparameterization could help generalization ( interpolating models, linearized NN ) - how overparameterization could help optimization ( non-convex optimization, loss landscape ) - complexity measures and approximation theoretic properties of randomly initialized and trained NN - generalization of robust learning ( adversarial robustness, standard and robust error tradeoff ) - prediction with calibrated confidence ( conformal prediction, calibration ) | |||||

Prerequisites / Notice | It’s absolutely necessary for students to have a strong mathematical background (basic real analysis, probability theory, linear algebra) and good knowledge of core concepts in machine learning taught in courses such as “Introduction to Machine Learning”, “Regression”/ “Statistical Modelling”. It's also helpful to have heard an optimization course or approximation theoretic course. In addition to these prerequisites, this class requires a certain degree of mathematical maturity—including abstract thinking and the ability to understand and write proofs. |

- Page 1 of 3 All