This course provides an in-depth theoretical treatment of optimization methods that are relevant in data science.
Learning objective
Understanding the guarantees and limits of relevant optimization methods used in data science. Learning theoretical paradigms and techniques to deal with optimization problems arising in data science.
Content
This course provides an in-depth theoretical treatment of classical and modern optimization methods that are relevant in data science.
After a general discussion about the role that optimization has in the process of learning from data, we give an introduction to the theory of (convex) optimization. Based on this, we present and analyze algorithms in the following four categories: first-order methods (gradient and coordinate descent, Frank-Wolfe, subgradient and mirror descent, stochastic and incremental gradient methods); second-order methods (Newton and quasi Newton methods); non-convexity (local convergence, provable global convergence, cone programming, convex relaxations); min-max optimization (extragradient methods).
The emphasis is on the motivations and design principles behind the algorithms, on provable performance bounds, and on the mathematical tools and techniques to prove them. The goal is to equip students with a fundamental understanding about why optimization algorithms work, and what their limits are. This understanding will be of help in selecting suitable algorithms in a given application, but providing concrete practical guidance is not our focus.
Prerequisites / Notice
A solid background in analysis and linear algebra; some background in theoretical computer science (computational complexity, analysis of algorithms); the ability to understand and write mathematical proofs.
Performance assessment
Performance assessment information (valid until the course unit is held again)
The performance assessment is only offered in the session after the course unit. Repetition only possible after re-enrolling.
Mode of examination
written 180 minutes
Additional information on mode of examination
At four times during the course of the semester, we will hand out graded assignments (compulsory continuous performance assessments). The solutions are expected to be typeset in LaTeX or similar. Solutions will be graded and contribute 40% to the final grade. Concretely, let P1, P2, P3, P4 be the performances in the four graded assignments, measured as the percentage of points being attained (between 0% and 100%). A graded assignment that is not handed in is counted with a performance of 0%. Let PE be the performance in the final exam. Then the overall course performance is computed as P = 0.1*P1 + 0.1*P2 + 0.1*P3 + 0.1*P4 + 0.6*PE. A course performance of P >= 50% is guaranteed to lead to a passing grade, but depending on the overall performance of the cohort, we may lower the threshold for a passing grade. Assignments can be discussed with colleagues, but we expect an independent writeup.
Written aids
4 pages (A4) of written material (no restrictions regarding form or content)
This information can be updated until the beginning of the semester; information on the examination timetable is binding.