This course introduces the core computer vision techniques and algorithms that autonomous cars use to perceive the semantics and geometry of their driving environment, localize themselves in it, and predict its dynamic evolution. Emphasis is placed on techniques tailored for real-world settings, such as multi-modal fusion, domain-adaptive and outlier-aware architectures, and multi-agent methods.
Lernziel
Students will learn about the fundamentals of autonomous cars and of the computer vision models and methods these cars use to analyze their environment and navigate themselves in it. Students will be presented with state-of-the-art representations and algorithms for semantic, geometric and temporal visual reasoning in automated driving and will gain hands-on experience in developing computer vision algorithms and architectures for solving such tasks.
After completing this course, students will be able to: 1. understand the operating principles of visual sensors in autonomous cars 2. differentiate between the core architectural paradigms and components of modern visual perception models and describe their logic and the role of their parameters 3. systematically categorize the main visual tasks related to automated driving and understand the primary representations and algorithms which are used for solving them 4. critically analyze and evaluate current research in the area of computer vision for autonomous cars 5. practically reproduce state-of-the-art computer vision methods in automated driving 6. independently develop new models for visual perception
Inhalt
The content of the lectures consists in the following topics:
1. Fundamentals (a) Fundamentals of autonomous cars and their visual sensors (b) Fundamental computer vision architectures and algorithms for autonomous cars
3. Geometric perception and localization (a) Depth estimation (b) 3D reconstruction (c) Visual localization (d) Unimodal visual/lidar 3D object detection
4. Robust perception: multi-modal, multi-domain and multi-agent methods (a) Multi-modal 2D and 3D object detection (b) Visual grounding and verbo-visual fusion (c) Domain-adaptive and outlier-aware semantic perception (d) Vehicle-to-vehicle communication for perception
The practical projects involve implementing complex computer vision architectures and algorithms and applying them to real-world, multi-modal driving datasets. In particular, students will develop models and algorithms for: 1. Semantic segmentation and depth estimation 2. Sensor calibration for multi-modal 3D driving datasets 3. 3D object detection using lidars
Skript
Lecture slides are provided in PDF format.
Voraussetzungen / Besonderes
Students are expected to have a solid basic knowledge of linear algebra, multivariate calculus, and probability theory, and a basic background in computer vision and machine learning. All practical projects will require solid background in programming and will be based on Python and libraries of it such as PyTorch.
Kompetenzen
Fachspezifische Kompetenzen
Konzepte und Theorien
geprüft
Verfahren und Technologien
geprüft
Methodenspezifische Kompetenzen
Analytische Kompetenzen
geprüft
Medien und digitale Technologien
gefördert
Problemlösung
geprüft
Soziale Kompetenzen
Kommunikation
gefördert
Kooperation und Teamarbeit
gefördert
Persönliche Kompetenzen
Kreatives Denken
geprüft
Kritisches Denken
geprüft
Leistungskontrolle
Information zur Leistungskontrolle (gültig bis die Lerneinheit neu gelesen wird)
Die Leistungskontrolle wird nur in der Session nach der Lerneinheit angeboten. Die Repetition ist nur nach erneuter Belegung möglich.
Prüfungsmodus
schriftlich 120 Minuten
Zusatzinformation zum Prüfungsmodus
The final grade will be calculated from the session examination grade and the overall projects grade, with each of the two elements weighing 50%.
The projects are an integral part of the course, they are group-based and their completion is compulsory. Receiving a failing overall projects grade results in a failing final grade for the course. Students who do not pass the projects are required to de-register from the exam.
Hilfsmittel schriftlich
One A4 sheet of paper. Simple non-programmable calculator.
Diese Angaben können noch zu Semesterbeginn aktualisiert werden; verbindlich sind die Angaben auf dem Prüfungsplan.
Es werden nur die öffentlichen Lernmaterialien aufgeführt.
Gruppen
Keine Informationen zu Gruppen vorhanden.
Einschränkungen
Allgemein
: Für Fachstudierende und Hörer/-innen ist eine Spezialbewilligung der Dozierenden notwendig Bewilligung der Dozierenden für alle Studierenden notwendig
Plätze
Maximal 90
Vorrang
Die Belegung der Lerneinheit ist bis 20.09.2024 nur durch die primäre Zielgruppe möglich
Primäre Zielgruppe
Robotics, Systems and Control MSc (159000)
Doktorat Maschinenbau und Verfahrenstechnik (164002)
Elektrotechnik und Informationstechnologie MSc (237000)
Doktorat Informationstechnologie & Elektrotechnik (239002)
Doktorat Informationstechn. & Elektrotech. ETH-UZH (241000)
Data Science MSc (261000)
Informatik MSc (263000)
Doktorat Informatik (264002)