The course covers fundamentals of large-scale data stream processing. The focus is on the design and architecture of modern distributed streaming systems as well as algorithms for analyzing data streams.
Objective
This course has the goal of providing an overview of the data stream processing model and introducing modern platforms and tools for anlayzing massive data streams. By the end of the course, students should be able to use techniques for extracting knowledge from continuous, fast data streams. They will also have gained a deep understanding of the design and implementation of modern distributed stream processors through a series of hands-on exercises.
Content
Modern data-driven applications require continuous, low-latency processing of large-scale, rapid data events such as videos, images, emails, chats, clicks, search queries, financial transactions, traffic records, sensor measurements, etc. Extracting knowledge from these data streams is particularly challenging due to their high speed and massive volume. Distributed stream processing has recently become highly popular across industry and academia due to its capabilities to both improve established data processing tasks and to facilitate novel applications with real-time requirements. In this course, we will study the design and architecture of modern distributed streaming systems as well as fundamental algorithms for analyzing data streams.
Lecture notes
Schedule and lecture notes will be posted in the course website: Link
Prerequisites / Notice
The exercise sessions will be a mixture of (1) reviews, discussions, and evaluation of research papers on data stream processing, and (2) programming assignments on implementing data stream mining algorithms and anlysis tasks.
- Basic knowledge of relational data management and distributed systems.
- Basic programming skills in Java and/or Rust is necessary to carry out the practical exercises and final project.
Performance assessment
Performance assessment information (valid until the course unit is held again)
Repetition only possible after re-enrolling for the course unit.
Additional information on mode of examination
The course consists of lectures, exercises, and a final semester project. There will be no formal examination at the end of the course. Students are continuously graded based on their participation in class (10%), weekly assignments (50%), and semester project (40%).