The spring semester 2021 will take place online until further notice. Exceptions: Courses that can only be carried out with on-site presence. Please note the information provided by the lecturers.

263-3826-00L  Data Stream Processing and Analytics

SemesterSpring Semester 2019
LecturersV. Kalavri
Periodicitynon-recurring course
Language of instructionEnglish

AbstractThe course covers fundamentals of large-scale data stream processing. The focus is on the design and architecture of modern distributed streaming systems as well as algorithms for analyzing data streams.
ObjectiveThis course has the goal of providing an overview of the data stream processing model and introducing modern platforms and tools for anlayzing massive data streams. By the end of the course, students should be able to use techniques for extracting knowledge from continuous, fast data streams. They will also have gained a deep understanding of the design and implementation of modern distributed stream processors through a series of hands-on exercises.
ContentModern data-driven applications require continuous, low-latency processing of large-scale, rapid data events such as videos, images, emails, chats, clicks, search queries, financial transactions, traffic records, sensor measurements, etc. Extracting knowledge from these data streams is particularly challenging due to their high speed and massive volume.
Distributed stream processing has recently become highly popular across industry and academia due to its capabilities to both improve established data processing tasks and to facilitate novel applications with real-time requirements. In this course, we will study the design and architecture of modern distributed streaming systems as well as fundamental algorithms for analyzing data streams.
Lecture notesSchedule and lecture notes will be posted in the course website:
Prerequisites / NoticeThe exercise sessions will be a mixture of (1) reviews, discussions, and evaluation of research papers on data stream processing, and (2) programming assignments on implementing data stream mining algorithms and anlysis tasks.

- Basic knowledge of relational data management and distributed systems.

- Basic programming skills in Java and/or Rust is necessary to carry out the practical exercises and final project.