263-5200-00L  Data Mining: Learning from Large Data Sets

SemesterSpring Semester 2015
LecturersA. Krause
Periodicityyearly recurring course
CourseDoes not take place this semester.
Language of instructionEnglish
CommentThe course will be offered again in the autumn semester 2015.



Catalogue data

AbstractMany scientific and commercial applications require insights from massive, high-dimensional data sets. This courses introduces principled, state-of-the-art techniques from statistics, algorithms and discrete and convex optimization for learning from such large data sets. The course both covers theoretical foundations and practical applications.
ObjectiveMany scientific and commercial applications require us to obtain insights from massive, high-dimensional data sets. In this graduate-level course, we will study principled, state-of-the-art techniques from statistics, algorithms and discrete and convex optimization for learning from such large data sets. The course will both cover theoretical foundations and practical applications.
ContentTopics covered:
- Dealing with large data (Data centers; Map-Reduce/Hadoop; Amazon Mechanical Turk)
- Fast nearest neighbor methods (Shingling, locality sensitive hashing)
- Online learning (Online optimization and regret minimization, online convex programming, applications to large-scale Support Vector Machines)
- Multi-armed bandits (exploration-exploitation tradeoffs, applications to online advertising and relevance feedback)
- Active learning (uncertainty sampling, pool-based methods, label complexity)
- Dimension reduction (random projections, nonlinear methods)
- Data streams (Sketches, coresets, applications to online clustering)
- Recommender systems
Prerequisites / NoticePrerequisites: Solid basic knowledge in statistics, algorithms and programming. Background in machine learning is helpful but not required.

Performance assessment

Performance assessment information (valid until the course unit is held again)
Performance assessment as a semester course
ECTS credits4 credits
ExaminersA. Krause
Typesession examination
Language of examinationEnglish
RepetitionThe performance assessment is only offered in the session after the course unit. Repetition only possible after re-enrolling.
Mode of examinationwritten 120 minutes
Additional information on mode of examinationThe grade is determined by a project [30%] and the final written exam [70%].
Written aidsTwo A4-pages (i.e. one A4-sheet of paper), either handwritten or 11 point minimum font size
This information can be updated until the beginning of the semester; information on the examination timetable is binding.

Learning materials

 
Main linkInformation
Only public learning materials are listed.

Courses

NumberTitleHoursLecturers
263-5200-00 VData Mining: Learning from Large Data Sets
Does not take place this semester.
2 hrsA. Krause
263-5200-00 UData Mining: Learning from Large Data Sets
Does not take place this semester.
1 hrsA. Krause

Groups

No information on groups available.

Restrictions

There are no additional restrictions for the registration.

Offered in

ProgrammeSectionType
Certificate of Advanced Studies in Computer ScienceFocus Courses and ElectivesWInformation
Computer Science MasterFocus Elective Courses Information SystemsWInformation
Computer Science MasterFocus Elective Courses Visual ComputingWInformation
Robotics, Systems and Control MasterArtificial IntelligenceWInformation
Statistics MasterStatistical and Mathematical CoursesWInformation