252-0376-00L Data Warehousing
Semester | Spring Semester 2012 |
Lecturers | R. Marti |
Periodicity | yearly recurring course |
Language of instruction | English |
Courses
Number | Title | Hours | Lecturers | |||||||
---|---|---|---|---|---|---|---|---|---|---|
252-0376-00 V | Data Warehousing | 2 hrs |
| R. Marti | ||||||
252-0376-00 U | Data Warehousing | 1 hrs |
| R. Marti |
Catalogue data
Abstract | Data Warehousing from the perspectives of data modelers, developers, and end users. Topics: On-Line Analytical Processing (OLAP) vs On-Line Transaction Processing (OLTP); data modeling for OLAP; OLAP querying and Data Mining algorithms; integration of data from existing databases: Extract Transform Load (ETL) processes, incl. mapping to a common data model and dealing with data quality issues. |
Objective | Know the raison d’ être for data warehouses, be able to design logical database schemas for On-Line Analytical Processing, know and correctly apply the options of physical database structures, be able to write SQL queries using advanced grouping and ranking operators and window functions, and know various approaches to data mining. |
Content | 1 Introduction Characterization of Data Warehouses (DWs); basic architecture of DWs; OLAP vs OLTP 2 Data Marts Star and Snowflake Schemas: dimensions, measures and facts; advanced grouping, aggregation and ranking in SQL, pivoting; special architectural support in relational DBMSs: bitmap indexes, materialized views; columnar data stores 3 The Integrated Data Warehouse Designing an integrated database schema and Master Data Management (MDM); valid time and transaction time, design patterns for versioned, temporal and bi-temporal data 4 ETL Processing Staging and Landing Areas; quality assurance per source, e.g. dealing with missing or incorrect data; integration issues, e.g. determining common entities from different sources (entity resolution); data profiling 5 Data Mining Detecting outliers, clusters, and association rules from large structured data sets. 6 Loose Ends (time permitting) Workflow support; data lineage 7 Business Environment The “enterprise feedback loop”: measuring the performance of an enterprise and planning future resource allocation |
Lecture notes | The slides presented in the lecture will be published on the Web Page. |
Literature | Buying a textbook is not required. Pointers to selected literature will be published on the Web Page. However, for "book collectors", the following two references are recommended: 1) Christopher Adamson: Star Schema, The Complete Reference. McGraw Hill 2010. This book contains a good overview of data warehouses from design and usage perspectives. 2) Jiawei Han, Micheline Kamber, Jian Pei: Data Mining, Concepts and Techniques, 3rd Edition. Morgan Kaufmann, 2012. The focus of this book is on data mining techniques (chapters 6-12), but it also covers the basics of data warehouses (chapters 2-5). |
Prerequisites / Notice | A working knowledge of relational database technology, i.e. "core" SQL and the basic architecture of a DBMS (especially the basics of query evaluation and B-tree indexes), is assumed. |
Performance assessment
Performance assessment information (valid until the course unit is held again) | |
Performance assessment as a semester course | |
ECTS credits | 4 credits |
Examiners | R. Marti |
Type | session examination |
Language of examination | English |
Repetition | The performance assessment is only offered in the session after the course unit. Repetition only possible after re-enrolling. |
Mode of examination | written 60 minutes |
Written aids | "Open book examination": Any documents candidates wish to take along, including books, but excluding electronic devices. |
This information can be updated until the beginning of the semester; information on the examination timetable is binding. |
Learning materials
No public learning materials available. | |
Only public learning materials are listed. |
Groups
No information on groups available. |
Restrictions
There are no additional restrictions for the registration. |
Offered in
Programme | Section | Type | |
---|---|---|---|
Computer Science Master | Focus Elective Courses Information Systems | W | |
Certificate of Advanced Studies in Computer Science | Focus Courses and Electives | W |