263-2910-00L  Program Analysis

SemesterSpring Semester 2015
LecturersM. Vechev
Periodicityyearly recurring course
Language of instructionEnglish


AbstractModern program analysis techniques are the predominant approach for automatically reasoning about real world programs -- its techniques have been applied in a vast range of application domains.

The course provides an introduction to the fundamental principles, applications, and research trends of modern program analysis.
ObjectiveThe course has 4 main objectives:

* Understand the foundational principles behind program analysis techniques.

* Understand how to apply these principles to build practical, working analyzers for real world problems.

* Understand how to combine these techniques with other approaches (e.g. machine learning techniques) to build powerful end-to-end reasoning systems, not possible otherwise.

* Gain familiarity with the state-of-the-art in the area and the future research trends in the next 5-10 years.
ContentThe last decade has seen an explosion in modern program analysis techniques. These techniques are increasingly being used to reason about a vast range of computational paradigms including:

* finding security violations in web and mobile applications such as JavaScript and Android
* practical type checking and inference (e.g. Facebook's recently released Flow analyzer).
* combinations with machine learning techniques for learning from massive programming data guiding prediction of program properties and prediction of new code.
* establishing properties of biological systems (e.g. DNA computation)
* finding serious errors in systems software (e.g. Linux kernel, device drivers, file systems)
* automatic discovery of new algorithms (e.g. concurrent data structures, distributed algorithms) and end-user programming.
* compilers for domain specific languages
* architecture-driven reasoning of concurrent software (e.g. Intel's x86, ARM, IBM's Power).

This course will provide a comprehensive introduction to modern, state-of-the-art program analysis concepts, principles and research trends, including:

* Static & Dynamic Analysis:
- concepts: memory safety, type checking and inference, typestate, concurrency analysis, abstract interpretation (domains, soundness, precision, fixed points)
- frameworks: Valgrind, FastTrack, EventRacer, Apron, PPL, Facebook's Flow analyzer.

* Statistical program reasoning:
- concepts: combining analysis with statistical models (e.g. Language models, Bayesian networks, Neural networks, etc)
- frameworks: Slang, JSNice (Link)

* Predicate abstraction:
- concepts: Graf-Saidi, Boolean programs, lazy abstraction
- frameworks: Microsoft's SLAM for C programs, Fender

* Symbolic execution:
- concepts: SMT, concolic execution
- frameworks: S2E, KLEE, Sage

* Security Analysis:
- concepts: static + dynamic combination
- example: malware detection

* Pointer analysis:
- concepts: Andersen's, Steensgaard's analysis
- frameworks: Soot, LLVM, WALA

* Program synthesis:
- concepts: L*, version spaces, PBE, CEGIS
- frameworks: Sketch, AGS, SmartEdit, ReSynth

* Applications of Analysis & Synthesis:
- GPU programs, security errors, device drivers, concurrent algorithms, end-user programming.

To gain a deeper understanding of how to apply these techniques in practice, the course will involve a small hands-on programming project where based on the principles introduced in class, the students will build a program reasoning engine (e.g. analysis, predictions) for a modern programming language.
Lecture notesThe lectures notes will be distributed in class.
LiteratureDistributed in class.
Prerequisites / NoticeThis course is aimed at both graduate (M.Sc., PhD) students as well as advanced undergraduate students.