851-0739-01L  Building a Robot Judge: Data Science For the Law

SemesterSpring Semester 2019
LecturersE. Ash
Periodicityyearly recurring course
Language of instructionEnglish
CommentParticularly suitable for students of D-INFK, D-ITET, D-MTEC


AbstractThis course explores the automation of decisions in the legal system. We delve into the tools from natural language processing and machine learning needed to predict judge decision-making and ask whether it is possible -- or even desirable -- to build a robot judge.
ObjectiveIs a concept of justice what truly separates man from machine? Recent advances in data science have caused many people to reconsider their responses to this question. With expanding digitization of legal data and corpora, alongside rapid developments in natural language processing and machine learning, the prospect arises for automating legal decisions.

Data science technologies have the potential to improve legal decisions by making them more efficient and consistent. The benefits to society from this automation could be significant. On the other hand, there are serious risks that automated systems could replicate or amplify existing legal biases and rigidities.

This course introduces students to the data science tools that are unlocking legal materials for computational and scientific analysis. We begin with the problem of representing laws as data, with a review of techniques for featurizing texts, extracting legal information, and representing documents as vectors. We explore methods for measuring document similarity and clustering documents based on legal topics or other features. Visualization methods include word clouds and t-SNE plots for spatial relations between documents.

We next consider legal prediction problems. Given the evidence and briefs in this case, how will a judge probably decide? How likely is a criminal defendant to commit another crime? How much additional revenue will this new tax law collect? Students will investigate and implement the relevant machine learning tools for making these types of predictions, including regression, classification, and deep neural networks models.

We then use these predictions to better understand the operation of the legal system. Under what conditions do judges tend to make errors? Against which types of defendants do parole boards exhibit bias? Which jurisdictions have the most tax loopholes? In a semester project, student groups will conceive and implement a research design for examining this type of empirical research question.

Some programming experience in Python is required, and some experience with text mining is highly recommended.