lacoco-lab

Linguistic Interpretability for Neural Models of Language

Software Project

Course Description:

For this course, you will work with one or two other students to propose and implement a software project focused on linguistic interpretabilty for neural-network-based models of language. This course runs in the summer semester, i.e. April-July 2025.

You will select an interpretability method and use it to analyze how one or more neural models represent and/or process formal linguistic categories (cf. examples below). You can use a pre-trained model, trained on language modeling or some other objective, or for more complicated methods / tasks you have the option of training your own smaller model. The model can be trained on any natural language. Many of the technical aspects are flexible and open to discussion, e.g. at the project proposal phase. The key requirement is that your final project will analyze linguistically relevant categories, and how they are realized in a given neural network model.

For a more comprehensive technical background, I highly recommend taking the seminar Interpreting and Analyzing Neural Language Models in parallel with this project.

Instructors: Kate McCurdy

Prerequisites: You should be comfortable working with neural networks and various machine learning techniques. You should also have sufficient linguistic background to understand and evaluate the categories of interest. Look through the lists of example methods and tasks — if you would not be capable of implementing one of the listed methods, or evaluating one of the listed tasks, this is probably not the course for you.

Registration: If you want to take part, please send an email to kmccurdy ( at sign ) lst.uni-saaland.de, deadline Friday April 11 18 (i.e. after the kickoff meeting on April 10 - feel free to come and ask questions there). N.B. registration deadline extended by one week to accommodate lack of CMS page!

In your email, please:

Project Timeline

Example project components

Example methods

Note that some methods can be used on large language models, while others are more suitable for smaller models. Select a method appropriate for the task.

Example categories

Case Studies

Research works with a linguistic interpretability focus.