lacoco-lab

Theory of Machine Learning for Language Models

This lecture treats modern theory of machine learning (e.g., generalization bounds). We will introduce techniques that are useful across many model families and data domains; we will especially illustrate the methods with applications to the understanding the learning of Language Models/LLMs.

Content at a glance (tentative):

Prerequisites:

To get an idea if your prerequisites are sufficient, you can check the early chapters in the notes linked under “Material” and see how accessible they look to you.

Logistics

This lecture is offered in Summer Semester 2026.

When: TBD

Where: TBD

Lecturers: Michael Hahn and Yash Sarrof

Credit Points: 6 CP

Material

Our primary text will be Tengyu Ma’s lecture notes from CS229M at Stanford (see also this iteration). We will treat a selection of key topics from these notes.

We’ll also take a view towards Transformers/Language Models, based on further readings (e.g., Edelman et al 2022, Wei et al 2021, Hahn and Rofin 2024, Huang et al 2025).

Syllabus

TBD

The syllabus will evolve over the course of the semester. We’ll adjust selection of topics based on what works best.

Grading

Other