lacoco-lab

Proseminar (Introductory Seminar): Large Language Models

Data

This proseminar is organized by: Prof. Dr. Michael Hahn

When and where: Do 12:15–13:45, Raum -1.05 (Building C7.2)

The class is centered around Jurasfky & Martin [J&M] (3rd edition): https://web.stanford.edu/~jurafsky/slp3/

Specifically, the January 25 version: https://web.stanford.edu/~jurafsky/slp3/ed3book_Jan25.pdf

Please register in Course Maganement System (CMS): https://cms.sic.saarland/prposeminar_llms_25/ We will use this for announcements and other communication.

The course is aimed primarily at students of B.Sc. Computerlinguistik in the second semester.

The number of participants is limited to 20. Please note that registration in CMS does not guarantee that you will be admitted. Admission will be finalized in the first week of the semester.

Every participant is expected to attend all sessions. Please let Michael know if you cannot attend some session. Attendance is important because in-class discussion is a substantive component of this proseminar.

The couse is worth 5CP. Every participant gives a presentation and submits a term paper.

Every presentation just covers a few pages of the J&M book, but contains a lot of technical content. A key component of your presentation will be to take technical content from the J&M book, thoroughly understand it, and present it in a form that makes it understandable to the other seminar participants. Thorough understanding of your topic is a key aspect; be prepared to invest a substantial amount of effort into understanding the technical details of your assigned part of the book. Besides giving a good presentation, you should be prepared to answer questions from seminar participants and Michael.

You should view the assigned part of the book just as a starting point, and are highly encouraged to draw on other resources beyond the J&M book (scholarly resources such as those referenced in the J&M book - but also blog posts, videos etc) to deepen your understanding of your topic. Your presentation should clearly indicate which resources you drew on.

Syllabus

Date	Topic	Reading	Concepts/Pages to Cover	Who
APRIL 10	-
APRIL 17	first session
APRIL 24	-
MAY 01	holiday, no class
MAY 08	N-Gram Models	Chapter 3.1	n-grams; log-probabilties (pages 33-37)	TBD
MAY 15	Testing, Perplexity, Sampling	Chapters 3.2, 3.3, 3.4	train-test split, perplexity (pages 38-41), sampling (pages 42-43)	TBD
MAY 22	Neural Networks	Chapters 7.1, 7.3	units, ReLU, feedforward networks, softmax (pages 133-134, 138-140)	TBD
MAY 29	holiday, no class
JUNE 05	Attention	Chapter 9.1	Definition of a single attention head (pages 186-190)	TBD
JUNE 12	Transformers	Chapter 9.2	Components of a transformer (pages 191-193; high-level idea of pages 194-197)	TBD
JUNE 19	holiday, no class
JUNE 26	Input and Output in Transformers	Chapters 9.4, 9.5	Input encodings, language modeling head (pages 197-201)	TBD
JULY 03	Pretraining	Chapter 10.3, 10.4	pages 210-213, 214-216	TBD
JULY 10 (online)	Prompting and Fewshot Prompting	Chapter 12.1, 12.4	prompting, in-context learning, chain-of-thought	TBD
JULY 17	Post-Training	Chapter 12.3	pages 249-254	TBD

If there are more than 18 participants, we will jointly agree on an extra session so that everyone gets the chance to do a full 45 minutes presentation.

Grading

Grades are composed of: 40% term paper, 40% presentation (including responses to questions), 20% participation in other students’ presentations.

Term Paper

The term paper should be a write-up of the content of your presentation. Please use the ACL style (see https://acl-org.github.io/ACLPUB/formatting.html). The paper should have up to 10 pages, excluding references.

The term paper is to be handed in (i.e., uploaded via CMS) by October 12, 2025.