lacoco-lab

Proseminar (Introductory Seminar): Large Language Models

Data

This proseminar is organized by: Prof. Dr. Michael Hahn

When and where: Do 12:15–13:45, Raum -1.05 (Building C7.2)

The class is centered around Jurasfky & Martin [J&M] (3rd edition): https://web.stanford.edu/~jurafsky/slp3/

Specifically, the January 25 version: https://web.stanford.edu/~jurafsky/slp3/ed3book_Jan25.pdf

Please register in Course Maganement System (CMS): https://cms.sic.saarland/prposeminar_llms_25/ We will use this for announcements and other communication.

The course is aimed primarily at students of B.Sc. Computerlinguistik in the second semester.

The number of participants is limited to 20. Please note that registration in CMS does not guarantee that you will be admitted. Admission will be finalized in the first week of the semester.

Every participant is expected to attend all sessions. Please let Michael know if you cannot attend some session. Attendance is important because in-class discussion is a substantive component of this proseminar.

The couse is worth 5CP. Every participant gives a presentation and submits a term paper.

Every presentation just covers a few pages of the J&M book, but contains a lot of technical content. A key component of your presentation will be to take technical content from the J&M book, thoroughly understand it, and present it in a form that makes it understandable to the other seminar participants. Thorough understanding of your topic is a key aspect; be prepared to invest a substantial amount of effort into understanding the technical details of your assigned part of the book. Besides giving a good presentation, you should be prepared to answer questions from seminar participants and Michael.

You should view the assigned part of the book just as a starting point, and are highly encouraged to draw on other resources beyond the J&M book (scholarly resources such as those referenced in the J&M book - but also blog posts, videos etc) to deepen your understanding of your topic. Your presentation should clearly indicate which resources you drew on.

Syllabus

!! UNDER CONSTRUCTION !!!

11 sessions

Date Topic Reading Concepts/Pages to Cover Who
APRIL 10 -      
APRIL 17 first session      
APRIL 24 -      
MAY 01 holiday, no class      
MAY 08 N-Gram Models Chapter 3.1 n-grams; log-probabilties (pages 33-37) TBD
MAY 15 Testing, Perplexity, Sampling Chapters 3.2, 3.3, 3.4 train-test split, perplexity (pages 38-41), sampling (pages 42-43) TBD
MAY 22 Neural Networks Chapters 7.1, 7.3 units, ReLU, feedforward networks, softmax (pages 133-134, 138-140) TBD
MAY 29 holiday, no class      
JUNE 05 Attention Chapter 9.1 Definition of a single attention head (pages 186-190) TBD
JUNE 12 Transformers Chapter 9.2 Components of a transformer (pages 191-193; high-level idea of pages 194-197) TBD
JUNE 19 holiday, no class      
JUNE 26 Input and Output in Transformers Chapters 9.4, 9.5 Input encodings, language modeling head (pages 197-201) TBD
JULY 03 Pretraining Chapter 10.3, 10.4 pages 210-213, 214-216 TBD
JULY 10 (online) Prompting and Fewshot Prompting Chapter 12.1, 12.4 prompting, in-context learning, chain-of-thought TBD
JULY 17 Post-Training Chapter 12.3 pages 249-254 TBD

If there are more than 18 participants, we will jointly agree on an extra session so that everyone gets the chance to do a full 45 minutes presentation.

Grading

!!! SUBJECT TO CHANGE !!!

Grades are composed of: 40% term paper, 40% presentation (including responses to questions), 20% participation in other students’ presentations.