lacoco-lab

Milestones in Machine Learning and Language: Historical Readings

Course Description

Modern AI feels like it’s everywhere — models that write, speak, see, play games, and even arguably reason. However many researchers today feel a sense of déjà vu: incremental papers, rebranded benchmarks, recycled ideas. Are we reaching the limits of what can be achieved just by scaling models? Is the field running out of new ideas?

This seminar takes a step back — and way back — to understand how machine learning and language technology evolved: both technically as well as philosophically. We’ll examine the early hopes, dead ends, breakthroughs, and rediscoveries that brought us to today’s transformer-based models

We’ll ask:

What did early AI researchers believe language and learning were ?
Why were neural networks once declared useless — and then revived to define modern AI?
What kinds of research actually shifted paradigms?
Are we at a similar inflection point today?

Prerequisites: This seminar will not presuppose knowledge of any of these fields. However, a willingness to engage with technical content is important.

We’ll read classic work from figures like Turing, Shannon, Chomsky, Rosenblatt, Minsky, Angluin, Valiant. In a field obsessed with the latest preprint reading old papers might be a very unique experience — but our hope is that the takeaways will prove useful, maybe even generative. By studying where ideas came from (and how they were nearly lost), you may come away with a deeper appreciation of today’s models — and new ways to think about the next generation of models.

Course Management System: CMS

Instructors: Yash Sarrof

Time: Every Wednesday (Starting 15th October), 16:15 to 17:45

Room: Building C7 3 - Seminar Room 1.14

Registration:

There are are no more spots left in the seminar.

If you are an LST / CoLi student, and want to take this class, you should directly register in the Course Management System (CMS). Admissions decision will be made around the end of the first week of the semester.
If you are a Computer Science student, you should initially register via the Computer Science department seminar registration system. If you want to take the seminar but were not selected by the assignment system, please apply for the waiting list by emailing ysarrof@lst.uni-saarland.de. Only register in Course Management System (CMS) once you were selected by the assignment system or otherwise admitted by us.

Syllabus

Each week covers foundational work in AI, ML, or NLP, presented by students and followed by discussion. Pairings are based on thematic coherence. This list is not binding, and depending on the interest of participants, we might substitute a paper here and there or break up the contents of a session over multiple weeks.

Theme 1 – Computation and Intelligence

Turing (1936): On Computable Numbers
Turing (1950): Computing Machinery and Intelligence

Theme 2 – Information and Early AI

Shannon (1948): A Mathematical Theory of Communication

Theme 3 – Language: Structure vs. Distribution

Zellig Harris (1954): Distributional Structure
Noam Chomsky (1956): Three Models for the Description of Language

Theme 4 – The Perceptron and the First AI Winter

Rosenblatt (1958): The Perceptron
Minsky & Papert (1969): Perceptrons (Book, Chapter - 6)

Theme 5 – Symbolic Reasoning and Natural Language

Winograd (1972): Understanding Natural Language (SHRDLU) [Only Section 1]

Theme 6 – Formal Models of Learnability

Gold (1967): Language Identification in the Limit
Angluin (1980): Inductive Inference of Formal Languages from Positive Data

Theme 7 – Learning and Generalization

Solomonoff (1964): A Formal Theory of Inductive Inference - Part I
Solomonoff (1964): A Formal Theory of Inductive Inference - Part II
Valiant (1984): A Theory of the Learnable
Vapnik (1998): An overview of Statistical Learning Theory

Schedule

Date	Topic/Theme	Readings	Presenters
15.10	Introduction
22.10	No Class
29.10	Theme 1 – Computation and Intelligence	Turing (1936): On Computable Numbers; Turing (1950): Computing Machinery and Intelligence	Zichao Wei, Antonia Wächter
5.11	Theme 2 – Information Theory	Shannon (1948): A Mathematical Theory of Communication	He Zhu, Shane John Paul
12.11	Theme 3 – Language: Structure vs. Distribution	Zellig Harris (1954): Distributional Structure; Noam Chomsky (1956): Three Models for the Description of Language	Mohammad Saqib Siddiqui, Yu-Hua Hu
19.11	Theme 4 – The Perceptron and the First AI Winter	Rosenblatt (1958): The Perceptron; Minsky & Papert (1969): Perceptrons (Book, Chapter 6)	Franka Beyer, Meropi Lampropoulou
26.11	Theme 5 – Symbolic Reasoning and Natural Language	Winograd (1972): Understanding Natural Language (SHRDLU) [Only Section 1]	Yifang Li
3.12	No Class
10.12	Background for 2nd half [Optional]
17.12	Theme 6 – Formal Models of Learnability	Gold (1967): Language Identification in the Limit	Kaushik Sengupta, Bekham Fallah
24.12	No Class
31.12	No Class
7.01	Theme 6 – Formal Models of Learnability (cont.)	Angluin (1980): Inductive Inference of Formal Languages from Positive Data	Martha Schubert, Alena Tsanda
14.01	Theme 7 – Learning and Generalization	Solomonoff (1964): A Formal Theory of Inductive Inference - Part I ; Solomonoff (1964): A Formal Theory of Inductive Inference - Part II	Maura Gitayani, Mathew Titus
21.01	Theme 7 – Learning and Generalization	Valiant (1984): A Theory of the Learnable ; Vapnik (1998): An overview of Statistical Learning Theory	Syed Muhammad Hamza Raza, Gaffar Saeed
28.01	Discussion, Closing Thoughts + Final Paper Discussions

Questions about readings

Please register on the forum on CMS for posting questions. Every student must submit one question about the readings before each scheduled presentation session. The deadline is typically Tuesday at noon before the presentation day. Students can also submit a few more questions, the grade will be calculated as the highest score among questions (so you can also ask some basic questions on which you want clarification).

Submit one question per week (by Tuesday at noon).
Grading:
- 0 = not submitted
- 1 = superficial
- 2 = thoughtful / insightful

Presentations

Each week students present in pairs.
Target: 45 min presentation + 30–40 min discussion

Not all papers are equally difficult, and evaluation will take this into account. For shorter and more accessible papers, presentations are expected to go into greater detail: explain the background concepts thoroughly, give more context about the state of the field at the time, and walk the audience carefully through the technical contributions. For longer or more technically challenging papers, the expectations are more flexible — you are not required to cover every detail. Instead, focus on helping the class understand the main ideas, why they mattered, and how they connect to the broader history of AI. In these cases, evaluation will be more lenient, since the goal is learning rather than exhaustive coverage.

In general, presentations should emphasize big-picture findings, motivations, and key results rather than trying to reproduce all the technical minutiae. For example, rather than listing every number from a results table, highlight the takeaways. If a paper contains multiple similar experiments, you may choose a representative subset. Conversely, if a paper is largely a review or conceptual piece, you are encouraged to bring in material from its references to enrich the discussion.

When preparing your talk, ask yourself: What problem was the paper trying to solve? Why was this approach novel or controversial at the time? How did it influence later work? Background explanations of key concepts are strongly encouraged, especially if you expect your classmates to be unfamiliar with them. You should also critically engage with the reading: do you agree with the authors’ assumptions, arguments, or conclusions? What do you think they missed?

Presentations will typically be done in pairs. In that case, avoid splitting the session into two disconnected halves; instead, coordinate to draw connections between the papers and highlight similarities or contrasts. Since you have ample time (~45 minutes presentation plus 30–40 minutes discussion), speak at a pace that the audience can follow, illustrate points with examples, and don’t hesitate to repeat important ideas. Remember: the purpose of the presentation is communication. If the audience cannot follow, the effort is wasted for both presenter and listener.

A key component of the presentation is facilitating discussion. Don’t leave all questions to the end; engage the audience throughout, refer to the questions submitted on the forum, and invite comments. Think carefully about what aspects of the paper will spark debate or curiosity. Likewise, when attending other students’ talks, you are expected to participate actively in the discussion.

Final Papers (for the 7 CP version)

Students taking the 7 CP version will complete a term paper based on a small, well-scoped independent project. The aim isn’t to produce a publishable breakthrough, but to explore a question that links the seminar’s historical readings to present-day methods, evidence, or debates. Many of the works we’ll read try to make fuzzy ideas more precise — for example, by turning broad questions about learning, language, or intelligence into definitions, theorems, or algorithms. A good project might take one of these formalizations and see how well its assumptions hold up in practice, reinterpret it using modern tools, or even suggest a different way the same problem could have been framed. For instance, you might revisit Gold’s “identification in the limit” and ask what it would look like in probabilistic terms, or compare Chomsky’s structuralist view of grammar with the distributional approaches that came later. Projects can take many forms — reproducing an old experiment, running a small modern experiment inspired by a classic paper, or writing a careful conceptual analysis. What ties them together is that each project should use the readings as a starting point to think about how ideas are framed, what those framings highlight, and what they leave out. At the end of each week’s session, I encourage you to reflect — on your own or with classmates — about what kinds of projects the readings suggest. Often the best ideas come from simple questions like: “What did this formalization leave out?” or “How else could we capture the same intuition today?”

Deliverable & format

Report: up to 8 pages main text (self-contained) in NeurIPS style, with unlimited appendix for prompts, code snippets, extra figures/analyses.

Content:

Brief literature review (situating your question in the classic readings)
Motivation
Methods
Results
Discussion/limitations

Evidence: Some quantitative or systematic qualitative validation is expected (e.g., small experiments, ablations, error analyses).

Submission: via CMS. Due: March end, 2026

Target something you can complete with modest compute and time: 1–2 datasets or a small synthetic corpus, and a clear evaluation plan. If using large models/APIs, keep runs minimal and reproducible; prefer open-source baselines where possible.

How it’s graded (high level).

Grounding in seminar readings & historical framing (connection and accuracy)
Clarity of question and appropriateness of method
Rigor of analysis (metrics, controls, or systematic qualitative protocol)
Insightfulness of discussion (what the results say about the historical claim/idea)
Writing quality and reproducibility (appendix artifacts, seeds, configs)

Attendance

Active participation is expected in every seminar session. You may miss up to two sessions without providing justification. For each additional session missed beyond that, you must submit a short (~500 words) writeup summarizing your thoughts on the paper(s) discussed in the session you missed.

Evaluation

Important: Study programs may differ in which versions of the class you can take. Please check with your study program coordinator if in doubt.

For students taking the seminar for 4 credits:

Presentation: 40%
Questions about readings: 30%
Participation in class: 30%

For students taking the seminar for 7 credits:

Presentation: 20%
Questions about readings: 15%
Participation in class: 15%
Final paper: 50%

Important: You need to be registered on LSF for the correct version of the class in order to receive credit.

Contact

Please contact Yash (ysarrof@lst.uni-saarland.de) or Michael (mhahn@lst.uni-saarland.de) for any questions.

Accommodations

If you need any accommodations due to a disability or chronic illness, please either contact Michael at mhahn@lst.uni-saarland.de or the Equal Opportunities and Diversity Management Unit of the university.