A Formal Framework for Understanding Length Generalization in Transformers

Publication
ICLR 2025