Born a Transformer -- Always a Transformer? On the Effect of Pretraining on Architectural Abilities

Publication
NeurIPS 2025