Contextualize-then-Aggregate: Circuits for In-Context Learning in Gemma-2 2B

Publication
arXiv preprint