InversionView: A General-Purpose Method for Reading Information from Neural Activations

Publication
Annual Conference on Neural Information Processing Systems (NeurIPS 2024)
Previously: Mechanistic Interpretability Workshop at ICML 2024 (oral) 🏆 Awarded Second Place Prize