When Circuits Go Atomic: Pruning Transformers One Neuron at a Time
Opening — Why this matters now Mechanistic interpretability has a scaling problem. As language models grow larger and more embedded in high‑stakes workflows, the old habit of waving at “important attention heads” is starting to look quaint. If we want to understand how models reason — not just where something lights up — we need circuit discovery methods that scale without drowning GPUs in activations or collapsing everything into blunt architectural units. ...