The Data Diet for Reasoning Models: Why Less (But Smarter) Wins
A model-training team has a familiar bad habit: when the model fails, it asks for more. More examples. More domains. More synthetic prompts. More compute. More benchmarks to average over until the unpleasant details become small enough to ignore. This habit is understandable. It is also expensive. And, according to SuperNova, it may be the wrong first instinct. ...