Stop Wasting Tokens: ESTAR and the Economics of Early Reasoning Exit
Opening — Why This Matters Now Large Reasoning Models (LRMs) have discovered a curious habit: they keep thinking long after they already know the answer. In the race toward higher benchmark scores, more tokens became the default solution. Need better math accuracy? Add 3,000 reasoning tokens. Want stronger medical QA performance? Let the model “think harder.” Compute is cheap—until it isn’t. ...