LemmaBench: When AI Finally Meets Real Mathematics
Opening — Why This Matters Now Every few months, a headline declares that AI can now “solve Olympiad math” or “prove theorems at gold-medal level.” Investors cheer. Researchers argue. Skeptics mutter something about data contamination. But here’s the uncomfortable question: are we measuring real mathematical reasoning—or just performance on carefully curated, increasingly familiar datasets? ...