When Benchmarks Lie: Teaching Leaderboards to Care About Preferences
A leaderboard is a comforting object. It gives procurement teams, product managers, and slightly sleep-deprived founders the same small pleasure: a ranked list. Bigger number, better model. Lower rank, worse model. Decision made. Spreadsheet closed. Everyone can return to pretending vendor evaluation is objective. Unfortunately, benchmarks do not care what your business actually needs. ...