Beyond the Pareto Frontier: Pricing LLM Mistakes in the Real World

For all the hype about model accuracy, inference cost, and latency, most organizations are still squinting at scatter plots to decide which large language model (LLM) to use. But what if we could cut through the tradeoff fog with a single number that tells you exactly which model is worth deploying—for your use case, under your constraints? That’s the bold proposal in a recent paper by Zellinger and Thomson from Caltech: treat LLM selection as an economic decision. Rather than searching for models on the accuracy-cost “Pareto frontier,” they suggest an approach grounded in price-tagging errors, delays, and abstentions in dollar terms. Think of it as a model selection framework that answers: How much is a mistake worth to you? ...

July 8, 2025 · 4 min · Zelina