Cover image

Cheap Thrills, Hard Guarantees: BARGAINing with LLM Cascades

A familiar enterprise AI story goes like this: the expensive model works, the cheap model almost works, and the finance team would very much like “almost” to become a procurement strategy. That is where the trouble starts. For large-scale document processing, classification, filtering, extraction, and review queues, teams rarely want to call the best available LLM on every record. It is too slow, too expensive, and occasionally a lovely way to convert a data pipeline into a billing incident. The obvious compromise is a model cascade: use a cheaper proxy model when it seems confident, and escalate the uncertain cases to a stronger oracle model. ...

September 6, 2025 · 17 min · Zelina
Cover image

The Slingshot Strategy: Outsmarting Giants with Small AI Models

TL;DR for operators Most organisations do not have an AI capability problem. They have an AI allocation problem. They send too many routine, repetitive, low-risk tasks to large frontier models because the demo looked impressive and the invoice arrived later. The slingshot strategy is the opposite instinct: break a workflow into smaller decisions, assign the cheap and reliable parts to specialised models or rules, and escalate only the uncertain or high-value cases to stronger LLMs. The point is not to worship small models. That would be merely replacing one superstition with a smaller, cheaper superstition. The point is to allocate model capacity like an operating resource. ...

March 26, 2025 · 13 min · Zelina