Cover image

Assert Less, Observe More: AICL and the New QA Stack for LLM Apps

TL;DR for operators LLM application testing should stop pretending that the whole product behaves like ordinary software. The database connector, retry logic, API wrapper, and schema validator still deserve normal unit, integration, and load tests. Fine. Keep those. They are not the problem. The problem starts when the product becomes a stateful language system: prompts are assembled dynamically, retrieval changes the context, tool calls modify the execution path, memory leaks across turns, and a model update can improve one workflow while quietly breaking another. At that point, exact-match assertions become less like QA and more like theatre with a YAML file. ...

August 31, 2025 · 17 min · Zelina