Cover image

Lie Detectors Are Late: Why AI Oversight Needs Commitment Tracing

Sales agents, investment advisors, negotiators, and procurement bots share one annoying trait: the dangerous moment often arrives before the final sentence. By the time the agent says, “This product is ideal for your risk profile,” or “We have a stronger competing offer,” the operational system has already lost the more interesting battle. The model did not become risky at the punctuation mark. It drifted, selected a path, rationalized a move, and only then produced the polished message that everyone pretends to audit. ...

June 12, 2026 · 17 min · Zelina
Cover image

Two Million Agents Walk Into a Forum, Nobody Builds a Mind

Opening — Why this matters now The AI industry has a small addiction to the word agent. Add another agent, then another, then a few hundred more, and the slide deck begins to smell faintly of civilization. Somewhere between “workflow automation” and “digital society,” we are invited to believe that scale itself becomes intelligence. ...

April 28, 2026 · 14 min · Zelina
Cover image

Breaking Rules, Not Systems: How Penalties Make Autonomous Agents Behave

Emergency is a terrible product requirement. It sounds simple in a meeting: “The agent should follow policy, except when the situation is urgent.” Wonderful. Very human. Also almost useless. A delivery robot should not enter a restricted zone. Unless the package is critical medicine. A warehouse agent should not skip safety checks. Unless a fire alarm requires rerouting. A self-driving system should obey traffic norms. Unless an emergency trip makes delay costly. But “unless urgent” does not tell the agent which rule can bend, which rule must hold, and which shortcut turns the system from flexible into reckless. ...

December 4, 2025 · 15 min · Zelina
Cover image

Mind Over Matter: How a BDI Ontology Gives AI Agents an Actual Inner Life

Workflow agents are easy to admire until someone asks a rude but necessary question: why did the agent do that? Not “what prompt did we send?” Not “which tool did it call?” Not “can we replay the logs and hope the compliance team loses interest?” The real question is sharper: what did the agent believe, what did it want, what did it commit to doing, which plan did that commitment specify, and what evidence justified the transition from one step to the next? ...

November 24, 2025 · 18 min · Zelina
Cover image

Answer, Then Audit: How 'ReSA' Turns Jailbreak Defense Into a Two‑Step Reasoning Game

The dangerous part is often clearer after the model starts answering Moderation usually begins with the user’s prompt. That sounds sensible. Read the request, classify the risk, block the bad thing, let the good thing through. A tidy little border checkpoint, complete with imaginary clipboard. The problem is that jailbreaks are not polite enough to declare themselves at the border. ...

September 20, 2025 · 17 min · Zelina