Human-in-the-Loop

When AI Drives, Who’s in Control? — Reclaiming Determinism in Agentic Systems

A car does not care whether an AI answer is impressive. It cares whether the answer arrives before the intersection. That small timing problem is where a large part of today’s agentic AI discussion becomes unserious. We keep asking whether models are smart enough to act. In cyber-physical systems, the more painful question is whether the system around the model can make action repeatable, bounded, and recoverable when the model is late, vague, or simply wrong. ...

The Ask Gap: Why AI Agents Fail Not Because They Can’t Think — But Because They Don’t Know When to Stop

A ticket lands in the queue. It looks ordinary: update a parser, answer a business question, patch a workflow, produce a SQL query. The agent opens the files, explores the schema, writes code, runs a few checks, and submits something plausible. The output is polished. The reasoning trace is confident. The dashboard marks the task as completed. ...

The Stochastic Gap: Why Your AI Agent Fails Before It Starts

A procurement workflow looks boring until an AI agent touches it. Before that moment, the process is usually wrapped in the comforting machinery of enterprise software: approval rules, validation checks, role permissions, exception paths, and enough audit trails to make everyone feel governed. Then someone inserts an agent and asks it to “handle the workflow.” The agent may know the words. It may call the right tools. It may even produce the next step that looks plausible. ...

From Copilots to Colleagues: The Organizational Leap to Agentic AI

Bookings are not glamorous. They arrive through email, booking platforms, supplier messages, customer updates, and last-minute changes that somehow always appear after the plan has already been “finalized.” Someone reads them. Someone reconciles them. Someone checks activity availability. Someone checks transport capacity. Someone updates the planning sheet. Someone notices that one family needs pickup from a different location. Someone quietly prevents tomorrow morning from becoming a small logistical circus. ...

When Agents Ask for Help: Teaching LLMs the Art of Expert Collaboration

A help desk ticket is rarely solved by the first sentence. Someone says, “The report is wrong.” Then comes the real work: wrong where, compared with what, after which data refresh, under which permission level, and whether “wrong” means mathematically false or merely politically inconvenient. The expert does not just hand over an answer. The expert asks questions, reconstructs context, and turns a vague failure into a useful diagnosis. ...

Think-with-Me: When LLMs Learn to Stop Thinking

A model can be wrong because it did not think enough. That part is easy to understand. The more annoying failure is when the model already had the answer, kept going, second-guessed itself into a ditch, and then presented the ditch with confidence. This is the special comedy of large reasoning models: sometimes the expensive part is not the intelligence, but the hesitation after the intelligence has already done its job. ...

When LLMs Stop Guessing and Start Complying: Agentic Neuro-Symbolic Programming

The problem is not that LLMs cannot write code. It is that they write the wrong kind too confidently. A familiar scene: someone gives an LLM a task, receives a block of code that looks elegant, runs it, and discovers that it has invented an API, misunderstood the library, or solved a neighboring problem with excellent grammar. This is annoying when the target is ordinary Python. It is worse when the target is a specialized framework where the code is supposed to encode logic, constraints, and domain structure. ...

Label Now, Drive Later: Why Autonomous Driving Needs Fewer Clicks, Not Smarter Annotators

Clicks are a cost centre. In a 3D annotation tool, deleting an unnecessary bounding box may take one or two seconds. Creating a missed vehicle annotation from scratch takes about 23 seconds. Correcting a poorly positioned box falls somewhere in between. These actions may all count as model errors. They do not cost the same amount of human time. ...

When the Tutor Is a Model: Learning Gains, Guardrails, and the Quiet Rise of AI Co‑Tutors

A tutor has three student chats open. In the first, a student has confused a factor with a multiple. In the second, another has substituted a negative number incorrectly. In the third, the student has already found the answer but is rapidly losing patience with being asked to explain it. The tutor must diagnose each problem, compose an appropriate question, maintain the students’ attention, and decide when further explanation becomes counterproductive. Doing this well requires mathematical knowledge, pedagogical discipline, emotional judgment, and enough spare attention to avoid replying to the wrong child. ...

From Branch Reports to Franchise Intelligence: AI Agents for Retail Execution Control

A franchise retail chain redesigned branch monitoring from manual coordination and delayed reporting into an AI-agent-enabled workflow for performance, promotion, inventory, customer-feedback, and franchisee-support management.