Autonomous Systems

When AI Reviews AI: Turning Foundation Models into Safety Inspectors

Inspection is not glamorous. It is not the robot demo, not the dashboard, not the moment a prototype obediently follows a traffic cone across a test track. Inspection is the slow, expensive discipline of asking whether the thing that worked once will behave acceptably when the weather changes, the path bends, the sensor gets confused, or the requirement was written by a tired engineer using the phrase “successfully complete” as if English were a formal language. ...

Scenes, Screens, and Sim-to-Real Dreams: Why Scenario Queries Matter

Road testing has one inconvenient flaw: reality insists on happening in real time. That is a problem for autonomous vehicles, robots, drones, and other cyber-physical systems whose failures are rare, contextual, and often expensive to reproduce. Simulation helps because it lets engineers manufacture awkward situations on demand: the pedestrian who appears at the worst possible moment, the parked car blocking the lane, the unprotected turn that requires social judgement rather than just geometry. Lovely. Except simulation has its own embarrassing little issue: a failure in simulation may be a real system weakness, or it may be an artefact of synthetic sensor data wearing a lab coat. ...

Thinking Fast and Flowing Slow: Real-Time Reasoning for Autonomous Agents

Delay is not a footnote in automation. It is the product. A customer support agent that takes thirty seconds to decide whether to escalate has already shaped the customer’s mood. A warehouse robot that produces the correct plan after the pallet has moved has produced something closer to poetry than control. A trading assistant that generates a gorgeous hedge after the market has repriced is not sophisticated. It is late, which is the expensive version of wrong. ...

Teaching Safety to Machines: How Inverse Constraint Learning Reimagines Control Barrier Functions

Factory robots, drones, and autonomous vehicles do not usually fail because nobody cared about safety. They fail because “safe” is annoyingly difficult to write down. An operator may know that a drone should not scrape the ground, that a warehouse robot should not cut across a human worker’s path, or that an autonomous car should not tailgate even when the road is technically clear. But turning that judgement into a formal mathematical boundary is another matter. The physical system has dynamics. The controller has limits. The dangerous state may not be a simple wall or circle. And the difference between “safe enough” and “please do not put that in production” may live in patterns of behaviour rather than in a clean rule. ...

Forkcast: How Pro2Guard Predicts and Prevents LLM Agent Failures

TL;DR for operators ProbGuard1 is a runtime safety monitor that tries to answer a more useful question than “Has the agent broken a rule?” It asks: “Given where the agent is now, how likely is it to end up breaking a rule soon?” That shift matters. Many agent failures are not single bad actions. They are bad trajectories: the robot chooses the wrong object, the car carries too much speed into a risky scene, the workflow skips a confirmation step three moves before data is exposed. A conventional rule-based guardrail often detects the problem when the violation is already visible. ProbGuard tries to detect the probability mass moving toward the violation earlier. ...