Fault Lines & Safety Nets: How RAFFLES Finds the First Domino in Agent Failures
A failed agent run rarely fails politely. It does not raise its hand at step 4 and say, “Here is the causal error; please patch the planner.” It drifts. A web agent grabs the wrong source. A coding agent trusts a bad assumption. A verifier rubber-stamps a plausible-looking answer. Twenty steps later the final output is wrong, the dashboard says “failed,” and the team is left doing digital archaeology with a very expensive shovel. ...