Breaking Rules, Not Systems: How Penalties Make Autonomous Agents Behave
Emergency is a terrible product requirement. It sounds simple in a meeting: “The agent should follow policy, except when the situation is urgent.” Wonderful. Very human. Also almost useless. A delivery robot should not enter a restricted zone. Unless the package is critical medicine. A warehouse agent should not skip safety checks. Unless a fire alarm requires rerouting. A self-driving system should obey traffic norms. Unless an emergency trip makes delay costly. But “unless urgent” does not tell the agent which rule can bend, which rule must hold, and which shortcut turns the system from flexible into reckless. ...