Safety First, Reward Second — But Not Last
The safest robot in a factory is the one that never moves. It will not collide with a worker, damage a component, cross a restricted boundary, or exceed a speed limit. Its incident statistics will be immaculate. Its productivity statistics will be less impressive. This absurdly safe robot captures a genuine problem in reinforcement learning. When an agent is trained under strict safety constraints, an algorithm can reduce violations by teaching the agent to avoid doing anything difficult. The resulting policy may satisfy the safety department, at least on paper, while quietly failing the reason it was deployed. ...