Patch, Don’t Preach: The Coming Era of Modular AI Safety
A patch is not a sermon. That distinction matters, because enterprise AI safety has spent too much time sounding like moral philosophy and too little time behaving like maintenance engineering. A deployed model develops a toxicity problem. A customer discovers a jailbreak route. A regulator changes the acceptable boundary for refusal. The usual answer is some combination of “wait for the next model release,” “fine-tune a new variant,” or “wrap it in another brittle instruction.” Very comforting. Also not exactly what one wants when the system is already in production. ...