Refusal, Rewired: Why One Safety Direction Isn’t Enough
Opening — Why this matters now Safety teams keep discovering an uncomfortable truth: alignment guardrails buckle under pressure. Jailbreaks continue to spread, researchers keep publishing new workarounds, and enterprise buyers are left wondering whether “safety by fine-tuning” is enough. The latest research on refusal behavior doesn’t merely strengthen that concern—it reframes the entire geometry of safety. ...