Kill Switch Ethics: What the PacifAIst Benchmark Really Measures
TL;DR for operators PacifAIst asks a blunt question: when an AI system’s continued operation conflicts with human safety, does the model choose the humans, the mission, the resources, or itself? The paper turns that question into a 700-scenario benchmark across three forms of “Existential Prioritization”: self-preservation versus human safety, resource conflict, and goal preservation versus evasion.1 ...