Themis Knows Best: When AI Judges Start Training Other AI
Opening — Why this matters now Autonomous agents are finally leaving the sandbox. From GUI automation to full computer-use agents, the frontier is no longer about whether models can act—but whether they can learn from acting without collapsing into noise. The uncomfortable truth: scaling models is easy. Scaling reliable learning signals is not. This paper introduces a framework—quietly but decisively—that reframes the problem. Not as a model problem. Not even as a data problem. ...