Themis Knows Best: When AI Judges Start Training Other AI
Click. The button moved. The page refreshed. A popup appeared, then disappeared. The agent says the task is done. The screenshot looks plausible. The log is long enough to impress a project manager and confusing enough to defeat a reviewer with a normal human attention span. Now comes the awkward question: should the agent be rewarded? ...