Rules of Attraction: How LLMs Learn to Judge Better Than We Do
Opening — Why this matters now In the last year, AI evaluation quietly became the industry’s most fragile dependency. LLMs are now asked to judge everything—from student essays to political sentiment to the quality of each other’s outputs. Companies use them to score customer emails, assess compliance risks, and even grade internal documentation. The problem is obvious: we’re relying on systems that struggle to agree with themselves. ...