Consistency Is Not a Coincidence: When LLM Agents Disagree With Themselves
Opening — Why This Matters Now We are entering the age of agentic AI. Not chatbots. Not autocomplete on steroids. Agents that search, retrieve, execute, and decide. And here is the uncomfortable question: If you run the same LLM agent on the same task twice — do you get the same behavior? According to the recent empirical study “When Agents Disagree With Themselves: Measuring Behavioral Consistency in LLM-Based Agents” (arXiv:2602.11619v1), the answer is often no. ...