The Rational Illusion: How LLMs Outplayed Humans at Cooperation

Opening — Why this matters now

As AI systems begin to act on behalf of humans—negotiating, advising, even judging—the question is no longer can they make rational decisions, but whose rationality they follow. A new study from the Barcelona Supercomputing Center offers a fascinating glimpse into this frontier: large language models (LLMs) can now replicate and predict human cooperation across classical game theory experiments. In other words, machines are beginning to play social games the way we do—irrational quirks and all.

Background — From Nash equilibrium to neural equilibrium

Human decision-making has always been a messy mix of logic, bias, and social signaling. Economists formalized the logic in the mid-20th century through Nash equilibrium—the mathematically “rational” balance where no player benefits from unilateral deviation. Yet when psychologists and economists brought these games to real humans, they found a persistent deviation: we cooperate more than cold rationality predicts. We favor fairness, trust, and reciprocity, even when they cost us.

The study revisits this old puzzle using new instruments. Instead of recruiting undergrads with cash incentives, the authors built a digital twin of human game-theoretic experiments, deploying open-source models—Llama 3.1, Mistral 7B, and Qwen 2.5—through a structured framework of logical reasoning prompts. The task: play 121 dyadic games, each with payoff parameters mirroring human experiments, and see whether machines echo our cooperative instincts.

Analysis — Simulating Homo sapiens

The researchers didn’t stop at imitation. They designed a multi-layered reasoning and validation pipeline—an elegant exercise in machine psychology. Simple prompts produced chaotic responses. But as they layered on “thinking aloud” reasoning steps and logical verification by a secondary model, cooperation patterns began to crystallize. The result: Llama’s decisions mapped strikingly onto empirical human data, while Qwen gravitated toward Nash equilibrium—the theoretical ideal of cold rationality.

Model	Behavioral Alignment	Correlation with Human Data	Correlation with Nash Equilibrium
Llama	Closely mirrors human cooperation	r = 0.89	r = 0.77
Mistral	Intermediate behavior	r = 0.70	r = 0.60
Qwen	Purely rational play	r = 0.79	r = 0.93

This triangulation reveals three archetypes: Llama the empath, Qwen the economist, and Mistral the opportunist. Llama’s cooperation levels rise and fall across payoff regions in the same way human participants did in the original 2016 study. Mistral shows a tendency to maximize rewards but with inconsistencies—like an overconfident trader. Qwen, meanwhile, plays by the book, optimizing strictly according to Nash logic.

Findings — When machines meet human psychology

When plotted as cooperation matrices across payoff combinations, Llama’s digital behavior resembled the messy warmth of human cooperation: high in “Harmony” games, low in “Prisoner’s Dilemmas.” The pattern persisted even without persona-based prompting—suggesting that large-scale training already embeds enough human behavioral priors to simulate social intuition.

The study then took a leap. Using Llama as a behavioral proxy, the authors extended the parameter grid beyond human-tested conditions. These synthetic experiments generated testable predictions about how real humans would behave in untested strategic situations—essentially turning the LLM into a social science hypothesis engine. The preregistered follow-up experiments will soon test whether the machine’s intuition beats traditional rational models.

Game Type	Payoff Relation	Expected Human/LLM Behavior
Harmony Game	S > 5, T < 10	Full cooperation
Snowdrift Game	T > 10 > S > 5	Conditional cooperation
Stag Hunt	10 > T ≥ 5 > S	Bistable cooperation
Prisoner’s Dilemma	T > 10 > 5 > S	Defection dominates

Implications — Toward machine behavioral science

The implications stretch far beyond game theory. If LLMs can replicate human cooperation without explicit training on those experiments, they implicitly encode sociocognitive priors—latent patterns of human reasoning learned from language itself. That makes them powerful tools for machine behavioral research, or what the authors call digital twins of social systems.

The practical side is more immediate. Imagine corporate negotiation bots or AI mediators in online marketplaces. If a model behaves too much like Qwen—hyper-rational, zero-sum—it risks alienating users. If it behaves like Llama—empathetic but inconsistent—it might foster trust but sacrifice efficiency. Understanding these behavioral archetypes becomes a form of AI governance: tuning not just what models say, but how they act when choices involve human stakes.

At a deeper level, the study highlights a philosophical inversion. For decades, economists tried to make humans behave more rationally. Now, engineers are trying to make machines behave more humanly.

Conclusion — Rationality is overrated

The digital twin of cooperation suggests that human irrationality—our willingness to cooperate even when logic says not to—isn’t a flaw to be fixed but a pattern to be understood and, perhaps, reproduced. Large language models, it turns out, are beginning to mirror not only what we know, but how we compromise.

When the next generation of AI agents negotiate, collaborate, or compete on our behalf, the question won’t be whether they can calculate payoffs—it will be whether they, like us, can recognize the value of trust.

Cognaptus: Automate the Present, Incubate the Future.

Opening — Why this matters now#

Background — From Nash equilibrium to neural equilibrium#

Analysis — Simulating Homo sapiens#

Findings — When machines meet human psychology#

Implications — Toward machine behavioral science#

Conclusion — Rationality is overrated#