Game Theory

When Aligned Models Compete: Nash Equilibria as the New Alignment Layer

Attention is a strange boss. It does not simply reward the best content, the most balanced opinion, or the most socially useful answer. It rewards whatever survives the rules of the environment. That distinction matters once AI systems stop being isolated chatbots and start behaving like a population: autonomous accounts, synthetic creators, enterprise agents, customer-facing bots, negotiation assistants, research agents, and ranking-aware content machines. Each one may be aligned in the usual single-model sense. Each one may pass safety checks. Each one may avoid obvious toxicity. Then they are released into the same market for attention, engagement, approval, conversion, or influence. ...

Pruning Is a Game, and Most Weights Lose

Pruning Is a Game, and Most Weights Lose Pruning usually sounds like housekeeping. Train the model. Rank the weights. Remove the small ones. Fine-tune the survivor. Pretend the whole exercise was more scientific than it looked in the notebook. That workflow has worked well enough to become familiar. But familiarity is not explanation. It tells us how to remove model components after training; it says less about why some components become removable in the first place. The paper Pruning as a Game: Equilibrium-Driven Sparsification of Neural Networks asks a sharper question: what if pruning is not merely an external compression operation, but the outcome of competition inside the model?1 ...

When Safety Stops Being a Turn-Based Game

Jailbreaks are not polite enough to wait their turn. That is the awkward weakness in many safety-training pipelines. A model is attacked, patched, tested, and released. Then another attack appears, usually crafted with more creativity than the previous defense assumed. The safety team patches again. The benchmark improves. The real attack surface moves. Everyone calls this iteration, because “organized whack-a-mole with GPUs” sounds less respectable. ...

Graph Minds, Game Moves: How Multi‑Agent Learning Is Quietly Redrawing AI Strategy

A traffic light is not just a traffic light once the other lights start learning. That is the uncomfortable starting point for strategic AI systems. A single model can optimise a route, price, recommendation, allocation, or control policy. But the moment other decision-makers are learning at the same time, the environment stops behaving like scenery. It becomes a cast. Each actor updates, reacts, misreads, cooperates, defects, imitates, or quietly ruins the assumptions in your simulator. Very rude, but entirely realistic. ...

The Rational Illusion: How LLMs Outplayed Humans at Cooperation

A negotiation bot walks into a pricing dispute. That is not the start of a joke. It is the start of a procurement problem, a marketplace design problem, a customer-service escalation problem, and, sooner than executives would like to admit, a governance problem. Once AI systems begin making choices on behalf of organisations, their behaviour in social settings matters. Not just whether they answer correctly. Not just whether they sound polite. Whether they cooperate, defect, compromise, optimise, over-trust, or quietly behave like a very caffeinated economist. ...

Enemy at the Gates, Friends at the Table: Why Competition Makes LLM Agents More Cooperative

TL;DR for operators Competition is usually sold as the thing that makes agents sharper, more adversarial, and perhaps a little too pleased with themselves. This paper points in a more useful direction: controlled external competition can make agent teams more cooperative internally, but only when it is paired with repeated interaction. The study places Qwen3 14B, Phi4 reasoning, and Cogito 14B agents into Iterated Prisoner’s Dilemma tournaments under three conditions: repeated interaction only, group competition only, and a combined “super-additive” setup where agents face both team structure and repeated encounters.1 For Qwen3 and Phi4, the combined setting produces the strongest cooperation. Qwen3’s mean cooperation rate rises from 0.22 in repeated interaction and 0.23 in group competition to 0.32 in the combined setting. Phi4 moves more sharply, from 0.21 and 0.13 to 0.43. ...

Game of Prompts: How Game Theory and Agentic LLMs Are Rewriting Cybersecurity

TL;DR for operators A suspicious domain appears in a DNS log. A conventional classifier either recognises it, misses it, or assigns a confidence score that someone in the SOC must interpret while pretending the queue is under control. The paper’s more interesting proposal is not “let an LLM summarise the alert”. That would be the enterprise equivalent of putting a helpful intern on a fire alarm. ...