Cover image

Diffusing to Coordinate: When Multi-Agent RL Learns to Breathe

Robots are easy to imagine as individuals. A quadruped walks. A drone flies. A warehouse arm picks. The business slide is usually kind enough to show one machine, one task, one satisfying arrow from input to output. Reality is less polite. A quadruped is not one decision-maker. It is a committee of limbs negotiating with gravity. A multi-drone system is not one policy with four propellers. It is a moving argument about timing, local perception, shared goals, and what not to crash into. A factory cell with multiple robotic agents is even worse: every local action changes the environment other agents are trying to understand. ...

February 23, 2026 · 17 min · Zelina
Cover image

Don’t Self-Sabotage Me Now: Rational Policy Gradients for Sane Multi-Agent Learning

Kitchen work is not hard because chopping onions is metaphysically difficult. It is hard because two people must agree, implicitly and quickly, who gets the onion, who holds the plate, who waits by the pot, and who moves out of the corridor before everyone performs a small culinary traffic accident. That is why Overcooked remains such a useful multi-agent benchmark. It turns coordination into something visible. Agents do not merely need to “perform a task”; they need to infer what another agent is about to do and avoid becoming a sentient obstacle. ...

November 13, 2025 · 14 min · Zelina