Cover image

Coaching the Swarm: Why Multi‑Agent RL Finally Scales

Blame is the unglamorous foundation of automation. When a human team misses a deadline, managers rarely ask only, “Did the project succeed?” They ask a more useful question: which handoff failed? Did the analyst misunderstand the data? Did engineering break the pipeline? Did the reviewer approve a bad output because the earlier work looked plausible? This is the difference between evaluation and coaching. Evaluation produces a score. Coaching produces a diagnosis. ...

February 3, 2026 · 17 min · Zelina
Cover image

The Rise of FreePhD: How Multiagent Systems are Reimagining the Scientific Method

A broken file link is not usually where scientific revolutions begin. It is, however, where many automated workflows die. That is why the most revealing moment in the freephdlabor paper is not the grand claim about personalised AI research groups. It is the rather unromantic episode where the system tries to write a paper, discovers that the experiment data are missing because of a failed symlink, attempts workarounds, fails validation, reports the failure, gets routed back through resource preparation, rebuilds the workspace correctly, and only then proceeds to manuscript generation.1 ...

October 25, 2025 · 15 min · Zelina
Cover image

Personas with Purpose: How TinyTroupe Reimagines Multiagent Simulation

TL;DR for operators TinyTroupe is not another “let’s make five agents debate the product roadmap” toy. The paper’s useful move is sharper: it treats persona simulation as a different engineering problem from assistive AI.1 Assistive agents are trained to be helpful, polite, comprehensive, and often suspiciously agreeable. Human simulation needs almost the opposite: inconsistency, reluctance, taste, memory, background, class signals, cultural context, and the ability to say “no” for reasons that are not optimised for the user’s happiness. Annoying, yes. Also known as customers. ...

July 15, 2025 · 19 min · Zelina