Cover image

Cities That Think: Reasoning AI for the Urban Century

Opening — Why this matters now By 2050, nearly seven out of ten people will live in cities. Yet most urban planning tools today still operate as statistical mirrors—learning from yesterday’s data to predict tomorrow’s congestion. Predictive models can forecast traffic or emissions, but they don’t reason about why or whether those outcomes should occur. The next leap, as argued by Sijie Yang and colleagues in Reasoning Is All You Need for Urban Planning AI, is not more prediction—but more thinking. ...

November 10, 2025 · 4 min · Zelina
Cover image

The Doctor Is In: How DR. WELL Heals Multi-Agent Coordination with Symbolic Memory

Opening — Why this matters now Large language models are learning to cooperate. Or at least, they’re trying. When multiple LLM-driven agents must coordinate—say, to move objects in a shared environment or plan logistics—they often stumble over timing, misunderstanding, and sheer conversational chaos. Each agent talks too much, knows too little, and acts out of sync. DR. WELL, a new neurosymbolic framework from researchers at CMU and USC, proposes a cure: let the agents think symbolically, negotiate briefly, and remember collectively. ...

November 7, 2025 · 4 min · Zelina
Cover image

Doctor, Interrupted: How Multi-Agent AI Revives the Lost Art of Pre‑Consultation

Opening — Why this matters now The global shortage of physicians is no longer a future concern—it’s a statistical certainty. In countries representing half the world’s population, primary care consultations last five minutes or less. In China, it’s often under 4.3 minutes. A consultation this brief can barely fit a polite greeting, let alone a clinical investigation. Yet every wasted second compounds diagnostic risk, burnout, and cost. Enter pre‑consultation: the increasingly vital buffer that collects patient data before the doctor ever walks in. But even AI‑based pre‑consultation systems—those cheerful symptom checkers and chatbots—remain fundamentally passive. They wait for patients to volunteer information, and when they don’t, the machine simply shrugs in silence. ...

November 6, 2025 · 4 min · Zelina
Cover image

Who Really Runs the Workflow? Ranking Agent Influence in Multi-Agent AI Systems

Opening — Why this matters now Multi-agent systems — the so-called Agentic AI Workflows — are rapidly becoming the skeleton of enterprise-grade automation. They promise autonomy, composability, and scalability. But beneath this elegant choreography lies a governance nightmare: we often have no idea which agent is actually in charge. Imagine a digital factory of LLMs: one drafts code, another critiques it, a third summarizes results, and a fourth audits everything. When something goes wrong — toxic content, hallucinated outputs, or runaway costs — who do you blame? More importantly, which agent do you fix? ...

November 3, 2025 · 5 min · Zelina
Cover image

Agents That Build Agents: The ALITA-G Revolution

From Static Models to Self-Evolving Systems Large Language Models (LLMs) began as static entities — vast but inert collections of parameters. Over the last year, they’ve learned to act: wrapped in agentic shells with tools, memory, and feedback loops. But ALITA-G (Qiu et al., 2025) pushes further, imagining agents that don’t just act — they evolve. The paper proposes a framework for turning a general-purpose agent into a domain expert by automatically generating, abstracting, and reusing tools called Model Context Protocols (MCPs). This marks a shift from “agents that reason” to “agents that grow.” ...

November 1, 2025 · 3 min · Zelina
Cover image

Blueprints of Agency: Compositional Machines and the New Architecture of Intelligence

When the term agentic AI is used today, it often conjures images of individual, autonomous systems making plans, taking actions, and learning from feedback loops. But what if intelligence, like biology, doesn’t scale by perfecting one organism — but by building composable ecosystems of specialized agents that interact, synchronize, and co‑evolve? That’s the thesis behind Agentic Design of Compositional Machines — a sprawling, 75‑page manifesto that reframes AI architecture as a modular society of minds, not a monolithic brain. Drawing inspiration from software engineering, systems biology, and embodied cognition, the paper argues that the next generation of LLM‑based agents will need to evolve toward compositionality — where reasoning, perception, and action emerge not from larger models, but from better‑coordinated parts. ...

October 23, 2025 · 4 min · Zelina
Cover image

Fork, Fuse, and Rule: XAgents’ Multipolar Playbook for Safer Multi‑Agent AI

TL;DR XAgents pairs a multipolar task graph (diverge with SIMO, converge with MISO) with IF‑THEN rule guards to plan uncertain tasks and suppress hallucinations. In benchmarks spanning knowledge and logic QA, it outperforms SPP, AutoAgents, TDAG, and AgentNet while using ~29% fewer tokens and ~45% less memory than AgentNet on a representative task. For operators, the practical win is a recipe to encode SOPs as rules on top of agent teams—without giving up adaptability. ...

September 19, 2025 · 4 min · Zelina
Cover image

Memory That Fights Back: How SEDM Turns Agent Logs into Verified Knowledge

TL;DR Most “agent memory” is a junk drawer: it grows fast, gets noisy, and slows everything down. SEDM (Self‑Evolving Distributed Memory) proposes an auditable, efficiency‑first overhaul. It verifies each candidate memory by replaying the exact run in a Self‑Contained Execution Context (SCEC), assigns an initial utility‑aligned weight, and then self‑schedules what to retrieve next. The result: higher task accuracy with fewer tokens versus strong memory baselines on FEVER and HotpotQA. ...

September 17, 2025 · 5 min · Zelina
Cover image

Mirror, Signal, Maneuver: How 'Self' Labels Nudge LLM Cooperation

When an agent thinks it sees itself in the mirror, it doesn’t necessarily smile—it sometimes clutches its wallet. TL;DR In an iterated public‑goods game (20 rounds, 10 tokens per round, 1.6 multiplier), telling models they’re playing “another AI” versus “themselves” shifts contributions by up to ~4 points in some settings. Direction of the shift depends on the prompt persona: with collective prompts, “self” labels often reduced contributions; with selfish prompts, “self” labels sometimes increased matching/cooperation. Effects persist under rephrased prompts and when reasoning traces aren’t requested, and they appear even in four‑agent self‑play variants. For enterprise multi‑agent AI, identity cues are levers. Manage them like you manage feature flags: test, monitor, and standardize. What the authors tested (and why it’s clever) Game mechanics. Two (and later four) LLM agents repeatedly choose how much to contribute (0–10) to a common pool each round. Pool is multiplied by 1.6 and split evenly; keeping more is privately optimal, but coordinated contribution yields higher joint payoffs. ...

August 27, 2025 · 5 min · Zelina
Cover image

Enemy at the Gates, Friends at the Table: Why Competition Makes LLM Agents More Cooperative

TL;DR When language‑model agents compete as teams and meet the same opponents repeatedly, they cooperate more—even on the very first encounter. This “super‑additive” effect reliably appears for Qwen3 and Phi‑4, and changes how we should structure agent ecosystems at work. Why this matters (for builders and buyers) Most enterprise agent stacks still optimize solo intelligence (one bot per task). But real workflows are competitive–cooperative: sales vs. sales, negotiators vs. suppliers, ops vs. delays. This paper shows that if we architect the social rules (teams + rematches) rather than just tune models, we can raise cooperative behavior and stability without extra fine‑tuning—or even bigger models. ...

August 24, 2025 · 4 min · Zelina