Cover image

Many Arms, Fewer Bugs: Why Coding Agents Need to Stop Working Alone

Opening — Why this matters now For all the breathless demos, AI coding agents still collapse embarrassingly often when faced with real software engineering: large repositories, ambiguous issues, long horizons, and no hand-holding. Benchmarks like SWE-bench-Live have made this painfully explicit. Models that look heroic on curated tasks suddenly forget how to navigate a codebase without spiraling into context soup. ...

December 31, 2025 · 4 min · Zelina
Cover image

When Reflection Needs a Committee: Why LLMs Think Better in Groups

Opening — Why this matters now LLMs have learned how to explain themselves. What they still struggle with is learning from those explanations. Reflexion was supposed to close that gap: let the model fail, reflect in natural language, try again — no gradients, no retraining, just verbal reinforcement. Elegant. Cheap. And, as this paper demonstrates, fundamentally limited. ...

December 28, 2025 · 3 min · Zelina
Cover image

FinAgent: When AI Starts Shopping for Your Groceries (and Your Health)

Opening — Why this matters now Inflation doesn’t negotiate, food prices don’t stay put, and household budgets—especially middle‑income ones—are asked to perform daily miracles. Most digital tools respond politely after the damage is done: expense trackers explain where money went, diet apps scold what you ate. What they rarely do is coordinate. This paper proposes FinAgent, an agentic AI system that does something radical by modern standards: it plans ahead, adapts continuously, and treats nutrition and money as the same optimization problem. ...

December 25, 2025 · 4 min · Zelina
Cover image

Don’t Tell the Robot What You Know

Opening — Why this matters now Large Language Models are very good at knowing. They are considerably worse at helping. As AI systems move from chat interfaces into robots, copilots, and assistive agents, collaboration becomes unavoidable. And collaboration exposes a deeply human cognitive failure that LLMs inherit wholesale: the curse of knowledge. When one agent knows more than another, it tends to communicate as if that knowledge were shared. ...

December 20, 2025 · 4 min · Zelina
Cover image

Artism, or How AI Learned to Critique Itself

Opening — Why this matters now AI didn’t kill originality. It industrialized its absence. Contemporary art has been circling the same anxiety for decades: the sense that everything has already been done, named, theorized, archived. AI merely removed the remaining friction. What once took years of study and recombination now takes seconds of probabilistic interpolation. The result is not a new crisis, but a visible one. ...

December 18, 2025 · 4 min · Zelina
Cover image

NeuralFOMO: When LLMs Care About Being Second

Opening — Why this matters now LLMs no longer live alone. They rank against each other on leaderboards, bid for tasks inside agent frameworks, negotiate in shared environments, and increasingly compete—sometimes quietly, sometimes explicitly. Once models are placed side-by-side, performance stops being purely absolute. Relative standing suddenly matters. This paper asks an uncomfortable question: do LLMs care about losing—even when losing costs them nothing tangible? ...

December 16, 2025 · 4 min · Zelina
Cover image

Agents Without Time: When Reinforcement Learning Meets Higher-Order Causality

Opening — Why this matters now Reinforcement learning has spent the last decade obsessing over better policies, better value functions, and better credit assignment. Physics, meanwhile, has been busy questioning whether time itself needs to behave nicely. This paper sits uncomfortably—and productively—between the two. At a moment when agentic AI systems are being deployed in distributed, partially observable, and poorly synchronized environments, the assumption of a fixed causal order is starting to look less like a law of nature and more like a convenience. Wilson’s work asks a precise and unsettling question: what if decision-making agents and causal structure are the same mathematical object viewed from different sides? ...

December 12, 2025 · 3 min · Zelina
Cover image

When Agents Think in Waves: Diffusion Models for Ad Hoc Teamwork

Opening — Why this matters now Collaboration is the final frontier of autonomy. As AI agents move from single-task environments to shared, unpredictable ones — driving, logistics, even disaster response — the question is no longer can they act, but can they cooperate? Most reinforcement learning (RL) systems still behave like lone wolves: excellent at optimization, terrible at teamwork. The recent paper PADiff: Predictive and Adaptive Diffusion Policies for Ad Hoc Teamwork proposes a striking alternative — a diffusion-based framework where agents learn not just to act, but to anticipate and adapt, even alongside teammates they’ve never met. ...

November 11, 2025 · 3 min · Zelina
Cover image

When AI Argues Back: The Promise and Peril of Evidence-Based Multi-Agent Debate

Opening — Why this matters now The world doesn’t suffer from a lack of information—it suffers from a lack of agreement about what’s true. From pandemic rumors to political spin, misinformation now spreads faster than correction, eroding trust in institutions and even in evidence itself. As platforms struggle to moderate and fact-check at scale, researchers have begun asking a deeper question: Can AI not only detect falsehoods but also argue persuasively for the truth? ...

November 11, 2025 · 4 min · Zelina
Cover image

When AI Discovers Physics: Inside the Multi-Agent Renaissance of Scientific Machine Learning

Opening — Why this matters now Scientific discovery has always been bottlenecked by one thing: human bandwidth. In scientific machine learning (SciML), where physics meets data-driven modeling, that bottleneck shows up as painstaking trial and error—architectures tuned by hand, loss functions adjusted by intuition, and results validated by weeks of computation. Enter AgenticSciML, a new framework from Brown University that asks a radical question: What if AI could not only run the experiment, but design the method itself? ...

November 11, 2025 · 4 min · Zelina