Cover image

Many Arms, Fewer Bugs: Why Coding Agents Need to Stop Working Alone

Opening — Why this matters now For all the breathless demos, AI coding agents still collapse embarrassingly often when faced with real software engineering: large repositories, ambiguous issues, long horizons, and no hand-holding. Benchmarks like SWE-bench-Live have made this painfully explicit. Models that look heroic on curated tasks suddenly forget how to navigate a codebase without spiraling into context soup. ...

December 31, 2025 · 4 min · Zelina
Cover image

When Rewards Learn to See: Teaching Humanoids What the Ground Looks Like

Opening — Why this matters now Humanoid robots can now run, jump, and occasionally impress investors. What they still struggle with is something more mundane: noticing the stairs before falling down them. For years, reinforcement learning (RL) has delivered impressive locomotion demos—mostly on flat floors. The uncomfortable truth is that many of these robots are, functionally speaking, blind. They walk well only because the ground behaves politely. Once the terrain becomes uneven, discontinuous, or adversarial, performance collapses. ...

December 21, 2025 · 4 min · Zelina
Cover image

Let There Be Light (and Agents): Automating Quantum Experiments

Opening — Why this matters now Quantum optics sits at an awkward intersection: conceptually elegant, mathematically unforgiving, and operationally tedious. Designing even a “classic” experiment often means stitching together domain intuition, optical components, and simulation code—usually in tools that were never designed for conversational exploration. As AI agents move from text completion to task execution, the obvious question emerges: can they design experiments, not just describe them? ...

December 20, 2025 · 3 min · Zelina
Cover image

Memory Over Models: Letting Agents Grow Up Without Retraining

Opening — Why this matters now We are reaching the awkward teenage years of AI agents. LLMs can already do things: book hotels, navigate apps, coordinate workflows. But once deployed, most agents are frozen in time. Improving them usually means retraining or fine-tuning models—slow, expensive, and deeply incompatible with mobile and edge environments. The paper “Beyond Training: Enabling Self-Evolution of Agents with MOBIMEM” takes a blunt stance: continual agent improvement should not depend on continual model training. Instead, evolution should happen where operating systems have always handled adaptation best—memory. ...

December 20, 2025 · 4 min · Zelina
Cover image

Shaking the Stack: Teaching Seismology to Talk Back

Opening — Why this matters now Scientific software has a strange tradition: world‑class physics wrapped in workflows that feel frozen in the 1990s. Seismology is no exception. SPECFEM — arguably the gold standard for seismic wave simulation — delivers extraordinary numerical fidelity, but only after users survive a rite of passage involving fragile text files, shell scripts, and MPI incantations. ...

December 17, 2025 · 4 min · Zelina
Cover image

When AI Discovers Physics: Inside the Multi-Agent Renaissance of Scientific Machine Learning

Opening — Why this matters now Scientific discovery has always been bottlenecked by one thing: human bandwidth. In scientific machine learning (SciML), where physics meets data-driven modeling, that bottleneck shows up as painstaking trial and error—architectures tuned by hand, loss functions adjusted by intuition, and results validated by weeks of computation. Enter AgenticSciML, a new framework from Brown University that asks a radical question: What if AI could not only run the experiment, but design the method itself? ...

November 11, 2025 · 4 min · Zelina
Cover image

Thinking Fast and Flowing Slow: Real-Time Reasoning for Autonomous Agents

Opening — Why this matters now AI agents are getting smarter—but not faster. Most large language model (LLM) systems still behave like cautious philosophers in a chess match: the world patiently waits while they deliberate. In the real world, however, traffic lights don’t freeze for an AI car mid-thought, and market prices don’t pause while a trading agent reasons about “the optimal hedge.” The new study Real-Time Reasoning Agents in Evolving Environments by Wen et al. (2025) calls this out as a fundamental flaw in current agent design—and offers a solution that blends human-like intuition with deliberative reasoning. ...

November 10, 2025 · 4 min · Zelina
Cover image

Fast Minds, Cheap Thinking: How Predictive Routing Cuts LLM Reasoning Costs

Opening — Why this matters now Large reasoning models like GPT-5 and s1.1-32B can solve Olympiad-level problems — but they’re computationally gluttons. Running them for every query, from basic arithmetic to abstract algebra, is like sending a rocket to fetch groceries. As reasoning models become mainstream in enterprise automation, the question is no longer “Can it reason?” but “Should it reason this hard?” ...

November 9, 2025 · 4 min · Zelina
Cover image

Unpacking the Explicit Mind: How ExplicitLM Redefines AI Memory

Why this matters now Every few months, another AI model promises to be more “aware” — but awareness is hard when memory is mush. Traditional large language models (LLMs) bury their knowledge across billions of parameters like a neural hoarder: everything is stored, but nothing is labeled. Updating a single fact means retraining the entire organism. The result? Models that can write essays about Biden while insisting he’s still president. ...

November 6, 2025 · 4 min · Zelina
Cover image

When AI Packs Too Much Hype: Reassessing LLM 'Discoveries' in Bin Packing

Opening — Why this matters now The academic world has been buzzing ever since a Nature paper claimed that large language models (LLMs) had made “mathematical discoveries.” Specifically, through a method called FunSearch, LLMs were said to have evolved novel heuristics for the classic bin packing problem—an NP-hard optimization task as old as modern computer science itself. The headlines were irresistible: AI discovers new math. But as with many shiny claims, the real question is whether the substance matches the spectacle. ...

November 5, 2025 · 5 min · Zelina