Cover image

When Your House Talks Back: Teaching Buildings to Think About Energy

Opening — Why this matters now Buildings quietly consume around a third of the world’s energy. Most of that consumption is governed not by grand strategy, but by human habit: when people cook, charge vehicles, cool rooms, or forget to turn things off. For decades, Building Energy Management Systems (BEMS) promised optimization. In practice, they delivered dashboards—dense, technical, and mostly ignored. ...

January 1, 2026 · 4 min · Zelina
Cover image

Many Arms, Fewer Bugs: Why Coding Agents Need to Stop Working Alone

Opening — Why this matters now For all the breathless demos, AI coding agents still collapse embarrassingly often when faced with real software engineering: large repositories, ambiguous issues, long horizons, and no hand-holding. Benchmarks like SWE-bench-Live have made this painfully explicit. Models that look heroic on curated tasks suddenly forget how to navigate a codebase without spiraling into context soup. ...

December 31, 2025 · 4 min · Zelina
Cover image

The Web, Reimagined as a World Model

Opening — Why this matters now Language agents are no longer satisfied with short conversations and disposable prompts. They want places—environments where actions have consequences, memory persists, and the world does not politely forget everything after the next API call. Unfortunately, today’s tooling offers an awkward choice: either rigid web applications backed by databases, or fully generative world models that hallucinate their own physics and promptly lose the plot. ...

December 30, 2025 · 4 min · Zelina
Cover image

Guardrails Over Gigabytes: Making LLM Coding Agents Behave

Opening — Why this matters now AI coding agents are everywhere—and still, maddeningly unreliable. They pass unit tests they shouldn’t. They hallucinate imports. They invent APIs with confidence that would be admirable if it weren’t so destructive. The industry response has been predictable: bigger models, longer prompts, more retries. This paper proposes something less glamorous and far more effective: stop asking stochastic models to behave like deterministic software engineers. ...

December 27, 2025 · 4 min · Zelina
Cover image

Traffic, but Make It Agentic: When Simulators Learn to Think

Opening — Why this matters now Traffic simulation has always promised more than it delivers. City planners, transport researchers, and policymakers are told that with the right simulator, congestion can be eased, emissions reduced, and infrastructure decisions made rationally. In practice, most simulators demand deep domain expertise, rigid workflows, and a tolerance for configuration pain that few real-world users possess. ...

December 25, 2025 · 4 min · Zelina
Cover image

Let There Be Light (and Agents): Automating Quantum Experiments

Opening — Why this matters now Quantum optics sits at an awkward intersection: conceptually elegant, mathematically unforgiving, and operationally tedious. Designing even a “classic” experiment often means stitching together domain intuition, optical components, and simulation code—usually in tools that were never designed for conversational exploration. As AI agents move from text completion to task execution, the obvious question emerges: can they design experiments, not just describe them? ...

December 20, 2025 · 3 min · Zelina
Cover image

Memory Over Models: Letting Agents Grow Up Without Retraining

Opening — Why this matters now We are reaching the awkward teenage years of AI agents. LLMs can already do things: book hotels, navigate apps, coordinate workflows. But once deployed, most agents are frozen in time. Improving them usually means retraining or fine-tuning models—slow, expensive, and deeply incompatible with mobile and edge environments. The paper “Beyond Training: Enabling Self-Evolution of Agents with MOBIMEM” takes a blunt stance: continual agent improvement should not depend on continual model training. Instead, evolution should happen where operating systems have always handled adaptation best—memory. ...

December 20, 2025 · 4 min · Zelina
Cover image

Shaking the Stack: Teaching Seismology to Talk Back

Opening — Why this matters now Scientific software has a strange tradition: world‑class physics wrapped in workflows that feel frozen in the 1990s. Seismology is no exception. SPECFEM — arguably the gold standard for seismic wave simulation — delivers extraordinary numerical fidelity, but only after users survive a rite of passage involving fragile text files, shell scripts, and MPI incantations. ...

December 17, 2025 · 4 min · Zelina
Cover image

When Agents Loop: Geometry, Drift, and the Hidden Physics of LLM Behavior

Opening — Why this matters now Agentic AI systems are everywhere—self-refining copilots, multi-step reasoning chains, autonomous research bots quietly talking to themselves. Yet beneath the productivity demos lurks an unanswered question: what actually happens when an LLM talks to itself repeatedly? Does meaning stabilize, or does it slowly dissolve into semantic noise? The paper “Dynamics of Agentic Loops in Large Language Models” offers an unusually rigorous answer. Instead of hand-waving about “drift” or “stability,” it treats agentic loops as discrete dynamical systems and analyzes them geometrically in embedding space. The result is less sci‑fi mysticism, more applied mathematics—and that’s a compliment fileciteturn0file0. ...

December 14, 2025 · 4 min · Zelina
Cover image

Forget Me Not: How IterResearch Rebuilt Long-Horizon Thinking for AI Agents

Opening — Why this matters now The AI world has become obsessed with “long-horizon” reasoning—the ability for agents to sustain coherent thought over hundreds or even thousands of interactions. Yet most large language model (LLM) agents, despite their size, collapse under their own memory. The context window fills, noise piles up, and coherence suffocates. Alibaba’s IterResearch tackles this problem not by extending memory—but by redesigning it. ...

November 11, 2025 · 4 min · Zelina