Cover image

Gated Sparse Attention: Speed Without the Sink

Opening — Why this matters now Long-context language models have crossed an uncomfortable threshold. Context windows now stretch to 128K tokens and beyond, yet the core attention mechanism still scales quadratically. The result is a growing mismatch between what models can theoretically ingest and what is economically and operationally feasible. At the same time, training instability — loss spikes, attention sinks, brittle gradients — continues to haunt large-scale runs. ...

January 24, 2026 · 4 min · Zelina
Cover image

ResMAS: When Multi‑Agent Systems Stop Falling Apart

Opening — Why this matters now Multi-agent systems (MAS) built on large language models have developed a bad habit: they work brilliantly—right up until the moment one agent goes off-script. A single failure, miscommunication, or noisy response can quietly poison the entire collaboration. In production environments, this isn’t a hypothetical risk; it’s the default operating condition. ...

January 11, 2026 · 4 min · Zelina
Cover image

Infinite Tasks, Finite Minds: Why Agents Keep Forgetting—and How InfiAgent Cheats Time

Opening — Why this matters now Everyone wants an autonomous agent that can just keep going. Write a literature review. Audit 80 papers. Run an open-ended research project for days. In theory, large language models (LLMs) are perfect for this. In practice, they quietly collapse under their own memory. The problem isn’t model intelligence. It’s state. ...

January 7, 2026 · 4 min · Zelina
Cover image

Unpacking the Explicit Mind: How ExplicitLM Redefines AI Memory

Why this matters now Every few months, another AI model promises to be more “aware” — but awareness is hard when memory is mush. Traditional large language models (LLMs) bury their knowledge across billions of parameters like a neural hoarder: everything is stored, but nothing is labeled. Updating a single fact means retraining the entire organism. The result? Models that can write essays about Biden while insisting he’s still president. ...

November 6, 2025 · 4 min · Zelina
Cover image

Blueprints of Agency: Compositional Machines and the New Architecture of Intelligence

When the term agentic AI is used today, it often conjures images of individual, autonomous systems making plans, taking actions, and learning from feedback loops. But what if intelligence, like biology, doesn’t scale by perfecting one organism — but by building composable ecosystems of specialized agents that interact, synchronize, and co‑evolve? That’s the thesis behind Agentic Design of Compositional Machines — a sprawling, 75‑page manifesto that reframes AI architecture as a modular society of minds, not a monolithic brain. Drawing inspiration from software engineering, systems biology, and embodied cognition, the paper argues that the next generation of LLM‑based agents will need to evolve toward compositionality — where reasoning, perception, and action emerge not from larger models, but from better‑coordinated parts. ...

October 23, 2025 · 4 min · Zelina
Cover image

Layers of Thought: How Hierarchical Memory Supercharges LLM Agent Reasoning

Most LLM agents today think in flat space. When you ask a long-term assistant a question, it either scrolls endlessly through past turns or scours an undifferentiated soup of semantic vectors to recall something relevant. This works—for now. But as tasks get longer, more nuanced, and more personal, this memory model crumbles under its own weight. A new paper proposes an elegant solution: H-MEM, or Hierarchical Memory. Instead of treating memory as one big pile of stuff, H-MEM organizes past knowledge into four semantically structured layers: Domain, Category, Memory Trace, and Episode. It’s the difference between a junk drawer and a filing cabinet. ...

August 1, 2025 · 3 min · Zelina