Cover image

Bubble Trouble: Why Top‑K Retrieval Keeps Letting LLMs Down

A practical reading of Context Bubble construction: why enterprise RAG needs constrained, auditable context assembly rather than larger top-k piles.

January 16, 2026 · 18 min · Zelina
Cover image

Drawing with Ghost Hands: When GenAI Helps Architects — and When It Quietly Undermines Them

A mechanism-first reading of experimental evidence showing why GenAI helps novice architectural designers, fails to broadly lift performance, and can quietly weaken creative agency.

January 16, 2026 · 16 min · Zelina
Cover image

One Agent Is a Bottleneck: When Genomics QA Finally Went Multi-Agent

A mechanism-first reading of GenomAgent: why specialized multi-agent orchestration improved genomics QA accuracy while cutting tool-use cost.

January 16, 2026 · 15 min · Zelina
Cover image

Reasoning or Guessing? When Recursive Models Hit the Wrong Fixed Point

A mechanistic reading of HRM shows why recursive depth can look like reasoning while behaving more like attractor search—and how that changes reliability testing for business AI systems.

January 16, 2026 · 16 min · Zelina
Cover image

When Agents Talk Back: Why AI Collectives Need a Social Theory

A mechanism-first reading of why LLM agent teams cannot be governed by single-agent benchmarks or MARL logic alone.

January 16, 2026 · 18 min · Zelina
Cover image

When Goals Collide: Synthesizing the Best Possible Outcome

How multi-property LTLf synthesis turns impossible all-or-nothing specifications into computable frontiers of guaranteed outcomes.

January 16, 2026 · 16 min · Zelina
Cover image

When Models Know They’re Wrong: Catching Jailbreaks Mid-Sentence

SafeProbing suggests that jailbreak defense may work better when models are monitored during generation, not judged only after the damage is already written.

January 16, 2026 · 3 min · Zelina
Cover image

EvoFSM: Teaching AI Agents to Evolve Without Losing Their Minds

A mechanism-first reading of EvoFSM, a finite-state-machine approach to making self-evolving AI research agents more adaptive without letting them rewrite themselves into chaos.

January 15, 2026 · 13 min · Zelina
Cover image

Knowing Is Not Doing: When LLM Agents Pass the Task but Fail the World

Task2Quiz shows why agent evaluation needs to separate task completion from grounded environment understanding.

January 15, 2026 · 14 min · Zelina
Cover image

Lean LLMs, Heavy Lifting: When Workflows Beat Bigger Models

A case-first look at why structured workflows and data tools, not just larger models, are the real bottleneck-breakers for large-scale optimization modeling.

January 15, 2026 · 12 min · Zelina