Cover image

Greedy, but Not Blind: Teaching Optimization to Listen

Opening — Why this matters now Public-sector AI has a credibility problem. Not because it cannot optimize—but because it optimizes too cleanly. In health system planning, decisions are rarely about pure efficiency. They are negotiated compromises shaped by terrain, politics, institutional memory, and hard-earned intuition. Classic optimization methods politely ignore all that. This paper tackles a question many planners quietly ask but rarely formalize: Can we let algorithms optimize without silencing human judgment—and still keep mathematical guarantees intact? ...

January 19, 2026 · 4 min · Zelina
Cover image

Think-with-Me: When LLMs Learn to Stop Thinking

Opening — Why this matters now The AI industry has developed an unhealthy obsession with thinking longer. More tokens, deeper chains, bigger context windows—surely that must mean better reasoning. Except, increasingly, it doesn’t. Large Reasoning Models (LRMs) often reason past the point of usefulness, slipping into self-validation loops or overwriting correct answers with unnecessary exploration. This paper proposes a heretical idea in the age of scaling: maybe the model doesn’t need to think more—it needs to know when to stop. ...

January 19, 2026 · 3 min · Zelina
Cover image

One-Shot Brains, Fewer Mouths: When Multi-Agent Systems Learn to Stop Talking

Opening — Why this matters now Multi-agent LLM systems are having a moment. Software engineering agents argue with each other, math solvers debate proofs, and code reviewers nitpick outputs like caffeinated interns. The results are often impressive—and painfully expensive. Token budgets explode, latency compounds, and the coordination logic starts to look like an over-managed meeting that should have been an email. ...

January 18, 2026 · 4 min · Zelina
Cover image

Redundancy Overload Is Optional: Finding the FDs That Actually Matter

Opening — Why this matters now Functional dependency (FD) discovery has quietly become a victim of its own success. Modern algorithms can enumerate everything—and that is precisely the problem. On realistic schemas, exhaustive FD discovery produces hundreds of thousands of valid dependencies, most of which are technically correct and practically useless. Computationally expensive. Cognitively overwhelming. Operationally irrelevant. ...

January 18, 2026 · 4 min · Zelina
Cover image

When the Right Answer Is No Answer: Teaching AI to Refuse Messy Math

Opening — Why this matters now Multimodal models have become unnervingly confident readers of documents. Hand them a PDF, a scanned exam paper, or a photographed worksheet, and they will happily extract text, diagrams, and even implied structure. The problem is not what they can read. It is what they refuse to unread. In real classrooms, mathematics exam papers are not pristine artifacts. They are scribbled on, folded, stained, partially photographed, and occasionally vandalized by enthusiastic graders. Yet most document benchmarks still assume a polite world where inputs are complete and legible. This gap matters. An AI system that confidently invents missing math questions is not merely wrong—it is operationally dangerous. ...

January 18, 2026 · 4 min · Zelina
Cover image

Recommendations With Receipts: When LLMs Have to Prove They Behaved

Opening — Why this matters now LLMs are increasingly trusted to recommend what we watch, buy, or read. But trust breaks down the moment a regulator, auditor, or policy team asks a simple question: prove that this recommendation followed the rules. Most LLM-driven recommenders cannot answer that question. They can explain themselves fluently, but explanation is not enforcement. In regulated or policy-heavy environments—media platforms, marketplaces, cultural quotas, fairness mandates—that gap is no longer tolerable. ...

January 17, 2026 · 4 min · Zelina
Cover image

When Memory Stops Guessing: Stitching Intent Back into Agent Memory

Opening — Why this matters now Everyone is chasing longer context windows. Million-token prompts. Endless chat logs. The assumption is simple: if the model can see everything, it will remember correctly. This paper shows why that assumption fails. In long-horizon, goal-driven interactions, errors rarely come from missing information. They come from retrieving the wrong information—facts that are semantically similar but contextually incompatible. Bigger windows amplify the problem. Noise scales faster than relevance. ...

January 17, 2026 · 3 min · Zelina
Cover image

Drawing with Ghost Hands: When GenAI Helps Architects — and When It Quietly Undermines Them

Opening — Why this matters now Architectural studios are quietly changing. Not with robotic arms or parametric scripts, but with prompts. Text-to-image models now sit beside sketchbooks, offering instant massing ideas, stylistic variations, and visual shortcuts that once took hours. The promise is obvious: faster ideation, lower friction, fewer blank pages. The risk is less visible. When creativity is partially outsourced, what happens to confidence, authorship, and cognitive effort? ...

January 16, 2026 · 4 min · Zelina
Cover image

One Agent Is a Bottleneck: When Genomics QA Finally Went Multi-Agent

Opening — Why this matters now Genomics QA is no longer a toy problem for language models. It sits at the uncomfortable intersection of messy biological databases, evolving schemas, and questions that cannot be answered from static training data. GeneGPT proved that LLMs could survive here—barely. This paper shows why surviving is not the same as scaling. ...

January 16, 2026 · 3 min · Zelina
Cover image

Reasoning or Guessing? When Recursive Models Hit the Wrong Fixed Point

Opening — Why this matters now Reasoning models are having a moment. Latent-space architectures promise to outgrow chain-of-thought without leaking tokens or ballooning costs. Benchmarks seem to agree. Some of these systems crack puzzles that leave large language models flat at zero. And yet, something feels off. This paper dissects a flagship example—the Hierarchical Reasoning Model (HRM)—and finds that its strongest results rest on a fragile foundation. The model often succeeds not by steadily reasoning, but by stumbling into the right answer and staying there. When it stumbles into the wrong one, it can stay there too. ...

January 16, 2026 · 4 min · Zelina