Cover image

Graph Work, Not Graph Worship: RAGA Turns RAG Into an Auditable Knowledge Operation

TL;DR for operators RAGA is not another “add a graph and accuracy goes up” paper. That would be too convenient, and therefore suspicious. The useful idea is more operational: treat retrieval-augmented generation as a knowledge management process, not a pile of embeddings with a polite chatbot on top. The paper proposes RAGA, short for Reading-And-Graph-building-Agent, an autonomous system that reads documents, searches existing graph knowledge, verifies whether new entities or relations should be added, and then constructs or updates a knowledge graph with source-linked provenance.1 Its core loop is Read–Search–Verify–Construct, implemented as a ReAct-style tool-calling agent rather than a one-shot extraction pipeline. ...

June 16, 2026 · 20 min · Zelina
Cover image

Search, Critique, Repeat: Critic-R Turns RAG Complaints into Retriever Training

Search failure is boring until it becomes expensive. A research agent asks for evidence. The retriever returns documents. The reasoning model reads them, continues writing, and eventually produces a confident answer. Somewhere in the middle, the evidence was slightly wrong: not irrelevant enough to trigger an obvious failure, not useful enough to support the next reasoning step. The agent proceeds anyway, because that is what agents do when we dress up uncertainty as workflow automation. ...

June 8, 2026 · 17 min · Zelina
Cover image

Memory Lane Has Potholes: MemFail and the Business of Testing Agent Recall

Memory is where enterprise AI demos go to become operationally embarrassing. In the demo, the assistant remembers that a client prefers concise weekly updates, that a trader avoids high-leverage positions after volatility spikes, or that a procurement manager only approves a supplier when compliance documents are current. In production, the same assistant may remember the attractive half of the fact and quietly lose the condition. It recalls “approves supplier” but forgets “only when compliance documents are current.” Congratulations: the agent has not forgotten. It has remembered dangerously. ...

June 4, 2026 · 15 min · Zelina
Cover image

Don’t Average the Needle: Spectral Retrieval and the RAG Evidence Problem

Enterprise search has a very old habit wearing a very modern jacket: it averages. A policy document becomes one vector. A runbook becomes one vector. A postmortem full of operational detail becomes one vector. Then a RAG system asks that one vector whether the document is relevant. This is convenient, fast, and usually defensible — until the relevant answer is a narrow paragraph hiding inside a large document. At that point, the retrieval system is no longer searching for evidence. It is asking a crowd to speak for the witness. ...

May 30, 2026 · 16 min · Zelina
Cover image

Provenance, Not Providence: Why AI Answers Need Receipts

Opening — Why this matters now The current AI market has become very good at producing fluent answers and very bad at explaining where those answers came from. This is not a minor inconvenience. It is the difference between an assistant that can be trusted in an operational workflow and an assistant that merely performs confidence with attractive typography. ...

May 9, 2026 · 14 min · Zelina
Cover image

Receipts, Please: RAG’s New Evidence Stack

Opening — Why this matters now The original business pitch for retrieval-augmented generation was wonderfully simple: connect the model to your documents, ask questions, get grounded answers. No need to retrain the model. No need to wait for the next foundation-model release. Just give the chatbot some files and let productivity bloom. ...

May 7, 2026 · 17 min · Zelina
Cover image

When AI Can Solve But Can't Search: The MathNet Equation

Search. That is the unglamorous part of AI work. The demo asks a model to solve a clean problem. The enterprise system asks a model to find the right prior case, retrieve the relevant precedent, avoid the misleading near-match, and then adapt the answer without making a confident mess of it. MathNet is interesting because it puts that distinction under pressure. The paper introduces a large multilingual, multimodal Olympiad mathematics benchmark, but the more useful business lesson is not merely that frontier models can solve hard math. We already have enough leaderboards wearing medals. The sharper finding is that models and embedding systems can still fail at recognizing when two problems are mathematically the same, or when one problem is structurally useful for another.1 ...

April 23, 2026 · 13 min · Zelina
Cover image

Write-Back to the Future: When Your RAG Starts Learning

Write-Back to the Future: When Your RAG Starts Learning A RAG system usually fails in a very ordinary way. The retriever finds something relevant, but not quite enough. The generator receives five passages, three of which are useful, one of which is decorative furniture, and one of which looks relevant only because it shares the right vocabulary. The answer is then expected to emerge from this little committee of half-helpful paragraphs. Sometimes it does. Sometimes it does what committees do. ...

March 27, 2026 · 19 min · Zelina

Build a Small RAG Knowledge Tool

How to build a lightweight retrieval-augmented knowledge tool with grounded answers, source citations, narrow scope, and a realistic MVP.

March 16, 2026 · 5 min · Michelle
Cover image

Memory Diet for AI Agents: Distilling Conversations Without Forgetting

Memory has become the awkward invoice attached to every serious AI agent demo. A short chatbot can survive on vibes. A long-running coding assistant cannot. After a few weeks of debugging sessions, architecture debates, config changes, rejected fixes, and “remember we tried this already?” moments, the agent’s past becomes valuable. It also becomes inconveniently large. The obvious solution is to stuff more transcript into the prompt. The obvious solution is usually how software gets expensive before it gets useful. ...

March 16, 2026 · 16 min · Zelina