Cover image

Graph Work, Not Graph Worship: RAGA Turns RAG Into an Auditable Knowledge Operation

TL;DR for operators RAGA is not another “add a graph and accuracy goes up” paper. That would be too convenient, and therefore suspicious. The useful idea is more operational: treat retrieval-augmented generation as a knowledge management process, not a pile of embeddings with a polite chatbot on top. The paper proposes RAGA, short for Reading-And-Graph-building-Agent, an autonomous system that reads documents, searches existing graph knowledge, verifies whether new entities or relations should be added, and then constructs or updates a knowledge graph with source-linked provenance.1 Its core loop is Read–Search–Verify–Construct, implemented as a ReAct-style tool-calling agent rather than a one-shot extraction pipeline. ...

June 16, 2026 · 20 min · Zelina
Cover image

Fine-Tuned, Fine Print: Why Post-Training Teaches Models What to Trust

Enterprise AI has entered its “sure, but can it use the evidence?” phase. That is progress, technically. It is also where many deployment stories begin to get expensive. The first generation of business LLM adoption was satisfied if a model could produce a fluent answer. The next generation asks something more demanding: can the model use retrieved documents, compliance policies, tool outputs, customer records, analyst notes, and human feedback in the right way? ...

June 10, 2026 · 17 min · Zelina
Cover image

Roll the Tape, Call the Tools: ReTool-Video and the Evidence-Routing Problem

Video is where AI demos go to become expensive. A model can describe a short clip. It can answer a question about a few sampled frames. It can even sound confident while doing so, which is apparently a product feature now. But business video work is rarely “what is happening in this five-second clip?” It is usually messier: find the exact moment in a two-hour training recording, count repeated actions without double-counting adjacent clips, verify whether an event appears in audio, subtitles, and frames, or decide whether a safety incident is real rather than just visually similar to one. ...

June 8, 2026 · 18 min · Zelina
Cover image

Search, Critique, Repeat: Critic-R Turns RAG Complaints into Retriever Training

Search failure is boring until it becomes expensive. A research agent asks for evidence. The retriever returns documents. The reasoning model reads them, continues writing, and eventually produces a confident answer. Somewhere in the middle, the evidence was slightly wrong: not irrelevant enough to trigger an obvious failure, not useful enough to support the next reasoning step. The agent proceeds anyway, because that is what agents do when we dress up uncertainty as workflow automation. ...

June 8, 2026 · 17 min · Zelina
Cover image

Curved Space, Straighter Retrieval: Why Graph RAG Needs Geometry

Curved Space, Straighter Retrieval: Why Graph RAG Needs Geometry Retrieval looks simple until the wrong thing keeps showing up. A company builds a graph model over products, papers, suppliers, users, or transactions. The model performs reasonably well inside familiar territory. Then the data shifts. New products appear. A new research domain enters the citation graph. A social platform changes user behavior. The model’s internal knowledge, frozen inside parameters, starts behaving like yesterday’s org chart: technically structured, operationally stale. ...

June 6, 2026 · 15 min · Zelina
Cover image

The Gate Before the Graph: Why Technical RAG Needs Evidence Control

Search is easy until it becomes responsible. A product engineer asks, “What methods exist for real-time tire friction estimation?” A normal search tool returns papers. A normal RAG system retrieves chunks. A confident LLM then writes a neat answer, preferably with enough bullet points to look managerial. The problem is not that this answer is always wrong. That would be mercifully simple. The problem is that it may be locally plausible but evidentially thin: two relevant chunks, one outdated method, no coverage of adjacent terminology, and a citation that looks reassuring mostly because it exists. ...

June 6, 2026 · 18 min · Zelina
Cover image

Memory Lane Has Potholes: MemFail and the Business of Testing Agent Recall

Memory is where enterprise AI demos go to become operationally embarrassing. In the demo, the assistant remembers that a client prefers concise weekly updates, that a trader avoids high-leverage positions after volatility spikes, or that a procurement manager only approves a supplier when compliance documents are current. In production, the same assistant may remember the attractive half of the fact and quietly lose the condition. It recalls “approves supplier” but forgets “only when compliance documents are current.” Congratulations: the agent has not forgotten. It has remembered dangerously. ...

June 4, 2026 · 15 min · Zelina
Cover image

Uncertain Terms: Hallucination Scores Are Triage Signals, Not Lie Detectors

Uncertain Terms: Hallucination Scores Are Triage Signals, Not Lie Detectors A support ticket lands on the AI team’s desk: the enterprise chatbot answered confidently, cited the wrong policy, and somehow made the compliance team nostalgic for search boxes. The obvious next idea is to add an uncertainty score. When the model is unsure, route the answer to a verifier. When the score is high, reject the output. When the score is low, let it pass. Elegant. Cheap. Measurable. Also, as usual, a little too clean. ...

June 4, 2026 · 18 min · Zelina
Cover image

K-Means, K-Gone: Sparse Coding and the Retrieval Bottleneck

Indexing is where many retrieval systems quietly become expensive. The demo looks harmless: upload documents, create embeddings, ask questions, receive answers with citations. Then the corpus starts behaving like a real business corpus. Policies change. Product pages are rewritten. Compliance documents are replaced. Support tickets arrive every hour. The retrieval layer must keep up, and suddenly the glamorous RAG stack is waiting for the plumbing to rebuild itself. As usual, the least photogenic component is the one holding the invoice. ...

June 2, 2026 · 21 min · Zelina
Cover image

RAG and the Art of Not Dropping the Answer

RAG and the Art of Not Dropping the Answer A RAG team usually starts with a familiar ambition: make the retrieved context smarter. The raw document feels too long. The search snippet feels too primitive. The page structure looks messy. A query-focused summary sounds more elegant. A proposition list sounds more machine-readable. A paraphrase from a strong LLM sounds, at least cosmetically, like an upgrade. So the team builds another representation layer between retrieval and generation, hoping the model will reward the extra sophistication. ...

June 2, 2026 · 16 min · Zelina