RAG | Cognaptus

Memory Diet for AI Agents: Distilling Conversations Without Forgetting

Memory has become the awkward invoice attached to every serious AI agent demo. A short chatbot can survive on vibes. A long-running coding assistant cannot. After a few weeks of debugging sessions, architecture debates, config changes, rejected fixes, and “remember we tried this already?” moments, the agent’s past becomes valuable. It also becomes inconveniently large. The obvious solution is to stuff more transcript into the prompt. The obvious solution is usually how software gets expensive before it gets useful. ...

When AI Meets the Delivery Room: Designing Safe LLM Chatbots for Maternal Health

A patient does not usually send a neatly structured medical case report. She sends a short message. “Baby moving less today.” “Severe headache and blurred vision.” “What foods increase iron?” To a normal chatbot, these are three user queries. To a maternal-health system, they are three different operating modes. One can be answered with general education. One may require urgent escalation. One may be harmless—or not—depending on pregnancy stage, timing, severity, and missing context. This is where the usual AI product fantasy quietly breaks down: the hardest part is not producing a fluent answer. The hardest part is deciding whether the system should answer at all. ...

From Hallucination to Verification: Why AI Needs a Pharmacist’s Mindset

Prescription checks are a good way to humble AI. Not because the language is impossible. Drug labels, clinical notes, dosage instructions, contraindications, and interaction warnings are all text-heavy. LLMs are good at text. That part is not the problem. The problem is that prescription verification is not a writing task. It is a safety task disguised as a reading task. A pharmacist is not merely asking, “Does this paragraph sound medically reasonable?” The real question is narrower and harsher: given this patient, this drug, this dose, this route, this timing, this interaction profile, and this missing or available clinical data, is there a specific safety issue that must be raised? ...

Memory Matters: Teaching Medical AI to Remember Like a Pathologist

Memory is a boring word until the diagnosis is wrong. A pathologist does not look at a whole-slide image as a flat picture. They see morphology, compare it with disease categories, recall grading criteria, filter out misleading patterns, and decide which pieces of old knowledge deserve attention in the current case. That last part is easy to understate. Expertise is not only having knowledge. It is knowing when to activate it. ...

Mind the Units: Why LLMs Still Can't Count (And How CONE Fixes It)

Numbers look harmless until they enter a business database. A revenue field says 50. A dosage field says 50. An age field says 50. A follow-up period says 50. A unit may be present, missing, abbreviated, buried in the column header, or inconsistently written as ml, mL, or something the spreadsheet inherited from a PDF extraction pipeline during its villain era. ...

Double Helix, Double Checks: Why Agentic AI Needs Governance Before It Writes Your Code

Code is where AI confidence goes to become expensive. A chatbot can produce a plausible function in ten seconds. An agent can now plan a refactor, split files, update interfaces, generate documentation, and politely leave behind a system that fails because one event payload forgot a required field. Very efficient. Very modern. Very annoying. ...

Memory Isn’t Personal: Why LLMs Still Forget What You Like

A customer tells your AI assistant that she dislikes crowded tourist attractions. Three weeks later, she asks for a weekend itinerary. A good assistant should not proudly recommend the busiest landmark in the city. A less good assistant will do exactly that, but in a warm tone. This is the quiet failure mode behind many “personal AI” demos. The interface remembers the conversation. The product claims continuity. The model may even have a giant context window large enough to swallow a small novel. Yet when the user asks a new question, the system behaves as if the earlier preference is just decorative text floating somewhere in the attic. ...

When AI Agents Read the Manual: Why τ-Knowledge Exposes the Limits of LLM Reasoning

A customer asks a banking agent to handle a routine request. Freeze a card. Replace a lost wallet. Open a better savings account. Close an old credit card. Apply a referral bonus. Nothing here sounds like artificial general intelligence. It sounds like Tuesday morning in a customer support queue. Then the agent has to read the internal policy, discover which tool exists, verify the customer’s account state, notice that one action blocks another, decide whether the user’s claim needs verification, and make the right database update. ...

The Context Ceiling: When Long Context Stops Thinking

Documents are the easiest way to fool an AI system into looking serious. A procurement team uploads the full contract archive. A compliance team adds policy manuals, audit notes, and emails. A financial analyst stuffs transcripts, filings, and market commentary into one heroic prompt. The interface accepts it. The model answers fluently. Everyone relaxes. ...

Mirror, Mirror on the LLM: Teaching Models to Think About Their Thinking

Evidence is not the same as judgment. Anyone who has watched an AI assistant work through a multi-document question has seen the strange version of this failure. The model finds the relevant fact. It even says something that looks like the right answer. Then, a few paragraphs later, it invents an extra condition, follows that condition with great confidence, and lands somewhere else. ...