Llm-Agents

The Model That Forgot Itself: Why LLMs Drift Without Knowing

A chatbot can say the right thing for ten turns and still forget what it was trying to do. That is the uncomfortable idea behind Probing the Lack of Stable Internal Beliefs in LLMs, a paper that studies whether large language models can maintain an unstated goal across a multi-turn interaction.1 The paper is not asking whether a model can avoid obvious contradictions. That is the familiar version of consistency: did the assistant say one thing on Monday and the opposite thing on Tuesday? ...

Belief Is a Graph: Why LLM Agents Need Structured Minds

Memory is the polite word we use when an LLM agent remembers a document, a user preference, or a previous chat message. It sounds reassuring. It also hides the awkward part: most agent memory is just stored text waiting to be retrieved. That is useful, but it is not the same as belief. ...

The Illusion of Anonymity: When AI Connects the Dots You Thought Were Safe

Anonymized data is still a story A customer log has no name. A research interview has no email address. A support transcript has placeholders where the direct identifiers used to be. Everyone relaxes. Compliance smiles politely. The spreadsheet is now “anonymous.” This is the small office ritual behind a very large assumption: if we remove direct identifiers, the remaining data becomes hard enough to link back to real people. ...

The Hidden Playbook of LLMs: How AI Quietly Thinks Like a Hacker

Security work has always had a slightly unfashionable virtue: it forces abstractions to confess. A chatbot demo can survive a vague answer. A vulnerability analyst cannot. When the task is binary analysis, the system has to move through addresses, functions, call sites, arguments, sinks, and partial evidence. It has to decide which path is worth following, which branch is noise, when to stop staring at one hypothesis, and when to crawl back to an earlier lead. In other words, it has to do the thing most AI product pages politely avoid naming: control the search. ...

When Alignment Meets Reality: Why LLMs Can’t Agree With Themselves

A policy says one thing. A customer says another. A retrieved document says something newly alarming. A compliance rule says stop. A business workflow says continue. This is where large language models become interesting, and by “interesting” I mean expensive. Most companies still talk about LLM alignment as if it were a calibration problem. Tune the model. Add a system prompt. Insert a safety policy. Wrap it with retrieval. Then expect the assistant to behave consistently across messy real-world tasks. The paper Are Dilemmas and Conflicts in LLM Alignment Solvable? A View from Priority Graph argues that this expectation is too neat for the problem being solved.1 ...

Ants in the Machine: What Swarm Intelligence Teaches Us About Routing LLM Agents

Routing is the unglamorous part of agentic AI. Which is exactly why it matters. A company can assemble a neat little digital workforce: one agent plans, one agent searches, one agent codes, one agent critiques, one agent writes the final answer. It looks sophisticated on a diagram. Then production traffic arrives, and the system discovers a more ancient truth: a committee is not useful if every request goes through the wrong people in the wrong order. ...

From Hallucination to Verification: Why AI Needs a Pharmacist’s Mindset

Prescription checks are a good way to humble AI. Not because the language is impossible. Drug labels, clinical notes, dosage instructions, contraindications, and interaction warnings are all text-heavy. LLMs are good at text. That part is not the problem. The problem is that prescription verification is not a writing task. It is a safety task disguised as a reading task. A pharmacist is not merely asking, “Does this paragraph sound medically reasonable?” The real question is narrower and harsher: given this patient, this drug, this dose, this route, this timing, this interaction profile, and this missing or available clinical data, is there a specific safety issue that must be raised? ...

Prompt Politics: How Tiny Policies Can Steer Entire AI Societies

Agents are easy to create. That is now the boring part. Give one LLM a persona, give another LLM a conflicting persona, add a shared task, let them talk, and suddenly the demo looks like a little society. A farmer argues with a conservationist. A rural teacher argues with an urban parent. A policy maker tries to sound balanced, because apparently even simulated bureaucracy has survival instincts. ...

Silver Bots: When Agentic AI Becomes the Caregiver

Medication is simple until someone forgets it twice, sleeps badly, skips breakfast, and says they feel “fine.” That is the real texture of elderly care. It is not one clean signal. It is a slow accumulation of weak signals: changed gait, missed pills, restless sleep, lower appetite, vague pain, repeated questions, a daughter who cannot visit this week, a nurse covering too many rooms, a home that is technically “smart” but not exactly wise. ...

When Plans Talk Back: Conversational AI Meets Classical Planning

Schedule three people, one car, two children, five afternoon activities, and several goals that quietly hate each other. Then ask a normal person to find the best plan. That is already a planning problem. Now ask the same person to understand why a plan failed, which goals caused the failure, what could be added without breaking the plan, and what must be sacrificed if one more constraint is enforced. ...