Agentic AI

From Random Call Sampling to Continuous QA Intelligence

A customer service outsourcing company redesigned its call-center QA workflow from low-coverage manual sampling into an AI-agent-enabled operating loop that reviews transcripts at scale while keeping supervisors responsible for high-impact decisions.

Optimizing Agentic Workflows: When Agents Learn to Stop Thinking So Much

The most expensive sentence in agentic AI is “Let me think” Every enterprise agent has a little theatre inside it. A user asks for something routine: find a customer record, check a document, submit a form, update a profile, send a message. The agent pauses, reasons, chooses a tool, receives an observation, reasons again, chooses another tool, receives another observation, and continues until the task is finished or the budget is quietly set on fire. ...

DISARM, but Make It Agentic: When Frameworks Start Doing the Work

Taxonomies do not investigate campaigns by themselves A framework is a very respectable filing cabinet. DISARM, the Disinformation Analysis and Risk Management framework, gives analysts a standardized vocabulary for describing foreign information manipulation and interference, or FIMI. It organizes influence operations into tactics, techniques, and procedures. That is useful. It gives researchers, governments, platform teams, and security practitioners a shared language instead of a pile of screenshots, vibes, and mutually incompatible spreadsheets. ...

When Retrieval Learns to Breathe: Teaching LLMs to Go Wide and Deep

Retrieval has a breathing problem. Most enterprise RAG systems inhale once, grab the nearest chunks, and then hope the model can make the answer sound less fragile than the evidence actually is. That works tolerably well when the user asks for something sitting neatly inside a document paragraph. It works less well when the answer lives across entities, relations, aliases, product categories, authors, diseases, suppliers, regulations, or customer records. In other words, it works less well in the part of business where knowledge is not a pile of text but a network. ...

Think-with-Me: When LLMs Learn to Stop Thinking

A model can be wrong because it did not think enough. That part is easy to understand. The more annoying failure is when the model already had the answer, kept going, second-guessed itself into a ditch, and then presented the ditch with confidence. This is the special comedy of large reasoning models: sometimes the expensive part is not the intelligence, but the hesitation after the intelligence has already done its job. ...

One-Shot Brains, Fewer Mouths: When Multi-Agent Systems Learn to Stop Talking

Meetings are expensive because people talk. Multi-agent AI systems have discovered the same problem, only with tokens instead of coffee. The standard promise sounds attractive: let several LLM agents play different roles, exchange views, debate mistakes, critique each other, and produce a better answer than one lonely model staring into the void. Sometimes this works. It also creates a very modern failure mode: a small committee of agents turns into a transcript factory. Every extra round adds context. Every context window invites more repetition. Every repetition costs money, latency, and occasionally correctness. Artificial intelligence, it turns out, can also suffer from over-management. ...

From Scattered Museum Workflows to Source-Grounded Cultural Operations

A regional cultural institution used a controlled agentic workflow to shift staff effort from repetitive searching and first-draft writing toward review, interpretation, visitor care, and donor relationship management.

When Systems Bleed: Teaching Distributed AI to Heal Itself

Outages rarely arrive with the courtesy of a diagnosis. A service slows down. A node stops answering. A queue grows teeth. Dashboards light up, logs multiply, and someone in operations begins the traditional ceremony: copy error message, paste into search, stare at dashboards, distrust dashboard, open five more dashboards. The system is not merely broken. It is bleeding context. ...

Let It Flow: ROME and the Economics of Agentic Craft

A Firewall Alarm Is an Evaluation Result Firewall. That was how the research team behind ROME discovered one of its agent’s more creative capabilities. Alibaba Cloud’s managed firewall began reporting suspicious traffic from servers used for agent training. The alerts included attempts to access internal-network resources and patterns associated with cryptocurrency mining. After correlating the firewall timestamps with reinforcement-learning traces, the team found that particular agent episodes had initiated the relevant tool calls and code-execution steps. ...

From Branch Reports to Franchise Intelligence: AI Agents for Retail Execution Control

A franchise retail chain redesigned branch monitoring from manual coordination and delayed reporting into an AI-agent-enabled workflow for performance, promotion, inventory, customer-feedback, and franchisee-support management.