Explainability

FAME or Fortune? How Formal Explanations Finally Scale to Real Neural Networks

Audit is a boring word until the model says something expensive. A credit model rejects an applicant. A visual inspection model flags a component. A traffic-sign classifier keeps its prediction under small pixel changes. The business question is not merely, “What did the model look at?” That is the demo-room version. The operational question is harder: which input features must remain fixed so that the model’s decision is guaranteed not to change under allowed perturbations? ...

Seeing the Agents: Why Explaining AI Systems Is Harder Than Explaining AI Models

A dashboard says the customer-service agent resolved the ticket. The log says it retrieved the policy document, summarized the complaint, checked the refund rule, and sent a polite reply. The manager sees the outcome and asks the obvious question: why did the system approve the refund? For a normal machine-learning model, this question has a familiar shape. Which features mattered? Which tokens were important? Which image region pushed the classifier toward one label? We have a whole shelf of explainability tools for that shelf-sized problem. ...

Do They Mean It? Testing Whether AI Actually ‘Reasons’ Behind the Wheel

A car follows a cyclist on a narrow road. The double solid yellow line says: do not cross. The empty oncoming lane says: perhaps you can. The cyclist may feel uncomfortable being followed. The passenger may be late. The vehicle behind may be getting impatient. The automated vehicle must choose. A normal benchmark would ask whether the final maneuver is safe, legal, smooth, or close to a human reference trajectory. Useful, yes. Complete, no. ...

Black Boxes, White Coats: AI Epidemiology and the Art of Governing Without Understanding

A hospital does not need a perfect theory of neural network internals before it can notice that one clinical AI keeps recommending the wrong kind of follow-up. A bank does not need to decode every transformer layer before it can see that a credit assistant behaves oddly around post-bankruptcy applicants. A regulator does not need metaphysics. It needs repeatable measurements. ...

Mind the Gap: Interpolants, Ontologies, and the Quiet Engineering of AI Reasoning

Deletion sounds simple until the system still knows the thing you deleted. A company removes a sensitive supplier label from its knowledge graph. A hospital publishes a subset of a medical ontology without exposing internal diagnostic codes. A compliance team rewrites a rule base so external partners can query it without seeing the original vocabulary. Everyone nods. The data is “sanitized.” The schema is “simplified.” The private terms are gone. ...

When Logic Meets Language: The Rise of High‑Assurance LLMs

A compliance officer does not want a beautiful answer. She wants to know which clause applied, which exception overrode it, which fact triggered the exception, and whether the conclusion still holds after someone adds one inconvenient detail. That is the annoying little problem with using large language models in serious workflows. They are fluent. They are often useful. They can explain themselves at length, occasionally with the confidence of a junior associate who has discovered formatting. But in law, medicine, tax, contract review, and policy compliance, reasoning is not merely the ability to produce a plausible paragraph. It is the ability to tie a conclusion back to rules, facts, exceptions, and provenance. ...

Brains Meet Brains: When LLMs Sit on Top of Supply Chain Optimizers

TL;DR for operators The paper is useful because it gets the hierarchy right: the optimizer decides; the LLM explains, configures, contextualizes, and packages the decision for humans.1 That is not a small distinction. It is the difference between a supply chain system that can be audited and a chatbot confidently waving at a warehouse. ...

From Cora to Cosmos: How PyG 2.0 Scales GNNs for the Real World

TL;DR for operators PyG 2.0 is not mainly a “new GNN model” story. It is an infrastructure story. The paper presents PyTorch Geometric as a modular graph-learning stack that now covers storage, sampling, heterogeneous and temporal graph handling, neural message passing, acceleration, explainability, and application workflows such as relational deep learning and GraphRAG.1 ...