Counterfactuals

Explaining the Explainers: Why Faithful XAI for LLMs Finally Needs a Benchmark

Hiring. A candidate writes a personal statement. A screening model gives a score. A manager asks the AI system why. The explanation says work experience mattered most, education came next, and demographic variables barely moved the decision. Everyone relaxes, because the explanation sounds reasonable. That is the dangerous part. A reasonable explanation is not necessarily a faithful explanation. A counterfactual edit that looks plausible is not necessarily a causal counterfactual. And a model that appears insensitive to demographic concepts may not be “fair”; it may simply have learned, or been aligned, to suppress visible sensitivity in the narrow setting being tested. ...

Counterfactuals, Concepts, and Causality: XAI Finally Gets Its Act Together

Explanations should answer the question people actually ask Audit meeting. A model has made a decision. Someone projects a heatmap. The highlighted pixels are around a chin, an eye, a forehead, or some other facial region that looks important because the model says it is important. Everyone nods carefully. Nobody is much wiser. The model has technically been “explained,” in the same way a smoke alarm explains fire by making noise. ...

Counterfactuals Unchained: How Causality Escapes Its Own Models

A loan is rejected. Now explain why. A borrower is rejected by an automated lending system. The compliance team asks a simple question: What caused the rejection? A naïve answer points to a variable: low income, high debt ratio, thin credit history, missing documentation, or some equally respectable-looking field in the model. A better answer asks what would have happened if that variable had changed. A still better answer asks which surrounding facts must be held fixed while we imagine that change. ...