Opening — Why this matters now

We keep insisting that powerful AI systems must be understood before they can be trusted. That demand feels intuitively correct—and practically paralysing.

Large language models now operate in medicine, finance, law, and public administration. Yet interpretability tools—SHAP, LIME, mechanistic circuit tracing—remain brittle, expensive, and increasingly disconnected from real-world deployment. The gap between how models actually behave and how we attempt to explain them is widening, not closing.

This paper proposes an unfashionable but pragmatic alternative: stop trying to look inside the model. Start watching what it does—systematically, statistically, and with expert oversight.

Background — From molecular biology to public health

The paper introduces AI Epidemiology, borrowing its logic directly from public health. Epidemiologists saved millions of lives long before they understood DNA, carcinogens, or molecular pathways. Smoking was regulated decades before anyone knew about p53 mutations. What mattered was population-level evidence, not mechanistic purity.

Current AI interpretability is the equivalent of insisting on molecular explanations before issuing a health warning. It is scientifically noble—and operationally useless.

The author categorises most modern interpretability as correspondence-based: methods that try to map internal model features to outputs in human terms. These approaches fail at scale due to:

  • Combinatorial explosion (token-level attributions become unstable)
  • Polysemantic neurons (one unit, many meanings)
  • Compensation effects (ablated components quietly rerouted)
  • Adversarial manipulation of explanations without changing outputs

In short: the explanation often lies more than the model.

Analysis — What AI Epidemiology actually does

AI Epidemiology reframes explainability as risk stratification, not introspection.

The system observes AI outputs and expert responses in production environments, then asks three epidemiological questions:

  1. Which outputs fail?
  2. What observable characteristics predict failure?
  3. Where do failures concentrate across domains, models, and contexts?

Instead of peering into weights and attention heads, it standardises expert–AI interactions into structured fields:

Category Field Purpose
Input/Output Mission What the AI was asked
Input/Output Conclusion What the AI recommended
Input/Output Justification The AI’s stated reasoning
Assessment Risk level Consequence severity
Assessment Alignment score Guideline compliance
Assessment Accuracy score Factual correctness
Outcome Override Did the expert disagree?
Outcome Corrective option What was done instead

These fields are captured passively—no extra work for experts, no surveys, no labels. Doctors, loan officers, or lawyers simply do their jobs. The system listens.

Over time, alignment and accuracy become exposure variables, statistically associated with expert overrides and real-world outcomes—precisely how blood pressure predicts heart attacks.

Findings — Evidence from medicine and chess

The paper validates this framework in two very different settings.

1. Ophthalmology feasibility study

Three clinical cases were reviewed by a consultant ophthalmologist. The results:

Metric Result
Semantic capture 100% lossless
Risk score reliability ICC = 1.0
Accuracy score reliability ICC = 1.0
Alignment score reliability ICC = 0.67
Overall reliability ICC = 0.89

In plain terms: the system consistently captured what mattered and scored AI outputs similarly to a human specialist—good enough for population-level surveillance.

2. SHAP vs Logia (chess demonstration)

A GPT-2 model recommends an illegal chess move: “move the pawn” when no pawn exists.

  • SHAP output: highlights token importance (“king”, “queen”, “Black”)
  • AI Epidemiology output: flags low accuracy, low alignment, 85% historical override rate, and explains why similar outputs fail

Only one of these explanations helps you stop bad decisions before they happen.

Implications — Governance without vendor lock-in

AI Epidemiology quietly solves several problems that interpretability never could:

  • Model-agnostic oversight: Works across vendors and model upgrades
  • Continuous correction: No waiting for retraining cycles
  • Auditability: Automatic compliance trails from day one
  • Expert-centred governance: Domain knowledge, not ML PhDs
  • Research guidance: Failure patterns tell interpretability researchers where to look

Crucially, it avoids turning oversight into yet another opaque AI layer. The system never replaces expert judgment—it amplifies it.

Conclusion — Explanation as survival, not enlightenment

AI Epidemiology is not anti-interpretability. It is anti-waiting.

While the field continues its long march toward understanding trillion-parameter models, institutions still need to decide loans, prescribe treatments, and issue rulings tomorrow morning. Population-level oversight offers a way to act responsibly now, using the most interpretable component we already have: human judgment at scale.

Public health did not wait for molecular biology. AI governance should not wait for mechanistic enlightenment either.

Cognaptus: Automate the Present, Incubate the Future.