TL;DR for operators
Most business causal analysis begins with an uncomfortable little fiction: that someone knows the causal graph. The marketing team wants to know whether a campaign caused retention. The risk team wants to know whether a policy change reduced defaults. The operations team wants to know whether a staffing rule improved service levels. Everyone has observational data. Nobody has a clean experimental intervention. Somewhere, usually in a deck with too many arrows, a causal diagram appears.
The paper behind this article asks what happens when that diagram is not fully known.1 Instead of assuming one causal graph, it treats a causal abstraction as a collection of possible causal diagrams compatible with partial knowledge. The question becomes: can a causal query still be identified across that whole collection?
The answer is not simply yes or no. The paper gives a hierarchy of identifiability notions. Some are strong but hard to verify. Some are easier to operationalise but more restrictive. Some require a single proof that works across every compatible graph. Others merely require the same causal estimand to be valid across graphs, even if each graph needs a different proof. That distinction is the small trapdoor in the floor.
For operators, the business value is a cleaner audit of causal claims under uncertainty. If your analytics team does not know the full graph, the right question is not “is this effect identifiable?” It is: identifiable under which criterion, with which common proof, across which collection of possible graphs? The answer determines whether the result is safe to automate, merely plausible for expert review, or still dependent on unstated modelling assumptions.
The paper does not deliver a production algorithm, a software package, or empirical benchmarks. It is a theoretical framework. Its practical contribution is more basic and more useful: it names the different levels of causal confidence that often get blurred together in business analytics. Apparently “we can identify the effect” was doing rather a lot of unpaid labour.
The familiar lie: we know the graph
Causal inference is attractive because it promises answers to questions prediction cannot settle. Prediction asks what is likely to happen. Causality asks what would happen if we changed something. That change might be a price intervention, a credit rule, a medical treatment, a fulfilment policy, or a sales incentive.
In Pearl-style causal inference, the target is often written as something like $P(Y \mid do(X), Z)$: the distribution of an outcome $Y$ under an intervention on $X$, possibly conditional on $Z$. The problem is that businesses usually observe $P(V)$, the joint distribution of measured variables, not the interventional distribution. Identifiability asks whether the causal quantity can be expressed as a function of observational data alone.
In the clean textbook version, a causal diagram tells us which transformations are legal. Do-calculus then rewrites an interventional expression into a do-free formula. Once the formula contains only observational terms, the analyst can estimate it from data.
The catch is the diagram. A fully specified causal graph requires knowing which variables cause which other variables, which unobserved common causes exist, and which paths should be blocked or left open. In real settings, that knowledge is partial. Product teams know some dependencies. Clinicians know some mechanisms. Economists know some timing constraints. Engineers know some process structure. Nobody knows the full graph. A causal graph in an enterprise setting is usually a negotiated artefact, not a photograph of reality.
Yvernes and co-authors take that mess seriously. They model a causal abstraction as a collection of causal diagrams. Each graph in the collection is a possible causal world consistent with the abstraction. The identifiability problem is no longer “does this query identify in graph $G$?” It is “what kind of identifiability survives across all graphs in collection $C$?”
That shift sounds small. It is not. It changes the object being audited.
One causal claim, many possible guarantees
The paper’s central move is to separate several notions that are easy to confuse. In a single graph, identifiability has a familiar meaning: a causal query is identifiable if there exists a causal estimand valid in every structural causal model inducing that graph. In a collection of graphs, that definition becomes more delicate.
The paper divides the landscape into two broad families:
| Notion | What it asks | Operational flavour |
|---|---|---|
| Identifiability through graphs (IG) | Is there one causal estimand valid across every graph in the collection? | Strong epistemic claim, potentially hard to check |
| Identifiability through graphs knowing the true distribution (IGP) | Is the query identifiable after restricting to models consistent with the true observational density? | Even more distribution-dependent; theoretically interesting, operationally demanding |
| Identifiability by common do-calculus (ICD) | Is there one do-calculus proof that works in every graph? | More deployable because the proof is uniform |
| Identifiability by common graphical criterion (ICGC) | Is there a graphical criterion satisfied by all graphs? | A criterion-level certificate; equivalent to ICD in the paper’s hierarchy |
| Common backdoor / common frontdoor (ICB / ICF) | Does a specific familiar criterion work across all graphs? | Useful but incomplete special cases |
The comparison matters because these terms describe different kinds of confidence. “The effect is identifiable in every graph” does not necessarily mean “the same proof works in every graph.” The former is a statement about the existence of a shared estimand. The latter is a statement about a uniform method.
That is the reader misconception the paper quietly dismantles. In a single graph, completeness of do-calculus makes identifiability and proof existence feel tightly coupled. In a collection of graphs, the coupling weakens. One graph may need one proof, another graph may need a different proof, and both may still lead to the same observational formula. Whether this always collapses into a common proof is exactly where the paper leaves an open conjecture.
For a business analytics team, this distinction is not academic hair-splitting. It is the difference between a causal claim that can be encoded into a repeatable governance rule and one that still requires graph-by-graph expert judgement.
The hierarchy is a map of deployability, not just logic
The paper’s hierarchy can be read as a ladder. At the bottom are highly specific criteria such as common backdoor and common frontdoor. These are easy to explain and often easy to check, but they are not complete. A causal effect may be identifiable even when neither criterion applies. That is already true in a single graph, so it is unsurprising in collections.
Above them sits common graphical criterion, ICGC. This is broader: instead of asking whether backdoor or frontdoor works, it asks whether some graphical criterion works across the whole collection.
The paper then proves that ICGC and ICD are equivalent. The reason is elegant. Any graphical criterion for a single graph can be established by a do-calculus proof; conversely, a common do-calculus proof is built from graphical independence conditions, whose conjunction forms a common graphical criterion. In other words, if the same proof works everywhere, that proof itself induces a shared criterion.
Then ICD implies IG. If there is one do-calculus proof that transforms the query into a do-free formula across every graph, then the same estimand is valid across every graph. This implication is straightforward and operationally important: a common proof is a certificate of graph-level identifiability.
Finally, IG implies IGP. If an estimand works across all structural causal models compatible with the graphs, it also works after narrowing attention to models compatible with a particular true observational distribution. Restricting the model class cannot make a universally valid estimand invalid.
The resulting comparison looks like this:
| Level | What it gives | Why it matters in practice | What it does not give |
|---|---|---|---|
| ICB / ICF | A common backdoor or frontdoor adjustment | Simple explanation; easy audit trail | Completeness |
| ICGC | A shared graphical condition | Broader certificate than a named adjustment rule | A guarantee that the criterion is easy to find |
| ICD | One do-calculus proof across all graphs | Repeatable, automatable proof logic | Necessarily all possible IG cases |
| IG | One estimand valid across all compatible graphs | Strong abstraction-level causal identifiability | A common proof, unless the open conjecture resolves that way |
| IGP | Identifiability after conditioning on true observational density | Shows how distributional knowledge can help | Practical finite-sample availability |
This table is not a maturity model. More general is not automatically better. IG is more permissive than ICD, but ICD is easier to operationalise. A business wants the most general valid claim only if it can verify, maintain, and explain it. Otherwise, the result becomes another clever analytics artefact that nobody wants to sign off.
The maximal-graph shortcut is the most operational theorem
The most immediately practical result is not the hierarchy diagram. It is the maximal-graph reduction.
Collections of compatible graphs can be large. If every uncertain edge creates multiple candidate diagrams, the collection can quickly become unwieldy. The paper proves that, for graphical notions of identifiability considered in the framework—IG, ICD, ICGC, ICB, and ICF—it is sufficient to check the maximal graphs under graph inclusion.
The intuition is simple. If $G_2$ is a subgraph of $G_1$, then $G_2$ has fewer edges. Removing edges preserves graphical independences present in the larger graph. The paper formalises this in Lemma 1: if a causal estimand, do-calculus proof, or graphical criterion applies in the larger graph, it also applies in the subgraph. Theorem 2 then follows: identifiability in the whole collection is equivalent to identifiability in the subcollection of maximal graphs.
For operators, this is the part worth underlining. It gives design guidance for causal abstractions:
| Design choice | Consequence |
|---|---|
| Build an abstraction that induces many incomparable maximal graphs | Identifiability checks may remain complex |
| Build an abstraction with few maximal graphs | The audit burden shrinks |
| Build an abstraction with one greatest graph | Identifiability across the collection reduces to identifiability in that graph; IG and ICD coincide |
This does not magically solve causal inference. It does something more modest: it tells teams where complexity enters. If a domain abstraction explodes into many maximal candidate graphs, the hard part may not be estimation. It may be the representation of uncertainty itself.
That is a useful warning for enterprise causal platforms. Data scientists often spend energy on estimators, debiasing methods, and model selection. This paper points to an earlier bottleneck: how partial causal knowledge is encoded before any estimator appears. Bad abstraction design can turn identifiability into a combinatorial swamp. The swamp may be mathematically respectable. It is still a swamp.
The examples are conceptual tests, not empirical evidence
The paper does not run experiments. There are no datasets, performance tables, or ablation curves. Its evidence consists of definitions, proofs, examples, and links to prior work.
That distinction matters because the article should not be read as saying “this method improves causal estimation.” The paper is not benchmarking estimators. It is classifying identifiability notions.
The examples serve different purposes:
| Paper component | Likely purpose | What it supports | What it does not prove |
|---|---|---|---|
| Example 1: common backdoor across two graphs | Illustration | Shows how a familiar adjustment set can work uniformly across a collection | That common backdoor is complete |
| Lemma 1 | Main theoretical support | Shows monotonicity under subgraph inclusion for estimands, do-calculus proofs, and criteria | That maximal graphs are always few |
| Theorem 2 | Main theoretical support | Reduces graphical identifiability checks to maximal graphs | That checking maximal graphs is computationally cheap |
| Figure 1 hierarchy | Conceptual synthesis | Displays implication and non-implication relationships | That all open relationships are resolved |
| Example 2 | Separation illustration | Shows IG is not complete for IGP because true distributional knowledge can help | That IGP is practical with finite samples |
| IG vs ICD conjecture | Open theoretical boundary | Identifies the unresolved relation between graph-level identifiability and common-proof identifiability | A proof of strict separation |
This is a theory paper doing theory work. That should not be held against it. But it should also not be translated into business claims it does not make. It provides a grammar for auditing causal identifiability under partial graph knowledge. It does not provide a push-button causal analytics engine, despite the industry’s persistent desire to turn every theorem into a SaaS feature by Tuesday.
Why IGP is powerful and awkward
IGP, identifiability through graphs knowing the true observational distribution, is a particularly instructive notion because it shows how extra distributional knowledge can change identifiability.
IG asks for an estimand that works across all SCMs compatible with the graph collection. IGP narrows the model class further: only SCMs whose observational distribution equals the true data density are considered. In principle, that extra information can make a query identifiable even when graph-only reasoning cannot.
The paper’s Example 2 demonstrates this point. It constructs a collection of two graphs and a query where the effect is not identifiable through graphs alone. Two SCMs can produce the same observational distribution while yielding different interventional quantities. However, if the true observational distribution includes an independence condition, the causal effect may become identifiable through graphs knowing that density.
Operationally, this is both intriguing and inconvenient. Businesses never know the true density. They estimate it from finite, noisy, biased, shifting data. IGP therefore belongs more naturally to the theoretical boundary of what could be identifiable with perfect observational knowledge than to the practical centre of causal governance.
Still, it teaches an important lesson. Graphical knowledge is not the only source of identifiability. Distributional facts can matter. But once the claim depends on true distributional properties, the audit burden changes. The question becomes not only “is the graph abstraction sufficient?” but also “how confident are we that the required distributional property holds in the population we are acting on?”
That is where finite samples, selection bias, measurement drift, and regime change re-enter the room, carrying paperwork.
The unresolved IG-versus-ICD gap is the paper’s sharpest edge
The open conjecture concerns the relationship between IG and ICD.
ICD says there is one do-calculus proof that works across the whole collection. IG says there is one causal estimand valid across the whole collection. Since ICD implies IG, the question is whether the reverse ever fails. Can a query be identifiable through graphs, with a common estimand, while no single do-calculus proof works across all compatible graphs?
The authors conjecture that such a separation exists, although they do not provide a concrete example. To prove it, one would need a collection of graphs, a causal query, and graph-specific do-calculus proofs that produce the same identification formula, while no common proof is valid across all graphs. To disprove it, one would need a universal procedure turning any IG case into a common do-calculus proof.
This open question is not a footnote. It governs how much operational comfort we can draw from graph-level identifiability.
If IG always implied ICD, then any abstraction-level identifiability result could be translated into a uniform proof. Governance would become cleaner. Analysts could ask for the common derivation and inspect it. Causal automation would have firmer ground.
If IG does not imply ICD, then there are cases where a causal effect is identifiable across all compatible graphs, but not by a single shared proof. That would mean some valid causal claims require graph-specific reasoning even though they converge to the same estimand. They may be correct, but less convenient to certify.
This is the kind of distinction that matters in regulated or high-stakes analytics. A bank, insurer, hospital, or public agency does not merely need a result to be mathematically true under some careful interpretation. It needs a reproducible argument for why the result is safe to use. ICD is closer to that argument. IG may be broader, but broader is not always easier to govern.
What this changes for business causal practice
The business implication is not that every company should immediately implement IG, ICD, ICGC, ICB, and ICF checks. Most organisations are not there. Many are still confusing propensity scores with causality and calling correlation “directional insight” when the room gets tense.
The more useful translation is a decision framework for causal claims under incomplete graph knowledge.
First, represent uncertainty explicitly. If the causal structure is not fully known, do not hide that uncertainty inside one preferred graph. Encode the plausible graph family induced by domain knowledge. This may involve known temporal ordering, forbidden edges, grouped variables, known confounders, or assumptions about latent common causes.
Second, ask what level of identifiability is being claimed. A common backdoor adjustment is easy to explain, but may be too restrictive. Common do-calculus is broader and still gives a uniform proof. IG is stronger in coverage but may be harder to operationalise. IGP may depend on distributional facts that finite data only approximate.
Third, inspect maximal graphs. If the abstraction has a small set of maximal elements, identifiability auditing becomes more manageable. If it has many, the abstraction itself may need redesign. The issue is not merely computational inconvenience. A large set of maximal graphs signals that the organisation’s causal knowledge is too diffuse to support a clean claim without further assumptions.
Fourth, separate estimation from identification. Estimation error, model misspecification, and sample bias matter, but only after the target is identifiable. This paper lives upstream from estimation. It asks whether the target can in principle be written using observational distributions under the assumed causal uncertainty. No amount of machine learning glamour rescues a non-identifiable estimand. It merely produces a more expensive guess.
A governance checklist for causal abstractions
A practical analytics team could turn the paper into a review checklist:
| Governance question | Good answer | Warning sign |
|---|---|---|
| What is the causal query? | Explicit $P(Y \mid do(X), Z)$-style target | Vague “impact” language |
| What is the abstraction? | A defined collection of compatible graphs | One informal diagram treated as truth |
| Which identifiability criterion is claimed? | ICB, ICF, ICGC, ICD, IG, or IGP stated clearly | “Identifiable” with no qualifier |
| Is the proof common across graphs? | One do-calculus proof or common criterion | Graph-specific reasoning hidden in appendices |
| How many maximal graphs matter? | Small and auditable | Large or unenumerated |
| Does the claim rely on true distributional knowledge? | Distributional assumptions tested and bounded | Infinite-sample logic smuggled into finite data |
| What remains unresolved? | Estimation, transportability, sampling, or graph uncertainty stated separately | Limitations scattered as decorative caution |
This is not bureaucracy for its own sake. It prevents a common failure mode: treating a causal result as production-ready because it is identifiable in one plausible graph, while ignoring other plausible graphs induced by the same partial knowledge.
In business terms, the paper helps convert causal modelling from a bespoke expert craft into something closer to an auditable design discipline. Not fully automated. Not painless. Just less mystical.
Where the paper stops
The boundaries are important.
First, the paper is theoretical. It provides definitions, implications, examples, and a conjecture. It does not provide a complete algorithmic treatment of the hierarchy. The authors explicitly point to the need to classify algorithmic complexity and develop efficient verification procedures.
Second, the maximal-graph reduction is useful but not a guarantee of tractability. If the maximal subcollection is large, checking it may still be difficult. The theorem says where to look, not that the search is cheap.
Third, IGP assumes access to the true observational density. In business settings, this is an idealisation. Finite data can suggest independence or dependence; it does not hand over $P^\ast(V)$ on a silver tray.
Fourth, the IG-versus-ICD relationship remains unresolved. That open problem determines whether every graph-level identifiability result can be converted into a common proof. Until it is resolved, teams should treat IG and ICD as meaningfully different operational statuses.
Finally, the framework says nothing by itself about whether the variables are well-measured, whether the abstraction is substantively correct, whether the effect transports to another market, or whether an intervention is ethically or legally permissible. Identification is necessary for credible causal estimation. It is not a full decision policy. Causality, like governance, refuses to be solved by naming one theorem and walking away.
The useful answer is not yes or no
The paper’s value lies in replacing a binary question with a better classification.
Under full graph knowledge, identifiability can already be hard. Under partial graph knowledge, it becomes easy to say something that sounds rigorous while silently switching between different standards of proof. Yvernes and co-authors make those standards explicit. They show how common graphical criteria, common do-calculus, graph-level identifiability, and distribution-aware identifiability relate—and where the map still has an uncharted edge.
For business users, the lesson is crisp. When a team claims a causal effect is identifiable without knowing the full causal graph, ask what survived the uncertainty. Did a common adjustment set survive? Did a common graphical criterion survive? Did a common do-calculus proof survive? Or did the same estimand merely emerge after graph-specific reasoning?
Those are different answers. They support different levels of automation, auditability, and decision confidence.
The fog does not disappear. But with the right hierarchy, at least we stop pretending every shadow is a road.
Cognaptus: Automate the Present, Incubate the Future.
-
Clément Yvernes, Emilie Devijver, Marianne Clausel, and Eric Gaussier, “Identifiability in Causal Abstractions: A Hierarchy of Criteria,” arXiv:2507.06213, 2025. https://arxiv.org/abs/2507.06213 ↩︎