Opening — Why this matters now

Modern AI systems are increasingly judged not just by what they do, but by why they do it. Regulators want explanations. Engineers want guarantees. Businesses want robustness under change. Yet, quietly, a paradox has been growing inside our models: systems that behave exactly the same on the surface may rely on entirely different internal reasoning.

This paper drags that paradox—known as the Rashomon effect—out of the comfortable world of classification and drops it into the messier, higher-stakes domain of sequential decision-making. The result is not just an academic extension, but a practical warning: identical behavior does not imply identical logic, and pretending otherwise can cost you robustness, verification time, and trust.

Background — From one-shot predictions to policies

The Rashomon effect, originally articulated by Leo Breiman, describes a familiar phenomenon in supervised learning: multiple models trained on the same data make the same predictions while relying on different features internally. Saliency maps, feature attributions, and counterfactual explanations all reveal this quiet disagreement beneath apparent consensus.

Sequential decision-making complicates matters. Here, an agent does not make a single prediction; it executes a policy, selecting actions over time in a stochastic environment. Success or failure is path-dependent and probabilistic. You cannot simply line up predictions against labels and call it a day.

The key conceptual shift in the paper is this:

In sequential decision-making, observable behavior is fully captured by the induced Markov chain generated by a policy interacting with its environment.

If two policies induce the same discrete-time Markov chain (DTMC) with respect to a given objective, they are observationally indistinguishable—even if they think very differently inside.

Analysis — Translating Rashomon into dynamics

The authors propose a clean, formal definition of the Rashomon effect for sequential tasks:

  1. Multiple policies are trained on identical expert data (via behavioral cloning).
  2. The policies induce identical DTMCs for a specified property (verified via probabilistic model checking).
  3. The policies differ in internal structure, measured by a user-chosen metric such as saliency-based feature importance.

Only when all three conditions hold do we have a genuine sequential Rashomon set.

Why model checking matters

In classification, identical predictions are easy to verify. In stochastic environments, they are not. A single rollout proves nothing.

The paper’s methodological backbone is probabilistic model checking, which constructs the entire induced DTMC of a policy and verifies properties exactly—no sampling error, no Monte Carlo hand-waving. This makes behavioral equivalence a mathematical fact, not a statistical guess.

Algorithmically, the process looks like this:

Step Purpose
Induced DTMC construction Enumerate all reachable states and transitions under the policy
Property verification Check reachability probabilities exactly
Cross-policy comparison Identify behaviorally identical policies
Attribution analysis Filter for internal diversity

What survives this pipeline is not coincidence—it is structure.

Findings — Identical behavior, divergent reasoning

Using a taxi navigation environment, the authors trained 100 neural policies on the same expert dataset. Model checking revealed:

  • 10 behavioral equivalence classes
  • 82 policies in the largest class, all achieving the objective with probability 1

Yet saliency analysis told a different story. Within that dominant class, policies disagreed sharply on which features mattered most: fuel level, passenger location, coordinates, or job count.

In other words: same actions, same states, same success—different mental models.

When the environment shifts, masks fall off

The real test came under distribution shift. The task was made harder by increasing the required number of completed jobs.

Policy 5 jobs 6 7 8 9 10
Individual Rashomon members Identical Divergent Often fail Fail Fail Fail
Majority-vote ensemble Good Better Degrades Poor Poor Poor
Permissive Rashomon policy Perfect Perfect Perfect Perfect Perfect Perfect

The punchline is subtle but devastating:

  • Policies that were provably identical in the original environment diverged dramatically under shift.
  • Internal differences mattered—even when behavior initially did not.

Implications — Robustness, verification, and explainability

This work has three implications that business and engineering leaders should not ignore.

1. Explainability is underdetermined

If multiple internally distinct policies are behaviorally identical, then any single explanation is incomplete by definition. Trusting one saliency map over another becomes a matter of choice, not truth.

2. Ensembles should be principled, not accidental

The Rashomon set offers a systematic way to build robust ensembles. Not random restarts. Not blind averaging. Verified behavioral equivalence first—diversity second.

3. Verification can get cheaper

By constructing permissive policies from Rashomon sets, the authors shrink the state space by an order of magnitude while retaining optimal performance. For safety-critical systems, that is not a footnote—it is a budget line.

Conclusion — Same path, different compass

This paper makes a quiet but important point: in sequential decision-making, identical behavior can conceal fundamentally different internal reasoning—and those differences surface exactly when conditions change.

The Rashomon effect is no longer just an interpretability curiosity. It is a structural property of decision-making systems, with direct consequences for robustness, verification, and trust.

If your agent always takes the right action, you may still want to ask why. Because when the map changes, only some compasses stay true.

Cognaptus: Automate the Present, Incubate the Future.