Opening — Why this matters now
Modern AI systems are increasingly judged not just by what they do, but by why they do it. Regulators want explanations. Engineers want guarantees. Businesses want robustness under change. Yet, quietly, a paradox has been growing inside our models: systems that behave exactly the same on the surface may rely on entirely different internal reasoning.
This paper drags that paradox—known as the Rashomon effect—out of the comfortable world of classification and drops it into the messier, higher-stakes domain of sequential decision-making. The result is not just an academic extension, but a practical warning: identical behavior does not imply identical logic, and pretending otherwise can cost you robustness, verification time, and trust.
Background — From one-shot predictions to policies
The Rashomon effect, originally articulated by Leo Breiman, describes a familiar phenomenon in supervised learning: multiple models trained on the same data make the same predictions while relying on different features internally. Saliency maps, feature attributions, and counterfactual explanations all reveal this quiet disagreement beneath apparent consensus.
Sequential decision-making complicates matters. Here, an agent does not make a single prediction; it executes a policy, selecting actions over time in a stochastic environment. Success or failure is path-dependent and probabilistic. You cannot simply line up predictions against labels and call it a day.
The key conceptual shift in the paper is this:
In sequential decision-making, observable behavior is fully captured by the induced Markov chain generated by a policy interacting with its environment.
If two policies induce the same discrete-time Markov chain (DTMC) with respect to a given objective, they are observationally indistinguishable—even if they think very differently inside.
Analysis — Translating Rashomon into dynamics
The authors propose a clean, formal definition of the Rashomon effect for sequential tasks:
- Multiple policies are trained on identical expert data (via behavioral cloning).
- The policies induce identical DTMCs for a specified property (verified via probabilistic model checking).
- The policies differ in internal structure, measured by a user-chosen metric such as saliency-based feature importance.
Only when all three conditions hold do we have a genuine sequential Rashomon set.
Why model checking matters
In classification, identical predictions are easy to verify. In stochastic environments, they are not. A single rollout proves nothing.
The paper’s methodological backbone is probabilistic model checking, which constructs the entire induced DTMC of a policy and verifies properties exactly—no sampling error, no Monte Carlo hand-waving. This makes behavioral equivalence a mathematical fact, not a statistical guess.
Algorithmically, the process looks like this:
| Step | Purpose |
|---|---|
| Induced DTMC construction | Enumerate all reachable states and transitions under the policy |
| Property verification | Check reachability probabilities exactly |
| Cross-policy comparison | Identify behaviorally identical policies |
| Attribution analysis | Filter for internal diversity |
What survives this pipeline is not coincidence—it is structure.
Findings — Identical behavior, divergent reasoning
Using a taxi navigation environment, the authors trained 100 neural policies on the same expert dataset. Model checking revealed:
- 10 behavioral equivalence classes
- 82 policies in the largest class, all achieving the objective with probability 1
Yet saliency analysis told a different story. Within that dominant class, policies disagreed sharply on which features mattered most: fuel level, passenger location, coordinates, or job count.
In other words: same actions, same states, same success—different mental models.
When the environment shifts, masks fall off
The real test came under distribution shift. The task was made harder by increasing the required number of completed jobs.
| Policy | 5 jobs | 6 | 7 | 8 | 9 | 10 |
|---|---|---|---|---|---|---|
| Individual Rashomon members | Identical | Divergent | Often fail | Fail | Fail | Fail |
| Majority-vote ensemble | Good | Better | Degrades | Poor | Poor | Poor |
| Permissive Rashomon policy | Perfect | Perfect | Perfect | Perfect | Perfect | Perfect |
The punchline is subtle but devastating:
- Policies that were provably identical in the original environment diverged dramatically under shift.
- Internal differences mattered—even when behavior initially did not.
Implications — Robustness, verification, and explainability
This work has three implications that business and engineering leaders should not ignore.
1. Explainability is underdetermined
If multiple internally distinct policies are behaviorally identical, then any single explanation is incomplete by definition. Trusting one saliency map over another becomes a matter of choice, not truth.
2. Ensembles should be principled, not accidental
The Rashomon set offers a systematic way to build robust ensembles. Not random restarts. Not blind averaging. Verified behavioral equivalence first—diversity second.
3. Verification can get cheaper
By constructing permissive policies from Rashomon sets, the authors shrink the state space by an order of magnitude while retaining optimal performance. For safety-critical systems, that is not a footnote—it is a budget line.
Conclusion — Same path, different compass
This paper makes a quiet but important point: in sequential decision-making, identical behavior can conceal fundamentally different internal reasoning—and those differences surface exactly when conditions change.
The Rashomon effect is no longer just an interpretability curiosity. It is a structural property of decision-making systems, with direct consequences for robustness, verification, and trust.
If your agent always takes the right action, you may still want to ask why. Because when the map changes, only some compasses stay true.
Cognaptus: Automate the Present, Incubate the Future.