Opening — Why This Matters Now

Interactive AI is entering boardrooms faster than corporate compliance teams can draft new slide decks. Many firms now deploy explanation-based interfaces—systems that don’t just make predictions but reveal why they made them. The assumption is seductive: give humans explanations, get better oversight. But psychology rarely cooperates. Order effects—our tendency to weigh early or recent information more heavily—threaten to distort user trust and training signals in these systems.

A new study (Pesenti et al., 2025)fileciteturn0file0 offers a rare dose of empirical calm. After running 713 participants through realistic debugging tasks, the authors found that order effects exist… but barely. And more importantly, they don’t meaningfully compromise the value of explanatory interactive learning (XIL) in enterprise settings.

Background — The Long Anxiety About Human Bias in AI Oversight

AI researchers have spent years worrying that humans corrupt AI models just as much as they correct them. Earlier work (notably Nourani et al., 2021) argued that if users encounter incorrect AI explanations early, they lose trust; if shown correct ones first, they grow complacent. In either case, their subsequent feedback becomes unreliable.

XIL raises the stakes: these systems rely on human adjustments to model explanations, meaning any cognitive bias propagates directly into model updates. In practice, this threatens high‑value deployments—compliance workflows, medical triage, insurance underwriting—where consistent human-AI collaboration is non‑negotiable.

Yet previous studies suffered from a conceptual mismatch: non‑interactive tasks, unrealistic stimuli, or feedback mechanisms that differ from how XIL actually operates. This new study remedies that.

Analysis — What the Paper Actually Does

Pesenti et al. simulate a classic XIL debugging loop using a fictional model that draws bounding boxes around blurred human faces. Participants decide whether the explanation (the box) is correct. If not, they reposition it.

Two experimental regimes matter:

  1. Within-session order effects — Does seeing many correct or incorrect explanations early change how users behave later in the same session?
  2. Between-session order effects — Does the sequence of exposure influence how users behave after a simulated model update?

The manipulation is elegantly simple: change the distribution of correct vs. incorrect explanations while keeping overall accuracy constant. The result: three groups—Increasing, Constant, Decreasing—receive differently ordered streams of errors.

What They Measured

  • Feedback accuracy: overlap between participant corrections and ground truth.
  • Agreement with the model: how often participants accept the model’s explanation.
  • Self‑reported trust: Likert-scale assessments.

Findings — The Bias That Barely Bends the System

Across both experiments, user performance remained consistently high. Small effects emerged, but none remotely existential.

1. Within-session: Only Weak Primacy Effects

Users exposed to many early model errors became slightly more skeptical later—particularly on ambiguous (“difficult”) images. But the effect was small and confined to agreement rates, not actual correction quality.

Interpretation: Early mistakes make users more cautious, but not less accurate. A healthy instinct.

Condition Effect on Feedback Accuracy Effect on User Trust Notable Pattern
Increasing Slightly lower agreement on ambiguous images No change Mild primacy effect
Constant Baseline No change
Decreasing Slight convergence of accuracy across easy/hard items No change Possibly task familiarity

2. Between-session: No Lasting Memory of Prior Errors

When users began a second session after being told the model had been “updated,” prior exposure patterns did not meaningfully influence behavior.

Agreement levels converged. Trust ratings converged. Correction accuracy converged.

Interpretation: People mentally reset when told the system has changed—good news for iterative ML workflows.

3. Self-Reported Trust Is Completely Unmoved

Across all conditions, users rated the model with statistically indistinguishable trust and perceived accuracy. In other words, their behavioral caution did not translate into conscious distrust.

Implications — What Business Leaders Should Actually Pay Attention To

This study delivers actionable reassurance for teams adopting XIL, human-in-the-loop review systems, or any debugging workflow involving model explanations.

1. XIL Is More Robust to Cognitive Bias Than Feared

The nightmare scenario—human bias derailing model improvement—doesn’t materialize. Users correct errors reliably regardless of exposure sequence.

2. Model Updates Act as a Cognitive Reset Button

Between-step trust washes clean. This grants organizations flexibility: multiple debugging rounds won’t compound subtle user biases.

3. Focus Governance on the Quality of Explanations, Not Their Order

Since order barely matters, companies should prioritize:

  • clarity of explanations,
  • reducing ambiguity in stimuli,
  • enforcing consistent bounds on what humans can correct.

4. Behavioral Data Is More Reliable Than Self-Reports

Every AI governance framework that still relies on “user trust surveys” should reconsider. Behavior—not sentiment—is where reliability lives.

Conclusion — The Measured Middle Ground

The story here is not that humans are unbiased (they aren’t), nor that order effects vanish (they don’t). It’s that in realistic XIL workflows, those biases don’t meaningfully undermine system improvement.

In a field addicted to worst‑case narratives about human-AI interaction, this paper provides a welcome reminder: sometimes, the system behaves better than we expect.

Cognaptus: Automate the Present, Incubate the Future.