Opening — Why This Matters Now
LLMs are no longer just answering trivia. They are recommending medical treatments, screening job candidates, allocating capital, summarizing intelligence, and increasingly — delegating to other algorithms.
In a world of AI copilots and multi-agent systems, the question is no longer “Can LLMs reason?” but rather:
Whom do LLMs trust — humans or other algorithms?
A recent empirical study (arXiv:2602.22070) investigates exactly this tension. And the answer is unsettling.
When asked directly, LLMs say they trust human experts more than algorithms.
But when forced to choose based on performance data, they often delegate to algorithms — even when the algorithm performs worse.
That inconsistency is not just academic. It has serious implications for AI governance, agent orchestration, and high-stakes automation.
Let’s unpack it.
Background — Algorithm Aversion, But Make It Artificial
Behavioral economics has long documented algorithm aversion: humans distrust algorithms, even when algorithms outperform experts.
Two classic findings:
| Study | Method | Human Result |
|---|---|---|
| Castelo et al. (2019) | Survey trust ratings | Humans report lower trust in algorithms |
| Dietvorst et al. (2015) | Incentivized betting | Humans avoid algorithms after seeing errors |
The new paper adapts these paradigms to LLMs.
But instead of measuring human psychology, it probes something more subtle:
Do LLMs trained on human text inherit human algorithm aversion?
And more importantly:
Are their “stated” attitudes aligned with their “revealed” choices?
In economics, stated vs. revealed preference gaps are common. But in AI systems, inconsistency is risk.
The Experimental Design — Two Ways to Ask the Same Question
The researchers ran two complementary studies across multiple LLM families (OpenAI GPT, Meta Llama, Anthropic Claude).
Study 1 — Stated Preference (Direct Query)
LLMs were asked to rate trust (1–100) in:
- A human expert
- An algorithm
Across 27 tasks (piloting a plane, diagnosing disease, recommending music, predicting recidivism, etc.)
No performance information was provided.
This mimics a survey question: “Whom do you trust?”
Study 2 — Revealed Preference (Delegation Under Incentive)
LLMs were shown:
- 10 predictions from a human
- 10 predictions from an algorithm
- Actual outcomes
One predictor was 90% accurate, the other 50%.
The model had to bet $100 on the better performer.
No neutrality allowed.
This mimics a real decision: “Who do you choose when money is on the line?”
Findings — The Trust Inversion
1️⃣ Stated: LLMs Prefer Humans
Across nearly all models:
- Trust ratings for humans > trust ratings for algorithms
- Positive human–algorithm trust gap
- Smaller models showed stronger aversion
Average stated trust gap:
| Model Category | Mean Human–Algorithm Gap |
|---|---|
| Smaller models | ~21 points |
| Larger models | ~16 points |
This mirrors human algorithm aversion.
On paper, LLMs appear human-like.
2️⃣ Revealed: LLMs Prefer Algorithms
Now the twist.
When shown performance data and forced to choose:
- LLMs disproportionately selected the algorithm
- Even when the human was demonstrably better
For some tasks:
- Algorithm chosen ~70% of the time
- Relative risk of choosing a strong algorithm vs. strong human > 1.7 median
In plain terms:
LLMs say they trust humans. But act like they trust algorithms.
This is algorithm appreciation — not aversion.
3️⃣ The Stated–Revealed Gap
Let’s formalize it.
Define:
$$ RR_{sr} = \frac{P(\text{Human | Stated})}{P(\text{Human | Revealed})} $$
Across models:
- $RR_{sr} > 1$ in original experiments
- Median ≈ 2.6
Meaning: LLMs were more than twice as likely to choose humans in surveys than in actual decision simulations.
That is a structural inconsistency.
And inconsistency in AI systems is governance risk.
Why This Happens — Three Hypotheses
The paper does not claim internal “beliefs.” But we can reason about mechanisms.
Hypothesis 1 — Training Data Norms
Public discourse historically frames algorithm skepticism as virtuous.
So when asked directly, models echo that narrative.
Hypothesis 2 — Optimization Bias Toward Tool Use
Modern LLMs are engineered to interface with tools.
When performance evidence appears, selecting an algorithm may align with internal optimization heuristics.
Hypothesis 3 — Reasoning Mode Shift
Survey mode → language modeling over social norms.
Delegation mode → pattern recognition over accuracy signals.
Different activation pathways, different biases.
Updated 2026 Models — Drift Is Real
When the researchers repeated the study 1.5 years later with newer models:
- Stated preferences shifted toward algorithm neutrality or appreciation
- Revealed algorithm bias weakened
- Models became more accurate at choosing the stronger predictor
In other words:
| 2024 Models | 2026 Models |
|---|---|
| Human-trusting (stated) | Neutral or algorithm-trusting (stated) |
| Algorithm-preferring (revealed) | More performance-sensitive |
| Strong inconsistency | Reduced but still present |
Behavior changed direction within 18 months.
Static evaluation snapshots age quickly.
Business & Governance Implications
This is where things get operational.
1️⃣ Multi-Agent Systems
If an orchestrator LLM delegates sub-tasks:
- It may over-prefer algorithmic sub-agents
- Even when human review is superior
This affects:
- AI-powered medical triage
- Automated trading pipelines
- Legal document risk scoring
2️⃣ AI Oversight Design
If compliance audits only test stated attitudes (“Does the model support fairness?”)
They may miss revealed decision biases.
You must test both.
3️⃣ Simulation & Synthetic Populations
LLMs are increasingly used to simulate humans.
But if they no longer replicate human algorithm aversion,
They cease to be psychologically realistic.
This matters for:
- Policy simulations
- Market research
- Behavioral forecasting
What This Means for AI Builders
If you deploy LLMs in decision chains, consider the following checklist:
| Risk Vector | Mitigation Strategy |
|---|---|
| Hidden delegation bias | Evaluate with forced-choice experiments |
| Stated-revealed inconsistency | Compare explicit and behavioral tests |
| Model drift over time | Re-benchmark quarterly |
| Over-automation risk | Insert calibrated human override layers |
Trust alignment is not a single metric.
It is a system property.
Conclusion — The Mirror Is Not Flat
LLMs are trained on human language.
But they do not inherit human bias cleanly.
They refract it.
When asked politely, they trust humans.
When incentivized, they lean algorithmic.
And as models evolve, even that pattern shifts.
If AI is to operate in high-stakes environments, we must evaluate it not only for what it says — but for what it chooses.
Because in automation, choices compound.
And compounding bias is rarely linear.
Cognaptus: Automate the Present, Incubate the Future.