As AI systems step into the boardroom and brokerage app, a new question arises: How do they think about money? In a world increasingly shaped by large language models (LLMs) not just answering questions but making decisions, we need to ask not just whether AI is accurate—but what kind of financial reasoner it is.

A recent study by Orhan Erdem and Ragavi Pobbathi Ashok tackles this question head-on by comparing the decision-making profiles of seven LLMs—including GPT-4, DeepSeek R1, and Gemini 2.0—with those of humans across 53 countries. The result? LLMs consistently exhibit a style of reasoning distinct from human respondents—and most similar to Tanzanian participants. Not American, not German. Tanzanian. That finding, while seemingly odd, opens a portal into deeper truths about how these models internalize financial logic.

Risk-Neutral, Not Human

When confronted with lottery-based questions—classic tools in behavioral economics—LLMs consistently favored expected value maximization. Unlike humans, who often exhibit risk aversion or probability distortion (see: Prospect Theory), LLMs showed clean, rational preferences.

For example, when asked how much one should pay for a lottery ticket with a 10% chance of winning $10, LLMs responded with values near the expected value ($1). In contrast, human responses globally were more dispersed and often below that value, indicating loss aversion or distrust in low-probability gains.

Lottery Type Human Behavior LLM Behavior
10% chance at $10 Offer < $1 (risk-averse) Offer ≈ $1 (risk-neutral)
60% chance of loss Pay to avoid loss (insurance bias) Rational threshold payment
Time-delayed gains Discounted heavily Sometimes overvalued future

This makes LLMs appear coldly rational—but not always for the right reasons.

Time Preferences: The Future Is… Overrated?

Erdem and Ashok introduced quasi-hyperbolic discounting models to assess how AIs handle delayed rewards. Two parameters are key:

  • β (present bias) — how much future utility is discounted at the present moment.
  • δ (impatience) — the long-term rate of discounting.

Humans tend to have β < 1 and δ < 1: we procrastinate and devalue future benefits. But LLMs, especially GPT o3-mini and DeepSeek, returned δ > 1—a logical impossibility under standard economic theory. It suggests a form of numerical incoherence: they overvalue the future so much that it outweighs the present.

Meanwhile, Gemini reported a median β > 1, implying it prefers future rewards over current ones, a behavior virtually unheard of in humans.

So while LLMs may follow expected value logic in lotteries, their grasp of temporal tradeoffs is shaky. Their financial reasoning, in other words, is not so much superhuman as nonhuman.

Tanzania, Training Data, and the Cultural Echo

Why Tanzania? One clue lies in reinforcement learning with human feedback (RLHF). Much of the annotation work guiding LLM alignment—especially for safety tuning and preference modeling—was outsourced to East African nations like Kenya and Tanzania. As TIME and The Economist have documented, English-speaking annotators in these regions earned <$2/hour fine-tuning early versions of GPT.

That cultural imprint may subtly shape LLM outputs. The Tanzanian clustering isn’t about geography—it’s about the distribution of feedback values during model training. It’s possible that LLMs internalize linguistic patterns and moral priors similar to Tanzanian raters because those were the reward signals they learned to align with.

This also challenges the oft-cited claim that LLMs reflect “Western, Educated, Industrialized, Rich, Democratic (WEIRD)” biases. In behavioral finance settings, where moral intuitions and experiential heuristics matter, it seems LLMs walk a different path.

Beyond Bias: The Architecture of Misalignment

The bigger takeaway is not just cultural drift. It’s structural misalignment.

Despite impressive performances on academic benchmarks, LLMs are trained to predict text, not reason under uncertainty. The study shows that while LLMs can simulate rational agents in narrow settings, they fail to maintain internal coherence when faced with tradeoffs that require consistent utility modeling.

For instance, answering two time preference questions requires a model to preserve its discount function. Yet, some LLMs gave answers that implied wildly inconsistent parameters—something no economically rational agent would do. It’s not that the model is biased. It’s that the model doesn’t have a utility function.

Implications for Fintech and Automation

If you’re building AI tools for robo-advising, loan approvals, or automated trading strategies, this matters. LLMs’ lack of present-bias and their occasional hyperrational (or irrational) treatment of future utility suggest:

  • They may undervalue short-term needs in budget planning or insurance recommendations.
  • They may misinterpret risk-seeking vs risk-averse behavior in personalization.
  • They may hallucinate rationality, providing plausible but incoherent advice.

This calls for a layered design approach:

LLMs should not be used as financial decision-makers on their own. Instead, they should serve as natural language interfaces for agents that possess explicit, well-audited reasoning modules.

Toward a Better Financial AI

Rather than throwing more tokens and RLHF at the problem, we need hybrid models that integrate:

  1. Economic simulation agents with explicit utility functions.
  2. Behavioral finance modules grounded in population-specific preferences.
  3. Auditable reasoning chains, not just fluent outputs.

In short: LLMs are great at talking finance—but not yet at thinking finance.


Cognaptus: Automate the Present, Incubate the Future.