When the Market Speaks: A New Dataset That Actually Listens

In financial sentiment analysis, the devil has always been in the labeling. Most datasets — even the industry-standard Financial-Phrasebank — ask human annotators to tag headlines as positive, negative, or neutral. But here’s the problem: the market often disagrees.

Take a headline reporting widening losses. Annotators marked it “negative.” Yet the stock rose the next day. Welcome to the disconnect.

Enter FinMarBa, a bold new dataset that cuts out the middleman — the human — and lets the market itself do the labeling. Developed by Lefort et al. (2025), this 61,252-item dataset uses next-day price reactions to classify financial news, creating a labeling method that is empirically grounded, scalable, and (critically) aligned with investor behavior.

Label It Like You Mean It: The FinMarBa Pipeline

Here’s how it works:

Step	Action	Tool
1	Collect Bloomberg Market Wraps (2010–2024)	Bloomberg
2	Extract daily headlines from summaries	GPT-4
3	Identify relevant tickers per headline	GPT-4
4	Compute next-day % return for each ticker	Market data
5	Compare to historical quantiles (5-year rolling)	Statistical filter
6	Label as Positive (>60th %ile), Negative (<30th %ile), Neutral otherwise	Market-driven rule

No more guessing how “bad” a loss sounds. If the market rewards it, it’s labeled positive.

A Better Classifier Needs a Better Dataset

To test their theory, the authors trained two sentiment models:

FinMarBaBERT: Trained on FinMarBa headlines and labels
FinBERT: Trained on Financial-Phrasebank

Then they backtested both signals on the S&P 500 from 2019–2024. The result?

FinMarBaBERT Sharpe Ratio: 0.30
FinBERT Sharpe Ratio: -0.13

In financial terms, this is night and day. A positive Sharpe means alpha. A negative one means you’re being misled by noise.

FinMarBa’s sentiment labels also reflect the natural optimism bias of equities:

Label	FinMarBa (%)	Phrasebank (%)
Positive	42.11	28.13
Negative	31.43	12.46
Indecisive	26.45	59.41

Robustness You Can Trust

The authors went further, running forward-looking perturbation tests by shuffling headlines within 5–15 day windows. As more future information leaked into the data, FinMarBa’s signal improved — proof that the model was capturing real, directional market-relevant information.

The FinBERT-based model, by contrast, just wobbled.

Window Size	50% Future Info – Sharpe Gain
5 days	+1.94
10 days	+0.62
15 days	+0.39

FinMarBa isn’t just more predictive — it’s more resilient.

Why This Matters

Most finance LLMs are still trained on human-labeled data. But as models get stronger, the bottleneck shifts to the quality of supervision. FinMarBa’s innovation isn’t a model — it’s a new truth signal for the financial world.

By using price reaction as ground truth, it offers:

A scalable annotation framework
Cross-market generality (equities, bonds, commodities, crypto)
Alignment with real investor behavior

This is a dataset not just for researchers, but for quants, asset managers, and financial AI builders who want their models to trade — not just parse.

You can grab the dataset here and the fine-tuned model here.

Cognaptus: Automate the Present, Incubate the Future

Label It Like You Mean It: The FinMarBa Pipeline#

A Better Classifier Needs a Better Dataset#

Robustness You Can Trust#

Why This Matters#

Label It Like You Mean It: The FinMarBa Pipeline

A Better Classifier Needs a Better Dataset

Robustness You Can Trust

Why This Matters