In the wake of the GameStop and AMC frenzies, financial firms and researchers have been racing to decode one question: Can social media sentiment predict stock prices?
A new paper from researchers at Wrocław University of Science and Technology provides a sobering answer: not really. Despite employing advanced sentiment models—including a ChatGPT-annotated and emoji-savvy version of Financial-RoBERTa—the study found only weak and inconsistent relationships between sentiment and price movement for GME and AMC.
But that doesn’t mean Reddit is irrelevant. In fact, some of the strongest signals came from simpler metrics: the number of Reddit comments and Google search interest, both of which showed moderate correlation and bidirectional Granger causality with stock price changes. In short, attention, not emotion, appears to be the real mover.
When RoBERTa Meets Diamond Hands
The researchers collected ~7 million Reddit posts from /r/wallstreetbets during the height of the 2021 meme stock mania and ran them through three sentiment engines:
Model | Type | Notes |
---|---|---|
TextBlob | Rule-based | Simple polarity scores (-1 to 1) |
Financial-RoBERTa | Pretrained | Trained on SEC filings, earnings transcripts, etc. |
Finetuned RoBERTa | ChatGPT-labeled Reddit data | Tuned to slang and emojis |
To supplement the highly imbalanced sentiment dataset, the team used ChatGPT again to generate synthetic negative examples, mimicking the aggressive and sarcastic tone of Reddit haters.
They also included a fourth model: a bare-bones emoji counter that just tallies usage of 🚀, 💎, and 🦜. Surprisingly, this crude approach outperformed sentiment classifiers in several causality tests.
The Real Predictors: Volume and Virality
Sentiment values showed minimal correlation with stock prices. For GME:
- Finetuned RoBERTa correlation: 0.05 (Pearson), -0.11 (Tau)
- Emoji Counter: -0.29 to -0.39, oddly more predictive
However, two metrics consistently demonstrated stronger relationships:
Feature | GME Pearson | AMC Pearson |
---|---|---|
Number of Comments | 0.52 | 0.38 |
Google Trends | 0.43 | 0.48 |
And in Granger causality tests, both metrics had significant bidirectional effects with stock prices when using stationary data. That is:
- More chatter = price moves
- Price moves = more chatter
A feedback loop of attention and volatility.
Why Sentiment Struggles (Even with ChatGPT’s Help)
- Retail sentiment is noisy: Saying “GME to the moon” could be sarcastic, serious, or just trend-bait. Tone doesn’t always map to action.
- Emoji compression > verbose analysis: Emojis may signal stronger conviction than long rants.
- Sentiment ≠Intent: A post expressing doubt may still reflect a user’s intent to buy the dip. NLP struggles with that nuance.
- Echo chambers distort signals: Popular opinions get amplified, while dissent fades—leading to sentiment overrepresentation.
Even the finetuned RoBERTa model, trained on slang and labeled by ChatGPT, could not overcome these fundamental limitations.
Where Do We Go From Here?
This study highlights a counterintuitive lesson for financial AI teams:
Sometimes dumb signals work better than smart ones.
Volume, emoji use, and search trends capture real-time attention and emotion more directly than complex NLP pipelines. For now, predicting meme stock moves from Reddit sentiment remains more art than science.
But the door isn’t shut. Future models may:
- Combine multimodal sentiment (text + emojis + memes)
- Factor in network propagation (who is saying what, and how it spreads)
- Distinguish between bullish sarcasm and genuine pessimism
Until then, tracking 🚀s and Reddit comment counts might beat your transformer.
Cognaptus: Automate the Present, Incubate the Future