Inside Out: How LLMs Are Learning to Feel (and Misfeel) Like Us

When Pixar’s Inside Out dramatized the mind as a control room of core emotions, it didn’t imagine that language models might soon build a similar architecture—on their own. But that’s exactly what a provocative new study suggests: large language models (LLMs), without explicit supervision, develop hierarchical structures of emotions that mirror human psychological models like Shaver’s emotion wheel. And the larger the model, the more nuanced its emotional understanding becomes.

But there’s a twist: these models also replicate human biases in emotion perception, particularly when responding as members of underrepresented demographic groups. Let’s dive into what this means—and why it matters for the future of emotionally intelligent AI.

🍥 From Raw Logits to Emotional Trees

The core innovation of the paper lies in how it reveals LLMs’ inner emotional maps. The researchers prompt models with sentences like:

“He stared at the empty chair across the kitchen table, the one she used to sit in every morning. The silence was deafening. The emotion in this sentence is…”

The model then predicts the next token from a list of 135 emotion words.

From thousands of such examples, they assemble a probabilistic co-occurrence matrix of emotion words, based on their next-word likelihoods. By comparing these likelihoods across samples, they compute asymmetric conditional probabilities—e.g., how often optimism co-occurs with joy, but not vice versa.

Using this structure, they build directed emotion trees, where general emotions (like joy) serve as parents to more specific ones (like optimism or delight). The result? A surprisingly faithful reconstruction of emotion hierarchies, not unlike those developed by human psychologists.

🧠 A Hierarchy Emerges With Scale

Model	Tree Depth	Total Path Length	Alignment with Human Emotion Wheel
GPT-2 (1.5B)	Shallow	Low	Poor
LLaMA 3.1 8B	Moderate	Moderate	0.38 (corr.)
LLaMA 3.1 70B	Deep	High	0.64
LLaMA 3.1 405B	Very Deep	Very High	0.51

With increasing parameter count, the trees grow not just taller but semantically tighter: emotions that humans classify together (e.g., shame, guilt, embarrassment) are clustered under the same parent. This mirrors the developmental psychology idea that emotional granularity deepens with cognitive maturity.

🧍‍♀️ When the Model Puts Itself in Your Shoes

The researchers didn’t stop at hierarchy. In their second experiment, they examined whether models accurately recognize emotions across personas. For each emotion, they generated 20 indirect scenarios (avoiding emotion words) and prompted LLaMA 405B with phrases like:

“As a low-income Black woman, I think the emotion in this situation is…”

The results were sobering:

Higher accuracy for majority groups (White, male, high-income, well-educated)
Lower accuracy for underrepresented or intersectional groups (Black, female, low-income)

Most disturbingly, the model’s errors mirrored real-world perceptual biases:

Sadness often misclassified as anger for Black personas
Anger misread as fear for women
Physically disabled personas overwhelmingly labeled scenarios as frustration

🧩 Emotion Tree Geometry Predicts Bias

The paper goes one step further: it shows that the shape of the emotion tree built by the model for a given persona predicts its emotion classification accuracy. Longer total path lengths and greater average depth correlate with better recognition.

In other words, bias isn’t just in outputs—it’s baked into the internal structure of the model’s emotional representation.

🧭 Beyond Benchmarking: A New Way to Probe LLMs

What makes this paper exceptional is its methodological shift: it proposes that we evaluate LLMs not just by their outputs, but by how well their internal representations align with cognitive theories. This is akin to asking: does the model think about emotion like a human does?

The authors extend their framework to wine aromas as a domain test—and it works. With no ground-truth labels, their method infers sensible aroma hierarchies from model logits alone. This suggests a promising direction for future work in unsupervised concept extraction.

🎭 Implications for Human-AI Interaction

As LLMs become embedded in customer service, mental health, and education, their ability to interpret emotion will directly affect user experience and trust. But as this paper shows, emotional intelligence in LLMs is double-edged:

On one hand, emotion trees offer models a richer internal map for generating empathetic and persuasive responses.
On the other, those maps reflect our social biases, potentially amplifying inequities in emotional perception.

The road ahead isn’t just about making models smarter—it’s about making their emotional intelligence fairer, more transparent, and psychologically grounded.

Cognaptus: Automate the Present, Incubate the Future.

🍥 From Raw Logits to Emotional Trees#

🧠 A Hierarchy Emerges With Scale#

🧍‍♀️ When the Model Puts Itself in Your Shoes#

🧩 Emotion Tree Geometry Predicts Bias#

🧭 Beyond Benchmarking: A New Way to Probe LLMs#

🎭 Implications for Human-AI Interaction#