Inside Out: How LLMs Are Learning to Feel (and Misfeel) Like Us
TL;DR for operators LLMs are not merely getting better at choosing the right emotion label. This paper shows that, inside their output distributions, larger models organise emotion words into increasingly rich hierarchies: broad emotions such as joy or sadness sit above more specific states such as optimism, disappointment, or grief.1 That matters because the hierarchy itself becomes an evaluation object. Instead of asking only whether a model correctly labels a customer message as “angry,” an operator can ask whether the model’s internal emotion map has enough depth, whether related emotions cluster sensibly, and whether that structure changes when the model is prompted to adopt different demographic personas. ...