The Chatbot Passed the Test. Then It Bowed Too Low.
TL;DR for operators NICE is useful because it does not ask whether a model has “social intelligence” as one grand, vaguely flattering trait. It breaks social intelligence into a diagnostic structure: 4 categories, 11 dimensions, 34 facets, and 137 Chinese-context ranking items. That matters because a model can look socially competent in aggregate while failing on the interaction behaviours that make or break real deployments. ...