Human-AI Interaction

When 'Check the AC' Becomes the Hard Part

TL;DR for operators Smart-home assistants do not fail only when users are vague. They fail when users become efficient. The PEC-Home paper studies a familiar pattern: after repeated interaction, people stop saying the whole thing. “Please turn on the air conditioner in the bedroom and set it to 26 degrees at 10 PM” eventually becomes “check the AC” or “handle that thing.” Humans manage this because shared context, identity, place, and prior routines do the missing work. Current LLM assistants are much less charming under that burden. ...

The Chatbot Passed the Test. Then It Bowed Too Low.

TL;DR for operators NICE is useful because it does not ask whether a model has “social intelligence” as one grand, vaguely flattering trait. It breaks social intelligence into a diagnostic structure: 4 categories, 11 dimensions, 34 facets, and 137 Chinese-context ranking items. That matters because a model can look socially competent in aggregate while failing on the interaction behaviours that make or break real deployments. ...

When Your AI Knows Too Little: The Hidden Bottleneck in Personal Agents

Lunch is a simple word. In an AI assistant demo, “order me lunch” looks like the kind of request that should be easy by now. Open the food app. Pick something. Pay. Done. The button-clicking part is no longer the miracle. The problem is everything the user did not say. Do they avoid peanuts? Do they usually order from Tuantuan or Chilemei? Is “light lunch” about calories, price, time, or avoiding the food coma before a meeting? Should the assistant ask first, or does asking defeat the whole point of assistance? And if the user says no, does the assistant actually stop, or does it “helpfully” continue doing the wrong thing with the confidence of a junior consultant holding a fresh slide deck? ...

The Mood Doesn’t Move the Model — But It Can Route It

Tone is an attractive business lever because it feels cheap. No new model. No new data pipeline. No procurement meeting in which someone says “governance layer” with a straight face. Just add a more emotional sentence before the prompt and hope the model becomes sharper. This is exactly the kind of idea that spreads because it is easy to try and hard to interpret. One team finds that urgency helps. Another finds that politeness helps. A third discovers that telling the model you are scared improves one benchmark and damages another. Soon the organization has a secret prompt cookbook, which is always a classy substitute for measurement. ...

Protocol Over Prompts: When Structure Becomes Strategy in AI Communication

Prompts are now office furniture. Everyone has them. Everyone complains about them. Nobody is quite sure who owns the standard version. One team keeps a Notion page of “best prompts.” Another hides theirs in a spreadsheet. A third tells new staff to “just ask clearly,” which is not a method, but it does have the administrative elegance of doing nothing. ...

Team Sync or Team Sink: When AI Starts Reading Your Pulse

Pulse is a tempting number. Put two people in a high-pressure task, strap a wearable to each wrist, measure how their bodies move together, and it becomes very easy to tell a neat story: synchronized teams are aligned teams; aligned teams perform better; therefore, AI should monitor physiological synchrony and intervene when people fall out of sync. ...

The Art of Interrupting AI: When Knowing Isn’t Talking

The meeting-room test AI still fails Meeting rooms are unforgiving places for intelligence. A person can know the topic, understand the slides, recognize every face around the table, and still be a terrible participant. Speak too early, and they interrupt. Speak too late, and the moment has passed. Say something factually relevant but socially tone-deaf, and the room quietly deducts points. No spreadsheet records this. Everyone notices anyway. ...

The Long Conversation Problem: How MAPO Teaches AI to Care Over Time

Customer support has a familiar failure mode: the first answer sounds polished, the second answer sounds patient, the third answer sounds as if the system has quietly forgotten what problem it is solving. The user is still there. The emotional state has changed. The unresolved issue has shifted. The model, meanwhile, keeps producing individually acceptable replies, like a waiter bringing one beautifully plated dish at a time to the wrong table. ...

When Plans Talk Back: Conversational AI Meets Classical Planning

Schedule three people, one car, two children, five afternoon activities, and several goals that quietly hate each other. Then ask a normal person to find the best plan. That is already a planning problem. Now ask the same person to understand why a plan failed, which goals caused the failure, what could be added without breaking the plan, and what must be sacrificed if one more constraint is enforced. ...

When Predictions Persuade: The Hidden Causal Risks of AI Decision Support

A prediction looks harmless when it is presented as “just information.” A loan officer sees a default-risk score. A doctor sees a survival estimate. A welfare caseworker sees a predicted probability of program success. The model does not press the button. The human still decides. Everyone in the room can therefore relax, at least until the audit committee arrives with coffee and regrettable questions. ...