Cover image

Don’t Forget How to Feel: Teaching Motion Models Empathy Without Amnesia

Avatars are easy to make expressive once. That is the boring version of the problem. Give a motion model enough examples of sad walking, angry gesturing, or excited dancing, and it can learn the broad association between text and motion. The harder problem starts later, after the product has already shipped. A game studio adds a new combat animation pack. A VR training company expands from office scenarios to emergency response. A digital-human platform moves from daily-life gestures into sports, performance, musical instruments, and acrobatics. Suddenly “sad” is no longer just a lowered head during walking. It must become a lowered head while jogging, a constrained body during performance, or a professional movement pattern inside a sport. ...

December 23, 2025 · 15 min · Zelina
Cover image

Don’t Tell the Robot What You Know

Directions are easy when both people see the same room. “Move left.” “Go toward the table.” “The apple is beside the sofa.” These are perfectly reasonable instructions if speaker and listener share the same visual world. They become less reasonable when one of them is staring at a wall, cannot see the table, and has no reason to believe the sofa exists. At that point, the problem is no longer navigation. It is epistemology, with furniture. ...

December 20, 2025 · 14 min · Zelina
Cover image

CitySeeker: Lost in Translation, Found in the City

The city does not answer literal questions A person says, “I’m thirsty.” A human does not usually reply, “Please specify whether you require a vending machine, café, convenience store, supermarket, juice shop, water fountain, or bubble tea store.” That would be technically attentive and socially catastrophic. A human looks around, remembers what cities usually contain, infers which places can satisfy the need, and starts walking toward a plausible target. ...

December 19, 2025 · 16 min · Zelina
Cover image

SceneMaker: When 3D Scene Generation Stops Guessing

A chair behind a table is not half a chair A single image can be a very rude input. It shows the front of a room, hides the back of objects, compresses depth into pixels, and then asks a model to produce a coherent 3D scene. The model must decide what the hidden side of a chair looks like, how large the chair is, whether it sits behind the table or intersects with it, and where everything belongs in 3D space. Naturally, when the result looks wrong, we often blame “weak 3D generation.” ...

December 13, 2025 · 15 min · Zelina
Cover image

Suzume-chan, or: When RAG Learns to Sit in Your Hand

A visitor walks into a research demo, a museum gallery, a hospital information corner, or a corporate training booth. The expert is busy. The brochure is dry. The QR code leads to a page nobody wants to read while standing up. The chatbot is available, technically, but it lives behind a screen and feels like another form to be tolerated. ...

December 13, 2025 · 18 min · Zelina
Cover image

Worlds Within Reach: How SIMA 2 Turns Virtual Environments into Training Grounds for Generalist Agents

Games are not toys to an AI lab. They are controlled worlds with messy consequences. A game gives an agent what enterprise software and robotics both struggle to provide at scale: visual ambiguity, delayed goals, menus, navigation, tool use, failure states, and a reset button that does not involve a broken warehouse robot or a furious operations manager. That is why Google DeepMind’s SIMA 2 paper is more interesting than “AI can play games again.” We have had that headline several times. It is getting a little tired, and it should probably hydrate. ...

December 6, 2025 · 16 min · Zelina
Cover image

Debate Club for Robots: How Multi-Agent Arguing Makes Embodied AI Safer

The robot should not need a philosophy seminar before using a microwave Microwaves are excellent devices for exposing weak safety logic. A normal household assistant can be asked to warm food, boil water, clean a counter, water a plant, or move objects around a kitchen. Most of these tasks are harmless. Some are not. “Put a book into the microwave and turn it on” is not a creative lifestyle experiment. It is a fire hazard with better lighting. ...

November 28, 2025 · 17 min · Zelina
Cover image

Seeing Is Believing—Planning Is Not: What SpatialBench Reveals About MLLMs

A robot in a parking lot does not need poetry. It needs to know where the car is, which way the road bends, what happens if it turns right, and how to reach the exit without performing an expensive interpretation of modern sculpture on someone’s bumper. That sounds simple until we ask a multimodal large language model to do it. ...

November 27, 2025 · 15 min · Zelina
Cover image

Practice Makes Agents: How DPPO Turns Failure into Embodied Intelligence

Robots do not fail gracefully. They misread the scene, choose the wrong object, skip a physical constraint, hallucinate a plan, or produce a confident answer that would make a warehouse supervisor quietly unplug something expensive. The usual response is more data. More robot trajectories. More simulation. More web video. More carefully labelled examples. More of the industrial-scale data plumbing that makes everyone feel productive until the model still cannot decide whether a cup should be placed inside the tray or beside it. ...

November 22, 2025 · 15 min · Zelina
Cover image

Ask, Navigate, Repeat: Why Socially Aware Agents Are the Next Frontier

Directions are easy until they are not. A visitor walks into a shopping district, hears “go past the clothing store, then continue toward MATCONC,” and starts moving. A human can pause, notice the layout is ambiguous, ask another person, update the plan, and recover. A robot, on a good day, may confidently continue in the wrong direction with the serene composure of a machine that has never been embarrassed in public. ...

November 18, 2025 · 15 min · Zelina