Embodied Agents

The Room Remembers, the Model Forgets

TL;DR for operators A room-tour video is a deceptively simple test for a video model. The objects do not explode, the camera does not enter a car chase, and nobody asks the model to perform cinematic philosophy. The hard part is duller and therefore more operationally relevant: the model must remember where things were, how rooms connected, what changed, and which earlier view matters now. ...

RoboSafe: When Robots Need a Conscience (That Actually Runs)

A robot does not need evil intent to become dangerous. It only needs a bad next action. “Turn on the microwave” sounds ordinary until the microwave contains a fork. “Pick up the knife” may be harmless in a cooking task until the next move is to swing it around. “Turn on the stove” may be safe for one step and unsafe three steps later if the agent forgets to turn it off. Physical risk is annoyingly literal that way. It does not wait for a model to finish reflecting on its values. ...