Cover image

When Models Guess the Verb by Looking at the Drawer

Opening — Why this matters now If you have ever watched a video model confidently predict opening drawer when the person is clearly closing it, you have already encountered the core problem of modern compositional video understanding: the model isn’t really watching the action. It is guessing. As video models are increasingly deployed in robotics, industrial monitoring, and human–AI interaction, the ability to correctly generalize unseen verb–object combinations is no longer academic. A robot that confuses opening with closing is not merely inaccurate—it is dangerous. ...

January 24, 2026 · 4 min · Zelina
Cover image

The Diligent but Brittle Student Inside Every LLM

If you put a large language model in a classroom for a year, what kind of student would it become? According to Simulating Human-Like Learning Dynamics with LLM-Empowered Agents, the answer isn’t flattering: most base LLMs act like “diligent but brittle surface learners”—hardworking, seemingly capable, but unable to generalize deeply. From Psych Lab to AI Lab Educational psychology has spent decades classifying learners into profiles like deep learners (intrinsically motivated, reflective, conceptual) and surface learners (extrinsically motivated, test-oriented, shortcut-prone). The authors built LearnerAgent, a multi-agent framework grounded in these theories, and dropped four AI ‘students’ into a simulated high school English class: ...

August 8, 2025 · 3 min · Zelina