Scalpels, Agents, and Orchestrators: When Surgery Meets Autonomous Workflows
A mechanism-first reading of how VISA uses LLM orchestration, memory, and specialised agents to make voice-controlled surgical data interfaces more reliable.
A mechanism-first reading of how VISA uses LLM orchestration, memory, and specialised agents to make voice-controlled surgical data interfaces more reliable.
SpatialThinker shows that better reward design, not just more data or depth sensors, can make multimodal models reason more reliably about 3D space.
A practical reading of new learning-theory results showing why noisy data may still permit generation, but can quietly destroy broad coverage.
PhysWorld shows how generated task videos become useful for robots only after geometry, physics, and residual learning do the unfashionable work.
A mechanism-first look at why DiagramIR’s structured back-translation pipeline makes AI-generated math diagrams easier to verify, cheaper to govern, and harder to excuse.
A close reading of an AI data-visualization platform paper shows where automated analytics can compress workflow, and where the evidence still stops short of replacing analysts.
A mechanism-first analysis of how multi-agent Text-to-Cypher systems turn graph querying from brittle prompting into executable, database-grounded retrieval.
A mechanism-first reading of STV, a task-vector method that makes many-shot multimodal adaptation less about longer prompts and more about knowing which attention heads to touch.
JobSphere shows how multilingual RAG can make government employment portals more usable, cheaper to operate, and still far from magically solved.
A mechanism-first reading of why refusal in language models may behave less like a switch and more like a structured manifold.