When we talk about AI in science, most imaginations stop at the screen — algorithms simulating molecules, predicting reactions, or summarizing literature. But in LabOS, AI finally steps off the screen and into the lab. It doesn’t just compute hypotheses; it helps perform them.

The Missing Half of Scientific Intelligence

For decades, computation and experimentation have formed two halves of discovery — theory and touch, model and pipette. AI has supercharged the former, giving us AlphaFold and generative chemistry, but the physical laboratory has remained stubbornly analog. Robotic automation can execute predefined tasks, yet it lacks situational awareness — it can’t see contamination, notice a wrong reagent, or adapt when a human makes an unscripted move.

LabOS, developed by teams at Stanford and Princeton, fills that gap. It’s not another digital assistant for scientists; it’s a multimodal co-scientist that thinks, sees, and collaborates in real laboratories. The system links agentic AI reasoning with XR smart glasses, Vision-Language Models (VLMs) trained on lab videos, and self-evolving multi-agent systems capable of autonomous reasoning and tool creation.

Think of it as ChatGPT with eyes, memory, and gloves.

From Agentic AI to XR Co-Scientist

At the core of LabOS lies a self-evolving agent architecture. Like a research group condensed into silicon, it includes:

Agent Role Example Function
Manager Plans experiments Breaks a research goal into structured tasks, reagents, and quality checks
Developer Executes computational work Writes and runs analysis code, e.g. gene expression comparison
Critic Evaluates results Checks intermediate outputs and refines hypotheses
Tool Creator Expands the system’s capability Autonomously integrates new APIs and analysis tools from literature

This “agentic lab team” doesn’t remain static — it learns from every problem. LabOS continually updates a Template Library of reasoning workflows and grows a Tool Ocean of reusable analytical components, making it progressively smarter with use. Benchmark tests show it outperforms frontier models on biomedical reasoning challenges like Humanity’s Last Exam (HLE) and LAB-Bench.

But computation is only half of its identity. The other half — and the more radical one — is how it physically connects with the scientist through XR glasses.

Seeing Through the Scientist’s Eyes

The LabOS team collected over 200 egocentric lab videos — scientists wearing cameras while conducting real experiments — to create LabSuperVision (LSV), a benchmark for scientific visual reasoning. These videos taught the VLM how to map visual cues (e.g., pipetting, labeling, contamination) to procedural meaning.

When tested on LSV, even top commercial models like GPT‑4o, Gemini 2.5 Pro, and Cosmos‑1 struggled, scoring below 3 out of 5 in accuracy for identifying procedural errors. LabOS’s fine-tuned LabOS‑VLM‑235B, however, exceeded 90% accuracy — detecting missteps such as sterile breaches or incorrect incubation times in real experiments.

Through XR glasses, LabOS overlays this perception with guidance. As the researcher performs steps, the AI watches and offers corrections: “You missed the centrifuge cap,” or “Incubation time off by two minutes.” The AI also records the entire workflow automatically, generating a digital twin of the experiment for future training or replication.

This fusion of egocentric perception and context-aware feedback turns LabOS into an intelligent collaborator — not a supervisor, but a co-pilot.

When AI Joins the Wet Lab

LabOS’s capabilities go beyond observation. In proof-of-concept studies, it has already contributed to biomedical discoveries:

  • Cancer Immunotherapy: LabOS autonomously identified CEACAM6 as a regulator of tumor resistance to natural killer (NK) cells — a target later validated in live experiments.
  • Mechanistic Discovery: It proposed ITSN1 as a regulator of cell–cell fusion, a result confirmed via CRISPRi assays.
  • Stem Cell Engineering: Using XR copiloting, LabOS guided researchers through complex gene-editing of induced pluripotent stem cells, flagging deviations and documenting each micro-action for reproducibility.

Each of these cases reflects a critical shift: AI is no longer advisory to science; it is participatory.

The Architecture of a Self-Evolving Laboratory

LabOS introduces a new concept — self-evolving scientific infrastructure. Every interaction becomes data. Every experiment becomes training. Over time, this feedback turns a lab into a living intelligence loop:

  1. AI proposes a hypothesis or plan.
  2. Human executes it with XR-assisted feedback.
  3. AI observes, records, and critiques performance.
  4. Both learn, updating the shared model and toolset.

The result is an ecosystem where intuition and computation reinforce each other — a “neural‑scientific network” rather than a mere workflow automation.

Implications: From Reproducibility to Creativity

If reproducibility has long been science’s Achilles’ heel, LabOS offers a path to mend it. By recording not only what was done but how, it makes tacit human skills transferable. A novice can perform like an expert by following the AI’s contextual guidance, while the system simultaneously learns from both.

Yet the larger promise is creative acceleration. The same feedback loop that keeps errors out of the lab can also bring serendipity back in — encouraging rapid hypothesis iteration and human‑AI idea fusion. In the long view, this may redefine what “doing science” means.

As Le Cong and Mengdi Wang, LabOS’s lead authors, suggest: science moves fastest when thought meets action. LabOS ensures both now share the same neural substrate.


Cognaptus: Automate the Present, Incubate the Future