SAGA, Not Sci‑Fi: When LLMs Start Doing Science

Opening — Why this matters now

For years, we have asked large language models to explain science. The paper behind SAGA asks a more uncomfortable question: what happens when we ask them to do science instead?

Scientific discovery has always been bottlenecked not by ideas, but by coordination — between hypothesis generation, experiment design, evaluation, and iteration. SAGA reframes this entire loop as an agentic system problem. Not a chatbot. Not a single model. A laboratory of cooperating AI agents.

Background — From tools to teams

Prior work has shown that LLMs can assist isolated tasks: molecule generation, materials screening, or process optimization. But these systems usually stop where things get messy — multi‑objective trade‑offs, domain‑specific constraints, and long feedback loops.

SAGA (Scalable Agent‑based Generative Architecture) departs from the “one‑model‑does‑all” fantasy. Instead, it borrows from how real research groups work: decomposing discovery into specialized roles, each optimized for a narrow responsibility.

Analysis — What SAGA actually does

At its core, SAGA is a multi‑agent orchestration framework. Each agent has:

A defined role (generator, evaluator, planner, critic)
A task‑specific objective function
Access to domain simulators or scoring functions

These agents interact through structured feedback rather than free‑form chat. Crucially, SAGA enforces grounded evaluation — candidates are not judged by linguistic plausibility, but by experimentally relevant metrics.

Domains covered

Domain	Objective	Evaluation Signals
Antibiotic discovery	Novel active compounds	Potency, toxicity, diversity
Inorganic materials	Superhard materials	DFT energy, hardness, HHI risk
DNA sequence design	Cell‑specific enhancers	MPRA expression, specificity
Chemical process design	Feasible flowsheets	Purity, capex, recycle penalties

This breadth is not cosmetic. It demonstrates that the agentic pattern — not the chemistry — is the real contribution.

Findings — What worked (and what didn’t)

Across domains, SAGA consistently outperformed single‑agent or purely generative baselines. Two results stand out:

Search efficiency improved: agent feedback reduced mode collapse and premature convergence.
Constraint satisfaction increased: especially in process design, where naive generation fails quickly.

However, the paper is refreshingly honest about limitations. In DNA design, for instance, some generated enhancers showed strong activity but poor specificity — a reminder that biological objectives remain stubbornly multi‑dimensional.

Implications — Why businesses should care

SAGA is not a lab curiosity. It is a template.

For enterprises exploring AI‑driven R&D, the message is clear:

Stop asking models for answers
Start assigning them roles
Measure outputs with real‑world cost functions

This architecture generalizes beyond science — into product design, supply‑chain optimization, even financial strategy. Anywhere decisions require iterative reasoning under constraints, agentic systems beat monolithic models.

Conclusion — From prompts to processes

SAGA signals a quiet but decisive shift. The future of applied AI will not be about smarter prompts, but about structured autonomy.

LLMs are no longer interns taking notes. With the right scaffolding, they become junior researchers — tireless, opinionated, and surprisingly disciplined.

Cognaptus: Automate the Present, Incubate the Future.

Opening — Why this matters now#

Background — From tools to teams#

Analysis — What SAGA actually does#

Domains covered#

Findings — What worked (and what didn’t)#

Implications — Why businesses should care#

Conclusion — From prompts to processes#