The Short of It

LLM agents are about to become your busiest “users”—but they don’t behave like dashboards or analysts. They speculate: issuing floods of heterogeneous probes, repeating near-identical work, and constantly asking for partial answers to decide the next move. Traditional databases—built for precise, one‑off queries—will buckle. We need agent‑first data systems that treat speculation as a first‑class workload.

This piece unpacks a timely research agenda and turns it into an actionable playbook for CTOs, data platform leads, and AI product teams.

What the Paper Argues (and Why It Matters)

The core thesis: LLM agents will be the dominant data workload, and their behavior—“agentic speculation”—has four tell‑tale properties you can exploit:

Scale: agents can issue 100s–1000s of requests per second per task.
Heterogeneity: early steps skim schemas/stats; later steps craft partial or complete solutions; validation shows up everywhere.
Redundancy: many plans and sub-plans overlap—ripe for sharing and caching.
Steerability: if the system talks back (hints, costs, join suggestions), agents get to the answer with fewer steps.

Implication for builders: stop optimizing for single perfect answers. Start optimizing for fast, sufficiently‑good signals that steer the swarm.

A Mental Model: From Queries to Probes

Agents don’t just send SQL; they send probes: bundles that may include SQL plus a brief (phase, goals, accuracy needs, budgets, success criteria). The system responds with results and grounding feedback (e.g., related tables, why-not provenance, cost estimates, or “batch these next 3 steps”).

Treat your database like a collaborator that can nudge agents away from dead ends, not a mute oracle.

The Architecture We Should Build

Below is the paper’s architectural thrust, translated for a practical roadmap.

1) Probe Interface (beyond SQL)

Brief-aware execution: understand phase (exploration vs. solution), target accuracy, and early‑stop criteria.
Semantic operators: native ops to find tables/columns/rows by meaning (not just LIKE).
Side-channel replies: surface hints (alternative tables, join keys), costs, and reuse opportunities.

2) Probe Optimizer (satisficing over maximizing)

Intra-probe: prune irrelevant columns/queries; share sub-plans; return coarse first, precise later.
Inter-probe: cache & materialize across turns and agents; pick one query to run exactly now to reduce five follow‑ups.
Decision policy: optimize for time-to-next-decision, not time-to-full-answer.

3) Agentic Memory Store (the semantic “pseudo-index”)

Persist learned grounding: encodings, missing value patterns, time/location granularities, prior probe results, “what worked before.”
Queryable by agents and sleeper agents (in-database helpers that prefetch hints, detect reuse, and warn about stale assumptions).

4) Branched Transactions (“MVCC on steroids”)

Support cheap forks and ultra‑fast rollbacks for “what‑if” updates across thousands of near-identical worlds.
Provide multi‑world isolation: logical separation with physical sharing.

What Changes for Your Data Platform (Concrete Moves)

Layer	Old World (BI/Dashboards)	Agent‑First World	First Steps You Can Pilot
Interface	SQL-only; fixed SLAs	Probes with briefs; semantic search; side‑channel hints	Add a thin probe API that accepts (goals, phase, accuracy, budget); log them for policy learning
Optimization	Throughput & exactness	Satisficing, coarse‑to‑fine; multi‑query sharing	Turn on approximate scan + early-stop; add sub-plan cache keyed by embeddings of projected/joined fields
Execution	One-shot queries	Incremental & interruptible execution	Implement chunked scans with “pause/continue,” return partial stats early
Storage	Fixed indexes/layouts	Agentic memory store; dynamic layouts	Start with a metadata KV + vector index for per-column notes, encodings, sample value sketches
Transactions	Linear ACID	Branching with fast fork/rollback	Prototype copy-on-write branching tablespaces for sandboxed what‑ifs
Governance	Static ACLs	Cross‑agent reuse vs. privacy trade‑offs	Add reuse policies: “share sub-plan X if redacted,” log provenance for audit

Patterns You Can Use This Quarter

Coarse‑First Scouting
- For exploration-phase probes, return: (row count, distinct values, null rates, 1% sampled histograms) per touched column—not full rows.
- Expose a STOP_IF(similarity > τ) hook so the system can auto‑abort near-duplicate scans.
Sub‑Plan Dedup Cache
- Hash or embed sub-plans (e.g., Scan[table, projected_cols], HashJoin[left,right,on]) and memoize results.
- Evict by information gain (Δ in downstream decisions), not just LRU.
Hinting Sleeper Agent
- On every probe, asynchronously compute: (i) related join candidates, (ii) likely missing filters, (iii) why‑not explanations for empty results, (iv) cost warnings with cheaper alternates.
- Feed hints back on the side channel; let the calling agent update its plan.
One‑Exact‑Now Policy
- In multi‑turn tasks, choose one high‑impact query to compute exactly if it likely collapses the tree of next questions (e.g., the canonical join).
- Everything else stays approximate until the agent commits.
Sandboxed What‑If Branches
- Introduce branch creation as a first‑class DDL (e.g., CREATE BRANCH ... FROM snapshot) with quotas and TTLs; optimize for rollback speed.

KPIs That Actually Reflect Agent Workloads

Time‑to‑Next‑Decision (TTND): median seconds from probe receipt to agent issuing its next, better‑informed probe.
Redundancy Hit‑Rate: % of sub‑plans served from the dedup cache.
Hint Utility: share of probes whose next turn changes because of system‑generated hints.
Branch Churn: forks/rollbacks per task; rollback latency at P95.
Approx‑Then‑Exact Lift: reduction in total compute vs. exact‑only baselines for equal task success.

Where This Collides with Reality (and How to Navigate)

Privacy vs. Reuse: Agentic memory and cross‑agent sharing can leak. Apply scope‑aware caches (per‑tenant, redacted global) and strong provenance.
Semantic Drift: Agent-written metadata goes stale on schema changes. Add staleness detectors and automatic deprecation of conflicting notes.
Cost Explosions: Swarms can DDoS your warehouse. Enforce phase‑aware admission control and budgeted probes per task.

Strategic Take: Data Systems Become Co‑Pilots

The step-change isn’t AQP or MPP—those are ingredients. The shift is dialogue: databases that negotiate partial results, costs, and next steps with agent swarms. Teams that implement this co‑pilot posture will ship agentic apps that feel snappy, cheap, and reliable while competitors burn cycles chasing perfect answers.

Leadership ask for this week: pilot a probe API, add a sub‑plan dedup cache, and instrument TTND. You’ll feel the throughput relief immediately.

Cognaptus: Automate the Present, Incubate the Future

The Short of It#

What the Paper Argues (and Why It Matters)#

A Mental Model: From Queries to Probes#

The Architecture We Should Build#

1) Probe Interface (beyond SQL)#

2) Probe Optimizer (satisficing over maximizing)#

3) Agentic Memory Store (the semantic “pseudo-index”)#

4) Branched Transactions (“MVCC on steroids”)#

What Changes for Your Data Platform (Concrete Moves)#

Patterns You Can Use This Quarter#

KPIs That Actually Reflect Agent Workloads#

Where This Collides with Reality (and How to Navigate)#

Strategic Take: Data Systems Become Co‑Pilots#