Sandboxes & Ladders: How to Build a Steerable Agent Economy

Budgets are where autonomy becomes real.

A chatbot can be annoying. An agent with a procurement account, API access, calendar authority, cloud credits, and a habit of negotiating with other agents is something else entirely. At that point, we are no longer discussing “workflow automation” in the tidy enterprise sense. We are discussing economic actors: software systems that request resources, trade off priorities, outsource tasks, pay for services, and generate consequences faster than the compliance department can ask for a meeting.

The paper “Virtual Agent Economies” by Nenad Tomašev and colleagues at Google DeepMind argues that this is not a distant philosophical curiosity. As autonomous agents become more capable and interoperable, their interactions may form a new economic layer: linked digital markets where agents transact, coordinate, and allocate scarce resources.¹ The authors call this a sandbox economy, but the term is slightly deceptive. A sandbox, in the popular imagination, is sealed. This paper’s central point is that the useful sandboxes will not be sealed. They will have membranes.

That is the mechanism worth understanding. Not “AI agents will create markets,” which is plausible but vague. Not “blockchain might help,” which is usually where serious thought goes to get mugged in a parking lot. The useful argument is more disciplined:

agent economies are likely to emerge by default;
the dangerous default is accidental and permeable;
market mechanisms can allocate scarce agent resources;
identity, credentials, audit trails, and oversight make those markets governable;
pilot sandboxes should test bounded agent economies before they are wired into production systems.

This is a conceptual and policy-design paper, not an empirical benchmark. There are no ablation tables, no agent leaderboard, no “our method beats baseline by 14.3%” victory lap. The evidence is architectural: a synthesis of market design, multi-agent systems, digital identity, community currencies, governance, and AI safety. That makes the paper less immediately testable, but also more strategically useful. It asks what infrastructure must exist before agent autonomy becomes economically safe.

A sandbox is not a box; it is a membrane

The paper’s most useful move is a two-axis taxonomy. Agent economies can be intentional or emergent, and they can be impermeable or permeable.

Origin / boundary	Impermeable	Permeable
Intentional	Controlled testbeds for agent behaviour, with limited external consequences	Designed agent markets connected to real resources through governed interfaces
Emergent	Isolated enclaves that may have limited utility	The default risk: agents simply enter human markets at machine speed

The important quadrant is not the neat laboratory sandbox. It is the ugly default: emergent and permeable. That is what happens if firms deploy agents independently, connect them to tools, allow them to buy services, let them negotiate with vendor agents, and later discover that a market has formed. Congratulations, your architecture is now macroeconomics with a YAML file.

Permeability is the degree to which events inside the agent economy affect the human economy, and vice versa. Money conversion, compute access, API quotas, physical execution, legal commitments, data sharing, and customer-facing actions are all permeability channels. The paper treats permeability as the design lever. A fully impermeable sandbox may be safe, but not very useful. A fully permeable sandbox may be useful, but systemically risky. The practical question is how to tune the boundary.

For business leaders, that reframes agent deployment. The question is not merely “Can this agent complete the task?” It is “What can cross the membrane when the agent acts?” Can it spend money? Bind the company contractually? Access restricted data? Trigger physical logistics? Reallocate compute from another team? Negotiate with third-party systems? Each “yes” is a permeability decision.

This matters because permeability is not controlled by one actor. A single company can harden its own agents, but the market emerges from many agents, protocols, vendors, platforms, and users interacting. The agent economy is a collective property. That makes governance harder, but it also explains why waiting for best practices to emerge organically is a charmingly expensive way to learn.

Machine-speed bargaining is where workflow automation becomes market design

The paper’s mechanism-first argument starts with the limits of central orchestration.

Inside one company, an agent platform can still look like a workflow engine. Tasks are routed, permissions are checked, logs are stored, costs are assigned, and failures are escalated. Add enough external agents, however, and the problem changes. Agents controlled by different users and organisations will have different goals, private information, unequal budgets, different tool access, and uneven capability. No single coordinator sees the full state.

At that point, markets become attractive because markets are coordination machines. They do not require every participant to reveal everything. They use prices, bids, reputations, and contracts to allocate scarce resources under partial information. The paper extends that logic to agent systems.

The scarce resources are not just money. They include:

Scarce resource	Agent-market version	Why allocation becomes hard
Compute	Inference budget, priority execution, GPU time	Demand spikes and task urgency vary continuously
Data	Proprietary datasets, private context, sensor streams	Access has value, privacy risk, and compliance constraints
Tools	API calls, specialist models, robotic capabilities	Some tools are expensive, regulated, or safety-critical
Human attention	approval, review, expert intervention	Humans remain scarce exactly where decisions are high-stakes
Physical execution	delivery, inspection, maintenance, robotics	Real-world actions create liability and externalities

This is where the paper’s analogy to high-frequency trading becomes useful. Current algorithmic markets already show how automated systems can interact at speeds beyond human intervention. A future agent economy may create a parallel phenomenon: high-frequency negotiation. Agents could continuously bargain over reservations, delivery windows, compute slots, advertising inventory, data access, or procurement terms. The risk is not merely “one bad agent.” The risk is feedback.

An agent raises a bid. Another agent reprices. A marketplace adjusts availability. A vendor agent reallocates capacity. A user agent interprets the change as scarcity and accelerates purchasing. Multiply that across thousands or millions of interactions, then connect the result to fiat money, cloud infrastructure, customer operations, or supply chains. That is how a sandbox stops being a sandbox and becomes a plumbing leak in the real economy.

The paper’s answer is not to ban markets. It is to design the membrane: rate limits, conversion windows, circuit breakers, credential checks, audit trails, and human escalation only where human judgment can still matter.

Auctions encode preferences, but they do not eliminate power

One of the paper’s clearest design proposals is the use of auctions for resource allocation. The intuition is straightforward: if agents act on behalf of users, and users have competing preferences over scarce resources, auctions can convert those preferences into bids.

The paper draws on Ronald Dworkin’s auction-based approach to distributive justice. In simplified form, each user receives an equal initial endowment of virtual currency. Their agent uses that endowment to bid for resources according to the user’s preferences. A fair outcome aims to satisfy an “envy test”: after allocation, no participant should prefer someone else’s bundle of goods plus leftover currency over their own.

In an agent economy, the auctioned goods might be compute, data access, high-priority service slots, specialist agent labour, API capacity, or scarce physical resources. The point is not to auction the agents themselves. That would mostly reward whoever already has better models and deeper pockets. The point is to auction access to shared resources under a controlled rule set.

This is elegant, but not magic. The paper is careful about the boundary. Equal endowments do not guarantee equal outcomes if agents differ in strategic ability. A more capable agent may bid better, infer scarcity faster, exploit patterns, collude, or use resources more efficiently after winning them. In other words, market fairness at the allocation layer can be undermined by capability inequality at the strategy layer. A fair auction run by unfairly matched negotiators is still a knife fight with stationery.

For enterprises, the useful lesson is narrower and more operational: auctions are a governance primitive. They are especially useful where a platform needs to allocate scarce internal resources among many semi-autonomous agents without pretending a central planner can manually rank every request.

A practical pilot might look like this:

Pilot element	Design choice	What it tests
Resource	API priority minutes or specialist-model calls	Whether agents can express task urgency through bids
Endowment	Equal daily credit budgets per user, team, or policy class	Whether the allocation reduces queue gaming
Auction cadence	Batch allocation for high-value resources; continuous allocation for low-risk ones	Whether frequency changes fairness and volatility
Approval rule	Autonomous spending below a threshold; human approval above it	Whether humans can supervise exceptions rather than routine trades
Audit metric	distribution of wins, unused credits, failed tasks, complaints, and post-allocation envy proxies	Whether the market is actually fair enough to trust

This is the paper’s business relevance at its most concrete. Do not begin with a grand agent economy. Begin with one scarce internal resource and one allocation rule. Observe whether agents behave sensibly. Then decide whether to widen the membrane.

Mission economies price the destination without scripting the route

The paper then moves from allocation to direction. Markets are good at routing resources, but ordinary markets do not automatically care about public goods, sustainability, safety, resilience, or fairness. Those objectives must be encoded into the market design.

This is where the authors introduce mission economies: agent markets structured around collective goals. The concept draws from mission-oriented economic thinking, but the agentic version is more programmable. If agents transact through shared infrastructure, the rules can reward outcomes that matter to a community, company, city, or regulator.

The key is to reward the destination, not dictate the route.

A mission economy for energy efficiency might reward agents for completing workloads during low-carbon grid periods, reducing redundant computation, or shifting non-urgent tasks geographically. A mission economy for scientific research might reward validated replication, useful intermediate results, or high-quality data curation. A mission economy for logistics might reward emissions reduction, delivery reliability, and safety compliance together.

This matters because agent economies will not simply allocate effort; they will shape what kinds of effort become profitable. If every agent is rewarded only for speed, we should not act surprised when the system learns to burn compute like a hedge fund intern with daddy’s GPU account. If the market rewards verified outcomes, resource efficiency, and trustworthy cooperation, agents will optimise in a different direction.

The hard part is credit assignment. Complex agent work is distributed. A user-facing agent may deliver the final answer, but it may depend on a search agent, a verification agent, a data-cleaning agent, a specialist model, and a human reviewer. The paper argues that useful agent markets need granular mechanisms for tracing value back through these chains. Otherwise, the visible agent captures the reward while the supporting agents become unpaid plumbing.

For business, that maps directly onto internal agent platforms. If a company deploys multi-agent systems, it should not only log the final output. It should track which agents contributed, what resources they used, whether their intermediate outputs were actually integrated, and whether the final task succeeded. That is not only observability. It is the basis for internal pricing, vendor compensation, model selection, and governance.

Identity is the rail gauge of the agent economy

Interoperability protocols let agents talk. They do not tell us whether the agent should be trusted.

The paper therefore spends substantial attention on identity and reputation infrastructure. Its proposed stack includes decentralised identifiers, verifiable credentials, proof-of-personhood mechanisms, audit trails, and potentially blockchain-based ledgers. This is the part of the discussion most likely to be misread as a crypto pitch. It is better understood as an accountability pitch.

The core question is simple: when an agent requests a transaction, what do we know about it?

Who controls it? What is it authorised to do? Has it completed similar tasks before? Does it hold a safety credential? Can it prove access to funds without revealing its full budget? Can it sign commitments? Can a credential be revoked? Can a regulator or platform reconstruct what happened after a failure?

A useful agent economy needs machine-readable answers. Human-readable trust badges will not survive machine-speed markets.

Infrastructure layer	Function	Business interpretation
Decentralised identifiers	Persistent or temporary machine-verifiable identity	Lets platforms know which agent is acting and under what key
Verifiable credentials	Signed attestations about capability, compliance, or history	Turns reputation into a portable asset, not platform gossip
Proof of personhood	Links some allocations to unique humans where needed	Prevents Sybil attacks in human-benefit schemes
Zero-knowledge proofs	Proves eligibility or sufficiency without full disclosure	Supports privacy-preserving negotiation
Audit trails	Records actions, bids, credentials, approvals, and failures	Makes liability and dispute resolution possible

The paper’s distinction between disposable and durable identities is especially practical. A temporary agent created to summarise a public document does not need the same identity infrastructure as an enterprise purchasing agent with authority to spend money. The former might use lightweight ephemeral identity. The latter needs durable identity, revocable credentials, policy-bound permissions, and an audit trail that lawyers can understand without first becoming cryptographers. A civilisation can dream.

This also changes how enterprises should think about agent vendors. The buying question is not only “How capable is your model?” It is “What identity, credential, billing, logging, revocation, and dispute mechanisms come with it?” Capability without accountability is just speed with plausible deniability.

Oversight has to be automated before it can be human

A recurring flaw in AI governance discussions is the phrase “human in the loop,” used as if a human reviewer can be inserted into any process like a compliance paperclip. In an agent economy, many harmful dynamics may emerge faster than humans can inspect individual actions. The paper recognises this and proposes hybrid, multi-tiered oversight.

The first layer is automated monitoring: AI overseers and rule systems that watch market activity in real time. They flag anomalies, detect manipulation, enforce basic limits, and pause dangerous actions. The second layer is automated adjudication: temporary holds, quarantine of suspect transactions, evidence gathering, and preliminary classification. The third layer is human expert review for complex, novel, or high-stakes cases.

This is not a replacement for human accountability. It is a way to preserve it. Humans should decide the rules, thresholds, liability models, appeal processes, and mission objectives. But humans should not be expected to manually inspect every micro-transaction among agents. That would be governance theatre: reassuring, slow, and mostly decorative.

The analogy to circuit breakers is useful. Financial markets do not rely on a human noticing a crash one trade at a time. They use automated mechanisms to pause activity under defined conditions. Agent markets will need similar controls, but the triggers may include more than price volatility. They may include credential failures, correlated tool misuse, abnormal bid patterns, unexpected data exfiltration attempts, cascading retries, sudden concentration of resource wins, or repeated interactions with known adversarial surfaces.

The paper also introduces “agent traps”: websites, inputs, or digital elements designed to subvert agents through prompt injection, jailbreaking, or adversarial manipulation. This is an important practical boundary. The more agents can transact, the more every web page, API response, and marketplace listing becomes a possible attack surface. In a permeable economy, an agent trap is not just a bad prompt. It can become an unauthorised expenditure, data leak, or contractual mistake.

So the governance stack is not optional decoration. It is the price of letting agents touch real assets.

Community currencies make sense only when they control scope

The paper’s discussion of community currencies is more subtle than “invent a token.” The point is modularity.

Large, general-purpose markets tend to transmit shocks widely. Local or mission-specific currencies can constrain scope, align incentives, and reduce contagion. A community currency for a city, university, research network, supply-chain consortium, or energy grid can encode local objectives more precisely than a global agent currency.

For example, a campus energy agent market might issue credits tied to compute scheduling, energy efficiency, and demand response. Agents operating inside that market could trade credits to prioritise workloads, but conversion into broader financial value would be capped or delayed. That makes the market useful without making it dangerously liquid.

This is where “sandbox” earns its name. The sandbox is not the currency. The sandbox is the set of rules governing where the currency works, what it can buy, how it converts, who can hold it, how fast it circulates, and what happens when something breaks.

The paper notes that community currencies require careful design: transparency, legitimacy, circulation velocity, locality, self-governance, and clear objectives. For business deployments, the translation is simple. Do not create an internal agent credit because it sounds futuristic. Create one only if it solves a specific coordination problem:

Use case	Sensible currency design	Bad design
Cloud cost governance	Credits tied to workload priority and compute efficiency	Freely convertible credits that teams hoard or trade politically
Customer support routing	Credits for escalation capacity and specialist review	A black-box score that hides why customers are deprioritised
R&D agent collaboration	Credits paid for validated intermediate contributions	Winner-takes-all rewards to the final summarising agent
City or campus operations	Local credits tied to energy, water, traffic, or maintenance goals	A generic “smart city token” with no credible governance

The business lesson is anti-glamorous: the currency matters less than the convertibility rule. Liquidity is useful. Uncontrolled liquidity is how a pilot becomes a systemic dependency.

What the paper directly shows, and what Cognaptus infers

Because this is a conceptual paper, the evidentiary standard must be handled cleanly. The paper does not demonstrate that auction-based agent economies work at scale. It does not show that verifiable credentials prevent agent fraud in production. It does not prove that mission economies will align agent swarms with public objectives. It does not provide measured performance results.

What it does provide is an architectural argument: if agent systems become economically active, then alignment cannot be treated only as a model-level problem. It becomes a market-design, identity, legal, and governance problem.

Paper claim	Evidence type	Business meaning	Boundary
Agent economies may emerge as agents transact and coordinate	Conceptual synthesis from agent autonomy and interoperability trends	Firms should expect agent-to-agent workflows to become cross-platform	Timing and scale remain uncertain
Permeability is the key design variable	Taxonomy and risk analysis	Deployment policies should define what can cross from agent systems into real assets	No universal permeability setting fits all sectors
Auctions can help allocate scarce resources fairly	Market-design reasoning, especially equal endowment logic	Useful for internal allocation pilots around compute, API priority, or specialist agents	Strategic capability gaps may still distort outcomes
Mission economies can direct agent markets toward shared goals	Policy-design analogy and mechanism proposal	Agent incentives can include sustainability, resilience, safety, or research-quality metrics	Metrics can be gamed; mission choice is political
Identity and credentials are prerequisites for trust	Infrastructure analysis	Agent platforms need identity, authorisation, revocation, billing, and audit rails	Technical feasibility differs by deployment context
Hybrid oversight is required	Speed and scale argument	Automated containment should precede human review	Automated overseers themselves require governance

Cognaptus’ inference is that enterprises should not wait for “the agent economy” to arrive as a grand external event. They should build miniature versions now, inside bounded domains, and measure what breaks.

Start with a narrow market: allocating model calls, dispatching specialist agents, ranking procurement requests, or pricing access to scarce review capacity. Give agents limited budgets. Require credentials. Log every bid and action. Add circuit breakers. Test whether outcomes improve. Then decide whether to connect the pilot to more consequential systems.

That is the ladder in the title: start with a sandbox, add controlled permeability, climb only when the previous rung has held under stress.

The boundary: not everything should become a market

The paper is more market-friendly than much of the AI alignment discourse, but it does not argue that every human value should be priced. That restraint matters.

Some needs are poorly represented by bidding. Some rights should not depend on purchasing power, even virtual purchasing power. Some domains require direct human decision-making because legitimacy, dignity, consent, culture, or risk sensitivity cannot be delegated cleanly to agent negotiation. A market can allocate compute. It should not quietly become the constitutional order of customer service, healthcare triage, public benefits, education access, or legal representation.

This is the practical limitation for executives: agent markets are coordination tools, not moral machines. They are good for allocating scarce resources under rules. They are bad when the rules are illegitimate, when affected people cannot participate, when power asymmetries dominate, or when the metric becomes a substitute for judgment.

The paper’s strongest recommendation is therefore not “build agent markets everywhere.” It is “pilot them under regulatory and operational containment.” Use supervised real-world laboratories. Test narrow missions. Observe emergent cooperation and adversarial behaviour. Measure fairness and failure. Refine before scaling.

That may sound slow. Compared with accidentally wiring autonomous agents into finance, logistics, procurement, and customer operations without market governance, it is practically athletic.

The operator’s playbook

A company taking this paper seriously would not begin by launching a token. It would begin by writing a permeability policy.

First, define which assets agents may touch: money, data, compute, tools, customer communications, legal commitments, and physical operations. Then define the boundary conditions: spending thresholds, conversion limits, approval tiers, jurisdictional constraints, and emergency stops.

Second, create one internal market for one scarce resource. Compute is the obvious candidate. Specialist agent time, review capacity, high-priority API calls, and data access are also suitable. Issue equal or policy-adjusted credits. Run batch auctions before continuous ones. Measure distribution, satisfaction, task success, gaming, and volatility.

Third, build identity rails before broadening access. Every economically active agent should have an identity, a controller, an authorisation profile, revocable credentials, and an audit log. Disposable agents and durable agents should not be governed as if they are the same creature wearing different hats.

Fourth, add oversight that can actually keep up. Human approval is useful at thresholds, exceptions, disputes, and policy changes. Automated oversight is needed for anomaly detection, transaction quarantine, credential checks, and circuit breakers.

Fifth, test mission weights carefully. Add incentives for outcomes the organisation genuinely values: lower compute waste, safer tool use, verified outputs, faster resolution, higher customer satisfaction, or lower emissions. Do not add fifteen objectives and call the resulting soup “aligned.” It is not aligned. It is confused with a dashboard.

The real thesis: agent alignment becomes institutional design

The paper’s quiet provocation is that advanced agents will not be governed only through better prompts, safer models, or more obedient assistants. Once agents transact with one another, alignment becomes institutional design.

Prices, credentials, identities, audit trails, liability rules, market boundaries, community currencies, oversight agents, and human review all become part of the alignment stack. The model still matters, obviously. But the market around the model may matter just as much.

That is the shift operators should notice. Agentic AI is usually sold as productivity software. This paper treats it as economic infrastructure. If agents can bargain, spend, allocate, outsource, and compete, then the governance problem is no longer confined to the application layer. It lives in the transaction layer.

The good news is that enterprises do not need to solve the entire future economy. They need to stop pretending their agent pilots are just clever workflows. Every agent with a budget is already a tiny economic actor. Every API quota is a resource allocation regime. Every approval threshold is a permeability rule. Every log schema is a future courtroom exhibit, whether legal has noticed or not.

Build the sandbox. Add the ladder. Climb slowly.

Cognaptus: Automate the Present, Incubate the Future.

Nenad Tomašev, Matija Franklin, Joel Z. Leibo, Julian Jacobs, William A. Cunningham, Iason Gabriel, and Simon Osindero, “Virtual Agent Economies,” arXiv:2509.10147, 2025. https://arxiv.org/abs/2509.10147 ↩︎

A sandbox is not a box; it is a membrane#

Machine-speed bargaining is where workflow automation becomes market design#

Auctions encode preferences, but they do not eliminate power#

Mission economies price the destination without scripting the route#

Identity is the rail gauge of the agent economy#

Oversight has to be automated before it can be human#

Community currencies make sense only when they control scope#

What the paper directly shows, and what Cognaptus infers#

The boundary: not everything should become a market#

The operator’s playbook#

The real thesis: agent alignment becomes institutional design#