From Wallets to Warlords: How AI Agents Are Colonizing Web3

TL;DR for operators

The useful reading of this paper is not “AI agents are coming to crypto.” That is already obvious, and in some corners of the market, painfully over-branded.

The sharper point is that Web3-AI agents are forming a stack. At the bottom are infrastructure and trust layers: protocols, DePIN systems, verification mechanisms, execution environments, and agent-development platforms. On top sit the applications: DeFi agents, portfolio tools, market-intelligence systems, governance assistants, security auditors, creative agents, and RWA managers. The paper’s dataset of 133 projects shows this stack is not evenly valued. Infrastructure accounts for 67.8% of the analysed $6.92 billion market capitalisation, even though incubation platforms show the most project activity.¹

For an operator, that split matters. Builder platforms suggest experimentation. Infrastructure valuation suggests where the market thinks control points may sit. DeFi agents suggest better workflow automation and user interfaces, not guaranteed alpha. Governance agents suggest cheaper proposal analysis and enforcement monitoring, not automatic democracy. Security agents suggest scalable first-pass auditing and threat intelligence, not the retirement of human auditors. Trust mechanisms suggest verifiable execution and accountability, not magical immunity from hallucination.

The paper is strongest as a landscape map and mechanism synthesis. It is weaker as proof of commercial durability. It does not show that these projects have sustainable revenue, defensible moats, or reliable autonomous performance under adversarial conditions. Crypto valuations are volatile; agent reliability is still fragile; and prompt injection becomes much less amusing when the agent has wallet permissions.

So the business takeaway is simple: do not ask whether “AI agents in Web3” are hype or real. They are both. Ask which layer is being built, what the agent is allowed to do, who verifies its actions, and what happens when it is confidently wrong. That is where the grown-up conversation begins, inconveniently.

Wallets were only the first act

A wallet used to be a signing tool. The user selected an action, reviewed the transaction, paid the gas, and hoped they had not just approved something regrettable. Web3’s promise was self-custody; its daily reality was tabs, bridges, approvals, seed phrases, slippage settings, and interfaces apparently designed by people who believe onboarding is a character test.

AI agents enter this scene as translators and operators. They can read market data, interpret natural language instructions, plan multi-step actions, interact with smart contracts, monitor portfolios, summarise proposals, and flag suspicious code. That makes them more than chatbots stapled onto token dashboards. In the paper’s framing, the agent becomes an actor inside the Web3 environment, not merely a customer-support layer outside it.

The mechanism is bidirectional.

Web3 gives agents four things they usually lack: programmable execution, native assets, verifiable records, and decentralised trust mechanisms. Agents give Web3 four things it badly needs: automation, interpretation, adaptive decision-making, and tolerable user experience. One side supplies rails; the other supplies cognition. That is the bargain.

The paper studies this bargain across 133 Web3-AI agent projects. Its contribution is not a single new model or benchmark. It is a market and systems map: four primary categories, ten subcategories, market-capitalisation patterns, and a synthesis of how agents are being inserted into DeFi, governance, security, and trust infrastructure. The paper’s method combines project collection from CoinMarketCap, Product Hunt, and GitHub; keyword filtering and snowball sampling; inclusion checks for meaningful Web3-AI integration; qualitative card sorting; and market analysis for the 77 projects with available capitalisation data.

That matters because this market is easy to misread. If one looks only at the loudest examples, Web3-AI agents look like tokenised mascots with Discord accounts. Some are exactly that. The dataset, however, shows a wider structure: development tools, agent marketplaces, DeFi executors, investment analytics, agent protocols, trust layers, smart-contract development tools, gaming agents, content agents, and RWA-related systems.

The serious question is not whether every item in the category deserves reverence. Spoiler: no. The serious question is where autonomous execution becomes economically useful, and where the trust boundary becomes too dangerous to delegate.

The taxonomy shows a stack, not a fad

The paper organises Web3-AI agent projects into four primary categories: AI Agent Incubation, Infrastructure, Financial Services, and Creative & Virtual. These categories are not mutually exclusive. The paper notes that 27 projects span multiple categories, and the full market capitalisation of multi-category projects is attributed to each relevant category in the analysis. That is a necessary boundary: the taxonomy is best read as a functional map, not a clean accounting ledger.

Category	What it contains	Operational reading
AI Agent Incubation	Builder platforms, marketplaces, monetisation tools, launchpads	The experimentation layer: people are trying to make agent creation easier, cheaper, and more tokenisable.
Infrastructure	Agent protocols, DePIN, trust layers, AI-powered development tools	The control layer: compute, coordination, verification, and developer primitives.
Financial Services	DeFAI agents and investment analytics	The execution layer: agents that trade, rebalance, analyse, bridge, and explain.
Creative & Virtual	Games, metaverse agents, content creation, AI-powered RWA systems	The attention and asset layer: agents as characters, creators, digital labour, and asset managers.

The distribution is revealing. AI Agent Incubation has the highest project count, with 56 projects, or 42.1% of the dataset. Financial Services is close behind with 55 projects. Infrastructure has 34 projects. Creative & Virtual has 28.

At first glance, that might suggest incubation and financial applications are the centre of gravity. Project count, however, is not capital concentration. The market-capitalisation data covers 77 of the 133 projects and totals about $6.92 billion. Within that subset, Infrastructure accounts for $4.69 billion, or 67.8% of the analysed market capitalisation, despite being only about a quarter of the project universe. The average market capitalisation in Infrastructure is also much higher than in the other categories.

That is the paper’s first important business signal: development energy and investor confidence are not pointing to exactly the same layer.

Incubation has breadth. Infrastructure has valuation. This is not unusual in emerging technical markets. Application builders multiply quickly because the surface area is large and the entry barrier can be low. Infrastructure concentrates value because the market expects foundations, standards, and bottlenecks to matter later. Whether that expectation is correct is not proven by the paper. But it is visible in the data.

There is also a network concentration pattern. Ethereum hosts 45 project instances in the analysed network data, representing 39.5% of network instances and 87.4% of the associated market capitalisation. Solana and Base follow in project count, but with far lower associated capitalisation. So yes, multi-chain activity is emerging. No, the value distribution is not yet equally multi-chain. Decentralisation, meet market concentration. Try not to look surprised.

The paper also reports a strong power-law pattern: the top 10% of projects with available market-cap data, seven projects out of 77, account for $5.13 billion, or 74.2% of analysed market capitalisation. This is main evidence for the market-structure argument, not a robustness check. It says the category is already stratified: a few large infrastructure and platform projects carry much of the visible value, while many smaller projects populate the experimentation layer.

For business readers, the taxonomy is less useful as a list of names and more useful as a screening device. When someone pitches a Web3-AI agent product, ask which layer it belongs to. Builder platform? Execution agent? Analytics interface? Trust layer? Security tool? Digital character? The answer tells you what kind of risk you are underwriting.

The agent bargain: Web3 gives rails, AI gives agency

The paper’s Figure 2 describes a typical Web3-AI agent architecture. The LLM sits inside an agent loop with planning, memory, goal orientation, observation, action, and orchestration. Through tool interfaces such as the Model Context Protocol, the agent interacts with wallets, blockchains, smart contracts, DeFi protocols, and market data.

This figure is best treated as an implementation detail with strategic consequences. It is not evidence that current systems are reliable. It explains the interface boundary: the agent reasons in natural language and tool calls; Web3 executes through wallets, smart contracts, and protocols.

That interface boundary is where the opportunity sits.

A human user says, “Move idle USDC into a lower-risk yield strategy, but avoid unaudited pools and do not bridge unless the expected gain covers gas.” A conventional app needs menus, filters, warnings, protocol pages, maybe a tutorial, and ideally a user with patience. An agent can turn that intent into a sequence: inspect balances, query yields, check risk labels, compare chains, estimate gas, generate a plan, ask for confirmation, then execute. In theory.

The “in theory” is carrying a small warehouse of risk. But the mechanism is real: agents can translate intent into on-chain action. That is why the Web3-AI intersection is more interesting than a chatbot wrapper. The agent is not just answering questions about a system. It can participate in the system.

The paper’s later sections apply this mechanism across four domains: DeFi, governance, security, and trust. These are not four separate stories. They are variations of the same trade: delegate cognition to the agent, constrain execution through Web3, and hope the verification layer is strong enough before the agent finds an expensive way to be creative.

DeFi agents are workflow automation before they are investment genius

The financial-services part of the dataset includes DeFAI agents and investment analytics tools. The paper identifies four roles for AI agents in DeFi: autonomous trading strategy implementation, intelligent portfolio construction, AI-driven market analysis, and improved accessibility through natural language interfaces.

The first role is execution. Agents can translate high-level user intent into token swaps, liquidity provision, lending, reward harvesting, cross-chain movements, and other smart-contract interactions. This is where the wallet stops being a passive signing device and becomes part of an automated operating loop.

The second role is portfolio management. The paper describes agents that analyse transaction history, risk tolerance, market conditions, protocol risks, yield opportunities, and volatility, then recommend or execute portfolio actions. This is not fundamentally different from robo-advisory logic in traditional finance, except the execution surface is messier, faster, composable, and often less forgiving.

The third role is market intelligence. Agents aggregate on-chain data, tokenomics, sentiment, smart-money flows, news, and protocol updates. In a fragmented crypto market, this may be less about producing prophetic alpha and more about reducing search cost. The operator’s question should be: does the agent improve signal discovery, or does it merely summarise noise with better typography?

The fourth role is accessibility. Natural language interfaces can hide gas settings, bridge mechanics, protocol fragmentation, and wallet complexity. This is commercially important because Web3 has spent years proving that “the user owns everything” is not the same as “the user understands anything.”

The business interpretation is therefore grounded but not euphoric. DeFi agents may create value first as operational assistants: faster execution, fewer manual steps, better monitoring, more personalised workflows, and more legible risk surfaces. They do not automatically create durable trading advantage. The paper surveys implementations and roles; it does not prove that AI agents consistently outperform human traders, index strategies, or existing automation systems.

That boundary matters. In finance, automation often arrives wearing the costume of intelligence. The costume is sometimes lovely. The P&L is less sentimental.

Governance agents attack the boring parts of democracy

Web3 governance is not mainly limited by ideals. It is limited by attention, expertise, coordination, and follow-through. Proposal texts are long. Token holders are busy. Technical consequences are hard to assess. Voting participation is often thin. Execution after voting can be opaque. The paper argues that AI agents can support governance across proposal analysis, community engagement, monitoring, enforcement, and adaptive mechanism design.

The most immediate use is proposal digestion. Agents can summarise proposals, identify economic implications, flag security-relevant smart-contract changes, translate technical language into accessible explanations, and track sentiment in community discussions. This is not glamorous, but neither is reading a 40-page governance post at midnight. Markets have been built on less noble forms of summarisation.

The second use is delegated voting and decision support. Users may configure agents with preferences, constraints, or values, and have them recommend or cast votes. That raises obvious accountability questions. If an agent votes against a treasury proposal because it misread a parameter, who is responsible: the user, the DAO, the agent developer, the model provider, or the nearest intern?

The third use is post-vote enforcement. Agents can monitor whether approved proposals are actually implemented by tracking smart-contract interactions, treasury movements, parameter changes, and off-chain data sources. This is a more concrete and underappreciated use case. Governance does not end at voting; it ends when decisions are executed correctly.

The fourth use is adaptive mechanism design. Agents can analyse participation history, voting patterns, quorum failures, incentive structures, and proposal lifecycles, then recommend changes. This is more speculative, but strategically interesting. If DAOs are institutions, agents may become their analysts, clerks, compliance monitors, and occasionally their overconfident consultants. Progress has a sense of humour.

The paper is careful enough to note that dedicated governance-agent categories were not explicitly identified in the landscape analysis. Governance functions are instead supported by platforms and agents from other categories, especially incubation and content-oriented systems. That is an important boundary. Governance-agent value is conceptually strong, but the market category is less mature than DeFi or infrastructure.

For businesses and DAOs, the near-term opportunity is not “autonomous governance.” It is governance operations: proposal triage, risk review, translation, monitoring, and accountability dashboards. The farther one moves toward delegated voting and adaptive mechanism design, the more the risk shifts from productivity to legitimacy.

Security agents are strongest when treated as scalable auditors, not replacement priests

The paper’s security section compares traditional Web3 security approaches with AI-agent-enabled systems. This section is partly a comparison with prior work and existing tools, not a new benchmark run by the authors. That distinction matters.

Traditional smart-contract security tools include static analysis, symbolic execution, formal verification, and machine-learning-based vulnerability detection. Each has strengths. Each also has constraints. Static tools can generate false positives and miss business-logic issues. Symbolic execution can run into state explosion. Formal verification can be rigorous but expensive and specification-heavy. Older ML approaches can struggle with generalisation beyond training distributions.

AI agents are positioned as a way to address some of these gaps by combining code understanding, reasoning, explanation, and tool orchestration. The paper cites systems such as GPTScan and iAudit as evidence that LLM-powered or multi-agent approaches can detect complex smart-contract vulnerabilities with promising precision and accuracy. It also discusses AI-powered platforms that combine auditing, phishing detection, fraud monitoring, and threat intelligence.

The business implication is practical: security agents may reduce the cost and time of first-pass analysis. They can scan code, generate explanations, triage risks, monitor suspicious activity, and support human auditors. They may also help smaller teams that cannot afford continuous manual review.

But the paper does not justify replacing expert security review for high-value contracts. Nor should it. Smart-contract failures are expensive, irreversible, and popular with people who do not send thank-you notes. The correct framing is augmentation: agents expand coverage, speed, and consistency; humans retain responsibility for final judgement, especially around business logic, protocol design, adversarial assumptions, and economic exploits.

A useful operational model is:

Security task	Agent value	Human still needed for
First-pass code scanning	Speed, coverage, vulnerability pattern detection	Prioritisation and false-positive resolution
Business-logic review	Hypothesis generation and explanation	Deep protocol reasoning and economic attack modelling
Phishing and fraud monitoring	Real-time pattern recognition	Response design, user communication, escalation
Audit documentation	Drafting, traceability, repeatability	Final accountability and sign-off

This is where the Web3-AI agent story becomes more credible. “Agent replaces auditor” is hype. “Agent gives auditors more reach and better triage” is a product.

Trust layers are the part everyone should stop skipping

Most AI-agent discussions obsess over capability. In Web3, capability without verification is a liability with branding.

The paper’s trust section is therefore central. It argues that Web3 mechanisms can support AI-agent reliability through cryptographic security, privacy-preserving computation, decentralised consensus, verification systems, transparent governance, and immutable audit trails.

This is the other half of the bargain. AI gives Web3 agency; Web3 gives AI constraints.

Trusted Execution Environments and cryptographic methods such as fully homomorphic encryption can help agents compute over sensitive data or execute actions in more verifiable environments. Consensus mechanisms and smart contracts can validate agent actions without relying on a central authority. Decentralised knowledge graphs and provenance systems can help agents access verified information. On-chain records can create audit trails for agent decisions and transactions.

None of this makes the model truthful. It makes parts of the environment verifiable.

That distinction is crucial. A blockchain can record what an agent did. It cannot guarantee that the agent’s reasoning was sensible. A TEE can protect execution integrity. It cannot make a bad strategy good. A reputation system can track behaviour. It cannot eliminate adversarial manipulation. Trust infrastructure reduces certain classes of uncertainty; it does not abolish judgement.

For operators, this means the key design question is permissions. What can the agent observe? What can it decide? What can it execute without confirmation? What limits apply by asset, protocol, transaction size, chain, time window, and risk score? What logs are generated? Who can dispute an action? Can the agent be paused?

The difference between a helpful agent and a small financial warlord is often a permissions table.

The market signal is infrastructure first, applications second

The paper’s most useful business evidence is the mismatch between project count and market capitalisation.

Incubation platforms dominate project activity. This suggests experimentation: builders are creating marketplaces, launchpads, agent tooling, and monetisation systems. Financial services also has broad activity, showing obvious demand for trading, analytics, and portfolio automation. Creative and virtual agents extend the category into attention markets, gaming, digital identity, and content.

Yet infrastructure captures the majority of analysed capitalisation. That suggests the market currently values foundations more than surface-level applications. The projects most likely to matter are not necessarily the ones with the liveliest chatbot demos. They may be the ones controlling agent coordination, verification, compute, identity, data provenance, wallet integration, or developer access.

This does not mean every infrastructure token is a good investment. Please do not tattoo market capitalisation onto your investment thesis. It means the category’s control points appear to sit below the application layer.

A business can use the paper’s map in three ways.

First, for product positioning. A Web3-AI product should be clear about whether it is competing as an interface, an execution agent, an analytics layer, an infrastructure protocol, or a trust mechanism. Fuzzy positioning may help on a pitch deck. It does not help integration buyers.

Second, for partnership strategy. Application teams may need infrastructure partners for wallet permissions, cross-chain execution, verification, storage, or auditability. Infrastructure teams may need application partners to prove usage beyond speculative token demand.

Third, for risk assessment. The more autonomy an agent receives, the more important the trust layer becomes. A market-intelligence bot can be wrong with limited damage. A portfolio agent with wallet permissions can be wrong expensively. A governance agent can be wrong institutionally. A smart-contract auditor can be wrong catastrophically. Same word, “agent”; very different blast radius.

What the paper directly shows, and what Cognaptus infers

The paper directly shows a surveyed landscape of 133 Web3-AI agent projects, organised into four primary categories and ten subcategories. It directly shows that market-capitalisation data was available for 77 projects, totalling about $6.92 billion as of early April 2025. It directly shows concentration: Infrastructure accounts for 67.8% of analysed market capitalisation; the top seven projects account for 74.2%; Ethereum dominates the network-value distribution in the analysed sample.

The paper also directly synthesises application roles across DeFi, governance, security, and trust mechanisms. These sections are partly based on project examples and related literature. They are useful for mechanism mapping, not for proving that every described capability is mature, profitable, or robust in production.

Cognaptus infers three business implications.

First, Web3-AI agents should be evaluated by delegated authority, not by interface polish. A natural-language interface is not the product if the agent cannot safely observe, decide, and execute. Conversely, a plain interface can still be valuable if the permissioning, logging, and execution rails are robust.

Second, infrastructure and trust layers deserve disproportionate attention. In a market where agents act on-chain, the scarce assets are not only models. They are verifiable execution, secure key management, identity, reputation, memory, audit trails, and cross-protocol coordination.

Third, the best near-term enterprise-grade use cases are likely assisted autonomy, not full autonomy. DeFi monitoring, proposal summarisation, security triage, transaction preparation, compliance logging, portfolio recommendations, and controlled execution workflows are more realistic than “let the agent manage everything while everyone goes for coffee.” The coffee can wait.

The boundary conditions are not decorative

The paper’s limitations are not generic academic modesty. They materially affect how the results should be used.

The first boundary is data coverage. The taxonomy covers 133 projects, but market-capitalisation analysis covers 77. That is still useful, but not complete. Early-stage, private, open-source, or inactive projects may be represented differently from listed token projects.

The second boundary is market volatility. The capitalisation figures are a snapshot from April 2025. In crypto, market value is a weather report pretending to be geology. It can indicate attention and confidence, but it is not proof of durable revenue, retention, or defensibility.

The third boundary is overlapping classification. Multi-category projects are assigned to multiple categories, and their full market capitalisation is counted in each relevant category. This helps map functional reach but complicates clean category-level valuation comparisons.

The fourth boundary is evidence type. The paper is a systematic landscape and synthesis study. It does not run a unified benchmark across all surveyed agents. Security metrics cited for tools such as GPTScan or iAudit come from prior systems and related work. Governance and trust sections are mechanism analyses supported by examples, not controlled deployment trials.

The fifth boundary is agent reliability. The paper explicitly identifies hallucination, limited context memory, computational cost, prompt injection, jailbreaking, and user-trust barriers. These are not side issues. They define the ceiling of autonomy. An unreliable agent with no wallet access is a nuisance. An unreliable agent with asset control is an incident report.

The next frontier is agent sovereignty, which should make adults nervous

The paper’s discussion points to future research directions: persistent agent memory, portable AI-agent digital assets, decentralised agent identity, decentralised multi-agent coordination, and RWA integration.

These directions are coherent. Agents need memory if they are to act across long time horizons. They need wallets or asset interfaces if they are to participate economically. They need identity if reputation and accountability are to persist across platforms. They need coordination protocols if multiple agents are to collaborate without central orchestration. They need RWA integration if they are to manage tokenised claims on real-world assets.

They also escalate the risk.

An agent with memory can learn, but it can also preserve contaminated context. An agent with assets can transact, but it can also lose money. An agent with identity can build reputation, but it can also become a legal puzzle. A swarm of agents can coordinate, but coordination is not always benign. RWA agents can improve asset operations, but they inherit every oracle, verification, legal, and custody problem in tokenised real-world finance.

This is why the phrase “autonomous AI agent” should not be treated as a feature by default. Autonomy is a design variable. Sometimes more is better. Often, bounded autonomy is the product.

The practical maturity model looks like this:

Stage	Agent capability	Suitable business use
Advisory	Summarises, explains, recommends	Research, education, market monitoring, proposal digestion
Assisted execution	Prepares actions for user approval	DeFi workflows, bridge routing, portfolio rebalancing suggestions
Bounded autonomy	Executes within strict limits	Low-value recurring transactions, monitoring, alerts, reward harvesting
Delegated autonomy	Controls meaningful assets or votes	Requires strong identity, auditability, risk controls, dispute processes
Multi-agent autonomy	Coordinates with other agents across protocols	Still mostly research and frontier experimentation

Many projects will claim to be near the bottom of that table. Most should operate near the top until their control systems mature.

The sober opportunity

The paper makes Web3-AI agents look less like a gimmick and more like an emerging operating layer for decentralised systems. That does not make the market clean, safe, or sane. It means the underlying pattern is worth understanding.

AI agents can reduce Web3’s usability tax. They can automate DeFi workflows, improve portfolio monitoring, accelerate market intelligence, support governance operations, strengthen audit workflows, and create more accountable execution environments. Web3 can give agents wallets, smart-contract access, cryptographic verification, decentralised identity, persistent memory, and audit trails.

The result is not simply “AI plus blockchain.” It is a new delegation problem.

Who gets to act? Under whose authority? With what limits? Using which data? Verified by whom? Reversible how? Insured by what? Governed where? These are not philosophical flourishes. They are product requirements.

The paper’s landscape says the ecosystem is already broad. Its market analysis says infrastructure is where visible value is concentrating. Its domain synthesis says agents are becoming operators across finance, governance, security, and trust. Its limitations say we are not yet ready to hand them the keys without supervision.

That is the balanced reading. Web3-AI agents are not just wallets with better grammar. Nor are they autonomous financial gods. They are emerging machine actors inside programmable economies.

Some will be tools. Some will be infrastructure. Some will be toys. A few may become institutions.

And because this is Web3, several will become cautionary tales with excellent branding.

\ast\astCognaptus: Automate the Present, Incubate the Future.\ast\ast

Yiming Shen, Jiashuo Zhang, Zhenzhe Shao, Wenxuan Luo, Yanlin Wang, Ting Chen, Zibin Zheng, and Jiachi Chen, “Web3 × AI Agents: Landscape, Integrations, and Foundational Challenges,” arXiv:2508.02773, 2025, https://arxiv.org/pdf/2508.02773. ↩︎

TL;DR for operators#

Wallets were only the first act#

The taxonomy shows a stack, not a fad#

The agent bargain: Web3 gives rails, AI gives agency#

DeFi agents are workflow automation before they are investment genius#

Governance agents attack the boring parts of democracy#

Security agents are strongest when treated as scalable auditors, not replacement priests#

Trust layers are the part everyone should stop skipping#

The market signal is infrastructure first, applications second#

What the paper directly shows, and what Cognaptus infers#

The boundary conditions are not decorative#

The next frontier is agent sovereignty, which should make adults nervous#

The sober opportunity#