Starting from Manus and MCP: AI Agent's cross-border exploration of Web3

Manus, the world's first general AI Agent product released by China's startup Monica, has been all over the domestic technology media and social networks. On the first day of its launch, invitation codes were hard to come by on the entire network. Even an invitation code on Xianyu costs 50,000 yuan. However, many industry KOLs still got invitation codes in advance, and a flood of experience interpretation articles followed.

As a general AI Agent product, Manus has the ability to independently complete tasks from planning to execution, such as writing reports and making tables. It not only generates ideas, but also thinks and takes action independently. With its powerful ability to think independently, plan and execute complex tasks, it directly delivers complete results, demonstrating unprecedented versatility and execution capabilities.

The popularity of Manus has not only attracted attention within the industry, but also provided valuable product ideas and design inspiration for the development of various AI Agents. With the rapid development of AI technology, AI Agent, as an important branch of the field of artificial intelligence, is gradually moving from concept to reality, and showing great application potential in all walks of life, including the Web3 industry.

Background knowledge

AI Agent, or artificial intelligence agent, is a computer program that can make decisions and perform tasks autonomously based on the environment, input, and predefined goals. The core components of AI Agent include the Large Language Model (LLM) as its "brain", which enables it to process information, learn from interactions, make decisions, and perform actions; observation and perception mechanisms, which enable it to perceive the environment; reasoning thinking processes, which involve analyzing observations and memory content and considering possible actions; action execution, as an explicit response to thinking and observation; and memory and retrieval, which stores past experiences for learning.

Starting from ReAct, the design pattern of AI Agent has two development routes: one is more focused on the planning ability of AI Agent, including REWOO, Plan & Execute, and LLM Compiler. The other is more focused on reflection ability, including Basic Reflection, Reflexion, Self Discover, and LATS.

Among them, the ReAct pattern is the earliest AI Agent design pattern and is currently the most widely used, so here we mainly introduce the concept of ReAct. ReAct refers to solving diverse language reasoning and decision-making tasks by combining reasoning and acting in language models. Its typical process is shown in the figure below, which can be described by an interesting cycle: Thought → Action → Observation, referred to as TAO cycle.

Thinking: When facing a problem, we need to think deeply. This thinking process is about how to define the problem, determine the key information and reasoning steps required to solve the problem.

Action: After determining the direction of thinking, the next step is to act. According to our thinking, take corresponding measures or perform specific tasks in the hope of pushing the problem towards the solution.

Observation: After taking action, we must carefully observe the results. This step is to test whether our actions are effective and whether they are close to the answer to the problem.

Loop iteration

AI Agent can be divided into Single Agent and Multi Agent according to the number of intelligent agents. The core of Single Agent lies in the cooperation between LLM and tools, and in the process of completing the task, Agent may have multiple rounds of interaction with users. Multi Agent assigns different roles to different Agents, and completes complex tasks through collaboration between Agents. However, in the process of completing tasks, there will be less interaction with users than Single Agent. Currently, most frameworks focus on Single Agent scenarios.

Model Context Protocol (MCP) is an open source protocol launched by Anthropic on November 25, 2024, which aims to solve the connection and interaction problems between LLM and external data sources. LLM can be compared to an operating system, and MCP can be compared to a USB interface, which supports flexible insertion of external data and tools, and then users can read and use these external data and tools.

MCP provides three capabilities to expand LLM: Resources (knowledge expansion), Tools (execute functions, call external systems), and Prompts (pre-written prompt word templates). The MCP protocol adopts a Client-Server architecture, and the underlying transmission uses the JSON-RPC protocol. Anyone can develop and host MCP Server, and can go offline and stop the service at any time.

Current status of AI Agent in Web3

In the Web3 industry, the popularity of AI Agent has dropped significantly since it reached its peak in January this year, and the overall market value has shrunk by more than 90%. Currently, the most popular and market-capitalized explorations of Web3 are still centered around the AI Agent framework, namely "launch platform model represented by Virtuals Protocol", "DAO model represented by ElizaOS" and "commercial company model represented by Swarms".

A launch platform is a platform that allows users to create, deploy and monetize AI Agents, similar to pump.fun in Meme, but for AI Agents. Virtuals Protocol is currently the largest launch platform, with more than 100,000 Agents issued on it. AIXBT, a popular "coin circle KOL", was created based on Virtuals. Virtuals Protocol includes a modular Agent framework called G.A.M.E. The core positioning of G.A.M.E is to provide developers with an efficient and open framework to make the development and launch of AI Agents as simple as building a WordPress website.

DAO stands for Decentralized Autonomous Organization. ElizaOS (formerly ai16z) was founded by @shawmakesmagic on the daos.fun platform. The original idea was to use AI models to simulate the investment decisions of the well-known venture capital institution a16z and its co-founder Marc Andreessen, and to invest in combination with the advice of DAO members. Later, it developed into a DAO for AI Agent developers with the Eliza framework as the core. The Eliza framework is built with TypeScript and provides a flexible and scalable platform for developing AI Agents that can interact across multiple platforms while maintaining consistent personality and knowledge.

Swarms was launched in 2022 by @KyeGomezB, who is currently 20 years old. It is an enterprise-level Multi Agent framework. Swarms uses intelligent orchestration and efficient collaboration to allow multiple AI Agents to work together like a team to solve complex business operation needs. At first, Swarms was just an AI Agent project for Web2. According to the founder, Swarms has more than 45 million agents running in production environments, providing services to the world's largest financial, insurance and medical institutions. It was not until the token $SWARMS was issued in December 2024 that it officially switched from Web2 to Web3.

From the perspective of the economic model alone, only the launch platform can currently achieve a self-sufficient economic closed loop. Take Virtuals as an example:

Agent creation: The creator launches a new AI Agent on the Virtuals platform;

Binding curve setting: The creator pays 100 $VIRTUAL tokens, and a binding curve will be created for the token of the new agent and paired with $VIRTUAL.

Liquidity pool creation: Once the binding curve limit is reached, the agent "graduates" and creates a liquidity pool where the agent token is paired with the $VIRTUAL token, adhering to the principle of fair launch without insiders: no pre-mining or internal allocation, fixed total supply, and liquidity lock for a long time.

In addition to charging the launch fee for AI Agent, Virtuals will also charge transaction fees for each transaction of agent tokens, and AI Agent will also charge inference fees for accessing LLM through Virtuals' API. Currently, ElizaOS and Swarms are planning to build their own launch platforms.

Of course, there are also problems with the launch platform. This kind of asset issuance requires the issued assets themselves to be "attractive" to form a positive flywheel. At present, most of the AI Agents launched are essentially Memes, without intrinsic value support, and will quickly return to zero once they lose the market's attention. In the current cold market environment, the launch platform cannot even attract creators, so the economic model cannot actually work.

Web3 Exploration of MCP

The emergence of MCP has brought new exploration directions to the current Web3 AI Agent. The most intuitive directions are:

Deploy MCP Server to the blockchain network to solve the single point problem of MCP Server while being censorship-resistant;

MCP Server has the function of interacting with the blockchain, such as conducting DeFi transactions and management, to lower the technical threshold.

The first direction has extremely high requirements for the storage system, data management capabilities, and asynchronous computing capabilities of the underlying blockchain. Blockchains like 0G can be selected. 0G is a modular AI blockchain with a scalable and programmable DA layer suitable for AI dapps. Its modular technology will achieve frictionless interoperability between chains while ensuring security, eliminating fragmentation and maximizing connectivity to create a decentralized AI ecosystem.

The second direction is similar to a variant of DeFAI, but currently DeFAI's backend is a series of tools in Function Call that are encapsulated by itself. UnifAI creates a unified DeFAI MCP Server to avoid reinventing the wheel. UnifAI is a platform that enables autonomous AI agents to perform on-chain and off-chain tasks in the Web3 ecosystem. It has UniQ for task automation, an agent service market, and infrastructure for tool discovery.

In addition to the above two directions, @brucexu_eth, the founder of LXDAO and ETHPanda, proposed a solution to build an OpenMCP.Network creator incentive network based on Ethereum. MCP Server needs to host and provide stable services. Users pay LLM providers, and LLM providers distribute actual incentives to the called MCP Servers through the network to maintain the sustainability and stability of the entire network and inspire MCP creators to continue to create and provide high-quality content. This set of networks will need to use smart contracts to achieve automation, transparency, trustworthiness and censorship resistance of incentives. Signatures, permission verification, and privacy protection during operation can all be achieved using technologies such as Ethereum wallets and ZK.

Although in theory, the combination of MCP and Web3 can inject decentralized trust mechanisms and economic incentive layers into AI Agent applications, the current zero-knowledge proof (ZKP) technology is still difficult to verify the authenticity of Agent behavior, and decentralized networks still have efficiency issues. This is not a short-term solution that can succeed.

Summary

The release of Manus marks an important milestone in general AI Agent products. The Web3 world also needs a milestone product to break the outside world's doubts that Web3 is not practical but only hype.

The emergence of MCP has brought new exploration directions for Web3 AI Agent, including deploying MCP Server to the blockchain network, enabling MCP Server to interact with the blockchain, or building an MCP Server creator incentive network.

AI is the most grand narrative in history. For Web3, integration with AI is inevitable. We still need to maintain patience and confidence and continue to explore.