摘要
OpenManus作为一个功能强大的AI智能体框架,其核心在于优秀的模块化架构设计。本文将深入分析OpenManus的系统架构,探讨其核心组件的设计理念和实现方式,包括智能体基类、工具系统、配置管理、多智能体协作等关键模块。通过本文的学习,开发者可以更好地理解OpenManus的内部机制,为定制开发和优化提供指导。
正文
1. 架构概览
OpenManus采用模块化设计,各个组件之间松耦合,便于扩展和维护。整体架构如下图所示:
2. 智能体系统设计
2.1 基础智能体类
OpenManus的智能体系统以[BaseAgent](file:///D:/project/OpenManus/app/agent/base.py#L17-L196)为核心,定义了智能体的基本行为和状态管理:
# BaseAgent核心实现
class BaseAgent(BaseModel, ABC):
# 核心属性
name: str = Field(..., description="Unique name of the agent")
description: Optional[str] = Field(None, description="Optional agent description")
# 提示词配置
system_prompt: Optional[str] = Field(
None, description="System-level instruction prompt"
)
next_step_prompt: Optional[str] = Field(
None, description="Prompt for determining next action"
)
# 依赖组件
llm: LLM = Field(default_factory=LLM, description="Language model instance")
memory: Memory = Field(default_factory=Memory, description="Agent's memory store")
state: AgentState = Field(
default=AgentState.IDLE, description="Current agent state"
)
# 执行控制
max_steps: int = Field(default=10, description="Maximum steps before termination")
current_step: int = Field(default=0, description="Current step in execution")
2.2 状态管理模式
智能体采用状态机模式管理执行状态:
# AgentState枚举定义
class AgentState(str, Enum):
IDLE = "idle"
RUNNING = "running"
FINISHED = "finished"
ERROR = "error"
状态转换通过[state_context](file:///D:/project/OpenManus/app/agent/base.py#L70-L93)上下文管理器安全处理:
@asynccontextmanager
async def state_context(self, new_state: AgentState):
"""Context manager for safe agent state transitions."""
if not isinstance(new_state, AgentState):
raise ValueError(f"Invalid state: {new_state}")
previous_state = self.state
self.state = new_state
try:
yield
except Exception as e:
self.state = AgentState.ERROR
raise e
finally:
self.state = previous_state
2.3 记忆管理机制
智能体通过[Memory](file:///D:/project/OpenManus/app/schema.py#L39-L57)类管理对话历史和上下文:
def update_memory(
self,
role: ROLE_TYPE,
content: str,
base64_image: Optional[str] = None,
**kwargs,
) -> None:
"""Add a message to the agent's memory."""
message_map = {
"user": Message.user_message,
"system": Message.system_message,
"assistant": Message.assistant_message,
"tool": lambda content, **kw: Message.tool_message(content, **kw),
}
if role not in message_map:
raise ValueError(f"Unsupported message role: {role}")
kwargs = {"base64_image": base64_image, **(kwargs if role == "tool" else {})}
self.memory.add_message(message_map[role](content, **kwargs))
3. 工具系统架构
3.1 工具基类设计
所有工具都继承自[BaseTool](file:///D:/project/OpenManus/app/tool/base.py#L8-L32)基类:
class BaseTool(ABC, BaseModel):
name: str
description: str
parameters: Optional[dict] = None
async def __call__(self, **kwargs) -> Any:
"""Execute the tool with given parameters."""
return await self.execute(**kwargs)
@abstractmethod
async def execute(self, **kwargs) -> Any:
"""Execute the tool with given parameters."""
def to_param(self) -> Dict:
"""Convert tool to function call format."""
return {
"type": "function",
"function": {
"name": self.name,
"description": self.description,
"parameters": self.parameters,
},
}
3.2 工具集合管理
[ToolCollection](file:///D:/project/OpenManus/app/tool/tool_collection.py#L12-L71)类负责管理多个工具:
class ToolCollection:
def __init__(self, *tools: BaseTool):
self.tools = tools
self.tool_map = {tool.name: tool for tool in tools}
def to_params(self) -> List[Dict[str, Any]]:
return [tool.to_param() for tool in self.tools]
async def execute(
self, *, name: str, tool_input: Dict[str, Any] = None
) -> ToolResult:
tool = self.tool_map.get(name)
if not tool:
return ToolFailure(error=f"Tool {name} is invalid")
try:
result = await tool(**tool_input)
return result
except ToolError as e:
return ToolFailure(error=e.message)
3.3 工具调用智能体
[ToolCallAgent](file:///D:/project/OpenManus/app/agent/toolcall.py#L19-L250)是支持工具调用的智能体基类:
class ToolCallAgent(ReActAgent):
available_tools: ToolCollection = ToolCollection(
CreateChatCompletion(), Terminate()
)
tool_choices: TOOL_CHOICE_TYPE = ToolChoice.AUTO
special_tool_names: List[str] = Field(default_factory=lambda: [Terminate().name])
async def think(self) -> bool:
"""Process current state and decide next actions using tools"""
# 获取带有工具选项的响应
response = await self.llm.ask_tool(
messages=self.messages,
system_msgs=(
[Message.system_message(self.system_prompt)]
if self.system_prompt
else None
),
tools=self.available_tools.to_params(),
tool_choice=self.tool_choices,
)
# 处理响应和工具调用
# ...
async def act(self) -> str:
"""Execute tool calls and handle their results"""
if not self.tool_calls:
if self.tool_choices == ToolChoice.REQUIRED:
raise ValueError(TOOL_CALL_REQUIRED)
return self.messages[-1].content or "No content or commands to execute"
results = []
for command in self.tool_calls:
result = await self.execute_tool(command)
# 处理工具执行结果
# ...
4. LLM接口层设计
4.1 多模型支持
[LLM](file:///D:/project/OpenManus/app/llm.py#L235-L766)类支持多种大语言模型:
class LLM:
def __init__(
self, config_name: str = "default", llm_config: Optional[LLMSettings] = None
):
llm_config = llm_config or config.llm
llm_config = llm_config.get(config_name, llm_config["default"])
self.model = llm_config.model
self.max_tokens = llm_config.max_tokens
self.temperature = llm_config.temperature
self.api_type = llm_config.api_type
self.api_key = llm_config.api_key
self.api_version = llm_config.api_version
self.base_url = llm_config.base_url
# 根据API类型初始化客户端
if self.api_type == "azure":
self.client = AsyncAzureOpenAI(
base_url=self.base_url,
api_key=self.api_key,
api_version=self.api_version,
)
elif self.api_type == "aws":
self.client = BedrockClient()
else:
self.client = AsyncOpenAI(api_key=self.api_key, base_url=self.base_url)
4.2 工具调用接口
[ask_tool](file:///D:/project/OpenManus/app/llm.py#L81-L135)方法支持工具调用:
async def ask_tool(
self,
messages: List[Union[dict, Message]],
system_msgs: Optional[List[Union[dict, Message]]] = None,
timeout: int = 300,
tools: Optional[List[dict]] = None,
tool_choice: TOOL_CHOICE_TYPE = ToolChoice.AUTO,
temperature: Optional[float] = None,
**kwargs,
) -> ChatCompletionMessage | None:
# 格式化消息
# 计算令牌数
# 验证工具和工具选择
# 发送请求并处理响应
5. 配置管理系统
5.1 配置结构
[Config](file:///D:/project/OpenManus/app/config.py#L273-L340)类采用单例模式管理配置:
class Config:
_instance = None
_lock = threading.Lock()
_initialized = False
def __new__(cls):
if cls._instance is None:
with cls._lock:
if cls._instance is None:
cls._instance = super().__new__(cls)
return cls._instance
@property
def llm(self) -> Dict[str, LLMSettings]:
return self._config.llm
@property
def sandbox(self) -> SandboxSettings:
return self._config.sandbox
@property
def browser_config(self) -> Optional[BrowserSettings]:
return self._config.browser_config
5.2 TOML配置文件
配置文件采用TOML格式,支持多种模型配置:
# 全局LLM配置
[llm]
model = "gpt-4o"
base_url = "https://api.openai.com/v1"
api_key = "sk-..."
max_tokens = 4096
temperature = 0.0
# 可选特定LLM模型配置
[llm.vision]
model = "gpt-4o"
base_url = "https://api.openai.com/v1"
api_key = "sk-..."
# 浏览器配置
[browser]
headless = false
disable_security = true
# 搜索配置
[search]
engine = "Google"
fallback_engines = ["DuckDuckGo", "Baidu", "Bing"]
6. 多智能体协作系统
6.1 Flow工厂模式
[FlowFactory](file:///D:/project/OpenManus/app/flow/flow_factory.py#L15-L30)使用工厂模式创建不同类型的流程:
class FlowFactory:
@staticmethod
def create_flow(
flow_type: FlowType,
agents: Union[BaseAgent, List[BaseAgent], Dict[str, BaseAgent]],
**kwargs,
) -> BaseFlow:
flows = {
FlowType.PLANNING: PlanningFlow,
}
flow_class = flows.get(flow_type)
if not flow_class:
raise ValueError(f"Unknown flow type: {flow_type}")
return flow_class(agents, **kwargs)
6.2 规划流程实现
[PlanningFlow](file:///D:/project/OpenManus/app/flow/planning.py#L49-L442)实现多智能体协作的规划流程:
class PlanningFlow(BaseFlow):
def __init__(
self, agents: Union[BaseAgent, List[BaseAgent], Dict[str, BaseAgent]], **data
):
# 初始化智能体
super().__init__(agents, **data)
# 设置执行器键
if not self.executor_keys:
self.executor_keys = list(self.agents.keys())
async def execute(self, input_text: str) -> str:
"""Execute the planning flow with agents."""
try:
# 创建初始计划
if input_text:
await self._create_initial_plan(input_text)
result = ""
while True:
# 获取当前步骤信息
self.current_step_index, step_info = await self._get_current_step_info()
# 退出条件
if self.current_step_index is None:
result += await self._finalize_plan()
break
# 执行当前步骤
step_type = step_info.get("type") if step_info else None
executor = self.get_executor(step_type)
step_result = await self._execute_step(executor, step_info)
result += step_result + "\n"
# 检查智能体是否要终止
if hasattr(executor, "state") and executor.state == AgentState.FINISHED:
break
return result
except Exception as e:
logger.error(f"Error in PlanningFlow: {str(e)}")
return f"Execution failed: {str(e)}"
7. 执行环境与沙箱
7.1 沙箱配置
[SandboxSettings](file:///D:/project/OpenManus/app/config.py#L131-L143)定义沙箱配置:
class SandboxSettings(BaseModel):
"""Configuration for the execution sandbox"""
use_sandbox: bool = Field(False, description="Whether to use the sandbox")
image: str = Field("python:3.12-slim", description="Base image")
work_dir: str = Field("/workspace", description="Container working directory")
memory_limit: str = Field("512m", description="Memory limit")
cpu_limit: float = Field(1.0, description="CPU limit")
timeout: int = Field(300, description="Default command timeout (seconds)")
network_enabled: bool = Field(
False, description="Whether network access is allowed"
)
7.2 沙箱客户端
[SANDBOX_CLIENT](file:///D:/project/OpenManus/app/sandbox/client.py#L115-L115)提供沙箱操作接口:
class SandboxClient:
async def execute_command(
self,
command: str,
timeout: Optional[int] = None,
work_dir: Optional[str] = None
) -> SandboxResult:
"""Execute a command in the sandbox environment."""
# 实现命令执行逻辑
# 处理超时和错误
# 返回执行结果
8. 实践示例
8.1 自定义智能体实现
from app.agent.toolcall import ToolCallAgent
from app.tool import ToolCollection
from app.tool.python_execute import PythonExecute
from app.tool.web_search import WebSearch
class CustomAgent(ToolCallAgent):
name: str = "custom_agent"
description: str = "自定义智能体示例"
available_tools: ToolCollection = Field(
default_factory=lambda: ToolCollection(
PythonExecute(),
WebSearch(),
)
)
system_prompt: str = """你是一个专门处理数据分析任务的智能体。
你可以使用Python执行代码和网络搜索工具来完成任务。"""
async def step(self) -> str:
"""执行单步操作"""
thought = await self.think()
if thought:
action = await self.act()
return action
return "任务完成"
8.2 工具扩展实现
from app.tool.base import BaseTool
class CustomTool(BaseTool):
name: str = "custom_tool"
description: str = "自定义工具示例"
parameters: dict = {
"type": "object",
"properties": {
"param1": {
"type": "string",
"description": "示例参数1"
},
"param2": {
"type": "integer",
"description": "示例参数2"
}
},
"required": ["param1"]
}
async def execute(self, param1: str, param2: int = 10) -> str:
"""执行工具逻辑"""
# 实现具体功能
result = f"处理参数: {param1}, {param2}"
return result
9. 性能优化策略
9.1 令牌计数优化
class TokenCounter:
def count_message_tokens(self, messages: List[dict]) -> int:
"""计算消息列表中的总令牌数"""
total_tokens = self.FORMAT_TOKENS # 基础格式令牌
for message in messages:
tokens = self.BASE_MESSAGE_TOKENS # 每条消息的基础令牌
# 添加角色令牌
tokens += self.count_text(message.get("role", ""))
# 添加内容令牌
if "content" in message:
tokens += self.count_content(message["content"])
# 添加工具调用令牌
if "tool_calls" in message:
tokens += self.count_tool_calls(message["tool_calls"])
total_tokens += tokens
return total_tokens
9.2 缓存机制
from functools import lru_cache
class LLM:
@lru_cache(maxsize=128)
def count_tokens(self, text: str) -> int:
"""计算文本中的令牌数"""
if not text:
return 0
return len(self.tokenizer.encode(text))
10. 错误处理与日志
10.1 异常处理机制
class ToolCallAgent(ReActAgent):
async def execute_tool(self, command: ToolCall) -> str:
"""执行单个工具调用并处理错误"""
try:
# 解析参数
args = json.loads(command.function.arguments or "{}")
# 执行工具
result = await self.available_tools.execute(name=name, tool_input=args)
# 格式化结果
observation = (
f"Observed output of cmd `{name}` executed:\n{str(result)}"
if result
else f"Cmd `{name}` completed with no output"
)
return observation
except json.JSONDecodeError:
error_msg = f"Error parsing arguments for {name}: Invalid JSON format"
logger.error(
f"📝 参数解析错误 '{name}' - 无效的JSON格式, 参数:{command.function.arguments}"
)
return f"Error: {error_msg}"
except Exception as e:
error_msg = f"⚠️ 工具 '{name}' 遇到问题: {str(e)}"
logger.exception(error_msg)
return f"Error: {error_msg}"
总结
OpenManus的架构设计体现了现代软件工程的最佳实践:
- 模块化设计:各组件职责清晰,便于维护和扩展
- 面向接口编程:通过抽象基类定义标准接口
- 工厂模式:通过工厂类创建复杂对象
- 策略模式:支持多种LLM和工具选择策略
- 状态管理:通过状态机模式管理智能体状态
- 错误处理:完善的异常处理和日志记录机制
通过深入理解这些设计模式和实现细节,开发者可以更好地利用OpenManus构建强大的AI应用,同时也能为框架的进一步发展做出贡献。
实践建议
- 理解核心概念:深入理解智能体、工具、流程等核心概念
- 遵循设计模式:在扩展开发中遵循已有的设计模式
- 合理使用缓存:对频繁调用且结果稳定的功能使用缓存
- 完善错误处理:为自定义组件实现完善的错误处理机制
- 关注性能优化:注意令牌计数和执行效率的优化