# Magentic-One
> [!IMPORTANT]
> **Note (December 22nd, 2024):** We recommend using the [Magentic-One API](https://github.com/microsoft/autogen/tree/main/python/packages/autogen-ext/src/autogen_ext/teams/magentic_one.py) as the preferred way to interact with Magentic-One. The API provides a more streamlined and robust interface for integrating Magentic-One into your projects.
> [!CAUTION]
> Using Magentic-One involves interacting with a digital world designed for humans, which carries inherent risks. To minimize these risks, consider the following precautions:
>
> 1. **Use Containers**: Run all tasks in docker containers to isolate the agents and prevent direct system attacks.
> 2. **Virtual Environment**: Use a virtual environment to run the agents and prevent them from accessing sensitive data.
> 3. **Monitor Logs**: Closely monitor logs during and after execution to detect and mitigate risky behavior.
> 4. **Human Oversight**: Run the examples with a human in the loop to supervise the agents and prevent unintended consequences.
> 5. **Limit Access**: Restrict the agents' access to the internet and other resources to prevent unauthorized actions.
> 6. **Safeguard Data**: Ensure that the agents do not have access to sensitive data or resources that could be compromised. Do not share sensitive information with the agents.
> Be aware that agents may occasionally attempt risky actions, such as recruiting humans for help or accepting cookie agreements without human involvement. Always ensure agents are monitored and operate within a controlled environment to prevent unintended consequences. Moreover, be cautious that Magentic-One may be susceptible to prompt injection attacks from webpages.
> [!NOTE]
> This code is currently being ported to AutoGen AgentChat. If you want to build on top of Magentic-One, we recommend waiting for the port to be completed. In the meantime, you can use this codebase to experiment with Magentic-One.
We are introducing Magentic-One, our new generalist multi-agent system for solving open-ended web and file-based tasks across a variety of domains. Magentic-One represents a significant step towards developing agents that can complete tasks that people encounter in their work and personal lives.
Find additional information about Magentic-one in our [blog post](https://aka.ms/magentic-one-blog) and [technical report](https://arxiv.org/abs/2411.04468).

> _Example_: The figure above illustrates Magentic-One mutli-agent team completing a complex task from the GAIA benchmark. Magentic-One's Orchestrator agent creates a plan, delegates tasks to other agents, and tracks progress towards the goal, dynamically revising the plan as needed. The Orchestrator can delegate tasks to a FileSurfer agent to read and handle files, a WebSurfer agent to operate a web browser, or a Coder or Computer Terminal agent to write or execute code, respectively.
## Architecture

Magentic-One work is based on a multi-agent architecture where a lead Orchestrator agent is responsible for high-level planning, directing other agents and tracking task progress. The Orchestrator begins by creating a plan to tackle the task, gathering needed facts and educated guesses in a Task Ledger that is maintained. At each step of its plan, the Orchestrator creates a Progress Ledger where it self-reflects on task progress and checks whether the task is completed. If the task is not yet completed, it assigns one of Magentic-One other agents a subtask to complete. After the assigned agent completes its subtask, the Orchestrator updates the Progress Ledger and continues in this way until the task is complete. If the Orchestrator finds that progress is not being made for enough steps, it can update the Task Ledger and create a new plan. This is illustrated in the figure above; the Orchestrator work is thus divided into an outer loop where it updates the Task Ledger and an inner loop to update the Progress Ledger.
Overall, Magentic-One consists of the following agents:
- Orchestrator: the lead agent responsible for task decomposition and planning, directing other agents in executing subtasks, tracking overall progress, and taking corrective actions as needed
- WebSurfer: This is an LLM-based agent that is proficient in commanding and managing the state of a Chromium-based web browser. With each incoming request, the WebSurfer performs an action on the browser then reports on the new state of the web page The action space of the WebSurfer includes navigation (e.g. visiting a URL, performing a web search); web page actions (e.g., clicking and typing); and reading actions (e.g., summarizing or answering questions). The WebSurfer relies on the accessibility tree of the browser and on set-of-marks prompting to perform its actions.
- FileSurfer: This is an LLM-based agent that commands a markdown-based file preview application to read local files of most types. The FileSurfer can also perform common navigation tasks such as listing the contents of directories and navigating a folder structure.
- Coder: This is an LLM-based agent specialized through its system prompt for writing code, analyzing information collected from the other agents, or creating new artifacts.
- ComputerTerminal: Finally, ComputerTerminal provides the team with access to a console shell where the Coder’s programs can be executed, and where new programming libraries can be installed.
Together, Magentic-One’s agents provide the Orchestrator with the tools and capabilities that it needs to solve a broad variety of open-ended problems, as well as the ability to autonomously adapt to, and act in, dynamic and ever-changing web and file-system environments.
While the default multimodal LLM we use for all agents is GPT-4o, Magentic-One is model agnostic and can incorporate heterogonous models to support different capabilities or meet different cost requirements when getting tasks done. For example, it can use different LLMs and SLMs and their specialized versions to power different agents. We recommend a strong reasoning model for the Orchestrator agent such as GPT-4o. In a different configuration of Magentic-One, we also experiment with using OpenAI o1-preview for the outer loop of the Orchestrator and for the Coder, while other agents continue to use GPT-4o.
### Logging in Team One Agents
Team One agents can emit several log events that can be consumed by a log handler (see the example log handler in [utils.py](src/autogen_magentic_one/utils.py)). A list of currently emitted events are:
- OrchestrationEvent : emitted by a an [Orchestrator](src/autogen_magentic_one/agents/base_orchestrator.py) agent.
- WebSurferEvent : emitted by a [WebSurfer](src/autogen_magentic_one/agents/multimodal_web_surfer/multimodal_web_surfer.py) agent.
In addition, developers can also handle and process logs generated from the AutoGen core library (e.g., LLMCallEvent etc). See the example log handler in [utils.py](src/autogen_magentic_one/utils.py) on how this can be implemented. By default, the logs are written to a file named `log.jsonl` which can be configured as a parameter to the defined log handler. These logs can be parsed to retrieved data agent actions.
# Setup and Usage
You can install the Magentic-One package and then run the example code to see how the agents work together to accomplish a task.
1. Clone the code and install the package:
The easiest way to install is with the [uv package installer](https://docs.astral.sh/uv/getting-started/installation/) which you need to install separately, however, this is not necessary.
Clone repo, use uv to setup and activate virtual environment:
```bash
git clone https://github.com/microsoft/autogen.git
cd autogen/python
uv sync --all-extras
source .venv/bin/activate
```
For Windows, run `.venv\Scripts\activate` to activate the environment.
2. I
没有合适的资源?快使用搜索试试~ 我知道了~
多智能体框架AutoGen AutoGen是一个用于创建可自主行动或与人类一起工作的多智能体 AI 应用程序的框架

共1428个文件
cs:379个
py:368个
md:146个

1 下载量 152 浏览量
2025-01-21
22:07:38
上传
评论
收藏 21.99MB ZIP 举报
温馨提示
多智能体框架AutoGen AutoGen是一个用于创建可自主行动或与人类一起工作的多智能体 AI 应用程序的框架。 构建人工智能代理和应用程序的框架
资源推荐
资源详情
资源评论






























收起资源包目录





































































































共 1428 条
- 1
- 2
- 3
- 4
- 5
- 6
- 15
资源评论


Muti-Agent
- 粉丝: 5w+
上传资源 快速赚钱
我的内容管理 展开
我的资源 快来上传第一个资源
我的收益
登录查看自己的收益我的积分 登录查看自己的积分
我的C币 登录后查看C币余额
我的收藏
我的下载
下载帮助


最新资源
- 焊接工程师培训.ppt
- 自动控制原理第4章-根轨迹.ppt
- 第04章-施工总进度-正稿.doc
- 内蒙古锡林浩特某酒店营销策划方案.doc
- 多层教学楼指标10.doc
- 2009年下半年度上海市建设工程价格与指数.doc
- 机动车登记规定.doc
- python 练习题,python 对称迷宫
- python 练习题,python计算器
- python 练习题,python价值之和
- 信号完整性及高速数字设计基础+考试题及解答
- AI生产力工具-免费开源,提高用户生产力,保护隐私和数据安全 包括但不限于…
- stm32选型手册,参考命名规则等信息
- 【医学图像分割】基于Swin-Transformer的细胞核分割模型:MoNuSeg数据集端到端训练与推理系统实现
- (70页PPT)WMS助力企业数字化转型.pptx
- 公益资料(70页PPT)智慧方案智慧医院信息化规划方案.pptx
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈



安全验证
文档复制为VIP权益,开通VIP直接复制
