Tools
Tools: Open Source Project of the Day (Part 10): AgentEvolver - Self-Evolving Agent System for Autonomous Learning and Evolution
2026-03-08
0 views
admin
Introduction ## What You'll Learn ## Prerequisites ## Project Background ## Project Introduction ## Author/Team Introduction ## Project Stats ## Main Features ## Core Purpose ## Use Cases ## Quick Start ## Installation ## Prerequisites ## Simplest Usage Example ## Core Features ## Project Advantages ## Detailed Project Analysis ## Architecture Design ## Core Architecture ## Self-Questioning Mechanism ## Self-Navigating Mechanism ## Self-Attributing Mechanism ## Performance ## AppWorld Benchmark ## BFCL-v3 Benchmark ## Mechanism Ablation Study ## Game Arena Multi-agent Scenarios ## Core Capabilities ## Supported Games ## Training Example ## Environment Compatibility ## Environment Interface ## Supported Environments ## Experience Management (ReMe) ## Features ## Project Resources ## Official Resources ## Who Should Use This "If AI agents could evolve like biological organisms — autonomously discovering problems, accumulating experience, and optimizing strategies — they would no longer be static tools, but truly 'growing' intelligent entities." This is Part 10 of the "Open Source Project of the Day" series. Today we explore AgentEvolver (GitHub). Traditional AI agent training requires large amounts of manually annotated datasets — expensive and hard to scale. AgentEvolver uses three self-evolving mechanisms — Self-Questioning, Self-Navigating, and Self-Attributing — to enable AI agents to autonomously generate tasks, accumulate experience, and optimize strategies, achieving true self-evolution. AgentEvolver is an efficient self-evolving agent system that enables AI agents to autonomously learn and evolve through three core mechanisms: Core problems the project solves: Project creation date: 2024 (based on GitHub activity, an actively maintained project) Project development history: AgentEvolver's core purpose is to build an efficient self-evolving agent system that enables AI agents to: Agent training and research Complex environment interaction Automatic task generation Experience reuse and optimization AgentEvolver requires conda and CUDA toolkit: Why choose AgentEvolver? Compared to traditional agent training methods, AgentEvolver uses three self-evolving mechanisms to achieve autonomous task generation, experience reuse, and fine-grained credit assignment, significantly reducing training costs, improving efficiency, and achieving outstanding performance on AppWorld and BFCL-v3 benchmarks. AgentEvolver adopts a service-oriented data flow architecture, seamlessly integrating environment sandboxes, LLMs, and experience management into modular services. Self-Questioning enables agents to autonomously explore environments and generate diverse tasks: Self-Navigating improves exploration efficiency through experience summarization and reuse: Self-Attributing achieves efficient policy optimization through fine-grained credit assignment: AgentEvolver achieves outstanding performance on AppWorld and BFCL-v3 benchmarks: Significant performance improvements over baseline models: Significant performance improvements over baseline models: Experiments show that all three mechanisms working together achieves the best results: AgentEvolver Game Arena extends AgentEvolver to multi-agent social game environments: The training curve for training the assassin role in Avalon shows that AgentEvolver can effectively improve agent performance on complex social reasoning tasks. AgentEvolver provides standardized interfaces supporting seamless integration with various external environments: AgentEvolver integrates ReMe for experience management: AgentEvolver is especially suitable for: AI agent researchers and developers, researchers needing to train autonomous agents, enterprises looking to reduce agent training costs, technical professionals interested in self-evolving systems, research teams needing multi-agent training. Not suitable for: Users who only need simple agents, scenarios that don't require autonomous learning, developers lacking reinforcement learning background. Welcome to visit my personal homepage for more useful knowledge and interesting products Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse COMMAND_BLOCK:
# Step 1: Install base dependencies
bash install.sh # Step 2: Set up environment service (AppWorld as example)
cd env_service/environments/appworld && bash setup.sh # Step 3: Set up ReMe (optional, for experience management)
bash external/reme/install_reme.sh # Step 4: Start training
conda activate agentevolver # Method 1: Basic example (without ReMe)
python launcher.py --conf examples/basic.yaml --with-appworld # Method 2: Full example (with ReMe, includes questioning + navigating + attributing)
python launcher.py --conf examples/overall.yaml --with-appworld --with-reme Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
# Step 1: Install base dependencies
bash install.sh # Step 2: Set up environment service (AppWorld as example)
cd env_service/environments/appworld && bash setup.sh # Step 3: Set up ReMe (optional, for experience management)
bash external/reme/install_reme.sh # Step 4: Start training
conda activate agentevolver # Method 1: Basic example (without ReMe)
python launcher.py --conf examples/basic.yaml --with-appworld # Method 2: Full example (with ReMe, includes questioning + navigating + attributing)
python launcher.py --conf examples/overall.yaml --with-appworld --with-reme COMMAND_BLOCK:
# Step 1: Install base dependencies
bash install.sh # Step 2: Set up environment service (AppWorld as example)
cd env_service/environments/appworld && bash setup.sh # Step 3: Set up ReMe (optional, for experience management)
bash external/reme/install_reme.sh # Step 4: Start training
conda activate agentevolver # Method 1: Basic example (without ReMe)
python launcher.py --conf examples/basic.yaml --with-appworld # Method 2: Full example (with ReMe, includes questioning + navigating + attributing)
python launcher.py --conf examples/overall.yaml --with-appworld --with-reme COMMAND_BLOCK:
# Copy config file
cp example.env .env # Modify .env file, set API key and conda path
# Then run training # Basic training (using built-in environment dataset)
python launcher.py --conf examples/basic.yaml --with-appworld # Full self-evolving training
python launcher.py --conf examples/overall.yaml --with-appworld --with-reme Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
# Copy config file
cp example.env .env # Modify .env file, set API key and conda path
# Then run training # Basic training (using built-in environment dataset)
python launcher.py --conf examples/basic.yaml --with-appworld # Full self-evolving training
python launcher.py --conf examples/overall.yaml --with-appworld --with-reme COMMAND_BLOCK:
# Copy config file
cp example.env .env # Modify .env file, set API key and conda path
# Then run training # Basic training (using built-in environment dataset)
python launcher.py --conf examples/basic.yaml --with-appworld # Full self-evolving training
python launcher.py --conf examples/overall.yaml --with-appworld --with-reme CODE_BLOCK:
AgentEvolver System
├── Environment Service
│ ├── AppWorld environment
│ ├── BFCL-v3 environment
│ ├── Game Arena (Avalon, Diplomacy)
│ └── Custom environment interface
├── LLM Service
│ ├── Qwen2.5-7B/14B
│ ├── Other LLM support
│ └── API call wrapper
├── Experience Manager
│ ├── ReMe integration
│ ├── Experience pool management
│ └── Experience summarization and reuse
├── Task Manager
│ ├── Task exploration
│ ├── Synthetic task generation
│ └── Training data management
└── Advantage Processor ├── ADCA-GRPO algorithm ├── Credit assignment └── Policy optimization Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
AgentEvolver System
├── Environment Service
│ ├── AppWorld environment
│ ├── BFCL-v3 environment
│ ├── Game Arena (Avalon, Diplomacy)
│ └── Custom environment interface
├── LLM Service
│ ├── Qwen2.5-7B/14B
│ ├── Other LLM support
│ └── API call wrapper
├── Experience Manager
│ ├── ReMe integration
│ ├── Experience pool management
│ └── Experience summarization and reuse
├── Task Manager
│ ├── Task exploration
│ ├── Synthetic task generation
│ └── Training data management
└── Advantage Processor ├── ADCA-GRPO algorithm ├── Credit assignment └── Policy optimization CODE_BLOCK:
AgentEvolver System
├── Environment Service
│ ├── AppWorld environment
│ ├── BFCL-v3 environment
│ ├── Game Arena (Avalon, Diplomacy)
│ └── Custom environment interface
├── LLM Service
│ ├── Qwen2.5-7B/14B
│ ├── Other LLM support
│ └── API call wrapper
├── Experience Manager
│ ├── ReMe integration
│ ├── Experience pool management
│ └── Experience summarization and reuse
├── Task Manager
│ ├── Task exploration
│ ├── Synthetic task generation
│ └── Training data management
└── Advantage Processor ├── ADCA-GRPO algorithm ├── Credit assignment └── Policy optimization COMMAND_BLOCK:
# Install ReMe
bash external/reme/install_reme.sh # Train with ReMe
python launcher.py --conf examples/overall.yaml --with-appworld --with-reme Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
# Install ReMe
bash external/reme/install_reme.sh # Train with ReMe
python launcher.py --conf examples/overall.yaml --with-appworld --with-reme COMMAND_BLOCK:
# Install ReMe
bash external/reme/install_reme.sh # Train with ReMe
python launcher.py --conf examples/overall.yaml --with-appworld --with-reme - Core self-evolution mechanisms and how AgentEvolver works
- How the three mechanisms (Self-Questioning, Self-Navigating, Self-Attributing) work together
- How to set up and train a self-evolving agent system
- Service-oriented data flow architecture design
- Outstanding performance on AppWorld and BFCL-v3 benchmarks
- Comparative analysis with other agent training frameworks - Basic understanding of AI agents and reinforcement learning
- Familiarity with Python programming
- Understanding of basic LLM concepts
- Basic knowledge of reinforcement learning training pipelines (optional) - Self-Questioning: Agents autonomously explore environments and generate diverse tasks, eliminating the cost of expensive manual dataset construction
- Self-Navigating: Summarizes and reuses cross-task experience to guide higher-quality exploration and improve exploration efficiency
- Self-Attributing: Handles long trajectories, discovers causal contributions of intermediate steps, and enables fine-grained and efficient policy optimization - Agent training requires large amounts of manually annotated datasets at high cost
- Lack of autonomous exploration capabilities makes it hard to discover new tasks
- Experience cannot be effectively reused, leading to low exploration efficiency
- Credit assignment in long trajectories is imprecise, making policy optimization inefficient
- Different environment integrations are difficult, lacking a unified training framework - AI agent researchers and developers
- Researchers needing to train autonomous agents
- Enterprises looking to reduce agent training costs
- Technical professionals interested in self-evolving systems - Background: Alibaba DAMO Academy ModelScope team, focused on AI model and system development
- Contributors: 10 contributors including @YunpengZhai, @TaoShuchang, @Xinji-Mai, and others
- Philosophy: Building efficient, autonomous, evolvable AI agent systems
- Website: modelscope.github.io/AgentEvolver - ⭐ GitHub Stars: 1.1k+ (continuously growing)
- 🍴 Forks: 128+
- 📦 Version: Latest version (continuously updated)
- 📄 License: Apache-2.0 (fully open source, free to use)
- 🌐 Website: modelscope.github.io/AgentEvolver
- 📚 Documentation: Includes complete usage guides and API documentation
- 💬 Community: Active GitHub Issues
- 📊 Paper: arXiv:2511.10395 - 2024: Project created, started building core self-evolution mechanisms
- 2024-2025: Refined the three mechanisms, added multi-environment support
- 2025: Published paper, achieved outstanding performance on AppWorld and BFCL-v3 benchmarks
- 2026: Continuous optimization, added Game Arena multi-agent scenario support - Autonomously generate tasks: Through Self-Questioning, agents autonomously explore environments and generate diverse tasks
- Experience-guided exploration: Through Self-Navigating, summarize and reuse cross-task experience to improve exploration efficiency
- Fine-grained credit assignment: Through Self-Attributing, precisely identify the contributions of key steps in long trajectories
- Efficient policy optimization: Based on fine-grained credit assignment, achieve more efficient policy optimization - Agent training and research Training autonomously exploring AI agents
Researching the effectiveness of self-evolution mechanisms
Reducing agent training costs
- Training autonomously exploring AI agents
- Researching the effectiveness of self-evolution mechanisms
- Reducing agent training costs
- Complex environment interaction AppWorld application operation tasks
BFCL-v3 complex reasoning tasks
Multi-agent social games (Avalon, Diplomacy)
- AppWorld application operation tasks
- BFCL-v3 complex reasoning tasks
- Multi-agent social games (Avalon, Diplomacy)
- Automatic task generation Automatically discover new tasks in the environment
Generate diverse training data
Reduce manual annotation costs
- Automatically discover new tasks in the environment
- Generate diverse training data
- Reduce manual annotation costs
- Experience reuse and optimization Cross-task experience summarization and reuse
Improve exploration efficiency
Accelerate agent learning
- Cross-task experience summarization and reuse
- Improve exploration efficiency
- Accelerate agent learning - Training autonomously exploring AI agents
- Researching the effectiveness of self-evolution mechanisms
- Reducing agent training costs - AppWorld application operation tasks
- BFCL-v3 complex reasoning tasks
- Multi-agent social games (Avalon, Diplomacy) - Automatically discover new tasks in the environment
- Generate diverse training data
- Reduce manual annotation costs - Cross-task experience summarization and reuse
- Improve exploration efficiency
- Accelerate agent learning - conda: For environment management
- CUDA toolkit: For GPU acceleration
- Python 3.x: Primary programming language - Self-Questioning: Agents autonomously explore environments, generate diverse tasks, eliminating manual dataset construction costs
- Self-Navigating: Summarizes and reuses cross-task experience to guide high-quality exploration, improving exploration efficiency
- Self-Attributing: Handles long trajectories, discovers causal contributions of intermediate steps, enables fine-grained policy optimization
- Environment compatibility: Standardized interfaces for seamless integration with various external environments and tool APIs
- Flexible context management: Built-in tools for managing multi-turn context and complex interaction logic
- Modular architecture: Decoupled components, easy to customize, extend, and upgrade algorithms
- Game Arena support: Extended to multi-agent social game environments, supporting interaction, evaluation, and training - Agent autonomously explores the environment
- Discovers new tasks and challenges in the environment
- Automatically generates task descriptions and training data
- Eliminates expensive manual dataset construction costs - High task diversity, covering various scenarios in the environment
- No manual annotation needed, significantly reduces costs
- High task quality, based on actual environment exploration - Summarize successful cross-task experiences
- Build an experience knowledge base
- Reuse relevant experience in new tasks
- Guide higher-quality exploration - Significantly improves exploration efficiency
- Experience is reusable, avoiding repeated exploration
- Guides higher-quality strategies - Analyze intermediate steps in long trajectories
- Identify causal contributions of key steps
- Assign credit based on contributions
- Implement fine-grained policy optimization - Precise credit assignment, avoids incorrect attribution
- High policy optimization efficiency
- Supports long trajectory processing - Qwen2.5-7B + AgentEvolver: avg@8: 32.4%, best@8: 51.2%
- Qwen2.5-14B + AgentEvolver: avg@8: 48.7%, best@8: 69.4% - 7B model: Improved from 1.8% to 32.4% (avg@8)
- 14B model: Improved from 18.0% to 48.7% (avg@8) - Qwen2.5-7B + AgentEvolver: avg@8: 57.9%, best@8: 69.0%
- Qwen2.5-14B + AgentEvolver: avg@8: 66.5%, best@8: 76.7% - 7B model: Improved from 29.8% to 57.9% (avg@8)
- 14B model: Improved from 41.6% to 66.5% (avg@8) - +Questioning: Significant performance improvement
- +Questioning&Navigating: Further improves exploration efficiency
- +Questioning&Attributing: Fine-grained optimization brings additional gains
- AgentEvolver (Full): All three mechanisms together, best performance - Web interface interaction: Real-time observation of AI agent reasoning and communication, or participate as a human player
- Scalable evaluation: Run large-scale self-play or mixed-model tournaments, supports configuration and leaderboards
- End-to-end training: Directly train LLM agents using reinforcement learning methods (like GRPO) in social game environments - Avalon: Social reasoning game, tests agents' reasoning and communication abilities
- Diplomacy: Complex multi-agent strategy game, tests long-term planning and collaboration abilities - Standardized interface: Unified environment interface specification
- Tool API integration: Supports integration with various tools and APIs
- Custom environments: Easy to add custom environments - AppWorld: Application operation task environment
- BFCL-v3: Complex reasoning task environment
- Game Arena: Multi-agent social game environment
- Custom environments: Integrated through standard interfaces - Experience summarization: Summarize successful cross-task experiences
- Experience pool management: Manage storage and retrieval of the experience pool
- Experience reuse: Reuse relevant experience in new tasks - 🌟 GitHub: https://github.com/modelscope/AgentEvolver
- 🌐 Website: modelscope.github.io/AgentEvolver
- 📄 Paper: arXiv:2511.10395
how-totutorialguidedev.toaimlllmbashapachepythonsslgitgithub