MindAgent, A Framework That Enables Multi-agent Collaboration, Is Now Available!

Agent Simulation 10/10/2023

3 main points
✔️ Designed MindAgent, a multi-agent planning framework to facilitate LLM collaboration
✔️ CUISINEWORLD, a benchmark for evaluating LLM multi-agent planning performance and the evaluation metric Collaboration Score (CoS) proposed
✔️ Comparative experiments demonstrate that work efficiency and task accomplishment rates increase as the number of agents performing work increases

MindAgent: Emergent Gaming Interaction
written by Ran Gong, Qiuyuan Huang, Xiaojian Ma, Hoi Vo, Zane Durante, Yusuke Noda, Zilong Zheng, Song-Chun Zhu, Demetri Terzopoulos, Li Fei-Fei, Jianfeng Gao
(Submitted on 18 Sep 2023 (v1), last revised 19 Sep 2023 (this version, v2))
Comments: The first three authors contributed equally. 28 pages
Subjects: Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Multiagent Systems (cs.MA)

code：

The images used in this article are from the paper, the introductory slides, or were created based on them.

Introduction

In recent years, much attention has been focused on the planning capability of being able to perform complex scheduling and coordinate the generating agents to complete sophisticated tasks in large language model (LLM) multi-agent systems.

On the other hand, compared to single-agent planning, which is widely treated in existing research, multi-agent planning has the problem that the complexity is very high due to the exponential growth of the action space (increase in the number of agents).

For this reason, there has been insufficient benchmarking for multi-agents in general, despite the fact that many game frameworks have been deployed in single-agent.

Against this background, this paper describes a paper in which we designed MindAgent, a multi-agent planning framework that facilitates collaboration with multiple agents in LLM, proposed a new benchmark, CUISINEWORLD, and an evaluation metric, Collaboration Score (CoS) and comprehensive experiments under various conditions will be described in this paper.

CUISINEWORLD

To evaluate LLM's performance in multi-agent planning, this paper proposes a benchmark called CUISINEWORLD, shown on the left in the figure below.

Let's take a closer look.

Task

CUISINEWORLD is a game that mimics a virtual kitchen environment, commanding multiple agents to use a variety of utensils and ingredients to fulfill as many orders as possible within a limited time frame.

The orders given are set up with 33 different dishes consisting of 27 different ingredients, ranging from tuna sashimi, which can be made simply by cutting the tuna meat, to dishes like pork pasta, which requires a variety of cooking utensils.

These dishes are grouped based on the difficulty of cooking them, resulting in 12 game levels.

At the start of a task, a food order is added to the task list, and when a matching dish is placed on the dining table, the task is considered complete and the dish is removed from the list.

On the other hand, if the time limit (which varies depending on the complexity of the dish) is reached, the task is considered failed and the dish is also eliminated from the list in this case.

This design requires LLMs to properly plan for multiple agents to maximize overall productivity, as new orders come in rapidly while existing orders must be cooked before time runs out.

Human-Agent Collaboration

CUISINEWORLD is designed with a text interface, allowing for collaboration between agents as well as between humans and agents.

Additionally, in addition to using a standard keyboard to control the player, a VR device can also be used, as shown in the bottom image below.

This VR functionality allows users to physically move in-game elements such as players and cooking utensils in a 3D environment and collaborate with agents for more immersive and realistic interactions.

Collaboration Score (CoS)

This paper proposes a Collaboration Score (CoS ), a metric to evaluate the extent to which LLMs are able to plan and complete food orders for multiple agents in CUISINEWORLD.

The CoS is defined by the following equation

where _Tint is the number of steps (=task interval) in which new orders are added to the task list in CUISINEWORLD set by the maximum number of steps T, and M is the total amount of task intervals to be evaluated. (M=5 by default).

Thus, CoS represents the average completion rate of tasks across various conditions with different task intervals within CUISINEWORLD, with higher scores indicating more efficient collaboration among multi-agents.

MindAgent: Infrastructure For Gaming AI

In this paper, we design MindAgent, a multi-agent planning framework intended to facilitate collaboration with multiple agents in LLM.

MindAgent's architecture is shown in the figure below.

As the figure shows, MindAgent's architecture consists of four modules: Planning Skills&Tool use, Action, LLM, and Memory.

CUISINEWORLD's gaming environment requires the use of a variety of planning skills and tools to complete tasks, and the Planning Skills&Took use module disseminates these skills and related gaming information.

In addition, it converts the relevant game data into a structured text format that can be processed by the LLM.

In addition to extracting actions from text input and converting them to domain-specific language, the Action module is responsible for validating the DSL (Domain Specific Language, a domain-specific language) and ensuring that it does not cause errors at runtime.

The LLM module is the dispatcher of the multi-agent system and makes decisions based on information sent by other modules.

The Memory module is responsible for recording the state of the environment and the state of the agent at each time step in a place called Memory History.

These modules and its design according to In-context Learning (learning done from task descriptions and inputs/outputs without updating parameters) allow MingAgent to improve its planning capabilities in multi-agents.

Experiments and Results

In this paper, experiments were conducted in CUISINEWORLD under a variety of conditions to investigate the performance of LLM in a multi-agent setting. (All experiments were performed using the OpenAI API and the anthropic API.)

The table below shows the task accomplishment rate and CoS for each different number of agents (2-4) at different task levels (very simple, simple, intermediate, and advanced).

As the table shows, it can be observed that the higher the number of agents at different task levels, the higher the task work efficiency (CoS).

In addition, for all task levels and number of agents, the task accomplishment rate was observed to be low when the number of tasks was small, and the task accomplishment rate stabilized as the number of tasks increased.

We can speculate that this is a result of the MindAgent framework, designed withfour modules andIn-context Learning, which improved LLM's planning ability in multi-agents with each task.

In addition, the figure below shows the task success rates for different task levels (level0-9) and number of agents (2-4).

As shown in the figure, it can be seen that for different task levels, in general, the greater the number of agents, the higher the work efficiency (the greater the slope of the graph).

These experimental results indicate that LLMs may be able to coordinate more agents and perform tasks more efficiently, providing very important insights for future research.

Summary

How was it? In this issue, we designed MindAgent, a multi-agent planning framework that facilitates collaboration with multiple agents in LLM, proposed a new benchmark, CUISINEWORLD, and an evaluation metric, Collaboration Score (CoS), and described a paper that conducted comprehensive experiments under various The paper described comprehensive experiments conducted under various conditions.

In addition to investigating LLM's planning capabilities in multi-agent and obtaining various results for future research, this paper also focused on developing future game systems that allow humans and AI to work together seamlessly, such as enabling human operations in VR environments at CUISINEWORLD The company was also working on the development of a future game system that would allow humans and AI to work together seamlessly.

The authors say that the insights and discoveries in this paper may lead not only to technological advances, but also to the creation of games that are more engaging and enjoyable for players, and there is hope that this paper will lead to progress in the gaming field.

The details of the CUISINEWORLD and MindAgent architectures introduced in this paper can be found in this paper for those who are interested.

Categories related to this article

田中侑李

MindAgent, A Framework That Enables Multi-agent Collaboration, Is Now Available!

Introduction

CUISINEWORLD

Task

Human-Agent Collaboration

Collaboration Score (CoS)

MindAgent: Infrastructure For Gaming AI

Experiments and Results

Summary

A Framework Is Now Available That Brings Out Performance Beyond That Of GPT-4 By Allowing Diverse Agents To Debate Each Other!

A Framework Is Now Available That Brings Out Performance Beyond That Of GPT-4 By Allowing Diverse Ag ...

A Multimodal Model Is Now Available That Enables Prediction Of Viewer Behavior From Video!

A Multimodal Model Is Now Available That Enables Prediction Of Viewer Behavior From Video!

ExpeL, An LLM Agent That Learns Autonomously From Experience, Is Now Available!

ExpeL, An LLM Agent That Learns Autonomously From Experience, Is Now Available!

A Multi-agent Framework Is Now Available For Any Task By Customizing The Workflow!

A Multi-agent Framework Is Now Available For Any Task By Customizing The Workflow!

AgentVerse, A Multi-agent Framework For Simulating Cooperative Human Behavioral Processes, Is Now Available!

AgentVerse, A Multi-agent Framework For Simulating Cooperative Human Behavioral Processes, Is Now Av ...

AgentBench, A Comprehensive Benchmark For Evaluating AI Agent Performance, Is Now Available!

AgentBench, A Comprehensive Benchmark For Evaluating AI Agent Performance, Is Now Available!