File:The Landscape of Emerging AI Agent Architectures Masterman 2404.11584v1.pdf

From Chrysalis Archive
Jump to navigation Jump to search

The_Landscape_of_Emerging_AI_Agent_Architectures_Masterman_2404.11584v1.pdf (file size: 1.58 MB, MIME type: application/pdf)

Summary

THE LANDSCAPE OF EMERGING AI AGENT ARCHITECTURES FOR REASONING, PLANNING, AND TOOL CALLING: A SURVEY

ABSTRACT

This survey paper examines the recent advancements in AI agent implementations, with a focus on their ability to achieve complex goals that require enhanced reasoning, planning, and tool execution capabilities. The primary objectives of this work are to a) communicate the current capabilities and limitations of existing AI agent implementations, b) share insights gained from our observations of these systems in action, and c) suggest important considerations for future developments in AI agent design. We achieve this by providing overviews of single-agent and multi-agent architectures, identifying key patterns and divergences in design choices, and evaluating their overall impact on accomplishing a provided goal. Our contribution outlines key themes when selecting an agentic architecture, the impact of leadership on agent systems, agent communication styles, and key phases for planning, execution, and reflection that enable robust AI agent systems.
Keywords AI Agent · Agent Architecture · AI Reasoning · Planning · Tool Calling · Single Agent · Multi Agent · Agent Survey · LLM Agent · Autonomous Agent

Introduction

Since the launch of ChatGPT, many of the first wave of generative AI applications have been a variation of a chat over a corpus of documents using the Retrieval Augmented Generation (RAG) pattern. While there is a lot of activity in making RAG systems more robust, various groups are starting to build what the next generation of AI applications will look like, centralizing on a common theme: agents.
 
Beginning with investigations into recent foundation models like GPT-4 and popularized through open-source projects like AutoGPT and BabyAGI, the research community has experimented with building autonomous agent-based systems [19, 1].
 
As opposed to zero-shot prompting of a large language model where a user types into an open-ended text field and gets a result without additional input, agents allow for more complex interaction and orchestration. In particular, agentic systems have a notion of planning, loops, reflection and other control structures that heavily leverage the model’s inherent reasoning capabilities to accomplish a task end-to-end. Paired with the ability to use tools, plugins, and function calling, agents are empowered to do more general-purpose work.
 
Among the community, there is a current debate on whether single or multi-agent systems are best suited for solving complex tasks. While single agent architectures excel when problems are well-defined and feedback from other agent-personas or the user is not needed, multi-agent architectures tend to thrive more when collaboration and multiple distinct execution paths are required.
 

arXiv:2404.11584v1 [cs.AI] 17 Apr 2024

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeDimensionsUserComment
current07:38, 22 May 2024 (1.58 MB)Don86326 (talk | contribs)Category:AI Category:AI agents ;''THE LANDSCAPE OF EMERGING AI AGENT ARCHITECTURES FOR REASONING, PLANNING, AND TOOL CALLING: A SURVEY'' ABSTRACT This survey paper examines the recent advancements in AI agent implementations, with a focus on their ability to achieve complex goals that require enhanced reasoning, planning, and tool execution capabilities. The primary objectives of this work are to a) communicate the current capabilities and limitations of existing AI agent implementat...

The following page uses this file: