Dynamic Agents - Deliberate, Plan, Generate and Execute Tasks
In Part I of this series, we provided a brief introduction to software agents - programs that autonomously perform tasks or make decisions based on numerous factors, including their environment, user input and predefined rules. On the one end of the spectrum, software agents are simple rules-based systems which only execute preconceived step-by-step instructions; on the other, software agents act as autonomous assistants, independently devising and executing plans on behalf of users or other programs.
Goal-directed (GD) agents are a subset of software agents and one of the more advanced forms; these agents are capable of dynamic planning and execution of tasks based on inputs or environmental conditions. Autonomous vehicles are just one example of a goal-directed agent. They make many real-time, complex decisions to navigate safely to a predefined destination. As the decision-making process includes the ability to identify and react to traffic, road conditions and unexpected obstacles, autonomous vehicles must be dynamic and able to adjust their route as needed.
Other applications of goal-directed agents include healthcare, where such agents are used to provide medical consultations, including analyzing symptoms, providing recommendations and booking appointments. In e-commerce, agents analyze user behavior, purchase history and preferences to recommend other products that the user is more likely to purchase. In finance, agents are used to protect user accounts and financial data by monitoring transactions in real-time to identify potentially fraudulent activity.
The Intersection of Goal-Directed Agents & Generative AI
With the release of GPT-3.5 and GPT-4, there has been a lot of buzz around generative AI this year. Generative AI models are a type of artificial intelligence that can generate new content, whether it is text generation (e.g., GPT-4, Llama Chat), image creation (Midjourney, DALL-E), or music composition (AIVA, Amper Music). Many organizations have begun to industrialize the raw zero shot capabilities of generative AI models in more elaborate enterprise grade offerings.
Among all of the candidate approaches, one natural progression in the development goal-directed agents is to integrate such agents with generative AI models. We have already begun to observe the fruit of such integration. In the recent paper, Generative Agents: Interactive Simulacra of Human Behavior, researchers from Stanford University created a simulated environment of generative agents - computational entities that simulate realistic human behavior, doing everything from performing everyday activities to interacting socially. Think of The Sims meets Generative AI. The authors explain how they orchestrate the simulated environment as follows -
To enable generative agents, we describe an architecture that extends a large language model to store a complete record of the agent’s experiences using natural language, synthesize those memories over time into higher-level reflections, and retrieve them dynamically to plan behavior. We instantiate generative agents to populate an interactive sandbox environment inspired by The Sims, where end users can interact with a small town of twenty five agents using natural language. In an evaluation, these generative agents produce believable individual and emergent social behaviors: for example, starting with only a single user-specified notion that one agent wants to throw a Valentine’s Day party, the agents autonomously spread invitations to the party over the next two days, make new acquaintances, ask each other out on dates to the party, and coordinate to show up for the party together at the right time.
The architecture uses a large language model to record, reflect on, and dynamically use their experiences to inform future actions. Conjoining large language models with computational, interactive agents resulted in a simulated environment that reflected believable individual and social behavior.
The coupling of generative AI models with goal-directed agents has also paved the way for Auto-GPT - an application that has the power to generate content autonomously. Auto-GPT harnesses the power of generative AI models and overlays the ability to act autonomously, that is, to process a request, select the best tools to achieve the request, create a plan for how to use them, and then execute on this plan. In its traditional form, generative AI models did not have the ability for such advanced forms of independent planning and execution. You may ask GPT-4 to create a recipe that has five or fewer ingredients for dinner; it takes the request, processes it and returns a recipe. As a goal-directed software agent, Auto-GPT can process more generalized, difficult requests, such as, ordering a pizza, creating a website or help a user negotiate a refund.
Kelvin Chat Powered by Kelvin Agent – Our Auto-GPT Style Agent
This week we previewed Kelvin Chat - a next generation Chat Experience powered by our Kelvin Agent. While we look forward to sharing more in the days and weeks to come, we can say that KelvinChat is an ‘Auto-GPT’ style agent which can create and execute plans using any one (or more) of 50+ of the tools from the Kelvin Legal Data OS. The Kelvin Legal Data O.S. is a powerful full legal data stack with end-to-end data pipeline which can preprocess unstructured and semistructured data for use in a range of potential applications. Kelvin OS features a series legal specific OCR (Kelvin OCR), spell checker (Kelvin Speller), tokenizer and sentence segmentation/chunker (Kelvin Document Index). It can also leverage the Kelvin Conversion Engine, Kelvin NLP, Kelvin Graph and many other tools. For a given problem, our Kelvin Agent will suggest potential plans for using the tools contained within the broad OS (and execute such plan at the behest of the user). Users can also design and execute their own plans, checklists or custom workflows using our no-code, low-code interface.