"There are several use cases where it is valuable to maintain a _store_ of useful facts that can be intelligently added to the context of the agent just before a specific step. The typically use case here is a RAG pattern where a query is used to retrieve relevant information from a database that is then added to the agent's context.\n",
"\n",
"\n",
"AgentChat provides a {py:class}`~autogen_core.memory.Memory` protocol that can be extended to provide this functionality. The key methods are `query`, `update_context`, `add`, `clear`, and `close`. \n",
"\n",
"- `add`: add new entries to the memory store\n",
"- `query`: retrieve relevant information from the memory store \n",
"- `update_context`: mutate an agent's internal `model_context` by adding the retrieved information (used in the {py:class}`~autogen_agentchat.agents.AssistantAgent` class) \n",
"- `clear`: clear all entries from the memory store\n",
"- `close`: clean up any resources used by the memory store \n",
"{py:class}`~autogen_core.memory.ListMemory` is provided as an example implementation of the {py:class}`~autogen_core.memory.Memory` protocol. It is a simple list-based memory implementation that maintains memories in chronological order, appending the most recent memories to the model's context. The implementation is designed to be straightforward and predictable, making it easy to understand and debug.\n",
"In the following example, we will use ListMemory to maintain a memory bank of user preferences and demonstrate how it can be used to provide consistent context for agent responses over time."
"[MemoryContent(content='The weather should be in metric units', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None), MemoryContent(content='Meal recipe must be vegan', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None)]\n",
"[FunctionExecutionResult(content='The weather in New York is 23 °C and Sunny.', name='get_weather', call_id='call_apWw5JOedVvqsPfXWV7c5Uiw', is_error=False)]\n",
"TaskResult(messages=[TextMessage(source='user', models_usage=None, metadata={}, created_at=datetime.datetime(2025, 6, 12, 17, 46, 33, 492791, tzinfo=datetime.timezone.utc), content='What is the weather in New York?', type='TextMessage'), MemoryQueryEvent(source='assistant_agent', models_usage=None, metadata={}, created_at=datetime.datetime(2025, 6, 12, 17, 46, 33, 494162, tzinfo=datetime.timezone.utc), content=[MemoryContent(content='The weather should be in metric units', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None), MemoryContent(content='Meal recipe must be vegan', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None)], type='MemoryQueryEvent'), ToolCallRequestEvent(source='assistant_agent', models_usage=RequestUsage(prompt_tokens=123, completion_tokens=19), metadata={}, created_at=datetime.datetime(2025, 6, 12, 17, 46, 34, 892272, tzinfo=datetime.timezone.utc), content=[FunctionCall(id='call_apWw5JOedVvqsPfXWV7c5Uiw', arguments='{\"city\":\"New York\",\"units\":\"metric\"}', name='get_weather')], type='ToolCallRequestEvent'), ToolCallExecutionEvent(source='assistant_agent', models_usage=None, metadata={}, created_at=datetime.datetime(2025, 6, 12, 17, 46, 34, 894081, tzinfo=datetime.timezone.utc), content=[FunctionExecutionResult(content='The weather in New York is 23 °C and Sunny.', name='get_weather', call_id='call_apWw5JOedVvqsPfXWV7c5Uiw', is_error=False)], type='ToolCallExecutionEvent'), ToolCallSummaryMessage(source='assistant_agent', models_usage=None, metadata={}, created_at=datetime.datetime(2025, 6, 12, 17, 46, 34, 895054, tzinfo=datetime.timezone.utc), content='The weather in New York is 23 °C and Sunny.', type='ToolCallSummaryMessage', tool_calls=[FunctionCall(id='call_apWw5JOedVvqsPfXWV7c5Uiw', arguments='{\"city\":\"New York\",\"units\":\"metric\"}', name='get_weather')], results=[FunctionExecutionResult(content='The weather in New York is 23 °C and Sunny.', name='get_weather', call_id='call_apWw5JOedVvqsPfXWV7c5Uiw', is_error=False)])], stop_reason=None)"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Run the agent with a task.\n",
"stream = assistant_agent.run_stream(task=\"What is the weather in New York?\")\n",
"await Console(stream)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can inspect that the `assistant_agent` model_context is actually updated with the retrieved memory entries. The `transform` method is used to format the retrieved memory entries into a string that can be used by the agent. In this case, we simply concatenate the content of each memory entry into a single string."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[UserMessage(content='What is the weather in New York?', source='user', type='UserMessage'),\n",
" SystemMessage(content='\\nRelevant memory content (in chronological order):\\n1. The weather should be in metric units\\n2. Meal recipe must be vegan\\n', type='SystemMessage'),\n",
" FunctionExecutionResultMessage(content=[FunctionExecutionResult(content='The weather in New York is 23 °C and Sunny.', name='get_weather', call_id='call_apWw5JOedVvqsPfXWV7c5Uiw', is_error=False)], type='FunctionExecutionResultMessage')]"
"We see above that the weather is returned in Centigrade as stated in the user preferences. \n",
"\n",
"Similarly, assuming we ask a separate question about generating a meal plan, the agent is able to retrieve relevant information from the memory store and provide a personalized (vegan) response."
"[MemoryContent(content='The weather should be in metric units', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None), MemoryContent(content='Meal recipe must be vegan', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None)]\n",
"- 1 cup mushrooms, sliced (shiitake or any variety you prefer)\n",
"- 2 green onions, chopped\n",
"- 1 tablespoon soy sauce (optional)\n",
"- 1/2 cup seaweed (such as wakame)\n",
"- 1 tablespoon sesame oil\n",
"- 1 tablespoon grated ginger\n",
"- Salt to taste\n",
"\n",
"**Instructions:**\n",
"1. In a pot, heat the sesame oil over medium heat.\n",
"2. Add the grated ginger and sauté for about a minute until fragrant.\n",
"3. Pour in the vegetable broth and bring it to a simmer.\n",
"4. Add the miso paste, stirring until fully dissolved.\n",
"5. Add the tofu cubes, mushrooms, and seaweed to the broth and cook for about 5 minutes.\n",
"6. Stir in soy sauce if using, and add salt to taste.\n",
"7. Garnish with chopped green onions before serving.\n",
"\n",
"Enjoy your delicious and nutritious vegan miso soup! TERMINATE\n"
]
},
{
"data": {
"text/plain": [
"TaskResult(messages=[TextMessage(source='user', models_usage=None, metadata={}, created_at=datetime.datetime(2025, 6, 12, 17, 47, 19, 247083, tzinfo=datetime.timezone.utc), content='Write brief meal recipe with broth', type='TextMessage'), MemoryQueryEvent(source='assistant_agent', models_usage=None, metadata={}, created_at=datetime.datetime(2025, 6, 12, 17, 47, 19, 248736, tzinfo=datetime.timezone.utc), content=[MemoryContent(content='The weather should be in metric units', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None), MemoryContent(content='Meal recipe must be vegan', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None)], type='MemoryQueryEvent'), TextMessage(source='assistant_agent', models_usage=RequestUsage(prompt_tokens=528, completion_tokens=233), metadata={}, created_at=datetime.datetime(2025, 6, 12, 17, 47, 26, 130554, tzinfo=datetime.timezone.utc), content=\"Here's another vegan broth-based recipe:\\n\\n**Vegan Miso Soup**\\n\\n**Ingredients:**\\n- 4 cups vegetable broth\\n- 3 tablespoons white miso paste\\n- 1 block firm tofu, cubed\\n- 1 cup mushrooms, sliced (shiitake or any variety you prefer)\\n- 2 green onions, chopped\\n- 1 tablespoon soy sauce (optional)\\n- 1/2 cup seaweed (such as wakame)\\n- 1 tablespoon sesame oil\\n- 1 tablespoon grated ginger\\n- Salt to taste\\n\\n**Instructions:**\\n1. In a pot, heat the sesame oil over medium heat.\\n2. Add the grated ginger and sauté for about a minute until fragrant.\\n3. Pour in the vegetable broth and bring it to a simmer.\\n4. Add the miso paste, stirring until fully dissolved.\\n5. Add the tofu cubes, mushrooms, and seaweed to the broth and cook for about 5 minutes.\\n6. Stir in soy sauce if using, and add salt to taste.\\n7. Garnish with chopped green onions before serving.\\n\\nEnjoy your delicious and nutritious vegan miso soup! TERMINATE\", type='TextMessage')], stop_reason=None)"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"stream = assistant_agent.run_stream(task=\"Write brief meal recipe with broth\")\n",
"await Console(stream)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Custom Memory Stores (Vector DBs, etc.)\n",
"\n",
"You can build on the `Memory` protocol to implement more complex memory stores. For example, you could implement a custom memory store that uses a vector database to store and retrieve information, or a memory store that uses a machine learning model to generate personalized responses based on the user's preferences etc.\n",
"\n",
"Specifically, you will need to overload the `add`, `query` and `update_context` methods to implement the desired functionality and pass the memory store to your agent.\n",
"\n",
"\n",
"Currently the following example memory stores are available as part of the {py:class}`~autogen_ext` extensions package. \n",
"\n",
"- `autogen_ext.memory.chromadb.ChromaDBVectorMemory`: A memory store that uses a vector database to store and retrieve information. \n",
"\n",
"- `autogen_ext.memory.chromadb.SentenceTransformerEmbeddingFunctionConfig`: A configuration class for the SentenceTransformer embedding function used by the `ChromaDBVectorMemory` store. Note that other embedding functions such as `autogen_ext.memory.openai.OpenAIEmbeddingFunctionConfig` can also be used with the `ChromaDBVectorMemory` store.\n"
"[FunctionExecutionResult(content='The weather in New York is 23 °C and Sunny.', name='get_weather', call_id='call_ufpz7LGcn19ZroowyEraj9bd', is_error=False)]\n",
" # Create assistant agent with ChromaDB memory\n",
" assistant_agent = AssistantAgent(\n",
" name=\"assistant_agent\",\n",
" model_client=model_client,\n",
" tools=[get_weather],\n",
" memory=[chroma_user_memory],\n",
" )\n",
"\n",
" stream = assistant_agent.run_stream(task=\"What is the weather in New York?\")\n",
" await Console(stream)\n",
"\n",
" await model_client.close()\n",
" await chroma_user_memory.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that you can also serialize the ChromaDBVectorMemory and save it to disk."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'{\"provider\":\"autogen_ext.memory.chromadb.ChromaDBVectorMemory\",\"component_type\":\"memory\",\"version\":1,\"component_version\":1,\"description\":\"Store and retrieve memory using vector similarity search powered by ChromaDB.\",\"label\":\"ChromaDBVectorMemory\",\"config\":{\"client_type\":\"persistent\",\"collection_name\":\"preferences\",\"distance_metric\":\"cosine\",\"k\":2,\"score_threshold\":0.4,\"allow_reset\":false,\"tenant\":\"default_tenant\",\"database\":\"default_database\",\"embedding_function_config\":{\"function_type\":\"sentence_transformer\",\"model_name\":\"all-MiniLM-L6-v2\"},\"persistence_path\":\"/var/folders/wg/hgs_dt8n5lbd3gx3pq7k6lym0000gn/T/tmp9qcaqchy\"}}'"
"The RAG (Retrieval Augmented Generation) pattern which is common in building AI systems encompasses two distinct phases:\n",
"\n",
"1. **Indexing**: Loading documents, chunking them, and storing them in a vector database\n",
"2. **Retrieval**: Finding and using relevant chunks during conversation runtime\n",
"\n",
"In our previous examples, we manually added items to memory and passed them to our agents. In practice, the indexing process is usually automated and based on much larger document sources like product documentation, internal files, or knowledge bases.\n",
"\n",
"> Note: The quality of a RAG system is dependent on the quality of the chunking and retrieval process (models, embeddings, etc.). You may need to experiement with more advanced chunking and retrieval models to get the best results.\n",
"\n",
"### Building a Simple RAG Agent\n",
"\n",
"To begin, let's create a simple document indexer that we will used to load documents, chunk them, and store them in a `ChromaDBVectorMemory` memory store. "
" chunks: int = await indexer.index_documents(sources)\n",
" print(f\"Indexed {chunks} chunks from {len(sources)} AutoGen documents\")\n",
"\n",
"\n",
"await index_autogen_docs()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"---------- user ----------\n",
"What is AgentChat?\n",
"Query results: results=[MemoryContent(content='ng OpenAI\\'s GPT-4o model. See [other supported models](https://microsoft.github.io/autogen/stable/user-guide/agentchat-user-guide/tutorial/models.html). ```python import asyncio from autogen_agentchat.agents import AssistantAgent from autogen_ext.models.openai import OpenAIChatCompletionClient async def main() -> None: model_client = OpenAIChatCompletionClient(model=\"gpt-4o\") agent = AssistantAgent(\"assistant\", model_client=model_client) print(await agent.run(task=\"Say \\'Hello World!\\'\")) await model_client.close() asyncio.run(main()) ``` ### Web Browsing Agent Team Create a group chat team with a web surfer agent and a user proxy agent for web browsing tasks. You need to install [playwright](https://playwright.dev/python/docs/library). ```python # pip install -U autogen-agentchat autogen-ext[openai,web-surfer] # playwright install import asyncio from autogen_agentchat.agents import UserProxyAgent from autogen_agentchat.conditions import TextMentionTermination from autogen_agentchat.teams import RoundRobinGroupChat from autogen_agentchat.ui import Console from autogen_ext.models.openai import OpenAIChatCompletionClient from autogen_ext.agents.web_surfer import MultimodalWebSurfer async def main() -> None: model_client = OpenAIChatCompletionClient(model=\"gpt-4o\") # The web surfer will open a Chromium browser window to perform web browsing tasks. web_surfer = MultimodalWebSurfer(\"web_surfer\", model_client, headless=False, animate_actions=True) # The user proxy agent is used to ge', mime_type='MemoryMimeType.TEXT', metadata={'chunk_index': 1, 'mime_type': 'MemoryMimeType.TEXT', 'source': 'https://raw.githubusercontent.com/microsoft/autogen/main/README.md', 'score': 0.48810458183288574, 'id': '16088e03-0153-4da3-9dec-643b39c549f5'}), MemoryContent(content='els_usage=None content='AutoGen is a programming framework for building multi-agent applications.' type='ToolCallSummaryMessage' The call to the on_messages() method returns a Response that contains the agent’s final response in the chat_message attribute, as well as a list of inner messages in the inner_messages attribute, which stores the agent’s “thought process” that led to the final response. Note It is important to note that on_messages() will update the internal state of the agent – it will add the messages to the agent’s history. So you should call this method with new messages. You should not repeatedly call this method with the same messages or the complete history. Note Unlike in v0.2 AgentChat, the tools are executed by the same agent directly within the same call to on_messages() . By default, the agent will return the result of the tool call as the final response. You can also call the run() method, which is a convenience method that calls on_messages() . It follows the same interface as Teams and returns a TaskResult object. Multi-Modal Input # The AssistantAgent can handle multi-modal input by providing the input as a MultiModalMessage . from io import BytesIO import PIL import requests from autogen_agentchat.messages import MultiModalMessage from autogen_core import Image # Create a multi-modal message with random image and text. pil_image = PIL . Image . open ( BytesIO ( requests . get ( "https://picsum.photos/300/200" ) . content )', mime_type='MemoryMimeType.TEXT', metadata={'chunk_index': 3, 'mime_type': 'MemoryMimeType.TEXT', 'source': 'https://microsoft.github.io/autogen/dev/user-guide/agentchat-user-guide/tutorial/agents.html', 'score': 0.4665141701698303, 'id': '3d603b62-7cab-4f74-b671-586fe36306f2'}), MemoryContent(content='AgentChat Termination Termination # In the previous section, we explored how to define agents, and organize them into teams that can solve tasks. However, a run can go on forever, and in many cases, we need to know when to stop them. This is the role of the termination condition. AgentChat supports several termination condition by providing a base TerminationCondition class and several implementations that inherit from it.
"---------- rag_assistant ----------\n",
"[MemoryContent(content='ng OpenAI\\'s GPT-4o model. See [other supported models](https://microsoft.github.io/autogen/stable/user-guide/agentchat-user-guide/tutorial/models.html). ```python import asyncio from autogen_agentchat.agents import AssistantAgent from autogen_ext.models.openai import OpenAIChatCompletionClient async def main() -> None: model_client = OpenAIChatCompletionClient(model=\"gpt-4o\") agent = AssistantAgent(\"assistant\", model_client=model_client) print(await agent.run(task=\"Say \\'Hello World!\\'\")) await model_client.close() asyncio.run(main()) ``` ### Web Browsing Agent Team Create a group chat team with a web surfer agent and a user proxy agent for web browsing tasks. You need to install [playwright](https://playwright.dev/python/docs/library). ```python # pip install -U autogen-agentchat autogen-ext[openai,web-surfer] # playwright install import asyncio from autogen_agentchat.agents import UserProxyAgent from autogen_agentchat.conditions import TextMentionTermination from autogen_agentchat.teams import RoundRobinGroupChat from autogen_agentchat.ui import Console from autogen_ext.models.openai import OpenAIChatCompletionClient from autogen_ext.agents.web_surfer import MultimodalWebSurfer async def main() -> None: model_client = OpenAIChatCompletionClient(model=\"gpt-4o\") # The web surfer will open a Chromium browser window to perform web browsing tasks. web_surfer = MultimodalWebSurfer(\"web_surfer\", model_client, headless=False, animate_actions=True) # The user proxy agent is used to ge', mime_type='MemoryMimeType.TEXT', metadata={'chunk_index': 1, 'mime_type': 'MemoryMimeType.TEXT', 'source': 'https://raw.githubusercontent.com/microsoft/autogen/main/README.md', 'score': 0.48810458183288574, 'id': '16088e03-0153-4da3-9dec-643b39c549f5'}), MemoryContent(content='els_usage=None content='AutoGen is a programming framework for building multi-agent applications.' type='ToolCallSummaryMessage' The call to the on_messages() method returns a Response that contains the agent’s final response in the chat_message attribute, as well as a list of inner messages in the inner_messages attribute, which stores the agent’s “thought process” that led to the final response. Note It is important to note that on_messages() will update the internal state of the agent – it will add the messages to the agent’s history. So you should call this method with new messages. You should not repeatedly call this method with the same messages or the complete history. Note Unlike in v0.2 AgentChat, the tools are executed by the same agent directly within the same call to on_messages() . By default, the agent will return the result of the tool call as the final response. You can also call the run() method, which is a convenience method that calls on_messages() . It follows the same interface as Teams and returns a TaskResult object. Multi-Modal Input # The AssistantAgent can handle multi-modal input by providing the input as a MultiModalMessage . from io import BytesIO import PIL import requests from autogen_agentchat.messages import MultiModalMessage from autogen_core import Image # Create a multi-modal message with random image and text. pil_image = PIL . Image . open ( BytesIO ( requests . get ( "https://picsum.photos/300/200" ) . content )', mime_type='MemoryMimeType.TEXT', metadata={'chunk_index': 3, 'mime_type': 'MemoryMimeType.TEXT', 'source': 'https://microsoft.github.io/autogen/dev/user-guide/agentchat-user-guide/tutorial/agents.html', 'score': 0.4665141701698303, 'id': '3d603b62-7cab-4f74-b671-586fe36306f2'}), MemoryContent(content='AgentChat Termination Termination # In the previous section, we explored how to define agents, and organize them into teams that can solve tasks. However, a run can go on forever, and in many cases, we need to know when to stop them. This is the role of the termination condition. AgentChat supports several termination condition by providing a base TerminationCondition class and several implementations that inherit from it. A termination conditio
"---------- rag_assistant ----------\n",
"AgentChat is part of the AutoGen framework, a programming environment for building multi-agent applications. In AgentChat, agents can interact with each other and with users to perform various tasks, including web browsing and engaging in dialogue. It utilizes models from OpenAI for chat completions and supports multi-modal input, which means agents can handle inputs that include both text and images. Additionally, AgentChat provides mechanisms to define termination conditions to control when a conversation or task should be concluded, ensuring that the agent interactions are efficient and goal-oriented. TERMINATE\n"
"stream = rag_assistant.run_stream(task=\"What is AgentChat?\")\n",
"await Console(stream)\n",
"\n",
"# Remember to close the memory when done\n",
"await rag_memory.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This implementation provides a RAG agent that can answer questions based on AutoGen documentation. When a question is asked, the Memory system retrieves relevant chunks and adds them to the context, enabling the assistant to generate informed responses.\n",
"\n",
"For production systems, you might want to:\n",
"1. Implement more sophisticated chunking strategies\n",
"2. Add metadata filtering capabilities\n",
"3. Customize the retrieval scoring\n",
"4. Optimize embedding models for your specific domain\n"
"`autogen_ext.memory.mem0.Mem0Memory` provides integration with `Mem0.ai`'s memory system. It supports both cloud-based and local backends, offering advanced memory capabilities for agents. The implementation handles proper retrieval and context updating, making it suitable for production environments.\n",
"\n",
"In the following example, we'll demonstrate how to use `Mem0Memory` to maintain persistent memories across conversations:"