2024-03-13 20:49:39 -04:00
# Retrieval Augmentation
Retrieval Augmented Generation (RAG) is a powerful technique that combines language models with external knowledge retrieval to improve the quality and relevance of generated responses.
2024-08-16 00:03:06 +08:00
One way to realize RAG in AutoGen is to construct agent chats with `AssistantAgent` and `RetrieveUserProxyAgent` classes.
2024-03-13 20:49:39 -04:00
## Example Setup: RAG with Retrieval Augmented Agents
The following is an example setup demonstrating how to create retrieval augmented agents in AutoGen:
2024-08-16 00:03:06 +08:00
### Step 1. Create an instance of `AssistantAgent` and `RetrieveUserProxyAgent`.
2024-03-13 20:49:39 -04:00
Here `RetrieveUserProxyAgent` instance acts as a proxy agent that retrieves relevant information based on the user's input.
2024-08-16 00:03:06 +08:00
Refer to the [doc ](https://microsoft.github.io/autogen/docs/reference/agentchat/contrib/retrieve_user_proxy_agent )
for more information on the detailed configurations.
2024-03-13 20:49:39 -04:00
```python
2024-08-16 00:03:06 +08:00
assistant = AssistantAgent(
2024-03-13 20:49:39 -04:00
name="assistant",
system_message="You are a helpful assistant.",
llm_config={
"timeout": 600,
"cache_seed": 42,
"config_list": config_list,
},
)
ragproxyagent = RetrieveUserProxyAgent(
name="ragproxyagent",
human_input_mode="NEVER",
max_consecutive_auto_reply=3,
retrieve_config={
"task": "code",
"docs_path": [
"https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Examples/Integrate%20-%20Spark.md",
"https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Research.md",
os.path.join(os.path.abspath(""), "..", "website", "docs"),
],
"custom_text_types": ["mdx"],
"chunk_token_size": 2000,
"model": config_list[0]["model"],
"client": chromadb.PersistentClient(path="/tmp/chromadb"),
"embedding_model": "all-mpnet-base-v2",
"get_or_create": True, # set to False if you don't want to reuse an existing collection, but you'll need to remove the collection manually
},
code_execution_config=False, # set to False if you don't want to execute the code
)
```
### Step 2. Initiating Agent Chat with Retrieval Augmentation
Once the retrieval augmented agents are set up, you can initiate a chat with retrieval augmentation using the following code:
```python
code_problem = "How can I use FLAML to perform a classification task and use spark to do parallel training. Train 30 seconds and force cancel jobs if time limit is reached."
ragproxyagent.initiate_chat(
assistant, message=ragproxyagent.message_generator, problem=code_problem, search_string="spark"
) # search_string is used as an extra filter for the embeddings search, in this case, we only want to search documents that contain "spark".
```
2024-09-24 03:19:23 +08:00
*You'll need to install `chromadb<=0.5.0` if you see issue like [#3551 ](https://github.com/microsoft/autogen/issues/3551 ).*
2024-03-13 20:49:39 -04:00
2024-04-18 13:00:03 -07:00
## Example Setup: RAG with Retrieval Augmented Agents with PGVector
The following is an example setup demonstrating how to create retrieval augmented agents in AutoGen:
2024-08-16 00:03:06 +08:00
### Step 1. Create an instance of `AssistantAgent` and `RetrieveUserProxyAgent`.
2024-04-18 13:00:03 -07:00
Here `RetrieveUserProxyAgent` instance acts as a proxy agent that retrieves relevant information based on the user's input.
Specify the connection_string, or the host, port, database, username, and password in the db_config.
```python
2024-08-16 00:03:06 +08:00
assistant = AssistantAgent(
2024-04-18 13:00:03 -07:00
name="assistant",
system_message="You are a helpful assistant.",
llm_config={
"timeout": 600,
"cache_seed": 42,
"config_list": config_list,
},
)
ragproxyagent = RetrieveUserProxyAgent(
name="ragproxyagent",
human_input_mode="NEVER",
max_consecutive_auto_reply=3,
retrieve_config={
"task": "code",
"docs_path": [
"https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Examples/Integrate%20-%20Spark.md",
"https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Research.md",
os.path.join(os.path.abspath(""), "..", "website", "docs"),
],
"vector_db": "pgvector",
"collection_name": "autogen_docs",
"db_config": {
"connection_string": "postgresql://testuser:testpwd@localhost:5432/vectordb ", # Optional - connect to an external vector database
# "host": None, # Optional vector database host
# "port": None, # Optional vector database port
# "database": None, # Optional vector database name
# "username": None, # Optional vector database username
# "password": None, # Optional vector database password
},
"custom_text_types": ["mdx"],
"chunk_token_size": 2000,
"model": config_list[0]["model"],
"get_or_create": True,
},
code_execution_config=False,
)
```
### Step 2. Initiating Agent Chat with Retrieval Augmentation
Once the retrieval augmented agents are set up, you can initiate a chat with retrieval augmentation using the following code:
```python
code_problem = "How can I use FLAML to perform a classification task and use spark to do parallel training. Train 30 seconds and force cancel jobs if time limit is reached."
ragproxyagent.initiate_chat(
assistant, message=ragproxyagent.message_generator, problem=code_problem, search_string="spark"
) # search_string is used as an extra filter for the embeddings search, in this case, we only want to search documents that contain "spark".
```
2024-03-13 20:49:39 -04:00
## Online Demo
[Retrival-Augmented Chat Demo on Huggingface ](https://huggingface.co/spaces/thinkall/autogen-demos )
## More Examples and Notebooks
For more detailed examples and notebooks showcasing the usage of retrieval augmented agents in AutoGen, refer to the following:
- Automated Code Generation and Question Answering with Retrieval Augmented Agents - [View Notebook ](/docs/notebooks/agentchat_RetrieveChat )
2024-07-16 14:44:48 +08:00
- Automated Code Generation and Question Answering with [PGVector ](https://github.com/pgvector/pgvector ) based Retrieval Augmented Agents - [View Notebook ](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_RetrieveChat_pgvector.ipynb )
- Automated Code Generation and Question Answering with [Qdrant ](https://qdrant.tech/ ) based Retrieval Augmented Agents - [View Notebook ](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_RetrieveChat_qdrant.ipynb )
2024-08-22 17:58:08 +08:00
- Automated Code Generation and Question Answering with [MongoDB Atlas ](https://www.mongodb.com/ ) based Retrieval Augmented Agents - [View Notebook ](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_RetrieveChat_mongodb.ipynb )
2024-03-13 20:49:39 -04:00
- Chat with OpenAI Assistant with Retrieval Augmentation - [View Notebook ](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_oai_assistant_retrieval.ipynb )
- **RAG**: Group Chat with Retrieval Augmented Generation (with 5 group member agents and 1 manager agent) - [View Notebook ](/docs/notebooks/agentchat_groupchat_RAG )
## Roadmap
Explore our detailed roadmap [here ](https://github.com/microsoft/autogen/issues/1657 ) for further advancements plan around RAG. Your contributions, feedback, and use cases are highly appreciated! We invite you to engage with us and play a pivotal role in the development of this impactful feature.