{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Using RetrieveChat with Qdrant for Retrieve Augmented Code Generation and Question Answering\n", "\n", "[Qdrant](https://qdrant.tech/) is a high-performance vector search engine/database.\n", "\n", "This notebook demonstrates the usage of Qdrant for RAG, based on [agentchat_RetrieveChat.ipynb](https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/agentchat_RetrieveChat.ipynb).\n", "\n", "\n", "RetrieveChat is a conversational system for retrieve augmented code generation and question answering. In this notebook, we demonstrate how to utilize RetrieveChat to generate code and answer questions based on customized documentations that are not present in the LLM's training dataset. RetrieveChat uses the `AssistantAgent` and `RetrieveUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_auto_feedback_from_code_execution.ipynb)).\n", "\n", "We'll demonstrate usage of RetrieveChat with Qdrant for code generation and question answering w/ human feedback.\n", "\n", "````{=mdx}\n", ":::info Requirements\n", "Some extra dependencies are needed for this notebook, which can be installed via pip:\n", "\n", "```bash\n", "pip install \"pyautogen[retrievechat-qdrant]\" \"flaml[automl]\"\n", "```\n", "\n", "For more information, please refer to the [installation guide](/docs/installation/).\n", ":::\n", "````" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Note: you may need to restart the kernel to use updated packages.\n" ] } ], "source": [ "%pip install \"pyautogen[retrievechat-qdrant]\" \"flaml[automl]\" -q" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Set your API Endpoint\n", "\n", "The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file.\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "models to use: ['gpt4-1106-preview', 'gpt-4o', 'gpt-35-turbo', 'gpt-35-turbo-0613']\n" ] } ], "source": [ "from qdrant_client import QdrantClient\n", "from sentence_transformers import SentenceTransformer\n", "\n", "import autogen\n", "from autogen import AssistantAgent\n", "from autogen.agentchat.contrib.retrieve_user_proxy_agent import RetrieveUserProxyAgent\n", "\n", "# Accepted file formats for that can be stored in\n", "# a vector database instance\n", "from autogen.retrieve_utils import TEXT_FORMATS\n", "\n", "config_list = autogen.config_list_from_json(\"OAI_CONFIG_LIST\")\n", "\n", "assert len(config_list) > 0\n", "print(\"models to use: \", [config_list[i][\"model\"] for i in range(len(config_list))])" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "````{=mdx}\n", ":::tip\n", "Learn more about configuring LLMs for agents [here](/docs/topics/llm_configuration).\n", ":::\n", "````" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accepted file formats for `docs_path`:\n", "['rtf', 'jsonl', 'xml', 'json', 'md', 'rst', 'docx', 'msg', 'pdf', 'log', 'xlsx', 'org', 'txt', 'csv', 'pptx', 'tsv', 'yml', 'epub', 'yaml', 'ppt', 'htm', 'doc', 'odt', 'html']\n" ] } ], "source": [ "print(\"Accepted file formats for `docs_path`:\")\n", "print(TEXT_FORMATS)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Construct agents for RetrieveChat\n", "\n", "We start by initializing the `AssistantAgent` and `RetrieveUserProxyAgent`. The system message needs to be set to \"You are a helpful assistant.\" for AssistantAgent. The detailed instructions are given in the user message. Later we will use the `RetrieveUserProxyAgent.generate_init_prompt` to combine the instructions and a retrieval augmented generation task for an initial prompt to be sent to the LLM assistant.\n", "\n", "### You can find the list of all the embedding models supported by Qdrant [here](https://qdrant.github.io/fastembed/examples/Supported_Models/)." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "67171b10626248ba8b5bff0f5a4d6895", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Fetching 5 files: 0%| | 0/5 [00:00, ?it/s]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# 1. create an AssistantAgent instance named \"assistant\"\n", "assistant = AssistantAgent(\n", " name=\"assistant\",\n", " system_message=\"You are a helpful assistant.\",\n", " llm_config={\n", " \"timeout\": 600,\n", " \"cache_seed\": 42,\n", " \"config_list\": config_list,\n", " },\n", ")\n", "\n", "# Optionally create embedding function object\n", "sentence_transformer_ef = SentenceTransformer(\"all-distilroberta-v1\").encode\n", "client = QdrantClient(\":memory:\")\n", "\n", "# 2. create the RetrieveUserProxyAgent instance named \"ragproxyagent\"\n", "# Refer to https://microsoft.github.io/autogen/docs/reference/agentchat/contrib/retrieve_user_proxy_agent\n", "# and https://microsoft.github.io/autogen/docs/reference/agentchat/contrib/vectordb/qdrant\n", "# for more information on the RetrieveUserProxyAgent and QdrantVectorDB\n", "ragproxyagent = RetrieveUserProxyAgent(\n", " name=\"ragproxyagent\",\n", " human_input_mode=\"NEVER\",\n", " max_consecutive_auto_reply=10,\n", " retrieve_config={\n", " \"task\": \"code\",\n", " \"docs_path\": [\n", " \"https://raw.githubusercontent.com/microsoft/flaml/main/README.md\",\n", " \"https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Research.md\",\n", " ], # change this to your own path, such as https://raw.githubusercontent.com/microsoft/autogen/main/README.md\n", " \"chunk_token_size\": 2000,\n", " \"model\": config_list[0][\"model\"],\n", " \"db_config\": {\"client\": client},\n", " \"vector_db\": \"qdrant\", # qdrant database\n", " \"get_or_create\": True, # set to False if you don't want to reuse an existing collection\n", " \"overwrite\": True, # set to True if you want to overwrite an existing collection\n", " \"embedding_function\": sentence_transformer_ef, # If left out fastembed \"BAAI/bge-small-en-v1.5\" will be used\n", " },\n", " code_execution_config=False,\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Example 1\n", "\n", "[back to top](#toc)\n", "\n", "Use RetrieveChat to answer a question and ask for human-in-loop feedbacks.\n", "\n", "Problem: Is there a function named `tune_automl` in FLAML?" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Trying to create collection.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2024-07-15 23:19:34,988 - autogen.agentchat.contrib.retrieve_user_proxy_agent - INFO - Found 3 chunks.\u001b[0m\n", "Model gpt4-1106-preview not found. Using cl100k_base encoding.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "VectorDB returns doc_ids: [['987f060a-4399-b91a-0e51-51b6165ea5bb', '0ecd7192-3761-7d6f-9151-5ff504ca740b', 'ddbaaafc-abdd-30b4-eecd-ec2c32818952']]\n", "\u001b[32mAdding content of doc 987f060a-4399-b91a-0e51-51b6165ea5bb to context.\u001b[0m\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Model gpt4-1106-preview not found. Using cl100k_base encoding.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[33mragproxyagent\u001b[0m (to assistant):\n", "\n", "You're a retrieve augmented coding assistant. You answer user's questions based on your own knowledge and the\n", "context provided by the user.\n", "If you can't answer the question with or without the current context, you should reply exactly `UPDATE CONTEXT`.\n", "For code generation, you must obey the following rules:\n", "Rule 1. You MUST NOT install any packages because all the packages needed are already installed.\n", "Rule 2. You must follow the formats below to write your code:\n", "```language\n", "# your code\n", "```\n", "\n", "User's question is: Is there a function called tune_automl?\n", "\n", "Context is: [](https://badge.fury.io/py/FLAML)\n", "\n", "[](https://github.com/microsoft/FLAML/actions/workflows/python-package.yml)\n", "\n", "[](https://pepy.tech/project/flaml)\n", "[](https://discord.gg/Cppx2vSPVP)\n", "\n", "\n", "\n", "# A Fast Library for Automated Machine Learning & Tuning\n", "\n", "
\n",
" \n",
"
\n",
"