"AutoGen offers conversable agents powered by LLMs, tools, or humans, which can be used to perform tasks collectively via automated chat. This framework allows tool use and human participation through multi-agent conversation.\n",
"In this notebook, we introduce a new class, `AgentBuilder`, to help users build an automatic task-solving process powered by a multi-agent system. Specifically, in `build()`, we prompt an LLM to create multiple participant agents, initialize a group chat, and specify whether this task need programming to solve. AgentBuilder also supports open-source LLMs by [vLLM](https://docs.vllm.ai/en/latest/index.html) and [Fastchat](https://github.com/lm-sys/FastChat). Check the supported model list [here](https://docs.vllm.ai/en/latest/models/supported_models.html)."
"Prepare a `config_path` for assistant agent to limit the choice of LLM you want to use in this task. This config can be a path to a json file or a name of an environment variable. A `default_llm_config` is also required to initialize the specific configurations of LLMs like seed, temperature, etc..."
"Create an `AgentBuilder` with the specified `config_path`. AgentBuilder will use GPT-4 in default to complete the whole process, you can also change the `builder_model` to other OpenAI models. You can also specify an OpenAI or open-source LLM as the agent backbone, see [blog](https://microsoft.github.io/autogen/blog/2023/07/14/Local-LLMs/) for more details."
"building_task = \"Find a paper on arxiv by programming, and analyze its application in some domain. For example, find a recent paper about gpt-4 on arxiv and find its potential applications in software.\""
]
},
{
"cell_type": "markdown",
"id": "5782dd5ecb6c217a",
"metadata": {
"collapsed": false
},
"source": [
"## Step 4: build group chat agents\n",
"Use `build()` to let build manager (the specified `builder_model`) complete the group chat agents generation. If you think coding is necessary in your task, you can use `coding=True` to add a user proxy (an automatic code interpreter) into the agent list, like: \n",
"To begin, we'll write a Python script that uses the `arxiv` library to search for recent papers on arXiv related to GPT-4. The script will automate the process of searching for the papers, downloading the metadata, and then extracting the relevant information to identify potential applications in software.\n",
"\n",
"First, you'll need to install the `arxiv` library if you haven't already. You can do this by running `pip install arxiv`.\n",
"\n",
"Here's a Python script that will perform the search and print out the title, authors, summary, and publication date of the most recent papers related to GPT-4. Save this script to a file and run it in your Python environment.\n",
"\n",
"```python\n",
"# filename: arxiv_search_gpt4.py\n",
"\n",
"import arxiv\n",
"import datetime\n",
"\n",
"# Define the search query and parameters\n",
"search_query = 'all:\"GPT-4\"'\n",
"max_results = 5 # You can adjust this number based on how many results you want\n",
"\n",
"# Search arXiv for papers related to GPT-4\n",
"search = arxiv.Search(\n",
" query=search_query,\n",
" max_results=max_results,\n",
" sort_by=arxiv.SortCriterion.SubmittedDate\n",
")\n",
"\n",
"# Fetch the results\n",
"results = list(search.results())\n",
"\n",
"# Print the details of the most recent papers\n",
"for result in results:\n",
" published = result.published.strftime('%Y-%m-%d')\n",
" print(f\"Title: {result.title}\\nAuthors: {', '.join(author.name for author in result.authors)}\\nPublished: {published}\\nSummary: {result.summary}\\n\")\n",
"\n",
"# Note: This script does not download the full paper, only the metadata.\n",
"```\n",
"\n",
"After running this script, you will have a list of recent papers related to GPT-4. You can then read through the summaries to identify potential applications in software. If you need to download the full papers, you can modify the script to fetch the PDFs using the URLs provided in the metadata.\n",
"\n",
"Once you have the summaries or full papers, you can use your analytical skills to discern the potential applications of GPT-4 in software. Look for keywords such as \"software engineering\", \"application\", \"tool\", \"framework\", \"integration\", \"development\", and \"automation\" to find relevant information.\n",
"\n",
"Please execute the above script to retrieve the recent papers on GPT-4 from arXiv. After that, I can guide you through the analysis of their content to identify potential applications in software.\n",
"Based on the output, we have several recent papers related to GPT-4. Let's analyze their summaries to identify potential applications in software:\n",
"\n",
"1. **Unnatural Error Correction: GPT-4 Can Almost Perfectly Handle Unnatural Scrambled Text**\n",
" - **Potential Application**: This paper suggests that GPT-4 has a remarkable ability to correct and understand scrambled text. This could be applied in software for error correction, data cleaning, and improving resilience against data corruption or obfuscation.\n",
"\n",
"2. **Language Model Agents Suffer from Compositional Generalization in Web Automation**\n",
" - **Potential Application**: The paper discusses the performance of GPT-4 in web automation tasks and highlights its limitations in compositional generalization. This indicates that while GPT-4 can be used in web automation software, there is room for improvement, especially in tasks that require understanding and combining different instructions.\n",
"\n",
"3. **AlignBench: Benchmarking Chinese Alignment of Large Language Models**\n",
" - **Potential Application**: This paper introduces a benchmark for evaluating the alignment of Chinese LLMs, including GPT-4. The potential application here is in developing software tools for evaluating and improving the alignment of language models, particularly for non-English languages, which is crucial for creating more inclusive and effective NLP applications.\n",
"\n",
"4. **CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation**\n",
" - **Potential Application**: The research presents a model for evaluating the quality of text generated by LLMs. Software applications could include automated quality control for content generation, providing feedback for improving language models, and developing more explainable AI systems.\n",
"\n",
"5. **AviationGPT: A Large Language Model for the Aviation Domain**\n",
" - **Potential Application**: The paper proposes a domain-specific LLM for aviation. This model could be applied in software for various NLP tasks within the aviation industry, such as question-answering, summarization, document writing, information extraction, report querying, data cleaning, and interactive data exploration, leading to improved efficiency and safety in aviation operations.\n",
"\n",
"These summaries provide a glimpse into the diverse applications of GPT-4 in software. From error correction and web automation to domain-specific applications and model evaluation, GPT-4's capabilities can be leveraged to enhance various aspects of software development and deployment.\n",
"You can clear all agents generated in this task with the following code if your task is complete or the next task is significantly different from the current one. If the agent's backbone is an open-source LLM, this process will also shutdown the endpoint server. If necessary, you can use `recycle_endpoint=False` to retain the previous open-source LLMs' endpoint server."
" \"building_task\": \"Find a paper on arxiv by programming, and analysis its application in some domain. For example, find a latest paper about gpt-4 on arxiv and find its potential applications in software.\",\n",
" \"agent_configs\": [\n",
" {\n",
" \"name\": \"Data_scientist\",\n",
" \"model\": \"gpt-4-1106-preview\",\n",
" \"system_message\": \"As a Data Scientist, you are tasked with automating the retrieval and analysis of academic papers from arXiv. Utilize your Python programming acumen to develop scripts for gathering necessary information such as searching for relevant papers, downloading them, and processing their contents. Apply your analytical and language skills to interpret the data and deduce the applications of the research within specific domains.\\n\\n1. To compile information, write and implement Python scripts that search and interact with online resources, download and read files, extract content from documents, and perform other information-gathering tasks. Use the printed output as the foundation for your subsequent analysis.\\n\\n2. Execute tasks programmatically with Python scripts when possible, ensuring results are directly displayed. Approach each task with efficiency and strategic thinking.\\n\\nProgress through tasks systematically. In instances where a strategy is not provided, outline your plan before executing. Clearly distinguish between tasks handled via code and those utilizing your analytical expertise.\\n\\nWhen providing code, include only Python scripts meant to be run without user alterations. Users should execute your script as is, without modifications:\\n\\n```python\\n# filename: <filename>\\n# Python script\\nprint(\\\"Your output\\\")\\n```\\n\\nUsers should not perform any actions other than running the scripts you provide. Avoid presenting partial or incomplete scripts that require user adjustments. Refrain from requesting users to copy-paste results; instead, use the 'print' function when suitable to display outputs. Monitor the execution results they share.\\n\\nIf an error surfaces, supply corrected scripts for a re-run. If the strategy fails to resolve the issue, reassess your assumptions, gather additional details as needed, and explore alternative approaches.\\n\\nUpon successful completion of a task and verification of the results, confirm the achievement of the stated objective. Ensuring accuracy and validity of the findings is paramount. Evidence supporting your conclusions should be provided when feasible.\\n\\nUpon satisfying the user's needs and ensuring all tasks are finalized, conclude your assistance with \\\"TERMINATE\\\".\"\n",
" },\n",
" {\n",
" \"name\": \"Research_analyst\",\n",
" \"model\": \"gpt-4-1106-preview\",\n",
" \"system_message\": \"As a Research Analyst, you are expected to be a proficient AI assistant possessing a strong grasp of programming, specifically in Python, and robust analytical capabilities. Your primary responsibilities will include:\\n\\n1. Conducting comprehensive searches and retrieving information autonomously through Python scripts, such as querying databases, accessing web services (like arXiv), downloading and reading files, and retrieving system information.\\n2. Analyzing the content of the retrieved documents, particularly academic papers, and extracting insights regarding their application in specific domains, such as the potential uses of GPT-4 in software development.\\n3. Presenting your findings in a clear, detailed manner, explaining the implications of the research and its relevance to the assigned task.\\n4. Employing your programming skills to automate tasks where possible, ensuring the output is delivered through Python code with clear, executable instructions. Your code will be designed for the user to execute without amendment or additional input.\\n5. Verifying the results of information gathering and analysis to ensure accuracy and completeness, providing evidence to support your conclusions when available.\\n6. Communicating the completion of each task and confirming that the user's needs have been satisfied through a clear and conclusive statement, followed by the word \\\"TERMINATE\\\" to signal the end of the interaction.\"\n",
" },\n",
" {\n",
" \"name\": \"Software_developer\",\n",
" \"model\": \"gpt-4-1106-preview\",\n",
" \"system_message\": \"As a dedicated AI assistant for a software developer, your role involves employing your Python programming prowess and proficiency in natural language processing to facilitate the discovery and analysis of scholarly articles on arXiv. Your tasks include crafting Python scripts to automatically search, retrieve, and present information regarding the latest research, with a focus on applicable advancements in technology such as GPT-4 and its potential impact on the domain of software development.\\n\\n1. Utilize Python to programmatically seek out and extract pertinent data, for example, navigating or probing the web, downloading/ingesting documents, or showcasing content from web pages or files. When enough information has been accumulated to proceed, you will then analyze and interpret the findings.\\n\\n2. When there's a need to perform an operation programmatically, your Python code should accomplish the task and manifest the outcome. Progress through the task incrementally and systematically.\\n\\nProvide a clear plan outlining each stage of the task, specifying which components will be executed through Python coding and which through your linguistic capabilities. When proposing Python code, remember to:\\n\\n- Label the script type within the code block\\n- Avoid suggesting code that the user would need to alter\\n- Refrain from including more than one code block in your response\\n- Circumvent requesting the user to manually transcribe any results; utilize 'print' statements where applicable\\n- Examine the user's reported execution outcomes\\n\\nIf an error arises, your responsibility is to rectify the issue and submit the corrected script. Should an error remain unresolvable, or if the task remains incomplete post successful code execution, re-evaluate the scenario, gather any further required information, and formulate an alternative approach.\\n\\nUpon confirming that the task has been satisfactorily accomplished and the user's requirements have been met, indicate closure of the procedure with a concluding statement.\"\n",
"This information will be saved in JSON format. You can provide a specific filename; otherwise, AgentBuilder will save the config to the current path with a generated filename 'save_config_TASK_MD5.json'."
"After that, you can load the saved config and skip the building process. AgentBuilder will create agents with the config information without prompting the builder manager."
"To find a recent paper about \"Llava\" on arXiv, we can use the arXiv API to search for papers that match this keyword. However, it's important to note that \"Llava\" might be a typo or a less common term. If you meant \"Lava\" or another term, please correct me. Assuming \"Llava\" is the correct term, I will proceed with that.\n",
"\n",
"Here's a Python script that uses the `arxiv` library to search for papers related to \"Llava\". If the `arxiv` library is not installed on your system, you can install it using `pip install arxiv`.\n",
"\n",
"```python\n",
"# filename: arxiv_search.py\n",
"\n",
"import arxiv\n",
"\n",
"# Define the search query and parameters\n",
"search_query = 'all:Llava'\n",
"max_results = 10\n",
"\n",
"# Search arXiv for papers related to the search query\n",
"def search_papers(query, max_results):\n",
" search = arxiv.Search(\n",
" query=query,\n",
" max_results=max_results,\n",
" sort_by=arxiv.SortCriterion.SubmittedDate\n",
" )\n",
" for result in search.results():\n",
" print(f\"Title: {result.title}\")\n",
" print(f\"Authors: {', '.join(author.name for author in result.authors)}\")\n",
" print(f\"Abstract: {result.summary}\")\n",
" print(f\"URL: {result.entry_id}\")\n",
" print(f\"Published: {result.published}\")\n",
" print(\"\")\n",
"\n",
"# Run the search and print the results\n",
"search_papers(search_query, max_results)\n",
"```\n",
"\n",
"To execute this script, save it to a file named `arxiv_search.py` and run it using a Python interpreter. The script will print out the titles, authors, abstracts, and URLs of up to 10 recent papers related to \"Llava\".\n",
"\n",
"Once we have the papers, we can analyze their abstracts to determine potential applications in computer vision. However, this part of the task will require human analysis and cannot be fully automated, as it involves understanding and interpreting the content of the papers. If the script finds relevant papers, I will proceed with the analysis based on the abstracts provided.\n",
"Based on the search results, it appears that \"LLaVA\" is a term related to Large Language Models (LLMs) and their applications in vision-language tasks. The papers listed discuss various aspects of LLaVA and its applications, including instruction learning, hallucination mitigation, video understanding, and more.\n",
"\n",
"From the abstracts, we can see that LLaVA and its variants are being used to improve the alignment between visual and language representations, which is crucial for tasks such as image captioning, visual question answering, and video understanding. These models are designed to process and understand multi-modal data, combining visual information with textual instructions or queries.\n",
"\n",
"For example, the paper titled \"Contrastive Vision-Language Alignment Makes Efficient Instruction Learner\" discusses how to align the representation of a Vision Transformer (ViT) with an LLM to create an efficient instruction learner for vision-language tasks. Another paper, \"PG-Video-LLaVA: Pixel Grounding Large Video-Language Models,\" extends the capabilities of LLaVA to videos, enabling the model to spatially and temporally localize objects in videos following user instructions.\n",
"\n",
"The potential applications in computer vision are vast and include:\n",
"\n",
"1. Image and video captioning: Generating descriptive text for images and videos.\n",
"2. Visual question answering: Answering questions based on visual content.\n",
"3. Object detection and localization: Identifying and locating objects in images and videos.\n",
"4. Video understanding: Interpreting actions, events, and narratives in video content.\n",
"5. Hallucination mitigation: Reducing instances where the model generates responses that contradict the visual content.\n",
"\n",
"These applications are crucial for developing more intelligent and interactive AI systems that can understand and respond to visual content in a human-like manner. The research on LLaVA and related models is contributing to the advancement of multi-modal AI, which can have significant implications for fields such as autonomous vehicles, assistive technologies, content moderation, and more.\n",
" execution_task=\"Find a recent paper about Llava on arxiv and find its potential applications in computer vision.\",\n",
" agent_list=agent_list,\n",
" llm_config=default_llm_config\n",
")\n",
"new_builder.clear_all_agents()"
]
},
{
"cell_type": "markdown",
"id": "32e0cf8f09eef5cd",
"metadata": {
"collapsed": false
},
"source": [
"## Use OpenAI Assistant\n",
"\n",
"[The Assistants API](https://platform.openai.com/docs/assistants/overview) allows you to build AI assistants within your own applications. An Assistant has instructions and can leverage models, tools, and knowledge to respond to user queries.\n",
"Data_scientist,Machine_learning_engineer,Research_analyst are generated.\n",
"Preparing configuration for Data_scientist...\n",
"Preparing configuration for Machine_learning_engineer...\n",
"Preparing configuration for Research_analyst...\n",
"Creating agent Data_scientist with backbone gpt-4-1106-preview...\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Multiple assistants with name Data_scientist found. Using the first assistant in the list. Please specify the assistant ID in llm_config to use a specific assistant.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Creating agent Machine_learning_engineer with backbone gpt-4-1106-preview...\n",
"Creating agent Research_analyst with backbone gpt-4-1106-preview...\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Multiple assistants with name Research_analyst found. Using the first assistant in the list. Please specify the assistant ID in llm_config to use a specific assistant.\n"
"We will start by searching for a recent paper about Explainable Artificial Intelligence (XAI) on arXiv. To do this, I'll write a Python script that uses the arXiv API to fetch relevant papers. We will then look at the most recent paper and analyze its content to identify potential applications within the medical domain. \n",
"\n",
"Let's start by writing the script to search for papers on arXiv. This code will fetch papers related to XAI that are latest:\n",
"Please run this script in a Python environment to fetch the information about the most recent paper on XAI. After running this code, we will get the title, authors, publication date, summary, arXiv ID, and a link to the paper. Once we have the paper information, we can proceed to analyze it and discuss potential medical applications.\n",
" File \"/home/elpis_ubuntu/anaconda3/envs/llm/lib/python3.11/http/client.py\", line 1297, in _send_request\n",
" self.putrequest(method, url, **skips)\n",
" File \"/home/elpis_ubuntu/anaconda3/envs/llm/lib/python3.11/http/client.py\", line 1131, in putrequest\n",
" self._validate_path(url)\n",
" File \"/home/elpis_ubuntu/anaconda3/envs/llm/lib/python3.11/http/client.py\", line 1231, in _validate_path\n",
" raise InvalidURL(f\"URL can't contain control characters. {url!r} \"\n",
"http.client.InvalidURL: URL can't contain control characters. '/api/query?search_query=all:XAI AND cat:cs.AI&start=0&max_results=1' (found at least ' ')\n",
"It seems there's an issue with the URL encoding in the script when making the request to the arXiv API. The query parameters need to be properly encoded to ensure that spaces and special characters are handled correctly. Let's correct the script by encoding the query parameters.\n",
"\n",
"Below is the corrected Python script. Please run it to fetch the information about the most recent paper on Explainable Artificial Intelligence (XAI) from arXiv:\n",
"This updated script ensures that the search query is properly encoded before making the request to arXiv. Once you run the updated version, you should be able to retrieve the details of the most recent paper on XAI. We can then move to the analysis part and discuss potential applications in the medical field.\n",
"Based on the output provided, the most recent paper on Explainable Artificial Intelligence (XAI) from arXiv is titled \"A Critical Survey on Fairness Benefits of XAI.\" It was authored by Luca Deck, Jakob Schoeffer, Maria De-Arteaga, and Niklas Kühl, and published on October 15, 2023.\n",
"\n",
"The summary discusses a critical survey conducted to analyze claims about the relationship between XAI and fairness. Through a systematic literature review and qualitative content analysis, the authors identified seven archetypal claims from 175 papers about the supposed fairness benefits of XAI. They present significant limitations and caveats regarding these claims, challenging the notion that XAI is a straightforward solution for fairness issues. The paper suggests reconsidering the role of XAI as one of the many tools to address the complex, sociotechnical challenge of algorithmic fairness. It emphasizes the importance of being specific about how certain XAI methods enable stakeholders to address particular fairness desiderata.\n",
"\n",
"Regarding potential applications in the medical field, one can infer from the summary that while the paper itself may not be directly focused on medical applications, its insights could be relevant. In healthcare, fairness is a critical concern due to the potential impact of biased algorithms on patient outcomes. XAI could help medical professionals and policymakers understand how AI models make predictions, which can be essential for identifying and mitigating biases in high-stakes decisions such as diagnosis, treatment planning, or resource allocation.\n",
"\n",
"While the summary does not provide explicit applications of XAI in medicine, understanding the interplay between AI explainability and fairness is undoubtedly beneficial in the context of ethical AI deployment in healthcare. Increased transparency through XAI could lead to more equitable healthcare algorithms, but this requires careful consideration of how the explainability ties into fairness outcomes, as indicated by the authors.\n",
"\n",
"For further analysis, I would recommend reading the full paper to extract detailed discussions of these issues, which might highlight more specific applications or considerations for the medical field.\n",