autogen/notebook/autobuild_basic.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c1004af6a7fbfcd8",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "# AutoBuild\n",
    "By: [Linxin Song](https://linxins97.github.io/), [Jieyu Zhang](https://jieyuz2.github.io/)\n",
    "Reference: [Agent AutoBuild](https://microsoft.github.io/autogen/blog/2023/11/26/Agent-AutoBuild/)\n",
    "\n",
    "AutoGen offers conversable agents powered by LLM, tool, or human, which can be used to perform tasks collectively via automated chat. This framework allows tool use and human participation through multi-agent conversation.\n",
    "Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
    "\n",
    "In this notebook, we introduce a new class, `AgentBuilder`, to help user build an automatic task solving process powered by multi-agent system. Specifically, in `build()`, we prompt a LLM to create multiple participant agent and initialize a group chat, and specify whether this task need programming to solve. AgentBuilder also support open-source LLMs by [vLLM](https://docs.vllm.ai/en/latest/index.html) and [Fastchat](https://github.com/lm-sys/FastChat). Check the supported model list [here](https://docs.vllm.ai/en/latest/models/supported_models.html)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ec78dda8e3826d8a",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Requirement\n",
    "\n",
    "AutoBuild require `pyautogen[autobuild]`, which can be installed by the following command:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e8e9ae50658be975",
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "%pip install pyautogen[autobuild]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7d0e63ab3604bdb9",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Step 1: prepare configuration and some useful functions\n",
    "Prepare a `config_file_or_env` for assistant agent to limit the choice of LLM you want to use in this task. This config can be a path of json file or a name of environment variable. A `default_llm_config` is also required for initialize the specific config of LLMs like seed, temperature, etc..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "2505f029423b21ab",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-06-09T15:07:41.225066900Z",
     "start_time": "2024-06-09T15:07:40.443327100Z"
    },
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import autogen\n",
    "from autogen.agentchat.contrib.agent_builder import AgentBuilder\n",
    "\n",
    "config_file_or_env = \"OAI_CONFIG_LIST\"\n",
    "llm_config = {\"temperature\": 0}\n",
    "config_list = autogen.config_list_from_json(config_file_or_env, filter_dict={\"model\": [\"gpt-4-turbo\", \"gpt-4\"]})\n",
    "\n",
    "\n",
    "def start_task(execution_task: str, agent_list: list, coding=True):\n",
    "    group_chat = autogen.GroupChat(\n",
    "        agents=agent_list,\n",
    "        messages=[],\n",
    "        max_round=12,\n",
    "        allow_repeat_speaker=agent_list[:-1] if coding is True else agent_list,\n",
    "    )\n",
    "    manager = autogen.GroupChatManager(\n",
    "        groupchat=group_chat,\n",
    "        llm_config={\"config_list\": config_list, **llm_config},\n",
    "    )\n",
    "    agent_list[0].initiate_chat(manager, message=execution_task)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c2d6586c68fa425b",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Step 2: create a AgentBuilder\n",
    "Create a `AgentBuilder` with the specified `config_path_or_env`. AgentBuilder will use `gpt-4` in default to complete the whole process, you can specify the `builder_model` and `agent_model` to other OpenAI model to match your task. \n",
    "You can also specify an open-source LLM supporting by vLLM and FastChat, see blog for more details."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "bfa67c771a0fed37",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-06-09T15:07:54.256131900Z",
     "start_time": "2024-06-09T15:07:54.236884400Z"
    },
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "builder = AgentBuilder(\n",
    "    config_file_or_env=config_file_or_env, builder_model=[\"gpt-4-turbo\"], agent_model=[\"gpt-4-turbo\"]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2e6a655fb6618324",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Step 3: specify a building task\n",
    "\n",
    "Specify a building task with a general description. Building task will help build manager (a LLM) decide what agents should be built."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "68315f6ec912c58a",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-06-09T15:07:57.283793900Z",
     "start_time": "2024-06-09T15:07:57.274718Z"
    },
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "building_task = \"Generate some agents that can find papers on arxiv by programming and analyzing them in specific domains related to computer science and medical science.\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5782dd5ecb6c217a",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Step 4: build group chat agents\n",
    "Use `build()` to let build manager (the specified `builder_model`) complete the group chat agents generation. If you think coding is necessary in your task, you can use `coding=True` to add a user proxy (an automatic code interpreter) into the agent list, like: \n",
    "```python\n",
    "builder.build(building_task, default_llm_config, coding=True)\n",
    "```\n",
    "If `coding` is not specified, AgentBuilder will determine on its own whether the user proxy should be added or not according to the task."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "ab490fdbe46c0473",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-06-09T15:08:45.446026500Z",
     "start_time": "2024-06-09T15:07:58.296262400Z"
    },
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[32m==> Generating agents...\u001b[0m\n",
      "['DataMining_Expert', 'Bioinformatics_Expert', 'AI_ComputerScience_Expert'] are generated.\n",
      "\u001b[32m==> Generating system message...\u001b[0m\n",
      "Preparing system message for DataMining_Expert\n",
      "Preparing system message for Bioinformatics_Expert\n",
      "Preparing system message for AI_ComputerScience_Expert\n",
      "\u001b[32m==> Generating description...\u001b[0m\n",
      "Preparing description for DataMining_Expert\n",
      "Preparing description for Bioinformatics_Expert\n",
      "Preparing description for AI_ComputerScience_Expert\n",
      "\u001b[32m==> Creating agents...\u001b[0m\n",
      "Creating agent DataMining_Expert...\n",
      "Creating agent Bioinformatics_Expert...\n",
      "Creating agent AI_ComputerScience_Expert...\n",
      "Adding user console proxy...\n"
     ]
    }
   ],
   "source": [
    "agent_list, agent_configs = builder.build(building_task, llm_config)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e00dd99880a4bf7b",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Step 5: execute task\n",
    "Let agents generated in `build()` to complete the task collaboratively in a group chat."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "7d52e3d9a1bf91cb",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-06-09T15:10:37.719729400Z",
     "start_time": "2024-06-09T15:08:58.365570500Z"
    },
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[33mDataMining_Expert\u001b[0m (to chat_manager):\n",
      "Find a recent paper about gpt-4 on arxiv and find its potential applications in software.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[32m\n",
      "Next speaker: Computer_terminal\n",
      "\u001b[0m\n",
      "\u001b[33mComputer_terminal\u001b[0m (to chat_manager):\n",
      "\n",
      "There is no python code from the last 1 message for me to execute. Group chat manager should let other participants to continue the conversation. If the group chat manager want to end the conversation, you should let other participant reply me only with \"TERMINATE\"\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[32m\n",
      "Next speaker: AI_ComputerScience_Expert\n",
      "\u001b[0m\n",
      "\u001b[33mAI_ComputerScience_Expert\u001b[0m (to chat_manager):\n",
      "\n",
      "To find a recent paper about GPT-4 on arXiv and explore its potential applications in software, we can utilize the arXiv API to search for papers related to \"GPT-4\". I can write a Python script to fetch this information. Let's proceed with that.\n",
      "\n",
      "```python\n",
      "import requests\n",
      "from xml.etree import ElementTree\n",
      "\n",
      "def search_arxiv(query, max_results=10):\n",
      "    url = 'http://export.arxiv.org/api/query?search_query=all:' + query + '&start=0&max_results=' + str(max_results)\n",
      "    response = requests.get(url)\n",
      "    root = ElementTree.fromstring(response.content)\n",
      "    papers = []\n",
      "    for entry in root.findall('{http://www.w3.org/2005/Atom}entry'):\n",
      "        title = entry.find('{http://www.w3.org/2005/Atom}title').text\n",
      "        summary = entry.find('{http://www.w3.org/2005/Atom}summary').text\n",
      "        papers.append({'title': title, 'summary': summary})\n",
      "    return papers\n",
      "\n",
      "# Search for GPT-4 related papers\n",
      "papers = search_arxiv('GPT-4')\n",
      "for paper in papers:\n",
      "    print(f\"Title: {paper['title']}\\nSummary: {paper['summary']}\\n\")\n",
      "```\n",
      "\n",
      "This script will fetch the titles and summaries of papers related to GPT-4 from arXiv. We can then analyze these summaries to identify potential applications in software. Shall I proceed to execute this script?\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[32m\n",
      "Next speaker: Computer_terminal\n",
      "\u001b[0m\n",
      "\u001b[31m\n",
      ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
      "\u001b[33mComputer_terminal\u001b[0m (to chat_manager):\n",
      "\n",
      "exitcode: 0 (execution succeeded)\n",
      "Code output: \n",
      "Title: Can LLMs like GPT-4 outperform traditional AI tools in dementia\n",
      "  diagnosis? Maybe, but not today\n",
      "Summary:   Recent investigations show that large language models (LLMs), specifically\n",
      "GPT-4, not only have remarkable capabilities in common Natural Language\n",
      "Processing (NLP) tasks but also exhibit human-level performance on various\n",
      "professional and academic benchmarks. However, whether GPT-4 can be directly\n",
      "used in practical applications and replace traditional artificial intelligence\n",
      "(AI) tools in specialized domains requires further experimental validation. In\n",
      "this paper, we explore the potential of LLMs such as GPT-4 to outperform\n",
      "traditional AI tools in dementia diagnosis. Comprehensive comparisons between\n",
      "GPT-4 and traditional AI tools are conducted to examine their diagnostic\n",
      "accuracy in a clinical setting. Experimental results on two real clinical\n",
      "datasets show that, although LLMs like GPT-4 demonstrate potential for future\n",
      "advancements in dementia diagnosis, they currently do not surpass the\n",
      "performance of traditional AI tools. The interpretability and faithfulness of\n",
      "GPT-4 are also evaluated by comparison with real doctors. We discuss the\n",
      "limitations of GPT-4 in its current state and propose future research\n",
      "directions to enhance GPT-4 in dementia diagnosis.\n",
      "\n",
      "\n",
      "Title: GPT-4 Can't Reason\n",
      "Summary:   GPT-4 was released in March 2023 to wide acclaim, marking a very substantial\n",
      "improvement across the board over GPT-3.5 (OpenAI's previously best model,\n",
      "which had powered the initial release of ChatGPT). However, despite the\n",
      "genuinely impressive improvement, there are good reasons to be highly skeptical\n",
      "of GPT-4's ability to reason. This position paper discusses the nature of\n",
      "reasoning; criticizes the current formulation of reasoning problems in the NLP\n",
      "community, as well as the way in which LLM reasoning performance is currently\n",
      "evaluated; introduces a small collection of 21 diverse reasoning problems; and\n",
      "performs a detailed qualitative evaluation of GPT-4's performance on those\n",
      "problems. Based on this analysis, the paper concludes that, despite its\n",
      "occasional flashes of analytical brilliance, GPT-4 at present is utterly\n",
      "incapable of reasoning.\n",
      "\n",
      "\n",
      "Title: Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4\n",
      "Summary:   Harnessing logical reasoning ability is a comprehensive natural language\n",
      "understanding endeavor. With the release of Generative Pretrained Transformer 4\n",
      "(GPT-4), highlighted as \"advanced\" at reasoning tasks, we are eager to learn\n",
      "the GPT-4 performance on various logical reasoning tasks. This report analyses\n",
      "multiple logical reasoning datasets, with popular benchmarks like LogiQA and\n",
      "ReClor, and newly-released datasets like AR-LSAT. We test the multi-choice\n",
      "reading comprehension and natural language inference tasks with benchmarks\n",
      "requiring logical reasoning. We further construct a logical reasoning\n",
      "out-of-distribution dataset to investigate the robustness of ChatGPT and GPT-4.\n",
      "We also make a performance comparison between ChatGPT and GPT-4. Experiment\n",
      "results show that ChatGPT performs significantly better than the RoBERTa\n",
      "fine-tuning method on most logical reasoning benchmarks. With early access to\n",
      "the GPT-4 API we are able to conduct intense experiments on the GPT-4 model.\n",
      "The results show GPT-4 yields even higher performance on most logical reasoning\n",
      "datasets. Among benchmarks, ChatGPT and GPT-4 do relatively well on well-known\n",
      "datasets like LogiQA and ReClor. However, the performance drops significantly\n",
      "when handling newly released and out-of-distribution datasets. Logical\n",
      "reasoning remains challenging for ChatGPT and GPT-4, especially on\n",
      "out-of-distribution and natural language inference datasets. We release the\n",
      "prompt-style logical reasoning datasets as a benchmark suite and name it\n",
      "LogiEval.\n",
      "\n",
      "\n",
      "Title: How is ChatGPT's behavior changing over time?\n",
      "Summary:   GPT-3.5 and GPT-4 are the two most widely used large language model (LLM)\n",
      "services. However, when and how these models are updated over time is opaque.\n",
      "Here, we evaluate the March 2023 and June 2023 versions of GPT-3.5 and GPT-4 on\n",
      "several diverse tasks: 1) math problems, 2) sensitive/dangerous questions, 3)\n",
      "opinion surveys, 4) multi-hop knowledge-intensive questions, 5) generating\n",
      "code, 6) US Medical License tests, and 7) visual reasoning. We find that the\n",
      "performance and behavior of both GPT-3.5 and GPT-4 can vary greatly over time.\n",
      "For example, GPT-4 (March 2023) was reasonable at identifying prime vs.\n",
      "composite numbers (84% accuracy) but GPT-4 (June 2023) was poor on these same\n",
      "questions (51% accuracy). This is partly explained by a drop in GPT-4's amenity\n",
      "to follow chain-of-thought prompting. Interestingly, GPT-3.5 was much better in\n",
      "June than in March in this task. GPT-4 became less willing to answer sensitive\n",
      "questions and opinion survey questions in June than in March. GPT-4 performed\n",
      "better at multi-hop questions in June than in March, while GPT-3.5's\n",
      "performance dropped on this task. Both GPT-4 and GPT-3.5 had more formatting\n",
      "mistakes in code generation in June than in March. We provide evidence that\n",
      "GPT-4's ability to follow user instructions has decreased over time, which is\n",
      "one common factor behind the many behavior drifts. Overall, our findings show\n",
      "that the behavior of the \"same\" LLM service can change substantially in a\n",
      "relatively short amount of time, highlighting the need for continuous\n",
      "monitoring of LLMs.\n",
      "\n",
      "\n",
      "Title: Gpt-4: A Review on Advancements and Opportunities in Natural Language\n",
      "  Processing\n",
      "Summary:   Generative Pre-trained Transformer 4 (GPT-4) is the fourth-generation\n",
      "language model in the GPT series, developed by OpenAI, which promises\n",
      "significant advancements in the field of natural language processing (NLP). In\n",
      "this research article, we have discussed the features of GPT-4, its potential\n",
      "applications, and the challenges that it might face. We have also compared\n",
      "GPT-4 with its predecessor, GPT-3. GPT-4 has a larger model size (more than one\n",
      "trillion), better multilingual capabilities, improved contextual understanding,\n",
      "and reasoning capabilities than GPT-3. Some of the potential applications of\n",
      "GPT-4 include chatbots, personal assistants, language translation, text\n",
      "summarization, and question-answering. However, GPT-4 poses several challenges\n",
      "and limitations such as computational requirements, data requirements, and\n",
      "ethical concerns.\n",
      "\n",
      "\n",
      "Title: Is GPT-4 a Good Data Analyst?\n",
      "Summary:   As large language models (LLMs) have demonstrated their powerful capabilities\n",
      "in plenty of domains and tasks, including context understanding, code\n",
      "generation, language generation, data storytelling, etc., many data analysts\n",
      "may raise concerns if their jobs will be replaced by artificial intelligence\n",
      "(AI). This controversial topic has drawn great attention in public. However, we\n",
      "are still at a stage of divergent opinions without any definitive conclusion.\n",
      "Motivated by this, we raise the research question of \"is GPT-4 a good data\n",
      "analyst?\" in this work and aim to answer it by conducting head-to-head\n",
      "comparative studies. In detail, we regard GPT-4 as a data analyst to perform\n",
      "end-to-end data analysis with databases from a wide range of domains. We\n",
      "propose a framework to tackle the problems by carefully designing the prompts\n",
      "for GPT-4 to conduct experiments. We also design several task-specific\n",
      "evaluation metrics to systematically compare the performance between several\n",
      "professional human data analysts and GPT-4. Experimental results show that\n",
      "GPT-4 can achieve comparable performance to humans. We also provide in-depth\n",
      "discussions about our results to shed light on further studies before reaching\n",
      "the conclusion that GPT-4 can replace data analysts.\n",
      "\n",
      "\n",
      "Title: Graph Neural Architecture Search with GPT-4\n",
      "Summary:   Graph Neural Architecture Search (GNAS) has shown promising results in\n",
      "automatically designing graph neural networks. However, GNAS still requires\n",
      "intensive human labor with rich domain knowledge to design the search space and\n",
      "search strategy. In this paper, we integrate GPT-4 into GNAS and propose a new\n",
      "GPT-4 based Graph Neural Architecture Search method (GPT4GNAS for short). The\n",
      "basic idea of our method is to design a new class of prompts for GPT-4 to guide\n",
      "GPT-4 toward the generative task of graph neural architectures. The prompts\n",
      "consist of descriptions of the search space, search strategy, and search\n",
      "feedback of GNAS. By iteratively running GPT-4 with the prompts, GPT4GNAS\n",
      "generates more accurate graph neural networks with fast convergence.\n",
      "Experimental results show that embedding GPT-4 into GNAS outperforms the\n",
      "state-of-the-art GNAS methods.\n",
      "\n",
      "\n",
      "Title: Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with\n",
      "  Code-based Self-Verification\n",
      "Summary:   Recent progress in large language models (LLMs) like GPT-4 and PaLM-2 has\n",
      "brought significant advancements in addressing math reasoning problems. In\n",
      "particular, OpenAI's latest version of GPT-4, known as GPT-4 Code Interpreter,\n",
      "shows remarkable performance on challenging math datasets. In this paper, we\n",
      "explore the effect of code on enhancing LLMs' reasoning capability by\n",
      "introducing different constraints on the \\textit{Code Usage Frequency} of GPT-4\n",
      "Code Interpreter. We found that its success can be largely attributed to its\n",
      "powerful skills in generating and executing code, evaluating the output of code\n",
      "execution, and rectifying its solution when receiving unreasonable outputs.\n",
      "Based on this insight, we propose a novel and effective prompting method,\n",
      "explicit \\uline{c}ode-based \\uline{s}elf-\\uline{v}erification~(CSV), to further\n",
      "boost the mathematical reasoning potential of GPT-4 Code Interpreter. This\n",
      "method employs a zero-shot prompt on GPT-4 Code Interpreter to encourage it to\n",
      "use code to self-verify its answers. In instances where the verification state\n",
      "registers as ``False'', the model shall automatically amend its solution,\n",
      "analogous to our approach of rectifying errors during a mathematics\n",
      "examination. Furthermore, we recognize that the states of the verification\n",
      "result indicate the confidence of a solution, which can improve the\n",
      "effectiveness of majority voting. With GPT-4 Code Interpreter and CSV, we\n",
      "achieve an impressive zero-shot accuracy on MATH dataset \\textbf{(53.9\\% $\\to$\n",
      "84.3\\%)}.\n",
      "\n",
      "\n",
      "Title: OpenAI Cribbed Our Tax Example, But Can GPT-4 Really Do Tax?\n",
      "Summary:   The authors explain where OpenAI got the tax law example in its livestream\n",
      "demonstration of GPT-4, why GPT-4 got the wrong answer, and how it fails to\n",
      "reliably calculate taxes.\n",
      "\n",
      "\n",
      "Title: GPT-4 Understands Discourse at Least as Well as Humans Do\n",
      "Summary:   We test whether a leading AI system GPT-4 understands discourse as well as\n",
      "humans do, using a standardized test of discourse comprehension. Participants\n",
      "are presented with brief stories and then answer eight yes/no questions probing\n",
      "their comprehension of the story. The questions are formatted to assess the\n",
      "separate impacts of directness (stated vs. implied) and salience (main idea vs.\n",
      "details). GPT-4 performs slightly, but not statistically significantly, better\n",
      "than humans given the very high level of human performance. Both GPT-4 and\n",
      "humans exhibit a strong ability to make inferences about information that is\n",
      "not explicitly stated in a story, a critical test of understanding.\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[32m\n",
      "Next speaker: AI_ComputerScience_Expert\n",
      "\u001b[0m\n",
      "\u001b[33mAI_ComputerScience_Expert\u001b[0m (to chat_manager):\n",
      "\n",
      "The search results from arXiv provide a diverse range of papers discussing the capabilities and applications of GPT-4. Here are some potential applications in software based on the summaries:\n",
      "\n",
      "1. **Dementia Diagnosis**: The first paper discusses the use of GPT-4 in dementia diagnosis, comparing its performance with traditional AI tools. Although it currently does not surpass traditional methods, it shows potential for future advancements in medical diagnostics.\n",
      "\n",
      "2. **Logical Reasoning**: The third paper evaluates GPT-4's performance on logical reasoning tasks. It highlights that while GPT-4 shows improvements over previous models, it still struggles with out-of-distribution datasets. This suggests applications in enhancing reasoning capabilities in software systems that require robust decision-making.\n",
      "\n",
      "3. **Data Analysis**: The paper titled \"Is GPT-4 a Good Data Analyst?\" explores GPT-4's capabilities in performing end-to-end data analysis. This indicates potential applications in software tools for data analytics, where GPT-4 could assist or augment human data analysts.\n",
      "\n",
      "4. **Graph Neural Architecture Search**: The integration of GPT-4 in designing graph neural networks, as discussed in the \"Graph Neural Architecture Search with GPT-4\" paper, showcases its application in automating and optimizing the design of complex network architectures in software.\n",
      "\n",
      "5. **Math Word Problems**: The paper on solving challenging math word problems using GPT-4's code interpreter suggests applications in educational software, particularly in developing tools that assist in learning and solving mathematical problems.\n",
      "\n",
      "These applications demonstrate GPT-4's potential to enhance various aspects of software, from improving diagnostic tools in healthcare to optimizing data analysis and network design in technical fields.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[32m\n",
      "Next speaker: DataMining_Expert\n",
      "\u001b[0m\n",
      "\u001b[33mDataMining_Expert\u001b[0m (to chat_manager):\n",
      "\n",
      "The applications outlined by the AI_ComputerScience_Expert indeed highlight the versatility and potential of GPT-4 in various software domains. To further validate these applications, we could consider setting up experiments or simulations that specifically test GPT-4's performance in these areas. For instance, in the context of dementia diagnosis, we could simulate a diagnostic process using GPT-4 and compare its accuracy and efficiency against traditional AI tools. Similarly, for data analysis and graph neural architecture search, we could benchmark GPT-4 against current state-of-the-art methods to quantitatively assess its improvements or shortcomings.\n",
      "\n",
      "These practical evaluations would provide a more concrete understanding of how GPT-4 can be integrated into software solutions and its potential impact on improving functionalities and user experiences. If needed, I can assist in designing these experiments or simulations to ensure they are robust and provide meaningful insights.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[32m\n",
      "Next speaker: AI_ComputerScience_Expert\n",
      "\u001b[0m\n",
      "\u001b[33mAI_ComputerScience_Expert\u001b[0m (to chat_manager):\n",
      "\n",
      "Absolutely, setting up experiments or simulations to test GPT-4's performance in specific applications would be a crucial step in validating its practical utility and integration into software solutions. For the dementia diagnosis application, we could use a dataset of clinical cases to evaluate the model's diagnostic accuracy and compare it with traditional AI systems. This would involve not only accuracy but also examining aspects like false positives and negatives, which are critical in medical diagnostics.\n",
      "\n",
      "For data analysis, we could design a set of tasks that mimic real-world data analysis scenarios. These tasks could include data cleaning, exploration, visualization, and predictive modeling. GPT-4's performance can be evaluated based on its accuracy, efficiency, and the insights it generates compared to human data analysts or other AI tools.\n",
      "\n",
      "In the case of graph neural architecture search, we could use standard datasets and benchmarks in the field to test the effectiveness of the architectures designed by GPT-4. Metrics such as the time taken to design the architecture, performance of the designed network on test data, and comparison with architectures designed by human experts or other automated systems would be valuable.\n",
      "\n",
      "These experiments would not only help in understanding GPT-4's capabilities but also in identifying areas where it might need further improvement. If you need assistance with the statistical analysis or the setup of these experiments, I can contribute with my expertise in programming and data analysis to ensure that the experiments are conducted efficiently and the results are analyzed correctly.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[32m\n",
      "Next speaker: DataMining_Expert\n",
      "\u001b[0m\n",
      "\u001b[33mDataMining_Expert\u001b[0m (to chat_manager):\n",
      "\n",
      "The proposed experimental setups by the AI_ComputerScience_Expert are well-thought-out and would indeed provide valuable insights into GPT-4's capabilities across different domains. To support these experiments, I can contribute by developing data mining scripts that efficiently gather and preprocess the necessary data from various sources. For instance, for the dementia diagnosis application, we can mine patient data, symptoms, and diagnostic results to create a comprehensive dataset for testing GPT-4.\n",
      "\n",
      "Additionally, for the data analysis tasks, I can help automate the process of data cleaning and preparation, which is crucial for ensuring the accuracy of the results. This involves handling missing data, normalizing datasets, and encoding categorical variables, which are common tasks in data analysis that can be automated using Python scripts.\n",
      "\n",
      "For the graph neural architecture search, I can assist in mining existing literature and datasets to find relevant benchmarks and performance metrics that can be used to evaluate the architectures designed by GPT-4. This would involve not only retrieving data but also analyzing it to extract meaningful patterns and insights that can guide the experimental setup.\n",
      "\n",
      "By combining our expertise in AI, data mining, and bioinformatics, we can ensure that the experiments are not only well-designed but also supported by robust data handling and analysis methodologies. This collaborative approach will enhance the reliability and validity of the findings, providing a solid foundation for assessing GPT-4's practical applications in software.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[32m\n",
      "Next speaker: Bioinformatics_Expert\n",
      "\u001b[0m\n",
      "\u001b[33mBioinformatics_Expert\u001b[0m (to chat_manager):\n",
      "\n",
      "The collaborative approach outlined by the DataMining_Expert is essential for the success of these experiments. By leveraging our combined expertise in AI, data mining, and bioinformatics, we can ensure that the experimental setups are robust and the data used is of high quality and relevance.\n",
      "\n",
      "For the dementia diagnosis application, integrating comprehensive patient data and diagnostic results will allow us to simulate realistic scenarios where GPT-4's diagnostic capabilities can be rigorously tested. This will help in assessing not only its accuracy but also its reliability and potential as a supportive tool in medical diagnostics.\n",
      "\n",
      "In the data analysis tasks, automating the data preparation process will significantly enhance the efficiency of the experiments. It will allow us to focus on evaluating GPT-4's performance in generating insights and making predictions, which are critical aspects of data analysis.\n",
      "\n",
      "For the graph neural architecture search, having access to relevant benchmarks and performance metrics is crucial. The data mining efforts to gather and analyze existing literature and datasets will provide a solid basis for evaluating the effectiveness of the architectures designed by GPT-4.\n",
      "\n",
      "Overall, this collaborative effort will enable us to conduct comprehensive and meaningful experiments that will provide insights into GPT-4's capabilities and limitations. This will not only contribute to the academic and scientific community but also guide future developments and applications of AI in software solutions. If there are no further inputs or adjustments needed, we can proceed with the planning and execution of these experiments.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[32m\n",
      "Next speaker: Bioinformatics_Expert\n",
      "\u001b[0m\n",
      "\u001b[33mBioinformatics_Expert\u001b[0m (to chat_manager):\n",
      "\n",
      "TERMINATE\n",
      "\n",
      "--------------------------------------------------------------------------------\n"
     ]
    }
   ],
   "source": [
    "start_task(\n",
    "    execution_task=\"Find a recent paper about gpt-4 on arxiv and find its potential applications in software.\",\n",
    "    agent_list=agent_list,\n",
    "    coding=agent_configs[\"coding\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "22a30e4b4297edd1",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Step 6 (Optional): clear all agents and prepare for the next task\n",
    "You can clear all agents generated in this task by the following code if your task is completed or the next task is largely different from the current task. If the agent's backbone is an open-source LLM, this process will also shut down the endpoint server. If necessary, you can use `recycle_endpoint=False` to retain the previous open-source LLMs' endpoint server."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "7fb0bfff01dd1330",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-06-09T15:11:20.347267900Z",
     "start_time": "2024-06-09T15:11:20.339680600Z"
    },
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[33mAll agents have been cleared.\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "builder.clear_all_agents(recycle_endpoint=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bbb098638a086898",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Save & load configs\n",
    "\n",
    "You can save all necessary information of the built group chat agents. Here is a case for those agents generated in the above task:\n",
    "```json\n",
    "{\n",
    "    \"building_task\": \"Generate some agents that can find papers on arxiv by programming and analyzing them in specific domains related to computer science and medical science.\",\n",
    "    \"agent_configs\": [\n",
    "        {\n",
    "            \"name\": \"ArXiv_Data_Scraper_Developer\",\n",
    "            \"model\": \"gpt-4-1106-preview\",\n",
    "            \"system_message\": \"You are now in a group chat. You need to complete a task with other participants. As an ArXiv_Data_Scraper_Developer, your focus is to create and refine tools capable of intelligent search and data extraction from arXiv, honing in on topics within the realms of computer science and medical science. Utilize your proficiency in Python programming to design scripts that navigate, query, and parse information from the platform, generating valuable insights and datasets for analysis. \\n\\nDuring your mission, it\\u2019s not just about formulating queries; your role encompasses the optimization and precision of the data retrieval process, ensuring relevance and accuracy of the information extracted. If you encounter an issue with a script or a discrepancy in the expected output, you are encouraged to troubleshoot and offer revisions to the code you find in the group chat.\\n\\nWhen you reach a point where the existing codebase does not fulfill task requirements or if the operation of provided code is unclear, you should ask for help from the group chat manager. They will facilitate your advancement by providing guidance or appointing another participant to assist you. Your ability to adapt and enhance scripts based on peer feedback is critical, as the dynamic nature of data scraping demands ongoing refinement of techniques and approaches.\\n\\nWrap up your participation by confirming the user's need has been satisfied with the data scraping solutions you've provided. Indicate the completion of your task by replying \\\"TERMINATE\\\" in the group chat.\",\n",
    "            \"description\": \"ArXiv_Data_Scraper_Developer is a specialized software development role requiring proficiency in Python, including familiarity with web scraping libraries such as BeautifulSoup or Scrapy, and a solid understanding of APIs and data parsing. They must possess the ability to identify and correct errors in existing scripts and confidently engage in technical discussions to improve data retrieval processes. The role also involves a critical eye for troubleshooting and optimizing code to ensure efficient data extraction from the ArXiv platform for research and analysis purposes.\"\n",
    "        },\n",
    "        {\n",
    "            \"name\": \"Computer_Science_Research_Analyst\",\n",
    "            \"model\": \"gpt-4-1106-preview\",\n",
    "            \"system_message\": \"You are now in a group chat. You need to complete a task with other participants. As a Computer Science Research Analyst, your objective is to utilize your analytical capabilities to identify and examine scholarly articles on arXiv, focusing on areas bridging computer science and medical science. Employ Python for automation where appropriate and leverage your expertise in the subject matter to draw insights from the research.\\n\\nEnsure that the information is acquired systematically; tap into online databases, interpret data sets, and perform literature reviews to pinpoint relevant findings. Should you encounter a complex problem or if you find your progress stalled, feel free to question the existing approaches discussed in the chat or contribute an improved method or analysis.\\n\\nIf the task proves to be beyond your current means or if you face uncertainty at any stage, seek assistance from the group chat manager. The manager is available to provide guidance or to involve another expert if necessary to move forward effectively.\\n\\nYour contributions are crucial, and it is important to communicate your findings and conclusions clearly. Once you believe the task is complete and the group's need has been satisfied, please affirm the completion by replying \\\"TERMINATE\\\".\",\n",
    "            \"description\": \"Computer_Science_Research_Analyst is a role requiring strong analytical skills, a deep understanding of computer science concepts, and proficiency in Python for data analysis and automation. This position should have the ability to critically assess the validity of information, challenge assumptions, and provide evidence-based corrections or alternatives. They should also have excellent communication skills to articulate their findings and suggestions effectively within the group chat.\"\n",
    "        },\n",
    "        {\n",
    "            \"name\": \"Medical_Science_Research_Analyst\",\n",
    "            \"model\": \"gpt-4-1106-preview\",\n",
    "            \"system_message\": \"You are now in a group chat. You need to complete a task with other participants. As a Medical_Science_Research_Analyst, your function is to harness your analytical strengths and understanding of medical research to source and evaluate pertinent papers from the arXiv database, focusing on the intersection of computer science and medical science. Utilize your Python programming skills to automate data retrieval and analysis tasks. Engage in systematic data mining to extract relevant content, then apply your analytical expertise to interpret the findings qualitatively. \\n\\nWhen there is a requirement to gather information, employ Python scripts to automate the aggregation process. This could include scraping web data, retrieving and processing documents, and performing content analyses. When these scripts produce outputs, use your subject matter expertise to evaluate the results. \\n\\nProgress through your task step by step. When an explicit plan is absent, present a structured outline of your intended methodology. Clarify which segments of the task are handled through automation, and which necessitate your interpretative skills. \\n\\nIn the event code is utilized, the script type must be specified. You are expected to execute the scripts provided without making changes. Scripts are to be complete and functionally standalone. Should you encounter an error upon execution, critically review the output, and if needed, present a revised script for the task at hand. \\n\\nFor tasks that require saving and executing scripts, indicate the intended filename at the beginning of the script. \\n\\nMaintain clear communication of the results by harnessing the 'print' function where applicable. If an error arises or a task remains unsolved after successful code execution, regroup to collect additional information, reassess your approach, and explore alternative strategies. \\n\\nUpon reaching a conclusion, substantiate your findings with credible evidence where possible.\\n\\nConclude your participation by confirming the task's completion with a \\\"TERMINATE\\\" response.\\n\\nShould uncertainty arise at any point, seek guidance from the group chat manager for further directives or reassignment of the task.\",\n",
    "            \"description\": \"The Medical Science Research Analyst is a professionally trained individual with strong analytical skills, specializing in interpreting and evaluating scientific research within the medical field. They should possess expertise in data analysis, likely with proficiency in Python for analyzing datasets, and have the ability to critically assess the validity and relevance of previous messages or findings relayed in the group chat. This role requires a solid foundation in medical knowledge to provide accurate and evidence-based corrections or insights.\"\n",
    "        },\n",
    "        {\n",
    "            \"name\": \"Data_Analysis_Engineer\",\n",
    "            \"model\": \"gpt-4-1106-preview\",\n",
    "            \"system_message\": \"You are now in a group chat. You need to complete a task with other participants. As a Data Analysis Engineer, your role involves leveraging your analytical skills to gather, process, and analyze large datasets. You will employ various data analysis techniques and tools, particularly Python for scripting, to extract insights from the data related to computer science and medical science domains on arxiv.\\n\\nIn scenarios where information needs to be collected or analyzed, you will develop Python scripts to automate the data retrieval and processing tasks. For example, you may write scripts to scrape the arXiv website, parse metadata of research papers, filter content based on specific criteria, and perform statistical analysis or data visualization. \\n\\nYour workflow will include the following steps:\\n\\n1. Use your Python coding abilities to design scripts for data extraction and analysis. This can involve browsing or searching the web, downloading and reading files, or printing the content of web pages or files relevant to the given domains.\\n2. After gathering the necessary data, apply your data analysis expertise to derive meaningful insights or patterns present in the data. This should be done methodically, making the most of your Python skills for data manipulation and interpretation.\\n3. Communicate your findings clearly to the group chat. Ensure the results are straightforward for others to understand and act upon.\\n4. If any issues arise from executing the code, such as lack of output or unexpected results, you can question the previous messages or code in the group chat and attempt to provide a corrected script or analysis.\\n5. When uncertain or facing a complex problem that you cannot solve alone, ask for assistance from the group chat manager. They can either provide guidance or assign another participant to help you.\\n\\nOnce you believe the task is completed satisfactorily, and you have fulfilled the user's need, respond with \\\"TERMINATE\\\" to signify the end of your contribution to the task. Remember, while technical proficiency in Python is essential for this role, the ability to work collaboratively within the group chat, communicate effectively, and adapt to challenges is equally important.\",\n",
    "            \"description\": \"Data_Analysis_Engineer is a professional adept in collecting, analyzing, and interpreting large datasets, using statistical tools and machine learning techniques to provide actionable insights. They should possess strong Python coding skills for data manipulation and analysis, an understanding of database management, as well as the ability to communicate complex results effectively to non-technical stakeholders. This position should be allowed to speak when data-driven clarity is needed or when existing analyses or methodologies are called into question.\"\n",
    "        },\n",
    "        {\n",
    "            \"name\": \"ML_Paper_Summarization_Specialist\",\n",
    "            \"model\": \"gpt-4-1106-preview\",\n",
    "            \"system_message\": \"You are now in a group chat. You need to complete a task with other participants. As an ML_Paper_Summarization_Specialist, your role entails leveraging machine learning techniques to extract and analyze academic papers from arXiv, focusing on domains that intersect computer science and medical science. Utilize your expertise in natural language processing and data analysis to identify relevant papers, extract key insights, and generate summaries that accurately reflect the advancements and findings within those papers.\\n\\nYou are expected to apply your deep understanding of machine learning algorithms, data mining, and information retrieval to construct models and systems that can efficiently process and interpret scientific literature.\\n\\nIf you encounter any challenges in accessing papers, parsing content, or algorithmic processing, you may seek assistance by presenting your issue to the group chat. Should there be a disagreement regarding the efficacy of a method or the accuracy of a summarization, you are encouraged to critically evaluate previous messages or outputs and offer improved solutions to enhance the group's task performance.\\n\\nShould confusion arise during the task, rather than relying on coding scripts, please request guidance from the group chat manager, and allow them to facilitate the necessary support by inviting another participant who can aid in overcoming the current obstacle.\\n\\nRemember, your primary duty is to synthesize complex academic content into concise, accessible summaries that will serve as a valuable resource for researchers and professionals seeking to stay abreast of the latest developments in their respective fields. \\n\\nOnce you believe your task is completed and the summaries provided meet the necessary standards of accuracy and comprehensiveness, reply \\\"TERMINATE\\\" to signal the end of your contribution to the group's task.\",\n",
    "            \"description\": \"The ML_Paper_Summarization_Specialist is a professional adept in machine learning concepts and current research trends, with strong analytical skills to critically evaluate information, synthesizing knowledge from academic papers into digestible summaries. This specialist should be proficient in Python for text processing and have the ability to provide constructive feedback on technical discussions, guide effective implementation, and correct misconceptions or errors related to machine learning theory and practice in the chat. They should be a reliable resource for clarifying complex information and ensuring accurate application of machine learning techniques within the group chat context.\"\n",
    "        }\n",
    "    ],\n",
    "    \"coding\": true,\n",
    "    \"default_llm_config\": {\n",
    "        \"temperature\": 0\n",
    "    },\n",
    "    \"code_execution_config\": {\n",
    "        \"work_dir\": \"groupchat\",\n",
    "        \"use_docker\": false,\n",
    "        \"timeout\": 60,\n",
    "        \"last_n_messages\": 2\n",
    "    }\n",
    "}\n",
    "```\n",
    "These information will be saved in JSON format. You can provide a specific filename, otherwise, AgentBuilder will save config to the current path with a generated filename 'save_config_TASK_MD5.json'."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "e4b88a5d482ceba4",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-06-09T15:11:22.539400Z",
     "start_time": "2024-06-09T15:11:22.533316800Z"
    },
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[32mBuilding config saved to ./save_config_c52224ebd16a2e60b348f3f04ac15e79.json\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "saved_path = builder.save()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a35620c10ee42be",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "After that, you can load the saved config and skip the building process. AgentBuilder will create agents with those information without prompting the builder manager."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "34addd498e5ab174",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-06-09T15:12:27.146791700Z",
     "start_time": "2024-06-09T15:11:25.430350500Z"
    },
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[32mLoading config from ./save_config_c52224ebd16a2e60b348f3f04ac15e79.json\u001b[0m\n",
      "\u001b[32m==> Creating agents...\u001b[0m\n",
      "Creating agent DataMining_Expert...\n",
      "Creating agent Bioinformatics_Expert...\n",
      "Creating agent AI_ComputerScience_Expert...\n",
      "Adding user console proxy...\n",
      "\u001b[33mDataMining_Expert\u001b[0m (to chat_manager):\n",
      "\n",
      "Find a recent paper about LLaVA on arxiv and find its potential applications in computer vision.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[32m\n",
      "Next speaker: Computer_terminal\n",
      "\u001b[0m\n",
      "\u001b[33mComputer_terminal\u001b[0m (to chat_manager):\n",
      "\n",
      "There is no python code from the last 1 message for me to execute. Group chat manager should let other participants to continue the conversation. If the group chat manager want to end the conversation, you should let other participant reply me only with \"TERMINATE\"\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[32m\n",
      "Next speaker: AI_ComputerScience_Expert\n",
      "\u001b[0m\n",
      "\u001b[33mAI_ComputerScience_Expert\u001b[0m (to chat_manager):\n",
      "\n",
      "To assist with the request on finding a recent paper about LLaVA on arXiv and exploring its potential applications in computer vision, I will perform a search on arXiv for the most recent papers related to LLaVA and analyze any mentioned applications in the field of computer vision.\n",
      "\n",
      "Let's start by searching for the most recent papers on this topic. I will write a Python script to query the arXiv API for papers related to \"LLaVA\" and \"computer vision\". Let's proceed with that.\n",
      "\n",
      "```python\n",
      "import urllib.request\n",
      "import urllib.parse\n",
      "import feedparser\n",
      "\n",
      "# Define the base URL for the arXiv API\n",
      "base_url = 'http://export.arxiv.org/api/query?'\n",
      "\n",
      "# Define the search parameters\n",
      "search_query = 'all:LLaVA AND all:\"computer vision\"'  # Search for LLaVA and computer vision\n",
      "start = 0  # Start at the first result\n",
      "max_results = 5  # Get the top 5 results\n",
      "\n",
      "query = f'search_query={urllib.parse.quote(search_query)}&start={start}&max_results={max_results}'\n",
      "url = base_url + query\n",
      "\n",
      "# Perform the HTTP request\n",
      "response = urllib.request.urlopen(url)\n",
      "\n",
      "# Parse the response using feedparser\n",
      "feed = feedparser.parse(response)\n",
      "\n",
      "# Print out the entries (titles and links) for each returned article\n",
      "for entry in feed.entries:\n",
      "    print(f\"Title: {entry.title}\")\n",
      "    print(f\"Authors: {', '.join(author.name for author in entry.authors)}\")\n",
      "    print(f\"Published: {entry.published}\")\n",
      "    print(f\"Link: {entry.link}\")\n",
      "    print(f\"Summary: {entry.summary[:150]}...\")  # Print the first 150 characters of the summary\n",
      "    print(\"\\n\")\n",
      "```\n",
      "\n",
      "This script will retrieve the top 5 most relevant papers from arXiv that mention both LLaVA and computer vision. We can analyze these papers to identify potential applications in computer vision.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[32m\n",
      "Next speaker: Computer_terminal\n",
      "\u001b[0m\n",
      "\u001b[31m\n",
      ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
      "\u001b[33mComputer_terminal\u001b[0m (to chat_manager):\n",
      "\n",
      "exitcode: 0 (execution succeeded)\n",
      "Code output: \n",
      "Title: LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation,\n",
      "  Generation and Editing\n",
      "Authors: Wei-Ge Chen, Irina Spiridonova, Jianwei Yang, Jianfeng Gao, Chunyuan Li\n",
      "Published: 2023-11-01T15:13:43Z\n",
      "Link: http://arxiv.org/abs/2311.00571v1\n",
      "Summary: LLaVA-Interactive is a research prototype for multimodal human-AI\n",
      "interaction. The system can have multi-turn dialogues with human users by\n",
      "taking mul...\n",
      "\n",
      "\n",
      "Title: LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents\n",
      "Authors: Shilong Liu, Hao Cheng, Haotian Liu, Hao Zhang, Feng Li, Tianhe Ren, Xueyan Zou, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang, Jianfeng Gao, Chunyuan Li\n",
      "Published: 2023-11-09T15:22:26Z\n",
      "Link: http://arxiv.org/abs/2311.05437v1\n",
      "Summary: LLaVA-Plus is a general-purpose multimodal assistant that expands the\n",
      "capabilities of large multimodal models. It maintains a skill repository of\n",
      "pre-...\n",
      "\n",
      "\n",
      "Title: Enhance Image-to-Image Generation with LLaVA Prompt and Negative Prompt\n",
      "Authors: Zhicheng Ding, Panfeng Li, Qikai Yang, Siyang Li\n",
      "Published: 2024-06-04T04:31:39Z\n",
      "Link: http://arxiv.org/abs/2406.01956v1\n",
      "Summary: This paper presents a novel approach to enhance image-to-image generation by\n",
      "leveraging the multimodal capabilities of the Large Language and Vision\n",
      "A...\n",
      "\n",
      "\n",
      "Title: Visual Instruction Tuning\n",
      "Authors: Haotian Liu, Chunyuan Li, Qingyang Wu, Yong Jae Lee\n",
      "Published: 2023-04-17T17:59:25Z\n",
      "Link: http://arxiv.org/abs/2304.08485v2\n",
      "Summary: Instruction tuning large language models (LLMs) using machine-generated\n",
      "instruction-following data has improved zero-shot capabilities on new tasks,\n",
      "b...\n",
      "\n",
      "\n",
      "Title: Improved Baselines with Visual Instruction Tuning\n",
      "Authors: Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee\n",
      "Published: 2023-10-05T17:59:56Z\n",
      "Link: http://arxiv.org/abs/2310.03744v2\n",
      "Summary: Large multimodal models (LMM) have recently shown encouraging progress with\n",
      "visual instruction tuning. In this note, we show that the fully-connected\n",
      "...\n",
      "\n",
      "\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[32m\n",
      "Next speaker: AI_ComputerScience_Expert\n",
      "\u001b[0m\n",
      "\u001b[33mAI_ComputerScience_Expert\u001b[0m (to chat_manager):\n",
      "\n",
      "The search has returned several interesting papers related to LLaVA and its applications in computer vision. Here are the summaries of the top papers:\n",
      "\n",
      "1. **LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation, and Editing**\n",
      "   - **Authors:** Wei-Ge Chen, Irina Spiridonova, Jianwei Yang, Jianfeng Gao, Chunyuan Li\n",
      "   - **Published:** 2023-11-01\n",
      "   - **Summary:** This paper introduces LLaVA-Interactive, a multimodal human-AI interaction system capable of multi-turn dialogues with human users by taking multiple inputs including images. It demonstrates applications in image chat, segmentation, generation, and editing.\n",
      "   - **Link:** [Read more](http://arxiv.org/abs/2311.00571v1)\n",
      "\n",
      "2. **LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents**\n",
      "   - **Authors:** Shilong Liu, Hao Cheng, Haotian Liu, Hao Zhang, Feng Li, Tianhe Ren, Xueyan Zou, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang, Jianfeng Gao, Chunyuan Li\n",
      "   - **Published:** 2023-11-09\n",
      "   - **Summary:** LLaVA-Plus expands the capabilities of large multimodal models, maintaining a skill repository of pre-trained models for various tasks including visual tasks.\n",
      "   - **Link:** [Read more](http://arxiv.org/abs/2311.05437v1)\n",
      "\n",
      "3. **Enhance Image-to-Image Generation with LLaVA Prompt and Negative Prompt**\n",
      "   - **Authors:** Zhicheng Ding, Panfeng Li, Qikai Yang, Siyang Li\n",
      "   - **Published:** 2024-06-04\n",
      "   - **Summary:** This paper presents a novel approach to enhance image-to-image generation by leveraging the multimodal capabilities of LLaVA, focusing on improving visual content generation.\n",
      "   - **Link:** [Read more](http://arxiv.org/abs/2406.01956v1)\n",
      "\n",
      "These papers highlight the versatility of LLaVA in handling various aspects of computer vision, such as image segmentation, generation, and editing. The applications are quite broad, impacting areas like multimodal human-AI interaction, enhancing image-to-image generation, and creating multimodal agents capable of performing visual tasks. These capabilities are crucial for advancing the field of computer vision, providing tools that can better understand and interact with visual data in a more human-like manner.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[32m\n",
      "Next speaker: DataMining_Expert\n",
      "\u001b[0m\n",
      "\u001b[33mDataMining_Expert\u001b[0m (to chat_manager):\n",
      "\n",
      "The summaries provided indeed highlight the potential applications of LLaVA in computer vision. The capabilities of LLaVA in handling tasks such as image segmentation, generation, and editing are particularly noteworthy. These functionalities can be extremely useful in various practical applications, such as enhancing visual content for media, improving interfaces for human-computer interaction, and even aiding in educational tools where visual aids are crucial.\n",
      "\n",
      "Given the detailed information from the papers, it seems that LLaVA's integration into computer vision tasks could lead to significant advancements in how machines process and understand visual information, making them more efficient and effective in tasks that require a deep understanding of visual contexts.\n",
      "\n",
      "It would be beneficial to further explore how these capabilities can be integrated into existing systems or used to develop new applications in fields that heavily rely on visual data.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[32m\n",
      "Next speaker: DataMining_Expert\n",
      "\u001b[0m\n",
      "\u001b[33mDataMining_Expert\u001b[0m (to chat_manager):\n",
      "\n",
      "TERMINATE\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[33mAll agents have been cleared.\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "new_builder = AgentBuilder(config_file_or_env=config_file_or_env)\n",
    "agent_list, agent_configs = new_builder.load(\n",
    "    \"./save_config_c52224ebd16a2e60b348f3f04ac15e79.json\"\n",
    ")  # load previous agent configs\n",
    "start_task(\n",
    "    execution_task=\"Find a recent paper about LLaVA on arxiv and find its potential applications in computer vision.\",\n",
    "    agent_list=agent_list,\n",
    ")\n",
    "new_builder.clear_all_agents()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}