autogen/notebook/autogen_agent_auto_feedback_from_code_execution.ipynb

{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Interactive LLM Agent\n",
    "\n",
    "FLAML offers an experimental feature of interactive LLM agents, which can be used to solve various tasks, including coding and math problem-solving.\n",
    "\n",
    "In this notebook, we demonstrate how to use `PythonAgent` and `UserProxyAgent` to write code and execute the code. Here `PythonAgent` is an LLM-based agent that can write Python code (in a Python coding block) for a user to execute for a given task. `UserProxyAgent` is an agent which serves as a proxy for the human user to execute the code written by `PythonAgent`, or automatically execute the code. Depending on the setting of `user_interaction_mode` and `max_consecutive_auto_reply`, the `UserProxyAgent` either solicits feedback from the human user or uses auto-feedback based on the result of code execution. For example, when `user_interaction_mode` is set to \"ALWAYS\", the `UserProxyAgent` will always prompt the user for feedback. When user feedback is provided, the `UserProxyAgent` will directly pass the feedback to `PythonAgent` without doing any additional steps. When no user feedback is provided, the `UserProxyAgent` will execute the code written by `PythonAgent` directly and return the execution results (success or failure and corresponding outputs) to `PythonAgent`.\n",
    "\n",
    "## Requirements\n",
    "\n",
    "FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the [autogen] option:\n",
    "```bash\n",
    "pip install flaml[autogen]\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-02-13T23:40:52.317406Z",
     "iopub.status.busy": "2023-02-13T23:40:52.316561Z",
     "iopub.status.idle": "2023-02-13T23:40:52.321193Z",
     "shell.execute_reply": "2023-02-13T23:40:52.320628Z"
    }
   },
   "outputs": [],
   "source": [
    "# %pip install flaml[autogen]"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Set your API Endpoint\n",
    "\n",
    "The [`config_list_gpt4_gpt35`](https://microsoft.github.io/FLAML/docs/reference/autogen/oai/openai_utils#config_list_gpt4_gpt35) function tries to create a list of gpt-4 and gpt-3.5 configurations using Azure OpenAI endpoints and OpenAI endpoints. It assumes the api keys and api bases are stored in the corresponding environment variables or local txt files:\n",
    "\n",
    "- OpenAI API key: os.environ[\"OPENAI_API_KEY\"] or `openai_api_key_file=\"key_openai.txt\"`.\n",
    "- Azure OpenAI API key: os.environ[\"AZURE_OPENAI_API_KEY\"] or `aoai_api_key_file=\"key_aoai.txt\"`. Multiple keys can be stored, one per line.\n",
    "- Azure OpenAI API base: os.environ[\"AZURE_OPENAI_API_BASE\"] or `aoai_api_base_file=\"base_aoai.txt\"`. Multiple bases can be stored, one per line.\n",
    "\n",
    "It's OK to have only the OpenAI API key, or only the Azure Open API key + base.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from flaml import oai\n",
    "\n",
    "config_list = oai.config_list_gpt4_gpt35()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example Task: Create and execute a python script with agents\n",
    "\n",
    "In the example below, let's see how to use the agents in FLAML to write a python script and execute the script. This process involves constructing a `PythonAgent` to serve as the assistant, along with a `UserProxyAgent` that acts as a proxy for the human user. In this example demonstrated below, when constructing the `UserProxyAgent`,  we select the `human_input_mode` to \"NEVER\". This means that the `UserProxyAgent` will not solicit feedback from the human user until the limit defined by `max_consecutive_auto_reply` is reached. For the purpose of this example, we've set this limit to 10."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "**** coding_agent received message from user ****\n",
      "\n",
      "Create and execute a script to plot a rocket without using matplotlib\n",
      "\n",
      "**** user received message from coding_agent ****\n",
      "\n",
      "Creating a rocket involves using ASCII characters to display it visually. Here's a simple script to get you started:\n",
      "\n",
      "```python\n",
      "# filename: rocket.py\n",
      "def plot_rocket():\n",
      "    rocket = '''\n",
      "          |\n",
      "         /_\\\n",
      "        /^|^\\ \n",
      "        //| \\\\\n",
      "       // | \\\\\n",
      "    '''\n",
      "    print(rocket)\n",
      "\n",
      "if __name__ == \"__main__\":\n",
      "    plot_rocket()\n",
      "```\n",
      "\n",
      "Save this code in a file named `rocket.py` and execute the script. After that, let me know the results.\n",
      "\n",
      " >>>>>>>> NO HUMAN INPUT RECEIVED. USING AUTO REPLY FOR THE USER...\n",
      "\n",
      "**** coding_agent received message from user ****\n",
      "\n",
      "exitcode: 0 (execution succeeded)\n",
      "Code output: \n",
      "          |\n",
      "         /_        /^|^\\ \n",
      "        //| \\\n",
      "       // | \\\n",
      "    \n",
      "\n",
      "\n",
      "**** user received message from coding_agent ****\n",
      "\n",
      "It seems that there's an issue with the spacing in the rocket drawing. Let's fix that.\n",
      "\n",
      "```python\n",
      "# filename: rocket.py\n",
      "def plot_rocket():\n",
      "    rocket = '''\n",
      "          |\n",
      "         /_\\\\\n",
      "        /^|^\\\\\n",
      "        //|\\\\\\\\\n",
      "       // | \\\\\\\\\n",
      "    '''\n",
      "    print(rocket)\n",
      "\n",
      "if __name__ == \"__main__\":\n",
      "    plot_rocket()\n",
      "```\n",
      "\n",
      "Save this code in the existing `rocket.py` file and execute the script again. After that, let me know the results.\n",
      "\n",
      " >>>>>>>> NO HUMAN INPUT RECEIVED. USING AUTO REPLY FOR THE USER...\n",
      "\n",
      "**** coding_agent received message from user ****\n",
      "\n",
      "exitcode: 0 (execution succeeded)\n",
      "Code output: \n",
      "          |\n",
      "         /_\\\n",
      "        /^|^\\\n",
      "        //|\\\\\n",
      "       // | \\\\\n",
      "    \n",
      "\n",
      "\n",
      "**** user received message from coding_agent ****\n",
      "\n",
      "Great! The rocket is now displayed correctly. If you have any more tasks or modifications, please let me know. Otherwise, I hope you enjoyed the ASCII rocket!\n",
      "\n",
      "TERMINATE\n"
     ]
    }
   ],
   "source": [
    "from flaml.autogen.agent.coding_agent import PythonAgent\n",
    "from flaml.autogen.agent.user_proxy_agent import UserProxyAgent\n",
    "\n",
    "# create an assistant which is essentially a PythonAgent instance named \"coding_agent\"\n",
    "assistant = PythonAgent(\"coding_agent\", request_timeout=600, seed=42, config_list=config_list)\n",
    "# create a UserProxyAgent instance named \"user\"\n",
    "user = UserProxyAgent(\n",
    "    \"user\",\n",
    "    human_input_mode=\"NEVER\",\n",
    "    max_consecutive_auto_reply=10,\n",
    "    is_termination_msg=lambda x: x.rstrip().endswith(\"TERMINATE\"),\n",
    ")\n",
    "# the assistant receives a message from the user, which contains the task description\n",
    "assistant.receive(\n",
    "    \"\"\"Create and execute a script to plot a rocket without using matplotlib\"\"\",\n",
    "    user,\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, let's see how to use the agents to first write the generated script to a file and then execute the script in two sessions of conversation between the `PythonAgent` and the `UserProxyAgent`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "**** coding_agent received message from user ****\n",
      "\n",
      "Create a temp.py file with the following content:\n",
      "```\n",
      "print('Hello world!')\n",
      "```\n",
      "\n",
      "**** user received message from coding_agent ****\n",
      "\n",
      "Here is the code to create the temp.py file with the specified content. Please execute this code:\n",
      "\n",
      "```python\n",
      "with open('temp.py', 'w') as file:\n",
      "    file.write(\"print('Hello world!')\")\n",
      "```\n",
      "\n",
      "After executing this code, you should have a file named temp.py with the content:\n",
      "\n",
      "```\n",
      "print('Hello world!')\n",
      "```\n",
      "\n",
      " >>>>>>>> NO HUMAN INPUT RECEIVED. USING AUTO REPLY FOR THE USER...\n",
      "\n",
      "**** coding_agent received message from user ****\n",
      "\n",
      "exitcode: 0 (execution succeeded)\n",
      "Code output: \n",
      "\n",
      "**** user received message from coding_agent ****\n",
      "\n",
      "Great! The temp.py file has been created successfully. Now, you can run this file to see the output. If you need any further assistance, feel free to ask.\n",
      "\n",
      "TERMINATE\n"
     ]
    }
   ],
   "source": [
    "# it is suggested to reset the assistant to clear the state if the new task is not related to the previous one.\n",
    "assistant.reset()\n",
    "assistant.receive(\n",
    "    \"\"\"Create a temp.py file with the following content:\n",
    "    ```\n",
    "    print('Hello world!')\n",
    "    ```\"\"\",\n",
    "    user,\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The example above involves code execution. In FLAML, code execution is triggered automatically by the `UserProxyAgent` when it detects an executable code block in a received message and no human user input is provided. This process occurs in a designated working directory, using a Docker container by default. Unless a specific directory is specified, FLAML defaults to the `flaml/autogen/extensions` directory. Users have the option to specify a different working directory by setting the `work_dir` argument when constructing a new instance of the `UserProxyAgent`.\n",
    "\n",
    "Upon successful execution of the preceding code block, a file named `temp.py` will be created and saved in the default working directory `flaml/autogen/extensions`. Now, let's prompt the assistant to execute the code contained within this file using the following line of code."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "**** coding_agent received message from user ****\n",
      "\n",
      "Execute temp.py\n",
      "\n",
      "**** user received message from coding_agent ****\n",
      "\n",
      "To execute temp.py, run the following code:\n",
      "\n",
      "```python\n",
      "import os\n",
      "\n",
      "os.system('python temp.py')\n",
      "```\n",
      "\n",
      "This code imports the os module and then runs the temp.py file. After executing this code, you should see the output:\n",
      "\n",
      "Hello world!\n",
      "\n",
      " >>>>>>>> NO HUMAN INPUT RECEIVED. USING AUTO REPLY FOR THE USER...\n",
      "\n",
      "**** coding_agent received message from user ****\n",
      "\n",
      "exitcode: 0 (execution succeeded)\n",
      "Code output: Hello world!\n",
      "\n",
      "\n",
      "**** user received message from coding_agent ****\n",
      "\n",
      "I'm glad that the code execution was successful and you got the desired output! If you need any further help or assistance with another task, feel free to ask.\n",
      "\n",
      "TERMINATE\n"
     ]
    }
   ],
   "source": [
    "assistant.receive(\"\"\"Execute temp.py\"\"\", user)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  },
  "vscode": {
   "interpreter": {
    "hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
   }
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "state": {
     "2d910cfd2d2a4fc49fc30fbbdc5576a7": {
      "model_module": "@jupyter-widgets/base",
      "model_module_version": "2.0.0",
      "model_name": "LayoutModel",
      "state": {
       "_model_module": "@jupyter-widgets/base",
       "_model_module_version": "2.0.0",
       "_model_name": "LayoutModel",
       "_view_count": null,
       "_view_module": "@jupyter-widgets/base",
       "_view_module_version": "2.0.0",
       "_view_name": "LayoutView",
       "align_content": null,
       "align_items": null,
       "align_self": null,
       "border_bottom": null,
       "border_left": null,
       "border_right": null,
       "border_top": null,
       "bottom": null,
       "display": null,
       "flex": null,
       "flex_flow": null,
       "grid_area": null,
       "grid_auto_columns": null,
       "grid_auto_flow": null,
       "grid_auto_rows": null,
       "grid_column": null,
       "grid_gap": null,
       "grid_row": null,
       "grid_template_areas": null,
       "grid_template_columns": null,
       "grid_template_rows": null,
       "height": null,
       "justify_content": null,
       "justify_items": null,
       "left": null,
       "margin": null,
       "max_height": null,
       "max_width": null,
       "min_height": null,
       "min_width": null,
       "object_fit": null,
       "object_position": null,
       "order": null,
       "overflow": null,
       "padding": null,
       "right": null,
       "top": null,
       "visibility": null,
       "width": null
      }
     },
     "454146d0f7224f038689031002906e6f": {
      "model_module": "@jupyter-widgets/controls",
      "model_module_version": "2.0.0",
      "model_name": "HBoxModel",
      "state": {
       "_dom_classes": [],
       "_model_module": "@jupyter-widgets/controls",
       "_model_module_version": "2.0.0",
       "_model_name": "HBoxModel",
       "_view_count": null,
       "_view_module": "@jupyter-widgets/controls",
       "_view_module_version": "2.0.0",
       "_view_name": "HBoxView",
       "box_style": "",
       "children": [
        "IPY_MODEL_e4ae2b6f5a974fd4bafb6abb9d12ff26",
        "IPY_MODEL_577e1e3cc4db4942b0883577b3b52755",
        "IPY_MODEL_b40bdfb1ac1d4cffb7cefcb870c64d45"
       ],
       "layout": "IPY_MODEL_dc83c7bff2f241309537a8119dfc7555",
       "tabbable": null,
       "tooltip": null
      }
     },
     "577e1e3cc4db4942b0883577b3b52755": {
      "model_module": "@jupyter-widgets/controls",
      "model_module_version": "2.0.0",
      "model_name": "FloatProgressModel",
      "state": {
       "_dom_classes": [],
       "_model_module": "@jupyter-widgets/controls",
       "_model_module_version": "2.0.0",
       "_model_name": "FloatProgressModel",
       "_view_count": null,
       "_view_module": "@jupyter-widgets/controls",
       "_view_module_version": "2.0.0",
       "_view_name": "ProgressView",
       "bar_style": "success",
       "description": "",
       "description_allow_html": false,
       "layout": "IPY_MODEL_2d910cfd2d2a4fc49fc30fbbdc5576a7",
       "max": 1,
       "min": 0,
       "orientation": "horizontal",
       "style": "IPY_MODEL_74a6ba0c3cbc4051be0a83e152fe1e62",
       "tabbable": null,
       "tooltip": null,
       "value": 1
      }
     },
     "6086462a12d54bafa59d3c4566f06cb2": {
      "model_module": "@jupyter-widgets/base",
      "model_module_version": "2.0.0",
      "model_name": "LayoutModel",
      "state": {
       "_model_module": "@jupyter-widgets/base",
       "_model_module_version": "2.0.0",
       "_model_name": "LayoutModel",
       "_view_count": null,
       "_view_module": "@jupyter-widgets/base",
       "_view_module_version": "2.0.0",
       "_view_name": "LayoutView",
       "align_content": null,
       "align_items": null,
       "align_self": null,
       "border_bottom": null,
       "border_left": null,
       "border_right": null,
       "border_top": null,
       "bottom": null,
       "display": null,
       "flex": null,
       "flex_flow": null,
       "grid_area": null,
       "grid_auto_columns": null,
       "grid_auto_flow": null,
       "grid_auto_rows": null,
       "grid_column": null,
       "grid_gap": null,
       "grid_row": null,
       "grid_template_areas": null,
       "grid_template_columns": null,
       "grid_template_rows": null,
       "height": null,
       "justify_content": null,
       "justify_items": null,
       "left": null,
       "margin": null,
       "max_height": null,
       "max_width": null,
       "min_height": null,
       "min_width": null,
       "object_fit": null,
       "object_position": null,
       "order": null,
       "overflow": null,
       "padding": null,
       "right": null,
       "top": null,
       "visibility": null,
       "width": null
      }
     },
     "74a6ba0c3cbc4051be0a83e152fe1e62": {
      "model_module": "@jupyter-widgets/controls",
      "model_module_version": "2.0.0",
      "model_name": "ProgressStyleModel",
      "state": {
       "_model_module": "@jupyter-widgets/controls",
       "_model_module_version": "2.0.0",
       "_model_name": "ProgressStyleModel",
       "_view_count": null,
       "_view_module": "@jupyter-widgets/base",
       "_view_module_version": "2.0.0",
       "_view_name": "StyleView",
       "bar_color": null,
       "description_width": ""
      }
     },
     "7d3f3d9e15894d05a4d188ff4f466554": {
      "model_module": "@jupyter-widgets/controls",
      "model_module_version": "2.0.0",
      "model_name": "HTMLStyleModel",
      "state": {
       "_model_module": "@jupyter-widgets/controls",
       "_model_module_version": "2.0.0",
       "_model_name": "HTMLStyleModel",
       "_view_count": null,
       "_view_module": "@jupyter-widgets/base",
       "_view_module_version": "2.0.0",
       "_view_name": "StyleView",
       "background": null,
       "description_width": "",
       "font_size": null,
       "text_color": null
      }
     },
     "b40bdfb1ac1d4cffb7cefcb870c64d45": {
      "model_module": "@jupyter-widgets/controls",
      "model_module_version": "2.0.0",
      "model_name": "HTMLModel",
      "state": {
       "_dom_classes": [],
       "_model_module": "@jupyter-widgets/controls",
       "_model_module_version": "2.0.0",
       "_model_name": "HTMLModel",
       "_view_count": null,
       "_view_module": "@jupyter-widgets/controls",
       "_view_module_version": "2.0.0",
       "_view_name": "HTMLView",
       "description": "",
       "description_allow_html": false,
       "layout": "IPY_MODEL_f1355871cc6f4dd4b50d9df5af20e5c8",
       "placeholder": "",
       "style": "IPY_MODEL_ca245376fd9f4354af6b2befe4af4466",
       "tabbable": null,
       "tooltip": null,
       "value": " 1/1 [00:00&lt;00:00, 44.69it/s]"
      }
     },
     "ca245376fd9f4354af6b2befe4af4466": {
      "model_module": "@jupyter-widgets/controls",
      "model_module_version": "2.0.0",
      "model_name": "HTMLStyleModel",
      "state": {
       "_model_module": "@jupyter-widgets/controls",
       "_model_module_version": "2.0.0",
       "_model_name": "HTMLStyleModel",
       "_view_count": null,
       "_view_module": "@jupyter-widgets/base",
       "_view_module_version": "2.0.0",
       "_view_name": "StyleView",
       "background": null,
       "description_width": "",
       "font_size": null,
       "text_color": null
      }
     },
     "dc83c7bff2f241309537a8119dfc7555": {
      "model_module": "@jupyter-widgets/base",
      "model_module_version": "2.0.0",
      "model_name": "LayoutModel",
      "state": {
       "_model_module": "@jupyter-widgets/base",
       "_model_module_version": "2.0.0",
       "_model_name": "LayoutModel",
       "_view_count": null,
       "_view_module": "@jupyter-widgets/base",
       "_view_module_version": "2.0.0",
       "_view_name": "LayoutView",
       "align_content": null,
       "align_items": null,
       "align_self": null,
       "border_bottom": null,
       "border_left": null,
       "border_right": null,
       "border_top": null,
       "bottom": null,
       "display": null,
       "flex": null,
       "flex_flow": null,
       "grid_area": null,
       "grid_auto_columns": null,
       "grid_auto_flow": null,
       "grid_auto_rows": null,
       "grid_column": null,
       "grid_gap": null,
       "grid_row": null,
       "grid_template_areas": null,
       "grid_template_columns": null,
       "grid_template_rows": null,
       "height": null,
       "justify_content": null,
       "justify_items": null,
       "left": null,
       "margin": null,
       "max_height": null,
       "max_width": null,
       "min_height": null,
       "min_width": null,
       "object_fit": null,
       "object_position": null,
       "order": null,
       "overflow": null,
       "padding": null,
       "right": null,
       "top": null,
       "visibility": null,
       "width": null
      }
     },
     "e4ae2b6f5a974fd4bafb6abb9d12ff26": {
      "model_module": "@jupyter-widgets/controls",
      "model_module_version": "2.0.0",
      "model_name": "HTMLModel",
      "state": {
       "_dom_classes": [],
       "_model_module": "@jupyter-widgets/controls",
       "_model_module_version": "2.0.0",
       "_model_name": "HTMLModel",
       "_view_count": null,
       "_view_module": "@jupyter-widgets/controls",
       "_view_module_version": "2.0.0",
       "_view_name": "HTMLView",
       "description": "",
       "description_allow_html": false,
       "layout": "IPY_MODEL_6086462a12d54bafa59d3c4566f06cb2",
       "placeholder": "",
       "style": "IPY_MODEL_7d3f3d9e15894d05a4d188ff4f466554",
       "tabbable": null,
       "tooltip": null,
       "value": "100%"
      }
     },
     "f1355871cc6f4dd4b50d9df5af20e5c8": {
      "model_module": "@jupyter-widgets/base",
      "model_module_version": "2.0.0",
      "model_name": "LayoutModel",
      "state": {
       "_model_module": "@jupyter-widgets/base",
       "_model_module_version": "2.0.0",
       "_model_name": "LayoutModel",
       "_view_count": null,
       "_view_module": "@jupyter-widgets/base",
       "_view_module_version": "2.0.0",
       "_view_name": "LayoutView",
       "align_content": null,
       "align_items": null,
       "align_self": null,
       "border_bottom": null,
       "border_left": null,
       "border_right": null,
       "border_top": null,
       "bottom": null,
       "display": null,
       "flex": null,
       "flex_flow": null,
       "grid_area": null,
       "grid_auto_columns": null,
       "grid_auto_flow": null,
       "grid_auto_rows": null,
       "grid_column": null,
       "grid_gap": null,
       "grid_row": null,
       "grid_template_areas": null,
       "grid_template_columns": null,
       "grid_template_rows": null,
       "height": null,
       "justify_content": null,
       "justify_items": null,
       "left": null,
       "margin": null,
       "max_height": null,
       "max_width": null,
       "min_height": null,
       "min_width": null,
       "object_fit": null,
       "object_position": null,
       "order": null,
       "overflow": null,
       "padding": null,
       "right": null,
       "top": null,
       "visibility": null,
       "width": null
      }
     }
    },
    "version_major": 2,
    "version_minor": 0
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}