autogen/notebook/autogen_agent_auto_feedback_from_code_execution.ipynb
Qingyun Wu fed28e700b
add agent notebook and documentation (#1052)
* add agent notebook and documentation

* fix bug

* set flush to True when printing msg in agent

* add a math problem in agent notebook

* remove

* header

* improve notebook doc

* notebook update

* improve notebook example

* improve doc

* improve notebook doc

* improve print

* doc

* human_input_mode

* human_input_mode str

* indent

* indent

* Update flaml/autogen/agent/user_proxy_agent.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* renaming and doc format

* typo

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-05-28 03:17:23 +00:00

713 lines
24 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Interactive LLM Agent\n",
"\n",
"FLAML offers an experimental feature of interactive LLM agents, which can be used to solve various tasks, including coding and math problem-solving.\n",
"\n",
"In this notebook, we demonstrate how to use `PythonAgent` and `UserProxyAgent` to write code and execute the code. Here `PythonAgent` is an LLM-based agent that can write Python code (in a Python coding block) for a user to execute for a given task. `UserProxyAgent` is an agent which serves as a proxy for the human user to execute the code written by `PythonAgent`, or automatically execute the code. Depending on the setting of `user_interaction_mode` and `max_consecutive_auto_reply`, the `UserProxyAgent` either solicits feedback from the human user or uses auto-feedback based on the result of code execution. For example, when `user_interaction_mode` is set to \"ALWAYS\", the `UserProxyAgent` will always prompt the user for feedback. When user feedback is provided, the `UserProxyAgent` will directly pass the feedback to `PythonAgent` without doing any additional steps. When no user feedback is provided, the `UserProxyAgent` will execute the code written by `PythonAgent` directly and return the execution results (success or failure and corresponding outputs) to `PythonAgent`.\n",
"\n",
"## Requirements\n",
"\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the [autogen] option:\n",
"```bash\n",
"pip install flaml[autogen]\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {
"iopub.execute_input": "2023-02-13T23:40:52.317406Z",
"iopub.status.busy": "2023-02-13T23:40:52.316561Z",
"iopub.status.idle": "2023-02-13T23:40:52.321193Z",
"shell.execute_reply": "2023-02-13T23:40:52.320628Z"
}
},
"outputs": [],
"source": [
"# %pip install flaml[autogen]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set your API Endpoint\n",
"\n",
"The [`config_list_gpt4_gpt35`](https://microsoft.github.io/FLAML/docs/reference/autogen/oai/openai_utils#config_list_gpt4_gpt35) function tries to create a list of gpt-4 and gpt-3.5 configurations using Azure OpenAI endpoints and OpenAI endpoints. It assumes the api keys and api bases are stored in the corresponding environment variables or local txt files:\n",
"\n",
"- OpenAI API key: os.environ[\"OPENAI_API_KEY\"] or `openai_api_key_file=\"key_openai.txt\"`.\n",
"- Azure OpenAI API key: os.environ[\"AZURE_OPENAI_API_KEY\"] or `aoai_api_key_file=\"key_aoai.txt\"`. Multiple keys can be stored, one per line.\n",
"- Azure OpenAI API base: os.environ[\"AZURE_OPENAI_API_BASE\"] or `aoai_api_base_file=\"base_aoai.txt\"`. Multiple bases can be stored, one per line.\n",
"\n",
"It's OK to have only the OpenAI API key, or only the Azure Open API key + base.\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from flaml import oai\n",
"\n",
"config_list = oai.config_list_gpt4_gpt35()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example Task: Create and execute a python script with agents\n",
"\n",
"In the example below, let's see how to use the agents in FLAML to write a python script and execute the script. This process involves constructing a `PythonAgent` to serve as the assistant, along with a `UserProxyAgent` that acts as a proxy for the human user. In this example demonstrated below, when constructing the `UserProxyAgent`, we select the `human_input_mode` to \"NEVER\". This means that the `UserProxyAgent` will not solicit feedback from the human user until the limit defined by `max_consecutive_auto_reply` is reached. For the purpose of this example, we've set this limit to 10."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"**** coding_agent received message from user ****\n",
"\n",
"Create and execute a script to plot a rocket without using matplotlib\n",
"\n",
"**** user received message from coding_agent ****\n",
"\n",
"Creating a rocket involves using ASCII characters to display it visually. Here's a simple script to get you started:\n",
"\n",
"```python\n",
"# filename: rocket.py\n",
"def plot_rocket():\n",
" rocket = '''\n",
" |\n",
" /_\\\n",
" /^|^\\ \n",
" //| \\\\\n",
" // | \\\\\n",
" '''\n",
" print(rocket)\n",
"\n",
"if __name__ == \"__main__\":\n",
" plot_rocket()\n",
"```\n",
"\n",
"Save this code in a file named `rocket.py` and execute the script. After that, let me know the results.\n",
"\n",
" >>>>>>>> NO HUMAN INPUT RECEIVED. USING AUTO REPLY FOR THE USER...\n",
"\n",
"**** coding_agent received message from user ****\n",
"\n",
"exitcode: 0 (execution succeeded)\n",
"Code output: \n",
" |\n",
" /_ /^|^\\ \n",
" //| \\\n",
" // | \\\n",
" \n",
"\n",
"\n",
"**** user received message from coding_agent ****\n",
"\n",
"It seems that there's an issue with the spacing in the rocket drawing. Let's fix that.\n",
"\n",
"```python\n",
"# filename: rocket.py\n",
"def plot_rocket():\n",
" rocket = '''\n",
" |\n",
" /_\\\\\n",
" /^|^\\\\\n",
" //|\\\\\\\\\n",
" // | \\\\\\\\\n",
" '''\n",
" print(rocket)\n",
"\n",
"if __name__ == \"__main__\":\n",
" plot_rocket()\n",
"```\n",
"\n",
"Save this code in the existing `rocket.py` file and execute the script again. After that, let me know the results.\n",
"\n",
" >>>>>>>> NO HUMAN INPUT RECEIVED. USING AUTO REPLY FOR THE USER...\n",
"\n",
"**** coding_agent received message from user ****\n",
"\n",
"exitcode: 0 (execution succeeded)\n",
"Code output: \n",
" |\n",
" /_\\\n",
" /^|^\\\n",
" //|\\\\\n",
" // | \\\\\n",
" \n",
"\n",
"\n",
"**** user received message from coding_agent ****\n",
"\n",
"Great! The rocket is now displayed correctly. If you have any more tasks or modifications, please let me know. Otherwise, I hope you enjoyed the ASCII rocket!\n",
"\n",
"TERMINATE\n"
]
}
],
"source": [
"from flaml.autogen.agent.coding_agent import PythonAgent\n",
"from flaml.autogen.agent.user_proxy_agent import UserProxyAgent\n",
"\n",
"# create an assistant which is essentially a PythonAgent instance named \"coding_agent\"\n",
"assistant = PythonAgent(\"coding_agent\", request_timeout=600, seed=42, config_list=config_list)\n",
"# create a UserProxyAgent instance named \"user\"\n",
"user = UserProxyAgent(\n",
" \"user\",\n",
" human_input_mode=\"NEVER\",\n",
" max_consecutive_auto_reply=10,\n",
" is_termination_msg=lambda x: x.rstrip().endswith(\"TERMINATE\"),\n",
")\n",
"# the assistant receives a message from the user, which contains the task description\n",
"assistant.receive(\n",
" \"\"\"Create and execute a script to plot a rocket without using matplotlib\"\"\",\n",
" user,\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, let's see how to use the agents to first write the generated script to a file and then execute the script in two sessions of conversation between the `PythonAgent` and the `UserProxyAgent`."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"**** coding_agent received message from user ****\n",
"\n",
"Create a temp.py file with the following content:\n",
"```\n",
"print('Hello world!')\n",
"```\n",
"\n",
"**** user received message from coding_agent ****\n",
"\n",
"Here is the code to create the temp.py file with the specified content. Please execute this code:\n",
"\n",
"```python\n",
"with open('temp.py', 'w') as file:\n",
" file.write(\"print('Hello world!')\")\n",
"```\n",
"\n",
"After executing this code, you should have a file named temp.py with the content:\n",
"\n",
"```\n",
"print('Hello world!')\n",
"```\n",
"\n",
" >>>>>>>> NO HUMAN INPUT RECEIVED. USING AUTO REPLY FOR THE USER...\n",
"\n",
"**** coding_agent received message from user ****\n",
"\n",
"exitcode: 0 (execution succeeded)\n",
"Code output: \n",
"\n",
"**** user received message from coding_agent ****\n",
"\n",
"Great! The temp.py file has been created successfully. Now, you can run this file to see the output. If you need any further assistance, feel free to ask.\n",
"\n",
"TERMINATE\n"
]
}
],
"source": [
"# it is suggested to reset the assistant to clear the state if the new task is not related to the previous one.\n",
"assistant.reset()\n",
"assistant.receive(\n",
" \"\"\"Create a temp.py file with the following content:\n",
" ```\n",
" print('Hello world!')\n",
" ```\"\"\",\n",
" user,\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"The example above involves code execution. In FLAML, code execution is triggered automatically by the `UserProxyAgent` when it detects an executable code block in a received message and no human user input is provided. This process occurs in a designated working directory, using a Docker container by default. Unless a specific directory is specified, FLAML defaults to the `flaml/autogen/extensions` directory. Users have the option to specify a different working directory by setting the `work_dir` argument when constructing a new instance of the `UserProxyAgent`.\n",
"\n",
"Upon successful execution of the preceding code block, a file named `temp.py` will be created and saved in the default working directory `flaml/autogen/extensions`. Now, let's prompt the assistant to execute the code contained within this file using the following line of code."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"**** coding_agent received message from user ****\n",
"\n",
"Execute temp.py\n",
"\n",
"**** user received message from coding_agent ****\n",
"\n",
"To execute temp.py, run the following code:\n",
"\n",
"```python\n",
"import os\n",
"\n",
"os.system('python temp.py')\n",
"```\n",
"\n",
"This code imports the os module and then runs the temp.py file. After executing this code, you should see the output:\n",
"\n",
"Hello world!\n",
"\n",
" >>>>>>>> NO HUMAN INPUT RECEIVED. USING AUTO REPLY FOR THE USER...\n",
"\n",
"**** coding_agent received message from user ****\n",
"\n",
"exitcode: 0 (execution succeeded)\n",
"Code output: Hello world!\n",
"\n",
"\n",
"**** user received message from coding_agent ****\n",
"\n",
"I'm glad that the code execution was successful and you got the desired output! If you need any further help or assistance with another task, feel free to ask.\n",
"\n",
"TERMINATE\n"
]
}
],
"source": [
"assistant.receive(\"\"\"Execute temp.py\"\"\", user)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
},
"vscode": {
"interpreter": {
"hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
}
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"state": {
"2d910cfd2d2a4fc49fc30fbbdc5576a7": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"454146d0f7224f038689031002906e6f": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_e4ae2b6f5a974fd4bafb6abb9d12ff26",
"IPY_MODEL_577e1e3cc4db4942b0883577b3b52755",
"IPY_MODEL_b40bdfb1ac1d4cffb7cefcb870c64d45"
],
"layout": "IPY_MODEL_dc83c7bff2f241309537a8119dfc7555",
"tabbable": null,
"tooltip": null
}
},
"577e1e3cc4db4942b0883577b3b52755": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_2d910cfd2d2a4fc49fc30fbbdc5576a7",
"max": 1,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_74a6ba0c3cbc4051be0a83e152fe1e62",
"tabbable": null,
"tooltip": null,
"value": 1
}
},
"6086462a12d54bafa59d3c4566f06cb2": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"74a6ba0c3cbc4051be0a83e152fe1e62": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"7d3f3d9e15894d05a4d188ff4f466554": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"b40bdfb1ac1d4cffb7cefcb870c64d45": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_f1355871cc6f4dd4b50d9df5af20e5c8",
"placeholder": "",
"style": "IPY_MODEL_ca245376fd9f4354af6b2befe4af4466",
"tabbable": null,
"tooltip": null,
"value": " 1/1 [00:00&lt;00:00, 44.69it/s]"
}
},
"ca245376fd9f4354af6b2befe4af4466": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"dc83c7bff2f241309537a8119dfc7555": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"e4ae2b6f5a974fd4bafb6abb9d12ff26": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_6086462a12d54bafa59d3c4566f06cb2",
"placeholder": "",
"style": "IPY_MODEL_7d3f3d9e15894d05a4d188ff4f466554",
"tabbable": null,
"tooltip": null,
"value": "100%"
}
},
"f1355871cc6f4dd4b50d9df5af20e5c8": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
}
},
"version_major": 2,
"version_minor": 0
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}