autogen/notebook/agentchat_logging.ipynb

318 lines
11 KiB
Plaintext
Raw Normal View History

Logging (#1146) * WIP:logging * serialize request, response and client * Fixed code formatting. * Updated to use a global package, and added some test cases. Still very-much a draft. * Update work in progress. * adding cost * log new agent * update log_completion test in test_agent_telemetry * tests * fix formatting * Added additional telemetry for wrappers and clients. * WIP: add test for oai client and oai wrapper table * update test_telemetry * fix format * More tests, update doc and clean up * small fix for session id - moved to start_logging and return from start_logging * update start_logging type to return str, add notebook to demonstrate use of telemetry * add ability to get log dataframe * precommit formatting fixes * formatting fix * Remove pandas dependency from telemetry and only use in notebook * formatting fixes * log query exceptions * fix formatting * fix ci * fix comment - add notebook link in doc and fix groupchat serialization * small fix * do not serialize Agent * formatting * wip * fix test * serialization bug fix for soc moderator * fix test and clean up * wip: add version table * fix test * fix test * fix test * make the logging interface more general and fix client model logging * fix format * fix formatting and tests * fix * fix comment * Renaming telemetry to logging * update notebook * update doc * formatting * formatting and clean up * fix doc * fix link and title * fix notebook format and fix comment * format * try fixing agent test and update migration guide * fix link * debug print * debug * format * add back tests * fix tests --------- Co-authored-by: Adam Fourney <adamfo@microsoft.com> Co-authored-by: Victor Dibia <victordibia@microsoft.com> Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2024-02-14 19:54:17 -05:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<!--\n",
"tags: [\"logging\", \"debugging\"]\n",
"description: |\n",
" Provide capabilities of runtime logging for debugging and performance analysis.\n",
"-->\n",
"\n",
"# Runtime Logging with AutoGen \n",
"\n",
"AutoGen offers utilities to log data for debugging and performance analysis. This notebook demonstrates how to use them. \n",
"\n",
"In general, users can initiate logging by calling `autogen.runtime_logging.start()` and stop logging by calling `autogen.runtime_logging.stop()`"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Logging session ID: 6e08f3e0-392b-434e-8b69-4ab36c4fcf99\n",
"\u001b[33muser_proxy\u001b[0m (to assistant):\n",
"\n",
"What is the height of the Eiffel Tower? Only respond with the answer and terminate\n",
"\n",
"--------------------------------------------------------------------------------\n",
"\u001b[33massistant\u001b[0m (to user_proxy):\n",
"\n",
"The height of the Eiffel Tower is approximately 330 meters.\n",
"\n",
"TERMINATE\n",
"\n",
"--------------------------------------------------------------------------------\n"
]
}
],
"source": [
"import json\n",
"import autogen\n",
"from autogen import AssistantAgent, UserProxyAgent\n",
"import pandas as pd\n",
"\n",
"# Setup API key. Add your own API key to config file or environment variable\n",
"llm_config = {\n",
" \"config_list\": autogen.config_list_from_json(\n",
" env_or_file=\"OAI_CONFIG_LIST\",\n",
" ),\n",
" \"temperature\": 0.9,\n",
"}\n",
"\n",
"# Start logging\n",
"logging_session_id = autogen.runtime_logging.start(config={\"dbname\": \"logs.db\"})\n",
"print(\"Logging session ID: \" + str(logging_session_id))\n",
"\n",
"# Create an agent workflow and run it\n",
"assistant = AssistantAgent(name=\"assistant\", llm_config=llm_config)\n",
"user_proxy = UserProxyAgent(\n",
" name=\"user_proxy\",\n",
" code_execution_config=False,\n",
" human_input_mode=\"NEVER\",\n",
" is_termination_msg=lambda msg: \"TERMINATE\" in msg[\"content\"],\n",
")\n",
"\n",
"user_proxy.initiate_chat(\n",
" assistant, message=\"What is the height of the Eiffel Tower? Only respond with the answer and terminate\"\n",
")\n",
"autogen.runtime_logging.stop()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Getting Data from the SQLite Database \n",
"\n",
"`logs.db` should be generated, by default it's using SQLite database. You can view the data with GUI tool like `sqlitebrowser`, using SQLite command line shell or using python script:\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"def get_log(dbname=\"logs.db\", table=\"chat_completions\"):\n",
" import sqlite3\n",
"\n",
" con = sqlite3.connect(dbname)\n",
" query = f\"SELECT * from {table}\"\n",
" cursor = con.execute(query)\n",
" rows = cursor.fetchall()\n",
" column_names = [description[0] for description in cursor.description]\n",
" data = [dict(zip(column_names, row)) for row in rows]\n",
" con.close()\n",
" return data"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>id</th>\n",
" <th>invocation_id</th>\n",
" <th>client_id</th>\n",
" <th>wrapper_id</th>\n",
" <th>session_id</th>\n",
" <th>request</th>\n",
" <th>response</th>\n",
" <th>is_cached</th>\n",
" <th>cost</th>\n",
" <th>start_time</th>\n",
" <th>end_time</th>\n",
" <th>total_tokens</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>e8bb00d7-6da5-4407-a949-e19b55d53da8</td>\n",
" <td>139819167322784</td>\n",
" <td>139823225568704</td>\n",
" <td>8821a150-8c78-4d05-a858-8a64f1d18648</td>\n",
" <td>You are a helpful AI assistant.\\nSolve tasks u...</td>\n",
" <td>The height of the Eiffel Tower is approximatel...</td>\n",
" <td>1</td>\n",
" <td>0.01572</td>\n",
" <td>2024-02-13 15:06:22.082896</td>\n",
" <td>2024-02-13 15:06:22.083169</td>\n",
" <td>507</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>c8522790-0067-484b-bb37-d39ae80db98b</td>\n",
" <td>139823225568656</td>\n",
" <td>139823225563040</td>\n",
" <td>fb0ef547-a2ac-428b-8c20-a5e63263b8e1</td>\n",
" <td>You are a helpful AI assistant.\\nSolve tasks u...</td>\n",
" <td>The height of the Eiffel Tower is approximatel...</td>\n",
" <td>1</td>\n",
" <td>0.01572</td>\n",
" <td>2024-02-13 15:06:23.498758</td>\n",
" <td>2024-02-13 15:06:23.499045</td>\n",
" <td>507</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>91c3f6c0-c6f7-4306-89cd-f304c9556de4</td>\n",
" <td>139823225449024</td>\n",
" <td>139819166072448</td>\n",
" <td>6e08f3e0-392b-434e-8b69-4ab36c4fcf99</td>\n",
" <td>You are a helpful AI assistant.\\nSolve tasks u...</td>\n",
" <td>The height of the Eiffel Tower is approximatel...</td>\n",
" <td>1</td>\n",
" <td>0.01572</td>\n",
" <td>2024-02-13 15:06:24.688990</td>\n",
" <td>2024-02-13 15:06:24.689238</td>\n",
" <td>507</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" id invocation_id client_id wrapper_id \\\n",
"0 1 e8bb00d7-6da5-4407-a949-e19b55d53da8 139819167322784 139823225568704 \n",
"1 2 c8522790-0067-484b-bb37-d39ae80db98b 139823225568656 139823225563040 \n",
"2 3 91c3f6c0-c6f7-4306-89cd-f304c9556de4 139823225449024 139819166072448 \n",
"\n",
" session_id \\\n",
"0 8821a150-8c78-4d05-a858-8a64f1d18648 \n",
"1 fb0ef547-a2ac-428b-8c20-a5e63263b8e1 \n",
"2 6e08f3e0-392b-434e-8b69-4ab36c4fcf99 \n",
"\n",
" request \\\n",
"0 You are a helpful AI assistant.\\nSolve tasks u... \n",
"1 You are a helpful AI assistant.\\nSolve tasks u... \n",
"2 You are a helpful AI assistant.\\nSolve tasks u... \n",
"\n",
" response is_cached cost \\\n",
"0 The height of the Eiffel Tower is approximatel... 1 0.01572 \n",
"1 The height of the Eiffel Tower is approximatel... 1 0.01572 \n",
"2 The height of the Eiffel Tower is approximatel... 1 0.01572 \n",
"\n",
" start_time end_time total_tokens \n",
"0 2024-02-13 15:06:22.082896 2024-02-13 15:06:22.083169 507 \n",
"1 2024-02-13 15:06:23.498758 2024-02-13 15:06:23.499045 507 \n",
"2 2024-02-13 15:06:24.688990 2024-02-13 15:06:24.689238 507 "
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def str_to_dict(s):\n",
" return json.loads(s)\n",
"\n",
"\n",
"log_data = get_log()\n",
"log_data_df = pd.DataFrame(log_data)\n",
"\n",
"log_data_df[\"total_tokens\"] = log_data_df.apply(\n",
" lambda row: str_to_dict(row[\"response\"])[\"usage\"][\"total_tokens\"], axis=1\n",
")\n",
"\n",
"log_data_df[\"request\"] = log_data_df.apply(lambda row: str_to_dict(row[\"request\"])[\"messages\"][0][\"content\"], axis=1)\n",
"\n",
"log_data_df[\"response\"] = log_data_df.apply(\n",
" lambda row: str_to_dict(row[\"response\"])[\"choices\"][0][\"message\"][\"content\"], axis=1\n",
")\n",
"\n",
"log_data_df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Computing Cost \n",
"\n",
"One use case of logging data is to compute the cost of a session."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Total tokens for all sessions: 1521, total cost: 0.0472\n",
"Total tokens for session 6e08f3e0-392b-434e-8b69-4ab36c4fcf99: 507, cost: 0.0157\n"
]
}
],
"source": [
"# Sum totoal tokens for all sessions\n",
"total_tokens = log_data_df[\"total_tokens\"].sum()\n",
"\n",
"# Sum total cost for all sessions\n",
"total_cost = log_data_df[\"cost\"].sum()\n",
"\n",
"# Total tokens for specific session\n",
"session_tokens = log_data_df[log_data_df[\"session_id\"] == logging_session_id][\"total_tokens\"].sum()\n",
"session_cost = log_data_df[log_data_df[\"session_id\"] == logging_session_id][\"cost\"].sum()\n",
"\n",
"print(\"Total tokens for all sessions: \" + str(total_tokens) + \", total cost: \" + str(round(total_cost, 4)))\n",
"print(\n",
" \"Total tokens for session \"\n",
" + str(logging_session_id)\n",
" + \": \"\n",
" + str(session_tokens)\n",
" + \", cost: \"\n",
" + str(round(session_cost, 4))\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "autog",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
}
},
"nbformat": 4,
"nbformat_minor": 2
}