autogen/notebook/oai_client_cost.ipynb

309 lines
11 KiB
Plaintext
Raw Normal View History

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a href=\"https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/oai_client_cost.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved. \n",
"\n",
"Licensed under the MIT License.\n",
"\n",
"# Use AutoGen's OpenAIWrapper for cost estimation\n",
"The `OpenAIWrapper` from `autogen` tracks token counts and costs of your API calls. Use the `create()` method to initiate requests and `print_usage_summary()` to retrieve a detailed usage report, including total cost and token usage for both cached and actual requests.\n",
"\n",
"- `mode=[\"actual\", \"total\"]` (default): print usage summary for non-caching completions and all completions (including cache).\n",
"- `mode='actual'`: only print non-cached usage.\n",
"- `mode='total'`: only print all usage (including cache).\n",
"\n",
"Reset your session's usage data with `clear_usage_summary()` when needed.\n",
"\n",
"## Requirements\n",
"\n",
"AutoGen requires `Python>=3.8`:\n",
"```bash\n",
"pip install \"pyautogen\"\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set your API Endpoint\n",
"\n",
"The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file.\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import autogen\n",
"\n",
"# config_list = autogen.config_list_from_json(\n",
"# \"OAI_CONFIG_LIST\",\n",
"# filter_dict={\n",
"# \"model\": [\"gpt-3.5-turbo\", \"gpt-4-1106-preview\"],\n",
"# },\n",
"# )\n",
"\n",
"config_list = autogen.config_list_from_json(\n",
" \"OAI_CONFIG_LIST\",\n",
" filter_dict={\n",
" \"model\": [\"gpt-3.5-turbo\", \"gpt-35-turbo\"],\n",
" },\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It first looks for environment variable \"OAI_CONFIG_LIST\" which needs to be a valid json string. If that variable is not found, it then looks for a json file named \"OAI_CONFIG_LIST\". It filters the configs by models (you can filter by other keys as well).\n",
"\n",
"The config list looks like the following:\n",
"```python\n",
"config_list = [\n",
" {\n",
" \"model\": \"gpt-4\",\n",
" \"api_key\": \"<your OpenAI API key>\",\n",
" }, # OpenAI API endpoint for gpt-4\n",
" {\n",
" \"model\": \"gpt-35-turbo-0631\", # 0631 or newer is needed to use functions\n",
" \"base_url\": \"<your Azure OpenAI API base>\", \n",
" \"api_type\": \"azure\", \n",
" \"api_version\": \"2023-08-01-preview\", # 2023-07-01-preview or newer is needed to use functions\n",
" \"api_key\": \"<your Azure OpenAI API key>\"\n",
" }\n",
"]\n",
"```\n",
"\n",
"You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/main/notebook/oai_openai_utils.ipynb) for full code examples of the different methods."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## OpenAIWrapper with cost estimation"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"In update_usage_summary\n",
"0.0001555\n"
]
}
],
"source": [
"from autogen import OpenAIWrapper\n",
"\n",
"client = OpenAIWrapper(config_list=config_list)\n",
"messages = [{'role': 'user', 'content': 'Can you give me 3 useful tips on learning Python? Keep it simple and short.'},]\n",
"response = client.create(messages=messages, model=\"gpt-35-turbo-1106\", cache_seed=None)\n",
"print(response.cost)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Usage Summary\n",
"\n",
"When creating a instance of OpenAIWrapper, cost of all completions from the same instance is recorded. You can call `print_usage_summary()` to checkout your usage summary. To clear up, use `clear_usage_summary()`.\n"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"No usage summary. Please call \"create\" first.\n"
]
}
],
"source": [
"from autogen import OpenAIWrapper\n",
"\n",
"client = OpenAIWrapper(config_list=config_list)\n",
"messages = [{'role': 'user', 'content': 'Can you give me 3 useful tips on learning Python? Keep it simple and short.'},]\n",
"client.print_usage_summary() # print usage summary"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"In update_usage_summary\n",
"----------------------------------------------------------------------------------------------------\n",
"Usage summary excluding cached usage: \n",
"Total cost: 0.00026\n",
"* Model 'gpt-35-turbo': cost: 0.00026, prompt_tokens: 25, completion_tokens: 110, total_tokens: 135\n",
"\n",
"All completions are non-cached: the total cost with cached completions is the same as actual cost.\n",
"----------------------------------------------------------------------------------------------------\n",
"----------------------------------------------------------------------------------------------------\n",
"Usage summary excluding cached usage: \n",
"Total cost: 0.00026\n",
"* Model 'gpt-35-turbo': cost: 0.00026, prompt_tokens: 25, completion_tokens: 110, total_tokens: 135\n",
"----------------------------------------------------------------------------------------------------\n",
"----------------------------------------------------------------------------------------------------\n",
"Usage summary including cached usage: \n",
"Total cost: 0.00026\n",
"* Model 'gpt-35-turbo': cost: 0.00026, prompt_tokens: 25, completion_tokens: 110, total_tokens: 135\n",
"----------------------------------------------------------------------------------------------------\n"
]
}
],
"source": [
"# The first creation\n",
"# By default, cache_seed is set to 41 and enabled. If you don't want to use cache, set cache_seed to None.\n",
"response = client.create(messages=messages, model=\"gpt-35-turbo-1106\", cache_seed=41)\n",
"client.print_usage_summary() # default to [\"actual\", \"total\"]\n",
"client.print_usage_summary(mode='actual') # print actual usage summary\n",
"client.print_usage_summary(mode='total') # print total usage summary"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'total_cost': 0.0002575, 'gpt-35-turbo': {'cost': 0.0002575, 'prompt_tokens': 25, 'completion_tokens': 110, 'total_tokens': 135}}\n",
"{'total_cost': 0.0002575, 'gpt-35-turbo': {'cost': 0.0002575, 'prompt_tokens': 25, 'completion_tokens': 110, 'total_tokens': 135}}\n"
]
}
],
"source": [
"# take out cost\n",
"print(client.actual_usage_summary)\n",
"print(client.total_usage_summary)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"In update_usage_summary\n",
"----------------------------------------------------------------------------------------------------\n",
"Usage summary excluding cached usage: \n",
"Total cost: 0.00026\n",
"* Model 'gpt-35-turbo': cost: 0.00026, prompt_tokens: 25, completion_tokens: 110, total_tokens: 135\n",
"\n",
"Usage summary including cached usage: \n",
"Total cost: 0.00052\n",
"* Model 'gpt-35-turbo': cost: 0.00052, prompt_tokens: 50, completion_tokens: 220, total_tokens: 270\n",
"----------------------------------------------------------------------------------------------------\n"
]
}
],
"source": [
"# Since cache is enabled, the same completion will be returned from cache, which will not incur any actual cost. \n",
"# So acutal cost doesn't change but total cost doubles.\n",
"response = client.create(messages=messages, model=\"gpt-35-turbo-1106\", cache_seed=41)\n",
"client.print_usage_summary()"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"No usage summary. Please call \"create\" first.\n"
]
}
],
"source": [
"# clear usage summary\n",
"client.clear_usage_summary() \n",
"client.print_usage_summary()"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"In update_usage_summary\n",
"----------------------------------------------------------------------------------------------------\n",
"No actual cost incurred (all completions are using cache).\n",
"\n",
"Usage summary including cached usage: \n",
"Total cost: 0.00026\n",
"* Model 'gpt-35-turbo': cost: 0.00026, prompt_tokens: 25, completion_tokens: 110, total_tokens: 135\n",
"----------------------------------------------------------------------------------------------------\n"
]
}
],
"source": [
"# all completions are returned from cache, so no actual cost incurred.\n",
"response = client.create(messages=messages, model=\"gpt-35-turbo-1106\", cache_seed=41)\n",
"client.print_usage_summary()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "msft",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
}
},
"nbformat": 4,
"nbformat_minor": 2
}