{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\"Open" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Copyright (c) Microsoft Corporation. All rights reserved. \n", "\n", "Licensed under the MIT License.\n", "\n", "# Use AutoGen's OpenAIWrapper for cost estimation\n", "The `OpenAIWrapper` from `autogen` tracks token counts and costs of your API calls. Use the `create()` method to initiate requests and `print_usage_summary()` to retrieve a detailed usage report, including total cost and token usage for both cached and actual requests.\n", "\n", "- `mode=[\"actual\", \"total\"]` (default): print usage summary for non-caching completions and all completions (including cache).\n", "- `mode='actual'`: only print non-cached usage.\n", "- `mode='total'`: only print all usage (including cache).\n", "\n", "Reset your session's usage data with `clear_usage_summary()` when needed.\n", "\n", "## Requirements\n", "\n", "AutoGen requires `Python>=3.8`:\n", "```bash\n", "pip install \"pyautogen\"\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set your API Endpoint\n", "\n", "The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file.\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import autogen\n", "\n", "# config_list = autogen.config_list_from_json(\n", "# \"OAI_CONFIG_LIST\",\n", "# filter_dict={\n", "# \"model\": [\"gpt-3.5-turbo\", \"gpt-4-1106-preview\"],\n", "# },\n", "# )\n", "\n", "config_list = autogen.config_list_from_json(\n", " \"OAI_CONFIG_LIST\",\n", " filter_dict={\n", " \"model\": [\"gpt-3.5-turbo\", \"gpt-35-turbo\"],\n", " },\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It first looks for environment variable \"OAI_CONFIG_LIST\" which needs to be a valid json string. If that variable is not found, it then looks for a json file named \"OAI_CONFIG_LIST\". It filters the configs by models (you can filter by other keys as well).\n", "\n", "The config list looks like the following:\n", "```python\n", "config_list = [\n", " {\n", " \"model\": \"gpt-4\",\n", " \"api_key\": \"\",\n", " }, # OpenAI API endpoint for gpt-4\n", " {\n", " \"model\": \"gpt-35-turbo-0631\", # 0631 or newer is needed to use functions\n", " \"base_url\": \"\", \n", " \"api_type\": \"azure\", \n", " \"api_version\": \"2023-08-01-preview\", # 2023-07-01-preview or newer is needed to use functions\n", " \"api_key\": \"\"\n", " }\n", "]\n", "```\n", "\n", "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/main/notebook/oai_openai_utils.ipynb) for full code examples of the different methods." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## OpenAIWrapper with cost estimation" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "In update_usage_summary\n", "0.0001555\n" ] } ], "source": [ "from autogen import OpenAIWrapper\n", "\n", "client = OpenAIWrapper(config_list=config_list)\n", "messages = [{'role': 'user', 'content': 'Can you give me 3 useful tips on learning Python? Keep it simple and short.'},]\n", "response = client.create(messages=messages, model=\"gpt-35-turbo-1106\", cache_seed=None)\n", "print(response.cost)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Usage Summary\n", "\n", "When creating a instance of OpenAIWrapper, cost of all completions from the same instance is recorded. You can call `print_usage_summary()` to checkout your usage summary. To clear up, use `clear_usage_summary()`.\n" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "No usage summary. Please call \"create\" first.\n" ] } ], "source": [ "from autogen import OpenAIWrapper\n", "\n", "client = OpenAIWrapper(config_list=config_list)\n", "messages = [{'role': 'user', 'content': 'Can you give me 3 useful tips on learning Python? Keep it simple and short.'},]\n", "client.print_usage_summary() # print usage summary" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "In update_usage_summary\n", "----------------------------------------------------------------------------------------------------\n", "Usage summary excluding cached usage: \n", "Total cost: 0.00026\n", "* Model 'gpt-35-turbo': cost: 0.00026, prompt_tokens: 25, completion_tokens: 110, total_tokens: 135\n", "\n", "All completions are non-cached: the total cost with cached completions is the same as actual cost.\n", "----------------------------------------------------------------------------------------------------\n", "----------------------------------------------------------------------------------------------------\n", "Usage summary excluding cached usage: \n", "Total cost: 0.00026\n", "* Model 'gpt-35-turbo': cost: 0.00026, prompt_tokens: 25, completion_tokens: 110, total_tokens: 135\n", "----------------------------------------------------------------------------------------------------\n", "----------------------------------------------------------------------------------------------------\n", "Usage summary including cached usage: \n", "Total cost: 0.00026\n", "* Model 'gpt-35-turbo': cost: 0.00026, prompt_tokens: 25, completion_tokens: 110, total_tokens: 135\n", "----------------------------------------------------------------------------------------------------\n" ] } ], "source": [ "# The first creation\n", "# By default, cache_seed is set to 41 and enabled. If you don't want to use cache, set cache_seed to None.\n", "response = client.create(messages=messages, model=\"gpt-35-turbo-1106\", cache_seed=41)\n", "client.print_usage_summary() # default to [\"actual\", \"total\"]\n", "client.print_usage_summary(mode='actual') # print actual usage summary\n", "client.print_usage_summary(mode='total') # print total usage summary" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'total_cost': 0.0002575, 'gpt-35-turbo': {'cost': 0.0002575, 'prompt_tokens': 25, 'completion_tokens': 110, 'total_tokens': 135}}\n", "{'total_cost': 0.0002575, 'gpt-35-turbo': {'cost': 0.0002575, 'prompt_tokens': 25, 'completion_tokens': 110, 'total_tokens': 135}}\n" ] } ], "source": [ "# take out cost\n", "print(client.actual_usage_summary)\n", "print(client.total_usage_summary)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "In update_usage_summary\n", "----------------------------------------------------------------------------------------------------\n", "Usage summary excluding cached usage: \n", "Total cost: 0.00026\n", "* Model 'gpt-35-turbo': cost: 0.00026, prompt_tokens: 25, completion_tokens: 110, total_tokens: 135\n", "\n", "Usage summary including cached usage: \n", "Total cost: 0.00052\n", "* Model 'gpt-35-turbo': cost: 0.00052, prompt_tokens: 50, completion_tokens: 220, total_tokens: 270\n", "----------------------------------------------------------------------------------------------------\n" ] } ], "source": [ "# Since cache is enabled, the same completion will be returned from cache, which will not incur any actual cost. \n", "# So acutal cost doesn't change but total cost doubles.\n", "response = client.create(messages=messages, model=\"gpt-35-turbo-1106\", cache_seed=41)\n", "client.print_usage_summary()" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "No usage summary. Please call \"create\" first.\n" ] } ], "source": [ "# clear usage summary\n", "client.clear_usage_summary() \n", "client.print_usage_summary()" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "In update_usage_summary\n", "----------------------------------------------------------------------------------------------------\n", "No actual cost incurred (all completions are using cache).\n", "\n", "Usage summary including cached usage: \n", "Total cost: 0.00026\n", "* Model 'gpt-35-turbo': cost: 0.00026, prompt_tokens: 25, completion_tokens: 110, total_tokens: 135\n", "----------------------------------------------------------------------------------------------------\n" ] } ], "source": [ "# all completions are returned from cache, so no actual cost incurred.\n", "response = client.create(messages=messages, model=\"gpt-35-turbo-1106\", cache_seed=41)\n", "client.print_usage_summary()" ] } ], "metadata": { "kernelspec": { "display_name": "msft", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.18" } }, "nbformat": 4, "nbformat_minor": 2 }