mirror of
				https://github.com/microsoft/autogen.git
				synced 2025-10-31 09:50:11 +00:00 
			
		
		
		
	 7a4ba1a732
			
		
	
	
		7a4ba1a732
		
			
		
	
	
	
	
		
			
			* init commit * add doc, notebook and test * fix test * update * update * update * update
		
			
				
	
	
		
			309 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			309 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| {
 | |
|  "cells": [
 | |
|   {
 | |
|    "cell_type": "markdown",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "<a href=\"https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/oai_client_cost.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "markdown",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "Copyright (c) Microsoft Corporation. All rights reserved. \n",
 | |
|     "\n",
 | |
|     "Licensed under the MIT License.\n",
 | |
|     "\n",
 | |
|     "# Use AutoGen's OpenAIWrapper for cost estimation\n",
 | |
|     "The `OpenAIWrapper` from `autogen` tracks token counts and costs of your API calls. Use the `create()` method to initiate requests and `print_usage_summary()` to retrieve a detailed usage report, including total cost and token usage for both cached and actual requests.\n",
 | |
|     "\n",
 | |
|     "- `mode=[\"actual\", \"total\"]` (default): print usage summary for non-caching completions and all completions (including cache).\n",
 | |
|     "- `mode='actual'`: only print non-cached usage.\n",
 | |
|     "- `mode='total'`: only print all usage (including cache).\n",
 | |
|     "\n",
 | |
|     "Reset your session's usage data with `clear_usage_summary()` when needed.\n",
 | |
|     "\n",
 | |
|     "## Requirements\n",
 | |
|     "\n",
 | |
|     "AutoGen requires `Python>=3.8`:\n",
 | |
|     "```bash\n",
 | |
|     "pip install \"pyautogen\"\n",
 | |
|     "```"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "markdown",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "## Set your API Endpoint\n",
 | |
|     "\n",
 | |
|     "The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file.\n"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 1,
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "import autogen\n",
 | |
|     "\n",
 | |
|     "# config_list = autogen.config_list_from_json(\n",
 | |
|     "#     \"OAI_CONFIG_LIST\",\n",
 | |
|     "#     filter_dict={\n",
 | |
|     "#         \"model\": [\"gpt-3.5-turbo\", \"gpt-4-1106-preview\"],\n",
 | |
|     "#     },\n",
 | |
|     "# )\n",
 | |
|     "\n",
 | |
|     "config_list = autogen.config_list_from_json(\n",
 | |
|     "    \"OAI_CONFIG_LIST\",\n",
 | |
|     "    filter_dict={\n",
 | |
|     "        \"model\": [\"gpt-3.5-turbo\"],\n",
 | |
|     "    },\n",
 | |
|     ")"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "markdown",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "It first looks for environment variable \"OAI_CONFIG_LIST\" which needs to be a valid json string. If that variable is not found, it then looks for a json file named \"OAI_CONFIG_LIST\". It filters the configs by models (you can filter by other keys as well).\n",
 | |
|     "\n",
 | |
|     "The config list looks like the following:\n",
 | |
|     "```python\n",
 | |
|     "config_list = [\n",
 | |
|     "    {\n",
 | |
|     "        \"model\": \"gpt-4\",\n",
 | |
|     "        \"api_key\": \"<your OpenAI API key>\",\n",
 | |
|     "    },  # OpenAI API endpoint for gpt-4\n",
 | |
|     "    {\n",
 | |
|     "        \"model\": \"gpt-35-turbo-0631\",  # 0631 or newer is needed to use functions\n",
 | |
|     "        \"base_url\": \"<your Azure OpenAI API base>\", \n",
 | |
|     "        \"api_type\": \"azure\", \n",
 | |
|     "        \"api_version\": \"2023-08-01-preview\", # 2023-07-01-preview or newer is needed to use functions\n",
 | |
|     "        \"api_key\": \"<your Azure OpenAI API key>\"\n",
 | |
|     "    }\n",
 | |
|     "]\n",
 | |
|     "```\n",
 | |
|     "\n",
 | |
|     "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/main/notebook/oai_openai_utils.ipynb) for full code examples of the different methods."
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "markdown",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "## OpenAIWrapper with cost estimation"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 2,
 | |
|    "metadata": {},
 | |
|    "outputs": [
 | |
|     {
 | |
|      "name": "stdout",
 | |
|      "output_type": "stream",
 | |
|      "text": [
 | |
|       "In update_usage_summary\n",
 | |
|       "0.0001555\n"
 | |
|      ]
 | |
|     }
 | |
|    ],
 | |
|    "source": [
 | |
|     "from autogen import OpenAIWrapper\n",
 | |
|     "\n",
 | |
|     "client = OpenAIWrapper(config_list=config_list)\n",
 | |
|     "messages = [{'role': 'user', 'content': 'Can you give me 3 useful tips on learning Python? Keep it simple and short.'},]\n",
 | |
|     "response = client.create(messages=messages, model=\"gpt-35-turbo-1106\", cache_seed=None)\n",
 | |
|     "print(response.cost)"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "markdown",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "## Usage Summary\n",
 | |
|     "\n",
 | |
|     "When creating a instance of OpenAIWrapper, cost of all completions from the same instance is recorded. You can call `print_usage_summary()` to checkout your usage summary. To clear up, use `clear_usage_summary()`.\n"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 14,
 | |
|    "metadata": {},
 | |
|    "outputs": [
 | |
|     {
 | |
|      "name": "stdout",
 | |
|      "output_type": "stream",
 | |
|      "text": [
 | |
|       "No usage summary. Please call \"create\" first.\n"
 | |
|      ]
 | |
|     }
 | |
|    ],
 | |
|    "source": [
 | |
|     "from autogen import OpenAIWrapper\n",
 | |
|     "\n",
 | |
|     "client = OpenAIWrapper(config_list=config_list)\n",
 | |
|     "messages = [{'role': 'user', 'content': 'Can you give me 3 useful tips on learning Python? Keep it simple and short.'},]\n",
 | |
|     "client.print_usage_summary() # print usage summary"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 15,
 | |
|    "metadata": {},
 | |
|    "outputs": [
 | |
|     {
 | |
|      "name": "stdout",
 | |
|      "output_type": "stream",
 | |
|      "text": [
 | |
|       "In update_usage_summary\n",
 | |
|       "----------------------------------------------------------------------------------------------------\n",
 | |
|       "Usage summary excluding cached usage: \n",
 | |
|       "Total cost: 0.00026\n",
 | |
|       "* Model 'gpt-35-turbo': cost: 0.00026, prompt_tokens: 25, completion_tokens: 110, total_tokens: 135\n",
 | |
|       "\n",
 | |
|       "All completions are non-cached: the total cost with cached completions is the same as actual cost.\n",
 | |
|       "----------------------------------------------------------------------------------------------------\n",
 | |
|       "----------------------------------------------------------------------------------------------------\n",
 | |
|       "Usage summary excluding cached usage: \n",
 | |
|       "Total cost: 0.00026\n",
 | |
|       "* Model 'gpt-35-turbo': cost: 0.00026, prompt_tokens: 25, completion_tokens: 110, total_tokens: 135\n",
 | |
|       "----------------------------------------------------------------------------------------------------\n",
 | |
|       "----------------------------------------------------------------------------------------------------\n",
 | |
|       "Usage summary including cached usage: \n",
 | |
|       "Total cost: 0.00026\n",
 | |
|       "* Model 'gpt-35-turbo': cost: 0.00026, prompt_tokens: 25, completion_tokens: 110, total_tokens: 135\n",
 | |
|       "----------------------------------------------------------------------------------------------------\n"
 | |
|      ]
 | |
|     }
 | |
|    ],
 | |
|    "source": [
 | |
|     "# The first creation\n",
 | |
|     "# By default, cache_seed is set to 41 and enabled. If you don't want to use cache, set cache_seed to None.\n",
 | |
|     "response = client.create(messages=messages, model=\"gpt-35-turbo-1106\", cache_seed=41)\n",
 | |
|     "client.print_usage_summary() # default to [\"actual\", \"total\"]\n",
 | |
|     "client.print_usage_summary(mode='actual') # print actual usage summary\n",
 | |
|     "client.print_usage_summary(mode='total') # print total usage summary"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 16,
 | |
|    "metadata": {},
 | |
|    "outputs": [
 | |
|     {
 | |
|      "name": "stdout",
 | |
|      "output_type": "stream",
 | |
|      "text": [
 | |
|       "{'total_cost': 0.0002575, 'gpt-35-turbo': {'cost': 0.0002575, 'prompt_tokens': 25, 'completion_tokens': 110, 'total_tokens': 135}}\n",
 | |
|       "{'total_cost': 0.0002575, 'gpt-35-turbo': {'cost': 0.0002575, 'prompt_tokens': 25, 'completion_tokens': 110, 'total_tokens': 135}}\n"
 | |
|      ]
 | |
|     }
 | |
|    ],
 | |
|    "source": [
 | |
|     "# take out cost\n",
 | |
|     "print(client.actual_usage_summary)\n",
 | |
|     "print(client.total_usage_summary)"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 17,
 | |
|    "metadata": {},
 | |
|    "outputs": [
 | |
|     {
 | |
|      "name": "stdout",
 | |
|      "output_type": "stream",
 | |
|      "text": [
 | |
|       "In update_usage_summary\n",
 | |
|       "----------------------------------------------------------------------------------------------------\n",
 | |
|       "Usage summary excluding cached usage: \n",
 | |
|       "Total cost: 0.00026\n",
 | |
|       "* Model 'gpt-35-turbo': cost: 0.00026, prompt_tokens: 25, completion_tokens: 110, total_tokens: 135\n",
 | |
|       "\n",
 | |
|       "Usage summary including cached usage: \n",
 | |
|       "Total cost: 0.00052\n",
 | |
|       "* Model 'gpt-35-turbo': cost: 0.00052, prompt_tokens: 50, completion_tokens: 220, total_tokens: 270\n",
 | |
|       "----------------------------------------------------------------------------------------------------\n"
 | |
|      ]
 | |
|     }
 | |
|    ],
 | |
|    "source": [
 | |
|     "# Since cache is enabled, the same completion will be returned from cache, which will not incur any actual cost. \n",
 | |
|     "# So acutal cost doesn't change but total cost doubles.\n",
 | |
|     "response = client.create(messages=messages, model=\"gpt-35-turbo-1106\", cache_seed=41)\n",
 | |
|     "client.print_usage_summary()"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 18,
 | |
|    "metadata": {},
 | |
|    "outputs": [
 | |
|     {
 | |
|      "name": "stdout",
 | |
|      "output_type": "stream",
 | |
|      "text": [
 | |
|       "No usage summary. Please call \"create\" first.\n"
 | |
|      ]
 | |
|     }
 | |
|    ],
 | |
|    "source": [
 | |
|     "# clear usage summary\n",
 | |
|     "client.clear_usage_summary() \n",
 | |
|     "client.print_usage_summary()"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 19,
 | |
|    "metadata": {},
 | |
|    "outputs": [
 | |
|     {
 | |
|      "name": "stdout",
 | |
|      "output_type": "stream",
 | |
|      "text": [
 | |
|       "In update_usage_summary\n",
 | |
|       "----------------------------------------------------------------------------------------------------\n",
 | |
|       "No actual cost incurred (all completions are using cache).\n",
 | |
|       "\n",
 | |
|       "Usage summary including cached usage: \n",
 | |
|       "Total cost: 0.00026\n",
 | |
|       "* Model 'gpt-35-turbo': cost: 0.00026, prompt_tokens: 25, completion_tokens: 110, total_tokens: 135\n",
 | |
|       "----------------------------------------------------------------------------------------------------\n"
 | |
|      ]
 | |
|     }
 | |
|    ],
 | |
|    "source": [
 | |
|     "# all completions are returned from cache, so no actual cost incurred.\n",
 | |
|     "response = client.create(messages=messages, model=\"gpt-35-turbo-1106\", cache_seed=41)\n",
 | |
|     "client.print_usage_summary()"
 | |
|    ]
 | |
|   }
 | |
|  ],
 | |
|  "metadata": {
 | |
|   "kernelspec": {
 | |
|    "display_name": "msft",
 | |
|    "language": "python",
 | |
|    "name": "python3"
 | |
|   },
 | |
|   "language_info": {
 | |
|    "codemirror_mode": {
 | |
|     "name": "ipython",
 | |
|     "version": 3
 | |
|    },
 | |
|    "file_extension": ".py",
 | |
|    "mimetype": "text/x-python",
 | |
|    "name": "python",
 | |
|    "nbconvert_exporter": "python",
 | |
|    "pygments_lexer": "ipython3",
 | |
|    "version": "3.9.18"
 | |
|   }
 | |
|  },
 | |
|  "nbformat": 4,
 | |
|  "nbformat_minor": 2
 | |
| }
 |