autogen/website/docs/topics/non-openai-models/cloud-cerebras.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Cerebras\n",
    "\n",
    "[Cerebras](https://cerebras.ai) has developed the world's largest and fastest AI processor, the Wafer-Scale Engine-3 (WSE-3). Notably, the CS-3 system can run large language models like Llama-3.1-8B and Llama-3.1-70B at extremely fast speeds, making it an ideal platform for demanding AI workloads.\n",
    "\n",
    "While it's technically possible to adapt AutoGen to work with Cerebras' API by updating the `base_url`, this approach may not fully account for minor differences in parameter support. Using this library will also allow for tracking of the API costs based on actual token usage.\n",
    "\n",
    "For more information about Cerebras Cloud, visit [cloud.cerebras.ai](https://cloud.cerebras.ai). Their API reference is available at [inference-docs.cerebras.ai](https://inference-docs.cerebras.ai)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Requirements\n",
    "To use Cerebras with AutoGen, install the `autogen-agentchat[cerebras]` package."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install autogen-agentchat[\"cerebras\"]~=0.2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Getting Started\n",
    "\n",
    "Cerebras provides a number of models to use. See the list of [models here](https://inference-docs.cerebras.ai/introduction).\n",
    "\n",
    "See the sample `OAI_CONFIG_LIST` below showing how the Cerebras client class is used by specifying the `api_type` as `cerebras`.\n",
    "```python\n",
    "[\n",
    "    {\n",
    "        \"model\": \"llama3.1-8b\",\n",
    "        \"api_key\": \"your Cerebras API Key goes here\",\n",
    "        \"api_type\": \"cerebras\"\n",
    "    },\n",
    "    {\n",
    "        \"model\": \"llama3.1-70b\",\n",
    "        \"api_key\": \"your Cerebras API Key goes here\",\n",
    "        \"api_type\": \"cerebras\"\n",
    "    }\n",
    "]\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Credentials\n",
    "\n",
    "Get an API Key from [cloud.cerebras.ai](https://cloud.cerebras.ai/) and add it to your environment variables:\n",
    "\n",
    "```\n",
    "export CEREBRAS_API_KEY=\"your-api-key-here\"\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## API parameters\n",
    "\n",
    "The following parameters can be added to your config for the Cerebras API. See [this link](https://inference-docs.cerebras.ai/api-reference/chat-completions) for further information on them and their default values.\n",
    "\n",
    "- max_tokens (null, integer >= 0)\n",
    "- seed (number)\n",
    "- stream (True or False)\n",
    "- temperature (number 0..1.5)\n",
    "- top_p (number)\n",
    "\n",
    "Example:\n",
    "```python\n",
    "[\n",
    "    {\n",
    "        \"model\": \"llama3.1-70b\",\n",
    "        \"api_key\": \"your Cerebras API Key goes here\",\n",
    "        \"api_type\": \"cerebras\"\n",
    "        \"max_tokens\": 10000,\n",
    "        \"seed\": 1234,\n",
    "        \"stream\" True,\n",
    "        \"temperature\": 0.5,\n",
    "        \"top_p\": 0.2, # Note: It is recommended to set temperature or top_p but not both.\n",
    "    }\n",
    "]\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Two-Agent Coding Example\n",
    "\n",
    "In this example, we run a two-agent chat with an AssistantAgent (primarily a coding agent) to generate code to count the number of prime numbers between 1 and 10,000 and then it will be executed.\n",
    "\n",
    "We'll use Meta's LLama-3.1-70B model which is suitable for coding."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "from autogen.oai.cerebras import CerebrasClient, calculate_cerebras_cost\n",
    "\n",
    "config_list = [{\"model\": \"llama3.1-70b\", \"api_key\": os.environ.get(\"CEREBRAS_API_KEY\"), \"api_type\": \"cerebras\"}]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Importantly, we have tweaked the system message so that the model doesn't return the termination keyword, which we've changed to FINISH, with the code block."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "from autogen import AssistantAgent, UserProxyAgent\n",
    "from autogen.coding import LocalCommandLineCodeExecutor\n",
    "\n",
    "# Setting up the code executor\n",
    "workdir = Path(\"coding\")\n",
    "workdir.mkdir(exist_ok=True)\n",
    "code_executor = LocalCommandLineCodeExecutor(work_dir=workdir)\n",
    "\n",
    "# Setting up the agents\n",
    "\n",
    "# The UserProxyAgent will execute the code that the AssistantAgent provides\n",
    "user_proxy_agent = UserProxyAgent(\n",
    "    name=\"User\",\n",
    "    code_execution_config={\"executor\": code_executor},\n",
    "    is_termination_msg=lambda msg: \"FINISH\" in msg.get(\"content\"),\n",
    ")\n",
    "\n",
    "system_message = \"\"\"You are a helpful AI assistant who writes code and the user executes it.\n",
    "Solve tasks using your coding and language skills.\n",
    "In the following cases, suggest python code (in a python coding block) for the user to execute.\n",
    "Solve the task step by step if you need to. If a plan is not provided, explain your plan first. Be clear which step uses code, and which step uses your language skill.\n",
    "When using code, you must indicate the script type in the code block. The user cannot provide any other feedback or perform any other action beyond executing the code you suggest. The user can't modify your code. So do not suggest incomplete code which requires users to modify. Don't use a code block if it's not intended to be executed by the user.\n",
    "Don't include multiple code blocks in one response. Do not ask users to copy and paste the result. Instead, use 'print' function for the output when relevant. Check the execution result returned by the user.\n",
    "If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\n",
    "When you find an answer, verify the answer carefully. Include verifiable evidence in your response if possible.\n",
    "IMPORTANT: Wait for the user to execute your code and then you can reply with the word \"FINISH\". DO NOT OUTPUT \"FINISH\" after your code block.\"\"\"\n",
    "\n",
    "# The AssistantAgent, using Llama-3.1-70B on Cerebras Inference, will take the coding request and return code\n",
    "assistant_agent = AssistantAgent(\n",
    "    name=\"Cerebras Assistant\",\n",
    "    system_message=system_message,\n",
    "    llm_config={\"config_list\": config_list},\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[33mUser\u001b[0m (to Cerebras Assistant):\n",
      "\n",
      "Provide code to count the number of prime numbers from 1 to 10000.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[33mCerebras Assistant\u001b[0m (to User):\n",
      "\n",
      "To count the number of prime numbers from 1 to 10000, we will utilize a simple algorithm that checks each number in the range to see if it is prime. A prime number is a natural number greater than 1 that has no positive divisors other than 1 and itself.\n",
      "\n",
      "Here's how we can do it using a Python script:\n",
      "\n",
      "```python\n",
      "def count_primes(n):\n",
      "    primes = 0\n",
      "    for possiblePrime in range(2, n + 1):\n",
      "        # Assume number is prime until shown it is not. \n",
      "        isPrime = True\n",
      "        for num in range(2, int(possiblePrime ** 0.5) + 1):\n",
      "            if possiblePrime % num == 0:\n",
      "                isPrime = False\n",
      "                break\n",
      "        if isPrime:\n",
      "            primes += 1\n",
      "    return primes\n",
      "\n",
      "# Counting prime numbers from 1 to 10000\n",
      "count = count_primes(10000)\n",
      "print(count)\n",
      "```\n",
      "\n",
      "Please execute this code. I will respond with \"FINISH\" after you provide the result.\n",
      "\n",
      "--------------------------------------------------------------------------------\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Replying as User. Provide feedback to Cerebras Assistant. Press enter to skip and use auto-reply, or type 'exit' to end the conversation:  \n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[31m\n",
      ">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "# Start the chat, with the UserProxyAgent asking the AssistantAgent the message\n",
    "chat_result = user_proxy_agent.initiate_chat(\n",
    "    assistant_agent,\n",
    "    message=\"Provide code to count the number of prime numbers from 1 to 10000.\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Tool Call Example\n",
    "\n",
    "In this example, instead of writing code, we will show how Meta's Llama-3.1-70B model can perform parallel tool calling, where it recommends calling more than one tool at a time.\n",
    "\n",
    "We'll use a simple travel agent assistant program where we have a couple of tools for weather and currency conversion.\n",
    "\n",
    "We start by importing libraries and setting up our configuration to use Llama-3.1-70B and the `cerebras` client class."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "import os\n",
    "from typing import Literal\n",
    "\n",
    "from typing_extensions import Annotated\n",
    "\n",
    "import autogen\n",
    "\n",
    "config_list = [\n",
    "    {\n",
    "        \"model\": \"llama3.1-70b\",\n",
    "        \"api_key\": os.environ.get(\"CEREBRAS_API_KEY\"),\n",
    "        \"api_type\": \"cerebras\",\n",
    "    }\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create our two agents."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create the agent for tool calling\n",
    "chatbot = autogen.AssistantAgent(\n",
    "    name=\"chatbot\",\n",
    "    system_message=\"\"\"\n",
    "        For currency exchange and weather forecasting tasks,\n",
    "        only use the functions you have been provided with.\n",
    "        When you summarize, make sure you've considered ALL previous instructions.\n",
    "        Output 'HAVE FUN!' when an answer has been provided.\n",
    "    \"\"\",\n",
    "    llm_config={\"config_list\": config_list},\n",
    ")\n",
    "\n",
    "# Note that we have changed the termination string to be \"HAVE FUN!\"\n",
    "user_proxy = autogen.UserProxyAgent(\n",
    "    name=\"user_proxy\",\n",
    "    is_termination_msg=lambda x: x.get(\"content\", \"\") and \"HAVE FUN!\" in x.get(\"content\", \"\"),\n",
    "    human_input_mode=\"NEVER\",\n",
    "    max_consecutive_auto_reply=1,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create the two functions, annotating them so that those descriptions can be passed through to the LLM.\n",
    "\n",
    "We associate them with the agents using `register_for_execution` for the user_proxy so it can execute the function and `register_for_llm` for the chatbot (powered by the LLM) so it can pass the function definitions to the LLM."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Currency Exchange function\n",
    "\n",
    "CurrencySymbol = Literal[\"USD\", \"EUR\"]\n",
    "\n",
    "# Define our function that we expect to call\n",
    "\n",
    "\n",
    "def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:\n",
    "    if base_currency == quote_currency:\n",
    "        return 1.0\n",
    "    elif base_currency == \"USD\" and quote_currency == \"EUR\":\n",
    "        return 1 / 1.1\n",
    "    elif base_currency == \"EUR\" and quote_currency == \"USD\":\n",
    "        return 1.1\n",
    "    else:\n",
    "        raise ValueError(f\"Unknown currencies {base_currency}, {quote_currency}\")\n",
    "\n",
    "\n",
    "# Register the function with the agent\n",
    "\n",
    "\n",
    "@user_proxy.register_for_execution()\n",
    "@chatbot.register_for_llm(description=\"Currency exchange calculator.\")\n",
    "def currency_calculator(\n",
    "    base_amount: Annotated[float, \"Amount of currency in base_currency\"],\n",
    "    base_currency: Annotated[CurrencySymbol, \"Base currency\"] = \"USD\",\n",
    "    quote_currency: Annotated[CurrencySymbol, \"Quote currency\"] = \"EUR\",\n",
    ") -> str:\n",
    "    quote_amount = exchange_rate(base_currency, quote_currency) * base_amount\n",
    "    return f\"{format(quote_amount, '.2f')} {quote_currency}\"\n",
    "\n",
    "\n",
    "# Weather function\n",
    "\n",
    "\n",
    "# Example function to make available to model\n",
    "def get_current_weather(location, unit=\"fahrenheit\"):\n",
    "    \"\"\"Get the weather for some location\"\"\"\n",
    "    if \"chicago\" in location.lower():\n",
    "        return json.dumps({\"location\": \"Chicago\", \"temperature\": \"13\", \"unit\": unit})\n",
    "    elif \"san francisco\" in location.lower():\n",
    "        return json.dumps({\"location\": \"San Francisco\", \"temperature\": \"55\", \"unit\": unit})\n",
    "    elif \"new york\" in location.lower():\n",
    "        return json.dumps({\"location\": \"New York\", \"temperature\": \"11\", \"unit\": unit})\n",
    "    else:\n",
    "        return json.dumps({\"location\": location, \"temperature\": \"unknown\"})\n",
    "\n",
    "\n",
    "# Register the function with the agent\n",
    "\n",
    "\n",
    "@user_proxy.register_for_execution()\n",
    "@chatbot.register_for_llm(description=\"Weather forecast for US cities.\")\n",
    "def weather_forecast(\n",
    "    location: Annotated[str, \"City name\"],\n",
    ") -> str:\n",
    "    weather_details = get_current_weather(location=location)\n",
    "    weather = json.loads(weather_details)\n",
    "    return f\"{weather['location']} will be {weather['temperature']} degrees {weather['unit']}\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We pass through our customer's message and run the chat.\n",
    "\n",
    "Finally, we ask the LLM to summarise the chat and print that out."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[33muser_proxy\u001b[0m (to chatbot):\n",
      "\n",
      "What's the weather in New York and can you tell me how much is 123.45 EUR in USD so I can spend it on my holiday? Throw a few holiday tips in as well.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[33mchatbot\u001b[0m (to user_proxy):\n",
      "\n",
      "\u001b[32m***** Suggested tool call (210f6ac6d): weather_forecast *****\u001b[0m\n",
      "Arguments: \n",
      "{\"location\": \"New York\"}\n",
      "\u001b[32m*************************************************************\u001b[0m\n",
      "\u001b[32m***** Suggested tool call (3c00ac7d5): currency_calculator *****\u001b[0m\n",
      "Arguments: \n",
      "{\"base_amount\": 123.45, \"base_currency\": \"EUR\", \"quote_currency\": \"USD\"}\n",
      "\u001b[32m****************************************************************\u001b[0m\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[35m\n",
      ">>>>>>>> EXECUTING FUNCTION weather_forecast...\u001b[0m\n",
      "\u001b[35m\n",
      ">>>>>>>> EXECUTING FUNCTION currency_calculator...\u001b[0m\n",
      "\u001b[33muser_proxy\u001b[0m (to chatbot):\n",
      "\n",
      "\u001b[33muser_proxy\u001b[0m (to chatbot):\n",
      "\n",
      "\u001b[32m***** Response from calling tool (210f6ac6d) *****\u001b[0m\n",
      "New York will be 11 degrees fahrenheit\n",
      "\u001b[32m**************************************************\u001b[0m\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[33muser_proxy\u001b[0m (to chatbot):\n",
      "\n",
      "\u001b[32m***** Response from calling tool (3c00ac7d5) *****\u001b[0m\n",
      "135.80 USD\n",
      "\u001b[32m**************************************************\u001b[0m\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[33mchatbot\u001b[0m (to user_proxy):\n",
      "\n",
      "New York will be 11 degrees fahrenheit.\n",
      "123.45 EUR is equivalent to 135.80 USD.\n",
      " \n",
      "For a great holiday, explore the Statue of Liberty, take a walk through Central Park, or visit one of the many world-class museums. Also, you'll find great food ranging from bagels to fine dining experiences. HAVE FUN!\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "LLM SUMMARY: New York will be 11 degrees fahrenheit. 123.45 EUR is equivalent to 135.80 USD. Explore the Statue of Liberty, walk through Central Park, or visit one of the many world-class museums for a great holiday in New York.\n",
      "\n",
      "Duration: 73.97937774658203ms\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "\n",
    "start_time = time.time()\n",
    "\n",
    "# start the conversation\n",
    "res = user_proxy.initiate_chat(\n",
    "    chatbot,\n",
    "    message=\"What's the weather in New York and can you tell me how much is 123.45 EUR in USD so I can spend it on my holiday? Throw a few holiday tips in as well.\",\n",
    "    summary_method=\"reflection_with_llm\",\n",
    ")\n",
    "\n",
    "end_time = time.time()\n",
    "\n",
    "print(f\"LLM SUMMARY: {res.summary['content']}\\n\\nDuration: {(end_time - start_time) * 1000}ms\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see that the Cerebras Wafer-Scale Engine-3 (WSE-3) completed the query in 74ms -- faster than the blink of an eye!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
Add Cerebras Integration (#3585) * Cerebras Integration * Address feedback * Fix typo * Run formatter 2024-09-30 17:14:55 -04:00			`{`
			`"cells": [`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"# Cerebras\n",`
			`"\n",`
			`"[Cerebras](https://cerebras.ai) has developed the world's largest and fastest AI processor, the Wafer-Scale Engine-3 (WSE-3). Notably, the CS-3 system can run large language models like Llama-3.1-8B and Llama-3.1-70B at extremely fast speeds, making it an ideal platform for demanding AI workloads.\n",`
			`"\n",`
			"While it's technically possible to adapt AutoGen to work with Cerebras' API by updating the `base_url`, this approach may not fully account for minor differences in parameter support. Using this library will also allow for tracking of the API costs based on actual token usage.\n",
			`"\n",`
			`"For more information about Cerebras Cloud, visit [cloud.cerebras.ai](https://cloud.cerebras.ai). Their API reference is available at [inference-docs.cerebras.ai](https://inference-docs.cerebras.ai)."`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
Fix typos in Cerebras doc (#3590) * Fix typos in Cerebras doc Fix typo in Cerebras documentation * FIx formatting 2024-10-02 11:10:55 -04:00			`"## Requirements\n",`
Update version to 0.2.36, update package name (#3592) * Update version to 0.2.36, update package name * update publish * Formatting * Update README.md * update email 2024-10-01 20:05:11 -04:00			"To use Cerebras with AutoGen, install the `autogen-agentchat[cerebras]` package."
Add Cerebras Integration (#3585) * Cerebras Integration * Address feedback * Fix typo * Run formatter 2024-09-30 17:14:55 -04:00			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"execution_count": null,`
			`"metadata": {},`
			`"outputs": [],`
			`"source": [`
Update version to 0.2.36, update package name (#3592) * Update version to 0.2.36, update package name * update publish * Formatting * Update README.md * update email 2024-10-01 20:05:11 -04:00			`"!pip install autogen-agentchat[\"cerebras\"]~=0.2"`
Add Cerebras Integration (#3585) * Cerebras Integration * Address feedback * Fix typo * Run formatter 2024-09-30 17:14:55 -04:00			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"## Getting Started\n",`
			`"\n",`
			`"Cerebras provides a number of models to use. See the list of [models here](https://inference-docs.cerebras.ai/introduction).\n",`
			`"\n",`
Fix typos in Cerebras doc (#3590) * Fix typos in Cerebras doc Fix typo in Cerebras documentation * FIx formatting 2024-10-02 11:10:55 -04:00			"See the sample `OAI_CONFIG_LIST` below showing how the Cerebras client class is used by specifying the `api_type` as `cerebras`.\n",
Add Cerebras Integration (#3585) * Cerebras Integration * Address feedback * Fix typo * Run formatter 2024-09-30 17:14:55 -04:00			"```python\n",
			`"[\n",`
			`" {\n",`
			`" \"model\": \"llama3.1-8b\",\n",`
			`" \"api_key\": \"your Cerebras API Key goes here\",\n",`
			`" \"api_type\": \"cerebras\"\n",`
			`" },\n",`
			`" {\n",`
			`" \"model\": \"llama3.1-70b\",\n",`
			`" \"api_key\": \"your Cerebras API Key goes here\",\n",`
			`" \"api_type\": \"cerebras\"\n",`
			`" }\n",`
			`"]\n",`
			"```"
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"## Credentials\n",`
			`"\n",`
			`"Get an API Key from [cloud.cerebras.ai](https://cloud.cerebras.ai/) and add it to your environment variables:\n",`
			`"\n",`
			"```\n",
			`"export CEREBRAS_API_KEY=\"your-api-key-here\"\n",`
			"```"
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"## API parameters\n",`
			`"\n",`
			`"The following parameters can be added to your config for the Cerebras API. See [this link](https://inference-docs.cerebras.ai/api-reference/chat-completions) for further information on them and their default values.\n",`
			`"\n",`
			`"- max_tokens (null, integer >= 0)\n",`
			`"- seed (number)\n",`
			`"- stream (True or False)\n",`
			`"- temperature (number 0..1.5)\n",`
			`"- top_p (number)\n",`
			`"\n",`
			`"Example:\n",`
			"```python\n",
			`"[\n",`
			`" {\n",`
			`" \"model\": \"llama3.1-70b\",\n",`
			`" \"api_key\": \"your Cerebras API Key goes here\",\n",`
			`" \"api_type\": \"cerebras\"\n",`
			`" \"max_tokens\": 10000,\n",`
			`" \"seed\": 1234,\n",`
			`" \"stream\" True,\n",`
			`" \"temperature\": 0.5,\n",`
			`" \"top_p\": 0.2, # Note: It is recommended to set temperature or top_p but not both.\n",`
			`" }\n",`
			`"]\n",`
			"```"
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"## Two-Agent Coding Example\n",`
			`"\n",`
			`"In this example, we run a two-agent chat with an AssistantAgent (primarily a coding agent) to generate code to count the number of prime numbers between 1 and 10,000 and then it will be executed.\n",`
			`"\n",`
			`"We'll use Meta's LLama-3.1-70B model which is suitable for coding."`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"execution_count": 32,`
			`"metadata": {},`
			`"outputs": [],`
			`"source": [`
			`"import os\n",`
			`"\n",`
			`"from autogen.oai.cerebras import CerebrasClient, calculate_cerebras_cost\n",`
			`"\n",`
			`"config_list = [{\"model\": \"llama3.1-70b\", \"api_key\": os.environ.get(\"CEREBRAS_API_KEY\"), \"api_type\": \"cerebras\"}]"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"Importantly, we have tweaked the system message so that the model doesn't return the termination keyword, which we've changed to FINISH, with the code block."`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"execution_count": 33,`
			`"metadata": {},`
			`"outputs": [],`
			`"source": [`
			`"from pathlib import Path\n",`
			`"\n",`
			`"from autogen import AssistantAgent, UserProxyAgent\n",`
			`"from autogen.coding import LocalCommandLineCodeExecutor\n",`
			`"\n",`
			`"# Setting up the code executor\n",`
			`"workdir = Path(\"coding\")\n",`
			`"workdir.mkdir(exist_ok=True)\n",`
			`"code_executor = LocalCommandLineCodeExecutor(work_dir=workdir)\n",`
			`"\n",`
			`"# Setting up the agents\n",`
			`"\n",`
			`"# The UserProxyAgent will execute the code that the AssistantAgent provides\n",`
			`"user_proxy_agent = UserProxyAgent(\n",`
			`" name=\"User\",\n",`
			`" code_execution_config={\"executor\": code_executor},\n",`
			`" is_termination_msg=lambda msg: \"FINISH\" in msg.get(\"content\"),\n",`
			`")\n",`
			`"\n",`
			`"system_message = \"\"\"You are a helpful AI assistant who writes code and the user executes it.\n",`
			`"Solve tasks using your coding and language skills.\n",`
			`"In the following cases, suggest python code (in a python coding block) for the user to execute.\n",`
			`"Solve the task step by step if you need to. If a plan is not provided, explain your plan first. Be clear which step uses code, and which step uses your language skill.\n",`
			`"When using code, you must indicate the script type in the code block. The user cannot provide any other feedback or perform any other action beyond executing the code you suggest. The user can't modify your code. So do not suggest incomplete code which requires users to modify. Don't use a code block if it's not intended to be executed by the user.\n",`
			`"Don't include multiple code blocks in one response. Do not ask users to copy and paste the result. Instead, use 'print' function for the output when relevant. Check the execution result returned by the user.\n",`
			`"If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\n",`
			`"When you find an answer, verify the answer carefully. Include verifiable evidence in your response if possible.\n",`
			`"IMPORTANT: Wait for the user to execute your code and then you can reply with the word \"FINISH\". DO NOT OUTPUT \"FINISH\" after your code block.\"\"\"\n",`
			`"\n",`
Fix typos in Cerebras doc (#3590) * Fix typos in Cerebras doc Fix typo in Cerebras documentation * FIx formatting 2024-10-02 11:10:55 -04:00			`"# The AssistantAgent, using Llama-3.1-70B on Cerebras Inference, will take the coding request and return code\n",`
Add Cerebras Integration (#3585) * Cerebras Integration * Address feedback * Fix typo * Run formatter 2024-09-30 17:14:55 -04:00			`"assistant_agent = AssistantAgent(\n",`
			`" name=\"Cerebras Assistant\",\n",`
			`" system_message=system_message,\n",`
			`" llm_config={\"config_list\": config_list},\n",`
			`")"`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"execution_count": 35,`
			`"metadata": {},`
			`"outputs": [`
			`{`
			`"name": "stdout",`
			`"output_type": "stream",`
			`"text": [`
			`"\u001b[33mUser\u001b[0m (to Cerebras Assistant):\n",`
			`"\n",`
			`"Provide code to count the number of prime numbers from 1 to 10000.\n",`
			`"\n",`
			`"--------------------------------------------------------------------------------\n",`
			`"\u001b[33mCerebras Assistant\u001b[0m (to User):\n",`
			`"\n",`
			`"To count the number of prime numbers from 1 to 10000, we will utilize a simple algorithm that checks each number in the range to see if it is prime. A prime number is a natural number greater than 1 that has no positive divisors other than 1 and itself.\n",`
			`"\n",`
			`"Here's how we can do it using a Python script:\n",`
			`"\n",`
			"```python\n",
			`"def count_primes(n):\n",`
			`" primes = 0\n",`
			`" for possiblePrime in range(2, n + 1):\n",`
			`" # Assume number is prime until shown it is not. \n",`
			`" isPrime = True\n",`
			`" for num in range(2, int(possiblePrime ** 0.5) + 1):\n",`
			`" if possiblePrime % num == 0:\n",`
			`" isPrime = False\n",`
			`" break\n",`
			`" if isPrime:\n",`
			`" primes += 1\n",`
			`" return primes\n",`
			`"\n",`
			`"# Counting prime numbers from 1 to 10000\n",`
			`"count = count_primes(10000)\n",`
			`"print(count)\n",`
			"```\n",
			`"\n",`
			`"Please execute this code. I will respond with \"FINISH\" after you provide the result.\n",`
			`"\n",`
			`"--------------------------------------------------------------------------------\n"`
			`]`
			`},`
			`{`
			`"name": "stdout",`
			`"output_type": "stream",`
			`"text": [`
			`"Replying as User. Provide feedback to Cerebras Assistant. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: \n"`
			`]`
			`},`
			`{`
			`"name": "stdout",`
			`"output_type": "stream",`
			`"text": [`
			`"\u001b[31m\n",`
			`">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n"`
			`]`
			`}`
			`],`
			`"source": [`
			`"# Start the chat, with the UserProxyAgent asking the AssistantAgent the message\n",`
			`"chat_result = user_proxy_agent.initiate_chat(\n",`
			`" assistant_agent,\n",`
			`" message=\"Provide code to count the number of prime numbers from 1 to 10000.\",\n",`
			`")"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"## Tool Call Example\n",`
			`"\n",`
			`"In this example, instead of writing code, we will show how Meta's Llama-3.1-70B model can perform parallel tool calling, where it recommends calling more than one tool at a time.\n",`
			`"\n",`
			`"We'll use a simple travel agent assistant program where we have a couple of tools for weather and currency conversion.\n",`
			`"\n",`
			"We start by importing libraries and setting up our configuration to use Llama-3.1-70B and the `cerebras` client class."
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"execution_count": 36,`
			`"metadata": {},`
			`"outputs": [],`
			`"source": [`
			`"import json\n",`
			`"import os\n",`
			`"from typing import Literal\n",`
			`"\n",`
			`"from typing_extensions import Annotated\n",`
			`"\n",`
			`"import autogen\n",`
			`"\n",`
			`"config_list = [\n",`
			`" {\n",`
			`" \"model\": \"llama3.1-70b\",\n",`
			`" \"api_key\": os.environ.get(\"CEREBRAS_API_KEY\"),\n",`
			`" \"api_type\": \"cerebras\",\n",`
			`" }\n",`
			`"]"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"Create our two agents."`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"execution_count": 43,`
			`"metadata": {},`
			`"outputs": [],`
			`"source": [`
			`"# Create the agent for tool calling\n",`
			`"chatbot = autogen.AssistantAgent(\n",`
			`" name=\"chatbot\",\n",`
			`" system_message=\"\"\"\n",`
			`" For currency exchange and weather forecasting tasks,\n",`
			`" only use the functions you have been provided with.\n",`
			`" When you summarize, make sure you've considered ALL previous instructions.\n",`
			`" Output 'HAVE FUN!' when an answer has been provided.\n",`
			`" \"\"\",\n",`
			`" llm_config={\"config_list\": config_list},\n",`
			`")\n",`
			`"\n",`
			`"# Note that we have changed the termination string to be \"HAVE FUN!\"\n",`
			`"user_proxy = autogen.UserProxyAgent(\n",`
			`" name=\"user_proxy\",\n",`
			`" is_termination_msg=lambda x: x.get(\"content\", \"\") and \"HAVE FUN!\" in x.get(\"content\", \"\"),\n",`
			`" human_input_mode=\"NEVER\",\n",`
			`" max_consecutive_auto_reply=1,\n",`
			`")"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"Create the two functions, annotating them so that those descriptions can be passed through to the LLM.\n",`
			`"\n",`
			"We associate them with the agents using `register_for_execution` for the user_proxy so it can execute the function and `register_for_llm` for the chatbot (powered by the LLM) so it can pass the function definitions to the LLM."
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"execution_count": 44,`
			`"metadata": {},`
			`"outputs": [],`
			`"source": [`
			`"# Currency Exchange function\n",`
			`"\n",`
			`"CurrencySymbol = Literal[\"USD\", \"EUR\"]\n",`
			`"\n",`
			`"# Define our function that we expect to call\n",`
			`"\n",`
			`"\n",`
			`"def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:\n",`
			`" if base_currency == quote_currency:\n",`
			`" return 1.0\n",`
			`" elif base_currency == \"USD\" and quote_currency == \"EUR\":\n",`
			`" return 1 / 1.1\n",`
			`" elif base_currency == \"EUR\" and quote_currency == \"USD\":\n",`
			`" return 1.1\n",`
			`" else:\n",`
			`" raise ValueError(f\"Unknown currencies {base_currency}, {quote_currency}\")\n",`
			`"\n",`
			`"\n",`
			`"# Register the function with the agent\n",`
			`"\n",`
			`"\n",`
			`"@user_proxy.register_for_execution()\n",`
			`"@chatbot.register_for_llm(description=\"Currency exchange calculator.\")\n",`
			`"def currency_calculator(\n",`
			`" base_amount: Annotated[float, \"Amount of currency in base_currency\"],\n",`
			`" base_currency: Annotated[CurrencySymbol, \"Base currency\"] = \"USD\",\n",`
			`" quote_currency: Annotated[CurrencySymbol, \"Quote currency\"] = \"EUR\",\n",`
			`") -> str:\n",`
			`" quote_amount = exchange_rate(base_currency, quote_currency) * base_amount\n",`
			`" return f\"{format(quote_amount, '.2f')} {quote_currency}\"\n",`
			`"\n",`
			`"\n",`
			`"# Weather function\n",`
			`"\n",`
			`"\n",`
			`"# Example function to make available to model\n",`
			`"def get_current_weather(location, unit=\"fahrenheit\"):\n",`
			`" \"\"\"Get the weather for some location\"\"\"\n",`
			`" if \"chicago\" in location.lower():\n",`
			`" return json.dumps({\"location\": \"Chicago\", \"temperature\": \"13\", \"unit\": unit})\n",`
			`" elif \"san francisco\" in location.lower():\n",`
			`" return json.dumps({\"location\": \"San Francisco\", \"temperature\": \"55\", \"unit\": unit})\n",`
			`" elif \"new york\" in location.lower():\n",`
			`" return json.dumps({\"location\": \"New York\", \"temperature\": \"11\", \"unit\": unit})\n",`
			`" else:\n",`
			`" return json.dumps({\"location\": location, \"temperature\": \"unknown\"})\n",`
			`"\n",`
			`"\n",`
			`"# Register the function with the agent\n",`
			`"\n",`
			`"\n",`
			`"@user_proxy.register_for_execution()\n",`
			`"@chatbot.register_for_llm(description=\"Weather forecast for US cities.\")\n",`
			`"def weather_forecast(\n",`
			`" location: Annotated[str, \"City name\"],\n",`
			`") -> str:\n",`
			`" weather_details = get_current_weather(location=location)\n",`
			`" weather = json.loads(weather_details)\n",`
			`" return f\"{weather['location']} will be {weather['temperature']} degrees {weather['unit']}\""`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"We pass through our customer's message and run the chat.\n",`
			`"\n",`
			`"Finally, we ask the LLM to summarise the chat and print that out."`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"execution_count": 45,`
			`"metadata": {},`
			`"outputs": [`
			`{`
			`"name": "stdout",`
			`"output_type": "stream",`
			`"text": [`
			`"\u001b[33muser_proxy\u001b[0m (to chatbot):\n",`
			`"\n",`
			`"What's the weather in New York and can you tell me how much is 123.45 EUR in USD so I can spend it on my holiday? Throw a few holiday tips in as well.\n",`
			`"\n",`
			`"--------------------------------------------------------------------------------\n",`
			`"\u001b[33mchatbot\u001b[0m (to user_proxy):\n",`
			`"\n",`
			`"\u001b[32m*** Suggested tool call (210f6ac6d): weather_forecast ***\u001b[0m\n",`
			`"Arguments: \n",`
			`"{\"location\": \"New York\"}\n",`
			`"\u001b[32m*************************************************************\u001b[0m\n",`
			`"\u001b[32m*** Suggested tool call (3c00ac7d5): currency_calculator ***\u001b[0m\n",`
			`"Arguments: \n",`
			`"{\"base_amount\": 123.45, \"base_currency\": \"EUR\", \"quote_currency\": \"USD\"}\n",`
			`"\u001b[32m****************************************************************\u001b[0m\n",`
			`"\n",`
			`"--------------------------------------------------------------------------------\n",`
			`"\u001b[35m\n",`
			`">>>>>>>> EXECUTING FUNCTION weather_forecast...\u001b[0m\n",`
			`"\u001b[35m\n",`
			`">>>>>>>> EXECUTING FUNCTION currency_calculator...\u001b[0m\n",`
			`"\u001b[33muser_proxy\u001b[0m (to chatbot):\n",`
			`"\n",`
			`"\u001b[33muser_proxy\u001b[0m (to chatbot):\n",`
			`"\n",`
			`"\u001b[32m*** Response from calling tool (210f6ac6d) ***\u001b[0m\n",`
			`"New York will be 11 degrees fahrenheit\n",`
			`"\u001b[32m**************************************************\u001b[0m\n",`
			`"\n",`
			`"--------------------------------------------------------------------------------\n",`
			`"\u001b[33muser_proxy\u001b[0m (to chatbot):\n",`
			`"\n",`
			`"\u001b[32m*** Response from calling tool (3c00ac7d5) ***\u001b[0m\n",`
			`"135.80 USD\n",`
			`"\u001b[32m**************************************************\u001b[0m\n",`
			`"\n",`
			`"--------------------------------------------------------------------------------\n",`
			`"\u001b[33mchatbot\u001b[0m (to user_proxy):\n",`
			`"\n",`
			`"New York will be 11 degrees fahrenheit.\n",`
			`"123.45 EUR is equivalent to 135.80 USD.\n",`
			`" \n",`
			`"For a great holiday, explore the Statue of Liberty, take a walk through Central Park, or visit one of the many world-class museums. Also, you'll find great food ranging from bagels to fine dining experiences. HAVE FUN!\n",`
			`"\n",`
			`"--------------------------------------------------------------------------------\n",`
			`"LLM SUMMARY: New York will be 11 degrees fahrenheit. 123.45 EUR is equivalent to 135.80 USD. Explore the Statue of Liberty, walk through Central Park, or visit one of the many world-class museums for a great holiday in New York.\n",`
			`"\n",`
			`"Duration: 73.97937774658203ms\n"`
			`]`
			`}`
			`],`
			`"source": [`
			`"import time\n",`
			`"\n",`
			`"start_time = time.time()\n",`
			`"\n",`
			`"# start the conversation\n",`
			`"res = user_proxy.initiate_chat(\n",`
			`" chatbot,\n",`
			`" message=\"What's the weather in New York and can you tell me how much is 123.45 EUR in USD so I can spend it on my holiday? Throw a few holiday tips in as well.\",\n",`
			`" summary_method=\"reflection_with_llm\",\n",`
			`")\n",`
			`"\n",`
			`"end_time = time.time()\n",`
			`"\n",`
			`"print(f\"LLM SUMMARY: {res.summary['content']}\\n\\nDuration: {(end_time - start_time) * 1000}ms\")"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"We can see that the Cerebras Wafer-Scale Engine-3 (WSE-3) completed the query in 74ms -- faster than the blink of an eye!"`
			`]`
			`}`
			`],`
			`"metadata": {`
			`"kernelspec": {`
			`"display_name": "Python 3 (ipykernel)",`
			`"language": "python",`
			`"name": "python3"`
			`},`
			`"language_info": {`
			`"codemirror_mode": {`
			`"name": "ipython",`
			`"version": 3`
			`},`
			`"file_extension": ".py",`
			`"mimetype": "text/x-python",`
			`"name": "python",`
			`"nbconvert_exporter": "python",`
			`"pygments_lexer": "ipython3",`
			`"version": "3.8.13"`
			`}`
			`},`
			`"nbformat": 4,`
			`"nbformat_minor": 4`
			`}`