- **AutoGen has expanded integrations with a variety of cloud-based model providers beyond OpenAI.**
- **Leverage models and platforms from Gemini, Anthropic, Mistral AI, Together.AI, and Groq for your AutoGen agents.**
- **Utilise models specifically for chat, language, image, and coding.**
- **LLM provider diversification can provide cost and resilience benefits.**
In addition to the recently released AutoGen [Google Gemini](https://ai.google.dev/) client, new client classes for [Mistral AI](https://mistral.ai/), [Anthropic](https://www.anthropic.com/), [Together.AI](https://www.together.ai/), and [Groq](https://groq.com/) enable you to utilize over 75 different large language models in your AutoGen agent workflow.
These new client classes tailor AutoGen's underlying messages to each provider's unique requirements and remove that complexity from the developer, who can then focus on building their AutoGen workflow.
Using them is as simple as installing the client-specific library and updating your LLM config with the relevant `api_type` and `model`. We'll demonstrate how to use them below.
The community is continuing to enhance and build new client classes as cloud-based inference providers arrive. So, watch this space, and feel free to [discuss](https://discord.gg/pAbnFJrkgZ) or [develop](https://github.com/microsoft/autogen/pulls) another one.
## Benefits of choice
The need to use only the best models to overcome workflow-breaking LLM inconsistency has diminished considerably over the last 12 months.
These new classes provide access to the very largest trillion-parameter models from OpenAI, Google, and Anthropic, continuing to provide the most consistent
and competent agent experiences. However, it's worth trying smaller models from the likes of Meta, Mistral AI, Microsoft, Qwen, and many others. Perhaps they
are capable enough for a task, or sub-task, or even better suited (such as a coding model)!
Using smaller models will have cost benefits, but they also allow you to test models that you could run locally, allowing you to determine if you can remove cloud inference costs
altogether or even run an AutoGen workflow offline.
On the topic of cost, these client classes also include provider-specific token cost calculations so you can monitor the cost impact of your workflows. With costs per million
tokens as low as 10 cents (and some are even free!), cost savings can be noticeable.
## Mix and match
How does Google's Gemini 1.5 Pro model stack up against Anthropic's Opus or Meta's Llama 3?
Now you have the ability to quickly change your agent configs and find out. If you want to run all three in the one workflow,
AutoGen's ability to associate specific configurations to each agent means you can select the best LLM for each agent.
## Capabilities
The common requirements of text generation and function/tool calling are supported by these client classes.
Multi-modal support, such as for image/audio/video, is an area of active development. The [Google Gemini](https://microsoft.github.io/autogen/docs/topics/non-openai-models/cloud-gemini) client class can be
used to create a multimodal agent.
## Tips
Here are some tips when working with these client classes:
- **Most to least capable** - start with larger models and get your workflow working, then iteratively try smaller models.
- **Right model** - choose one that's suited to your task, whether it's coding, function calling, knowledge, or creative writing.
- **Agent names** - these cloud providers do not use the `name` field on a message, so be sure to use your agent's name in their `system_message` and `description` fields, as well as instructing the LLM to 'act as' them. This is particularly important for "auto" speaker selection in group chats as we need to guide the LLM to choose the next agent based on a name, so tweak `select_speaker_message_template`, `select_speaker_prompt_template`, and `select_speaker_auto_multiple_template` with more guidance.
- **Context length** - as your conversation gets longer, models need to support larger context lengths, be mindful of what the model supports and consider using [Transform Messages](https://microsoft.github.io/autogen/docs/topics/handling_long_contexts/intro_to_transform_messages) to manage context size.
- **Provider parameters** - providers have parameters you can set such as temperature, maximum tokens, top-k, top-p, and safety. See each client class in AutoGen's [API Reference](https://microsoft.github.io/autogen/docs/reference/oai/gemini) or [documentation](https://microsoft.github.io/autogen/docs/topics/non-openai-models/cloud-gemini) for details.
- **Prompts** - prompt engineering is critical in guiding smaller LLMs to do what you need. [ConversableAgent](https://microsoft.github.io/autogen/docs/reference/agentchat/conversable_agent), [GroupChat](https://microsoft.github.io/autogen/docs/reference/agentchat/groupchat), [UserProxyAgent](https://microsoft.github.io/autogen/docs/reference/agentchat/user_proxy_agent), and [AssistantAgent](https://microsoft.github.io/autogen/docs/reference/agentchat/assistant_agent) all have customizable prompt attributes that you can tailor. Here are some prompting tips from [Anthropic](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview)([+Library](https://docs.anthropic.com/en/prompt-library/library)), [Mistral AI](https://docs.mistral.ai/guides/prompting_capabilities/), [Together.AI](https://docs.together.ai/docs/examples), and [Meta](https://llama.meta.com/docs/how-to-guides/prompting/).
- **Help!** - reach out on the AutoGen [Discord](https://discord.gg/pAbnFJrkgZ) or [log an issue](https://github.com/microsoft/autogen/issues) if you need help with or can help improve these client classes.
Now it's time to try them out.
## Quickstart
### Installation
Install the appropriate client based on the model you wish to use.
```sh
pip install pyautogen["mistral"] # for Mistral AI client
pip install pyautogen["anthropic"] # for Anthropic client
pip install pyautogen["together"] # for Together.AI client
pip install pyautogen["groq"] # for Groq client
```
### Configuration Setup
Add your model configurations to the `OAI_CONFIG_LIST`. Ensure you specify the `api_type` to initialize the respective client (Anthropic, Mistral, or Together).
```yaml
[
{
"model": "your anthropic model name",
"api_key": "your Anthropic api_key",
"api_type": "anthropic"
},
{
"model": "your mistral model name",
"api_key": "your Mistral AI api_key",
"api_type": "mistral"
},
{
"model": "your together.ai model name",
"api_key": "your Together.AI api_key",
"api_type": "together"
},
{
"model": "your groq model name",
"api_key": "your Groq api_key",
"api_type": "groq"
}
]
```
### Usage
The `[config_list_from_json](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils/#config_list_from_json)` function loads a list of configurations from an environment variable or a json file.
```py
import autogen
from autogen import AssistantAgent, UserProxyAgent
config_list = autogen.config_list_from_json(
"OAI_CONFIG_LIST"
)
```
### Construct Agents
Construct a simple conversation between a User proxy and an Assistant agent
```py
user_proxy = UserProxyAgent(
name="User_proxy",
code_execution_config={
"last_n_messages": 2,
"work_dir": "groupchat",
"use_docker": False, # Please set use_docker = True if docker is available to run the generated code. Using docker is safer than running the generated code directly.
**NOTE: To integrate this setup into GroupChat, follow the [tutorial](https://microsoft.github.io/autogen/docs/notebooks/agentchat_groupchat) with the same config as above.**
## Function Calls
Now, let's look at how Anthropic's Sonnet 3.5 is able to suggest multiple function calls in a single response.
This example is a simple travel agent setup with an agent for function calling and a user proxy agent for executing the functions.
One thing you'll note here is Anthropic's models are more verbose than OpenAI's and will typically provide chain-of-thought or general verbiage when replying. Therefore we provide more explicit instructions to `functionbot` to not reply with more than necessary. Even so, it can't always help itself!
Let's start with setting up our configuration and agents.
```py
import os
import autogen
import json
from typing import Literal
from typing_extensions import Annotated
# Anthropic configuration, using api_type='anthropic'
anthropic_llm_config = {
"config_list":
[
{
"api_type": "anthropic",
"model": "claude-3-5-sonnet-20240620",
"api_key": os.getenv("ANTHROPIC_API_KEY"),
"cache_seed": None
}
]
}
# Our functionbot, who will be assigned two functions and
# given directions to use them.
functionbot = autogen.AssistantAgent(
name="functionbot",
system_message="For currency exchange tasks, only use "
"the functions you have been provided with. Do not "
"reply with helpful tips. Once you've recommended functions "
"reply with 'TERMINATE'.",
is_termination_msg=lambda x: x.get("content", "") and (x.get("content", "").rstrip().endswith("TERMINATE") or x.get("content", "") == ""),
llm_config=anthropic_llm_config,
)
# Our user proxy agent, who will be used to manage the customer
# request and conversation with the functionbot, terminating
# when we have the information we need.
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
system_message="You are a travel agent that provides "
"specific information to your customers. Get the "
"information you need and provide a great summary "
"so your customer can have a great trip. If you "
"have the information you need, simply reply with "
"'TERMINATE'.",
is_termination_msg=lambda x: x.get("content", "") and (x.get("content", "").rstrip().endswith("TERMINATE") or x.get("content", "") == ""),
return f"{weather['location']} will be {weather['temperature']} degrees {weather['unit']}"
```
Finally, we start the conversation with a request for help from our customer on their upcoming trip to New York and the Euro they would like exchanged to USD.
Importantly, we're also using Anthropic's Sonnet to provide a summary through the `summary_method`. Using `summary_prompt`, we guide Sonnet to give us an email output.
```py
# start the conversation
res = user_proxy.initiate_chat(
functionbot,
message="My customer wants to travel to New York and "
"they need to exchange 830 EUR to USD. Can you please "
"provide them with a summary of the weather and "
"exchanged currently in USD?",
summary_method="reflection_with_llm",
summary_args={
"summary_prompt": """Summarize the conversation by
providing an email response with the travel information
for the customer addressed as 'Dear Customer'. Do not
provide any additional conversation or apologise,
just provide the relevant information and the email."""
},
)
```
After the conversation has finished, we'll print out the summary.
```py
print(f"Here's the LLM summary of the conversation:\n\n{res.summary['content']}")
```
Here's the resulting output.
```text
user_proxy (to functionbot):
My customer wants to travel to New York and they need to exchange 830 EUR
to USD. Can you please provide them with a summary of the weather and
Certainly. I'll provide an email response to the customer with the travel
information as requested.
Dear Customer,
We are pleased to provide you with the following information for your
upcoming trip to New York:
Weather Forecast:
The current forecast for New York indicates a temperature of 11 degrees
Fahrenheit. Please be prepared for very cold weather and pack appropriate
warm clothing.
Currency Exchange:
We have calculated the currency exchange for you. Your 830 EUR will be
equivalent to approximately 913 USD at the current exchange rate.
We hope this information helps you prepare for your trip to New York. Have
a safe and enjoyable journey!
Best regards,
Travel Assistance Team
```
So we can see how Anthropic's Sonnet is able to suggest multiple tools in a single response, with AutoGen executing them both and providing the results back to Sonnet. Sonnet then finishes with a nice email summary that can be the basis for continued real-life conversation with the customer.
## More tips and tricks
For an interesting chess game between Anthropic's Sonnet and Mistral's Mixtral, we've put together a sample notebook that highlights some of the tips and tricks for working with non-OpenAI LLMs. [See the notebook here](https://microsoft.github.io/autogen/docs/notebooks/agentchat_nested_chats_chess_altmodels).