## Why are these changes needed?
Enables usage statistics for streaming responses by default.
There is a similar bug in the AzureAI client. Theoretically adding the
parameter
```
model_extras={"stream_options": {"include_usage": True}}
```
should fix the problem, but I'm currently unable to test that workflow
## Related issue number
closes https://github.com/microsoft/autogen/issues/6548
## Checks
- [ ] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- [ ] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [ ] I've made sure all auto checks have passed.
## Why are these changes needed?
FIX/mistral could not recive name field, so add model transformer for
mistral
## Related issue number
Closes#6147
## Checks
- [ ] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- [x] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [x] I've made sure all auto checks have passed.
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
## Why are these changes needed?
Multimodal message fill context with other routine. However current
`_set_empty_to_whitespace` is fill with context.
So, error occured.
And, I checked `multimodal_user_transformer_funcs` and I found it, in
this routine, context must not be empty.
Now remove the `_set_empty_to_whitespace` when multimodal message,
<!-- Please give a short summary of the change and the problem this
solves. -->
## Related issue number
Closes#6439
## Checks
- [ ] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- [x] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [x] I've made sure all auto checks have passed.
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
## Why are these changes needed?
| Package | Test time-Origin (Sec) | Test time-Edited (Sec) |
|-------------------------|------------------|-----------------------------------------------|
| autogen-studio | 1.64 | 1.64 |
| autogen-core | 6.03 | 6.17 |
| autogen-ext | 387.15 | 373.40 |
| autogen-agentchat | 54.20 | 20.67 |
## Related issue number
Related #6361
## Checks
- [ ] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- [ ] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [ ] I've made sure all auto checks have passed.
The current implementation of consecutive `SystemMessage` merging
applies only to models where `model_info.family` starts with
`"gemini-"`.
Since PR #6327 introduced the `multiple_system_messages` field in
`model_info`, we can now generalize this logic by checking whether the
field is explicitly set to `False`.
This change replaces the hardcoded family check with a conditional that
merges consecutive `SystemMessage` blocks whenever
`multiple_system_messages` is set to `False`.
Test cases that previously depended on the `"gemini"` model family have
been updated to reflect this configuration flag, and renamed accordingly
for clarity.
In addition, for consistency across conditional logic, a follow-up PR is
planned to refactor the Claude-specific transformation condition
(currently implemented via `create_args.get("model",
"unknown").startswith("claude-")`)
to instead use the existing `is_claude()`.
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
This PR improves fallback safety when an invalid `model_family` is
supplied to `get_transformer()`. Previously, if a user passed an
arbitrary or incorrect `family` string in `model_info`, the lookup could
fail without falling back to `ModelFamily.UNKNOWN`.
Now, we explicitly check whether `model_family` is a valid value in
`ModelFamily.ANY`. If not, we fallback to `_find_model_family()` as
intended.
## Related issue number
Related #6011#issuecomment-2779957730
## Checks
- [ ] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- [x] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [x] I've made sure all auto checks have passed.
---------
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
Just simple change.
```python
messages: List[LLMMessage] = [UserMessage(content="Call the pass tool with input 'task'", source="user")]
```
to
```python
messages: List[LLMMessage] = [UserMessage(content="Call the pass tool with input 'task' and talk result", source="user")]
```
And, now.
Anthropic model could pass that test case
`test_model_client_with_function_calling`.
-> Yup. Before, claude could not pass that test case.
With this change, Claude (Anthropic) models are now able to pass the
test case successfully.
Before this fix, Claude failed to interpret the intent correctly. Now,
it can infer both tool usage and follow-up generation.
This change is backward-compatible with other models (e.g., GPT-4) and
improves cross-model consistency for function-calling tests.
This PR improves how model_family is resolved when selecting a
transformer from the registry.
Previously, model families were inferred using a simple prefix-based
match like:
```
if model.startswith(family): ...
```
This works for cleanly prefixed models (e.g., `gpt-4o`, `claude-3`) but
fails for models like `mistral-large-latest`, `codestral-latest`, etc.,
where prefix-based matching is ambiguous or misleading.
To address this:
• model_family can now be passed explicitly (e.g., via ModelInfo)
• _find_model_family() is only used as a fallback when the value is
"unknown"
• Transformer lookup is now more robust and predictable
• Example integration in to_oai_type() demonstrates this pattern using
self._model_info["family"]
This change is required for safe support of models like Mistral and
other future models that do not follow standard naming conventions.
Linked to discussion in
[#6151](https://github.com/microsoft/autogen/issues/6151)
Related : #6011
---------
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
## Why are these changes needed?
This PR fixes a `400 - invalid_request_error` that occurs when using
Anthropic models and the **final message is from the assistant and ends
with trailing whitespace**.
Example error:
```
Error code: 400 - {'error': {'code': 'invalid_request_error', 'message': 'messages: final assistant content cannot end with trailing whitespace', ...}}
```
To unblock ongoing internal usage, this patch introduces an **ad-hoc
fix** that strips trailing whitespace if the model is Anthropic and the
last message is from the assistant.
## Related issue number
Ad-hoc fix for issue discussed here:
https://github.com/microsoft/autogen/issues/6167
Follow-up structural proposal here:
https://github.com/microsoft/autogen/issues/6167https://github.com/microsoft/autogen/issues/6167#issuecomment-2768592840
## Why are these changes needed?
This change addresses a compatibility issue when using Google Gemini
models with AutoGen. Specifically, Gemini returns a 400 INVALID_ARGUMENT
error when receiving a response with an empty "text" parameter.
The root cause is that Gemini does not accept empty string values (e.g.,
"") as valid inputs in the history of the conversation.
To fix this, if the content field is falsy (e.g., None, "", etc.), it is
explicitly replaced with a single whitespace (" "), which prevents the
Gemini model from rejecting the request.
- **Gemini API compatibility:** Gemini models reject empty assistant
messages (e.g., `""`), causing runtime errors. This PR ensures such
messages are safely replaced with whitespace where appropriate.
- **Avoiding regressions:** Applying the empty content workaround **only
to Gemini**, and **only to valid message types**, avoids breaking OpenAI
or other models.
- **Reducing duplication:** Previously, message transformation logic was
scattered and repeated across different message types and models.
Modularizing this pipeline removes that redundancy.
- **Improved maintainability:** With future model variants likely to
introduce more constraints, this modular structure makes it easier to
adapt transformations without writing ad-hoc code each time.
- **Testing for correctness:** The new structure is verified with tests,
ensuring the bug fix is effective and non-intrusive.
## Summary
This PR introduces a **modular transformer pipeline** for message
conversion and **fixes a Gemini-specific bug** related to empty
assistant message content.
### Key Changes
- **[Refactor]** Extracted message transformation logic into a unified
pipeline to:
- Reduce code duplication
- Improve maintainability
- Simplify debugging and extension for future model-specific logic
- **[BugFix]** Gemini models do not accept empty assistant message
content.
- Introduced `_set_empty_to_whitespace` transformer to replace empty
strings with `" "` only where needed
- Applied it **only** to `"text"` and `"thought"` message types, not to
`"tools"` to avoid serialization errors
- **Improved structure for model-specific handling**
- Transformer functions are now grouped and conditionally applied based
on message type and model family
- This design makes it easier to support future models or combinations
(e.g., Gemini + R1)
- **Test coverage added**
- Added dedicated tests to verify that empty assistant content causes
errors for Gemini
- Ensured the fix resolves the issue without affecting OpenAI models
---
## Motivation
Originally, Gemini-compatible endpoints would fail when receiving
assistant messages with empty content (`""`).
This issue required special handling without introducing brittle, ad-hoc
patches.
In addressing this, I also saw an opportunity to **modularize** the
message transformation logic across models.
This improves clarity, avoids duplication, and simplifies future
adaptations (e.g., different constraints across model families).
---
## 📘 AutoGen Modular Message Transformer: Design & Usage Guide
This document introduces the **new modular transformer system** used in
AutoGen for converting `LLMMessage` instances to SDK-specific message
formats (e.g., OpenAI-style `ChatCompletionMessageParam`).
The design improves **reusability, extensibility**, and
**maintainability** across different model families.
---
### 🚀 Overview
Instead of scattering model-specific message conversion logic across the
codebase, the new design introduces:
- Modular transformer **functions** for each message type
- Per-model **transformer maps** (e.g., for OpenAI-compatible models)
- Optional **conditional transformers** for multimodal/text hybrid
models
- Clear separation between **message adaptation logic** and
**SDK-specific builder** (e.g., `ChatCompletionUserMessageParam`)
---
### 🧱 1. Define Transform Functions
Each transformer function takes:
- `LLMMessage`: a structured AutoGen message
- `context: dict`: metadata passed through the builder pipeline
And returns:
- A dictionary of keyword arguments for the target message constructor
(e.g., `{"content": ..., "name": ..., "role": ...}`)
```python
def _set_thought_as_content_gemini(message: LLMMessage, context: Dict[str, Any]) -> Dict[str, str | None]:
assert isinstance(message, AssistantMessage)
return {"content": message.thought or " "}
```
---
### 🪢 2. Compose Transformer Pipelines
Multiple transformer functions are composed into a pipeline using
`build_transformer_func()`:
```python
base_user_transformer_funcs: List[Callable[[LLMMessage, Dict[str, Any]], Dict[str, Any]]] = [
_assert_valid_name,
_set_name,
_set_role("user"),
]
user_transformer = build_transformer_func(
funcs=base_user_transformer_funcs,
message_param_func=ChatCompletionUserMessageParam
)
```
- The `message_param_func` is the actual constructor for the target
message class (usually from the SDK).
- The pipeline is **ordered** — each function adds or overrides keys in
the builder kwargs.
---
### 🗂️ 3. Register Transformer Map
Each model family maintains a `TransformerMap`, which maps `LLMMessage`
types to transformers:
```python
__BASE_TRANSFORMER_MAP: TransformerMap = {
SystemMessage: system_transformer,
UserMessage: user_transformer,
AssistantMessage: assistant_transformer,
}
register_transformer("openai", model_name_or_family, __BASE_TRANSFORMER_MAP)
```
- `"openai"` is currently required (as only OpenAI-compatible format is
supported now).
- Registration ensures AutoGen knows how to transform each message type
for that model.
---
### 🔁 4. Conditional Transformers (Optional)
When message construction depends on runtime conditions (e.g., `"text"`
vs. `"multimodal"`), use:
```python
conditional_transformer = build_conditional_transformer_func(
funcs_map=user_transformer_funcs_claude,
message_param_func_map=user_transformer_constructors,
condition_func=user_condition,
)
```
Where:
- `funcs_map`: maps condition label → list of transformer functions
```python
user_transformer_funcs_claude = {
"text": text_transformers + [_set_empty_to_whitespace],
"multimodal": multimodal_transformers + [_set_empty_to_whitespace],
}
```
- `message_param_func_map`: maps condition label → message builder
```python
user_transformer_constructors = {
"text": ChatCompletionUserMessageParam,
"multimodal": ChatCompletionUserMessageParam,
}
```
- `condition_func`: determines which transformer to apply at runtime
```python
def user_condition(message: LLMMessage, context: Dict[str, Any]) -> str:
if isinstance(message.content, str):
return "text"
return "multimodal"
```
---
### 🧪 Example Flow
```python
llm_message = AssistantMessage(name="a", thought="let’s go")
model_family = "openai"
model_name = "claude-3-opus"
transformer = get_transformer(model_family, model_name, type(llm_message))
sdk_message = transformer(llm_message, context={})
```
---
### 🎯 Design Benefits
| Feature | Benefit |
|--------|---------|
| 🧱 Function-based modular design | Easy to compose and test |
| 🧩 Per-model registry | Clean separation across model families |
| ⚖️ Conditional support | Allows multimodal / dynamic adaptation |
| 🔄 Reuse-friendly | Shared logic (e.g., `_set_name`) is DRY |
| 📦 SDK-specific | Keeps message adaptation aligned to builder interface
|
---
### 🔮 Future Direction
- Support more SDKs and formats by introducing new message_param_func
- Global registry integration (currently `"openai"`-scoped)
- Class-based transformer variant if complexity grows
---
## Related issue number
Closes#5762
## Checks
- [ ] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- [x] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [ v ] I've made sure all auto checks have passed.
---------
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
Anthropic SDK could not takes multiple system messages.
However some autogen Agent(e.g. SocietyOfMindAgent) makes multiple
system messages.
And... Gemini with OpenaiSDK do not take error. However is not working
mulitple system messages.
(Just last one is working)
So, I simple change of, "merge multiple system message" at these cases.
## Related issue number
Closes#6116Closes#6117
---------
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
So the behavior of hosted R1 model is the same as locally hosted R1 model.
Addresses: #5989
---------
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
Resolves#5982
This PR adds support for `json_schema` as a `response_format` type in
`OpenAIChatCompletionClient`. This is necessary because it allows the
client to be serialized along with the schema. If user use
`response_format=SomeBaseModel`, the client cannot be serialized.
Usage:
```python
# Structured output response, with a pre-defined JSON schema.
OpenAIChatCompletionClient(...,
response_format = {
"type": "json_schema",
"json_schema": {
"name": "name of the schema, must be an identifier.",
"description": "description for the model.",
# You can convert a Pydantic (v2) model to JSON schema
# using the `model_json_schema()` method.
"schema": "<the JSON schema itself>",
# Whether to enable strict schema adherence when
# generating the output. If set to true, the model will
# always follow the exact schema defined in the
# `schema` field. Only a subset of JSON Schema is
# supported when `strict` is `true`.
# To learn more, read
# https://platform.openai.com/docs/guides/structured-outputs.
"strict": False, # or True
},
},
)
````
R1 reasoning tokens from hosted R1 model were not parsed correctly for the openai client
Resolves#5941
---------
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
These changes are needed because there is currently no way to get
logging information about Streaming LLM requests/responses.
I decided to put the StreamStart event AFTER the first chunk so there
aren't false positives about connections/auth.
Closes#5730
---------
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
Resolves#5745
Also made sure to log LLMCallEvent from all builtin model clients, and
added unit test for coverage.
---------
Co-authored-by: Ryan Sweet <rysweet@microsoft.com>
Co-authored-by: Victor Dibia <victordibia@microsoft.com>
<!-- Thank you for your contribution! Please review
https://microsoft.github.io/autogen/docs/Contribute before opening a
pull request. -->
<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->
## Why are these changes needed?
<!-- Please give a short summary of the change and the problem this
solves. -->
The PR introduces two changes.
The first change is adding a name attribute to
`FunctionExecutionResult`. The motivation is that semantic kernel
requires it for their function result interface and it seemed like a
easy modification as `FunctionExecutionResult` is always created in the
context of a `FunctionCall` which will contain the name. I'm unsure if
there was a motivation to keep it out but this change makes it easier to
trace which tool the result refers to and also increases api
compatibility with SK.
The second change is an update to how messages are mapped from autogen
to semantic kernel, which includes an update/fix in the processing of
function results.
## Related issue number
<!-- For example: "Closes #1234" -->
Related to #5675 but wont fix the underlying issue of anthropic
requiring tools during AssistantAgent reflection.
## Checks
- [ ] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- [ ] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [ ] I've made sure all auto checks have passed.
---------
Co-authored-by: Leonardo Pinheiro <lpinheiro@microsoft.com>
Resolves#5192
Test
```python
import asyncio
import os
from random import randint
from typing import List
from autogen_core.tools import BaseTool, FunctionTool
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.ui import Console
async def get_current_time(city: str) -> str:
return f"The current time in {city} is {randint(0, 23)}:{randint(0, 59)}."
tools: List[BaseTool] = [
FunctionTool(
get_current_time,
name="get_current_time",
description="Get current time for a city.",
),
]
model_client = OpenAIChatCompletionClient(
model="anthropic/claude-3.5-haiku-20241022",
base_url="https://openrouter.ai/api/v1",
api_key=os.environ["OPENROUTER_API_KEY"],
model_info={
"family": "claude-3.5-haiku",
"function_calling": True,
"vision": False,
"json_output": False,
}
)
agent = AssistantAgent(
name="Agent",
model_client=model_client,
tools=tools,
system_message= "You are an assistant with some tools that can be used to answer some questions",
)
async def main() -> None:
await Console(agent.run_stream(task="What is current time of Paris and Toronto?"))
asyncio.run(main())
```
```
---------- user ----------
What is current time of Paris and Toronto?
---------- Agent ----------
I'll help you find the current time for Paris and Toronto by using the get_current_time function for each city.
---------- Agent ----------
[FunctionCall(id='toolu_01NwP3fNAwcYKn1x656Dq9xW', arguments='{"city": "Paris"}', name='get_current_time'), FunctionCall(id='toolu_018d4cWSy3TxXhjgmLYFrfRt', arguments='{"city": "Toronto"}', name='get_current_time')]
---------- Agent ----------
[FunctionExecutionResult(content='The current time in Paris is 1:10.', call_id='toolu_01NwP3fNAwcYKn1x656Dq9xW', is_error=False), FunctionExecutionResult(content='The current time in Toronto is 7:28.', call_id='toolu_018d4cWSy3TxXhjgmLYFrfRt', is_error=False)]
---------- Agent ----------
The current time in Paris is 1:10.
The current time in Toronto is 7:28.
```
---------
Co-authored-by: Jack Gerrits <jackgerrits@users.noreply.github.com>