This PR adds a module-level docstring to `_message_transform.py`, as
requested in the review for [PR
#6063](https://github.com/microsoft/autogen/pull/6063).
The documentation includes:
- Background and motivation behind the modular transformer design
- Key concepts such as transformer functions, pipelines, and maps
- Examples of how to define, register, and use transformers
- Design principles to guide future contributions and extensions
By embedding this explanation directly into the module, contributors and
maintainers can more easily understand the structure, purpose, and usage
of the transformer pipeline without needing to refer to external
documents.
## Related issue number
Follow-up to [PR #6063](https://github.com/microsoft/autogen/pull/6063)
## Why are these changes needed?
This change addresses a compatibility issue when using Google Gemini
models with AutoGen. Specifically, Gemini returns a 400 INVALID_ARGUMENT
error when receiving a response with an empty "text" parameter.
The root cause is that Gemini does not accept empty string values (e.g.,
"") as valid inputs in the history of the conversation.
To fix this, if the content field is falsy (e.g., None, "", etc.), it is
explicitly replaced with a single whitespace (" "), which prevents the
Gemini model from rejecting the request.
- **Gemini API compatibility:** Gemini models reject empty assistant
messages (e.g., `""`), causing runtime errors. This PR ensures such
messages are safely replaced with whitespace where appropriate.
- **Avoiding regressions:** Applying the empty content workaround **only
to Gemini**, and **only to valid message types**, avoids breaking OpenAI
or other models.
- **Reducing duplication:** Previously, message transformation logic was
scattered and repeated across different message types and models.
Modularizing this pipeline removes that redundancy.
- **Improved maintainability:** With future model variants likely to
introduce more constraints, this modular structure makes it easier to
adapt transformations without writing ad-hoc code each time.
- **Testing for correctness:** The new structure is verified with tests,
ensuring the bug fix is effective and non-intrusive.
## Summary
This PR introduces a **modular transformer pipeline** for message
conversion and **fixes a Gemini-specific bug** related to empty
assistant message content.
### Key Changes
- **[Refactor]** Extracted message transformation logic into a unified
pipeline to:
- Reduce code duplication
- Improve maintainability
- Simplify debugging and extension for future model-specific logic
- **[BugFix]** Gemini models do not accept empty assistant message
content.
- Introduced `_set_empty_to_whitespace` transformer to replace empty
strings with `" "` only where needed
- Applied it **only** to `"text"` and `"thought"` message types, not to
`"tools"` to avoid serialization errors
- **Improved structure for model-specific handling**
- Transformer functions are now grouped and conditionally applied based
on message type and model family
- This design makes it easier to support future models or combinations
(e.g., Gemini + R1)
- **Test coverage added**
- Added dedicated tests to verify that empty assistant content causes
errors for Gemini
- Ensured the fix resolves the issue without affecting OpenAI models
---
## Motivation
Originally, Gemini-compatible endpoints would fail when receiving
assistant messages with empty content (`""`).
This issue required special handling without introducing brittle, ad-hoc
patches.
In addressing this, I also saw an opportunity to **modularize** the
message transformation logic across models.
This improves clarity, avoids duplication, and simplifies future
adaptations (e.g., different constraints across model families).
---
## 📘 AutoGen Modular Message Transformer: Design & Usage Guide
This document introduces the **new modular transformer system** used in
AutoGen for converting `LLMMessage` instances to SDK-specific message
formats (e.g., OpenAI-style `ChatCompletionMessageParam`).
The design improves **reusability, extensibility**, and
**maintainability** across different model families.
---
### 🚀 Overview
Instead of scattering model-specific message conversion logic across the
codebase, the new design introduces:
- Modular transformer **functions** for each message type
- Per-model **transformer maps** (e.g., for OpenAI-compatible models)
- Optional **conditional transformers** for multimodal/text hybrid
models
- Clear separation between **message adaptation logic** and
**SDK-specific builder** (e.g., `ChatCompletionUserMessageParam`)
---
### 🧱 1. Define Transform Functions
Each transformer function takes:
- `LLMMessage`: a structured AutoGen message
- `context: dict`: metadata passed through the builder pipeline
And returns:
- A dictionary of keyword arguments for the target message constructor
(e.g., `{"content": ..., "name": ..., "role": ...}`)
```python
def _set_thought_as_content_gemini(message: LLMMessage, context: Dict[str, Any]) -> Dict[str, str | None]:
assert isinstance(message, AssistantMessage)
return {"content": message.thought or " "}
```
---
### 🪢 2. Compose Transformer Pipelines
Multiple transformer functions are composed into a pipeline using
`build_transformer_func()`:
```python
base_user_transformer_funcs: List[Callable[[LLMMessage, Dict[str, Any]], Dict[str, Any]]] = [
_assert_valid_name,
_set_name,
_set_role("user"),
]
user_transformer = build_transformer_func(
funcs=base_user_transformer_funcs,
message_param_func=ChatCompletionUserMessageParam
)
```
- The `message_param_func` is the actual constructor for the target
message class (usually from the SDK).
- The pipeline is **ordered** — each function adds or overrides keys in
the builder kwargs.
---
### 🗂️ 3. Register Transformer Map
Each model family maintains a `TransformerMap`, which maps `LLMMessage`
types to transformers:
```python
__BASE_TRANSFORMER_MAP: TransformerMap = {
SystemMessage: system_transformer,
UserMessage: user_transformer,
AssistantMessage: assistant_transformer,
}
register_transformer("openai", model_name_or_family, __BASE_TRANSFORMER_MAP)
```
- `"openai"` is currently required (as only OpenAI-compatible format is
supported now).
- Registration ensures AutoGen knows how to transform each message type
for that model.
---
### 🔁 4. Conditional Transformers (Optional)
When message construction depends on runtime conditions (e.g., `"text"`
vs. `"multimodal"`), use:
```python
conditional_transformer = build_conditional_transformer_func(
funcs_map=user_transformer_funcs_claude,
message_param_func_map=user_transformer_constructors,
condition_func=user_condition,
)
```
Where:
- `funcs_map`: maps condition label → list of transformer functions
```python
user_transformer_funcs_claude = {
"text": text_transformers + [_set_empty_to_whitespace],
"multimodal": multimodal_transformers + [_set_empty_to_whitespace],
}
```
- `message_param_func_map`: maps condition label → message builder
```python
user_transformer_constructors = {
"text": ChatCompletionUserMessageParam,
"multimodal": ChatCompletionUserMessageParam,
}
```
- `condition_func`: determines which transformer to apply at runtime
```python
def user_condition(message: LLMMessage, context: Dict[str, Any]) -> str:
if isinstance(message.content, str):
return "text"
return "multimodal"
```
---
### 🧪 Example Flow
```python
llm_message = AssistantMessage(name="a", thought="let’s go")
model_family = "openai"
model_name = "claude-3-opus"
transformer = get_transformer(model_family, model_name, type(llm_message))
sdk_message = transformer(llm_message, context={})
```
---
### 🎯 Design Benefits
| Feature | Benefit |
|--------|---------|
| 🧱 Function-based modular design | Easy to compose and test |
| 🧩 Per-model registry | Clean separation across model families |
| ⚖️ Conditional support | Allows multimodal / dynamic adaptation |
| 🔄 Reuse-friendly | Shared logic (e.g., `_set_name`) is DRY |
| 📦 SDK-specific | Keeps message adaptation aligned to builder interface
|
---
### 🔮 Future Direction
- Support more SDKs and formats by introducing new message_param_func
- Global registry integration (currently `"openai"`-scoped)
- Class-based transformer variant if complexity grows
---
## Related issue number
Closes#5762
## Checks
- [ ] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- [x] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [ v ] I've made sure all auto checks have passed.
---------
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
Rename the `ChatMessage` and `AgentEvent` base classes to `BaseChatMessage` and `BaseAgentEvent`.
Bring back the `ChatMessage` and `AgentEvent` as union of built-in concrete types to avoid breaking existing applications that depends on Pydantic serialization.
Why?
Many existing code uses containers like this:
```python
class AppMessage(BaseModel):
name: str
message: ChatMessage
# Serialization is this:
m = AppMessage(...)
m.model_dump_json()
# Fields like HandoffMessage.target will be lost because it is now treated as a base class without content or target fields.
```
The assumption on `ChatMessage` or `AgentEvent` to be a union of concrete types could be in many existing code bases. So this PR brings back the union types, while keep method type hints such as those on `on_messages` to use the `BaseChatMessage` and `BaseAgentEvent` base classes for flexibility.
Anthropic SDK could not takes multiple system messages.
However some autogen Agent(e.g. SocietyOfMindAgent) makes multiple
system messages.
And... Gemini with OpenaiSDK do not take error. However is not working
mulitple system messages.
(Just last one is working)
So, I simple change of, "merge multiple system message" at these cases.
## Related issue number
Closes#6116Closes#6117
---------
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
When using the `ACADynamicSessionsCodeExecutor` it includes the stdout
from the execution but also the `results` property from the call to
dynamic sessions. In some situations, when the executed code results in
a file being saved this is included in the result:
```console
Plot saved as 'results_by_date.png'
{'type': 'image', 'format': 'png', 'base64_data': 'iVBORw0KGgoAAAANSUhEUgAAA90AAAJOCAYAAACqS2TfAAAAOXRFWHRTb2Z0d2FyZQ...
```
In some situations, this additional output is not desirable:
- when displaying the code output to a user - in this case, the stdout
content is dwarfed by the base64 encoded file content
- when an LLM agent is going to evaluate the code output to determine
next steps - in this case, the base64 content will be included in the
message history sent to the LLM increasing the prompt token cost
To handle these cases, this PR adds a new (optional) argument to the
`ACADynamicSessionsCodeExecutor` constructor that would allow
suppressing the result content (but default to False to preserve the
current behaviour in the default case)
(from #6042)
Closes#6042
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
This PR adds missing model entries for OpenAI-compatible endpoints,
including gpt-4.5-turbo, gpt-4.5-turbo-preview, and claude-3.5-sonnet.
This improves coverage and avoids potential fallback or mismatch issues
when initializing clients.
This PR refactored `AgentEvent` and `ChatMessage` union types to
abstract base classes. This allows for user-defined message types that
subclass one of the base classes to be used in AgentChat.
To support a unified interface for working with the messages, the base
classes added abstract methods for:
- Convert content to string
- Convert content to a `UserMessage` for model client
- Convert content for rendering in console.
- Dump into a dictionary
- Load and create a new instance from a dictionary
This way, all agents such as `AssistantAgent` and `SocietyOfMindAgent`
can utilize the unified interface to work with any built-in and
user-defined message type.
This PR also introduces a new message type, `StructuredMessage` for
AgentChat (Resolves#5131), which is a generic type that requires a
user-specified content type.
You can create a `StructuredMessage` as follow:
```python
class MessageType(BaseModel):
data: str
references: List[str]
message = StructuredMessage[MessageType](content=MessageType(data="data", references=["a", "b"]), source="user")
# message.content is of type `MessageType`.
```
This PR addresses the receving side of this message type. To produce
this message type from `AssistantAgent`, the work continue in #5934.
Added unit tests to verify this message type works with agents and
teams.
So the behavior of hosted R1 model is the same as locally hosted R1 model.
Addresses: #5989
---------
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
<!-- Thank you for your contribution! Please review
https://microsoft.github.io/autogen/docs/Contribute before opening a
pull request. -->
<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->
## Why are these changes needed?
Add utf encoding to file reading.
Without this, a default system encoding will be used. On Windows
machines this can default to any local encoding causing errors.
```python
with open(
os.path.join(os.path.abspath(os.path.dirname(__file__)), "page_script.js"), "rt", encoding="utf-8"
) as fh:
```
<!-- Please give a short summary of the change and the problem this
solves. -->
## Related issue number
<!-- For example: "Closes #1234" -->
Closes#6093
## Checks
- [ ] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- [ ] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [ ] I've made sure all auto checks have passed.
added support for the thought process in tool calls for
`OpenAIChatCompletionClient`, allowing additional text produced by a
model alongside tool calls to be preserved in the thought field of
`CreateResult`. This PR extends the same functionality to
`AzureAIChatCompletionClient` for consistency across model clients.
#5650
Co-authored-by: Jay Prakash Thakur <jathakur@microsoft.com>
## Why are these changes needed?
This PR fixes a `TypeError: Cannot instantiate typing.Union` that occurs
when using the `MultimodalWebSurfer_agent` with Anthropic models. The
error was caused by the incorrect usage of `typing.Union` as a class
constructor instead of a type hint within the `_anthropic_client.py`
file. The code was attempting to instantiate `typing.Union`, which is
not allowed. The fix correctly uses `typing.Union` within type hints,
and uses the correct `Base64ImageSourceParam` type. It also updates the
`pyproject.toml` dependency.
## Related issue number
Closes#6035
## Checks
- [ ] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- [ ] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [v] I've made sure all auto checks have passed.
---------
Co-authored-by: Victor Dibia <victordibia@microsoft.com>
Optionally limit what files and folders FileSurfer can access
(constraining it to a subtree of the FS).
This is not a replacement for Docker sandboxing, but can be used in
conjunction with sandboxing to help prevent FileSurfer from accessing
sensitive files.
## Why are these changes needed?
when I want to create a ACASessionsExecutor instance and execute some
code, the default library imported does not work. It always returns:
"ClientAuthenticationError: Authentication failed: AADSTS70011: The
provided request must include a 'scope' input parameter. The provided
value for the input parameter 'scope' is not valid. The scope
https://dynamicsessions.io/ is not valid. Trace ID:
d75efa58-8be7-44ef-8839-aacfdc850600 Correlation ID:
a8e4d859-92da-4fbe-a8e0-05116323ab55 Timestamp: 2025-03-14 14:15:09Z"
After changing the scope in _ensure_access_token to be
"https://dynamicsessions.io/.default" rather than
""https://dynamicsessions.io/" and it worked.
## Related issue number
issue #5946
## Checks
- [Y ] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- [ ] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [ Y] I've made sure all auto checks have passed.
Co-authored-by: edwinwu <edwin@Edwin-MBA.local>
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
Resolves#5982
This PR adds support for `json_schema` as a `response_format` type in
`OpenAIChatCompletionClient`. This is necessary because it allows the
client to be serialized along with the schema. If user use
`response_format=SomeBaseModel`, the client cannot be serialized.
Usage:
```python
# Structured output response, with a pre-defined JSON schema.
OpenAIChatCompletionClient(...,
response_format = {
"type": "json_schema",
"json_schema": {
"name": "name of the schema, must be an identifier.",
"description": "description for the model.",
# You can convert a Pydantic (v2) model to JSON schema
# using the `model_json_schema()` method.
"schema": "<the JSON schema itself>",
# Whether to enable strict schema adherence when
# generating the output. If set to true, the model will
# always follow the exact schema defined in the
# `schema` field. Only a subset of JSON Schema is
# supported when `strict` is `true`.
# To learn more, read
# https://platform.openai.com/docs/guides/structured-outputs.
"strict": False, # or True
},
},
)
````
R1 reasoning tokens from hosted R1 model were not parsed correctly for the openai client
Resolves#5941
---------
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
`poe check` fails on main on Windows due to a combination line ending
mismatches, Unix-specific commands, and Windows-specific `asyncio`
behavior. This PR attempts to fix this (so that `poe check` on a
freshly-pulled `main` passes on Windows 11.)
Currently we have SecretStr type for model clients to promote security
best practices.
- when we dump_component, keys are serialized as SecreteStr ..
- when we load_component ... SecreteStr type is passed to the client in
the api_key field. This i causes the type problems as the clients expect
a string type.
This PR updates the from_config method for model clients to ensure we
get the value from SecretStr.
Closes#5944
Fixes issues like the following trace:
```
packages/autogen_ext/agents/file_surfer/_markdown_file_browser.py", line 39, in __init__
self.set_path(self._base_path)
File "/home/hmozannar/webby/.venv/lib/python3.12/site-packages/autogen_ext/agents/file_surfer/_markdown_file_browser.py", line 67, in set_path
self._open_path(path)
File "/home/hmozannar/webby/.venv/lib/python3.12/site-packages/autogen_ext/agents/file_surfer/_markdown_file_browser.py", line 210, in _open_path
io.StringIO(self._fetch_local_dir(path)), file_extension=".txt"
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hmozannar/webby/.venv/lib/python3.12/site-packages/autogen_ext/agents/file_surfer/_markdown_file_browser.py", line 248, in _fetch_local_dir
mtime = datetime.datetime.fromtimestamp(os.path.getmtime(full_path)).strftime("%Y-%m-%d %H:%M")
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen genericpath>", line 67, in getmtime
PermissionError: [Errno 13] Permission denied: '/home/hmozannar/webby/autogen-studio/frontend/readme.txt'
```
Use of `SKChatCompletionAdapter` reliably fails with "'MockValSer'
object cannot be converted to 'SchemaSerializer'"; can repro with this
example:
https://microsoft.github.io/autogen/stable/user-guide/core-user-guide/components/model-clients.html#semantic-kernel-adapter
This appears to be related to
https://github.com/pydantic/pydantic/issues/7713 - commit uses
workaround from
https://github.com/pydantic/pydantic/issues/7713#issuecomment-2604574418
## Why are these changes needed?
This unblocks use of the Semantic Kernel integration by addressing the
above-referenced error, enabling the integration to perform as expected.
## Related issue number
N/A, see https://github.com/pydantic/pydantic/issues/7713 for context,
though.
## Checks
- [X] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- None needed, internal only change.
- [ ] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- None added; this works on my machine, but I'm not clear on the root
cause of the issue and have no strong opinion on whether this is the
ideal way to fix it long term - simply leaning towards PR`ing a tenative
fix instead of raising an issue.
- [ ] I've made sure all auto checks have passed.
- I am not familiar with these, but assume they will be run during CI.
---------
Co-authored-by: Leonardo Pinheiro <leosantospinheiro@gmail.com>
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
These changes are needed because there is currently no way to get
logging information about Streaming LLM requests/responses.
I decided to put the StreamStart event AFTER the first chunk so there
aren't false positives about connections/auth.
Closes#5730
---------
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
This pull request introduces the integration of the `llama-cpp` library
into the `autogen-ext` package, with significant changes to the project
dependencies and the implementation of a new chat completion client. The
most important changes include updating the project dependencies, adding
a new module for the `LlamaCppChatCompletionClient`, and implementing
the client with various functionalities.
### Project Dependencies:
*
[`python/packages/autogen-ext/pyproject.toml`](diffhunk://#diff-095119d4420ff09059557bd25681211d1772c2be0fbe0ff2d551a3726eff1b4bR34-R38):
Added `llama-cpp-python` as a new dependency under the `llama-cpp`
section.
### New Module:
*
[`python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/__init__.py`](diffhunk://#diff-42ae3ba17d51ca917634c4ea3c5969cf930297c288a783f8d9c126f2accef71dR1-R8):
Introduced the `LlamaCppChatCompletionClient` class and handled import
errors with a descriptive message for missing dependencies.
### Implementation of `LlamaCppChatCompletionClient`:
*
`python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py`:
- Added the `LlamaCppChatCompletionClient` class with methods to
initialize the client, create chat completions, detect and execute
tools, and handle streaming responses.
- Included detailed logging for debugging purposes and implemented
methods to count tokens, track usage, and provide model information.…d
chat capabilities
<!-- Thank you for your contribution! Please review
https://microsoft.github.io/autogen/docs/Contribute before opening a
pull request. -->
<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->
## Why are these changes needed?
<!-- Please give a short summary of the change and the problem this
solves. -->
## Related issue number
<!-- For example: "Closes #1234" -->
## Checks
- [X ] I've included any doc changes needed for
https://microsoft.github.io/autogen/. See
https://microsoft.github.io/autogen/docs/Contribute#documentation to
build and test documentation locally.
- [X ] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [ X] I've made sure all auto checks have passed.
---------
Co-authored-by: aribornstein <x@x.com>
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
Co-authored-by: Ryan Sweet <rysweet@microsoft.com>
This pull request introduces a new feature to the `FileSurfer` agent and
`MarkdownFileBrowser` by adding support for specifying a base path for
file browsing.
*
`python/packages/autogen-ext/src/autogen_ext/agents/file_surfer/_file_surfer.py`:
* Added `base_path` parameter to `FileSurfer` class and its
initialization method, with a default value of the current working
directory (`os.getcwd()`).
[[1]](diffhunk://#diff-084847b5e64c659c9aff0bd2d05bbcd0fff2c819a4b91bbe65fa0566054c0972R58)
[[2]](diffhunk://#diff-084847b5e64c659c9aff0bd2d05bbcd0fff2c819a4b91bbe65fa0566054c0972R80-R85)
* Updated `MarkdownFileBrowser` initialization within `FileSurfer` to
use the `base_path` parameter.
*
`python/packages/autogen-ext/src/autogen_ext/agents/file_surfer/_markdown_file_browser.py`:
* Added `base_path` parameter to `MarkdownFileBrowser` class and its
initialization method, with a default value of the current working
directory (`os.getcwd()`).
* Updated `MarkdownFileBrowser` to use the `base_path` for setting the
initial path and returning the current page path.
<!-- Thank you for your contribution! Please review
https://microsoft.github.io/autogen/docs/Contribute before opening a
pull request. -->
<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->
## Why are these changes needed?
Add anthropic docs
- Add api docs
- Add sample code + usage in agent chat user guide
<!-- Please give a short summary of the change and the problem this
solves. -->
## Related issue number
<!-- For example: "Closes #1234" -->
Closes#5856
## Checks
- [ ] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- [ ] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [ ] I've made sure all auto checks have passed.
Resolves#5745
Also made sure to log LLMCallEvent from all builtin model clients, and
added unit test for coverage.
---------
Co-authored-by: Ryan Sweet <rysweet@microsoft.com>
Co-authored-by: Victor Dibia <victordibia@microsoft.com>
Fixes#4821 by adding a `close()` method to all clients.
Additionally:
* The m1 CLI is updated to close the client before exiting.
* The playwrightcontroller is updated to suppress some other unrelated
chatty warnings (e.g,, produced by markitdown when encountering
conversions that require external utilities)
Resolves#4075
1. Introduce custom runtime parameter for all AgentChat teams
(RoundRobinGroupChat, SelectorGroupChat, etc.). This is done by making
sure each team's topics are isolated from other teams, and decoupling
state from agent identities. Also, I removed the closure agent from the
BaseGroupChat and use the group chat manager agent to relay messages to
the output message queue.
2. Added unit tests to test scenarios with custom runtimes by using
pytest fixture
3. Refactored existing unit tests to use ReplayChatCompletionClient with
a few improvements to the client.
4. Fix a one-liner bug in AssistantAgent that caused deserialized agent
to have handoffs.
How to use it?
```python
import asyncio
from autogen_core import SingleThreadedAgentRuntime
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_ext.models.replay import ReplayChatCompletionClient
async def main() -> None:
# Create a runtime
runtime = SingleThreadedAgentRuntime()
runtime.start()
# Create a model client.
model_client = ReplayChatCompletionClient(
["1", "2", "3", "4", "5", "6", "7", "8", "9", "10"],
)
# Create agents
agent1 = AssistantAgent("assistant1", model_client=model_client, system_message="You are a helpful assistant.")
agent2 = AssistantAgent("assistant2", model_client=model_client, system_message="You are a helpful assistant.")
# Create a termination condition
termination_condition = TextMentionTermination("10", sources=["assistant1", "assistant2"])
# Create a team
team = RoundRobinGroupChat([agent1, agent2], runtime=runtime, termination_condition=termination_condition)
# Run the team
stream = team.run_stream(task="Count to 10.")
async for message in stream:
print(message)
# Save the state.
state = await team.save_state()
# Load the state to an existing team.
await team.load_state(state)
# Run the team again
model_client.reset()
stream = team.run_stream(task="Count to 10.")
async for message in stream:
print(message)
# Create a new team, with the same agent names.
agent3 = AssistantAgent("assistant1", model_client=model_client, system_message="You are a helpful assistant.")
agent4 = AssistantAgent("assistant2", model_client=model_client, system_message="You are a helpful assistant.")
new_team = RoundRobinGroupChat([agent3, agent4], runtime=runtime, termination_condition=termination_condition)
# Load the state to the new team.
await new_team.load_state(state)
# Run the new team
model_client.reset()
new_stream = new_team.run_stream(task="Count to 10.")
async for message in new_stream:
print(message)
# Stop the runtime
await runtime.stop()
asyncio.run(main())
```
TODOs as future PRs:
1. Documentation.
2. How to handle errors in custom runtime when the agent has exception?
---------
Co-authored-by: Ryan Sweet <rysweet@microsoft.com>
_(EXPERIMENTAL, RESEARCH IN PROGRESS)_
In 2023 AutoGen introduced [Teachable
Agents](https://microsoft.github.io/autogen/0.2/blog/2023/10/26/TeachableAgent/)
that users could teach new facts, preferences and skills. But teachable
agents were limited in several ways: They could only be
`ConversableAgent` subclasses, they couldn't learn a new skill unless
the user stated (in a single turn) both the task and how to solve it,
and they couldn't learn on their own. **Task-Centric Memory** overcomes
these limitations, allowing users to teach arbitrary agents (or teams)
more flexibly and reliably, and enabling agents to learn from their own
trial-and-error experiences.
This PR is large and complex. All of the files are new, and most of the
added components depend on the others to run at all. But the review
process can be accelerated if approached in the following order.
1. Start with the [Task-Centric Memory
README](https://github.com/microsoft/autogen/tree/agentic_memory/python/packages/autogen-ext/src/autogen_ext/task_centric_memory).
1. Install the memory extension locally, since it won't be in pypi until
it's merged. In the `agentic_memory` branch, and the `python/packages`
directory:
- `pip install -e autogen-agentchat`
- `pip install -e autogen-ext[openai]`
- `pip install -e autogen-ext[task-centric-memory]`
2. Run the Quickstart sample code, then immediately open the
`./pagelogs/quick/0 Call Tree.html` file in a browser to view the work
in progress.
3. Click through the web page links to see the details.
2. Continue through the rest of the main README to get a high-level
overview of the architecture.
3. Read through the [code samples
README](https://github.com/microsoft/autogen/tree/agentic_memory/python/samples/task_centric_memory),
running each of the 4 code samples while viewing their page logs.
4. Skim through the 4 code samples, along with their corresponding yaml
config files:
1. `chat_with_teachable_agent.py`
2. `eval_retrieval.py`
3. `eval_teachability.py`
4. `eval_learning_from_demonstration.py`
5. `eval_self_teaching.py`
6. Read `task_centric_memory_controller.py`, referring back to the
previously generated page logs as needed. This is the most important and
complex file in the PR.
7. Read the remaining core files.
1. `_task_centric_memory_bank.py`
2. `_string_similarity_map.py`
3. `_prompter.py`
8. Read the supporting files in the utils dir.
1. `teachability.py`
2. `apprentice.py`
3. `grader.py`
4. `page_logger.py`
5. `_functions.py`
<!-- Thank you for your contribution! Please review
https://microsoft.github.io/autogen/docs/Contribute before opening a
pull request. -->
<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->
## Why are these changes needed?
<!-- Please give a short summary of the change and the problem this
solves. -->
The PR introduces two changes.
The first change is adding a name attribute to
`FunctionExecutionResult`. The motivation is that semantic kernel
requires it for their function result interface and it seemed like a
easy modification as `FunctionExecutionResult` is always created in the
context of a `FunctionCall` which will contain the name. I'm unsure if
there was a motivation to keep it out but this change makes it easier to
trace which tool the result refers to and also increases api
compatibility with SK.
The second change is an update to how messages are mapped from autogen
to semantic kernel, which includes an update/fix in the processing of
function results.
## Related issue number
<!-- For example: "Closes #1234" -->
Related to #5675 but wont fix the underlying issue of anthropic
requiring tools during AssistantAgent reflection.
## Checks
- [ ] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- [ ] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [ ] I've made sure all auto checks have passed.
---------
Co-authored-by: Leonardo Pinheiro <lpinheiro@microsoft.com>