autogen

mirror of https://github.com/microsoft/autogen.git synced 2025-09-12 17:56:20 +00:00

Author	SHA1	Message	Date
Copilot	c150f85044	Add `tool_choice` parameter to `ChatCompletionClient` `create` and `create_stream` methods (#6697 ) ## Summary Implements the `tool_choice` parameter for `ChatCompletionClient` interface as requested in #6696. This allows users to restrict which tools the model can choose from when multiple tools are available. ## Changes ### Core Interface - Core Interface: Added `tool_choice: Tool \| Literal["auto", "required", "none"] = "auto"` parameter to `ChatCompletionClient.create()` and `create_stream()` methods - Model Implementations: Updated client implementations to support the new parameter, for now, only the following model clients are supported: - OpenAI - Anthropic - Azure AI - Ollama - `LlamaCppChatCompletionClient` currently not supported Features - "auto" (default): Let the model choose whether to use tools, when there is no tool, it has no effect. - "required": Force the model to use at least one tool - "none": Disable tool usage completely - Tool object: Force the model to use a specific tool --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: ekzhu <320302+ekzhu@users.noreply.github.com> Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-06-30 14:15:28 +09:00
Tejas Dharani	11b7743b7d	Fix completion tokens none issue 6352 (#6665 )	2025-06-26 23:26:27 +00:00
peterychang	8a2582c541	SK KernelFunction from ToolSchemas (#6637 ) ## Why are these changes needed? Only a subset of available tools will sent to SK ## Related issue number resolves https://github.com/microsoft/autogen/issues/6582 ## Checks - [ ] I've included any doc changes needed for <https://microsoft.github.io/autogen/>. See <https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to build and test documentation locally. - [ ] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [ ] I've made sure all auto checks have passed.	2025-06-06 10:15:56 -04:00
peterychang	03394a42c0	Default usage statistics for streaming responses (#6578 ) ## Why are these changes needed? Enables usage statistics for streaming responses by default. There is a similar bug in the AzureAI client. Theoretically adding the parameter ``` model_extras={"stream_options": {"include_usage": True}} ``` should fix the problem, but I'm currently unable to test that workflow ## Related issue number closes https://github.com/microsoft/autogen/issues/6548 ## Checks - [ ] I've included any doc changes needed for <https://microsoft.github.io/autogen/>. See <https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to build and test documentation locally. - [ ] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [ ] I've made sure all auto checks have passed.	2025-05-28 14:32:04 -04:00
Miroslav Pokrovskii	aa22b622d0	feat: add qwen3 support (#6528 ) ## Why are these changes needed? Add ollama qwen 3 support	2025-05-14 09:52:13 -07:00
EeS	978cbd2e89	FIX/mistral could not recive name field (#6503 ) ## Why are these changes needed? FIX/mistral could not recive name field, so add model transformer for mistral ## Related issue number Closes #6147 ## Checks - [ ] I've included any doc changes needed for <https://microsoft.github.io/autogen/>. See <https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to build and test documentation locally. - [x] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [x] I've made sure all auto checks have passed. Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-05-12 19:32:14 -07:00
peterychang	9118f9b998	Add ability to register Agent instances (#6131 ) <!-- Thank you for your contribution! Please review https://microsoft.github.io/autogen/docs/Contribute before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? Nice to have functionality ## Related issue number Closes #6060 ## Checks - [x] I've included any doc changes needed for <https://microsoft.github.io/autogen/>. See <https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to build and test documentation locally. - [x] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [x] I've made sure all auto checks have passed. --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-05-12 15:34:48 +00:00
Victor Dibia	6427c07f5c	Fix AnthropicBedrockChatCompletionClient import error (#6489 ) <!-- Thank you for your contribution! Please review https://microsoft.github.io/autogen/docs/Contribute before opening a pull request. --> Some fixes with the AnthropicBedrockChatCompletionClient - Ensure `AnthropicBedrockChatCompletionClient` exported and can be imported. - Update the BedrockInfo keys serialization - client argument can be string (similar to api key in this ) but exported config should be Secret - Replace `AnthropicBedrock` with `AsyncAnthropicBedrock` : client should be async to work with the ag stack and the BaseAnthropicClient it inherits from - Improve `AnthropicBedrockChatCompletionClient` docstring to use the correct client arguments rather than serialized dict format. Expect ```python from autogen_ext.models.anthropic import AnthropicBedrockChatCompletionClient, BedrockInfo from autogen_core.models import UserMessage, ModelInfo async def main(): anthropic_client = AnthropicBedrockChatCompletionClient( model="anthropic.claude-3-5-sonnet-20240620-v1:0", temperature=0.1, model_info=ModelInfo(vision=False, function_calling=True, json_output=False, family="unknown", structured_output=True), bedrock_info=BedrockInfo( aws_access_key="<aws_access_key>", aws_secret_key="<aws_secret_key>", aws_session_token="<aws_session_token>", aws_region="<aws_region>", ), ) # type: ignore result = await anthropic_client.create([UserMessage(content="What is the capital of France?", source="user")]) print(result) await main() ``` <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number <!-- For example: "Closes #1234" --> Closes #6483 ## Checks - [ ] I've included any doc changes needed for <https://microsoft.github.io/autogen/>. See <https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to build and test documentation locally. - [ ] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [ ] I've made sure all auto checks have passed. --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-05-10 09:12:02 -07:00
EeS	6fc4f53212	FIX: `MultiModalMessage` in gemini with openai sdk error occured (#6440 ) ## Why are these changes needed? Multimodal message fill context with other routine. However current `_set_empty_to_whitespace` is fill with context. So, error occured. And, I checked `multimodal_user_transformer_funcs` and I found it, in this routine, context must not be empty. Now remove the `_set_empty_to_whitespace` when multimodal message, <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number Closes #6439 ## Checks - [ ] I've included any doc changes needed for <https://microsoft.github.io/autogen/>. See <https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to build and test documentation locally. - [x] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [x] I've made sure all auto checks have passed. Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-05-01 09:27:31 -07:00
EeS	a283d268df	TEST/change gpt4, gpt4o serise to gpt4.1nano (#6375 ) ## Why are these changes needed? \| Package \| Test time-Origin (Sec) \| Test time-Edited (Sec) \| \|-------------------------\|------------------\|-----------------------------------------------\| \| autogen-studio \| 1.64 \| 1.64 \| \| autogen-core \| 6.03 \| 6.17 \| \| autogen-ext \| 387.15 \| 373.40 \| \| autogen-agentchat \| 54.20 \| 20.67 \| ## Related issue number Related #6361 ## Checks - [ ] I've included any doc changes needed for <https://microsoft.github.io/autogen/>. See <https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to build and test documentation locally. - [ ] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [ ] I've made sure all auto checks have passed.	2025-04-23 17:51:25 +00:00
Peter Jausovec	d051da52c3	fix: ollama fails when tools use optional args (#6343 ) ## Why are these changes needed? `convert_tools` failed if Optional args were used in tools (the `type` field doesn't exist in that case and `anyOf` must be used). This uses the `anyOf` field to pick the first non-null type to use. ## Related issue number Fixes #6323 ## Checks - [ ] I've included any doc changes needed for <https://microsoft.github.io/autogen/>. See <https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to build and test documentation locally. - [x] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [x] I've made sure all auto checks have passed. --------- Signed-off-by: Peter Jausovec <peter.jausovec@solo.io> Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-04-22 00:06:46 +00:00
EeS	1de07ab293	Generalize Continuous SystemMessage merging via model_info[“multiple_system_messages”] instead of `startswith("gemini-")` (#6345 ) The current implementation of consecutive `SystemMessage` merging applies only to models where `model_info.family` starts with `"gemini-"`. Since PR #6327 introduced the `multiple_system_messages` field in `model_info`, we can now generalize this logic by checking whether the field is explicitly set to `False`. This change replaces the hardcoded family check with a conditional that merges consecutive `SystemMessage` blocks whenever `multiple_system_messages` is set to `False`. Test cases that previously depended on the `"gemini"` model family have been updated to reflect this configuration flag, and renamed accordingly for clarity. In addition, for consistency across conditional logic, a follow-up PR is planned to refactor the Claude-specific transformation condition (currently implemented via `create_args.get("model", "unknown").startswith("claude-")`) to instead use the existing `is_claude()`. Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-04-21 11:30:35 -07:00
EeS	b24df29ad0	Fix/transformer aware any modelfamily (#6213 ) This PR improves fallback safety when an invalid `model_family` is supplied to `get_transformer()`. Previously, if a user passed an arbitrary or incorrect `family` string in `model_info`, the lookup could fail without falling back to `ModelFamily.UNKNOWN`. Now, we explicitly check whether `model_family` is a valid value in `ModelFamily.ANY`. If not, we fallback to `_find_model_family()` as intended. ## Related issue number Related #6011#issuecomment-2779957730 ## Checks - [ ] I've included any doc changes needed for <https://microsoft.github.io/autogen/>. See <https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to build and test documentation locally. - [x] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [x] I've made sure all auto checks have passed. --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-04-05 19:58:16 -07:00
Eric Zhu	d4ac2ca6de	Fix streaming + tool bug in Ollama (#6193 ) Fix a bug that caused tool calls to be truncated in OllamaChatCompletionClient when streaming is on.	2025-04-03 14:56:01 -07:00
Victor Dibia	bd572cc112	Ensure message sent to LLMCallEvent for Anthropic is serializable (#6135 ) Messages sent as part of `LLMCallEvent` for Anthropic were not fully serializable The example below shows TextBlock and ToolUseBlocks inside the content of messages - these throw downsteam errors in apps like AGS (or event sinks) that expect serializable dicts inside the LLMCallEvent. ``` [ {'role': 'user', 'content': 'What is the weather in New York?'}, {'role': 'assistant', 'content': [TextBlock(citations=None, text='I can help you find the weather in New York. Let me check that for you.', type='text'), ToolUseBlock(id='toolu_016W8g55GejYGBzRRrcsnt7M', input={'city': 'New York'}, name='get_weather', type='tool_use')]}, {'role': 'user', 'content': [{'type': 'tool_result', 'tool_use_id': 'toolu_016W8g55GejYGBzRRrcsnt7M', 'content': 'The weather in New York is 73 degrees and Sunny.'}]} ] ``` This PR attempts to first serialize content of anthropic messages before they are passed to `LLMCallEvent` ``` [ {'role': 'user', 'content': 'What is the weather in New York?'}, {'role': 'assistant', 'content': [{'citations': None, 'text': 'I can help you find the weather in New York. Let me check that for you.', 'type': 'text'}, {'id': 'toolu_016W8g55GejYGBzRRrcsnt7M', 'input': {'city': 'New York'}, 'name': 'get_weather', 'type': 'tool_use'}]}, {'role': 'user', 'content': [{'type': 'tool_result', 'tool_use_id': 'toolu_016W8g55GejYGBzRRrcsnt7M', 'content': 'The weather in New York is 73 degrees and Sunny.'}]} ] ```	2025-04-02 18:01:42 -07:00
EeS	d7f2b56846	FIX:simple fix on tool calling test for anthropic (#6181 ) Just simple change. ```python messages: List[LLMMessage] = [UserMessage(content="Call the pass tool with input 'task'", source="user")] ``` to ```python messages: List[LLMMessage] = [UserMessage(content="Call the pass tool with input 'task' and talk result", source="user")] ``` And, now. Anthropic model could pass that test case `test_model_client_with_function_calling`. -> Yup. Before, claude could not pass that test case. With this change, Claude (Anthropic) models are now able to pass the test case successfully. Before this fix, Claude failed to interpret the intent correctly. Now, it can infer both tool usage and follow-up generation. This change is backward-compatible with other models (e.g., GPT-4) and improves cross-model consistency for function-calling tests.	2025-04-02 23:10:11 +00:00
EeS	27da37efc0	[Refactor] model family resolution to support non-prefixed names like Mistral (#6158 ) This PR improves how model_family is resolved when selecting a transformer from the registry. Previously, model families were inferred using a simple prefix-based match like: ``` if model.startswith(family): ... ``` This works for cleanly prefixed models (e.g., `gpt-4o`, `claude-3`) but fails for models like `mistral-large-latest`, `codestral-latest`, etc., where prefix-based matching is ambiguous or misleading. To address this: • model_family can now be passed explicitly (e.g., via ModelInfo) • _find_model_family() is only used as a fallback when the value is "unknown" • Transformer lookup is now more robust and predictable • Example integration in to_oai_type() demonstrates this pattern using self._model_info["family"] This change is required for safe support of models like Mistral and other future models that do not follow standard naming conventions. Linked to discussion in [#6151](https://github.com/microsoft/autogen/issues/6151) Related : #6011 --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-04-02 22:08:17 +00:00
EeS	9de16d5f70	Fix/anthropic colud not end with trailing whitespace at assistant content (#6168 ) ## Why are these changes needed? This PR fixes a `400 - invalid_request_error` that occurs when using Anthropic models and the final message is from the assistant and ends with trailing whitespace. Example error: ``` Error code: 400 - {'error': {'code': 'invalid_request_error', 'message': 'messages: final assistant content cannot end with trailing whitespace', ...}} ``` To unblock ongoing internal usage, this patch introduces an ad-hoc fix that strips trailing whitespace if the model is Anthropic and the last message is from the assistant. ## Related issue number Ad-hoc fix for issue discussed here: https://github.com/microsoft/autogen/issues/6167 Follow-up structural proposal here: https://github.com/microsoft/autogen/issues/6167 https://github.com/microsoft/autogen/issues/6167#issuecomment-2768592840	2025-04-02 00:56:08 +00:00
EeS	fbdd89b46b	[BugFix][Refactor] Modular Transformer Pipeline and Fix Gemini/Anthropic Empty Content Handling (#6063 ) ## Why are these changes needed? This change addresses a compatibility issue when using Google Gemini models with AutoGen. Specifically, Gemini returns a 400 INVALID_ARGUMENT error when receiving a response with an empty "text" parameter. The root cause is that Gemini does not accept empty string values (e.g., "") as valid inputs in the history of the conversation. To fix this, if the content field is falsy (e.g., None, "", etc.), it is explicitly replaced with a single whitespace (" "), which prevents the Gemini model from rejecting the request. - Gemini API compatibility: Gemini models reject empty assistant messages (e.g., `""`), causing runtime errors. This PR ensures such messages are safely replaced with whitespace where appropriate. - Avoiding regressions: Applying the empty content workaround only to Gemini, and only to valid message types, avoids breaking OpenAI or other models. - Reducing duplication: Previously, message transformation logic was scattered and repeated across different message types and models. Modularizing this pipeline removes that redundancy. - Improved maintainability: With future model variants likely to introduce more constraints, this modular structure makes it easier to adapt transformations without writing ad-hoc code each time. - Testing for correctness: The new structure is verified with tests, ensuring the bug fix is effective and non-intrusive. ## Summary This PR introduces a modular transformer pipeline for message conversion and fixes a Gemini-specific bug related to empty assistant message content. ### Key Changes - [Refactor] Extracted message transformation logic into a unified pipeline to: - Reduce code duplication - Improve maintainability - Simplify debugging and extension for future model-specific logic - [BugFix] Gemini models do not accept empty assistant message content. - Introduced `_set_empty_to_whitespace` transformer to replace empty strings with `" "` only where needed - Applied it only to `"text"` and `"thought"` message types, not to `"tools"` to avoid serialization errors - Improved structure for model-specific handling - Transformer functions are now grouped and conditionally applied based on message type and model family - This design makes it easier to support future models or combinations (e.g., Gemini + R1) - Test coverage added - Added dedicated tests to verify that empty assistant content causes errors for Gemini - Ensured the fix resolves the issue without affecting OpenAI models --- ## Motivation Originally, Gemini-compatible endpoints would fail when receiving assistant messages with empty content (`""`). This issue required special handling without introducing brittle, ad-hoc patches. In addressing this, I also saw an opportunity to modularize the message transformation logic across models. This improves clarity, avoids duplication, and simplifies future adaptations (e.g., different constraints across model families). --- ## 📘 AutoGen Modular Message Transformer: Design & Usage Guide This document introduces the new modular transformer system used in AutoGen for converting `LLMMessage` instances to SDK-specific message formats (e.g., OpenAI-style `ChatCompletionMessageParam`). The design improves reusability, extensibility, and maintainability across different model families. --- ### 🚀 Overview Instead of scattering model-specific message conversion logic across the codebase, the new design introduces: - Modular transformer functions for each message type - Per-model transformer maps (e.g., for OpenAI-compatible models) - Optional conditional transformers for multimodal/text hybrid models - Clear separation between message adaptation logic and SDK-specific builder (e.g., `ChatCompletionUserMessageParam`) --- ### 🧱 1. Define Transform Functions Each transformer function takes: - `LLMMessage`: a structured AutoGen message - `context: dict`: metadata passed through the builder pipeline And returns: - A dictionary of keyword arguments for the target message constructor (e.g., `{"content": ..., "name": ..., "role": ...}`) ```python def _set_thought_as_content_gemini(message: LLMMessage, context: Dict[str, Any]) -> Dict[str, str \| None]: assert isinstance(message, AssistantMessage) return {"content": message.thought or " "} ``` --- ### 🪢 2. Compose Transformer Pipelines Multiple transformer functions are composed into a pipeline using `build_transformer_func()`: ```python base_user_transformer_funcs: List[Callable[[LLMMessage, Dict[str, Any]], Dict[str, Any]]] = [ _assert_valid_name, _set_name, _set_role("user"), ] user_transformer = build_transformer_func( funcs=base_user_transformer_funcs, message_param_func=ChatCompletionUserMessageParam ) ``` - The `message_param_func` is the actual constructor for the target message class (usually from the SDK). - The pipeline is ordered — each function adds or overrides keys in the builder kwargs. --- ### 🗂️ 3. Register Transformer Map Each model family maintains a `TransformerMap`, which maps `LLMMessage` types to transformers: ```python __BASE_TRANSFORMER_MAP: TransformerMap = { SystemMessage: system_transformer, UserMessage: user_transformer, AssistantMessage: assistant_transformer, } register_transformer("openai", model_name_or_family, __BASE_TRANSFORMER_MAP) ``` - `"openai"` is currently required (as only OpenAI-compatible format is supported now). - Registration ensures AutoGen knows how to transform each message type for that model. --- ### 🔁 4. Conditional Transformers (Optional) When message construction depends on runtime conditions (e.g., `"text"` vs. `"multimodal"`), use: ```python conditional_transformer = build_conditional_transformer_func( funcs_map=user_transformer_funcs_claude, message_param_func_map=user_transformer_constructors, condition_func=user_condition, ) ``` Where: - `funcs_map`: maps condition label → list of transformer functions ```python user_transformer_funcs_claude = { "text": text_transformers + [_set_empty_to_whitespace], "multimodal": multimodal_transformers + [_set_empty_to_whitespace], } ``` - `message_param_func_map`: maps condition label → message builder ```python user_transformer_constructors = { "text": ChatCompletionUserMessageParam, "multimodal": ChatCompletionUserMessageParam, } ``` - `condition_func`: determines which transformer to apply at runtime ```python def user_condition(message: LLMMessage, context: Dict[str, Any]) -> str: if isinstance(message.content, str): return "text" return "multimodal" ``` --- ### 🧪 Example Flow ```python llm_message = AssistantMessage(name="a", thought="let’s go") model_family = "openai" model_name = "claude-3-opus" transformer = get_transformer(model_family, model_name, type(llm_message)) sdk_message = transformer(llm_message, context={}) ``` --- ### 🎯 Design Benefits \| Feature \| Benefit \| \|--------\|---------\| \| 🧱 Function-based modular design \| Easy to compose and test \| \| 🧩 Per-model registry \| Clean separation across model families \| \| ⚖️ Conditional support \| Allows multimodal / dynamic adaptation \| \| 🔄 Reuse-friendly \| Shared logic (e.g., `_set_name`) is DRY \| \| 📦 SDK-specific \| Keeps message adaptation aligned to builder interface \| --- ### 🔮 Future Direction - Support more SDKs and formats by introducing new message_param_func - Global registry integration (currently `"openai"`-scoped) - Class-based transformer variant if complexity grows --- ## Related issue number Closes #5762 ## Checks - [ ] I've included any doc changes needed for <https://microsoft.github.io/autogen/>. See <https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to build and test documentation locally. - [x] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [ v ] I've made sure all auto checks have passed. --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-03-30 21:09:30 -07:00
EeS	0cd3ff46fa	FIX: Anthropic and Gemini could take multiple system message (#6118 ) Anthropic SDK could not takes multiple system messages. However some autogen Agent(e.g. SocietyOfMindAgent) makes multiple system messages. And... Gemini with OpenaiSDK do not take error. However is not working mulitple system messages. (Just last one is working) So, I simple change of, "merge multiple system message" at these cases. ## Related issue number Closes #6116 Closes #6117 --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-03-28 09:05:54 -07:00
Jay Prakash Thakur	b5ff7ee355	feat(ollama): Add thought field support and fix LLM control parameters (#6126 )	2025-03-26 23:14:26 -07:00
y26s4824k264	0bec835d59	Emit <think> and </think> around reasoning chunks from model_extras in choices.detla So the behavior of hosted R1 model is the same as locally hosted R1 model. Addresses: #5989 --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-03-25 16:17:53 -07:00
Jay Prakash Thakur	7047fb8b8d	Add support for thought field in AzureAIChatCompletionClient (#6062 ) added support for the thought process in tool calls for `OpenAIChatCompletionClient`, allowing additional text produced by a model alongside tool calls to be preserved in the thought field of `CreateResult`. This PR extends the same functionality to `AzureAIChatCompletionClient` for consistency across model clients. #5650 Co-authored-by: Jay Prakash Thakur <jathakur@microsoft.com>	2025-03-24 17:33:10 -07:00
Eric Zhu	a8cef327f1	Support json schema for response format type in OpenAIChatCompletionClient (#5988 ) Resolves #5982 This PR adds support for `json_schema` as a `response_format` type in `OpenAIChatCompletionClient`. This is necessary because it allows the client to be serialized along with the schema. If user use `response_format=SomeBaseModel`, the client cannot be serialized. Usage: ```python # Structured output response, with a pre-defined JSON schema. OpenAIChatCompletionClient(..., response_format = { "type": "json_schema", "json_schema": { "name": "name of the schema, must be an identifier.", "description": "description for the model.", # You can convert a Pydantic (v2) model to JSON schema # using the `model_json_schema()` method. "schema": "<the JSON schema itself>", # Whether to enable strict schema adherence when # generating the output. If set to true, the model will # always follow the exact schema defined in the # `schema` field. Only a subset of JSON Schema is # supported when `strict` is `true`. # To learn more, read # https://platform.openai.com/docs/guides/structured-outputs. "strict": False, # or True }, }, ) ````	2025-03-18 03:14:42 +00:00
ZakWork	685142cf51	Fix R1 reasoning parser for openai client (#5961 ) R1 reasoning tokens from hosted R1 model were not parsed correctly for the openai client Resolves #5941 --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-03-17 10:09:41 -07:00
Eric Zhu	aba41d74d3	feat: add structured output to model clients (#5936 )	2025-03-15 07:58:13 -07:00
Eric Zhu	a4b6372813	Use SecretStr type for api key (#5939 ) To prevent accidental export of API keys	2025-03-13 21:29:19 -07:00
Eitan Yarmush	817f728d04	add LLMStreamStartEvent and LLMStreamEndEvent (#5890 ) These changes are needed because there is currently no way to get logging information about Streaming LLM requests/responses. I decided to put the StreamStart event AFTER the first chunk so there aren't false positives about connections/auth. Closes #5730 --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-03-11 15:02:46 -07:00
PythicCoder	6a3acc4548	Feature add Add LlamaCppChatCompletionClient and llama-cpp (#5326 ) This pull request introduces the integration of the `llama-cpp` library into the `autogen-ext` package, with significant changes to the project dependencies and the implementation of a new chat completion client. The most important changes include updating the project dependencies, adding a new module for the `LlamaCppChatCompletionClient`, and implementing the client with various functionalities. ### Project Dependencies: * [`python/packages/autogen-ext/pyproject.toml`](diffhunk://#diff-095119d4420ff09059557bd25681211d1772c2be0fbe0ff2d551a3726eff1b4bR34-R38): Added `llama-cpp-python` as a new dependency under the `llama-cpp` section. ### New Module: * [`python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/__init__.py`](diffhunk://#diff-42ae3ba17d51ca917634c4ea3c5969cf930297c288a783f8d9c126f2accef71dR1-R8): Introduced the `LlamaCppChatCompletionClient` class and handled import errors with a descriptive message for missing dependencies. ### Implementation of `LlamaCppChatCompletionClient`: * `python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py`: - Added the `LlamaCppChatCompletionClient` class with methods to initialize the client, create chat completions, detect and execute tools, and handle streaming responses. - Included detailed logging for debugging purposes and implemented methods to count tokens, track usage, and provide model information.…d chat capabilities <!-- Thank you for your contribution! Please review https://microsoft.github.io/autogen/docs/Contribute before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number <!-- For example: "Closes #1234" --> ## Checks - [X ] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally. - [X ] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [ X] I've made sure all auto checks have passed. --------- Co-authored-by: aribornstein <x@x.com> Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com> Co-authored-by: Ryan Sweet <rysweet@microsoft.com>	2025-03-10 16:53:53 -07:00
Eric Zhu	740afe5b61	Add ToolCallEvent and log it from all builtin tools (#5859 ) Resolves #5745 Also made sure to log LLMCallEvent from all builtin model clients, and added unit test for coverage. --------- Co-authored-by: Ryan Sweet <rysweet@microsoft.com> Co-authored-by: Victor Dibia <victordibia@microsoft.com>	2025-03-07 16:04:45 -08:00
Leonardo Pinheiro	906b09e451	fix: Update SKChatCompletionAdapter message conversion (#5749 ) <!-- Thank you for your contribution! Please review https://microsoft.github.io/autogen/docs/Contribute before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> The PR introduces two changes. The first change is adding a name attribute to `FunctionExecutionResult`. The motivation is that semantic kernel requires it for their function result interface and it seemed like a easy modification as `FunctionExecutionResult` is always created in the context of a `FunctionCall` which will contain the name. I'm unsure if there was a motivation to keep it out but this change makes it easier to trace which tool the result refers to and also increases api compatibility with SK. The second change is an update to how messages are mapped from autogen to semantic kernel, which includes an update/fix in the processing of function results. ## Related issue number <!-- For example: "Closes #1234" --> Related to #5675 but wont fix the underlying issue of anthropic requiring tools during AssistantAgent reflection. ## Checks - [ ] I've included any doc changes needed for <https://microsoft.github.io/autogen/>. See <https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to build and test documentation locally. - [ ] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [ ] I've made sure all auto checks have passed. --------- Co-authored-by: Leonardo Pinheiro <lpinheiro@microsoft.com>	2025-03-03 23:05:54 +00:00
rylativity	5615f40a30	5663 ollama client host (#5674 ) @ekzhu should likely be assigned as reviewer ## Why are these changes needed? These changes address the bug reported in #5663. Prevents TypeError from being thrown at inference time by ollama AsyncClient when `host` (and other) kwargs are passed to autogen OllamaChatCompletionClient constructor. It also adds ollama as a named optional extra so that the ollama requirements can be installed alongside autogen-ext (e.g. `pip install autogen-ext[ollama]` @ekzhu, I will need some help or guidance to ensure that the associated test (which requires ollama and tiktoken as dependencies of the OllamaChatCompletionClient) can run successfully in autogen's test execution environment. I have also left the "I've made sure all auto checks have passed" check below unchecked as this PR is coming from my fork. (UPDATE: auto checks appear to have passed after opening PR, so I have checked box below) ## Related issue number Intended to close #5663 ## Checks - [x] I've included any doc changes needed for <https://microsoft.github.io/autogen/>. See <https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to build and test documentation locally. - [x] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [x] I've made sure all auto checks have passed. --------- Co-authored-by: Ryan Stewart <ryanstewart@Ryans-MacBook-Pro.local> Co-authored-by: Jack Gerrits <jackgerrits@users.noreply.github.com> Co-authored-by: peterychang <49209570+peterychang@users.noreply.github.com>	2025-02-26 11:02:48 -05:00
Victor Dibia	05fc763b8a	add anthropic native support (#5695 ) <!-- Thank you for your contribution! Please review https://microsoft.github.io/autogen/docs/Contribute before opening a pull request. --> Claude 3.7 just came out. Its a pretty capable model and it would be great to support it in Autogen. This will could augment the already excellent support we have for Anthropic via the SKAdapters in the following ways - Based on the ChatCompletion API similar to the ollama and openai client - Configurable/serializable (can be dumped) .. this means it can be used easily in AGS. ## What is Supported (video below shows the client being used in autogen studio) https://github.com/user-attachments/assets/8fb7c17c-9f9c-4525-aa9c-f256aad0f40b - streaming - tool callign / function calling - drop in integration with assistant agent. - multimodal support ```python from dotenv import load_dotenv import os load_dotenv() from autogen_agentchat.agents import AssistantAgent from autogen_agentchat.ui import Console from autogen_ext.models.openai import OpenAIChatCompletionClient from autogen_ext.models.anthropic import AnthropicChatCompletionClient model_client = AnthropicChatCompletionClient( model="claude-3-7-sonnet-20250219" ) async def get_weather(city: str) -> str: """Get the weather for a given city.""" return f"The weather in {city} is 73 degrees and Sunny." agent = AssistantAgent( name="weather_agent", model_client=model_client, tools=[get_weather], system_message="You are a helpful assistant.", # model_client_stream=True, ) # Run the agent and stream the messages to the console. async def main() -> None: await Console(agent.run_stream(task="What is the weather in New York?")) await main() ``` result ``` messages = [ UserMessage(content="Write a very short story about a dragon.", source="user"), ] # Create a stream. stream = model_client.create_stream(messages=messages) # Iterate over the stream and print the responses. print("Streamed responses:") async for response in stream: # type: ignore if isinstance(response, str): # A partial response is a string. print(response, flush=True, end="") else: # The last response is a CreateResult object with the complete message. print("\n\n------------\n") print("The complete response:", flush=True) print(response.content, flush=True) print("\n\n------------\n") print("The token usage was:", flush=True) print(response.usage, flush=True) ``` <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number <!-- For example: "Closes #1234" --> Closes #5205 Closes #5708 ## Checks - [ ] I've included any doc changes needed for <https://microsoft.github.io/autogen/>. See <https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to build and test documentation locally. - [ ] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [ ] I've made sure all auto checks have passed. cc @rohanthacker	2025-02-26 07:27:41 +00:00
Eric Zhu	9fd8eefc55	fix: Structured output with tool calls for OpenAIChatCompletionClient (#5671 ) Resolves: #5568 Also, refactored some unit tests. Integration tests against OpenAI endpoint passed: https://github.com/microsoft/autogen/actions/runs/13484492096 Co-authored-by: Jack Gerrits <jackgerrits@users.noreply.github.com>	2025-02-24 14:18:46 +00:00
Victor Dibia	170b8cc893	Make ChatCompletionCache support component config (#5658 ) <!-- Thank you for your contribution! Please review https://microsoft.github.io/autogen/docs/Contribute before opening a pull request. --> This PR makes makes ChatCompletionCache support component config <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? Ensures we have a path to serializing ChatCompletionCache , similar to the ChatCompletion client that it wraps. This PR does the following - Makes CacheStore serializable first (part of this includes converting from Protocol to base class). Makes it's derivatives serializable as well (diskcache, redis) - Makes ChatCompletionCache serializable - Adds some tests <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number <!-- For example: "Closes #1234" --> Closes #5141 ## Checks - [ ] I've included any doc changes needed for <https://microsoft.github.io/autogen/>. See <https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to build and test documentation locally. - [ ] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [ ] I've made sure all auto checks have passed. cc @nour-bouzid	2025-02-23 19:49:22 -08:00
Eric Zhu	7784f44ea6	feat: Add thought process handling in tool calls and expose ThoughtEvent through stream in AgentChat (#5500 ) Resolves #5192 Test ```python import asyncio import os from random import randint from typing import List from autogen_core.tools import BaseTool, FunctionTool from autogen_ext.models.openai import OpenAIChatCompletionClient from autogen_agentchat.agents import AssistantAgent from autogen_agentchat.ui import Console async def get_current_time(city: str) -> str: return f"The current time in {city} is {randint(0, 23)}:{randint(0, 59)}." tools: List[BaseTool] = [ FunctionTool( get_current_time, name="get_current_time", description="Get current time for a city.", ), ] model_client = OpenAIChatCompletionClient( model="anthropic/claude-3.5-haiku-20241022", base_url="https://openrouter.ai/api/v1", api_key=os.environ["OPENROUTER_API_KEY"], model_info={ "family": "claude-3.5-haiku", "function_calling": True, "vision": False, "json_output": False, } ) agent = AssistantAgent( name="Agent", model_client=model_client, tools=tools, system_message= "You are an assistant with some tools that can be used to answer some questions", ) async def main() -> None: await Console(agent.run_stream(task="What is current time of Paris and Toronto?")) asyncio.run(main()) ``` ``` ---------- user ---------- What is current time of Paris and Toronto? ---------- Agent ---------- I'll help you find the current time for Paris and Toronto by using the get_current_time function for each city. ---------- Agent ---------- [FunctionCall(id='toolu_01NwP3fNAwcYKn1x656Dq9xW', arguments='{"city": "Paris"}', name='get_current_time'), FunctionCall(id='toolu_018d4cWSy3TxXhjgmLYFrfRt', arguments='{"city": "Toronto"}', name='get_current_time')] ---------- Agent ---------- [FunctionExecutionResult(content='The current time in Paris is 1:10.', call_id='toolu_01NwP3fNAwcYKn1x656Dq9xW', is_error=False), FunctionExecutionResult(content='The current time in Toronto is 7:28.', call_id='toolu_018d4cWSy3TxXhjgmLYFrfRt', is_error=False)] ---------- Agent ---------- The current time in Paris is 1:10. The current time in Toronto is 7:28. ``` --------- Co-authored-by: Jack Gerrits <jackgerrits@users.noreply.github.com>	2025-02-21 13:58:32 -08:00
Eric Zhu	69c0b2b5ef	fix: Add model info validation and improve error messaging (#5556 ) Introduce validation for the ModelInfo dictionary to ensure required fields are present. Resolves #5501	2025-02-14 18:09:33 -08:00
Eric Zhu	ec314c586c	feat: Add strict mode support to BaseTool, ToolSchema and FunctionTool (#5507 ) Resolves #4447 For `openai` client's structured output support is through its beta client, which requires the function JSON schema to be strict when in structured output mode. Reference: https://platform.openai.com/docs/guides/function-calling#strict-mode	2025-02-13 19:44:55 +00:00
wistuba	7a772a2fcd	feat: add indictor for tool failure to FunctionExecutionResult (#5428 ) Some LLMs recieve an explicit signal about tool use failures. Closes #5273 Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-02-09 21:57:50 -08:00
Eric Zhu	9a028acf9f	feat: enhance Gemini model support in OpenAI client and tests (#5461 )	2025-02-09 10:12:59 -08:00
Leonardo Pinheiro	b868e32b05	fix: update SK adapter stream tool call processing. (#5449 ) <!-- Thank you for your contribution! Please review https://microsoft.github.io/autogen/docs/Contribute before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> The current stream processing of SK model adapter returns on the first function call chunk but this behavior is incorrect end ends up returning with an incomplete function call. The observed behavior is that the function name and arguments are split into different chunks and this update correctly processes the chunks in this way. ## Related issue number <!-- For example: "Closes #1234" --> Fixes the reply in #5420 ## Checks - [ ] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally. - [ ] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [ ] I've made sure all auto checks have passed. --------- Co-authored-by: Leonardo Pinheiro <lpinheiro@microsoft.com>	2025-02-09 14:39:19 +10:00
afourney	0b659de36d	Mitigates #5401 by optionally prepending names to messages. (#5448 ) Mitigates #5401 by optionally prepending names to messages. Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-02-08 07:04:24 +00:00
Leonardo Pinheiro	be085567ea	fix: remove sk tool adapter plugin name (#5444 ) <!-- Thank you for your contribution! Please review https://microsoft.github.io/autogen/docs/Contribute before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> Semantic kernel prepends the plugin name to the tool name when passing the tools to model clients and this is causing a mismatch between tool names in SK and the AssistantAgent. Since plugin names are optional, we have opted to remove it. ## Related issue number <!-- For example: "Closes #1234" --> Closes #5420 ## Checks - [ ] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally. - [ ] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [ ] I've made sure all auto checks have passed. --------- Co-authored-by: Leonardo Pinheiro <lpinheiro@microsoft.com>	2025-02-08 04:54:05 +00:00
Eric Zhu	901ab1276d	feat: enhance AzureAIChatCompletionClient validation and add unit tests (#5417 ) Resolves #5414	2025-02-07 18:32:14 +00:00
Eric Zhu	569bc19769	feat: add gemini model families, enhance group chat selection for Gemini model and add tests (#5334 ) Resolves #5322	2025-02-03 18:32:35 +00:00
Eric Zhu	f656ff1e01	feat: Support R1 reasoning text in model create result; enhance API docs (#5262 ) Resolves #5255 --------- Co-authored-by: afourney <adamfo@microsoft.com>	2025-01-30 11:03:54 -08:00
Eric Zhu	44db2cc1fb	fix: handle non-string function arguments in tool calls and add corresponding warnings (#5260 )	2025-01-30 16:49:22 +00:00
Eric Zhu	225eb9d0b2	feat: introduce ModelClientStreamingChunkEvent for streaming model output and update handling in agents and console (#5208 ) Resolves #3983 * introduce `model_client_stream` parameter in `AssistantAgent` to enable token-level streaming output. * introduce `ModelClientStreamingChunkEvent` as a type of `AgentEvent` to pass the streaming chunks to the application via `run_stream` and `on_messages_stream`. Although this will not affect the inner messages list in the final `Response` or `TaskResult`. * handle this new message type in `Console`.	2025-01-29 02:49:02 +00:00
Eric Zhu	b441d5b43a	fix: Enhance OpenAI client to handle additional stop reasons and improve tool call validation in tests to address empty tool_calls list. (#5223 ) Resolves #5222	2025-01-27 21:16:47 +00:00
Sachin Joglekar	8926206479	Implement default in-memory store for ChatCompletionCache (#5188 )	2025-01-25 21:07:58 +00:00

1 2

68 Commits