autogen

mirror of https://github.com/microsoft/autogen.git synced 2025-11-07 21:34:00 +00:00

Author	SHA1	Message	Date
gagb	d563ba0a1d	Run poe fmt	2025-03-26 11:45:18 -07:00
gagb	e634b43b51	Remove unused dependency 'tiktoken' and related token counting function from OAI coder	2025-03-26 11:37:48 -07:00
gagb	6bab9e8df5	Merge branch 'main' into gagb/qualcoder	2025-03-21 13:16:12 -07:00
afourney	eaef7bab7c	Allow Docker-out-of-docker in AGBench (#6047 ) This PR allows docker-out-of-docker scenarios to be run in agbench (e.g., agent teams that rely on the DockerCommandLineExecutor) This is becoming increasingly important for benchmarking and testing, since the behaviors of running local executors can diverge in important ways.	2025-03-21 12:55:00 -07:00
gagb	881b3c41bc	Run poe lint on linter	2025-03-20 19:36:23 -07:00
gagb	04a01d7d1c	Add reflection pattern	2025-03-20 19:20:22 -07:00
gagb	878aa4c3fc	Add linter to AGBench (#6022 ) This pull request introduces a new linting feature to the benchmark configuration in the `agbench` package. The main changes include adding a new command to the CLI, implementing the linter functionality, and integrating it with the existing codebase. ### New Linting Feature: * [`python/packages/agbench/src/agbench/cli.py`](diffhunk://#diff-0eafed70ad5e99e6f7319927bf92ee3ce4787d156dd2775b10a61baad7ec1799R10): Added `lint_cli` import and integrated the new "lint" command into the `main` function. [[1]](diffhunk://#diff-0eafed70ad5e99e6f7319927bf92ee3ce4787d156dd2775b10a61baad7ec1799R10) [[2]](diffhunk://#diff-0eafed70ad5e99e6f7319927bf92ee3ce4787d156dd2775b10a61baad7ec1799R37-R41) ### Linter Implementation: * [`python/packages/agbench/src/agbench/linter/__init__.py`](diffhunk://#diff-45842e728e3daad063b3cf84d5857a4fdfe14e6d977fb2054f284eb9f5bb5272R1-R4): Added necessary imports to initialize the linter module. * [`python/packages/agbench/src/agbench/linter/_base.py`](diffhunk://#diff-f7ea2f6706232406b6c727fda6d71f09c568b4573f070af79bb7f3da3514e364R1-R81): Defined core classes such as `Document`, `Code`, `CodeExample`, `CodedDocument`, and the `BaseQualitativeCoder` protocol. * [`python/packages/agbench/src/agbench/linter/cli.py`](diffhunk://#diff-e6ad1e14dc0df2c10fe62fede5a06d83865ad1961f99ec2d78f9052feb4d663bR1-R86): Implemented the `lint_cli` function, which includes loading log files, coding them, and printing the results. * [`python/packages/agbench/src/agbench/linter/coders/oai_coder.py`](diffhunk://#diff-5059129410822c8a214f797a6167cbfcfbe31bd6a3b1efcb65a2dd703ef9b331R1-R212): Implemented the `OAIQualitativeCoder` class to interact with OpenAI for coding documents and caching results. Example usage: <img width="997" alt="image" src="https://github.com/user-attachments/assets/6718688e-9917-4a43-a2f1-1105b030528d" /> <img width="999" alt="image" src="https://github.com/user-attachments/assets/7fcb9c43-70f2-4fe7-ae29-5ad6a4ef2a16" /> > If you are in VSCode Terminal, you can click on the links in the terminal output to jump to the exact error. --------- Co-authored-by: afourney <adamfo@microsoft.com>	2025-03-20 19:05:42 +00:00
gagb	a3a2f43234	Add line content field to CodeExample and enhance examples description; implement log summary generation in CLI	2025-03-19 16:04:57 -07:00
gagb	7eb13a28a1	Enhance linter models with detailed field descriptions and improve severity-based output formatting	2025-03-19 15:49:31 -07:00
gagb	172f9f1cec	Run poe format	2025-03-19 14:51:26 -07:00
gagb	98a2827265	Add linter functionality with CLI integration for log analysis	2025-03-19 14:02:04 -07:00
afourney	e5ab7d55cf	Some pandas series were not being handled correctly (#5972 )	2025-03-17 07:16:18 +00:00
afourney	22b68b96b6	Added a flag to agbench to enable Azure identity. (#5977 )	2025-03-17 00:10:44 -07:00
Eric Zhu	483532180a	Improvements to agbench (#5776 ) 1. Add host network support in Docker and remove unused requirements from argument check. 2. Use Pandas to simplify summary statistic calculations. 3. Add running time to summary statistics ``` Using tabulation method defined in '/home/ekzhu/autogen/python/packages/agbench/benchmarks/HumanEval/Scripts/custom_tabulate.py' Task Id Trial 0 Success Trial 0 Time -- ------------ ----------------- -------------- 0 HumanEval_0 True 3 1 HumanEval_1 False 15 2 HumanEval_2 True 2 3 HumanEval_3 True 11 4 HumanEval_4 True 4 5 HumanEval_5 True 2 6 HumanEval_6 False 18 7 HumanEval_7 True 2 8 HumanEval_8 True 2 9 HumanEval_9 True 12 10 HumanEval_10 False 11 11 HumanEval_11 True 2 12 HumanEval_12 True 3 13 HumanEval_13 True 1 14 HumanEval_14 True 4 15 HumanEval_15 True 1 16 HumanEval_16 True 2 17 HumanEval_17 False 76 18 HumanEval_18 True 4 19 HumanEval_19 True 3 20 HumanEval_20 True 5 21 HumanEval_21 True 3 22 HumanEval_22 True 1 23 HumanEval_23 True 2 24 HumanEval_24 nan Summary Statistics Successes Failures Missing Total Average Success Rate Average Time Total Time ------- ----------- ---------- --------- ------- ---------------------- -------------- ------------ Trial 0 20 4 1 25 0.8 7.875 189 CAUTION: 'autogenbench tabulate' is in early preview and is not thoroughly tested. Please do not cite values from these calculations in academic work without first inspecting and verifying the results in the run logs yourself. ``` Now the default tabulate output looks like this --------- Co-authored-by: Ryan Sweet <rysweet@microsoft.com>	2025-03-16 09:13:12 -07:00
afourney	af5dcc7fdf	Significant updates to agbench. (#5313 ) - Updated HumanEval template to use AgentChat - Update templates to use config.yaml for model and other configuration - Read environment from ENV.yaml (ENV.json still supported but deprecated) - Temporarily removed WebArena and AssistantBench. Neither had viable Templates after `autogen_magentic_one` was removed. Templates need to be update to AgentChat (in a future PR, but this PR is getting big enough already)	2025-02-07 18:01:44 +00:00
afourney	088a50faa5	Remove old autogen_magentic_one package. (#5305 ) This PR removes the older `autogen_magentic_one` package, and directs people to use the new AgentChat implementation. Hopefully this eases confusion. --------- Co-authored-by: Jack Gerrits <jack@jackgerrits.com> Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2025-01-31 15:14:40 -08:00
Jack Gerrits	538f39497b	Replace create_completion_client_from_env with component config (#4928 ) * Replace create_completion_client_from_env with component config * json load	2025-01-08 14:33:28 +00:00
Jack Gerrits	fadff4aece	Fix definition of workspace package, remove uv pin (#4830 ) * Fix definition of workspace package, remove uv pin * add --all-packages * pin docs uv versions for older project structure * try old version to verify CI * Use workflow target * change syntax * change check * try with var in matrix * add all packages to workspace * remove project table	2024-12-27 13:11:42 -05:00
Jack Gerrits	87011ae01b	Migrate model context and models modules out of components (#4613 ) * Move model context out of components * move models out of components * rename docs file	2024-12-09 10:00:08 -08:00
Eric Zhu	8dac072658	Update references in docs (#4590 ) * Update agent doc * Remove outdated doc * Update references * Update readme * Update readme	2024-12-06 01:59:28 -08:00
Eric Zhu	fa550c2c36	fix docs (#4589 ) * fix doc on distributed runtime * Fix references * Update references * Fix import paths in user guide notebooks for code executor components	2024-12-06 01:23:05 -08:00
Jack Gerrits	2b878763f8	Move grpc runtimes to ext, flatten application (#4553 ) * Move grpc runtimes to ext, flatten application * rename to grpc * fmt	2024-12-04 16:23:20 -08:00
Victor Dibia	777f2abbd7	Load and Save state in AgentChat (#4436 ) 1. convert dataclass types to pydantic basemodel 2. add save_state and load_state for ChatAgent 3. state types for AgentChat --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2024-12-04 16:14:41 -08:00
Jack Gerrits	3022369eeb	Flatten core base and components (#4513 ) * Flatten core base and components * remove extra files * dont export from deprecated locations * format * fmt	2024-12-03 17:00:44 -08:00
Griffin Bassman	e037596228	typo: agbench readme (#4302 )	2024-11-21 19:17:30 -05:00
Leonardo Pinheiro	38f62e1609	migrate models (#3848 ) * migrate models * Update python/packages/autogen-agentchat/src/autogen_agentchat/agents/_tool_use_assistant_agent.py Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com> * refactor missing imports * ignore type check errors * Update python/packages/autogen-ext/src/autogen_ext/models/_openai/_model_info.py Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com> * update packages index page --------- Co-authored-by: Leonardo Pinheiro <lpinheiro@microsoft.com> Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2024-10-22 11:40:41 -04:00
Hussein Mozannar	e11d84b996	Adding Benchmarks to agbench (#3803 ) * Move from tomllib to tomli * added example code for magentic-one + code comments * adding benchmarks temporarily * add license for datasets * revert changes to magentic-one * change license location --------- Co-authored-by: Ryan Sweet <rysweet@microsoft.com>	2024-10-18 06:33:33 +02:00
Max Golovanov	636591a149	Use MCR registry instead of Docker's registry (#3814 ) * Update FunctionCallGenerator.cs to address race condition Update FunctionCallGenerator.cs to address race condition * Update Dockerfile Use MCR registry * Update Dockerfile Use MCR registry	2024-10-17 07:08:50 -07:00
Hussein Mozannar	373adc9a34	Adding Benchmarks back into agbench and updates to agbench (#3711 )	2024-10-11 15:46:18 -07:00
Jack Gerrits	2526c69ce9	Include license file in package (#3703 )	2024-10-09 15:01:09 -04:00
Jack Gerrits	1174fcd92e	Merge branch 'main' into staging	2024-10-02 14:38:28 -04:00
afourney	d7190cbe9e	Removes easyocr from mdconvert (#653 ) * Removes easyocr from mdconvert * Updated uv lock * Remove unused variable.	2024-09-26 18:22:44 -04:00
Jack Gerrits	dc02719f7c	Check for prints (#616 ) * Check for prints * format	2024-09-23 20:10:57 +00:00
Jack Gerrits	93e7127f1f	Change references from agenext to autogen (#610 )	2024-09-23 10:46:05 -04:00
Jack Gerrits	6bf62262a4	fix config for pyright for most (#561 )	2024-09-18 20:23:53 +00:00
Ryan Sweet	7d7fc8a912	.NET cleanup and refactor (#558 ) Moves some shared code from samples into core. complete/cleanup the rename to Microsoft.AutoGen adds new projects in AutoGen.Extensions	2024-09-18 11:57:51 -07:00
Jack Gerrits	306541e247	Fixup ruff config and inclusions (#495 ) * add tests to ruff for core * fmt * lint * lint fixes * fixup more dirs * dont include non python * lint fixes * lint fixes * fix dir name * dont relative include	2024-09-13 10:41:15 -04:00
afourney	243c095796	Updated the root path discovery in agbench to reflect latest folder structure. (#433 )	2024-08-29 15:07:17 -07:00
Jack Gerrits	4ff5610853	Migrate to uv and poe for workspace management and task running (#424 ) * Migrate to uv and poe for workspace management and task running * install python * try fix * ensure workspace venv in used * package dir * move nbqa to mypy task * separate sync, clarify docs	2024-08-29 09:46:06 -04:00
Jack Gerrits	5e8840d13c	Python: organize packages in package directory (#420 ) * Move packages to packages directory * remove screenshot * update some paths	2024-08-28 13:35:21 -04:00

40 Commits