autogen

yujunjun/autogen

Fork 0

mirror of https://github.com/microsoft/autogen.git synced 2025-08-11 10:11:27 +00:00

Commit Graph

Author	SHA1	Message	Date
KazooTTT	a122ffe541	Fix/typo (#1034 ) * fix: typo * fix: typo * fix: typo of function name * fix: typo of function name of test file * Update test_token_count.py --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>	2023-12-22 16:00:46 +00:00
afourney	f8b4b4259b	Adds the GAIA benchark to the Testbed. This PR depends on #792 (#810 ) * Re-added completion logging when using older versions of autogen. * Extended scenario definitions and templating to include folders. * Prepare collate_human_eval.py for working with group chat scenarios. * Converted HumanEval to the folder-based approach, and added GroupChat scenarios. * Fixed the default termination message. * Fixed another termination condition. * Updated compatible autogen versions. * Added initial support for GAIA benchmark. * Fixed a bug in executing the finalize scripts. * Generalized the template further to support multiple folder copy operations. * Refined GAIA support, and broke scenarios down by difficulty. * Added some experimental scripts for computing metrics over GAIA. This is a first version, and will likely need refinement. * Added instructions for cloning GAIA * Updated README to fix some typos. * Added a script to format GAIA reslts for the leaderboard. * Update samples/tools/testbed/scenarios/GAIA/Templates/BasicTwoAgents/scenario.py Co-authored-by: LeoLjl <3110503618@qq.com> --------- Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu> Co-authored-by: LeoLjl <3110503618@qq.com>	2023-12-06 01:46:10 +00:00

Author

SHA1

Message

Date

KazooTTT

a122ffe541

Fix/typo (#1034 )

* fix: typo

* fix: typo

* fix: typo of function name

* fix: typo of function name of test file

* Update test_token_count.py

---------

Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>

2023-12-22 16:00:46 +00:00

afourney

f8b4b4259b

Adds the GAIA benchark to the Testbed. This PR depends on #792 (#810 )

* Re-added completion logging when using older versions of autogen.

* Extended scenario definitions and templating to include folders.

* Prepare collate_human_eval.py for working with group chat scenarios.

* Converted HumanEval to the folder-based approach, and added GroupChat scenarios.

* Fixed the default termination message.

* Fixed another termination condition.

* Updated compatible autogen versions.

* Added initial support for GAIA benchmark.

* Fixed a bug in executing the finalize scripts.

* Generalized the template further to support multiple folder copy operations.

* Refined GAIA support, and broke scenarios down by difficulty.

* Added some experimental scripts for computing metrics over GAIA. This is a first version, and will likely need refinement.

* Added instructions for cloning GAIA

* Updated README to fix some typos.

* Added a script to format GAIA reslts for the leaderboard.

* Update samples/tools/testbed/scenarios/GAIA/Templates/BasicTwoAgents/scenario.py

Co-authored-by: LeoLjl <3110503618@qq.com>

---------

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
Co-authored-by: LeoLjl <3110503618@qq.com>

2023-12-06 01:46:10 +00:00

2 Commits