mirror of
https://github.com/microsoft/autogen.git
synced 2025-08-21 15:12:05 +00:00
Improve team-one readme (#225)
* Update readme * Improve readme further * Add results
This commit is contained in:
parent
9e814cbad8
commit
e69dd92c4f
@ -55,16 +55,30 @@ Team-One uses agents with the following personas and capabilities:
|
|||||||
|
|
||||||
|
|
||||||
### Performance
|
### Performance
|
||||||
|
|
||||||
Team-One currently achieves the following performance on complex agent benchmarks:
|
Team-One currently achieves the following performance on complex agent benchmarks:
|
||||||
|
|
||||||
GAIA
|
_GAIA_
|
||||||
|
|
||||||
TODO
|
| Level | Task Completion Rate* |
|
||||||
|
|-------|---------------------|
|
||||||
|
| Level 1 | 49% (26/53) |
|
||||||
|
| Level 2 | 26% (22/86) |
|
||||||
|
| Level 3 | 8% (2/26) |
|
||||||
|
| Total | 30% (50/165) |
|
||||||
|
|
||||||
WebArena
|
*Indicates the percentage of tasks completed successfully on the development set.
|
||||||
|
|
||||||
TODO
|
_WebArena_
|
||||||
|
|
||||||
|
| Site | Task Completion Rate |
|
||||||
|
|----------------|----------------|
|
||||||
|
| Reddit | 49% (27/55) |
|
||||||
|
| Shopping | 23% (22/96) |
|
||||||
|
| CMS | 16% (16/101) |
|
||||||
|
| Gitlab | 41% (32/79) |
|
||||||
|
| Maps | 35% (23/65) |
|
||||||
|
| Multiple Sites | % (--/26) |
|
||||||
|
| Total | 28% (120/422) |
|
||||||
|
|
||||||
|
|
||||||
# Setup
|
# Setup
|
||||||
|
Loading…
x
Reference in New Issue
Block a user