mirror of
https://github.com/microsoft/autogen.git
synced 2025-08-21 15:12:05 +00:00
Improve team-one readme (#225)
* Update readme * Improve readme further * Add results
This commit is contained in:
parent
9e814cbad8
commit
e69dd92c4f
@ -55,16 +55,30 @@ Team-One uses agents with the following personas and capabilities:
|
||||
|
||||
|
||||
### Performance
|
||||
|
||||
Team-One currently achieves the following performance on complex agent benchmarks:
|
||||
|
||||
GAIA
|
||||
_GAIA_
|
||||
|
||||
TODO
|
||||
| Level | Task Completion Rate* |
|
||||
|-------|---------------------|
|
||||
| Level 1 | 49% (26/53) |
|
||||
| Level 2 | 26% (22/86) |
|
||||
| Level 3 | 8% (2/26) |
|
||||
| Total | 30% (50/165) |
|
||||
|
||||
WebArena
|
||||
*Indicates the percentage of tasks completed successfully on the development set.
|
||||
|
||||
TODO
|
||||
_WebArena_
|
||||
|
||||
| Site | Task Completion Rate |
|
||||
|----------------|----------------|
|
||||
| Reddit | 49% (27/55) |
|
||||
| Shopping | 23% (22/96) |
|
||||
| CMS | 16% (16/101) |
|
||||
| Gitlab | 41% (32/79) |
|
||||
| Maps | 35% (23/65) |
|
||||
| Multiple Sites | % (--/26) |
|
||||
| Total | 28% (120/422) |
|
||||
|
||||
|
||||
# Setup
|
||||
|
Loading…
x
Reference in New Issue
Block a user