mirror of
https://github.com/microsoft/autogen.git
synced 2025-07-07 00:51:38 +00:00
WebArena Benchmark
This scenario implements the WebArena benchmark. The evaluation code has been modified from WebArena in evaluation_harness we retain the License from WebArena and include it here LICENSE.
References
Zhou, Shuyan, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng et al. "Webarena: A realistic web environment for building autonomous agents." arXiv preprint arXiv:2307.13854 (2023).