| We are very excited to introduce JungleGym, our new open-source launch featuring datasets and tools for developing autonomous web agents, by Marco Mascorro and Matt Bornstein. Addressing the challenge of benchmarking and testing AI agents, JungleGym aims to facilitate builders in this domain. Project overview: https://junglegym.ai/ Live demo: https://junglegym.ai/TreeVoyager%20(LLM%20DOM%20Parser) GitHub: https://github.com/a16z-infra/JungleGym Datasets included: Mind2Web (Zhou et al.), WebArena (Shuyanzhxyc, Frankxu2004, _Hao_Zhu et al.), and AgentInstruct (Zeng et al.). These datasets allow for comprehensive testing of agents against established ground truths, accessible via the JungleGym API. *TreeVoyager*: A new LLM-based DOM parser that simplifies task logic implementation for agent developers. Inspired by Tree of Thoughts (ShunyuYao12) and Minecraft Voyager (Guanzhi_Wang, DrJimFan). This project builds on the * remarkable work * by the teams behind these datasets and incorporates feedback from agent developers like SigGravitas, DivGarg, Yohei Nakajima, and draws inspiration from projects like World of Bits (TimShi_ai). |