Hacker News new | ask | show | jobs
Holistic Agent Leaderboard: The Missing Infrastructure for AI Agent Evaluation (arxiv.org)
1 points by randomwalker 240 days ago