Hacker News new | ask | show | jobs
Benchmarking LLM Agents on Consequential Real World Tasks (the-agent-company.com)
2 points by suprgeek 546 days ago