Hacker News new | ask | show | jobs
How to Make a Good Terminal Bench Task (twitter.com)
3 points by neversupervised 85 days ago
1 comments

I've been a contributor and reviewer for terminal bench since last August, and this post is about what I've learned designing and reviewing tasks. The guidance is broadly applicable to anyone building an agentic benchmark.I would love feedback from the HN community.