Hacker News new | ask | show | jobs
by esperent 29 days ago
When you share them, please also share the setup for people to easily rerun them. Nearly every eval I've seen shares the llm session transcript but not the actual harness setup etc. that they used.