Hacker News new | ask | show | jobs
by a24j 145 days ago
Can you share the agent-comparison harness code or point to something similar? I want to learn about benchmarking models in a basic or practical sense.
1 comments

Thanks so much!!