Hacker News new | ask | show | jobs
by azzarcher 1039 days ago
How is this standing out from https://benchllm.com/?
1 comments

I really dislike benchllm's use of yamls for test cases - I'd rather it be in code.

""" input: What's 1+1? Be very terse, only numeric output expected: - 2 - 2.0 """

Agreed. No one should ever have to touch YAML for writing unit tests for LLMs. Ever. Most people writing agents and LLM applications are Python developers/data scientists/ML engineers.