Y
Hacker News
new
|
ask
|
show
|
jobs
by
azzarcher
1039 days ago
How is this standing out from
https://benchllm.com/
?
1 comments
Eddygandr
1039 days ago
I really dislike benchllm's use of yamls for test cases - I'd rather it be in code.
""" input: What's 1+1? Be very terse, only numeric output expected: - 2 - 2.0 """
link
jacky2wong
1039 days ago
Agreed. No one should ever have to touch YAML for writing unit tests for LLMs. Ever. Most people writing agents and LLM applications are Python developers/data scientists/ML engineers.
link
""" input: What's 1+1? Be very terse, only numeric output expected: - 2 - 2.0 """