Y
Hacker News
new
|
ask
|
show
|
jobs
by
pawanjswal
379 days ago
I have seen many LLM devs' encountered this at some point. Good to see that you are not only pointing out the inconsistency but also actively advocating a common benchmark.