Y
Hacker News
new
|
ask
|
show
|
jobs
by
nomel
968 days ago
Fifth sentence:
> However, we’ve found that HumanEval is a poor indicator of real-world helpfulness.