Hacker News new | ask | show | jobs
by nomel 968 days ago
Fifth sentence:

> However, we’ve found that HumanEval is a poor indicator of real-world helpfulness.