Hacker News new | ask | show | jobs
by riku_iki 968 days ago
I think it is valid criticism that that HumanEval benchmark is not completely representative, they also say it in the post.