Hacker News new | ask | show | jobs
by coder543 789 days ago
I agree HumanEval isn't great, but I've found that it is better than not having anything. Maybe we'll get better benchmarks someday.

What would make "Double" higher performance than any other hosted system?