Hacker News new | ask | show | jobs
by hongbo_zhang 134 days ago
This is the benchmark between the latest models on a new programming language to avoid overfitting. Latest models are quite good over generalization to new languages, they can write tens of thousands of lines of code in one prompt that just works.