|
|
|
|
|
by skywhopper
807 days ago
|
|
Yes, and to be clear, the benchmark used here is merely the 300 simplest problems in the larger benchmark suite, which itself is only a tiny subset of issues from a dozen large (and presumably well-curated) Python projects. Not to mention that making the code fix is only a tiny part of resolving an issue. There should also be explanations and added test cases. In other words, I doubt the 22% of “fixes” would pass review by the project owner if a human submitted them. |
|