|
|
|
|
|
by zuzuen_1
308 days ago
|
|
I would be more interested in Qodo's performance on the swe-bench-multilingual benchmark. Swe-bench-verified only includes bugs related to python breakages. The best submission is swe-bench-multilingual is Claude 3.7 Sonnet which solves ~43% of the issues in the dataset. |
|