|
|
|
|
|
by ectopasm83
806 days ago
|
|
The point is that the success rate is progressing, paper after paper > The baseline results of Magis (10%), Devin (14%) are evaluated in another subset of SWE-bench, which we cannot directly compare with, so we take the results from their technical reports as a reference. Wondering how it compares with these models. |
|
/s