|
|
|
|
|
by guccihat
484 days ago
|
|
> The Exercism problems have proven to be very effective at measuring an LLM's ability to modify existing code The Aider Polyglot website also states that the benchmark " ...asks the LLM to edit source files to complete 225 coding exercises". However, when looking at the actual tests [0], it is not about editing code bases, it's rather just solving simple programming exercies? What am I missing? [0] https://github.com/Aider-AI/polyglot-benchmark |
|