Hacker News new | ask | show | jobs
by abhikrc 807 days ago
The entire setup is available for inspection, please see

https://github.com/nus-apr/auto-code-rover

if you need example bugs we can provide that too. Some examples also appear in the arxiv paper, please see

https://arxiv.org/abs/2404.05427

1 comments

This is super fascinating stuff, excellent work. As most of us don't have the time to read the entirety of the paper, are you able to directly link to some issues which have been landed and closed? Some personal favorites would be awesome :)

I think I speak for others when I say the best way to judge the efficacy of this project is some real-world, on-site examples of it being used in prod. I'm especially curious for its performance in feature-request or flakey bug report type issues as opposed to reliable test failures. I expect the former is much tougher!

Thank you for your interest. There are some interesting examples in the SWE-bench-lite benchmark which are resolved by AutoCodeRover:

- From sympy: https://github.com/sympy/sympy/issues/13643. AutoCodeRover's patch for it: https://github.com/nus-apr/auto-code-rover/blob/main/results...

- Another one from scikit-learn: https://github.com/scikit-learn/scikit-learn/issues/13070. AutoCodeRover's patch (https://github.com/nus-apr/auto-code-rover/blob/main/results...) modified a few lines below (compared to the developer patch) and wrote a different comment.

There are more examples in the results directory (https://github.com/nus-apr/auto-code-rover/tree/main/results).

fwiw the example issue highlighted in the post was already fixed by a human 3 years ago so I wouldn't expect to see much in the way of real life fixed issues yet.