This is super fascinating stuff, excellent work. As most of us don't have the time to read the entirety of the paper, are you able to directly link to some issues which have been landed and closed? Some personal favorites would be awesome :)
I think I speak for others when I say the best way to judge the efficacy of this project is some real-world, on-site examples of it being used in prod. I'm especially curious for its performance in feature-request or flakey bug report type issues as opposed to reliable test failures. I expect the former is much tougher!
fwiw the example issue highlighted in the post was already fixed by a human 3 years ago so I wouldn't expect to see much in the way of real life fixed issues yet.
I think I speak for others when I say the best way to judge the efficacy of this project is some real-world, on-site examples of it being used in prod. I'm especially curious for its performance in feature-request or flakey bug report type issues as opposed to reliable test failures. I expect the former is much tougher!