Hacker News new | ask | show | jobs
by simonw 263 days ago
Let's look at every PR on GitHub in public repos (many of which are likely to be under open source licenses) that may have been created with LLM tools, using GitHub Search for various clues:

GitHub Copilot: 247,000 https://github.com/search?q=is%3Apr+author%3Acopilot-swe-age... - is:pr author:copilot-swe-agent[bot]

Claude: 147,000 https://github.com/search?q=is%3Apr+in%3Abody+%28%22Generate... - is:pr in:body ("Generated with Claude Code" OR "Co-Authored-By: Claude" OR "Co-authored-by: Claude")

OpenAI Codex: ~2,000,000 (over-estimate, there's no obvious author reference here so this is just title or bid containing "codex"): https://github.com/search?q=is%3Apr+%28in%3Abody+OR+in%3Atit... - is:pr (in:body OR in:title) codex

Suggestions for improvements to this methodology are welcome!

4 comments

What's the acceptance rate on such PRs?
Add is:merged to see.

For Copilot I got 151,000 out of 247,000 = 61%

For Claude 124,000 / 147,000 = 84%

For Codex 1.7m / 2m = 85%

... I just found out there's an existing repo and site that's been running these kinds of searches for a while: https://prarena.ai/ and https://github.com/aavetis/PRarena
That's a denominator of total. How many are actually useful?
The main problem with your search methodology is that maybe AI is good at generating a high volume of slop commits.

Slop commits are not unique to AI. Every project I’ve worked on had that person who has high commit count and when you peek at the commits they are just noise.

I’m not saying you’re wrong btw. Just saying this is a possible hole in the methodology

HN people: lines of code and numbers of PRs are irrelevant to determine the capabilities of a developer.

Also HN people: look at the magic slop machine, it made all these lines of codes and PRs, it is irrefutable proof that it's good and AGI

Both of these things can be true at the same time:

1. Counting lines of code is a bad way to measure developer productivity.

2. The number of merged PRs on GitHub overall that were created with LLM assistance is an interesting metric for evaluating how widely these tools are being used.