|
Unfortunately, we still don't have great metrics for developer productivity, other than the hilari-bad lines of code metric. Jira tickets, sprints, points, t-shirt sizes; all of that is to try and bring something measurable to the table, but everyone knows it's really fuzzy. What I do know though, is that ChatGPT can finish a leetcode problem before I've even fully parsed the question. There are definitely ratholes to get stuck and lose time in when trying to get the LLM to give the right answer, but LLM-unassisted programming has the same problem. When using an LLM to help, there's a bunch of different contexts I don't have to load in because the LLM is handling it giving me more head space to think about the bigger problems at hand. No matter what a study says, as soon as it comes out, it's going to get picked apart because people aren't going to believe the results, no matter what the results say. This shit's not properly measurable like in a hard science so you're going to have to settle for subjective opinions. If you want to make it a competition, how would you rank John Carmack, Linus Torvalds, Grace Hopper, and Fabrice Bellard? How do you even try and make that comparison? How do you measure and compare something you don't have a ruler for? |
This is an interesting case for two reasons. One is that leetcode is for distilled elementary problems known in CS - given all CS papers or even blogs at disposal, you should be able to solve them all by pattern matching the solution. Real work is anything but that - the elementary problems have solutions in libraries, but everything in between is complicated and messy and requires handling the unexpected/underdefined cases. The second reason is that leetcode problems are fully specified in a concise description with an example and no outside parameters. Just spending the time to define your problem to that level for the LLM is likely getting you more than halfway to the solution. And that kind of detailed spec really takes time to create.