Hacker News new | ask | show | jobs
by antirez 2317 days ago
I did some trivial math. Redis is composed of 100k lines of code, I wrote at least 70k of that in 10 years. I never work more than 5 days per week and I take 1 month of vacations every year, so assuming I work 22 days every month for 11 months:

    70000/(22*11*10) = ~29 LOC / day
Which is not too far from 10. There are days where I write 300-500 LOC, but I guess that a lot of work went into rewriting stuff and fixing bugs, so I rewrote the same lines again and again over the course of years, but yet I think that this should be taken into account, so the Mythical Man Month book is indeed quite accurate.

However this math is a bit off because in the course of such 10 years I wrote quite a number of side projects, but still, max ~50 LOC / day.

4 comments

From the sounds of it that also includes a decent amount of greenfield work.

I don't have hard figures to easily consult but I'd guess that I'm at about your average in total, but then on the days when I'm refactoring/debugging existing stuff, honestly it could be like 3 lines a day, or 5, or just planning/sketching something out.

It's like the old mechanic trope. It's not hard to replace a bolt, what's hard is knowing which bolt to replace and where.

This "LOC as a proxy for productivity" metric seems so much harder to measure in a useful way on brownfield work.

On my most productive (in my own estimation) brownfield days in recent memory, the codebase would generally shrink by several hundred LOC.

I also find that a huge factor in my code production rate on brownfield projects might not have much to do with me, because it's factors like, "Is the code well-documented, easy to understand, and backed by tests that make the intended behavior clear? Or do I have to start by burning days or weeks on wrangling with Chesterton's Fence?"

And, on the other side of it, when is documenting and cleaning my own code to guard some future maintainer from that situation vital, and when am I burning a day of my own time to save someone else only an hour in expectation? All I know for sure in that situation is that, if my manager is assiduously counting LOC, ticket close rate, anything like that, then game theory demands that I should never bother to spend an extra hour on making it more maintainable if I expect that the cost of that decision will be born by one of my teammates. The 10X rockstar developer at a previous team of mine taught me that lesson in a rather brutal manner.

> I should never bother to spend an extra hour on making it more maintainable if I expect that the cost of that decision will be born by one of my teammates.

If you want to be a team lead, though, or even just have people follow your lead, I find that not only do you want to worry about these costs, but you need to talk openly about them, and be seen addressing it. Most devs follow the ones they trust, no matter what title they have.

On all the projects where we tried to build people up instead of get shit done, we were consistently getting more shit done at the two year mark, if not sooner. Any idiot can ship a version 1.0.0, but it takes some talent (and luck) to ship version 2.3.0

From what I’ve seen, Postgres followed a similar model, and if you look at the performance benchmarks over time, it has progressively narrowed the gap with each major release. That kind of momentum is something worth sacrificing for.

> If you want to be a team lead, though... you want to worry about these costs...

This may depend on the extent to which your organization conforms to the Peter principle.

I am about to rewrite something (different cloud, blabla) and if I do a good job I think the program’s LOC will go down by about 25%, the unit testing LOC will go up by maybe 50%, and maybe 50% of each will be the same old logic while the other half will be new logic.

I have no idea how I would try to count that if I wanted to measure “productivity.”

Interesting so maybe the more accurate way to measure is increase LOC + decrease LOC.
Or even better, don't measure LOC.
LOC is a terrible measure for productivity, but I like to think about it more as a measure of capacity. LOC/day is useful as an upper bound in the same way pages per day is an upper bound for authors. Stephen King is notorious for being one of the most prolific writers, and he can’t produce more than 8 publishable pages a day (top hit on google suggests his average may be close to 6). Knowing that the number of LOC/day is so low on average, can really help keep estimates honest, and remind us how truly difficult what we do actually is.
Isaac Asimov, another prolific author, averaged something like 2800 words per day[1] over the most productive period of his career, which works out to about 9 pages, so that seems like a good estimate.

1: Google is giving a much higher number but they all seem to go back to the same estimate which is more hand-wavy than a printed source I found in college when researching it. Sorry I don't have it to source. The other number from that source was 1800 published words per day if you start from his first published book, which is absurd, since new authors tend to be much less prolific.

LOC is a poor way to measure progress, but it's not a bad sanity check on time estimates for a proposed project. Most experienced programmers have some points of reference where they know the approximate LOC count, and can make a rough functionality/complexity analogy to a proposal. If the proposal is less time than the team could possibly generate a comparable amount of tested code at 10-100 lines/day (wherever your team is in that range), then you should probably revisit the estimate.

Apply a few of these kinds of comparisons using different metrics, and you may be able to improve your estimates.

Bill Gates said something like measuring progress on a program by LOC was like measuring progress on an aircraft by its weight
"Measuring programming progress by lines of code is like measuring aircraft building progress by weight."

I'm no expert in aircraft, but I'm guessing that in both cases the relationship between progress and the metric in question is logarithmic: The first bits to be put place represent the bulk of the (weight|LOC), but only a relatively small percentage of overall time and effort.

Same metric applies - lighter is better.

;)

Alrighty.
BRB reformatting the codebase.
Has progress on that been fairly continuous with the project growing steadily (almost) nonstop? Or was a big portion of that the groundwork just to get it to a point of being usable in the beginning?

With projects I work on, I'll often write a few thousands lines of foundation in a couple weeks, then I'm adding a line here and there as needed. The first 1000 lines are always easy. The next 10 can take days.

I think this went quite constant. The main change is that in the first years I could work a ton of stuff 1 month, zero the next month, and now is instead more smoothed evenly.
And if you have to bounce around across different codebases, you obviously go a lot slower.

If I'm pounding out the same boilerplate code I've written for every greenfield app, I can go at a phenomenal speed. But if I'm put into code I'm not very familiar with, 98% of my effort is understanding the existing code and 2% of it is making that 1-10 LOC change.

Over time, over multiple iterations on multiple versions on various platforms (maybe excluding tests), this seems about right.

I remember when I was prototyping an OpenGL thing on SunOS or a Renderman on same (maybe Solaris?), I was working ALL of the time and cranking out LOTS of code. Then refactor-refactor, fix technical debt, slowly add features without breaking, more automated tests, more platforms, and then, tada! My effective rate of coding (measured by LoC) was depressingly low.

I guess that I'm happy that it appears that I'm effective and relatively efficient, but I'm not the LoC-cranking-out machine that I thought I was. Sobering.

Just curious, do you still code in vacation days?
Nope.
Thanks! Good to relax~~