Hacker News new | ask | show | jobs
by jblow 2438 days ago
Some of the points you raise are matters of style (I have files that are 10k lines long and think they are better that way; also I don't think unit tests are as useful as claimed), but yeah, when you have 600 files and 11MB of code to do something that is supposed to be simple and also is security-critical ... you have a big problem.
3 comments

What a coincidence. I watched your video on Preventing the Collapse of Civilization just yesterday. (Mostly agree with things said there, with the exception of the part about Smalltalk. Ironically, the video was recommended to me in a Smalltalk chat channel.) Seems relevant.

>Some of the points you raise are matters of style (I have files that are 10k lines long and think they are better that way

It's true that line counts are a matter of style. However, some coding styles tax cognitive bandwidth and immediate memory capacity more than others. This causes people to skim over specifics and miss bugs. Even if you are comfortable with a 10K LOC file you wrote, whomever reviews your code will probably be overwhelmed until they fully comprehend its structure. Grouping related things into smaller files is a way to focus their attention and communicate intended relations between code units.

> also I don't think unit tests are as useful as claimed

In general, I agree. Not a fan of TDD zealotry. But in this particular case some unit tests would be beneficial. There is a lot of stuff going in some methods.

They do seem to run PVS-Studio static analysis, which is somewhat reassuring, but I don't think that's enough in code that is so important, complex and widely used.

It should be very obvious once you think about it that breaking code into more files, and procedures into subprocedures, is pushing complexity around rather than simplifying ... and it’s pushing the complexity from somewhere visible, into an invisible structure that the viewer has to then reconstruct in his mind. I think in a great many cases it’s not a good idea.
There is a reason the OpenBSD folks decided to write doas:)
> I have files that are 10k lines long and think they are better that way

Whether you prefer this practice stylistically or not doesn't change the fact that it is objectively worse in terms of code quality and maintainability, even if you are able to compensate for those deficits with your own skill/experience.

You might have no problem successfully navigating that code for which you've already built a complete mental model around, but any future collaborators or inheritors of that code will have to waste hours or days rebuilding that mental model which you could have just codified in the structure of the source code.

> objectively worse

That's an opinion. There's nothing objective about the assertion that many small files are inherently more secure. However, that many small files and one large file have equivalent power, can express all the same programs, and can be equally secure or insecure... those actually are all objective facts.

I didn't say it was objectively more secure. Obviously it's possible to have code that is of poor quality/maintainability but still happens to be more secure than its alternatives. Maintaining good code quality is not a magic bullet that prevents all bugs.
Then replace secure with "maintainable" or "high quality". It's still a subjective opinion, unless you can prove that large files are inherently less maintainable or lower quality (which you can't, because all the arguments here are subjective and, like almost everything in software development, these opinions are driven by fashion rather than any evidence).

It's popular to use "objectively" for "clearly" and I've been guilty of this myself, but let's try to reserve the term for actual matters of fact, lest it lose any meaning.

Besides being unclear whether one is better or worse, it's not clear there's even a practical difference when a significant number of programmers will be navigating and learning code through an IDE anyway. They'll be jumping from function to function, barely even aware of which file each one is in.

Personally I prefer smaller files, but that's only because of compile times. I started my career with punch cards, so I know what slow compile times look like. For a long time they seemed to be getting ever shorter, until I switched to a "modern C++" project. Now compile times are not quite back to the punch-card days, but back to maybe late 90s, and breaking stuff up into smaller files helps.