Hacker News new | ask | show | jobs
by timhrothgar 1615 days ago
Perhaps I should update the question. I'm not referring to ALL software quality. I'm referring to the quality of codebases that are 1) old, 2) large, and 3) supported by many people.

You make a good point though. Perhaps I just miss the good ol' days of working on small teams with small codebases that were pretty easily maintained.

5 comments

Even with those caveats, it's unclear whether the premise is satisfied. The Linux kernel is old (old enough to drink!) large (and getting larger!) and supported by many people. Has its quality declined? I don't think so. It supports more hardware than ever. Kernel panics don't happen nearly as often as they used to. New features (BPF), make kernel programming easier. It's difficult to say that the Linux kernel's quality has declined over time.
People being up the Linux kernel on the same way they bring up Oprah as an example of being successful in America.

99.99999% of software in big corporations will not have even 5% of the quality of the Linux kernel. The reason the kernel has such a high quality is because of Linux being a dictator, training everybody in not always so nice ways to write code that doesn't break anything and that is maintainable.

He cares about the code and he has the status and the mandate to prevent it from becoming shit.

The Linux kernel is an easy example, sure, but it's not the only one. You can say the same thing about proprietary operating systems and software as well. It's been a long time since I've had misbehaving applications blue-screen Windows. That's something that used to happen daily when I ran Windows 98. Or if you prefer MacOS, I remember MacOS Lion being incredibly unstable, with beach-balls and crashes galore. That's all been fixed.

Applications have gotten better too. Browsers have gotten significantly more robust against misbehaving pages. Microsoft Office doesn't eat my work (even if I forget to hit ctrl-s).

In fact, I would go so far as to say that the only software which has gotten worse is video games. Used to be you could put in a disc, install the game, and be reasonably assured that you were getting a playable final product. Today, you put in the disc, install the game, and then have to download multiple gigabytes of patches... and the game is still often buggy (Bethesda, I'm looking at you!).

>99.99999% of software in big corporations will not have even 5% of the quality of the Linux kernel.

That is true. But it was equally true when Linus Torvalds dropped the first version of the Linux kernel all the way back in '91. It's not clear to me that things have gotten worse since then.

Regarding games, I wonder if it is just an issue of expectations.

For the longest time, AAA studios mostly released simple first person shooters with straightforward enemy AI and simple physics. And Bethesda and Obsidian released gloriously buggy RPGs. Nowadays, every game includes open world elements, RPG elements, and more complicated NPC interactions... and it turns out that they are all full of bugs. Complex games have complex problems that don't reveal themselves until players do weird things.

Not to mention all these RPG systems add a whole additional layer to mess up -- character stats might not be 'buggy' exactly, but they might be very poorly 'balanced.' It is really easy to not explore every skill interaction and sometimes multipliers end up exploding. I mean, we saw Blizzard fail to balance Diablo II for like a decade or so, Wizards of the Coast tries to balance D&D but that takes all the fun out of character building -- RPGs of any significant complexity are I think just fundamentally prone to exploding numbers.

Not to mention all these RPG systems add a whole additional layer to mess up -- character stats might not be 'buggy' exactly, but they might be very poorly 'balanced.'

I'm not talking about balance issues. I'm specifically talking about issues where the game is clearly and obviously not working as designed. Clipping errors. Objects flying off into the sky because of problems with collision detection. Textures not loading. NPCs walking facefirst into walls because of buggy pathing code.

I think what's happened is that game development studios' reach has exceeded their grasp. They want to make these huge open worlds with numerous storylines, quests, etc, all with rich graphics and physics, but they just don't have the time or budget to get it right. So they rush out what they have, knowing full well that it has massive numbers of untested corner cases and hope that the PR blowback from the bugs isn't so bad that it ruins their reputation.

I 100% agree -- the balance issues just popped into my head as a tangent. Another example of how engines and rendering improve over time but there are lots of sort of 'other problems' that really don't scale with technology advancements.
> Complex games have complex problems that don't reveal themselves until players do weird things.

If you don't think them through they will. Just write it right.

If it was that simple don’t you think game developers would have done just that? I have worked both in and outside the games industry and it is clear to me that AAA games are magnitudes more complex than your everyday biz app.
I mean, I don't work in games, this is just an observation as a player who's bumped into his fair share of bugs. But if I bump into Bethesda I'll mention that advice to them.
> 99.99999% of software in big corporations will not have even 5% of the quality of the Linux kernel.

Having done a lot of work on the kernel, I think you’d be surprised at the relatively low quality of a lot of kernel drivers.

The core code is generally quite good, but it’s not true that the kernel is full of pristine code.

I’ve worked at multiple companies where code quality standards exceeded a lot of the weirder stuff I’ve seen (and often fixed) in kernel driver code.

I don't think it's just Linus (though he certainly helps).

The kernel, unlike many other understaffed open source projects, has lots of developers working on it (the majority paid these days). At the same time, unlike software built by corporations, there are no non techinical PHBs to say "ship it now and clean it up later".

Linux has found the sweet spot of getting companies on board to provide paid developers (it's much easier to do good work when you can allocate large blocks of time to it because your're paid to do it) whilst at the same time preventing them having too much say over the technical directions and timelines. Typically a company employing kernel developers will have a say in what they do (drivers, core kernel, ...) but not on how they do it.

But it's not really a model that can work for all open source software.

For related reasons I think that companies that build lots of software but whose product isn't actually software (like Google that is really an advertising company) often produce better quality software than pure software companies (like Microsoft) because the technical staff are left more alone by management who prefer to concentrate on the "real business"

Right, politeness is killing corporate software.
I had this specific example in mind. How has Linux done it? Was it ultimately due to the benevolent dictator for life (BDFL) management practice?
The kernel isn't working on commercial deadlines. Maintainers don't have to worry about their stack rank at the end of the year. They don't have to justify their head count. Some of this happens internally I'm sure, at places like IBM, Red Hat, and Intel, but none of it is coming from Linus.
Probably that, and the average Linux contributor being a lot brighter than your run of the mill programmer.
If someone was to design a good small set of opcodes we could go back to rolling machine code by hand. While at it we might as well fill the atx box with cute tiny isolated computers that talk over ip addresses. They have to be small enough as to not leave space for an OS. Some ROM is fine ofc.
>> Kernel panics don't happen nearly as often as they used to

My earliest kernel I have used in something one might call production are 1.2.12 around 1995. I must say even then, with this early kernels I had no panics at all and much higher uptimes (patching for security wasn't as much of an issue at that time ;-) )

There were many exploitable remote root holes then, but no one was attacking them because Windows was a much juicier target.
> Perhaps I should update the question. I'm not referring to ALL software quality. I'm referring to the quality of codebases that are 1) old, 2) large, and 3) supported by many people.

Because Google[1] and Facebook came along and scared everyone by iterating at a LOLWTF pace and companies in surrounding spaces looked at what was taking up time in their release schedules and the answer was "test passes" so they fired all their testers, and told devs to add unit tests but unit tests don't cut it.

Companies that used to have immaculate software quality had dedicated test automation engineers who had the job of abusing software in crazy bizarre ways. Then they hired armies of manual testers to go over anything that hadn't been automated.

Lots of problems existing with this system, one of which was career advancement for software engineers in test was limited because it is hard to get recognized for the two primary jobs of an SDET:

Signing off on code

Blocking a release on quality grounds

So you had a gradual rot of SDET and test orgs at companies, with pools of brilliance that slowly got drained as the best engineers got tired of being undervalued.

Start from that base, and then around ~2010 everything needs to start "moving fast".

Apple and MS both get rid of their test teams, and with two of the largest employers of dedicated software engineers in test getting out of the field, the entire field itself falls apart. Now it is career suicide, an ever shrinking career path that pays far less than doing "real" development work.

That leaves us at where we are today. Everything sucks and breaks all the time.

[1] Everyone forgets how bad the first 5 major versions of Android were.

I have no QA testing and the code I maintain haven’t had a customer production bug for 5+ years. And we are taking complex software written in C++. The reason why is 90000 end-to-end use case tests. I routinely implement new features or refactor major parts of the application and the tests will tell me if it is ready for production or not.
Old large codebases are mostly maintained by people who weren’t around when the code was originally coming into existence. They don’t know the implicit design assumptions and decisions, or even the history of requirements. One thing you’ll find in nearly any software project of any age is a lack of good documentation of those things, so as you lose the community folklore, people will start making myopic changes, cargo-culting, violating future-looking design principles, and so forth. Pretty soon you just have a pile of incoherent features and making systematic improvements is hard because the code is no longer systematic.
> 1) old, 2) large, and 3) supported by many people.

Software entropy implies that software that doesn't change will always suffer entropy if the environment changes. In a vacuum, we would never need to change software once it was "feature complete". But that's not how the world works. Environments change and so that inevitably means software will corrode.

Candidly, I think you're just looking at the world through rose tinted glasses.

You listed the reason it's a mess yourself - many people worked on it and it's old.

For big companies there is no incentive at all for a developer to personally care about and do battle over things like code quality and reduce complexity.

If they are use any kind of agile (lol) system, it will just be about polishing the turd so it doesn't break and add features in some way that doesn't require big rewrites.