Hacker News new | ask | show | jobs
by kbolino 638 days ago
Leaning into LTS is nice until you near EOL and have to migrate everything in an often Herculean effort to work with the next LTS release.
4 comments

Like 12 years of life cycle is not enough for you to plan a transition?

You can use the entire life cycle but not one is forcing you to. You can update from one LTS to another every 2 years, or 4 years, or 5 years... you decide.

I don't really think we're in disagreement here. The longer you wait, the harder the transition will be. LTS is a good foundation, and usually the right choice for "enterprise" or "business" settings, but you should not rely overmuch on any one LTS release's way of doing things, when the wider Linux ecosystem moves much faster.
The longer you wait the harder the pain. The less you wait the more frequent the pain. So it depends on the function that converts intensity and frequency to suffering :p But, most importantly, the fact that LTS gives you a choice is what I was highlighting.

For the scope I operate, which is pretty standard Linux packages (PostgreSQL, MariaDB, Nginx, Docker, OpenVPN, OpenSSH) the changes between 16.04 and 22.04 have been quite OK to deal with.

It's a tradeoff. Doing a big effort once every 4 or 5 years, vs a hopefully smaller effort every year. Sometimes the intermediate smaller steps help you move forward, sometimes it just means more migrations. Sometimes the software/hardware you need means you can't use a LTS OS at all.

If possible, it's nicer to pick established, mature software for as much of your stack as you can, so that there's less of a difference in APIs over longer time frames. But it's not always possible.

It's not terrible in my experience of doing it several times now.

It is definitely less terrible than trying to unfuck tangles of terraform / terragrunt / yaml / bits of cloud infra.

I went through the migration from CentOS 6 to 7 and never want to do anything like that again. The good news, I guess, is that it never will happen again: CentOS is basically dead anyway, and it's not likely that so many core pieces of system software will change that drastically anymore.
I did CentOS 3 -> 4 -> 5 -> 6 -> 7 -> Debian. Very few problems.

(30 nodes)

I can't imagine you leaned into any one of those releases, then. That sequence involves major changes to the kernel, the init system, the configuration management tools, the core libraries, Apache, Python, Perl, etc. Any one of those alone could (and did, in my experience) trigger a major rewrite of configuration and/or code.

I'm glad it was painless for you. In my experience, it was not, and most of the reasons were beyond my control.

What does lean into mean here? A lot of software from 20 years ago compiles (if needed) and runs fine on the latest versions.
Every major release of every major distribution makes choices. These are choices about what software to include in the first place, what versions of that software to pin (especially for LTS releases), what default configuration to provide, recommendations about how to solve certain problems, etc. These choices are made based upon the experience and opinions of the distribution maintainers. However, those maintainers are (usually) not major contributors to the software they're distributing. This means distros can make "bad" choices, choosing for example to focus on software that eventually dies out, or recommending configurations that eventually get deprecated or removed, etc. Sometimes, these choices are even made in a way such that they exclude what will become the winning alternative, leaving no migration path except complete and total overhaul.

If all Linux is to you is a place to run some application software, these choices are mostly irrelevant. As long as the software you care about continues to run, the other things are just picayune details. If this comes off as derisive, I apologize, because I'm actually broadly endorsing that view of things, as much as it is possible to achieve. But if you start really taking advantage of the things which the distribution provides out of the box and recommends, especially around large-scale multi-system operation, you end up buying into the distibution's choices. When a large organization you're a part of does it too, now the sunk costs really start to mount. As the Linux ecosystem continues to evolve, especially in different directions than the distribution chose at the time, the cost of migrating to later releases grows. This is all a good reason to me to not marry oneself so tightly to those particular choices, but that isn't always feasible with deadlines and compliance requirements and so on bearing down on the sysadmin.

There's also an even bigger problem that can arise, the distribution can just end, such as the termination of CentOS, leaving lots of people hanging. In that case, I know some who started to pay Red Hat for RHEL, but most seem to have moved on to other distros, like Ubuntu. That kind of migration has a lot of the same issues, too, once again leaving me to recommend not to lean into the particulars too much.

apache -> nginx. Python versions. postgres. All fine.
Did you crossgrade to Debian in-place?
What is it that people do that breaks so often due to lack of backwards compatibility from the OS?

IMO, the lure of an LTS is that you don't need to keep testing if your computer is still working every week when a set of updates come. Not that things that your software depends on the details remain frozen. If your software depends on the details of something, you should add it as a dependency.

The bigger problem IMO is not that things break, it's that if you depend on one LTS release too heavily, and you wait too long to migrate from one LTS to another, everything breaks all at once.

What should be a gradual migration as new things develop turns into a singular nightmare.

What are you depending on the OS that isn't extremely backwards compatible?

Once in a decade you get something like a breaking upgrade of nginx, or the glibc debacle of 2003. That may take a person-week to fix[1], what can hardly be called "herculean".

1 - If you go with 1 person * 1 week, if you try to go with 7 people * 1 day, it will suddenly cost 7 person-weeks. But the only way upgrading is such a hurry is if you borked a lot of things prior to it.

Off the top of my head, some of the things that have broken at an LTS transition that I've been involved with are out-of-tree kernel module builds, C code using OpenSSL, Puppet config, Salt config, RPM specfiles, Python code, Perl code, Apache configs, shell scripts, Java code, bootloader configs, bootstrap scripts, and init scripts/configs (esp. sysvinit to systemd). Any one of these things is not a problem in isolation, the problem is due to having to fix all of them all at once. Too much complexity put into any one of them (often arising from external requirements or rushed implementations) also makes migrating harder. Waiting until the 11th hour on the EOL clock just adds to the stress of the process.

Many of my bad experiences were because of corporate policies and lack of proper prioritization at levels above system administration. However, the sysadmin does have some choice in the matter, especially when greenfielding. You can turn stability into a vice if you're not careful.