Hacker News new | ask | show | jobs
by ezekiel68 457 days ago
In our environments, we use Fedora. We run the package upgrades weekly in a test env and make sure the functional/integration tests pass successfully, then roll those forward to stage and prod envs. Very seldom (twice in 5 years) have we caught a problem in the lower environment that prohibited the upgrade from moving on towards prod. And in both of those instances, newer package upgrades in the test env fixed the problems within a week or two without us needing to open up an issue ourselves in the Fedora forums.

Still, after one nasty experience in 2023, we always wait six to eight weeks after a new Fedora version is released before starring to attempt one of those upgrades. This has worked spectaculary well for us. We get all the benefits of newer mainline kernel drivers for recent server motherboard chipsets and CPUs while maintaining a very solid OS. CVEs seldom even get close to us, since they are often based on much older versions of system packages.

2 comments

That's an excellent example of what I like to call "good IT hygiene". I too would like to know what kind of tools you have to perform the functional and integration tests, and to execute the various rollouts.
Without going too deeply into details, we use common non-cloud-native platforms such as Jenkins to configure and schedule the tests. Unit tests are often baked into Makefiles while functional / integration tests are usually written as shell scripts, python scripts, or (depending on what needs to happen) even Ansible playbooks. This allows us to avoid cloud vendor lock-in, while using the cloud to host this infra and the deployment envs themselves.

Edit: we use Makefiles, not because we are writing code in C (we are not) but because our tech culture is very familiar with using 'make' to orchestrate polyglot language builds and deployments.

That's quite impressive by the standards I'm used to. Do you mind if I ask what scale you're operating at and what tools you use to manage the staged rollout?
re: tools, see my reply to a peer comment.

Our scale isn't ginormous. Fewer than two dozen microservices and we sometimes fudge the 'microservice' definition somewhat to allow some of those services (such as pure lookups) to host their isolated tables in the same database schemas/instances. We always mock external web service calls in the test env since a Fedora update either will or will not screw up the ability to hit an endpoint via HTTP (has never happened) -- in other words, hitting a real, live service would add nothing to the results of the dev test outcomes.