Hacker News new | ask | show | jobs
by contingencies 3264 days ago
So... they're not running any kind of devops system at all. If they were, they could just run the upgrade on the image, test it, deploy it. All the "months of careful planning and many many tests" they did are basically wasted time.

I wouldn't be proud of this, quite the opposite. I would suggest critically reviewing the entire infrastructure management strategy since months lost to a single upgrade is obviously indicative of greater problems.

3 comments

It sounds to me like they are in fact running a giant devops system, all for the purpose of not using virtual static IPs.

Instead of just provisioning fresh VMs and migrating customer data they're doing this massive upgrade in place on existing machines to avoid losing the assigned IPs.

I guess they decided the benefits of being cloud provider agnostic outweighed the downside of spending months of man hours automating in-place OS upgrades.

A good system can handle multiple requirements. You present a false dichotomy.
I'm not sure that I'm quite as critical, the technique itself looks useful for some scenarios.

I've worked on a Windows project where we solved a similar problem by booting from vhd, so you can "just" write a new vhd, uniquify it with the per machine config, update the boot menu and reboot - all data are on a separate volume, naturally

I'm surprised they went to all this effort for only 2000 machines though.

I work on a "devops" team and this is similar to what we do. We have a "reinstall the whole damn image" flag but using it by default would just cause unnecessary downtime for us, especially in our non-production environments.