Hacker News new | ask | show | jobs
by wmf 5234 days ago
First you fail over to another server, then you repair the original failed server; this takes repair out of the critical path. This generally requires no changes to an app as long as it's crash-safe. There's plenty of software to do this (e.g. Heartbeat, Red Hat Cluster) but because it doesn't work in the cloud people forgot about it.