Hacker News new | ask | show | jobs
by hkchad 689 days ago
How does one solve for their own device management/fallback strategy? They are in an industry that is regulated to have endpoint protection. So even if they had a 1:1 DR zone they would have been running MS and CS on both, w/o safeguards in place that only Microsoft or CS could put in place for them (staggered rollout) both primary and DR would go down. It's really hard to put the blame on the customers (Delta) in this instance.

No industry is going to invest in 2 completely independently built and run systems that share zero components/vendors because that's what it would have taken here.

3 comments

When their computers went down plenty of airlines started writing boarding passes by hand to keep things moving. Delta's operations meanwhile were completely frozen for a full week and cost them half a billion dollars. So it's not like the problem was unavoidable or unsolvable. Everyone but Delta managed.
Delayed update rollout is an enterprise level windows setting that’s used by my company (as well as on my personal windows machine through regex edits) specifically to protect business critical systems from this scenario. While Microsoft and cloudstrike should not have pushed this broken update, are they actually responsible for Delta or anyone else’s losses? My opinion is no, but we’ll see how a lawsuit shakes out.
I’d say test rollouts before rolling them out? I have about 10 stand alone machines at work that test updates a day before they go out, and in emergencies at least a couple of hours. Basic tests after an update “run some software, do some reboots, run a couple of benchmarks”, and I’m by no means an IT genius, and I would have caught this fk up.