|
|
|
|
|
by eastdakota
219 days ago
|
|
Because we initially thought it was an attack. And then when we figured it out we didn’t have a way to insert a good file into the queue. And then we needed to reboot processes on (a lot) of machines worldwide to get them to flush their bad files. |
|
- A product depends on frequent configuration updates to defend against attackers.
- A bad data file is pushed into production.
- The system is unable to easily/automatically recover from bad data files.
(The CrowdStrike outages were quite a bit worse though, since it took down the entire computer and remediation required manual intervention on thousands of desktops, whereas parts of Cloudflare were still usable throughout the outage and the issue was 100% resolved in a few hours)