|
|
|
|
|
by _8j50
2116 days ago
|
|
I think you're stuck on the politics. Level3 is their competition but initially CF was blamed. CF owes it to their customers and investors to explain to them why they had an outage and how they responded to it, and they do not need talk in detail about an unrelated past incident (just because it was related to flowspec does not mean it was a similar outage), and they certainly should not wait for Level3's investigation. I would expect Google to have a similar explanation if a significant number of GCP customers faced an outage. You should know, it wasn't just someone else's network that went down, that network brought down a big chunk of the internet with it. I think technical honesty comes before political appearances. The #hugops and mention of their past experience with a flowspec outage is clearly there to signal that the blogpost is not there for blaming or making L3 look bad. |
|
The professional way to write a blog post like this is from your own perspective. Identify the proximate cause (the peer), name names if you must, talk about how awesome your own systems are, show some of your monitoring if you like, and talk about what you'll do in the future to be even more resilient to this class of problems.
That's all to the good and much of Cloudflare's blog was exactly that. Would've been fine if they left it like that.
Acknowledging there is no postmortem (yet) but then pointlessly speculating about what it might contain is what I have a problem with.
I don't speak for Google but if I found out we had written a post like this, I would speak up and advocate to change it.