Hacker News new | ask | show | jobs
by cle 2493 days ago
> As to what can be done to prevent similar failures, the FCC is recommending CenturyLink and other backbone providers take some basic steps, such as disabling unused features on network equipment, installing and maintaining alarms that warn admins when memory or processor use is reaching its peak, and having backup procedures in the event networking gear becomes unreachable.

Disabling unused services? Alarms when nearing resource limits? Contingency plans? How is this the first time this has come up?! These are like security & devops 101.

3 comments

It's kind of funny. These are best practices for running basic run of the mill web services, even something like a forum or personal homepage. Admittedly there's an obvious, massive difference in complexity, but you would expect the gold standard best practices to come from something mission critical like core Internet services and flow down to less critical services, not the other way around.
Well it is easy to find time to add gold-plating to a small basically useless service, but those guys are probably swamped or try to cut cost by being agile or something.
Cutting costs by fighting fires all the time^W^W^W^W^Wremoving smoke detectors. The classic strategy.
> How is this the first time this has come up?

It's not, obviously.

If one is cynical, it's just a way for the FCC to look like it is doing something. Or, if one attributes great, great rhetorical skill to the FCC, it's their way to lambaste CenturyLink for not even adhering to 101 level principles. I tend to believe the latter.

Here is a much more technical and thorough analysis of the incident: https://blog.thousandeyes.com/centurylink-outage-lessons-man...