Hacker News new | ask | show | jobs
by dan1234 1421 days ago
This new outage has occurred almost exactly 24hrs after the first one - is there some automated process tripping everything?

The incident states that it's a power issue (like they said last night), but none of my servers were rebooted last night, so it's clearly something affecting the network level, and it seems to be affecting all servers.

4 comments

Maybe it’s the time the cleaner plugs in the vacuum.
Or the wrong light switch on the wall
A power issue and networking issues are not mutually exclusive, it could very well be that networking equipment is on a separate power delivery circuit from the actual servers.
Yes, sorry I didn't mean to imply that Linode were lying about the cause!

I look forward to their full postmortem about the issue (they promised that after the first outage, but the second happened before they'd published anything).

So they lost connectivity but were not rebooted (as in your uptime didn't go down)?

Sounds like they had enough battery to safely pause all the instances and then resume them when power came back.

So what does your monitoring show?

I wouldn't expect you to be able to monitor your provider's power but you can monitor your own power and perhaps you have your own UPS in a CoLo style setup or at least a promise of some sort. Perhaps your stuff isn't important enough for that - that's your risk assessment.

Your monitoring should be able to discern power vs networking.