Hacker News new | ask | show | jobs
by ryanbrunner 1559 days ago
There's not really all that much pointing to an infrastructure level failure - it's possible, but it's just as likely it's an application-level failure somewhere in Github's code. The API is returning 500s and not 503s and the failure is relatively quick, so it's not obviously a server outage.
1 comments

It's yellow lights across the board, literally nothing is green. That's usually indicative of some sort of software infrastructure level failure or cascade failure, not an application-level failure, which usually manifests as one or two specific services going down (depending on how you define "infrastructure" and "application" - with IAC, arguably the software defined infrastructure _is_ an application). I doubt its a physical hardware issue. It's rarely hardware (except when your DS catches on fire).

No red lights, so it's probably not something catastrophic like that facebook DNS SNAFU, but it definitely smells infrastructure- or deployment-scoped. Like either small DNS issue, or some load balancers are sending traffic to servers which cannot handle it programmatically (schema change?) so they are barfing.

Only load balancer (as an infrastructure) can hit the lights across the board. Not much else.
Databases, Caches or the authentication service? For me read-only requests are working fine and I've not seen any issues. Submitting new contents (e.g. comments) is where it's failing for me. It might be that their database primary is falling over.