| > a VP must sign off on changing status pages, which is... backwards to say the least. I think most people's experience with "VP's" makes them not realize what AWS VP's do. VP's here are not sitting in an executive lounge wining and dining customers, chomping on cigars and telling minions to "Call me when the data center is back up and running again!" They are on the tech call, working with the engineers, evaluating the problem, gathering the customer impact, and attempting to balance communicating too early with being precise. Is there room for improvement? Yes. I wish we would just throw up a generic "Shit's Fucked Up. We Don't Know Why Yet, But We're Working On It" message. But the reason why we don't, doesn't have anything to do with having to get VP approval to put that message up. The VP's are there in the trenches most of the time. |
I gotta say, the implication that you can't register an outage until you know why it happened is pretty damning. The status page is where we look to see if services are effected, if that information can't be shared there until you understand the cause, that's very broken.
The AWS status page has become kind of a joke to customers.
I was encouraged to see the announcement in OP say that there is "a new version of our Service Health Dashboard" coming. I hope it can provide actual capabilities to display, well, service health.
From how people talk about it, it kind of sounds like updates to the Service Health Dashboard are currently purely a manual process. Rather than automated monitoring automatically updating the Service Health Dashboard in any way at all. I find that a surprising implementation for an organization of Amazon's competence and power. That alarms me more than who it is that has the power to manually update it; I agree that I don't have enough knowledge of AWS internal org structures to have an opinion on if it's the "right" people or not.
I suspect AWS must have internal service health pages that are actually automatically updated in some way by monitoring, that is, that actually work to display service health. It seems like a business decision rather than a technical challenge if the public facing system has no inputs but manual human entry, but that's just how it seems from the outside, I may not have full information. We only have what Amazon shares with us of course.