| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dekhn 1655 days ago

Sure, but... that just raises more questions :)

Taken literally what you are saying is the service could be down and an executive could override that, preventing them for paying customers for a service outage, even if the service did have an outage and the customer could prove it (screenshots, metrics from other cloud providers, many different folks see it).

I'm sure there is some subtlety to this, but it does mean that large corps with influence should be talking to AWS to ensure that status information corresponds with actual service outages.

3 comments

meetups323 1655 days ago

Large corps with influence get what they want regardless. Status page goes red and the small corps start thinking they can get what they want too.

link

scrose 1655 days ago

> Status page goes red and the small corps start thinking they can get what they want too.

I think you mean "start thinking they can get what they pay for"

link

emodendroket 1655 days ago

I have no inside knowledge or anything but it seems like there are a lot of scenarios with degraded performance where people could argue about whether it really constitutes an outage.

link

dilyevsky 1655 days ago

One time gcp argued that since they did return 404s on gcs for a few hours that wasn’t an uptime/latency sla violation so we were not entitled to refund (tho they refunded us anyway)

link

Enginerrrd 1655 days ago

Man, between costs and shenanigans like this, why don't more companies self-host?

link

dilyevsky 1655 days ago

1. Leadership prefers to blame cloud when things break rather than take responsibility.

2. Cost is not an issue (until it is but you’re already locked in so oh well)

3. Faang has drained the talent pool of people who know how

link

emodendroket 1654 days ago

If you think that’s bad you should see the outages when you self host without a big enough team to really manage it.

link

pm90 1655 days ago

Opex > Capex. If companies thought about long term, yes they might consider it. But unless the cloud providers fuck up really badly, they're ok to take the heat occasionally and tolerate a bit of nonsense.

link

dilyevsky 1655 days ago

You can lease equipment you know…

link

dekhn 1655 days ago

Yep. I was an SRE who worked at Google and also launched a product on Google Cloud. We had these arguments all the time, and the contract language often provides a way for the provider to weasel out.

link

jedberg 1655 days ago

Like I said I never worked there and this is all hearsay but there is a lot of nuance here being missed like partial outages.

link

dekhn 1655 days ago

This is no longer a partial outage. The status page reports elevated API error rates, DynamoDB issues, EC2 API error rates, and my company's monitoring is significantly affected (IE, our IT folks can't tell us what isn't working) and my AWS training class isn't working either.

If this needed a CEO to eventually get around to pressing a button that said "show users the actual information about a problem" that reflects poorly on amazon.

link

dhsigweb 1655 days ago

My friend works at a telemetry company for monitoring and they are working on alerting customers of cloud service outages before the cloud providers since the providers like to sit on their hands for a while (presumably to try and fix it before anyone notices).

link