Hacker News new | ask | show | jobs
by rose-knuckle17 243 days ago
aws had an outage. Many companies were impacted. Headlines around the world blame AWS. the real news is how easy it is to identify companies that have put cost management ahead of service resiliency.

Lots of orgs operating wholly in AWS and sometimes only within us-east-1 had no operational problems last night. Some that is design (not using the impacted services). Some of that is good resiliency in design. And some of that was dumb luck (accidentally good design).

Overall, those companies that had operational problems likely wouldn't have invested in resiliancy expenses in any other deployment strategy either. It could have happened to them in Azure, GCP or even a home rolled datacenter.

3 comments

Redundancy is insanely expensive especially for SaaS companies where the biggest cost is cloud.

Are customers willing to pay companies for that redundancy? I think not. Once every few years outage for 3 hours is fine for non critical services.

In general it is not expensive. In most cases you can either load balance across two regions all the time or have a fallback region that you scale out/up and switch to if needed.
Quite expensive to build though. Many of these companies don't have the sharpest engineers building multi-cloud.

IMO, going multi AZ or multi-cloud adds a good amount of complexity.

TBH I don't care if last.fm doesn't work for 8 hours a year, that isn't a big deal. My bank? Yeah that should work.

Multi tenancy is expensive. You’d need to have every single service you depend on, including 3rd party services, on multi tenancy. In many cases such as the main DB, you need dedicated resources. You’re most likely to also going to need expensive enterprise SLAs.

Servers are easy. I’m sure most companies already have servers that can be spun up. Things related to data are not.

You don't need expensive SLAs to do data replication or load balancing in the cloud. It is pretty basic.
Talking about 3rd party services.

And no, data replication or load balancing is not easy, nor cheap.

You wrote "You’d need to have every single service you depend on, including 3rd party services, on multi tenancy.". This is highly incorrect. I worked at several companies that have a multi tenancy strategy. It is:

* Automated. * Scoped to business critical services. Typically not including many of the 3rd party services. * Uses data replication, which is a feature in any modern cloud. * Load balancing, by DNS basically for free or a real LB somewhere on the edge.

If you fail at this you probably fail at disaster recovery too or any good practice on how to run things in the cloud. Most likely because of very poor architecture.

>> Redundancy is insanely expensive especially for SaaS companies

That right there means the business model is fucked to begin with. If you can't have a resilient service, then you should not be offering that service. Period. Solution: we were fine before the cloud, just a little slower. No problem going back to that for some things. Not everything has to be just in time at lowest possible cost.

Three nines might be good enough when you're Fornite. Probably not when you're Robinhood.
The part that makes no sense is - it's not cost management. AWS costs ten to a hundred times MORE than any other option - they just don't put it in the headline number.