Hacker News new | ask | show | jobs
by Nextgrid 1429 days ago
I have 2 takes on this:

1) AWS is already really expensive, just on a single AZ. Replicating to a second AZ would almost double your costs. I can't help but bring up the point that an old-school bare-metal setup on something like Hetzner/OVH/etc becomes significantly more cost-effective since you're not using AWS's advantages in this area anyway (and as we've seen in practice, AWS is nowhere near more reliable - how many times have AWS' AZs gone down vs the bare-metal HN server which only had its single significant outage very recently? - it makes sense considering the AWS control plane is orders of magnitude more complex than an old-school bare-metal server which just needs power and a network port).

2) It is extremely hard to build reliable systems over time (since during non-outage periods, everything appears to work fine despite accidentally introducing a hard dependency on a single AZ), and even more so to account for second-order effects such as an inter-AZ link suddenly becoming saturated during the outage. I'm personally not confident at all in Amazon's (or frankly, any public cloud provider's) ability to actually guarantee seamless failover during an outage, since the only way to prove it's working is to have a real outage as to induce any potential second-order effects such as inter-AZ links suddenly becoming saturated, which AWS or any other cloud provider aren't going to do (as an intentional, regularly-scheduled outage for testing would hurt anyone who intentionally doesn't use multiple AZs, essentially pricing them out of the market by forcing them to either commit to the cost increase of multi-AZ or move to a provider who doesn't do scheduled outages for testing purposes).

1 comments

Going bare-metal is a premature optimization. Most startups that go that route don't survive long enough to make use of this optimization.

Take advantage of AWS (or Azure, or DO) until you're big enough that bringing the action in-house is a financially and technically prudent option.

It’s premature when it’s premature. It’s late when it’s not.