Hacker News new | ask | show | jobs
by speedgoose 1658 days ago
> Third, you’re gonna go down when the cloud goes down.

Not necessarily. You just need to not be stuck with a single cloud provider. The likelihood of more than one availability zone going down on a single cloud provider is not that low in practice. Especially when the problem is a software bug.

The likelihood of AWS, Azure, and OVH going down at the same time is low. So if you need to stay online if AWS fail, don't put all your eggs in the AWS basket.

That means not using proprietary cloud solutions from a single cloud provider, it has a cost so it's not always worth it.

3 comments

> using proprietary cloud solutions from a single cloud provider, it has a cost so it's not always worth it.

but perhaps some software design choices could be made to alleviate these costs. For example, you could have a read-only replica on azure or whatever backup cloud provider, and design your software interfaces to allow the use of such read only replicas - at least you'd be degraded rather than unavailable. Ditto with web servers etc.

This has a cost, but it's lower than entirely replicating all of the proprietary features in a different cloud.

Complex systems are expensive to operate, in many ways.

The more complexity you build into your own systems on top of the providers you depend on, the more likely you are to shoot yourself in the foot when you run into complexity issues that you’ve never seen before.

And the times that is most likely to happen is when one of your complex service providers goes down.

If the kind of thing you’re talking about could be feasibly done, then Netflix would have already done it. The fact that Netflix hasn’t solved this problem is a strong indicator that piling more proprietary complexity on top of all the vendor complexity you inherit from using a given service, well that’s a really hard problem in and of itself.

True multi-cloud redundancy is hard to test - because it’s everything from DNS on up and it’s hard to ask AWS to go offline so you can verify Azure picks up the slack.
I deeply concur with this statement. I think folks here are conflating a one off test versus keeping your redundancy up to date as apps evolve.
Sure you can. Firewall AWS off from whatever machine does the health checks in the redundancy implementation.
What happens when your health check system fails?
It's true, but you can do load balancing at the DNS level.
And you will get 1/N of requests timing or erroring out, and in the meanwhile paying 2x or 3x the costs. So, it might be worth in some cases but you need to evaluate it very, very well.