Hacker News new | ask | show | jobs
by valenterry 355 days ago
How does AWS do that though? Do the re-implement all the code in every region? Because even the slightest re-use of code could trigger a synchronous (possibly delayed) downtime of all regions.
3 comments

Reusing code doesn't trigger region dependencies.

> Do the re-implement all the code in every region?

Everyone does.

The difference is AWS very strongly ensures that regions are independent failure domains. The GCP architecture is global with all the pros and cons that implies. e.g GCP has a truly global load balancer while AWS can not since everything is at core regional.

They definitely roll out code (at least for some services) one region at a time. That doesn't prevent old bugs/issues from coming up but it definitely helps prevent new ones from becoming global outages.
Right, that makes sense. But if it's an evil bug that triggers e.g. over a year-change only, then that might not help.

So I suppose theoretically also AWS can go down all together, even if less likely.

Region (and even availability zones) in AWS are independent. The regions all have overlapping IPv4 addresses, so direct cross-region connectivity is impossible.

So it's actually really hard to accidentally make cross-region calls, if you're working inside the AWS infrastructure. The call has to happen over the public Internet, and you need a special approval for that.

Deployments also happen gradually, typically only a few regions at a time. There's an internal tool that allows things to be gradually rolled out and automatically rolled back if monitoring detects that something is off.