Hacker News new | ask | show | jobs
by whoisjuan 1108 days ago
Why is it always us-east-1 though?

I have always stayed away from that region because it seems significantly less reliable than other regions.

7 comments

It's the:

* Largest (DDoS'd most, most complex, scaling issues etc)

* Oldest (More time for weird idiosyncrasies to take hold)

* Where most testing happens

* Where new products are deployed first

1) and 2) certainly apply. 3) and 4) don't. Testing in the largest region is one of the biggest anti-patterns.
4 is still generally true. Most new features drop in us-east-1 on launch day.
Usually us-east-1 is deployed to after several smaller regions. Usually it'll fall in the middle of the week depending on the pipeline.

Just because a feature is there on launch day doesn't mean it was deployed to first. Features are often hidden behind flags that are switched for launch.

I'm well aware of that, but the point is that when the feature is ungated to the public, it's in us-east-1 and gets all that load, and more load than the rest because of the fact that a lot of big customers are based in us-east-1, including much of Amazon itself.
AWS doesn't test there last I checked, they roll out to smaller regions first.
Most AWS engineering is closest to (and tested in) us-west-2 (PDX) or us-east-2 (Ohio)
It's also the home of single region services...

IAM, Cloudfront ACM certs, etc

Those are not single-region services. Changes must be executed there, but the data is replicated globally. If you don’t need to make changes in the context of those services, they will keep working in the other regions even during an incident in the primary region.
It's also

* The only place where the IAM dashboard can be accessed from. I need to access it NOW. I can't.

Looking forward to Auckland coming online, which should be the opposite to most of these factors, and will make game streaming bearable (for me)
us-east-1 is the largest region, so it is where changes meet scale.

It is also a massively complex beast in itself spanning dozens of datacenters with massive amounts of fiber between them. Much more fragile than having everything in a single building and as you scale up the number of components you increase the rate of failure.

No AWS region is in a single building, they aren't amateurs like Azure. Each region is at least 3 AZs, which is at least one physical DC.
And yet it's AWS that's down.
Touché. Still I'd rate the overall reliability of AWS higher than Azure; and even if that weren't the case, security issues make Azure look like a very poor choice.
I actually just wrote about this very thing. It's not just that it SEEMS less reliable, it absolutely is:

https://statusgator.com/blog/is-north-virginia-aws-region-th...

I don't think this article has any value. Are you only counting region wide outages? US east is probably 10x the size of any other region with more AZ's than any other region.
I suspect it's where they concentrate a lot of their control plane.
us-east-1 is AWS's oldest region, and has the most legacy infrastructure, in ways that many other regions do not.
I thought I read that this is where they deploy new changes first. Can anyone confirm?
No definitely not. Usually pipelines deploy over 1-2 week periods, and they don't deploy on Fridays/holidays/high-traffic periods like December.

Deployments start off very conservative, maybe 1-2 small regions on the first day of deployments. As you gain confidence, the pipeline deploys to more regions/bigger regions.

A pipeline that deploys to 22 regions over one week might go from 2 small regions on monday, 4 small/medium regions on tuesday, 8 medium/large regions on wednesday, 8 regions on thursday.

us-east-1 is usually going to be deployed to on the wednesday/thursday in this example, but that isn't always the case because sometimes deployments are accelerated for feature launches (especially around re:invent), or retried because of a failure.

There are best practice guides within Amazon that very closely detail how you should deploy, although it is up to the teams to follow them, which they usually do an okay job of.

I don't believe it's true. I was working on one of the biggest AWS services and we always deployed to small regions first.

@dijit is right: https://news.ycombinator.com/item?id=36315736

I have a suspicion that AWS uses some regions as canaries. Because we control both ends of things, I have personally noted that certain AWS functions clearly break in Australia first.
When I worked there, there were few hard and fast rules. Every team had its own release processes, so there was a lot of variance. It has been a couple of years, so this may have changed.

Typically, a team would group their regions into batches and deploy their change to one batch at a time. Usually they follow a geometric progression, so the first batch has one region, the second batch has two regions, the third batch has four regions, and so on. This batching was performed for the sake of time; nobody wants to wait a month for a single change to finish rolling out.

One reason not to deploy to us-east-1 in the first batch is so you don't blow up your biggest region. The fewer customers you break, the better.

One reason not to deploy to us-east-1 in the last batch is that there are a lot of batches. If a problem is uncovered after deploying the last batch, then someone has to initiate rollbacks for every single region.

Some teams tried to compromise and put us-east-1 in one of the earlier batches.

When i worked at aws, IIRC, us-east-1 was one of the last regions we deployed to. So this is very confusing to me
From observing my wife's teams over the years, they deploy new _products_ early to that region, but deploying code changes starts in smaller regions.
Because it was one of the first, and it shows its age and less than rigorous rollout compared to the other zones.