Hacker News new | ask | show | jobs
by jedberg 1429 days ago
Sorry all I jinxed it. Yesterday I was in a meeting and said "The only regional outages AWS has ever had were in us-east-1, so we should just move to us-east-2."

Now I guess we have to move to us-west-2. :)

Update: looks like it's only one zone anyway, so my statement still stands!

7 comments

Stay in us-east-1, they provide Chaos Monkey for free. It's a feature.
I always say stay in use1 because almost everybody is there and when it's suffering any kind of outage so much of the internet is affected that it's no big deal that you're a part of the outage. People just go outside and get some air knowing it will be back up in a few hours, usually right around the time the AWS status page acknowledges that there is an issue.
After my own heart
Chaos Kong is the one that takes out whole regions. ;)
I think it's Chaos Cthulhu the way it's going recently...
us-rlyeh-1
Chaos Trump
I'm moving to us-weast-1
what kind of compass are ya reading lad?
I presume they are trying to express an extra cardinal dimension perpendicular to the plane. Deep underground in Bezos new evil lair perhaps?
Better to use us-nouth-3
The AWS status board, posted elsewhere in the comments, seems to think this is an AZ outage, not a regional one.

Edit: although, one of our vendors that uses AWS has said that they think ELB registration is impacted (but I don't recall if that's regional?) and R53 is impacted (which is supposed to be global, IIRC). Dunno how much truth there is to it as we don't use AWS directly.

AWS is notorious did underreporting and failing to report. They do not have asafe culture and its bad for your career if there is a major outage
Please don't move to us-west. We are probably going to have an 11-point earthquake the next day.

Thanks!

In all seriousness, we've been deploying everything on us-west-2, and it seems to have dodged most of the outages recently. Is there something special about that data center?
Classically, us-east-1 received most of the hate given its immense size (it used to be several times larger than any other) and status as the first large aws data center. It also seemed to launch new aws features first but that may have been my imagination. If true, I'm sure always running the latest builds was not great for stability.

us-west-2 has had outages as well but it is less common, even rare. I've been pushing companies to make their initial deployments onto us-west-2 for over ten years now. I occasionally get kudos messages in my inbox :)

I believe us-east-1 runs some of the control plane and an us-east outage can effectively take a service in a different region offline as it can break IAM authentication
93.99999
There are six 9s in there. Pretty solid!
Maybe Amazon should make us-east-1's actual datacenter change depend on the customer, as they do with the AZs :P
Doesn't AWS IoT run only in us-east-1?

And I think Alexa skills, if anybody cares about those.

It's never been a default datacenter. For a long time the default when you first logged into the console was us-east-1 so a lot of companies set up there (that's where all of reddit was run for a long time and Netflix too). At some point they switched the default to us-east-2.

So anyone who is in us-west-2 is there intentionally, which makes me assume there is a smaller footprint there (but I have no idea).

Rather the opposite - us-west-2 is big but not the biggest region, or the smallest, or the oldest or newest, it's not partitioned off like the China or GovCloud regions. Because us-west-2 is fairly typical it tends to be one of the last regions to get software updates, after they've been tested in prod elsewhere
Looks like this particular issue was due to power loss, and for power us-west-2 has one clear advantage: It's power is directly from the Columbia river and highly unlikely to have demand based outages.
Maybe not the entire region. Amazon was reportedly building a data center complex next to the natural gas Hermiston Generating Plant some distance from the river.
If us-west-2 goes down in the next few days we’ll expect an explanation.
Naive question: don't people who care about resiliency have their services in more than one datacenter? or datacenter failure is considered such a rare event that's it's not worth the cost/trouble of using more?
AWS makes it pretty easy to operate in multiple AZs within a region (each AZ is considered a separate datacenter but in real life each AZ is multiple datacenters that are really close to each other).

That being said, there is still an added cost and complexity to operate in multiple AZs, because you have to synchronize data across the AZs. Also you have to have enough reserved instances to move into when you lose an AZ, because if you're running lean and each zone is serving 33% of your traffic, suddenly the two that are left need to serve 50% each.

The bigger companies with overhead reservations will get all the instances before you can launch any on demand during an AZ failure.

> each AZ is considered a separate datacenter but in real life each AZ is multiple datacenters that are really close to each other

For AWS specifically, I’m fairly certain they maintain a minimum distance and are much more strict on requirements to be on different grids etc than other Cloud providers. A few years ago they were calling out Azure and Google Cloud on exactly what you describe (having data centers essentially on the same street almost).

A single AZ may have neighboring datacenters, but they are very strict on having datacenters for different AZs be at least 100km apart and on different flood plains and power grids.
This should be at most 100km. Range is in 60km-100km range typically.
100km? Oh really?
https://docs.aws.amazon.com/sap/latest/general/arch-guide-ar...

Each Availability Zone can be multiple data centers.At full scale, it can contain hundreds of thousands of servers. They are fully isolated partitions of the AWS global infrastructure. With its own powerful infrastructure, an Availability Zone is physically separated from any other zones. There is a distance of several kilometers, although all are within 100 km (60 miles of each other).

I think you may have slightly misread. I think what’s being said is that a single logical AZ may actually be multiple physical datacenters in close proximity.
At least in eu-north-1 the three AZs are located in different towns, about 50 km apart (Västerås, Eskilstuna and Katrineholm).
Some people care about it but not enough to justify the added downsides - multi-data center is expensive (you pay per data center) and it’s complex (data sharding/duplication/sync).

If you’re Amazon where every second is millions of $ in transactions you care more than StartUp that has 1 request per minute. Even if you accept the risk, you still care when your DC goes down.

Also, a large chunk of AWS is managed from a single data center so if that one goes down you may still have issues with your service in another data center.

I'd consider using it, but the biggest roadblock for me is that I work in a regulated industry in Australia, and until AWS finishes their Melbourne region (next year maybe?) I'm stuck in one region because all private data needs to stay in Australia.

Also, I think a lot, but not all of the services I use work okay with multiple regions.

On top of that, I was looking at the documentation for KMS keys yesterday, and a KMS key can be multiregion, but if you don't create it as multiregion from the start, you can't update the multiregion attribute. So you need to create a new KMS key and update everything to use the new multiregion key.

AWS works with multiple availability zones (AZ) per region, some products by default deploy in several ones at the same time, while others leave it up to you.
AWS makes it trivially easy to distribute across more than one datacenter... The only time that outages make the news is when they all fail in a region.
Jinxed for sure. I refuse to deploy resource into us-east-1 unless required by the service.