Hacker News new | ask | show | jobs
by nevir 236 days ago
It's really not that nefarious.

IAD datacenters have forever been the place where Amazon software developers implement services first (well before AWS was a thing).

Multi-AZ support often comes second (more than you think; Amazon is a pragmatic company), and not every service is easy to make TRULY multi-AZ.

And then other services depend on those services, and may also fall into the same trap.

...and so much of the tech/architectural debt gets concentrated into a single region.

1 comments

Right, like I said: crazy. Anything production with certain other clouds must be multi-AZ. Both reinforced by culture and technical constraints. Sometimes BCDR/contract audits [zones chosen by a third party at random].
It sure is a blast when they decide to cut off (or simulate the loss of) a whole DC just to see what breaks, I bet :)
The disconnect case was simple: breakage was as expected. The island was lost until we drew it on the map again. Things got really interesting when it was a full power-down and back on.

Were the docs/tooling up to date? Tough bet. Much easier to fix BGP or whatever.