|
|
|
|
|
by happymellon
1417 days ago
|
|
> you are miles ahead of just about every other web company. I'm curious who these web companies are. Use something like Lambda and you get multi-az for free. https://docs.aws.amazon.com/lambda/latest/dg/security-resili... Dynamo is another service that wouldn't be impacted as it is multi-az. Getting postgres RDS multi-region would require the extra couple of lines in your CDK, but is fairly straightforward. |
|
At least 1 cluster had a node on “affected” hardware (per AWS). Aurora failed to failover properly and the cluster ended up in a weird error state, requiring intervention from AWS. Could not write to the db at all. This took several hours to resolve.
All that to say that it’s never straightforward. In today’s event, it was pure luck of the draw as to whether a multi-AZ Aurora cluster was going to have >60 seconds of pain.
That SaaS has been running Aurora for years and has never experienced anything similar. I was very surprised when I heard the cluster was in a non-customer-fixable state and required manual intervention. I’ve shilled Aurora hard. Now I’m unsure.
Thank goodness they had an enterprise support deal or who knows if they’d still have issues now.