Hacker News new | ask | show | jobs
by mdriley 1017 days ago
The report says the cooling issue caused "a loss of service availability for a subset of [one] Availability Zone".

How did a single-AZ failure cause outages for two dozen services?

Why did a single-AZ failure mean "approximately half of Cosmos DB clusters in the Australia East region were either down or heavily degraded" and require those clusters to do a cross-region failover?