Hacker News new | ask | show | jobs
by whs 1832 days ago
We had similar problem where a running ETL job caused a production outage due to binlog pressure.

One thing that surprised us that our TAM says that on a 1 AZ write-heavy workload normal MySQL would have higher performance as Aurora synchronously write to storage servers in other AZs. On immediate read-after-write workload that would mean it would take longer time to acquire lock.

2 comments

> One thing that surprised us that our TAM says that on a 1 AZ write-heavy workload normal MySQL would have higher performance as Aurora synchronously write to storage servers in other AZs

What is surprising about a multi-AZ database having higher latency than one that runs in only one AZ?

From what I can tell, they provisioned their DB instance(s) in a single AZ, but weren't aware that Aurora automatically provisions its own storage and always uses multiple AZs. We touch on the separation of compute and storage in the post.

I think the surprise is that it's not possible to have a truly "single AZ" Aurora database, even though you might have thought you provisioned your DB instances that way.

I see. I haven’t used Aurora, but have had experience running write heavy workloads on RDS. EBS failures would regularly (like monthly) cause our write latency to spike up 3-5x. If Aurora’s storage layer architecture is more resilient to those types of problems, that seems like a huge win.
Should not be a surprise if you are using Aurora hopefully. Papers on the topic are very clear on how they scale the storage.
This seems plausible given our understanding of the database internals. In general we found our AWS contacts to be knowledgeable and forthcoming about complex tradeoffs between Aurora and vanilla MySQL, even if some of that information is hard or impossible to find in the docs.