Wasn't aware that there was noticeably higher latency between availability zones in the same AWS region. Kinda thought the whole point was to run replicas of your application in multiple to achieve higher availability.
They also charge you like 1c/GB for traffic egress between the zones. To top it off there are issues with AWS loadbalancers in multi-zone setups. Ultimately i've come to the conclusion that large multi-zonal clusters is a mistake. Do several single-zone disposable clusters if you want zone redundancy.
At $WORK traffic between zones ($REGION-DataTransfer-Regional-Bytes) is our second largest cost on our AWS bill, more than our EC2/EKS cost. It adds up to mid six figures each year. We try to minimize this where it is easy to do so. For example, our EKS pods perform reads against RDS read replicas in the same AZ only, but you're out of luck for writes to the primary instance. To reduce this in any significant way can eat up a lot of time, and for us, the cost is enough to be painful but not enough to dedicate an engineer to fixing.
This is precisely how Amazon's bread is buttered. An outage affecting an entire AZ is rare enough that I would feel pretty happy making all our clusters single-AZ, but it would be a fool's errand for me to convince management to go against Amazon's official recommendations.
I would LOVE to pitch something else I'm working on that is solving this problem in EKS, cross zone data transfer.
It's a plugin that enables traffic re-direction for any service that is using an IP in any given VPC. If you have say multiple RDS Reader instances, it will first attempt to use local AZ instances first, but the other instances are available if local instances are non-functional. So you do not loose HA or failover features.
The plugin does not require any reconfiguration on your apps. It works similar to Topology Aware Routing (https://kubernetes.io/docs/concepts/services-networking/topo...) in Kubernetes, but it works for services outside of Kubernetes. The plugin even works for non-Kubernetes setup as well.
This AZP solution is fine for services that is have one IP or primary instance, like RDS Writer instance. It does not work for anything that is "stateless" and multi-AZ, like RDS Read-only instances or ALBs.
That was the origin for this solution. A client app had to issue millions of small SQL queries where the first query had to complete before the second query could be made. Millions of MS adds up.
Lowest possible latency would of course be running the client code on the same physical box as the SQL server, but thats hard to do.
In my experience it has been relatively high variance – it does get as low as 0.5, but can be 3-4. That's an order of magnitude difference, and can be the difference between a great and a terrible UX when you amplify it across many RPCs.
In general the goal should be to deploy as much of the stack in one zone as possible, and have multiple zones for redundancy.
> In general the goal should be to deploy as much of the stack in one zone as possible
Agree. The can be a few downsides one has to consider if you have to fail over to another zone. Worst case, there isn't sufficient capacity available when you fail over if everyone else is asking for capacity at the same time. If one uses e.g. karpenter, you should be able to be very diverse in the instance selection process, so that you get at least some capacity, but maybe not the preferred.
I was surprised to. Of course it makes sense when you look at it hard enough, two seperate DCs won't have the same latency than internal DC communication. It might have the same physical wire-speed, but physical distance matter.