Hacker News new | ask | show | jobs
by deafcalculus 3211 days ago
Won't DNS failover be painfully slow? Some clients ignore small TTL values. I've seen DNS updates taking several hours to propagate.

I thought one of the advantages of multiple zones is that zonal failover can happen with "zero" downtime (this seems to be the case with Amazon RDS).

2 comments

The default answer includes multiple A records, so if clients can't reach one of the IPs, they try another. There's no need for anything to propagate for that to kick in, it's just ordinary client retry behavior.

We do also withdraw an IP from DNS if it fails; when we measure it, we see that over 99% of clients and resolvers do honor TTLs and the change is effected very quickly. We've been using this same process for www.amazon.com for a long time.

Contrast to an alternative like BGP anycast, where it can take minutes for an update to propagate as BGP peers share it with each other in sequence.

RDS failover still uses DNS and you still need to be aware of client TTLs:

"Because the underlying IP address of a DB instance can change after a failover, caching the DNS data for an extended time can lead to connection failures if your application tries to connect to an IP address that no longer is in service."

http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_B...