| HN Mirror

I am quoting the author, and their analysis of the initial query. This observation from whoever wrote the article matches my own experimental results: initial queries are very very slow on Route 53 LBR. A distribution of queries is useless and misleading, as later queries are cached if you have a sufficient TTL - so only the first few really matters in the performance results.

Later queries are fast of course, as the results are cached (TTL).

Even if the DNS is very pooly configured, all queries after the first one will benefit from the cache!! So the first few queries matter much more, and this is what we should be talking about instead of distributions and percentiless.

Said differently: If each of your visitor has to way a second or two until the site comes up the first time, then the site works normally, it may still give them a bad impression.

I measured the DNS delay on first Route53 reply to be over 700 ms personally. For the author it is 2000 ms. These results are in the same order of magnitude, and make Route53 unsuitable for many applications. Of coutse, you could start hacking, like keeping Amazon cache warm by issuing queries through chron, or by setting extremely long TTLs and hoping your visitors DSL modem will keep your A records in cache as long as you asked for - but these are just hacks trying to compensate that the first DNS query takes SECONDS to process.

Route53 LBR DNS is not as a "slow and requiring hacks". It's supposed to be fast, simple to run, and to ingrate with different ecosystems. To me, it seems to be none of that.

After assessing Route53 as fubar, I switched from AWS to Azure: TrafficManager offers the same features, and the first request takes less than 350ms. There must still be some cruft in there, but at least it is manageable.