Hacker News new | ask | show | jobs
by LogicX 5285 days ago
TL;DR: You have a fine point to test a new domain's resolving status using dig +trace to not poison your cache. Existing domains need not apply.

You may need to explain your theory on your own website, to bring it back up under this load elsewhere... I can't get to your article because its down, but I see you have your TTL set to 72 hours:

  www.simonluijk.com.	259200	IN	CNAME	simonluijk.com.
  simonluijk.com.		259200	IN	A	46.102.244.108
Whatever you wrote in your article... the 'myth' aspect has to do with everyone's TTLs involved... many of which are out of your control: ex: the listing of your domain in the TLD:

  simonluijk.com.		172800	IN	NS	a.ns.zerigo.net.
  simonluijk.com.		172800	IN	NS	b.ns.zerigo.net.
  simonluijk.com.		172800	IN	NS	d.ns.zerigo.net.
  simonluijk.com.		172800	IN	NS	c.ns.zerigo.net.
  ;; Received 261 bytes from 192.12.94.30#53(e.gtld-servers.net) in 114 ms
Which I see is set at 48hours...

So someone could have a combination of 48 hour TTL cache'd response for your domain's DNS servers, and 72 hour cache'd response by your own DNS servers for your record. Thats not even taking in to account resolvers which ignore the TTL values and substitute their own.

Update: Finally got your article to load. Yes, for a brand new domain, manually testing resolution using dig +trace first, until you confirm it works (avoiding poisoning your cache with a negative response) is a fine suggestion.

Surely the registrar warnings exist for the more likely scenarios of any changes to existing domains. Added TL;DR at the top.

Update 2: Removed alternative resolver rants, and updated to emphasize the dig +trace option - as per author's comment below, and original article.

2 comments

This is actually sadly exactly what the author missed in their article. DNS propagation is directly controlled by the TTL setting on a domain entry.

TTL stands for Time To Live, this is the number (in seconds) that the DNS entry tells people to keep it active in the DNS server cache's (presuming the DNS server will not over-ride this for either a higher or lower number, which is entirely their choice but not common.) This is done so that any request to adomain.com will not have to require a DNS lookup to the main serve for every page request.

It is true that if you have not done a lookup on the domain, then your computer and DNS servers would presumably not have any active DNS records for the domain. So you can make a change and "viola" within 5 minutes (the next time you visit the site) you will have the updated record. However, if you had recently done a DNS inquiry and eceived the record for the old DNS entry, you will need to wait for the old DNS entry to expire before the DNS server you are using will choose to look it up again. This doesn't go into any of the fun of what happens when you have 2 or more DNS servers setup, but ultimately what people are seeing is that the "48 hour" waiting period is substantially less, however most ISPs will stick to this default number to reduce worrisome support from their clients who think otherwise but don't know anything about how DNS works so support will never be able to explain this in laymen terms (or wait, did I just do that?)

me: web hosting sysadmin also dealing with clients. Yes, people really do freak out about DNS problems, and we quote 72 hours because we have clients on 6 continents.

Realistically, it takes 30 minutes - 4 hours for DNS updates to stick. Use http://host-tracker.com/ to check the IP of your site -- that's what we do. It tests something like 80 locations, and the results show the IP returned.

You are absolutely correct regarding the TTLs, and although I've seen well-intentioned help articles suggesting things like setting your TTL to 10-300 seconds...most "big" recursive resolvers will ignore TTLs below 3600 seconds (1 hour), so this doesn't really help.

Props to anyone who knows what RFC covers this behaviour and cites a minimum valid TTL. I'm not aware of any, but I'm not totally up on my RFCs :)

http://www.ietf.org/rfc/rfc1034.txt:

The TTL is assigned by the administrator for the zone where the data originates. While short TTLs can be used to minimize caching, and a zero TTL prohibits caching, the realities of Internet performance suggest that these times should be on the order of days for the typical host. If a change can be anticipated, the TTL can be reduced prior to the change to minimize inconsistency during the change, and then increased back to its former value following the change.

and http://www.ietf.org/rfc/rfc1912.txt:

1-5 days are typical values.

Exactly. If you tell someone it's going to take 24 hours and due to caching it takes 48, they're going to be pretty pissed. On the other hand, if you tell someone it's going to take 72 and it really takes 2, they are going to be quite happy.

There are so many variables involved in DNS TTL's, that it really makes more sense to over-estimate things.

Please provide further detail on these '"big" recursive resolvers' that ignore TTLs. I'm yet to see one in the wild and so I'm somewhat dubious of the claim.

(Please don't be vague - post the addresses of the resolvers in question.)

We just moved our DNS from Network Solutions to Route 53 this month, and I can verify that there are indeed resolvers that'll ignore TTLs. Ours were 3-6 hours, but it took some sites about 24 hours to pick up our new SOA.

Which would have been fine - the A records were the same - but no, NetSol instantly starts serving a blank "Business Profile" landing page A-record. Thanks, people who used to run the Internet.

I know one such caching server was ns1.dns.rcn.net. (But only from inside RCN; querying it from Comcast gave different results. Same IP address, so I'm assuming it's anycast.) whatsmydns.net reported others as "Bell South" and "Cox" (I can't recall the locations, I think one was in Georgia).

I just queried ns1.dns.rcn.net for an rrset that has a TTL of 120 seconds and it returned appropriate TTLs.

EDIT: It also does the right thing with even shorter TTLs - try `dig 40.2.+.rp.secret-wg.org txt @ns1.dns.rcn.net`.

Oh, it returned appropriate-looking TTLs even at the time; we didn't watch them go down to zero and wrap to their original value, but I suspect that's what they did.

Also, if you're not on RCN, you aren't getting the same NS1 as someone who is. (Again, I assume anycast or load balancing, but I'm handwaving; I haven't understood routing since gated.conf changed.)

My boss was on RCN at home, and I was a few miles away on Comcast. We both pointed dig at 207.172.3.8 and hammered on our domain name; he saw stale results, I saw fresh ones.

Would've loved to have the expertise and tools set up to figure out what went wrong, but we just went to bed and by lunch it sorted itself out.

I was not expecting so much traffic. I kicked in a few more gunicorn instances. Hope that helps.

Well thats the point of using the -trace option. It makes dig act as the resolver bypassing all of the caches.

My apologies - I glossed over your explanation of trace, as I've used it for years for other purposes, without the primary intention being this. expunging my other resolver rants from my OP