Hacker News new | ask | show | jobs
by rkoz 4659 days ago
Care to explain how DNS alone can do distributed config management?
3 comments

Spotify store some configuration in configuration to good effect:

http://labs.spotify.com/tag/dns/

It's obviously no Zookeeper but it is proven and mature.

That's interesting use of dns indeed.

They also mention the DNS for service discovery approach starts to reach it's limits and Spotify is considering Zookeeper (quote):

   We have not yet (as of January 2013) started implementing a replacement. We are 
   looking into using Zookeeper as an authoritative source for a static and dynamic 
   service registry, likely with a DNS facade.
I find their reasons curious. Why in the world are they using zone files? There are tons of DNS servers that support database backends, and writing tools to interact with them is easy.

For that matter, writing an authoritative only DNS backend is easy (been there, done that - took about one week from starting to read the RFC's until having a production ready backend; it takes little time because most/all of the hard work is in the recursive resolvers, and the DNS protocol is actively very well described in the RFCs)

And claiming DNS provides a static view of the world is a bit funny - DNS provides TTL values for everything. If you want a dynamic view, you specify low TTLs, and make sure your clients honour them, and couple that with fast replication of the zone data. There's plenty of options for that, from the duct tape (my DNS server could update however many records you could write to disk on your hardware per second - via a small script that used Qmail as a queueing messaging server...) to well polished products.

Couple that with NOTIFY and IXFR, the protocol provides every mechanism necessary for keeping zone data replicated and up to date. Many modern DNS servers also let you instead simply rely on database replication (e.g. you can have the DNS server serve data out of Postgres for example, and use Postgres replication to keep the zones up to date over multiple servers), or leave it to you to do updates.

The appeal with DNS here is the long track record and existence of servers that have been battered to death in far more hostile environments than most internal service discovery systems ever will need to deal with.

The downside to DNS is that to get things like guaranteed consistency, you'd need a backend that can guarantee it, and clients that don't cache (which means you need to be careful about what resolvers you rely on). And then it might be just as easy to just deploy one of the options in this article (but there's nothing inherent with DNS that prevents that either).

Oh! Treat records as key-value pairs, and encode the value into the IP. If your timeout is 10 seconds, add a DNS record timeout.server.yourdomain which resolves to an IPv6 address with value 10. It gets tougher with ASCII strings, but you could support multi-record configs as well. Then your application just uses nslookup to download the config when you reload it.

If someone builds this I will be their best friend

DNS maybe good for lightweight service discovery, people have been doing it for ages. However I wont waste anytime trying to dress it up as an answer for real world config management problems (distributed, hierarchical, model-agnostic, consistent .."stuff")
The only one on your checklist that DNS doesn't cover is consistency, and there's tons of applications where short term inconsistency is totally acceptable.
Well, at the very least there's a TXT record for strings. Access control and consistency may both be issues though.
Crazily enough, this is built into most DNS implementations. It's called Hesiod-class records: http://en.wikipedia.org/wiki/Hesiod_(name_service)

If you've ever wondered why DNS requires the "IN" (for "Internet") in all its record declarations, it's to make this distinction. The other two options are "HE" (for Hesiod), and "CH", for http://en.wikipedia.org/wiki/Chaosnet.

You can also use DNS like a distributed cache. This is useful when you have millions of clients because their local DNS server will do caching for you. You can also use it like a bloom filter where cache hits are true positives and anything that misses might or might not be a valid key.
That sounds like the opposite of a Bloom filter. In a Bloom filter, you can have false positives but you always get true negatives.
I meant to say "except" where i went on explaining the opposite kind of lookups.
That doesn't sound much at all like a bloom filter.
Look up how service discovery via srv records works. That is how active directory (amongst other things) works under the covers