Hacker News new | ask | show | jobs
by beevek 4227 days ago
A simple example might help. Imagine you have a node in California and a node in NY, and a user in NJ. Let's assume for now that geographic proximity is actually a good arbiter for performance. If we make a bad routing decision and send your user to the CA node, we're adding a lot of overhead in their session with your application: every TCP packet (e.g., every HTTP request) takes a long round trip. Even if we spit out a DNS answer really, really fast, if it's the wrong answer, the user has a bad experience.

It's actually even worse than that, because the user isn't the one directly doing the DNS query: there's an intermediary DNS resolver, which will cache the response. If the "wrong" response gets cached, then every user of that resolver will get that wrong response until the cache expires. So not only have we made the original requester's experience bad, but we've negatively impacted every other user of your application that's leveraging the same resolver.

Time to first byte is a combination of a lot of things, and if you're in more than one datacenter, it doesn't much matter how fast you spit out a DNS response if you're giving the wrong one and impacting the rest of the session going forward.

Doesn't mean you shouldn't expect the best of both worlds: sending the user to the "best" endpoint, fast.

On the filter chain question: typical/canonical approach to do any kind of decision making in a DNS system is to add some new proprietary record type, like a geo record or a health checking record. That's kind of the natural thing to do in DNS at first glance.

But if you want to get any kind of complex routing behavior, you're going to need an awful lot of different record types implementing those different behaviors, and what you end up with behind the scenes is what I often call a spaghetti of different DNS records all pointing at each other in some kind of big decision tree -- maybe a geo record, pointing at a bunch of health checking records, pointing at a bunch of CNAMEs, pointing at a bunch of A records. This quickly becomes unmanageable, and every new kind of routing you want to do results in another layer in the decision tree, all to resolve a single hostname.

The Filter Chain is a way to collapse all that down to something much more manageable and performant by bringing all the context into one place and thinking of routing as a collection of simple actions that you're taking on some input data (answers you could give, and details about those answers).