Hacker News new | ask | show | jobs
by bhauer 3533 days ago
> A low tolerance for DNS response times, and suddenly large chunks of the internet are failing a lot...

Hang on a second. I feel that you're piling on other resolver changes in order to make a point. I'm not suggesting that the tolerance for DNS response times be reduced. Nor am I suggesting a scenario where the authority gets one shot after their TTL, after which they're considered dead forever. I would expect my caching DNS resolver to periodically re-attempt to resolve with the authority once we've entered the period after the authority's TTL.

> Leak a route, DDoS a DNS provider, and watch as traffic everywhere goes to an attack server because servers everywhere "protect" people by serving known-stale data rather than failing safe.

I think you're suggesting that someone could commandeer an IP and then prevent the rightful owner to correct their DNS to point to a temporary new IP.

Isn't the real problem in this scenario the ability to commandeer an IP? The malicious actor would also need to be able to provide a valid certificate at the commandeered IP. And at that point, I feel we've got a problem way beyond DNS resolution caching. Besides, if what you have proposed is possible, isn't it also possible against any current domain for the duration of their authoritative TTL? That is, a domain that specifies an 8-hour TTL is vulnerable to exactly this kind of scenario for up to an 8-hour window. Has this IP commandeering and certificate counterfeiting happened before?

3 comments

> Hang on a second. I feel that you're piling on other resolver changes in order to make a point.

Yes. The point I am making is the additional failure modes that need to be considered and the pain they can cause. Historically have caused.

At no point did I ever think you were suggesting that one failure to respond renders a server dead to your resolver forever. Instead, I expect that your resolver will see a failure to respond from a resolver a high percentage of the time, leading to frequent serving of stale data.

> Isn't the real problem in this scenario the ability to commandeer an IP?

You're absolutely right! The real problem here is the ability to commandeer an IP.

However, that the real problem is in another castle does not excuse technical design decisions that compound the real problem and increase the damage potential.

> Instead, I expect that your resolver will see a failure to respond from a resolver a high percentage of the time, leading to frequent serving of stale data.

If this were true, the current failure mode would have end users receiving NX DOMAIN a "high percentage of the time," which obviously is not happening.

{edit: To be clear, I'm reading the quote as you stating that "failure to resolve" currently happens a high percentage of the time, and therefore this new logic would result in extended TTLs more often than the original post would assume they would happen}

> However, that the real problem is in another castle does not excuse technical design decisions that compound the real problem and increase the damage potential.

It's fair to point out that this change, combined with other known issues could create a "perfect storm," but as was pointed out this exploit is already possible within the current authoritative TTL window. Exploiting the additional caching rules would just be a method of extending that TTL window.

On the other hand, where do you draw the line here? If you had to make sure that no exploits were possible most of the systems that exist today would never have gotten off the ground. It seems a bit like complaining that the locks to the White House can be exploited (picked), while missing the fact that they are only supposed to slow someone down before the "men with guns" can react.

Based on the highly unscientific sample of the set of questions asked by my coworkers in my office today, the failure mode of end users receiving NX DOMAIN has happened much more than on most days.

I don't need to make sure no exploits are possible. However, it at all possible, I'd like to help ensure that things aren't accidentally made more dangerous. It's one thing to consider and make a tradeoff. It's quite another to be ignorant of what the price is.

Well, it obviously happens when the resolver is down, but that's the situation that this logic is being proposed to smooth over. The normal day-to-day does not see a high percentage of resolvers failing to respond, or else people would be getting NX DOMAIN for high profile domains much more often.
I'm just trying to make sure we don't wind up making DNS poisoning nastier in an effort to be more user-friendly.
All the attacks mentioned here seem to be of the following shape:

1. Let's somehow get a record that points at a host controlled by us into many resolvers (by compromising a host or by actually inserting a record).

2. Let's prolong the time this record is visible to many people by denying access to authoritative name servers of a domain.

(1) is unrelated to caching-past-end-of-ttl, so you need to be able to do (1) already. (2) just prolongs the time (1) is effective and required you to be able to deny access to the correct DNS server. Is it really that much easier to deny access to a DNS server than it is to redirect traffic to that DNS server and supply bogus reponses?

WRT the second attack, what they're referring to is actually DNS cache poisoning - inserting a false record into the DNS pointing your name at an attacker-controlled IP address. This is a fairly common attack, but usually has an upper time limit - the TTL (which is often limited by DNS servers).

This proposal would allow an attacker to prolong the effects of cache poisoning by running a simultaneous DDoS against un-poisoned upstream DNS servers.

Not sure whether it could be used in a legitimate attack (probably), but it can definitely lead to confusing behavior in some scenarios. You switch servers, your old IP is handed to some random person, your website temporarily goes down - and now your visitors end up at some random website. Would you want that? Especially if you're a business?

Also, "commandeering" an IP of a small hosting might be easier than you think. It depends entirely on how they recycle addresses.