Hacker News new | ask | show | jobs
by textmode 2963 days ago
"Alternative To /etc/hosts

You don't have to use the hosts file (or addn-hosts ), but performance starts to suffer once the list of domains gets past 120,000.

jacobsalmela says: June 24, 2015 at 06:53

It's partially due to the amount of domains on the lists controlled by the other sites. ~120,000 seemed to be the sweet spot. Once it got higher than that, the hosts format performed better. But a faster SD card can make a difference..."

https://jacobsalmela.com/2015/06/16/block-millions-ads-netwo...

The "list of domains" here is a list of domains to which the user does not want her computer to connect.

The author is suggesting list sizes over 120,000 begin to trigger performance issues, using this dnsmasq-based approach.

What about another "list of domains" that comprises all the ones to which the user does want to connect.

Would it be more or less than 120,000?

For over 15 years I have been running authoritative nameservers on the local network, using tinydns and later nsd, including a custom root.

cdb, the key-value store used in tinydns, on its own is useful for storing domain->ipaddr mappings. I can store lists up to 4GB.

If I understand correctly, the rough equivalent in Pi-Hole is perhaps serving /etc/hosts or some other list of hosts via dnsmasq. (I believe pdns_recursor can also serve /etc/hosts if I recall correctly.)

IME, controlling both /etc/hosts and authoritative DNS has made it very easy to block ads since they almost always rely on DNS.

However I use authoritative DNS as a substitute for recursive DNS.

/etc/resolv.conf lists authoritative nameservers, not resolvers.

As such, DNS is primarily used not to block but to selectively permit. (To build the zonefiles, I use a separate method for "prefetching" needed IP address in bulk that does not use recursive DNS. It has worked beautifully for over 15 years. On the local network I have encrypted DNS lookups via authoritative queries to CurveDNS-proxied authoritative nameservers; no recursive resolvers are needed.)

Foregoing recursive DNS, the approach is similar to a firewall ruleset where the default is to block everything. The user then adds specific rules to allow desired traffic (or in this case domain resolutions).

In other words, the approach I chose was to determine what domains I wanted to access instead of trying to identify every possible domain that needed to be blocked. Every domain is blocked by default until I allow it.

Although I have no need for Pi-Hole personally I would like to see it succeed. I am glad to see that other users taking an interest in DNS.

The reason I ask the question about the size of the "allow" domain list is that over 15 years I am not even close to reaching 120,000 domains. I wonder how many domains other users visit.

To rephrase the question again: If there are two lists of domains: 1. all the domains to which the user wants to allow and 2. all the domains she wants to block, then which is the larger list?

The answer will vary from one user to another.