Hacker News new | ask | show | jobs
by ignoramous 1937 days ago
I develop a FOSS adblocking DNS stub resolver and client. And I believe, DNS-based content-blocking will become drastically ineffective as it gets more popular.

Besides CNAMEs breaking all sorts of assumptions a client software makes (and hence also causing security headaches in the process as outlined in the paper), there are a couple other DNS cloaking techniques that the paper doesn't discuss:

1. ALIAS records (not standardized? popularized by Route53) hide CNAME-like pointers. Another variant of this is, some DNS nameservers (like Cloudflare) flatten CNAME records (aka transparently ALIAS endpoints): CNAMEs aren't sent with the answer, that is, you're straight up served the A/AAAA record with IPs (which could easily be third-party). DNSSEC doesn't help here, afaik.

2. The shiny new SVCB/HTTPS records open up another avenue for DNS cloaking. For example, consider this (unverified if correct) record with a chain of pointers:

    example.com SCVB IN 0 example.net
    example.net CNAME IN example.org
    example.org SVCB IN 0 example.us
    example.us SVCB IN 1 example.uk (ipv4hint=2.2.2.2, ipv6hint=2:2::2)
    example.uk SVCB IN 0 example.de
    example.de CNAME IN example.fr
    example.fr SVCB IN 1 . (ipv4hint=..., ipv6hint=...)
   example.fr SVCB IN 2 example.es (ipv4hint=..., ...)
    example.fr SVCB IN 3 example.it (...)
    example.fr CNAME IN example.ru
    example.es CNAME IN example.it
    example.it SVCB IN 1 . (...)
    example.it SVCB IN 2 example.ch (...)
    example.it A IN 4.4.4.4
    example.it AAAA IN 4:4::4
    example.ch SVCB IN 0 example.ru
    example.ru SCVB IN 1 . (...)
    example.ru A IN 3.3.3.3
    example.ru AAAA IN 3:3::3
(the above is missing the example where targets follow "port prefix naming" viz. _443._https.example.com)

Though it remains trivial to uncloak domains hiding behind SVCB/HTTPS records, implementations have to be careful about what they let through. Flattened CNAMEs and ALIAS records; however, to my knowledge, remain undetectable.

But: All indications are that it is foolish to rely on DNS to discern between first-party and third-party. I mean, I can already run www.example.com on Netlify, app.example.com on Vercel, api.example.com on AWS, and cdn.example.com on Cloudflare... and those endpoints could very well be running anything the cloud providers want (third party).

IP based firewall doesn't suffer these shortcomings, but then, enforcing IP blocks are complicated by Virtual Hosting (multiple web services behind a single IP) and IPv6 (too many addresses to curate and block).

7 comments

I just use an HTTP client that does not automatically load resources nor run Javascript. Using such a client, the user, by voluntarily typing the name of a website or following a URL, decides what to retrieve (a page, e.g., index.html), not the web developer. If the website developer is allowed to decide what the user involuntarily retrieves, then it stands to reason a website seeking revenue through online advertising will make sure the user involutarily retrieves ads, or cookies from a tracker. For example, by letting the ad server or tracker use a subdomain of the website as a "cloak".

The fact that the technique relies on a CNAME or some other DNS indirection seems to suggest that the ad server or tracker will have a different IP from the website. That may be another weak point in any effort to conceal the fact that some resources referenced in the page or Javascript files are only necessary for advertising purposes. If both content and ad cruft were being served from a single IP, then that might pose more of a challenge in deciphering what to retreieve. I have yet to see that and doubt I ever will.

I am a believer that ultimately whitelisting is more effective than blacklisting. Request what you want, leave the rest. As opposed to letting a browser request everything according to a web developer's wishes, and then you try to block stuff. With extensions, third party assistance, etc.

> I just use an HTTP client that does not automatically load resources nor run Javascript.

For interest, what do you use? A standard browser with plugins, or a specialised client?

For making HTTP requests, I use a variety of commandline programs, mostly non-custom. For reading HTML I use links, mostly. For reading other formats I use UNIX utilities. These are all small programs that I can easily edit and re-compile if something annoys me and I want it removed.

Today's "standard browser" that runs Javascript is an omnibus, overly complex, kitchen sink program that is inextricably linked to the online advertising industry. Online ads and tracking generally do not work without the help of one of these so-called "standard" browsers.

I don't see a way out of this without the ability to selectively block or fake browser APIs and detect tracking with heuristics, just like old antivirus and spyware blocking software.
Unfortunately, both Chrome and Safari seem dead set on killing any ability for extensions to block based on heuristics.
> DNS-based content-blocking will become drastically ineffective as it gets more popular

So does URL based content-blocking. I recently want to block Youtube/Twitter ads on my own, to my dismay, the ads were buried in some deep JSON response. And ads resource URLs are not easily distinguishable from real content.

> is foolish to rely on DNS to discern between first-party and third-party

Correct, because first-party / third-party is not a technical difference, but a social/commercial one. The app.example.com may run in a different cloud and be part of the same first-party service.

IPV6 also greatly complicates IP-based blocking. There are so many IPV6 addresses that it'd be relatively cheap for an ad tracker to develop a system that uses a new one every day.
If you get allocated a /32, you have 95 bits of addresses to play with. You could use a different IP every millisecond and not have problems.
Yeah, but they'd all be in the same /32, which is trivial to block. You'd have to intersperse "legitimate" users through your space to thwart that.
What would be wrong with blocking the entire /32 if you know the owner of it is using it in such a way?
Ad trackers often use some ISP or cloud provider with many other customers. Which network ISP assigned to a given customer is not public information. Even if a company has own AS, blocking it not always an option: Google, Oracle, IBM and others potentially can use any IP in their networks for Ads, but too big to block.
> Another variant of this is, some DNS nameservers (like Cloudflare) flatten CNAME records (aka transparently ALIAS endpoints)

AFAIK, cloudflare only flattens CNAMEs at the root level, and that’s because CNAMEs at the root are not a standard. They have to convert it to an A record to be standards compliant.

> AFAIK, cloudflare only flattens CNAMEs at the root level

Cloudflare nameservers also flatten CNAMEs pointing to workers.dev (a domain they own), for example.

My intention was to point out that nameservers that flatten CNAMEs render DNS-based blocking ineffective.

At what point does an opt-in approach make more sense than opt-out?
Cookie banners were invented for this purpose but as we can see this is not the best solution. How would anyway anyone opt-in to any kind of tracking after all the privacy issues in the last years? But some kind of standard would be really important for sure. I'm working on the implementation of tracking solutions with the respect of user consent and privacy but even simple website analytics gets really complicated because there are no industry standards currently, every 3rd party handles user consent differently. And we did not even got to the point of gathering the user consent..