Hacker News new | ask | show | jobs
by defaultname 1859 days ago
If we go down that road, however, sites can make ads completely indistinguishable from desired content. Same domain, same stream, no easily marked container. All of the imperative adblocking tech in the world, short of queuing everything through a neural engine post render, can block what is possible.

So there has always been a detente between adblockers and publishers, presuming the former hit a small enough set of users that it was just ignored. It seems that is no longer the case.

3 comments

That would require delivering ads from first-party servers, right? So third-party ad and tracking networks would die a painful death.
More likely, they'd get upgraded to "first-party" tracking by acting as a CDN-layer where the ad networks do the proxying/caching to get the actual content upstream before merging it with the ads and serving whole thing in a single request. How would you block ads from Cloudflare if it became an ad network?
The same way you do now with ublock origin. By removing specific elements which match a rule.

The networks will still be able to track you, but ads will be blockable until pages discard the DOM and switch to canvas rendering the whole page

Then you serve the ads on the same set of elements that also contain critical content to the user. You can't block ads from Twitch / YouTube with a rule if the ads are baked into the stream itself. Same goes for any other kind of "element" that doesn't explicitly set itself apart from the actual content.

The ads have long ago started evolving away from a simple "here's an ad neatly placed into its own semantic container so that blockers can target it".

> You can't block ads from Twitch / YouTube with a rule if the ads are baked into the stream itself

you probably wont ever read this, but you actually can.

the pre/post video ads have always been blockable with ublock origin, and the mid-stream adverts, inserted by the content creators themselves can be skipped using SponsorBlock.

Couldn't they just be proxied through a first party server?
Some trackers are having people setup CNAME records on their domains, so the tracker cookies appear to be first party:

https://arxiv.org/abs/2102.09301

uBlock Origin already performs CNAME decloaking and blocks this approach, it’s pretty cool.
For anyone else who wanted to know more like me, here's a good rundown: https://www.reddit.com/r/uBlockOrigin/comments/f8qnpc/ublock...

Note that CNAME uncloaking only works on Firefox; chromium-based browsers do not support the required API.

And for me this is one of the reasons - probably the biggest - that I don't want to buy an ipad. Because it doesn't allow to run the full blown firefox

I've spent hours debating moving to ipad instead of android tablet and it ends to 1. lightning instead of usb-c (can't afford the ipad pro) but ok I can live with it and 2. firefox which is just a blocker

> uBlock Origin already performs CNAME decloaking and blocks this approach, it’s pretty cool.

... which in return is a static list of domains which needs to be regularly updated, and therefore is not really failsafe. uBlock0 uses Adguard's scraped dataset [1] as a fallback source to do this, as Chrome Extensions cannot make DNS requests without a DNS-via-HTTPS endpoint.

Firefox, however, has provided the `dns` API [2] to do requests via the native OS resolver (which in return is also not failsafe due to being unencrypted plain-old-manipulateable DNS UDP requests)

[1] https://github.com/AdguardTeam/cname-trackers

[2] https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/Web...

uBlock Origin on Firefox is able to perform CNAME uncloaking to block this shenanigan.
TBH that is the future. tracking won't die will just evolve to become harder to block.
That would partially defeat the purpose. First party adds see first party cookies, so they have no inherent cross site tracking ability.
And yet facebook is already do something along the same lines by adding fbcid=<trackingnumber> to every outbound link and then the site that receives the link can report back "I saw fbcid=<trackingnumber>". Sure it makes 3rd party tracking require more trust but wouldn't some analysis tell you if your client is trying to game your ads for revenue etc...?
Cross-site tracking via cookies (and 3rd-party cookies in general) has already been dead for years.
That would be more effort than including a single html-script tag to import google analytics. I have hope that most parties would decide that the extra server load and difficulties would make it not worth it.
You underestimate the desire for precision tracking, unfortunately. Hiding behind custom subdomains is common. Stepping up to cloaking it to be delivered from the application is more effort but it'll happen.
You overestimate the technical ability of publishers. Frankly, it’s pathetic to rely on a third party’s hot linked JavaScript, but no one can be arsed to understand how it works, so they just add the tags to GTM instead of realizing that they could trivially implement A/B testing or whatever themselves.
if you as the ad network don't connect to the end user directly: how can you be sure you're not being defrauded by the site owner?
still more effort than before. twiddling with server configs requires more work than inserting a js snippit.
I wonder if Google’s Web Packaging standard was intended to eventually make it possible to deliver both the page and the ads from the same server without enabling one party to tamper with the other.
While this is an obvious route you could take, this is a significant jump from the current behaviour where the content has to be signed by the owner of the domain that packaged it and is treated as if it was served by that domain (on a given scheme/port), and the same-origin policy applies like normal, and thus the ads would continue to be treated as third-party.

It _does_ potentially allow performance gains, insofar as you're then able to send a single bundle containing both first and third party content, but it isn't a gain from the point-of-view of avoiding adblockers (aside from the most primitive DNS/IP level ones).

...That's what YouTube does. Ad video content comes from redirector.gvt.com -> xxx.googlevideo.com just like solicited video content.

(Yes, I classify advertising as spam.)

But somehow it’s still blockable. I don’t see any ads in YT on Firefox + Ublock.
You can also reverse proxy it through your server. I did that once for Google analytics on a demo site.
> If we go down that road, however, sites can make ads completely indistinguishable from desired content.

This is exactly the reason why I'm building a web browser with a statistical representation of both the DOM/CSS Layout _and_ the network traffic, so that neural networks can be trained on classifying ads and malicious actors.

There's a lot of requirements in regards of networking for such a peer-to-peer system to work, like a consensus on DNS/CNAME/PTR or consensus on TLS cert validity.

But I honestly believe that this is inavoidable in the near future, given that most Browsers these days are just a Chrome/Chromium shim where obviously Google's business model conflicts with the idea of blocking ads.

I think by law, ads have to be declared as such for users.
That works in theory but fails in practice. Most ads will just have an "Opinion" tag stapled on them.