Hacker News new | ask | show | jobs
by shawnz 1860 days ago
This is basically the exact fear which was being expressed by users when Google announced that they would require Chrome extensions to only use declarative content blocking starting with Manifest v3 (which anecdotally convinced me to switch to Firefox).
2 comments

If we go down that road, however, sites can make ads completely indistinguishable from desired content. Same domain, same stream, no easily marked container. All of the imperative adblocking tech in the world, short of queuing everything through a neural engine post render, can block what is possible.

So there has always been a detente between adblockers and publishers, presuming the former hit a small enough set of users that it was just ignored. It seems that is no longer the case.

That would require delivering ads from first-party servers, right? So third-party ad and tracking networks would die a painful death.
More likely, they'd get upgraded to "first-party" tracking by acting as a CDN-layer where the ad networks do the proxying/caching to get the actual content upstream before merging it with the ads and serving whole thing in a single request. How would you block ads from Cloudflare if it became an ad network?
The same way you do now with ublock origin. By removing specific elements which match a rule.

The networks will still be able to track you, but ads will be blockable until pages discard the DOM and switch to canvas rendering the whole page

Then you serve the ads on the same set of elements that also contain critical content to the user. You can't block ads from Twitch / YouTube with a rule if the ads are baked into the stream itself. Same goes for any other kind of "element" that doesn't explicitly set itself apart from the actual content.

The ads have long ago started evolving away from a simple "here's an ad neatly placed into its own semantic container so that blockers can target it".

> You can't block ads from Twitch / YouTube with a rule if the ads are baked into the stream itself

you probably wont ever read this, but you actually can.

the pre/post video ads have always been blockable with ublock origin, and the mid-stream adverts, inserted by the content creators themselves can be skipped using SponsorBlock.

Couldn't they just be proxied through a first party server?
Some trackers are having people setup CNAME records on their domains, so the tracker cookies appear to be first party:

https://arxiv.org/abs/2102.09301

uBlock Origin already performs CNAME decloaking and blocks this approach, it’s pretty cool.
For anyone else who wanted to know more like me, here's a good rundown: https://www.reddit.com/r/uBlockOrigin/comments/f8qnpc/ublock...

Note that CNAME uncloaking only works on Firefox; chromium-based browsers do not support the required API.

> uBlock Origin already performs CNAME decloaking and blocks this approach, it’s pretty cool.

... which in return is a static list of domains which needs to be regularly updated, and therefore is not really failsafe. uBlock0 uses Adguard's scraped dataset [1] as a fallback source to do this, as Chrome Extensions cannot make DNS requests without a DNS-via-HTTPS endpoint.

Firefox, however, has provided the `dns` API [2] to do requests via the native OS resolver (which in return is also not failsafe due to being unencrypted plain-old-manipulateable DNS UDP requests)

[1] https://github.com/AdguardTeam/cname-trackers

[2] https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/Web...

uBlock Origin on Firefox is able to perform CNAME uncloaking to block this shenanigan.
TBH that is the future. tracking won't die will just evolve to become harder to block.
That would partially defeat the purpose. First party adds see first party cookies, so they have no inherent cross site tracking ability.
And yet facebook is already do something along the same lines by adding fbcid=<trackingnumber> to every outbound link and then the site that receives the link can report back "I saw fbcid=<trackingnumber>". Sure it makes 3rd party tracking require more trust but wouldn't some analysis tell you if your client is trying to game your ads for revenue etc...?
Cross-site tracking via cookies (and 3rd-party cookies in general) has already been dead for years.
That would be more effort than including a single html-script tag to import google analytics. I have hope that most parties would decide that the extra server load and difficulties would make it not worth it.
You underestimate the desire for precision tracking, unfortunately. Hiding behind custom subdomains is common. Stepping up to cloaking it to be delivered from the application is more effort but it'll happen.
You overestimate the technical ability of publishers. Frankly, it’s pathetic to rely on a third party’s hot linked JavaScript, but no one can be arsed to understand how it works, so they just add the tags to GTM instead of realizing that they could trivially implement A/B testing or whatever themselves.
if you as the ad network don't connect to the end user directly: how can you be sure you're not being defrauded by the site owner?
still more effort than before. twiddling with server configs requires more work than inserting a js snippit.
I wonder if Google’s Web Packaging standard was intended to eventually make it possible to deliver both the page and the ads from the same server without enabling one party to tamper with the other.
While this is an obvious route you could take, this is a significant jump from the current behaviour where the content has to be signed by the owner of the domain that packaged it and is treated as if it was served by that domain (on a given scheme/port), and the same-origin policy applies like normal, and thus the ads would continue to be treated as third-party.

It _does_ potentially allow performance gains, insofar as you're then able to send a single bundle containing both first and third party content, but it isn't a gain from the point-of-view of avoiding adblockers (aside from the most primitive DNS/IP level ones).

...That's what YouTube does. Ad video content comes from redirector.gvt.com -> xxx.googlevideo.com just like solicited video content.

(Yes, I classify advertising as spam.)

But somehow it’s still blockable. I don’t see any ads in YT on Firefox + Ublock.
You can also reverse proxy it through your server. I did that once for Google analytics on a demo site.
> If we go down that road, however, sites can make ads completely indistinguishable from desired content.

This is exactly the reason why I'm building a web browser with a statistical representation of both the DOM/CSS Layout _and_ the network traffic, so that neural networks can be trained on classifying ads and malicious actors.

There's a lot of requirements in regards of networking for such a peer-to-peer system to work, like a consensus on DNS/CNAME/PTR or consensus on TLS cert validity.

But I honestly believe that this is inavoidable in the near future, given that most Browsers these days are just a Chrome/Chromium shim where obviously Google's business model conflicts with the idea of blocking ads.

I think by law, ads have to be declared as such for users.
That works in theory but fails in practice. Most ads will just have an "Opinion" tag stapled on them.
Blocking YouTube ads requires injecting JS, it had nothing to do with Manifest.
A major justification for removing the previous content blocking API was that it could be used to do things like inject JS. So clearly the intention is to have content blocking extensions not do that at all. Although in this specific case, it might still be possible.

AdGuard and uBO for example use the content blocking API to inject blocking "scriptlets" on sites where this kind of thing is required. That kind of usage is made much more inconvenient with Manifest v3.

I don't think so. Injecting JS is a valid use-case and will work forever, probably a huge majority of extensions do that. An intention is to make content blocking extensions more performant.

It's very easy to inject JS. I don't know whether you're talking from your own experience, but I wrote my little extension to replace uBlock (with my own list of rules and blocks) and to inject JS or CSS you just have to add a line in manifest.json which have nothing to do with blocking API.

See here where Justin Schuh says the sole motivation is for privacy reasons: https://twitter.com/justinschuh/status/1134092257190064128

I know it is easy to inject JS and that you can do it with the manifest file. But without the old content blocking API you can't dynamically inject different snippets on different pages based on filter lists for example (unless you inject something on every page).

I wouldn't be surprised if in the future, content blocking extensions won't be allowed in the store if they use such broad permissions for example.

Well, to be fair, not really. Scriptlets will continue to work just okay.

To be completely honest, Manifest V3 technically is not THAT bad and it's capabilities at the current moment are really close to what major ad blockers can do.

There're still some things that bother me:

1. Debugging a content blocker is really inconvenient (not as bad as Safari though) 2. The future. What if its development stalls after it's released? 3. Google's goal (probably, for Manifest V4) is to make content blocking completely declarative, i.e. get rid of any host permissions and content scripts.