(note: I'm hesitant to post this; any site that actually did this is a site I'm never visiting again)
The way to get around adblocking is very long crypto tokens, but not in the subdomain. All that is needed is a front-end proxy that takes each session[1] and rewrites all href/src addresses to point to the proxy. This means all URLs in the page are of the form
https;//example.com/proxy/<crypto-token>
# or in the no-cookie case
https://example.com/proxy/<crypto-token>/<session-id>
Rewriting client-side generated URLs is an exercise left for the relevant Javascript framework, but only requires the addition of a simple API in the proxy to convert URLs, or some sort of bypass/whitelist mechanism.
The tokens used by the proxy can either be the cyphertext of the actual URL or a synthetic token that references the real URL stored in a DB in the proxy. Such details can are left to the implementation of the proxy.
The point is that you have to send the crypto token back to the proxy to get either 1) a redirect of the real URL, 2) the actual content served by the site in question (either from the proxy directly or as a tunnel, or 3) the advertisement/whatever, with the prox6y acting as a live proxy to real ad server, with all the stupid tracking information passed along as extra HTTP headers (in the style of "X-Forwarded-For"). The client only ever sees URLs from the single domain, each obfuscated into a crypto token. No URL would give up any distinguishing characteristic an adblocker can use as a filter.
The only costs are the cost of running the proxy, and a bit of latency on each GET request because of the extra hop through the proxy.
It might be possible to find heuristics to block in the client's DOM, which is why I have expressed concern in the past[2] about the people who will use WebAssembly and a <canvas> tag to bypass the DOM. These two techniques in combination will make adblocking nearly impossible without first either breaking crypto or solving the halting problem.
[1] As defined in the usual manner, either as a cookie or embedded in the URLs the proxy generates.
Sure, you can block all of these URLS. This means you block jQuery, all other Javascript, all CSS, all images, the user-clickable URLs on the page, etc.
The point was that all URLs get encrypted/replaced at the proxy. If you load the page and block all of them, you get not only a page that doesn't work, you also lose access to all other pages on the site.
This is an extremely rude way to run a website; it completely betrays your contempt for and mistrust of the client. Given that quite a few websites already demonstrate this with their malicious and purposfully-misleading ads, I expect they won't think twice about capturing all URLs with a proxy (or similar technique).
Incidentally, this type of method could be extended very easily to prevent all deep linking. Also note that this isn't theoretical. I saw this done in... 1998/1999. It had problems and was expensive to run on the server, but I suspect Moore's law has easily had enough time to solve that problem.
If both legit and ad urls all look the same, and can only be distinguished by decrypting with a key, then adblockers may be unable to differentiate between the two.
Your can differentiate with a list of object SHA hashes you blacklist based on ad blocker user feedback. You'll still need to fetch the object, but you can dump it before rendering.
Excellent point - you could monitor the ABP database and if the hash appears, modify the content (shifting the value slightly on a single pixel) so the thieves need to block the new one.
I assume by thieves you mean ad networks, because I never agreed to retrieve their content, let alone view it.
Its an arms race, as always. And just as the media industry couldn't beat piracy, ad networks aren't going to beat blockers, even if it means content producers get their content stripped and distributed via other channels.
Ok so what are the search implications here? How does this impact Google indexing? And if it is done in such a way that Google can make a distinction, how far behind would a great adblocker be?
I don't see how it would impact Google at all. Does Google change their behavior if you include
Adding a mechanism to bypass URL-encryption for actual (non-ad) external links would be easy, if someone cares about traditional PageRank-style references.
Other than that, the client (possibly GoogleBot) still gets the same page content; it's only the URLs that change. I have no idea (and really don't care) if URL changes affect the "search engine optimization" games some people like to play. I'm describing a way people could actually defeat adblocking; if that method is annoying to use because of side effects, that's not my concern.
Are you using "publisher" to mean anything other than the author of the website? I'm not sure. It seems like you're suggesting that the ad network is the publisher, and the actual website is just "hosting", but I may be parsing your comment incorrectly.
As for any concerns that advertisers not trusting the publications they want to do business with, that's their problem. They can easily verify that a website's url-encrypting-proxy isn't modifying their ads with random anonymous checks. Contracts that require the publisher to pass through the ads unmodified (and with the appropriate HTTP headers or any other technical detail) would be the obvious next step.
There are several other obvious ways for advertisers to gain power over the actual publishers of a website, but I'm not really interested in enumerating ways to keep ads on the internet.
You wouldnt have to tamper with existing data comming from site visitors, you could just simply generate fake visits. That would be much much harder to check for. You could of course do that with the current model, but it requires a lot more resources (distributed IP addreses to come from being the largest issue).
Since this is specificly targeted at getting around adblocking, the ad networks wouldnt be able to rely on cookies etc because the same types of visitors will often block/clear them.
They'll lose some of their SEO juice because their URLs are no longer human-friendly. Hashing everything would be more costly than letting some people block ads.
The way to get around adblocking is very long crypto tokens, but not in the subdomain. All that is needed is a front-end proxy that takes each session[1] and rewrites all href/src addresses to point to the proxy. This means all URLs in the page are of the form
Rewriting client-side generated URLs is an exercise left for the relevant Javascript framework, but only requires the addition of a simple API in the proxy to convert URLs, or some sort of bypass/whitelist mechanism.The tokens used by the proxy can either be the cyphertext of the actual URL or a synthetic token that references the real URL stored in a DB in the proxy. Such details can are left to the implementation of the proxy.
The point is that you have to send the crypto token back to the proxy to get either 1) a redirect of the real URL, 2) the actual content served by the site in question (either from the proxy directly or as a tunnel, or 3) the advertisement/whatever, with the prox6y acting as a live proxy to real ad server, with all the stupid tracking information passed along as extra HTTP headers (in the style of "X-Forwarded-For"). The client only ever sees URLs from the single domain, each obfuscated into a crypto token. No URL would give up any distinguishing characteristic an adblocker can use as a filter.
The only costs are the cost of running the proxy, and a bit of latency on each GET request because of the extra hop through the proxy.
It might be possible to find heuristics to block in the client's DOM, which is why I have expressed concern in the past[2] about the people who will use WebAssembly and a <canvas> tag to bypass the DOM. These two techniques in combination will make adblocking nearly impossible without first either breaking crypto or solving the halting problem.
[1] As defined in the usual manner, either as a cookie or embedded in the URLs the proxy generates.
[2] https://news.ycombinator.com/item?id=10211050