Hacker News new | ask | show | jobs
by nomilk 754 days ago
AdFlush (F1 Score: 0.98) seems to do better than some other adblockers: AdGraph (F1 score: 0.93), WebGraph (F1 score: 0.90), and WTAgraph (F1 score: 0.84), but it begs the question: why not compare to the most popular adblockers: uBlock Origin, Adblock Plus etc.

I think the authors want to compare apples with apples, so they only compare their algorithm to other adblockers that use algorithms, as opposed to those which use crowdsourced lists. The paper somewhat acknowledges this:

> However, manual maintenance of these filter lists requires significant human effort

Seems like one of those tasks where crowdsourcing scales so nicely (only one person has to report an ad for it to go into a crowdsourced list that blocks it for millions of others) that it makes an algorithmic approach unnecessary.

4 comments

The filter based adblockers are at risk though, with Google's new extension thingy that - at least a few years ago, I haven't heard from it since - limited the amount of rules. If there's a non-rule based system that is 98% effective then that would circumvent the arbitrary rule limits that Google set.
My understanding is that under manifest v3[1] only a list of rules is allowed. An algorithmic ad blocker wouldn't be able to work at all.

[1] https://arstechnica.com/gadgets/2023/11/google-chrome-will-l...

This is true. Extensions currently (manifest v2) are able to evaluate net requests dynamically, and are able to modify requests according to a dynamic ruleset that the extension can retrieve from some filter list published on the internet.

Under manifest v3, extensions are not able to dynamically inspect requests, instead, they may only apply rules to net requests. Even worse, there is a limitation of only 5000 rules per extension!! [1]

Even WORSE worse, under Chrome's manifest v3 rules, the extension cannot load any external code! Meaning that blocklists must be packaged with the extension. [2] Now, one might consider the reading of that link to no affect block lists, it's not a "library" and it's not "code" so long as it's just a list of textual rules.... however, google considers the following to be a violation: "Building an interpreter to run complex commands fetched from a remote source, even if those commands are fetched as data". [3]

Sneaky sneaky. An extension update (and hence new app store submission) is required to update filter lists.

In other words, dynamic net requests are banned, and remotely-updated blocklists are banned as well.

[1] https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/Web...

[2] https://developer.chrome.com/docs/extensions/develop/migrate...

[3] https://developer.chrome.com/docs/webstore/program-policies/...

Chrome allows at least 30000 static rules + 30000 dynamic rules[1].

[1] https://developer.chrome.com/docs/extensions/reference/api/d...

That's not enough. Just uBlock Origin's default list "uBlock filters – Ads" already accounts for over 38,000 rules. EasyList is over 87,000!
10x more https://blog.chromium.org/2024/05/manifest-v2-phase-out-begi...

"Based on input from the extension community, we also increased the number of rulesets for declarativeNetRequest, allowing extensions to bundle up to 330,000 static rules and dynamically add a further 30,000."

If Manifest v3 is really this bad then it's probably still possible to build adblockers by DLL hooking the browser. It should also not affect browsers with built-in adblocking like Brave and Vivaldi.
> it's probably still possible to build adblockers by DLL hooking the browser.

I like this. or possibly the COM API. but I'm not a Windows expert.

How complex is to revert changes to manifest to bring supporting v2 back to Chromium? Or is it intentionally made super complex by Google?
Microsoft decided it was prohibitive for them. So probably overly difficult.
I would say it just works for them. Considering they show ads in the Windows Start menu now.
If Google's goal is to thwart adblockers by creating limitations on what browser extensions can do, then creating a browser extension that blocks ads within the current set of limitations is a temporary solution at best.
Google doesn't control the browser, user does.
Google controls the APIs that extension writers can use. They are currently using that control to impose limits on what adblocker extensions can do. [1][2]

You could download the Chromium source and patch it to change the extensions APIs (or better, just use Firefox), but the majority of users won't do this, and extension writers aren't going to make a version for a patched Chromium browser unless it has significant market share and support.

[1] https://nordvpn.com/blog/manifest-v3-ad-blockers/

[2] https://www.eff.org/deeplinks/2021/12/chrome-users-beware-ma...

You could always provide an extension that loads itself as a .dll/.so. I don't see much difference in friction between adding an extension through google's website vs. download setup.exe from somewhere. Of course like you say, using less user-hostile software is preferable.
Such extensions would be trivially easy for Google to break with Chrome updates. You also cannot distribute an extension like that through any of the usual extension stores.

Better to just use a browser that actually respects its users.

That might work for highly tech savvy people, but that's a very small minority of users. Google will still make ad blocking near-impossible for 99.99% of its users.
Firefox has 2.9%. Safari has 18.12%. Everything else is Chrome or reskinned Chrome, with Chrome itself being 65.3%.

Unless you’re running that 20%, Google controls it, and they basically write the standards anymore.

Oh, of course if you run Google-written software without modifications, you're not really controlling it. So if you want to control it, either go inside and tinker with the code, or - easier? - switch to a non-Google browser.

I thought this is rather obvious, at least for those worried about experience. Do you think all those who realize they're suffering from ads don't think about using non-Chromium browser?

I honestly don’t think they think about a nonchromium browser, and if they do think of it, they reject it for unfounded reasons. If they did use a nonchromium browser, Firefox would have a larger market share.
And if addblocking doesn't work on Chrome, Firefox usage will go up.
I guess that's why uBO Lite exists :) I started using it a couple of months ago instead of Ublock Origin, and still haven't seen any ads since.

https://github.com/uBlockOrigin/uBOL-home

I think eventually there is nothing that can stop certain adds on Chrome once specific API's are removed, even using manifest 3. Maybe someone could chime in on this as its really confusing now since Google keeps pushing back the date to remove manifest 2. (This might be outdated info)
We'll create a shim to render the page in the background and use AI to remove ads and then serve the result to the user, at the least. Fuck ads and malvertising
Yes and: There will be a tipping point where it'll be easier to allow the content rather than blocking the garbage. Dynamic screen scrapping, more or less.
Yeah, it generally does feel like a "Catch me if you can" situation. I'm sure that there will be different ad-blockers once those APIs are removed, as there seems to be a very strong desire from some people not to see ads.

I hope we'll not end up in a DRM-like system where ads are somehow really baked in and content stops working for lay-people if they try to circumvent ads.

And that will be the day Chrome dies.
They day Google starts blocking ad blocking users is the day the exodus starts from Google services.
I think you're overestimating the number of people who 1) care and 2) use adblocking extensions or any extension for that matter.

Google knows what will likely happen, and pays people lots of money to know.

Without commenting on Google[1], I think this sort of thing is true in the short term but less true in the long term. I expect that, were Chrome to ban ad blockers, technical folks will start to teach non-technical folks in their orbit how to e.g. install Firefox to regain ad-blocking capability. I think it would take some number of years but there would be a pushback in the medium- to long-term.

1. Googler, opinion solely my own.

This is ironically how Chrome got its big push into the mainstream. Would be great if that’s how it got pushed out. But the world of influential techies, especially amongst the younger, seems to have gotten smaller. Perhaps I’m wrong
They'd massively alienate a large and motivated subset userbase with the ability to build viable alternatives to Google products or at least build more active means to cirvumvent their platform restrictions.
I think you are unfortunately correct about this.

I am consistently blown away when I inadvertently experience the Internet without ad-blocking. It’s absolute garbage.

I am sad that people are either OK with this or don’t care. For many they don’t know any better, and asking many of those same groups to install and manage plugins is a fraught request.

32.8% of global users use an ad blocker. (33% of Americans.) [1]

Chrome's market share is about 65% [2]. If their recent manifest changes eventually break ad blocking (which seems to be the goal), it'll lose a bunch of market share (I guess they're optimizing for short-term profit).

[1] https://backlinko.com/ad-blockers-users [2] https://gs.statcounter.com/browser-market-share

Why do you think everybody switched from IE to Chrome? Because their tech friends told them to or did it for them.

The day Chrome can't sufficiently block ads anymore is the day Chrome dies.

Do you remember IE exodus to Firefox pre-2010? Yeah Google better watch its hyperback.
They learned from Microsoft's mistake and most browsers run off the Chromium while they have Firefox by the balls with their default search engine deal. Not to mention Firefox is hellbent on snatching defeat from the jaws of victory.
I don't know what you mean. They are already blocking adblock users on YouTube and there is certainly no exodus happening there. A few people complain about it and get a handful of upvotes on social media from their friends, but it hasn't even come close to rising to "backlash" status.
Are they? I block ads on YouTube and I’m still allowed watch videos.

I suspect they have silently stopped blocking ad blockers.

I remember there was a lot of reports about this being the case, but there is no way I am not blocking Google.

I suspect that such a move would draw significant scrutiny from regulators, potentially far outweighing any impacts from users switching browsers on their own.
Real easy problem to solve by just switching back to Firefox
The first thing you see when you open Firefox is an ad for Amazon and Expedia.
I don't. Are you talking about 'sponsored shortcuts'? You can turn those off in the settings. It's on the first page you see when you hit the settings button in the top right
Isn't this the case for a bloom filter (vacuum maybe)? You can have very few rules.
> only one person has to report an ad for it to go into a crowdsourced list that blocks it for millions of others

Is it that easy? Sounds very abusable

Yes, and some list maintainers accept money to add or remove you from the list (officially, or officiously through a secondary maintainer, depending on the list), but otherwise it's no different than getting a domain marked as malware or phishing (with a few paid editors on Phishtank or VirusTotal).

It's easier to get a domain added than removed. and for the "corruption"/"rackeetering" part, it's a "win-win" for the adblockers and the list maintainers.

Adblockers also often pay browsers to be integrated by default (AdGuard, Adblock Plus, etc), and then they negociate with publishers to whitelist some domains (not necessarily the most obvious, can just be analytics).

"We offer your domain to be unblocked on xx millions of devices by default, this will create you a uplift of revenue of +yy%"

Which lists do this? Do any of them ship with uBlock Origin?
Humans are really the primary attack vectors for any security system.
yes, one of my clients was hit by this and i was tasked with solving the situation.

i had to create a ticket in a repo explaining why blocking a whole domain instead of a single subdomain was actually pretty bad. they approved it and reverted the change.

finding where exactly i had to open the ticket and what to write was a “down the rabbit hole” experience.

Domains are cheap, don't serve content on an ad domain maybe?

Sounds like perhaps your task was to ensure a company's ads got through an adblocker?

my task was to rectify an issue in one of these crowd sourced lists of ad servers.

they were blocking a whole domain instead of blocking the ad-serving subdomain.

the issue was rectified, the main domain was replaced by the ad-serving subdomain.

Still, as pbhjpbhj suggested, if I were publishing both content and ads, I would consider publishing the ads on a different domain (not just a subdomain) to reduce technical issues. Domains with ugly names are very cheap.
of course, and this is a valid proposal. but that was outside the remit.
You could be right but you are definitely jumping to a conclusion here.

The default lists used by uBlock for example include things like error tracking telemetry, Sentry for example.

I can see why people want to block that stuff (privacy) but it’s not exactly an “ad”

Yes, but the effects of that abuse are observable and easily fixable. If suddenly a whole site goes offline for a bunch of people a change like that is likely to get reversed very quickly.
there is an entire section in the paper sub-titled: Comparison with uBlock Origin..
practical solutions don't get you published
"Practical solutions" also leave you vulnerable to cat and mouse games against sites that block or bypass adblockers (even with ublock origin). The end game is to have heuristic/AI adblocking which would directly hook into browser rendering so that it becomes undetectable. Obviously leading browsers do not support this for extensions, but forking Chromium wouldn't be so hard.
"doing thing X work and everyone uses it, so bad actors invest time against things X. While thing Y isn't used by anyone so bad actors aren't spending time to work around it, q.e.d. we prove thing Y is better".

i don't really buy your argument

The argument is that Y is more robust.