Hacker News new | ask | show | jobs
by diplocorp 2662 days ago
Do you mind shedding light on how such click fraud is detected?
2 comments

The most usual technique is to setup click baits/traps, once you click on a trap link you (= IP or UID via cookie) are added to an ignore list, where all your actions are not invoiced to advertisers. Simple and works,
This actually sounds ideal for me too. Please ignore me and stop tracking me.
They didn't say they stopped tracking you, just that they didn't charge the advertisers.
But in that case it still looks like an effective way of fighting that business model.
Or to receive more aggressive advertising/you are using ad nauseum overlays.
Aggressively advertising to someone you know hates ads is stupid. They'll just boycott you in spite.
LOL. Of course they're still charging the advertisers.
That sounds correct, don’t pay ad placements, but do charge advertisers. Right up there with usual adtech morals.
Off course traps is not the only technique. What's also checked: do you use datacenter IP or not, which country/location, does your browser footprint looks like what you send by headers and user agent, some may even validate if you move mouse like a human (check how recaptca works). And also take into account that adtech companies have a lot of statistics to analyze, so single click will not be detected, but thousands likely will be.
This is effective. After all that is said and done. Ad marketplaces stop bidding on your pageview (at least, the quality ones do). Over time fewer networks want your impression and the publisher ends up seeing a worse RoI on ads.
What would happen if there would be a popular extension that would share UID cookies between all its users?
That’s a great idea :)
How do you put an ad trap? A selenium bot only clicks on what a user can see... I doubt you could notice the difference
This extension doesn't use selenium, plus that's not entirely true, selenium sees HTML and DOM while a user sees the final render; there's ways you can hide content from a user while showing it to Selenium-style bots.
I thought selenium will throw an exception when an element being clicked is not actually visible.
If “ignore all clicks from a user that clicks >3 ads on a page” isn’t good enough for an ad network, it can add three or four ‘ads’ that technically are visible, but the same color as the page background. If a user clicks a few of them, ignore all clicks from that user on that page.

AdNauseam could detect that, too, but it gets exceedingly hard, slowing down the user’s browser. So, I think the ad network can win that battle.

Does it? I don't think you can reliably identify whether something is visible if the other site, which controls the CSS, does not want you to. It's a classical arms race situation.

    <span style="opacity: 0.001">Trap!</span>
Selenium can reliably do that, as it is a proper webbrowser controlled through an API
Yes, but it is not a human eye and brain.
Except that publishers will compare their own GA or MOAT data against what you’re reporting and wonder why the hell you’re reporting significantly fewer impressions than other networks and their own tools.
But how do you discern real users from fake ones? Do you search for specific plugins? And what about bots which are using headless sessions?
First like varelaz says, one important criteria is your ISP. MaxMind provides information whether you are "Corporate" or Residential. Generally when you are Corporate / Datacenter, you get into a low-quality tier or even no ads at all for some networks.

Users following invisible links are definitive bots but otherwise, the main idea is to verify the coherence of the headers, and verify if there is a difference between theoretical browser capabilities and reality.

The behaviour is not so important because advertising networks generally have frequency capping support per IP/UID.

Long time ago, lots of fraud bots used to use COM/MSHTML interfaces ( like https://docs.microsoft.com/en-us/previous-versions/windows/i... ) so, even if declaring itself as Chrome, it was obviously an IE.

Now the fraud is more with Android WebViews.

It's very easy to distinguish two browsers, and the browsers that declare themselves "no tracking" are even easier to track in real-life scenario because their signature is very different.

Take two Safari iOS on the same 3G networks, it's very difficult to differentiate them, but take two Brave browsers and it's quite easy to track the user.

CasperJS/PhantomJS/Selenium bots are usually running with the default resolution and they leak some javascript properties like window._phantom, window.Buffer, window.emit or window.webdriver (selenium).

Sounds like I'd prefer a way to get my IP designated as corporate... Mediacom, wanna help? (haha)
Mission accomplished!

or xkcd #810

To add to what has been said in the other comments, even just a simple algorithme and stats can detect that. You don't even need a Machine Learning model. It's rare than someone clicks on ads, even more several times on the same one, and even more if it's several times every day. The behavior of the user will just look like an aberration on the chart with hundreds times more clicks than the next maximum.
Maybe I misinterpret, but this argumentation by you and some others that "this is easily detectable" is _exactly_ the good part. * They want to ignore me? Mission accomplished * They want to discard my clicks and subtract it from the payout? Mission accomplished etc. in any case they are playing the game for me.

"Good day sir, you lose, I said good day!"