| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by singron 2624 days ago

There are multiple types of fraud. One is bots that give fake impressions, but another is fraudulent publishers that give improper ad placement (e.g. overlapping ads or invisible ads). In the second type, the user is legitimate, so you can't entirely rely on something that identifies illegitimate users. I think this is one reason why ads aren't always sandboxed in iframes since you need a way to detect if the ad is actually visible in the root frame.

Behavior tracking is difficult since it's hard to say that a legitimate user will never do something. E.g. large ISP NATs thwart IP tracking by giving many customers the same IP. Safari blocks 3rd party cookies.

Google has a somewhat well known bot countermeasure called botguard that does a decent job proving that you are probably running an entire browser, but that only marginally increases the cost of fraud to running a browser instance per-bot. Increasing per-impression cost for fraudsters can put them out of business, but increasing per-impression cost to detect fraudsters can put advertisers out of business.

Also, ad-targeting is often a realtime problem. You have to decide what ad, if any, to show within milliseconds. Do you never show ads to unrecognized users? How much turnaround time will you need before you can precompute a profile and start showing ads to a legitimate user? How much turnaround time do you need for detecting and blocking fraud?

Unfortunately, specific countermeasures aren't often publicly published since one of the greatest costs of ad fraud is figuring out and then circumventing countermeasures. E.g. you might have a hard time reverse engineering something faster than it's being engineered by 20 people at Google.

1 comments

rightbyte 2624 days ago

"Also, ad-targeting is often a realtime problem."

Surely Google are caching a queue of adds for each user and similarly for "random unknown user"? Why would this have to be real time?

link

aggronn 2624 days ago

Programmatic advertising is 100% a real-time, per request bidding process. There is no queue of ads. Virtually all banner advertising on the web now is done this way.

link

Macha 2624 days ago

Just because it's Google's code on the publisher page, doesn't mean it's Google's customer's ad that shows up on the page. It's entirely possible a third party is willing to pay more than any of Google's own customers, so it's auctioned off to Google's customers, and Google's partners (who auction it among their own customers).

Also advertisers often want to do dynamic stuff too. Or may be willing to pay more for the same user in different contexts. Or utterly unwilling to have their ad on sites with UGC. And you don't know where the user will show up next.

link