Hacker News new | ask | show | jobs
by doctorpangloss 2658 days ago
Before anyone thinks this is a Eureka anti-ad-blocking technology: Clearly you still need client-side javascript, distributed by the mediator, to ensure that the impression is actually delivered and the click is actually registered.

Otherwise, obviously, the server could just maliciously record impressions/clicks.

Then, logically, if uBlock Origin doesn't remove the ad, but does successfully remove the mediator's script, the server can never book the impression. So why waste precious bandwidth (actually INCREASING the cost of ad delivery for the publisher) delivering an ad you can never be paid for? Boggles the mind.

Embedding the ad into the video is more akin to a native ad, which is generally understood by the advertiser to not have measurable conversion and to be strictly context (as opposed to user) targeted.

We are going full circle--that is, back to the beginning--of ad technology.

12 comments

> Clearly you still need client-side javascript, distributed by the mediator, to ensure that the impression is actually delivered and the click is actually registered.

There's no point to client-side JavaScript: The baddies just write JavaScript that rewrites basic objects using Object.defineProperty so that document.visibilityState always says so (and so on), or that lie to the visibility sensor. Or they just make a whole fake web browser that runs on a Server. You are in an arms-race, and verification companies simply can't/don't do a very good job.

> Otherwise, obviously, the server could just maliciously record impressions/clicks.

I offer ads† to publishers server-side via XML or JSON, and they can stitch them into the page however they want, and I've been doing this for years.

My publishers often get paid by click, but some of the more valuable ads are paid on referral. Occasionally I see a CPM/CPD deal go through, but it's usually to a larger publisher that I can understand how they get their traffic. I won't help anyone do a CPM/CPD deal unless I understand their traffic.

You're right that it's much too easy to buy traffic from e.g. Google and spray it at my impression and click trackers, which is why I don't rely on them: Adblock and uBlock can remove the trackers all they want, but their users still get ads from my platform, and my publishers will still get paid.

†: Strictly speaking: I offer a platform for publishers, and often help my customers get introductions/recommendations/connections to advertisers who believe in data-driven online sponsorship.

> Embedding the ad into the video is more akin to a native ad, which is generally understood by the advertiser to not have measurable conversion

Server-side stitching can be done in realtime, bespoke, and with standard VAST tags. Not everyone is doing this, because dash/HLS are simpler and still "good enough".

> and to be strictly context (as opposed to user) targeted.

I've worked with several (big) brands who have done completers and demo-guaranteed video campaigns. There is absolutely user-targeting in video.

> There's no point to client-side JavaScript: The baddies just write JavaScript that rewrites basic objects using Object.defineProperty so that document.visibilityState always says so (and so on), or that lie to the visibility sensor. Or they just make a whole fake web browser that runs on a Server. You are in an arms-race, and verification companies simply can't/don't do a very good job.

I agree it's an arms race, but why do you think it favors the attacker? Bot/spam detection is incredibly important, and the folks I've worked with in spam detection are really good at what they do.

(Disclosure: I work on ads at Google, though not in spam. Speaking only for myself.)

> There's no point to client-side JavaScript: The baddies just write JavaScript that rewrites basic objects using Object.defineProperty so that document.visibilityState always says so (and so on), or that lie to the visibility sensor. Or they just make a whole fake web browser that runs on a Server. You are in an arms-race, and verification companies simply can't/don't do a very good job.

You cannot overwrite javascript properties in frames from another domain, right? Am I missing something?

A fake webbrowser requires a lot of IP addresses. Wide-spread abuse seems hard to me, especially when combined with Google's hidden "I'm not a robot" thingy.

> You cannot overwrite javascript properties in frames from another domain, right? Am I missing something?

You don't need to.

The SSP or publisher can slip the naughty JavaScript directly into the ad tag.

> A fake webbrowser requires a lot of IP addresses.

You may be surprised to learn there's a market for buying IP addresses, and they're cheaper than the revenue a bad actor can gain from using them.

There's also a lot of toolbars that embed some limited tunnelling functionality that they can then resell.

There's also a market for hacked DSL routers that you can tunnel through.

You can't use ReCaptcha (or any captcha) for ads. Captchas work because they prevent access to content users want until they solve the captcha.

If you put ads behind a captcha? Well in all honesty you're just doing a service to the user by hiding the ads behind a captcha they're never going to solve (even if they are not robots) because it's not in their best interest to do so.

> You can't use ReCaptcha (or any captcha) for ads. Captchas work because they prevent access to content users want until they solve the captcha.

If you've used ReCaptcha in the past few years [1] you might have noticed it often doesn't ask you to solve a captcha. The parent is describing using a similar approach of detecting bots to identify ad impressions that shouldn't be counted (spam).

[1] https://security.googleblog.com/2014/12/are-you-robot-introd...

(Disclosure: I work at Google in ads, though not in spam.)

There is a hidden "I'm not a robot" "captcha". You might use that to help detect whether the impression/view/click was legit.

https://developers.google.com/recaptcha/docs/invisible

You can programmatically invoke the challenge from the ad's javascript.

If you follow that train of thought to its logical (if perverse) conclusion, we can soon expect ads as the subject matter of captcha.

Instead of selecting three pictures that have a given "thing" in them, we'll be picking the ones showing a given brand among otherwise generic signs.

I've seen some websites that do that, ie watch a short ad and then type in the brand name from the ad.
So life imitates art - again. Too bad the artist is a dystopian dadaist.
> Before anyone thinks this is a Eureka anti-ad-blocking technology: Clearly you still need client-side javascript, distributed by the mediator, to ensure that the impression is actually delivered and the click is actually registered.

I don't suppose I understand why clientside JavaScript is needed here. The serverside code could simply generate a unique hash for every visitor, and include that in the campaign link. Then, server-side code on the receiving end can read this hash, record a unique hit, and monitor the user on the campaign landing page to see if a lead is generated.

This seems obvious to me, but I don't actually work in advertising. Where is the break in this system? What am I missing that allows this to be exploited, in a way that only clientside JavaScript can fix?

EDIT: In context, I've realized that my proposed solution might work for clicks, but would do nothing for tracking impressions. Hrm. I'm not really sure if that problem is solvable. Then again, I'm also not a fan of impression based ad tracking (it feels creepy) so maybe I don't mind if it remains broken.

A couple leads to answer you :

- Video advertising is not happy with only impressions and clicks metrics. In general, advertiser will want to know if their video played while in view from a human (or at least in view on a screen), and for how long it has played (or at the very least a rough estimate, like say how much midpoints).

- The concept of "impression" itself is often not very well defined, but counting it at "the server served a request with the video payload" is really a too optimistic view of things which leaves big holes exploitable by fraudsters. Having client side javascript playing at least requires additional software running, aka additional costs (if minimal) for fraudster.

- You can't really "just" monitor the user on the campaign landing page, since it's different sites involved, it involves different cookies, and actually reconciliating them is doable, but it'd require some work that the advertiser may not be willing or able to do.

If all script appears to come from the website (advertiser scripts routed via the website with random filenames), it'll probably be pretty hard to filter that kind of behavior without breaking other websites.
Sure ! You still need client side javascript though
Threat models:

* The ad server outright telling lies to get paid for nothing.

* The ad server not being trusted to validate that views are legitimate.

Client-side JavaScript can probe the DOM and execution environment for abnormalities indicative of automation.

yes but as was said earlier client-side JavaScript can also be blocked. Everything has downsides.

And it was posited that client-side was needed, the reply was they couldn't see why it was 'needed', which I agree with, I can see why it might be wished for, but you do not need it to record clicks. You do need it for other things, or to improve understanding of the clicks.

Basically any solution is going to be composed of many pieces, all of those pieces susceptible to attack in different ways.

if client side javascript is blocked, then just don't pay for that impression
But why can't a malicious server also serve up some JS that modifies the behaviour of the JS served by the ad network?
You don't need a malicious server.

You can use Google to do this and you'll get a google/branded domain name for your object-hijacking javascript. The number of times I've seen something like document.visibilityState='visible' in peoples ads (or ad wrappers) is astounding.

Isn't document.visibilityState a read-only property?

https://developer.mozilla.org/en-US/docs/Web/API/Document/vi...

No.

It is not.

    Object.defineProperty(document, 'visibilityState', { value: "visible", writable: false })
demonstrates trivially that the documentation is clearly wrong.

Maybe it says it's "read-only" because Google wants bad guys to do this sort of thing, since it makes advertisers buy more ads from them.

Or maybe it's an honest mistake that neither Mozilla, nor Google (nor Microsoft or anyone else it seems) has any idea what "read-only" means.

they try to do that sometimes, but it's rather harder to claim innocence for that than other methods
The server could track it based on the fact it delivered the content to the end user with the same unique hash present, no?
You mean like a tracking pixel? The issue here is that all requests to the ad domain can be blocked by the ad blocker.
But for giant sites like NY Times or the like, you can solve this through commercial agreements and rights to audit.
> We are going full circle--that is, back to the beginning--of ad technology.

I guess they will end serving ads and "content" from the same site, but this site will be controlled directly by the advertisers, not by the publishers.

AMP
>Otherwise, obviously, the server could just maliciously record impressions/clicks.

I'd imagine that advertisers would just bake it into the cost of the impression like they do today for click fraud. In fact, I think this type of fraud is much preferable to click fraud.

That's not what native ads are. They are about following the form and behavior of the surrounding content and can have all the same complex interaction and conversion tracking as any other ad.

Also video players with embedded ads have been around for years. The technology is already more advanced this and is just starting to roll out. Despite what you might think, there are plenty of checks against fraud and any publisher that does such blatant video fraud will get caught very quickly.

clearly we need an AI/ML that will actively scan any video footage and cut out anything that looks like an ad.

then people might actually find out what is and isn't ad.

It will start cutting out actual origin footage at some point because non-obvious ads are already there from the beginning mixed with primary content in a way they can't be easily separated.
What if there existed a standard protocol to (automatically) interact with website's CMSs for the buying and selling of ad space via an API?

Maybe even programmatically allowing the upload of images or videos? (Maybe even ADsafe [0] scripts?)

[0] https://github.com/douglascrockford/ADsafe

Honestly, I'd be happy with that - the biggest problem with ads today is that they're a major source of tracking and malware.
The server could continue to serve ads until the client reports back that one ad was rendered. If the message is blocked, the content would never be delivered. Thus making ad blockers content blockers. I'm not saying this is a good thing or bad thing. Im just saying there are more cats and more mice out there.
It's entirely possible to stitch a targeted advertisement into a video file. This happens.
are ads only sold cpc these days? no cpm ads?
CPM ads still out there, for brand awareness. But you'll be paying for ads that no one clicks. So most advertisers go for CPC.