| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by wakatime 1951 days ago

I'm very selective with the external scripts allowed on my websites. Ad networks are notorious for running malicious JavaScript on popular sites like NYTimes[1] and Yahoo[2] home pages. Any plans for an API so sites can receive ad content as JSON and display it without ever executing your external JavaScript? I might consider it for future side projects if I could npm install your client library instead of including an external script tag.

[1] https://www.nytimes.com/2009/09/13/business/media/13note.htm...

[2] https://www.washingtonpost.com/news/the-switch/wp/2014/01/04...

4 comments

pbalau 1950 days ago

> Any plans for an API so sites can receive ad content as JSON and display it without ever executing your external JavaScript?

That will be a very easy target for faking impressions...

link

onion2k 1950 days ago

That's trivial to solve though. Just don't pay per impression for users who opt for that method of delivery. Pay for clicks instead.

link

franga2000 1949 days ago

That will be a very easy target for faking clicks...

link

wakatime 1950 days ago

It's the same JavaScript running, just hosted on your domain instead of externally. The JS shouldn't support eval, which is unfortunately a common way to display ads in networks with embedded external scripts. Version updates can go through your review too.

link

mike_d 1950 days ago

The current state of the art in ad fraud detection has basically become "here is a bunch of weird random stuff, lets see if you get the right answer." That stuff is delivered by dynamic JavaScript.

What you are proposing is AMP Ads and is universally hated by advertisers and publishers.

link

hansvm 1950 days ago

Speaking as someone who knows approximately nothing about ad fraud, what additional protections exist? The scheme you described could be easily thwarted by appropriately sandboxing their script to modify a shadow DOM instead of the real one (and countermeasures like checking the page once in awhile would just as well apply to a JSON approach).

link

geocar 1950 days ago

Ad fraud takes a number of different forms, including:

* Buying a low-value ad (like a banner) and cramming a high-value ad (like a video) in there, and lying to the ad server about the visibility, sound, etc, using JavaScript. Sandboxing is typically stymied by a number of cross-domain limitations in real-browsers we can detect server-side.

* Buying installs for a older/hacked browser (or browser extensions) that has been scripted up to load the ads. People would embed these in screen-savers making real users visit these ad pages when the pc owner was unlikely to be around. They won't have the protection real browsers have, and so can trivially modify the network profile.

* Making a headless browser call ads on pages. These pages "look" valid, and you can visit them to see the ads, but the headless browser has collected cookies from various shopping sites, and uses a number of home/DSL proxy services to obscure detection. For any single impression, they look indistinguishable from legitimate traffic.

These are detected in different ways: JavaScript helps some for the first two, but in the second it's mostly that you're looking for bugs in the implementation (and it's just JavaScript gives you a wider search area). Usually these things are "home grown", so if you've got a wide view of the industry, and can change your scripts frequently, you can "detect" them being built in real-time.

However that last one is tricky, and outside of bugs[1], you're left with timing attacks which I won't enumerate because their obscurity is the strongest protection for continued utility, but in general they work on the principle of leaking some identifying data in HTTP and DNS responses, and relying on the fact that that headless browser needs to call lots of ads to pay for the electricity and Internet that it uses, so we get lots of opportunities for a collision.

[1]: https://geocar.sdf1.org/browser-verification.html

link

pbalau 1950 days ago

If I have control over the code that displays the ads, then I can fake the impressions. The past 20 years of development in the ad tech space didn't happen just because, it happened because there is a real problem that needs solving, a problem that constantly evolves.

link

geocar 1950 days ago

I've built an ad network that supports exactly this: I give you (the publisher) a bag of JSON or XML that tags up the content, and you decide how to render it. I typically pay on click, but I have paid impressions in some cases where the publisher and I can reach a level of trust. I don't think wakatime.com would be an appropriate publisher for me, but maybe you have other sites that are more appropriate.

My original goal was avoiding ad blockers: By having the publisher render the ad themselves, it doesn't look obviously an ad, and as long as the publisher doesn't make the page itself an ad farm, users do not tend to block with custom CSS (that might end up in popular blocking tools). It seems to work okay- we've been operational for over five years at this point, and I've not seen one of the publisher domains or CSS show up in ublock.

link

simonmales 1951 days ago

I used to do ad scheduling 10+ years ago and at least in those times the ads has NOSCRIPT tags.

But for it to really work you would need to store a cookie to correctly redirect the user.

I like your idea of rendering the ads on the server side, but I would hope they would have super low responsive times. Or at least low timeout on your side.

link

wakatime 1951 days ago

You can render client side too, the key is no external JavaScript is being trusted to run on the page.

That ofc means the common practice of advertisers pasting a JavaScript snippet, the network doing a review process, then rendering that snippet as an advertisement on some property would not be allowed on this Ad Network.

link

etaioinshrdlu 1950 days ago

It seems like iframes may be a better solution. They provide quite strong isolation of inside to outside communication. I worked in the ad tech industry 5 years ago and everything was iframes.

link

max_ 1950 days ago

Don't they take a toll on cpu & rendering times?

What do u think of images rendered on the backend?

link

etaioinshrdlu 1950 days ago

Yep, they definitely have a performance impact. Plain old image tags are really lightweight and secure. Unfortunately these simple methods are highly susceptible to ad impression fraud. Really, all online advertising is susceptible to fraud, but if you're paying for clicks or impressions, you can tame the fraud with mass amounts of JS, browser sniffing, data collection, aggregate analysis. This is what the large ad networks like Google do and it's a large industry with many actors.

Reducing this data collection and turning to simpler methods like images increases fraud, which decreases the amount honest publishers would earn (likely hugely). So it's definitely doable, but tends to not make much economic sense at large scale.

link