Hacker News new | ask | show | jobs
by Nick-Craver 2556 days ago
I just wanted to chime in from Stack Overflow here and let people know: we are aware of the issue. And we're NOT okay with it. We're trying to sort out how to kill the audio behavior now. It's not very straightforward to find where it's coming from, but we are working on it. We've also reached out to Google for their assistance in tracking it down. If anyone can offer advice, we'll more than happily take it.

- Nick Craver, Architecture Lead at Stack Overflow

15 comments

Why are you allowing arbitrary javascript to be served to your users?
Wish I could upvote this 1,000 times.

It's ridiculous. It's a text-based ad. At worst, it's a clickable image. At what point did it become okay in your minds to let advertisers run arbitrary code?

I've left ads turned on specifically on StackOverflow because 1) I want to support StackOverflow, and 2) I trust them not to run malicious ads.

I don't even care that they're running ads network-wide. But if they're going to be running these kinds of ads anywhere on the site, they're going right on the ad block list along with everyone else.

It’s completely insane. Can you imagine a TV station receiving ads on tapes and playing them to their audience without looking at them first? Can you imagine TV stations occasionally showing ads containing porn, urging people to kill, showing extreme violence during cartoons, or containing specially crafted audio that blows out your speakers, and the TV station just shrugs and says they try their best to stop these things but they can’t stop everything?

Imagine a TV ad that tries to make your phone call a 1-900 number so they can rip you off, and the station says they don’t know where it came from but they’re trying real hard to put a stop to it. And somehow watching the ads themselves before broadcasting them never crosses their mind.

It’s worse than that. Imagine a TV ad which sends malicious code that gets executed to your television, which profiles the hardware in your TV and sends information about your viewing habits (tied to a unique ID) back to the advertiser.

In any other context we would call this a security vulnerability. I think that label also applies here.

You don’t need to, it happens already. Many TVs do screen grabs and send everything you do to the manufacturer or partners.
My Vizio's built-in software tries to do that. There's a reason it's not allowed to connect to wifi.
When you say "it's not allowed", do you trust its own settings? Are you sure it's not doing something like [0]? How do you even protect against that?

[0]: https://www.reddit.com/r/privacy/comments/bpr6xs/if_you_choo...

I bet you could do that with an ad that plays “Alexa- call 1-900-555-1234”.
The state of web ads is closer to the public pinboard only instead of having ads for grandmas couch its Mr CEO trying every trick to drain your money and track you.
Tv spots are very limited. Digital ad impressions number in the billions with 10s of millions of ad creatives. It’s not the same situation.
The only reason it’s not the same situation is because they’re willing to throw their users under the bus for a little extra cash. If they wanted to exert more control, they absolutely could. Ads would cost more and we’d see fewer distinct ads as a result.
That is absolutely not the only reason. Digital ads work entirely different from the TV medium and its more than "a little extra cash".

No single publisher today really has the power to change much, no matter how big they are. The issue likes with adtech (like Google) and advertisers.

Or -- the more expensive ads don't justify the ROI, meaning advertisers don't buy them, meaning fewer ads, but less content.
If you can't manage to oversee it because of the scale you don't deserve to take advantage of the scale.
That sounds nice but is neither realistic or even sensible. There are other solutions like sandboxing to prevent access to features, it's not an unsolvable problem.
Well I would argue if billions will see the content, that gives more reason to have it checked over before serving no?
Billions? No single creative is seen by that many. In fact, with dynamic creative optimization (DCO) and all the optimization that happens, you can easily get creatives that are custom generated and only see by a few individuals or even a single person.
I think this comment[1] on the linked Meta question explains it pretty well:

> To the people confused why ads need to run their own Javascript (even ones that are just static images): The short answer is that Ad Networks do not and cannot trust website operators. They need to run their own JavaScript served from their own servers in order to verify that a real user saw the ad and for how long, and they can't trust the website operator to tell them. And these pieces of JavaScript tend to be more invasive and privacy-destroying than the website's JS because they care, far more than the actual website does, that the "user" is not a bank of iphones in a sweatshop in China.

[1]: https://meta.stackoverflow.com/questions/386487/why-is-stack...

Not just arbitrary JavaScript, arbitrary JavaScript where they can’t easily even see where it came from! Sheesh.

Could we require advertisers to sign their ad code to have a trail of where it came from, prevent tampering, and make it easier to pull the plug on bad actors?

The people bearing the costs of the internet ad economy aren’t the people in any position to do anything about it. So there’s very little pressure to fix anything.

Maybe if the US government started threatening to enact something like GDPR unless the a democratic industry gets its shit together.

Large adtech demand/sell side platforms do not want to remove these bad actors because they make money on percentage of spend. They are incentivized to increase volume and ad spend at all costs, and there is no regulation to stop them from doing otherwise by continuing to deal with shady companies and known malware techniques.
This is not a solution. JS still runs, it just has limited access to certain features.

You also need to somehow <iframe> the ad content (and serve it from somewhere else with the feature policy header set/attribute on the iframe set) or else sacrifice use of these features on your own site.

The solution is to make the ads inert. They do not need to run code.

Why are you allowing arbitrary JavaScript to run on your device?
Sites like StackOverflow require JavaScript to work (or at least, to work in a manner approaching interactivity). So, even someone who disables JavaScript normally, would presumably enable it in order to use this popular and useful site. Furthermore – and importantly – they place trust in StackOverflow not to abuse the privilege of executing arbitrary JavaScript. That is an entirely reasonable thing for a technically savvy modern web user to do.

By serving this ad with JavaScript not vetted to StackOverflow's presumed standard, StackOverflow has violated that trust. Thus the onus is on them, not the user, to remove the offending ad or risk damaging their brand.

Honestly, what you said is like saying "why would you ever not keep a hand on your wallet" after someone got pickpocketed in a nice restaurant. Reasonable people have reasonable expectations of safety in certain places which they trust to provide it for them. No-one should go around being constantly paranoid of pickpockets everywhere, no more than anyone on the web should be constantly paranoid of malicious JavaScript even on sites with established records of safety.

> So, even someone who disables JavaScript normally, would presumably enable it in order to use this popular and useful site.

I agree that StackOverflow is at fault here, but enabling JS is not a binary choice — "allow all JS on this site" vs "block all JS on this site" are not your only options.

Tools like uMatrix allow me to control JS coming from different domains on different domains independently. For example, on SO I have enabled JS from Stack Exchange and related domains, but not from Google or other snoopers.

Revenues are important. The users will not notice unless something happens. And when something happens they forget fast.
More money that way
From the post:

"The ad is attempting to use the Audio API as one of literally hundreds of pieces of data it is collecting about your browser in an attempt to "fingerprint" it... Your browser may be blocking this particular API, but it's not blocking most of the data."

Seems like killing the audio is the metaphorical putting a finger in the dyke of serving arbitrary JavaScript to your users.

Maybe in the dyke holding back user outrage, but the dyke of serving arbitrary JavaScript was never built in the first place.
It's spelled "dike".
Not in England, which incidentally is where English originates from.
Nick, how did things go so wrong from three years ago?

e.g. https://news.ycombinator.com/item?id=20289841

I don’t know. I am so very much trying to find out and push to make things better.
So no vetting on new ad tech?
> we are aware of the issue. > We're trying to sort out how to kill the audio behavior now.

Are you really aware of the issue? The issue people have here is not the fact that the ad is trying to access the audio api per se but that it is trying to fingerprint the users.

If you're "NOT okay with it", how about stopping ads completely until you resolve this problem? That should give a bigger impetus to solve it ASAP as the bottom line gets hit for multiple stakeholders.

This is not just ads, but about fingerprinting and tracking users somehow or the other by third parties. It's plain evil, and not a decent thing to continue foisting on your unsuspecting users after you've known it. Tell management to take an ethical stance and preserve the reputation of SO.

Probably not his call. By "we" he's probably talking about the engineering team, which in many cases is nothing more than a conduit for whims of the marketing and sales teams.

The only time they'd do that is if the marketing team decided that the value-add from taking ads off cancelled out the profit loss from taking the ads off.

I completely understand that it may not be his call. That's why I said "Tell management to take an ethical stance and preserve the reputation of SO."

Maybe he (or someone else in the team) has already given this as a temporary solution but it's been rejected. Since we don't know what's going on in the background, this suggestion being put on a public forum is still worthwhile. It could also help external parties (like HN readers) add more pressure in not letting this kind of surveillance continue just because the company doesn't want to stop making money while they're working on a solution or waiting for Google (or someone else) to help.

Every minute they delay cutting this off puts thousands of people in a position of vulnerability.

So, we have:

- Stack Overflow makes a blog post about not using dynamic ads.

- Dynamic ads found on Stack Overflow, with aggressive fingerprinting.

- Architecture Lead doesn't know how this happened and is getting serious.

I have so many questions. I hope this gets a post-mortem.

The fundamental problem seems to be that you are including non-sandboxed JavaScript that you don’t control.

Perhaps you should stop doing that.

Would something like SafeFrame have avoided this issue?

https://www.iab.com/guidelines/safeframe/

Hi Nick,

If you're serious about this, I've built tools for the publisher side for stopping exactly this.

My email address is in my profile.

I’m very interested and very serious. Email sent.
I just saw this post, where an potential justification was provided for a similar script in the past: https://meta.stackoverflow.com/questions/335956/adzerk-servi...

It's hard to read the obfuscated code and be sure what's being done with the browser environment information. This script seems to generate some hash and put in some global variables, presumably for some other script to consume. I don't know whether such scripts send it to a server, compare it locally to a previously-known value, or ignore it.

I would pay for an ad-free version of Stack Overflow. Take my money, please.
I think the data in aggregate is worth more than people like you would pay for an ad-free service.
This is the actual problem at the heart of it all. And even if it were more profitable to take subscription fees than to serve ads, what's stopping you from "double dipping" and serving ads anyway?
Or taking your subscription money and tracking you anyway. Knowing your interests on one site helps target you elsewhere.
ArsTechnica (obviously a very different site compared to SO) has an ad free subscription model where it also removed all trackers for paying subscribers. It's possible to do this in an ethical way. Whether the site publisher is interested or not is a different matter.
> what's stopping you from "double dipping" and serving ads anyway?

People looking at the source code, like what happened here.

You think the NY Times, Linkedin, etc. is going to have the same response as StackOverflow? Good luck even getting in touch with someone who knows what you're talking about.
If LinkedIn (to choose a random example) advertises one of the perks of subscribing is that you won't be tracked, and then tracks you anyway, that's a story for The New York Times et al.
Very likely. I'd pay hundreds of dollars a year to Gogle if they guaranteed* me, with severe legal repercussions otherwise, that they wouldn't track me, or allow a single bit of my data, anonymized or not, leave their servers, or be used in any other way that wasn't for my own purpose.

Re-selling digital personas as commodities must be far more lucrative.

> Gogle

Is that the evil twin?

I actually wonder about this. SO's typical user is tech-savvy, and I would imagine many access the site with adblockers on (I do). So I suspect my value to the site in terms of ads is close to zero. I would happily pay a monthly subscription to know that the service will remain, given how much value I derive from it, if that gave me the assurance that they won't track me with ads/cookies/fingerprinting.

Their other income is from job ads, and I guess the value is that they have lots of data points about their logged in users (with scores high enough to imply they've interacted with the site a fair bit), in the form of what is posted, worth more than the aggregated list of websites that a user sees (as reported by ads).

I'd love to know more about this, as I have very little understanding of the economics of serving targeted ads. How much can they be making from ads?

But they're mostly wouldn't pay in ads either. The difference may be pay-for-ad-free vs adblock-and-no-money rather than getting more ad views.
It looks like something using fingerprintjs2.

This library is very popular.

https://github.com/Valve/fingerprintjs2/blob/master/fingerpr...

Not sure how that plays with rules about how you can place ads etc, but <iframe> with a feature policy can stop access to audio I think.
Why don't you block all the JavaScript not coming from your origin and just display a simple link+PNG as advertising?
This is exactly why I block third party advertisements for myself and everyone that uses my network.
I hear from multiple sides people reporting, to receive ads about topics thy only talked to friends about but never entered in a search engine.

Google has is currently as far away from their previous world famous "don't be evil" corporate culture.

Other examples are AMP where Google wants to make it harder to de-individualise URL's. This is being driven to an extend where Chrome on Android makes it harder to edit the URL.

Or games like Egress or PokemonGo, which in my opinion helps Google constantly update their WiFi SSIDs-To-GPS-location database.This database is rhen furthermore being used to track users location through a little permission called "WiFi Control", which also can not be found in the regular App Permissions settings entry.

To me WiFi-Control sound nothing like location tracking. But I have to admit, I am not a native speaker. Therefore I might be misunderstanding something.

"Don't be evil" was replaced by "Do the right thing" years ago. Great piece of corporate speak right there.