Hacker News new | ask | show | jobs
by DrBazza 1732 days ago
Since I've been "reassured" that we're not being listened to by smart speakers, I can only assume that familial links i.e. the 'facebook graph', are also exploited by ad companies now as well.

Anec-data: visiting the in-laws in a different part of the country, so my device and location is 'known' to have geographically moved by tracking, they searched for garages and glazing. We return home, and every advert is for garages and glazing.

2 comments

That’s most likely because your device used their wifi and ended up behind their nat using the same ip as them. The anonymous tracking cookie ids on your device get associated with searches from that ip. (i.e. “cookie 1234 is in-market for glazing”) and you take those cookies back home with you. This is actually a massive failure because you’re not likely to purchase glazing at all.
I'm not sure it's a failure. You could argue that showing him ads for glazing could be beneficial in case he spots a good deal and decides to talk to them about it.

Now I don't think the ad network is that smart and explicitly intended for this, but presumably in aggregate they're seeing better results by merging targeting buckets by IP than not, so they continue doing it.

What reassurances have you had about not being listened to?

I keep cam/mic access switched off a lot for this reason, and I've only seen generic denials/refusals that it is or could happen but nothing concrete, much like the pre-snowden outlook towards internet surveillance.

If someone wants to convince me FB, et al are doing this, they can show me packet traces of audio being uploaded for analysis, or in the alternative (if the theory is on-phone analysis) resource consumption data from the phone doing it.

There's been so much discussion of this theory that we'd have seen one of those by now. I have zero love for the surveillance shops, but there's just no real evidence this happens.

Resource consumption for low-accuracy voice transcription is trivial - think in the single-digit milliwatt range[1] for custom hardware. It would also be really easy to hide the resulting small amount of textual data in routine communications with the service's server.

You can't rule out audio transcription on the technical basis that "it's too hard" alone, because it's not too hard.

(for Google, that is - Facebook is constrained by the Android sandbox, but Google has their opaque Google Play services blob on almost every Android phone)

That's obviously not a reason to believe that it does exist - we'd only know that if a Google whistleblower stepped forward, or if someone reverse-engineered Play Services - but we can't rule it out on a technical basis alone.

[1] https://groups.csail.mit.edu/sls/publications/2018/Price_IEE...

Better to focus on the things that we have evidence that they are doing (and there is plenty of abusive behavior we know about) than to speculate about unlikely attack angles and say "we can't rule it out" (proving a negative is nearly impossible). Working that way just leads to focus on the wrong threats.
I'd be satisfied with a "beyond a reasonable doubt"-type argument. Somebody pre-registers some hypotheses, like "I've never thought about beanie babies in my life, but I'm going to talk about them in front of my phone. I expect I'll start seeing ads for them at a noticably higher rate then before.". Such an experiment wouldn't be difficult to conduct, but I've never heard of it being done methodically, only noticed after the fact.

I'd be very pleased to see it done though. Complications will certainly arise, so N should be large, and there should be several independent replications.

Maybe it's overly cynical, but this also makes me think of the Volkswagen emissions kerfuffle. Would it be so surprising if the ad software was sophisticated enough to know when it was being tested, and try to play dumb?

My wife and I did this. We don’t watch pro baseball at all and we live in the southeast. Had a 10 minute conversation about the Cincinnati Reds and within a few minutes we were seeing MLB ads on Facebook.

I don’t know how they are doing it…but it happens.

> What reassurances have you had about not being listened to?

I know enough about tech to know that it's very very very hard technical problem, and hiding it is basically impossible. And no one showed anything even reassembling any form of breadcrumb pointing towards it, not even a proof.

> I know enough about tech to know that it's very very very hard technical problem, and hiding it is basically impossible.

This gets repeatedly asserted, and it's false. Low-accuracy voice transcription is a solved problem, and is relatively easy to hide, as long as you have API access[1]. (so, Facebook is probably in the clear, as neither Apple nor Google are crazy enough to let them have invisible microphone access, but it would be relatively easy for Google (Play Services hook anyone?))

[1] https://news.ycombinator.com/item?id=27142812

I'm very well aware that low accuracy voice transcription is possible.

But it's naive thinking that having an algorithm equals solving a technical problem. That's not even the problem. Problem is how to deploy it, at scale, without anyone leaking it (both employees and vendors). And hiding it so well, that none of the security researchers will be able to find it. And doing all of that in a way, that they can use it and get value out of it.

And then compare risks of doing with risks and ROI of, for example, improving search accuracy, so people will just come and tell you more about stuff they want.

Have you considered that they might be tracking your thoughts and speech through microchips they implanted in your during your Covid vaccination? /s