Hacker News new | ask | show | jobs
by rickhanlonii 3161 days ago
They're listening, but not to your voice or with a microphone.

Facebook is listening to your data--to all of our data, all at once. They have locations, searches, clicks, messaging, photos, hashtags, and any other form of browsing patterns for everyone in your country, everyone in your neighboorhood, everyone in the same room as you.

They have enough data with such advanced analysis that on occasion they can get really close what you're thinking/doing without you explicitly telling them. They're doing this frequently enough to creep a lot of people out and they're only going to get better at it over time.

What we have here is something like Turing test for privacy: a sufficiently advanced amount of data and analysis will be indistinguishable from surveillance.

It's simply unnecessary to listen with a microphone.

Edit: found this slide from F8 in 2015 where they said they store 300PB and process 10PB (with a P) of data per day https://www.instagram.com/p/0tUjrQKH6R

14 comments

Yes, I think people really underestimate how much data they leak all day long, and how strong the pattern is. Simply emitting bluetooth and wifi, nevermind cellular, all day long can expose an incredible amount of stuff, provided someone can piece it all together (which is exactly what Facebook and other ad companies do).

We like to pretend we're unique and powered by free-will but the reality is most things you do are fairly easy to track and predict.

We're not in 1984, but certainly the capability is _already_ there. The difference between us and 1984 is only really one of intent, and you can guess how long that'll last.

> The difference between us and 1984 is only really one of intent

I'm not convinced intent isn't there; I'm also confused why'd you assume that. As a free software activist, I wonder have you every checked EFF or FSF websites. I'm not saying EFF and FSF are saving the world (no, they are not) but if you browse what they're dealing with, you can easily see the intent is there. It is simply hard to do all these without people thinking intent is not there, which is why nothing is done as explicitly as 1984. For example, Windows 10 has built-in ads, INSIDE operating system, your OS, which you already paid quite a price, forces you see ads; and tracks your behavior "to optimize user experience". People are not worried about this since ads are such an integral part of our lives. But Microsoft perfectly intents to track which websites I visit, what do I eat, what do I like, where would I go. Maybe you can say intention is "different" as in in 1984 they use my personal information to prosecute me, whereas this is not what Microsoft is after. Well, then this brings us to some Foucoult reading. In a world where all my information collected what is the difference between using all my information for a specific cause, when this cause can change just tomorrow. What if NSA asks Microsoft to give bunch of people's data? (we all know this already happened, but it is better practice to ask questions)

I am a programmer, I use my computer all the time, pretty much every single awake moment of mine. I don't want to be watched. But as you said it is depressingly hard to achieve this, since it feels like only 0.1% of the population shares this same desire. Most people are okay being watched, and this is part of the problem.

> ads are such an integral part of our lives

I'll skip the futurama reference for now, but for me ads are far rarer than in the 90s. Adblock handles most things online, iPlayer, Netflix and amazon don't really have adverts on TV (they have a graphical menu but I would call that an advert), Spotify doesn't have adverts. Sure there are a few adverts if I go to somewhere like london, and Sainsbury's advertise their own deals in store, and of course products on the shelf are adverts in of themselves, but the only advert I have seen in the last 24 hours was one raising awareness of cistic fibrosis in a billboard outside of Sainsbury's. Oh and the train announcement saying "the on board shop is open for teas coffees and light refreshments". There wouldn't have been any product placement in the episode of once upon a time I watched last night either.

The TV jingles I remember from childhood are still etched in my brain, the adverts I see in this decade are ones I hunt out depberatly (the John Lewis Xmas advert for example, or film trailers, or something about amazon delivery people stealing your sofa that was recommended at work)

I did see adverts on Saturday night when we went to see Thor. I don't recall what was advertised other than odeon limitless, the trailers are adverts too of course, alas they don't really work - we'd rather see more films at the cinema than we do now, but getting babysitting is a pain. We missed Spider-Man and kingsmen for example.

If anything there are more opportunities to avoid adverts than ever before.

It definitely depends where you are living, and how you travel.

If you haven't been to London recently I understand you might say there are a few ads, but lately I have really begun to notice just how prevalent they are. The tube is absolutely plastered with them - in the train carriages, the escalators, the tunnels between platforms. Construction work fences are covered in ads. And this is for thousands of people who travel 30-45mins two times every day, completely filled with ads. You don't necessarily pay them much attention every day, but they definitely have an effect on which companies, brands and even theatre performances pop into my head first.

On top of this, I took a flight to Heathrow last night and there were ads on the plane, ads all over the arrivals area as you walk to the taxi rank. In the Uber home I noticed the motorway connecting Heathrow and London has bright electronic billboards one after the other - it never seems to end.

I just wanted to give some perspective on what the commuter experience is like in London (and I'm sure in other large cities, too). It may feel like you experience less ads personally, but that may not be representative of the general public.

The apps you mentioned are a select few (admittedly good examples of how TV ads have gone down) , but most free apps now rely on some kind of advertising model to be profitable. Almost every social media platform forces ads into your timeline. Perhaps we just don't tend to notice how pervasive they are.

I guess one side effect of being glued to your phone all the time is that you don't notice adverts. But the tube had adverts in the 90s, and before - http://howlsandwhispers.co.uk/articles/bob-mazzer-london-und...

I did notice adverts on a flight from delhi last month on the front of a film, but it's very rare I'll watch anything on a plane's entertainment system, partly because of things like adverts.

Even tiny cities in comparison will give you a similar experience. You just have to look. There's probably a lot of people looking right through them.
Not sure I agree r; Netflix and Spotify. Both have curated home pages with questionable amounts of accuracy. Netflix in particular, places all of their own content on the top row of the selection. My fire tv stick home page is jammed with ads for content on the landing screen. Amazons website is a gigantic ad. My online shopping now has suggested items based on what I’m ordering( ordering strawberries - now you get suggestions for branded cream in the same search)

Product placement also appears to be rampant (despite you mentioning it explicitly in your post) - many shows highlight specific vehicles and brands, or specific phones, brands of food...

Another one I’ve found is lots of blogs (particularly food ones) have sponsored entries. So you have someone writing about food ad having recipes that use a particular brand of harissa, or a particular food processor.

All these things are advertising, and are mostly heavily targeted too.

Absolutely. Ublock origin scrubs every page clean of ads and trackers.

It's really not THAT hard to stay off the grid.

Quick note to say that despite umatrix/ublock origin, it's hard to not be tracked via canvas fingerprint tracking, where the use of a blocker can make you more trackable. I mean it's not necessarily the ad companies at that point, but you're still a data point for marketing/surveillance.

https://en.wikipedia.org/wiki/Canvas_fingerprinting

I was the audience to at least a hundred advertisements on my way to work today.

Do I mind? Not really. Probably because I can't point out a single one.

> Most people are okay being watched, and this is part of the problem.

I am not sure I agree with that. I think the vast majority of people react to the same way to surveillance, they at most don't like it, or outright hate it.

But most people are not conscious of this surveillance. If there was a guy following them absolutely everywhere, all day long, and logging on a notepad everything they do and everything they say, they would go absolutely mad. It's just that current surveillance, through cookies, CCTVs, server side logging, is too stealth for people to notice.

> We're not in 1984

https://m.imgur.com/vRBtL

"that Huxley not Orwell was right"

Implies only one can be right. There doesn't have to be only one answer.

    Some say the world will end in fire, 
    Some say in ice. 
    From what I’ve tasted of desire 
    I hold with those who favor fire. 
    But if it had to perish twice, 
    I think I know enough of hate 
    To say that for destruction ice 
    Is also great 
    And would suffice.
Everytime we avoid one possible dystopia, the possibility for another will surface. We just have to hold them at Bay faster than they can arrive
> "that Huxley not Orwell was right" Implies only one can be right.

You're very right about this observation. That being said, I'm kind of tired of the 1984 meme because the state of things and the way we seem to be heading looks nothing like the future presented in 1984. In fact, if we are expecting dystopia to present itself like the one portrayed in 1984 it might obscure the fundamental issues and complexities that our society faces. Brave New World, on the other hand, has a better chance of helping us reveal them.

Direct link to the image:

https://i.imgur.com/vRBtL.jpg

I get redirected regardless.
When a person submits their email for free Wifi at a place with Zenreach, they pull from databases and build a profile on the person.

I don't know which services provide this or how they gather data, but the thought of anyone being able to download a detailed file on you with only an email or phone number is scary.

A few months ago I did a Google search for my name as i do once or twice a year just to see what comes out. This time I had mixed reactions because among first results there was full text taken from a scanned old magazine dated 1978 or so with my name complete with street address. That was long before the Internet was publicly available and having such data into a magazine would not hurt anyone because people expected it could be useful to get in touch with others with similar interests (electronics in this case), and it surely worked for me because I got a free subscription and some job offers which I would have gladly accepted if I wasn't only 12 years old:) But nearly 40 years have passed, the ubiquitous magazine called the Internet is read by anyone for free, and having your address and/or personal data there isn't that safe anymore. Luckily I relocated a few times since, but what if I didn't? Technically speaking it's great to see an old magazine brought back to life, but what about unfiltered personal information one would expect to remain buried that will instead remain available forever in other contexts? I'm not bashing Google's bots crawling around to turn anything into searchable data, and I surely would never ever want laws to limit what can be searched, but pay attention if you shared personal data on printed material because if until yesterday we said "everything you put on the internet stays there forever", today there is more.
As a company director since the age of 15 I’ve had my personal address publicly listed for quite a while now.

Prior to that there were phone books and whilst you could choose to be delisted clearly a lot of people didn’t since the books were so thick!

I thoroughly agree that privacy is important but I don’t think people ever really minded very much or if they did not enough to do something about it like lobby for laws to ensure personal addresses are not disclosed publicly.

What's the reason for keeping your address a secret? Genuinely curious, as where I live (se) almost all peoples addresses are public information, and it's really practical.
33 bits. Information is a force multiplier, not power itself (Francis Bacon was wrong.)

https://33bits.wordpress.com/about/

It takes only 33 bits of distinctive information to identify a given person. Specific information about a person, including background, can help provide further information on them -- it tells you where to look (and more importantly, a very good idea as to where not to), who to talk to, and what they might have done.

"If one would give me six lines written by the hand of the most honest man, I would find something in them to have him hanged."

- Cardinal Richelieu (a/k/a Armand Jean du Plessis, Cardinal-Duc de Richelieu et de Fronsac)

"the forced revelation of information makes individual privilege and power more important. When everyone has to play with their cards on the table, so to speak, then people who feel like they can be themselves without consequence do so freely -- these generally being people with support groups of like-minded people, and who are neither economically nor physically vulnerable. People who are more vulnerable to consequences use concealment as a method of protection: it makes it possible to speak freely about controversial subjects, or even about any subjects, without fear of harassment."

https://plus.google.com/+YonatanZunger/posts/WegYVNkZQqq

(Yonatan Zunger is the former chief architect of Google+.)

I work with prisoners. I have a unique name. My family members and I are the only people with our names in the country, and possibly the world.

If you google my name, you will find my family (parents, sibling), their home address, phone numbers, ages, occupations. It is local.

Upon release or through veiled communication, I or my family members could easily be targeted for harm by inmates or former inmates.

This is a very real concern for me. I have found no way to permanently remove this information from search results.

I encourage people to think more creatively when they cannot think of potential disadvantages of easily searchable personal information.

Create a batch of new yous' with different addresses and make them public across the internet. Your problem is too little misinformation.
> What's the reason for keeping your address a secret?

I, and the people I love, have public political opinions on the internet, and SWATting is a thing.

I can either keep my address a secret (and I do; I use my PO Box whenever possible), or I can decline to participate in public civic discourse and encourage the people I love to do the same.

I think it's not so much keeping this sort of information a secret, it has never really been a secret as anyone with some motivation could find out these things. The big thing these days with computerisation and the Internet is it is making these things trivial to acquire, accumulate, and search. It's not a matter of going to offices and looking through paper records, because it's all at someone's fingertips.
> it's really practical

To what end?

I can't think of a situation when someone might need my address and not being able to get it from me directly.

Addresses can be used for authentication—it routinely is in the UK where I live now.

There are also possible physical safety issues, especially in a country like the US were people seem to have guns and are willing to use them.

I miss Sweden’s transparency and widespread’s trust.

Lookup address-by-name seems much more damning than lookup-name-by-address.

Much less is to be assumed about you, like just your name.

Versus the country you live in, state/province/territory, city/village/township, neighborhood, etc.

We _are_ in 1984, but unlike the book, it's not imposed on us. We signed up for it. And those of frequenting HN most likely are the ones helping to build it.

Think about it: tomorrow is Monday. How many folks on this board will be working on analytics and tracking once they get in to the office? A lot of us, guaranteed. And how many of us will take a principled stand, swearing not to work on products until surveillance features are removed? A few of us, guaranteed.

You say people will take a stand, but the vast and overwhelming number of us wont, we will just continue to help build the greatest surveillance system in history. And then tell each other how we’re unicorns that are saving the world, that we cherish freedom and liberty and the hacker ethis, when really we just want to get paid.
Nobody knows what stand to take. We stood and fought for open source everywhere, thinking that would be our salvation. But apperently there is more to it than that, and we aren't exactly sure what.
Sure we are sure, we just don't like it. RMS gets a lot of flack, but he's mostly right---this is not a technical problem with a technical solution, it's a political problem with a necessarily political solution. Most of us live in democracies, so if you want to fix the problem, it's actually quite straight-forward, it's just that people don't want to give up their nice job and nice hobbies.
If you know what the law should be, please tell us.
Oh my god stop being so dramatic, both of you. The vast majority of people are not actively contritubing to the sorry state of data privacy nor are they heroically working to fix it. Put the blame where its due.
Are you kidding me? HN is _built_ on startups that specialize in selling ads.
>We _are_ in 1984, but unlike the book, it's not imposed on us.

We're living in both 1984 and Brave New World.

See this illustration between 1984 and Brave New World.

https://i.imgur.com/vRBtL.jpg

the other difference between us and 1984 is that we are volunteering that data for free.
Right, like the example from Target. It is possible that facebook has access to a stream of Target purchases or credit card purchase information. I'd be 0% surprised if there was a data sharing agreement with credit rating companies.

Even if they just knew the price of the purchase, combined with the job (burn risk) and location data (traveled from work to nearest store with pharmacy during the day), it's possible that you might be able to infer a burn with enough accuracy to be valuable to advertisers.

"It is possible that facebook has access to a stream of Target purchases" ...

It's not possible, it's simply true. In fact, this feature of the facebook ad platform is available to any advertiser on Facebook/Instagram/etc, whether you have $100 to spend or $100M.

You simply upload your customer's purchases to Facebook with data such as zip code, email, etc etc which Facebook then uses to optimize your advertising budget and find more customers similar to the ones you already have (the feature is called look-alike audiences).

"Google said that it captures around 70% of credit and debit card transactions in the US."

source: http://www.bbc.com/news/technology-40027706

If Google has access, I am sure Facebook does it.

I'm sure Facebook would like to do what you're saying, but I think that they're a long way from having those capabilities.

They have billions of users, billions of ads, and trillions of pieces of metadata. Sifting through that to produce a guess like "User 1234 burned themselves and would be interested in product 5678" would be an amazing and scary piece of AI.

Not to mention that Facebook are limited by the ads they can sell. The manufacturer of the burn cream doesn't give Facebook a bucket of money and free reign. They choose how and where the ads are shown. There is currently no option for "Users who recently burned themselves".

Last year there were 29,000 categories that could be targeted. If an ad sales rep wanted to close a deal, how difficult would it be to add one more for recent burn victims, or people who recently bought a specific product?

From the links that m_ke posted, facebook already has ties to loyalty card information, so it's very possible that they didn't need to do any inference, they just had the data directly.

In any case, the main point is that facebook doesn't need to listen to what people are saying, it has a ton of other data streams that could explain the stories in the link.

What would the point be in advertising a burn cream to someone that just bought a burn cream? It's not like it's a common reoccurring purchase (like Toothpaste for example).

Like the article says, it's probably just the Baader-Meinhof phenomenon* at work. It's like when you buy a new car - you suddenly start seeing that same model of car everywhere.

* https://www.damninteresting.com/the-baader-meinhof-phenomeno...

I think the proposed logic here is "a person like you buys burn cream so we advertise burn cream to people like you." In this scenario the person who actually bought the item will get the ad too.

To me, advertising a product to people who already bought it is a sound strategy. I see no contradiction. I'd rather like to hear a cogent argument as to why that is wasted effort, as is usually implied.

In your example, maybe I left the bottle of burn cream at the office. So I will want to get another one on the way home. It's effective to get reminded in my Facebook about that item (and the brand) isn't it? Yes it's creepy but it works.

> facebook doesn't need to listen to what people are saying, it has a ton of other data streams

The same could be said for any of those data streams. But just because Big F has access to those other sources of data doesn't mean it excludes voice recognition.

I'd be very surprised if FB wasn't in this arena. That's how big tech companies work these days — they do what everyone else is doing. Like getting into self-driving cars.

Apple[1], Google, Amazon, MS, etc... all have voice recognition boxes. FB doesn't need to market a box for your living room, it already has one in everyone's pocket.

([1] Apple says it doesn't use any of your Siri words against you. If you don't trust FB's word, you might no trust Apple's, either. But at least on the Mac you can keep dictation off the internet, if you choose.)

I agree that even with an enormous amount of data there are limits to what facebook can realistically infer...I particularly enjoyed the things that facebook’s ad platform determined to be my hobbies when I checked recently: https://m.imgur.com/nWCWn63

I’ve had an account since 2005, even if I use it much less than I used to, you’d think they’d have enough data to do better than that.

Think smaller. They don't need to consider billion users at once, for modeling you can start with hundreds of thousands or few million. Trillion pieces of metadata can be reduced to tens of thousands of features (eg. instead of feeding in the raw status updates, assign each update to some class, put in some mood measurement etc). This reduction does not need to be fully automated, it can be human assisted (humans coming up with ideas on what kind of features to extract). Instead of considering all possible ads, you can start by taking out the long tail of ads with very little interaction data and focus on some specific categories.

Building this kind of predictive models is not black magic. This is what companies do to figure out who they should target on direct marketing campaigns. Applying some magical deep learning dust can quite likely improve the results (in Facebook scale), but it's not mandatory. Computationally the hardest part is building the models. Once they are done, applying them to millions and millions of customers is more straight forward.

When considering what is possible, you need to also consider the stakes. Facebook ad revenue is something like $10 billion per quarter[1]. To make more money, I can think of three ways: add more users, make users spend more time looking at their feed or make more money per ad slot. Better targeting means more money per ad slot since some customers are paying directly based on the clicks. Since improvements lead to better revenue, extra hardware costs are easy to justify.

[1] http://www.adweek.com/digital/facebook-raked-in-9-16-billion...

Patterns like these can definitely be inferred by machine learning, with well principled models.

P(product_class=bandage | job=factory_worker, pharmacy_visit_last_month=1) >> P(product_class=bandage)

Also, Googling something on your phone is so common and easy nowadays that people just forget that they did it. It's literally become an integral part of the thinking process.
My suspicion on this is that stores offer "reward cards" so they can get you to sign something that allows them to gather and sell your data.
Rewards cards / loyalty card are designed to incentivize you to spend more with the stores. Ostensibly reselling data is a lot less valuable than directly trying to convince you to shop more at their establishments
That was true at the beginning of the loyalty card phage. But now the big chains are learning how to data mine and discovering the value of their data to others.

Now your loyalty card data is worth twice as much. Once to the retailer for internal marketing, targeting, purchasing, etc... and again selling it to outside parties.

thanks, sorry to reply to this late. I often wonder if I take a loyalty card and then for example I purchase twice as much double cream as the next person, does this get sold to health insurers who then charge me a higher premium? The permutations are endless eg too much beer. too many condoms. Too much spicy dip.
If you look at the history of the Tesco Clubcard (one of the early ones to appear in the UK), the data they mined from their new ability to tie purchasing trends to customers more than made up for the cost of the rewards being offered to get people to use it.

There's a reason they got so dominant over here - they were simply way ahead of the game on offering what their customers actually wanted, because they were the only people who actually had a clue.

That has of course changed now.

If you read the Walgreens Rewards privacy policy, it covers everything from retail transactions to security camera footage and they are able to tie everything together. They're also able to gather and sell data about your health, which would otherwise be protected by regulation using the rewards program, especially as people are incentivized to use the card, even for prescriptions...and no one actually reads [between the lines] of the privacy policy
I always thought they were up front about that being true. At least for the 1% off cards that supermarkets offer.
That's true and credit card companies are very open about it (at least if you read their reports for investors). The advantage of store cards is that the store knows exactly which products you bought, whereas they only know the total if you used a Mastercard/Visa.
>whereas they only know the total if you used a Mastercard/Visa

This is simply not true with respect to the big retailers in the US. This started as far back as '06, though the details of the implementation have changed a bit.

Only if it's a co-branded card. If you use your Visa issued by some bank to go shopping at Walmart, they won't get your personal details together with all items you bought. They can get the items but without the personal details (including age, wealth, etc), they're not that useful.
We had the ability to do this at the largest merchant acquirer 10 years ago as long as your bank signed up. Walmart was one of the first customers and would stream SKU-level data straight to our servers from the POS terminal.
You make a very good point. However, in this case there is very strong evidence that Facebook is also listening to the microphone.

Coincidentally, I recently had a friend convey a story where the Facebook app suddenly recommended a new connection immediately after certain information was spoken in a verbal conversation. This instance was particularly damning because, due to the sensitivity of the information that was spoken, it had been very carefully kept out of the digital footprint.

I consider it likely that this is merely just another example of people underestimating how much data they're leaking. I would wager to guess that there would have been far easier methods of deducing that new connection other than listening to your conversation and somehow inferring it from that stream of noise. You just don't know which, but it's not unimaginable that that certain information was leaked in another way as well, and probably in a more structured format than unlabelled audio.
Maybe the other party looked the friend up.

Once Facebook recommended me to connect to guy with an interesting name, and I wondered where I saw this name before. I looked up my emails, and I saw that I bought something from him on eBay several years earlier. I know I never gave Facebook access to my mail account, but guess what, I'm pretty sure the guy gave access to his. Facebook saw that we sent mail to each other, and it asked me if I wanted to be FB friends with this random eBayer...

It's not unimaginable, but the frequency of these stories should make you wonder.

I had this happen personally. Had a conversation about my work with this girl I know which ended up being mostly about project management. The girl told me later she started getting ads for project management stuff.

It's not a subject she's interested in, has ever searched for information on, and has no relevance to her job working the counter at a sandwich shop. We have no social media accounts in common and don't even talk that often.

The issue is in the story teller. If the person telling the story doesn't understand what other data they're leaking or the data the people they interact with are leaking you cannot take their word that "Facebook is listening to my microphone" at face value, no matter how frequently they say it.

Also wouldn't a constant stream of audio - even low quality audio - ruin battery life? I realise it's a phone so it's base usage is a constant stream of audio but I can't help but feel that or something else would be giving it away.

As they have a bug bounty program I imagine there's plenty of people watching raw network activity between app and Facebook too.

>Also wouldn't a constant stream of audio - even low quality audio - ruin battery life?

Facebook is a well-known battery hog, at least on iOS. I can't imagine their devs are any better at Android.

Facebook probably knew because she was logged in to it on her work computer, where she was also searching for and reading project management websites. Since every website has a like button it means that Facebook can correlate the two.
This girl has zero interest in project management, the whole reason she brought it up to me is because she thought it was weird because it's a subject she never even thought of outside of the one conversation we had just prior to her being fed ads for the same thing.
> Turing test

Sorry to 'actually' you but that's phrased like Clarke's third law: "Any sufficiently advanced technology is indistinguishable from magic."

https://en.wikipedia.org/wiki/Clarke%27s_three_laws

True, but I have to say that I appreciated both references.
Exactly. People underestimate the amount of data Facebook (and companies alike) can and do scoop up. I’ve seen countless stories supposedly proofing that they’re listening through the microphone. However, most of them can easily be explained away if you consider the vasts amount of data they collect.
The idea that they're listening to your mic is rather ridiculous. I mean, we'd find out eventually, and the consequences of widespread illegal wiretapping would be severe for Facebook, they would really cease to exist.

However, as you said, they don't need to. Localization and usage metrics alone can tell you an incredible amount of extremely detailed information.

Just looking at the Facebook iOS app, they don't declare the "audio" background mode in their Info.plist. So instantly we know that they're not recording anything in the background.

For foreground, I suppose it'd be fairly trivial to tell if the app was making calls to AVAudioRecorder.

I wish someone would actually write about how ridiculous these clames are in a way that non-technical people would be able to understand. There's so much bias, conjecture and downright false proof.
I agree. I'd love something shorter than having send someone to read half of the LessWrong Sequences. I wish CGP Grey or Kurzgesagt could do a video on that.

The basic thing you need to make people comprehend: to learn a fact X, you don't have to actually learn it - it's enough for you to learn about such Y that P(X|Y) >> P(X). And you don't have to learn Y either, you have to learn V and W such that P(Y|W) >> P(Y) and P(Y|Z) >> P(Y). Etc. This method applies recursively.

And once they comprehend the causality graph this forms, you need to make them understand just how much information we radiate all the time, and how humans are still very inefficient in doing the calculations mentioned above with all that data. Facebook is only the tip of an iceberg; it's only going to get worse from here, because modern technology keeps letting us explore the causality graph even further and faster.

> the consequences of widespread illegal wiretapping would be severe for Facebook, they would really cease to exist.

Unfortunately, I don't think they would. They'd get a slap on the wrist fine, maybe have one or two senior executives resign, they'd post a mea culpa, saying how they now it was wrong and promise to do better, and then everyone would forget about it a few weeks later when some new thing started to dominate the news cycle.

Indeed. I doubt they would go down due to this. They'd receive some negative attention. They'll make some sweet PR statements and hop, they're back in business.
Ummm....https://www.reddit.com/r/videos/comments/79i4cj/youtube_user...

It's a pretty easy test - they just said "cat food" a bunch of times and ads for cat food came up in their feed the next day. I'm sure if you were interested you could probably do the same experiment with some other product you've never posted about or talked about online and come up with similar results. It seems strange that you would just assume that the people who are bringing up this issue are simply misinformed, or do not also know that facebook also looks at other parts of their data. I think most lay people understand how data mining is leveraged these days.

Hi, first comment so please be gentle.

I'd like to point out that this video doesn't actually provide any evidence that Facebook was listening.

They said they talked about cat food all day, then they showed an ad on Facebook for cat food.

There's no way to verify that they didn't make the video of the cat food ad on Facebook and then talk about cat food all day.

Nor is there any evidence that they talked about cat food all day at all. No recording of the conversations etc.

In fact I'm a little confused about how this is even a question.

I'm not a programmer so I don't know, but: Wouldn't it be trivial for someone who knows how to write Android software to monitor if an an application is accessing the audio input device?

I mean, I know that on Linux you can monitor whether or not a device is being opened.

Why doesn't someone check if Facebook is accessing the audio input?

That's been done, and no one has found any evidence that this is happening. Hence it's status as a conspiracy theory.

As soon as you start assuming that the OS is giving the facebook app access to the microphone regardless of whether you allowed it in the settings, things start to get absurd

Link please?
> I'm not a programmer so I don't know, but: Wouldn't it be trivial for someone who knows how to write Android software to monitor if an an application is accessing the audio input device?

Or even by MITMing the connection and looking at the network packets. But yes, it's not outside the realms of possibility to hook into the microphone driver on a rooted Android phone and check when it's being activated.

I'm not one to leap to Facebook's defence and if this is happening it needs to be shut down ASAP, but I suspect that there would be at least some credible evidence out there if it were indeed the case.

I'd like to see some more rigorous testing than this. They said it took 2 days to happen? Other people's anecdotes indicate it happens within minutes. I don't think this is conclusive, it's still within the realm of confirmation bias.

I feel like there'd be hundreds of these videos if it was real. How many people tried to make this video and it didn't work? I hope someone buys 10 new phones, opens 10 new Facebook accounts, and does some statistically rigorous testing.

This is easy also to fake, and I trust some random YouTuber much less than a public Facebook statement. It would be intensely stupid for the head of ads at to deny this if they were doing it.
Well, she wasn't under oath, so there is no firm legal requirement for her to be truthful. Or maybe, being a public representative, she's not been briefed about this. Legal security through compartmentalization is pretty common in the corporate world.
> Well, she wasn't under oath, so there is no firm legal requirement for her to be truthful.

The same is true for the random YouTuber ;)

For sure :)
That could easily have been a random ad, combined with something like the Baader-Meinhof phenomenon.

To be a scientific test it needs a list of products, and then randomly assign half into a study group and half in to a control group. Then talk all the time about stuff in the study group but not the control group. And see if there is a statistically significant difference between the ads shown for products in the study group vs the control group.

IN the video he claims not to have owned a cat or talked about cats for 20 years. they chose a subject that they are certain they have, as a couple, never exchanged words on.

Explaining away doesn't explain. For an average, normal FB user, this was as scientific as they could get. It would be nice to see someone repeat the experiment with a pro-hacker type on hand and a packet sniffer ++.

There are too many people giving a company like fb, who have form on the implementation of morally and ethically dubious practices, the benefit of the doubt, all the while dismissing any and all claims of users. It looks fishy at best, and I'm not name calling, it isn't allowed on this forum, but there are a disproportionate number of FB defenders popping up wherever this story surfaces.

Except he probably sees hundreds of ads for things he's never talked about for 20 years. It's not scientific at all, because he has no way of distinguishing a listening ad from luck. My statistical method would distinguish that.

But I guarantee you there are hundreds of pro-hacker types not giving facebook the benefit of the doubt. They're reverse engineering the apps, monitoring API calls from rooted phones, monitoring network traffic, etc. There are entire forums of people dedicated to hacking on android and finding rooting methods, etc.

There are tons of security researchers in academia desperate for a nice paper, eagerly looking for something juicy like facebook app listening to users. In fact these researchers have found that hundreds of android apps are listening to you[1]. But facebook isn't on that list.

The EFF has tons of smart people eager to dig into any little privacy mistake that a major company makes. They for sure would launch a huge lawsuit against facebook if this came out, and they very likely have some researchers looking into whether there's any listening.

Why haven't you heard about all the pro-hacker research into facebook's spying? Because they have all found no spying, because there isn't any. They don't want to publish that they found nothing, because then it would look like they're defending facebook, and these pro-hacker types don't like facebook and don't want to defend them.

[1] https://arstechnica.com/information-technology/2017/05/there...

This is what I came here to say. They may or may not be listening, but the end result is the same.
> Facebook is listening to your data--to all of our data, all at once.

So is Google. What makes FB even creepier is that they don't just watch everything people do, but they manipulate people in controlled experiments to extract as much money from them as possible.

> they manipulate people in controlled experiments to extract as much money from them as possible.

You just described A/B testing. This is a technique used by every successful company. The only thing unique about Facebook is the data.

FB is also okay using data other companies choose not to. Ethics is definitely a factor, too.
This is the thing. It's a fine line between offering someone a solution they might want or appreciate based on their search. Another to manipulate them into wanting it through emotional plays.
> and any other form of browsing patterns for everyone in your country

Definitely. Basically every major website has facebook code on it for social sharing and for "analytics". Facebook surely uses this analytics data to track you from site to site and know exactly what you're looking at. If your browser is signed into facebook, they know exactly who you are too.

That's all very true and very important, but I think it misses the point of contention that many people are rightly focusing on.

There's an issue of whether facebook is, despite what they say, doing some sort of data analysis that's relevant to advertising, in addition to all the other listening and analyzing they do with the rest of our data.

"all of our data" includes recorded audio.
What an insightful comment. Not sure if that’s your idea or not but the bit about enough data being indistinguishable from surveillance is very interesting and if evolved properly may one day make a great argument for constitutional protections around data collection and privacy.
> a sufficiently advanced amount of data and analysis will be indistinguishable from surveillance.

What’s surveillance if not “sufficently advanced amount of data”? They’re indistinguishable becaus they are the same.

Facebook isn’t “collecting data”. It really is following you.

So... Why should we care?

If I had to give my father some reasons why this should concern him, I'd come up blank. Maybe you can fill in a few.

The number of reasons you should care are numerous. One facet of the dystopian nightmare that total surveillance inevitably brings is already being realized in China.

https://www.wsj.com/articles/chinas-new-tool-for-social-cont...

They're collecting data on children without their informed consent, babies today have their information capital compromised from the moment they are born, via their parents shares, purchases, and activity, without their say.
But then why should anyone care about "informed consent" of their "information capital"?
Because the risks involved with complete knowledge of everyone at every moment are huge. Consider that most developed countries have for decades spent millions in spy agencies to get just enough dirt on people of interest to be able to manipulate them.

Imagine that the same information is now available on all people to many giant companies, some with almost government like spheres of influence, you can see the potential for manipulation grow.

The other part of Facebook in particular is that they sell that leverage to advertisers. An example of that would be the possibility that Russia influenced the US election. Whether or not it happened in either direction, the concept and possibility is something to be worried about.

The reason we should still care even though it seems ubiquitous already is because we messed up by getting here, but we can still fix it.

Well I'm paranoid enough to think such possibilities are a problem. But how do you sell those concerns to the average person?
Already many companies check the social media accounts of people applying for a job. A company can't legally discriminate against you based on gender, race, ethnicity, age, etc... But an HR drone can easily roundfile your resume because he doesn't agree with your political views.

That's why employers requiring social media information is illegal in a dozen states or so. (Illinois, I know for sure. There are others.)

Is your question why privacy is important, or are you thinking of something else?
They have to respect our privacy. The moment they don't, it will cause widespread outrage.

Once I realized that, it was hard to think of reasons why their actions are wrong.

Facebook is amassing a huge amount of valuable information on every member.

I disagree that the threat of outrage is sufficient to stop the information being used badly. For example, there is outrage against Equifax, but their cache of information has still been compromised. We have seen data leak after data leak from various companies. If a country's spy agencies want to go after the data, they have a lot of resources, from hacking to legislation to physical intrusion or coercion.

Add to that, it seems like Facebook doesn't institutionally care about privacy (probably because it is hard to explain something to someone whose paycheck depends on them not understanding it). For example, http://actualfacebookgraphsearches.tumblr.com has some very damaging graph searches. Or people who have been outed as gay by incomprehensible privacy settings.

Facebook is a sieve, and the reason to care about them having a lot of information is the same reason to care about privacy in general.

Remember when Target snitched to a teenager’s parents that she was pregnant because her purchases fit the “is pregnant” profile, and everyone got outraged, and they stopped profiling people?

Actually, only the first two things happened.

Actually, only one of those things happened, the one in the middle. I'll let you dig into Forbes's "sources" to figure out how they managed to twist a hypothetical into something that already happened.
I did a quick search and found a Forbes article based on a New York Times article which claims it really happened:

https://mobile.nytimes.com/2012/02/19/magazine/shopping-habi...

It is possible that their source was lying or mistaken, of course.

I ... don’t know what to say. What they are doing now is respectful?

If we ever come to widespread outrage, it will be much to late. In fact, I don’t think there can be outrage. FB would detect it and make it disappear. Somehow.

There has been enough manufactured outrage to make people or decision makers stop caring about it.
> The moment they don't, it will cause widespread outrage.

lol.

For me the fear is insurance companies denying cover or increasing premiums based on some random search, share or like that actually has no relevance to the cover, then not telling you why they made that decision.