Hacker News new | ask | show | jobs
by Ticklee 1888 days ago
They already have, the internet is a mess.

Every website you visit wants more and more of your data. Facebook played a huge role in making this level of data sharing widespread.

2 comments

I would claim the opposite. Facebook normalized the belief that the information people put on their FB page was NOT being scraped by many people even though it was. The rise of Facebook accompanied a whole belief system about "things I share with my friends on the Internet".

It seems like Facebook is now large enough that they're effectively owning up to the unavoidable truth - there's no way that information made available to all subscribers of some largish social network isn't going to be public to the world.

That's not an accurate framing of the situation. Sure, they relied on people's technological illiteracy to do things people didn't really think were possible for a while. But in the face of the news about recent leaks, and the Cambridge Analytica scandal in particular, they have had to switch to a more active PR strategy to quell the concerns people have about their product(s).
The Cambridge Analytica scandal was three years ago and this article is about PR moves Facebook is doing now.

And sure, I don't give every gruesome detail in the rise but I'd still claim that the overall situation is that Facebook is large enough and it's model porous enough that a variety of actors have scraped it, are scraping it and will scrape it. And given this, Facebook has to start owning up to an inevitable situation. Keep in mind, The Cambridge Analytica scandal was predicated on Facebook's claimed data model (which I'd claim isn't just false but also "can't be true"). Sure, the easiest way to scrape it is having API access, which it's hard not to give to your advertisers. But if Facebook gave no one API access, various actors would be directly scraping.

And overall, I'd say The Cambridge Analytica scandal was the thing that wasn't a good framing of the broad problems of Facebook and privacy.

Edit: "But in the face of the news about recent leaks, and the Cambridge Analytica scandal in particular, they have had to switch to a more active PR strategy to quell the concerns people have about their product(s)."

And I'd say, this is again actually the wrong frame. Facebook is at the center of the storm, no doubt. But there is no large social network possible that wouldn't be subject to the general privacy problems of Facebook. Facebook created the fantasy definition of privacy, Facebook violated that definition but no one could satisfy it.

Scandals linger far longer than 3 years.

A great example is M&M’s dye choice became controversial due to customer confusion over which red dyes where harmful. So, the company couldn’t simply change the dye because what they where using wasn’t problematic. In the end they had to flat out stop selling red M&M’s for over a decade, and their reintroduction was surprisingly controversial.

If you read my gp, I'm not arguing the Cambridge Analytica scandal didn't influence Facebook. I'm arguing the real, larger frame is that Facebook can't help but be porous and it's acknowledge that truth for their self-interest. That helps them avoid scandal, yes but contrary to the earlier poster "it's cause of scandal" or "it's cause Facebook bad" is a bad, distorting frame. And that isn't saying Facebook is good, it's saying the entire framework of social networks and things propagating on the Internet creates a certain kind of "playing field".

I would speculate, in fact, that Facebook acting now make the obvious point that of course people are going to be scraping the data of their site because after X many scandals, it's becoming obvious that people will do that, that they will do that to any site like Facebook and that they'll have much clearer cover if they "normalize" thing that are ... fricken normal.

I'd further speculate that they couldn't act when Cambridge Analytica was fresher because then they'd be seen as being self-justifying and then they had to be seen as humble and apologetic.

>customer confusion over which red dyes where harmful

Never heard of that; is that cochineal?

Some red azo food dyes have been shown to be carcinogenic.
It's not just Facebook. Every sales, marketing, or product person in the world basically has an unlimited appetite for data and will push to suck up as much data as possible.

There is a logical reason for this: one of the toughest things is knowing what your users actually want and what their actual pain points are. In advertising there's an analogous problem often summarized as: "I know I am wasting 80% of my ad spend, but I don't know which 80%."

Every single incentive on the business side incentivizes data grabbing. This will never change unless users vote hard with their wallets or unless there is protective legislation.

I wholeheartedly think http/s is irreparably damaged, for example it is impossible to find good information on search engines, even the free ones like Searx. If you are able to find a website you can bet it includes 10MiB of trackers and ads.

Hopefully someone writes a better protocol with no third party cookies and heavily restricted javascript.

Hate to nitpick but those things are not features of the HTTP protocol from the IETF but of HTML from the W3C
Nitpick away, when I am ignorant I'd rather be told than stay ignorant.
As the other poster pointed out, those are properties of HTML and not HTTP/S. But what I'd like to point out is that this:

> heavily restricted javascript

Is basically impossible. Any useful subset of javascript would be turing-complete, and therefore enough to do whatever's necessary to track the user. Literally all you need to be able to do is make an HTTP request and bam, you can track.

Turing-complete is (kind of) irrelevant, the question is what (equivalent of) system calls is has access to. Eg, javascript should not be able to set cookies or cause network traffic after page load by default.

> all you need to be able to do is make an HTTP request

Precicely. Inability to do this is (part of) what > > heavily restricted javascript means.

Haha yeah, never going to happen. No network calls = no real-time dashboards, which is a no-go for basically any company. Not only that, it doesn't really solve the issue. You can just make the user click a button which triggers the network call anyways, and bam you can track again. Restricting javascript wouldn't work.
Blogs were a major death blow to the web. Before that web pages were more like books, references, unique compilations and compendiums of knowledge, random non-sense, and individual people's musings for no other reason than to put it out there. Now, 99% of all "blog" content is just there to somehow sucker a poor soul into typing an unfortunate series of terms into a search bar out of desperation for knowledge/info, such that the semi-random search results gets the person's eyeball to fall on said "blog" content so that an ad can be sold.