Hacker News new | ask | show | jobs
by TechBro8615 1447 days ago
Indeed. It's the height of hypocrisy for a company to define the borders of its own system and then prosecute those who they consider in violation of them. There is no consideration given to whether the data should have been collected and retained by Facebook in the first place, regardless of whatever arbitrary access policies they defined to fit their own business and data model.

It's not clear what Facebook's position on scraping truly is. Sometimes they downplay it as "normalized and widespread," and other times they castigate it as inexplicably legal and clearly immoral, or even outright "in violation of state and federal law." For example:

- April 2021. Researchers find an exposed database containing the scraped data of 533 million facebook users. Some news reports refer to it as a "breach." Facebook attempts to downplay the issue as the result of third party scraping. Headline in ZDNet: "Internal Facebook email reveals intent to frame data scraping as ‘normalized, broad industry issue’" [0]

- October 2020. Facebook announces lawsuits against companies it claimed created a "malicious extension on Google’s Chrome Web Store designed to scrape Facebook, in violation of Facebook’s Terms and Policies and state and federal law." [1]

So... which is it? Does Facebook believe that scraping is a "broad, normalized industry issue?" Or is it a violation of "state and federal law?" It seems like they measure severity of its impact primarily based on the reactions of political commentators.

And what's the difference between automating a browser and automating an API client? Why did Facebook design an API for accessing the data they collected, if it's illegal to collect? They've even claimed to be the victim of Cambridge Analytica, who purchased a "quiz" application created by a developer who pieced it together using code straight from the "examples" section of Facebook's API documentation.

There is one obvious resolution to this apparent contradiction. If we remove Facebook from the question, then the contradiction resolves itself. All we need to do is stop presuming that Facebook has the right to collect and retain this data in the first place. And as a user, if you publish your data to a website designed for sharing it with other people, then by definition it is no longer private data. Therein lies the central question: what is "semi-private" data, and who controls its boundaries?

[0] https://www.zdnet.com/article/facebook-internal-email-reveal...

[1] https://about.fb.com/news/2020/10/taking-legal-action-agains...

p.s. another thing they never mention is why companies want to scrape lists of facebook users. perhaps it might have something to do with the "lookalike audience" feature, and its more precisely targetable predecessors, which allow advertisers to upload a list of usernames and email addresses for targeted advertising?