Hacker News new | ask | show | jobs
by kordlessagain 1448 days ago
Facebook has hidden much of Instagram's content behind logins, so that makes most of it "not public".

At the same time, I don't think all of Instagram's users care if their images are hidden, or not.

It's quite unfortunate Facebook/Meta is using hostile language and the word "scraping" together in this case. Scraping is a legitimate process used by various business models to gather information from the Web, which itself was originally intended to be an open forum for people to share content.

Hostile business models have corrupted that intent and turned it into a competitive environment that is harming users and legitimate models which may not have the funding larger corporations can muster.

I have a "scraper" I've built that will either snapshot a page from a user's browser or crawl it remotely with Selinium/Firefox, on the user's behalf, to save the content in an index for searching later, by that user. It's not automated, nor does it parse and crawl URLs in the pages saved. It doesn't use page content in a wider context, either.

I've spent a significant amount of time trying to "work around" anti-scraping efforts by various companies and it's frustrating to see hostility instead of cooperation in certain types of use.

1 comments

> Facebook has hidden much of Instagram's content behind logins, so that makes most of it "not public".

1) It was public when the content was posted by its authors. Facebook locked it down retroactively, regardless of the author's intent.

2) A login requirement doesn't make it non-public, if making an account is trivial, and there are already hundreds of millions of accounts. Is the plot of Avengers: Endgame also not public, because it's locked behind a ticket purchase or subscription?

Also login requirement is not certain. e.g. Google doesn't need to login to index those pages, neither do you for first few profiles. Only after your identity (ip or fingerprint) is know instagram starts locking public content behind login gates.