| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by asdkhadsj 2173 days ago

Do you host the content as well? What's the legality of that?

I ask because I want to embed some archiving and "reader mode" logic into an app of mine that would be FOSS and self hosted. However that means each individual would be effectively scraping and archiving, and possibly p2p spreading, news content(as data sources).

So I'm curious if there is some underlying "fair use"-like mechanism that allows Archive, Outline.com, and you to consume news content without it being considered piracy.

1 comments

caballeto 2173 days ago

This is a great question. Thus far I thought, that the content that is publicly available without any limitation (e.g. membership access), can be scraped by anyone. You can take a look at hiQ Labs vs. LinkedIn [1]. LinkedIn's public data was scraped by data analytics company, the ruling was against LinkedIn.

I also think that the use case matters, I don't republish their content on the site, but merely provide it via API. Technically, it could be argued, that they could get this data themselves, but it is easier for them to use a service similar to this one to simplify things.

[1] https://en.wikipedia.org/wiki/HiQ_Labs_v._LinkedIn

artembugara 2173 days ago

hiQ Labs vs. LinkedIn is a bad example. It is super specific.

LinkedIn did not owe the data (it was users').

In your case, you are reselling copyrighted product.

Of course you can scrape. It does not mean you can distribute this.

hboon 2171 days ago

What about Google Search API?