Hacker News new | ask | show | jobs
by Grimm1 1681 days ago
No? If you place information publicly on a website it's pretty much free game, no copyright violation, especially regarding user generated information. That's my take, but legally it's a gray area and it's still going back and forth in the courts (at least in the US) but for a while before a decision was vacated by the supreme court scraping publicly available information on a site was legally protected and seemingly inline with my thoughts on it.
1 comments

If we are to live in a mutually prosperous society, how is the labor and therefore well-being of the content creator improved by a web scraper? Does this precedent not injury future opportunities for exercising one’s life to making website data available for others to scrape?

By my understanding any website with a copyright disclaimer warrants their data as exclusively their own and are granting permission for other web users to generate it, ie people are not entitled to share their web data with anyone. So if they are, and we agree that it’s good that they do, and continue to create information for others to know, how do we avoid the implicit harm in extracting data without nothing being given in return but possibly harming the internet’s experience for everyone accessing the same information?

I'm actually cool assuming there is implicit harm and no benefit, but by that logic we need to tear google down too. I'm cool making that trade but it has to be done equally.

If you can't make that trade then you've weighed the value provided by an organization like google to be more valuable than the copyright of these content creators and I want other players who may want to be able to challenge google to have the same protections and access google does to have a chance at providing the same value.

Well actually, isn’t Google improving the value of the content ergo property itself by improving its accessibility? I was inferring a one-way street with the accumulated data that can lead to server crashes - which I don’t believe Google’s web crawl does at all (in fact that would be counter-productive).
I'd say most crawlers are looking to provide enriched value for content at their end use. Google is just an aggregator (the biggest by far) but other aggregators are looking to provide similar value.
So then an aggregator is different than a scraping service with respect to the value given to the rest of humanity? In that, in principle one adds value to the content creation and the other deducts through potential harmful interference with its reciprocity?
I'd say many aggregators do offer value to the rest of humanity but I imagine there are probably some exceptions and also not all scraper services offer no value it's just different value to different people.

Some scraping services make their money by offering scraping services to companies for specific information and you could argue they provide value to other businesses that way, but not to the broader "rest of humanity".

So I'm not sure it's as simple as just "aggregator" good "scraping service" bad as value provided takes on many different forms, and that's what makes this difficult.

I guess it may come down to your take on what you think of middlemen, because they are all effectively middlemen in the data economy.

Edit: I was rereading your comment, in respect directly to the value added to the content, then yes maybe it is more clear that aggregators are in principle different because they do add that value where scraping services that sell the data do not offer any enrichment to the content creator. I personally think protecting content aggregators that republish the data to create visibility or other value for the content creator to the extent that they're not worried about being sued for that is probably a worthwhile thing to happen because of the net benefit to our ability to find information/content.