Hacker News new | ask | show | jobs
by luffy 5731 days ago
Regarding scraping:

I have a hard time figuring out what the difference is between having a human read a web page with sports scores on it, and then entering those scores in to your application vs. having a scraper grab those scores automatically. In most cases, these source web pages will be publicly available without requiring any agreement to a terms of service contract.

Scraping a site and using the actual HTML in your application would be a copyright violation, definitely. Sometimes a particular format can even be patented. So I'd definitely stay away from actually scraping out an entire table and inserting that into your app.

But as far as the scores/facts - those are not subject to copyright. So what is the particular legal issue if you are scraping and only getting non-copyrightable facts from a publicly available web page? I'm genuinely curious to know.

1 comments

I don't know. I agree that the game scores, roster data, etc. are facts but at the same time work was performed at some point to compile that data so I assume there must be some legality around not simply copying it and reusing it. Of course I don't know for sure it's just a gut feeling. Wish I could find a definitive answer without having to pay legal consultation fees I can't afford.
From a legal point of view, duplication of facts is allowable according to Feist Publications v. Rural Telephone Service. Learned this in my IP law class. Duplication of original expression is forbidden. Stats, imo, fall into the "facts" category.