Hacker News new | ask | show | jobs
by brechin 4935 days ago
Scraping, in itself, is often not prohibited, wrong, illegal, or against most sites' TOS. Your site would allow me, for example, to scrape all your content for my own personal use, but I couldn't re-publish or re-sell the info.

It seems hard to limit legitimate uses of a free resource without changing the requirements on how users access the site (require account signup, use CAPTCHAs, use CSS/JS to only display properly in a browser).

As one who does a lot of scraping, I have encountered few barriers that can't be (legally) overcome with a reasonable amount of effort.

1 comments

>Your site would allow me, for example, to scrape all your content for my own personal use //

On what basis are you claiming this. Sounds like it would be true under fair-use clauses of US Copyright law but it's certainly not true in the UK (and by extension I presume for you to perform on content served from the UK though I've yet to read a thorough treatment of how the [ie any] law works with server locations).

Commercial considerations are usually much broader than selling too: not only could you not resell it but you couldn't distribute it (whether by publishing or otherwise).

From TOS on his site:

"Cucumbertown authorizes you to view, download and/or print the Materials only for personal, non-commercial use, provided that you keep intact all copyright and other proprietary notices contained in the original Materials."

So, for example, I could grab everything from the site (minus copyrighted images, etc.) and make my own personal DB of the content. Obviously that's a lot of effort for a little reward for one person, but if I created a repo with a set of tools for people to do this for themselves it could become a big legitimate source of "scraping" traffic.