|
I built a simple CRUD app for a previous (small) employer. Nothing special technology-wise, but a good concept, sound business model, and backed up with a couple of full-time staff creating content for it. Line one of the T&Cs was "no scraping". Business model was based on sales to individual users but we were prepared to do analysis in aggregate if asked. A scraper company, funded by magic money (Knight Foundation grants) and $1m of VC, convinced a (UK) Government department to pay them to scrape our site for some analysis the department wanted. They'd never contacted us, never asked for permission, never asked if we could supply the data. Our company was bumping along at this point and having to lay people off. Income from a nice lucrative Government contract would have kept a couple more people in work. The scraper company's FAQ was, in my view, full-on unethical: > "we check the robots.txt file. If the site permits robots in general to scrape their site (NOT just GoogleBot!), then we will do so. We will make no effort to look for other terms and conditions as well." You will ostentatiously "make no effort to look" for T&Cs in case they prohibit the significant contract you're about to sign with the Government? Whoa. So how I feel about web scraping is simple: "don't be evil". If you're diverting income or traffic from the original site, don't do it. If you're genuinely adding value, go for it, but be open, be prepared to work with the original site, and be prepared to accede to their wishes. |