Hacker News new | ask | show | jobs
by jazzyjackson 1789 days ago
When is it illegal to scrape a website?
1 comments

There are bad scrapers out there. Plenty of common problems:

- Denial of Service by queries - they hit search pages with complex or slow queries, diving to 1000th page of results . This kills the db.

- Denial of Service by parallelization - they hit 1000s of pages at once, causing server to run out of memory or other issues . This kills the web.

- Denial of Service by bugs - their code is buggy, slight change to page causes their scrapes to repeat ad nauseum.

- bad URL/cookie scrapes - they hit URLs that perform actions (say add to cart) against websites. This causes sites to track more data in abandoned carts, managing sessions, item popularity.

If scraping wouldn't affect server negatively, then it would feel less illegal.

Let's not forget data mining. People build whole businesses on this and a lot of them are parastic. They are profiting on data that is sometimes costly to obtain.

All the companies that scrape linkedin tied it to other socialmedia and build power profiles on people such that CIA is clapping with smile.