| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cookiecaper 2944 days ago

The generally-accepted precedent is that yes, unwanted scraping violates the CFAA. Given the age of the problem, the case law is still developing around it, but there have been many high-profile scraping cases, and the scraper almost always loses.

The reality is that the CFAA is extremely broad and if we want to protect "scraping", better termed something like "data preservation" or "data recovery", we need to change the CFAA, copyright, and the applicability of EULAs (which effectively work to plug any tiny leak that someone may've found through the CFAA-copyright combo).

Copyright itself makes it effectively illegal to read a web page without the owner's consent, even if a) there is no trespass/unauthorized access (CFAA); and b) there is no infringement in the actual content extracted. This occurs because the markup and other necessary supporting material around a page is a copyrighted work, and just reading it into memory and then immediately discarding is considered sufficiently tangible to infringe on the copyrighted work.

This is called the "RAM Copy Doctrine", and it has been [mis]applied to scrapers many times. In Facebook v. Power Ventures, it was used to stop a startup from helping Facebook users extract their own content. That founder was left owing $3M in damages to Facebook.

LinkedIn v. Hi5 is the most notable recent exception, but those rulings seem to be pure judicial activism unsupported by precedent or really any legal underpinnings, and will surely be overturned on appeal.

For every high-profile LinkedIn v. Hi5-style success, there are a good number of losses. It is fairly routine now after 3Taps.

IANAL, but my SaaS business, which depended on a key piece of scraped data, was destroyed by a legal threat from a Fortune 100.