|
|
|
|
|
by titusjohnson
1567 days ago
|
|
Please don't attempt to equate internet traffic to door locking. It's a tired old argument that fails the moment critical thought is applied. > Unauthorized access can occur whether the bucket is public or not. The law does not require that sufficient measures (or any measures, really) be taken to protect the assets in question. We can disagree as to whether it should, but that's not how it's written today. Citation needed. Probably more than one. Web scraping is most certainly legal. Everything involved in the ridiculous "breaking and entering an unlocked residential door" is done a billion times a day by web scrapers as a matter of course. The act if doing GET / wraps up finding a home, evaluating its entrances, knocking, opening the door, and taking photos of the entryway. In 50ms. I do agree with your last line. Definitely think about whether a judge would laugh at you or not... |
|
It's a useful metaphor that gets people convicted. You might not like it or agree with it, but that's the way it is.
> Web scraping is most certainly legal. Everything involved in the ridiculous "breaking and entering an unlocked residential door" is done a billion times a day by web scrapers as a matter of course
Unfortunately you, like others, are ignoring the crucial element of consent. Web scraping is done lawfully only with the consent of the website scraped. When scraping is done non-consensually -- even if the website is public -- it can be considered trespass to chattels and might even constitute a CFAA violation. I know this because my company scraped eBay without their consent in the late 1990s/early 2000s and was shut down by a lawsuit. See, e.g., eBay v. Bidder's Edge, 100 F. Supp. 2d 1058 (N.D. Cal. 2000) (not my specific employer at the time, but in the same business).
Ignore robots.txt at your peril, and treat the absence of one as a lack of consent. That's what Google and other search engines do.