Hacker News new | ask | show | jobs
by snickerdoodle12 325 days ago
They're not just training on pirated content, they've also scraped literally the entire internet and used that too.
1 comments

Scraping the public internet is also not a CFAA violation
CFAA bans accessing a protected computer without authorization. Hitting URLs denied by robots.txt has been argued to be just that.
> Hitting URLs denied by robots.txt has been argued to be just that.

"Has been argued" -- sure, but never successfully; in fact, in HiQ v. LinkedIn, the 9th Circuit ruled (twice, both before and on remand again after and applying the Supreme Court ruling in Van Buren v. US) against a cease and desist on top of robots.txt to stop accessing data on a public website constituting "without authorization" under the CFAA.

Now do every other jurisdiction
CFAA was mentioned specifically, which means only US jurisdiction is relevant here.