|
|
|
|
|
by achrono
395 days ago
|
|
If anyone was skeptical of the US government being deeply entrenched with these companies in letting this blatant violation of the spirit of the law [1] continue, this should hopefully secure the conclusion. And for the future, here's one heuristic: if there is a profound violation of the law anywhere that (relatively speaking) is ignored or severely downplayed, it is likely that interested parties have arrived at an understanding. Or in other words, a conspiracy. [1] There are tons of legal arguments on both sides, but for me it is enough to ask: if this is not illegal and is totally fair use (maybe even because, oh no look at what China's doing, etc.), why did they have to resort to & foster piracy in order to obtain this? |
|
European here, but why do you think this is so clear cut? There are other jurisdictions where training on copyrighted data has already been allowed by law/caselaw (Germany and Japan). Why do you need a conspiracy in the US?
AFAICT the US copyright law deals with direct reproductions of a copyrighted piece of content (and also carves out some leeway with direct reproduction, like fair use). I think we can all agree by now that LLMs don't fully reproduce "letter perfect" content, right? What then is the "spirit" of the law that you think was broken here? Isn't this the definition of "transformative work"?
Of note is also the other big case involving books - the one where google was allowed to process mountains of books, they were sued and allowed to continue. How is scanning & indexing tons of books different than scanning & "training" an LLM?