|
|
|
|
|
by twoodfin
1063 days ago
|
|
It actually doesn’t even matter if LLMs reproduce copyrighted data from their training. The issue is that a human copied the data from its source into memory for use in training, and this copy was likely not fair use under cases like MAI Systems. The Supreme Court hasn’t ruled on a software case like this, as far as I know. But given the recent 7-2 decision against Andy Warhol’s estate for his copying of photographs of Prince, this doesn’t seem like a Court that’s ready to say copying terabytes of unlicensed material for a commercial purpose is OK. I’m going to guess this ends with Congress setting up some kind of clearinghouse for copyrighted training material: You opt in to be included, you get fees from OpenAI when they use what you added. This isn’t unprecedented: Congress set up special rules and processes for things like music recordings repeatedly over the years. https://scholarship.law.edu/cgi/viewcontent.cgi?referer=&htt... |
|