|
|
|
|
|
by albert180
888 days ago
|
|
I think the biggest issue is with publishing the datasets.
Then people and companies would discover that it's full of their copyrighted content and sue.
I wouldn't be surprised if they slurped in the whole Z-Library et Al into their models. Or Google their entire Google Books Dataset |
|
If a human knows a song "by heart" (imperfectly), it is not considered copyright infringement.
If a LLM knows a song as part of its training data, then it is copyright infringement.
But what if you developed a model with no prepared training data and forced it to learn from it's own sensory inputs. Instead of shoveling it bits, you played it this particular song and it (imperfectly) recorded the song with it's sensory input device. The same way humans listen to and experience music.
Is the latter learning model infringing on the copyright of the song?