|
|
|
|
|
by Hizonner
689 days ago
|
|
But the pitch-shifted song is still recognizably a creative work. It has identifiable, humanly comprehensible forms of all the original creative elements that Swift originally put into it (plus I guess a de minimis amount of extra creativity from the choice to pitch shift it). If I take a string of data from a true hardware RNG, XOR it with a Taylor Swift song, and throw away the original random stream, is the resulting fundamentally random bit string still a derivative work of the song? As with the ML model, you can't recognize the song in it. And as with at least some training examples in the inputs of most ML models, you can't recover the song from it either. It feels like the test for whether X is derivative for copyright purposes should include some kind of attention to whether X is a creative work at all. Maybe not, but then what test do you use? I do recognize the possibility that the models might not themselves be eligible for copyright as independent works, yet still infringe copyright in the training inputs. It seems messy, but not impossible. ... and as I said elsewhere, it's also messy that while you generally can't recover every training input from the model, you can usually recover something very close to some of the training inputs. |
|
It's not a copy of it, and when you distribute it you're not distributing the original. So it's not a derivative for copyright purposes.
It can still be a derivative for other legal purposes. Judges don't appreciate it when you do funny math tricks like that and will see through them.
> It feels like the test for whether X is derivative for copyright purposes should include some kind of attention to whether X is a creative work at all. Maybe not, but then what test do you use?
Yes, that's how US copyright law works. (well sort of…)
Being a transformative work of something makes it less of a copy of it, the more transformed it is, since it falls under fair use exemptions or is clearly a different category of thing.
If a model was a derivative of its training data, then Google snippets/thumbnails would be derivatives of its search results and would be illegal too. Unless you wrote a new law to specifically allow them.
In other countries (Germany, Japan) fair use is weaker, but model training has laws specifically making it legal in certain circumstances, and presumably so do Google snippets.