|
|
|
|
|
by cameldrv
210 days ago
|
|
They’re sort of separate. In a sense you could say that the ChatGPT model is a lossily compressed version of its training corpus. We acknowledge that a jpeg of a copyrighted image is a violation. If the model can recite Harry Potter word for word, even imperfectly, this is evidence that the model itself is an encoding of the book (among other things). You hear people saying that a trained model can’t be a violation because humans can recite poetry, etc, but a transformer model is not human, and very philosophically and economically importantly, human brains can’t be copied and scaled. |
|