|
|
|
|
|
by rubinelli
1244 days ago
|
|
De minimis is a longstanding defense in copyright law. If you are copying very little from very many works, as is the case when you turn multiple petabytes into a few gigabytes of neural network weights, you are in the clear. The problem arises when models overfit and spit out almost perfect copies of the training data. |
|
For example, I could take a massive 8k video and covert it into a very small 144p youtube video. Am I in the clear simply because the output is tiny compared to the input? Similar I could take a huge studio master copy of a song and convert it to a very small and rather compressed (distorted) mp3.
I partially agree that some of the problem is when perfect copies are spit out by the models, but I do think there is a bigger problem. Copyright is a complex concept that can't be defined exclusively by a single metric like size, and any mathematically definition will in the end be killed if large copyright holders feel threatened by it.