|
|
|
|
|
by polotics
674 days ago
|
|
No the <10GB size of the model does not imply any less copyright infrigement is occuring IMHO.
The fact that there is a very efficient compression involved does not change the fact that a copy of the copyrighted material, that copy being not compressed in any way, was input into the process that generated the model, in breach of the copyrighted material's copyright. |
|
Transformers's analyze images, they don't copy them. You might call this semantics, but you probably also wouldn't call out an algorithm that counts black pixels on website images as "copyright violation".
There is a lot of nuance here and a lot to consider. Transformers are not archives of images, they are archives of relationships. This is key because you don't have to copy an image to measure the relationships between it's pixels.
Train a transformer on one image, and it will just output noisy garbage.