| I'm not even sure if event the first part is true. Has it been determined if AI models are intellectual property? Machine generated content may not be copyrightable. It isn't just the output of generative AI that falls under this, the models themselves are. Can you copyright a set of coefficients for a formula? In the sense of a JPEG it would be considered that the image being reproduced is the thing that has the copyright. Being the first to run the calculations that produces a compressed version of that data should not grant you any special rights to that compressed form. An AI model is just a form of that writ large. When the models generalize and create new content, it seems hard to see how that either the output or the model that generated it could be considered someone's property. People possess models, I'm not sure if they own them. There are however billions of dollars at play here and enough money can buy you whichever legal opinion you want. |
If companies train on data they don't own and expect to own their model weights, that's hypocritical.
Model weights shouldn't be copyrightable if the training data was pilfered.
But this hasn't been tested because models are locked away in data centers as trade secrets. There's no opportunity to observe or copy them outside of using their outputs as synthetic data.
On that subject, training on model outputs should be fair use, and an area we should use legislation to defend access to (similar to web scraping provisions).