|
I believe we first need to answer the question of whether the copyright of the AI model’s source text or images affects the output. My opinion — and note I’m a software engineer, not a lawyer — is that an AI, being a statistical model and not generally intelligent, should not be allowed to disregard the copyright of its source material. This would, I think, require the AI’s creator to secure a license for all of its sources that allows this sort of transformation and presentation. And further, a user of the AI would themselves require a license to use the output. The alternative seems to be “anything goes”. |
A model trained on several copyrighted data sources cannot somehow be used in a way depending on a subset of those sources.
So all parameters of usage and compensation should be settled by contract between the model builder and copyrighted data supplier, before the copyrighted material is used.
Or to put it simply: using copyrighted material to create a model would NOT be considered fair use.
That’s it. That’s the standard. No complicated new laws required.
Model builders obtain permission to use copyrighted material from copyright holders based on any terms both agree to.
Terms might involve model usage limits, term limits, one time compensation, per use compensation, data source credits, or anything else either party wants.
The likely result will be some standard sets of terms becoming popular and well known. But nobody has to agree to anything they don’t want to.