| I don’t think it makes sense for both model builders and the model’s users to separately obtain licenses for the same works used in the training set. A model trained on several copyrighted data sources cannot somehow be used in a way depending on a subset of those sources. So all parameters of usage and compensation should be settled by contract between the model builder and copyrighted data supplier, before the copyrighted material is used. Or to put it simply: using copyrighted material to create a model would NOT be considered fair use. That’s it. That’s the standard. No complicated new laws required. Model builders obtain permission to use copyrighted material from copyright holders based on any terms both agree to. Terms might involve model usage limits, term limits, one time compensation, per use compensation, data source credits, or anything else either party wants. The likely result will be some standard sets of terms becoming popular and well known. But nobody has to agree to anything they don’t want to. |