"What are the intellectual-property risks of using generative AI tools?
The Oracle Contributor Agreement (OCA) requires that a contributor own the intellectual property rights in each contribution and be able to grant those rights to Oracle, without restriction. Most generative AI tools, however, are trained on copyrighted and licensed content, and their output can include content that infringes those copyrights and licenses, so contributing such content would violate the OCA. Whether a user of a generative AI tool has IP rights in content generated by the tool is the subject of active litigation."
Even if training on copyrighted material is considered fair use, there is still the issue that LLMs may reproduce significant parts of the training set. In fact, there is an ongoing lawsuit in Germany (GEMA vs. OpenAI) because ChatGPT reproduced significant parts of existing song lyrics, which very likely violates German copyright law. The whole thing really is a legal minefield and some companies do indeed prohibit the use of LLMs for this very reason (until all of these legal questions are really settled).