To be honest, that was my first thought on reading that headline as well. Given that especially those large companies (but who knows how smaller ones got their training data) got a huge amount of backlash for their unprecedented collection of data all over the web and not just there but everywhere else, it's kinda ironic to talk about intellectual property.
If you use one of those AI model as a basis for your AI model the real danger could be that the owners of the originating data are going after you at some point as well.
Standard corporate hypocrisy. "Rules for thee, not for me."
If you actually expected anything to be open about OpenAI's products, please get in touch, I have an incredible business opportunity for you in the form of a bridge in New York.
They got backlash, but (if I'm not mistaken) it was ruled that it's okay to use copyrighted works in your model.
So if a model is copyrighted, you should still be able to use it if you generate a different one based on it. I.e. copyright laundry. I assume this would be similar to how fonts work. You can copyright a font file, but not the actual shapes. So if you re-encode the shapes with different points, that's legal.
But, I don't think a model can be copyrighted. Isn't it the case that something created mechanically can't be copyrighted? It has to be authored by a person.
I find it weird that so many hackers go out of their way to approve of the legal claims of Big AI before it's even settled, instead of undermining Big AI. Isn't the hacker ethos all about decentralization?
Standard disclaimer. Like inserting a bunch of 'hypothetically' in a comment telling one where to find some piece of abandoned media where using an unsanctioned channel would entail infringing upon someone's intellectual property.
I understand that its not very clear if the neural net and its weights & biases are considered as IP, I personally think that if some OpenAI employee just leaks GPT-4o it isn't magically public domain and everyone can just use it. I think lawmakers would start to sue AWS if they just re-host ChatGPT. Not that I endorse it, but especially in IP and in law in general "judge law" ("Richterrecht" in german) is prevalent, and laws are not a DSL with a few ifs and whiles.
But it is also a "cover my ass" notice as others said, I live in Germany and our law regarding "hacking" is quite ancient.
The simple fact that models are released under license, which may or may not be free, imply that it is intellectual property. You can't license something that is not intellectual property.
It is a standard disclaimer, if you disagree, talk to your lawyer. The legal situation of AI models is such a mess that I am not even sure that a non-specialist professional will be of great help, let alone random people on the internet.
1. the current, unproven-in-court legal understanding,
2. standard disclaimer to cover OP's ass
3. tongue-in-cheek reference to the prevalent argument that training AI on data, and then offering it via AI is being a parasite on that original data
> reference to the prevalent argument that training AI on data, and then offering it via AI is being a parasite on that original data
Prevalent or not, phrased this way it's clear how nonsense it is. The data isn't hurt or destroyed in the process of being trained on, nor does the process deprive the data owners from their data or opportunity to monetize it the way they ordinarily would.
The right terms here are "learning from", "taking inspiration from", not "being a parasite".
(Now, feeling entitled to rent because someone invented something useful and your work accidentally turned out to be useful, infinitesimally, in making it happen - now that is wanting to be a parasite on society.)
I think the bad part of it is stripping consent from the original creators, after they published their work. I personally see it as an unfortunate side-effect of change. The artists of the future can create with AI already in mind, but this was not the privilege of the current, and previous generations.
Getting back to "learning from", I think the issue is not the learning part, but the recreation part. AI can churn content to orders of magnitude higher than before, even in the age of Fiverr and other tools-opportunities. This changes the dynamics of the interaction, because previously, it took someone tens of hours to create something, now it takes AI minutes. That is not participating in the same playing field, it's absolutely dominating it, completely changing it. That is something to have feelings about, especially if one's livelihood is also impacted. Data is not destroyed, and neither is its ownership, but people don't usually want the exact thing, they are content with a good enough thing, and this takes away a lot of power from the artists, whose work is the lifeblood of artistic AI in the first place.
So I don't think it's as nonsense as you state it. But I do understand that it's not cut and dry the other way around either. Gatekeeping culture is definitely not a humane thing to do. Culture comes and goes, intermingles, inspires and changes all the time, and people take from it and add to it all the time. Preserving copyright perfectly would neuter it, and slant the landscape even more towards the already powerful.
If you use one of those AI model as a basis for your AI model the real danger could be that the owners of the originating data are going after you at some point as well.