Another funny possibly sad coincidence is that the licenses that made open source what it is will probably be absolutely useless going forward, because as recent precedent has shown, companies can train on what they have legally gained access to.
On the other hand, AGPL continues to be the future of F/OSS.
MIT is also still useful; it lets me release code where I don't really care what other people do with it as long as they don't sue me (an actual possibility in some countries)
The US, for one. You can sue nearly anyone for nearly anything, even something you obviously won't win in court, as long as you find a lawyer willing to do it; you don't need any actual legal standing to waste the target's time and money.
Even the most unscrupulous lawyer is going to look at the MIT license, realize the target can defend it for a trivial amount of money (a single form letter from their lawyer) and move on.
You can sue for damages if they have malware in the code, there is no license that protects you from distributing harmful products even if you do it for free.
And illegally too. Anthropic didn't pay for those books they used.
It's too late at this point. The damage is done. These companies trained on illegally obtained data and they will never be held accountable for that. The training is done and they got what they needed. So even if they can't train on it in the future, it doesn't matter. They already have those base models.
Then punitive measures are in order. Add it to the pile of illegal, immoral, and unethical behavior of the feudal tech oligarchs already long overdue for justice. The harm they have done and are doing to humanity should not remain unpunished.
And the legality of this may vary by jurisdiction. There’s a nonzero chance that they pay a few million in the US for stealing books but the EU or Canada decide the training itself was illegal.
Then the EU and canada just won't have any sovereign LLMs. They'll have to decide if they'd rather prop up some artificial monopoly or support (by not actively undermining) innovation.
It’s not going to happen. The EU is desperate to stop being in fourth place in technology and will do absolutely nothing to put a damper on this. It’s their only hope to get out of the rut.
If I can reproduce the entirety of most books off the top of my head and sell that to people as a service, it's a copyright violation. If AI does it, it's fair use.
>If I can reproduce the entirety of most books off the top of my head and sell that to people as a service, it's a copyright violation. If AI does it, it's fair use.
Assuming you're referring to Bartz v. Anthropic, that is explicitly not what the ruling said, in fact it's almost the inverse. The judge said that output from an AI model which is a straight up reproduction of copyrighted material would likely be an explicit violation of copyright. This is on page 12/32 of the judgement[1].
But the vast majority of output from an LLM like Claude is not a word for word reproduction; it's a transformative use of the original work. In fact, the authors bringing the suit didn't even claim that it had reproduced their work. From page 7, "Authors do not allege that any infringing copy of their works was or would ever be provided to users by the Claude service." That's because Anthropic is already explicitly filtering out results that might contain copyrighted material. (I've run into this myself while trying to translate foreign language song lyrics to English. Claude will simply refuse to do this)[2]
They should still have to pay damages for possessing the copyrighted material. That's possession, which courts have found is copyright violation. Remember all the 12 year olds who got their parents sued back in the 2000s? They had unauthorized copies.
I don't know what exactly you're referring to here. The model itself is not a copy, you can't find the copyrighted material in the weights. Even if you could, you're allowed under existing case law to make copies of a work for personal use if the copies have a different character and as long as you don't yourself share the new copies. Take the Sony Betamax case, which found that it was legal and a transformative use of copyrighted material to create a copy of a publicly aired broadcast onto a recording medium like VHS and Betamax for the purposes of time-shifting one's consumption.
Now, Anthropic was found to have pirated copyrighted work when they downloaded and trained Claude on the LibGen library. And they will likely pay substantial damages for this. So on those grounds, they're as screwed as the 12 year olds and their parents. The trial to determine damages hasn't happened yet though.
This was immediately my reaction as well, but I'm not a judge so what do I know. In my own mind I mark it as a "spice must flow" moment -- it will seem inevitable in retrospect but my simple (almost surely incorrect) take is that there just wasn't a way this was going to stop AI's progress. AI as a trend has incredible plot armor at this point in time.
Is the hinge that the tools can recall a huge portion (not perfectly of course) but usually don't? What seems even more straight forward is the substitute good idea, it seems reasonable to assume people will buy less copies of book X when they start generating books heavily inspired by book X.
But, this is probably just a case of a layman wandering into a complex topic, maybe it's the case that AI has just nestled into the absolute perfect spot in current copyright law, just like other things that seem like they should be illegal now but aren't.
Yea, that dipshit judge just opened the flood gates for more problems. The problem is they don't understand how this stuff works and they're in the position of having to make a judgement on it. They're completely unprepared to do so.
Now there's precedent for future cases where theft of code or any other work of art can be considered fair use.
So interestingly, free meant autonomy for Stallman and the original proponents of "copyleft" style licenses too. But autonomy for end-users, not developers. But Stallman et al believed the copyleft style licenses maximized autonomy for end-users, rightly or wrongly, that was the intent.
I read through and I think that the analysis suffers from the fact that in the case when the modifier is the user it's fine.
Free software refers to user freedoms, not developer freedoms.
I don't think the below is right:
> > Notwithstanding any other provision of this License, if you modify the Program, your modified version must prominently offer all users interacting with it remotely through a computer network (if your version supports such interaction) an opportunity to receive the Corresponding Source of your version by providing access to the Corresponding Source from a network server at no charge, through some standard or customary means of facilitating copying of software.
>
> Let's break it down:
>
> > If you modify the Program
>
> That is if you are a developer making changes to the source code (or binary, but let's ignore that option)
>
> > your modified version
>
> The modified source code you have created
>
> > must prominently offer all users interacting with it remotely through a computer network
>
> Must include the mandatory feature of offering all users interacting with it through a computer network (computer network is left undefined and subject to wide interpretation)
I read the AGPL to mean if you modify the program then the users of the program (remotely, through a computer network) must be able to access the source code.
It has yet to be tested, but that seems like the common sense reading for me (which matters, because judges do apply judgement). It just seems like they are trying too hard to do a legal gotcha. I'm not a lawyer so I can't speak to that, but I certainly don't read it the same way.
I don't agree with this interpretation of every-change-is-a-violation either:
> Step 1: Clone the GitHub repo
>
> Step 2: Make a change to the code - oops, license violation! Clause 13! I need to change the source code offer first!
>
> Step 1.5: Change the source code offer to point to your repo
This example seems incorrect -- modifying the code does not automatically make people interact with the program over a network...
"free software" was defined by the GNU/FSF... so I generally default to their definitions. I don't think the license falls afoul of their stated definitions.
That said, they're certainly anti-capitalist zealots, that's kind of their thing. I don't agree with that, but that's besides the point.
It's not really "virtually impossible to comply with". It's very restrictive, yes, but not hard to comply if you want to.
And yes, it is an EULA pretending to be a license. I'd put good odds on it being illegal in my country, and it may even be illegal on the US. But it's well aligned with the goals of GNU.
Hell is, by design, a consequence for poor people. (People could literally pay the church to not go to hell[0]). Rich people have no consequences whatsoever, let alone poor people consequences.
Not "by design", as historically the hell came first. It was only much later that they catholic church started talking about the purgatory and the possibility of reducing your punishment by paying money.
The people running AI companies have figured out that there is no such thing as hell. We have to come up with new reasons for people to behave in a friendly way.
We already have such reasons. Besides, all religious "kindness" was never kindness without strings attached, even though they'd like you to think that was the case.
Open source may be necessary but it is not sufficient. You also needed the compute power and architecture discoveries and the realisation that lots of data > clever feature mapping for this kind of work.
A world without open source may have given birth to 2020s AI but probably at a slower pace.
On the other hand, AGPL continues to be the future of F/OSS.