Hacker News new | ask | show | jobs
by dogcomplex 108 days ago
We should be removing IP law entirely, not strengthening it to cover entire classes of problem even when implemented entirely differently. Same for anyone trying to claim "colorful monster creatures" as innately Pokemon IP. Just because someone climbed a mountain first doesn't mean they own it forever. Nobody should be honouring any of these claims.

Nor should we be treating AI models themselves as respected IP. They're built on everyone else's data. Throw away this whole class of law, it's irrelevant in this new world.

5 comments

> own it forever

Well we could try fixing the forever part. Copyright is out of control. I’d like to see a world with much less power given to IP. Sometimes I even say I want it eradicated entirely. But realistically we should start by cutting things back. Maybe give software an especially short copyright period.

Reset it back to 20 years and make that a hard limit for both patents and copyright. No renewals. Zero exceptions. Let the market sort the rest out.

There's always going to be downsides and edgecases when granting any party a monopoly over anything. At least if it's limited to 2 decades any unintended consequences, philosophical objections, and etc are hopefully kept within reason.

That would be insane for aerospace software, where you might spend most of that time getting the code certified (required to break the $0 revenue threshold), let alone paying back your costs and then making an actual profit.

Meanwhile, there are cases where copyright of more than 2 years is overkill.

I don't know what, but it seems like we need some sort of mechanism for variable-length IP duration is needed.

Is copyright meaningful for aerospace software? I'm largely unfamiliar with that domain but I have trouble imagining that (for example) Boeing cares much about people redistributing or hacking on the control software for a 777. How would that impact their bottom line?

I could understand for medical devices maybe but even then it seems like the software is a tiny part of the overall cost of a given design. A competitor could already do a clean room reimplementation in that case.

But I guess it wouldn't be all that bad if there were a carefully crafted extension for government certified software that was explicitly tied to the length of the certification process.

The only problem with this certified software exception is I foresee they'll write the law as "expiration timer starts when software has finished certification" then some lobby group will get the regulatory departments to adopt a new process of partial certification where said software is usable in devices but the 'finished certification' never gets reached so the copyright gets dragged out forever.
Nope, it falls more under trade secrets than copyright.

If you do something that requires stealing the code (publishing it, selling it, etc) the company can legally fuck you up.

Now, once it's in tbe wind, it becomes almost impossible to pursue from a practical point of view, as any implementer can claim trade secrets to avoid showing you the code.

I think the point is more that many kinds of software (presumably including aerospace software) doesn't really need any kinds of protections from redistribution because it is effectively only useful for a specific design and much of the effort in creating it is not the algorithms that a competitor could steal without copyright or alternative protection but certifying that the software fits the rest of the system, which any competitor making use of the software would have to do again.

Also remember that the original point of copyright and patent protections is to encourage people to create the protected works in the first place but Boeing isn't just going to stop making aerospace software without copyright because their hardware will be useless without it. So if anything, any software that is needed for hardware made by the same company to function doesn't really have any right to be copyrightable at all.

If certification is the actual cost, you don't need copyright, at all. SQLite is in the public domain. Your moat is the certification itself, not the code.
Certification isn't a moat; either the software is certified as safe/bug-free or it isn't. If it's safe, that just makes it more valuable to pirates.
That's absurd.

I can't use SQLite for aviation even though it was certified.

I can't even claim FIPS compliance for my software without going through an expensive process, even though I only use FIPS approved primitives.

Building on certified/compliant libraries helps, but their vendors can certainly contractually make me pay for it.

All OSS libraries have a warranty disclaimer; using them according to even those licenses automatically excludes "fitness for a particular purpose."

Why would public domain software be any different?

The moat is the certification process, not the code itself. "I copied this from somewhere after it was already certified" might fast track something, but it's not gonna fly with "certification was good, done."

> some sort of mechanism for variable-length IP duration is needed

I've always liked the idea of a Harberger tax-style patent enforcement fee:

The patent owner declares the value of their patent on an annual basis and pays 1-5% of that declared value per year for the privilege of relying on the government to enforce their exclusive ownership of the patent. At any point, another party can buy the patent at its declared value, which discourages patent-holders from declaring artificially low values. The annual fee discourages artificially high valuations for indefinite periods of time -- as the patent yields less return over time it makes less sense to keep paying a high annual fee, encouraging owners to lower the declared valuation or end the patent protection altogether when it's no longer profitable.

To discourage hoarding patents indefinitely one could either set a hard upper limit (e.g. 60 years) or increase the fee over time, for example every few years the fee increases by 1% until at some point the patent is effectively publicly owned.

Wait for the great new times when an AI will certify aerospace, automotive and medical SW. Waiting for that. It will be 1000x better and faster than the existing processes
Or maybe it shouldn't take 10+ years to certify aerospace software.
Have you seen the quality of regular software though? And the failure rate of regular physical items? The only reason I trust aircraft is because of the process.

Consider if you will that if some guy were to fly a drone the size of a car that he knocked together in his garage over a residential area people would not accept that. Yet private pilots in cessnas fly over neighborhoods constantly.

Good news! LLM output cannot be copyrighted. Everything that an LLM produces is automatically, irrevocably, in the public domain.
Not quite in my opinion. The output of an LLM from a simple prompt falls into the public domain, but if you also give a copyrighted work as input, the mechanistic transformation performed will not alter the original license (same as encoding a video does not change its license).
Are training data counted as input?

It would be interesting to see a court ruling that the output of LLMs trained on copyleft code are licensed under the GPL ... and all other viral licenses simultaneously

> Are training data counted as input?

It is quantum legality, to use copyright input is legal or illegal depending on the observer.

Schrodinger's Chat
Unless your llm works by quoting large parts of copyrighted works, reinterpretations of them aren't copyrighted. Because it's not a copy.
What if the output regurgitates some other legal entity’s boilerplate licence agreement? Is the output automatically licensed to that entity?
No, the copyright is the colour of the bits, and red bits with a comment saying "these bits are blue" are not blue bits, but you may be prosecuted for fraud.
It's wild to me that there haven't been more court cases to answer questions like those being asked in this thread.

No one knows.

But we also know from other research that LLMs don't actually do mechanistic translations. Even when they are asked to and say that they did, they're basically rewriting the code from their training data
If the LLM output is already someone else's copyrighted work, the LLM doesn't change that?
If that occurs and it’s a substantial enough body of output that it is itself copyrightable and not covered by fair use. Confluence of those conditions is intentionally rare.
The LLM cannot produce copyrighted work.

If the LLM reproduces a human's copyrighted work, then that copyright still stands. This is, in effect, the same as photocopying someone else's writing. The LLM was trained on the copyrighted work, is incapable of producing new copyrightable work, so if it duplicates the original work then the original author's copyright still stands.

I am not a lawyer

Same as it ever was: Either trade secrets or license files that are treated as suggestions.
What if you used the LLM to generate works that were already copyrighted?
There was a recent case that everyone has been describing as "LLM output can't be copyrighted" but what it actually said was you can't register the AI as the author.
This is not true, and I'd love to see some actual citation here.

The courts have repeatedly said that copyright only applies to human creativity. The Supreme Court explicitly said this when they refused to hear the appeal:

https://en.wikisource.org/wiki/Thaler_v._Perlmutter,_Refusal...

> "We affirm our decision to refuse registration for the Work because it lacks the human authorship necessary to be eligible for copyright protection."

So they're saying that the LLM cannot be the author, because LLMs cannot claim copyright.

The related case about patents is more supportive of the narrative that AIs cannot be authors (see https://www.cafc.uscourts.gov/opinions-orders/21-2347.OPINIO...), specifically: "Here, there is no ambiguity: the Patent Act requires that inventors must be natural persons; that is, human beings."

The patent situation is that the Act says that inventor must be an individual, which the courts are interpreting to mean a human, so the LLM cannot be named as the inventor. So, in this case, yes, this is just saying that an LLM cannot be named as the inventor of a patent. That's not the same thing as the courts are saying with copyrights.

> So they're saying that the LLM cannot be the author, because LLMs cannot claim copyright.

They're saying that the LLM can't be the author.

Now suppose you supply the LLM with a prompt that contains human creativity, it performs a deterministic mathematical transformation on the prompt to produce a derivative text, and you want to copyright that, claiming yourself as the author. What happens then?

If you think the answer is that you can't, how do you distinguish that from what happens when someone writes source code and has a compiler turn it into a binary computer program? Or do you think that e.g. Windows binaries can't be copyrighted because they were compiled by a machine?

> Now suppose you supply the LLM with a prompt

My understanding was that they did in fact do just that, but the court somehow misunderstood what they were doing, and assumed that the LLM was working completely autonomously without any human input at all, which isn't really possible IMO. Someone told it what to do.

They also argued that you couldn't copyright an output that you can't explain how it came to be, i.e. if they had been able to articulate how an LLM works, the outcome might have been quite different, which I found surprising.

If art in general (human-made or otherwise) is always derived from existing influences... should we really be forced to explain how or why we created a piece of art in order to defend it?

The usual bar for copyright infringement of a derivative work is, from what I have seen, "how much did you copy from the original, and how obvious is it", which is of course a subjective determination that would be made by each individual judge or jury of a case.

> What happens then?

The part that the human created, the prompt, can be copyrighted.

The part that the LLM created, cannot be.

Copyright in code works exactly the same way: the source code is copyrighted. The binary code is only copyrighted to the extent that it is derived from the source code. This is well-established.

Powerful interests want it to be true.
IMO the bigger question is how would you even tell if a work was generated by an LLM? There's a ton of code being written out there; the folks who generated it are going to claim they authored it for copyright purposes, and those who want to use it are going to claim it was LLM-generated. So what happens?
The alleged author, when bringing a copyright infringement suit, will submit testimony claiming they wrote it. Parties to the suit will have a chance to present arguments and evidence. Then, the claim will be adjudicated by a judge and/or jury.
That code isn't going to be open source. And if you use someone else's closed source code you are violating laws that have nothing to do with copyright.
I'm not sure I understand. I'm not talking about stolen/leaked code here. I'm saying: imagine you claim you're the author of some piece of code. You may or may not have written it with an LLM, but even if so, assume you have the full rights to all the inputs. You post it publicly on GitHub. You don't attach a license, or perhaps you attach a restrictive license that doesn't permit much beyond viewing. Someone comes across your code, finds it brilliant, and wants to use it. If that code was non-copyrightable (such as generated via an LLM), then they're fine doing it without your permission, no? But if that code was copyrightable, then they're not permitted to do so, correct?

So now consider two questions:

1. You actually didn't use an LLM, but they believe & claim you did. Who has the burden of proof to show that you actually own the copyright, and how do they do so?

2. They write new code that you feel is based on yours. They claim they washed it through an LLM, but you don't believe so. Who has the burden of proof here and how do they do so?

Good questions.

My take on the answers (I am not a lawyer):

1. You copy their code. They bring a copyright claim (let's assume this isn't a DMCA thing and they're actually bringing a claim to court). Your defence is "the LLM wrote it so no copyright attaches". Since they're asserting their copyright claim, they would have to provide evidence for that claim (same as in any other copyright case), including providing evidence that a human wrote it (which is new, and required to defeat your defence).

2. They copy your code. You bring a copyright case. Their defence is "I used an LLM to wash the code without copying". Since they're not disputing your copyright claim to the original code, you don't have to defend or prove your copyright. But you do have to prove that their code infringes on your copyright, which would mean proving that the LLM copied your code when creating the new code. This has been done before by demonstrating similarity.

Can you expand on that, please? Which other laws are infringed if you use someone else's closed source code?
You used an illegal leak to train your llm
Is Pierre Menard really the author of his Quixote?
I think it can be copyrighted or is a very complex legal issue. Coding support is used in commercial apps where copyrights are fully reserved. I cannot be feasibly determined if any output is purely LLM or not.
I would be okay with just keeping it but limiting it severely. If you release music and you can't sell enough albums in 20 years, that's not societies problem. A lot of artists release albums every 1 - 3 years anyway, so they're always selling some records, or were before streaming became the way to "own" music. Most make their money off of concerts anyway.

For movies and shows, charge and increasing fee to renew the copyright. Eventually studios will give up certain movies. The older the movie the more you pay.

We could also just have some of the rights go away after X amount of years. Maybe after so much time it's still not legal to copy the original work, but it is legal to make a cover song, or a derivative work using the same character. At another point maybe it's no longer to illegal to copy for free, but it is still illegal to sell without permission.

I personally think we should have shorter limits for non-creator owners of copyright, and for creators it should be like 20 years or death whichever comes last. I also think compulsory licensing should exist for everything.

The problem here is that large companies can do whatever they want and regular people cannot. Don't worry, they won't be allowing you the same rights as these companies.
But some people designed their entire lives around the assumption of IP protections.

If we remove IP laws, we should remove all private property laws!