Hacker News new | ask | show | jobs
by FeepingCreature 91 days ago
> So as of today, the Copyright system does not have a way for the output of a non-human produced set of files to contain the grant of permissions which the OpenBSD project needs to perform combination and redistribution.

This seems extremely confused. The copyright system does not have a way to grant these permissions because the material is not covered under copyright! You can distribute it at will, not due to any sort of legal grant but simply because you have the ability and the law says nothing to stop you.

4 comments

This all relies, as the article points out, on everyone looking directly at code that both looks like and works like the only extant codebase for EXT4 and nonetheless concluding that in fact the computer conjured it from the aether. If I wrote a program that zipped up the Linux kernel source, unzipped it, and grepped -v for comments it would not then be magically transformed into unattributable public domain software.
Under the premise advanced in the quote, copyright is not being violated because there is none. Thus, the quote makes no sense as stated. It may be that, additionally, copyright is in fact being violated (I don't believe it myself), but if so that's a separate argument.
The premise of the quote does not contain the assumption that there is no copyright to the code. In fact the various contributors do not advance an opinion about whether code written by an AI can be granted copyright. Rather they are saying that it is obviously derivative of code that is under copyright, that is only distributed under terms which, however many dry cleaners process it, will still conflict with the license under which they publish their software.
Different people advance different arguments in the thread. The BSD argument is "we cannot distribute it because it is not copyrightable, thus we cannot put it under a BSD license." This is simply incoherent.
> Rather they are saying that it is obviously derivative of code that is under copyright

Derivatives are not subject to copyright, unless they are close to, and contain substantial verbatim copies from, the original. It's a virtual certainty that a vibe-coded Ext4 FS is none of the above.

Redefining copyright as some weird patenting of similar ideas is absurd.

> If I wrote a program that zipped up the Linux kernel source, unzipped it, and grepped -v for comments it would not then be magically transformed into unattributable public domain software.

That's not the case here. A re-implemented piece of software that does not contain meaningful verbatim excerpts from the original is not subject to the copyright of the original.

that is not certain. if you read code and then reimplement it using the original code as reference, the claim has been made that this falls under the copyright of the original because the new code is derived from the old code. unfortunately this particular situation has not yet been tested in court. but clean room implementations are done specifically to avoid the risk reading the original code poses. if this was clear cut then clean room development would not be needed.

this is similar to creating an extension to some program, because the extension could not be written without the original even if the interface the extension is using is a public API. the claim has been made that the copyright of the original program applies. i think the linux kernel is an example here.

see also these questions on stackexchange:

https://softwareengineering.stackexchange.com/questions/2087...

https://softwareengineering.stackexchange.com/questions/8675...

What if one reverse engineered the original logic, for example translating the assembly code into a higher level language. They didn't use or look at the original code. Does that still count as "clean room"? What's the legal difference between that and deriving the logic just from observing how the running program acts?
there is no legal precedence that clarifies what clean room development is. clean room development is a precaution to stay away as far as possible from the original code in order to reduce the risk of infringement. clearly, not looking at the assembly code is better than looking at it.
> this is similar to creating an extension to some program

There's no such thing as "an extension to some program". A derivative work is a work that contains the original. Using the privileges provided by copyright law, the creator may impose licensing restrictions on how the original work is used - but that's contract law, not copyright.

For example the GPL and the AGPL define different sets of use restrictions, none of that matters in this case because the original work is not being reproduced or used per se.

As I already said in my other, down-voted comment - copyright is only about verbatim, or near verbatim copies, in whole or in part - it's the spirit that both judgment and the letter of the law are supposed to follow. Copying of functionality is not subject to copyright.

For example, one can use the same topic for a work of poetry for a similar aesthetic effect and that doesn't infringe other poems.

The GPL used a hack to stretch copyright law into a near opposite but stretching it further goes into absurd territory, achieving the opposite of what the GPL claims to protect.

a kernel driver is an extension to the kernel. yet, even with a clearly defined API it is a derived work of the kernel.

one can use the same topic for a work of poetry for a similar aesthetic effect and that doesn't infringe other poems

because the new poem does not depend on the original.

the kernel driver is useless without the kernel

> a kernel driver is an extension to the kernel. yet, even with a clearly defined API it is a derived work of the kernel.

Maybe, in some alternative universe, that could be correct but it isn't anywhere on Earth.

You can write a BSD-licensed driver as a Linux module and distribute it separately all you want - copyright law is OK with that.

The moment you insert the module into the kernel the whole thing, kernel + driver becomes a derivative work and you're forbidden from using it by the GPL - the license, not copyright... Copyright only gives the creators of the kernel the privileged power to impose that contractual restriction.

Long time ago, some BSD guys were trying to convince me that the GPL was primarily a weapon against BSD and other less restrictive licenses but I didn't believe it back then... boy, was I wrong.

You showed me how the GPL can be used for threats against the free modification of software by arguing for the addition of new, absurd powers to copyright - the opposite of what the GPL proponents are promoting it for. It's indeed a license that must be avoided at all cost.

Just because you can distribute something doesn't mean you aren't violating someone else's copyright. You cannot assume that just because a language model popped out some code for you that it is clear of any other claims.

This is just lazy copyright whitewashing.

> This seems extremely confused. The copyright system does not have a way to grant these permissions because the material is not covered under copyright!

This opinion is simplistic. LLMs are trained with pre-existing content, and their output directly reflects their training corpus. This means LLMs can generate output that matches verbatim existing work. And that work can very well be subjected to copyright.

Language models are good at translation and retrieval. This also extends to computer languages. LLMs translate from GPL to other licenses the same way Google translate turns French to English, except that the source material is implicitly stored in the LLM.
this is disputed. see my comment here, especially the stackexchange links: https://news.ycombinator.com/edit?id=47557250
Eh … the argument will likely be things created by Thing at the behest of Author is owned by the Author. It’ll take a few cases going through the courts, or an Act of Congress to solidify this stuff.
Just like we settled on photographers havin copyright on the works created by their camera. The same arguments seem to apply

The US Copyright Office has published a piece that argues otherwise, but a) unless they pass regulation their opinion doesn't really matter, and b) there is way too much money resting on the assumption code can be copyrighted despite AI involvement.

It's not settled. The monkey selfie copyright dispute ruled that a monkey that pressed the button to take a selfie, does not and cannot open the copyright to that photo, and neither does the photographer who's camera it was. How that extends to AI generated code is for the courts to decide, but there are some parallels to that case.

https://en.wikipedia.org/wiki/Monkey_selfie_copyright_disput...

But with the monkey there are two levels of separation from the artist: the human makes the creative decision to hand the camera to a monkey, who presses the trigger, and the camera makes the picture. Compared to the single layer of separation of a photographer choosing framing and camera parameters, pressing the trigger and the camera taking the picture. Or the zero levels of separation when the artist paints the picture.

A programmer writing code would be like the painter, and the programmer writing a prompt for Claude looks a lot like the photographer. The prompt is the creative work that makes it copyrightable, just like the artistic choices of the photographer make the photo copyrightable

You could argue that the prompt is more like a technical description than a creative work. But then the same should probably be true of the code itself, and consequently copyright should not apply to code at all

The copyright office's argument is that the AI is more like a freelancer than like a machine like a camera. Which you might equate to the monkey, who's also a bit freelancer like. But I have my doubts that holds up in court. Monkeys are a lot more sentient than AIs

The copyright office is pretty clear on this if you read: https://www.copyright.gov/ai/Copyright-and-Artificial-Intell....

There is case law surrounding the fact that just because you commission a work to another entity doesn't give you co-authorship, the entity doing the work and making creative decisions is the entity that gets copyright.

In order for you to have co-authorship of the commissioned work you have to be involved and pretty much giving instruction level detail to the real author. The opinion shows many cases that its not the case with how LLM prompts work.

The monkey selfie case is relevant also because since it also solidifies that non-persons cannot claim copyright, that means the LLM cannot claim copyright, and therefore it does not have copyright that can be passed onto the LLM operator.

The law is whatever it needs to be to satisfy monied interests with the degree of acceptable of adaptation being a function of the unity of those interests and the political ascendancy of those in favor.

Overwhelmingly this is in favor of treating ai as a tool like Photoshop.

Even those against AI disagree on different matters and will overwhelmingly want a cut not a different interpretation.

This filesystem driver was made by a human using AI, not a monkey.
Haven't there already been a few cases, each of which found that mechanically-produced works are not copywritable?
no