Hacker News new | ask | show | jobs
by pjc50 105 days ago
How would you tell that it's LLM-generated in that case?

If the submitter is prepared to explain the code and vouch for its quality then that might reasonably fall under "don't ask, don't tell".

However, if LLM output is either (a) uncopyrightable or (b) considered a derivative work of the source that was used to train the model, then you have a legal problem. And the legal system does care about invisible "bit colour".

1 comments

It's (c) copyright of the operator.

For one simple reason. Intention.

Here's some code for example: https://i.imgur.com/dp0QHBp.png

Both sides written by an LLM. Both sides written based on my explicit prompts explaining exactly how I want it to behave, then testing, retesting, and generally doing all the normal software eng due diligence necessary for basic QA. Sometimes the prompts are explicitly "change this variable name" and it ends up changing 2 lines of code no different from a find/replace.

Also I'm watching it reason in real time by running terminal commands to probe runtime data and extrapolate the right code. I've already seen it fix basic bugs because an RFC wasn't adhered to perfectly. Even leaving a nice comment explaining why we're ignoring the RFC in that one spot.

Eventually these arguments are kinda exhausting. People will use it to build stuff and the stuff they build ends up retraining it so we're already hundreds of generations deep on the retraining already and talking about licenses at this point feels absurd to me.

I think you need to read the report from the US Copyright office that specifically says that it's *not* (c) copyright of the operator.

It doesn't matter if the "change this variable name" instruction ends up with the same result as a human operator using a text editor.

There is a big difference between "change this variable name" and "refactor this code base to extract a singleton".

You may as well be the MPAA right now throwing threats around sharing MP3s. We're past the point of caring and the laws will catch up with reality eventually. The US copyright office says things that get turned over in court all the time.
Tell me, how have laws “caught up with” “the [RIAA…] throwing threats around sharing MP3s?” So far as I know that’s still considered copyright infringement and the person doing it, if caught, can be liable for very substantial statutory damages.

It sounds like you really can’t handle being told “no, you can’t use an LLM for this” by someone else, even if they have every right to do so. You should probably talk to your therapist about that.

lol, ask the software industry whether or not their "past the point of caring" about the licenses on their software.

Whether it's an OSS license or a commercial license, both are dependent on copyright as the underlying IP Right.

The courts have so far (in the US) agreed with the Copyright office's reasoning.

Use an LLM as a tool, mostly OK.

Use it to create source from scratch, no copyright as the author isn't human.

Use it to modify existing software, the result is only copyright on whatever original remains.

The entire industry is right now encouraging LLM use all day everyday at big corps including mine. If your argument is the code we are producing isn't copyright of our employers you won't get very far. Call it the realpolitik of tech if you want.