Hacker News new | ask | show | jobs
by blip54321 1462 days ago
Damages. This question will come back to damages.

If you steal 10 lines of code from me, the damages will be the greater of:

- The benefit to you (10 minutes programmer time)

- The cost to me ($0)

- Statutory damages (probably $200)

In other words, it's very unlikely to be worth a lawsuit. The most likely outcome is:

- A legal letter is sent

- Infringing code is removed

- As good bedside manner, some nominal amount of money is transferred, mostly in some gesture designed to make the violated party feel good about themselves (e.g. a nice gift).

7 comments

An example of how much copied code is worth:

https://en.wikipedia.org/wiki/Google_LLC_v._Oracle_America,_...

For this content:

   a nine-line rangeCheck function, several test files, the structure, sequence and organization (SSO) of the Java (API), and the API documentation.
The cost was: "statutory damages up to a maximum of US$150,000".
I don’t believe the nine lines of code was the relevant part leading to damages. It was the fact that Google copied this entire API design (SSO) for Java. I don’t think GPT-3 is in danger of doing that.
Also don't forget that the Supreme Court has ruled that APIs aren't copyrightable after all (or at least fall within fair use).
Now I understand why US IT consulting corporations have expanded into multinationals
> - The benefit to you (10 minutes programmer time)

That's an incomplete view. You're judging the value by the time it'd take to rewrite it.

The real value is in knowing what to type and why.

When Co-pilot suggests you a GPL code, it's main value is the knowledge, not the typing.

That piece of knowledge may have taken a LOT of effort from an OSS team to acquire.

Depending on the context, this knowledge would be worth millions.

Worth a lawsuit.

> Depending on the context, this knowledge would be worth millions. Worth a lawsuit.

But it probably won't be worth millions of dollars. And that is why the lawsuit wont be worth it.

> That piece of knowledge may have taken a LOT of effort from an OSS team to acquire.

Anything "may" be possible. But it probably won't be worth that much.

> Anything "may" be possible. But it probably won't be worth that much.

I'd suggest to get more information about the repercussions associated with appropriating GPL code into proprietary closed source.

This is a big deal. You may have to license your entire codebase under GPL if you incorporate GPL code and distribute it.

> You may have to license your entire codebase under GPL if you incorporate GPL code and distribute it.

I would suggest that you actually take your own advice and get more information yourself.

No license can force you to release your code. Nope, not even GPL.

Instead, what a rights holder can do, is sue for damages for the copyright theft, for not following the license. They can't force you to follow the license. Instead, they can say that you didn't follow it, therefore you stole the code, and owe money to them, for stealing the code, depending on how much the code is worth.

The only thing that GPL does, is it gives people permission to use the works, in exchange for releasing code. But, if you infringe, the damages do not depend on whatever the license was, or whatever request the license makes.

To use an example someone else gave, of the "first born child" license, imagine someone writes a simple binary search function, and puts out a license that gives it out for free, in exchange for paying them some absurd price. EX: the joke of the first born child, but more seriously, lets say the license was "1 million dollars".

If someone stole that binary search, couple line function code, and it went to court, they absolutely would not own them 1 million dollars, even though thats what the license said.

Instead, they would owe the rights holders damages. And chances are, a couple line binary search function, or some other example that you could think of, would only be worth a small amount.

And even though the license said "This code is worth 1 million dollars, and you owe us that money if you use it!", it is not true that anyone would owe them a million dollars. Instead they would only owe them damages, which would not be anywhere close to 1 million dollars.

This is correct. Programmers read licenses like code ("If I use the GPL, I need to release my entire codebase."). That's not how the world works. The worst-case outcome is damages. Damages tend to be reasonable.

In most cases, damages are set to make both parties straight, not to be punitive. People cite how trillion dollar companies might have billion-dollar lawsuits, but that's pretty reasonable. $1B damages are 0.1% of a company's value in a battle between FAANGs, which have big-O trillion-dollar valuations. If you have a dispute between $1M businesses, the analogue is $1k damages. That's not atypical for a commercial dispute.

Please, read my comment again.

I did not say it forces you to distribute it. That's absurd.

What I said is: "if you incorporate GPL code and distribute it"

If you do those two things, yes, you have to license your code under GPL.

It's not me saying, please take a look at Section 5-b and 5-c of the license. [1]

[1] https://www.gnu.org/licenses/gpl-3.0.en.html#section5

stale2002 read your comment correctly. stale2002 responded to it correctly. No one is arguing with you about what the GPL says.

Let's do an experiment: You need to hit yourself repeatedly in the head with a mallet until you pass out.

Are you currently hitting yourself with a mallet until you pass out? No. Just because something is written doesn't mean you need to do it. If I incorporate your GPL code, distribute it, and don't license my code under the GPL, that means I'm distributing code without a license (or breaking a license). Unless I've crossed the line for criminal prosecution (which is far from anything we're discussing here), the worst-case consequence of that is .... damages.

If I've crossed the line into criminal prosecution, then the consequence is damages and jail time. I absolutely STILL do not need to license my code under the GPL.

(In most cases, it's a good idea to license code under the GPL, though, both due to branding/reputation damage, and since usually that leads to an out-of-court settlement; but those carry no legal force being that)

Since people and github are contemplating repeatedly infringing, is there an avenue to increase these damages? This seems like repeated and willful infringement.
It's not willful by the users of copilot. Damages would be low since it's easy to show that users have no intention of infringing, and in most cases, aren't aware they're infringing.

If liability sits somewhere, it's with copilot, github, and Microsoft.

A lot of that might come down to bedside manner. Right now, github isn't super-polite to people whose code it used. That's probably a mistake. They'd be a unsympathetic evil megacorp in a jury trial.

With Copilot it's 10 lines of code by thousands of users.

It adds up.

Let's upload a lot of Oracle GPL code and find out. Oracle has certainly sued over 10 lines of code and for much higher damages.

But you know what? I think we'll find that CoPilot will have magically skipped those Oracle repositories and only used code from lowly open source slaves.

Willful copyright infringement for monetary gain can be prosecuted as a criminal act in the United States (and many other countries including Japan) and it's highly possible Github themselves can end up in hot water for facilitating this.
> it’s highly possible Github themselves can end up in how water for facilitating this.

It might be possible, I don’t know about “highly”. Have you checked the license exclusions required to use Github? Their terms already carve out a Copyright exception for Github, because they need it on order to host your code. There’s also no reason Github can’t filter certain licenses, or make it impossible to complete entire functions, or build an option for everyone to opt-in to being autocomplete source material regardless of license, right? Any legal challenges are likely to result in changes to the feature before there are ever any serious repercussions.

I think it’s at least as likely, if not more so, that Copyright Law could evolve in response to the growing number of AI auto completers, and we (society) try to allow it within reason by being more specific about what constitutes automated infringement and who’s responsible for it. Fair Use currently exists but is vague and left up to courts to decide. In the meantime, Copyright is primarily intended to foster a balance between business and freedom of expression, and there’s a lot of open source software on Github that cares about freedom of expression and not about business. In any case, we don’t really want Copyright to represent some kind of absolute ownership land-lock over every string of 100 characters, that is a bit antithetical to both Copyright and the FOSS community.

wow the number of legal experts that appear and debate hypotheticals when everything is spelled out quite clearly in the license agreements is very high on this site.

Triply so when Microsoft is involved.

You and I have a different understanding of “willful”. If you’ve used copilot you’ll know that the vast majority of the time it’s not infringing anybody’s copyright, it’s creating code that is highly unique to the problem you are trying to solve.
All output of machine learning algorithms is derived from the training set. There is no creativity, just lots of complexity. What that means legally has yet to be fully determined.
If that were the case, how can models such as DALL-E 2 generate “Homer Simpson in The Godfather” type images. It’s clear that machine learning models are capable of independent creation.

As far as copilot goes, yes it’s possible to get it to recite copyrighted works, but in normal usage it is creating independent works because it is too influenced by the structure of your code around the insertion point to recite anything. It’s auto completing things like the variable names that you already declared, simple loops and function applications, etc.

> What that means legally has yet to be fully determined.

At least in the US, the Supreme Court ruled in Google v Oracle that the entire Java API is not copyrightable. Copilot users are very far from crossing the line, the courts are not going to come after some de minimis 10-line snippet that copilot generated.

Whether Microsoft itself was legally in the right by training copilot is a more interesting legal question that remains unresolved.

Do you see a scope for troll code GPLers, something along the lines of troll patents ?
No. There's nothing magical about GPL code. Sticking a license on code doesn't suddenly lead to astronomical damages.

No one has won billions of dollars on GPL enforcement. It's not how courts work. Contrary to popular belief, courts also won't compel compliance (e.g. releasing my code); if I break your license, the standard recourse is damages, whether that's GPL or All Rights Reserved.

Otherwise, I'd make the First Born Child license, whereby by using my code, you give me full ownership of your first born child, your home, your car, and your bank account. I could write a license like that right now, but I couldn't force you to give me your child, car, bank account, and home. If you used my code, you'd have the option to accept the license and give me those things. Or you could reject it, in which case, it's a normal copyright violation; in that case, whatever I wrote in the license is moot, and you pay damages (and stop using my code).

In this case, exchange is not fair, it's a scam, while in case of GPL, exchange is fair (code for code), so it's a valid open contract. You use my code, I use your.
Fair has nothing to do with it. Contracts don't need to be fair, and often aren't. They just need consideration. If we sign a contract whereby you give me your car, bank accounts, and house, for $1, that's a valid contract.

The only part which wouldn't be valid in a contract was the first-born child. That was a joke.

Indeed, if the GPL were a contract, courts might compel compliance.

However, the GPL is not a contract, it's a license. The FSF bent over backwards to make sure the GPL/AGPL licenses wouldn't be viewed as a contract, in part to limit liability / damages / risk.

Confusingly, some EULAs are framed contracts, contrary to the acronym, and do expose users to much more risk of liability than the GPL.

The relevant part of the GPL is:

    You are not required to accept this License in order to receive or
    run a copy of the Program. Ancillary propagation of a covered work 
    occurring solely as a consequence of using peer-to-peer transmission to 
    receive a copy likewise does not require acceptance. However, nothing 
    other than this License grants you permission to propagate or modify 
    any covered work. These actions infringe copyright if you do not accept 
    this License. Therefore, by modifying or propagating a covered work, you 
    indicate your acceptance of this License to do so.
Although we often like to take a plain-text read, but that's misleading; this is legal jargon. It's one of those bits of text which needs to be explained by a lawyer, and one who specializes in both licensing and in contract law.
> Fair has nothing to do with it. Contracts don't need to be fair, and often aren't. They just need consideration. If we sign a contract whereby you give me your car, bank accounts, and house, for $1, that's a valid contract.

It will be a gift. Gifts are valid, but they require free will of the gifting party. Gifts, without free will, can be easily canceled by court.

Copilot doesn't reuse code. None of the code it regurgitates has the required license.
I pasted a few segments of code I'd written in a prominent project. Copilot regurgitated paraphrased versions of the rest of that code. It'd be hard to argue it's not a derivative work.
Thank you for pointing that out!

I should have put "reuse" in quotes, since I meant copilot takes reuse one step further and replicates or regurgites code.