Hacker News new | ask | show | jobs
by ghoward 1809 days ago
I understand your concerns.

Honestly, with this whole debacle, I am not going to be accepting outside contributions anyway.

I also understand the concern with a problematic license. However, I don't plan to make a specific exemption about machine learning, but rather tie up an ambiguity.

What I think I'll do is that the license will require that when the licensed source code is used, partially or fully, as an input to an algorithm, the license terms must be distributed with the output of that algorithm.

I don't think this is a violation of the GPL at all because the GPL requires you to distribute the license with the binary code of GPL'ed code, and such binary code is the output of an algorithm (the compiler) whose input was the source code.

But what it would do is put the onus on GitHub that, if they used my code in training that data, if they distributed the results (as they are doing), they must distribute my license terms as well and tell users that some of the results are under those terms.

1 comments

> binary code is the output of an algorithm (the compiler) whose input was the source code.

Just because binary code is produced by the operation of an algorithm on source code doesn’t make all output produced any algorithm on that source code binary code. Otherwise checksums and hashes and prime numbers would be copyrighted.

Bats are not birds.

You have a point, which is why the legal system would still require that a copy be substantial before they count it as infringing. I would argue that Copilot has already been shown to copy substantial portions, though.