Hacker News new | ask | show | jobs
by h2odragon 1132 days ago
When I published stuff to GitHub, it had open licenses: i wanted anyone and everyone to make whatever use of it they could. I didn't foresee this use, and I'm not fond of Microsoft (to say the least); but it certainly falls into the area of things I explicitly allowed when publishing.

I suspect many others who publish there feel the same way.

6 comments

Perhaps your license permits CoPilot reuse, but that is not every F/OSS license. There are some which require attribution of the original authors. There are some which require the distributor to make available any source code, and any modifications made to the software.

Software authors are not upset about the mere reuse of their code, it's the violations of such license terms that are problematic. If attribution is required, but neglected or impossible, that's typically known as "plagiarism", you know.

First, if the copying is found to be fair use (which is very likely), then attribution or other requirements of a copyright license will not be required.

Second, the only aspects of code that needs to follow the license are the parts of the code that are covered by copyright. That excludes anything that is functional. Since optimizations are functional and not expressive in nature then, for example, an optimized sorting algorithm would not be covered. What would be covered is how that algorithm is organized… the API, file structure, class names, ie, the arbitrary parts of code that everyone argues about.

If copying by AI is generally found to be fair use then we will see this in music, porn, advertisement, in political associated situations, and other situations where authors has a history of disagreeing with how their work get used. Unstable diffusion is an ongoing test of how far fair use may be applied.

I find it very likely that copyright law will be changed if training on copyrighted material becomes universally allowed under fair use. The alternative is that training on software code is allowed, but training on images/videos/music is not, which I do not find likely.

> Second, the only aspects of code that needs to follow the license are the parts of the code that are covered by copyright

The legal system don't generally work that ways. The questions judges tend to look at is if the accused party can be reasonable said to have copied someone else work without permission. We can look at either napster or the pirate bay court cases and see how low priority judges tend to view arguments that rely exclusively on a technical detail (A torrent file is not the same as a movie!).

The actual test will happen once microsoft’s source code gets leaked and we start training our models against them. Until then we will keep hearing how using our work for free is good and that copyright is out of fashion. As soon as the tables are turned so will this narrative go away.
Every creator and author has their own idea of what they want and do not want. A license is rarely ever comprehensive enough to cover all of it, and as time goes on those ideas can also change with the author.
Best thing you can do now is start porting projects over elsewhere leaving history behind on that platform pointing users to the new home/mirrors. You could also consider a different license. I hope the FSF comes up with an exception like they did in the A of AGPL about how LLMs can use the data (or require the data to be open, etc.).
Problem is when someone uploads something to github they have a license to share (eg via GPL), but are not the copyright owner of.
Well, I certainly had some expectations that are covered in the license. I.e that derivative work is a subject of some constraints and that copyrights are not removed from the code.
> i wanted anyone and everyone to make whatever use of it they could.

> didn't foresee this use

So you really didn’t want any use. You just wanted the use you found acceptable? So you didn’t really want it to be “open”

I didn't foresee it, i do not object to it, and probably would not have had i known beforehand.

Code i dont want others to use I dont publish.

I only work on open source code that I either am getting paid for or that I have gotten paid for in the past, I genericized and gone through my employer’s very straightforward open source process.

By default the license we use is MIT. If I ever did for some reason choose to open source my own work, it would be a similar license.

I don’t like the idea of claiming something is “open” and then placing restrictions on it.