Hacker News new | ask | show | jobs
by NoZebra120vClip 1132 days ago
Perhaps your license permits CoPilot reuse, but that is not every F/OSS license. There are some which require attribution of the original authors. There are some which require the distributor to make available any source code, and any modifications made to the software.

Software authors are not upset about the mere reuse of their code, it's the violations of such license terms that are problematic. If attribution is required, but neglected or impossible, that's typically known as "plagiarism", you know.

1 comments

First, if the copying is found to be fair use (which is very likely), then attribution or other requirements of a copyright license will not be required.

Second, the only aspects of code that needs to follow the license are the parts of the code that are covered by copyright. That excludes anything that is functional. Since optimizations are functional and not expressive in nature then, for example, an optimized sorting algorithm would not be covered. What would be covered is how that algorithm is organized… the API, file structure, class names, ie, the arbitrary parts of code that everyone argues about.

If copying by AI is generally found to be fair use then we will see this in music, porn, advertisement, in political associated situations, and other situations where authors has a history of disagreeing with how their work get used. Unstable diffusion is an ongoing test of how far fair use may be applied.

I find it very likely that copyright law will be changed if training on copyrighted material becomes universally allowed under fair use. The alternative is that training on software code is allowed, but training on images/videos/music is not, which I do not find likely.

> Second, the only aspects of code that needs to follow the license are the parts of the code that are covered by copyright

The legal system don't generally work that ways. The questions judges tend to look at is if the accused party can be reasonable said to have copied someone else work without permission. We can look at either napster or the pirate bay court cases and see how low priority judges tend to view arguments that rely exclusively on a technical detail (A torrent file is not the same as a movie!).

The actual test will happen once microsoft’s source code gets leaked and we start training our models against them. Until then we will keep hearing how using our work for free is good and that copyright is out of fashion. As soon as the tables are turned so will this narrative go away.