Hacker News new | ask | show | jobs
by PeterisP 1405 days ago
Currently most of the ML field is working on an assumption (untested in courts, but also not disputed in courts) that a trained model is not considered a derivative work (according to the copyright law definitions, no matter what colloquial understanding of "derivative" implies) of its source data. There also is a serious argument that the model is uncopyrightable at all (since any noncreative transformation of data is not copyrightable, and there is solid precedent stating that it is absolutely irrelevant how much effort and money the transformation took), in which case the modal simply can't be a derivative work.

If that assumption holds (we won't know for sure until some precedent multiple years down the line) then there is no violation of the copyright law even if they refuse the license.

In essence, I do not think that it is appropriate to say that "Copilot is violating licenses" because it's an open legal question whether that act is a violation or fully within their rights to do so, it is undecided at the moment, and there will be an answer to that question only after we get either a relevant precedent ruling or new legislation, both of which will take quite some time.