Hacker News new | ask | show | jobs
by cpcallen 1077 days ago
I'm disappointed that the article is only making the (somewhat pedantic) distinction between source code and weights. From the quotation marks in the headline I hoped that it would instead be making the distinction between human-readable source code and machine-readable compiled form.

For example, IMHO (IANAL) an AI code-completion tool that had been trained on GPL software is (or should be) only be legal to distribute if it is accompanied by the training code _and all the code ingested during training_ (or an offer to provide such code upon request).

1 comments

This is an interesting point. If you read the OSI open source definition, specifically on source code (quoted below) I'm inclined to treat the training data as part of the source code for the purpose of determining whether to consider any model open source.

  2. Source Code
  The program must include source code, and must allow distribution in source code as well as compiled form. Where some form of a product is not distributed with source code, there must be a well-publicized means of obtaining the source code for no more than a reasonable reproduction cost, preferably downloading via the Internet without charge. The source code must be the preferred form in which a programmer would modify the program. Deliberately obfuscated source code is not allowed. Intermediate forms such as the output of a preprocessor or translator are not allowed.
https://opensource.org/osd/