|
|
|
|
|
by cgearhart
1093 days ago
|
|
No, I don’t think it’s just a question of recall accuracy. They issue really hinges on whether or not the AI itself is a derivative work of the training data, as I think that would trigger certain requirements in the original source licenses. Lots of folks seem to think that it is not a derivative work because (a) the model is just a bunch of numeric weights, it doesn’t contain any explicit code; and (b) it’s possible for the model to output original code in some cases. But that’s flawed reasoning because it’s quite clear that the model weights do contain perfect copies of at least some training code, and the models can produce that code perfectly (without the original license) when prompted. Thus it seems clear that the model itself should be treated as a derivative work, whereas a human is not—even if they memorize the code they read. |
|