| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by breve 1 hour ago

If an LLM is trained on GPL code then that code has become an intrinsic part of the model (because if it hasn't then what was the value of training on it). So shouldn't that model now also be licensed GPL?

And how do I know the LLM output is not reproducing substantial chunks of GPL'd code, making my code GPL?

3 comments

Ekaros 1 hour ago

Or alternatively. LLM is not human. Non human generated content has no copy right protection. Meaning all generative model output is automatically public domain.

link

olsondv 1 hour ago

Github copilot has filters for enterprise that remove the GPL code before it gets returned. At least that’s how my company has been covering itself.

link

imglorp 1 hour ago

Maybe this, but multiply by N licenses. Any given output may have ideas from all of them.

Law is probably going to take a while to catch up here.

link