I don't think GPL licenses really stop companies from scraping the code and using it to train their LLM's. Also, how would you prove they actually used your code ?
Nothing. You protect yourself with hiding your code away from Github. Anyway, when software will be produced WITH LLMs, that will inevitably repeat your code, then you will sue.
Opensource is MADE to be copied. It's meant to be. You WANT it. You just want it to benefit EVERYONE and not corporations who make money out of it, and there are defenses to that.
Opensource, free software, is a philosophy, not a business opportunity per se. Business is just tangential to it.
I strongly agree. LLM’s have eviscerated the core idea and motivation behind using the GPL, in a way I didn’t at all see coming. We’ve published a lot of very unique GPL code (eg in sagemath), which anybody can now easily “regenerate” as part of some other closed (or open) software using an LLM. They wouldn’t even know they are creating a derived work of our GPL code. I have had to come to terms with this...
I don't think this is as clear as you say it is. If a person "regenerates" GPL code from memory, it will still be a copyright violation. LLMs can't own copyright on their own, so a person must be responsible for the result. So it should be a violation. GPL should help (if you actually sue).