|
> Pre-training is, actually, our collective gift I feel like this wording isn't great when there are many impactful open source programmers who have explicitly stated that they don't want their code used to train these models and licensed their work in a world where LLMs didn't exist. It wasn't their "gift", it was unwillingly taken from them. > I'm a programmer, and I use automatic programming. The code I generate in this way is mine. My code, my output, my production. I, and you, can be proud. I've seen LLMs generate code that I have immediately recognized as being copied a from a book or technical blog post I've read before (e.g. exact same semantics, very similar comment structure and variable names). Even if not legally required, crediting where you got ideas and code from is the least you can do. While LLMs just launder code as completely your own. |
That’s been the fate of many creators since the dawn of time. Kafka explicitly stated that he wanted his works to be burned after his death. So when you’re reading about Gregor’s awkward interactions with his sister, you’re literally consuming the private thoughts of a stranger who stated plainly that he didn’t want them shared with anyone.
Yet people still talk about Kafka’s “contribution to literature” as if it were otherwise, with most never even bothering to ask themselves whether they should be reading that stuff at all.