Hacker News new | ask | show | jobs
by Copyrightest 104 days ago
NetBSD has a very reasonable stance:

  If you commit code that was not written by yourself, double check that the license on that code permits import into the NetBSD source repository, and permits free distribution. Check with the author(s) of the code, make sure that they were the sole author of the code and verify with them that they did not copy any other code.

  Code generated by a large language model or similar technology, such as GitHub/Microsoft's Copilot, OpenAI's ChatGPT, or Facebook/Meta's Code Llama, is presumed to be tainted code, and must not be committed without prior written approval by core.
https://www.netbsd.org/developers/commit-guidelines.html
1 comments

No, it is not reasonable to presume code generated by any large language model is "tainted code." What does that even mean? It sounds like a Weird Al parody of the song "Tainted Love."
“Taint” has been a term of art in Open Source for decades. That you don’t know this reveals your ignorance, not any sort of cleverness.

LLMs regurgitate their training data. If they’re generating code, they’re not modeling the syntax of a language to solve a problem, they’re reproducing code they ingested, code that is covered by copyright. Just regurgitating that code via an LLM rather than directly from your editor’s clipboard does not somehow remove that copyright.

It’s clear you think you should be allowed to use LLMs to do whatever you want. Fortunately there are smarter people than you out there who recognize that there are situations where their use is not advised.