| HN Mirror

Training on code that unintentionally has vulnerabilities is a problem, but I'm even more worried about bad actors intentionally putting code with vulnerabilities on GitHub with the hope that it will become training data. Bad actors might learn how to disguise code to sneak it into Copilot (if disguise is even necessary) and introduce backdoors, etc. It could be especially dangerous because of the "stamp of approval" Copilot has from GitHub/Microsoft. People who would not copy/paste code from the web might feel a false sense of security using Copilot.