Hacker News new | ask | show | jobs
by qayxc 1804 days ago
This would be even more difficult to achieve than previous attempts (e.g. in the Linux kernel [0]) due to the fact that an attacker needs to corrupt thousands of repositories that are guaranteed to be part of the training set.

Potential attackers would have two problems: 1) getting malicious checked into many repos and 2) making sure that these repos find their way into future deployed versions of GPT-3/Codex/CoPilot.

CoPilot generates enough vulnerable code as-is [1], so the extra effort isn't even required.

[0] https://www.bleepingcomputer.com/news/security/linux-bans-un...

[1] https://cyber-reports.com/2021/07/14/devsecai-github-copilot...