Hacker News new | ask | show | jobs
by skeledrew 25 days ago
> It's not difficult to steer an LLM during training so that they'd output malware only when prompted a certain way

Perhaps, but that's also a good way to lose users+reputation as there's no way to control when said malware is generated. Once the first instance is discovered cybersec researchers will have a field day reproducing it and showing the world.