Hacker News new | ask | show | jobs
by nemomarx 308 days ago
There was that result about training them to be evil in one area impacting code generation?
1 comments

Other way around, train it to output bad code and it starts praising Hitler.

https://arxiv.org/abs/2502.17424