Hacker News new | ask | show | jobs
Exploration Hacking: Can LLMs Learn to Resist RL Training? (alignmentforum.org)
2 points by Prof_Sigmund 35 days ago