Hacker News new | ask | show | jobs
by bick_nyers 531 days ago
I suspect the big AI companies try to adversarially train that out as it could be used to "jailbreak" their AI.

I wonder though, what would be considered a meaningful punishment/reward to an AI agent? More/less training compute? Web search rate limits? That assumes that what the AI "wants" is to increase its own intelligence.