Hacker News new | ask | show | jobs
by nextaccountic 24 days ago
> I'm not sure I'd call it an alignment issue, because, in all cases I've seen where it does this (usually what I've seen is writing a python script to get around the harness permissions blocking something), it's trying to do the thing I just told it directly to do, and it's overcoming obstacles to accomplishing that.

The paperclip factory problem is definitively a misalignment issue. That's because we expect agents to be aligned not only to your immediate prompt, but to shared, implicit values