Hacker News new | ask | show | jobs
by Jang-woo 82 days ago
That's really interesting — thanks for sharing the notes.

The "rebel agent" framing feels very close to what I'm trying to get at, especially the idea that refusal can be part of correct behavior rather than failure.

One difference I'm trying to think through is where that decision lives.

In a lot of these examples, the agent itself decides to deviate based on its understanding of the situation.

What I'm wondering is whether we can (or should) define that earlier — at the level of the action itself.

So instead of the agent deciding to "rebel" at runtime, the system would already encode when execution is permitted, and refusal becomes the default if conditions aren't met.

The explanation part you mentioned also seems important — not just saying "no", but making it legible why execution wasn't allowed.

Curious how much of that work treats rebellion as something emergent from the agent, vs something structurally defined in the system.