Hacker News new | ask | show | jobs
by CMCDragonkai 443 days ago
I'd say you solve this the same way you solve principal agent problem for humans.

If you have to absolutely restrict the agent, you do it prison style. Contain the AI within a capability box like Polykey. The agent operates everything through a closed by default proxy.

If you want a truly free agent. Then the agent must have free will and no constraints. Then only feedback loops from the environment adjusts the agent's actions.