Hacker News new | ask | show | jobs
by Xss3 215 days ago
Humans have incentives to not do those things. Family. Jail. Money. Food. Bonuses. Etc.

If we could align an AI with incentives in the same way we can a person then youd have a point.

So far alignment research is hitting dead ends no matter what fake incentives we try to feed an AI.

1 comments

Can you remind me of the link between alignment and writing accurate documentation? Honestly don't understand how they are linked.
You want the ai aligned with writing accurate documentation, not aligned with a goal thats near but wrong, e.g. writing accurate sounding documentation.