Hacker News new | ask | show | jobs
by joe_the_user 2393 days ago
I am confused by how this is supposed to be useful. It seems like the researchers are defining side-effects as things that "disrupt the world" (of this life game) and training an AI to avoid this.

But this seems like at best one of a whole host unexpected effects one might consider. AI that discriminates in a way that society frowns on might not "disrupt the world" in such a visible fashion.

I don't see how one can get away with an entity doing stuff for you with that entity understanding your model of the world.

1 comments

Yes, this is one specific safety problem -- there are many other RL safety problems that deserve high quality benchmarks too. See eg https://arxiv.org/pdf/1606.06565.pdf or https://medium.com/@deepmindsafetyresearch/building-safe-art... for discussions of the problem space.