Hacker News new | ask | show | jobs
by Animats 1846 days ago
Their paper "Concrete problems in AI safety"[1] is interesting. Could be more concrete. They're run into the "common sense" problem, which I sometimes define, for robots, as "getting through the next 30 seconds without screwing up". They're trying to address it by playing with the weighting in goal functions for machine learning.

They write "Yet intuitively it seems like it should often be possible to predict which actions are dangerous and explore in a way that avoids them, even when we don’t have that much information about the environment." For humans, yes. None of the tweaks on machine learning they suggest do that, though. If your constraints are in the objective function, the objective function needs to contain the model of "don't do that". Which means you've just moved the common sense problem to the objective function.

Important problem to work on, even though nobody has made much progress on it in decades.

[1] https://arxiv.org/pdf/1606.06565.pdf

1 comments

So they're on safety, explainability, steerability and so-forth. It seems like you could call these "holistic" properties and they have always been hard to add to add to a system - indeed, it seems like in a conventionally constructed system they tend to have to be engineered-in from the start.

The success of deep learning might be something of a curse - it's go enough success that creating a safe system seems to automatically be modifying a neural net to be safe despite it not having the "engineered from the start" quality.

The success of deep learning might be something of a curse

Possibly. In AI, someone has a good idea, there's great enthusiasm, people in the field predict Strong AI Real Soon Now, the good idea hits a ceiling, and then people are stuck for a while. AI has been through four cycles of that. The ceiling of the current cycle may be in sight.

The next big problem is, as they say, "safety", or "common sense". Nobody really has a handle on how to do that yet. Checking proposed explorations against a simulation of some kind works if you can simulate the world well enough. Outside that niche, it's hard.

Collecting vast amounts of data to predict what can go wrong without an underlying model runs into combinatorial limits. More things can go wrong than you are likely to be able to observe.

Good that there are people thinking about this.