Hacker News new | ask | show | jobs
by ansgri 3257 days ago
https://en.wikipedia.org/wiki/Model_predictive_control

Of course imagining possible outcomes before executing is useful! And it has many uses outside deep learning. No reason to reinvent new words, really. At least without referring to the established ones.

Maybe there is a serious novel idea, but I've missed it.

Basically, if you need to control a complex process (i.e. bring some future outcome in accordance to your plan), you can build a forward model of the system under control (which is simpler than a reverse model), and employ some optimization techniques (combinatorial, i.e. graph-based; numeric derivative-free, i.e. pattern-search; or differential) to find the optimal current action.

5 comments

The link between imagining and deep learning is rather in the opposite direction - it has always been obvious that imagining possible outcomes before executing would be useful, but the novelty is that deep learning has allowed them to actually make "imagination" that works.

MPC is an useful concept if you have a predictive model that's at least vaguely close to the actual behavior. In some contexts (e.g. modeling of particular industrial systems) programmers could build such a model, but in the general case that's absolutely not feasible, the world is full with problems where, practically speaking, you can not manually build a forward model of the system under control.

So this article is about initial research on systems that can construct such a predictive model/imagination from experience, with a proof of concept that the current deep learning approaches allow us to build systems that can learn such predictive models (which wasn't really possible before) and further development of this concept seems to be the way how we can actually apply things like MPC to problems where we won't build a forward model ourselves; and in the long run, that means pretty much all problems.

> systems that can construct such a predictive model/imagination from experience

I just want to emphasize this point as the crux here. We have many many techniques for AI that involve doing roll-outs once a smart human with domain knowledge hands the system a fully-formed model of the dynamics. Not so many where the dynamics are learned

In the 1990s, I was thinking of this for legged locomotion over rough terrain, fast turns, and such. The idea was to use a mediocre but fast physical simulation to answer "what-if" questions, allowing planning of moves about two or three steps ahead in the real world, or at least a realistic simulator. Then use a learning model to learn corrections for differences between the mediocre simulator and the "real world", or good simulator. The system would start out somewhat klutzy and get better. Eventually, perhaps to the parkour level.

Once you have a model, you can invert it to make a controller, as the post above points out. For classical linear models, this can be done analytically. For non-linear models, you can use the model to train a controller, running the model with random inputs to generate a training set.

(I spent several years working on the simulator problem, shipped a simulation product ("Falling Bodies", the first ragdoll simulator that didn't suck)[1] and eventually sold the technology to a physics engine startup and went on to other things. Even today, as Sony and Boston Dynamics have demonstrated at great time and expense, there's no market for legged robots yet.)

[1] https://www.youtube.com/watch?v=5lHqEwk7YHs

Precisely what I wanted to say. The 'imagination' they describe is simply reasoning about the future based on current information (with implied uncertainty). Large chunks of any AI textbook are about analyzing future states from present states, planning actions to manipulate that, and the effects of uncertainty on it.

'Imagination' isn't even a good word for it---in conversational English, we often use the word for thinking about models of fictitious states that can't happen, which has subtle value for humans, but not yet for machines.

Moreover, data-driven control isn't a new concept. It's not my field so I can't comment on what's new here, but I've heard about learning dynamics and rewards in a control theory context plenty of times.
In a control theory context, we're likely talking about inferring a small handful of parameters where the relationships between them are well known. In this paper they're inferring the entire dynamics of an environment from thousands of raw pixel values. This is not something that admits a tractable exact optimization
That may be fair. I only read the first paper with the spaceships and mazes, which are much more traditional problems.

It sounds like the second paper is the more interesting one from your description though, so I will give that a read.

>Maybe there is a serious novel idea, but I've missed it.

I don't know if learned models are novel, but they certainly aren't vanilla MPC. (In my quick scan of them, only second paper mentions learning models)