| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by currymj 2853 days ago

One of the core equations in reinforcement learning is the Bellman equation -- named after Richard Bellman, inventor of dynamic programming. And in fact, in the operations research community, reinforcement learning is often referred to as "approximate dynamic programming". There are lots of extremely boring, quite effective techniques for solving real industrial problems in this framework, without any neural networks at all.

As for why so much excitement about "deep RL", when it hasn't done anything substantive outside of games -- I think it's because it has the possibility of working in wildly different domains with minimal modification. We can sort of see some of this already -- OpenAI used the same training algorithm to play DOTA2 as they did to train a robotic hand to manipulate blocks.

I have no idea whether the hype will actually pan out, but that generality is worth being cautiously excited about, I think.