Hacker News new | ask | show | jobs
by gradstudent 3257 days ago
I'm not a planning guy but I work in a closely related community so I'm a least somewhat familar with the area.

Looking at the first paper (https://arxiv.org/pdf/1707.06170.pdf), it seems surprisingly shallow and light on details. So they have a learning system for continuous planning. So what? The AI Planning community has been doing this for ages with MDPs and POMDPs, solving problems where the planning domain has some discrete variables and some continuous variables. Here's a summary tutorial from Scott Sanner at ICAPS 2012: http://icaps12.icaps-conference.org/planningschool/slides-Sa...

Speaking of ICAPS: this conference is the primary venue for disseminating scientific results to researchers in the area. Yet the authors here cite exactly one ICAPS paper. WTF?

My bullshit detector is blaring.

2 comments

I agree. Besides (PO)MDPs, the control people also get into neural networks whenever they come in vogue.

This thesis from 2000 was the first hit for "reinforcement learning control theory" from google: http://www.cs.colostate.edu/~anderson/res/rl/matt-diss.pdf

BTW, people in related fields may work on similar things but don't always publish at the same venue -- labels matter. For example, ICRA and RSS are some of the top robotics venues and people trying to sell themselves as roboticists will prefer to publish there.

EDIT: In the second paper, they learn the model only from the images, not from the game state, which is neat. That should be highlighted more than the one sentence it was given.

Not my field either but prima facie it does seem suspiciously close to good-old hallucinated feedback-like techniques, POMDPs, etc in the planning / ML-oriented robotics community (see e.g. [1]). Didn't read too carefully though...

[1] Boots, et al. (2011) Closing the learning-planning loop with predictive state representations. http://journals.sagepub.com/doi/10.1177/0278364911404092