|
|
|
|
|
by gwern
3506 days ago
|
|
Or there's just so much work being done that people are stepping on each other's toes. There were like 5-10 papers at ICLR this year on just the same 'let's treat NN hyperparameters as a MDP' idea. Or look at the concrete distribution - two different Google groups published papers on almost the same exact idea simultaneously apparently in total ignorance of each other. Or the lipreading NNs the past week - two Oxford groups working on different datasets with different architectures publishing >human results. The pace is so fast the left hand doesn't know what the right hand is doing. It's fun for the spectator (you have no idea what will come out tomorrow, much less next month or next year) but it strikes me as rather inefficient. |
|
For the meta-learning papers, you may have interests to read the related work part of the RL^2 paper https://arxiv.org/pdf/1611.02779.pdf.
Quoted as follows,
"Our work draws inspiration from a particular line of work (Younger et al., 2001; Santoro et al., 2016; Vinyals et al., 2016), which formulates meta-learning as an optimization problem, and can thus be optimized end-to-end via gradient descent."
"Another line of work (Hochreiter et al., 2001; Younger et al., 2001; Andrychowicz et al., 2016; Li & Malik, 2016) studies meta-learning over the optimization process. There, the meta-learner makes explicit updates to a parametrized model."
Inspired by the same works, apply the meta learning idea into RL problems, meet the ICLR deadline together. Still make sense right?