| HN Mirror

Probably comes down to whether the model can be trained with gradient descent (at least in the short term).

A general pre-trained RL guided architecture search (#1) together with more choices of nonlinearity (#2), feature extraction (#3), pooling and memory argumentation (#4) and other tricks (#5) could be very powerful amongst many domains. Make it be able to accept multiple pre-trained models as priors and we're well on our way to general AI or at least a place where most data-scientists could be automated away.

(#1 deepmind had a demo a year back or so that was quite novel) (#2 vaguely remember someone training decision trees with gradient descent; could definitely see a 'randomforest' layer appearing in the middle of deep nets) (#3 just convolutions + tricks really). (#4 neural turing machine etc) (#5 any attention mechanism/any sequence mechanism (rnn/lstm etc)/ any graph relational understanding like the recent deepmind paper).