Learning to Execute and Neural Turing Machines

Y	Hacker News new \| ask \| show \| jobs

	Learning to Execute and Neural Turing Machines (plus.google.com)
	41 points by wojzaremba 4261 days ago

3 comments

radarsat1 4261 days ago

The Turing Machine idea makes a lot of sense... the machine is simply a state machine graph that interacts with the memory --- sensible that it could be "learned" similar to any genetic algorithm approach. Pretty cool trick regarding the differentiability of the system however.

That said, the biggest challenge here, I imagine, is evaluating the learned system. It may give right answers, but how often does it give wrong answers? How can the learned "machine" be tested for correctness? How does overfitting come into the picture? For instance, halting cannot be proved nor guaranteed. This strikes me as a fundamental advantage of a more functional "feed forward" approach of most learning systems.

link

shawntan 4261 days ago

The paper discusses putting the NTM through several tasks, and tests for "overfitting" or how well it has generalised the task by giving it a slightly longer task than it has seen during training. For example, in the copy task, they trained it on sequences of length 20, but tested the it on a sequence of length 100.

Of course, this doesn't guarantee anything, but they also take a look at some of the internals of the learnt system which are more easily interpreted, and found that it does some pretty consistent things.

link

noiv 4261 days ago

In many cases there are no correct answers only adequate ones. Did I chose the right job, car or partner? Who can tell?

link

agibsonccc 4261 days ago

I would like to add for those of you not familiar: The Neural Turing Machine method uses a neural network called a recurrent neural net. Recurrent neural nets are used in modeling time series data and have a neat concept of training called back propagation through time.

Here's a neat tutorial with an RBM (typically a feed forward net) as a recurrent net for those who want to just see what a recurrent net "looks like"

http://deeplearning.net/tutorial/rnnrbm.html

link

shawntan 4261 days ago

Nitpicking here, but while the authors do use a recurrent neural net (RNN), they do not use it exclusively.

The system consists of a memory element, and a controller element. In their evaluation of the system, they use both a standard feed-forward network, as well as an RNN with long short-term memory (LSTM) units as the controller element. In certain tasks, the feed-forward network works better.

+1 on the deeplearning.net tutorials, and theano. I've learnt a lot from there.

link

agibsonccc 4259 days ago

Right. Mainly just low hanging fruit for those who aren't in this stuff day to day.

In a lot of my talks and day to day conversations, I've found people don't know the difference between a feed forward architecture vs, recurrent, vs recursive vs,...you get the point :P

link

dang 4261 days ago

The neural Turing machines article has had significant recent attention here: https://hn.algolia.com/?q=neural+turing+machines#!/story/for.... If someone wants to post the "Learning to Execute" article, that would be great.

Originally I thought we could just change the present url to that one, but since the comments are only about the other paper, it seems better to just treat this as a dupe.

link

javierluraschi 4261 days ago

http://arxiv.org/pdf/1410.4615v1.pdf

link

dang 4261 days ago

I meant that someone should submit it as a story.

link