Hacker News new | ask | show | jobs
by shawntan 4261 days ago
The paper discusses putting the NTM through several tasks, and tests for "overfitting" or how well it has generalised the task by giving it a slightly longer task than it has seen during training. For example, in the copy task, they trained it on sequences of length 20, but tested the it on a sequence of length 100.

Of course, this doesn't guarantee anything, but they also take a look at some of the internals of the learnt system which are more easily interpreted, and found that it does some pretty consistent things.