Hacker News new | ask | show | jobs
by dave_sullivan 4257 days ago
Re: point of the paper, I think it's addressing a current need within representation learning research where there's this question of "Ok, we can teach really large neural networks stuff, but how do we compress that knowledge efficiently?" How can we learn more compact/efficient/reliable/discrete representations? I've only just finished reading it through and this seems to me to be a promising direction and one I'd like to see more research on.

Re: number of training examples, I'm taking the chart on pg 11 to mean the number of training examples shown. Based on that, it looks like the NTM is learning a lot faster than the LSTM. As far as I can tell, it's getting near 0 loss about 20,000 examples in? It depends on the domain for whether learning w/ 20k examples is impressive or not, personally I think it's comparatively impressive.

Re: cherry picking of tasks to highlight perceived strengths of NTM, fair enough. Although this is one I'll be playing around with a bit to find out where that starts and stops...

Any thoughts on how this compares to the approach of HTMs?