Hacker News new | ask | show | jobs
by YetAnotherNick 2105 days ago
Actually we do have few mechanisms for long term memory like Neural Turing Machine, which has explicit memory cells which neural network could read and write. I think the only thing that is holding back NTM is that it is computationally not efficient like fixed sized context transformer.
1 comments

What's holding back NTM is that they are hard to train, even worse than RNNs. They are not much less efficient than a Transformer. Instead, Transformer has all the advantages of the NTM but it is much easier to train.

Actually, the way I see it, Transformer is a direct descendent of memory-based architectures (NTM, MemNet, stack-based RNNs...) that is both expressive and easy to train.