Hacker News new | ask | show | jobs
by visarga 715 days ago
There still are important distinctions. RNNs have constant memory while transformers expand their memory with each new token. They are related, but one could in theory process an unbounded sequence while the other cannot because of growing memory usage.