Hacker News new | ask | show | jobs
by SomewhatLikely 659 days ago
https://arxiv.org/abs/2006.16236 Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention