Hacker News new | ask | show | jobs
by macrolime 1211 days ago
RWKV is showing that maybe RNNs can perform on par with transformers

https://github.com/BlinkDL/RWKV-LM