Y
Hacker News
new
|
ask
|
show
|
jobs
by
naasking
1210 days ago
So maybe RWKV [1] is the next step. It parallelizes even better and seems to have no sequence limit.
[1]
https://github.com/BlinkDL/RWKV-LM