|
|
|
|
|
by karalala
772 days ago
|
|
Its xlstm contradicting existing peer reviewed papers lmao. Either xlstm should fix their benchmarks or existing peer reviewed papers should retract. RWKV-v6 > RWKV-v5 > RWKV-v4, not the other way round obviously.
HGRN 8 ppl worse than baseline transformers? NIPS 2023 spotlight paper btw. |
|