Author here. The core claim: RWKV-7 (2.9B params, RNN) scores 72.8% avg across
standard benchmarks vs LLaMA 3.2's 69.7% — trained on 3.1T tokens vs ~9T.
Same parameter count, one-third the data.
The more interesting result is architectural: RWKV-7 formally exceeds TC⁰,
the complexity class bounding standard Transformers (Merrill & Sabharwal's
proof in the paper). It solves state-tracking problems that fixed-depth
attention provably cannot.
Inference runs in O(1) memory per token — no KV cache. The hybrid variant
(RWKV-X) hits 99.8% passkey retrieval at 64K and 1.37x Flash Attention v3
throughput at 128K.
> Specifically, we collected new data created after January 2025, including: [...] new fiction on Archive of Our Own (Various, 2025),
Not sure how to feel about this. From a researcher's point of view, reproducibility is important, but the last time someone publicly collected data from AO3, the community was not very fond of that.
Yeah, that HF dataset page is rough. 247+ threads, mostly DMCA reports, archive-locked fics scraped without consent, dataset reuploaded after takedown. The AO3 community had every reason to be furious.
Not RWKV-specific though. Most large corpora have the same sources in them, they just don't list them explicitly. Whether the transparency makes it better or worse is a real question.
The more interesting result is architectural: RWKV-7 formally exceeds TC⁰, the complexity class bounding standard Transformers (Merrill & Sabharwal's proof in the paper). It solves state-tracking problems that fixed-depth attention provably cannot.
Inference runs in O(1) memory per token — no KV cache. The hybrid variant (RWKV-X) hits 99.8% passkey retrieval at 64K and 1.37x Flash Attention v3 throughput at 128K.
Paper: https://arxiv.org/abs/2503.14456 (COLM 2025, peer-reviewed)
Weights: https://huggingface.co/collections/RWKV/rwkv-v7-67d43835efa2...
Code: https://github.com/BlinkDL/RWKV-LM (Apache 2.0)
Happy to discuss the delta rule generalization, the TC⁰ proof, or the benchmark methodology — I spent 36 sources digging into the caveats.