Hacker News new | ask | show | jobs
by VHRanger 806 days ago
It seems only OK as a model? Looking at the LLM chat leaderboard it's 71st and the 14B version is worse than a lot of 7B models:

https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboar...

Also, llama.cpp makes inference accessible for a lot of people, and it's not available for RWKV.

Not to knock on the model, I'm sure it's good. I also like that it's a succesful example of citizen science.

It's just not popular enough to have the inference infrastructure transformers have, not established enough to attract enough money to get 60B+ models trained, and so on.

3 comments

This leaderboard is not the best for comparing model architectures, the dataset and finetuning have too much influence. I think perplexity on a particular dataset would be a better way to compare
>Also, llama.cpp makes inference accessible for a lot of people, and it's not available for RWKV.

It absolutely is: https://github.com/RWKV/rwkv.cpp .

i believe it is undertrained, at minimum