Hacker News new | ask | show | jobs
by senseiV 965 days ago
ah yes RWKV, always great to mention, crazy about how no one talks about it, it literally the most powerful multilang model at 1b and 3b scales, probs going for 14b and 7b too