Hacker News new | ask | show | jobs
by whimsicalism 1061 days ago
good point, for some reason i always leave out rwkv when thinking of the transformer models.. perhaps because it is more of a redux