Hacker News new | ask | show | jobs
by whimsicalism 845 days ago
> There are some papers suggesting that transformers are better than SSMs in fundamental ways

I mean the vanilla transformers are also shown failing at the tasks they present.