Hacker News new | ask | show | jobs
by solomatov 895 days ago
> state-space models make transformer based models obsolete

We will see whether they work on a large scale pretty soon. I hope they will, but they might not be. There're models which might outperform more advanced models on the smaller scale, and I haven't heard how Mamba performs on GPT scale.