|
|
|
|
|
by solomatov
895 days ago
|
|
> state-space models make transformer based models obsolete We will see whether they work on a large scale pretty soon. I hope they will, but they might not be. There're models which might outperform more advanced models on the smaller scale, and I haven't heard how Mamba performs on GPT scale. |
|