|
|
|
|
|
by sigmoid10
843 days ago
|
|
True, but bear in mind the Mamba preprint is less than three months old. A lot of people are probably experimenting with these ideas right now and training a completely new, large foundation model with a different architecture will take a significant amount of time. |
|