|
|
|
|
|
by edflsafoiewq
744 days ago
|
|
The doc comment at the top of the .py file is sufficiently descriptive """Simple, minimal implementation of Mamba in one file of Numpy adapted from (1) and inspired from (2).
Suggest reading the following before/while reading the code:
[1] Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Albert Gu and Tri Dao)
https://arxiv.org/abs/2312.00752
[2] The Annotated S4 (Sasha Rush and Sidd Karamcheti)
https://srush.github.io/annotated-s4
|
|
Even that first line you posted is unhelpfully circular, defining mamba as an implementation of mamba.
Call me old fashioned, but a best practice read me should concisely provide: what the thing is, and why it is, aka the problem it solves. (And not with circular definition.)