Y
Hacker News
new
|
ask
|
show
|
jobs
by
swimwiththebeat
839 days ago
Does anyone know if this is using the Mamba architecture[1] instead of transformers? It looks like it uses a state space model (SSM) layer.
[1]:
https://arxiv.org/abs/2312.00752
2 comments
milliondreams
837 days ago
We covered state space models in a blog post here -
https://blog.dragonscale.ai/state-space-models/
It gives overview of Mamba And StrypedHyna.
link
sal9000
839 days ago
It came earlier than Mamba. It uses hyena hierarchy blocks, which are considered SSM but not the same as Mamba.
link
It gives overview of Mamba And StrypedHyna.