Y
Hacker News
new
|
ask
|
show
|
jobs
by
trextrex
740 days ago
I'm not clear on what advantage this architecture has over mamba/Griffin. They also have the linear scaling, better sequence parallelism and are competitive in performance with transformers.
2 comments
lalaland1125
740 days ago
The whole field seems to be having issues with comparisons right now.
We really don't even know how Mamba vs Griffin compare.
link
wave_1
740 days ago
state tracking...
link
We really don't even know how Mamba vs Griffin compare.