Y
Hacker News
new
|
ask
|
show
|
jobs
by
cubefox
497 days ago
Loosely related thought: A year ago, there was a lot of talk about the Mamba SSM architecture replacing transformers. Apparently that didn't happen so far.
1 comments
thesz
496 days ago
Just like with neural networks and Adam [1], LLMs evolve to make transformers their best building block.
[1]
https://parameterfree.com/2020/12/06/neural-network-maybe-ev...
link
[1] https://parameterfree.com/2020/12/06/neural-network-maybe-ev...