Y
Hacker News
new
|
ask
|
show
|
jobs
by
cubefox
83 days ago
They don't say anything about dropping training speed.
1 comments
estearum
82 days ago
> a departure from Mamba-2, which optimized for training speed.
?
link
cubefox
82 days ago
Yes? Mamba-2 optimized for training speed compared to Mamba-1. Mamba-3 adds optimization for inference. These are pretty much version numbers.
link
?