Hacker News new | ask | show | jobs
by cubefox 83 days ago
They don't say anything about dropping training speed.
1 comments

> a departure from Mamba-2, which optimized for training speed.

?

Yes? Mamba-2 optimized for training speed compared to Mamba-1. Mamba-3 adds optimization for inference. These are pretty much version numbers.