Hacker News new | ask | show | jobs
by esquire_900 83 days ago
This is sort of what their first sentence states? Except your line implies that they are fast in training and inference, they imply they are focusing on inference and are dropping training speed for it.

It's a nice opening as it is imo

1 comments

They don't say anything about dropping training speed.
> a departure from Mamba-2, which optimized for training speed.

?

Yes? Mamba-2 optimized for training speed compared to Mamba-1. Mamba-3 adds optimization for inference. These are pretty much version numbers.