Hacker News new | ask | show | jobs
MiniMax teased M3 Sparse Attention: 9.7x prefilling, 15.6x decoding at 1M (twitter.com)
9 points by rebekkamikkoa 25 days ago