Hacker News new | ask | show | jobs
by anshumankmr 254 days ago
Also worth checking out was codestral... I think that had a 256k context and used Mamba even if it is slightly older model now... it had worked great for a Text2SQL use case we worked on.
1 comments

Magistral 2509 just came out. It super slows down when you go over 40,000 context. It's quite a fantastic model.