Hacker News new | ask | show | jobs
by anon373839 490 days ago
Mistral's partnership with Cerebras for inference hardware has received less commentary than I expected. They're basically blowing the competition out of the water, with Le Chat getting 1,100+ tokens per second of per-user throughput.
3 comments

Yes, I'm really impressed by the speed as well.

A bit more about the collaboration can be found here:

https://cerebras.ai/blog/mistral-le-chat

For those that haven’t, best to see it yourself - it is visibly, significantly faster:

https://chat.mistral.ai/chat

Thats just crazy.

I'm curious when someone will do the right experiment in a way that some LLM on Cerebras will do the reasoning so well so big so fast, that it does something very novel