| I don't really see how NVIDIA shipping so many chips matters. If more people want Cerebras chips they will presumably be manufactured. I agree that Cerebras manufacture <300 wafers per year. Probably around 250-300, calculated from $1.6-2 million per unit and their 2024 revenue. I don't really see how that matters though. I don't see how core counts matter, but I assume that Cerebras is some kind of giant VLIW-y thing where you can give different instructions to different subprocessors. I imagine that the model weights would be stored in little bits on each processor and that it does some calculation and hands it on. Then you never need to load the the weights, the only thing you're passing around is activations with them going from wafer 1, to wafer 2, etc. to wafer 20. When this is running at full speed, I believe that this can be very efficient, better than a small GPU like those made by NVIDIA. Yes, a lot of the area will be on-chip memory/SRAM, but a lot of it will also be logic and that logic will be computing things instead of being used to move things from RAM to on-chip memory. I don't have any deep knowledge of this system, really, nothing beyond what I've explained here, but I believe that Mistral are using these systems because they're completely superb and superior to GPUs for their purposes, and they will made a carefully weighed decision based on actual performance and actual cost. |
Mistral is a small fish in the grander scheme of things. I would assume that using Cerebras is a way to try to differentiate themselves in a market where they are largely ignored, which is the reason Mistral is small enough to be able to have their needs handled by Cerebras. If they grow to OpenAI levels, there is no chance of Cerebras being able to handle the demand for them.
Finally, I had researched this out of curiosity last year. I am posting remarks based on that.