So I interacted with people at cerebras at a tradeshow and it seems like you have to have extremely advanced cooling to keep that thing working. IIRC the user agreement says "you can't turn it off or else the warranty is void". With the way their chip is designed, I would be strongly worried that the giant chip has warping issues, for example, when certain cores are dark and the thermal generation is uneven (or, if it gets shut down on accident while in the middle of inferencing an LLM). There may even be chip-to-chip variation depending on which cores got dq'd based on their on-the-spot testing.
Already through the gapevine I'm hearing that H100s and B100s have to be replaced more often.... than you'd want? I suspect people are mum about it otherwise they might lose sweetheart discounts from nvidia. I can't imagine that cerebras, even with their extreme engineering of their cooling system, have truly solved cooling in a way that isn't a pain in the ass (otherwise they wouldn't have the clause?) and if I were building a datacenter I would be very worried about having to do annoying and capital intensive replacements.
I have nowhere near the knowledge required to say yes or no to your argument. My point is that the guy that wrote the article is shilling a pre-ipo company whole fuding the competitors which is really surprising to get that many upvotes.
maybe but it shouldn't be surprising. cerebras's designs were born ~2014 ~pre transformers, and the megachips were initially targetted for hpc workloads. it was definitely "solution looking for a problem" back then and now is drifting into square peg in round hole territory now (see sibling comment about groq). I'm surprised they have gotten their raw perf as high as they have by now.
Already through the gapevine I'm hearing that H100s and B100s have to be replaced more often.... than you'd want? I suspect people are mum about it otherwise they might lose sweetheart discounts from nvidia. I can't imagine that cerebras, even with their extreme engineering of their cooling system, have truly solved cooling in a way that isn't a pain in the ass (otherwise they wouldn't have the clause?) and if I were building a datacenter I would be very worried about having to do annoying and capital intensive replacements.