The memory bandwidth for that is 150GB/sec. Inference speed is memory bandwidth bound, so that memory is useless for inference. Discrete GPUs will run circles around the CSE-3 at inference if they tried using the external DRAM.
The other comment already clarified that 150GB/sec = 1.2Tbps. That said, the CSE-3 did not change this figure. It is buried in their specification sheets somewhere if you care to search for it. I did last year, which is how I know.
That's ... right! Huh, missed that (assuming all units were written properly and mean what they mean).
Edit: yeah, double checked their site and everything. Dang, their IO is indeed "slow". They claim 1 microsecond latencies, but still, an H100 can move much more data than that.
It is useless for inference, but it is great for training. It used to be more prominent on their website, but it is harder to find references to it now that they are mimicking Groq’s business model.