Hacker News new | ask | show | jobs
by ryao 384 days ago
The memory bandwidth for that is 150GB/sec. Inference speed is memory bandwidth bound, so that memory is useless for inference. Discrete GPUs will run circles around the CSE-3 at inference if they tried using the external DRAM.
1 comments

Where do you get those 150GB/sec from?

Here [1] they imply they can reach 1.2Tbps (allegedly, I know), and that's the previous generation ...

1: https://f.hubspotusercontent30.net/hubfs/8968533/Virtual%20B...

The other comment already clarified that 150GB/sec = 1.2Tbps. That said, the CSE-3 did not change this figure. It is buried in their specification sheets somewhere if you care to search for it. I did last year, which is how I know.
Doesn't 1.2Tbps / 8 = 150 GBps because 8b = 1B ?
That's ... right! Huh, missed that (assuming all units were written properly and mean what they mean).

Edit: yeah, double checked their site and everything. Dang, their IO is indeed "slow". They claim 1 microsecond latencies, but still, an H100 can move much more data than that.