| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by fxtentacle 242 days ago

„273 GB/sec memory bandwidth“

Really? Less RAM bw than an Epyc CPU? And 4x to 8x less than a consumer GPU?

How come this doesn’t massively limit LLM inference speeds?

1 comments

qskousen 242 days ago

It does - the inference speed is much slower than a consumer video card. The draw for the Spark and systems like it are the massive amounts of memory available to the GPU.

link