| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kergonath 512 days ago
	I see this comment all the time. But realistically if you want more than 1 token/s you’re going to need geforces, and that would cost quite a lot as well, for 100 GB.

1 comments

nenaoki 512 days ago

https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwe...

GB10, or DIGITS, is $3,000 for 1 PFLOP (@4-bit) and 128GB unified memory. Storage configurable up to 4TB.

Can be paired to run 405B (4-bit), probably not very fast though (memory bandwidth is slower than a typical GPU's, and is the main bottleneck for LLM inference).

link

kergonath 512 days ago

That’s not something I can get, so it’s not really relevant. There is always a better device around the corner.

link

justincormack 512 days ago

Not shipping until May or so.

link