| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Numerlor 244 days ago
	The GPU is significantly faster and it has cuda, though I'm not sure where it'd fit in the market. At the lower price points you have the AMD machines which are significantly cheaper, even though they're slower and with worse support. Then there's apple's with higher memory bandwidth and even the nvidia agx Thor is faster in GPU compute at the cost of worse CPU and networking, and at the 3-4K price point even a threadripper system becomes viable that can get significantly more memory

1 comments

yencabulator 242 days ago

> The GPU is significantly faster and it has cuda,

But (non-batched) LLM processing is usually limited by memory bandwidth, isn't it? Any extra speed the GPU has is not used by current-day LLM inference.

link

Numerlor 241 days ago

I believe just inference is bandwidth limited, prompt processing and other tasks on the other hand needs the compute. As I understand it, the workstation is also as a whole focused on the local development process before readying things for the datacenters, not just running LLMs

link