Hacker News new | ask | show | jobs
by Numerlor 244 days ago
The GPU is significantly faster and it has cuda, though I'm not sure where it'd fit in the market.

At the lower price points you have the AMD machines which are significantly cheaper, even though they're slower and with worse support. Then there's apple's with higher memory bandwidth and even the nvidia agx Thor is faster in GPU compute at the cost of worse CPU and networking, and at the 3-4K price point even a threadripper system becomes viable that can get significantly more memory

1 comments

> The GPU is significantly faster and it has cuda,

But (non-batched) LLM processing is usually limited by memory bandwidth, isn't it? Any extra speed the GPU has is not used by current-day LLM inference.

I believe just inference is bandwidth limited, prompt processing and other tasks on the other hand needs the compute. As I understand it, the workstation is also as a whole focused on the local development process before readying things for the datacenters, not just running LLMs