Hacker News new | ask | show | jobs
by thulle 1034 days ago
With a 4090 nvidia-smi reports ~60-70% mem bandwidth usage while at 99% gpu usage, so that'd be 650-750GB/s. Considering how slow the inference is on the APU having 10% of the 4090 mem bandwidth maybe isn't that much of an issue?

edit: comment from the reddit thread:

> LLM isn't all that great as it is primarily memory bandwidth bound, ie almost no difference from a CPU if your memory bw is mere 12/25Gb/s. SD needs far more compute for inference - APU with slow memory helps.

- https://old.reddit.com/r/Amd/comments/15t0lsm/i_turned_a_95_...